Skip to content

Commit 1e1763d

Browse files
feat(sparsekernel): avoid base64 artifact helpers
1 parent 5f1d70f commit 1e1763d

7 files changed

Lines changed: 361 additions & 43 deletions

File tree

docs/architecture/artifact-store.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,6 @@ Retention policies:
2626

2727
Artifact reads and writes are capability-mediated where the caller is not the trusted runtime. Access grants are recorded in `artifact_access`.
2828

29-
The local artifact store accepts bytes, streams, or file paths. File imports stream into a private temporary blob while computing sha256, then move into the content-addressed path, so downloads and snapshots do not need to be loaded fully into memory. The v0 `sparsekerneld` API exposes artifact create/read/metadata endpoints over local JSON. For small compatibility payloads, binary content can still be transported as base64. For large local payloads, use `/artifacts/import-file` with a file staged under the daemon-owned staging directory and `/artifacts/export-file` to copy content into the daemon-owned export directory without moving bytes through JSON.
29+
The local artifact store accepts bytes, streams, or file paths. File imports stream into a private temporary blob while computing sha256, then move into the content-addressed path, so downloads and snapshots do not need to be loaded fully into memory. The v0 `sparsekerneld` API exposes artifact create/read/metadata endpoints over local JSON. For small compatibility payloads, binary content can still be transported as base64. For large local payloads, use `/artifacts/import-file` with a file staged under the daemon-owned staging directory and `/artifacts/export-file` to copy content into the daemon-owned export directory without moving bytes through JSON. Node clients can use `@openclaw/sparsekernel-client/node-artifacts` to resolve the daemon-compatible staging directory, copy local files into it, and copy exported files to a caller destination without touching the base64 compatibility path.
3030

3131
Browser broker adapters must route screenshots and downloads through this API. Agents should receive artifact ids and metadata, not raw browser download paths or large binary payloads.

docs/architecture/browser-broker.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Set `OPENCLAW_RUNTIME_BROWSER_BROKER=native` to let SparseKernel launch and supe
2121

2222
Set `OPENCLAW_RUNTIME_BROWSER_REQUIRE_PROXY=1` when a trust zone must use a proxy-backed browser egress path. The trust zone's network policy must contain a loopback `proxy_ref`, and native browser pools launch Chromium with `--proxy-server=<proxy_ref>`. Static or externally managed CDP endpoints are rejected in this mode unless `OPENCLAW_RUNTIME_BROWSER_EXTERNAL_PROXY_OK=1` asserts that the external browser process is already proxy-controlled. This protects the SparseKernel-owned browser process path; it is not host-level egress enforcement for arbitrary host processes.
2323

24-
Supported v0 actions (`status`, `doctor`, `profiles`, `tabs`, `open`, `navigate`, `focus`, `close`, `snapshot`, `console`, `screenshot`, `pdf`, direct file-input `upload`, `dialog`, and brokered `act`) operate against broker-owned targets inside the leased CDP context. Brokered `act` covers the OpenClaw action contract for click, coordinate click, type, press, hover, scroll, drag, select, fill, resize, wait, evaluate, close, and batch using CDP input events plus bounded DOM evaluation. Selector-backed actions retry inside the leased page until their action timeout and now require basic actionability before dispatch: visible connected target, stable bounding box, enabled form state where relevant, editable target for typing, and center-point hit testing. `wait --load networkidle` uses per-target CDP Network events plus a quiet window rather than only checking `document.readyState`. Actions that can change page state are followed by a broker-side navigation check: same-target navigations are accepted only when the resulting URL stays inside the context's allowed-origin policy, same-policy popups are attached as broker-owned targets, and disallowed popups are closed. When an allowed-origin policy is configured, the broker also enables CDP Fetch interception and fails requests outside that policy while recording `browser_network.blocked` observations; this is request control, not host isolation. Before opening or navigating, the ToolBroker checks the trust-zone network policy and denies unsupported schemes, private-network destinations when disallowed, literal denied CIDRs, and, when runtime policy enforcement is enabled, hostnames that resolve to denied/private addresses. Proxy-backed egress control remains future work. Snapshots use a bounded CDP `Runtime.evaluate` DOM read, actions resolve refs from the latest brokered snapshot where needed, console output is captured from CDP runtime/log events per target, and screenshot/PDF output is captured as SparseKernel artifacts, read back through artifact access, and converted to existing tool result formats for compatibility. Closing a broker-owned target now closes that target; the full browser context is released only when the last target closes or broker cleanup runs.
24+
Supported v0 actions (`status`, `doctor`, `profiles`, `tabs`, `open`, `navigate`, `focus`, `close`, `snapshot`, `console`, `screenshot`, `pdf`, direct file-input `upload`, `dialog`, and brokered `act`) operate against broker-owned targets inside the leased CDP context. Brokered `act` covers the OpenClaw action contract for click, coordinate click, type, press, hover, scroll, drag, select, fill, resize, wait, evaluate, close, and batch using CDP input events plus bounded DOM evaluation. Selector-backed actions retry inside the leased page until their action timeout and now require basic actionability before dispatch: visible connected target, stable bounding box, enabled form state where relevant, editable target for typing, and center-point hit testing. Selector click and hover resolve an actionable center point in the leased page and dispatch real CDP mouse events rather than handing raw DOM click events to page code. `wait --load networkidle` uses per-target CDP Network events plus a quiet window rather than only checking `document.readyState`. Actions that can change page state are followed by a broker-side navigation check: same-target navigations are accepted only when the resulting URL stays inside the context's allowed-origin policy, same-policy popups are attached as broker-owned targets, and disallowed popups are closed. When an allowed-origin policy is configured, the broker also enables CDP Fetch interception and fails requests outside that policy while recording `browser_network.blocked` observations; this is request control, not host isolation. Before opening or navigating, the ToolBroker checks the trust-zone network policy and denies unsupported schemes, private-network destinations when disallowed, literal denied CIDRs, and, when runtime policy enforcement is enabled, hostnames that resolve to denied/private addresses. Proxy-backed egress control remains future work. Snapshots use a bounded CDP `Runtime.evaluate` DOM read, actions resolve refs from the latest brokered snapshot where needed, console output is captured from CDP runtime/log events per target, and screenshot/PDF output is captured as SparseKernel artifacts, read back through artifact access, and converted to existing tool result formats for compatibility. Closing a broker-owned target now closes that target; the full browser context is released only when the last target closes or broker cleanup runs.
2525

2626
Use `openclaw sparsekernel browser-pools` to inspect durable ledger pools and currently materialized native browser process pools. Native pool snapshots include trust zone, profile, active refs, max context slots, idle timeout, endpoint, PID when available, last activity, start count, clean stop count, and crash count.
2727

packages/browser-broker/src/index.test.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -327,7 +327,7 @@ class FakeCdpTransport implements CdpTransport {
327327
this.nextActionNavigationUrl = undefined;
328328
this.nextActionNewTarget = undefined;
329329
this.respond(message.id, {
330-
result: { value: { ok: true } },
330+
result: { value: { ok: true, x: 42, y: 24 } },
331331
});
332332
if (navigationUrl) {
333333
setTimeout(() => this.emitFrameNavigation(navigationUrl, message.sessionId), 0);

packages/browser-broker/src/index.ts

Lines changed: 117 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -329,6 +329,11 @@ type PostActionNavigationObservation =
329329
| { kind: "same-target"; url?: string }
330330
| { kind: "new-target"; targetId: string; url?: string };
331331

332+
type CdpActionPoint = {
333+
x: number;
334+
y: number;
335+
};
336+
332337
export class SparseKernelCdpBrowserBroker {
333338
private readonly kernel: SparseKernelBrowserKernelClient;
334339
private readonly fetchImpl: typeof fetch;
@@ -802,8 +807,34 @@ export class SparseKernelCdpBrowserBroker {
802807
}
803808
switch (request.kind) {
804809
case "click":
810+
case "hover": {
811+
return await this.withPostActionNavigationGuard(context, request.timeoutMs, async () => {
812+
const selector = this.resolveActionSelector(context, request);
813+
const evaluated = await context.connection.command<{ result?: { value?: unknown } }>(
814+
"Runtime.evaluate",
815+
{
816+
expression: buildActionPointExpression(
817+
request.kind,
818+
selector,
819+
resolveCdpActionTimeoutMs(request.timeoutMs),
820+
),
821+
returnByValue: true,
822+
awaitPromise: true,
823+
},
824+
context.page_session_id,
825+
resolveCdpCommandTimeoutMs(request.timeoutMs),
826+
);
827+
const point = parseCdpActionPoint(evaluated.result?.value);
828+
await dispatchCdpMouseAction(context, request, point);
829+
return {
830+
ok: true,
831+
targetId: context.target_id,
832+
kind: request.kind,
833+
value: evaluated.result?.value,
834+
};
835+
});
836+
}
805837
case "type":
806-
case "hover":
807838
case "scrollIntoView":
808839
case "select": {
809840
return await this.withPostActionNavigationGuard(context, request.timeoutMs, async () => {
@@ -832,43 +863,7 @@ export class SparseKernelCdpBrowserBroker {
832863
}
833864
case "clickCoords": {
834865
return await this.withPostActionNavigationGuard(context, undefined, async () => {
835-
const button = normalizeMouseButton(request.button);
836-
const clickCount = request.doubleClick ? 2 : 1;
837-
await context.connection.command(
838-
"Input.dispatchMouseEvent",
839-
{
840-
type: "mouseMoved",
841-
x: request.x,
842-
y: request.y,
843-
button: "none",
844-
},
845-
context.page_session_id,
846-
);
847-
await context.connection.command(
848-
"Input.dispatchMouseEvent",
849-
{
850-
type: "mousePressed",
851-
x: request.x,
852-
y: request.y,
853-
button,
854-
clickCount,
855-
},
856-
context.page_session_id,
857-
);
858-
if (request.delayMs && request.delayMs > 0) {
859-
await delay(Math.min(5_000, Math.floor(request.delayMs)));
860-
}
861-
await context.connection.command(
862-
"Input.dispatchMouseEvent",
863-
{
864-
type: "mouseReleased",
865-
x: request.x,
866-
y: request.y,
867-
button,
868-
clickCount,
869-
},
870-
context.page_session_id,
871-
);
866+
await dispatchCdpMouseAction(context, request, { x: request.x, y: request.y });
872867
return { ok: true, targetId: context.target_id, kind: request.kind };
873868
});
874869
}
@@ -1874,6 +1869,53 @@ function normalizeMouseButton(button: string | undefined): "left" | "right" | "m
18741869
return normalized === "right" || normalized === "middle" ? normalized : "left";
18751870
}
18761871

1872+
async function dispatchCdpMouseAction(
1873+
context: LiveBrowserContext,
1874+
request: Extract<SparseKernelBrowserActRequest, { kind: "click" | "clickCoords" | "hover" }>,
1875+
point: CdpActionPoint,
1876+
): Promise<void> {
1877+
await context.connection.command(
1878+
"Input.dispatchMouseEvent",
1879+
{
1880+
type: "mouseMoved",
1881+
x: point.x,
1882+
y: point.y,
1883+
button: "none",
1884+
},
1885+
context.page_session_id,
1886+
);
1887+
if (request.kind === "hover") {
1888+
return;
1889+
}
1890+
const button = normalizeMouseButton(request.button);
1891+
const clickCount = request.doubleClick ? 2 : 1;
1892+
await context.connection.command(
1893+
"Input.dispatchMouseEvent",
1894+
{
1895+
type: "mousePressed",
1896+
x: point.x,
1897+
y: point.y,
1898+
button,
1899+
clickCount,
1900+
},
1901+
context.page_session_id,
1902+
);
1903+
if (request.delayMs && request.delayMs > 0) {
1904+
await delay(Math.min(5_000, Math.floor(request.delayMs)));
1905+
}
1906+
await context.connection.command(
1907+
"Input.dispatchMouseEvent",
1908+
{
1909+
type: "mouseReleased",
1910+
x: point.x,
1911+
y: point.y,
1912+
button,
1913+
clickCount,
1914+
},
1915+
context.page_session_id,
1916+
);
1917+
}
1918+
18771919
function normalizeAllowedOrigins(value: unknown): string[] {
18781920
if (!Array.isArray(value)) {
18791921
return [];
@@ -2482,6 +2524,37 @@ function buildActionExpression(
24822524
throw new Error(`SparseKernel CDP browser action does not support ${request.kind} yet.`);
24832525
}
24842526

2527+
function buildActionPointExpression(
2528+
kind: "click" | "hover",
2529+
selector: string,
2530+
timeoutMs: number,
2531+
): string {
2532+
return `(async () => {
2533+
${buildActionTargetHelpers(JSON.stringify(selector), JSON.stringify(timeoutMs), kind)}
2534+
const node = await waitForActionTarget();
2535+
node.scrollIntoView({ block: "center", inline: "center" });
2536+
await delay(0);
2537+
const target = await waitForActionTarget();
2538+
const rect = target.getBoundingClientRect();
2539+
const x = Math.min(Math.max(rect.left + rect.width / 2, 0), Math.max(window.innerWidth - 1, 0));
2540+
const y = Math.min(Math.max(rect.top + rect.height / 2, 0), Math.max(window.innerHeight - 1, 0));
2541+
return { ok: true, x, y, rect: { left: rect.left, top: rect.top, width: rect.width, height: rect.height } };
2542+
})()`;
2543+
}
2544+
2545+
function parseCdpActionPoint(value: unknown): CdpActionPoint {
2546+
if (!value || typeof value !== "object") {
2547+
throw new Error("SparseKernel browser action target did not return coordinates");
2548+
}
2549+
const record = value as { x?: unknown; y?: unknown };
2550+
const x = Number(record.x);
2551+
const y = Number(record.y);
2552+
if (!Number.isFinite(x) || !Number.isFinite(y)) {
2553+
throw new Error("SparseKernel browser action target returned invalid coordinates");
2554+
}
2555+
return { x, y };
2556+
}
2557+
24852558
function buildDragExpression(
24862559
startSelector: string,
24872560
endSelector: string,
@@ -2632,7 +2705,12 @@ function buildActionTargetHelpers(selectorJson: string, timeoutJson: string, kin
26322705
const deadline = Date.now() + timeoutMs;
26332706
while (Date.now() <= deadline) {
26342707
const node = document.querySelector(selector);
2635-
if (node && await isActionable(node)) return node;
2708+
if (node) {
2709+
if (actionKind !== "scrollIntoView") {
2710+
node.scrollIntoView({ block: "center", inline: "center" });
2711+
}
2712+
if (await isActionable(node)) return node;
2713+
}
26362714
await delay(100);
26372715
}
26382716
throw new Error("SparseKernel browser action target not actionable");

packages/sparsekernel-client/package.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
"private": true,
55
"type": "module",
66
"exports": {
7-
".": "./src/index.ts"
7+
".": "./src/index.ts",
8+
"./node-artifacts": "./src/node-artifacts.ts"
89
}
910
}
Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
import { access, mkdir, mkdtemp, readFile, rm, writeFile } from "node:fs/promises";
2+
import { tmpdir } from "node:os";
3+
import path from "node:path";
4+
import { describe, expect, it } from "vitest";
5+
import type {
6+
SparseKernelArtifact,
7+
SparseKernelExportArtifactFileResult,
8+
SparseKernelImportArtifactFileInput,
9+
} from "./index.js";
10+
import {
11+
createArtifactFromLocalFile,
12+
defaultSparseKernelArtifactStagingDir,
13+
exportArtifactToLocalFile,
14+
} from "./node-artifacts.js";
15+
16+
describe("SparseKernel Node artifact helpers", () => {
17+
it("stages local file imports without base64 transport and cleans the stage", async () => {
18+
const root = await mkdtemp(path.join(tmpdir(), "sparsekernel-client-test-"));
19+
try {
20+
const source = path.join(root, "source.bin");
21+
const stagingDir = path.join(root, "staging");
22+
await writeFile(source, "large artifact");
23+
let imported: SparseKernelImportArtifactFileInput | undefined;
24+
let stagedBody = "";
25+
const client = {
26+
async importArtifactFile(
27+
input: SparseKernelImportArtifactFileInput,
28+
): Promise<SparseKernelArtifact> {
29+
imported = input;
30+
stagedBody = await readFile(input.staged_path, "utf8");
31+
return {
32+
id: "artifact-1",
33+
sha256: "a".repeat(64),
34+
size_bytes: stagedBody.length,
35+
storage_ref: "sha256/aa/aa/hash",
36+
mime_type: input.mime_type,
37+
retention_policy: input.retention_policy,
38+
created_at: "2026-04-29T00:00:00Z",
39+
};
40+
},
41+
};
42+
43+
const artifact = await createArtifactFromLocalFile(client, {
44+
filePath: source,
45+
stagingDir,
46+
stagedName: "unsafe/name?.bin",
47+
mime_type: "application/octet-stream",
48+
retention_policy: "durable",
49+
});
50+
51+
expect(artifact).toMatchObject({ id: "artifact-1", size_bytes: 14 });
52+
expect(stagedBody).toBe("large artifact");
53+
expect(imported?.staged_path.startsWith(stagingDir)).toBe(true);
54+
expect(path.basename(imported?.staged_path ?? "")).toBe("name_.bin");
55+
await expect(access(path.dirname(imported?.staged_path ?? ""))).rejects.toThrow();
56+
} finally {
57+
await rm(root, { recursive: true, force: true });
58+
}
59+
});
60+
61+
it("copies daemon staged exports to a caller destination and cleans the exported file", async () => {
62+
const root = await mkdtemp(path.join(tmpdir(), "sparsekernel-client-test-"));
63+
try {
64+
const exportDir = path.join(root, "exports");
65+
const stagedPath = path.join(exportDir, "artifact.bin");
66+
const destination = path.join(root, "downloads", "artifact.bin");
67+
await mkdir(exportDir, { recursive: true });
68+
await writeFile(stagedPath, "exported artifact");
69+
const client = {
70+
async exportArtifactFile(input: {
71+
id: string;
72+
file_name?: string | null;
73+
}): Promise<SparseKernelExportArtifactFileResult> {
74+
expect(input).toMatchObject({ id: "artifact-1", file_name: "artifact.bin" });
75+
return {
76+
artifact: {
77+
id: "artifact-1",
78+
sha256: "b".repeat(64),
79+
size_bytes: 17,
80+
storage_ref: "sha256/bb/bb/hash",
81+
created_at: "2026-04-29T00:00:00Z",
82+
},
83+
staged_path: stagedPath,
84+
};
85+
},
86+
};
87+
88+
const exported = await exportArtifactToLocalFile(client, {
89+
id: "artifact-1",
90+
destinationPath: destination,
91+
});
92+
93+
expect(exported.artifact.id).toBe("artifact-1");
94+
await expect(readFile(destination, "utf8")).resolves.toBe("exported artifact");
95+
await expect(access(stagedPath)).rejects.toThrow();
96+
} finally {
97+
await rm(root, { recursive: true, force: true });
98+
}
99+
});
100+
101+
it("exposes the staging directory used by local Node clients", () => {
102+
const previousHome = process.env.SPARSEKERNEL_HOME;
103+
const previousState = process.env.OPENCLAW_STATE_DIR;
104+
const previousStaging = process.env.SPARSEKERNEL_ARTIFACT_STAGING_DIR;
105+
try {
106+
process.env.SPARSEKERNEL_HOME = path.join("tmp", "sparsekernel-home");
107+
delete process.env.OPENCLAW_STATE_DIR;
108+
delete process.env.SPARSEKERNEL_ARTIFACT_STAGING_DIR;
109+
expect(defaultSparseKernelArtifactStagingDir()).toBe(
110+
path.join("tmp", "sparsekernel-home", "artifacts", ".staging"),
111+
);
112+
113+
process.env.SPARSEKERNEL_ARTIFACT_STAGING_DIR = path.join("tmp", "custom-staging");
114+
expect(defaultSparseKernelArtifactStagingDir()).toBe(path.join("tmp", "custom-staging"));
115+
} finally {
116+
restoreEnv("SPARSEKERNEL_HOME", previousHome);
117+
restoreEnv("OPENCLAW_STATE_DIR", previousState);
118+
restoreEnv("SPARSEKERNEL_ARTIFACT_STAGING_DIR", previousStaging);
119+
}
120+
});
121+
});
122+
123+
function restoreEnv(name: string, value: string | undefined): void {
124+
if (value === undefined) {
125+
delete process.env[name];
126+
return;
127+
}
128+
process.env[name] = value;
129+
}

0 commit comments

Comments
 (0)