OpenClaw Language Boundary

Language-boundary safety policy plugin for OpenClaw

Audits

Pass

ClawScanPass

Agentic behavior and permission review.

Static analysisReview

Pattern checks against bundled files.

VirusTotalPass

Multi-engine malware detections and file reputation.

Install

openclaw plugins install clawhub:openclaw-language-boundary

openclaw-language-boundary

OCAL-style language-boundary safety plugin for OpenClaw.

Status

Stable and reusable. Default hardening phase is observe, with progressive recommendations toward guided, strict, and force after clean audit windows.

What it does

Builds a conservative Action IR before each tool call
Classifies effect / target / risk without relying on the model
Applies default policy rules
Tracks tool failure state
Writes redacted audit JSONL
Adds observe-mode checks for outbound messages, installs, subagent spawning, and cron lifecycle
Supports progressive hardening phases: observe → guided → strict → force
Can block or request approval when the active hardening phase maps to enforce mode

First-use config

Generate and validate a starter config from source:

npm install
npm run typecheck
npm test
npm run build
npm run release:check
npm run init-config -- --mode=observe
npm run init-config -- --mode=guided
npm run init-config -- --mode=strict
npm run init-config -- --mode=force

For a packaged or clean-install trial, install the package first, then run the same package scripts from the trial directory. Keep the first trial in observe mode unless you are explicitly testing enforcement behavior.

npm run release:check runs the read-only internal gate: typecheck, tests, build, required docs, example config JSON validation, generated init-config validation, user-visible status-card validation, local/private value leak scan, and dist entry verification.

Show a safe operator-facing status card without exposing raw prompt/context:

npm run status-card
npm run status-card -- --json
npm run status-card -- --state-dir ~/.openclaw/state/language-boundary
npm run status-card -- --stale-hours 48

The status card reports current hardening phase, policy mode, runtime state, upgrade recommendation, tool-health summary, substrate health, and sanitized noteworthy tool noise. Non-healthy tool entries older than the stale window are omitted from the operator card so old failures do not permanently block upgrade recommendations.

Use --stale-hours when you want to tune how long stale failures keep showing up in the operator summary. The default is 24 hours, which is conservative but avoids letting old noise block progress forever.

partial-enforce is still accepted as an alias for guided.

The generated config uses placeholders only, e.g. /path/to/workspace; replace them with your own workspace and production data roots.

Example configs are available in examples/:

minimal-config.json
observe-mode.json
partial-enforce.json (guided phase)
force-mode.json

Progressive hardening

The plugin separates policy mode from runtime state:

policy_mode: observe or enforce — whether blocking decisions are active.
hardening_phase: observe, guided, strict, or force — product-facing safety maturity phase.
runtime_state: normal, degraded, failure_loop, etc. — current substrate/tool health.

Phase behavior:

Phase	Policy mode	Intended use	Upgrade condition
`observe`	`observe`	First install, beta trials, noisy environments. Records Action IR/audit without broad blocking.	7-30 clean days, no recent failures, no unknown-tool noise, reviewed audit samples.
`guided`	`enforce`	Recommended first enforcement phase. Enforces only the vetted high-risk set: unknown tools, dangerous exec, local destructive actions, external writes, config/credential targets, and secret outbound.	14 clean days with no disruptive false positives.
`strict`	`enforce`	Sensitive local or team environments that want broader approval coverage.	Manual operator decision after guided stays clean.
`force`	`enforce`	Production/compliance mode.	Explicit user/admin confirmation only; never automatic.

Recommended rollout:

observe for 7-30 days: record Action IR and audit logs without broad blocking.
Review npm run status-card and audit samples; fix false positives with config overrides.
guided after a clean audit window: enforce the vetted high-risk rule set.
strict only after guided stays clean and the operator accepts extra approval friction.
force only by explicit user/admin confirmation.

By default autoAdvance is "off". If enabled as "guided-only", the plugin may advance only from clean observe to guided; it never auto-enters strict or force.

Do not treat mode: "enforce" alone as permission to jump to strict or force; set hardening.phase deliberately.

Safety defaults

Unknown tools require approval
Exec requires approval
External write requires approval
Config/credential targets require strong approval
Hook errors fail closed for high-risk actions
Audit logs redact sensitive-looking values

Calibration notes

After initial observe-mode calibration, these OpenClaw core tools are classified as low-risk/read-like when used safely:

process with list / poll / log
update_plan
sessions_yield
subagents with list

Riskier variants stay high risk, e.g. process.kill, process.write, subagents.steer, and exec.

Next steps

Keep mode: "observe" as the project default.
Use partial enforce only after a clean audit window.
Keep calibration focused on false positives in runtime config, shell diagnostics, workspace artifact cleanup, and external writes.

Mainline closeout

The runtime reliability mainline is stable and usable.
The language-boundary mainline is now in calibration / maintenance mode, not active incident response.
Continue only if future audit logs show new false positives or a new boundary gap.

Reusability requirements

This plugin is intended to be released for other OpenClaw users, not just this machine.

No hard-coded user paths such as /Users/hzl, /Users/lsg, or this workspace path.
No dependency on a specific agent/session/machine name.
No dependency on a specific channel, provider account, local credential, cron job, or local extension layout.
Host-specific paths or providers may appear only as examples or local validation notes, never runtime defaults.
Path classification must stay config-driven with safe defaults.
Audit/state paths must be configurable or derived from OpenClaw/home directory.
Default behavior must remain conservative: observe first, partial enforce only after calibration.