OpenClaw Language Boundary
Language-boundary safety policy plugin for OpenClaw
Audits
PassInstall
openclaw plugins install clawhub:openclaw-language-boundaryopenclaw-language-boundary
OCAL-style language-boundary safety plugin for OpenClaw.
Status
Stable and reusable. Default hardening phase is observe, with progressive recommendations toward guided, strict, and force after clean audit windows.
What it does
- Builds a conservative Action IR before each tool call
- Classifies effect / target / risk without relying on the model
- Applies default policy rules
- Tracks tool failure state
- Writes redacted audit JSONL
- Adds observe-mode checks for outbound messages, installs, subagent spawning, and cron lifecycle
- Supports progressive hardening phases: observe → guided → strict → force
- Can block or request approval when the active hardening phase maps to enforce mode
First-use config
Generate and validate a starter config from source:
npm install
npm run typecheck
npm test
npm run build
npm run release:check
npm run init-config -- --mode=observe
npm run init-config -- --mode=guided
npm run init-config -- --mode=strict
npm run init-config -- --mode=force
For a packaged or clean-install trial, install the package first, then run the
same package scripts from the trial directory. Keep the first trial in
observe mode unless you are explicitly testing enforcement behavior.
npm run release:check runs the read-only internal gate: typecheck, tests, build, required docs, example config JSON validation, generated init-config validation, user-visible status-card validation, local/private value leak scan, and dist entry verification.
Show a safe operator-facing status card without exposing raw prompt/context:
npm run status-card
npm run status-card -- --json
npm run status-card -- --state-dir ~/.openclaw/state/language-boundary
npm run status-card -- --stale-hours 48
The status card reports current hardening phase, policy mode, runtime state, upgrade recommendation, tool-health summary, substrate health, and sanitized noteworthy tool noise. Non-healthy tool entries older than the stale window are omitted from the operator card so old failures do not permanently block upgrade recommendations.
Use --stale-hours when you want to tune how long stale failures keep showing up in the operator summary. The default is 24 hours, which is conservative but avoids letting old noise block progress forever.
partial-enforce is still accepted as an alias for guided.
The generated config uses placeholders only, e.g. /path/to/workspace; replace them with your own workspace and production data roots.
Example configs are available in examples/:
minimal-config.jsonobserve-mode.jsonpartial-enforce.json(guidedphase)force-mode.json
Progressive hardening
The plugin separates policy mode from runtime state:
policy_mode:observeorenforce— whether blocking decisions are active.hardening_phase:observe,guided,strict, orforce— product-facing safety maturity phase.runtime_state:normal,degraded,failure_loop, etc. — current substrate/tool health.
Phase behavior:
| Phase | Policy mode | Intended use | Upgrade condition |
|---|---|---|---|
observe | observe | First install, beta trials, noisy environments. Records Action IR/audit without broad blocking. | 7-30 clean days, no recent failures, no unknown-tool noise, reviewed audit samples. |
guided | enforce | Recommended first enforcement phase. Enforces only the vetted high-risk set: unknown tools, dangerous exec, local destructive actions, external writes, config/credential targets, and secret outbound. | 14 clean days with no disruptive false positives. |
strict | enforce | Sensitive local or team environments that want broader approval coverage. | Manual operator decision after guided stays clean. |
force | enforce | Production/compliance mode. | Explicit user/admin confirmation only; never automatic. |
Recommended rollout:
observefor 7-30 days: record Action IR and audit logs without broad blocking.- Review
npm run status-cardand audit samples; fix false positives with config overrides. guidedafter a clean audit window: enforce the vetted high-risk rule set.strictonly after guided stays clean and the operator accepts extra approval friction.forceonly by explicit user/admin confirmation.
By default autoAdvance is "off". If enabled as "guided-only", the plugin
may advance only from clean observe to guided; it never auto-enters strict
or force.
Do not treat mode: "enforce" alone as permission to jump to strict or
force; set hardening.phase deliberately.
Safety defaults
- Unknown tools require approval
- Exec requires approval
- External write requires approval
- Config/credential targets require strong approval
- Hook errors fail closed for high-risk actions
- Audit logs redact sensitive-looking values
Calibration notes
After initial observe-mode calibration, these OpenClaw core tools are classified as low-risk/read-like when used safely:
processwithlist/poll/logupdate_plansessions_yieldsubagentswithlist
Riskier variants stay high risk, e.g. process.kill, process.write, subagents.steer, and exec.
Next steps
- Keep
mode: "observe"as the project default. - Use partial enforce only after a clean audit window.
- Keep calibration focused on false positives in runtime config, shell diagnostics, workspace artifact cleanup, and external writes.
Mainline closeout
- The runtime reliability mainline is stable and usable.
- The language-boundary mainline is now in calibration / maintenance mode, not active incident response.
- Continue only if future audit logs show new false positives or a new boundary gap.
Reusability requirements
This plugin is intended to be released for other OpenClaw users, not just this machine.
- No hard-coded user paths such as
/Users/hzl,/Users/lsg, or this workspace path. - No dependency on a specific agent/session/machine name.
- No dependency on a specific channel, provider account, local credential, cron job, or local extension layout.
- Host-specific paths or providers may appear only as examples or local validation notes, never runtime defaults.
- Path classification must stay config-driven with safe defaults.
- Audit/state paths must be configurable or derived from OpenClaw/home directory.
- Default behavior must remain conservative:
observefirst, partial enforce only after calibration.
