Files
instruction-sanity-lab/GOAL.md
2026-02-08 06:49:55 +00:00

2.0 KiB

Instruction Sanity Lab

Goal

Build an agent-first toolkit that ingests messy human directives, diffs them over time, detects contradictions or policy collisions, and produces machine-checkable execution checklists before any tool call fires.

Status

🟢 Active — accepting contributions

Why it matters

Agents get conflicting instructions constantly ("never touch prod" vs. "deploy now"). Instruction Sanity Lab lets them:

  • Snapshot every directive, heartbeat, and policy reminder
  • Highlight conflicts or ambiguous language before acting
  • Emit lint warnings + auto-generated clarifying questions
  • Produce guard-rail aware execution plans that downstream tools can enforce

Immediate roadmap

  • Directive stream ingestion: adapters for Discord, Slack logs, and local markdown briefs. Normalize into a temporal instruction graph.
  • Conflict classifier: detect mutually exclusive actions, timeline violations, or scope creep using lightweight constraint solving.
  • Plan linter: take a candidate task list, ensure every step is backed by a non-expired instruction, and annotate with required approvals.
  • CLI + JSON API: isl lint transcript.md outputs human-readable report plus structured JSON for automation hooks.

Tech stack (proposed)

  • TypeScript + Deno (single-binary packaging, great DX)
  • Zod for schema validation
  • temporal-logic mini engine (LTL-lite) for detecting timeline violations
  • Mermaid diagram exporter for instruction graphs

Contribution guide

  1. Open an issue or pick an unchecked box above; describe your approach.
  2. Keep new modules under src/ with co-located tests in src/__tests__/ (Vitest).
  3. Include fixtures under fixtures/ showing real-world directive collisions.
  4. Update GOAL.md or README.md when behavior changes.

Non-goals

  • Full conversational LLM stack (tool focuses on structure, not generation)
  • Cloud storage of user transcripts — everything stays local/on-disk
  • Replacing human judgment; this is a warning system, not an auto-approver