Agents / Trust & SafetyMock demo
WorkflowPilot Safe Agents
A safe-agent demo that turns a business goal into a proposed multi-step workflow, runs only human-approved tool calls, and records every action in an audit trail.
Next.jsTypeScriptClaude tool-useHuman-in-the-loopAudit loggingVercel
At a glance
- ICP
- Ops/AI PMs and small teams automating low-risk workflows: drafting emails, creating tickets, summarising notes.
- Features
- Create a workflow goal
- Agent proposes a tool-call plan
- Human approval queue (approve / reject / edit)
- Execute mock tools first (dry-run by default)
- Full audit trail of proposed → approved → executed
- Failure / rollback report + safety settings
AI architecture
- 1GoalUser states a plain-language objective.
- 2Plannerclaude-opus-4-8 proposes ordered steps with tool, args, reasoning.
- 3Approval queueNo tool runs until a human approves, the core safety gate.
- 4ExecutionApproved steps run against mock tools (send_email_mock, create_ticket_mock, …).
- 5Audit logEvery proposal, approval, execution, and failure is recorded.
- 6RecoveryFailure report + rollback suggestions; policy violations are never auto-retried.
Case study
Product problem
Ops and AI PMs want the leverage of agents without the blast radius. The product's job is to make automation trustworthy: every action is previewed, approved, logged, and reversible.
ICP & MVP scope
ICP: a small team automating low-risk workflows. MVP: goal → plan → approval → mock execution → audit. Out of scope (until explicitly approved): real external integrations and unattended execution.
Trust by design
The guardrail metric, unauthorized tool/action attempts, has a target of zero. Trust isn't a feature here; it's the product. The safety settings and audit trail are the things a buyer actually evaluates.
Resume bullets · AI Engineering
- Built a human-in-the-loop agent with a manual tool-calling loop that gates every action behind explicit approval and an append-only audit log.
- Implemented a permissions model and mock-tool sandbox so agent safety can be demoed with zero real-world blast radius.
Resume bullets · AI PM
- Designed a trust-by-design agent product where approval gates, audit trails, and rollback are the core value, not add-ons.
- Set a zero-tolerance guardrail (unauthorized tool attempts) and framed safety settings as the primary buyer-evaluated surface.