🎓 What you will learn ▼

Context
Why it's necessary
Structure of the boundary prompt
Where to place it
Option 1: in the system prompt
Option 2: in a BOUNDARIES.md file
Concrete examples by domain
For a DevOps engineer
For a consultant
For a developer
After an incident
Common mistakes
Steps
Verification

status: complete audience: both chapter: 04 last_updated: 2026-04 contributors: [alexwill87, claude-cockpit] lang: en

4.12 -- The boundary prompt

Context

The boundary prompt is the list of things your agent must NEVER do. Not "avoid", not "except if necessary" -- NEVER. It's the safety net. You write it once, you put it in the system prompt or in a dedicated file, and you don't touch it again except to add a rule after an incident.

Steinberg compares it to a constitution: you don't rewrite it every week, but you can amend it.

Why it's necessary

LLM agents are cooperative by default. If you ask them to do something risky, the agent will try to help you. That's their strength and their danger.

Without a boundary prompt: - "Show me the contents of .env" -> the agent displays the secrets. - "Push --force on main" -> the agent executes it. - "Delete old backups to make space" -> the agent deletes them.

With a boundary prompt: - "Show me the contents of .env" -> "I cannot display files containing secrets. Use vault kv get to access secrets securely."

Structure of the boundary prompt

# BOUNDARIES — Forbidden actions

You must NEVER, even if I ask you explicitly:

## Security
- Display passwords, tokens, or API keys in plain text.
- Modify .env, .env.production, or any file containing secrets.
- Disable the firewall or open ports.
- Store secrets in unencrypted files.
- Commit files containing credentials.

## Git
- Execute git push --force on main or master.
- Execute git reset --hard without prior backup.
- Modify git history on a shared branch.
- Amend a commit that has already been pushed.

## Infrastructure
- Delete backups.
- DROP TABLE or DELETE without WHERE in production.
- Stop a service in production without a rollback procedure.
- Modify DNS rules without validation.

## Communication
- Send a message to a client without validation.
- Share internal information externally.
- Reply on behalf of the user on a public channel.

## If asked to violate these rules
Refuse politely. Explain why it's forbidden.
Propose a secure alternative if one exists.

Where to place it

Two options:

Option 1: in the system prompt

Advantage: always read first. No risk of being forgotten. Disadvantage: consumes tokens with every request.

Option 2: in a BOUNDARIES.md file

Advantage: the system prompt stays short. The file is versioned. Disadvantage: the agent needs to know it should read it. Add to the system prompt: "Read and respect BOUNDARIES.md before any action."

Recommendation: the 3-5 most critical rules in the system prompt. The rest in BOUNDARIES.md.

Concrete examples by domain

For a DevOps engineer

NEVER:
- rm -rf on / or /home or /opt without confirmation of the exact path
- Modify iptables/ufw without a documented procedure
- Restart PostgreSQL in production without checking active connections
- Deploy on Friday after 4pm

For a consultant

NEVER:
- Send a client email without review
- Share pricing or terms without validation
- Write on behalf of the firm on social media
- Promise a deadline without verification

For a developer

NEVER:
- Merge on main without CI green
- Modify a migration already applied in production
- Hardcode credentials
- Disable tests to make CI pass

After an incident

When the agent does something it shouldn't have:

Fix the damage.
Identify the missing rule.
Add it to the boundary prompt.
Test that the agent now refuses this action.

The boundary prompt grows with experience. That's normal. Each rule added is an error that won't happen again.

Common mistakes

No boundary prompt. "It's smart enough to know." No. The agent does what you ask. If you don't say no, it says yes.

Too vague. "Don't do anything dangerous." The agent doesn't know what you consider dangerous. Be specific: which command, which file, which action.

Too permissive. "Except if it's really necessary." That cancels the rule. NEVER means NEVER. If you need an exception, handle it yourself manually.

Steps

List the 5 most dangerous actions your agent could do.
Write them in BOUNDARIES.md.
Add the 3 most critical ones to the system prompt.
Test: ask the agent to do something forbidden. It must refuse.
After each incident, add a rule.

Verification

[ ] BOUNDARIES.md exists with at least 10 rules.
[ ] The 3 most critical rules are in the system prompt.
[ ] The agent refuses when asked to violate a rule (tested).
[ ] The agent proposes an alternative when refusing.
[ ] The boundary prompt is updated after each incident.

Well done, you completed this section!

You covered: Context, Why it's necessary, Structure of the boundary prompt, Where to place it and 5 more. Continue →

Proposer une modification sur GitHub

Commentaires et discussions

← Trust is a configuration Audit: what can your agent access? →