This is what prompt engineering looks like when real money, real users, and real compliance requirements are on the line.
Over the past four posts, we covered the mechanisms (system prompts, steering files, agent instructions), the hidden prompt (MCP tool definitions), the formats (structured prompting), and the evaluation pipeline. Now we put it all together with production patterns from three domains — banking, SaaS, and dev tools — plus the anti-patterns that still trip up experienced teams.
Domain 1: Banking & Fintech
Banking is the hardest environment for prompt engineering. Compliance regulations, audit trails, and zero tolerance for errors make every prompt a potential liability.
Pattern: Audit-Trail Prompting
# Every agent response must be auditableclass BankingAgent: def __init__(self): self.audit_log = []
def run(self, query: str): # Log the raw query self.audit_log.append({ "timestamp": datetime.utcnow().isoformat(), "type": "user_query", "content": query, })
# Generate response with explicit reasoning response = self.agent.run(f""" <task>{query}</task>
<compliance_rules> - Do NOT process transactions over $10,000 without verification - Do NOT share account details of other users - Do NOT disclose internal fraud detection logic - All responses must include a disclaimer for financial advice - If unsure, escalate to human agent — do not guess </compliance_rules>
<audit_required> Before responding, explain your reasoning in this audit block (the customer will NOT see this): - What information did you use? - What compliance rules apply? - Did you verify the customer's identity? </audit_required>
<customer_response> Your response to the customer goes here. Include the standard disclaimer. </customer_response> """)
# Audit the full response self.audit_log.append({ "timestamp": datetime.utcnow().isoformat(), "type": "agent_response", "content": response, "compliance_check": self.verify_compliance(response), })
return responseWhy it works: The agent’s reasoning is captured in a machine-parseable audit block before the customer-facing response. If a compliance issue arises, you have the agent’s reasoning trace.
Pattern: Progressive Identity Verification
# The agent doesn't ask for everything upfrontverification_levels = { "balance_check": {"name", "last_4_digits"}, "transaction_history": {"name", "last_4_digits", "dob"}, "transfer": {"full_ssn", "2fa_code", "device_verification"}, "account_changes": {"full_ssn", "2fa_code", "manager_approval"},}
agent = Agent( instructions=f"""Verify identity progressively based on the action: {json.dumps(verification_levels, indent=2)}
For balance_check: ask only name + last 4 digits. For transfers: require SSN + 2FA + device verification. Never ask for more information than needed. If verification fails at any level, do not proceed. """)Domain 2: SaaS Customer-Facing Agents
SaaS agents handle user onboarding, troubleshooting, and configuration. The stakes are lower than banking, but the volume is higher — and bad prompts create support tickets.
Pattern: Empathy-First Escalation
support_agent = Agent( instructions="""You are a SaaS support agent.
Response structure: 1. **Acknowledge**: Show you understand the problem 2. **Diagnose**: Ask clarifying questions if needed 3. **Resolve**: Provide step-by-step solution 4. **Confirm**: Ask if the solution worked
Escalation rules: - If the user is frustrated (uses caps, swears, repeats), skip diagnosis - If you can't resolve in 3 messages, offer to escalate - If the issue affects billing, always loop in billing agent - Never make the user repeat information they already provided
Tone rules: - Professional but warm. Use "I" not "we". - Never blame the user: "Let's fix this together" not "You did X wrong" - If you don't know something, say "I don't have that information" — don't guess """, handoffs=[billing_agent, technical_agent, human_agent],)Pattern: Progressive Onboarding Prompts
# Onboarding changes as the user progressesonboarding_stages = { "welcome": { "instructions": "User just signed up. Be warm. Guide them to create their first project. Offer a quick tour. Don't mention billing yet.", "max_messages": 3, }, "active": { "instructions": "User has created a project. Help them configure features. Ask about their goals. Offer relevant tips based on their actions.", "max_messages": None, }, "power_user": { "instructions": "User has been active for 30+ days. Skip basics. Offer advanced features, API access, and integrations. Ask if they want a demo of new features.", "max_messages": None, }, "churning": { "instructions": "User hasn't logged in for 14+ days. Be proactive but not pushy. Ask what's preventing them from using the product. Offer a call with customer success.", "max_messages": 2, },}Domain 3: Dev Tools & Internal Agents
Dev tool agents have the most freedom but the highest precision requirements. A bad code suggestion wastes developer hours. A bad deployment command breaks production.
Pattern: Read-Then-Write Guard
dev_agent = Agent( instructions="""You write and modify code.
Mandatory workflow: 1. ALWAYS read the relevant file first (use read_source_file tool) 2. ALWAYS check existing tests before adding features 3. ALWAYS run tests after changes 4. NEVER overwrite a file without reading it first 5. NEVER delete code without understanding what it does
If you can't read the file (permission error, not found): - Do not guess the content - Report the error to the user - Ask for the correct file path
Deployments (CRITICAL): - Never deploy on Friday after 2 PM local time - Never deploy without a PR review - Always run the test suite before deployment - If tests fail, fix them before deploying """, mcp_servers=[github_mcp, filesystem_mcp, testing_mcp], system_prompt="You are a senior developer. Read before writing. Test before deploying.")Pattern: CI/CD Gate Prompts
pr_review_agent = Agent( instructions="""Review pull requests for this repository.
Checklist (ALL items must pass): 1. [ ] Tests pass (check CI status) 2. [ ] No secrets committed (check for API keys, tokens) 3. [ ] Code follows project conventions (check steering file) 4. [ ] No commented-out code 5. [ ] Error handling is present 6. [ ] No console.log or debug statements in production code
Scoring: - 0-2 failures: Approve with comments - 3-4 failures: Request changes - 5-6 failures: Block and tag team lead
Tone: Specific, actionable, and respectful. "Line 42: This variable is unused" not "Your code is messy". """, tools=[read_file, search_code, check_ci_status],)The 12 Anti-Patterns
After watching dozens of teams deploy prompt-engineered agents, these are the most common mistakes.
1. The Everything System Prompt
// Bad: 2,500 words covering project rules, identity, tools, output format, safety, examples, and the meaning of lifeFix: System prompt ≤ 200 words. Delegate to steering files, MCP context, and agent instructions.
2. Assuming the Agent Reads Everything
// Bad: Critical rule hidden in paragraph 47 of 83"You must never delete production data... (scrolling)... and also please format dates as ISO 8601"Fix: Critical rules go FIRST. Safety rules go at position 1-3 of the prompt. Format preferences go last.
3. The Silent Failure Mode
# Bad: Agent fails silentlyagent.run("Deploy to production") # Agent decides not to deploy but doesn't say whyFix: Require the agent to explain non-actions.
instructions = """If you decide NOT to do something, explain why.Silent refusals are unacceptable — tell the user what's blocking you."""4. Identical Instructions for Different Agents
# Bad: Same instructions, different namesagent_a = Agent(name="researcher", instructions="Be thorough and accurate")agent_b = Agent(name="writer", instructions="Be thorough and accurate")Fix: Each agent needs unique instructions that define its role, tools, and constraints.
5. No Steering File, Everything in System Prompt
// Bad: Project rules that change every sprint are in a static system promptFix: Steering file (.kiro/steering.md or .claude.md) for anything that changes with the codebase.
6. Prompt That Could Have Been a Tool
// Bad: Prompt tells the agent to do something a tool could do"Calculate the total by multiplying price by quantity for each item and summing"Fix: Write a tool. Prompts are instructions. Tools are capabilities. Don’t prompt for what you can tool.
7. Over-Specifying Output Format
// Bad: The model complies but the content suffers"Format: strict JSON with exactly these 47 fields..."// Result: Valid JSON, hallucinated dataFix: Keep JSON schema to essential fields. Let the model be flexible where precision doesn’t matter.
8. No Evaluation Baseline
// Bad: Deployed a prompt change and "it felt better"Fix: Run before/after evaluation with the same test dataset. Score both versions. Compare.
9. Prompt Changes Without a PR
// Bad: "I just tweaked the instructions in production" → 2 hours later: P0 incidentFix: Every prompt change goes through a PR. Every PR triggers automated evaluation. No exceptions.
10. Ignoring Token Budget
// Bad: Tool descriptions + system prompt + steering file consume 60K tokens// The model has 12K left for the actual conversationFix: Monitor prompt token usage. Keep system + tools + steering under 30% of context window.
11. One Prompt to Rule Them All
# Bad: Same prompt for all users, all contextsagent = Agent(instructions="Help the user with anything")Fix: Use context-aware prompts. Different instructions for new users vs power users vs admins.
12. No Rollback Plan
// Bad: "The new prompt broke everything but we don't have the old version"Fix: Version every prompt. Keep the last 3 versions accessible. Rollback is a one-click operation.
The Prompt Engineering Manifesto (2026 Edition)
- Prompts are code. Version them. Test them. Review them.
- Steering files > system prompts. Project rules belong in the repo.
- Tools are prompts. Your tool descriptions steer agent behavior more than your instructions.
- Structure your output. JSON schemas and Pydantic models prevent parse failures.
- Test before deploy. LLM-as-judge isn’t perfect, but it’s better than nothing.
- Rollback is a feature. Every prompt change must be reversible.
- Context is finite. Every token in your prompt is a token not available for reasoning.
- Agents need different prompts. One prompt for all agents is wrong nine times out of ten.
- Critical rules go first. Safety, compliance, and must-follow rules at the top.
- When in doubt, add a tool. Prompts instruct; tools enable.
Series Recap
| Post | Topic | Key Takeaway |
|---|---|---|
| 1 | System Prompts vs Steering Files vs Agent Instructions | Three layers, different concerns |
| 2 | MCP Tools as Prompts | Tool definitions are instructions |
| 3 | Structured Prompting | XML, JSON, CoT, output types |
| 4 | Prompt Testing & Evaluation | LLM-as-judge, versioning, regression |
| 5 | Production Patterns & Anti-Patterns | Banking, SaaS, dev tools, 12 anti-patterns |
Series: Prompt Engineering 2026 — Production Patterns. Complete in 5 posts.
Next: what should we explore next?
Advertisement
Advertisement