Take your AI agents from "works in the demo" to "safe in production."
A boutique MCP and AI-agent consultancy for CTOs, VPs of Engineering, and Staff+ engineers who moved fast on agents — and now need to govern them before they become a liability.
by Willian Pinho — Maintainer of the MCP server that works in Claude Code, Cursor & Gemini.
What governed looks like
From ad-hoc to declarative in one config.
A production-ready MCP gateway encodes access rights, fail-close posture, routing policy, and observability in a single auditable file — not in Slack threads.
# MCP Gateway — production config (illustrative)
gateway:
default_posture: fail-close # safe-by-default on any degradation
onboarding: declarative # YAML-in-git, reviewed, audited
rbac:
mode: per-function
principals:
- name: agent-read-only
tools: [search, read_file]
- name: agent-privileged
tools: [search, read_file, write_file]
requires_approval: true
routing:
models:
primary: gpt-4o
fallback: claude-3-haiku
cost_controls:
per_team_budget_usd: 500
latency_p95_ms: 3000
observability:
tracing: opentelemetry
log_level: info
reconstruct_calls: true # full agent → tool trace
secrets:
provider: vault
rotation: automatic
zero_inline_credentials: true -
Fail-close by default
Any degradation — gateway, model, or tool — refuses safely rather than silently doing the wrong thing.
-
Per-function RBAC
Each principal has an explicit allow-list of tools. Write access requires approval. Blast radius is bounded by design.
-
Full call reconstruction
OpenTelemetry tracing across agent → gateway → model → tool. Any call can be replayed after the fact.
-
YAML-in-git onboarding
New MCP servers enter production through a reviewed PR, not a Slack message. Declarative, auditable, reversible.
The flagship engagement
MCP Gateway Readiness Audit
In two weeks, you get a precise, evidence-backed verdict on whether your MCP/agent platform is production-grade — and a sequenced plan to close every gap. Fixed scope. Fixed deliverables.
Seven dimensions assessed
-
Tool-access governance / RBAC
Per-function rights management, least-privilege posture, blast-radius of each tool.
-
Fail-close vs fail-open
Degradation behavior of gateway, models, and MCP servers. Are refusals safe by default?
-
Onboarding flow
How new MCP servers and tools enter production — declarative YAML-in-GitHub vs ad-hoc, with review gates and an audit trail.
-
Observability & tracing
End-to-end visibility (OpenTelemetry-grade) across agent → gateway → model → tool. Can you reconstruct any call?
-
Multi-LLM routing & cost controls
Virtual-model routing policy, fallback behavior, per-team cost attribution, latency guardrails.
-
Security & secrets
Secret handling, IDP integration, identity propagation, prompt-injection and exfiltration exposure.
-
Production-readiness gaps
Rollout, kill-switch, rate limits, eval/quality gates — the operational table-stakes for shipping with confidence.
Four deliverables
- ✓
Written Readiness Report
Clear findings per dimension, evidence-backed, written so your engineers AND your security/leadership stakeholders can both act on it.
- ✓
Scored Gap Matrix
All seven dimensions scored red / yellow / green with a severity rating. The entire state of your platform on one screen — defensible and re-runnable.
- ✓
Prioritized 90-Day Roadmap
A sequenced, effort-tagged remediation plan: what to fix in week 1, in month 1, and in quarter 1 — ordered by risk-reduction-per-dollar.
- ✓
Live Review Session
A working session with your team to walk the findings, pressure-test the roadmap, and align on the single highest-leverage next step.
Scored Gap Matrix
| # | Dimension | Status | Finding |
|---|---|---|---|
| 01 | Tool-access governance / RBAC | Critical | No per-function rights model found |
| 02 | Fail-close vs fail-open | Critical | Gateway defaults to fail-open on degradation |
| 03 | Onboarding flow | Review | Ad-hoc today; review gate partially in place |
| 04 | Observability & tracing | Review | Logging present; no end-to-end OTel trace |
| 05 | Multi-LLM routing & cost controls | Pass | Routing policy defined; attribution complete |
| 06 | Security & secrets | Review | Vault in use; IDP propagation incomplete |
| 07 | Production-readiness gaps | Critical | No kill-switch; rate limits absent |
Illustrative — actual scores are evidence-backed findings from your stack.
Two clearly-separated steps — you always know what you're paying for.
Step 1
Paid discovery
A focused working call to map your stack, team size, and the systems in scope.
You walk away with
A fixed-price audit quote and a written scope statement — exactly which systems, which of the seven dimensions, and the two-week timeline. No surprises before you commit.
Step 2
The audit
The full diagnostic across all seven dimensions of your MCP/agent stack.
You walk away with
The four deliverables above: readiness report, scored gap matrix, 90-day roadmap, and live review session.
The discovery exists for one reason: so the audit price is a real number you agree to up front — not a meter that runs while we work.
The path
Discovery → Audit → Implementation. One clean line.
Each step produces the artifact the next step needs. You can stop after any one of them.
Discovery
Scoping call
Maps your stack, team, and scope. Output: a fixed audit quote and written scope statement — you know the exact price before any audit work begins.
Audit
2 weeks · fixed price
Produces the readiness report, scored gap matrix, 90-day roadmap, and live review. A structured diagnostic across the seven dimensions that decide whether a stack is production-grade.
Implementation
Fixed price, scoped from the roadmap
Gateway hardening, declarative onboarding, RBAC, observability, and multi-LLM routing — delivered as a separate fixed-price project scoped directly from the audit roadmap.
This is paid discovery, not a one-off report. The audit deliberately produces the scoping artifact your implementation needs. If you proceed to implementation within 60 days, the full audit fee is credited against the implementation engagement.
Pricing
From US$15K for the two-week audit. Fixed price, no meter.
You buy a deliverable and an outcome — the four artifacts above — not hours.
From US$15,000
Why "from" and not one number? Two things move the figure: the complexity of your stack (how many MCP servers, models, and integration points) and your team size (how many stakeholders the report and review serve). The short paid discovery pins both — so you sign off on an exact, fixed audit price before any audit work begins.
Every engagement includes, in writing
- Written Readiness Report (findings across all 7 dimensions)
- Scored Gap Matrix (red / yellow / green, with severity)
- Prioritized 90-Day Roadmap (effort-tagged, risk-ordered)
- Live Review Session with your team
Payment milestones
Proceed to implementation within 60 days? The full audit fee is credited against the implementation engagement. You pay for the diagnosis once. If you build, the diagnosis was free.
Why this consultant
I build and ship the protocol this audit is about.
-
Maintainer of a published MCP server
Compatible with Claude Code, Cursor, and Gemini CLI. A working cross-client implementation of the protocol, not just an opinion on it.
-
Just did exactly this work
Recently operationalized an MCP-services + virtual-models gateway: YAML-in-GitHub MCP onboarding, per-function rights management, multi-LLM routing, OpenTelemetry observability, and IDP integration.
-
16 years in production engineering
Including a payment platform serving 10M+ users at 99.9% uptime across a fragmented hardware fleet. Production-grade reliability under real stakes.
-
Fintech KYC pipeline shipped in 3 weeks
Fast, scoped delivery against a hard deadline — the same discipline that makes a fixed two-week audit credible.
Find out exactly where your agent stack stands.
Start with a 15-minute discovery call. We map your stack and scope a fixed-price audit — you decide whether to proceed with the number in front of you.
No deck, no obligation. Fixed scope, fixed price, written scope statement before any audit work begins.