Take your AI agents from "works in the demo" to "safe in production."

A boutique MCP and AI-agent consultancy for CTOs, VPs of Engineering, and Staff+ engineers who moved fast on agents — and now need to govern them before they become a liability.

by Willian Pinho — Maintainer of the MCP server that works in Claude Code, Cursor & Gemini.

10M+ users at 99.9% uptime
16y production engineering
3wk fintech KYC pipeline shipped
1 MCP server, 3 AI clients

From ad-hoc to declarative in one config.

A production-ready MCP gateway encodes access rights, fail-close posture, routing policy, and observability in a single auditable file — not in Slack threads.

gateway.yml
  • Fail-close by default

    Any degradation — gateway, model, or tool — refuses safely rather than silently doing the wrong thing.

  • Per-function RBAC

    Each principal has an explicit allow-list of tools. Write access requires approval. Blast radius is bounded by design.

  • Full call reconstruction

    OpenTelemetry tracing across agent → gateway → model → tool. Any call can be replayed after the fact.

  • YAML-in-git onboarding

    New MCP servers enter production through a reviewed PR, not a Slack message. Declarative, auditable, reversible.

MCP Gateway Readiness Audit

In two weeks, you get a precise, evidence-backed verdict on whether your MCP/agent platform is production-grade — and a sequenced plan to close every gap. Fixed scope. Fixed deliverables.

Seven dimensions assessed

  1. Tool-access governance / RBAC

    Per-function rights management, least-privilege posture, blast-radius of each tool.

  2. Fail-close vs fail-open

    Degradation behavior of gateway, models, and MCP servers. Are refusals safe by default?

  3. Onboarding flow

    How new MCP servers and tools enter production — declarative YAML-in-GitHub vs ad-hoc, with review gates and an audit trail.

  4. Observability & tracing

    End-to-end visibility (OpenTelemetry-grade) across agent → gateway → model → tool. Can you reconstruct any call?

  5. Multi-LLM routing & cost controls

    Virtual-model routing policy, fallback behavior, per-team cost attribution, latency guardrails.

  6. Security & secrets

    Secret handling, IDP integration, identity propagation, prompt-injection and exfiltration exposure.

  7. Production-readiness gaps

    Rollout, kill-switch, rate limits, eval/quality gates — the operational table-stakes for shipping with confidence.

Four deliverables

  • Written Readiness Report

    Clear findings per dimension, evidence-backed, written so your engineers AND your security/leadership stakeholders can both act on it.

  • Scored Gap Matrix

    All seven dimensions scored red / yellow / green with a severity rating. The entire state of your platform on one screen — defensible and re-runnable.

  • Prioritized 90-Day Roadmap

    A sequenced, effort-tagged remediation plan: what to fix in week 1, in month 1, and in quarter 1 — ordered by risk-reduction-per-dollar.

  • Live Review Session

    A working session with your team to walk the findings, pressure-test the roadmap, and align on the single highest-leverage next step.

Scored Gap Matrix

Pass Review Critical
# Dimension Status Finding
01 Tool-access governance / RBAC Critical No per-function rights model found
02 Fail-close vs fail-open Critical Gateway defaults to fail-open on degradation
03 Onboarding flow Review Ad-hoc today; review gate partially in place
04 Observability & tracing Review Logging present; no end-to-end OTel trace
05 Multi-LLM routing & cost controls Pass Routing policy defined; attribution complete
06 Security & secrets Review Vault in use; IDP propagation incomplete
07 Production-readiness gaps Critical No kill-switch; rate limits absent

Illustrative — actual scores are evidence-backed findings from your stack.

Two clearly-separated steps — you always know what you're paying for.

Step 1

Paid discovery

Short · scoping

A focused working call to map your stack, team size, and the systems in scope.

You walk away with

A fixed-price audit quote and a written scope statement — exactly which systems, which of the seven dimensions, and the two-week timeline. No surprises before you commit.

Step 2

The audit

2 weeks · fixed price

The full diagnostic across all seven dimensions of your MCP/agent stack.

You walk away with

The four deliverables above: readiness report, scored gap matrix, 90-day roadmap, and live review session.

The discovery exists for one reason: so the audit price is a real number you agree to up front — not a meter that runs while we work.

Discovery → Audit → Implementation. One clean line.

Each step produces the artifact the next step needs. You can stop after any one of them.

Discovery

Scoping call

Maps your stack, team, and scope. Output: a fixed audit quote and written scope statement — you know the exact price before any audit work begins.

Audit

2 weeks · fixed price

Produces the readiness report, scored gap matrix, 90-day roadmap, and live review. A structured diagnostic across the seven dimensions that decide whether a stack is production-grade.

Implementation

Fixed price, scoped from the roadmap

Gateway hardening, declarative onboarding, RBAC, observability, and multi-LLM routing — delivered as a separate fixed-price project scoped directly from the audit roadmap.

This is paid discovery, not a one-off report. The audit deliberately produces the scoping artifact your implementation needs. If you proceed to implementation within 60 days, the full audit fee is credited against the implementation engagement.

From US$15K for the two-week audit. Fixed price, no meter.

You buy a deliverable and an outcome — the four artifacts above — not hours.

From US$15,000

2-week engagement · fixed price · four deliverables

Why "from" and not one number? Two things move the figure: the complexity of your stack (how many MCP servers, models, and integration points) and your team size (how many stakeholders the report and review serve). The short paid discovery pins both — so you sign off on an exact, fixed audit price before any audit work begins.

Every engagement includes, in writing

  • Written Readiness Report (findings across all 7 dimensions)
  • Scored Gap Matrix (red / yellow / green, with severity)
  • Prioritized 90-Day Roadmap (effort-tagged, risk-ordered)
  • Live Review Session with your team

Payment milestones

Kickoff 50%
On delivery + live review 50%

Proceed to implementation within 60 days? The full audit fee is credited against the implementation engagement. You pay for the diagnosis once. If you build, the diagnosis was free.

I build and ship the protocol this audit is about.

  • Maintainer of a published MCP server

    Compatible with Claude Code, Cursor, and Gemini CLI. A working cross-client implementation of the protocol, not just an opinion on it.

  • Just did exactly this work

    Recently operationalized an MCP-services + virtual-models gateway: YAML-in-GitHub MCP onboarding, per-function rights management, multi-LLM routing, OpenTelemetry observability, and IDP integration.

  • 16 years in production engineering

    Including a payment platform serving 10M+ users at 99.9% uptime across a fragmented hardware fleet. Production-grade reliability under real stakes.

  • Fintech KYC pipeline shipped in 3 weeks

    Fast, scoped delivery against a hard deadline — the same discipline that makes a fixed two-week audit credible.

Find out exactly where your agent stack stands.

Start with a 15-minute discovery call. We map your stack and scope a fixed-price audit — you decide whether to proceed with the number in front of you.

No deck, no obligation. Fixed scope, fixed price, written scope statement before any audit work begins.