Build the CLI First

Everyone shipped an MCP server in 2025. Now the developers who actually run AI agents in production are quietly going back to the command line.

The Model Context Protocol exploded onto the scene when Anthropic introduced it in November 2024. Within a year, the ecosystem grew to over 1,800 servers. OpenAI, Google, and Microsoft all adopted it. Anthropic donated it to the Linux Foundation. It was, by any measure, one of the fastest protocol adoptions in tech history.

And yet, by February 2026, a clear counter-signal emerged. Eric Holmes' post "MCP is Dead. Long Live the CLI" hit the top of Hacker News. Octomind, a company that builds testing agents, published "Everyone Scrambled to Ship MCP Servers. The Agents That Actually Work Just Use the Command Line." Developer after developer reported the same thing: after building real systems with MCP, they reverted to CLIs and direct API calls.

This is not a hot take. This is a consulting firm's honest assessment after building AI-powered tooling for ourselves and our clients. Here is what we have learned.

What MCP Actually Is (And What It Promised)

MCP is a standardized protocol (inspired by the Language Server Protocol (LSP)) that lets AI models discover and invoke external tools. An MCP server exposes "tools" (functions the AI can call), "resources" (data it can read), and "prompts" (templates for structured interactions). Everything communicates over JSON-RPC 2.0.

The pitch was compelling: build your integration once, and it works with Claude, ChatGPT, Cursor, or any MCP-compatible client. No more custom glue code for every AI platform. The N×M integration problem becomes N+M.

In practice, MCP delivers on several of these promises. The structured tool schemas are genuinely useful and the AI knows exactly what parameters a tool expects, what types they are, and what the tool returns. Dynamic discovery via tools/list means you can add capabilities without changing client code. And the OAuth and consent flows give enterprises the governance layer they need before handing an AI agent any real authority.

So why are experienced builders pulling back?

The Case Against Starting with MCP

The Token Tax

This is the most concrete, measurable problem. Every MCP tool definition gets loaded into the AI model's context window before you ask your first question.

The numbers are brutal. The GitHub MCP server ships 93 tools at roughly 55,000 tokens. That's about a quarter of Claude's context window, gone before you have typed "hello." One developer documented connecting four MCP servers to Claude Code and watching 67,000 tokens disappear. At that point, you are paying a significant token tax on every single interaction, and the model has less room to actually think about your problem.

CLIs, by contrast, consume near-zero context tokens. An agent running gh pr list --json number,title uses only the tokens in that command string. It loads tool details progressively via --help only when it needs them.

In benchmarks, CLI-based approaches showed 5–10x better token efficiency. One study measured a Token Efficiency Score of 202 for CLI versus 152 for MCP and found a 33% advantage. CLI-based agents completed tasks (like memory profiling) that MCP agents structurally could not because the MCP agent had burned too much context on tool definitions.

The Debugging Gap

When a CLI command fails, you copy-paste it and run it yourself. You see the exact same error. You add --verbose. You pipe stderr to a file. You have 50 years of debugging tooling at your disposal: tee, strace, redirects, script recording.

When an MCP tool call fails, you are debugging a JSON-RPC exchange over a transport layer (stdio or HTTP), mediated by an AI model that may have misinterpreted the tool schema. The surface area is dramatically larger, and the tooling is immature by comparison.

As one developer put it: "Several of us spent hours debugging MCP connection issues that would have been instant to diagnose with curl or a direct CLI call."

The Audience Problem

MCP servers only work with MCP-compatible clients. Today, that means Claude Desktop, ChatGPT Desktop, Cursor, and a handful of other AI-powered tools. If you build an MCP-only integration, you have excluded:

CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins)
Cron jobs and scheduled automation
Shell scripts and Makefiles
Developers who prefer their terminal
Systems without an AI agent running

A CLI serves all of these audiences. Every CI/CD system is fundamentally built around running shell commands. Every developer knows how to use --help. Every server with SSH access can run your tool immediately.

The Security Irony

MCP's governance features; OAuth flows, user consent for tool calls, sandboxed execution are real strengths. But the current state of the ecosystem undermines them. Research found that 88% of MCP servers require credentials, yet 53% rely on insecure static API keys rather than OAuth (only 8.5% use OAuth properly). Researchers demonstrated prompt injection attacks, tool poisoning (where a tool mutates its own definition after installation), and hundreds of MCP servers running publicly exposed with no authentication.

Meanwhile, battle-tested auth patterns for CLIs already exist and work: AWS profiles, kubeconfig, gh auth login. These patterns work identically whether a human or an AI agent is driving. As Eric Holmes noted: "MCP is unnecessarily opinionated about auth. Battle-tested auth flows work the same whether humans or Claude is driving."

What CLIs Get Right

Composability is a Superpower

Doug McIlroy invented the Unix pipe in 1973. The design decision to make programs communicate via text streams over stdin/stdout created an ecosystem of tools that compose endlessly:

gh pr list --json number,title,author | jq '.[] | select(.author.login == "me")' | wc -l

Three tools. Three vendors. Zero integration code. This works because each tool follows a simple contract: read stdin, write stdout, use exit codes.

MCP tools do not compose with grep, jq, or xargs. They compose with other MCP tools within the same AI agent context and a fundamentally narrower composition model.

LLMs Already Know CLIs

Large language models were trained on billions of lines of shell commands, man pages, Stack Overflow answers, and CLI documentation. When you tell an agent to run gh pr view 123, it just works. The model has deep familiarity with CLI conventions from its training data.

MCP tool schemas, by contrast, are novel artifacts the model encounters at inference time. The model must parse the schema, understand the parameter constraints, and generate a valid JSON-RPC call. All this without the benefit of having seen thousands of examples during training.

CLIs Are Protocol-Independent

If MCP evolves (and it will.. the spec has already deprecated SSE transport in favor of Streamable HTTP), your CLI does not care. If a new AI protocol emerges next year, your CLI still works. The tool's value is in its functionality, not in the protocol wrapping it.

Robert Melton documented building a 50,000-line Go CLI for Mail.app, then wrapping it as an MCP server in approximately 200 lines of Python. If MCP changes, he updates 200 lines. The 50,000-line CLI remains untouched.

The Right Mental Model: CLI First, MCP When Needed

The emerging consensus is not "MCP is dead." It is: build the CLI first, add MCP when governance demands it.

Here is the decision framework we use at Crimson Cow Labs:

Start With CLI When:

Your users are developers who are comfortable in a terminal
You need automation -- CI/CD pipelines, cron jobs, shell scripts
You need composability -- piping output to other tools, chaining operations
You want debuggability -- reproduce any issue by running the same command
You are prototyping -- CLIs have lower setup overhead than MCP servers
A CLI already exists for the service (gh, aws, kubectl, terraform, stripe)

Add MCP When:

Enterprise governance requires it -- regulated environments where unrestricted shell access is not permitted
Your audience includes non-developers -- people who find CLIs intimidating but use AI assistants
You need structured discovery -- the AI agent needs to dynamically discover capabilities
IDE integration is the primary use case -- VS Code, Cursor, and similar tools benefit from MCP's structured context
Multi-tenant scoping is required -- per-user permissions and audit trails in a SaaS product

The Hybrid Pattern

The strongest architecture we have seen is building a well-designed CLI and wrapping it in a thin MCP server:

Build the CLI. Follow conventions: subcommands, --help, --json output flag, proper exit codes, structured error messages on stderr.
Make it excellent. Invest in the developer experience. Good help text. Consistent flag names. Fast startup time.
Add --json output. This is critical. It makes your CLI machine-readable for both scripts and MCP wrappers.
Write the MCP wrapper. It is roughly 200 lines that translate MCP tool calls into CLI invocations and format the JSON output as MCP responses.

This gives you the best of both worlds: a universal tool that works for humans, scripts, CI/CD, and AI agents, plus a structured interface for AI-native contexts when they are required.

A Practical Example

Consider a consulting contract generator (a tool near to our hearts). You could build it two ways:

MCP-Only Approach:

server.registerTool("generate_contract", {
  description: "Generate a consulting contract",
  inputSchema: {
    consultantName: z.string(),
    clientName: z.string(),
    amount: z.string(),
    // ... 15 more parameters
  },
}, async (params) => {
  const contract = await generateContract(params);
  return { content: [{ type: "text", text: contract }] };
});

This works in Claude Desktop. It does not work in your CI pipeline when you want to batch-generate contracts from a CSV. It does not work when a developer wants to script it into an existing workflow. It does not compose with diff to compare two contract versions.

CLI-First Approach:

# Human use
ccl contracts generate --consultant "Jane Smith" --client "Acme Corp" --amount "$25,000"

# Automation
cat clients.csv | ccl contracts generate --from-csv --output-dir ./contracts/

# Comparison
diff <(ccl contracts generate --template standard) <(ccl contracts generate --template enterprise)

# AI agent use (via MCP wrapper or direct shell)
# The LLM already knows how to call this

Same underlying functionality. Dramatically broader utility.

What This Means for Your Business

If you are building developer tools, internal tooling, or any software that AI agents will interact with, here is our recommendation:

Do not build an MCP-only integration. You are locking yourself into the AI ecosystem and excluding the majority of potential users.
Invest in a great CLI. Follow the Command Line Interface Guidelines. Use a mature framework (Commander.js for Node, Click for Python, Cobra for Go). Support --json output. Write excellent --help text.
Add MCP when the use case justifies it. If your customers are enterprises that need governance, if your tool is primarily used in an IDE context, or if non-developer users need AI-mediated access. Add the MCP layer on top of your CLI.
Watch the token budget. If your MCP server exposes more than 10–15 tools, you are likely burning context that the model needs for reasoning. Consider splitting into focused servers or using progressive disclosure.

The command line has been the developer's primary interface for over 50 years. MCP is a useful addition to the toolkit and not a replacement. Build the foundation first.

Crimson Cow Labs helps companies build AI-powered tools and workflows. We specialize in the practical application of LLMs to consulting, legal, and business operations. We always start with the CLI.