Making MCP cheaper via CLI

Is this article from a while back?

> Before your agent can do anything useful, it needs to know what tools are available. MCP’s answer is to dump the entire tool catalog into the conversation as JSON Schema. Every tool, every parameter, every option.

Because this simply isn't true anymore for the best clients, like Claude Code.

Similar to how Skills were designed[1] to be searchable without dumping everything into context, MCP tools can (and does in Claude Code) work the same way.

See https://www.anthropic.com/engineering/advanced-tool-use and https://x.com/trq212/status/2011523109871108570 and https://platform.claude.com/docs/en/agents-and-tools/tool-us...

[1] https://agentskills.io/specification#progressive-disclosure

There is some important context missing from the article.

First, MCP tools are sent on every request. If you look at the notion MCP the search tool description is basically a mini tutorial. This is going right into the context window. Given that in most cases MCP tool loading is all or nothing (unless you pre-select the tools by some other means) MCP in general will bloat your context significantly. I think I counted about 20 tools in GitHub Copilot VSCode extension recently. That's a lot!

Second, MCP tools are not compossible. When I call the notion search tool I get a dump of whatever they decide to return which might be a lot. The model has no means to decide how much data to process. You normally get a JSON data dump with many token-unfriendly data-points like identifiers, urls, etc. The CLI-based approach on the other hand is scriptable. Coding assistant will typically pipe the tool in jq or tail to process the data chunk by chunk because this is how they are trained these days.

If you want to use MCP in your agent, you need to bring in the MCP model and all of its baggage which is a lot. You need to handle oauth, handle tool loading and selection, reloading, etc.

The simpler solution is to have a single MCP server handling all of the things at system level and then have a tiny CLI that can call into the tools.

In the case of mcpshim (which I posted in another comment) the CLI communicates with the sever via a very simple unix socket using simple json. In fact, it is so simple that you can create a bash client in 5 lines of code.

This method is practically universal because most AI agents these days know how to use SKILLs. So the goal is to have more CLI tools. But instead of writing CLI for every service you can simply pivot on top of their existing MCP.

This solves the context problem in a very elegant way in my opinion.

These days you can rewrite everything yourself for very cheap. So this is `mcporter` rewritten. I prefer to use Rust personally for rewrites. Opus 4.6 can churn it out pretty quickly if that's what you want. To be honest, almost all software that I want to try these days I don't even install. Instead I'd rather read the README and produce a personal version. This allows encoding idiosyncrasies and specifics that another author will not accept.

This looks related to Awesome CLIs/TUIs and terminal trove which has lots both CLI and TUI apps.

Awesome TUIs: https://github.com/rothgar/awesome-tuis

Awesome CLIs: https://github.com/agarrharr/awesome-cli-apps

Terminal Trove: https://terminaltrove.com/

I guess this is another one shows that the CLI and Unix is coming back in 2026.

Hehe... nice one. I think we are all thinking the same thing.

I've also launched https://mcpshim.dev (https://github.com/mcpshim/mcpshim).

The unix way is the best way.

True for coding agents running SotA models where you're the human-in-the-loop approving, less true for your deployed agents running on cheap models that you don't see what's being executed.

But yeah, a concrete example is playwright-mcp vs playwright-cli: https://testcollab.com/blog/playwright-cli

Not just cheaper in terms of token usage but accuracy as well.

Even the smallest models are RL trained to use shell commands perfectly. Gemini 3 flash performs better with a cli with 20 commands vs 20+ tools in my testing.

cli also works well in terms of maintaining KV cache (changing tools mid say to improve model performance suffers from kv cache vs cli —help command only showing manual for specific command in append only fashion)

Writing your tools as unix like cli also has a nice benefit of model being able to pipe multiple commands together. In the case of browser, i wrote mini-browser which frontier models use much better than explicit tools to control browser because they can compose a giant command sequence to one shot task.

https://github.com/runablehq/mini-browser

I like this approach ... BUT the big win for me is audit logs. CLIs naturally leave a trail you can replay.

ALSO... the permission boundary is clearer. You can whitelist commands, flags, working dir... it becomes manageable.

HOWEVER... packaging still matters. A “small” CLI that pulls in a giant runtime kills the benefit.

I want the discipline of small protocol plus big cache. Cheap models can summarize what they did and avoid full context in every step...

I’m not sure how this works. A lot of that tool description is important to the Agent understanding what it can and can’t do with the specific MCP provider. You’d have to make up for that with a much longer overarching description. Especially for internal only tools that the LLM has no intrinsic context for.

This sounds similar to MCPorter[0], can anyone point out the differences?

[0] https://github.com/steipete/mcporter

MCP has some schemas though. CLI is a bit of a mess.

But MCP today isn’t ideal. I think we need to have some catalogs where the agents can fetch more information about MCP services instead of filling the context with not relevant noise.

Why are they using JSON in the context? I thought we'd figured out that the extra syntax was a waste of tokens?

Can LLMs compress those documents into smaller files that still retain the full context?

You just reinvented Skills

I've seen folks say that the future of using computers will be with an LLM that generates code on the fly to accomplish tasks. I think this is a bit ridiculous, but I do think that operating computers through natural language instructions is superior for a lot of cases and that seems to be where we are headed.

I can see a future where software is built with a CLI interface underneath the (optional) GUI, letting an LLM hook directly into the underlying "business" logic to drive the application. Since LLM's are basically text machines, we just need somebody to invent a text-driven interface for them to use...oh wait!

Imagine booking a flight - the LLM connects to whatever booking software, pulls a list of commands, issues commands to the software, and then displays the output to the user in some fashion. It's basically just one big language translation task, something an LLM is best at, but you still have the guardrails of the CLI tool itself instead of having the LLM generate arbitrary code.

Another benefit is that the CLI output is introspectable. You can trace everything the LLM is doing if you want, as well as validate its commands if necessary (I want to check before it uses my credit card). You don't get this if it's generating a python script to hit some API.

Even before LLM's developers have been writing GUI applications as basically a CLI + GUI for testability, separation of concerns etc. Hopefully that will become more common.

Also this article was obviously AI generated. I'm not going to share my feelings about that.

Cheaper, but is it more effective?

I know I saw something about the Next.js devs experimenting with just dumping an entire index of doc files into AGENTS.md and it being used significantly more by Claude than any skills/tool call stuff.

A lot of providers already have native CLI tools with usually better auth support and longer sessions than MCP as well as more data in their training set on how to use those cli tools for many things. So why convert mcp->cli tool instead of using the existing cli tools in the first place? Using the atlassian MCP is dog shit for example, but using acli is great. Same for github, aws, etc.

I had deepseek explain MCP to me. Then I asked what was the point of persistent connections and it said it was pretty much hipster bullshit and that some url to post to is really enough for an llm to interact with things.

The article's link to clihub.sh is broken. Looks like https://clihub.org/ is the correct link? I've added that to the toptext as well.

Edit: took out because I think that was something different.

I can give example.

LLM only know `linear` tool exists.

I ask "get me the comments in the last issue"

Next call LLM does is

`linear --help 2>&1 | grep -i -E "search|list.issue|get.issue")` then `linear list-issues --raw '{"limit": 3}' -o json 2>&1 | head -80)` then `linear list-comments --issue-id "abc1ceae-aaaa-bbbb-9aaa-6bef0325ebd0" 2>&1)`

So even the --help has filtering by default. Current models are pretty good

MCP has some schemas though. CLI is a bit of a mess.

But MCP today isn’t ideal. I think we need to have some catalogs where the agents can fetch more information about MCP services instead of filling the context with not relevant noise.

Cheaper, but is it more effective?

I can give example.

LLM only know `linear` tool exists.

I ask "get me the comments in the last issue"

Next call LLM does is

So even the --help has filtering by default. Current models are pretty good

It's the same from functionality perspective. The schema's are converted to CLI versions of it. It's a UI change more than anything.

personal experience, definitely yes. You can try it out with `gh` rather than `Github MCP`. You'll see the difference immediately (espicially more if you have many MCPs)

It's the same from functionality perspective. The schema's are converted to CLI versions of it. It's a UI change more than anything.

personal experience, definitely yes. You can try it out with `gh` rather than `Github MCP`. You'll see the difference immediately (espicially more if you have many MCPs)

There is some important context missing from the article.

If you want to use MCP in your agent, you need to bring in the MCP model and all of its baggage which is a lot. You need to handle oauth, handle tool loading and selection, reloading, etc.

The simpler solution is to have a single MCP server handling all of the things at system level and then have a tiny CLI that can call into the tools.

This solves the context problem in a very elegant way in my opinion.

Hehe... nice one. I think we are all thinking the same thing.

I've also launched https://mcpshim.dev (https://github.com/mcpshim/mcpshim).

The unix way is the best way.

Nice!

Compared both

---

TL;DR CLIHUB compiles MCP servers into portable, self-contained binaries — think of it like a compiler. Best for distribution, CI, and environments where you can't run a daemon.

mcpshim is a runtime bridge — think of it like a local proxy. Best for developers juggling many MCP servers locally, especially when paired with LLM agents that benefit from persistent connections and lightweight aliases.

---

https://cdn.zappy.app/b908e63a442179801e406b01cf412433.png (table comparison)

---

Is this article from a while back?

Because this simply isn't true anymore for the best clients, like Claude Code.

Similar to how Skills were designed[1] to be searchable without dumping everything into context, MCP tools can (and does in Claude Code) work the same way.

See https://www.anthropic.com/engineering/advanced-tool-use and https://x.com/trq212/status/2011523109871108570 and https://platform.claude.com/docs/en/agents-and-tools/tool-us...

[1] https://agentskills.io/specification#progressive-disclosure

This looks related to Awesome CLIs/TUIs and terminal trove which has lots both CLI and TUI apps.

Awesome TUIs: https://github.com/rothgar/awesome-tuis

Awesome CLIs: https://github.com/agarrharr/awesome-cli-apps

Terminal Trove: https://terminaltrove.com/

I guess this is another one shows that the CLI and Unix is coming back in 2026.

Not just cheaper in terms of token usage but accuracy as well.

Even the smallest models are RL trained to use shell commands perfectly. Gemini 3 flash performs better with a cli with 20 commands vs 20+ tools in my testing.

https://github.com/runablehq/mini-browser

I like this approach ... BUT the big win for me is audit logs. CLIs naturally leave a trail you can replay.

ALSO... the permission boundary is clearer. You can whitelist commands, flags, working dir... it becomes manageable.

HOWEVER... packaging still matters. A “small” CLI that pulls in a giant runtime kills the benefit.

I want the discipline of small protocol plus big cache. Cheap models can summarize what they did and avoid full context in every step...

FYI the blog has direct comparison to Anthropic’s Tool Search.

Regardless, most MCPs are dumping. I know Cloudflare MCP is amazing but other 1000 useful MCPs are not.

I actually want to combine this and CLIHub into a directory where someone can download all the official MCPs or CLIs (or MCP to CLIs) with a single command

True for coding agents running SotA models where you're the human-in-the-loop approving, less true for your deployed agents running on cheap models that you don't see what's being executed.

But yeah, a concrete example is playwright-mcp vs playwright-cli: https://testcollab.com/blog/playwright-cli

This is cool!

I was actually thinking if I should support daemons just to support playwright. Now I don't have a use case for it

This sounds similar to MCPorter[0], can anyone point out the differences?

[0] https://github.com/steipete/mcporter

Main differences are

CLIHub

- written in go

- zero-dependency binaries

- cross-compilation built-in (works on all platforms)

- supports OAuth2 w/ PKCE, S2S, Google SA, API key, basic, bearer. Can be extended further

MCPorter

- TS

- huge dependency list

- runtime dependency on bun

- Auth supports OAuth + basic token

- Has many features like SDK, daemons (for certain MCPs), auto config discovery etc.

MCPorter is more complete tbh. Has many nice to have features for advanced use cases.

My use case is simple. Does it generate a CLI that works? Mainly oauth is the blocker since that logic needs to be custom implemented to the CLI.

Why are they using JSON in the context? I thought we'd figured out that the extra syntax was a waste of tokens?

Even before LLM's developers have been writing GUI applications as basically a CLI + GUI for testability, separation of concerns etc. Hopefully that will become more common.

Also this article was obviously AI generated. I'm not going to share my feelings about that.

Ofc it is written by ai, I have a skill for it -

https://github.com/thellimist/thellimist.github.io/blob/mast...

I dump a voice message, then blog comes out. Then I modify a bunch of things, and iterate 1-2 hours to get it right

Can LLMs compress those documents into smaller files that still retain the full context?

You just reinvented Skills

I don't prefer to use online skills where half has malware

Official MCPs are trusted. Official MCPs CLIs are trusted.

Did he? Skills are for CLIs, not for converting MCPs into CLIs.

The article's link to clihub.sh is broken. Looks like https://clihub.org/ is the correct link? I've added that to the toptext as well.

Edit: took out because I think that was something different.

Good catch.

I didn't release the website yet. I'll remove the link

Nice!

Compared both

---

TL;DR CLIHUB compiles MCP servers into portable, self-contained binaries — think of it like a compiler. Best for distribution, CI, and environments where you can't run a daemon.

---

https://cdn.zappy.app/b908e63a442179801e406b01cf412433.png (table comparison)

---

Nice. Love it.

One important aspect of mcpshim which you might want to bring into clihub is the history idea. Imagine if the model wants to know what it did couple of days ago. It will be nice to have an answer for that if you record the tool calls in a file and then allow the agent to query the file.

I was happy with playwright like MCPs that require the daemon so didn't convert them to CLIs.

My use cases are almost all 3rd party integrations.

Have you seen any improvements converting on MCPs that require persistency into CLI?

FYI the blog has direct comparison to Anthropic’s Tool Search.

Regardless, most MCPs are dumping. I know Cloudflare MCP is amazing but other 1000 useful MCPs are not.

I actually want to combine this and CLIHub into a directory where someone can download all the official MCPs or CLIs (or MCP to CLIs) with a single command

This is cool!

I was actually thinking if I should support daemons just to support playwright. Now I don't have a use case for it

Main differences are

CLIHub

- written in go

- zero-dependency binaries

- cross-compilation built-in (works on all platforms)

- supports OAuth2 w/ PKCE, S2S, Google SA, API key, basic, bearer. Can be extended further

MCPorter

- TS

- huge dependency list

- runtime dependency on bun

- Auth supports OAuth + basic token

- Has many features like SDK, daemons (for certain MCPs), auto config discovery etc.

MCPorter is more complete tbh. Has many nice to have features for advanced use cases.

My use case is simple. Does it generate a CLI that works? Mainly oauth is the blocker since that logic needs to be custom implemented to the CLI.

Ofc it is written by ai, I have a skill for it -

https://github.com/thellimist/thellimist.github.io/blob/mast...

I dump a voice message, then blog comes out. Then I modify a bunch of things, and iterate 1-2 hours to get it right

Might need to iterate on them more because it's still quite obviously machine written, and a lot of people find it disrespectful to read content that was LLM generated.

Every AI agent using MCP is quietly overpaying. Not on the API calls themselves - those are fine. The tax is on the instruction manual.

Before your agent can do anything useful, it needs to know what tools are available. MCP’s answer is to dump the entire tool catalog into the conversation as JSON Schema. Every tool, every parameter, every option.

CLI does the same job but cheaper.

Same tools, different packaging

I took an MCP server and generated a CLI from it using CLIHub. Same tools, same OAuth, same API underneath. Two things change: what loads at session start, and how the agent calls a tool.

The numbers below assume a typical setup: 6 MCP servers, 14 tools each, 84 tools total.

1. Session start

MCP dumps every tool schema into the conversation upfront. CLI uses a lightweight skill listing - just names and locations. The agent discovers details when it needs them.1

MCP loads this (~185 tokens * 84 = 15540):

{
  "name": "notion-search",
  "description": "Search for pages and databases",
  "inputSchema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "The search query text"
      },
      "filter": {
        "type": "object",
        "properties": {
          "property": { "type": "string", "enum": ["object"] },
          "value": { "type": "string", "enum": ["page", "database"] }
        }
      }
    }
  },
  {
  "name": "notion-fetch",
  ...
  }
  ... (84 tools total)
}

CLI loads this (~50 tokens * 6 = 300):

<available_tools>
  <tool>
    <name>notion</name>
    <description>CLI for Notion</description>
    <location>~/bin/notion</location>
  </tool>
  <tool>
    <name>linear</name>
    ...
  </tool>
  ... (6 tools total)
</available_tools>

2. Tool call

Once the agent knows what’s available, it still needs to call a tool.

MCP tool call (~30 tokens):

{
  "tool_call": {
    "name": "notion-search",
    "arguments": {
      "query": "my search"
    }
  }
}

CLI tool call (~610 tokens):

# Step 1: Discover tools (~4 + ~600 tokens)

$ notion --help
notion search <query> [--filter-property ...]
  Search for pages and databases
notion create-page <title> [--parent-id ID]
  Create a new page
... 12 more tools

------------------------------------------------

# Step 2: Execute (~6 tokens)

$ notion search "my search"

MCP’s call is cheaper because definitions are pre-loaded. CLI pays at discovery time - --help returns the full command reference (~600 tokens for 14 tools), then the agent knows what to execute.

Tools used	MCP	CLI	Savings
Session start	~15,540	~300	98%
1 tool	~15,570	~910	94%
10 tools	~15,840	~964	94%
100 tools	~18,540	~1,504	92%

CLI uses ~94% fewer tokens overall.

Anthropic’s Tool Search

Anthropic launched Tool Search which loads a search index instead of every schema then uses fetch tools on demand. It typically drops token usage by 85%.

Same idea as CLI’s lazy loading. But when Tool Search fetches a tool, it still pulls the full JSON Schema.2

Tools used	MCP	TS	CLI	Savings vs TS
Session start	~15,540	~500	~300	40%
1 tool	~15,570	~3,530	~910	74%
10 tools	~15,840	~3,800	~964	75%
100 tools	~18,540	~12,500	~1,504	88%

Tool Search is more expensive, and it’s Anthropic-only. CLI is cheaper and works with any model.

CLIHub

I struggled finding CLIs for many tools so built CLIHub a directory of CLIs for agent use.

Open sourced the converter - one command to create CLIs from MCPs.

I like using formatting of Openclaw's available_skills block for CLI. It can be modified to other formats.
Tool Search: ~500 session start + ~3K per search (loads 3-5 tools) + ~30 per call. Assumes 1 search for 1-10 calls, 3 searches for 100.

I don't prefer to use online skills where half has malware

Official MCPs are trusted. Official MCPs CLIs are trusted.

Did he? Skills are for CLIs, not for converting MCPs into CLIs.

Good catch.

I didn't release the website yet. I'll remove the link

Nice. Love it.

I was happy with playwright like MCPs that require the daemon so didn't convert them to CLIs.

My use cases are almost all 3rd party integrations.

Have you seen any improvements converting on MCPs that require persistency into CLI?

The article says the LLM has to load 15540 tokens every time, I wonder if that can be reduced while retaining the context maybe with deduplications, removing superfluous words, using shorter expressions with the same meaning or things like that.

Might need to iterate on them more because it's still quite obviously machine written, and a lot of people find it disrespectful to read content that was LLM generated.

Hacker Times

Hacker Times

Making MCP cheaper via CLI

Discussion

Discussion

Same tools, different packaging

1. Session start

2. Tool call

Anthropic’s Tool Search

CLIHub