How I use every Claude Code feature

I really like this take on MCP: https://blog.sshh.io/i/177742847/mcp-model-context-protocol

> Instead of a bloated API, an MCP should be a simple, secure gateway that provides a few powerful, high-level tools [...] In this model, MCP’s job isn’t to abstract reality for the agent; its job is to manage the auth, networking, and security boundaries and then get out of the way.

Kinda sad if 3000 words is now considered "too long to read through rather use as reference" but some interesting points, I'd be keen to see an even longer version with actual examples instead of placeholder ones.

> Finally, we keep this file synced with an AGENTS.md file to maintain compatibility with other AI IDEs that our engineers might be using.

I researched this the other day, the recommended (by Anthropic) way to do this is to have a CLAUDE.md with a single line in it:

  @AGENTS.md

Then keep your actual content in the other file: https://docs.claude.com/en/docs/claude-code/claude-code-on-t...

Hooks are underutilized and will be critical for long-running and better agent performance. Excited to release Cupcake in the coming weeks. What I started building when i made the feature request for hooks.

- https://github.com/eqtylab/cupcake

- https://github.com/anthropics/claude-code/issues/712

I really like this take on MCP: https://blog.sshh.io/i/177742847/mcp-model-context-protocol

Agreed. My only MCP is a code interpreter. I also recently started experimenting with making an MCP “proxy” which acts a better harness that lets the agent call MCP from within a code interpreter [1]

But in general I still don’t really use MCP. Agents are just so good at solving problems themselves. I wish MCP would mostly focus at the auth part instead of the tool part. Getting an agent access to an API with credentials usually gives them enough power to solve problems on their own.

[1]: https://x.com/mitsuhiko/status/1984756813850374578?s=46

This is how MCP works if you use it for as essential an internal tool API gateway (stateless http) instead of a client facing service that end users are connecting directly to. It's basically just OpenAPI but slightly more tuned for LLM inference.

"Claude Code isn’t just an interactive CLI; it’s also a powerful SDK for building entirely new agents—..."

Em dash and "it's not X, it's Y" in one sentence. Tired of reading posts written by AI. Feels disrespectful to your readers

> /clear + /catchup (Simple Restart): My default reboot. I /clear the state, then run a custom /catchup command to make Claude read all changed files in my git branch.

I've found myself doing similar workarounds. I'm guessing anthropic will just make the /compact command do this instead soon enough.

Are "agents" really as useful as the hype makes them out to be?

Most of the time I'm just pasting code blocks directly into raycast and once I've fixed the bug or got the properly transformed code in the shape that I aimed for, then I paste it back into neovim. Next i'm going to try out "opencode"[0], because I've heard some good things about it. For now, I'm happy with my current workflow.

[0] https://github.com/NickvanDyke/opencode.nvim

Don't sleep on using Claude Code to improve your Claude Code config. Switch to plan mode and try the following prompt:

read the document at https://blog.sshh.io/p/how-i-use-every-claude-code-feature and tell me how to improve my Claude code setup

> If you’re not already using a CLI-based agent like Claude Code or Codex CLI, you probably should be.

Are the CLI-based agents better (much better?) than the Cursor app? Why?

I like how easy it is to get Cursor to focus a particular piece of code. I select the text and Cmd-L, saying "fix this part, it's broken like this ____."

I haven't really tried a CLI agent; sending snippets of code by CLI sounds really annoying. "Fix login.ts lines 148-160, it's broken like this ___"

As fascinating as these tools can be - are we (the industry) once again finding something other than our “customer” to focus our brains on (see Paul Graham’s “Top idea in your mind” essay)?

Crazy how fast Claude Code is evolving, every week there’s something new to learn, and it just keeps getting better.

What type of projects are you guys building? I bought Max and these features to try it out to build a more complex project (ROS2) and that does not seem to work out at all… HTML page, yes, embedded project not so much.

I really enjoyed reading this. One thought I had on the issue of paths in Claude.md

My concern with hardcoding paths inside a doc, it will likely become outdated as the codebase evolves.

One solution would be to script it and have it run pre commit to regenerate the Claude.md with the new paths.

There probably is potential for even more dev tooling that 1. Ensure reference paths are always correct, 2. Enforces standard for how references are documented in Claude.md (and lints things like length)

Perhaps using some kind of inline documentation standard like jsdoc if it’s a ts file or a naming convention if it’s an Md file

Example:

// @claude.md // For complex … usage or if you encounter a FooBarError, see ${path} for advanced troubleshooting steps

> The Takeaway: Skills are the right abstraction. They formalize the “scripting”-based agent model, which is more robust and flexible than the rigid, API-like model that MCP represents.

Just to not confuse, MCP is like an api but the underlying api can execute an Skill. So, its not MCP vs Skill as a contest. It's just the broad concept of a "flexible" skill vs "parameter" based Api. And again parameter based APIs can also be flexible depending on how we write it except that it lacks SKILL.md in case of Skills which guides llm to be more generic than a pure API.

By the way, if you are a Mac user, you can execute Skills locally via OpenSkills[1] that I have created using apple contianers.

1. OpenSkills -https://github.com/BandarLabs/open-skills

I use claude code every day, and havent had a chance to dig super deep into skills, but even though ive read a lot of people describe them and say they're the best thing so far, I still dont get them. Theyre things the agent chooses to call right? They have different permissions? is it a tool call with different permissions and more context? I have yet to see a single post give an actual real-world concrete example of how theyre supposed to be used or a compare and contrast with other approaches.

I don't understand how people use the `git worktree` workflow. I get that you want to isolate your work, but how do you deal with dev servers, port conflicts and npm installs? When I tried it, it was way more hassle than it was worth.

Skills are also a convenient way for writing self-documenting packages. They solve the problem of teaching the LLM how to use a library.

I have started experimenting with a skills/ directory in my open source software, and then made a plugin marketplace that just pulls them in. It works well, but I don't know how scalable it will be.

https://github.com/juanre/ai-tools

"All my stateless tools (like Jira, AWS, GitHub) have been migrated to simple CLIs." - How do you get Jira on the CLI?

Just my curiosity: Why are you producing so much code? Is it because it is now possible to do so with AI, or because you have a genuine need (solid business usecase) that requires a lot of code?

Blog posts like this would really benefit from specific examples. While I can get some mileage out of these tools for greenfield projects, I'm actually shocked that this has proven useful with projects of any substantial size or complexity. I'm very curious to understand the context where such tools are paying off.

> Generally my goal is to “shoot and forget”—to delegate, set the context, and let it work. Judging the tool by the final PR and not how it gets there.

This feels like a false economy to me for real sized changes, but maybe I’m just a weak code reviewer. For code I really don’t care about, I’m happy to do this, but if I ever need to understand that code I have an uphill battle. OTOH reading intermediate diffs and treating the process like actual pair programming has worked well for me, left me with changes I’m happy with, and codebases I understand well enough to debug.

I feel like these posts are interesting, but become irrelevant quickly. Does anyone actually follow these as guides, or just consume them as feedback for how we wish we could interface with LLMs and the workarounds we currently use?

Right now these are reading like a guide to prolog in the 1980s.

right on. i usually just tell it "hey go update this function to do [x]" in horribly misspelled english and then yell at it until it does it right

No thanks. I rather write the code myself that use generated slop. I actually like to code and see little benefit in other peoples copypaste code (thats essentially what ai slop is really)

Does anyone have any suggestions on making Claude prefer to use project internal abstractions and utility functions? My C++ project has a lot of them. If I just say something like "for I/O and networking code, check IOUtils.h for helpers" then it often doesn't do that. But mentioning all helper functions and classes in the context also seems like a bad idea. What's the best way? Are the new Skills a solution?

Why are so many still using CC and not Codex

Are there any of those CLI clients (coded in plain and simple C, or basic python/perl without 1 billion of expensive dependencies) able to access those 'coding AI' prompt anonymously then rate limited?

If no anonymous access is provided, is there a way to create an account with a noscript/basic (x)html/classic web browsers in order to get an API key secret?

Because I do not use web engines from the "whatng" cartel.

To add insult to injury, my email is self-hosted with IP literals to avoid funding the DNS people which are mostly now in strong partnership with the "whatng" cartel (email with IP literals are "stronger" than SPF since it does the same and more). An email is often required for account registration.

Enshittification needs its Moore's law.

"Claude Code isn’t just an interactive CLI; it’s also a powerful SDK for building entirely new agents—..."

Em dash and "it's not X, it's Y" in one sentence. Tired of reading posts written by AI. Feels disrespectful to your readers

I have the same instinctive response to reading AI generated stuff, but I'm coming to a more moderate position where I'm trying to judge the content on the content itself. For example, in a post like this, it doesn't bother me at all because it's still an extremely useful reference, and the author clearly read through, organized, and edited the output. This is a good example of usage of AI in my opinion.

The people who just copy paste output from ai and ship it as a blog post however, deserve significant condemnation for that.

> Tired of reading posts written by AI.

Didn’t realize you were forced to read this?

> Feels disrespectful to your readers

I didn’t feel disrespected—I felt so respected I read the whole thing.

> /clear + /catchup (Simple Restart): My default reboot. I /clear the state, then run a custom /catchup command to make Claude read all changed files in my git branch.

I've found myself doing similar workarounds. I'm guessing anthropic will just make the /compact command do this instead soon enough.

Are "agents" really as useful as the hype makes them out to be?

[0] https://github.com/NickvanDyke/opencode.nvim

Don't sleep on using Claude Code to improve your Claude Code config. Switch to plan mode and try the following prompt:

read the document at https://blog.sshh.io/p/how-i-use-every-claude-code-feature and tell me how to improve my Claude code setup

> If you’re not already using a CLI-based agent like Claude Code or Codex CLI, you probably should be.

Are the CLI-based agents better (much better?) than the Cursor app? Why?

I like how easy it is to get Cursor to focus a particular piece of code. I select the text and Cmd-L, saying "fix this part, it's broken like this ____."

I haven't really tried a CLI agent; sending snippets of code by CLI sounds really annoying. "Fix login.ts lines 148-160, it's broken like this ___"

Claude is able to detect the lines of code selected in vscode anyway

Yes and you can select multiple files to give it focus. It can run anything in your PATH too. Eg it's pretty good at using `gh` and so on

They all have optional ide integration, e.g Claude knows the active vscode tab and highlighted lines.

Claude is just better at coding than cursor.

Really, the interface isn't a meaningful part of it. I also like cmd-L, but claude just does better at writing code.

...also, it's nice that Anthropic is just focusing on making cool stuff (like skills), while the folk from cursor are... I dunno. Whatever it is they're doing with cursor 2.0 :shrug:

> The Takeaway: Skills are the right abstraction. They formalize the “scripting”-based agent model, which is more robust and flexible than the rigid, API-like model that MCP represents.

By the way, if you are a Mac user, you can execute Skills locally via OpenSkills[1] that I have created using apple contianers.

1. OpenSkills -https://github.com/BandarLabs/open-skills

Skills are also a convenient way for writing self-documenting packages. They solve the problem of teaching the LLM how to use a library.

I have started experimenting with a skills/ directory in my open source software, and then made a plugin marketplace that just pulls them in. It works well, but I don't know how scalable it will be.

https://github.com/juanre/ai-tools

- https://github.com/eqtylab/cupcake

- https://github.com/anthropics/claude-code/issues/712

If no anonymous access is provided, is there a way to create an account with a noscript/basic (x)html/classic web browsers in order to get an API key secret?

Because I do not use web engines from the "whatng" cartel.

Enshittification needs its Moore's law.

Thanks! I def don't think I would have guessed this use case when MCP first came out, but more and more it seems Claude just yearns for scripting on data rather than a bunch of "tools". My/MCPs job has become just getting it that data.

[1]: https://x.com/mitsuhiko/status/1984756813850374578?s=46

Yeah I'm fairy pessimistic about how much folks will read

100%. I was excited when i read that disclaimer and found myself disappointed by the limited content. That said, i did get a couple tidbits out of it.

The people who just copy paste output from ai and ship it as a blog post however, deserve significant condemnation for that.

The internet is dead. Long live the internet.

> Tired of reading posts written by AI.

Didn’t realize you were forced to read this?

> Feels disrespectful to your readers

I didn’t feel disrespected—I felt so respected I read the whole thing.

to be clear that repo is simply a thin neovim plugin for the truly excellent agent TUI opencode

I recommend using it directly instead of via the plugin

Yeah I started with Cursor, went hybrid, and then in the last month or so I've totally swapped over.

Part of it is the snappy more minimal UX but also just pure efficacy seems consistently better. Claude does its best work in CC. I'm sure the same is true of Codex.

Claude is able to detect the lines of code selected in vscode anyway

Yes and you can select multiple files to give it focus. It can run anything in your PATH too. Eg it's pretty good at using `gh` and so on

They all have optional ide integration, e.g Claude knows the active vscode tab and highlighted lines.

Claude is just better at coding than cursor.

Really, the interface isn't a meaningful part of it. I also like cmd-L, but claude just does better at writing code.

...also, it's nice that Anthropic is just focusing on making cool stuff (like skills), while the folk from cursor are... I dunno. Whatever it is they're doing with cursor 2.0 :shrug:

As fascinating as these tools can be - are we (the industry) once again finding something other than our “customer” to focus our brains on (see Paul Graham’s “Top idea in your mind” essay)?

It seems so ... LLM-based coding tools are mostly about speed and cost of development - corporate accounting metrics, but what customers care about is mostly product features (& lack of bugs).

There is no customer advantage to developing cheap and fast if the delivered product isn't well conceived from a current and future customer-needs perspective, and a quickly shipped product full of bugs isn't going to help anyone.

I think the same goes for AI in general - CEOs are salivating over adopting "AI" (which people like Altman and Amodei are telling them will be human level tomorrow, or yesterday in the case of Amodei), and using it to reduce employee head count, but the technology is nowhere near the human level needed to actually benefit customers. An "AI" (i.e. LLM) customer service agent/chatbot is just going to piss off customers.

What do you mean by "customer"? Because I'm also using these tools to understand the customer better.

Crazy how fast Claude Code is evolving, every week there’s something new to learn, and it just keeps getting better.

Nothing crazy about it, judging by how much CPU and memory it uses. Now, if it managed to grow features without bringing my M4 Mac with 64GB of ram to a crawl... that's be magic.

With all the LLM coding assistants, you have to get a feel for each model and which extension/interface you're using with them. Not only that, but it's also dependent on your project!

For example, if you're writing a command line tool in Python, it doesn't really matter what model you use since they're all really great at Python (LOL). However, if you're writing a complicated SPA that uses say, Vue 3 with Vite (and some fancy CSS framework) and Python w/FastAPI... You want the "smartest" model that knows about all these things at once (and regularly gets updated knowledge of the latest versions of things). For me, that means Claude Code.

I am cheap though and only pay Anthropic $20/month. This means I run out of Claude Credits every damned week (haha). To work around this problem, I used to use OpenAI's pay-per-use API with gpt5-mini with VS Code's Copilot extension, switching to GPT5-codex (medium) with the Codex extension for more complicated tasks.

Now that I've got more experience, I've figured out that GPT5-codex costs way too much (in API credits) for what you get in nearly all situations. Seriously: Why TF does it use that much "usage". Anyway...

I've tried them all with my very, very complicated collaborative editor (CRDTs), specifically to learn how to better use AI coding assistants. So here's what I do now:

    * Ollama cloud for gpt-oss:120b (it's so fast!)
    * Claude Code for everything else.

I cannot understate how impressed I am with gpt-oss:120b... It's like 10x faster than gpt5-mini and yet seems to perform just as well. Maybe better, actually because it forces you to narrow your prompts (due to smaller context window). But because it's just so damned fast, that doesn't matter.

With Claude Code, it's like magic: You give it a really f'ing complicated thing to troubleshoot or implement and it just goes—and keeps going until it finishes or you run out of tokens! It's a, "the future is now!" experience for sure.

With gpt-oss:120b it's more like having an actual conversation, where the only time you stop typing is when you're reviewing what it did (which you have to do for all the models... Some more than others).

FYI: The worst is Gemini 2.5. I wouldn't even bother! It's such trash, I can't even fathom how Google is trying to pass it off as anything more than a toy. When it decides to actually run (as opposed to responding with, "Failed... Try again"), it'll either hallucinate things that have absolutely nothing to do with your prompt or it'll behave like some petulant middle school kid that pretend to spend a lot of time thinking about something but ultimately does nothing at all.

I really enjoyed reading this. One thought I had on the issue of paths in Claude.md

My concern with hardcoding paths inside a doc, it will likely become outdated as the codebase evolves.

One solution would be to script it and have it run pre commit to regenerate the Claude.md with the new paths.

Perhaps using some kind of inline documentation standard like jsdoc if it’s a ts file or a naming convention if it’s an Md file

Example:

// @claude.md // For complex … usage or if you encounter a FooBarError, see ${path} for advanced troubleshooting steps

We have a linter that checks for this to help mitigate

> Finally, we keep this file synced with an AGENTS.md file to maintain compatibility with other AI IDEs that our engineers might be using.

I researched this the other day, the recommended (by Anthropic) way to do this is to have a CLAUDE.md with a single line in it:

  @AGENTS.md

Then keep your actual content in the other file: https://docs.claude.com/en/docs/claude-code/claude-code-on-t...

This is one thing I think they need to get in-line with, and rename CLAUDE.md to AGENTS.md to follow convention.

We have an AGENTS.md symlinked to CLAUDE.md, seems to work fine.

Yeah that's probably a slightly cleaner way of doing it.

You think it would be a good idea to use a symlink instead?

In my experience, neither Claude nor any other agent actually reads AGENTS.md (or CLAUDE.md or anything else) without being told to explicitly every session.

It seems to be relative to skill level. If you're less-experienced, you're letting these things write most if not all of your code. If you're more experienced, that's inverted (you write most of the code and let the AI safely pepper things in).

Makes sense. I work for a growth stage startup and most of these apply to our internal mono repo so hard to share specifics. We use this for both new and legacy code each with their own unique AI coding challenges.

If theres enough interest, I might replicate some examples in an open source project.

Right now these are reading like a guide to prolog in the 1980s.

Given that this space is so rapidly evolving, these kinds of posts are helpful just to make sure you aren't missing anything big. I've caught myself doing something the hard way after reading one of these. In this case, the framing is basically man pages for CLIs was a helpful description of sills that gives me some ideas about how to improve interaction with an in-house CLI my co. uses.

I wouldn't use as a guide necessarily, but I would use as a way to sync my own findings and see if I have missed something important.

I wouldn't say I follow them as guides, but I think the field is changing quickly enough that it's good, or at least interesting, to read what's working well for other people.

This one is already out of date. The bit on the top about allocating space in CLAUDE.md for each tool is largely a waste of tokens these days. Use the skills feature.

right on. i usually just tell it "hey go update this function to do [x]" in horribly misspelled english and then yell at it until it does it right

Sometimes I write with Claude in English and German mixed with really bad typos and it’s amazing how well it works.

No thanks. I rather write the code myself that use generated slop. I actually like to code and see little benefit in other peoples copypaste code (thats essentially what ai slop is really)

I feel or have the fear that the world will tumble and crack under the sheer amount of code we produce and can’t be maintained because at one point no one human can understand all the stuff that was written.

At the moment though I also code on and off with an agent. I’m not ready or willing to only vibe code my projects. For one is the fact that I had tons of examples where the agent gaslighted me only to turn around at the last stage. And in some cases the code output was to result focused and didn’t think about the broader general usage. And sure that’s in part because I hold it wrong. Don’t specify 10million markdown files etc. But it’s a feedback loop system. If I don’t trust the results I don’t jump in deeper. And I feel a lot of developers have no issue with jumping ever deeper. Write MCPs now CLIs and describe projects with custom markdown files. But I think we really need both camps. Otherwise we don’t move forward.

Why are so many still using CC and not Codex

If you have no modifications or customization of Claude code then it comes down to a preference for proactivity (codex) or a bit more restraint.

If you are using literally any of Claude Code’s features the experience isn’t close, and regardless of model preference (Claude is my least favorite model by far) you should probably use Claude code. It’s just a much more extensible product for teams.

CC has better agent tools and is faster. The ability to switch from plan mode to execution mode and back is huge. Toggling thinking also. And of course they are innovating all of these agentic features like MCP, sub-agents, skills, etc...

Codex writes higher quality code, but is slower and less feature rich. I imagine this will change within months. The jury is still out. Exciting times!

I use both at the same time. CC seems to have better access to web and researching capabilities compared to Codex. Maybe I'm not using Codex right or missing something, but it has frequent troubles browsing internet. Also Claude Code is faster. So I use it when I know it can handle the task.

Ecosystem features and cohesion.

Both. Codex MCP within CC as a second brain. Best of both worlds.

I generally like to use it. But I one project in the org which simply can’t work because the internal built system expects a normal .git directory at the root. Means I have to rewrite some of the build code that isn’t aware of this git feature. And yes we use a library to read from git but not the git cli or a more recent compatible one that understands that the current work tree is not the main one.

Agree, depending on the repo and changes it’s hard with local dev servers. It sometimes works well if you don’t need local dockers and want to outsource git workflow to CC as well. Then it can do on that branch whatever it wants and main work is in another worktree with more steering and or docker env.

"All my stateless tools (like Jira, AWS, GitHub) have been migrated to simple CLIs." - How do you get Jira on the CLI?

There's an Atlasian cli with Jira support https://developer.atlassian.com/cloud/acli/reference/command...

Jiratui[0] has some support for basic automation. That's probably what OP is using as it is the most poppular Jira cli tool out there.

0: https://github.com/whyisdifficult/jiratui

First search result (on Kagi): https://github.com/ankitpokhrel/jira-cli

Latest version from 2 momths ago, >4700 stars on GitHub

Just my curiosity: Why are you producing so much code? Is it because it is now possible to do so with AI, or because you have a genuine need (solid business usecase) that requires a lot of code?

Often code in SaaS companies like ours is indeed how we solve customer problems. It's not so much the amount of code but the rate (code per time) we can effectively use to solve problems/build solutions. AI, when tuned correctly, lets us do this faster than ever possible before.

> Generally my goal is to “shoot and forget”—to delegate, set the context, and let it work. Judging the tool by the final PR and not how it gets there.

Yeah I'm fairy pessimistic about how much folks will read

100%. I was excited when i read that disclaimer and found myself disappointed by the limited content. That said, i did get a couple tidbits out of it.

to be clear that repo is simply a thin neovim plugin for the truly excellent agent TUI opencode

I recommend using it directly instead of via the plugin

It seems so ... LLM-based coding tools are mostly about speed and cost of development - corporate accounting metrics, but what customers care about is mostly product features (& lack of bugs).

What do you mean by "customer"? Because I'm also using these tools to understand the customer better.

The prerequisite thought here is that you're using CC to invoke CLI tools.

So now you need to get CC to understand _how_ to do that for various tools in a way that's context efficient, because otherwise you're relying on either potentially outdated knowledge that Claude has built in (leading to errors b/c CC doesn't know about recent versions) or chucking the entirety of a man page into your default context (inefficent).

What the Skill files do is then separate the when from the how.

Consider the git cli.

The skill file has a couple of sentences on when to use the git cli and then a much longer section on how it's supposed to be used, and the "how" section isn't loaded until you actually need it.

I've got skills for stuff like invoking the native screenshot CLI tool on the Mac, for calling a custom shell script that uses the github API to download and pull in screenshots from issues (b/c the cli doesn't know how to do this), for accessing separate APIs for data, etc.

Maybe these might be handy: - https://github.com/anthropics/skills - https://www.anthropic.com/engineering/equipping-agents-for-t...

I think if it literally as a collection of .md files and scripts to help perform some set of actions. I'm excited for it not really as a "new thing" (as mentioned in the post) but as effectively an endorsement for this pattern of agent-data interaction.

apparently I missed Simon Willison's article, this at least somewhat explains them: https://simonwillison.net/2025/Oct/16/claude-skills/

So if youre building your own agent, this would be a directory of markdown documents with headers that you tell the agent to scan so that its aware of them, and then if it thinks they could be useful it can choose to read all the instructions into its context? Is it any more than that?

I guess I dont understand how this isnt just RAG with an index you make the agent aware of?

Yeah it is a mystery to me how folks could also maintain context in more than two sessions. The code review would be brutal.

You’ll also end up dealing with merge conflicts if you haven’t carefully split the work or modularized the code.

I have a bash script that creates the worktree, copies env over and changes the ports of containers and the services. I then can proxy the "real" port to any worktree, it's common I'll have 3 worktrees active to switch back and forth

I gave that a try, then I decided to use devcontainers instead, and I find that better, for the reasons you mentioned.

There's an Atlasian cli with Jira support https://developer.atlassian.com/cloud/acli/reference/command...

Jiratui[0] has some support for basic automation. That's probably what OP is using as it is the most poppular Jira cli tool out there.

0: https://github.com/whyisdifficult/jiratui

First search result (on Kagi): https://github.com/ankitpokhrel/jira-cli

Latest version from 2 momths ago, >4700 stars on GitHub

At some point I vibecoded myself everything into cli commands, anything that has API could be a cli command.

I just started developing self-hosted services largely with AI.

It wasn't possible before for me to do any of this at this kind of scale. Before, getting stuck on a bug could mean hours, days, or maybe even weeks of debugging. I never made the kind of progress I wanted before.

Many of the things I want, do already exist, but are often older, not as efficient or flexible as they could be, or just plain _look_ dated.

But now I can pump out react/shadcn frontends easily, generate apis, and get going relatively quickly. It's still not pure magic. I'm still hitting issues and such, but they are not these demotivating, project-ending, roadblocks anymore.

I can now move at a speed that matches the ideas I have.

I am giving up something to achieve that, by allowing AI to take control so much, but it's a trade that seems worth it.

>> Why are you producing so much code?

This is basically a "thinking tax".

If you don't want to think and offload it to llm they burn through a lot of tokens to implement in a non-efficient way something you could often do in 10 lines if you though about it for a few minutes.

No, I think it is normal. If it were easy to gain a mental model of the code simply by reading, then debugging would be trivial. The whole point of debugging is that there are differences between your mental model of the code and what the code is actually doing, that sometimes can't be uncovered unless you step through it line by line even if you're the one who wrote it.

It is why I am a bit puzzled by the people who use an LLM to generate code in anything other than a "tightly scoped" fashion (boilerplate, throwaway code, standalone script, single file, or at the function level). I'm not sure how that makes your job later on any easier if you have even a worse mental model of the code because you didn't even write it. And debugging is almost usually more tedious than writing code, so you've traded off the fun/easy part for a more difficult one. Seems like a faustian deal.

I treat everything I find in code review as something to integrate into the prompts. Eventually, on a given project, you end up getting correct PRs without manual intervention. That's what they mean. You still have to review your code of course!

I've found planning to be key here for scaling to arbitrary complex changes.

It's much easier to review larger changes when you've aligned on a Claude generated plan up front.

Skills seem like the way forward, but Claude still needs to be convinced to activate the skill. If that's not happening reliably, hooks should be able to help.

A sibling comment on hooks mentions some approaches. You could also try leveraging the UserPromptSubmit hook to do some prompt analysis and force relevant skill activation.

I wonder how well a sentence or two in CLAUDE.md, saying to search the local project for examples of similar use cases or use of internal libraries, would work.

Hooks can also be useful for this. If it's using the wrong APIs then can hint on write or block on commit with some lint function that checks for this.

Writing is a tool for thought I feel, and when you're outsourcing your thought, it detracts from whatever you intended to say. I guess if it was that heavily-edited such a telltale sign wouldn't have remained.

I use AI for code, but I never use it for any writing that is for human eyes.

Not worried about hallucinations?

Nothing there implies I was forced. I read the whole thing. It was not disclosed that it was written (in part?) by AI. And That sentence I quoted was pretty far along.

As-is Gemini CLI and Codex. I run my CLIs in VSC and only using it as a file browser.

Is that better than Cursor? Same? Just different?

Cursor can use the Claude Sonnet and Claude Opus LLMs, so I would expect output to be quite similar in that respect.

The agentic part of the equation is improving on both sides all the time.

Have you tried using light CLIs rather than MCP? I’ve found that CLIs are just easier for Claude, especially if you write them with Claude and during planning instruct it to think about adding guidance to users who get confused.

Our auth, log diving, infra state, etc, is all usable via cli, and it feels pretty good when pointing Claude at it.

The internet is dead. Long live the internet.

the eternal september 2: it's eternaler this time

Nothing crazy about it, judging by how much CPU and memory it uses. Now, if it managed to grow features without bringing my M4 Mac with 64GB of ram to a crawl... that's be magic.

Claude doesn't do much of anything on the local machine. I run it on a Macbook Air and a piddly 2vCPU 4GB VPS. Works fine.

My m1 macbook pro works fine with +10 claude code sessions open at the same time (iTerm2). Are you using a terminal with a memory leak perhaps?

Huh, Claude Code barely uses any system ressources. Are you sure it's Claude Code and not some Electron app that hasn't been updated for Tahoe?

I use Claude Code. A lot.

As a hobbyist, I run it in a VM several times a week on side projects, often with --dangerously-skip-permissions to vibe code whatever idea is on my mind. Professionally, part of my team builds the AI-IDE rules and tooling for our engineering team that consumes several billion tokens per month just for codegen.

The CLI agent space is getting crowded and between Claude Code, Gemini CLI, Cursor, and Codex CLI, it feels like the real race is between Anthropic and OpenAI. But TBH when I talk to other developers, their choice often comes down to what feels like superficials—a “lucky” feature implementation or a system prompt “vibe” they just prefer. At this point these tools are all pretty good. I also feel like folks often also over index on the output style or UI. Like to me the “you’re absolutely right!” sycophancy isn’t a notable bug; it’s a signal that you’re too in-the-loop. Generally my goal is to “shoot and forget”—to delegate, set the context, and let it work. Judging the tool by the final PR and not how it gets there.

Having stuck to Claude Code for the last few months, this post is my set of reflections on Claude Code’s entire ecosystem. We’ll cover nearly every feature I use (and, just as importantly, the ones I don’t), from the foundational CLAUDE.md file and custom slash commands to the powerful world of Subagents, Hooks, and GitHub Actions. This post ended up a bit long and I’d recommend it as more of a reference than something to read in entirety.

The single most important file in your codebase for using Claude Code effectively is the root CLAUDE.md. This file is the agent’s “constitution,” its primary source of truth for how your specific repository works.

How you treat this file depends on the context. For my hobby projects, I let Claude dump whatever it wants in there.

For my professional work, our monorepo’s CLAUDE.md is strictly maintained and currently sits at 13KB (I could easily see it growing to 25KB).

It only documents tools and APIs used by 30% (arbitrary) or more of our engineers (else tools are documented in product or library specific markdown files)
We’ve even started allocating effectively a max token count for each internal tool’s documentation, almost like selling “ad space” to teams. If you can’t explain your tool concisely, it’s not ready for the CLAUDE.md.

Over time, we’ve developed a strong, opinionated philosophy for writing an effective CLAUDE.md.

Start with Guardrails, Not a Manual. Your CLAUDE.md should start small, documenting based on what Claude is getting wrong.
Don’t @-File Docs. If you have extensive documentation elsewhere, it’s tempting to @-mention those files in your CLAUDE.md. This bloats the context window by embedding the entire file on every run. But if you just mention the path, Claude will often ignore it. You have to pitch the agent on why and when to read the file. “For complex … usage or if you encounter a FooBarError, see path/to/docs.md for advanced troubleshooting steps.”
Don’t Just Say “Never.” Avoid negative-only constraints like “Never use the --foo-bar flag.” The agent will get stuck when it thinks it must use that flag. Always provide an alternative.
Use CLAUDE.md as a Forcing Function. If your CLI commands are complex and verbose, don’t write paragraphs of documentation to explain them. That’s patching a human problem. Instead, write a simple bash wrapper with a clear, intuitive API and document that. Keeping your CLAUDE.md as short as possible is a fantastic forcing function for simplifying your codebase and internal tooling.

Here’s a simplified snapshot:

# Monorepo

## Python
- Always ...
- Test with <command>
... 10 more ...

## <Internal CLI Tool>
... 10 bullets, focused on the 80% of use cases ...
- <usage example>
- Always ...
- Never <x>, prefer <Y>

For <complex usage> or <error> see path/to/<tool>_docs.md

...

Finally, we keep this file synced with an AGENTS.md file to maintain compatibility with other AI IDEs that our engineers might be using.

If you are looking for more tips for writing markdown for coding agents see “AI Can’t Read Your Docs”, “AI-powered Software Engineering”, and “How Cursor (AI IDE) Works”.

The Takeaway: Treat your CLAUDE.md as a high-level, curated set of guardrails and pointers. Use it to guide where you need to invest in more AI (and human) friendly tools, rather than trying to make it a comprehensive manual.

I recommend running /context mid coding session at least once to understand how you are using your 200k token context window (even with Sonnet-1M, I don’t trust that the full context window is actually used effectively). For us a fresh session in our monorepo costs a baseline ~20k tokens (10%) with the remaining 180k for making your change — which can fill up quite fast.

[

](https://substackcdn.com/image/fetch/$s_!o_oM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee93292-646a-407a-95da-d469be81002e_1158x720.png)

A screenshot of /context in one of my recent side projects. You can almost think of this like disk space that fills up as you work on a feature. After a few minutes or hours you’ll need to clear the messages (purple) to make space to continue.

I have three main workflows:

/compact (Avoid): I avoid this as much as possible. The automatic compaction is opaque, error-prone, and not well-optimized.
/clear + /catchup (Simple Restart): My default reboot. I /clear the state, then run a custom /catchup command to make Claude read all changed files in my git branch.
“Document & Clear” (Complex Restart): For large tasks. I have Claude dump its plan and progress into a .md, /clear the state, then start a new session by telling it to read the .md and continue.

The Takeaway: Don’t trust auto-compaction. Use /clear for simple reboots and the “Document & Clear” method to create durable, external “memory” for complex tasks.

I think of slash commands as simple shortcuts for frequently used prompts, nothing more. My setup is minimal:

/catchup: The command I mentioned earlier. It just prompts Claude to read all changed files in my current git branch.
/pr: A simple helper to clean up my code, stage it, and prepare a pull request.

IMHO if you have a long list of complex, custom slash commands, you’ve created an anti-pattern. To me the entire point of an agent like Claude is that you can type almost whatever you want and get a useful, mergable result. The moment you force an engineer (or non-engineer) to learn a new, documented-somewhere list of essential magic commands just to get work done, you’ve failed.

The Takeaway: Use slash commands as simple, personal shortcuts, not as a replacement for building a more intuitive CLAUDE.md and better-tooled agent.

On paper, custom subagents are Claude Code’s most powerful feature for context management. The pitch is simple: a complex task requires X tokens of input context (e.g., how to run tests), accumulates Y tokens of working context, and produces a Z token answer. Running N tasks means (X + Y + Z) * N tokens in your main window.

The subagent solution is to farm out the (X + Y) * N work to specialized agents, which only return the final Z token answers, keeping your main context clean.

I find they are a powerful idea that, in practice, custom subagents create two new problems:

They Gatekeep Context: If I make a PythonTests subagent, I’ve now hidden all testing context from my main agent. It can no longer reason holistically about a change. It’s now forced to invoke the subagent just to know how to validate its own code.
They Force Human Workflows: Worse, they force Claude into a rigid, human-defined workflow. I’m now dictating how it must delegate, which is the very problem I’m trying to get the agent to solve for me.

My preferred alternative is to use Claude’s built-in Task(...) feature to spawn clones of the general agent.

I put all my key context in the CLAUDE.md. Then, I let the main agent decide when and how to delegate work to copies of itself. This gives me all the context-saving benefits of subagents without the drawbacks. The agent manages its own orchestration dynamically.

In my “Building Multi-Agent Systems (Part 2)” post, I called this the “Master-Clone” architecture, and I strongly prefer it over the “Lead-Specialist” model that custom subagents encourage.

The Takeaway: Custom subagents are a brittle solution. Give your main agent the context (in CLAUDE.md) and let it use its own Task/Explore(...) feature to manage delegation.

On a simple level, I use claude --resume and claude --continue frequently. They’re great for restarting a bugged terminal or quickly rebooting an older session. I’ll often claude --resume a session from days ago just to ask the agent to summarize how it overcame a specific error, which I then use to improve our CLAUDE.md and internal tooling.

More in the weeds, Claude Code stores all session history in ~/.claude/projects/ to tap into the raw historical session data. I have scripts that run meta-analysis on these logs, looking for common exceptions, permission requests, and error patterns to help improve agent-facing context.

The Takeaway: Use claude --resume and claude --continue to restart sessions and uncover buried historical context.

Hooks are huge. I don’t use them for hobby projects, but they are critical for steering Claude in a complex enterprise repo. They are the deterministic “must-do” rules that complement the “should-do” suggestions in CLAUDE.md.

We use two types:

Block-at-Submit Hooks: This is our primary strategy. We have a PreToolUse hook that wraps any Bash(git commit) command. It checks for a /tmp/agent-pre-commit-pass file, which our test script only creates if all tests pass. If the file is missing, the hook blocks the commit, forcing Claude into a “test-and-fix” loop until the build is green.
Hint Hooks: These are simple, non-blocking hooks that provide “fire-and-forget” feedback if the agent is doing something suboptimal.

We intentionally do not use “block-at-write” hooks (e.g., on Edit or Write). Blocking an agent mid-plan confuses or even “frustrates” it. It’s far more effective to let it finish its work and then check the final, completed result at the commit stage.

The Takeaway: Use hooks to enforce state validation at commit time (block-at-submit). Avoid blocking at write time—let the agent finish its plan, then check the final result.

Planning is essential for any “large” feature change with an AI IDE.

For my hobby projects, I exclusively use the built-in planning mode. It’s a way to align with Claude before it starts, defining both how to build something and the “inspection checkpoints” where it needs to stop and show me its work. Using this regularly builds a strong intuition for what minimal context is needed to get a good plan without Claude botching the implementation.

In our work monorepo, we’ve started rolling out a custom planning tool built on the Claude Code SDK. Its similar to native plan mode but heavily prompted to align its outputs with our existing technical design format. It also enforces our internal best practices—from code structure to data privacy and security—out of the box. This lets our engineers “vibe plan” a new feature as if they were a senior architect (or at least that’s the pitch).

The Takeaway: Always use the built-in planning mode for complex changes to align on a plan before the agent starts working.

I agree with Simon Willison’s: Skills are (maybe) a bigger deal than MCP.

If you’ve been following my posts, you’ll know I’ve drifted away from MCP for most dev workflows, preferring to build simple CLIs instead (as I argued in “AI Can’t Read Your Docs”). My mental model for agent autonomy has evolved into three stages:

Single Prompt: Giving the agent all context in one massive prompt. (Brittle, doesn’t scale).
Tool Calling: The “classic” agent model. We hand-craft tools and abstract away reality for the agent. (Better, but creates new abstractions and context bottlenecks).
Scripting: We give the agent access to the raw environment—binaries, scripts, and docs—and it writes code on the fly to interact with them.

With this model in mind, Agent Skills are the obvious next feature. They are the formal productization of the “Scripting” layer.

If, like me, you’ve already been favoring CLIs over MCP, you’ve been implicitly getting the benefit of Skills all along. The SKILL.md file is just a more organized, shareable, and discoverable way to document these CLIs and scripts and expose them to the agent.

The Takeaway: Skills are the right abstraction. They formalize the “scripting”-based agent model, which is more robust and flexible than the rigid, API-like model that MCP represents.

Skills don’t mean MCP is dead (see also “Everything Wrong with MCP”). Previously, many built awful, context-heavy MCPs with dozens of tools that just mirrored a REST API (read_thing_a(), read_thing_b(), update_thing_c()).

The “Scripting” model (now formalized by Skills) is better, but it needs a secure way to access the environment. This to me is the new, more focused role for MCP.

Instead of a bloated API, an MCP should be a simple, secure gateway that provides a few powerful, high-level tools:

download_raw_data(filters…)
take_sensitive_gated_action(args…)
execute_code_in_environment_with_state(code…)

In this model, MCP’s job isn’t to abstract reality for the agent; its job is to manage the auth, networking, and security boundaries and then get out of the way. It provides the entry point for the agent, which then uses its scripting and markdown context to do the actual work.

The only MCP I still use is for Playwright, which makes sense—it’s a complex, stateful environment. All my stateless tools (like Jira, AWS, GitHub) have been migrated to simple CLIs.

The Takeaway: Use MCPs that act as data gateways. Give the agent one or two high-level tools (like a raw data dump API) that it can then script against.

Claude Code isn’t just an interactive CLI; it’s also a powerful SDK for building entirely new agents—for both coding and non-coding tasks. I’ve started using it as my default agent framework over tools like LangChain/CrewAI for most new hobby projects.

I use it in three main ways:

Massive Parallel Scripting: For large-scale refactors, bug fixes, or migrations, I don’t use the interactive chat. I write simple bash scripts that call claude -p “in /pathA change all refs from foo to bar” in parallel. This is far more scalable and controllable than trying to get the main agent to manage dozens of subagent tasks.
Building Internal Chat Tools: The SDK is perfect for wrapping complex processes in a simple chat interface for non-technical users. Like an installer that, on error, falls back to the Claude Code SDK to just fix the problem for the user. Or an in-house “v0-at-home” tool that lets our design team vibe-code mock frontends in our in-house UI framework, ensuring their ideas are high-fidelity and the code is more directly usable in frontend production code.
Rapid Agent Prototyping: This is my most common use. It’s not just for coding. If I have an idea for any agentic task (e.g., a “threat investigation agent” that uses custom CLIs or MCPs), I use the Claude Code SDK to quickly build and test the prototype before committing to a full, deployed scaffolding.

The Takeaway: The Claude Code SDK is a powerful, general-purpose agent framework. Use it for batch-processing code, building internal tools, and rapidly prototyping new agents before you reach for more complex frameworks.

The Claude Code GitHub Action (GHA) is probably one of my favorite and most slept on features. It’s a simple concept: just run Claude Code in a GHA. But this simplicity is what makes it so powerful.

It’s similar to Cursor’s background agents or the Codex managed web UI but is far more customizable. You control the entire container and environment, giving you more access to data and, crucially, much stronger sandboxing and audit controls than any other product provides. Plus, it supports all the advanced features like Hooks and MCP.

We’ve used it to build custom “PR-from-anywhere” tooling. Users can trigger a PR from Slack, Jira, or even a CloudWatch alert, and the GHA will fix the bug or add the feature and return a fully tested PR1.

Since the GHA logs are the full agent logs, we have an ops process to regularly review these logs at a company level for common mistakes, bash errors, or unaligned engineering practices. This creates a data-driven flywheel: Bugs -> Improved CLAUDE.md / CLIs -> Better Agent.

$ query-claude-gha-logs --since 5d | claude -p “see what the other claudes were getting stuck on and fix it, then put up a PR“

The Takeaway: The GHA is the ultimate way to operationalize Claude Code. It turns it from a personal tool into a core, auditable, and self-improving part of your engineering system.

Finally, I have a few specific settings.json configurations that I’ve found essential for both hobby and professional work.

HTTPS_PROXY/HTTP_PROXY: This is great for debugging. I’ll use it to inspect the raw traffic to see exactly what prompts Claude is sending. For background agents, it’s also a powerful tool for fine-grained network sandboxing.
MCP_TOOL_TIMEOUT/BASH_MAX_TIMEOUT_MS: I bump these. I like running long, complex commands, and the default timeouts are often too conservative. I’m honestly not sure if this is still needed now that bash background tasks are a thing, but I keep it just in case.
ANTHROPIC_API_KEY: At work, we use our enterprise API keys (via apiKeyHelper). It shifts us from a “per-seat” license to “usage-based” pricing, which is a much better model for how we work.
- It accounts for the massive variance in developer usage (We’ve seen 1:100x differences between engineers).
- It lets engineers to tinker with non-Claude-Code LLM scripts, all under our single enterprise account.
“permissions”: I’ll occasionally self-audit the list of commands I’ve allowed Claude to auto-run.

The Takeaway: Your settings.json is a powerful place for advanced customization.

That was a lot, but hopefully, you find it useful. If you’re not already using a CLI-based agent like Claude Code or Codex CLI, you probably should be. There are rarely good guides for these advanced features, so the only way to learn is to dive in.

To me, a fairly interesting philosophical question is how many reviewers should a PR get that was generated directly from a customer request (no internal human prompter)? We’ve settled on 2 human approvals for any AI-initiated PR for now, but it is kind of a weird paradigm shift (for me at least) when it’s no longer a human making something for another human to review.

Yeah it is a mystery to me how folks could also maintain context in more than two sessions. The code review would be brutal.

You’ll also end up dealing with merge conflicts if you haven’t carefully split the work or modularized the code.

I gave that a try, then I decided to use devcontainers instead, and I find that better, for the reasons you mentioned.

Maybe these might be handy: - https://github.com/anthropics/skills - https://www.anthropic.com/engineering/equipping-agents-for-t...

At some point I vibecoded myself everything into cli commands, anything that has API could be a cli command.

Claude doesn't do much of anything on the local machine. I run it on a Macbook Air and a piddly 2vCPU 4GB VPS. Works fine.

I just started developing self-hosted services largely with AI.

Many of the things I want, do already exist, but are often older, not as efficient or flexible as they could be, or just plain _look_ dated.

I can now move at a speed that matches the ideas I have.

I am giving up something to achieve that, by allowing AI to take control so much, but it's a trade that seems worth it.

Huh, Claude Code barely uses any system ressources. Are you sure it's Claude Code and not some Electron app that hasn't been updated for Tahoe?

I wouldn't say I follow them as guides, but I think the field is changing quickly enough that it's good, or at least interesting, to read what's working well for other people.

Cloud only. My employer is still on an ancient data center version. But you can easily write a cli that wraps the REST API.

apparently I missed Simon Willison's article, this at least somewhat explains them: https://simonwillison.net/2025/Oct/16/claude-skills/

I guess I dont understand how this isnt just RAG with an index you make the agent aware of?

It also looks a lot like a tool that has a description mentioning it has a more detailed MD file the LLM can read for instructions on complex workflows, doesn’t it? MCP has the concept of resources for this sort of thing. I don’t see any difference between calling a tool and calling a CLI otherwise.

I mean it is technically RAG as the LLM is deciding to retrieve a document. But it’s very constrained.

The skills that I use all direct a next action and how to do it. Most of them instruct to use Tasks to isolate context. Some of them provide abstraction specific context (when working with framework code, find all consumers before making changes. add integration tests for the desired state if it’s missing, then run tests to see…) and others just inject only the correct company specific approach to solving only this problem into Task context.

They are composable and you can build the logic table of when an instance is “skilled” enough. I found them worse than hooks with subagents when I started, but now I see them as the coolest thing in Claude code.

The last benefit is nobody on your team even had to know they exist. You can just have them as part of onboarding and everyone can take advantage of what you’ve learned even when working on greenfield projects that don’t have a CLAUDE.md.

The prerequisite thought here is that you're using CC to invoke CLI tools.

What the Skill files do is then separate the when from the how.

Consider the git cli.

The skill file has a couple of sentences on when to use the git cli and then a much longer section on how it's supposed to be used, and the "how" section isn't loaded until you actually need it.

After CC used that skill and it is now in the context, how do you get rid of it later when you don’t need the skill anymore and don’t want to have your context stuffed with useless skill descriptions?

What I find works best for complex things is having one session generate the plan and then dispatching new sessions for each step to prevent context-rot. Not "parallel agents" but "sequential agents."

Our auth, log diving, infra state, etc, is all usable via cli, and it feels pretty good when pointing Claude at it.

Yeah if that's possible or you are willing to build it, that's the right solution. Today pretty much all of my integrations are pure CLIs like that rather than MCPs.

You can do anything you want via a CLI but MCP still exists as a standard that folks and platforms might want to adopt as a common interface.

Yeah I started with Cursor, went hybrid, and then in the last month or so I've totally swapped over.

Part of it is the snappy more minimal UX but also just pure efficacy seems consistently better. Claude does its best work in CC. I'm sure the same is true of Codex.

Cursor Composer appears to have this type of coupling and uses IDE resources better than other models on average.

Seems you haven't heard of Cursor 2.0

https://cursor.com/blog/2-0

We have a linter that checks for this to help mitigate

You lint the file paths inside Claude.md?

This is one thing I think they need to get in-line with, and rename CLAUDE.md to AGENTS.md to follow convention.

They ain't giving up that free marketing.

We have an AGENTS.md symlinked to CLAUDE.md, seems to work fine.

In my experience, neither Claude nor any other agent actually reads AGENTS.md (or CLAUDE.md or anything else) without being told to explicitly every session.

I've sniffed Claude Code's HTTP traffic and confirmed that the CLAUDE.md file content (and AGENTS.md if it is @-referenced) is automatically included in the system prompt without it having to perform any additional file read operations.

This one is already out of date. The bit on the top about allocating space in CLAUDE.md for each tool is largely a waste of tokens these days. Use the skills feature.

It's a balance and we use both.

Skills doesn't totally deprecate documenting things in CLAUDE.md but agree that a lot of these can be defined as skills instead.

Skill frontmatter also still sits in the global context so it's not really a token optimization either.

Codex writes higher quality code, but is slower and less feature rich. I imagine this will change within months. The jury is still out. Exciting times!

Ecosystem features and cohesion.

What features are preferable to better output quality? (Since you didn’t mention output quality as superior)