DeepClaude

    #!/bin/sh
    export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
    export ANTHROPIC_AUTH_TOKEN=sk-secret
    export ANTHROPIC_MODEL=deepseek-v4-flash
    export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
    exec claude $@

I'm not exactly sure what the point of this is. Deepseek already has instructions to use its API with many CLI's including Claude Code directly:

https://api-docs.deepseek.com/quick_start/agent_integrations...

If you're looking for Claude Code alternatives, I would first suggest looking into pi.dev or opencode for your harness. And then for models, you can choose from OpenCode Go (IMO most cost effect at this moment), OpenRouter, or direct from DeepSeek. Better if you go the Kimi route IMO and just buy a subscription from kimi.com

>DeepSeek V4 Pro scores 96.4% on LiveCodeBench and costs $0.87/M output tokens

This is a heavily subsidized price and will only last until the end of the month: "The deepseek-v4-pro model is currently offered at a 75% discount, extended until 2026/05/31 15:59 UTC." [0]

The "supported backends" table is also deceiving -- while OpenRouter's server's may be in the US, the only way to get the $0.44/$0.87 pricing is to pass through to the DeepSeek API, which of course is China-based. [1]

I do think the model is quite good, I myself use it through Ollama Cloud for simple tasks. But I think some folks have bought in a little too much to the marketing hype around it.

[0] https://api-docs.deepseek.com/quick_start/pricing [1] https://openrouter.ai/deepseek/deepseek-v4-pro/providers

Not sure you can replace Claude with DeepSeek V4 that easily and have same results.

From what I see while building my own agentic system in Elixir, the problem is in training for your specific harness/contracts. Claude/GPT-style models seem to be trained around very specific contracts used by the harness like tool call formats, planning structure, patching, reading files, recovering from errors, and knowing when to stop.

In practice, you either need a very strong general model that can infer and follow those contracts (expensive), or a weaker model that has been fine-tuned / trained specifically on your own agent contracts. Otherwise, the whole thing becomes flaky very quickly. And I suspect with Deepseek V4 you may get last options.

After some time replacing gemini 3 flash preview with deepseek v4 flash for a chat model, the biggest difference is the auto reasoning effort. Gemini flash is super fast and perfect for a chat model. But when I need some thought experiments with a handful of constraints, it struggles a bit and I switch to sonnet. But with deepseek v4 flash, it can do long complex reasoning and it gets things often right. Generating a lot of reasoning tokens means that it takes a lot of time of course. But I am happy to find a cheaper model and excited to try something other than gemini flash. Gemini flash has been so good that I was locked on it for a while.

> Claude Code is the best autonomous coding agent.

If you look at the terminal-bench@2.0 leaderboard, you'll quickly see it's actually one of the weakest agentic harnesses. Anthropic's own models score lower with Claude Code than with virtually any other harness.

So it's quite the opposite. Claude Code is arguably the worst harness to run models with.

> DeepSeek V4 Pro scores 96.4% on LiveCodeBench and costs $0.87/M output tokens.

Yes and this is a temporary discount which increases to 3.48 USD on 2026/05/31 15:59 UTC.

Source: https://api-docs.deepseek.com/quick_start/pricing

It's surprisingly easy to hit $200 worth of tokens even at ~$1/M token though. No matter how many times I do the math the coding plans are the better value.

Tried DeepSeek V4 Pro and Flash on Open Router and they worked fine - flash might have actually produced a better result, but also the same prompt across different inference providers produced the same result. Then tried DS4 Pro again via tinfoil.sh and got the same design but littered with random Chinese characters in the code. Tinfoil pegs prompt data as private / not trained on. Do know know DS4 providers that are verifiably private and not training on your prompts and outputs?

If you're okay with sonnet level performance, this sounds like a straight upgrade. But I find that sonnet messes up too much, that it ends up not being worth cost optimizing down to using it or another sonnet-level model. Glad to have this as an option though

Just want to say that I faced this very problem the last week, I discovered OpenCode agent and it works great, with DeepSeek and other models. Try it out guys.

Using this "out-of-the-box" with an OpenRouter subscription using DeepSeekv4, I just blew through 15 dollars in 45 minutes on a moderate sized code base, just making a plan and executing a refactoring of an upload pipeline to use a state machine. Not really seeing the cost savings for real-world work tbh.

Did... Did you just ask an AI to one-shot something that normally amounts to no more than setting two env variables?

obviously vibe coded ( co authored ) + the prices dont even match

How is this different than using ollama to launch Claude with

ollama launch claude --model deepseek-v4-pro:cloud

Is claude code the best coding harness? Anyone running evals on that?

I've been using DeepSeek v4 pro as an alternative to Claude models and for the first time I can see it as a real replacement. With the other Chinese models, I was missing something, but DeepSeek seems good enough for the kind of development I want to do.

Claude code can already use the DeepSeek API, so what are the advantages of this tool?

Next claude news (trump style): Recent versions of Claude code no longer allow talking to other models, or helping with any code that has the goal of moving away from anthropic models.

Is there a way to do this directly by using claudecode CLI (which I already have installed) and openrouter??

I have a question. does anyone have a problem with switihng context between AI and your terminal

I just spent half my day getting CUDA and LLAMA to work with my 5070TI.

I was able to use it in agent mode with Roo, I stopped after having it write out a plan, but I'll continue when I have more time.

Deepseek feels less likely to do a straight up rug pull since you can self host with enough money, but I'm still more excited about local solutions.

Usually I just need grunt work done. I'm not solving difficult problems.

Using a bunch of CLIs to work with DeepSeek V4, I've found that Langcli is the best fit for DeepSeek V4. For programming tasks, the cache hit rate is above 95%.

Not only can it seamlessly and dynamically switch between DeepSeek V4 Flash, V4 Pro, and other mainstream models within the same context, but it is also 100% compatible with Claude Code.

Interesting setup

do you have any benchmarks on: - token usage over time - failures/retry rates

would be great to see how it behaves in production

96.4% on LiveCodeBench is impressive but LiveCodeBench is single-shot. The interesting test is multi-turn agentic — has anyone benchmarked DeepSeek V4 Pro vs Opus on SWE-bench Verified or similar where the cheaper model has to be more decisive about tool use over 30+ turns? Curious if there's a cliff at higher tool-call depths.

I'm wondering why DeepSeek didn't create an AI coding agent like Kimi Code.

This has become a problem for me. I like trying new things. But I also know that in about a week, there's going to be a better/cheaper setup. And a week after that. And ideally I'd like to get some coding done when I'm not tinkering with the tools.

So I think I'll stay with CC for now.

I wonder if openrouter will replicate that 120x caching, I suppose they will?

Honestly with the likes of Opencode / pi / hermes I don't really find the "Claude Code agent loop" part particularly interesting.

The edge Anthropic has on others lies on its models performance. CLI tooling (and obviously pricing) is definitely not better than others.

You don't need Deep Claude. Claude Code is working with any model that exposes an endpoint for an Anthropic compatible API.

I am using Claude Code with GLM 5.1, MiniMax M2.7, Kimi K2.6 and Xiaomi MiMo V2.5 Pro.

the wrapper is basically env var glue. You’re still betting the whole loop on Anthropic's closed client.

Cost engineering [1] will be the next hot topic for AI.

[1] A fancier way of saying "reducing cost."

Nice, it's quite usefull to have a project like this which streamlines the setup necessary to use other "brains" in claude code "body". I personally will give this a try, but Ijust find the message on pricing a bit disingenuous, the deepseek price of "$0.87/M output tokens" is a discount, and this setup anyways needs a calude.ai subscription offering claude code, which now is 100$/month min.

why not opencode with deepseek?

And if I don't care about cost, what about actual performance?

does it support aws bedrock provider

Can I... somehow run this locally? DeepSeek is opensource? Do I even need their API key?

(I have no experience with running anything locally, maybe it's a stupid question)

Is there some way to make claude/codex beep when it finishes a task.

Why wouldn't you use something open source like OpenCode, which already support DSv4 and has more features than CC?

I am now thinking how far can agentic AI can go how far we can achieve

layer on layer on layer to refactor bunch of lines xD

Oh shoot now the next CC upgrade will blow your subscription for doing this

anthropic messed up big time harness works with any muh commodity LLM, meanwhile VCs were duped on the myth of FOOM AGI, probably not a cooincidence Anthropic is enmeshed with the scifi fan fic forum known as lesswrong. The world wants useful tools. The bay area bubble in contrast thrives on Mythos.

    #!/bin/sh
    export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
    export ANTHROPIC_AUTH_TOKEN=sk-secret
    export ANTHROPIC_MODEL=deepseek-v4-flash
    export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
    exec claude $@

ANTHROPIC_MODEL=deepseek-v4-pro[1m] ANTHROPIC_SUBAGENT_MODEL=deepseek-v4-flash

This is what I’ve been using for non-confidential projects for about a week now (soon after v4 came out). I honestly can’t tell the difference, but I’m not doing anything crazy with it either.

Worth noting that I don’t think DeepSeek‘s API lets you opt out of training. Once this is up on other providers though… (OpenRouter is just proxying to DeepSeek atm)

The more interesting part of deepclaude is the local proxy it runs to switch models mid-session and do combined cost tracking. Though these features seem quite buried in the LLM-generated readme. Looking at the history, it appears they were added later, and the readme wasn't restructured to highlight this.

Also, the author checked in their apparently effective social media advertising plan: https://github.com/aattaran/deepclaude/commit/a90a399682defc... (which seems to be working)

It seems like any project that makes fun of Claude is bound to reach the top spot on Hacker News. Even if it’s just a project consisting of four lines of code.

So I created https://getaivo.dev, one can use model in the coding agent directly. Just `aivo claude -m deepseek-v4-pro`

This in essence is what allows one to use any model with CC -- including local.

thanks, that was super easy.

I have been wanting to try CC with different models since Opus went downhill last month..

What limitations or issues have you noticed when using DeepSeek with Claude Code if any?

The AI wars have begun

those who use deepseek v4, what level of output you get? Codex 5.3 or GPT 5.4?

is flash version on level of gpt 5.4 mini

> DeepSeek V4 Pro scores 96.4% on LiveCodeBench and costs $0.87/M output tokens.

Yes and this is a temporary discount which increases to 3.48 USD on 2026/05/31 15:59 UTC.

Source: https://api-docs.deepseek.com/quick_start/pricing

It's surprisingly easy to hit $200 worth of tokens even at ~$1/M token though. No matter how many times I do the math the coding plans are the better value.

ANTHROPIC_MODEL=deepseek-v4-pro[1m] ANTHROPIC_SUBAGENT_MODEL=deepseek-v4-flash

This is what I’ve been using for non-confidential projects for about a week now (soon after v4 came out). I honestly can’t tell the difference, but I’m not doing anything crazy with it either.

Worth noting that I don’t think DeepSeek‘s API lets you opt out of training. Once this is up on other providers though… (OpenRouter is just proxying to DeepSeek atm)

I wanted to try this. To bring back opus and sonnet do I just reset those env's?

I'm not exactly sure what the point of this is. Deepseek already has instructions to use its API with many CLI's including Claude Code directly:

https://api-docs.deepseek.com/quick_start/agent_integrations...

The readme absolutely buries the features that are actually non-trivial: It runs a proxy to switch models mid-session, and does combined cost tracking between Anthropic and other models you might be using. The LLM that wrote the readme never updated the general project description to highlight these features.

Also the author checked in their advertising plan: https://github.com/aattaran/deepclaude/commit/a90a399682defc...

There probably isn't a point. Someone didn't understand something, didn't research it, so they 1 shotted their first thought and sent it to the front page of HN and all of their socials. It's the future bruh

From vibe coders for vibe coders

I thought the tool format wasnt exactly the same ? So plugging any IA into claude code requires a conversion of format

I'm curious how well it actually works. I tried Deepseek with Hermes and Opencode and it seemed extremely bad about using some of the basic tools given, like the Hermes holographic memory tools, even with system prompt instructions strongly pointing them out.

I’m going to throw my harness in the ring: https://codeberg.org/mlow/lmcli

agreed. OpenCode is a strong base, and with a couple modifications it can become a very effective harness. my sideproject mouse.dev I’ve been combining parts from OpenCode, Claude Code, and Hermes to build a cloud agent architecture that works well from mobile.

Another very cost-effective option is Ollama Cloud. In a month of use, I only hit the 5-hour limit once, when I ran 8 agents simultaneously for 2 hours.