Darkbloom – Private inference on idle Macs

I have a hard time believing their numbers. If you can pay off a mac mini in 2-4 months, and make $1-2k profit every month after that, why wouldn’t their business model just be buying mac minis?

I installed this so you don't have to. It did feel a bit quirky and not super polished. Fails to download the image model. The audio/tts model fails to load.

In 15 minutes of serving Gemma, I got precisely zero actual inference requests, and a bunch of health checks and two attestations.

At the moment they don't have enough sustained demand to justify the earning estimates.

You have to install their MDM device management software on your computer. Basically that computer is theirs now. So don't plan on just handing over your laptop temporarily unless you don't mind some company completely owning your box. Still might be a validate use for people with slightly old laptops lying around, but beware trying to share this computer with your daily activities if you e.g. use a bank on a browser on this computer regularly. MDM means they can swap out your SSL certs level of computer access, please correct me if I'm wrong.

They use the TEE to check that the model and code is untampered with. That's a good, valid approach and should work (I've done similar things on AWS with their TEE)

The key question here is how they avoid the outside computer being able to view the memory of the internal process:

> An in-process inference design that embeds the in- ference engine directly in a hardened process, elimi- nating all inter-process communication channels that could be observed, with optional hypervisor mem- ory isolation that extends protection from software- enforced to hardware-enforced via ARM Stage 2 page tables at zero performance cost.[1]

I was under the impression this wasn't possible if you are using the GPU. I could be misled on this though.

[1] https://github.com/Layr-Labs/d-inference/blob/master/papers/...

Unfortunately, verifiable privacy is not physically possible on MacBooks of today. Don't let a nice presentation fool you.

Apple Silicon has a Secure Enclave, but not a public SGX/TDX/SEV-style enclave for arbitrary code, so these claims are about OS hardening, not verifiable confidential execution.

It would be nice if it were possible. There's a lot of cool innovations possible beyond privacy.

Interesting concept. Two sided marketplaces are hard to bootstrap but maybe just enough curiosity would get the flywheel going. Hell they should just try and convince people to enroll as providers but then also use the service even if it’s hitting their own machines until there’s some degree of supply and demand pressure then try and get only providers to sign up. Or set up some way to encourage providers to promote others to use the service (the 100% rev share kind of breaks that concept but anything can change).

I wish this was self hostable, even for a license fee. Many businesses have fleets of Macs, sometimes even in stock as returned equipment from employees. Would allow for a distributed internal inference network, which has appeal for many orgs who value or require privacy.

@eigengajesh - Your cost estimator lists Mac Mini M4 Pro with only 24 or 48GB options, but the M4 Pro mini can also be configured with 64GB. At least, I hope so, as I'm typing this on one. ;-)

Oh, also, you seem to have some bugs:

Gemma: WARN [vllm_mlx] RuntimeError: Failed to load the default metallib. This library is using language version 4.0 which is not supported on this OS. library not found library not found library not found

cohere: 2026-04-16T14:25:10.541562Z WARN [stt] File "/Users/dga/.darkbloom/bin/stt_server.py", line 332, in load_model 2026-04-16T14:25:10.541614Z WARN [stt] from mlx_audio.stt.models.cohere_asr import audio as audio_mod 2026-04-16T14:25:10.541643Z WARN [stt] ModuleNotFoundError: No module named 'mlx_audio.stt.models.cohere_asr'

Trying to download the flux image models fails with:

curl: (56) The requested URL returned error: 404

darkbloom earnings does not work

your documentation is inconstent between saying 100% of revenue to providers vs 95%

I think .. this needs a little more care and feeding before you open it up widely. :) And maybe lay off the LLM generated text before it gets you in trouble for promising things you're not delivering.

Cool idea. Just some back-of-the-envelope math here (not trusting what's on their site):

My M5 Pro can generate 130 tok/s (4 streams) on Gemma 4 26B. Darkbloom's pricing is $0.20 per Mtok output.

That's about $2.24/day or $67/mo revenue if it's fully utilized 24/7.

Now assuming 50W sustained load, that's about 36 kWh/mo, at ~$.25/kWh approx. $9/mo in costs.

Could be good for lunch money every once in a while! Around $700/yr.

This is one of those ideas I think makes perfect sense, but requires so much operational change for the entire stack, that it would be very difficult to scale:

- Convincing labs to run distributed, burst-y inference

- Convincing people to run their Mac all day, hoping to make a little profit

- Convincing users to trust a distributed network of un-trusted devices

I had a similar idea, pre-AI, just for compute in general. But solving even 1 of those 3 (swap AI lab for managed-compute-type-company, eg Supabase, Vercel) is nearly impossible.

Having strong SETI@Home vibes from 25 years ago, except of course, this is not for the greater good of humanity, but a for-profit project.

Problem is, from a technical point of view, what kind of made sense back then (most people running desktops, fans always on, energy saving minimal) is kind of stupid today (even if your laptop has no fan, would you want it to be always generating heat?)...

I definitely want my laptops to be cool, quiet and idle most of the time.

I'd love a way to do this locally -- pool all the PCs in our own office for in-office pools of compute. Any suggestions from anyone? We currently run ollama but manually manage the pools

client side of this kind of needs to be open source unless I'm running it on a dedicated machine and firewalling it from the rest of my network. Or the company needs to have a very strong reputation and certifications. curlbash and go is a pretty hard sell for me

You might not even know it as a user but the payment/distribution here is all built on crypto+stablecoins. This is a great use case for it.

So basically ... Pied Piper.

Wasn't there an idea about 15 years ago where you would open your browser, go to a webpage and that page would have a JavaScript based client that would run distributed workloads?

I believe the idea was that people could submit big workloads, the server would slice them up and then have the clients download and run a small slice. You as the computer owner would then get some payout.

Intersting to see this coming back again.

Interesting to see an offering with this heritage [1] proposing flat earnings rates for inference operators here, rather than trying to sell a dynamic marketplace where operators compete on price in real-time.

Right now the dashboards show 78 providers online, but someone in-thread here said that they spun one up and got no requests. Surely someone would be willing to beat the posted rate and swallow up the demand?

I expect this is a migration target, but a tactical omission from V1 comms both for legitimate legibility reasons (I can sell x for y is easier to parse than 'I can participate in a marketplace') and slightly illegitimate legibility reasons (obscuring likely future price collapse).

Still - neat project that I hope does well.

[1] Layer Labs, formerly EigenLayer, is company built around a protocol to abstract and recycle economic security guarantees from Ethereum proof of stake.

I like the idea but it wont take off until Homomorphic Encryption for inference becomes a thing that's efficient and anyone can be a node.

I won't install some random untrusted binary off of some website. I downloaded it and did some cursory analysis instead.

Got the latest v0.3.8 version from the list here: https://api.darkbloom.dev/v1/releases/latest

Three binaries and a Python file: darkbloom (Rust)

eigeninference-enclave (Swift)

ffmpeg (from Homebrew, lol)

stt_server.py (a simple FastAPI speech-to-text server using mlx_audio).

The good parts: All three binaries are signed with a valid Apple Developer ID and have Hardened runtime enabled.

Bad parts: Binaries aren't notarized. Enrolls the device for remote MDM using micromdm. Downloads and installs a complete Python runtime from Cloudflare R2 (Supply chain risk). PT_DENY_ATTACH to make debugging harder. Collects device serial numbers.

TL;DR: No, not touching that.

I think it’s important that systems like this exist, but getting them off the ground is non-trivial.

We’ve been building something similar for image/video models for the past few months, and it’s made me think distribution might be the real bottleneck.

It’s proving difficult to get enough early usage to reach the point where the system becomes more interesting on its own.

Curious how others have approached that bootstrap problem. Thanks in advance.

Love the concept, with some similarity to folding@home, though more personal gain.

But trying it out it still needs work, I couldn't download a model successfully (and their list of nodes at https://console.darkbloom.dev/providers suggests this is typical).

And as a cursory user, it took me some digging to find out that to cash out you need a Solana address (providers > earnings).

I'm not sure how the economics works out. Pricing for AI inference is based on supply/demand/scarcity. If your hardware is scarce, that means low supply; combine with high demand, it's now valuable. But what happens if you enable every spare Mac on the planet to join the game? Now your supply is high, which means now it's less valuable. So if this becomes really popular, you don't make much money. But if it doesn't become somewhat popular, you don't get any requests, and don't make money. The only way they could ensure a good return would be to first make it popular, then artificially lower the number of hosts.

I've tried to install it on my mac, but not sure what macOS version it should support.

on 15.1 it failed to serve models.

updated to latest 15.5 and it fails to run binary.

> Operators cannot observe inference data.

Is there some actual cryptography behind this, or just fundamentally-breakable DRM and vibes?

Generate images requested by randoms on the internet on your hardware.

What could possibly go wrong?

"These are estimates only. We do not guarantee any specific utilization or earnings. Actual earnings depend on network demand, model popularity, your provider reputation score, and how many other providers are serving the same model.

When your Mac is idle (no inference requests), it consumes minimal power — you don't lose significant money waiting for requests. The electricity costs shown only apply during active inference.

Text models typically see the highest and most consistent demand. Image generation and transcription requests are bursty — high volume during peaks, quiet otherwise."

This feels like defi... de-ai

Actually more useful than Bitcoin. Brilliant idea.

Like the concept. This is not a business - should be an open source GitHub repo maybe.

They lost me with just one microcopy - “start earning”. Huge red signal.

I installed two models, but it just always reports:

    Available models (2):
    CohereLabs/cohere-transcribe-03-2026 (4.6 GB)
    flux_2_klein_9b_q8p.ckpt (20.2 GB)
    ...
    Advertising 0 model(s) (only loaded models)

Also the benchmark just doesn't work.

Interesting idea, but needs some work.

Seems like an interesting way for those people that purchased a Mac Mini to run OpenClaw to pay off the hardware, since mostly it’s now idle.

Why does M1 Max project significantly higher revenue than M3 Max with double the ram?

It's a good project that makes sense. I recommend adding a contractual layer as well, since it's free and makes sense. Operators could legally sign that they will not look into the inference layer. After all, the operators already have a financial relationship with this provider, so it makes sense to add a contract to it and keep operators from looking into other people's data that way, too. I wish this project a lot of success.

Until we have breakthroughs in homomorphic encryption compute, I won't trust such privacy claims

How does the inference work correctly if the payloads are encrypted?

> Every request is end-to-end encrypted

Afaik you will need to decrypt the data the moment it needs to be fed into the model.

How do they do this then?

They could consider registering as a provider on something like OpenRouter if they aren't getting enough inference requests on their own site.

They are almost claiming FHE, isn't it just a matter of creating the right tool to get the generated tokens from RAM before it gets encrypted for transfer. How is it fundamentally different than chutes?

Is this named after the 2011 split album with Grimes and d'Eon?

hey guys! i'm the creator. let me know if you have any questions.

Why isn’t a MacBook Air M5 on the hardware list?

Broken calculator or am I missing something here?

  Macbook Air M2  8GB   12h/day -> $647/month

  Mac Mini M4     32GB  12h/day -> $290/month

I mean, I'd be happy to buy a few used M2 Airs with minimal specs and start printing money but…

Apple should build this, and start giving away free Macs subsidized by idle usage.

Like Fold@home but for profit!

I could imagine this working for the openclaw community if the price is right

That solution actually makes great sense. So Apple won in some strange way again?

Guess there are limitations on size of the models, but if top-tier models will getting democratized I don’t see a reason not to use this API. The only thing that comes to me is data privacy concerns.

I think batch-evals for non-sensitive data has great PMF here.

I thought this was Apple’s plan all along. How is this not already their thing?

Why only Macs? If we think of all PCs and mobile phones running idle, the potential is much larger.

I really want this to succeed

I cant buy credits - says page could not load

I like the idea but it wont take off until Homomorphic Encryption for inference becomes a thing that's efficient and anyone can be a node.

I won't install some random untrusted binary off of some website. I downloaded it and did some cursory analysis instead.

Got the latest v0.3.8 version from the list here: https://api.darkbloom.dev/v1/releases/latest

Three binaries and a Python file: darkbloom (Rust)

eigeninference-enclave (Swift)

ffmpeg (from Homebrew, lol)

stt_server.py (a simple FastAPI speech-to-text server using mlx_audio).

The good parts: All three binaries are signed with a valid Apple Developer ID and have Hardened runtime enabled.

TL;DR: No, not touching that.

I think it’s important that systems like this exist, but getting them off the ground is non-trivial.

We’ve been building something similar for image/video models for the past few months, and it’s made me think distribution might be the real bottleneck.

It’s proving difficult to get enough early usage to reach the point where the system becomes more interesting on its own.

Curious how others have approached that bootstrap problem. Thanks in advance.

Love the concept, with some similarity to folding@home, though more personal gain.

But trying it out it still needs work, I couldn't download a model successfully (and their list of nodes at https://console.darkbloom.dev/providers suggests this is typical).

And as a cursory user, it took me some digging to find out that to cash out you need a Solana address (providers > earnings).

> Operators cannot observe inference data.

Is there some actual cryptography behind this, or just fundamentally-breakable DRM and vibes?

Generate images requested by randoms on the internet on your hardware.

What could possibly go wrong?

When your Mac is idle (no inference requests), it consumes minimal power — you don't lose significant money waiting for requests. The electricity costs shown only apply during active inference.

Text models typically see the highest and most consistent demand. Image generation and transcription requests are bursty — high volume during peaks, quiet otherwise."

I installed two models, but it just always reports:

    Available models (2):
    CohereLabs/cohere-transcribe-03-2026 (4.6 GB)
    flux_2_klein_9b_q8p.ckpt (20.2 GB)
    ...
    Advertising 0 model(s) (only loaded models)

Also the benchmark just doesn't work.

Interesting idea, but needs some work.

Seems like an interesting way for those people that purchased a Mac Mini to run OpenClaw to pay off the hardware, since mostly it’s now idle.

Why does M1 Max project significantly higher revenue than M3 Max with double the ram?

Until we have breakthroughs in homomorphic encryption compute, I won't trust such privacy claims

How does the inference work correctly if the payloads are encrypted?

They could consider registering as a provider on something like OpenRouter if they aren't getting enough inference requests on their own site.

Is this named after the 2011 split album with Grimes and d'Eon?

Broken calculator or am I missing something here?

  Macbook Air M2  8GB   12h/day -> $647/month

  Mac Mini M4     32GB  12h/day -> $290/month

I mean, I'd be happy to buy a few used M2 Airs with minimal specs and start printing money but…

Apple should build this, and start giving away free Macs subsidized by idle usage.

Like Fold@home but for profit!

I thought this was Apple’s plan all along. How is this not already their thing?

I really want this to succeed

I have a hard time believing their numbers. If you can pay off a mac mini in 2-4 months, and make $1-2k profit every month after that, why wouldn’t their business model just be buying mac minis?

I installed this so you don't have to. It did feel a bit quirky and not super polished. Fails to download the image model. The audio/tts model fails to load.

In 15 minutes of serving Gemma, I got precisely zero actual inference requests, and a bunch of health checks and two attestations.

At the moment they don't have enough sustained demand to justify the earning estimates.

They released this like a day ago, I'm not surprised that there's not enough demand right now. Give it some time to take off

weird to learn that they do not generate inference requests to their network themselves to motivate early adopters at least to host their inference software

Has anyone tested the system from the other end... sending a prompt and getting a response?

and I don't think they ever will unless they're highly competitive (hopefully that price they have stays? at least for users)

I was thinking of building this exact thing a year ago but my main stopper was economics: it would never make sense for someone to use the API, thus nobody can make money off of zero demand.

I guess we just have to look at how Uber and Airbnb bootstrapped themselves. Another issue with my original idea was that it was for compute in general, when the main, best use-case, is long(er)-running software like AI training (but I guess inference is long running enough).

But there already exist software out there that lets you rent out your GPU so...

MDMs on macOS are permissioned via AccessRights, and you can verify that their permission set is fairly minimal and does not allow what you've described here (bits 0, 4, 10).

That said, their privacy posture at the cornerstone of their claims is snake oil and has gaping holes in it, so I still wouldn't trust it, but it's worth being accurate about how exactly they're messing up.

MDM is the absolute deal breaker. No way in hell will I ever make my Macbook into an unsellable brick if they so decide to lock my computer with MDM.

Even moreso, not for pennies/month

They use the TEE to check that the model and code is untampered with. That's a good, valid approach and should work (I've done similar things on AWS with their TEE)

The key question here is how they avoid the outside computer being able to view the memory of the internal process:

I was under the impression this wasn't possible if you are using the GPU. I could be misled on this though.

[1] https://github.com/Layr-Labs/d-inference/blob/master/papers/...

While they do make this argument, realistically anyone sending their prompt/data to an external server should assume there will be some level of retention.

And more so in particular, anyone using Darkbloom with commercial intents should only really send non-sensitive data (no tokens, customer data, ...) I'd say only classification tasks, imagine generation, etc.

This entire paper smells of LLM, I'm sure even the most distinguished academic would refrain from using notation to prove that the SIP status cannot change during operation.

Apple Silicon systems have unified memory between CPU and GPU. The hypervisor page table trick is thus claimed to protect GPU memory from RDMA.

Macs do not have an accessible hardware TEE.

Macs have secure enclaves.

Unfortunately, verifiable privacy is not physically possible on MacBooks of today. Don't let a nice presentation fool you.

Apple Silicon has a Secure Enclave, but not a public SGX/TDX/SEV-style enclave for arbitrary code, so these claims are about OS hardening, not verifiable confidential execution.

It would be nice if it were possible. There's a lot of cool innovations possible beyond privacy.

I wrote a whole SDK for using SGX, it's cool tech. But in theory on Apple platforms you can get a long way without it. iOS already offers this capability and it works OK.

macOS has a strong enough security architecture that something like Darkbloom would have at least some credibility if there was a way to remotely attest a Mac's boot sequence and TCC configuration combined with key-to-DR binding. The OS sandbox can keep apps properly separated if the kernel is correct and unhacked. And Apple's systems are full of mitigations and roadblocks to simple exploitation. Would it be as good as a consumer SGX enclave? Not architecturally, but the usability is higher.

As if you get privacy with the inference providers available today? I have more trust in a randomly selected machine on a decentralized network not being compromised than in a centralized provider like OpenAI pinky promising not to read your chats.

Every hardware key will be broken if there is enough incentive to do so. Their claims read like pure hubris.

Cool idea. Just some back-of-the-envelope math here (not trusting what's on their site):

My M5 Pro can generate 130 tok/s (4 streams) on Gemma 4 26B. Darkbloom's pricing is $0.20 per Mtok output.

That's about $2.24/day or $67/mo revenue if it's fully utilized 24/7.

Now assuming 50W sustained load, that's about 36 kWh/mo, at ~$.25/kWh approx. $9/mo in costs.

Could be good for lunch money every once in a while! Around $700/yr.

Well. Running your machine to do inference will utilize more than 50W sustained load, I'd say more than double that. Plus electricity is more expensive here (but granted, I do have solar panels). Plus don't forget to factor in that your hardware will age faster.

I'd say it's not worth it. But the idea is cool.

Also this assumes hardware never fails. I learned about this the hard way back when I started mining crypto on my 5700XT way back when.

I figured since I already used it a lot, and I've never had a GPU fail on me, it would be fine.

The fans on it died in a month of constant use, replacing them was more money than what I made on mining.

Their example big earner models are FLUX.2 Klein 4B and FLUX.2 Klein 9B, which i imagine could generate a lot more tokens/s than a 26B model on your machine.

For Gemma 4 26B their math is:

single_tok/s = (307 GB/s / 4 GB) * 0.60 = 46.0 tok/s

batched_tok/s = 46.0 * 10 * 0.9 = 414.4 tok/s

tok/hr = 414.4 * 3600 = 1,492,020

revenue/hr = (1,492,020 / 1M) * $0.200000 = $0.2984

I have no idea if that is a good estimate of how much an M5 Pro can generate - but that’s what it says on their site.

They do a bit of a sneaky thing with power calculation: they subtract 12Ws of idle power, because they are assuming your machine is idling 24/7, so the only cost is the extra 18W they estimate you’ll use doing inference. Idk about you, but i do turn my machine off when i am not using it.

> My M5 Pro can generate 130 tok/s (4 streams) on Gemma 4 26B.

This seems high. At which quantization? Using LM Studio or something else?

Note: Darkbloom seems to run everything on Q8 MLX.

OpenAI has only about 5% paying customers, how does it generate revenue?

I don’t think this is a sustainable business model. For example, Cubbit tried to build decentralised storage, but I backed out because better alternatives now exist, and hardware continues to improve and become cheaper over time.

Your electricity and ownership are going to get lower return and does not actually requce CO2.

Genuinely curious, is there any way to estimate amortization of Mac?

I’d imagine 1 year of heavy usage would somehow affect its quality.

Any idea what makes for such a diff between your and theirs numbers? Batching? Or could they do a crazy prefix caching across all nodes to reduce the actual processing.

Maybe lunch money for you, but there are people in some parts of the world who live on $200/month. Like Ukraine.

Don't forget to factor in cooling costs.

Having strong SETI@Home vibes from 25 years ago, except of course, this is not for the greater good of humanity, but a for-profit project.

I definitely want my laptops to be cool, quiet and idle most of the time.

I some times play about with local models via ollama/comfyui and more recently ace-step to generate music.

This is short bursts of heat 5-10 m during the render I would not be happy with that for multiple hours a day. I am sure that would have a negative effect on battery health.

I'd love a way to do this locally -- pool all the PCs in our own office for in-office pools of compute. Any suggestions from anyone? We currently run ollama but manually manage the pools

Seems like so much more work than "just" paying for https://huggingface.co or whichever other neocloud who already did all the setup for you and just waits for your credit card per minute/seconds/token.

If you set CPUSchedulingPolicy=idle Nice=19 IOSchedulingClass=idle in the ollama server configuration it should run in the background with lowest priority.

You might not even know it as a user but the payment/distribution here is all built on crypto+stablecoins. This is a great use case for it.

Good. Another great non-speculative use-case for crypto and stablecoins.

So basically ... Pied Piper.

Wasn't there an idea about 15 years ago where you would open your browser, go to a webpage and that page would have a JavaScript based client that would run distributed workloads?

Intersting to see this coming back again.

Or SETI which would search for signs of alien life.

Still - neat project that I hope does well.

[1] Layer Labs, formerly EigenLayer, is company built around a protocol to abstract and recycle economic security guarantees from Ethereum proof of stake.

Like the concept. This is not a business - should be an open source GitHub repo maybe.

They lost me with just one microcopy - “start earning”. Huge red signal.

But why would I donate my Mac Studio's idle time if I couldn't "start earning"?

> Every request is end-to-end encrypted

Afaik you will need to decrypt the data the moment it needs to be fed into the model.

How do they do this then?

The system hosting the model must be one of the ends.

Remember, all encryption is E2EE if you're not picky about the ends.

hey guys! i'm the creator. let me know if you have any questions.

Hey Gajesh! I sent you an email with some of the teething problems I ran into trying to get started as a provider. Hope it didn't end up in your spam folder!

Why isn’t a MacBook Air M5 on the hardware list?

Because the model that generated that list was trained before the M5 came out.

I could imagine this working for the openclaw community if the price is right

That was my first thought too especially for talks that aren’t particularly important like daily digests of online things

If you start buying minis, then you need to house, power, and cool them. So you are building a mini data center. If you are building a small data center, economies of scale will drive you to want to build larger and larger. However, this gets expensive and neighbors tend to not like data centers (for good reason). To me this seems like asymmetric warfare against hyper-scalers.

The numbers are optimistically legit -- it's calculated based purely considering we have demand for all machines at all times. We don't have that right now, but fairly optimistic that people will do it.

That's why we don't recommend purchasing a new machine. Existing machine is no cost for you to run this.

Electricity is one cost, but it will get paid off from every request it receives. Electricity is only deducted when you run an inference. If you have any questions, DM me @gajesh on Twitter.

Solid q. I think the part of it is that it’s really easy to attract some “mass” (capital) of users, as there are definitely quite a few of idle Macs in the world.

Non-VC play (not required until you can raise on your own terms!) and clear differentiation.

If you want to go full-business-evaluation, I would be more worried about someone else implementing same thing with more commission (imo 95% and first to market is good enough).

Because they don’t have that much initial money in their pocket, while the idle computer is already there, and the biggest friction point is convincing people to install some software. Both producing rhetoric and software are several order of magnitude cheaper than to directly own and maintain a large fleet of hardware with high guarantee of getting the electrical stable input in a safe place to store them.

Assuming that getting large chunk of initial investment is just a formality is out of touch with 99% of people reality out there, when it’s actually the biggest friction point in any socio-economical endeavour.

It is too good to be true. When you see it is making more than a claude code subscription for fuck all work per day.

Prolly gonna make $50 a year tops.

Because their numbers don’t work out. When you do the math on token cost versus inference speed, you get something that barely breaks even even with cheap power.

Also they’ve already launched a crypto token, which is a terrible sign.

> These are estimates only. We do not guarantee any specific utilization or earnings. Actual earnings depend on network demand, model popularity, your provider reputation score, and how many other providers are serving the same model.

Others are reporting low demand, eg.: https://news.ycombinator.com/item?id=47789171

The numbers are obviously high, because if this takes off then the price for inference will also drop. But I still think it’s a solid economic model that benefits low income countries the most. In Ukraine, for example, I know people who live on $200/month. A couple Mac Minis could feed a family in many places.

As a business owner, I can think of multiple reasons why a decentralized network is better for me as a business than relying on a hyperscaler inference provider. 1. No dependency on a BigTech provider who can cut me off or change prices at any time. I’m willing to pay a premium for that. 2. I get a residential IP proxy network built-in. AI scrapers pay big money for that. 3. No censorship. 4. Lower latency if inference nodes are located close to me.

Power and racking are difficult and expensive?

Capital and availability?

I cant buy credits - says page could not load

It is too good to be true. When you see it is making more than a claude code subscription for fuck all work per day.

Prolly gonna make $50 a year tops.

Because their numbers don’t work out. When you do the math on token cost versus inference speed, you get something that barely breaks even even with cheap power.

Also they’ve already launched a crypto token, which is a terrible sign.

Others are reporting low demand, eg.: https://news.ycombinator.com/item?id=47789171

I some times play about with local models via ollama/comfyui and more recently ace-step to generate music.

This is short bursts of heat 5-10 m during the render I would not be happy with that for multiple hours a day. I am sure that would have a negative effect on battery health.

If you set CPUSchedulingPolicy=idle Nice=19 IOSchedulingClass=idle in the ollama server configuration it should run in the background with lowest priority.

Good. Another great non-speculative use-case for crypto and stablecoins.

Or SETI which would search for signs of alien life.

But why would I donate my Mac Studio's idle time if I couldn't "start earning"?

The system hosting the model must be one of the ends.

Remember, all encryption is E2EE if you're not picky about the ends.

Hey Gajesh! I sent you an email with some of the teething problems I ran into trying to get started as a provider. Hope it didn't end up in your spam folder!

Because the model that generated that list was trained before the M5 came out.

That was my first thought too especially for talks that aren’t particularly important like daily digests of online things

That solution actually makes great sense. So Apple won in some strange way again?

I think batch-evals for non-sensitive data has great PMF here.

> So Apple won in some strange way again?

Heh, what did they win exactly? This is just a way for another company to extract value out of the single region of the world where Apple is a relevant vendor, and it happens to be the one where it's the easiest to pull people into schemes.

Yes. They never needed to participate in the AI race to zero.

Because they were already at the finish line with Apple Silicon.

> I don’t see a reason not to use this API. The only thing that comes to me is data privacy concerns.

The whole inference is end-to-end encrypted so none of the nodes can see the prompts or the messages.

Why only Macs? If we think of all PCs and mobile phones running idle, the potential is much larger.

From the paper: https://github.com/Layr-Labs/d-inference/blob/master/papers/...

> Apple’s attestation servers will only generate the FreshnessCode for a genuine device that checks in via APNs. A software-only adversary cannot forge the MDA certificate chain (Assumption 3). Com- bined with SIP enforcement (preventing binary replace- ment) and Secure Boot (preventing bootloader tampering), this provides strong evidence that the signing key resides in genuine Apple hardware.

They use the Apple TEE which they claim also protects GPU memory (I wasn't aware of this).

NVidia data center GPUs have a similar path, but not their consumer ones. Not sure about the NVidia Spark.

It's possible AMD Strix Halo can do this, but unlikely for any other PC based GPU environments.

simple first target, PCs have more variability

That's why we don't recommend purchasing a new machine. Existing machine is no cost for you to run this.

Electricity is one cost, but it will get paid off from every request it receives. Electricity is only deducted when you run an inference. If you have any questions, DM me @gajesh on Twitter.

You're not taking into account the thermal strain on the machine, though. A machine that's 100% utilized (even worse if it's in bursts) will last less than an idle machine.

Solid q. I think the part of it is that it’s really easy to attract some “mass” (capital) of users, as there are definitely quite a few of idle Macs in the world.

Non-VC play (not required until you can raise on your own terms!) and clear differentiation.

If you want to go full-business-evaluation, I would be more worried about someone else implementing same thing with more commission (imo 95% and first to market is good enough).

Eigenlayer (which this is spun off from) is a massively VC-funded crypto company.

I think the point they’re making though is that the numbers seem too good to be true.

ie. Does anyone know the payback time for a B100 used just for inference? I assume it’s more than a couple of months? Or is it just training that costs so much?

How many of those people who could live off $200USD/month can afford or already have a mac mini in the house?

Isn't this same premise as "lets buy few GPUs to mine crypto and have passive income"? It didn't work very well and it probably won't work now either. If there is money to be made, bigger players will get in there, buy out all mac minis they can, drive price up for regular people and inevitably drive inference price down so you'll be lucky to get initial investment back

On the latency point - your requests are still going through the coordinator of the system here. So on average strictly worse than a large provider.

You - Darkbloom - Operator - Darkbloom - you, vs

You - Provider - you

---

On the censorship point - this is an interesting risk surface for operators. If people are drawn my decentralized model provisioning for its lax censorship, I'm pretty sure they're using it to generate things that I don't want to be liable for.

If anything, I could imagine dumber and stricter brand-safety style censorship on operator machines.

It's quite funny thinking about a chimpanzee seeing a lot of bananas thinking this could feed my family and then same with humans only with Mac Minis

Power and racking are difficult and expensive?

How difficult? Is running 1000 minis worth $1,000,000/month of effort? I feel like it is.

Capital and availability?

I guess if it only works at scale capital is maybe the answer. Like enough cash to buy 5 or 10 or even 100 minis seem doable - but if the idea only works well when you have 10,000 running - that makes some sense.