Machine hardware evolution is slowing down, pretty soon you can buy one big ass server that will last potentially decades as it would be purpose built for ai.
Things like 'context based home security' yeah thats just, automatic, free, part of the ai system.
Everyone will talk to the ai through their phones and it'll be connected to the house, it'll have lineage info of the family may be passed down through generations etc, and it'll all be 100% owned, offline, for the family; a forever assistant just there.
I think I could vibe code the local ai security system myself.
Why would you run this on your M5 instead of a dedicated machine for it? A Jetson Orin would be faster at prefill and decode, as well as cheaper for home installation.
Seems like trying to make a need from the tools. My security system front page shows me every event that happened at my house, and I don't have to interrogate it on every happenstance, and I don't see what the value of that is.
βHey, my mother-in-law is coming today. She drives a blue Ford pickup. Let her in and record the car plate for future use.β
βThere are servicemen coming today around noon. They should check the electricity box and leave in a few minutes. Let me know if they do something else.β
I've got a 3060 myself, which is nice to play around with the smaller models for free (minus electricity) and with 100% uptime, but I was not able to program anything with them yet that I didn't want to rewrite completely. A heavily quantized Qwen3.5-27B model is getting close though. Maybe in a few months.
My intuition is that OpenClaw-like systems still make too many mistakes to be trusted with security. And that it will take more months or years until the models and harnesses are truly ready.
That's why most professional inference solutions reach for GPU-heavy hardware like the Jetson. Apple Silicon seems like a strange and overly expensive fit for this use cae.
Benchmarks: https://old.reddit.com/r/LocalLLaMA/comments/1rpw17y/ryzen_a...
This is the classic issue in tech right now - it's becoming easier to build the systems, but the compliance/legal hurdles are still real, slow, and human. Even if the monitoring is best in class (which I'd argue it likely is - this is a fantastic application of AI), if the compliance isn't there it wont be a real product.
It is still incredibly impressive of course! I just wish it was jailbroken
Qwen3.5-9B scores 93.8% β within 4 points of GPT-5.4 β running entirely on a MacBook Pro M5 at 25 tok/s, 765ms TTFT, using only 13.8 GB of unified memory. Zero API costs. Full data privacy. All local.
MacBook Pro M5 Β· M5 Pro Β· 18 cores Β· 64 GB Unified Memory Β· macOS 15.3 (arm64) Β· llama.cpp
96-test evaluation across 15 suites covering tool use, security classification, event deduplication, and more.
| Rank | Model | Type | Passed | Failed | Pass Rate | Time |
|---|---|---|---|---|---|---|
| π₯ | GPT-5.4 | βοΈ Cloud | 94 | 2 | 97.9% | 2m 22s |
| π₯ | GPT-5.4-mini | βοΈ Cloud | 92 | 4 | 95.8% | 1m 17s |
| π₯ | Qwen3.5-9B (Q4_K_M) | π Local | 90 | 6 | 93.8% | 5m 23s |
| π₯ | Qwen3.5-27B (Q4_K_M) | π Local | 90 | 6 | 93.8% | 15m 8s |
| 5 | Qwen3.5-122B-MoE (IQ1_M) | π Local | 89 | 7 | 92.7% | 8m 26s |
| 5 | GPT-5.4-nano | βοΈ Cloud | 89 | 7 | 92.7% | 1m 34s |
| 7 | Qwen3.5-35B-MoE (Q4_K_L) | π Local | 88 | 8 | 91.7% | 3m 30s |
| 8 | GPT-5-mini (2025) | βοΈ Cloud | 60 | 36 | 62.5% | 7m 38s |
* GPT-5-mini had many failures due to the API rejecting non-default temperature values β listed for completeness only.
The Qwen3.5-35B-MoE has a lower TTFT than all OpenAI cloud models β 435ms vs 508ms for GPT-5.4-nano.
A benchmark we created to evaluate LLMs on real home security assistant workflows β not generic chat, but the actual reasoning, triage, and tool use an AI home security system needs.
All 35 fixture images are AI-generated (no real user footage). Tests run against any OpenAI-compatible endpoint.
Watch the benchmark suite execute live on Apple Silicon β every test visible in real time.
A 9B model on a laptop scoring within 4% of GPT-5.4 on domain tasks β fully offline with complete privacy β is the value proposition of local AI.
Download Aegis Benchmark on GitHub
System: Aegis-AI β Local-first AI home security on consumer hardware.
Benchmark: HomeSec-Bench β 96 LLM + 35 VLM tests across 16 suites.
Skill Platform: DeepCamera β Decentralized AI skill ecosystem.