Sandboxing AI Agents in Linux

I use Leash [1] [2] for sandboxing my agents (to great effect!). I've been very happy with it, it provides strict policy-level control for all process-level + network-level activity, as well as full visibility and dynamic runtime controls via WebUI. Way better than bubblewrap imo.

I originally saw it here on HN and have been hooked ever since.

[1] Screenshot: https://camo.githubusercontent.com/99b9e199ffb820c27c4e977f2...

[2] https://github.com/strongdm/leash

Fun fact: Do you know what container / sandboxing system is in most widespread use? Not docker containers, certainly not bubblewrap, and not even full VMs or firecracker. It's Chrome tabs.

I've started using a container (podman) which is just for the AI tools. I start it up for Codex etc and let it access to the appropriate code directory outside the container.

Anyone else using this approach? Ideas on improvements?

This is the way to go! On my side I've build a very small `claude-vm` wrapper to run each instance in a VM with Lima: https://github.com/sylvinus/agent-vm

I just have an unprivileged secondary local account and do ssh dummy@localhost.

Is this wrong?

I will ask what I've asked before: how to know what resources to make available to agents and what policies to enforce? The agent behavior is not predefined; it may need access to a number of files & web domains.

For example, you said: > I don't expose entire /etc, just the bare minimum How is "bare minimum" defined?

> Inspecting the log you can spot which files are needed and bind them as needed. This requires manual inspection.

I don't know if I want to create an ad-hoc list of permissions. What I would like would be something like take a snapshot of my current workspace in a VM. Run claude there and let it go wild. After the end of the session, kill the box. The only downside is potentially syncing the claude sessions/projects. But I don't think that'd be too difficult.

If you have ssh installed, with network access it can ssh localhost to escape the sandbox.

I like this approach for Nix: https://dev.to/andersonjoseph/how-i-run-llm-agents-in-a-secu... It makes it also easy to give the agent only access to the tools it actually needs.

As a heads up and affirmation that the approach is correct, here's a small shell bubblewrap wrapper that boils the command line down to `sandbox-run claude --dangerously-skip-permissions`.

https://github.com/sandbox-utils/sandbox-run

I'm launching a SaaS to create yet another solution to the AI Sandboxing problem in linux.

My friends and I have spent a lot of time quietly injecting support down into the kernel without anybody raising a flag, and we finally have the infrastructure in place to solve this problem.

We have also poisoned all the LLMs training data with our approach, so our marketing is primed and we wont even need to learn Claude to use our tool.

We’re planning a soft launch this month, or maybe next month. Depending on how "in the vibe" (our new word for flow :) our team gets.

We’re calling it `useradd`.

Yes, the man page is intimidating, and the documentation is terrible. But once you're over the learning curve, it puts your machine into a kind of 'main frame' mode where multiple 'virtual teletypes' and users can operate on the same machine.

DM me if you want a beta key.

---

Sorry for the snark, but i cringe at the monuments to complexity I see people building, at least this solution is relative simple and free. Still, dont really see what it buys me.

Would love this for MacOS

Saw something last week using bubblewrap as well in hn github.com/Use-Tusk/fence

Really well targeted!

I'd been thinking of using toolbox or devcontainers going forward, but having to craft containers with all my stuff sounds so painful, feels like it would become another full-time job to make containers

Bubblewrap & passing in a bunch of the current system sounds like a great compromise!

I do wonder what isolation something like systemd-run can offer, if that is enough.

Part #2 to me, I also want observability as to what the agent changed. That was one place where containers are such a clear & huge advantage! Having an overlay that contains the changes to the filesystem is so explicit. There's also works like agentfs, that offer a FUSE filesystem backed by Turso DB (sqlite compatible).

I like this approach for Nix: https://dev.to/andersonjoseph/how-i-run-llm-agents-in-a-secu... It makes it also easy to give the agent only access to the tools it actually needs.

As a heads up and affirmation that the approach is correct, here's a small shell bubblewrap wrapper that boils the command line down to `sandbox-run claude --dangerously-skip-permissions`.

https://github.com/sandbox-utils/sandbox-run

Would love this for MacOS

For example, you said: > I don't expose entire /etc, just the bare minimum How is "bare minimum" defined?

> Inspecting the log you can spot which files are needed and bind them as needed. This requires manual inspection.

Ask the agent to bubblewrap itself

I'm launching a SaaS to create yet another solution to the AI Sandboxing problem in linux.

My friends and I have spent a lot of time quietly injecting support down into the kernel without anybody raising a flag, and we finally have the infrastructure in place to solve this problem.

We have also poisoned all the LLMs training data with our approach, so our marketing is primed and we wont even need to learn Claude to use our tool.

We’re planning a soft launch this month, or maybe next month. Depending on how "in the vibe" (our new word for flow :) our team gets.

We’re calling it `useradd`.

DM me if you want a beta key.

---

Sorry for the snark, but i cringe at the monuments to complexity I see people building, at least this solution is relative simple and free. Still, dont really see what it buys me.

Really well targeted!

Bubblewrap & passing in a bunch of the current system sounds like a great compromise!

I do wonder what isolation something like systemd-run can offer, if that is enough.

`useradd` doesn't restrict network access.

Well done. It took me all the way up to `useradd`...

Edit: too bad about your edit. The comment was just fine without it.

Ask the agent to bubblewrap itself

Well done. It took me all the way up to `useradd`...

Edit: too bad about your edit. The comment was just fine without it.

`useradd` doesn't restrict network access.

I have used a separate user, but lately I have been using rootless podman containers instead for this reason. But I know too little about container escapes. So I am thinking about a combination.

Would a podman container run by a separate user provide any benefit over the two by themselves?

I have used a separate user, but lately I have been using rootless podman containers instead for this reason. But I know too little about container escapes. So I am thinking about a combination.

Would a podman container run by a separate user provide any benefit over the two by themselves?

I originally saw it here on HN and have been hooked ever since.

[1] Screenshot: https://camo.githubusercontent.com/99b9e199ffb820c27c4e977f2...

[2] https://github.com/strongdm/leash

Fun fact: Do you know what container / sandboxing system is in most widespread use? Not docker containers, certainly not bubblewrap, and not even full VMs or firecracker. It's Chrome tabs.

That's interesting, how does Chrome implement "sandboxing" in Windows and MacOS? For Linux, does it use the same underlying technology as Docker, Podman, LXD, LXC (cgroups, namespaces...)?

Or is a custom "sandboxing" implementation not relying on system level functions (eg. a VM with restricted functions)?

If the latter, I wonder if something like JRE or .NET CLR is still out there in larger numbers, but obviously, Chrome does have billions of users.

Using Chrome for anything seems like a security failure of itself. It's got great features, but damn do they come at a cost.

> certainly not bubblewrap,

Eh, it might be bubblewrap given it's what flatpak uses.

I just have an unprivileged secondary local account and do ssh dummy@localhost.

Is this wrong?

I've started using a container (podman) which is just for the AI tools. I start it up for Codex etc and let it access to the appropriate code directory outside the container.

Anyone else using this approach? Ideas on improvements?

Saw something last week using bubblewrap as well in hn github.com/Use-Tusk/fence

This is the way to go! On my side I've build a very small `claude-vm` wrapper to run each instance in a VM with Lima: https://github.com/sylvinus/agent-vm

If you have ssh installed, with network access it can ssh localhost to escape the sandbox.

You can consider these agents criminals, or treat them like babies. Both can do harm for a while, but one offers a future.

Don't give it access to your ssh keys!

`ssh localhost` doesn't work for me. maybe because I have enabled only key-based ssh and my user key is not in authorized_keys? am I missing something?

That's interesting, how does Chrome implement "sandboxing" in Windows and MacOS? For Linux, does it use the same underlying technology as Docker, Podman, LXD, LXC (cgroups, namespaces...)?

Or is a custom "sandboxing" implementation not relying on system level functions (eg. a VM with restricted functions)?

If the latter, I wonder if something like JRE or .NET CLR is still out there in larger numbers, but obviously, Chrome does have billions of users.

Yes, Chromium has "native" sandboxing on all those platforms, Windows [0] Linux [1] and MacOS [2].

Chromium uses both seccomp filtering as well as user namespaces (the technology that Docker/Podman use).

The Windows and MacOS sandboxing strategies are more "interesting" because I've seen very few (open source) programs that use those APIs as extensively as Chromium. On Windows, it makes use of AppContainer [3] (among other things), while on MacOS it uses the sparsely documented sandbox API [4], which I think was based on code from TrustedBSD?

[0] https://chromium.googlesource.com/chromium/src/+/HEAD/docs/d...

[1] https://chromium.googlesource.com/chromium/src/+/HEAD/sandbo...

[2] https://www.chromium.org/developers/design-documents/sandbox...

[3] https://learn.microsoft.com/en-us/windows/win32/secauthz/app...

[4] https://manp.gs/mac/7/sandbox

Using Chrome for anything seems like a security failure of itself. It's got great features, but damn do they come at a cost.

> certainly not bubblewrap,

Eh, it might be bubblewrap given it's what flatpak uses.

You can consider these agents criminals, or treat them like babies. Both can do harm for a while, but one offers a future.

Yes, Chromium has "native" sandboxing on all those platforms, Windows [0] Linux [1] and MacOS [2].

Chromium uses both seccomp filtering as well as user namespaces (the technology that Docker/Podman use).

[0] https://chromium.googlesource.com/chromium/src/+/HEAD/docs/d...

[1] https://chromium.googlesource.com/chromium/src/+/HEAD/sandbo...

[2] https://www.chromium.org/developers/design-documents/sandbox...

[3] https://learn.microsoft.com/en-us/windows/win32/secauthz/app...

[4] https://manp.gs/mac/7/sandbox

Don't give it access to your ssh keys!

Yes, it should have its own dedicated key instead of sharing one of your own.

`ssh localhost` doesn't work for me. maybe because I have enabled only key-based ssh and my user key is not in authorized_keys? am I missing something?

You are right in that it would still need to authenticate.

Yes, it should have its own dedicated key instead of sharing one of your own.

Like many developers, I find myself more and more using AI agents to help with software development.

I currently use Claude Code, the command line interface, together with Opus 4.5 (Anthropic's top model as of this writing). I use it to distill my rough task requirements into a detailed development plan, then implement the plan.

By default, Claude Code asks each time if it may read and write files and run software. This is sensible default configuration, but does get annoying after a time. Worse, it interrupts me often enough that I can't do much in parallel while babysitting it.

There's also a --dangerously-skip-permissions (a.k.a. “YOLO”) mode which will happily run anything without asking. This can be risky (although I know of some people that run it like that and still haven't destroyed their dev machines).

Sandboxing

The standard solution is to sandbox the agent – either on a remote machine (exe.dev, sprites.dev, daytona.io), or locally via Docker or other virtualization mechanism.

A lightweight alternative on Linux is bubblewrap, which uses Linux kernel features like cgroups and user namespaces to limit (jail) a process.

As it turns out, bubblewrap is a good solution for lightweight sandboxing of AI agents. Here's what I personally need from such a solution:

mimic my regular Linux dev machine setup (I don't want to manage multiple dev environment)
minimal/no access to information outside what's required for the current project
write access only to the current project
can directly operate on the files/folders of the project so I can easily inspect or modify the same files from my IDE or run the code myself
network access – both to connect to AI providers and search the internet, and to be able to start a server that I can connect to

Bubblewrap and Docker are not hardened security isolation mechanisms, but that's okay with me. I'm not really concerned about the following risks:

escape via zero-day Linux kernel bug
covert side channel communications
exfiltration of data from current project (including project-specific access keys)
screwing up the codebase (the code is managed via git and backed up at GitHub or elsewhere)

The last bit is tricky, but even full remote sandboxes can't protect against that. In theory, we could have transparent API proxies that would inject proper access keys without the AI agent ever being aware of it, but this is really non-trivial to set up right now.

An alternative is to contain potential damage by creating project-specific API keys so at least the blast area is minimal if those keys are leaked.

In practice

Here's how my bubblewrap sandbox script looks:

#!/usr/bin/bash

exec 3<$HOME/.claude.json

exec /usr/bin/bwrap \
    --tmpfs /tmp \
    --dev /dev \
    --proc /proc \
    --hostname bubblewrap --unshare-uts \
    --ro-bind /bin /bin \
    --ro-bind /lib /lib \
    --ro-bind /lib32 /lib32 \
    --ro-bind /lib64 /lib64 \
    --ro-bind /usr/bin /usr/bin \
    --ro-bind /usr/lib /usr/lib \
    --ro-bind /usr/local/bin /usr/local/bin \
    --ro-bind /usr/local/lib /usr/local/lib \
    --ro-bind /opt/node/node-v22.11.0-linux-x64/ /opt/node/node-v22.11.0-linux-x64/ \
    --ro-bind /etc/alternatives /etc/alternatives \
    --ro-bind /etc/resolv.conf /etc/resolv.conf \
    --ro-bind /etc/profile.d /etc/profile.d \
    --ro-bind /etc/bash_completion.d /etc/bash_completion.d \
    --ro-bind /etc/ssl/certs /etc/ssl/certs \
    --ro-bind /etc/ld.so.cache /etc/ld.so.cache \
    --ro-bind /etc/ld.so.conf /etc/ld.so.conf \
    --ro-bind /etc/ld.so.conf.d /etc/ld.so.conf.d \
    --ro-bind /etc/localtime /etc/localtime \
    --ro-bind /usr/share/terminfo /usr/share/terminfo \
    --ro-bind /usr/share/ca-certificates /usr/share/ca-certificates \
    --ro-bind /etc/nsswitch.conf /etc/nsswitch.conf \
    --ro-bind /etc/hosts /etc/hosts \
    --ro-bind /etc/ssl/openssl.cnf /etc/ssl/openssl.cnf \
    --ro-bind /usr/share/zoneinfo /usr/share/zoneinfo \
    --ro-bind $HOME/.bashrc $HOME/.bashrc \
    --ro-bind $HOME/.profile $HOME/.profile \
    --ro-bind $HOME/.gitconfig $HOME/.gitconfig \
    --ro-bind $HOME/.local $HOME/.local \
    --bind $HOME/.claude $HOME/.claude \
    --bind $HOME/.cache $HOME/.cache \
    --file 3 $HOME/.claude.json \
    --bind "$PWD" "$PWD" \
    claude --dangerously-skip-permissions $@

If this looks rather idiosyncratic, it's because it is. Rather than using some generic rules, I experimented with bwrap until I found minimal configuration that I need to set up for my system.

Some interesting stuff:

/tmp, /proc and /dev are automatically handled by bwrap
I bind-mount (ie. expose) files and directories under the same path as local machine, so there's no difference in file locations, project paths, etc
I don't expose entire /etc, just the bare minimum
The content of $HOME/.claude.json is injected into the sandbox so any changes there won't get saved to the real one
The content of $HOME/.claude/ directory is mapped read-write, so Claude can save and modify files there (such as session data)
/opt/node/node-v22.11.0-linux-x64/ is my custom nodejs install location
I change the hostname so it's easy to distinguish between the host and sandbox

I will probably be tweaking the script as needed, but this is a pretty good starting point for me.

How to customize

If you want to adapt this to another AI agent or to your system, my suggestion is to tweak the script to run bash instead, then run your agent manually, see what breaks and tweak as appropriate.

A useful command for this is strace, which can trace file access system calls so you can see what's needed:

strace -e trace=open,openat,stat,statx,access -o /tmp/strace.log codex

Inspecting the log you can spot which files are needed and bind them as needed.