Your File System Is Already A Graph Database

I will always be in awe of people who can remain diligent doing this level of journaling/personal information management.

I've got scraps of paper and post-it notes and just throw them away after they've been sitting around for a while and I forget what they are about.

I keep harping on this, but the question is not "can you use your filesystem as a graph database" - of course you can - but whether this performs better or worse than a vector database approach, especially at scale.

The premise of Atomic, the knowledge base project I'm currently working on, is that there is still significant value in vectors, even in an agentic context. https://github.com/kenforthewin/atomic

Using the same logic, a key/value database is also a graph database?

Isn’t the biggest benefit of graph databases the indexing and additional query constructs they support, like shortest path finding and whatnot?

I will always be in awe of people who can remain diligent doing this level of journaling/personal information management.

I've got scraps of paper and post-it notes and just throw them away after they've been sitting around for a while and I forget what they are about.

The premise of Atomic, the knowledge base project I'm currently working on, is that there is still significant value in vectors, even in an agentic context. https://github.com/kenforthewin/atomic

Oh neat! Whenever I was working on RAG proof of concepts vector databases seemed to generate noisiest outputs that happened to include my information from my chunks but it was unable to draw reasonable contextual associations. I swap RAG out with a web search tool, all of a sudden the quality goes way up. Is RAG ever going to be easier to hold or should lay people like me just stay moving on?

I've been playing around with the same, but trying to use local models as my Obsidian vault obviously contain a bunch of private things I'm not willing to share with for-profit companies, but I have yet to find any model that comes close to working out as well as just codex or cc with the small models, even with 96GB of VRAM to play around with.

I've started to think about maybe a fine-tuned model is needed, specifically for "journal data retrieval" or something like that, is anyone aware of any existing models for things like this? I'd do it myself, but since I'm unwilling to send larger parts of my data to 3rd parties, I'm struggling collecting actual data I could use for fine-tuning myself, ending up in a bit of a catch 22.

For some clients projects I've experimented with the same idea too, with less restrictions, and I guess one valuable experience is that letting LLMs write docs and add them to a "knowledge repository" tends to up with a mess, best success we've had is limiting the LLMs jobs to organizing and moving things around, but never actually add their own written text, seems to slowly degrade their quality as their context fills up with their own text, compared to when they only rely on human-written notes.

Using the same logic, a key/value database is also a graph database?

Isn’t the biggest benefit of graph databases the indexing and additional query constructs they support, like shortest path finding and whatnot?

I think the confusion stems from the fact that we call a database what is really a database management system.

Yes, the author is likely unaware of this. They see markdown files with links, so a graph and the set of those files, so a "database".

https://neo4j.com/docs/graph-data-science/current/algorithms...

So you have some folders with markdown files ... which are insanely hard to query without a tool ... impossible to traverse via their relationships ... and you call that a graph database? WHAT?!

Clicked the link expecting to see some tool or method that actually allows graph-like queries and traversals on files in a file system, all I found was some rant about someone on the internet being wrong.

Waste of time.

Interesting approach but how do you download Google Docs, XLS and Slack threads etc.. and how is it saved in obsidian, are they all converted to markdown before saving or summarized to extract key topics and saved. What about images ?

> […] the knowledge base isn’t just for research. It’s a context engineering system. You’re building the exact input your LLM needs to do useful work. > […] there’s a real difference between prompting “help me write a design doc for a rate limiting service” and prompting an LLM that has access to your project folder with six months of meeting notes, three prior design docs, the Slack thread where the team debated the approach, and your notes on the existing architecture.

Yep, my distributed JSON over HTTP database uses the ext4 binary tree for indexing: http://root.rupy.se

It can only handle 3 way multiple cross references by using 2 folders and a file now (meta) and it's very verbose on the disk (needs type=small otherwise inodes run out before disk space)... but it's incredibly fast and practially unstoppable in read uptime!

Also the simplicity in using text and the file system sort of guarantees longevity and stability even if most people like the monolithic garbled mess that is relational databases binary table formats...

I can see over engineering when I look at one. And premature optimization.

Anyway, why care how the data is stored? You need a catalog. You need an index. You need automation. Helps keeping order and helps with inevitable changes and flips and pivots and whims and trends and moods and backups and restoration and snapshots and history and versioning and moon travels and collaboration and compatibility and long summer evening walks and portability.

I am more curious on the note taking. How do you ingest data here? Export from slack via LLM:s? Store it in GitHub?

My “knowledge” is spread out on various SaaS (Google, slack, linear, notion, etc). I don’t see how I can centralize my “knowledge” without a lot of manual labour.

I’ve been thinking about this in a couple of contexts and pretty much how I’ve come to think about it.

Folders give you hierarchical categories.

You still want tags for horizontal grouping. And links and references for precise edges.

But that gives you a really nice foundation that should get you pretty damn far.

I also now am telling the llm to add a summary as the first section of the file is longer.

On the other hand, I get why cloud drive users completely disregard file structure and search everything. Two files usually don't have the same name unless you're laying it out programmatically like this. I use dir trees for code ofc, but everything else is flat in my ~/Documents.

Deep inside a project dir, feels like some the ease of LLMs is just not having to cd into the correct directory, but you shouldn't need an LLM to do that. I'm gonna try setting up some aliases like "auto cd to wherever foo/main.py is" and see how that goes.

I created AS Notes (https://www.asnotes.io) (an extension for VS Code, Antigravity etc) partly because of this use case. It works like Obsidian, being markdown based, with wikilinks, mermaid rendering and task management. In VS Code, we have access to really good Agent harnesses and can navigate our notes and documents in a file system like manner. Further, using AGENTS.md, idea files etc we can instruct the agent how to interact, add to our notes etc. I've found working with my notes like this really useful, and provided I trim anything generated by an AI that's not going to be useful, provides an investment in the information I've gathered as the information is retained in markdown rather than getting lost in multiple chatbot UI s.

Filesystem is a tree - a particular, constrained graph. Advanced topics usually require a lot of interconnections.

Maybe it is why mind maps never spoke to me. I felt that a tree structure (or even - planar graphs) were not enough to cover any sufficiently complex topic.

I'm wonder though:

1. Why does AI need that folder structure? Why not a flat list of files and let the AI agent explore with BM25 / grep, etc.

2. pre-compute compression vs compute at query time.

Kaparthy (and you) are recommending pre-compressing and sorting based on hard coded human abstraction opinions that may match how the data might be queried into human-friendly buckets and language.

Why not just let the AI calculate this at run time? Many of these use cases have very few files and for a low traffic knowledge store, it probably costs less tokens if you only tokenize the files you need.

There for sure a "second brain" product hiding in plain site for one of the frontier AI companies. Google/Gemini should be all over this right now.

I've found a similar structure along with a naming convention useful at my day job --- the big thing is the names are such that when copied as a filepath, the filepath and extension deleted, and underscores replaced by tabs, the text may then be pasted into a spreadsheet and summed up or otherwise manipulated.

In somewhat of an inversion, I've been getting the initial naming done by an LLM (well, I was, until CoPilot imposed file upload limits and the new VPN blocked access to it) --- for want of that, I just name each scan by Invoice ID, then use a .bat file made by concatenating columns in a spreadsheet to rename them to the initial state ready for entry.

Sure. It just fails to be atomic. Which is a property I really like.

If I understand this right, the difference between the author's suggested approach and simply chatting with an AI agent over your files is hyperlinks: if your files contain links to other relevant files, the agent has an easier time identifying relevant material.

I can't remember how many file structures I've already tried... LLMs seem to be a great help here. Also used CC to organize my messy harddrive.

Now just need to find a good way to maintain the order...

Wow just strings in files. Are you jumping node to node via pointers index free?

You can fine-tune local models using your own data. Unsloth has a guide at https://unsloth.ai/docs/get-started/fine-tuning-llms-guide.

I'm currently experimenting with Tobi's QMD (https://github.com/tobi/qmd) to see how it performances with local models only on my Obsidian Vault.

Models are lossy, so fine-tune can only take you so far with small models. What we need is reasonably capable local models with a huge context window and a method to make efficient use of token and cram as much info as possible in the context before degrading the output quality.

Couldn't you create synthetic data based on your entries using local models? Or would that defeat the purpose of fine tuning it?

This is exactly what we're working on, is there any application in particular you're interested in the most?

> I'm struggling collecting actual data I could use for fine-tuning myself,

Journalling or otherwise writing is by far the best way to do this IMO but it doesn't take very much audio to accurately do a voice-clone. The hard thing about journalling is that it can actually be really biased away from the actual "distribution" of you, whether it's more aspirational or emotional or less rigorous/precise with language.

What I'm starting to do is save as many of my prompts as possible, because I realized a lot of my professional writing was there and it was actually pretty valuable data (especially paired with outputs and knowledge of what went well and waht didn't) for finetuning on my own workloads. Secondly is assembling/curating a collection of tools and products that I can drop into each new context with LLMs and also use for finetuning them on my own needs. Unlike "knowledge repositories" these both accurately model my actual needs and work and don't require me to do really do anything unnatural.

The other thing I'm about to start doing is "natural" in a certain sense but kinda weird, basically recording myself talking to my computer (verbalizing my thoughts more so it can be embedded alongside my actions, which may be much sparser from the computer's perspective) / screen recordings of my session as I work with it. This is something I've had to look into building more specialized tools for, because it creates too much data to save all of it. But basically there are small models, transcoding libraries, and pipelines you can use for audio/temporal/visual segmentation and transcription to compress the data back down into tokens and normal-sized images.

This is basically creating a semantic search engine of yourself as you work, kinda weird, but IMO it's just much weirder that your computer can actually talk back and learn about you now. With 96GB you can definitely do it BTW. I successfully finetuned an audio workload on gemma 4 2b yesterday on a 16GB mac mini. With 96GB you could do a lot.

> letting LLMs write docs and add them to a "knowledge repository"

I think what you actually want them to do is send them to go looking for stuff for you, or actively seeking out "learning" about something like that for their own role/purposes, so they can embed the useful information and better retrieve it when they need it, or produce traces grounded in positive signals (eg having access to this piece of information or tool, or applying this technique or pattern, measurably improves performance at something in-distribution to whatever you have them working on) they can use in fine-tuning themselves.

So you have some folders with markdown files ... which are insanely hard to query without a tool ... impossible to traverse via their relationships ... and you call that a graph database? WHAT?!

Waste of time.

Yep, my distributed JSON over HTTP database uses the ext4 binary tree for indexing: http://root.rupy.se

I can see over engineering when I look at one. And premature optimization.

I’ve been thinking about this in a couple of contexts and pretty much how I’ve come to think about it.

Folders give you hierarchical categories.

You still want tags for horizontal grouping. And links and references for precise edges.

But that gives you a really nice foundation that should get you pretty damn far.

I also now am telling the llm to add a summary as the first section of the file is longer.

There for sure a "second brain" product hiding in plain site for one of the frontier AI companies. Google/Gemini should be all over this right now.

Sure. It just fails to be atomic. Which is a property I really like.

Wow just strings in files. Are you jumping node to node via pointers index free?

I think agentic RAG still has its place. a hybrid semantic/keyword search tool in addition to other research tools outperforms the baseline in my experience.

You can fine-tune local models using your own data. Unsloth has a guide at https://unsloth.ai/docs/get-started/fine-tuning-llms-guide.

I'm currently experimenting with Tobi's QMD (https://github.com/tobi/qmd) to see how it performances with local models only on my Obsidian Vault.

Right, the technical know-how about fine-tuning isn't the problem here, getting sufficiently high quality session logs without basically giving away my private data for free is the issue.

Today, I can use even the small models of OpenAI and Anthropic to get valuable sessions, but if I wanted to actually use those for fine-tuning a local model, I'd need to actually start sending the data I want to use for fine-tuning to OpenAI and Anthropic, and considering it's private data I'm not willing to share, that's a hard-no.

So then my options are basically using stronger local models so I get valuable sessions I can use for fine-tuning a smaller model. But if those "stronger local models" actually worked in practice to give me those good sessions, then I'd just use those, but I'm unable to get anything good enough to serve as a basis for fine-tuning even from the biggest ones I can run.

Couldn't you create synthetic data based on your entries using local models? Or would that defeat the purpose of fine tuning it?

Yeah, I suppose, but how do I get sufficiently high quality synthetic data without sending the original data to OpenAI/Anthropic, or by using local models when none of them seem strong enough to be able to generate that "sufficiently high quality synthetic data" in the first place?

This is exactly what we're working on, is there any application in particular you're interested in the most?

> I'm struggling collecting actual data I could use for fine-tuning myself,

> letting LLMs write docs and add them to a "knowledge repository"

I think maybe you're misunderstanding the issue here. I have loads of data, but I'm unwilling to send it to 3rd parties, so that leaves me with gathering/generating the training data locally, but none of the models are good/strong enough for that today.

I'd love to "send them to go looking for stuff for you", but local models aren't great at this today, even with beefy hardware, and since that's about my only option, that leaves me unable to get sessions to use for the fine-tuning in the first place.

I think the confusion stems from the fact that we call a database what is really a database management system.

I'd just like to interject for a moment. What you're refering to as a database, is in fact a database management system, or as I've recently taken to calling it, database plus management system.

You confuse the raw fist with the master who calculates the shortest path to your destruction.

Yes, the author is likely unaware of this. They see markdown files with links, so a graph and the set of those files, so a "database".

https://neo4j.com/docs/graph-data-science/current/algorithms...

Neo4j looooooves the "if you think about it, everything is graphs!" marketing maneuver. They (their marketing department) were the very first thing I thought of when I read this headline.

His argument is that the LLM is the query engine. By that logic you can approximate anything since LLMs can.

I am more curious on the note taking. How do you ingest data here? Export from slack via LLM:s? Store it in GitHub?

My “knowledge” is spread out on various SaaS (Google, slack, linear, notion, etc). I don’t see how I can centralize my “knowledge” without a lot of manual labour.

Unless you're forced into using certain tool (work, etc), start by standardizing on a single tool. That's one reason a lot of people like Obsidian, but there are plenty of similar tools, or you can just write markdown in your editor of choice. Then set of some sort of sync so you have it everywhere you are (mobile can be a bit tricky for some set-ups) and commit to using that method as much as possible for your notes.

You may want to do as described and link to Slack messages (etc), but just remember any external link should be treated as ephemeral. You may not have access to the Slack anymore, for example. That may mean you don't need that note either, or it may mean you lost access to a node on your knowledge graph, you have to determine whether that matters.

By starting now, at least everything going forward is captured in a way you can both own and utilize it. Then it may be a bit of a pain and some manual work to get existing notes into your tool of choice, but you can determine what needs to be in there from other tools as you go forward.

> I use dir trees for code ofc, but everything else is flat in my ~/Documents.

Which is great, but on all major OSes you'd eventually hit performance issues with flat directories like this. Might not be an issue in month one, or even year one, but after 10 years of note taking/journaling that approach will show the issue with large flat directories.

So eventually you'd need to shard it somehow, so might as well start categorizing/sorting things from the get go, at least in some broad major categories at least, because doing so once you already have 10K entries in a directory, it sucks big time to do it.

Filesystem is a tree - a particular, constrained graph. Advanced topics usually require a lot of interconnections.

Maybe it is why mind maps never spoke to me. I felt that a tree structure (or even - planar graphs) were not enough to cover any sufficiently complex topic.

If it has hard or soft links, its a proper graph.

Isn't text a basic linear structure that can cover sufficiently complex topics ?

I'm wonder though:

1. Why does AI need that folder structure? Why not a flat list of files and let the AI agent explore with BM25 / grep, etc.

2. pre-compute compression vs compute at query time.

Kaparthy (and you) are recommending pre-compressing and sorting based on hard coded human abstraction opinions that may match how the data might be queried into human-friendly buckets and language.

> Why does AI need that folder structure? Why not a flat list of files and let the AI agent explore with BM25 / grep, etc.

Progressive disclosure, same reason you don't get assaulted with all the information a website has to offer at once, or given a sql console and told to figure it out, and instead see a portion of the information in a way that is supposed to naturally lead you to finding the next and next bits of information you're looking for.

> use cases

This is essentially just where you're moving the hierarchy/compression, but at least for me these are not very disjoint and separable. I think what I actually want are adaptable LoRa that loosely correspond to these use cases but where a dense discriminator or other system is able to adapt and stay in sync with these too. Also, tool-calling + sql/vector embeddings so that you can actually get good filesystem search without it feeling like work, and let the model filter out the junk.

> let the AI calculate this at run time?

You still do want to let it do agentic RAG but I think more tools are better. We're using sqlite-vec, generating multimodal and single-mode embeddings, and trying to make everything typed into a walkable graph of entity types, because that makes it much easier to efficiently walk/retrieve the "semantic space" in a way that generalizes. A small local model needs at least enough structure to know these are the X ways available to look for something and they are organized in Y ways, oriented towards Z and A things.

Especially on-device, telling them to "just figure it out" is like dropping a toddler or autonomous vehicle into a dark room and telling them to build you a search engine lol. They need some help and also quite literally to be taught what a search engine means for these purposes. Also, if you just let them explore or write things without any kind of grounding in what you need/any kind of positive signals, they're just going to be making a mess on your computer.

> 1. Why does AI need that folder structure? Why not a flat list of files and let the AI agent explore with BM25 / grep, etc.

Two reasons I think:

Coding agents simulate similar things to what they have been trained on. Familiarity matters.

And they tend to do much better the more obvious and clear a task is. The more they have to use tools or "thinking", the less reliable they get.

> Why does AI need that folder structure? Why not a flat list of files and let the AI agent explore with BM25 / grep, etc.

It doesn't. The human creating the files needs it, to make it easier to traverse in future as the file count grows. At 52k files, that's a horrendous list to scroll through to find the thing you're looking for. Meanwhile, an AI can just `find . -type f -exec whatever {} \;` and be able to process it however it needs. Human doesn't need to change the way they work to appease the magic rock in the box under the desk.

I can't remember how many file structures I've already tried... LLMs seem to be a great help here. Also used CC to organize my messy harddrive.

Now just need to find a good way to maintain the order...

> Also used CC to organize my messy harddrive.

Do you still have your prompt by chance, and willing to share it? I took a stab at this and it didn't want to make much change. I think I need to be more specific but am not sure how to do that in a general way

> Why does AI need that folder structure? Why not a flat list of files and let the AI agent explore with BM25 / grep, etc.

> use cases

> let the AI calculate this at run time?

> 1. Why does AI need that folder structure? Why not a flat list of files and let the AI agent explore with BM25 / grep, etc.

Two reasons I think:

Coding agents simulate similar things to what they have been trained on. Familiarity matters.

And they tend to do much better the more obvious and clear a task is. The more they have to use tools or "thinking", the less reliable they get.

I think agentic RAG still has its place. a hybrid semantic/keyword search tool in addition to other research tools outperforms the baseline in my experience.

Right, the technical know-how about fine-tuning isn't the problem here, getting sufficiently high quality session logs without basically giving away my private data for free is the issue.

you could do something like rent GPU time yourself, and use it to run a higher-quality local model (e.g. one of the Chinese "close to frontier" ones). Not guaranteed to preserve privacy of course, but it at least avoids directly sending the data to OpenAI/Anthropic.

Right, that's exactly the situation I'm in too and "send them to go looking for stuff for you" without it going off the rails is the problem we've been working on.

Basically you need a squad of specialized models to do this in a mostly-structured way that ends up looking kind of like a crawling or scraping/search operation. I can share a stack of about 5-6 that are working for us directly if you want, I want to keep the exact stack on the DL for now but you can check my company's recent github activity to get an idea of it. It's basically a "browser agent" where gemma or qwen guide the general navigation/summarization but mostly focus on information extraction and normalization.

The other thing I've done, which obviously not everybody is going to want to do, is create emails and browser profiles for the browser agent (since they basically work when I'm not on the computer, but need identity to navigate the web) and run them on devices that don't have the keys to the kingdom. I also give them my phone number and their own (via an endpoint they can only call me from). That way if they run into something they have a way to escalate it, and I can do limited steering out of the loop. Obviously this is way more work than is reasonable for most people right now though so I'm hoping to show people a proper batteries-included setup for it soon.

Edit: Based on your other comment, I think maybe what you're really looking for most are "personal traces". Right now that's something we're working on with https://github.com/accretional/chromerpc (which uses the lower level Chrome Devtools Protocol rather than Puppetteer to basically fully automate web navigation, either through an LLM or prescriptive workflows). It would be very simple to set up automation to take a screenshot and save it locally every Xm or in response to certain events and generate traces for yourself that way, if you want. That alone provides a pretty strong base for a personal dataset.

I'd just like to interject for a moment. What you're refering to as a database, is in fact a database management system, or as I've recently taken to calling it, database plus management system.

You confuse the raw fist with the master who calculates the shortest path to your destruction.

Neo4j looooooves the "if you think about it, everything is graphs!" marketing maneuver. They (their marketing department) were the very first thing I thought of when I read this headline.

"Everything is graphs, so let's use a graph DBMS for anything" is a classic blunder

His argument is that the LLM is the query engine. By that logic you can approximate anything since LLMs can.

Indeed, what is the point of links/edges when the llm can figure out the relations by itself?

> I use dir trees for code ofc, but everything else is flat in my ~/Documents.

If it's just performance, cd ~/Documents && mkdir old && mv ./* old/ (or today's date instead of old). I actually have that layout on one PC.

If real organization is needed, seems like that'd be easier in hindsight than having foresight

Isn't text a basic linear structure that can cover sufficiently complex topics ?

Yes. And precisely for this reason reading a dictionary is not a way of learning a language.

If it has hard or soft links, its a proper graph.

On Linux at least, hard links can't be made to directories, except for the magic . and .. links. So this only allows for a DAG.

Symbolic links can form a graph, and you can process them as needed using readlink etc. to traverse the graph, but they'll still be considered broken if they form a cycle.

That what i was thinking! Instead of Wiki links, use Symlinks (i guess windows would not like it?)

> Why does AI need that folder structure? Why not a flat list of files and let the AI agent explore with BM25 / grep, etc.

> The human creating the files needs it

why? The human would just talk to the AI agent. Why would they need to scroll through that many files?

I made a similar system with 232k files (1 file might be a slack message, gitlab comment, etc). it does a decent job at answering questions with only keyword search, but I think i can have better results with RAG+BM25.

> Also used CC to organize my messy harddrive.

I don't have the exact prompt anymore, but it was very lean. I first asked to do an assessment: "Review the content of the whole folder structure. I want you to assess it, and suggest a better setup and structure based on its content. Don't change anything yet, just assess"

and then worked from there, giving feedback on the proposed folder structure, until I was happy

If it's just performance, cd ~/Documents && mkdir old && mv ./* old/ (or today's date instead of old). I actually have that layout on one PC.

If real organization is needed, seems like that'd be easier in hindsight than having foresight

Yes. And precisely for this reason reading a dictionary is not a way of learning a language.

That what i was thinking! Instead of Wiki links, use Symlinks (i guess windows would not like it?)

and then worked from there, giving feedback on the proposed folder structure, until I was happy

Right, that's exactly the situation I'm in too and "send them to go looking for stuff for you" without it going off the rails is the problem we've been working on.

> that ends up looking kind of like a crawling or scraping/search operation

Sure, but what I'm talking about is that the current SOTA models are terrible even for specialized small use cases like what you describe, so you can't just throw a local modal on that task and get useful sessions out of them that you can use for fine-tuning. If you want distilled data or similar, you (obviously) need to use a better model, but currently there is none that provides the privacy-guarantees I need, as described earlier.

All of those things come once you have something suitable for the individual pieces, but I'm trying to say that none of the current local models come close to solving the individual pieces, so all that other stuff is just distraction before you have that in place.

"Everything is graphs, so let's use a graph DBMS for anything" is a classic blunder

I've seen it work to sell their product to managers who definitely should have gone with something else, so I get why they do it. It works.

Indeed, what is the point of links/edges when the llm can figure out the relations by itself?

On Linux at least, hard links can't be made to directories, except for the magic . and .. links. So this only allows for a DAG.

Symbolic links can form a graph, and you can process them as needed using readlink etc. to traverse the graph, but they'll still be considered broken if they form a cycle.

I guess technically you could do bind mounts but that's messy

Considered broken by what?

> The human creating the files needs it

why? The human would just talk to the AI agent. Why would they need to scroll through that many files?

And when the system fails for whatever reason?

Just because AI exists doesn't mean we can neglect basic design principles.

If we throw everything out the window, why don't we just name every file as a hash of its content? Why bother with ASCII names at all?

Fundamentally, it's the human that needs to maintain the system and fix it when it breaks, and that becomes significantly easier if it's designed in a way a human would interact with it. Take the AI away, and you still have a perfectly reasonable data store that a human can continue using.

I guess technically you could do bind mounts but that's messy

And when the system fails for whatever reason?

Just because AI exists doesn't mean we can neglect basic design principles.

If we throw everything out the window, why don't we just name every file as a hash of its content? Why bother with ASCII names at all?

> that ends up looking kind of like a crawling or scraping/search operation

Understood. I guess I'm saying "soon" but definitely agreed its not "now" yet. I will say though, with 96GB, in a couple months you're going to be able to hold tons of Gemma 4 LoRa "specialists" in-memory at the same time and I really think it will feel like a whole new world once these are all getting trained and shared and adapted en-masse. And also, you could set up personal traces now if you want. Nobody can make you, but in its laziest form it can be literally just taking screenshots of your screen periodically as you work, and that'll have applications soon

I've seen it work to sell their product to managers who definitely should have gone with something else, so I get why they do it. It works.

Considered broken by what?

Historically, it made deletion rather difficult with some problematic edge-cases. You could unlink a directory and create an orphan cycle that would never be deleted. Combine that with race conditions on a multi-user systems, plus the indeterminate cost of cycle-detection, and it turns out to be a rather complex problem to solve properly, and banning hard-links is a very simple way to keep the problem tractable, and result in fast, robust and reliable filesystem operations.

GP was talking about symlink cycles though, which can't produce orphans during deletion.

> And also, you could set up personal traces now if you want. Nobody can make you, but in its laziest form it can be literally just

But again, you're missing my point :) I cannot, since the models I could generate useful traces from are run by platforms I'm not willing to hand over very private data to, and local models that I could use I cannot get useful traces from.

And I'm not holding out hope for agent orchestration, people haven't even figured out how to reliably get high quality results from agent yet, even less so with a fleet of them. Better to realistically tamper your expectations a bit :)

> And also, you could set up personal traces now if you want. Nobody can make you, but in its laziest form it can be literally just

Karpathy recently posted about using LLMs to build personal knowledge bases — collecting raw sources into a directory, having an LLM “compile” them into a wiki of interlinked markdown files, and viewing the whole thing in Obsidian. He followed it up with an “idea file,” a gist you can hand to your agent so it builds the system for you.

This is a great idea, I’ve been doing some form of this for over a decade. My Staff Eng co-host @davidnoelromas reached out after the tweet to ask for more details on how I’ve been using obsidian and AI. This an expanded version of what I told him.

I’ve collected possibly too many markdown files.

find . -type f | wc -l
52447

That’s my obsidian vault, and I use it with AI everyday without a special database, or a vector store, or a RAG pipeline. It’s merely files on disk.

The problem this actually solves

Think about the context you carry around in your head for your job. The history of decisions on a project. What you discussed with your manager three months ago. The Slack thread where the team landed on an approach. The Google Doc someone shared in a meeting you half-remember. The slowly evolving understanding of how a system works that lives across fifteen people’s heads and nowhere else.

Now think about what happens when you need to produce something from all that context. A design doc. A perf packet. A project handoff. An onboarding guide for a new team member. You spend hours reassembling context from Slack, docs, emails, your own memory, and you still miss things.

The knowledge base turns this into a system instead of a scramble.

The architecture

A file system with markdown and wikilinks is already a graph database. Files are nodes. Wikilinks are semantic edges. Folders introduce taxonomy. You don’t need a special MCP server or plugin. The file system abstraction is the interface, and LLMs are surprisingly good at navigating it.

I use a structure borrowed from Tiago Forte’s Building a Second Brain, with the PARA taxonomy as a starting point, extended with categories that match how I actually work:

/projects/{name}
/areas/{topics}
/people/{slack_handle}
/daily/{year}/{month}/{day}/
/meetings/{year}/{month}/{day}/

Markdown files are nodes, wikilinks ([[target]]) are edges, the folder taxonomy is the schema and LLMs is the query engine. A graph database with a natural language query interface. No infrastructure required.

How it works day to day

After every meeting, the agent creates a note in daily/{year}/{month}/{day}/, downloads any attached Google Docs, and links everything to the long-running notes I keep for each person I interact with regularly. A note from a 1:1 with my boss JP gets a wikilink to [[/people/jp|jp]] and to whatever projects we discussed.

Over months, each person’s note becomes a timeline of every conversation, decision, and open thread. Each project folder accumulates every relevant artifact. You don’t have to remember where things are. The graph remembers.

For a work project, I can point the agent at a starting doc and say: “Spider through every tool you have access to and pull down all the related context.” It grabs Slack threads, Google Docs, web resources, all rendered as markdown inside the project folder. From that assembled context, the agent can draft design docs, product vision statements, problem/solution analyses. The output is better than prompting cold because the LLM is working with the real history of the project, not your summary of it.

This is the part Karpathy’s tweet hints at but doesn’t fully spell out: the knowledge base isn’t just for research. It’s a context engineering system. You’re building the exact input your LLM needs to do useful work.

What makes this different from just using an LLM

You might be thinking: I already ask Claude to help me write a design doc. True. But there’s a real difference between prompting “help me write a design doc for a rate limiting service” and prompting an LLM that has access to your project folder with six months of meeting notes, three prior design docs, the Slack thread where the team debated the approach, and your notes on the existing architecture.

The knowledge base is a context engineering system. You’re not building a wiki for the sake of having a wiki. You’re building the input layer that makes every future LLM interaction better. Every meeting note, every linked decision, every filed artifact improves the quality of every query that follows.

Where this is still hard

The piece I haven’t cracked is automated inbox processing. The idea is straightforward: web clippings, meeting notes, Slack saves, and random captures all land in an inbox folder. The agent processes everything new, applies progressive summarization, breaks content into atomic pieces, correlates each piece with the right project, area, or person.

I have a graveyard of experiments here. The LLM is good at summarizing and categorizing. The hard part is defining what “processed” means in a way that’s consistent enough to be useful six months later but flexible enough to handle the variety of stuff that lands in an inbox. Every attempt has been either too rigid (everything gets the same treatment) or too loose (the vault drifts into chaos).

If you’ve solved this, I’d genuinely like to hear about it.

Getting started

You don’t need 52,000 files to get value from this. Start with three things:

One: Create the folder structure. Projects, areas, people, daily. Even empty, the taxonomy gives you and the LLM a schema.

Two: After your next meeting, have the agent create a note and link it to the relevant person and project. Do this for a week. Watch the graph start to form.

Three: The next time you need to write something, a design doc, a status update, a perf self-review, point the agent at the relevant folders and ask it to draft from what’s there.

The difference is noticeable right away. Not because the LLM is smarter, but because it finally has the context to be useful.

Your work compounds. That’s the thing that feels genuinely new.

GP was talking about symlink cycles though, which can't produce orphans during deletion.

True, I missed that. I suppose with symlinks you have the reverse problem: you can point to deleted filenames and then have broken links. The cycle detection is still an issue though--it has indeterminate complexity and the graph can be modified as you are traversing it!

This is true, but just about everyone has a symlink cycle on their system at `/proc/self/root`, and for the most part nobody notices. Having a max recursion depth is usually more useful than actively trying to detect cycles.

Hacker Times

Hacker Times

Your File System Is Already A Graph Database

Discussion

Discussion

The problem this actually solves

The architecture

How it works day to day

What makes this different from just using an LLM

Where this is still hard

Getting started