Though typescript is pretty fast, and the language is flexible, we all know how demanding graph databases are. How hard they are to shard, etc. It seems like this could be a performance trap. Are there successful rbdms or nosql databases out there written in typescript?
Also why is everything about LLMs now? Can't we discuss technologies for their face value anymore. It's getting kind of old to me personally.
How dos Yjs handle schema migrations? If I add a property to a vertex type that existing peers have cached, does it conflict or drop the unknown field?
This looks neat, but if you want it to be used for AI purposes, you might want to show a schema more complicated than a twitter network.
The query syntax looks nice by the way.
Imho having a graph database that is teally easy to use and can use queries as types/schemas works much better, you don't need strong schema validation so long as you can gracefully ignore what your schema doesn't expect.
I'm sure once that problem been solved, you can use the built-in map/object of whatever language, and it'll be good enough. Add save/load to disk via JSON and you have long-term persistence too. But since LLMs still aren't clever enough, I don't think the underlying implementation matters too much.
A: One of the main lessons of the RAG era of LLMs was reranked multiretrieval is a great balance of test time, test compute, and quality at the expense of maintaining a few costly index types. Graph ended up a nice little lift when put alongside text, vector, and relational indexing by solving some n-hop use cases.
I'm unsure if the juice is worth the squeeze, but it does make some sense as infra. Making and using these flows isn't that conceptually complicated and most pieces have good, simple OSS around them.
B: There is another universe of richer KG extraction with even heavier indexing work. I'm less clear on the relative ROI here in typical benchmarks. Imagine going full RDF, vs the simpler property graph queries & ontologies here, and investing in heavy entity resolution etc preprocessing during writes. I don't know how well these improve scores vs regular multiretrieval above, and how easy it is to do at any reasonable scale.
In practice, a lot of KG work lives out of the DB and agent, and in a much fancier kg pipeline. So there is a missing layer with less clear proof and a value burden.
--
Seperately, we have been thinking about these internally. We have been building gfql , oss gpu cypher queries on dataframes etc without needing a DB -- reuse existing storage tiers by moving into embedded compute tier -- and powering our own LLM usage has been a primary use case for us. Our experiences have led us to prioritizing case A as a next step for what the graph engine needs to support inside, and viewing case B as something that should live outside of it in a separate library . This post does make me wonder if case B should move closer into the engine to help streamline things for typical users, akin how solr/lucene/etc helped make elastic into something useful early on for search.
Plane route demo
Load a snapshot of real airline routes into the graph and query it with TypeScript.
Live demo
Powered by @codemix/graph and @codemix/y-graph-storage β a real graph database, synced via a Yjs CRDT across every open tab. Add yourself, rearrange people, draw connections.
Install the package from npm β no native dependencies, runs anywhere Node or a bundler can.
$ pnpm add @codemix/graph
Note: This is alpha-quality software. We use it in production at codemix and it works well for our use cases, but please be careful using it with your own data.
Describe vertices, edges, and indexes in a plain object. Property types flow through every query, traversal, and mutation β no casts, no runtime surprises.
import { Graph, GraphSchema, InMemoryGraphStorage } from "@codemix/graph";
import { z } from "zod";
const schema = {
vertices: {
User: {
properties: {
email: { type: z.email(), index: { type: "hash", unique: true } },
name: { type: z.string() },
},
},
Repo: {
properties: {
name: { type: z.string() },
stars: { type: z.number() },
},
},
},
edges: {
OWNS: { properties: {} },
FOLLOWS: { properties: {} },
},
} as const satisfies GraphSchema;
const graph = new Graph({ schema, storage: new InMemoryGraphStorage() });
addVertex, addEdge, and updateProperty.Vertices and edges are added through the graph instance. Property arguments are checked against your schema at both compile time and runtime.
// add vertices β args are typed to each label's property schema
const alice = graph.addVertex("User", { name: "Alice", email: "alice@example.com" });
const bob = graph.addVertex("User", { name: "Bob", email: "bob@example.com" });
const myRepo = graph.addVertex("Repo", { name: "my-repo", stars: 0 });
// add edges
graph.addEdge(alice, "OWNS", myRepo, {});
graph.addEdge(bob, "FOLLOWS", alice, {});
// read properties β types come from the schema
alice.get("name"); // string
myRepo.get("stars"); // number
// update in place
graph.updateProperty(myRepo, "stars", 42);
// or via the element itself
myRepo.set("stars", 42);
A Gremlin-style traversal API β familiar step names, but every label, property key, and hop is checked by TypeScript against your schema.
Start a traversal
import { GraphTraversal } from "@codemix/graph";
const g = new GraphTraversal(graph);
for (const path of g.V().hasLabel("User")) {
path.value.get("name"); // string β
path.value.get("email"); // string β
}
Filter by property
// exact match or predicate
const [alice] = g.V()
.hasLabel("User")
.has("email", "alice@example.com");
const seniors = g.V()
.hasLabel("User")
.where((v) => v.get("name").startsWith("A"));
Traverse edges
// follow OWNS edges from User β Repo
for (const path of g.V()
.hasLabel("User")
.has("email", "alice@example.com")
.out("OWNS").hasLabel("Repo")) {
path.value.get("stars"); // number β typed from Repo's schema
}
Label and select
// capture vertices at multiple hops and project them together
for (const { user, repo } of g.V()
.hasLabel("User").as("user")
.out("FOLLOWS")
.out("OWNS").hasLabel("Repo").as("repo")
.select("user", "repo")) {
console.log(
user.value.get("name"), // string
repo.value.get("stars"), // number
);
}
Swap InMemoryGraphStorage for YGraph and the entire graph lives in a Yjs CRDT document. Every traversal, Cypher query, and index works unchanged β you just get conflict-free sync on top.
Plug in a provider
import * as Y from "yjs";
import { WebsocketProvider } from "y-websocket";
import { YGraph } from "@codemix/y-graph-storage";
const doc = new Y.Doc();
const graph = new YGraph({ schema, doc });
// Connect any Yjs provider β sync happens automatically.
// Every peer that joins the room sees the same graph.
const provider = new WebsocketProvider("wss://my-server", "graph-room", doc);
Subscribe to fine-grained changes
// Events fire for local and remote mutations alike
const unsubscribe = graph.subscribe({
next(change) {
// change.kind is one of:
// "vertex.added" | "vertex.deleted"
// "edge.added" | "edge.deleted"
// "vertex.property.set" | "vertex.property.changed"
console.log(change.kind, change.id);
},
});
Live queries
// Wraps any traversal and re-fires when the result set could change
const topRepos = graph.query((g) =>
g.V().hasLabel("Repo").order("stars", "desc").limit(10)
);
const unsubscribe = topRepos.subscribe({
next() {
for (const path of topRepos) {
console.log(path.value.get("name"), path.value.get("stars"));
}
},
});
// Adding or updating a Repo elsewhere β even from a remote peer β
// triggers the subscriber automatically.
graph.updateProperty(myRepo, "stars", 99);
Collaborative property types
import { ZodYText, ZodYArray } from "@codemix/y-graph-storage";
import { z } from "zod";
// Declare Y.Text / Y.Array / Y.Map properties in the schema
const schema = {
vertices: {
Document: {
properties: {
title: { type: ZodYText }, // collaborative string
tags: { type: ZodYArray(z.string()) }, // collaborative array
},
},
},
edges: {},
} as const satisfies GraphSchema;
// Plain values are auto-converted β no need to construct Y.* manually
const doc = graph.addVertex("Document", { title: "Hello", tags: ["crdt"] });
// Mutate in place β all peers see the change with no conflicts
doc.get("title").insert(5, ", world");
doc.get("tags").push(["graph"]);
The same graph is queryable via a Cypher-compatible string language β ideal for exposing data to LLMs via an MCP server, or accepting ad-hoc queries from external clients without bundling a traversal library.
Parse and execute
import { parseQueryToSteps, createTraverser } from "@codemix/graph";
const { steps, postprocess } = parseQueryToSteps(`
MATCH (u:User)-[:OWNS]->(r:Repo)
WHERE r.stars > 100
RETURN u.name, r.name
ORDER BY r.stars DESC
LIMIT 10
`);
const traverser = createTraverser(steps);
for (const row of traverser.traverse(graph, [])) {
console.log(postprocess(row));
// { u: { name: "Alice" }, r: { name: "my-repo" } }
}
Parameterised queries
// Pass parameters to avoid string interpolation
const { steps, postprocess } = parseQueryToSteps(`
MATCH (u:User { email: $email })-[:OWNS]->(r:Repo)
RETURN r.name, r.stars
`);
const traverser = createTraverser(steps);
const rows = Array.from(
traverser.traverse(graph, [{ email: "alice@example.com" }])
).map(postprocess);
Mutations
// CREATE, MERGE, SET, DELETE are all supported
const { steps } = parseQueryToSteps(`
MATCH (r:Repo { name: $name })
SET r.stars = r.stars + 1
`);
createTraverser(steps).traverse(graph, [{ name: "my-repo" }]);
// Enforce read-only β throws ReadonlyGraphError on any write clause
const { steps: safeSteps } = parseQueryToSteps(query, { readonly: true });
This package is licensed under the MIT license.
It was orignally written as a research project by Charles Pick, founder of codemix and author of the infamous ts-sql demo. Later, when we were building codemix we needed a structured knowledge graph, so we adapted the code, added Y.js support and later, Opus 4.5 added a Cypher-like query language.
While you're here
codemix captures what you actually mean β your business domain, your user flows, the concepts, the constraints β and keeps it in sync with your codebase automatically.
Change your product through chat, diagrams, or collaborative editing. Steer coding agents through development and review code with real understanding. Every agent on your team shares the same context.
Create something completely new, or import your existing codebase to get started.
Try codemix for free, no credit card required.
FREE