Codeville also used a weave for storage and merge, a concept that originated with SCCS (and thence into Teamware and BitKeeper).
Codeville predates the introduction of CRDTs by almost a decade, and at least on the face of it the two concepts seem like a natural fit.
It was always kind of difficult to argue that weaves produced unambiguously better merge results (and more limited conflicts) than the more heuristically driven approaches of git, Mercurial, et al, because the edit histories required to produce test cases were difficult (at least for me) to reason about.
I like that Bram hasn’t let go of the problem, and is still trying out new ideas in the space.
git config --global merge.conflictstyle diff3
to get something like what is shown in the article.conflict free merging sounds cool, but doesn't that just mean that that a human review step is replaced by "changes become intervals rather than collections of lines" and "last set of intervals always wins"? seems like it makes sense when the conflicts are resolved instantaneously during live editing but does it still make sense with one shot code merges over long intervals of time? today's systems are "get the patch right" and then "get the merge right"... can automatic intervalization be trusted?
edit: actually really interesting if you think about it. crdts have been proven with character at a time edits and use of the mouse select tool.... these are inherently intervalized (select) or easy (character at a time). how does it work for larger patches can have loads of small edits?
Funny, there was just a post a couple of days ago how this is false.
It's not the same as capturing it, but I would also note that there are a wide wide variety of ways to get 3-way merges / 3 way diffs from git too. One semi-recent submission (2022 discussing a 2017) discussed diff3 and has some excellent comments (https://news.ycombinator.com/item?id=31075608), including a fantastic incredibly wide ranging round up of merge tools (https://www.eseth.org/2020/mergetools.html).
However/alas git 2.35's (2022) fabulous zdiff3 doesn't seems to have any big discussions. Other links welcome but perhaps https://neg4n.dev/blog/understanding-zealous-diff3-style-git...? It works excellently for me; enthusiastically recommended!
The goal should be to build a full spec and then build a code forge and ecosystem around this. If it’s truly great, adoption will come. Microsoft doing a terrible job with GitHub is great for new solutions.
- What kind of problems do 1 person, 10 person, 100 person, 1k (etc) teams really run into with managing merge conflicts?
- What do teams of 1, 10, 100, 1k, etc care the most about?
- How does the modern "agent explosion" potentially affect this?
For example, my experience working in the 1-100 regime tells me that, for the most part, the kind of merge conflict being presented here is resolved by assigning subtrees of code to specific teams. For the large part, merge conflicts don't happen, because teams coordinate (in sprints) to make orthogonal changes, and long-running stale branches are discouraged.
However, if we start to mix in agents, a 100 person team could quickly jump into a 1000 person team, esp if each person is using subagents making micro commits.
It's an interesting idea definitely, but without real-world data, it kind of feels like this is just delivering a solution without a clear problem to assign it to. Like, yes merge-conflicts are a bummer, but they happen infrequently enough that it doesn't break your heart.
I assume the proposed system addresses it somehow but I don't see it in my quick read of this.
> ... CRDTs for version control, which is long overdue but hasn’t happened yet
Pijul happened and it has hundreds - perhaps thousands - of hours of real expert developer's toil put in it.
Not that Bram is not one of those, but the post reads like you all know what.
What I do think is the critical challenge (particularly with Git) is scalability.
Size of repository & rate of change of repositories are starting to push limits of git, and I think this needs revisited across the server, client & wire protocols.
What exactly, I don't know. :). But I do know that in my current role (mid-size well-known tech company) is hitting these limits today.
Well, isn't that what the CRDT does in its own data structure ?
Also keep in mind that syntactic correctness doesn't mean functional correctness.
This means that everything that implements eventual consistency (including Git) is using "a CRDT".
If you haven’t resolved conflicts then it probably doesn’t compile and of course tests won’t pass, so I don’t see any point in publishing that change? Maybe the commit is useful as a temporary state locally, but that seems of limited use?
Nowadays I’d ask a coding agent to figure out how to rebase a local branch to the latest published version before sending a pull request.
It's been amazing watching it grow over the last few years.
I’m releasing Manyana, a project which I believe presents a coherent vision for the future of version control — and a compelling case for building it.
It’s based on the fundamentally sound approach of using CRDTs for version control, which is long overdue but hasn’t happened yet because of subtle UX issues. A CRDT merge always succeeds by definition, so there are no conflicts in the traditional sense — the key insight is that changes should be flagged as conflicting when they touch each other, giving you informative conflict presentation on top of a system which never actually fails. This project works that out.
One immediate benefit is much more informative conflict markers. Two people branch from a file containing a function. One deletes the function. The other adds a line in the middle of it. A traditional VCS gives you this:
<<<<<<< left
=======
def calculate(x):
a = x * 2
logger.debug(f"a={a}")
b = a + 1
return b
>>>>>>> right
Two opaque blobs. You have to mentally reconstruct what actually happened.
Manyana gives you this:
<<<<<<< begin deleted left
def calculate(x):
a = x * 2
======= begin added right
logger.debug(f"a={a}")
======= begin deleted left
b = a + 1
return b
>>>>>>> end conflict
Each section tells you what happened and who did it. Left deleted the function. Right added a line in the middle. You can see the structure of the conflict instead of staring at two blobs trying to figure it out.
CRDTs (Conflict-Free Replicated Data Types) give you eventual consistency: merges never fail, and the result is always the same no matter what order branches are merged in — including many branches mashed together by multiple people working independently. That one property turns out to have profound implications for every aspect of version control design.
Line ordering becomes permanent. When two branches insert code at the same point, the CRDT picks an ordering and it sticks. This prevents problems when conflicting sections are both kept but resolved in different orders on different branches.
Conflicts are informative, not blocking. The merge always produces a result. Conflicts are surfaced for review when concurrent edits happen “too near” each other, but they never block the merge itself. And because the algorithm tracks what each side did rather than just showing the two outcomes, the conflict presentation is genuinely useful.
History lives in the structure. The state is a weave — a single structure containing every line which has ever existed in the file, with metadata about when it was added and removed. This means merges don’t need to find a common ancestor or traverse the DAG. Two states go in, one state comes out, and it’s always correct.
One idea I’m particularly excited about: rebase doesn’t have to destroy history. Conventional rebase creates a fictional history where your commits happened on top of the latest main. In a CRDT system, you can get the same effect — replaying commits one at a time onto a new base — while keeping the full history. The only addition needed is a “primary ancestor” annotation in the DAG.
This matters because aggressive rebasing quickly produces merge topologies with no single common ancestor, which is exactly where traditional 3-way merge falls apart. CRDTs don’t care — the history is in the weave, not reconstructed from the DAG.
Manyana is a demo, not a full-blown version control system. It’s about 470 lines of Python which operate on individual files. Cherry-picking and local undo aren’t implemented yet, though the README lays out a vision for how those can be done well.
What it is is a proof that CRDT-based version control can handle the hard UX problems and come out with better answers than the tools we’re all using today — and a coherent design for building the real thing.
The code is public domain. The full design document is in the README.
No posts
I really wonder what kinds of magical AI you're using, because in my experience, Claude Code chokes and chokes hard on complex rebases/merge conflicts to the point that I couldn't trust it anymore.
<<<<<<< left
||||||| base
def calculate(x):
a = x * 2
b = a + 1
return b
=======
def calculate(x):
a = x * 2
logger.debug(f"a={a}")
b = a + 1
return b
>>>>>>> right
With this configuration, a developer reading the raw conflict markers could infer the same information provided by Manyana’s conflict markers: that the right side added the logging line.the key insight is that changes should be flagged as conflicting when they touch each other, giving you informative conflict presentation on top of a system which never actually fails.
> What do teams of 1, 10, 100, 1k, etc care the most about?
Oh god no! That would be about the worst way to do it.
Just make it conceptually sound.
Is it actually okay to try to merge changes to binaries? If two people modify, say, different regions of an image file (even in PNG or another lossless compression format), the sum of the visual changes isn't necessarily equal to the sum of the byte-level changes.
In the general case, such commits cannot be considered the same — consider a commit which flips a boolean that one branch had flipped in another file. But there are common cases where the commits should be considered equivalent, such as many rebased branches. Can the CRDT approach help with e.g. deciding that `git branch -d BRANCH` should succeed when a rebased version of BRANCH has been merged?
From time to time, I do a 'pijul pull -a' into the pijul source tree, and I get a conflict (no local work on my part). Is there a way to do a tracking update pull? I didn't see one, so I toss the repo and reclone. What works for you in tracking what's going on there?
Take a docx, write the file, parse it into entities e.g. paragraph, table, etc. and track changes on those entities instead of the binary blob. You can apply the same logic to files used in game development.
The hard part is making this fast enough. But I am working on this with lix [0].
Started with the machine learning use case for datasets and model weights but seeing a lot of traction in gaming as well.
Always open for feedback and ideas to improve if you want to take it for a spin!
When I was screwing around with the Git file format, tricks I would use to save space like hard-linking or memory-mapping couldn't work, because data is always stored compressed after a header.
A general copy-on-write approach to save checkout space is presumably impossible, but I wonder what other people have traveled down similar paths have concluded.
Partial checkouts are awkward at best, LFS locks are somehow still buggy and the CLI doesn't support batched updates. Checking the status of a remote branch vs your local (to prevent conflicts) is at best a naive polling.
Better rebase would be a nice to have but there's still so much left to improve for trunk based dev.
For some reason, when it comes to this subject, most people don't think about the problem as much as they think they've thought about it.
I recently listened to an episode on a well-liked and respected podcast featuring a guest there to talk about version control systems—including their own new one they were there to promote—and what factors make their industry different from other subfields of software development, and why a new approach to version control was needed. They came across as thoughtful but exasperated with the status quo and brought up issues worthy of consideration while mostly sticking to high-level claims. But after something like a half hour or 45 minutes into the episode, as they were preparing to descend from the high level and get into the nitty gritty of their new VCS, they made an offhand comment contrasting its abilities with Git's, referencing Git's approach/design wrt how it "stores diffs" between revisions of a file. I was bowled over.
For someone to be in that position and not have done even a cursory amount of research before embarking on a months (years) long project to design, implement, and then go on the talk circuit to present their VCS really highlighted that the familiar strain of NIH is still alive, even in the current era where it's become a norm for people to be downright resistant to writing a couple dozen lines of code themselves if there is no existing package to import from NPM/Cargo/PyPI/whatever that purports to solve the problem.
There are many ways to instantiate a CRDT, and a trivial one would be "last write wins" over the whole source tree state. LWW is obviously not what you'd want for source version control. It is "correct" per its own definition, but it is not useful.
Anyone saying "CRDTs solve this" without elaborating on the specifics of their CRDT is not saying very much at all.
I also use the merge tool of JetBrains IDEs such as IntelliJ IDEA (https://www.jetbrains.com/help/idea/resolve-conflicts.html#r...) when working in those IDEs. It uses a three-pane view, not a four-pane view, but there is a menu that allows you to easily open a comparison between any two of the four versions of the file in a new window, so I find it similarly efficient.
The big red block seems the same as "flagged", unless I'm misunderstanding something
What you really need is the ability to diff the base and "ours" or "theirs". I've found most different UIs can't do this. VSCode can, but it's difficult to get to.
I haven't tried p4merge though - if it can do that I'm sold!
It'll fire on merge issues that aren't code problems under a smarter merge, while also missing all the things that merge OK but introduce deeper issues.
Post-merge syntax checks are better for that purpose.
And imminently: agent-based sanity-checks of preserved intent – operating on a logically-whole result file, without merge-tool cruft. Perhaps at higher intensity when line-overlaps – or even more-meaningful hints of cross-purposes – are present.
Technically you could include conflict markers in your commits but I don't think people like that very much
FWIW I've struggled to get AI tools to handle merge conflicts well (especially rebase) for the same underlying reason.
... and of course it is, because Pijul uses Pijul for development, not Git and GitHub!
You would think that if a better, more sound model of storing patches is your whole selling point, you would want to make as easy as possible for people who are interested in the project to actually understand it. It is really weird not to care about the first impression that your manual makes on a curious reader.
Currently, I'm about 6 years into the experiment.
Approximately 2 years in (about 4 years ago), I've actually went to the Pijul Nest and reported [1] the issue. I got an explanation on fixing this issue locally, but weirly enough, the fix still wasn't actually implemented on the public version.
I'll report back in about a year with an update on the experiment.
No matter the tool, merges should always be presented like that. It's the only presentation that makes sense.
Something like base, that is "common base", looks far more apt to my mind. In the same vein, endogenous/exogenous would be far more precise, or at least aligned with the concern at stake. Maybe "local/alien" might be a less pompous vocabulary to convey the same idea.
True but it is not valid syntax. Like, you mean with the conflict lines?
That has not been my experience at all. The changes you introduced is your responsibility. If you synchronizes your working tree to the source of truth, you need to evaluate your patch again whether it introduces conflict or not. In this case a conflict is a nice signal to know where someone has interacted with files you've touched and possibly change their semantics. The pros are substantial, and it's quite easy to resolve conflicts that's only due to syntastic changes (whitespace, formatting, equivalent statement,...)
You can choose to have a workflow where you're never directly editing any commit to "gain back autonomy" of the working copy; and if you really want to, with some scripting, you can even emulate a staging area with a specially-formatted commit below the working copy commit.
On the contrary, I think this is an all-too-familiar pitfall for the, er... technically minded.
"I've implemented it in the code. My work here is done. The rest is window dressing."
I'm surprised! Pijul has been discussed here on HN many, many times. My impression is that many people here were hoping that Pijul might eventually become a serious Git contender but these days people seem to be more excited about Jujutsu, likely because migration is much easier.
Because of that, I think it is worse than “but it is not valid syntax”; it’s “but it may not be valid syntax”. A merge may create a result that compiles but that neither of the parties involved intended to write.
In the example in the article, the inserted line from the right change is floating because the function it was in from the left has been deleted. That's the state of the file, it has the line that has been inserted and it does not have the lines that were deleted, it contains both conflicting changes.
So in that example you indeed must resolve it if you want your program to compile, because the changes together produce something that does not function. But there is no state about the conflict being stored in the file.
Sure, that works – like having one (rare, expensive) savant engineer apply & review everything in a linear canonical order. But that's not as competitive & scalable as flows more tolerant of many independent coders/agents.
That answer is "It depends on the context"
> The reason the "ours" and "theirs" notions get swapped around during rebase is that rebase works by doing a series of cherry-picks, into an anonymous branch (detached HEAD mode). The target branch is the anonymous branch, and the merge-from branch is your original (pre-rebase) branch: so "--ours" means the anonymous one rebase is building while "--theirs" means "our branch being rebased".[0]
[0] https://stackoverflow.com/questions/25576415/what-is-the-pre...
I realized recently that I've subconsciously routed-around merge conflicts as much as possible. My process has just subtly altered to make them less likely. To the point of which seeing a 3-way merge feels jarring. It's really only taking on AI tools that bought this to my attention.
Addendum: I've since long disabled it. A and B changes are enough for me, especially as I rebase instead of merging.
I suspect that this could be because the rebase command is implemented as a serie of merges/cherry-picks from the target branch.
ours means what is in my local codebase.
theirs means what is being merged into my local codebase.
I find it best to avoid merge conflicts than to try to resolve them. Strategies that keep branches short lived and frequently merging main into them helps a lot.
git checkout mybranch
git rebase main
Now git takes main and starts cloning (cherry-picking, as you said) commits from mybranch on top of it. From git's viewpoint it's working on top of main, so if a conflict occurs, main is "ours" and mybranch is "theirs". But from your viewpoint you're still on mybranch, and indeed are left on mybranch when the rebase is complete. (It's a different mybranch, of course; once the rebase is completed, git moves mybranch to point to the new (detached) HEAD.) Which makes "ours" and "theirs" exactly the opposite of what the user expects.What if I'm rebasing a branch onto another? Is "ours" the branch being rebased, or the other one? Or if I'm applying a stash?
i have a branch and i want to merge that branch into main.
is ours the branch and main theirs? or is ours main, and the branch theirs?
git checkout master #check out the branch to apply commits to
git rebase mybranch #Apply all commits from mybranch
Now I just write rebase-current-branch
and it does what I want: fetches origin/master and rebases my working branch on top of it.But "ours"/"theirs" still keeps tripping me up.
As a bonus I can then also merge the feature branch into main as a squash commit, ditching the history of a feature branch for one large commit that implements the feature. There is no point in having half implemented and/or buggy commits from the feature branch clogging up my main history. Nobody should ever need to revert main to that state and if I really really need to look at that particular code commit I can still find it in the feature branch history.
Just checkout the branch you are merging/rebasing into before doing it.
> Or if I'm applying a stash?
The stash is in that case effectively a remote branch you are merging into your local codebase. ours is your local, theirs is the stash.
git checkout mybranch
git rebase main
A conflict happens. Now "ours" is main and "theirs" is mybranch, even though from your perspective you're still on mybranch. Git isn't, however.Not a problem if you are a purist on linear history.