Generally our practice is to pin everything to major versions, in ruby-speak this means like `gem 'net-sftp', '~> 4.0'` which allows 4.0.0 up to 4.9999.9999 but not 5. Exceptions for non-semver such as `pg` and `rails` which we just pin to exact versions and monitor manually. This little file contains our intentions of which gems to update automatically and for any exceptions, why not.
Then we encourage aggressive performances of `bundle update` which pulls in tons of little security patches and minor bugfixes frequently, but intentionally.
Without the lockfile though, you would not be able to do our approach. Every bundle install would be a bundle update, so any random build might upgrade a gem without anyone even meaning to or realizing it, so, your builds are no longer reproducible.
So we'd fix reproducibility by reverting to pinning everything to X.Y.Z, specifically to make the build deterministic, and then count on someone to go in and update every gem's approved version numbers manually on a weekly or monthly basis. (yeah right, definitely will happen).
> But... why would libpupa’s author write a version range that includes versions that don’t exist yet? How could they know that liblupa 0.7.9, whenever it will be released, will continue to work with libpupa? Surely they can’t see the future? Semantic versioning is a hint, but it has never been a guarantee.
> For that, kids, I have no good answer.
Because semantic version is good enough for me, as a package author, to say with a good degree of confidence, "if security or stability patches land within the patch (or sometimes, even minor) fields of a semver version number, I'd like to have those rolled out with all new installs, and I'm willing to shoulder the risk."You actually kind-of answer your own question with this bit. Semver not being a guarantee of anything is true, but I'd extend this (and hopefully it's not a stretch): package authors will republish packages with the same version number, but different package contents or dependency specs. Especially newer authors, or authors new to a language or packaging system, or with packages that are very early in their lifecycle.
There are also cases where packages get yanked! While this isn't a universally-available behaviour, many packaging systems acknolwedge that software will ship with unintentional vulnerabilities or serious stability/correctness issues, and give authors the ability to say, "I absolutely have to make sure that nobody can install this specific version again because it could cause problems." In those cases, having flexible subdependency version constraints helps.
It might be helpful to think by analogy here. If a structure is _completely rigid,_ it does have some desirable properties, not the least of which being that you don't have to account for the cascading effects of beams compressing and extending, elements of the structure coming under changing loads, and you can forget about accounting for thermal expansion or contraction and other external factors. Which is great, in a vacuum, but structures exist in environments, and they're subject to wear from usage, heat, cold, rain, and (especially for taller structures), high winds. Incorporating a planned amount of mechanical compliance ends up being the easier way to deal with this, and forces the engineers behind it to account for failure modes that'll arise over its lifetime.
Scala uses Maven repositories (where the common practice is to use fixed dependency versions) but with different resolution rules:
* When there are conflicting transitive versions, the highest number prevails (not the closest to the root).
* Artifacts declare the versioning scheme they use (SemVer is common, but there are others)
* When resolving a conflict, the resolution checks whether the chosen version is compatible with the evicted version according to the declared version scheme. If incompatible, an error is reported.
* You can manually override a transitive resolution and bypass the error if you need to.
The above has all the advantages of all the approaches advocated for here:
* Deterministic, time-independent resolution.
* No need for lock files.
* No silent eviction of a version in favor of an incompatible one.
* For compatible evictions, everything works out of the box.
* Security update in a transitive dependency? No problem, declare a dependency on the new version. (We have bots that even automatically send PRs for this.)
* Conflicting dependencies, but you know what you're doing? No problem, force an override.
Using hashes also makes it easier to distribute, fetch, proxy, etc. since there's no need for trust. In contrast, fetching code based only on (name and) version number requires more centralised repositories with a bunch of security hoops to jump through.
Also, on that note, I can plug my own post on the topic: http://www.chriswarbo.net/blog/2024-05-17-lock_files_conside...
The joke is this:
Lupa and Pupa received their paycheques, but the accountant messed up, so Lupa received payment belonging to Pupa, and Pupa — belonging to Lupa.
“To Lupa” sounds like “dick head” when translated to Russian. The ending reads as if Pupa received a dick head, which means that he didn’t receive anything.
I am not sure, but it could that the entire post intent is to get English-speaking folks to discuss “libpupa” and “liblupa”.
As a separate thought, it seems that it would be possible to statically analyze the usage of "B" in the source code of "A" and compare it to the public API for any version of "B" to determine API compatibility. This doesn't account for package incompatibility due to side effects that occur behind the API of "B", but it seems that it would get you pretty far. I assume this would be a solution for purely functional languages.
However for NPM you will hit issues where 2 packages need a different React version and if you want to use them both you need to pick. In addition it is better for security. The lock file is a distributed checksum. Not impervious to supply chain attacks but better equipped than trusting the package author not to retrospectively bump (I guess you could have a hash for this tbat included the downloaded source code and claimed deps).
When you first take a dependency, you typically want the latest compatible version, to have all the available bug fixes (especially security fixes).
Once you’ve started building on top of a dependency you need stability and have to choose when to take updates.
It’s about validating the dependency… on first use, there’s no question you will be validating its use in your app. Later, you have to control when you take an update so you can ensure you have a chance to validate it.
BTW, of course semantics versioning isn’t perfect. It just lowers the risk of taking certain bug fixes, making it feasible to take them more frequently.
The lock file just holds the state for this mechanism.
Languages ecosystems who try to make it sane for developers usually end up with some sort of snapshot/bom system that lists that are compatible together, and that nudges lib developers to stay compatible with each other. I'm not going to pretend this is easy, because this is hard work on the side of lib devs, but it's invaluable for the community.
Compared to that, people extolling the virtues of semver always seem to miss the mark.
But it's also missing the value of hashes, even if every package used symver, then you had a script that could easily update to get recent security updates, we would still gain value from a lockfile hashes to protect against source code changing underneath the same version code.
Maven and Java is simply broken when dealing with transitive dependencies.
I've been hit so many times with running time exception "MethodNotFound" cause two libraries have the same transitive dependency and one version gets picked over the other one.
Maven, by default, does not check your transitive dependencies for version conflicts. To do that, you need a frustrating plugin that produces much worse error messages than NPM does: https://ourcraft.wordpress.com/2016/08/22/how-to-read-maven-....
How does Maven resolve dependencies when two libraries pull in different versions? It does something insane. https://maven.apache.org/guides/introduction/introduction-to....
Do not pretend, for even half a second, that dependency resolution is not hell in maven (though I do like that packages are namespaced by creators, npm shoulda stolen that).
When I, the owner of an application, choose a library (libuseful 2.1.1), I think it's fine that the library author uses other libraries (libinsecure 0.2.0).
But in 3 months, libinsecure is discovered (surprise!) to be insecure. So they release libinsecure 0.2.1, because they're good at semver. The libuseful library authors, meanwhile, are on vacation because it's August.
I would like to update. Turns out libinsecure's vulnerability is kind of a big deal. And with fully hardcoded dependencies, I cannot, without some horrible annoying work like forking/building/repackaging libuseful. I'd much rather libuseful depend on libinsecure 0.2.*, even if libinsecure isn't terribly good at semver.
I would love software to be deterministically built. But as long as we have security bugs, the current state is a reasonable compromise.
Maven/Java does absolutely insane things, it will just compile and run programs with incompatible version dependencies and then they crash at some point, and pick some arbitrary first version of a dependency it sees. Then you start shading JARs and writing regex rules to change import paths in dependencies and your program crashes with a mysterious error with 1 google result and you spend 8 hours figuring out WTF happened and doing weird surgery on your dependencies dependencies in an XML file with terrible plugins.
This proposed solution is "let's just never use version ranges and hard-code dependency versions". Now a package 5 layers deep is unmaintained and is on an ancient dependency version, other stuff needs a newer version. Now what? Manually dig through dependencies and update versions?
It doesn't even understand lockfiles fully. They don't make your build non-reproducible, they give you both reproducible builds (by not updating the lockfile) and an easy way to update dependencies if and when you want to. They were made for the express purpose of making your build reproducible.
I wish there was a mega article explaining all the concerns, tradeoffs and approaches to dependency management - there are a lot of them.
If every dependency was a `=` and cargo allowed multiple versions of SemVer compatible packages.
The first impact will be that your build will fail. Say you are using `regex` and you are interacting with two libraries that take a `regex::Regex`. All of the versions need to align to pass `Regex` between yourself and your dependencies.
The second impact will be that your builds will be slow. People are already annoyed when there are multiple SemVer incompatible versions of their dependencies in their dependency tree, now it can happen to any of your dependencies and you are working across your dependency tree to get everything aligned.
The third impact is if you, as the application developer, need a security fix in a transitive dependency. You now need to work through the entire bubble up process before it becomes available to you.
Ultimately, lockfiles are about giving the top-level application control over their dependency tree balanced with build times and cross-package interoperability. Similarly, SemVer is a tool any library with transitive dependencies [0]
[0] https://matklad.github.io/2024/11/23/semver-is-not-about-you...
I believe Zig is also considering adopting it.
If there are any dependencies with the same major version the algorithm simply picks the newest one of them all (but not the newest in the package registry), so you don't need a lockfile to track version decisions.
Go's go.sum contains checksums to validate content, but is not required for version selection decisions.
Go's system may be worth emulating in future designs. It's not perfect (still requires some centralized elements, module identities for versions ≥2 are confusing, etc.) but it does present a way to both not depend strongly on specific centralized authorities without also making any random VCS server on the Internet a potential SPoF for compiling software. On the other hand, it only really works well for module systems that purely deal with source code and not binary artifacts, and it also is going to be the least hazardous when fetching and compiling modules is defined to not allow arbitrary code execution. Those constraints together make this system pretty much uniquely suited to Go for now, which is a bit of a shame, because it has some cool knock-on effects.
(Regarding deterministic MVS resolution: imagine a@1.0 depending on b@1.0, and c@1.0 depending on a@1.1. What if a@1.1 no longer depends on b? You can construct trickier versions of this possibly using loops, but the basic idea is that it might be tricky to give a stable resolution to version constraints when the set of constraints that are applied depends on the set of constraints that are applied. There are possible deterministic ways to resolve this of course, it's just that a lot of these edge cases are pretty hard to reason about and I think Go MVS had a lot of bugs early on.)
The reason we have dependency ranges and lockfiles is so that library a1.0 can declare "I need >2.1" and b1.0 can declare "I need >2.3" and when you depend on a1.0 and b1.0, we can do dependency resolution and lock in c2.3 as the dependency for the binary.
>How could they know that liblupa 0.7.9, whenever it will be released, will continue to work with libpupa? Surely they can’t see the future? Semantic versioning is a hint, but it has never been a guarantee.
Yes, this is a social contract. Not everything in the universe can be locked into code, and with Semantic versioning, we hope that our fellow humans won't unnecessarily break packages in non-major releases. It happens, and people usually apologize and fix, but it's rare.
This has worked successfully if you look at RubyGems which is 6 years older than npm (although Gemfile.lock was introduced in 2010, npm didn't introduce it until 2017).
RubyGems doesn't have the same reputation for dysfunction as Node does. Neither does Rust, Go, PHP, and Haskell. Even more that I probably don't use a daily basis. Node is the only language that I will come back and find a docker container that straight up won't build or a package that requires the entire dependency tree to update because one package pushed a minor-version change that ended up requiring a minor version change to Node, then that new version of Node isn't compatible with some hack that another package did in it's C extension.
In fact, I expect some Node developer to read this article and deploy yet another tool that will break _everything_ in the build process. In other languages I don't even think I've ever really thought about dependency resolution in years.
The package file (whatever your system) is communication to other humans about what you know about the versions you need.
The lockfile is the communication to other computers about the versions you are using.
What you shouldn't have needed is fully defined versions in your package files (but you do need it, in case some package or another doesn't do a good enough job following semver)
So, this:
package1: latest
# We're stuck on an old version b/c of X, Y, Z
package2: ~1.2
(Related: npm/yarn should use a JSON variant (or YAML, regular or simplified) that allows for comments for precisely this reason)This article appears to be talking about lockfiles for libraries - and I agree, for libraries you shouldn't be locking exact versions because it will inevitably pay havoc with other dependencies.
Or maybe I'm missing something about the JavaScript ecosystem here? I mainly understand Python.
I assume java gets around this by bundling libraries into the deployed .jar file. That this is better than a lock file, but doesn't make sense for scripting languages that don't have a build stage. (You won't have trouble convincing me that every language should have a proper build stage, but you might have trouble convincing the millions of lines of code already written in languages that don't.)
The algorithm can be deterministic, but fetching the dependencies of a package is not.
It is usually an HTTP call to some endpoint that might flake out or change its mind.
Lock files were invented to make it either deterministic or fail.
Even with Maven, deterministic builds (such as with Bazel) lock the hashes down.
This article is mistaken.
If a transient dependency (not directly referenced) updates, this might introduce different behavior. if you test a piece of software and fix some bugs, the next build shouldn't contain completely different versions of dependencies. This might introduce new bugs.
For server-side or other completely controlled environments the only good reason to have lock files is if they are actually hashed and thus allow to confirm security audits. Lock files without hashes do not guarantee security (depending on the package registry, of course, but at least in Python world (damn it) the maintainer can re-publish a package with an existing version but different content).
Given that, I still see some consequences:
The burden for testing if a library can use its dependency falls back on the application developer instead of the library developer. A case could be made that, while library developers should test what their libraries are compatible with, the application developer has the ultimate responsibility for making sure everything can work together.
I also see that there would need to be tooling to automate resolutions. If ranges are retained, the resolver needs to report every conflict and force the developer to explicitly specify the version they want at the top-level. Many package managers automatically pick one and write it into the lock file.
If we don’t have lock files, and we want it to be automatic, then we can have it write to the top level package manager and not the lock file. That creates its own problems.
One of those problems comes from humans and tooling writing to the same configuration file. I have seen problems with that idea pop up — most recently, letting letsencrypt modify nginx configs, and now I have to manually edit those. Letsencrypt can no longer manage them. Arguably, we can also say LLMs can work with that, but I am a pessimist when it comes to LLM capabilities.
So in conclusion, I think the article writer’s reasoning is sound, but incomplete. Humans don’t need lockfiles, but our tooling need lockfiles until it is capable of working with the chaos of human-managed package files.
However: You absolutely do need a lock file to store a cryptographic hash of each dependency to ensure that what is fetched has not been tampered with. And users are definitely not typing a hash when adding a new dependency to package.json or Cargo.toml.
This works with minimal coordination between authors of the dependencies. It becomes a big deal when you have several unrelated dependencies, each transitively requiring that libpupa. The chance they converge on the same exact version is slim. The chance a satisfying version can be found within specified ranges is much higher.
Physical things that are built from many parts have the very same limitation: they need to specify tolerances to account for the differences in production, and would be unable to be assembled otherwise.
A: You're handling problem X and then unrelated problem Y suddenly arises because you're not locking package versions thoroughly. It's not fun.
B: Now the opposite. You lock all versions of the libs you use. You use renovate or schedule time for updates periodically. You have a thorough test suite that you can automatically exercise when trying the new updates. You can apply the updates and deoy the new version to a test environment to run a final test manually. Things look good. You deploy to production and, quite often, things go smoothly.
A is the blue pill, easy to taste but things are out of your control and will bite you eventually. B is the red pill: you're in control, for the better or worst.
.NET doesn't have lock files either, and its dependency tree runs great.
Using fixed versions for dependencies is a best practice, in my opinion.
> Our dependency resolution algorithm thus is like this:
> 1. Get the top-level dependency versions
> 2. Look up versions of libraries they depend on
> 3. Look up versions of libraries they depend on
...would fail in languages like Python where dependencies are shared, and the steps 2, 3, etc. would result in conflicting versions.
In these languages, there is good reason to define dependencies in a relaxed way (with constraints that exclude known-bad versions; but without pins to any specific known-to-work version and without constraining only to existing known-good versions) at first. This way dependency resolution always involves some sort of constraint solving (with indeterminate results due to the constraints being open-ended), but then for the sake of reproducibility the result of the constraint solving process may be used as a lockfile. In the Python world this is only done in the final application (the final environment running the code, this may be the test suite in for a pure library) and the pins in the lock aren't published for anyone to reuse.
To reiterate, the originally proposed algorithm doesn't work for languages with shared dependencies. Using version constraints and then lockfiles as a two-layer solution is a common and reasonable way of resolving the dependency topic in these languages.
Perhaps that's a valid assumption for readers of his blog, but once it appears here there are going to be a lot of readers who don't have the context to know what it's about.
Can an "NPM" tag be added to the subject of this post? More generally, I encourage authors to include a bit more context at the top of an article.
https://go.dev/ref/mod#minimal-version-selection https://research.swtch.com/vgo-mvs
Instead of a "lock" file, go includes a "sum" file, which basically tells you the checksums of the versions of the modules that were used during a build happened to be, so that you can download them from a central place later and ensure you are working from the same thing (so that any surreptitious changes are identified).
> Dependency management - this allows project authors to directly specify the versions of artifacts to be used when they are encountered in transitive dependencies or in dependencies where no version has been specified.
It's just less convenient because you have to manage it yourself.
But it is still useful. It's like a bloom filter. Most of the time you can happily pull minor or patch upgrades with no issue. Occasionally it will break. But that's less work than analysing every upgrade.
People who use a library might use newer versions (via diamond dependencies or because they use latest), but it will result in a combination of dependencies that wasn't tested by the library's authors. Often that's okay because libraries try to maintain backward compatibility.
Old libraries that haven't had a new release in a while are going to specify older dependencies and you just have to deal with that. The authors aren't expected to guess which future versions will work. They don't know about security bugs or broken versions of dependencies that haven't been released yet. There are other mechanisms for communicating about that.
A -> L1 -> L2
They are saying that A should not need a lockfile because it should specify a single version of L1 in its dependencies (i.e. using an == version check in Python), which in turn should specify a single version of L2 (again with an == version check).
Obviously if everybody did this, then we wouldn't need lockfiles (which is what TFA says). The main downsides (which many comments here point out) are:
1. Transitive dependency conflicts would abound
2. Security updates are no longer in the hands of the app developers (in my above example, the developer of A1 is dependent on the developer of L1 whenever a security bug happens in L2).
3. When you update a direct dependency, your transitive dependencies may all change, making what you that was a small change into a big change.
(FWIW, I put these in order of importance to me; I find #3 to be a nothingburger, since I've hardly ever updated a direct dependency without it increasing the minimum dependency of at least one of its dependencies).
You are wrong; Maven just picks one of lib-x:0.1.4 or lib-x:0.1.5 depending on the ordering of the dependency tree.
Python says "Yes." Every environment manager I've seen, if your version ranges don't overlap for all your dependencies, will end up failing to populate the environment. Known issue; some people's big Python apps just break sometimes and then three or four open source projects have to talk to each other to un-fsck the world.
npm says "No" but in a hilarious way: if lib-a emits objects from lib-x, and lib-b emits objects from lib-x, you'll end up with objects that all your debugging tools will tell you should be the same type, and TypeScript will statically tell you are the same type, but don't `instanceof` the way you'd expect two objects that are the same type should. Conclusion: `instanceof` is sus in a large program; embrace the duck typing (and accept that maybe your a-originated lib-x objects can't be passed to b-functions without explosions because I bet b didn't embrace the duck-typing).
And yeah, I did that right away. Fun for a moment but extremely distracting.
No they are not. Fully reproducible builds have existed without lockfiles for decades
Why? Can’t you specify which version to use?
If it's an old release of a library then it will specify old dependencies, but you just have to deal with that yourself, because library authors aren't expected to have a crystal ball that tells them which future versions will be broken or have security holes.
If the package specification file is code and not data, then this becomes more problematic. Elixir specified dep as data within code. Arguably, we can add code to read and write from a separate file… but at that point, those might as well be lock files.
I actually much prefer that: specify the git revision to use (i.e. a SHA1 hash). I don't particularly care what "version number" that may or may not have.
just my guess.
https://devblogs.microsoft.com/dotnet/enable-repeatable-pack...
https://learn.microsoft.com/en-us/nuget/consume-packages/pac...
Again - there's no free lunch here.
I have had to do that with Ruby apps, where libraries are also shared.
And yet Java and Maven exist...
… they're not, though. Python & Rust both have lockfiles. I don't know enough Go to say if go.sum counts, but it might also be a lockfile. They're definitely not unique to NPM, because nothing about the problem being solved is unique to NPM.
For those times where that's not the case, you can look at the dependency tree to see which is included and why. You can then add a <dependency> override in your pom.xml file specifying the one you want.
It's not an "insane" algorithm. It gives you predictability. If you write something in your pom.xml that overrides whatever dependency your dependency requires, because you can update your pom.xml if you need to.
And because pom.xml is hand-written there are very few merge conflicts (as much as you'd normally find in source code), vs. a lock file where huge chunks change each time you change a dependency, and when it comes to a merge conflict you just have to delete the lot and redo it and hope nothing important has been changed.
Lockfiles are great.
Maven is dependency heaven.
The point is, "You don't need lockfiles."
And that much is true.
(Miss you on twitter btw. Come back!)
Never used it myself though, just read about it but never had an actual usecase
Im a big fan of anything Aphex Twin for these type of sessions.
I had this happen with JUnit after a JDK upgrade. We needed to update to a newer major version of JUnit to match the new JDK, so we updated the test code accordingly. But then things broke. Methods were missing, imports couldn't be resolved, etc. Turned out something else in the dependency tree was still pulling in JUnit 4, and Maven's “nearest-wins” logic just silently went with the older version. No error, no warning. Just weird runtime/classpath issues. This something else turned out to be spring, for some odd reason it was an ancient version of Junit 4 as well.
And yeah, you can eventually sort it out, maybe with mvn dependency:tree and a lot of manual overrides, but it is a mess. And Maven still doesn't give you anything like a lockfile to consistently reproduce builds over time. That's fine if your whole org pins versions very strictly, but it's naive to claim it "just works" in all cases. Certainly because versions often don't get pinned that strictly and it is easy to set up things in such a way that you think you have pinned the version while that isn't the case. Really fun stuff..
You also have the option of ignoring it if you want to build the old version for some reason, such as testing the broken version.
Just because Java does this doesn't mean every language has to. It's not strongly tied to the dependency management system used. You could have this even with a Java project using lockfiles.
> a package 5 layers deep is unmaintained and is on an ancient dependency version, other stuff needs a newer version. Now what? Manually dig through dependencies and update versions?
Alternatively, just specify the required version in the top-level project's dependency set, as suggested in the article.
If you want to do something cute and fun, whatever its your site. But if you actually want people to use your site make it easy to dismiss. We already have annoying ads and this is honestly worse than many ads.
Also, from the bio that I can barely see he writes about "UI Design" and... included this?
As an aside, I have an article in my blog that has GIFs in it, and they're important for the content, but I'm not a frontend developer by any stretch of the imagination so I'm really at wit's end for how to make it so that the GIFs only play on mouse hover or something else. If anybody reading has some tips, I'd love to hear them. I'm using Zola static site generator, and all I've done is make minor HTML and CSS tweaks, so I really have no idea what I'm doing where it concerns frontend presentation.
* Absence of lockfiles
* Absence of the central registry
* Cryptographically checksummed dependency trees
* Semver-style unification of compatible dependencies
* Ability for the root package to override transitive dependencies
At the cost of
* minver-ish resolution semantics
* deeper critical path in terms of HTTP requests for resolving dependencies
The trick is that, rather than using crates.io as the universe of package versions to resolve against, you look only at the subset of package versions reachable from the root package. See https://matklad.github.io/2024/12/24/minimal-version-selecti...
No, they don't. As the article explains, the resolution process will pick the version that is 'closest to the root' of the project.
> The second impact will be that your builds will be slow....you are working across your dependency tree to get everything aligned.
As mentioned earlier, no you're not. So there's nothing to support the claim that builds will be slower.
> You now need to work through the entire bubble up process before it becomes available to you.
No you don't, because as mentioned earlier, the version that is 'closest to root' will be picked. So you just specify the security fixed version as a direct dependency and you get it immediately.
Java dependency management is unhinged, antiquated garbage to anyone who has used any other ecosystem.
They would be resolved by just picking the version 'closest to root', as explained in the article.
> Security updates are no longer in the hands of the app developers
It is, the app developers can just put in a direct dependency on the fixed version of L2. As mentioned earlier, this is the version that will be resolved for the project.
> When you update a direct dependency, your transitive dependencies may all change, making what you that was a small change into a big change.
This is the same even if you use a lockfile system. When you update dependencies you are explicitly updating the lockfile as well, so a bunch of transitive dependencies can change.
Or maybe I misread the article and it did not say that.
npm used to allow you to unpublish (and may be overwrite?) published artifacts, but they removed that feature after a few notable events.
Edit: I was not quite correct. It looks like you can still unpublish, but with specific criteria. However, you cannot ever publish a different package using the same version as an already published package.
Nope, Maven will grab anything which happens to have a particular filename from `~/.m2`, or failing that it will accept whatever a HTTP server gives it for a particular URL. It can compare downloaded artifacts against a hash; but that's misleading, since those hashes are provided by the same HTTP server as the artifact! (Useful for detecting a corrupt download; useless for knowing anything about the artifact or its provenance, etc.)
This isn't an academic/theoretical issue; I've run into it myself https://discuss.gradle.org/t/plugins-gradle-org-serving-inco...
However, I think most people in the reproducible build space would consider Maven an external uncontrolled input.
https://src.fedoraproject.org/rpms/conky/blob/rawhide/f/sour...
also of flathub
https://github.com/flathub/com.belmoussaoui.ashpd.demo/blob/...
"they are not lockfiles!" is a debatable separate topic, but for a wider disconnected ecosystem of sources, you can't really rely on versions being useful for reproducibility
It's also not about fully reproducible builds, it's about a tradeoff to get modern package manger (npm, cargo, ...) experience and also somewhat reproducible builds.
show me one "decades old build" of a major project that isn't based on 1) git hashes 2) fixed semver URLs or 3) exact semver in general.
"Supposed to" being the operative phrase. This is of little comfort when you need version X.Y for a security fix but your build breaks.
Note that Maven is more complex than others here have mentioned. In some cases, Maven compares versions lexically (e.g. version 1.2 is considered newer than version 1.10).
Dependency management is indeed hell.
Not being able to build because one thing depends on libpupa 1.2.34.pre5 and another, on 1.2.35 would be a worse outcome, on average.
https://en.wikipedia.org/wiki/File_locking#Lock_files
When I saw the title "We shouldn't have needed lockfiles", I expected something about preferring some other mechanism for resource locking.
More generally, I see a lot of articles that talk about an issue in some language or framework that don't mention that context. Just adding "JavaScript" or "NPM" (or whatever) in the title or near the top of the article would be very helpful.
https://docs.npmjs.com/cli/v11/commands/npm-publish
> The publish will fail if the package name and version combination already exists in the specified registry.
> Once a package is published with a given name and version, that specific name and version combination can never be used again, even if it is removed with npm unpublish.
Isn't that basically a crappy, hand-rolled equivalent to a lock file?
Having worked professionally in C, Java, Rust, Ruby, Perl, PHP I strongly prefer lock files. They make it so much nicer to manage dependencies.
As an escape hatch, you end up doing a lot of exclusions and overrides, basically creating a lockfile smeared over your pom.
P.S. Sadly, I think enough people have left Twitter that it's never going to be what it was again.
But what if all the packages had automatic ci/cd, and libinsecure 0.2.1 is published, libuseful automatically tests a new version of itself that uses 0.2.1, and if it succeeds it publishes a new version. And consumers of libuseful do the same, and so on.
I think the better model is that your package manager let you do exactly what you want -- override libuseful's dependency on libinsecure when building your app.
In binary package managers this kind of workflow seems like an afterthought.
2) "Now a package 5 layers deep is unmaintained and is on an ancient dependency version, other stuff needs a newer version. Now what? Manually dig through dependencies and update versions?"
You can't solve both of these simultaneously.
If you want a library's dependences to be updated to versions other than the original library author wanted to use (e.g. because that library is unmaintained) then you're going to get those incompatibilities and crashes.
I think it's reasonable to be able to override dependencies (e.g. if something is unmaintained) but you have to accept there are going to be surprises and be prepared to solve them, which might be a bit painful, but necessary.
But I realized something by attempting to read this article several times first.
If I ever want to write an article and reduce peoples ability to critically engage with the argument in it I should add a focus pulling animation that thwarts concerted focus.
It's like the blog equivalent of public speakers who ramble their audience into a coma.
Inverted colours would've been _mostly fine._ Not great, but mostly fine, but instead, the author went out of their way to add this flashlight thing that's borderline unusable?
What the hell is this website?
"libpupa": "1.2.3"
"liblupa": "0.7.8"
My app
├╴a 1.0
│ └╴ d 1.0
└╴b 1.0
└╴c 2.0
└╴ d 2.0
My app
├╴a 1.0
│ └╴d 1.0 x-- ignored
├╴b 1.0
│ └╴c 2.0
│ └╴d 2.0 x-- ignored
└╴d 2.1 <-- picked
A lockfile provide a specific, concrete, minimized, satisfied solution on what an application or library uses to operate.
Generally, deployed applications have and save lock files so that nothing changes without testing and interactive approval.
Libraries don't usually ship lock files to give the end user more flexibility.
What solved system package dependency hell is allowing multiple versions and configurations of side-by-side dependencies rather than demanding a single, one-size-fits-all dependency that forces exclusion and creates unsatisfiable constraints.
and as some people mentioned, if a dependency of a dependency provides an important security patch, do you want to wait for your dependency to update first? or do you rely on overrides?
The solution is version ranges, but this then necessitates lockfiles, to avoid the problem of uncontrolled upgrades.
That said, there's an option that uses version ranges, and avoids nondeterminism without lockfiles: https://matklad.github.io/2024/12/24/minimal-version-selecti....
Note: maven technically allows version ranges, but they're rarely used.
I want no security bugs, but as a heuristic, I'd strongly prefer the latest patch version of all libraries, even without perfect guarantees. Code rots, and most versioning schemes are designed with that in mind.
You should check how comments work on niconico.
Streaming comments on YouTube give it a run for its money, what absolute garbage.
Some call transforming .java to .class a transpilation, but then a lot of what we call compilation should also be called transpilation.
Well, Java can ALSO be AOT compiled to machine code, more popular nowadays (e.g. GraalVM).
In go 1.17 they were added so that project loading did not require downloading the go.mod of every dependency in the graph.
And how will this look like, if your app doesn't have library C mentioned in its dependencies, only libraries A and B? You are prohibited from answering "well, just specify all the transitive dependencies manually" because it's precisely what a lockfile is/does.
It's not always the correct solution, but sometimes it is. If I have a dependency that uses libUtil 2.0 and another that uses libUtil 3.0 but neither exposes types from libUtil externally, or I don't use functions that expose libUtil types, I shouldn't have to care about the conflict.
vs
"You can use make to ape the job of dependency managers"
wat?
There's a very strong argument that manually managing deps > auto updating, regardless of the ergonomics.
P.S. You're, right, but also it's where the greatest remnant remains. :(
Majority of those people came back after a while. The alternatives get near-zero engagement, so it's just shouting into the wind. For the ones that left over political reasons, receiving near-zero engagement takes all the fun out of posting... so they're back.
Go by default uses proxy.golang.org for speed/security, and sum.golang.org for sharing verification, but it works just fine without them.
I think trust in binary packages / no remote code execution is orthogonal to dependency selection.
This may sound judgy, but at the heart it's intended to be descriptive: there are two roughly stable states, and both have their problems.
[edit]
The author confirmed that they are assuming Maven's rules and added it to the bottom of their post.
Which is going to lead to horrible issues when that library isn't compatible with all your other dependencies. What if your app directly depends on both L1 and L2, but L1 is compatible with L3 1.2 ... 1.5 while L2 is compatible with L3 1.4 ... 1.7? A general "stick to latest" policy would have L1: "L3==1.5", L2: "L3==1.7" (which breaks L1 if L2 wins). A general "stick to oldest compatible" policy would have L1: "L3==1.2", L2: "L3==1.4" (which breaks L2 if L1 wins).
The obvious solution would be to use L3 1.4 ... 1.5 - but that will never happen without the app developer manually inspecting the transitive dependencies and hardcoding the solution - in essence reinventing the lock file.
> It is, the app developers can just put in a direct dependency on the fixed version of L2. As mentioned earlier, this is the version that will be resolved for the project.
And how is that going to work out in practice? Is that direct dependency supposed to sit in your root-level spec file forever? Will there be a special section for all the "I don't really care about this, but we need to manually override it for now" dependencies? Are you going to have to manually specify and bump it until the end of time because you are at risk of your tooling pulling in the vulnerable version? Is there going to be tooling which automatically inspects your dependencies and tells you when it is safe to drop?
> This is the same even if you use a lockfile system. When you update dependencies you are explicitly updating the lockfile as well, so a bunch of transitive dependencies can change.
The difference is that in the lockfile world any changes to transitive dependencies are well-reasoned. If every package specifies a compatibility range for its dependencies, the dependency management system can be reasonably sure that any successful resolution will not lead to issues and that you are getting the newest package versions possible.
With a "closest-to-root" approach, all bets are off. A seemingly-trivial change in your direct dependencies can lead to a transitive dependency completely breaking your entire application, or to a horribly outdated library getting pulled in. Moreover, you might not even be aware that this is happening. After all, if you were keeping track of the specific versions of every single transitive dependency, you'd essentially be storing a lockfile - and that's what you were trying to avoid...
exactly the same thing as a lockfile
Lol, the word "modern" has truly lost all meaning. Your list of "modern package managers" seems to coincide with a list of legacy tooling I wrote four years ago! https://news.ycombinator.com/item?id=29459209
Some guy files a CVE against my library, saying it crashes if you feed it a large, untrusted file.
I decide to put out a new version of the library, fixing the CVE by refusing to load conspicuously large files. The API otherwise remains unchanged.
Is the new release a major, minor, or bugfix release? As I have only an approximate understanding of semantic versioning norms, I could go for any of them to be honest.
Some other library authors are just as confused as me, which is why major.minor.patchlevel is only a hint.
- Nearest Definition Wins: When multiple versions of the same dependency appear in the dependency tree, the version closest to your project in the tree will be used.
- First Declaration Wins: If two versions of the same dependency are at the same depth in the tree, the first one declared in the POM will be used.
Good luck finding a project of any complexity that manages to adhere to that kind of design sensibility religiously.
(I think the only language I've ever used that provided top-level support for recognizing that complexity was SML/NJ, and it's been so long that I don't remember exactly how it was done... Modules could take parameters so at the top level you could pass to each module what submodule it would be using, and only then could the module emit types originating from the submodule because the passing-in "app code" had visibility on the submodule to comprehend those types. It was... Exactly as un-ergonomic as you think. A real nightmare. "Turn your brain around backwards" kind of software architecting.)
Edit: actually, depending on the package manager, the auto generated lockfile takes less work than the single override, as they don't have the same issue maven does to require an override in the first place.
Just because thousands of programmers manage to suffer through your bad system every day does not make it good.
In an academic sense, you're probably right.
In practice it turns out that this isn't an issue in 99% of cases. Yes, I have once run into a weird issue where Nexus was corrupted and it took some debugging, so it's not like it can't happen, but assuming you don't do anything weird, the assumption that Maven artifacts are immutable is fairly safe.
I'm not saying that lockfiles aren't technically superior or anything, but the failure modes are so rare that people usually don't bother (even in Gradle where lockfiles are technically supported).
I was talking about tags above, eg. "npm i react@next", and you can use tags in your package.json. npm allows you to republish them at will, and you can never force your users to use a specific version.
Maven and Gradle make up the vast majority of all Java projects in the wild today. So, effectively, Maven is Java in terms of dependency management.
(To be generous: it might be that we didn't build our own bar the moment someone who is at least Nazi-tolerant started sniffing around for the opportunity to purchas the deed to the bar. The big criticism might be "we, as a subculture, aren't punk-rock enough.")
The client who didn't notice a difference would probably call it a bugfix.
The client whose software got ever-so-slightly more reliable probably would call it a minor update.
The client whose software previously was loading large files (luckily) without issue would call it major, because now their software just doesn't work anymore.
What you're describing with SML functors is essentially dependency injection I think; it's a good thing to have in the toolbox but not a universal solution either. (I do like functors for dependency injection, much more than the inscrutable goo it tends to be in OOP languages anyways)
Or have I misunderstood?
For example, as a developer I want Spring to stay on 6.3.x and not suddenly jump to 6.4 - as that is likely to break stuff. I do not care whether I get 6.3.1 or 6.3.6, as they are quite unlikely to cause issues. I do not care the slightest what version of libfoobar I get 5 dependencies down the line.
However, I do not want packages to suddenly change versions between different CI runs. Breakage due to minor version bumps are unlikely, but they will happen. That kind of stuff is only going to cause noise when a rebuild of a PR causes it to break with zero lines changed, so you only want version upgrades to happen specifically when you ask for them. On top of that there's the risk of supply chain attacks, where pulling in the absolute latest version of every single package isn't exactly a great idea.
The package spec allows me to define "spring at 6.3.x", the lockfile stores that we are currently using "spring 6.3.5, libfoobar 1.2.3". I ask it to look for upgrades, it resolves to "spring 6.3.6, libfoobar 1.4.0". There's also a "spring 6.4.0" available, but the spec says that we aren't interested so it gets ignored. All tests pass, it gets merged, and we'll stay at those versions until we explicitly ask for another upgrade.
The whole flow exists for things which aren't root dependencies. The major versions of those are trivial to keep track of manually, and you'll only have a handful of them. It's all the minor versions and downstream dependencies which are the issue: tracking them all manually is a nightmare, and picking whatever happens to be the latest version at time-of-build is a massive no-no.
You can Google "YAMLException: The incoming YAML document exceeds the limit" - an error introduced in response to CVE-2022-38752 - to see what happens when a library introduces a new input size limit.
What happened in that case is: the updated library bumps their version from 1.31 to 1.32; then a downstream application updates their dependencies, passes all tests, and updates their version from 9.3.8.0 to 9.3.9.0
If you force a transitive dependency in Maven, then yes, some other library may get incompatible with it. But in NPM when people declare dependency as, say, ~1.2.3 the also don't know if they will be compatible with a future 1.2.4 version. They just _assume_ the next patch release won't break anything. Yes npm will try to find a version that satisfies all declarations, but library devs couldn't know the new version would be compatible because it wasn't published at that time.
And my point is that it's _exactly_ the same probability that the next patch version is incompatible in both Maven and NPM. That's why NPM users are not afraid to depend on ~x.x or even ^x.x, they basically YOLOing.
That's precisely because maven doesn't support version ranges. Maven artifacts are also immutable.
Maven also supports manual override when the insane resolution strategy fails that's the "dependencymanagement" section.
In theory... None of us should be doing it. Emitting raw underlying structures from a dependency coupled with ranged versioning means part of your API is under-specified; "And this function returns an argument, the type of which is whatever this third-party that we don't directly communicate with says the type is." That's hard to code against in the general case (but it works out often enough in the specific case that I think it's safe to do 95-ish percent of the time).
Ultimately, these are imperfect solutions to practical problems, and I know that I much prefer the semantic versioning and lockfile approach to whatever the java people are into.
But anyway.. isn't that exactly the purpose of lock files? If you don't trust the semver range, it shouldn't matter because every `npm ci` results in the same package versions.
https://maven.apache.org/enforcer/enforcer-rules/versionRang...
The person who wrote the range selected a range that they deem likely to work.
I don't use NPM, but in Python it definitely happens that you see e.g.:
foo >= 0.3.4, <= 0.5.6
Which can save a lot of headaches early on for packages that use ZeroVer[1]Alternatively, it's a pointer to an opaque data structure. But then that fact (that it's a pointer) is frozen.
Either way, you can rely on dependencies not just pulling the rug from under you.
(I remember, ages ago, trying to wrap my head around Component Object Model. It took me awhile to grasp it in the abstract because, I finally realized, it was trying to solve a problem I'd never needed to solve before: ABI compatibility across closed-source binaries with different compilation architectures).