Java is fast, code might not be

Understanding algorithmic complexity (in particular, avoiding rework in loops), is useful in any language, and is sage advice.

In practice though, for most enterprise web services, a lot of real world performance comes down to how efficiently you are calling external services (including the database). Just converting a loop of queries into bulk ones can help loads (and then tweaking the query to make good use of indexes, doing upserts, removing unneeded data, etc.)

I'm hopeful that improvements in LLMs mean we can ditch ORMs (under the guise that they are quicker to write queries and the inbetween mapping code with) and instead make good use of SQL to harness the powers that modern databases provide.

When you're using a programming language that naturally steers you to write slow code you can't only blame the programmer.

I was listening to someone say they write fast code in Java by avoiding allocations with a PoolAllocator that would "cache" small objects with poolAllocator.alloc(), poolAllocator.release(). So just manual memory management with extra steps. At that point why not use a better language for the task?

The autoboxing in a loop case can be handled by the compiler.

I ran into 5 and 7 in a Flink app recently - was parsing a timestamp as a number first and then falling back to iso8601 string, which is what it was. The flamegraph showed 10% for the exception handling bit. While fixing that, also found repeated creation of datetimeformatter. Both were not in loops, but both were being done for every event, for 10s of 1000s of events every second.

First request latency also can really suck in Java before hotpathed code gets through the C2 compiler. You can warm up hotpaths by running that code during startup, but it's really annoying having to do that. Using C++, Go, or Rust gets you around that problem without having to jump through the hoops of code path warmup.

I wish Java had a proper compiler.

A subject close to my heart, I write a lot of heavily optimised code including a lot of hot data pipelines in Java.

And aside from algorithms, it usually comes down to avoiding memory allocations.

I have my go-to zero-alloc grpc and parquet and json and time libs etc and they make everything fast.

It’s mostly how idiomatic Java uses objects for everything that makes it slow overall.

But eventually after making a JVM app that keeps data in something like data frames etc and feels a long way from J2EE beans you can finally bump up against the limits that only c/c++/rust/etc can get you past.

For fillInStackTrace, another trick is to define your own Exception subclass and override the method to be empty. I learned this trick 15+ years ago.

It doesn't excuse the "use exceptions for control flow" anti-pattern, but it is a quick patch.

> Exceptions for Control Flow

This one is so prevalent that JVM has an optimization where it gives up on filling stack for exception, if it was thrown over and over in exact same place.

Also finding the right garbage collector and settings that works best for your project can help a lot.

Knock Knock

Who’s there?

long pause

Java

You can write many of the bad examples in the article in any language. It is just far more common to see them in Java code than some other languages.

Java is only fast-ish even on its best day. The more typical performance is much worse because the culture around the language usually doesn't consider performance or efficiency to be a priority. Historically it was even a bit hostile to it.

JavaScript can be fast too, it's just the ecosystem and decisions devs make that slow it down.

Same for Java, I have yet to in my entire career see enterprise Java be performant and not memory intensive.

At the end of the day, if you care about performance at the app layer, you will use a language better suited to that.

this is great, so practical!!!

any other resources like this?

Do good, don't do bad. Okay.

As much as I love Java, everybody should just be using Rust. That way you are actually in control, know what's going on, etc. Another reason specifically against Java is that the tooling, both Maven and Gradle, still stucks.

The autoboxing in a loop case can be handled by the compiler.

Also finding the right garbage collector and settings that works best for your project can help a lot.

this is great, so practical!!!

any other resources like this?

JavaScript can be fast too, it's just the ecosystem and decisions devs make that slow it down.

Same for Java, I have yet to in my entire career see enterprise Java be performant and not memory intensive.

At the end of the day, if you care about performance at the app layer, you will use a language better suited to that.

> Same for Java

Well, JS is fast and Go is faster, but Java is C++-fast.

My experience with the defaults in JavaScript is that they’re pretty slow. It’s really, really easy to hit the limits of an express app and for those limits to be in your app code. I’ve worked on JVM backed apps and they’re memory hungry (well, they require a reallocation for the JVM) and they’re slow to boot but once they’re going they are absolutely ripping fast and your far more likely to be bottlenecked by your DB long before you need to start doing any horizontal scaling.

Fair point on ecosystem decisions, that's basically the thesis of the post. These patterns aren't Java being slow, they're developers (myself included) writing code that looks fine but works against the JVM. Enterprise Java gets a bad rap partly because these patterns compound silently across large codebases and nobody profiles until something breaks.

"Enterprise Java"

Factories! Factories everywhere!

(Perhaps a good library for timestamp code in data pipelines https://github.com/williame/TimeMillis)

Understanding algorithmic complexity (in particular, avoiding rework in loops), is useful in any language, and is sage advice.

I wish Java had a proper compiler.

Knock Knock

Who’s there?

long pause

Java

For fillInStackTrace, another trick is to define your own Exception subclass and override the method to be empty. I learned this trick 15+ years ago.

It doesn't excuse the "use exceptions for control flow" anti-pattern, but it is a quick patch.

A subject close to my heart, I write a lot of heavily optimised code including a lot of hot data pipelines in Java.

And aside from algorithms, it usually comes down to avoiding memory allocations.

I have my go-to zero-alloc grpc and parquet and json and time libs etc and they make everything fast.

It’s mostly how idiomatic Java uses objects for everything that makes it slow overall.

> Exceptions for Control Flow

This one is so prevalent that JVM has an optimization where it gives up on filling stack for exception, if it was thrown over and over in exact same place.

You can write many of the bad examples in the article in any language. It is just far more common to see them in Java code than some other languages.

Do good, don't do bad. Okay.

When you're using a programming language that naturally steers you to write slow code you can't only blame the programmer.

I've been using sqlx + postgres very successfully with claude in the last couple of months. However, we've been raw dogging MySQL and node.js at work for over a year, and I also used raw SQLite from C++ before that (I am still traumatized by all the pointers, never again), so...

Well before LLMs, I already ditched ORMs. What sometimes holds back SQL is not having a convenient way to call it. Statically-typed languages require you to manually set result types unless you use a compile-time query builder, but that's a whole can of worms. Besides that, many client libs aren't so convenient out of the box, so you still need a few of your own helpers.

Also, before jsonb existed, you'd often run into big blobs of properties you don't care to split up into tables. Now it takes some discipline to avoid shoving things into jsonb that shouldn't be.

> Just converting a loop of queries into bulk ones can help loads

This is usually the first thing I look for when someone is complaining about speed. Developers often miss it because they are developing against a database on their local machine which removes any of the network latency that exists in deployed environments.

> I'm hopeful that improvements in LLMs mean we can ditch ORMs (under the guise that they are quicker to write queries and the inbetween mapping code with) and instead make good use of SQL to harness the powers that modern databases provide.

Maybe we can ditch active models like those we see in sqlalchemy, but the typed query builders that come with ORMs are going to become more important, not less. Leveraging the compiler to catch bad queries is a huge win.

Author here. DB and external service calls are often the biggest wins, thanks for calling that out.

In my demo app, the CPU hotspots were entirely in application code, not I/O wait. And across a fleet, even "smaller" gains in CPU and heap compound into real cost and throughput differences. They're different problems, but your point is valid. Goal here is to get more folks thinking about other aspects of performance especially when the software is running at scale.

> a lot of real world performance comes down to how efficiently you are calling external services (including the database)

Apart from that my experience over the last 20 years was that a lot of performance is lost because of memory allocation (in GCed languages like Java or JavaScript). Removing allocation in hot loops really goes a long way and leads to 10 or 100 fold runtime improvements.

> ditch ORMs ... make good use of SQL

I think Java (or other JVM languages) are then best positioned, because of jooq. Still the best SQL generation library I've used.

Easy to get wrong as well.

There's a balance with a DB. Doing 1 or 2 row queries 1000 times is obviously inefficient, but making a 1M row query can have it's own set of problems all the same (even if you need that 1M).

It'll depend on the hardware, but you really want to make sure that anything you do with a DB allows for other instances of your application a chance to also interact with the DB. Nothing worse than finding out the 2 row insert is being blocked by a million row read for 20 seconds.

There's also a question of when you should and shouldn't join data. It's not always a black and white "just let the DB handle it". Sometimes the better route to go down is to make 2 queries rather than joining, particularly if it's something where the main table pulls in 1000 rows with only 10 unique rows pulled from the subtable. Of course, this all depends on how wide these things are as well.

But 100% agree, ORMs are the worst way to handle all these things. They very rarely do the right thing out of the box and to make them fast you ultimately end up needing to comprehend the SQL they are emitting in the first place and potentially you end up writing custom SQL anyways.

I've always found ORMs to be performance killers. It always worked out better to write the SQL directly. The idea that you should have a one-to-one correspondence between your data objects and your database objects is disastrous unless your data storage is trivial.

> Understanding algorithmic complexity (in particular, avoiding rework in loops), is useful in any language, and is sage advice.

I recently fixed a treesitter perf issue (for myself) in neovim by just dfsing down the parse tree instead of what most textobject plugins do, which is:

-> walk the entire tree for all subtrees that match this metadata

-> now you have a list of matching subtrees, iterate through said subtree nodes, and see which ones are "close" to your cursor.

But in neovim, when I type "daf", I usually just want to delete the function right under my cursor. So you can just implement the same algorithm by just... dfsing down the parse tree (which has line numbers embedded per nodes) and detecting the matches yourself.

In school, when I did competitive programming and TCS, these gains often came from super clever invariants that you would just sit there for hours, days, weeks, just mulling it over. Then suddenly realize how to do it more cleverly and the entire problem falls away (and a bunch of smart people praise you for being smart :D). This was not one of them - it was just, "go bypass the API and do it faster, but possibly less maintainably".

In industry, it's often trying to manage the tradeoff between readability, maintainability, etc. I'm very much happy to just use some dumb n^2 pattern for n <= 10 in some loop that I don't really care much about, rather than start pulling out some clever state manipulation that could lead to pretty "menial" issues such as:

- accidental mutable variables and duplicating / reusing them later in the code

- when I look back in a week, "What the hell am I doing here?"

- or just tricky logic in general

I only noticed the treesitter textobject issue because I genuinely started working with 1MB autogen C files at work. So... yeah...

I could go and bug the maintainers to expose a "query over text range* API (they only have query, and node text range separately, I believe. At least of the minimal research I have done; I haven't kept up to date with it). But now that ties into considerations far beyond myself - does this expose state in a way that isn't intuitive? Are we adding composable primitives or just ad hoc adding features into the library to make it faster because of the tighter coupling? etc. etc.

I used to think of all of that as just kind of "bs accidentals" and "why shouldn't we just be able to write the best algorithms possible". As a maintainer of some systems now... nah, the architectural design is sometimes more fun!

I may not have these super clever flashes of insight anymore but I feel like my horizons have broadened (though part of it is because GPT Pro started 1 shotting my favorite competitive programming problems circa late 2025 D: )

You mostly need a recent JDK. Leyden has already cut down warmup by a lot and is expected to continue driving it down.

https://foojay.io/today/how-is-leyden-improving-java-perform...

https://quarkus.io/blog/leyden-1/

You can create a native executable with GraalVM. Alternatively, if you want to keep the JVM: With the ongoing project Leyden, you can already "pre-train" some parts of the JVM warm-up, with full AoT code compilation coming some time in the future.

Gaming the JIT just to get startup times in line is a decent sign that Java's "fast" comes with invisible asterisks all over prod graphs. At some point you're managing the runtime, not the app.

AOT options like GraalVM Native Image can help cold starts a lot, but then half your favorite frameworks breaks and you trade one set of hoops for another. Pick which pain you want.

I worked on JVMs long ago (almost twenty years now). At that time most Java usage was for long-running servers. The runtime team staunchly refused to implement AOT caching for as long as possible. This was a huge missed opportunity for Java, as client startup time has always, always, always sucked. Only in the past 3-5 years does it seem like things have started to shift, in part due to the push for Graal native image.

I long ago concluded that Java was not a client or systems programming language because of the implementation priorities of the JVM maintainers. Note that I say priorities--they are extremely bright and capable engineers that focus on different use cases, and there isn't much money to be made from a client ecosystem.

AOT is nice for startup time, but there are tradeoffs in the other direction for long tail performance issues in production.

There are JITs that use dynamic profile guided optimization which can adjust the emitted binary at runtime to adapt to the real world workload. You do not need to have a profile ahead of time like with ordinary PGO. Java doesn't have this yet (afaik), but .NET does and it's a huge deal for things like large scale web applications.

https://devblogs.microsoft.com/dotnet/bing-on-dotnet-8-the-i...

I challenge the idea that first request latency is bottle necked by language choice. I can see how that is plausible, mind. Is it a concern for the vast majority of developers?

Excelsior JET, now gone, but only because GraalVM and OpenJ9 exist now.

The folks on embedded get to play with PTC and Aicas.

Android, even if not proper Java, has dex2oat.

i'd be curious about a head to head comparison of how much the c2 actually buys over a static aot compilation with something serious like llvm.

if it is valuable, i'd be surprised you can't freeze/resume the state and use it for instantaneous workload optimized startup.

I really hate how completely clueless people on hn are about java. This is not, and has not been an issue for many many years in Java and even the most junior of developers know how to avoid it. But oh no, go and rust is alwaayssss the solution sure.

Do none of the JVMs do that? GraalVM?

The premise of this joke is dead since 2020, when ZGC was production ready.

God, please make me unsee it. That‘s a cool trick that turns into an anti-pattern itself if abused.

> And aside from algorithms, it usually comes down to avoiding memory allocations.

I’ve heard about HFT people using Java for workloads where micro optimization is needed.

To be frank, I just never understood it. From what I’ve seen heard/you have to write the code in such a way that makes it look clumsy and incompatible with pretty much any third party dependencies out there.

And at that point, why are you even using Java? Surely you could use C, C++, or any variety of popular or unpopular languages that would be more fitting and ergonomic (sorry but as a language Java just feels inferior to C# even). The biggest swelling point of Java is the ecosystem, and you can’t even really use that.

Can you share the libs you 're using?

ah, thank you. Haven't worked in java for a bit now, but that was the only one I read where I was like "I'm sure we didn't have to avoid this when I worked on java".

The rest were all very familiar. Well, apart from the new stuff. I think most of my code was running in java 6...

Performance is really not Java's issue. Even bad Java code is still substantially faster than the bulk of modern software that is based on technologies like Python or JavaScript/Node.js.

Which, to be fair, in many cases is ok. If you just need to churn out LOB apps for worker drones as cheap as possible, performance is probably not the most important factor.

I don't think that's a charitable take of the article. To many programmers, it wouldn't be obvious that some of these footguns (autoboxing, string concatenation, etc) are "bad", or what the "good" alternatives are (primitives, StringBuilder, etc).

That said, the article does have the "LLM stank" on it, which is always offputting, but the content itself seems solid.

I decompiled Project Zomboid (written in Java) a while back, because I was curious about the performance issues I was having with the game. (Very laggy on my 10 year old laptop, while looking like The Sims 1.) I figured, best case scenario I find some easy bottlenecks and I can patch in a fix.

Well, the whole thing was standard Java OOP, except they also had a bunch of functional programming stuff on top of that. I can relate to that -- I think they were university students when they started, and I definitely had an OOP and FP phase. But then they just... kept it, 10+ years later.

So while it's true that you can write C in any language... those kind of folks don't tend to use Java in the first place ;)

(Except Notch? Well, his code looks like C, not sure if it's actually fast! I really enjoyed his 4 kilobyte java games back in the day, I think he published the source for each one too.)

EDIT: Found it!

https://web.archive.org/web/20120317121029/http://www.mojang...

Edit 2: This one has a download, still works!

https://web.archive.org/web/20120301015921/http://www.mojang...

This point gets raised every single time managed languages and low latency development come up together. The trade off is running "fast" all of the time, even when you don't have to, vs running slow most of the time and tinkering when you need to go fast.

I've spent a fair few years developing lowish (10-20us wire to wire) latency trading systems and the majority of the code does not need to go fast. It's just wasted effort, a debugging headache, and technical debt. So the natural trade off is a bit of pain to make the hot path fast through spans, unsafe code, pre-allocated object pools, etc and in return you get to use a safe and easy programming language everywhere else.

In C# low latency dev is not even that painful, as there are a lot of tools available specifically for this purpose by the runtime.

You might have an application for which speed is not important most of the time. Only one or two processes might require allocation-free code. For such a case, why would you burden all of the other code with the additional complexity? Calling out to a different language then may come with baggage you'd rather avoid.

A project might also grow into these requirements. I can easily imagine that something wasn't problematic for a long time but suddenly emerged as an issue over time. At that point you wouldn't want to migrate the whole codebase to a better language anymore.

The problem with comments like these is that guessing what "better language" a commentator has in mind is always an exercise left up to the reader. But that tends to be by design—it's great for potshots and punditry, because it means not having make a concrete commitment to anything that might similarly be confronted and torn apart in the replies—like if the "better language" alluded to is C (and it generally is)—the language where the standard library "steers" you towards quadratic string operations because the default/natural way to refer to a string's length is O(n).

Java doesn't steer you into object pools. I wrote Java code for 20 years and never used a cache to avoid allocating objects, and never saw a colleague use one. The person you were talking to doesn't know what he's doing.

> So just manual memory management with extra steps

This is actually the perfect situation: you are allowed to do it carefully and manually for 1% of code on the hot path, but you don't have to worry about it for the 99% of the code that's not.

TBH, I do not see how Java as a language steers anyone to use one those shotguns. E.g. the knowledge about algorithmic complexity is foundational, the StringBuilder is junior-level basic knowledge.

Bad idea. I've made a pool allocator before, but that was for expensive network objects and expensive objects dealing with JNI.

Doing it to avoid memory pressure generally means you simply have a bad algorithm that needs to be tweaked. It's very rarely the right solution.

Not knowing what's going on in Java is a personal problem. The language and jvm have its own quirks but it's no less knowable than any other compiler optimized code. The debugging and introspection tooling in Java is also best in class so I would say it's one of the more understandable run times.

Gradle does suck and maven is ok but a bit ugly.

Gradle does suck, it gives too much freedom on a tool that should be straightforward and actively design to avoid footguns, it does the opposite by providing a DSL that can create a lot of abstractions to manage dependencies. The only place I worked where the Gradle configuration looked somewhat sane had very strict design guidelines on what was acceptable to be in the Gradle config.

Maven on the other hand, is just plain boring tech that works. There's plenty of documentation on how to use it properly for many different environments/scenarios, it's declarative while enabling plug-ins for bespoke customisations, it has cruft from its legacy but it's quite settled and it just works.

Could Maven be more modern if it was invented now? Yeah, sure, many other package managers were developed since its inception with newer/more polished concepts but it's dependable, well documented, and it just plain works.

I’ll never understand the impulse to tell the entire world what to do based on your own personal preferences and narrow experiences.

It gets a reaction, though, so great for social media.

Rust has no place other than deployment scenarios where any kind of automatic resource management, be it tracing GC or reference counting, is not wanted for, either due to technical reasons, or being a waste of time trying to change people's mindset.

> That way you are actually in control

Programming in Rust is a constant negotiation with the compiler. That isn't necessarily good or bad but I have far more control in Zig, and flexibility in Java.

I'm a fan of Rust too. But there are millions of Java applications running in production right now, and some of them are running these anti-patterns today. Not everyone has the option to rewrite in a different language. For those teams, knowing what to look for in a profiler can make a real difference without changing a single dependency.

> Same for Java

Well, JS is fast and Go is faster, but Java is C++-fast.

The premise of this joke is dead since 2020, when ZGC was production ready.

I’ll never understand the impulse to tell the entire world what to do based on your own personal preferences and narrow experiences.

It gets a reaction, though, so great for social media.

(Perhaps a good library for timestamp code in data pipelines https://github.com/williame/TimeMillis)

> Just converting a loop of queries into bulk ones can help loads

Also, before jsonb existed, you'd often run into big blobs of properties you don't care to split up into tables. Now it takes some discipline to avoid shoving things into jsonb that shouldn't be.

Compile it to native (GraalVM) and you can get it fast while consuming less memory. But now your build is slow :)

> Understanding algorithmic complexity (in particular, avoiding rework in loops), is useful in any language, and is sage advice.

I recently fixed a treesitter perf issue (for myself) in neovim by just dfsing down the parse tree instead of what most textobject plugins do, which is:

-> walk the entire tree for all subtrees that match this metadata

-> now you have a list of matching subtrees, iterate through said subtree nodes, and see which ones are "close" to your cursor.

- accidental mutable variables and duplicating / reusing them later in the code

- when I look back in a week, "What the hell am I doing here?"

- or just tricky logic in general

I only noticed the treesitter textobject issue because I genuinely started working with 1MB autogen C files at work. So... yeah...

You are not wrong. There are of course tradeoffs here. There are various things that can improve web service performance, but if we are talking about the performance of a web service in comparison to other more general concerns, like maintainability, then I agree trying to make small performance wins falls pretty low on the list.

After all, even if one has some slow and beastly, unoptimized Spring Boot container that chews through RAM, its not that expenseive (in the grand scheme of things) to just replicate more instances of it.

Do none of the JVMs do that? GraalVM?

They do, to add to another comment of mine elsewhere, JIT caches go all the way back to products like JRockit, and IBM JVM has and it for years in Maestro, now available as OpenJ9.

Too many folks have this mindset there is only one JVM, when that has never been the case since the 2000's, after Java for various reasons started poping everywhere.

the best way is via CRaC (https://docs.azul.com/crac/) but only a few vendors support it and there’s a bit of process to get it setup.

in practice, for web applications exposing some sort of `WarmupTask` abstraction in your service chassis that devs can implement will get you quite far. just delay serving traffic on new deployments until all tasks complete. that way users will never hit a cold node

My architecture builds a command registry in Clojure/JVM which runs as a daemon, the registry is shared by a dynamically generated babashka (GraalVM) shell that only includes whitelisted commands for that user. So for the user, unauthorized commands don’t even exist, and I get my JVM app with no startup overhead.

Gradle does suck and maven is ok but a bit ugly.

> That way you are actually in control

Programming in Rust is a constant negotiation with the compiler. That isn't necessarily good or bad but I have far more control in Zig, and flexibility in Java.

"Enterprise Java"

Factories! Factories everywhere!

Author here. DB and external service calls are often the biggest wins, thanks for calling that out.

God, please make me unsee it. That‘s a cool trick that turns into an anti-pattern itself if abused.

ah, thank you. Haven't worked in java for a bit now, but that was the only one I read where I was like "I'm sure we didn't have to avoid this when I worked on java".

The rest were all very familiar. Well, apart from the new stuff. I think most of my code was running in java 6...

Can you share the libs you 're using?

I would disagree that either "plain works" because to even package your app into a self-contained .jar, you need a plugin. I can't recall the specifics now, but years ago I spent many hours fighting both Maven and Gradle.

Lets look at Java in modern day.

* Most mature Java project has moved to Kotlin.

* The standard build system uses gradle, which is either groovy or kotlin, which gets compiled to java which then compiles java.

* Log4shell, amongst other vulnerabilities.

* Super slow to adopt features like async execution

* Standard repo usage is terrible.

There is no point in using Java anymore. I don't agree that Rust is a replacement, but between Python, Node, and C/C++ extensions to those, you can do everything you need.

LLMs take the whole argument away. Yes, maven/gradle/sbt suck to work with. But now you can just generate it.

Yes, there is a learning curve to Rust, but once you get proficient, it no longer bothers you. I think this is more good than bad, because, for example, look at Bun, it is written in Zig, it has so many bugs. They had a bug in their filesystem API that freezed your process, and it stayed unfixed for at least half a year after I filed it. Zig is a nice C replacement, but it doesn't have the same correctness guardrails as Rust.

I think that right now it is easier than ever to rewrite your app in Rust, due to LLMs. Unfortunately there are still people out there who dismiss this idea, and continue having their back-end written in much inferior languages, like JavaScript or Python. If your back-end is written in Java, you aren't even in the worst spot.

Yes! Obligatory link to the seminal work on the subject:

https://gwern.net/doc/cs/2005-09-30-smith-whyihateframeworks...

Why do you think this plays out over and over again? What's the causal mechanisms of this strange attractor

I use Ecto with Elixir in my day job, and it has a pretty good query building type solution. BUT: I still regularly come into issues where I have to use a fragment in order to do the specific SQL operation that I want, or I start my app and it turns out it has not caught the issue with my query (relating to my specific MySQL version or whatever). Which unfortunately defeats the purpose.

My experience with something like the latest Claude Code models these days has been that they are pretty good at SQL. I think some combination of LLM review of SQL code with smoke tests would do the trick here.

My experience profiling is that I/O wait is never the problem. However, the app may actually be spending most of it's CPU time interacting with database. In general, networks have gotten so fast relative to CPU that the CPU cost of marshalling or serializing data across a protocol ends up being the limiting factor. I got a major speedup once just by updating the JSON serialization library an app used.

> And aside from algorithms, it usually comes down to avoiding memory allocations.

I’ve heard about HFT people using Java for workloads where micro optimization is needed.

I am very interested about this and would like an authoritative answer on this. I even went as far as buying some books on code optimization in the context of HFT and I was not impressed. Not a single snippet of assembly; how are you optimizing anything if you don't look at what the compiler produces?

But on Java specifically: every Java object still has a 24-byte overhead. How doesn't that thrash your cache?

The advice on avoiding allocations in Java also results in terrible code. For example, in math libraries, you'll often see void Add(Vector3 a, Vector3 b, Vector3 our) as opposed to the more natural Vector3 Add(Vector3 a, Vector3 b). There you go, function composition goes out the window and the resulting code is garbage to read and write. Not even C is that bad; the compiler will optimize the temporaries away. So you end up with Java that is worse than a low-level imperative language.

And, as far as I know, the best GC for Java still incurs no less than 1ms pauses? I think the stock ones are as bad as 10ms. How anyone does low-latency anything in Java then boggles my mind.

They do, to add to another comment of mine elsewhere, JIT caches go all the way back to products like JRockit, and IBM JVM has and it for years in Maestro, now available as OpenJ9.

Too many folks have this mindset there is only one JVM, when that has never been the case since the 2000's, after Java for various reasons started poping everywhere.

Compile it to native (GraalVM) and you can get it fast while consuming less memory. But now your build is slow :)

The minute a project has maven in it the build is slow. Don’t even get me started on Gradle…

the best way is via CRaC (https://docs.azul.com/crac/) but only a few vendors support it and there’s a bit of process to get it setup.

Lets look at Java in modern day.

* Most mature Java project has moved to Kotlin.

* The standard build system uses gradle, which is either groovy or kotlin, which gets compiled to java which then compiles java.

* Log4shell, amongst other vulnerabilities.

* Super slow to adopt features like async execution

* Standard repo usage is terrible.

There is no point in using Java anymore. I don't agree that Rust is a replacement, but between Python, Node, and C/C++ extensions to those, you can do everything you need.

LLMs take the whole argument away. Yes, maven/gradle/sbt suck to work with. But now you can just generate it.

Yes! Obligatory link to the seminal work on the subject:

https://gwern.net/doc/cs/2005-09-30-smith-whyihateframeworks...

Assuming we're talking about the same bug, The filesystem API freeze wasn't caused by Zig's lack of correctness guarantees, but a design flaw in Bun's implementation.

Why do you think this plays out over and over again? What's the causal mechanisms of this strange attractor

But on Java specifically: every Java object still has a 24-byte overhead. How doesn't that thrash your cache?

And, as far as I know, the best GC for Java still incurs no less than 1ms pauses? I think the stock ones are as bad as 10ms. How anyone does low-latency anything in Java then boggles my mind.

Well, yes? It's a feature provided by a plugin, like any other feature in Maven, you declare the plugin for creating a fat-jar or single-jar and use that. It's just some lines of XML configuration so it plain works.

Like I said, it's not hypermodern with batteries included, and streamlined for what became more common workflows after it was created but it doesn't need workarounds, it's not complicated to define a plugin to be called in one of the steps of the lifecycle, and it's provided as part of its plugin architecture.

I can understand spending many hours fighting Gradle, even I with plenty of experience with Gradle (begrudgingly, I don't like it at all) still end up fighting its idiocies but Maven... It's like any other tool, you need to learn the basics but after that you will only fight it if you are verging away from the well-documented usage (which are plenty, it's been battle-tested for decades).

You "need a plugin" in the sense that every component of maven is a "plugin". The core plugins give you everything you need to build a self-contained jar - if you wanted to, you don't even have to configure the plugins, if you want to write a long cli command instead.

You mostly need a recent JDK. Leyden has already cut down warmup by a lot and is expected to continue driving it down.

https://foojay.io/today/how-is-leyden-improving-java-perform...

https://quarkus.io/blog/leyden-1/

Easy to get wrong as well.

There's a balance with a DB. Doing 1 or 2 row queries 1000 times is obviously inefficient, but making a 1M row query can have it's own set of problems all the same (even if you need that 1M).

I agree with you fully yes. One has to watch out for overwhelmingly large or locking queries.

ORMs are a caching layer for dev time.

They store up conserved programming time and then spend it all at once when you hit the edge case.

If you never hit the case, it's great. As soon as you do, it's all returned with interest :)

Gaming the JIT just to get startup times in line is a decent sign that Java's "fast" comes with invisible asterisks all over prod graphs. At some point you're managing the runtime, not the app.

AOT options like GraalVM Native Image can help cold starts a lot, but then half your favorite frameworks breaks and you trade one set of hoops for another. Pick which pain you want.

Excelsior JET, now gone, but only because GraalVM and OpenJ9 exist now.

The folks on embedded get to play with PTC and Aicas.

Android, even if not proper Java, has dex2oat.

GraalVM has a lot of limitations, some popular lib don't work with it. From what I remember anything using reflection is painful to use.

And going the other direction, if you want your C++ binaries to benefit from statistics about how to optimize the steady-state behavior of a long-running process, the analogous technique is profile-guided optimization (PGO).

GraalVM is terrible. Eats gigabytes of memory to compile super simple application. Spends minutes doing that. If you need compiled native app, just use Golang.

> ditch ORMs ... make good use of SQL

I think Java (or other JVM languages) are then best positioned, because of jooq. Still the best SQL generation library I've used.

I challenge the idea that first request latency is bottle necked by language choice. I can see how that is plausible, mind. Is it a concern for the vast majority of developers?

i'd be curious about a head to head comparison of how much the c2 actually buys over a static aot compilation with something serious like llvm.

if it is valuable, i'd be surprised you can't freeze/resume the state and use it for instantaneous workload optimized startup.

AOT is nice for startup time, but there are tradeoffs in the other direction for long tail performance issues in production.

https://devblogs.microsoft.com/dotnet/bing-on-dotnet-8-the-i...

The minute a project has maven in it the build is slow. Don’t even get me started on Gradle…

> a lot of real world performance comes down to how efficiently you are calling external services (including the database)

That said, the article does have the "LLM stank" on it, which is always offputting, but the content itself seems solid.

So while it's true that you can write C in any language... those kind of folks don't tend to use Java in the first place ;)

(Except Notch? Well, his code looks like C, not sure if it's actually fast! I really enjoyed his 4 kilobyte java games back in the day, I think he published the source for each one too.)

EDIT: Found it!

https://web.archive.org/web/20120317121029/http://www.mojang...

Edit 2: This one has a download, still works!

https://web.archive.org/web/20120301015921/http://www.mojang...

Performance is really not Java's issue. Even bad Java code is still substantially faster than the bulk of modern software that is based on technologies like Python or JavaScript/Node.js.

Bad idea. I've made a pool allocator before, but that was for expensive network objects and expensive objects dealing with JNI.

Doing it to avoid memory pressure generally means you simply have a bad algorithm that needs to be tweaked. It's very rarely the right solution.

In C# low latency dev is not even that painful, as there are a lot of tools available specifically for this purpose by the runtime.

Which, to be fair, in many cases is ok. If you just need to churn out LOB apps for worker drones as cheap as possible, performance is probably not the most important factor.

> So just manual memory management with extra steps

This is actually the perfect situation: you are allowed to do it carefully and manually for 1% of code on the hot path, but you don't have to worry about it for the 99% of the code that's not.

Anytime I use a language other than Java it's always jooq that I miss. It's that good.

Can you provide any examples or evidence of Java apps that prove this?

Because in my experience as of 2026, Java programs are consistently among the most painful or unpleasant to interact with.

Ah, but let's port rust to the JVM!

I worked with ORM (EclipseLink) and used SQL just fine.

When using JDBC I found myself quickly in implementing a poor mans ORM.

TBH, I do not see how Java as a language steers anyone to use one those shotguns. E.g. the knowledge about algorithmic complexity is foundational, the StringBuilder is junior-level basic knowledge.

Assuming we're talking about the same bug, The filesystem API freeze wasn't caused by Zig's lack of correctness guarantees, but a design flaw in Bun's implementation.

I agree with you fully yes. One has to watch out for overwhelmingly large or locking queries.

GraalVM has a lot of limitations, some popular lib don't work with it. From what I remember anything using reflection is painful to use.

How would you handle validating numeric input in a hot path then? All of the solutions proposed in #5 are incomplete or broken, and it stems from the fact that Java's language design over-uses exceptions for error handling in places where an optional value would be much safer and faster.

Maybe I'm stupid, but I never actually understood people who blame programming languages for bugs in software. Because sure, it's good to have guardrails, but in my opinion, if you're writing a program and there's a bug, unless this bug lies somewhere in implementation of compiler/interpreter/etc, you can't blame the tooling, It's you who introduced this bug. It was your mistake.

It's cool when your tooling warns you about potential bugs or mistakes in implementation, but it's still your responsibility to write the correct code. If you pick up a hammer and hit your finger instead of the nail, then in most cases (though not always) it’s your own fault.

https://github.com/oven-sh/bun/issues/18192

I am talking about this bug. It looks like it is still unfixed, in the sense, there is a PR fixing it, but it wasn't merged. LOL.

Regardless of whether this specific bug would be caught by Rust compiler, Bun in general is notorious for crashing, just look at how many open issues there are, how many crashes.

Not saying that you cannot make a correct program in Zig, but I prefer having checks that Rust compiler does, to not having them.

ORMs are a caching layer for dev time.

They store up conserved programming time and then spend it all at once when you hit the edge case.

If you never hit the case, it's great. As soon as you do, it's all returned with interest :)

The question is why we don’t have database management systems that integrate tightly with the progmming language. Instead we have to communicate between two different paradigms using a textual language, which is itself inefficient.

GraalVM is terrible. Eats gigabytes of memory to compile super simple application. Spends minutes doing that. If you need compiled native app, just use Golang.

I used to be really excited about GraalVM but this, together with limitations in what Java code can run (reflection must be whitelisted - i.e. pain) made me run away from it. I do use Go, but my favourite substitute for Java is actually Dart. It can run as a script, compile to a binary or to a multiplatform "fast" format (a bit like a jar), and performance wise it's par on par with Java! It's faster on some things, a bit slower on other... but in general, compiling to exe makes it extremely fast to start, like Go. I think it even shares some Go binary creation tooling since both are made by Google and I remember when they were implementing the native compiler, they mentioned something about that.

Anytime I use a language other than Java it's always jooq that I miss. It's that good.

Ah, but let's port rust to the JVM!

Can you provide any examples or evidence of Java apps that prove this?

Because in my experience as of 2026, Java programs are consistently among the most painful or unpleasant to interact with.

Crac / aot cache / ready now all can address this. Not even considering native aot. Multiple low latency trading systems across market markers, hedge funds and ibs prove this. But people just want to compare it to building a cli tool in go or rust.

I worked with ORM (EclipseLink) and used SQL just fine.

When using JDBC I found myself quickly in implementing a poor mans ORM.

Normally in 100% cases, with parseInt/parseDouble etc. Getting NumberFormatException so frequently on a hot path that it impacts performance means, that you aren’t solving the parsing number problem, you are solving a guessing type problem, which is out of scope for standard library and requires custom parser.

https://github.com/oven-sh/bun/issues/18192

I am talking about this bug. It looks like it is still unfixed, in the sense, there is a PR fixing it, but it wasn't merged. LOL.

Regardless of whether this specific bug would be caught by Rust compiler, Bun in general is notorious for crashing, just look at how many open issues there are, how many crashes.

Not saying that you cannot make a correct program in Zig, but I prefer having checks that Rust compiler does, to not having them.

When millions of users constantly make the same mistake with the tool, there may be a problem with the tool, whether it's a defect in the tool or just that it's inappropriate for the job. Blaming the user might give one a righteous feeling, but decade after decade that approach has failed to actually fix any problems.

We tried that in 90’s RAD environments like Foxpro and others. If it fits the problem, they were great! If not, it’s even worse than with an ORM. They rarely fit today since they were all (or mostly) local-first or even local-only. Scaling was either not possible or pretty difficult.

Because every single database vendor will try to lock down their users to their DBMS.

Oracle is a prime example of this. Stored procedures are the place to put all business logic according to Oracle documentation.

This caused backslash from escaping developers who then declared business logic should never be inside the database. To avoid vendor lock-in.

There's no ideal solution, just tradeoffs.

The answer is simple: model optimized for storage and model designed for processing are two different things. The languages used to describe and query them have to be different.

But like can you provide an actual example of an application?

> But people just want to compare it to building a cli tool in go or rust.

This seems like the key. HN is definitely biased towards simpler, smaller tools. (And that's not a bad thing!). The most compelling JVM stories I hear are all from much larger scale enterprise settings.

Kafka being a good example. It's very good at what it does, but painful to manage and usually not worth the pain for anyone who's not in a mega enterprise.

Okay, but this contradicts your original statement that "Java doesn't steer anyone to use these [footguns]". Every language has a way to parse integers, and most developers do not need a custom parser. Only in Java does that suddenly become a performance footgun.

Hacker Times

Hacker Times

Java is fast, code might not be

Discussion

Discussion

1. String Concatenation in Loops

2. Accidental O(n²) with Streams Inside Loops

3. String.format() in Hot Paths

4. Autoboxing in Hot Paths

5. Exceptions for Control Flow

6. Too-Broad Synchronization

7. Repeated Creation of “Reusable” Objects

8. Virtual Thread Pinning (If You’re on JDK 21–23)

The Compounding Effect