Orders by hour could be made faster. The issue with it is it's using a map when an array works both faster and just fine.
On top of that, the map boxes the "hour" which is undesirable.
This is how I'd write it
long[] ordersByHour = new long[24];
var deafultTimezone = ZoneId.systemDefault();
for (Order order : orders) {
int hour = order.timestamp().atZone(deafultTimezone).getHour();
ordersByHour[hour]++;
}
If you know the bound of an array, it's not large, and you are directly indexing in it, you really can't do any better performance wise.
It's also not less readable, just less familiar as Java devs don't tend to use arrays that much.
maybe it would be a little better to use ints rather than longs, as Java lists can't be bigger than the int max value anyways. Saves you a cache line or two.
Understanding algorithmic complexity (in particular, avoiding rework in loops), is useful in any language, and is sage advice.
In practice though, for most enterprise web services, a lot of real world performance comes down to how efficiently you are calling external services (including the database). Just converting a loop of queries into bulk ones can help loads (and then tweaking the query to make good use of indexes, doing upserts, removing unneeded data, etc.)
I'm hopeful that improvements in LLMs mean we can ditch ORMs (under the guise that they are quicker to write queries and the inbetween mapping code with) and instead make good use of SQL to harness the powers that modern databases provide.
There's a balance with a DB. Doing 1 or 2 row queries 1000 times is obviously inefficient, but making a 1M row query can have it's own set of problems all the same (even if you need that 1M).
It'll depend on the hardware, but you really want to make sure that anything you do with a DB allows for other instances of your application a chance to also interact with the DB. Nothing worse than finding out the 2 row insert is being blocked by a million row read for 20 seconds.
There's also a question of when you should and shouldn't join data. It's not always a black and white "just let the DB handle it". Sometimes the better route to go down is to make 2 queries rather than joining, particularly if it's something where the main table pulls in 1000 rows with only 10 unique rows pulled from the subtable. Of course, this all depends on how wide these things are as well.
But 100% agree, ORMs are the worst way to handle all these things. They very rarely do the right thing out of the box and to make them fast you ultimately end up needing to comprehend the SQL they are emitting in the first place and potentially you end up writing custom SQL anyways.
Author here. DB and external service calls are often the biggest wins, thanks for calling that out.
In my demo app, the CPU hotspots were entirely in application code, not I/O wait. And across a fleet, even "smaller" gains in CPU and heap compound into real cost and throughput differences. They're different problems, but your point is valid. Goal here is to get more folks thinking about other aspects of performance especially when the software is running at scale.
public int parseOrDefault(String value, int defaultValue) {
if (value == null || value.isBlank()) return defaultValue;
for (int i = 0; i < value.length(); i++) {
char c = value.charAt(i);
if (i == 0 && c == '-') continue;
if (!Character.isDigit(c)) return defaultValue;
}
return Integer.parseInt(value);
}
Is probably worse than Integer.parseInt alone, since it can still throw NumberFormatExceptions for values that overflow (which is no longer handled!). Would maybe fix that. Unfortunately this is a major flaw in the Java standard library; parsing numbers shouldn't throw expensive exceptions.
When you're using a programming language that naturally steers you to write slow code you can't only blame the programmer.
I was listening to someone say they write fast code in Java by avoiding allocations with a PoolAllocator that would "cache" small objects with poolAllocator.alloc(), poolAllocator.release(). So just manual memory management with extra steps. At that point why not use a better language for the task?
You might have an application for which speed is not important most of the time.
Only one or two processes might require allocation-free code. For such a case, why would you burden all of the other code with the additional complexity?
Calling out to a different language then may come with baggage you'd rather avoid.
A project might also grow into these requirements. I can easily imagine that something wasn't problematic for a long time but suddenly emerged as an issue over time. At that point you wouldn't want to migrate the whole codebase to a better language anymore.
Why should compiler optimize obviously dumb code? If developer wants to create billions of heap objects, compiler should respect him. Optimizing dumb code is what made C++ unbearable. When you write one code and compilers generates completely different code.
First request latency also can really suck in Java before hotpathed code gets through the C2 compiler. You can warm up hotpaths by running that code during startup, but it's really annoying having to do that. Using C++, Go, or Rust gets you around that problem without having to jump through the hoops of code path warmup.
You can create a native executable with GraalVM. Alternatively, if you want to keep the JVM: With the ongoing project Leyden, you can already "pre-train" some parts of the JVM warm-up, with full AoT code compilation coming some time in the future.
GraalVM is terrible. Eats gigabytes of memory to compile super simple application. Spends minutes doing that. If you need compiled native app, just use Golang.
And going the other direction, if you want your C++ binaries to benefit from statistics about how to optimize the steady-state behavior of a long-running process, the analogous technique is profile-guided optimization (PGO).
I worked on JVMs long ago (almost twenty years now). At that time most Java usage was for long-running servers. The runtime team staunchly refused to implement AOT caching for as long as possible. This was a huge missed opportunity for Java, as client startup time has always, always, always sucked. Only in the past 3-5 years does it seem like things have started to shift, in part due to the push for Graal native image.
I long ago concluded that Java was not a client or systems programming language because of the implementation priorities of the JVM maintainers. Note that I say priorities--they are extremely bright and capable engineers that focus on different use cases, and there isn't much money to be made from a client ecosystem.
AOT is nice for startup time, but there are tradeoffs in the other direction for long tail performance issues in production.
There are JITs that use dynamic profile guided optimization which can adjust the emitted binary at runtime to adapt to the real world workload. You do not need to have a profile ahead of time like with ordinary PGO. Java doesn't have this yet (afaik), but .NET does and it's a huge deal for things like large scale web applications.
I challenge the idea that first request latency is bottle necked by language choice. I can see how that is plausible, mind. Is it a concern for the vast majority of developers?
I really hate how completely clueless people on hn are about java. This is not, and has not been an issue for many many years in Java and even the most junior of developers know how to avoid it. But oh no, go and rust is alwaayssss the solution sure.
They do, to add to another comment of mine elsewhere, JIT caches go all the way back to products like JRockit, and IBM JVM has and it for years in Maestro, now available as OpenJ9.
Too many folks have this mindset there is only one JVM, when that has never been the case since the 2000's, after Java for various reasons started poping everywhere.
the best way is via CRaC (https://docs.azul.com/crac/) but only a few vendors support it and there’s a bit of process to get it setup.
in practice, for web applications exposing some sort of `WarmupTask` abstraction in your service chassis that devs can implement will get you quite far. just delay serving traffic on new deployments until all tasks complete. that way users will never hit a cold node
My architecture builds a command registry in Clojure/JVM which runs as a daemon, the registry is shared by a dynamically generated babashka (GraalVM) shell that only includes whitelisted commands for that user. So for the user, unauthorized commands don’t even exist, and I get my JVM app with no startup overhead.
This is why I use java for long running processes, if i care about a small binary that launches fast, i just use something slower at runtime but faster at startup like python.
A subject close to my heart, I write a lot of heavily optimised code including a lot of hot data pipelines in Java.
And aside from algorithms, it usually comes down to avoiding memory allocations.
I have my go-to zero-alloc grpc and parquet and json and time libs etc and they make everything fast.
It’s mostly how idiomatic Java uses objects for everything that makes it slow overall.
But eventually after making a JVM app that keeps data in something like data frames etc and feels a long way from J2EE beans you can finally bump up against the limits that only c/c++/rust/etc can get you past.
> And aside from algorithms, it usually comes down to avoiding memory allocations.
I’ve heard about HFT people using Java for workloads where micro optimization is needed.
To be frank, I just never understood it. From what I’ve seen heard/you have to write the code in such a way that makes it look clumsy and incompatible with pretty much any third party dependencies out there.
And at that point, why are you even using Java? Surely you could use C, C++, or any variety of popular or unpopular languages that would be more fitting and ergonomic (sorry but as a language Java just feels inferior to C# even). The biggest swelling point of Java is the ecosystem, and you can’t even really use that.
I ran into 5 and 7 in a Flink app recently - was parsing a timestamp as a number first and then falling back to iso8601 string, which is what it was. The flamegraph showed 10% for the exception handling bit.
While fixing that, also found repeated creation of datetimeformatter.
Both were not in loops, but both were being done for every event, for 10s of 1000s of events every second.
This one is so prevalent that JVM has an optimization where it gives up on filling stack for exception, if it was thrown over and over in exact same place.
You can write many of the bad examples in the article in any language. It is just far more common to see them in Java code than some other languages.
Java is only fast-ish even on its best day. The more typical performance is much worse because the culture around the language usually doesn't consider performance or efficiency to be a priority. Historically it was even a bit hostile to it.
Performance is really not Java's issue. Even bad Java code is still substantially faster than the bulk of modern software that is based on technologies like Python or JavaScript/Node.js.
Which, to be fair, in many cases is ok. If you just need to churn out LOB apps for worker drones as cheap as possible, performance is probably not the most important factor.
My experience with the defaults in JavaScript is that they’re pretty slow. It’s really, really easy to hit the limits of an express app and for those limits to be in your app code. I’ve worked on JVM backed apps and they’re memory hungry (well, they require a reallocation for the JVM) and they’re slow to boot but once they’re going they are absolutely ripping fast and your far more likely to be bottlenecked by your DB long before you need to start doing any horizontal scaling.
Fair point on ecosystem decisions, that's basically the thesis of the post. These patterns aren't Java being slow, they're developers (myself included) writing code that looks fine but works against the JVM. Enterprise Java gets a bad rap partly because these patterns compound silently across large codebases and nobody profiles until something breaks.
I don't think that's a charitable take of the article. To many programmers, it wouldn't be obvious that some of these footguns (autoboxing, string concatenation, etc) are "bad", or what the "good" alternatives are (primitives, StringBuilder, etc).
That said, the article does have the "LLM stank" on it, which is always offputting, but the content itself seems solid.
As much as I love Java, everybody should just be using Rust. That way you are actually in control, know what's going on, etc. Another reason specifically against Java is that the tooling, both Maven and Gradle, still stucks.
Not knowing what's going on in Java is a personal problem. The language and jvm have its own quirks but it's no less knowable than any other compiler optimized code.
The debugging and introspection tooling in Java is also best in class so I would say it's one of the more understandable run times.
Gradle does suck, it gives too much freedom on a tool that should be straightforward and actively design to avoid footguns, it does the opposite by providing a DSL that can create a lot of abstractions to manage dependencies. The only place I worked where the Gradle configuration looked somewhat sane had very strict design guidelines on what was acceptable to be in the Gradle config.
Maven on the other hand, is just plain boring tech that works. There's plenty of documentation on how to use it properly for many different environments/scenarios, it's declarative while enabling plug-ins for bespoke customisations, it has cruft from its legacy but it's quite settled and it just works.
Could Maven be more modern if it was invented now? Yeah, sure, many other package managers were developed since its inception with newer/more polished concepts but it's dependable, well documented, and it just plain works.
I would disagree that either "plain works" because to even package your app into a self-contained .jar, you need a plugin. I can't recall the specifics now, but years ago I spent many hours fighting both Maven and Gradle.
Well, yes? It's a feature provided by a plugin, like any other feature in Maven, you declare the plugin for creating a fat-jar or single-jar and use that. It's just some lines of XML configuration so it plain works.
Like I said, it's not hypermodern with batteries included, and streamlined for what became more common workflows after it was created but it doesn't need workarounds, it's not complicated to define a plugin to be called in one of the steps of the lifecycle, and it's provided as part of its plugin architecture.
I can understand spending many hours fighting Gradle, even I with plenty of experience with Gradle (begrudgingly, I don't like it at all) still end up fighting its idiocies but Maven... It's like any other tool, you need to learn the basics but after that you will only fight it if you are verging away from the well-documented usage (which are plenty, it's been battle-tested for decades).
You "need a plugin" in the sense that every component of maven is a "plugin". The core plugins give you everything you need to build a self-contained jar - if you wanted to, you don't even have to configure the plugins, if you want to write a long cli command instead.
Programming in Rust is a constant negotiation with the compiler. That isn't necessarily good or bad but I have far more control in Zig, and flexibility in Java.
Yes, there is a learning curve to Rust, but once you get proficient, it no longer bothers you. I think this is more good than bad, because, for example, look at Bun, it is written in Zig, it has so many bugs. They had a bug in their filesystem API that freezed your process, and it stayed unfixed for at least half a year after I filed it. Zig is a nice C replacement, but it doesn't have the same correctness guardrails as Rust.
Assuming we're talking about the same bug, The filesystem API freeze wasn't caused by Zig's lack of correctness guarantees, but a design flaw in Bun's implementation.
Maybe I'm stupid, but I never actually understood people who blame programming languages for bugs in software. Because sure, it's good to have guardrails, but in my opinion, if you're writing a program and there's a bug, unless this bug lies somewhere in implementation of compiler/interpreter/etc, you can't blame the tooling, It's you who introduced this bug. It was your mistake.
It's cool when your tooling warns you about potential bugs or mistakes in implementation, but it's still your responsibility to write the correct code. If you pick up a hammer and hit your finger instead of the nail, then in most cases (though not always) it’s your own fault.
When millions of users constantly make the same mistake with the tool, there may be a problem with the tool, whether it's a defect in the tool or just that it's inappropriate for the job. Blaming the user might give one a righteous feeling, but decade after decade that approach has failed to actually fix any problems.
That's why I say "in most cases" - so not always, actually. There might be problems with tools, I'm not trying to deny that. And by the way, what if some (or even most) of the users just don't have enough skill to use the tool properly? Again, there could be a problem with tool, yes, but you can't always blame only tools for mistakes users make.
I am talking about this bug. It looks like it is still unfixed, in the sense, there is a PR fixing it, but it wasn't merged. LOL.
Regardless of whether this specific bug would be caught by Rust compiler, Bun in general is notorious for crashing, just look at how many open issues there are, how many crashes.
Not saying that you cannot make a correct program in Zig, but I prefer having checks that Rust compiler does, to not having them.
Rust has no place other than deployment scenarios where any kind of automatic resource management, be it tracing GC or reference counting, is not wanted for, either due to technical reasons, or being a waste of time trying to change people's mindset.
I'm a fan of Rust too. But there are millions of Java applications running in production right now, and some of them are running these anti-patterns today. Not everyone has the option to rewrite in a different language. For those teams, knowing what to look for in a profiler can make a real difference without changing a single dependency.
I think that right now it is easier than ever to rewrite your app in Rust, due to LLMs. Unfortunately there are still people out there who dismiss this idea, and continue having their back-end written in much inferior languages, like JavaScript or Python. If your back-end is written in Java, you aren't even in the worst spot.
Orders by hour could be made faster. The issue with it is it's using a map when an array works both faster and just fine.
On top of that, the map boxes the "hour" which is undesirable.
This is how I'd write it
If you know the bound of an array, it's not large, and you are directly indexing in it, you really can't do any better performance wise.It's also not less readable, just less familiar as Java devs don't tend to use arrays that much.
In practice though, for most enterprise web services, a lot of real world performance comes down to how efficiently you are calling external services (including the database). Just converting a loop of queries into bulk ones can help loads (and then tweaking the query to make good use of indexes, doing upserts, removing unneeded data, etc.)
I'm hopeful that improvements in LLMs mean we can ditch ORMs (under the guise that they are quicker to write queries and the inbetween mapping code with) and instead make good use of SQL to harness the powers that modern databases provide.
There's a balance with a DB. Doing 1 or 2 row queries 1000 times is obviously inefficient, but making a 1M row query can have it's own set of problems all the same (even if you need that 1M).
It'll depend on the hardware, but you really want to make sure that anything you do with a DB allows for other instances of your application a chance to also interact with the DB. Nothing worse than finding out the 2 row insert is being blocked by a million row read for 20 seconds.
There's also a question of when you should and shouldn't join data. It's not always a black and white "just let the DB handle it". Sometimes the better route to go down is to make 2 queries rather than joining, particularly if it's something where the main table pulls in 1000 rows with only 10 unique rows pulled from the subtable. Of course, this all depends on how wide these things are as well.
But 100% agree, ORMs are the worst way to handle all these things. They very rarely do the right thing out of the box and to make them fast you ultimately end up needing to comprehend the SQL they are emitting in the first place and potentially you end up writing custom SQL anyways.
They store up programming time and then spend it all at once when you hit the edge case.
If you never hit the case, it's great. As soon as you do, it's all returned with interest :)
In my demo app, the CPU hotspots were entirely in application code, not I/O wait. And across a fleet, even "smaller" gains in CPU and heap compound into real cost and throughput differences. They're different problems, but your point is valid. Goal here is to get more folks thinking about other aspects of performance especially when the software is running at scale.
Or even the local filesystem :)
CPU calls are cheap, memory is pretty cheap, disk is bad, spinning disk is very bad, network is 'good luck'.
You can O(pretty bad) as long as you stay within the right category of those.
I was listening to someone say they write fast code in Java by avoiding allocations with a PoolAllocator that would "cache" small objects with poolAllocator.alloc(), poolAllocator.release(). So just manual memory management with extra steps. At that point why not use a better language for the task?
A project might also grow into these requirements. I can easily imagine that something wasn't problematic for a long time but suddenly emerged as an issue over time. At that point you wouldn't want to migrate the whole codebase to a better language anymore.
Doing it to avoid memory pressure generally means you simply have a bad algorithm that needs to be tweaked. It's very rarely the right solution.
Rest of advice is great: things compilers can't really catch but a good code reviewer should point out.
I wish Java had a proper compiler.
https://foojay.io/today/how-is-leyden-improving-java-perform...
https://quarkus.io/blog/leyden-1/
I long ago concluded that Java was not a client or systems programming language because of the implementation priorities of the JVM maintainers. Note that I say priorities--they are extremely bright and capable engineers that focus on different use cases, and there isn't much money to be made from a client ecosystem.
There are JITs that use dynamic profile guided optimization which can adjust the emitted binary at runtime to adapt to the real world workload. You do not need to have a profile ahead of time like with ordinary PGO. Java doesn't have this yet (afaik), but .NET does and it's a huge deal for things like large scale web applications.
https://devblogs.microsoft.com/dotnet/bing-on-dotnet-8-the-i...
The folks on embedded get to play with PTC and Aicas.
Android, even if not proper Java, has dex2oat.
if it is valuable, i'd be surprised you can't freeze/resume the state and use it for instantaneous workload optimized startup.
Because in my experience as of 2026, Java programs are consistently among the most painful or unpleasant to interact with.
Too many folks have this mindset there is only one JVM, when that has never been the case since the 2000's, after Java for various reasons started poping everywhere.
in practice, for web applications exposing some sort of `WarmupTask` abstraction in your service chassis that devs can implement will get you quite far. just delay serving traffic on new deployments until all tasks complete. that way users will never hit a cold node
There are options to turn on which cause the JVM to save off and reload compiled classes. It pretty massively improves performance.
You can get even faster if you do that plus doing a jlink jvm. But that's more of a pain. The AOT cache is a lot simpler to do.
https://openjdk.org/jeps/514
And aside from algorithms, it usually comes down to avoiding memory allocations.
I have my go-to zero-alloc grpc and parquet and json and time libs etc and they make everything fast.
It’s mostly how idiomatic Java uses objects for everything that makes it slow overall.
But eventually after making a JVM app that keeps data in something like data frames etc and feels a long way from J2EE beans you can finally bump up against the limits that only c/c++/rust/etc can get you past.
I’ve heard about HFT people using Java for workloads where micro optimization is needed.
To be frank, I just never understood it. From what I’ve seen heard/you have to write the code in such a way that makes it look clumsy and incompatible with pretty much any third party dependencies out there.
And at that point, why are you even using Java? Surely you could use C, C++, or any variety of popular or unpopular languages that would be more fitting and ergonomic (sorry but as a language Java just feels inferior to C# even). The biggest swelling point of Java is the ecosystem, and you can’t even really use that.
It doesn't excuse the "use exceptions for control flow" anti-pattern, but it is a quick patch.
This one is so prevalent that JVM has an optimization where it gives up on filling stack for exception, if it was thrown over and over in exact same place.
Who’s there?
long pause
Java
Java is only fast-ish even on its best day. The more typical performance is much worse because the culture around the language usually doesn't consider performance or efficiency to be a priority. Historically it was even a bit hostile to it.
Same for Java, I have yet to in my entire career see enterprise Java be performant and not memory intensive.
At the end of the day, if you care about performance at the app layer, you will use a language better suited to that.
Factories! Factories everywhere!
https://gwern.net/doc/cs/2005-09-30-smith-whyihateframeworks...
any other resources like this?
That said, the article does have the "LLM stank" on it, which is always offputting, but the content itself seems solid.
Gradle does suck and maven is ok but a bit ugly.
Maven on the other hand, is just plain boring tech that works. There's plenty of documentation on how to use it properly for many different environments/scenarios, it's declarative while enabling plug-ins for bespoke customisations, it has cruft from its legacy but it's quite settled and it just works.
Could Maven be more modern if it was invented now? Yeah, sure, many other package managers were developed since its inception with newer/more polished concepts but it's dependable, well documented, and it just plain works.
Like I said, it's not hypermodern with batteries included, and streamlined for what became more common workflows after it was created but it doesn't need workarounds, it's not complicated to define a plugin to be called in one of the steps of the lifecycle, and it's provided as part of its plugin architecture.
I can understand spending many hours fighting Gradle, even I with plenty of experience with Gradle (begrudgingly, I don't like it at all) still end up fighting its idiocies but Maven... It's like any other tool, you need to learn the basics but after that you will only fight it if you are verging away from the well-documented usage (which are plenty, it's been battle-tested for decades).
It gets a reaction, though, so great for social media.
Programming in Rust is a constant negotiation with the compiler. That isn't necessarily good or bad but I have far more control in Zig, and flexibility in Java.
It's cool when your tooling warns you about potential bugs or mistakes in implementation, but it's still your responsibility to write the correct code. If you pick up a hammer and hit your finger instead of the nail, then in most cases (though not always) it’s your own fault.
I am talking about this bug. It looks like it is still unfixed, in the sense, there is a PR fixing it, but it wasn't merged. LOL.
Regardless of whether this specific bug would be caught by Rust compiler, Bun in general is notorious for crashing, just look at how many open issues there are, how many crashes.
Not saying that you cannot make a correct program in Zig, but I prefer having checks that Rust compiler does, to not having them.