Reducing Cassandra p99 latency by fixing OS page cache thrashing

invertedohm · 2026-02-03T16:39:01.000Z 1770136741

Kernel 4.18 is _ancient_, weird to test on that. Also I wonder if the whole performance increase can't just be achieved by lowering the readahead size a bit. I see in their notes they have a decent SSD but set the blockdev readahead to 64. Depending on your workload it's much more performant to lower that to 16. As mentioned in TFA, fast storage is the most important factor here, and in my experience combining that with a smaller readahead pretty much fixes any read amplification issues you get with the larger readahead.

AtlasBarfed · 2026-02-03T15:27:46.000Z 1770132466

I'm wondering if compaction can be isolated as a separate processing task and it's own memory budget that separates it's processing from the memory pages utilized by request serving

Because compaction is about generating a new set of better optimized stables and then discarding the old

Cassandra is was full of all these tasks limped into the same problematic single jfm heap, even though the data could be partitioned into hundreds of jvms with smaller /easier to to gc heaps

Iirc scylla utilizes this fact with cpu pinning across cpus and other tricks that the older Cassandra could not.