948 views • Garbage Collection
“Best” Garbage Collector is a myth.
If you are ever writing your Garbage Collector or trying to pick one for your workload, there are 7 metrics that you should be evaluating a GC algorithm on.
No one garbage collector can be the best across all 7 properties and it was found that any GC algorithm is at least 15% better than other algorithms on at least one of the characteristics. Hence it all boils down to the needs of your specific workload to pick one over the other. The key characteristics are
A garbage collector is safe when it never reclaims the space of a LIVE object and always cleans up only the dead objects.
Although this looks like an obvious requirement, some GC algorithms claim space of LIVE objects just to gain that extra ounce of performance.
A garbage collector should be as little time cleaning up the garbage as possible; this way it would ensure that the CPU is spent on doing actual work and not just cleaning up the mess.
Most garbage collectors hence run small cycles frequently and a major cycle does deep cleaning once a while. This way they maximize the overall throughput and ensure we spend more time doing actual work.
A garbage collector is said to be complete when it eventually reclaims all the garbage from the heap.
It is not desirable to do a complete clean-up every time the GC is executed, but eventually, a GC should guarantee that the garbage is cleaned up ensuring zero memory leaks.
Some garbage collectors pause the program execution during the cleanup and this induces a “pause”. Long pauses affect the throughput of the system and may lead to unpredictable outcomes; so a GC is designed and tuned to minimize the pause time.
The garbage collector needs to pause the execution because it needs to either run defragmentation where the heap objects are shuffled freeing up larger contiguous memory segments.
Garbage collectors require auxiliary data structures to track objects efficiently and the memory required to do so is pure overhead. An efficient GC should have this space overhead as low as possible allowing sufficient memory for the program execution.
Most GC algorithms are generic but when bundled with the programing language the GC can exploit the language patterns and object allocation nuances. So, it is important to pick the GC that can leverage these details and make its execution as efficient as possible.
For example, in some programming languages, GC runs in constant time by exploiting how objects are allocated on the heap.
Most GC are efficient in cleaning up a small chunk of memory, but a scalable GC would run efficiently even on a server with large RAM. Similarly, a GC should be able to leverage multiple CPU cores, if available, to speed up the execution.
If you like what you read subscribe you can always subscribe to my newsletter and get the post delivered straight to your inbox. I write essays on various engineering topics and share it through my weekly newsletter.
759 views • 46 likes • 2022-05-20
So, caching doesn't always work! What would happen if we apply caching to speed up our Mark and Sweep Gargabge collecto...
796 views • 41 likes • 2022-05-06
A basic Mark-and-Sweep garbage collection algorithm operates in Stop-the-World mode, which means the program execution p...
1061 views • 57 likes • 2022-04-29
Garbage Collection has to be one of the most interesting topics of discussion out there. In the previous videos, we took...
A set of courses designed to make you a better engineer and excel at your career; no-fluff, pure engineering.
Being a passionate engineer, I love to talk about a wide range of topics, but these are my personal favourites.
Arpit's Newsletter read by 17000+ engineers
🔥 Thrice a week, in your inbox, an essay about system design, distributed systems, microservices, programming languages internals, or a deep dive on some super-clever algorithm, or just a few tips on building highly scalable distributed systems.
Powered by this tech stack.