all essays (74)


Genetic Algorithm to solve the Knapsack Problem

441 reads 2022-03-07

The 0/1 Knapsack Problem has a pseudo-polynomial run-time complexity. In this essay, we look at an approximation algorit...

Pseudorandom Number Generation using LFSR

609 reads 2022-02-21

This essay takes a detailed look into pseudorandom number generation using LFSR, a widely adopted technique to generate ...

Indexing on Partitioned Data

380 reads 2022-02-07

In this essay, we will take a detailed look into how we could index the partitioned data, allowing us to query the data ...

Data Partitioning Strategies

302 reads 2022-01-31

In this essay, we take a detailed look into the two common approaches to horizontally partition the data - Hash Based an...

Data Partitioning

435 reads 2022-01-24

In this essay, we take a detailed look into Partitioning basics and understand how it can help us scale our Reads and Wr...

Leaderless Replication

444 reads 2022-01-16

In this essay, we take a look into a different way of replication, called Leaderless Replication, that comes in handy in...

Conflict Resolution

275 reads 2022-01-09

In this essay, go in-depth to understand ways to resolve and avoid conflicts in a multi-master setup....

Conflict Detection

709 reads 2021-11-28

In this essay, we talk about conflicts and understand what they are, how to detect them in a multi-master setup....

Multi-Master Replication

497 reads 2021-11-03

In this essay, we look at what Multi-Master Replication is, the core challenge it addresses, use-cases we can find this ...

Monotonic Reads

348 reads 2021-10-03

Asynchronous replication leads to a fascinating situation where it feels like we are going through a wormhole traveling ...

Read-your-write Consistency

1120 reads 2021-09-22

Read-Your-Writes consistency states that the system guarantees that, once an item has been updated, any attempt to read ...

Handling outages in a Master-Replica setup

375 reads 2021-09-07

This essay talks about the worse - nodes going down - impact, recovery, and real-world practices....

The New Replica

293 reads 2021-08-23

In this one, we take a look into how these Replicas are set up and understand some quirky nuances about Replication....

Replication Formats

334 reads 2021-08-15

When we are employing a Master-Replica pattern to improve availability, throughput, and fault-tolerance, the big questio...

Replication Strategies

821 reads 2021-08-10

In this essay, we take a quick yet verbose look into Synchronous, Asynchronous, and Semisynchronous replication strategi...

Master-Replica Replication

350 reads 2021-08-07

In this essay, we talk about everything we should know about Master-Replica replication pattern....

ACID - Durability

335 reads 2021-07-19

Durability seems to be a taken-for-granted requirement, but to be honest, it is the most important one. Let's deep dive ...

ACID - Isolation

301 reads 2021-07-05

Isolation is the ability of the database to concurrently process multiple transactions in a way that changes made in one...

ACID - Consistency

387 reads 2021-07-02

In the context of databases, Consistency is Correctness, which means that under no circumstance will the data lose its c...

ACID - Atomicity

748 reads 2021-06-28

A single database transaction often contains multiple statements to be executed on the database. In Relational Databases...

Architectures in Distributed Systems

563 reads 2021-06-22

While designing a Distributed System, it is essential to pick the right kind of architecture. Usually, architectures are...

Mistaken Beliefs of Distributed Systems

358 reads 2021-06-17

In this essay, we learn about a set of false assumptions that programmers new to distributed applications invariably mak...

Fork Bomb

425 reads 2021-06-09

In this essay, we explore a simple yet effective DoS attack called Fork Bomb, also called Rabbit Virus. This attack fork...

Chained Comparison Operators in Python

275 reads 2021-04-27

In this essay, we find how chained comparison expressions are evaluated, understand how short-circuit evaluations happen...

Designing Taxonomy on a Relational DB

1303 reads 2021-04-18

In this essay, design taxonomy on a SQL-based Relational database by taking Udemy as an example, write SQL queries cover...

The Weird Walrus

2377 reads 2021-03-29

In this essay, we alter the Python Grammar and allow it run Assignment Expressions without any parenthesis....

Fully Persistent Arrays

618 reads 2021-02-14

Persistent Data Structures allow us to hold multiple versions of a data structure at any given instant of time. This ena...

Persistent Data Structures - An Introduction

1861 reads 2021-02-07

Persistent Data Structures allow us to hold multiple versions of a data structure at any given instant of time. This ena...

Constant Folding in Python

3917 reads 2021-01-10

Every programming language aims to be performant and Python is no exception. In this essay, we dive deep into Python int...

String Interning in Python

1480 reads 2020-12-20

Every programming language aims to be performant and Python is no exception. In this essay, we dive deep into Python int...

A Simple Recursion Tree Visualizer for Python

4140 reads 2020-12-13

One of the most complicated concepts to wrap our heads around has to be recursion; to understand it well it always helps...

Approximate Count-Distinct using Flajolet Martin Algorithm

3131 reads 2020-12-06

Measuring distinct elements from a stream of values is one of the most common utilities that finds its application acros...

2Q Cache Management Algorithm

1705 reads 2020-11-29

LRU is one of the most widely used cache eviction algorithms suffers from a bunch of limitations especially when used fo...

Israeli Queues

7250 reads 2020-11-22

Israeli Queues are fondly named after a peculiar behavior observed in Israel. This behavior was mimicked to solve a very...

1D Procedural Terrain Generation

1062 reads 2020-11-16

Terrains are at the heart of every Computer Game - be it Counter-Strike, Age of Empires, or even Minecraft. The virtual ...

Set Similarity using Jaccard Similarity Coefficient and MinHash

1481 reads 2020-11-08

Set similarity measure finds its application spanning the Computer Science spectrum; some applications being - user segm...

Time Series Smoothing - Making Aberrations Stand Out

499 reads 2020-11-01

Time Series smoothing algorithms removes short-term irregularities from the plot while preserving long-term trends. But ...

Constant Time LFU

6035 reads 2020-08-23

The most popular implementation of the LFU Cache Eviction Scheme, using a min-heap, implements all three operations with...

Morris's Algorithm for Approximate Counting

2120 reads 2020-08-02

Morris' Algorithm counts a large number of events using a very small space O(log log n). The algorithm uses probabilisti...

Slowsort - A Pessimal Sorting Algorithm

6485 reads 2020-07-26

Slowsort is a pessimal sorting algorithm based on the Multiply and Surrender paradigm. The algorithm is designed to be d...

Bitcask - A Log-Structured Fast KV Store

574 reads 2020-07-19

Bitcask is a Key-Value store that persists its data in append-only log files and still reaps super-performant read-write...

Phi φ Accrual Failure Detection

478 reads 2020-07-12

Phi φ Accrual Failure Detection algorithm, unlike conventional algorithms, is an adaptive failure detection algorithm th...

Deciphering Repeated-key XOR Ciphertext

2797 reads 2020-07-04

Deciphering is the process of recovering the original message from an encrypted byte stream, usually, without having any...

Deciphering Single-byte XOR Ciphertext

1999 reads 2020-06-21

Deciphering is the process of recovering the original message from an encrypted byte stream, usually, without having any...

Making Python Integers Iterable

1790 reads 2020-06-14

In Python, Integers are not iterables but we can make them iterable by implementing __iter__ function. In this essay, we...

Powering Inheritance in C using Structure Composition

3850 reads 2020-06-07

C language does not support inheritance however it does support Structure Compositions which can be tweaked to serve use...

The RUM Conjecture

670 reads 2020-05-31

While designing any storage system the three main aspects we optimize for are Reads, Updates, and auxiliary Memory. RUM ...

Consistent Hashing

1262 reads 2020-05-24

Consistent Hashing is one of the most sought after techniques when it comes to designing highly scalable distributed sys...

Integer Caching in Python

1824 reads 2020-05-17

To gain a performance boost and avoid reallocation of frequently used integers, Python creates singleton instances of sm...

Fractional Cascading - Speeding up Binary Searches

486 reads 2020-05-10

The performance of binary search when applied on k lists independently can be improved using bridges and the technique i...

Copy-on-Write Semantics

559 reads 2020-05-03

Copy-on-write is used to model Time Travel, build databases with no locks, and makes the fork system call super-efficien...

Midpoint Insertion Strategy in MySQL LRU Cache

466 reads 2020-04-26

The MySQL InnoDB Storage engine uses LRU cache but it suffers from a notorious problem. In this article, we find how by ...

Building Finite State Machines with Python Coroutines

7625 reads 2020-04-19

The most intuitive way of building and implementing Finite State Machines is by using Python Coroutines and in this arti...

Better Ranking using Bayesian Average

526 reads 2020-04-12

Ranking a list of movies, products, books or even restaurants is tricky and in this article, we find what works for such...

Sliding Window based Rate Limiter

1056 reads 2020-04-05

A rate limiter is used to control the rate of traffic sent or received on the network and in this article we dive deep a...

Inverse Document Frequency

417 reads 2020-03-06

TF-IDF is extensively used in search engines and in various document classification and clustering techniques. Instead o...

Eight Rituals to be a Better Programmer

2709 reads 2020-02-28

"How to get better at programming?" is the question I had been asked quite a few times, and today I lay down the 8 ritua...

Personalize your Python Prompt

1763 reads 2020-02-21

Personalization is what we all love. In this article we find how we could personalize the Python interpreter prompt >>>...

Pseudorandom Number Generation using Cellular Automata - Rule 30

1011 reads 2020-02-14

Generating pseudorandom numbers is an interesting problem in Computer Science. In this article, we dive deep into an alg...

Function Overloading in Python

12359 reads 2020-02-07

Python natively does not support function overloading - having multiple functions with the same name. Today we see how w...

Isolation Forest Algorithm for Anomaly Detection

1134 reads 2020-01-28

Anomaly detection is an age-old problem and in this article, we dive deep into an unsupervised algorithm, Isolation Fore...

Image Steganography

304 reads 2020-01-17

Steganography has been around since at least 440 BCE but with the rise of computers, the techniques have evolved to hand...

Super Long Integers in Python

3584 reads 2020-01-10

Python must be doing something beautiful internally to support super long integers and today we find out what's under th...

Changing Python

1121 reads 2020-01-03

I changed the Python's source code and made addition incorrect and unpredictable. The addition operation will internally...

Stop an Iterating Loop

426 reads 2019-09-06

There are two ways through which we can stop an iterating loop, first by using break statement and second by making loop...

Sleepsort and Concurrency in Golang

460 reads 2017-07-16

Understanding concurrency in any programming language is tricky let alone Golang; hence to get my hands dirty the first ...

HTTP Requests using Netcat

657 reads 2017-07-05

All our lives we have been hitting REST APIs with libraries and utilities like curl and postman. Its time we do it the h...

Fast and Efficient Pagination in MongoDB

3898 reads 2017-06-06

MongoDB is a document based data store and hence pagination is one of the most common use case of it. Find out how you c...

Why MongoDB's cursor.skip() is Slow?

2088 reads 2017-06-04

MongoDB's cursor.skip() is very inefficient, why is that? Even though it is slow and inefficient, team MongoDB wants to...

Benchmark Pagination Strategies in MongoDB

3538 reads 2017-06-02

Benchmark results for two pagination approaches for MongoDB....

Multiple MySQL server running on same Ubuntu server

285 reads 2016-05-13

Have multiple MySQL versions running on same server within 5 minutes....

Setting up Graphite using Nginx on an Ubuntu server

728 reads 2015-12-14

Part 1: Monitor your production systems and application analytics using Graphite. This article will help you setup these...

Setting up Graphite and Grafana on an Ubuntu server

278 reads 2015-12-14

Part 2: Monitor your production systems and application analytics using Graphite. This article will help you setup these...

Publish python package on PyPI

321 reads 2015-11-10

If you have written something cool in Python and want to make it installable via pip and easy_install, this post will he...


Arpit's Newsletter read by 15000+ engineers

🔥 Thrice a week, in your inbox, an essay about system design, distributed systems, microservices, programming languages internals, or a deep dive on some super-clever algorithm, or just a few tips on building highly scalable distributed systems.



  • v11.0.1
  • © Arpit Bhayani, 2022