Overview of JunoDB - an open source KV store by PayPal

Watch the video explanation ➔

This article is part of a series where we explore JunoDB, an open-source key-value database recently released by PayPal. In this first video of the series, we will provide an overview of JunoDB, highlighting its features and how it differs from Redis. Additionally, we will delve into an interesting problem called latency bridging and how JunoDB addresses it for PayPal.

Overview of JunoDB

JunoDB is a distributed key-value store that was designed and built at PayPal. While it is widely used within PayPal, it is not considered the source of truth for critical services. However, JunoDB is utilized in various core backend services at PayPal, including authentication, login, risk management, and transaction processing.

The main motivation behind developing JunoDB was to provide an efficient way to store and cache data. Unlike Redis, which is primarily an in-memory cache, JunoDB offers persistent storage along with the ability to cache data. This feature greatly reduces the load on relational databases, which are commonly used in payment platforms due to their asset guarantees and role as the source of truth. By leveraging caching, JunoDB minimizes the need to recompute expensive queries and reduces the load on both the database and downstream services.

Implementation Language and Concurrency

Originally, JunoDB was implemented as a single-threaded C++ program, taking inspiration from Redis. However, PayPal decided to rewrite it in Go (Golang) for two main reasons. First, they wanted to leverage the high concurrency that Go provides through its lightweight goroutines. Go routines are more efficient than traditional POSIX threads, allowing for parallel execution. Second, single-threadedness did not align with the use case of JunoDB. While Redis can efficiently handle loads that are memory-bound, JunoDB is designed for CPU-bound workloads. It requires multiple CPUs to execute high CPU-intensive operations effectively. By utilizing multiple cores, JunoDB can leverage parallelization and enhance performance.

High Availability and Scalability

JunoDB boasts impressive availability, providing six nines (99.9999%) of uptime for PayPal. This translates to a maximum of 31.56 seconds of downtime per year. Moreover, JunoDB handles a staggering 350 billion requests each day, showcasing its scalability and robustness.

Common Use Cases and Latency Bridging

JunoDB serves various common use cases, with caching being one of the primary ones. It allows storing data in both memory and on disk, making it suitable for temporary caching or longer-term persistence. PayPal utilizes JunoDB for caching user tokens, account details, API responses, and user preferences. These use cases demonstrate how JunoDB can replace legacy caching systems, reducing the number of queries made to relational databases and other microservices.

Another critical problem JunoDB addresses is idempotency, which prevents duplicate processing. By storing idempotency keys, JunoDB ensures that requests are processed only once, regardless of how many times they are received. This feature is especially crucial for financial platforms like PayPal, where duplicate processing can lead to serious issues.

Additionally, JunoDB solves the problem of latency bridging. By enabling near-instant inter-cluster replication, JunoDB ensures low replication lag between clusters. PayPal leverages this feature to bridge the latency gap between their Oracle databases, which are set up in an active-active configuration. With JunoDB in place, both read and write requests can be efficiently handled, even if they are directed to different data centers. This approach significantly improves the consistency and availability of data, allowing for better high-availability systems.

Conclusion

In conclusion, JunoDB, the open-source key-value database developed by PayPal, offers powerful features such as efficient concurrency, multi-threading, and near-instant inter-cluster replication. It serves as a cache, reducing the load on relational databases and improving system availability. JunoDB is particularly useful in bridging latency in distributed systems, making it an ideal choice for organizations like PayPal. Its ability to handle caching, idempotency, and storing near-static data makes it a practical solution for various use cases. With its performance, scalability, and high availability, JunoDB presents a valuable tool for organizations dealing with significant data loads.

Here's the video ⤵

Courses I teach

Alongside my daily work, I also teach some highly practical courses, with a no-fluff no-nonsense approach, that are designed to spark engineering curiosity and help you ace your career.


System Design Masterclass

A no-fluff masterclass that helps experienced engineers form the right intuition to design and implement highly scalable, fault-tolerant, extensible, and available systems.


Details →

System Design for Beginners

An in-depth and self-paced course for absolute beginners to become great at designing and implementing scalable, available, and extensible systems.


Details →

Redis Internals

A self-paced and hands-on course covering Redis internals - data structures, algorithms, and some core features by re-implementing them in Go.


Details →


Writings and Learnings

Blogs

Bookshelf

Papershelf


Arpit's Newsletter read by 80000+ engineers

Weekly essays on real-world system design, distributed systems, or a deep dive into some super-clever algorithm.