Why MongoDB's cursor.skip() is slow?

Arpit Bhayani

entrepreneur, educator, and tinkerer



MongoDB’s cursor object has a method called skip, which as per documentation and definition, controls where MongoDB begins returning results. Thus in combination with function limit, one can easily have paginated results.

I have written a blog post on how you can have Fast and Efficient Pagination in MongoDB.

But while going through the documentation of skip, there is something interesting to notice. There is a small warning in MongoDB documentation, that states

The cursor.skip() method is often expensive because it requires the server to walk from the beginning of the collection or index to get the offset or skip position before beginning to return results. As the offset (e.g. pageNumber above) increases, cursor.skip() will become slower and more CPU intensive. With larger collections, cursor.skip() may become IO bound.

In short, MongoDB has to iterate over documents to skip them. Thus when collection or result set is huge and you need to skip documents for pagination, the call to cursor.skip will be expensive. While going through the source code of skip I found out that it does not use any index and hence gets slower when result set increases in size.

This also implies that if you use skip then the “skipping speed” will not improve even if you index the field.

But what if the size of result set is small? is calling skip still a terrible idea? If skip was so terrible, then MongoDB team and community must had taken that decision long back. But they haven’t … why?

Because it is very efficient and fast for smaller result set. I have taken this opportunity to benchmark and compare the two approach for pagination and there I found out skip and limit based pagination works well for smaller result sets.

In conclusion, skip is not as bad one might think. But you must understand your use case well so as to make an informed decision.

Courses I teach

Alongside my daily work, I also teach some highly practical courses, with a no-fluff no-nonsense approach, that are designed to spark engineering curiosity and help you ace your career.


System Design Masterclass

A no-fluff masterclass that helps experienced engineers form the right intuition to design and implement highly scalable, fault-tolerant, extensible, and available systems.


Details →

System Design for Beginners

An in-depth and self-paced course for absolute beginners to become great at designing and implementing scalable, available, and extensible systems.


Details →

Redis Internals

A self-paced and hands-on course covering Redis internals - data structures, algorithms, and some core features by re-implementing them in Go.


Details →


Arpit Bhayani

Arpit's Newsletter

CS newsletter for the curious engineers

❤️ by 90000+ readers

If you like what you read subscribe you can always subscribe to my newsletter and get the post delivered straight to your inbox. I write essays on various engineering topics and share it through my weekly newsletter.



Writings and Learnings

Knowledge Base

Bookshelf

Papershelf


Arpit's Newsletter read by 90000+ engineers

Weekly essays on real-world system design, distributed systems, or a deep dive into some super-clever algorithm.