Why MongoDB's cursor.skip() is slow?

Published on 4th Jun 2017

1 min read

Share this article on

MongoDB’s cursor object has a method called skip, which as per documentation and definition, controls where MongoDB begins returning results. Thus in combination with function limit, one can easily have paginated results.

I have written a blog post on how you can have Fast and Efficient Pagination in MongoDB.

But while going through the documentation of skip, there is something interesting to notice. There is a small warning in MongoDB documentation, that states

The cursor.skip() method is often expensive because it requires the server to walk from the beginning of the collection or index to get the offset or skip position before beginning to return results. As the offset (e.g. pageNumber above) increases, cursor.skip() will become slower and more CPU intensive. With larger collections, cursor.skip() may become IO bound.

In short, MongoDB has to iterate over documents to skip them. Thus when collection or result set is huge and you need to skip documents for pagination, the call to cursor.skip will be expensive. While going through the source code of skip I found out that it does not use any index and hence gets slower when result set increases in size.

This also implies that if you use skip then the “skipping speed” will not improve even if you index the field.

But what if the size of result set is small? is calling skip still a terrible idea? If skip was so terrible, then MongoDB team and community must had taken that decision long back. But they haven’t … why?

Because it is very efficient and fast for smaller result set. I have taken this opportunity to benchmark and compare the two approach for pagination and there I found out skip and limit based pagination works well for smaller result sets.

In conclusion, skip is not as bad one might think. But you must understand your use case well so as to make an informed decision.

Arpit's Newsletter

500+ Signups

If you like what you read subscribe you can always subscribe to my newsletter and get the post delivered straight to your inbox. I write essays on various engineering topics and share it through my weekly newsletter 👇

Other articles that you might like

All you need to know about Inverse Document Frequency

All you need to know about Inverse Document Frequency

TF-IDF is extensively used in search engines and in various document classification and clustering t...

6th Mar
Powering inheritance in C using structure composition

Powering inheritance in C using structure composition

C language does not support inheritance however it does support Structure Compositions which can be ...

7th Jun
Python Caches Integers

Python Caches Integers

To gain a performance boost and avoid reallocation of frequently used integers, Python creates singl...

17th May
Personalize your python prompt

Personalize your python prompt

Personalization is what we all love. In this article we find how we could personalize the Python int...

21st Feb