MongoDB Index Usage Analysis

At MYDBOPS we have ventured into MongoDB recently. We have been exploring it deeply from Jan-2017. In most cases we see MongoDB and MySQL exists together as a part of DB Infra.

We have been engaged as part of open source migration for a biggest retail in US.It involves MongoDB and MySQL as a datastore. Below is one of our experience in Query optimisation for MongoDB.

Index Usage :

As a Database Administrator, One of your prime role is to do housekeeping on secondary indexes, As developers can be adding indexes on every fields on the documents which is not necessary or Wrong indexes are created while tuning queries.

Housekeeping of indexes is trivial to avoid additional over head for index creation during document insertion and help us curb the additional storage taken up by unnecessary indexes.

So your quick question will be, I have so many indexes on my table, How to identify which one is getting used ?

MongoDB by default store the index usage pattern for the collections since the restart.

Using Mongo Shell:

It can be viewed using aggregate command $indexStats.

> db.<collection_name>.aggregate([ { $indexStats: { } } ]);

Example:

db.category.aggregate([ { $indexStats: { } }, { $limit: 1 } ]).pretty();
 {
 "name" : "startTime_1_endTime_1",
 "key" : {
 "startTime" : 1,
 "endTime" : 1
 },
 "host" : "stress-default-prd-mongodb-vm-l-0:27017",
 "accesses" : {
 "ops" : NumberLong(144871), ---> Index Usage Count Since Restart
 "since" : ISODate("2017-05-02T07:10:31.166Z")
 }
 }

With this ouput we can understand this index is used pretty well.

Now, Let’s write a server side function called “idx_usage” to make the output more readable and display in a consolidated fashion

MongoDB PRIMARY>
db.system.js.save ({ _id : "idx_usage", value : function (col) { out="\n"; db.getCollection(col).aggregate([{$indexStats:{}},{"$sort":{"accesses.ops":1}}]).forEach(
function(op) { var c = op.accesses.ops; var k = JSON.stringify(op.key); var n = op.name; ts = "Index Stats Since: "+op.accesses.since;
out = out+"Count: "+c+" Key: "+n+" "+k+"\n"; }); var ret = "\n"+ts+"\n"+out; return ret }});

> db.loadServerScripts()

> idx_usage("category")
Index Stats Since: Tue May 02 2017 02:10:31 GMT-0500 (CDT)

Count: 0 Key: _id_ {"_id":1}
Count: 0 Key: idx_end_start_sp {"endtime":1,"starttime":1,"special":1}
Count: 0 Key: idx_cp {"childProducts":1}
Count: 35849 Key: categoryid {"categoryid":1}
Count: 176627 Key: startTime_1_endTime_1 {"startTime":1,"endTime":1}

We could see that idx_end_start_sp, idx_cp are not used since restart, We can drop those indexes after getting confirmation the developers.

Using MongoDB Compass:

If you’re not a terminal guy and you like to visualise the things, You can use MongoDB Compass.

Compass has a rich GUI interface, It will also display additional info like size of the index and much more to play around.

When using compass, Make sure that you are not pointing to the Primary server, as by default it will sample 10k records before opening up the details for a specified collection. It can incur additional load to the server depending upon the size of your document.

Compass Output:

Compass

The above is the index stats from compass on a collection in MongoDB 3.4 ( Weird Tiger ).

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s