MongoDB

Database Benchmarking for Beginners

Read more about Database Benchmarking for Beginners
1 comment
Log in to post comments
924 views

An engineer I work with asked me for tips on what to read about database benchmarking. I told him I've learned a lot from reading Mark Callaghan's blog. Now that I think about it, articles and conference talks from Baron Scwhartz were also, or even more, fundamental early on when I was getting started.

What's in a database storage engine

Read more about What's in a database storage engine
2 comments
Log in to post comments
2149 views

I overheard - over-read, really - an internet discussion about database storage engines. The discussion was about what functionality is considered part of a storage engine, and what functionality is in the common parts of the database server. My first reaction was something like "how isn't this obvious?" Then I realized for a lot of the database functionality it isn't obvious at all and the answer really is that it could be either way.

Automated System Performance Testing at MongoDB - the end of a trilogy

Read more about Automated System Performance Testing at MongoDB - the end of a trilogy
Log in to post comments
979 views

We are pleased to announce that our paper Automated System Performance Testing at MongoDB will be presented at DBTest 2020. (A workshop in conjunction with SIGMOD/PODS.) It is available today on Arxiv.org.

This paper presents the framework we developed to do automated performance testing on realistic MongoDB clusters. We have used and evolved this system to run hundreds of benchmarks every day as part of our Continuous Integration system. In conjunction with publishing this paper, we have also finally open sourced the Python code to this framework. The framework is called Distributed Systems Infrastructure 2.0, or DSI for short.

Automatic retries in MongoDB

Read more about Automatic retries in MongoDB
Log in to post comments
2129 views

At work we were discussing whether MongoDB will retry operations in some circumstances or whether the client needs to be prepared to do so. After a while we realized different participants in the discussion were discussing different retries.

So I sat down to get to the bottom of all the retries that can happen in MongoDB, and write a blog post about them. But after googling a bit it turns out someone has already written that blog post, so this will be a short post for me linking to other posts.

Retries by the driver

If you set retryWrites=true in your MongoDB connection string, then the driver will automatically retry some write operations for some types of failures. Ok, can I be more specific? Yes I can...

Node failures in Serializeable Snapshot Isolation

Read more about Node failures in Serializeable Snapshot Isolation
1 comment
Log in to post comments
3673 views

Previously in this series: Reading about Serializeable Snapshot Isolation.

Last week I took a deep dive into articles on Serializeable Snapshot Isolation. It ended on a sad note, as I learned that to extended SSI to a sharded database using 2PC for distributed transactions1 , there is a need to persist - which means replicate - all read sets in addition to all writes.

This conclusion has been bothering me, so before diving into other papers on distributed serializeable transactions, I wanted to understand better what exactly happens in SSI when a node (shard) fails. This blog doesn't introduce any new papers, just more details. And more speculation.

1for example, MongoDB

Reading about Serializeable Snapshot Isolation

Read more about Reading about Serializeable Snapshot Isolation
Log in to post comments
3989 views

Previously in this series: Toward strict consistency in distributed databases.

Apparently, when Snapshot Isolation was invented, it was received with great enthusiasm, since it avoids all the anomalies familiar from the SQL standard. It was only as time went by that people discovered that SI had some completely new anomalies and therefore wasn't equivalent to serializeable isolation. (This I learned from the PostgreSQL paper below.)

Toward strict consistency in distributed databases

Read more about Toward strict consistency in distributed databases
1 comment
Log in to post comments
13465 views

July is vacation time in Finland. Which means it is a great time to read blogs and papers on consistency in distributed databases! This and following blogs are my reading notes and thoughts. Mostly for my own benefit but if you benefit too, that's great. Note also that my ambition is not to appear like an authority on consistency. I just write down questions I had and things I learned. If you have comments, feel free to write some below. That's part of the point of blogging this!

Video x2: Measuring performance variability of EC2

Read more about Video x2: Measuring performance variability of EC2
1 comment
Log in to post comments
50159 views

I was recently invited to speak at Fwdays Highload in Kyiv. This was my first ever visit to Ukraine, so I was excited to go and visit this large and beautiful European capital. Over a thousand years ago Vikings would row their boats through the rivers in Russia, and take the Dniepr southward to Kyiv and ultimately Turkey. It was exciting to travel in the footsteps of my forefathers.

My talk isn't really MongoDB specific, rather about an EC2 performance tuning project we did in 2017:

Moving to MongoDB Engineering

Read more about Moving to MongoDB Engineering
Log in to post comments
13056 views

It will soon be 3 years that I've been with MongoDB. I joined the company amidst a strong growth spurt, and 5 months later the HR website told me that I had now been in the company longer than 50% of my colleagues.

4 modifications for Raft consensus

Read more about 4 modifications for Raft consensus
1 comment
Log in to post comments
10973 views

A month ago I published a quasi-academic paper, proposing 3 modifications to the Raft replication algorithm. I got some great review and feedback on the Raft mailing list. So based on that I have now updated the paper, hopefully to be much clearer than the first iteration.

RSS

About the bookAbout this siteAcademicAccordAmazonAppleBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLNyrkiöodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPParkinsonPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube

OpenLife.cc

Open Source and Databases