MongoDB

3 modifications to the Raft consensus algorithm (paper)

Update: This version of my paper is superceded by a new version: 4 modifications to Raft consensus. Please read it instead.

August is usually a slower month as a lot of people are on vacations. I try to take advantage of that to work on tasks that require a bit more deliberation and quiet time. This Summer I returned to re-reading the paper on the Raft algorithm, in particular my colleagues in New York pointed out that the PhD thesis that extends on the original paper was now complete, and contains some additional details.

A MEAN Hackathon

I'm preparing to do some simple MongoDB hackathons in Scandinavia, and because I don't want to forget how to do all the steps, I actually wrote down an example exercise.

This is a simple and fun exercise just to get some data into MongoDB and then get it out again. We're going to use some awesome JavaScript tools for the out part: Node.js and Crest for a simple HTTP API, and Angular.js to draw the pretty pictures. So this is not just a MongoDB hackathon but more like a fullstack JavaScript or MEAN hackathon. (Strictly speaking there's no Express.js, so maybe this is a NMCA hackathon?)

Install MongoDB

The usual apt, yum and brew methods will work for this tutorial.

5 MongoDB features that are NOT reasons to choose MongoDB

On May 5th I will be speaking at the ICT Expo in Helsinki. On home field, so to speak! This is the major Finnish ICT event and MongoDB will be exhibiting together with our Nordics partner Altotech.

The title for my talk is "5 Reasons that made MongoDB the Leading NoSQL Database". In preparing the talk I came to think of several MongoDB features that are great, but in consideration are clearly NOT reasons that make MongoDB stand out against other NoSQL databases.

Why? Because these are typical traits of any NoSQL database. Here's my Top 5 list of non-reasons:

MongoDB Aggregation Framework fun: the answer

Last week I posted a challenge about ordering fabric for a bunch of different flags, which can be conveniently solved with the MongoDB aggregation framework. Did you figure out the answer?

So we have a bunch of these flag orders for different countries:

{
	"_id" : ObjectId("52a71923cd4dd732cd060204"),
	"country" : "Norway",
	"colors" : [
		{
			"color" : "red",
			"fabric_units" : 3
		},
		{
			"color" : "blue",
			"fabric_units" : 2
		},
		{
			"color" : "white",
			"fabric_units" : 1
		}
	],
	"num_flags" : 2
}

And to calculate how much fabric we need of each color:

> db.flags.aggregate( [ 
    { $unwind : "$colors" }, 

MongoDB Aggregation Framework fun: ordering fabric for flags

One of the really cool, well designed features in MongoDB is the aggregation framework. Basically it is the feature that brings MongoDB query language on par with SQL in feature richness. Most importantly, it fully supports sharding for scale-out and parallel processing. I've had a lot of fun with it.

Last night I accidentally came up with a fun exercise that you can solve with a nice aggregation framework pipeline. I'll present it as a challenge, feel free to suggest answers in comments. I'll share my answer next week.


Suppose you work in a flag factory, and you have received the following orders:

> db.flags.find().pretty()
{
	"_id" : ObjectId("52a7189ecd4dd732cd060201"),
	"country" : "Finland",

Translating reliably between XML and JSON (xml2json)

Last week I was assigned to work on a simple yet interesting problem. MongoDB stores data as JSON. But it turns out we often have customers - especially in the important financial services market segment - where data is in XML. (Yes, SOAP still exists too!) To store that data into MongoDB, we need to transform it into JSON.

(For the impatient reader: here's the Github link!)

Storing XML as a text field

There are various ways one could do this. For example one could simply store the XML document as a whole in a single field, and then extract just a few parts of the XML that are stored as individual JSON keys. One reason to do this is that they can then be indexed and used in queries.

For example:

MongoDB & Node.js events in Stockholm, Aug 26

Monday, Aug 26 will be an exciting day if you live around Stockholm and are interested in MongoDB, Node.js or just open source and web programming or anything in this direction.

In the morning, from 9:00 to 12:00 I will be doing MongoDB office hours. This is an "ask the expert(s)" type of free community consulting thing we do regularly at 10gen. You can come and discuss your MongoDB application, architecture or just basic questions. First come, first served. Please RSVP at https://www.meetup.com/Stockholm-MongoDB-User-Group/events/134695732/

Thoughts from Oscon: Why diversity is annoying and assholes run large corporations

Oscon is over, I'm home and recovered both from jet lag and just general exhaustion.

Oscon is a very broad conference so there is a lot to learn and many people and projects to befriend. There are many things and angles one could write a blog post about. To me Oscon is above all the conference to meet other open source people and have the deep and inspiring discussions. So in that spirit I will make a few philosophical remarks in this post, thoughts from Oscon 2013.

If you'd want to read a run-through of the conference itself, I recommend Dirk van den Poel's very extensive summary.

pt-query-digest for MongoDB profiler logs

One of my favorite MySQL tools ever is pt-query-digest. It's the tool you use to generate a report of your slow query log (or some other supported sources), and is similar to for example the Query Analyzer in MySQL Enterprise Monitor. Especially in the kind of job I am, where I often just land out of nowhere on a server at the customer site, and need to quickly get an overview of what is going on, this tool is always the first one I run. I will show some examples below.

Slides from Spatial functions in MySQL 5.6, MariaDB 5.5, PostGIS 2.0 and others at Percona Live

Slides from my Percona Live talk evaluating the new spatial features in MySQL 5.6 and MariaDB 5.5 are now online. This is new material I have never presented before. It is based on work I have done in my job at Nokia HERE.com location services. So even if at this conference it draws less attention than my HA talks, it is actually what I'm most proud of to present.

TL;DR summary is that PostgreSQL has lots of features but MySQL has much better ease of use and performance. (I copy paste this standard sentence into any PostgreSQL vs MySQL evaluation I do :-) The MongoDB info is basically outdated, as the new 2.4 release introduces completely new implementation based on GeoJSON, new indexing, neither of which I tested.

About the bookAbout this siteAcademicAccordAmazonBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube