Overview and archeological exploration of the MepSQL Bakery

This is the final part in a series of posts about the MepSQL build system known as MepSQL Bakery. MepSQL is a (yet another) fork of the MySQL database server, with the server based on the MySQLatFacebook code and the build system based on the MariaDB build system.

In this final post I wish to draw a high level picture of the complete process of building TAR, DEB and (eventually) RPM packages from the source code. There's not much more technical details to add to the previous posts, instead I'm going to make some, shall we say "archeological", observations which imho are interesting given how the build system has evolved when being passed from one project to another. Perhaps more importantly, I will also say a few words about where I think the future direction lies.

For reference, in previous posts we have already discussed:

  1. Announcing MepSQL, continuing the "Cambrian Explosion" of MySQL forks
  2. MepSQL Debs for Ubuntu now released - courtesy of cool tweaks to the build system.
  3. Looking at OpenSuse Build Service and Launchpad PPA (aka: How to build packages for MepSQL?)
  4. Why to choose a cloud service, and which one
  5. Going from MariaDB to MepSQL BuildBot setup and using EC2LatentBuildSlave to save money
  6. Dealing with the Cambrian Explosion 1/2: How to parameterize the package name in source and binary TAR files
  7. Dealing with the Cambrian Explosion 2/2: Parameterizing the package name in DEB files

The full MepSQL Bakery stack, derived from MariaDB, derived from...

When looking at the diagram of scripts and technologies involved in producing the MepSQL packages, the first observation is that there are surprisingly many layers in this picture:

MepSQL build system

The process starts with a commit triggering the BuildBot master to start building, which then calls scripts in the bakery layer, which goes down into the next layer, and next...

One way to try to understand this complexity is the realization that this stack can be seen as "archaeological" layers of evolution, where one open source project has taken the technology from its predecessors and built upon what already exists:

MepSQL build system with references to the origin of each technology

Such is for instance the relationship between the top-most BuildBot layer and the Bakery layer following it. In my opinion there is no strict technical boundary between functionality that should be in the BuildBot scripts and functionality in the bakery scripts. The BuildBot scripts are just automation built on top of the already existing OurDelta scripts. Whether a line of code is in one or the other, is mainly a result of who happened to write it.

I'll just add a quick hack here and do the real fix (10 years) later...

Such artifacts can be seen throughout the code. As exhibit one, consider that the MySQL sources already contain scripts/make_binary_distribution.sh (covered earlier):

# FIXME should be handled by make file, and to other dir
mkdir -p $DEST/bin $DEST/support-files
cp scripts/mysqlaccess.conf $DEST/bin/
cp support-files/magic $DEST/support-files/

# Create empty data directories, set permission (FIXME why?)
mkdir $DEST/data $DEST/data/mysql $DEST/data/test
chmod o-rwx $DEST/data $DEST/data/mysql $DEST/data/test

So what is happening here is that two files are considered to be in the wrong place after make install and whoever was maintaining this script, rather than fix the Makefile, took the opportunity to conveniently fix the issue from this script instead. (Btw, I know the answer to the "why?" question, find me as hingo on IRC channel #mysql.) I sympathize with whoever did that. I do exactly the same in a bakery script I happened to be working in several years later:

104 # TODO: Terrible terrible hack, fix this in makefile, then fall back to the above uncommented code.
105 mv ${NEW_DIR}/share/mysql/*.sql ${NEW_DIR}/share/

(This fix might be related purely to the mysqlatfacebook branch I use as base for mepsql, I doubt this error would exist in official MySQL, but haven't checked.)

The evolution only got us halfway

Another major thing that has changed is the evolution from first Linux distributions, and later OurDelta, being essentially build scripts, to MariaDB and MepSQL being self contained forks/variants of MySQL. Back in the old days, the job of OurDelta was to take the source tar released by official MySQL, apply some patches, then build packages. Linux distributions do essentially the same. But today, everyone has their own bzr branch, so there is no need to patch anything, the code already is what it is supposed to be.

So what should be done in the future is some serious cleanup of historical baggage, to make this system more streamlined. Or put in simple words, the picture above should have less boxes.

The entire bakery layer is one such artifact: Most of it could be pushed down into the main mepsql/mariadb/mysql code repository itself, just like scripts/make_binary_distribution.sh already is there. Functionality that is hardwired into the physical infrastructure used to build the packages (the server at Kristian's home for MariaDB, or Amazon EC2 for MepSQL) can be merged into the BuildBot configuration file. (As an option, long snippets of bash commands could then possibly be moved out from the BuildBot configuration file into separate shell scripts, if you feel like it's hard to let go of that layer. It might make the BuildBot configuration shorter and easier to digest. But the scripts would still be closely tied with the setup of the BuildBot system running them, whereas any scripts pushed to the MySQL sources would be generic in nature.)

Similarly, the files used to build debs and rpms could be merged into the main source repository. Having the ability to do just make debs and make rpms directly as a first class function of a source release makes sense to me (same for windows, osx and solaris even). (From what Vlad says, this is apparently considered normal in CMake.) In fact, that is exactly where Percona keeps their rpm spec and debian/ files, although they are hidden inside the XtraDB plugin. Probably again for their own historical reasons?

Looking forward to CMake in MySQL 5.5!

And yes, the Autotools have already been done away with for MySQL 5.5. and are replaced by CMake. Let me just say it out loud: I will not miss them!

I get the usefulness of the idea: You have some template file which is transformed into the actual makefile. Fine. Kind of what we did to abstract the package name into the $MYSQLFORK variable. But who thought it was a great idea to perform a total of 5 passes, where the output of the previous tool is again transformed in new ways by the next? Out of all the layers in the above diagram, figuring out autoconf related errors were the most grievesome times of all.

So I look forward to moving towards the MySQL 5.5 codebase. While I don't have any experience from CMake, I already expect it to be an improvement :-)

Styles, cultures and personalities

Finally, going back to the topic of archaeological exploration, it is always fun to observe different coding styles, or even commenting styles...

# Whatup?
echo "Going to build ${SOURCE_DIR} in ${BUILD_BASE_DIR}"

    -- How they comment their code Down Under.

# A DEB package name would be composed of the below components as such:
# Note:
# $VERSION is appended to this file automatically by tarbake*.sh. It's value is
# parsed from the src tar generated by make dist, in other words it is the
# version set in configure.in:AC_INIT() in the MySQL sources. Hence it is also
# what SELECT VERSION() will output in the server. However, tarbake*.sh will
# remove the string "-$MYSQLFORK" (case *insensitive*) from $VERSION if found.
# $REPO_VERSION is found in the .bzr_version file which is created by preheat.sh.
# It's value is the revision number of the bzr repo. The intent is to provide
# a unique identification of the scripts used to build the packages. (But note
# that bzr sometimes rehashes the revno.)
# $CODENAME is the distro "nickname" as returned by "lsb_release -sc", such as
# "lucid" or "lenny".
# Due to historical reasons there is still some redundancy. (TODO: refactor it
# away!) for instance $PACKAGE_NAME and $MYSQLFORK are synonyms but either is
# used in different places. $VERSION is not used consistently, some scripts seem
# to parse the version from the src tar themselves. (See $UPSTREAM variable.)

    -- Code commented by a pedantic Scandinavian.

Add new comment

The content of this field is kept private and will not be shown publicly.
  • No HTML tags allowed.
  • External and mailto links in content links have an icon.
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.
  • Use [fn]...[/fn] (or <fn>...</fn>) to insert automatically numbered footnotes.
  • Each email address will be obfuscated in a human readable fashion or, if JavaScript is enabled, replaced with a spam resistent clickable link. Email addresses will get the default web form unless specified. If replacement text (a persons name) is required a webform is also required. Separate each part with the "|" pipe symbol. Replace spaces in names with "_".
About the bookAbout this siteAcademicAmazonBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube