Dealing with the Cambrian Explosion 1/2: How to parameterize the package name in source and binary TAR files

As I mentioned before, it seems that thanks to Git and Bzr introducing distributed version control workflows, the open source community is now living in a phase where forking is easy and happens frequently - referred to by Brian Aker as the Cambrian Explosion of open source. We certainly see that happening in the MySQL Community.

Assuming you have the competence and know your way around a codebase, forking a proper open source project isn't that hard. You create your own project on GitHub or Launchpad and copy the source code. 1 But one thing you need to do is to change to using the new name (Drizzle, Percona, MariaDB, MepSQL....). Typically you want to keep using the original command line and file names (mysql, mysqld_safe, libmysql...), yet the product name as it appears in your installation packages is changed to distinguish it from the parent project.

In the life of an open source project a name change is a relatively rare occurrence. Most projects never fork or change their name. So it is not surprising that all the tools and methods we use while programming assume that the product name is a constant. It turns out it is hardwired into build scripts here and there. Sometimes excessively: when building DEB files the word "mysql" is used over a hundred different times!

To just change the name from one (mysql) to another (mepsql) isn't a difficult task. But in this new Cambrian world we are living with multiple co-existing "friendly forks" that all continue their development and share code back and forth. Distributed version control makes all this relatively easy, so I didn't want the build system to be an obstacle either. If your code differs because of this in over a hundred lines, then the affected files will have merge conflicts when you want to exhange code between the friendly forks.

So I wanted the name of the project, the name of the output package files, to become a parameter. This is the only way the same build scripts could be used and co-maintained by all the different forks. It is also the only way for me to use the same script easily to build mysql, percona-server or mariadb packages, in addition to my own mepsql packages, should I ever want to do that.

Producing a differently branded binary tar

MySQL sources contain a script scripts/make_binary_distribution.sh that is used to build the binary tar. (If you make bin-dist this script is run.)

This script already does similar things to what we want to do. For instance, it will take the output of uname and change some platform codenames to the better known product names, such as "GNU/Linux" to "linux" or "darwin" to "osx":

104 system=`echo $system | sed -e 's/darwin9.*/osx10.5/g'`

After all such decorations are figured out, the script simply builds the mysql binaries, runs make install to copy all of them into a directory, and then creates a TAR file of that directory. The name of the TAR file, as well as the top level directory when extracted, is what the script has been busy determining:

173 if [ x"$SHORT_PRODUCT_TAG" != x"" ] ; then
174 NEW_NAME=mysql-$SHORT_PRODUCT_TAG-$VERSION_NAME-$PLATFORM$SUFFIX
175 else
176 NEW_NAME=mysql@MYSQL_SERVER_SUFFIX@-$VERSION_NAME-$PLATFORM$SUFFIX
177 fi

The only thing hard coded in that name? The string "mysql"!

Ok, so when MariaDB was forked, you change that to say "mariadb" instead, not a big deal. But now we live in the Cambrian Explosion, so we need to think differently and embrace forking. So I want to parameterize this.

Hence, in the MepSQL code branch you can either set a --mysqlfork=mepsql command line parameter, or a export $MYSQLFORK=mepsql environment variable:


73 # $MYSQLFORK can also be set as an environment variable. This allows for better
74 # compatibility, earlier versions of this script will just silently ignore it.
75 if [ ! "$MYSQLFORK" ]; then
76 MYSQLFORK=mysql # Used as the first part to NEW_NAME
77 fi
78
79 for arg do
80 case "$arg" in
81 --tmp=*) TMP=`echo "$arg" | sed -e "s;--tmp=;;"` ;;
82 --mysqlfork=*) MYSQLFORK=`echo "$arg" | sed -e "s;--mysqlfork=;;"` ;;

...and this value is then used as the package name:


180 if [ x"$SHORT_PRODUCT_TAG" != x"" ] ; then
181 NEW_NAME=$MYSQLFORK-$SHORT_PRODUCT_TAG-$VERSION_NAME-$PLATFORM$SUFFIX
182 else
183 NEW_NAME=$MYSQLFORK@MYSQL_SERVER_SUFFIX@-$VERSION_NAME-$PLATFORM$SUFFIX
184 fi

Note that:

  • It is recommended to use the environment variable rather than the command line parameter, since if you happen to build some source that doesn't support the $MYSQLFORK parameter (currently anything but the mepsql branch), then it will just be silently ignored, whereas the --mysqlfork command line parameter would trigger an error.
  • The default is left as "mysql" (backwards compatibility). So to build packages called "mepsql" you really have to specify that.

Building a source tar

The source tar file is kind of different. It is the result of building make dist. This make target is a feature of autotools, and the package name is taken from the value in the AC_INIT() macro in configure.in:


15 AC_INIT([MySQL Server], [5.1.57], [], [mysql])

I don't remember the details, but with MariaDB Kristian determined that changing that value from "mysql" to anything else was not a good idea. Instead, this is dealt with kind of similarly as the binary tar was built above, from the OurDelta derived tarbake script that wraps around make dist:


94 NEWBASE="${PACKAGE_NAME}-${VERSION}"
98 tar zxf "${DISTNAME}.tar.gz"
101 mv "${DISTNAME}" "${NEWBASE}"
111 TARBALL="${NEWBASE}.tar.gz"
112 tar zcf "${TARBALL}" "${NEWBASE}/"

These lines from the MariaDB script were here used as is, we simply set PACKAGE_NAME=$MYSQLFORK elsewhere.

AC_INIT() name and version

There's one more place to modify when you want to "brand" your own version of MySQL. It is the AC_INIT macro we already saw above. In MariaDB they took the approach of adding the "MariaDB" string here, so that when you connect to a MariaDB server, or do SELECT VERSION() you see that it is a MariaDB flavored MySQL server:

15 AC_INIT([MariaDB Server], [5.1.55-MariaDB], [], [mysql])

or in MepSQL:

15 AC_INIT([MepSQL Server], [5.1.52-MepSQL-11.02-alpha2], [], [mysql])

These values are shown like this in the mysql console:


ubuntu@domU-12-31-39-03-AD-C2:~$ mysql -u root
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 39
Server version: 5.1.52-MepSQL-11.02-alpha2 (Ubuntu)
 
Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select version();
+----------------------------+
| version() |
+----------------------------+
| 5.1.52-MepSQL-11.02-alpha2 |
+----------------------------+
1 row in set (0.00 sec)

mysql>

Note that there is also a configure option --with-server-suffix that is a text string appended to the version number. For some reason MariaDB chose to append to the version in AC_INIT() directly, and I agree that feels correct when you have a fork that has its own version numbers rather than just being a suffx to a MySQL version. The --with-server-suffix build parameter is used by Linux distributions, and I believe by MySQL itself when it builds -community and -enterprise flavors of its own sources. The end result is the same, from the above you wouldn't know if I had used the suffix or appended AC_INIT directly.

Cleaning up the fork name from the version string

While it is good practice to show the name of your fork together with the version string as above, it is redundant when the version number is used as part of the package. Imagine if MepSQL packages were called:

mepsql-5.1.52-MepSQL-11.02-alpha2.tar.gz

So in the tarbake script we remove the second "MepSQL" as such:

93 VERSION=$(echo "${VERSION}" | sed -e "s/-$MYSQLFORK//I")

This is how we get packages like:

mepsql-5.1.52-11.02-alpha2.tar.gz

Again, we have made this line of code generic by looking for the value of the $MYSQLFORK variable, where originally the script just had a hard coded "MariaDB" string there.

To be continued: Parameterizing the package name in DEB files

This was still quite easy. Parameterizing the fork name in the DEB packages was much more difficult. I will describe how that was done in a second blog post.

  • 1As noted in previous posts, if everything isn't fully open source, which is the case with MySQL, you have a challenge in reproducing from scratch the missing parts, like documentation, build system, etc...
About the bookAbout this siteAcademicAccordAmazonAppleBeginnersBooksBuildBotBusiness modelsbzrCassandraCloudcloud computingclsCommunitycommunityleadershipsummitConsistencycoodiaryCopyrightCreative CommonscssDatabasesdataminingDatastaxDevOpsDistributed ConsensusDrizzleDrupalEconomyelectronEthicsEurovisionFacebookFrosconFunnyGaleraGISgithubGnomeGovernanceHandlerSocketHigh AvailabilityimpressionistimpressjsInkscapeInternetJavaScriptjsonKDEKubuntuLicensingLinuxMaidanMaker cultureMariaDBmarkdownMEAN stackMepSQLMicrosoftMobileMongoDBMontyProgramMusicMySQLMySQL ClusterNerdsNodeNoSQLNyrkiöodbaOpen ContentOpen SourceOpenSQLCampOracleOSConPAMPParkinsonPatentsPerconaperformancePersonalPhilosophyPHPPiratesPlanetDrupalPoliticsPostgreSQLPresalespresentationsPress releasesProgrammingRed HatReplicationSeveralninesSillySkySQLSolonStartupsSunSybaseSymbiansysbenchtalksTechnicalTechnologyThe making ofTransactionsTungstenTwitterUbuntuvolcanoWeb2.0WikipediaWork from HomexmlYouTube

Search

Recent blog posts

Recent comments