Yesterday I posted results from some MySQL benchmarks I had been doing on a large server. In this post I'd like to list 4 important fixes that were done to avoid bad performance:
- Linux swapping, NUMA: If you have more than 1 CPU socket, you will be using Non-Uniform Memory Allocation. Linux has a bad default strategy of allocating memory here, and it is easy to end up in a situation where you have many GB of free memory left, yet Linux is already swapping. Yes, seeing that happen will be very confusing! This problem has been investigated and explained brilliantly by Jeremy Cole. The solution is a one liner given at the end of his article.
- Swappiness: This is not a bug, just easy to forget, or easy to assume someone else already set this. By default, Linux will think it is a good idea to swap out some processes (especially MySQL, that consumes a lot of memory) in favor of the filesystem cache, so that your disk access becomes faster. This is a good idea in many cases, but not when you're running a database. So you should set
/etc/sysctl.conf. This was covered in Yoshinori's tutorial at the MySQL conference.
- Swappiness, Linux bug: On kernels prior to 2.6.28 (such as CentOS 5...) Linux will still swap out MySQL regardless of vm.swappiness, because the algorithm for balancing swapping and filesystem cache is stupid. The fix is to upgrade your kernel. A workaround is again offered by Yoshinori to those who can't upgrade: A perl script you can run from cron.
- InnoDB contention on 100+ GB buffer pool: It seems that InnoDB performance will severely degrade under some conditions if your buffer pool is 100 GB or more (exact limit unknown). This was originally observed by Vadim. The contention is possibly triggered when your system has low latencies when writing to disk, or when using ext3. A workaround is to make the buffer pool smaller - 64 GB has been verified to work well.
I'd like to thank Jeremy Cole, Yoshinori Matsunobu and Vadim Tkachenko for sharing their experiences and making it easy to fix what would otherwise have been very difficult problems.