Alex Yurchenko finally posted results on a benchmark he has planned to do for a long time: Galera vs NDB cloud shootout.
Their blog requires registration to comment, so I'll post my comment here instead:
Sysbench can do the loadbalancing itself, so there is no need for external loadbalancer. Just add a comma separated list of master MySQL nodes to --mysql-host. This is similar to what the JDBC and PHP drivers can do too, and it is my favorite architecture. Why introduce extra layers of stuff that you don't need and that doesn't bring any additional value?
sysbench is probably not an optimal benchmark for "out of the box" NDB. Sysbench OLTP does a lot of selects on ranges, and joins and unions. By default NDB will partition the data using a hash algorithm, so this means every sysbench transaction will have to "spray" itself across all the partitions. To optimize for sysbench workload, one should explicitly specify the partitions to be continuous (PARTITION BY RANGE). In my experience of NDB, correct partitioning for your workload can improve performance with anything from 2x to 5x, perhaps even more.
Btw, is sysbench latency reported by transaction or per statement? I suppose it is per transaction. Then 220 ms latency is completely expected and it is due to the non-optimal partitioning I've explained above. NDB shouldn't have any problems getting into the 10 ms latency range if partitioning is done such that most transactions hit only a single partition. Yeah, knowing this one rule is really the secret to becoming NDB guru :-) I suppose using Dolphin interconnects, or with NDB 7.2 also Infiniband will work, it cuts down the latency for you anyway, but correct partitioning is always preferred.
Your conclusion is mostly right. For "out of the box" dumb benchmark NDB usually does not deliver great performance results, unless you are lucky and you happen to be running a pure key-value workload. This is true even if you have non-virtualized 10GB Ethernet super-duper hardware. 100 ms to 200 ms latencies is exactly the range where you end up in such a case. However, when you know that NDB is almost always bottlenecked by network communication overhead, then you realize that your goal in optimizing is to minimize network communication. After that it becomes an easy task for any experienced DBA to use correct PARTITIONing scheme when creating tables. (In fact, it's the same optimization you would also do for a partitioned table on a single node, such as a data warehouse on InnoDB or Oracle.)
If you have the test environment still setup, it would actually be really interesting to see also such test with NDB, so that you don't just compare the "dumb" case (which is valid in itself, I totally agree Galera experience is much more user friendly) but also the best case. Based on this I'd say it could be a pretty even match.