MySQL Conference 2011

I went to the MySQL Conference once again at the Santa Clara Convention Center.

Executive Summary: The talks and informal conversations centered around these main issues:

  1. fragility of replication, and need for a global binlog ID and checksums
  2. stalls in Innodb queue processing, resulting in query latency
  3. general happiness with Oracle’s MySQL 5.5 release, considered by many to be the best release in several years
  4. there were some Postgresql talks and the closing keynote was by Mike Olson, the former CEO of Sleepycat, so a segue is being made to an Open Source database conference in future years.

Some more detailed comments below …

MySQL Replication Tutorial

– read-only except for super-user and repl thread (cool for DW)
– slavereadahead, mk-prefetch

– do
– expensive queries
– non-deterministic functions

– lots of rows or network traffic

semi-sync available in 5.5
– load-plugin on master and slave

delayed replication in 5.6

stop slave;
change master to master_delay=60;
start slave;

See SQL_DELAY: !!!


– now has options for also creating slave and cir


– oak-get-slave-lag
– oak-online-alter-table
– oak-purge-master-logs
– oak-show-replication-status

MySQL Performance Tuning Tips

– sequence diagram time x flow
– show global status like ‘table_wait_%’
– on linux stats are per-process, not per-thread. mysqd is multi-threaded

– add a performance schema like Luke’s

Aspersa toolkit

– missed: iotop, dtrace, systemtap, sample “show processlist”
Justin Swanhart

Diagnosing MySQL Replication Failiures
Justin Swanhart

– recommended to enable binlogs on slave for recovery and diagnostics
– ie. errant write to slave will have server-id=slave-id
– BBU and buggy journal file process common sources of “rolled back” master compared to slave
-use maatkit to identify and update slaves, note that checksum performance may depend on key distribution heuristics
– statement-based and row-based replication have different issues and bugs
– read the replication features 33 sections
– note there are still 37 replication bugs
– mixed innodb and myisam updates is a bad idea, overall ill-defined behavior (at least 3 changes in 5.1.x)
– CREATE TEMPORARY TABLE – bad idea, but at least use a unique name with tmp_ prefix to help the DBA

Automated, Non-stop MySQL Operations and Failover
Mats, DeNA & ngmoco

1) heartbeat + drbd

innodb-flush-log-at-trx-commit=1, sync-binlog=1 for HA, but kills write performance

2) Mysql Cluster – not innodb


4) Google patch – with guid in binlog

– study relay log internals “at” and “end_log_pos”
– to catch up a slave, mysqlbinlog –start-position=101835
– actually does recovery of lost transactions
– disable purge of relay log after sql thread to preserve statements, but will fill disk, so need alternate way of deleting
– purge relay log blocks (90 seconds for 10 GB), so need hard links to do non-blocking delete outside MySQL process
– show full processlist is non-blocking way to show progress of sql thread on slave, unlike master_pos
– mysqlbinlog –base64=always, none have useful properties
– Tool: Master High Availability Toolkit
power off delays
– R610 5-10 seconds using telnet+DRAC
– DL360 4-5 econds ipmitool+iLO
– do not use 5.0.45 or lower version (end_log_pos is incorrect)
– drop user is graceful way of not accepting new app connections, but finishing existing transactions.



we don’t monkeypatch slaves – we rebuild from a reliable source
problem with mysql permissions is cached data


XFS better with 4k blocks
mkfs.xfs -s szie=4096
mount -o nobarrier
multiple threads
Percona server or Innodb plugin or Mysql 5.5
Innodb mutex contention with 120 GB RAM, performs better with 26 GB

Linden Labs
20 TB SSD X25 MLC and SLC, one failed on POST

Interesting trick

FusionIO user for mysql db – disks on master, fusion ion on 8 slaves, very happy now, early days some driver updates


– sweet features, UI, backup options
– still no way setup a remote slave or do root configuration-type commands

Schooner and Virident had booths.

Wed. Keynotes

MariaDB 5.6


Brian Aker
Drizzle GA

MySQL State of the Art @ facebook

DB Operations
– typical DBA

DB Engineering Team
– develop code for MySQL, tools, fix bugs, port patches

DB Performance Team
– full stack solution hw to os to cache to db
– test
– improve

– Nov 2010
– 4 ms read, 5 ms write
– 38 GB traffic on database tier
– 13 million qps

– based on vanilla MySQL 5.1
– custom facebook build on launchpad
– heavily sharded, heavy replication

– lots of generations
– many core, not cpu bounc
– local storage, no SAN
– some flash in production

Performance Ideas
– monitor long-term trends – why are inserts up 10% this week?
– monitor real-time problems – look at Top10, not aggregates
– monitor everything
– throughput vs. latency
– 95th, 99th percentile
– stalls (1s, 2s, 3s why)


– table statistics
– shadows
– async copy of live data on test machine for load testing for main API (not ETL stuff), starts with replication and breaks it
– percona has a tool called logjam
– slocket
– udp dump of slow queries from network
– pmysql
– parallel statement submit
– global show full processlist
– fast – thousands of servers in a second
– replication sampling
– run show full processlist in a loop, look at writes, generate % stats
– aggregated client error logs
– scribe is a distributed log processing system
– query comments
– every single MySQL statement ever
– FB has an API to insert this
– server, page, memcached info
– FB doesn’t use query cache, but could strip out comments to make it query-cache friendly
– backtraces are stored by id
– indigo (query monitor)


– Dogpiles
– Temporary Slow Down
– Stall Tools
– dogpiled
– aspersa stalk, collect
– pmp
– gdb

stalls found
– tables extending
– drop table
– purge connection
– kernel mutex
– transaction creation
– lock creation/removal
– deadlock detection
– many more – servers don’t expect as well as you would expect

“MySQL has to do what Facebook wants. It is a submissive relationship.”


– increasing utiliization of hw
– memory to disk ratio
– finding bottlenecks
– disk bound normally
– sometimes network
– rarely CPU/memory bw
– poor application design
– leads to poor query design

Disk Efficiency

– normally disk found
– queue lengths
– Innodb page size
– default 16k (8k, 4k possible)
– smaller for flash?
– our tests range scans vs. point lookups
– need adjustable per table or index

(linux io request queue may be 128 requests, but each drive has 32 slots, so only 4 drives would be efficietly used)

Memory efficiency

– compact records
– clustered and covering index planning
– accessing historical data
– understand workloads
– full table scans
– ETL type queries
– mysqldump
– tune midpoint insertion for Innodb LRU


– many iops!
– awesome at random reads
– kind of small
– tested many vendors since a lot of mktg is misleading
– long-term performance
– reliability
– edge cases
– failure rates
– different use cases


– flash in front of disks
– slower disk usage possible
– write-back cache
– much more data storage
– able to utilize much more of flash card
– ver long warmup time
– open source

MySQL 2X Project

– flash is too fast, slave replication is too slow, so put multiple mysql instances per server
– use 3307, 3308, 3309 (skip 3306 to find legacy 3306 requests)


– MySQL is never a solved problem
– better/new
– new hw
– new data centers


– never assume
– add more metrics
– full stack
– more innodb info
– more application info


– lag is a big problem
– possible solutions
– better save prefetch
– maatkit version has problems
– tungsten parallel slave
– oracle parallel slave

InnoDB Compression

– originally was planned during 5.1 upgrade
– problems
– replication stream
– increased log writes
– performance in some cases
– goto db eng talk thu at 2.50 pm

env: CentOS/XFS, mysql build with debugging symbols on, xtrabackup


– 5 companies created common standard
– offset losses to processors from providers, or revoked
– outsourcing credit card processing correctly makes you compliant (cc’s never go through your network or touch your disk)

– don’t do PCI unless level 1 or level 2 merchant since it’s not cost-effective
– worry more about your PCI auditor than the written docs

– design access around business processes
– only billing support and ops person needs access to chd

– securich
– MySQL 5.5 has PAM, to help with enforcing password policies

– be judicious with grants
– disable local_infile
– disable old_passwords
– set read_only=on on replication slaves
– enable secure_auth

#, name, service code, exp date can store
mag, cvv2, pin cannot store

– binlogs enabled on master and slaves
– read-only
– semi sync, use furthest ahead slave


– host list
– discovers replication topology by polling hosts
– current master found based on topology, not dns
– mostly stateless
– locality to dc
– also RPC API
– python
– if decider is not online, no problem.

Sanity checks

– no more than N hosts should be down
  – recommend N=1
  – down slave could have most recent tx
  – N < hosts/2 for quorum Will be available in google mysql tools. Thursday

– Keynotes

– Baron Schwartz, Percona
– Mike Olson, former CEO Sleepycat Software, now CEO Cloudera

– remarkable insight into overall state of business problems and databases, no specific technology tips
– consumer devices are generating so much location data that we have no data store big enough and won’t without designing for it.

– 3 Ignite talks (20 slides in 5 minutes)
Baron Schwartz: MySQL failure stats from Percona support db (Tops are replications, human processes)
Canadian dude
Josh Berkus: “Be Sexy. Go Down.” parodying the latest IT trends and unintended anti-scaling (Ruby, etc.)
– some of the younger attendees didn’t realize that the examples were real-world, like the Ruby mongrel anti-scaling, etc.

Chef Tutorial

– central chef server needed, whether yours or on (free for 5 nodes)
– uses RSA keys


Patrick Galbraith

– works on MySQL and Drizzle

Error Detection with MySQL Replication

– uses stream cipher
– eventually to be Open Sourced

Closing Keynote

JC did a talk on MySQL @ Twitter with lots of statistics.

Once again, O’Reilly did a great job organizing the conference. Hope it continues for many years to come, preferably not during tax week.

Former MySQL boss: Code ‘in better shape than ever’ under Oracle

This entry was posted in Business, Japanese, Linux, MySQL, Open Source, Oracle, Perl, Storage, Tech, Toys. Bookmark the permalink.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.