Solving Java GC Pause Outages in Production

Java Duke
Just thinking about howto configure HAProxy with two backend Java servers to be HA.

Java programs do pauses for garbage collection, known as “GC Pauses.”

The description “Stop the World” (STW) illustrates their true severity – they are a slow-motion train wreck for incoming requests.

If you’re new to this topic, please read:

Willy: “I work with people who use a lot of Java applications, and I’ve seen them spend as much time on tuning the JVM as they spend writing the code, and the result is really worth it.” Anybody have some extra time? 😐

My operational requirements for Java in production are:

  1. understand GC pause activity for my application servers
  2. control GC pause activity to a reasonable and bounded extent
  3. configure HAProxy load balancer to not send requests to servers undergoing GC pauses (ie. don’t lose requests)
  4. use an affordable amount of RAM to accomplish the above, preferably 8 or 16 GB in a shared VM environment.

1. Understand GC pause activity for my application servers

Detailed GC logging can be enabled with:

-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps

and you can specify a separate GC log with:

-verbose:gc -Xloggc:/tmp/gc.log

See “Understanding Garbage Collection Logs.”

2. Control GC pause activity to a reasonable and known extent

One of the biggest challenges is to control the frequency and duration of GC pauses …

Some configuration approaches:

  • set heap size and compaction percent only somewhat above need. That will cause GCs to be more frequent, but also faster or the opposite …
  • set heap size to large amount and compaction to 100%, then trigger GC after hours
  • investigate alternate JVMs.

An example of some of the tuning options:

java -Xms512m -Xmx1152m -XX:MaxPermSize=256m -XX:MaxNewSize=256m

JRockit JVM: Tuning For a Small Memory Footprint
Tuning Java Virtual Machines (JVMs)
Weblogic Tuning JVM Garbage Collection for Production Deployments

Some programming approaches:

  • use streaming file IO with Files.lines() instead of reading into a String or hashmap, or use memory-mapped files
  • rewrite portions of your application to correctly use StringBuffer instead of String
  • Reduce object copies – if you do not have a problem with thread safety, then you don’t need immutable objects.
  • call dispose() method when available, such as SWT image class
  • for HashMaps, call clear() to re-use the memory later, but set to null to GC it
  • split java server into real-time and batch servers where possible with appropriate heap sizes.

3. Configure HAProxy load balancer requests to not be sent to servers undergoing GC pause events

This is tricky for several reasons:

  • health checks can be passive or active. Both have check gaps that won’t notice a GC starting before a request is sent
  • even if GC notifications are enabled and the server health check is red, HAProxy will not know (see above)
  • even if GC notifications are enabled and the server health check is now green, HAProxy will not know (see above) :)
  • the HAProxy options log-health-checks and redispatch may be helpful

a) I think the only 100% reliable way is to coordinate from the HAProxy side:

  1. understand your GC pattern
  2. use HAProxy socket interface to drain, then disable one backend
  3. wait for zero connections
  4. force a GC (easier said than done in Oracle Java since System.gc() is only a request for GC), or restart the Java server
  5. use HAProxy socket interface to enable the Java server.

This method would be risky with two Java servers, since during maintenance on one server, the other could GC pause. (facepalm)

b) Another possible approach would be to handle MemoryPoolMXBean MEMORY_THRESHOLD_EXCEEDED events. Maybe that can be used to update the health check on the server side and send a drain socket request to HAProxy if you reliably had advance notice and could force a GC now, trying the Java Tool Interface ForceGarbageCollection()?

c) And another idea is to write a sentinel file every 250 ms, and if it reaches 750 ms, assume a GC is happening and drain HAProxy. Unfortunately the TI events GarbageCollectionStart() and GarbageCollectionEnd() are sent after the VM is stopped, so you’re limited in what you can do when you need the most flexibility.

Some Java 8 Classes related to GC notifications:

  1. MemoryPoolMXBean – “The memory usage monitoring mechanism is intended for load-balancing or workload distribution use. For example, an application would stop receiving any new workload when its memory usage exceeds a certain threshold. It is not intended for an application to detect and recover from a low memory condition.”
  2. GarbageCollectionNotificationInfo
  3. GarbageCollectorMXBean

Also, investigate mod_jk and AJP. tomcat uses the same heap as your application, so tuning is very important here too.

4. Use an affordable amount of RAM to accomplish the above, preferably 8 or 16 GB in a shared VM environment

If you work in a VM consolidation environment, it’s important to minimize the footprint of your applications. Requesting an entire server to run a bloated app just isn’t going to cut it. See above for rewriting applications to minimize heap and GCs.

Garbage Collection JMX Notifications Example Code
Blade: A Data Center Garbage Collector
How to Tame Java GC Pauses? Surviving 16GiB Heap and Greater
SO: Garbage Collection Notifications
Letting the Garbage Collector Do Callbacks
How to force garbage collection in Java?
SSL Termination, Load Balancers & Java
Github: Measuring Java Memory Consumption – sample code
Java is not “angry” with you.
Set State to DRAIN vs set weight 0
Scalable web applications [with Java]
Examples of forcing freeing of native memory direct ByteBuffer has allocated, using sun.misc.Unsafe?
Lucene ByteBuffer sample code
Improve availability in Java enterprise applications
The Four Month Bug: JVM statistics cause garbage collection pauses
Memory management when failure is not an option


CASSANDRA-5345: Potential problem with GarbageCollectorMXBean

Posted in Cassandra, GC Pauses, Microservices, Open Source, Oracle, REST API Programming, Storage, Tech | Tagged | Leave a comment

I found jMeter, however, really easy to use.

Java DukeSo, let me get this straight

  • Java is not safe for use in servers because of GC pauses.
  • And it’s not safe for use in clients because of GC pauses.

Doesn’t leave much left! :)

Thanks to Greg Lindahl, founder of Blekko, for making my day. You’re The Man when it comes to performance!

Another good one that makes you pause:

  • Me at ApacheCon 2009: “So how do you like programming in Java?”
  • Random Attendee in Wifi tables area: “It’s great. Not sure why people gripe about memory consumption.”
  • Me: “Really. Show me your Java app.”
  • Random Attendee: “Well, my Macbook Air doesn’t have enough RAM.” :)


Apache jMeter
Distributed Testing with JMeter on EC2
Dan Luu: HN comments are underrated HN comments

Posted in API Programming, Conferences, GC Pauses, Open Source, Tech | Leave a comment

REST API Client Computer Languages and Frameworks Survey

I recently wrote REST API client programs in several programming languages as a subproject of my Perl REST API Framework, and had some surprises, both good and bad.

I would have gladly just linked to somebody else’s sample clients, but I couldn’t find any remotely complete or professional-grade code (complete working program with error-handling, Basic auth and timeout.)

The closest to useful REST clients that I saw were the Java tutorials by, RESTful Java client with Apache HttpClient

Here’s my notes:


  • tough to find the a working HTTP class for Java 1.8 on Centos7. I couldn’t get Apache HTTPClient imports working, so I ended up using HttpURLConnection
  • first experience with immutable data collections – quite jarring to realize you have to copy anything to change it. And an int is not an Integer, and a String is not a StringBuffer. Hahaha, good one!
  • somewhat of a learning curve for java build process. See for a minimal build tool
  • lint: javac -Xlint:all
  • I wouldn’t be surprised if Java’s legendary slowness and memory bloat are from the above issues, obvious even from a 200-line program.


  • used the requests HTTP module
  • felt comfortable until running Pylint and discovering how freaky the python community can get (scoring my working program -1.5/10, but getting 10/10 after whitespace-only changes. Really?)
  • nice indenting: -r -s 4 -e


  • used the httparty HTTP framework
  • elegant, beautiful OO code without even trying.


  • got it working the fastest, but then took longer to polish it
  • wish there was a lint for PHP


  • not bad – straight-forward to do various requests and get responses


  • inadequate for REST API programming, more for manually fetching files only


  • LWP is mature, well-documented and readily available and made Perl the easiest scripting language to work with overall
  • Perl’s built-in lint checking (strict and diagnostics) is much appreciated after its lack in PHP, Python and bash.

RESTful Java client with Apache HttpClient
Why Pylint is both useful and unusable, and how you can actually use it
Notes on Managing Java in the Cloud
Static typing will not save us from broken software
OpenFeign Java HTTP Client Library

Posted in API Programming, Linux, Microservices, Open Source, REST API Programming, Tech | Leave a comment

PagerDuty Summit Conference 2016 SF

PagerDuty LogoI went to the complimentary PagerDuty Summit Sept. 13 on Market Street in SF.

The well-organized conference format was 2 tracks downstairs, with breaks and a small expo area upstairs.

Andrew Fong of Dropbox had a very good talk on their struggle to go from four 9’s (“can use tactics”) to five 9’s (“has to be strategic”.) Their solution was to have a working group composed of anybody who wanted to contribute, across departments. (Not dedicated HA staff.)

Andre Kelly of Google talked about having well-defined post-mortem processes in place now to capture outages in an organized manner and data mine the results over time later.

Apparently there’s some popular Open Source post-mortem systems for that. Please leave a comment if you have any experience with those.

Sean Reilley of IBM discussed people issues in communicating agile across a large company with pockets of staff who were used to waiting for permission (ie. not inherently agile.)

Upstairs, the mini-expo seemed to have a couple booths for security-related start-up Cloud products, Datadog, plus a booth for PagerDuty itself to do customer demos and get beta feedback.

PagerDuty Incident Timeline

Sketch of New PagerDuty Incident Timeline Visualization Tool

The money shot was seeing their new beta graphical incident timeline, to be released in November, which made the trip worthwhile. Until then, you can enable HTML emails for a slightly richer experience.

The “Village” historic venue, [pic], was not my favorite: climbing up and down steep stairs with a backpack got old fast.

Conference Videos

Posted in Conferences, Tech | Leave a comment

Eye of Hurricane Matthew

Eye of Hurricane Matthew

Posted in Tech | Leave a comment

John Collins “The Paper Airplane Guy”

CNN linked to a video of John Collins, “The Paper Airplane Guy.”

John holds the world-record for paper airplane distance throwing.

I had a chance to see John live recently when he gave a lecture and demo at my office in Silicon Valley.

It was a unique experience:

  1. John is a fun lecturer who really knows aerodynamics and can explain it clearly to both kids and adults
  2. learning the art of making high-performance airplanes was great fun.

I hold a commercial airplane licence and can say that he really knows his stuff. Highly recommended.

Posted in San Jose Bay Area, Tech, Toys | Leave a comment

Does Software Rot?

Back in the day, Joel wrote an infamous post asserting that “software doesn’t rot” over time.

I believe Joel was addressing the tendency of new programmers on a project to avoid learning the old codebase and write a new one instead, at great cost in terms of time and money.

But let’s discuss the more interesting topic of whether software can actually rot.

I would say that he was correct in a very narrow sense, namely a program written for a single version of Windows.

But in the big picture, he was completely wrong. Even Windows software requires re-writes for “Certified for Windows” assurance for new shrink-wrapped versions to be shelved in US chain stores. (Stores were trying to reduce the rate of returns and customer support.)

And how’s Silverlight, discontinued in 2012, working out for developers? :)

When it comes to web software, total re-writes have been required for:

  • mobile
  • XML and JSON output
  • new Javascript frameworks.
Posted in API Programming, Open Source | Leave a comment

Perl Petstore Enhanced REST API Framework

Perl LogoI’ve been doing a lot of work with REST APIs and microservices, so I decided to write a complete REST API framework in Perl based on the Mojolicious and Swagger2 Petstore sample.

You can git clone the repo and add a new API endpoint in about 5 minutes with automatic parameter validation and documentation:

git clone
cd perl-petstore-enhanced/pets
less ../
vi api.spec cgi-bin/pets.cgi ./lib/Pets/Controller/
# add an Alias for cgi-bin/pets.cgi to httpd or nginx
# point your browser at
# Good job. Have a Modelo! :)

or you can spend an hour to rename the files for your project and tweak it to requirements.

This project serves as a convenient bridge for those who:

  1. can write simple CGI programs and want to write a best-practises Swagger (OpenAPI) REST API server without climbing a steep learning curve, or
  2. want to write a quick proof-of-concept API server to be re-implemented in other languages or frameworks later, as your Swagger spec file is 100% reusable
  3. are targeting a small VM. This will work in a 2 GB RAM VM just fine, or on an existing server running httpd or nginx.

Also of note is the samples/ folder, which has non-trivial client programs in several languages (bash, Java, Perl, PHP and Ruby.)

I learned the importance of Swagger2 and auto-generated API documentation and validation when I was programming with the old Rackspace Cloud v1 and v2 APIs.

People asked me, “How did you get anything to work? You must have really wanted it!” since the Rackspace sample code, docs and live API didn’t match each other. My secret: I actually guessed URLs in the browser to find the endpoints I needed. Swagger prevents that headache.

Posted in API Programming, Open Source, Perl, Tech | Leave a comment

Hawaii Trip 2016 – What’s New in Waikiki

Spent Labor Day weekend in Waikiki.

I enjoy going there every few years and seeing what’s new.

However, it’s been completely built out as a mall, so looks kind of corporate now. To combat that, plan to climb Diamondhead and go to the zoo.

Also, who would fly a quadcopter drone at one of the most crowded beaches in the world? Not surprised, just saying.

So what’s new in Waikiki?

  • Two hurricanes were approaching the Islands, but like usual did not landfall on Oahu
  • Not very busy, likely because of the Hurricane news
  • International Marketplace is now a shiny mall that opened Aug. 25. It is anchored by Saks Fifth Avenue, and has the only public restrooms in Waikiki now. It has plaques to remember the mom-and-pop stores they bulldozed.
  • Kalakaua is also a giant hand-bag mall for Japanese tourists
  • Matteo’s Italian (and Seafood) at Seaside and Kuhio closed, and a Crackin Kitchen Seafood opened next door
  • 24hour Fitness is charging $25 for a day-pass on Kalakaua, but it does have a beach view
  • Free Kuhio Beach Hula Show (Waikiki) is 6:30 pm Tues/Thu/Sat – features two dozen performers! Bring your own towel or beach chair to sit on, practise your photography.
  • 100 Japanese people were lined up outside Marukame Udon on Kuhio one night at 9 pm. Must be pretty good. Next door is a souvenir shop with the most awesomely tacky items. If you need a hula dancer for your car, get shopping.
  • Princess Kaiulani Hotel buffet ($42/person) still has free Hawaiian music and hula show downstairs, and a very good Polynesian show/dinner upstairs. (They cancelled the downstairs show at least one evening because of Hurricane weather reports.)
  • McD still serves the free pineapple cup with combos, and also offers taro pie – very sweet. They charge $10 for a combo, but you can get a BOGO Big Mac on Mondays and they have a Pick Two special, and they do have drink refills and wifi
  • Duke’s Restaurant is still packed, but the Hula Grill ($60/person) upstairs doesn’t have a wait list. Has restrooms.
  • TheBus is $2.50 per trip now, or $35/4-day tourist pass available in ABC Stores. The Waikiki Trolley is only $2/trip between Waikiki and Ala Moana and the open air cars are good for photography and sight-seeing
  • Lots of hotel and residential construction cranes
  • Flew American Airlines there – they served biscuits instead of meals, and had no entertainment systems. Ran APU for one-hour while finding pilots. Dreadful experience, but this is a USA airline, so I’m being redundant.
  • Disney Aulani is not a theme park – it’s a time-share (ie. scam) with a few hotel room rentals for $450/nite in the middle of nowhere. ok if you’re a large family that wants to cocoon, maybe.
  • if you go on a boat tour of any kind and want to have fun, buy the cheapest tickets or you’ll be stuck with grandparents

Waikiki photo vantage points:

  • beach sunsets
  • surfboard stands
  • rescue canoes
  • Kuhio Hula Show (Tues/Thurs/Sat at 6:30 pm)
  • street performers
  • Diamondhead
  • Honolulu Zoo

If you’re from the mainland, remember that Hawaii is hot and humid. Stay hydrated, wear a hat, and don’t over-exert yourself – especially around noon-time.


Posted in Travel | Leave a comment

Re: Botched Go-around Appears To Have Led to Emirates 777 Crash

As a commercially-rated airline pilot who reads accident reports, I always tingle when I fly on anything but a USA majors flight in less than perfect weather.

The recent Emirates 777 crash in Dubai is a case in point.

The airliner, with 300 people aboard, crashed into the runway with a sink rate of 900’/minute, and later the center-tank exploded, killing one firefighter. 22 pax and FAs were injured descending the slides (typically, several people are injured during a slide evacuation.)

It’s important for pilots to always be mindful that a landing approach can end in two ways:

  1. landing
  2. go-around

Though it would take a lot of painstaking research to say where this particular flight started going wrong, we do know some of the links in the “accident chain”:

  1. wind shear from 8 knots headwind to 16 knots tailwind. Depending on when the pilots learned this, their spidey- sense should have been off the scale – ie. either requesting a hold, a go-around or another airport. I also wouldn’t use the autopilot in wind-shear because judgment is needed to manage the throttle in that situation
  2. long landing – aim point in an airliner is 1000′, but they had a 1,100 meter (3,609′) warning. If they couldn’t start a normal landing at 1,000′, it was time to seriously think about a go-around
  3. late go-around – if you’re over the runway at idle and 5′ in a wind shear with your gear down, you probably should just land. What were the pilots thinking here? Were they blindly following ATC or book procedures when they really needed piloting skill?
  4. late TOGA power – jet engines take about 6 seconds to produce useful lift, the pilots tried 3 seconds. Do the math.
  5. foreign airline and pilots – for some reason, they’re often not up to challenging weather. They seem to be more interested in epaulets than aerial mastery. I’d suggest making them fly this flight profile in the sim before graduation. Or is the extra $5,000 in fuel for a go-around a career-limiting problem?

Taken together, obviously nobody with a clue was in the cockpit that day. I would rank this accident as bad as the TransAsia GE235 “Oops, I shutdown the good engine” accident in Taipei.

Botched Go-around Appears To Have Led to Emirates 777 Crash

Posted in Tech, Travel | Leave a comment

Farewell to Prince

Disbelief at the death of Prince at the relatively young age of 57.

Prince was a musical genius, certainly one of the giants of this century – he wrote, sang, was a virtuoso of 2 dozen instruments, and played guitar at the level of Jimi Hendrix.

He could perform with everybody, or nobody, yet chose to mentor female musicians, introducing them by name in his shows.

I saw one of his shows, but wish I had gone to more.

For business reasons, he never allowed his catalog on YouTube, but there’s a few links from TV performances that indicate his brilliance and show him “bringing the funk”:

Prince & 3RDEYEGIRL Perform ‘She’s Always In My Hair’
Prince Saturday Night Live Full Performance (2014)
Prince playing piano over ‘Summertime’ at Soundcheck, Koshien, Hyogo Prefecture (1990)
PRINCE BET Interview with Tavis Smiley(1998) Prince’s vision for lifting up black youths: Get them to code, Prince’s Death: Latest News
W: Prince

Posted in Tech | Leave a comment

Weekend of Earthquakes

There were a few major earthquakes this weekend:

  • Ueki, Japan – 6.2 (foreshock)
  • Kumamoto, Japan – 7.0
  • Ecuador – 7.8

Hundreds of aftershocks have occurred in Japan.

You would think that Californians, of all people, would be concerned with earthquake safety, but the LA Times has reported on a building safety cover-up involving thousands of older schools and office buildings which will pancake in a major quake.

How Risky Are Older Concrete Buildings?
LA Times FAQ: Concrete buildings, earthquake safety and you
Non-ductile Concrete Buildings

Posted in Tech | Leave a comment

Congrats to SpaceX on Ocean Landing

I used to write telemetry collection software for the Space Shuttle, rockets and balloons, but even I watched the SpaceX barge landing with disbelief as the rocket smoothly rotated in all 11 or so degrees of freedom at the same time – no hesitation or staging before the touchdown.

It was like watching a really big lawn dart plant itself. :)

View post on

twitter: SpaceX
HN: SpaceX Launch Livestream: CRS-8 Dragon Hosted Webcast
theRegister: SpaceX finally lands Falcon rocket on robo-barge in one piece, SpaceX’s Musk: We’ll reuse today’s Falcon 9 rocket within 2 months

Posted in Tech | Leave a comment

MH370 Debris Illuminates Crash Reasons

A few pieces of MH370 have recently been found on a Mozambique beach, and confirmed as authentic parts.

Their excellent condition and relatively large sizes indicate that the accident wasn’t a high-speed impact with an obstacle or water.

As a commercially-rated airplane pilot, my opinion is that leaves:

  1. explosion or decompression
  2. descent (or fugoid) into ocean at relatively low speed
  3. “graveyard spiral dive” pulled the wings off.

MH370 Debris Storm
Tourist who found debris was searching for MH370
Turbulence V-Speeds
Australia Confirms Mozambique Debris Came from MH370
MH370: Debris found in March ‘almost certainly’ from missing plane

Posted in Tech | Leave a comment

Congrats to LIGO Team

Congrats to the Laser Interferometer Gravitational-wave Observatory (LIGO) team for directly detecting gravitational waves for the first time.

LIGO was the NSF’s most expensive project, and took scientists basically from the 1960’s to 2015 to fully realize – initially nobody believed it was possible to actually build this instrument.

Two detector locations with perpendicular 4 km 4-mirror laser interferometers were able to detect gravitation waves from a billion year-old blackhole collision:

Direct Gravitational Wave Measurement of Two Black Holes Merging in 1/10 of a second!

(The speed of light is a constant, while gravitation waves distort space, thus changing an interference pattern.)

Basic science is always valuable, but just a few of the reasons why this experiment is important:

  1. confirm the equations originally proposed by Einstein in 1915 for gravitation in the GTR and Standard Model
  2. confirm experimentally that light and gravitation waves have different propagation characteristics
  3. develop the technology to make observations at the sub-proton level
  4. study large-scale cosmic events (black holes, colliding galaxies, supernova, binary star systems)
  5. study the time of the Big Bang, as gravitational waves are not filtered like EM waves
  6. confirm or deny cosmic observations and theories made in the EM spectrum, and provide advance notification of occurring events for study in the EM spectrum.

More generally, measurement tools are the highest form of technology, whether for time, space, EM, or gravity. Any investment of time or money in measurement tools is easily repaid 1000x. For example, the GPS system is the result of accurate time measurement using “atomic clocks.”

This decade is an exciting time for science, as several major terrestrial and space instruments come online or are upgraded.

It will be interesting to see if anybody develops a table-top model of LIGO. Experiments in the 60’s with non-laser methods were susceptible to ambient vibrations, but we’ll see.

Gravitational Waves Detected 100 Years After Einstein’s Prediction
Reddit AMA

Posted in Tech | Leave a comment

Superbowl 50

I watched Superbowl 50 in Sunnyvale – a nice spring-like day with blue skies.

Got a bonus show: I was just going inside as the Blue Angels did a low-altitude formation flyover, followed by a couple solo approaches, toward Levi’s Stadium.

Denver Broncos over Carolina Panthers 24 – 10, with Denver leading the entire game.

Cam Newton, QB for Carolina, got sacked, to varying degrees, 6 times. He sore-loser sulked during the post-game interviews, which generated a lot of controversy.

Peyton Manning, Bronco’s QB, won MVP, amidst the usual narcissistic drama of whether he’d retire on top, or not.

The turf came under scrutiny, as some linebackers were literally sliding across it.

Halftime Show

Beyonce, looking thick, Bruno Mars, nice moves in a rubber suit, and Coldplay (woefully) performed. Must have been a nostalgic Brit on the halftime committee I guess.

According to the media, Beyonce was doing a Black Power protest, but the show wasn’t particularly different than anything MJ or Janet did. And frankly, I wouldn’t blame her if she did.


Most of the ads were forgettable.

The Amazon ad with Baldwin and Marino was ok.

There were a few annoying prescription ads, though the cartoon intestines with feet one was more than weird.

Municipal Sports Stadium Corruption

I’m local to the Levi’s Stadium, so am aware of the endless tales of corruption (lack of accounting to City Council, failure to make public service reimbursements, destruction of meeting notes and emails, mis-appropriation of a kids soccer park, etc.)

But even I was surprised that the local transit authorities “privatized” the Caltrain and VTA Light Rail for the day, requiring a a SuperBowl 50 ticket and special $40 ticket per passenger to use a taxpayer-funded system. Hmm.

“Event Passengers Must Pre-Purchase VTA Fare Prior to Boarding

All passengers traveling to the Super Bowl must use VTA’s mobile app, EventTIK to purchase a special VTA Super Bowl 50 Day Pass fare AND possess proof of a valid Super Bowl ticket in order to board the special Super Bowl trains.”

Mr. York: next time, pay for your own damn stadium. You can afford it.
Formation Flyover Photo

Posted in San Jose Bay Area | Leave a comment

Babbage’s Difference Engine at Computer History Museum

Today was the last chance to see Babbage’s Difference Engine at the CHM in Mountain View before the owner makes it private again.

The Computer History Museum has certainly matured into a world-class museum over the years.

The docent talked for about 45 minutes. Unfortunately, it was displayed at the end of a hallway. So 100+ people with kids and strollers jostled to get a view.

It’s very impressive in person – consists of 8,000 parts, weighs five tons, and measures 11 feet long, moderately noisy and mesmerizing to watch. The cranker used a moderately strong rowing motion.

Babbage, in building the first computer, did not have the hindsight to start with a smaller version first. Thus he never finished building a working model despite a decade of funding from the British government and the remaining days of his life working on it.

CHM did a fantastic job on the DEC PDP-1 and IBM 1401 display rooms. Only about 50 PDP-1’s were made, so to have a working model is amazing.

Posted in Tech | Leave a comment

Congratulations on HondaJet USA Certification

Congrats to Honda for earning FAA Production Certification for their first aircraft, the HondaJet HA-420 light business jet.

I’ve been following the news of the HondaJet for over a decade as they progressed step-by-step towards certification.

The HA-420 is the most technologically advanced, fastest (420 knots) and efficient (by up to 20%) small business jet currently certified. Of interest to owner/operators, it may be flown single-pilot.

The price is $4.5 million, which Honda can finance.

The creation of the HondaJet is an epic story, starting with Honda’s founder dreaming of building an airplane several decades ago, and establishing design facilities 2 decades ago in the USA.

A jet engine, the GE Honda HF120, was also certified for this plane.

The total investment to certify both an airframe and an engine must have been staggering to get to this point. Only a multinational mfg. company with support from top executives like Honda can pull that off in peace time.

Even so, aviation is a tough business to make money in, especially as a new entrant.

Japanese companies have a long history of interesting work in aerodynamics. Both the Battleship Yamato and Bullet Train used duck-bill shaped leading airfoils for significant drag reduction. The HondaJet developers likewise use laminar flow nose (see top photo) and wings, and winglets (see second photo.)

According to a review by a friend of Philip Greenspun, the airplane has some issues: interior noise in the passenger compartment is 6 DB too high, only 573 pounds of useful load with full fuel, and a 4000′ runway is needed. Also, a lot of pilot ergonomics that should have made it in, didn’t. Also, the high price is comparable to the the next class up, which are much roomier and have more comfortable useful loads.

yt: Kenny G Live at the HondaJet TC Event with Mr.Fujino,
HondaJet FAA Type Certification Celebration HondaJet Wins FAA Certification
HondaJet Nominated for 2015 Collier Trophy
W: HondaJet HondaJet Pilot Review

Posted in Tech, Toys | Leave a comment

TAP Plastics Mountain View

Although I’ve walked by TAP Plastics on Castro St. in Mountain View a hundred times, today was the first time I went inside.

Their motto “the fantastic plastic place” is accurate.

They have specialized in plastics sales since 1952 and have 21 stores.

  • marketing, signs and displays
  • collectibles displays
  • marine
  • fiberglass laminate supplies
  • custom design (linear, not vacuum forming)

Their web site is a gem, supporting 9 languages using Google Translate.

TAP Plastics Inc.
312 Castro Street
Mountain View, CA 94041

Posted in San Jose Bay Area | Leave a comment

HOWTO: CentOS 7/Redhat 7 Firewalld Setup for Cassandra Server

How to do initial firewalld configuration for Cassandra Server and Opscenter on CentOS/Redhat 7 with 2 network interfaces, in my case Dell 1950/2950.

First: verify that your network interfaces are associated with a NetworkManager zone:

# grep -i zone /etc/sysconfig/network-scripts/ifcfg-*
# service network restart

Second: add the Cassandra ports to the internal zone (private interface) and public zone (public interface):


# add ports on internal interface for Cassandra server

firewall-cmd --zone=internal --add-port=7000/tcp --add-port=7199/tcp --add-port=9042/tcp --add-port=9160/tcp --add-port=61619-61621/tcp --permanent

# add ports on public interface for Cassandra server

firewall-cmd --zone=public --add-port=80/tcp --add-port=8888/tcp --permanent

firewall-cmd --reload

Edit the files in /etc/firewalld/zones to remove the desktop helper services, then do

service firewalld restart

3. Verify configuration:

firewall-cmd --get-active-zones
firewall-cmd --zone=public --list-ports
firewall-cmd --zone=public --list-services
firewall-cmd --zone=internal --list-ports
firewall-cmd --zone=internal --list-services

Output is:

# firewall-cmd --get-active-zones
interfaces: enp4s0
interfaces: enp8s0

# firewall-cmd --zone=internal --list-ports
7000/tcp 7199/tcp 9042/tcp 9160/tcp 61619-61621/tcp

# firewall-cmd --zone=internal --list-services

# firewall-cmd --zone=public --list-ports
80/tcp 8888/tcp

# firewall-cmd --zone=public --list-services

4. Verify firewall rules with nmap:

# nmap -sS

Starting Nmap 5.51 ( ) at 2015-10-15 22:34 PDT
Nmap scan report for
Host is up (0.075s latency).
Not shown: 997 filtered ports
22/tcp open ssh
80/tcp open http
8888/tcp open opscenter

Nice! :)


As always, if you experience network issues on linux, disable selinux, firewalld and TCP wrappers first and verify if those are the source of the problem:

setenforce 0
service firewalld stop
cat /etc/hosts.*

Fedora introduces Network Zones Network Zones

Posted in Cassandra, Linux, Open Source, Storage, Tech | Leave a comment

Notes on Virtualbox 4.3.30 and OS X 10.8.5 for CentOS 7

Virtualbox 4.3.30 on OS X 10.8.5 with CentOS 7 guest VMs work ok on my notebook for web development, but setup was a little fussy.

I use VMs for:

  1. general web development and testing, to stay off the production environment
  2. destructive performance testing (intrusive changes to source code and configurations that require VM rollback to undo, most of which will never be commmitted.) This is great for work on profiling, i18n, caching, mod_rewrite rules, etc.
  3. accelerating automation testing, since a VM can boot in 10 seconds on my Mac with SSD, and VM creation is scriptable. This is a huge win.
  4. working offline (no-Wifi areas.)


  • “Host” is your Mac notebook. It runs Virtualbox under Mac OS X.
  • “Guest” is the VM running under Virtualbox. A guest can be any operating system, but in this case we’re using CentOS 7.x.

Getting Started

  • check Internet for known software issues first
  • update to the latest version of Virtualbox

Choose Network Topology

I wanted to run my web site in a VM, viewable from the Mac browser and have the VM be able to run ‘yum update’, so needed host => guest and guest => Internet routing. There’s 2 networking choices that match those requirements:

  1. Bridged – easiest and works best if a Mac network adapter is always connected, like in the office, or at home if your Wifi access point is always on
  2. NAT – always works, but you have to NAT from host => guest (ie. => You can use Mac’s ipfw or ipf firewalls to then NAT from 80 to 8000, making it seamless:

    sudo ipfw add 100 fwd,8080 tcp from any to any 80 in


  • under “Machine … Settings”, choose “Bridged Adapter”
  • guest IP address will come from Virtualbox DHCP server, usually the guest IP address is
  • on the host, you just use the guest’s real IP address from above
  • if you bridge to the Airport interface (en0), and the host Wifi is off, you lose your guest lease (ie. no routing inside or outside guest VM)
  • binds to a host’s physical interface (conceptually speaking)
  • no NAT needed or available in Virtualbox settings
  • the Virtualbox DHCP address is 192.168.x.100


  • under “Machine … Settings”, just choose NAT, not “NAT Network”
  • guest IP address will come from Virtualbox DHCP server, usually or
  • host IP address will be (NATTed to guest address above)
  • click on “Port Forwarding” button and use host ports above 1024 (usually 2222 for ssh and 8000 for HTTP)


  • the Virtualbox manual is a reference, not a tutorial. After reading this blog post, the manual is useful to fill in details.
  • disable CentOS 7 firewall with ‘service firewalld stop’
  • view CentOS 7 interfaces with ‘ip a’
  • if one networking topology doesn’t work for you, try another. No need to reboot the VM.
  • if you spend more than an hour without success, try VMware Fusion. It covers my use case automatically.


  • do ‘tail -f /var/log/messages’, disable “Cable Connected”, click “OK”, and watch as DHCP lease is lost. Then click on “Cable Connected”, click “OK” to restore
  • if using Bridged on en0, do ‘tail -f /var/log/messages’, do “Turn Wi-fi Off” on Mac, and watch as DHCP lease is lost. Then turn Wifi back on.

Network Security

  • use strong passwords if you value what’s inside the VM
  • enable guest firewall with ‘service firewalld start’
  • TCP wrappers is an easy and effective filtering method



    ALL: ALL

Simulating Production

You can update /etc/hosts to have your browser access your web site in a VM:


# Bridged

But I find that Firefox gets less confused with permanent redirects, etc. by prefixing the hostname:


# Virtualbox NAT Topology (don't forget to use ports 2222 and 8000 from host => guest!)
# Virtualbox Bridged Topology


Take advantage of Virtualbox’s clone and snapshot features. What does “Cable connected” checkbox change?
Port Forwarding in Mac OSX Mavericks
Port Forwarding in Mac OS Yosemite

Posted in Linux, Open Source, Oracle, Tech | Leave a comment

Percona Clustercheck Improved Error Handling Patch

Here’s my Github pull request for improved error handling in Percona’s clustercheck utility, used by haproxy for health-checking a Percona XtraDB Cluster.

It adds two features:

  1. 401 Unauthorized response for failed authentication
  2. 404 Not Found response if the mysql program can’t be found

The error detection is done in a low-latency manner using PIPESTATUS, without an additional database connection. Here is colored diff output.

Posted in API Programming, Linux, MySQL, MySQL Cluster, Open Source, Tech | Leave a comment

SVLUG: Daniel Klopp on Docker

Linux Penguin LogoAt Silicon Valley Users Group (SVLUG) tonite, Daniel Klopp, Senior Technical Consultant, Taos Consulting, gave an intermediate talk on “Docker.”

He had some really informative and detailed slides on using Docker, especially his cgroup commands samples.

Some of the interesting things he mentioned were:

  1. cgroups are nested
  2. Docker currently has a limit of 127 “layers”, with prior layers appearing to be read-only to the current layer
  3. Docker is high-level enough to run on multiple operating systems, including both linux and windows

Daniel Klopp

Daniel Klopp

One attendee mentioned that a work-around for the insecure nature of Docker is to combine it with SELinux, though that will involve a fair amount of work.

Over 400 people RSVPed on a related Meetup, and over 150 people attended, a record for this decade.

Pasta Spread

Great turnout!

Pasta Spread

Salad, meat lasagna, pasta alfredo, veggie lasagna from Taos!

Thanks to Taos for providing food for all. Taos has job postings for sys admin, network admin, devops and help desk IT persons.

Thanks to Symantec once again for hosting the event.

Posted in API Programming, Cloud, Linux, Open Source, Tech, User Groups | Leave a comment

Top Utility for Cassandra Clusters – cass_top

DataStax’s OpsCenter is pretty, but sometimes you don’t want to chop holes in your firewall for the server and agents.

So I wrote cass_top. It works like top, but colorizes the output of nodetool status. It also lets you build nodetool commands using menus, run and log the output.

What’s especially nice is that it uses bash (no python required), and uses minimal screen real estate, so you can view all your clusters on one monitor using eterms.

$ cass_top

cass_top Screenshot
cass_top Help Screenshot

Please leave a comment with your suggestions.

github: Cassandra Top cass_top

Posted in Cassandra, Linux, Storage, Tech, Toys | Leave a comment

MariaDB Patch: CREATE [[NO] FORCE] VIEW Options

MariaDB LogoBelow is my patch that implements the CREATE [[NO] FORCE] VIEW options against MySQL/MariaDB 10.1.0.

It adds two new options that look like this:

  1. CREATE NO FORCE VIEW v1 AS SELECT * FROM TABLE1; — base TABLE1 must exist, as before
  2. CREATE FORCE VIEW v1 AS SELECT * FROM TABLE1; — base TABLE1 doesn’t need to exist


  • these options follow the Oracle Enterprise options fairly closely. NO FORCE works like the old default – a user needs database, table, column access and CREATE VIEW grant to create a view (more or less). FORCE allows a user to create a view with only database access and CREATE VIEW grant and no underlying base table. At SELECT time, full access control and grant checking is performed, and an error will occur if those constraints are not met.
  • views are more complicated than one would expect, and can be composed of base tables, derived tables, INFORMATION_SCHEMA (IS), and other views. The only table object not allowed is a temporary table
  • CREATE FORCE VIEW is an important option when managing large sets of views when you don’t want to track the creation sequence, or when creating views via program. An example is mysqldump, which can be simplified by replacing the current temporary tables ordering workarounds with FORCE VIEW.
  • It’s a fairly solid patch. I think the best thing is to commit it to alpha and let it bake for a while.
  • One permutation that will need special handling is this: CREATE FORCE VIEW view1 AS SELECT * FROM table1; Since * is not resolved to column names by FORCE, currently ” AS SELECT * AS ” is generated, causing an error. So just use explicit column names like CREATE FORCE VIEW view1 SELECT id, col1, col2 FROM table1; See this bug.
  • it passes t/view.test:
    # ./ view
    Logging: ./  view
    vardir: /usr/local/mariadb-10.1.0/mysql-test/var
    MariaDB Version 10.1.0-MariaDB-debug
    TEST                                  RESULT   TIME (ms) or COMMENT
    main.view                            [ pass ]   1896
    The servers were restarted 0 times
    Spent 1.896 of 7 seconds executing testcases
    Completed: All 1 tests were successful.
  • I wrote tests/ which does 8,000+ test permutations. It passes. :)

$ cat create_force_view.patch

--- ../mariadb-10.1.0/sql/sql_view.h 2014-06-27 04:50:36.000000000 -0700
+++ sql/sql_view.h 2014-09-02 02:35:42.000000000 -0700
@@ -29,10 +29,10 @@
/* Function declarations */

bool create_view_precheck(THD *thd, TABLE_LIST *tables, TABLE_LIST *view,
- enum_view_create_mode mode);
+ enum_view_create_mode mode, enum_view_create_force force);

bool mysql_create_view(THD *thd, TABLE_LIST *view,
- enum_view_create_mode mode);
+ enum_view_create_mode mode, enum_view_create_force force);

bool mysql_make_view(THD *thd, File_parser *parser, TABLE_LIST *table,
uint flags);
--- ../mariadb-10.1.0/sql/sql_lex.h 2014-06-27 04:50:33.000000000 -0700
+++ sql/sql_lex.h 2014-09-02 01:21:10.000000000 -0700
@@ -170,6 +170,12 @@
VIEW_CREATE_OR_REPLACE // check only that there are not such table

+enum enum_view_create_force
+ VIEW_CREATE_NO_FORCE, // default - check that there are not such VIEW/table
+ VIEW_CREATE_FORCE, // check that there are not such VIEW/table, then ignore table object dependencies
enum enum_drop_mode
DROP_DEFAULT, // mode is not specified
@@ -2442,6 +2448,7 @@
enum enum_var_type option_type;
enum enum_view_create_mode create_view_mode;
+ enum enum_view_create_force create_view_force;
enum enum_drop_mode drop_mode;

uint profile_query_id;
--- ../mariadb-10.1.0/sql/ 2014-06-27 04:50:34.000000000 -0700
+++ sql/ 2014-09-02 02:34:31.000000000 -0700
@@ -4943,7 +4943,7 @@
Note: SQLCOM_CREATE_VIEW also handles 'ALTER VIEW' commands
as specified through the thd->lex->create_view_mode flag.
- res= mysql_create_view(thd, first_table, thd->lex->create_view_mode);
+ res= mysql_create_view(thd, first_table, thd->lex->create_view_mode, thd->lex->create_view_force);
--- ../mariadb-10.1.0/sql/sql_yacc.yy 2014-06-27 04:50:37.000000000 -0700
+++ sql/sql_yacc.yy 2014-09-05 17:19:29.000000000 -0700
@@ -1851,7 +1851,7 @@
statement sp_suid
sp_c_chistics sp_a_chistics sp_chistic sp_c_chistic xa
opt_field_or_var_spec fields_or_vars opt_load_data_set_spec
- view_algorithm view_or_trigger_or_sp_or_event
+ view_algorithm view_or_trigger_or_sp_or_event view_force_option
definer_tail no_definer_tail
view_suid view_tail view_list_opt view_list view_select
view_check_option trigger_tail sp_tail sf_tail udf_tail event_tail
@@ -2446,6 +2446,7 @@
Lex->create_view_algorithm= DTYPE_ALGORITHM_UNDEFINED;
Lex->create_view_suid= TRUE;
+ Lex->create_view_force= VIEW_CREATE_NO_FORCE; /* initialize just in case */
@@ -15887,6 +15888,15 @@
| event_tail

+ /* empty */ /* 411 - is there a cleaner way of initializing here? */
+ { Lex->create_view_force = VIEW_CREATE_NO_FORCE; }
+ { Lex->create_view_force = VIEW_CREATE_NO_FORCE; }
+ { Lex->create_view_force = VIEW_CREATE_FORCE; }
+ ;

DEFINER clause support.
@@ -15944,7 +15954,7 @@

- view_suid VIEW_SYM table_ident
+ view_suid view_force_option VIEW_SYM table_ident
LEX *lex= thd->lex;
lex->sql_command= SQLCOM_CREATE_VIEW;
--- ../mariadb-10.1.0/sql/ 2014-06-27 04:50:36.000000000 -0700
+++ sql/ 2014-09-05 19:33:58.000000000 -0700
@@ -248,7 +248,7 @@

bool create_view_precheck(THD *thd, TABLE_LIST *tables, TABLE_LIST *view,
- enum_view_create_mode mode)
+ enum_view_create_mode mode, enum_view_create_force force)
LEX *lex= thd->lex;
/* first table in list is target VIEW name => cut off it */
@@ -259,7 +259,7 @@

- Privilege check for view creation:
+ Privilege check for view creation with default (NO FORCE):
- user has CREATE VIEW privilege on view table
- user has DROP privilege in case of ALTER VIEW or CREATE OR REPLACE
@@ -272,6 +272,7 @@
checked that we have not more privileges on correspondent column of view
table (i.e. user will not get some privileges by view creation)
if ((check_access(thd, CREATE_VIEW_ACL, view->db,
@@ -285,6 +286,11 @@
check_grant(thd, DROP_ACL, view, FALSE, 1, FALSE))))
goto err;

+ if (force) {
+ res = false;
+ DBUG_RETURN(res || thd->is_error());
+ }
for (sl= select_lex; sl; sl= sl->next_select())
for (tbl= sl->get_table_list(); tbl; tbl= tbl->next_local)
@@ -369,7 +375,7 @@

bool create_view_precheck(THD *thd, TABLE_LIST *tables, TABLE_LIST *view,
- enum_view_create_mode mode)
+ enum_view_create_mode mode, enum_view_create_force force)
return FALSE;
@@ -391,7 +397,7 @@

bool mysql_create_view(THD *thd, TABLE_LIST *views,
- enum_view_create_mode mode)
+ enum_view_create_mode mode, enum_view_create_force force)
LEX *lex= thd->lex;
bool link_to_local;
@@ -425,14 +431,13 @@
goto err;

- if ((res= create_view_precheck(thd, tables, view, mode)))
+ if (res= create_view_precheck(thd, tables, view, mode, force))
goto err;

lex->link_first_table_back(view, link_to_local);
view->open_type= OT_BASE_ONLY;

- if (open_temporary_tables(thd, lex->query_tables) ||
- open_and_lock_tables(thd, lex->query_tables, TRUE, 0))
+ if (open_temporary_tables(thd, lex->query_tables) || (!force && open_and_lock_tables(thd, lex->query_tables, TRUE, 0)))
view= lex->unlink_first_table(&link_to_local);
res= TRUE;
@@ -513,6 +518,7 @@

+if (!force) {
/* prepare select to resolve all fields */
lex->context_analysis_only|= CONTEXT_ANALYSIS_ONLY_VIEW;
if (unit->prepare(thd, 0, 0))
@@ -612,6 +618,7 @@

res= mysql_register_view(thd, view, mode);

@@ -621,7 +628,7 @@
meta-data changes after ALTER VIEW.

- if (!res)
+ // if (!res)
+ if (!res && !force) /* 411 - solves segfault problems with CREATE FORCE VIEW option sometimes */
tdc_remove_table(thd, TDC_RT_REMOVE_ALL, view->db, view->table_name, false);

if (mysql_bin_log.is_open())
@@ -908,6 +915,8 @@
fn_format(path_buff, file.str, dir.str, "", MY_UNPACK_FILENAME);
path.length= strlen(path_buff);

if (ha_table_exists(thd, view->db, view->table_name, NULL))
if (mode == VIEW_CREATE_NEW)
--- ../mariadb-10.1.0/mysql-test/t/view.test 2014-06-27 04:50:30.000000000 -0700
+++ mysql-test/t/view.test 2014-09-06 00:23:32.000000000 -0700
@@ -5263,4 +5263,17 @@
--echo # -----------------------------------------------------------------
--echo # -- End of 10.0 tests.
--echo # -----------------------------------------------------------------
+create no force view v1 as select 1;
+drop view if exists v1;
+create force view v1 as select 1;
+drop view if exists v1;
+create force view v1 as select * from missing_base_table;
+drop view if exists v1;
+--echo # -----------------------------------------------------------------
+--echo # -- End of 10.1 tests.
+--echo # -----------------------------------------------------------------
SET optimizer_switch=@save_optimizer_switch;

Posted in API Programming, Linux, MySQL, Open Source, Oracle, Storage, Tech | Leave a comment

Installing Datastax Cassandra and Python Driver on CentOS 5

Cassandra Logo

Cassandra can run on CentOS 5.x, but there is no yum repo support.

If you can’t upgrade linux distros, here’s how to install Datastax Cassandra Community Edition and the python cassandra driver on CentOS 5.x.

It’s not difficult, but there’s several steps, including updating java.

(The following steps would make a complete chef or puppet recipe for a non-SSL install with vnodes.)

# setup environment
groupadd -g 602 cassandra
useradd -u 602 -g cassandra -m -s /sbin/nologin cassandra
mkdir /var/lib/cassandra /var/log/cassandra /var/run/cassandra
touch /var/log/cassandra/system.log
chown -R cassandra:cassandra /var/lib/cassandra /var/log/cassandra /var/run/cassandra
mkdir -p /opt && cd /opt

cat >> /etc/security/limits.conf <<EOD
cassandra soft memlock unlimited
cassandra hard memlock unlimited
cassandra soft nofile 8192
cassandra hard nofile 10240

# upgrade java
yum remove java
# download, then install JDK 7.x from
rpm -Uvh jdk-7u67-linux-x64.rpm
# download, then install recent jna.jar from
mv jna.jar /usr/share/java
ln -s /usr/share/java/jna.jar /opt/cassandra/lib/
# update envariables
cat >> /etc/profile <<"EOD"
export JAVA_HOME=/usr/java/default
export JRE_HOME=/usr/java/default/jre
export CASSANDRA_HOME=/opt/cassandra

# get Datastax DCE
curl -L >dsc-cassandra-2.0.9.tar.gz
tar zxvf - < dsc-cassandra-2.0.9.tar.gz ln -s /opt/dsc-cassandra-2.0.9 /opt/cassandra chown -R root:root /opt/cassandra/ bash cassandra/switch_snappy 1.0.4

# open cassandra firewall ports if necessary (not needed if using internal interface on most servers)
vi /etc/sysconfig/iptables
-A INPUT -i eth0 -m state --state NEW -m multiport -p tcp --dport 7000,7199,9042,9160 -j ACCEPT
service iptables restart
# configure /opt/cassandra/conf/cassandra.yaml (at least listen_address, rpc_address, seeds and tokens before starting server. If you need a do-over, clean the cassandra data with # rm -fr /var/lib/cassandra/*)

# download startup script:
wget -O /etc/init.d/cassandra
chown root:root /etc/init.d/cassandra
chmod 755 /etc/init.d/cassandra
chkconfig --add cassandra

# start cassandra server (if it is standalone, or a seed server. otherwise start after the seed servers):
service cassandra start

# cat /etc/redhat-release 
CentOS release 5.10 (Final)

[root@www1 conf]# nodetool status
Datacenter: datacenter1
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load       Tokens  Owns   Host ID                               Rack
UN  71.87 KB   256     66.8%  8302c6d5-4c88-4695-bbf4-762bc7f24544  rack1
UN  136.63 KB  256     69.9%  eddb03b2-98d3-46ff-be63-95435414a883  rack1
UN  100.08 KB  256     63.3%  2a8dde5e-29b0-4a67-8204-40769376c44a  rack1

If you only see the node on localhost, then you have a problem:

  • read and fix any errors in /var/log/cassandra/system.log until there are zero errors. snappy-related errors are from /tmp being noexec or not running the switch_snappy 1.0.4 command above.
  • disable iptables firewall, test and reenable later
  • in, increase log4j.rootLogger to DEBUG
  • if you have multiple NICs, JMX (ie. nodetool) can bind to the wrong interface. You likely need to configure the-Djava.rmi.server.hostname=[address] option in - to the address you want to listen on
  • public/private IP address problems in AWS EC2. You may need to set broadcast_address: [public_ec2_address]
  • normally rmiregistry is not needed unless you have some atypical firewalling or routing (NAT.)

Datastax Opscenter 5.0

You can install the binary from yum or tarball, but the important things to know are:

  • the monitoring agent will be installed on each cassandra node and uses port 61621. The init script is called datastax-agent.
  • the UI only needs to be installed once, but needs ports 61620, and 8888 for HTTP.
  • to allow Opscenter to remotely manage nodes with ssh, remove old ssh entries from .ssh/known_hosts first, connect manually to each node, then Opscenter should be happy
  • by default, Opscenter listens for agents on, phones home to each day, and does not require web authentication, so you likely want to change those.

Python also needs to be upgraded if you want to use cqlsh or the python client cassandra driver.

# install python 2.6 and dependencies
yum install gcc python26 python26-devel libev libev-devel

# install python's pip module
curl --silent --show-error --retry 5 | python26

# install cassandra driver for python
pip install cassandra-driver

# install
tar zxvf - < blist-1.3.6.tar.gz cd blist-1.3.6 python26 install cd ..

# - test installation

from cassandra.cluster import Cluster

cluster = Cluster([''])

def dump(obj):
   for attr in dir(obj):
       if hasattr( obj, attr ):
           print( "obj.%s = %s" % (attr, getattr(obj, attr)))

# python26

obj.__class__ = <class 'cassandra.cluster.Cluster'>

Troubleshooting connection problems in JConsole Storing OpsCenter Data in a Separate Cluster

Posted in Cassandra, Cloud, Linux, Open Source, Tech | Leave a comment

MySQL 5.6 Views and Stored Procedures Tips

MySQL LogoI recently tuned an existing application that used dozens of views and hundreds of stored procedures using MySQL 5.6.

There seems to be three attitudes towards using views and stored procedures (SPs) with MySQL:

  1. don’t use them at all to increase portability
  2. just use SPs to reduce network traffic in large reporting queries (my choice)
  3. go crazy and use them everywhere like old-school Oracle Enterprise apps.

Here are some notes on using views:

  • before creating views, review your schema to ensure keys have matching types and charsets for good performance. It’s much easier to spot schema problems in a text listing than to guess why a view is slower than expected at execution time. (This is doubly true for MySQL Cluster.)
  • MySQL currently doesn’t have CREATE VIEW FORCE, although MariaDB 10.1.0 alpha has my patch. The FORCE option will greatly simply view administration and also mysqldump output, which creates temporary tables to ensure views can be created regardless of table/view ordering issues
  • When looking at the MariaDB source code, it’s apparent that some view options were never actually implemented, like RESTRICT/CASCADE

And some notes on stored procedures (SPs):

  • if a SP makes a stateful session change, like set sql_log_bin=0, ensure that isn’t going to be a problem later if an exception condition doesn’t reset it
  • after running a SP, SHOW PROFILES will list all the queries executed with performance statistics
  • SPs that do non-essential SELECTs or INFORMATION SCHEMA queries probably need to be reviewed by a DBA for fundamental problems like non-atomic “reading before writing”
  • MySQL compiles SPs again for each thread.

Both views and SPs are relatively new MySQL features, so budget some extra development and testing time when using them, especially with replication.

[MDEV-6365] CREATE VIEW Ignores RESTRICT/CASCADE Options Using MySQL triggers and views in Amazon RDS

Posted in MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

SVLUG: Devops and Release Canaries with Linux, CloudStack and MySQL Cluster

I did a talk at the Silicon Valley Linux Users Group (SVLUG) tonite on “Devops and Release Canaries with Linux, CloudStack and MySQL Cluster.”

Thanks again to Symantec for hosting.

Posted in API Programming, Cloud, Linux, MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

Velocity Conference Santa Clara 2014 Tips Game Cards

The O’Reilly Velocity Web Operations & Performance Conference is June 24-26 in Santa Clara.

Next to the messages/jobs board was a Web Ops & Performance Tips board:

– use source maps to debug compressed JS and CSS
– use ::before to optimize font rendering
– use local storage to persist markup and templates to reduce requests and payload
– avoid CSS block rendering in chrome by not using screen media type until after. Then put screen back to element
– use gatling stress tool for load generation/perf testing (Apache Licence 2.0)
– learn curl
– learn POSIX before recreating another tool that already exists. Bill Joy (?)
– “if you do it more than twice a week, automate”
– it takes no skills to do NoOps! :)

Posted in Cloud, Conferences, Open Source, Tech | Leave a comment

AWS Pop-up Loft, San Francisco

Amazon Web Services pop-up loft (Ask an Architect area, lecture hall, kitchen/lounge)
Photo credit:

I happened to be in SF today, so I went to the Amazon Web Services pop-up loft on Market St.

Amazon rented an empty storefront for 4 weeks for lecture sessions upstairs, and a computer lab and an ‘Ask an Architect’ bar downstairs.

One of the hosts said the loft was a shell in May, and they had to build out everything: the kitchen area, 2 bathrooms and various partitions.

I asked the experts about new EBS and RDS features, and they had answers as well as a $100 AWS credit.

The weather was sunny and warm in SF.

Lots of street performers and hustlers, including a very smooth male R&B singer. A young rapper named Rap2K15 was selling hand-made CDs.

Update 2014 06 23: Apparently a drawing was held, and I was one of 3 winners of a free general pass to the AWS:Reinvent Conference :)

Update 2014 06 24:

AWS Bootcamp

Full-day AWS overview, including EC2, S3, RDS, VPC and IAM, with 2 labs.

“Provisioning and Managing AWS Infrastructure with Chef” with special guest George Miranda, Chef Technical Consultant, Chef

George talked about using Chef tools like chef metal, knife and chef zero and a minimal amount of ruby to make an AMI and provision a MySQL server and 5 Nginx web servers.


@gmiranda23, chef-ami-factory

Update 2014 06 26:

Dealing With Obstacles at Scale, Bob Hagemann, Twilio

To reduce pain:

– UTC timezone
– UTF8
– use thin AMI and chef/puppet instead of thick AMI
– wrote boxconfig a few years ago (like netflix asgard)
– remote admin mainly
– small teams 3-8
– services should run in 3 AZs
– monitoring with nagios, cron, pingdom
– haproxy on each host as proxy
– MySQL, MHA, LVM. Manual failover.
– global low latency with route53
– @bobzilla42
– Uses freeswitch plus own telcom sw
– billing system 100s QPS
– Ops team is about 8 people
– VPNs to HQ and carrier-approved colo
– three founders, one came from Amazon.

925 Market Street, SF
June 4 – 27, 2014 (likely closed on the 27th for dismantling)
Free registration, tshirts and lunch. Closes 5:30 pm, 6:00 pm or 8:00 pm daily.
Muni 30 and 45 return from Market St. and 5th to Caltrain.

@AWSstartups #AWSloft

AWS Loft Returning in Fall 2014

Posted in API Programming, Business, Cloud, Conferences, Linux, MySQL, Open Source, Oracle, San Jose Bay Area, Tech | Leave a comment

Advanced Liquibase Techniques

Liquibase LogoI recently did some work with liquibase. Here’s some techniques for advanced users to workaround limitations to calculate query cost.

Liquibase Introduction

Liquibase is an Open Source (Apache 2.0 License) Java utility and API for specifying and versioning schema changes (DDL) for several popular databases. It is commonly introduced to projects by programmers, rather than DBAs.

What liquibase can do:

  • allow “refactoring” of SQL schema changes to target multiple databases using XML by using a database-independent syntax, or raw SQL, depending on your preference
  • allow conditional execution and rollback of SQL based on database type or environment.

What liquibase can’t do:

  • has no built-in provisions for operational concerns, like conditionally executing SQL based on time/cost. There’s an assumption that schema changes are online, often true on Oracle and SQL Server, less so on MySQL, especially prior to 5.6 (unless you do micro-sharding)
  • does not do intelligent merges to the same object across changesets, like adding multiple columns to the same table in one statement.

How liquibase works:

  • the programmer specifies schema changes in Java, XML or JSON and runs the liquibase command
  • liquibase creates 2 tables in your database to store version, user and patch name information and to lock out other simultaneous liquibase runs.

How to Make Liquibase Consider Cost for MySQL

After some experimentation, there’s a couple liquibase features you can use to do more advanced things:

  1. create a savepoint using the tag and rollback options:
    • liquibase tag rel0; liquibase update …; liquibase rollback rel0
  2. prepend and append logic to each changeset to use information_schema on the SQL DDL statement. on failure, exit with 1 (See XML example below)


<?xml version="1.0" encoding="UTF-8"?>


    <changeSet id="1" author="james">
       create table if not exists `profiling` ( `connection_id` int(11) not null default 0, `query_id` int(11) not null default '0', `state` varchar(40) default '', KEY (query_id));
       truncate table profiling;
       set profiling=1;

       alter table department add column test2 int default null;
       insert into profiling (connection_id, query_id, state) select connection_id(), query_id, state from information_schema.profiling where query_id=2;
        <sql>alter table department drop column test2</sql>

    <changeSet id="1-post" author="james">
      <preConditions onFail="HALT">
        <sqlCheck expectedResult="0">SELECT count(*) from profiling where state='copy to tmp table'</sqlCheck>


  1. the changeset DDL statement will still have run, even if the precondition HALTs – they’re separate changesets, after all
  2. the rollback in “1” will not be executed, even if “1-post” HALTs.

The workaround for those 2 issues is to combine the two techniques in a shell script:


liquibase tag rel0

liquibase update changeset.xml || {
    # fail the build pipeline to not propagate changeset to next stage
    # (ie. don't run in production)
    liquibase rollback rel0
    mysql -e 'alter table test.department drop column test2' 
    exit 1

The above looks a little kludgy, but provides a stepping stone for the reader to customize in their particular environment. (The preConditions and bash script can be easily autogenerated with a Perl or Python script.)

An alternative to XML is using the Java API to set everything up.

Please leave a comment if you have any suggestions or a Java API program.

Posted in API Programming, MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

Percona Live MySQL Conference Santa Clara 2014

The Percona Live MySQL Conference was held once again in Santa Clara from April 1-4, 2014.

Executive Summary:

  1. Percona hosted another excellent conference, with 1,150 attendees from 43 countries plus a vibrant exhibit hall.
  2. The overall themes that emerged this year were “What’s new in MySQL 5.6?” and “The rise of Galera Cluster.” Unfortunately, Oracle delivered the 5.6 features they promised, but didn’t bother to ask production DBAs what they really needed (ie. GTIDs require downtime to configure, and ALTER ONLINE doesn’t support throttling or background operation on slaves (SR 3-8856341908).)
  3. MySQL 5.7 is promising about double the performance of 5.6, but note that the 5.7 feature micro-benchmark effort hasn’t translated into a complete understanding of whole database performance yet.
  4. the current active branches are now: Oracle 5.6/5.7, MariaDB 10.0/10.1, Webscale SQL (Facebook, Google, LinkedIn, and Twitter), Facebook 5.6 with Deployable GTIDs, and Percona Server 5.6. (The version you want to migrate to is one based on MySQL 5.6.17 or later.)

Severalnines Booth booth. They create and support cluster and cloud database solutions. Photo credit: Steve Barker,


Wed. Keynotes

Percona Live 2014 opening keynote with Percona CEO Peter Zaitsev
Robert Hodges – Getting Serious about MySQL and Hadoop at Continuent
(Continuent needs to pivot into another market as MySQL’s new built-in features displace their replication products.)
‘Raising the MySQL Bar’ with Oracle’s Tomas Ulin, VP of Engineering for MySQL, Oracle
Adventures in MySQL at Dropbox, Renjish Abraham

Wed. Talks

Online schema changes for maximizing uptime, David Turner, Dropbox, Ben Black, Tango

– MySQL 5.6 has online schema change capability, however there’s no way to throttle IO consumed during the operation and the single-threaded slave will lag
– David has tested the ALTER ONLINE in MySQL 5.6.17 and will use it when ported to Percona Server
– for now uses Percona Online Schema Change utility for its throttling feature.

Be the hero of the day with the InnoDB Data recovery tool, Marco “The Grinch” Tusa and Aleksandr Kuzminsky, Percona Services

– tools have been created by Percona to recover Innodb data if you don’t have backups and you’re out of business otherwise. Call them! :)

Galera Cluster New Features, Seppo Jaakola, Codership

– reviewed features in Galera Cluster versions 3 and 4
– looking good.

MySQL Cluster Performance Tuning, Johan Andersson,

- Disable NUMA
- echo 0 > /proc/sys/vm/swappiness
- bind data node threads to CPUs
- cat /proc/interrupts


LDM = cores/2

TC = LDM/4



Tune redo log



Practical sysbench, Peter Boros, Percona

– prefers “latency” graph style with transparent dots vs. line charts
– uses R and ggplot2 for graphing
– attendees tried to guess SSD performance on Peter’s notebook for different block sizes, most were proven totally wrong by sysbench

Birds of a Feather (BoF) Sessions

“Meet MySQL Team (at Oracle)” BoF

– discussion again this year about parallel query execution (same as at MariaDB BoF last year), with Peter Zaitsev also bringing it up again
– discussion about raw partitions (belief is that they will be 20% more space-efficient and 30% faster, and avoid Linux endless limitations and bugs)
– internal “development roadmap” only extends about 12 months at a time, subject to customer demands
– I griped about FK panic/data loss issues in MySQL Cluster 7.3.3. Tomas Ulin, Vice President, MySQL Engineering, said that was news to him. (See SR 3-8717994851 and SR 3-87646727311)
– Mark Callaghan, Facebook, said he was working on MongoDB now, but requested named keys in flexible schema in MySQL.
– Peter Zaitsev, Percona, said several clients are using GTIDs and they seem to work.
– Oracle pleaded with users to drop MyISAM. I mentioned the main reason was that legacy systems used older compression methods, but InnoDB could be used since it has compression too
– The Oracle MySQL Fabric project is an attempt to counter MongoDB’s automatic slave promotion.


Thursday Keynotes

‘9 Things You Need to Know…’, Peter Zaitsev, Percona
The Evolution of MySQL in the All-Flash Datacenter, Nisha Talagala, Fusion-IO
MySQL, Private Cloud Infrastructure and OpenStack, Sean Chighizola, Big Fish Games
Keynote Panel: The Future of Operating MySQL at Scale

Thu. Talks

Benchmarking Databases for Scale, Peter Boros and Kenny Gryp, Percona

Question: “What is Percona’s secret to professional benchmarks?”
Answer: “Benchmark absolutely everything multiple times, time permitting.”

MySQL 5.7: Performance & Scalability Benchmarks, Dimitri KRAVTCHUK

– comprehensive micro-benchmarking graphs of 5.7 to gain a deeper understanding of parts
– the challenge remains: how to tune the whole database to perform well?

Use Your MySQL Knowledge to Become an Instant Cassandra Guru, Robert Hodges, Continuent and Tim Callaghan, Tokutek

– good comparison of relational data modelling and C* data modelling, lots of similarities
– note that MariaDB has a Cassandra plugin

RDS for MYSQL, Tips, Patterns and Common Pitfalls, Laine Campbell, Blackbird (formerly PalominoDB)

Write Conflicts in Multi-Master Replication Topologies, Seppo Jaakola, Codership

– it’s good to see that Codership is paying attention to the details of replication

MySQL Community Awards

Shlomi has a comprehensive post on this years winners.

MySQL Lightning Talks (5 minutes each)

Truncating Sub Optimal DBA Verbal Responses Vectors, David Stokes (Oracle)

MySQL 5.6 Global Transaction IDs: Benefits and Limitations, Stephane Combaudon (Percona)


Zero database downtime using the Federated storage engine and Replication, prasad mani (BBC)

Scaling via adding a Table, Rick James (self)

Rick knows some clever ways to optimize solutions with MySQL. He’s doing consulting now, so contact him.

Extra Table Saves the Day: Slides

No es ‘ano’, es ‘año’! A take on encoding in your DB, Ignacio Nin (Vivid Cortex)

What Not to Say to the MySQL DBA, Gillian Gunson (Blackbird (formerly PalominoDB))
“I’ll code around it. ”
“Stop micro-optimizing. ”
“Use passive master for QA”
“MySQL is a toy database. ”
This conference is a support group. ”

Hall of Shame, Shlomi Noach
Triple active-replication in gaming anecdote: don’t do that.

The bash slave-prefetch oneliner, Art van Scheppingen (Spil Games)

Unsung Relay Log, Vishnu Rao, FlipKart
Com_relaylog_dump for tungsten and mysql 5.5

Unique User Count — Rollup, Rick James (self)

Formula for user visit estimation by counting bits.

Logical Backups in the Cloud, Bill Karwin, Percona
Backups for PHP designers
PHP class Mysql/Dump

How to Squat, Kyle Redinger (VividCortex, Inc)

Iron DBA Replication Challenge, Attunity


Friday Keynotes

Percona CMO Terry Erisman opens the 3rd and final day of Percona Live 201

Keynote: OpenStack Co­Opetition, A View from Within, Boris Renski, Mirantis and OpenStack Boardmember

– one of the best conference keynotes ever, and a great primer on Open Source marketing … up there with the O’Reilly Open Source Conference keynote on the importance of Android – before it shipped.

Friday Talks

Global Transaction ID at Facebook, Evan Elias, Santosh Banda and Yoshinori Matsunobu, Facebook

– just write your own MySQL branch if a feature is too hard to deploy :)

R for MySQL DBAs, Ryan Lowe and Randy Wigginton, Percona

– R has about 1,000 interesting sample databases (demos included diamonds and cars)
– good interface for quick graphing, not so great for complex programs
– Percona usess R and ggplot graph module for most of the graphs you see now.

MariaDB for Developers, Colin Charles, Chief Evangelist, MariaDB

Closing Prize Drawing

About 30 high-end gifts were handed out.

Some nice prizes contributed by exhibitors, including Nexus 7 tablets, $250 AWS gift certificates, SQLyog and Monyog licenses, and a quad drone!


The exhibits are one of my favorite things at the conference each year because of how strong the MySQL third-party community is.

Some notable absences were Clustrix and Violin memory, but those were offset by new exhibitors. Webyog was a sponsor but I didn’t see a booth. PalominoDB changed their name to Blackbird, and appear to be offering DevOps as well as DBA services.

And of course, as the organizers, Percona had a large, central spread. :)

Thanks to the sponsors and exhibitors for making a conference like this financially possible.

Facebook Debuts Web-Scale Variant Of MySQL

Facebook’s Yoshinori Matsunobu on MySQL, WebScaleSQL & Percona Live
Twitter’s Calvin Sun on WebScaleSQL, Percona Live
Tweets about PerconaLive
Percona Live MySQL Conference Highlights

Posted in Cassandra, Cloud, Conferences, Linux, MySQL, MySQL Cluster, Open Source, Oracle, Perl, San Jose Bay Area, Storage, Tech | Leave a comment

Cassandra Operations Checklist

Most of the Cassandra rollouts I’ve heard about at conferences have been “Devopsed” – written by Dev and productionized by Dev, with hand-off to Operations long afterwards.

That’s the opposite to how RDBMS projects are usually deployed in large companies.

As Cassandra becomes more mature, this hand-off will occur earlier after development ends.

Here is a checklist for handing off a Cassandra database to Operations (I only consider non-trivial rings of 3 or more nodes in production with a full data set):

  Node Impact
  Item Comments Performance/ Space/ Time/IOPs/BW
Cassandra Server Version Should be exactly the same minor version across cluster except briefly during server updates
Token or vnodes? needs to be configured before first start of server
Cassandra Client/Connector Version Thrift or CQL?
Snitch name? Why? several choices
Replication Factor (RF)? Why? usually RF=3 for SoT* data, defined at keyspace level
Compaction method? Why? Size or Level, defined at CF level
Read Consistency Level? Why? Netflix recommends CL=ONE. ALL seldom makes sense.
Write Consistency Level? Why? ALL seldom makes sense.
TTL? Why? Defined at row level.
Expected Average Query Latency 10 ms is reasonable, 1 ms is tough.
nodetool repair/scrub needed weekly yes more space more
Bootstrapping a new node yes yes
Java gcpause stop the world yes yes
Are there any wide columns? do they get wider over time? pathological case for Cassandra yes more space more
Backup in case of application bug or a disaster. Opscenter, Priam, custom. yes slightly more for incremental backups, double for local cold copy more
Restore requires Cassandra node shutdown yes
If a storage volume fills, howto fix it? Especially a problem with multiple JBOD volumes, which fill unevenly. yes less space less
If a storage volume fails, howto fix it? yes less space less
What is the total data size now? Projected in 12 months? affects most operations yes yes yes
What is the acceptable query latency? affects network and hardware choices
What is the best maintenance window time each week?
What are the business and practical SLAs?
What training is needed for your Operations team? Datastax Admin and Data Modelling Classes (recommend most recent Cassandra version)
What partitioner is used? Opscenter only supports random partitioner or murmur 3 partitioner for rebalancing
What procedures need to be written for your Operations team?
What monitoring tools?
  1. DSE or DCE/OpsCenter
  2. nodetool
  3. Jconsole/jmxterm
  4. Boundary
  5. nagios/zabbix
What bugs have been encountered? Which ones still apply?
What lessons can Devops share with the Operations team?

SoT = Source of Truth

About Data Consistency in Cassandra
ConstantContact techblog: Cassandra and Backups Do I absolutely need a minimum of 3 nodes/servers for a Cassandra cluster or will 2 suffice?

Posted in Business, Cassandra, Cloud, Tech | Leave a comment

Howto Add a New Command to the MySQL Server

MySQL LogoAdding a new statement or command to the MySQL server is not difficult.

First, decide if you want to modify the server source code, or if a User-Defined Function (UDF) will meet your needs.

Since I just added the SHUTDOWN server command, I thought I would be helpful to outline the steps needed to add a new command.


  1. some familiarity with C/C++ syntax and programming (like “The C Programming Language”, by Kernighan and Ritchie.)
  2. some familiarity with lex and yacc. (I read the Dragon Book a long time ago.)
  3. access to a linux account with cmake, gcc, make and bison packages.
# CentOS
yum install cmake gcc make bison

# Ubuntu
apt-get update
apt-get install cmake gcc make bison

# unpack the MySQL source code:

tar zxvf - < mariadb-5.5.30.tar.gz

# most of the files you need to modify are in this directory:

cd mariadb-5.5.30/sql
  • sql_yacc.yy
  • sql_lex.h

# add the token(s) (commands and arguments you think you will need) and verify the syntax:

bison -v sql_yacc.yy

# if you get warnings, fix %expect in

# cut-and-paste a code block from a command with similar syntax in to implement your new command, and build a test version of MySQL

# build your new server in a sandbox:


cd mariadb-5.5.30
cmake . -DCMAKE_INSTALL_PREFIX:PATH=/usr/local/mariadb-5.5.30
make --with-debug
sudo make install

# test your new server with 3 terminal windows:


killall mysqld
/usr/local/mariadb-5.5.30/bin/mysqld_safe --user=mysql --debug &
tail -f  /tmp/mysqld.trace | grep Got &
tail -f /var/log/mysqld.log &
mysql -u root -p
# login, then test your new command while watching the log and trace

# read /var/log/mysqld.log and /tmp/mysqld.trace for errors and panics like this:

Version: '5.5.30-MariaDB-debug'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
mysqld: /home/james/mariadb-5.5.30/sql/ int mysql_execute_command(THD*): Assertion `0' failed.
130515 11:25:19 [ERROR] mysqld got signal 6 ;

This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

The above panic was caused by the SQLCOM_ switch falling through, because the new command was not defined yet.

# When you’re done, make a test

vi mysql-test/t/my_new_command.test

# Create a patch file:

mv mariadb-5.5.30 mariadb-5.5.30-new
tar zxvf - < mariadb-5.5.30.tar.gz

cd mariadb-5.5.30/src
for i in sql_yacc.yy sql_lex.h; do
   echo $i
   diff -u $i ../../mariadb-5.5.30-new/sql/ >>patch.txt
# don't forget mysql-test/t/my_new_command.test

# apply your patch file:

patch -b < patch.txt

# do a build and test your patch before distributing it.

Easy peasy, right! :)

Sergei Golubchik wrote on the MariaDB developers list: "Reserved words are keywords (listed in the sql/lex.h) that are
not listed in the 'keyword' rule of sql_yacc.yy (and 'keyword_sp' rule, that 'keyword' rule includes)."

How can I get the output of the DBUG_PRINT
How to find shift/reduce conflict in this yacc file?
MariaDB Contributor Agreement (MCA) Frequently Asked Questions
wikipedia: diff

MySQL Internals Manual XtraDB / InnoDB internals in drawing
Overloading Procedures
innodb_diagrams project
Understanding MySQL Internals By Sasha Pachev (O'Reilly)
DTrace can tell you what MySQL is doing
MySQL C Client API programming tutorial
MySQL 5.1 Class Index

  • IRC, #maria channel on Freenode
  • (ideas)
  • (search for unassigned tasks)

Keywords: MariaDB, MySQL server programming, tutorial, patch.

Posted in API Programming, Linux, MySQL, Open Source, Oracle, Tech, Toys | 3 Comments

Patch to Add Shutdown Statement to MySQL MariaDB

MySQL LogoAt the OSCON 2011 MariaDB Birds-of-a-Feather (BoF) session, I suggested adding a MySQL SHUTDOWN statement to Monty, which was written up as WL#232. Other databases have this feature, and it’s very handy when automating management of a cluster of MySQL servers.

And at the Percona Live MySQL Conference 2013, Monty suggested to MariaDB BOF attendees that a good way to get a new feature added is to to write a patch to pave the way for a committer to start with.

Phase 1

So … I sat down last nite and wrote the patch against MariaDB 5.5.30.

Basically it meant telling mysql’s lex/yacc files to parse “shutdown”, then calling the existing MySQL API shutdown kill_mysql() function.

This code is released under the Open Source BSD-new License, according to the MariaDB Contributor Agreement.

shutdown_0.1.patch.txt – MariaDB 5.5.30:

---	2013-03-11 03:29:13.000000000 -0700
+++ /home/james/mariadb-5.5.30-new/sql/	2013-05-15 13:17:05.000000000 -0700
@@ -1305,7 +1305,6 @@
@@ -1333,7 +1332,6 @@
     STATUS_VAR *current_global_status_var;      // Big; Don't allocate on stack
@@ -3736,6 +3734,31 @@
+  {
+    // jeb - This code block is copied from COM_SHUTDOWN above. Since kill_mysql(void) {} doesn't take a level argument, the level code is pointless.
+    // jeb - In fact, the level code should be removed and Oracle Database statements implemented: SHUTDOWN, SHUTDOWN IMMEDIATE and SHUTDOWN ABORT. See WL#232.
+    status_var_increment(thd->status_var.com_other);
+    if (check_global_access(thd,SHUTDOWN_ACL))
+      break; /* purecov: inspected */
+    enum mysql_enum_shutdown_level level;
+    if (level == SHUTDOWN_DEFAULT)
+      level= SHUTDOWN_WAIT_ALL_BUFFERS; // soon default will be configurable
+    else if (level != SHUTDOWN_WAIT_ALL_BUFFERS)
+    {
+      my_error(ER_NOT_SUPPORTED_YET, MYF(0), "this shutdown level");
+      break;
+    }
+    DBUG_PRINT("SQLCOM_SHUTDOWN",("Got shutdown command for level %u", level));
+    my_eof(thd);
+    kill_mysql();
+    res=TRUE;
+    break;
+  }
--- sql_yacc.yy	2013-03-11 03:29:19.000000000 -0700
+++ /home/james/mariadb-5.5.30-new/sql/sql_yacc.yy	2013-05-15 11:12:03.000000000 -0700
@@ -791,7 +791,7 @@
   Currently there are 174 shift/reduce conflicts.
   We should not introduce new conflicts any more.
-%expect 174
+%expect 196
    Comments for TOKENS.
@@ -1645,6 +1645,7 @@
         definer_opt no_definer definer
         parse_vcol_expr vcol_opt_specifier vcol_opt_attribute
         vcol_opt_attribute_list vcol_attribute
+        shutdown
 %type  call sp_proc_stmts sp_proc_stmts1 sp_proc_stmt
@@ -1796,6 +1797,7 @@
         | savepoint
         | select
         | set
+        | shutdown
         | signal_stmt
         | show
         | slave
@@ -13715,6 +13717,17 @@
+          SHUTDOWN
+          {
+            LEX *lex=Lex;
+            lex->value_list.empty();
+            lex->users_list.empty();
+            lex->sql_command= SQLCOM_SHUTDOWN;
+          }
+        ;
           expr { $$=$1; }
         | DEFAULT { $$=0; }
---	2013-03-11 03:29:11.000000000 -0700
+++ /home/james/mariadb-5.5.30-new/sql/	2013-05-15 03:07:00.000000000 -0700
@@ -2173,6 +2173,7 @@
   case SQLCOM_KILL:
---	2013-03-11 03:29:14.000000000 -0700
+++ /home/james/mariadb-5.5.30-new/sql/	2013-05-15 01:20:11.000000000 -0700
@@ -3333,6 +3333,7 @@
   {"savepoint",            (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SAVEPOINT]), SHOW_LONG_STATUS},
   {"select",               (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SELECT]), SHOW_LONG_STATUS},
   {"set_option",           (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SET_OPTION]), SHOW_LONG_STATUS},
+  {"shutdown",             (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SHUTDOWN]), SHOW_LONG_STATUS},
   {"signal",               (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SIGNAL]), SHOW_LONG_STATUS},
   {"show_authors",         (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SHOW_AUTHORS]), SHOW_LONG_STATUS},
   {"show_binlog_events",   (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SHOW_BINLOG_EVENTS]), SHOW_LONG_STATUS},
--- sql_lex.h	2013-03-11 03:29:13.000000000 -0700
+++ /home/james/mariadb-5.5.30-new/sql/sql_lex.h	2013-05-15 01:19:17.000000000 -0700
@@ -193,6 +193,7 @@
     When a command is added here, be sure it's also added in

To apply:

tar zxvf - < mariadb-5.5.30.tar.gz
cd mariadb-5.5.30/sql
patch -b < shutdown_0.1.patch.txt


cd mariadb-5.5.30
cmake . -DCMAKE_INSTALL_PREFIX:PATH=/usr/local/mariadb-5.5.30
make --with-debug
sudo make install


killall mysqld
/usr/local/mariadb-5.5.30/bin/mysqld_safe --user=mysql --debug &
tail -f  /tmp/mysqld.trace | grep Got &
mysql -u root -p

mysql client (with mysqld.log and mysql.trace entries overlaid):

mysql> shutdown;
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql> 130515 13:20:38 mysqld_safe mysqld from pid file /var/run/mysqld/ ended


T@4    : | | | >parse_sql
T@4    : | | | <parse_sql
T@4    : | | | >LEX::set_trg_event_type_for_tables
T@4    : | | | <LEX::set_trg_event_type_for_tables
T@4    : | | | >mysql_execute_command
T@4    : | | | | >deny_updates_if_read_only_option
T@4    : | | | | <deny_updates_if_read_only_option
T@4    : | | | | >stmt_causes_implicit_commit
T@4    : | | | | <stmt_causes_implicit_commit
T@4    : | | | | SQLCOM_SHUTDOWN: Got shutdown command for level 16
T@4    : | | | | >set_eof_status
T@4    : | | | | <set_eof_status
T@4    : | | | | >kill_mysql
T@4    : | | | | | quit: After pthread_kill
T@4    : | | | | <kill_mysql
T@4    : | | | | proc_info: /home/james/mariadb-5.5.30/sql/  query end


130515 13:20:08 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130515 13:20:08 InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
130515 13:20:08 InnoDB: The InnoDB memory heap is disabled
130515 13:20:08 InnoDB: Mutexes and rw_locks use GCC atomic builtins
130515 13:20:08 InnoDB: Compressed tables use zlib 1.2.3
130515 13:20:08 InnoDB: Initializing buffer pool, size = 128.0M
130515 13:20:08 InnoDB: Completed initialization of buffer pool
130515 13:20:08 InnoDB: highest supported file format is Barracuda.
130515 13:20:09  InnoDB: Waiting for the background threads to start
130515 13:20:10 Percona XtraDB ( 5.5.30-MariaDB-30.1 started; log sequence number 1597945
130515 13:20:10 [Note] Plugin 'FEEDBACK' is disabled.
130515 13:20:10 [Note] Event Scheduler: Loaded 0 events
130515 13:20:10 [Note] /usr/local/mariadb-5.5.30/bin/mysqld: ready for connections.
Version: '5.5.30-MariaDB-debug'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
130515 13:20:37 [Note] Got signal 15 to shutdown mysqld
130515 13:20:37 [Note] /usr/local/mariadb-5.5.30/bin/mysqld: Normal shutdown

130515 13:20:37 [Note] Event Scheduler: Purging the queue. 0 events
130515 13:20:37  InnoDB: Starting shutdown...
130515 13:20:38  InnoDB: Shutdown completed; log sequence number 1597945
130515 13:20:38 [Note] /usr/local/mariadb-5.5.30/bin/mysqld: Shutdown complete

130515 13:20:38 mysqld_safe mysqld from pid file /var/run/mysqld/ ended

A possible test would be like this, but it would interfere with operation of the test mysqld instance:



Phase 2

My above patch applies cleanly within the existing MySQL shutdown framework, which implements a feature like Oracle Database's SHUTDOWN IMMEDIATE command.

However, my patch is a Pyrrhic victory, since there's so much wrong with MySQL's existing shutdown framework that it will take an internals committer to sort it out.

The shutdown framework is badly designed, if it was designed at all, since it fails the "does this feel programmed on purpose?" test, and in fact doesn't work reliably:

  1. Conceptually, there should be 3 Oracle Database-style SHUTDOWN options: WAIT, IMMEDIATE and ABORT. Implementing SHUTDOWN WAIT would mean intrusive changes to the MySQL source code, while SHUTDOWN ABORT would be easier to program, but at the risk of data integrity.
  2. the following bug reports describe a race condition between mysqld threads and the shutdown thread:

I guess I'll have to pay myself the worklog bounty of $100. :)

This is actually my second MySQL patch contribution. In 1997 or 1998 I submitted a patch for the installer, which was one of the most troublesome components at that time. Monty rewrote it, but I liked my version better.

Update: Sergei Golubchik committed this patch to MariaDB 10.0.4 on 2013-06-25. Thanks, Sergei!

MySQL's Missing Shutdown Statement
Bug #63276: skip sleep in srv_master_thread when shutdown is in progress

Posted in Linux, MySQL, Open Source, Oracle, OSCON, Tech | 1 Comment

Colgan: The Gift That Keeps on Giving

After the 2009 Colgan regional airline accident near Buffalo, we got the FAA 1,500 hour rule for pilot experience. Nevermind that the root cause was botched primary stall training and lack of sleep …

Now, 7.5 years later and counting, the FAA is proposing that “New airline pilots would go through professional development and mentoring programs.”

One might ask what the FAA was doing for almost a decade. Apparently 50 bodies wasn’t enough to motivate them.

There’s multiple levels of irony in the FAA’s two Colgan “actions”:

  1. the 1,500 hour rule has bankrupted all of the regionals, since there’s no eligible pilots now to fly their leased planes – an unforeseen win actually, since there won’t be any more regionals like Colgan
  2. the whole point of the regionals was to reduce costs for the majors, by using inexperienced non-union pilots with minimal training and without mentoring, while the industry hid behind FAA’s “one level of safety” mantra. (Just watch the Senate Colgan hearings to hear the major airline executives refuse to mentor their captive regionals because, “according to the FAA there’s one level of safety.”) :)

Ultimately it was the FAA that allowed race-to-the-bottom regionals to exist and Colgan was an easily-forseeable accident.

Since this topic is important enough to warrant reflection:

  • Both USA and non-USA airlines need to deal with the original and as yet unaddressed problems: pilot fatigue (building some pilot dorms is cheaper than an accident) and “bad sticks” (inadequately-trained pilots – so do stall training.)
  • Non-USA airlines: the US regional system saved a few bucks for a while, but at the expense of dead pax and ultimately draconian government oversight that raised pilot wages – US-style regional airlines are not a role model for export.

FAA Proposes Mentoring Programs For Airline Hires
AIN Blog: Torqued: ‘One Level of Safety’ Remains a Myth
Frontline: “Flying Cheap” PBS YouTube

Posted in Tech | Leave a comment