Origin of the Elements Chart

Dr. Edward Murphy, University of Virginia, gave a profound talk on how stars create heavier elements from hydrogen and helium. One of the slides is remarkably informative, and supplements the periodic table:

Abundance of Elements

Lecture notes:

– almost all elements come from inside stars
– at 100 million degrees, massive stars create elements up to the atomic number of iron (Fe)
– as the star goes supernova (or when binary stars collide), massive stars create the elements beyond iron (including precious metals) in about 2 seconds
– because the atomic numbers for helium (He) are even numbers, about 10x as many elements have even atomic numbers compared to odd numbers, presumably because the even elements result from common nuclear formulae
– some of the elements are much less common than expected, like Li and De, so presumably they are consumed in nuclear reactions, like helping fusion (confirmed as they’re both used in nuclear weapons production)
– iron is the most stable element, hence nuclear reactions result in iron, which then slows down later reactions since iron doesn’t participate in reactions until very high temperatures at the end of the star.
– life on earth is made mostly from carbon and oxygen due to their reactivity and abundance. This is likely on other planets.

youtube.com: The Origin of the Elements Lecture (2012), Abundance of Elements Chart

W: r-process
The (truly) Periodic Table @4:12 The Tetris of elements

Posted in Uncategorized | Leave a comment

Terraform Workspaces

Terraform workspaces allow you to use a different state file from the CLI, or the equivalent with Terraform Cloud.

Workspaces are used to provide cooperative isolation. They can be thought of like programming language namespaces.

A workspace called production is created (and initially switched to) with:

terraform workspace new production

You can switch back to the default workspace with:

terraform workspace select default

Here’s a nice example of a terraform script that manages 2 workspaces using terraform locals, default (test) and production, depending on the current workspace:
medium.com: Terraform workspaces and locals for environment separation (2017)

And here’s a 2021 github issue on how to use workspaces under terragrunt:
github.com: Support for terraform workspaces? #1581

Use Cases for Workspaces

It’s common to use separate AWS accounts for test and production environments, and workspaces let you assign workspace names to teams or clusters within each of those accounts.

Or a single cloud account can be used, and workspaces used with names like default for test, and separate names for teams.

Note that using workspaces provides a separation mechanism that’s better than nothing at avoiding resources getting stepped on, but somebody can still do a select command into your team’s workspace.

W: Terraform
medium.com: Terraform Workspaces Basics with AWS CodePipeline (2019)

Posted in Cloud, Open Source, Tech | Leave a comment

OpenAPI to GraphQL Gateways

A lot of companies are in the situation where they have a working, tested REST API server and they want to enable GraphQL queries as easily as possible.

Both are data access methods that allow end-users and developers to query and optionally update data hosted by a server. Note that there is an “impedance mismatch” between REST and GraphQL, since GraphQL can be considered a higher-level access method with pre-aggregated and pre-joined results with a single network request.

There’s 3 important object type concepts in GraphQL:

  1. Schema (in this case from api spec file)
  2. Query (read-only, which in many use cases is adequate alone)
  3. Mutation (optional updates or complex operations)

To import an OpenAPI spec file, it needs the following:

  1. response schema fields for each endpoint. OpenAPI REST APIs usually don’t need this since JSON can be dumped directly, but it’s mandatory with GraphQL. (GraphQL IDEs, like Altair, use the schema info for autocompletion and pulldowns also.)
  2. compatibility with the gateway OpenAPI parser.

Here are some approaches from easiest to hardest/more expensive:

1. IBM’s OpenAPI to GraphQL gateway and library

Try this out first, but it has a lot of rough edges:

  • read the Issues and Pull Requests first
  • doesn’t parse multiple types per endpoint
  • in my case, auth worked but Altair didn’t show any response fields, possibly because of response envelopes.

Example of minor “enveloping” (the {“users”: part) that seems to confuse it:


Insert this line at the top of openapi-to-graphql:

#!/usr/bin/env node

Starting the listener (it’s a node.js application) on linux:

$ openapi-to-graphql api.yaml
GraphQL accessible at: http://localhost:3000/graphql

How to do schema introspection in Altair GraphQL client to see GraphQL resolvers (endpoints):

  __schema {
    types {

2. try a couple of lower-level libraries/frameworks

The challenges are installation errors from library dependencies, and more programming effort than using a gateway listener as in #1.

3. pay for a commercial gateway solution.

You will likely get a more polished solution with more technical support than the current Open Source solutions.

Some strategies for evaluating an OpenAPI to GraphQL gateway:

  1. start with a couple of read-only (GET) endpoints to become resolvers. (Read-only may suffice.)
  2. then try a couple of updates (POST/PUT/DELETE) (mutations) if needed.

Some GraphQL advantages are reducing aggregating REST API output, and reducing network requests. To obtain that benefit from your existing REST API server when using a GraphQL-REST API gateway, you may need to add additional REST API endpoints for common complex GraphQL requests. For example, perhaps doing a SQL JOIN behind the scenes and returning that result in one request. An example is that if there’s Employees and Companies tables, then you may want a GraphQL query to return joined table data with employee name and company name without two REST API calls.

GraphQL: Building a consistent approach for the API consumer
OpenAPI Specification
GraphQL vs. REST: A Comprehensive Comparison

Posted in API Programming, Business, Cloud, Linux, Open Source, Perl, REST API Programming, Tech | Leave a comment

Intro to Designing and Making Your Own PCB Boards


PCB = Printed Circuit Board
EDA = Electronic Design Automation software
hobbyist PCBs = single- or double-layer boards without high-frequency circuits

Designing and making PCB boards is one of the easiest, least expensive, least messy CAD/CAM hobbies. Most people use EDA software, which is specific to designing and making PCBs.

EasyEDA 3D View
3D View of PCB created with EasyEDA Designer illustrating annotations, fiducials (pick and place registration dots), copper pour, plated grounding hole and mounting holes

In 2021, there’s 5 main options for hobbyists to learn and/or make PCB boards:

  1. use PCB EDA software, simulate in SPICE, view in 3D, then stop there
  2. use PCB EDA software and click a menu to send to an online PCB fab (turn-around time about 2 weeks)
  3. use PCB EDA software and make your own PCB boards with a chemical process (better results than desktop CNC machines)
  4. use PCB EDA software and make your own PCB boards with a general purpose CNC machine (results depend on skill)
  5. use PCB EDA software and make your own PCB boards with a desktop PCB CNC machine (starts at $4,000, results are hit or miss)

Option 1 is good for hobbyists who want to learn how to use EDA software, with advanced features including vias/holes, SMT and annotations. This is a good route if you don’t really need to make a physical board today, but want to know the initial skills for future use.

Option 2 is good for people who want to use advanced design features (see #1), and actually need to fab the board professionally with those features. EasyEDA’s fab, JLCPCB, can also populate boards if you use their recommended parts and they’re in stock.

Option 3 is good for people who want to design basic boards and also physically make simple boards themselves. It’s unlikely the average hobbyist would be able to successfully make complex boards with annotations, which requires silk screening (unless traces are used for lettering), many vias or holes (unless copper PCB rivets are used), multi-layer or SMT (requires skilful hand soldering or wave soldering.)

To get a high-resolution image for chemical processing, export as SVG instead of Gerber, then scale and print using an SVG application like Inkscape (free.)

For Option 3, you do chemical processing, and for holes, operate a drill press. New, diamond-coated or carbide drill bits are recommended to avoid shredding PCBs. Chemical processing is the cheapest and highest-resolution method, and there’s simplified “no-heat” methods that use glossy-paper inkjet or laser printer transfers instead of UV light now.

For Option 4, you use a CNC machine and/or drill press. This is a possible option for people who already own a CNC machine. Use diamond-coated or carbide drill bits, do a height-map, cut at 0.1 mm depth, and do a test PCB to see what the resolution of your CNC machine is. Often good SMT traces are difficult with low-end CNC machines. Some people have had luck with careful use of a low-end 3018 machine. The disadvantages are messiness, lack of silk-screening annotations, and effort making vias.

For Option 5, follow the instructions with your desktop PCB CNC machine carefully. Some people have good luck with these machines, and some don’t, due to the limited torque of the built-in motors and lack of rigidity. Some professional engineers have reported good results, and benefit from same-day turnaround on their prototypes. Some users have said they need to watch the machine like a hawk to avoid mishaps, or avoid SMT designs. The disadvantages are lack of silk-screening annotations, and effort making vias.

It’s recommended to use a vacuum hose with CNC machines to keep the work area clean of chips, otherwise the occasional trace can be damaged and the drill bit can wear faster, causing shredding.

The materials and chemicals used in “no-heat” chemical processing are:

  • some type of PCB, depends on process or machine
  • baby oil for transferring your ink jet- or laser-printed circuit from glossy paper to the PCB
  • ferric chloride to etch the copper (or alternative)
  • nail polish or diluted acetone to remove ink from board
  • electrical lacquer for coating the final board
  • rubber gloves, mask and steel wool or scotchbrite (avoid using sandpaper on copper traces, but ok on PCB edges)
  • copper PCB rivets

Isopropol alcohol is commonly used to wipe boards after CNC milling.

If you drill holes, a drill press or drill stand ($50) is recommended to make the holes vertical, to avoid shredding the board, and to preserve the sharpness of the drill bits. Sizes 0.8 mm, 1.0 mm, 1.5 mm, 2.0 mm, 2.5 mm, and 3.0 mm are commonly used. Plan to use bits that use the same size chuck to save effort and time. (The most commonly-used drill bit is 0.8 mm, but some people have to use 0.9 mm if leads don’t fit. This bit is also used to pre-drill larger holes.) If you’re trying to search for drill bits, try lapidary sources.

Note that final traces need to be tinned or coated (electrical lacquer) to reduce corrosion.

Also, chemical methods have high-enough resolution for SMT, while desktop PCB milling machines often don’t.

It’s important to realize the multiple functional properties of a PCB:

  • electrical (EMF/electron drift transmission through traces, wires and solder)
  • thermal (heat is dispersed along traces/pours, vias and air)
  • mechanical (strong component connections)
  • dimensional (rigidity in XYZ)
  • electromagnetic (high power, current and frequency applications create crosstalk via electromagnetic fields in air)

PCB Substrates: Knowing Your Dielectric Material’s Properties

Below are some recommended EDA software:

Free but Cloud:

– EasyEDA: My Review (Free)

Free and no-cloud:

– KiCad: Homepage, My KiCad Tips, KiCad Links (GPLv3, various Open Source licenses, funded by CERN)

Other EDA Programs:

Diptrace (freeware limits pin count until paid upgrade)
DesignSpark PCB (adware)
Fritzing (GPLv3)
– AutoDesk 123d Circuits Tinkercad (Free)
EveryCircuit ($15)
ExpressPCB (Nice FAQ on Placing Components) (Free)

Diptrace vs KiCad

Cut2d Pro: “Cut2D Pro gives you the power to produce complex 2D patterns with profile, pocket, drill and inlay toolpaths. With unlimited job and toolpath size, true shape nesting & job set-up sheets.

Cut2D Pro has easy to use vector drawing and editing tools with powerful 2D machining strategies for CNC routing, milling or engraving and provides a powerful but intuitive software solution for cutting parts on a CNC Router.”

Other CAD/CAM Software

gerbv Viewer for Gerber (GPLv2)
Flatcam Converts formats including Gerber to gcode (free)


Wegstr: PCB making, PCB prototyping quickly and easy – STEP by STEP ($4,000+ plus consumables)
The Wegstr CNC prototyping mill
Double layer PCB prototyping 0.1 mm traces/spaces
CNC PCB – high quality with the budget 3018 CNC
FlatCam 2 sided PCB milling on a CNC
Bantam Tools CNC: Milling Double-Sided PCBs with the Desktop PCB Milling Machine
Creating a homemade PCB with the Voltera V-One PCB Printer ($4,100+)

DIY Chemical Process PCB Making

DIY PCB Toner Transfer (No Heat) & Etching – Easiest
DIY PCBs At Home (Single Sided Presensitized) – Nice UV Process
DIY PCB Fabrication (Dry Film Inkjet Method)
DIY PCB Shaker For Etching (Low Cost Rocker/ Agitator)
Exposing, Developing, Etching and Drilling PCBs – My (Current) Way (with drilling and isolation slots)
Making PCBs (with vias)

Misc Links

– GadgetReboot: Youtube, Github KiCad Projects

– eprpartner.com: Small but mighty: fiducial marks and their purpose
– ema-eda.com: Orcad Lite Limits

Amazon: “Lab Shaker”

Flake Multimeter Watch
Action Box DIY Pro Machines
altium.com: Are Fiducial Marker Placements on PCBs Still Necessary with Modern Manufacturing Capabilities?
Investigation on the efficiency of thermal relief shapes on different Printed Circuit Boards
Mounting hole on a PCB (Fiducials)
My Experience Using the Online PCB Software EasyEDA

PCB Electronics Theory

LED Thermal Relief in Real Life with vias and heatsinks See @10:27
Ground in PCB Layout – Separate or Not Separate? (with Rick Hartley)
[LIVE] How to Achieve Proper Grounding – Rick Hartley – Expert Live Training (US)
Robert Cox Lectures

Posted in CNC, kicad, Tech, Toys | Leave a comment

EasyEDA PCB Software Review

I evaluated EasyEDA Designer (v6.4.20.6 July 6, 2021 on Mac OS 10.11) to design a simple but complete (SPICE simulation, parts, mounting holes, annotations) through-hole, one-sided PCB to illuminate an LED.

EasyEDA Designer is a proprietary program, but free to use (they make money by selling parts and PCB fab services online.) It is an Electron (Javascript) application. (If you have battery usage problems, close EasyEDA Designer, then restart it.)


The initial learning curve was moderately difficult, as any EDA software has the first time, but is very easy for the second PCB. I was able to accomplish exactly what I wanted. Both the Online and Desktop Clients require an Internet connection, so if that is an issue, try KiCad.

I used EasyEDA Designer on Mac OS 10.11, so it should work on any version of Mac OS, and the UI has a Mac-friendly UI with menus and shortcut keys.

EasyEDA works conceptually the same as KiCad, so it’s a good stepping stone.

Read the “Usage Tips” and “EasyEDA/LTspice Notes” below to save at least an hour when designing your first PCB!

EasyEDA 3D View
3D View of PCB created with EasyEDA Designer illustrating annotations, fiducials (pick and place registration dots), copper pour, plated grounding hole and mounting holes

Review and Notes

There are 3 EasyEDA product types:

  1. Online Editor (browser)
  2. Desktop Client (download), but is still a browser app that’s tied to the cloud
  3. local install (call for pricing.)

I used #2, expecting to use it locally, but a login and cloud connection is needed for Spice and likely other features.

The purpose of EasyEDA is:

– simple UI for creating PCBs
– upload your PCB artwork to their site to purchase services (parts or board fab.)

The workflow to create a PCB is conceptually identical to KiCad:

– New … Project
– (origin is automatically set to 0, 0)
– schematic capture (insert your electronic symbols from libraries. The symbols need Spice models if you plan to simulate.)
– simulate with LTspice (shows voltage (V) and current (I) over time per component)
– convert to PCB components (pick actual parts from the parts libraries)
– click on canvas, then set units to mm or inches, and snap size to 0.050″ or 1 mm and enable it
– inspect 2D or 3D views (provides assurance that the actual PCB resembles what you expected)
– move PCB to (0, 0)
– create any copper areas (pours/ground planes)
– export to Gerber (for sending to CAM machines.)

So learning one product allows you to understand the other, although the menus and keys will be different.

EasyEDA Features

EasyEDA has every feature that a hobbyist would need. This is not surprising, since they offer PCB fab services, so their EDA software literally has to function.

– very intuitive commands via mouse menus or shortcut keys. Click on Setting … Shortcut Keys to view or change shortcuts. Also has Help and Tutorial systems.
– has Panelize menu (in KiCad you must use the external KiKit program), but the panel looks odd in 2D and 3D views.
– has a very convenient dedicated Place … Hole menu for (drilled) board mounting holes
– has a built-in fiducials (alignment markers) feature with Shift+F for pick and place machines
– commercially-maintained components libraries
– set of video tutorials on Youtube.

EasyEDA Limitations and Problems

– requires Internet connection for LTspice and possibly other operations
– 3D view often renders a blank page, requiring closing and re-opening the worksheets
– no blind vias
– it’s unlikely that EasyEDA Designer has similar non-hobbyist high-frequency features as Altium. This includes “creepage” features, etc.

EasyEDA/LTspice Notes

– your schematic must use Spice library components (models), including a voltage source and GND. The EELib parts library does not include Spice models, so will not work for simulation
– not sure why the LTspice V/I graphs disappeared at some point, now I just see a text report that says “succeeded”
– Regardless of EDA software, it’s always recommended to do your Spice modelling on small subsystems, one at a time, to limit the complexity of each circuit.
– requires Internet connection for LTspice and possibly other operations

3D View Notes

– make solder masks invisible to see the traces
– image not as high-resolution as KiCad, but looks ok (see sample above.)

Usage Tips

These tips will save you at least an hour when making your first design, and are easy to remember:

– the default origin is (0, 0), which you don’t need to change for EasyEDA’s fab service. To change the origin, you need to use the EasyEDA API.
– the Start tab has a UI with various options. An important one is a button with
“Change to Simulation Mode” or “Change to Standard Mode”. (You need to be in Standard Mode for PCB layout.)
– click anywhere on the grid for the snap size setting. (I use 1 mm or 0.050 inches.)
– use Edit … Name Position to position annotations below board components (default is above for some reason)
– use right-click on Projects window to delete old project files
– use Shift+F to insert fiducials for pick and place machines. They should be 1 – 3 mm in diameter, and usually round. You can delete the annotation “U1” above, an usse with copy and snap to grid to make 3 fiducials at various corners of the board.
– View … Zoom … Fit in Window or K is the fastest way to scroll the screen to the origin, otherwise you have to use the left and right keys.
– has a very convenient dedicated Place … Hole menu for (drilled) board mounting holes.
– Place … Circle or C is rarely what you want, since it’s just an annotation graphic. Use the pads P, vias V, holes or fiducials Shift+F features instead.
– 3D view often renders a blank page, requiring closing and re-opening the worksheets
– isolation slots (relatively long cutout regions into the board) can be made with various pads or holes, just insert the component, select oval shape and define the oval dimensions. (Isolation slots are often used in various power supply circuits to provide air-gapping for high-frequency or high-current applications.)
– copper areas (pours) can be edited after pouring by clicking precisely on the edge of the pour. This will display the CopperArea Attributes dialog. Also, try the Tools … Copper Area Manager. You can click on the canvas and enable/disable display of copper areas.
– traces can be very wide in EasyEDA, so often copper area pours aren’t needed. For example, I have made traces 50 mm wide.


– Javascript scripts are run from the scripts window under Advanced … Extensions
API Documentation
Forum: API script editor and debugger
Github Shapes Example
User Extensions for EasyEDA Summary
easyeda-ibom-extension (Interactive HTML BOM Tool)

(Note that some experienced engineers avoid EDA scripts in general because the API can change over time without warning, causing problems with existing documents.)

Differences with KiCad

– EasyEDA has a dialog for round PCB corners, while KiCad users usually import a DXF file
– EasyEDA has copper area/pour integration, while KiCad users usually configure zones.
– EasyEDA has a very large, commercially-supported components library, while KiCad is more user-supported.
– KiCad has global labels as shorthand for schematic wires
– KiCad uses tilde ~ for “not”, try # in EasyEDA. See forum post.


Getting To Blinky 4.0 – LTspice simulation
What are the differences between solder mask and paste mask?

Re: What is paste mask?

A stainless steel template in a tightened frame, with holes corresponding to landpatterns on the bare board that are either cut, punched, or drilled with a laser. The solder stencil is used at the beginning of the circuit assembly process where it is placed on top of a bare board, solder paste is pushed through the holes, then the stencil is lifted away. This leaves small solder paste deposits on the bare board, onto which components are placed.

Posted in API Programming, Cloud, CNC, kicad, Tech | Leave a comment

KiCad Features

KiKit is a panelization utility that supports panel fiducials (alignment markers.) (EasyEDA and Altium have a built-in panelization UI.) The KiCad fiducials key is “F”.

Panelizing boards within KiCad – Random KiCad Tips

– there is a component table for bulk mgmt. of parts

– use “X” icon for marking pins as No Connect, otherwise DRC will emit errors

– there is a high-frequency creepage plugin that is 2D (Altium has a built-in 3D creepage checker)

KiCAD Quick-Start Tutorial

High Power Circuit Board Design (PCB) – KiCad 5 – : Part 1/2,
Part 2/2

Posted in kicad, Linux, Open Source, Tech | Leave a comment

Github Actions Demo Repo with Several Programming Languages

I created a Github repo to test using the Github Actions build workflow feature to lint several programming languages (Perl, Python, Ruby and Bash) in different directories simultaneously.

Github Actions is pleasant to configure and seems very useful.

The Perl-specific workflow (only 33 lines) is here.

[Screenshot: GitHub Actions UI showing 4 workflow run results]

Note that Github Actions can be used to spin up a Jenkins VM, then you can run existing Jenkins scripts, like Pipeline, etc.

Posted in Cloud, Perl, Tech | Leave a comment

OpenLayers 5.3.0 and “SyntaxError: missing : in conditional expression”

Just a PSA for Openstreetmap Openlayers 5.3.0 API users.

The rawgit CDN ol.js file appears to contain a malformed ternary operation that causes browsers to not create the ol object, breaking map programs.

The Firefox web console says:

SyntaxError: missing : in conditional expression ol.js:1:34694

The solution is to switch CDN from rawgit to cdn.jsdelivr.net like this:


So definitely test your existing Openlayers code on a current version of Firefox.

There’s only a couple of searchable references starting May, 2020, so it’s something fairly new.

Posted in Open Source | Leave a comment

Very Light Jet (VLJ) Market History to 2020

Like everybody connected to the aviation industry, I watched the Very Light Jet (VLJ) market hype (mostly around Paul Allen’s Eclipse 500,) in 2002 and wondered how the market would shake out. After the Eclipse announcement, around 10 competitors announced they would build a VLJ, too.

VLJ’s are characterized by being lower-cost than other private jets, and being certified for single-pilot (SP) operation. However, “lower-cost” increased from $837,000 in 2002 to $3-$5 million in 2020. The key technology in 2002 was a Wiliams cruise missile engine, but that turned out to be under-powered for the Eclipse back then and the engines were upgraded to P&W. A Williams jet is used in the Cirrus Jet today.)

The Eclipse (twin P&W, 370k, ~$1M, 267 made) company was predicted to fail because of the ultralow pricing and company mfg. inexperience. And it did in 2008, although it shipped a few hundred airframes in various stages of completion. Most of the wealthy individual buyers moved on to the Cessna Mustang (twin P&WC, 340 knots, $3.3M, 469 made), which was much more expensive but had the backing of Cessna/Textron.

What’s interesting is that Eclipse reported a major deal with an air taxi, Day Jet. I was skeptical of the deal at the time, but they did receive about 20 jets and provided service until also failing in 2008, although I wonder about the dispatch rate for a v1 jet.

Besides the Cessna Mustang, another successful jet that eventually emerged was the Cirrus Vision SF50.

The Cirrus Vision SF50 (single Williams, 300 knots, $2.7M, 170 made so far) was certified in 2016 and the shipment volume winner for 2018 and 2019. It and the Embraer Phenom 100 (twin P&WC, 400 knots, $5 million) are the only VLJs currently being made, with the HondaJet available at twice the price ($5.5 million.)

(Note that Cirrus (and most small US airplane companies) are owned by Chinese companies as of 2020.)

On Apr. 18, 2019, an FAA AD was issued after 3 incidents involving AOA sensors were reported. Aerosonic, the AOA supplier, shipped bad units to Cirrus, and the SF50 fleet was grounded until replaced. The effect was the same as the 737 MAX accidents, although Cirrus had a hardware, not a mostly software, problem. It’s interesting that a ferry flight was allowed to repair it.

W: Eclipse 500, Cirrus Vision SF50, Cessna Citation Mustang (Model 510), Honda HA-420 HondaJet, Embraer Phenom 100

Textron Ceases Production of Cessna Citation Mustang

Posted in Tech | Leave a comment

Software Fix Found For A220 Engine Acoustic Resonance Shutdowns

Very interesting:

“Airbus has determined that acoustic resonance is behind the destruction of four engines on its new A220 airliners in the last couple of years. The company has come up with a software fix for the condition, which caused in-flight shutdowns of the Pratt & Whitney geared turbofan engines on three Swiss flights and one on an Air Baltic aircraft.”

Engine makers commonly use remote telemetry, so I’m a little surprised it took a couple of years to pinpoint that. But I’m sure they’ll be adding acoustic sensors now.

Acoustics are a surprisingly big deal in aerospace. HondaJet’s composite fuselage rang like a bell and even passengers needed headsets, so the new Elite version focused on reducing cabin noise by 3 db (half.)

avweb.com: Software Fix Found For A220 Engine Shutdowns

Posted in Tech | Leave a comment

Cessna 210 Wing Spar Failure

Cessna T210M Wing Spar Shear between top rivets and lower I-beam Flange Narrowing Point (and Lower Forging Lightening Spotface Indentation.) The white foam may have collected water, aiding corrosion. Photo credit: ATSB. (click image to enlarge)

I’ve read thousands of airplane accident reports, but this is one of the most concerning.

The Cessna 210 (and 177) family has a complex wing design made of aluminum components with around 100 features, mostly rivet holes, that can become stress risers.

Since 2012, several spar cracks have been found, and in 2019, there was a fatal accident in Australia after losing a wing, which is not survivable. (Note that this plane was used for aerial survey grids at 200′ with full fuel plus wing tip fuel tanks and 2 pilots, for as many as 5,000 hours. Two survey planes were operated. It would be fascinating to do a teardown on the other one.)

The fatal 2019 accident in Australia appears to be a combination of corrosion on a rivet hole and metal (aluminum) fatigue, possibly combined with the round shape of the lower spar forging spotface (indentation.) Aluminum can shear instantaneously once a crack forms, though in this case the ATSB suspects previous cracking.

Cessna 210 Wing Lower Spar Forging. Photo credit: tennesseeaircraft.net.

Example of a fail-safe spar with a riveted spar web. This might have arrested the crack, but at the cost of being twice as heavy since both the top and bottom half-spars would need to be capable of carrying the load. And all bets are off with corrosion. Diagram credit: aircraftsystemstech.com.

(In the 1995 Kobe earthquake, some of the freeway supports that failed had a steel jacket around the bottom third of the concrete supports, causing a stress riser in a ring around the support. Japan subsequently updated their building code to do full steel jackets. This spar structure failure appears to be very similar if a forging spotface was involved. Under a sideways load, the weakest part would be at the curved spar cap spotface indentation, acting as a strees riser. Under a vertical load, it would still be the same location, since both the spar caps and forgings narrow in thickness at that point.)

This is a tough one to fix, let alone inspect. The spar and other attachments are no longer made, and even if they were, that’s $86,000 into an old airplane. For these to continue to fly, somebody needs to operate these outside the USA and come up with a “massive spar” modification that’s affordable, if not TSO’ed.

Cessna 210 Spar Links

Wings – Aircraft Structures
federalregister.gov: Airworthiness Directives; Textron Aviation Inc. (Type Certificate Previously Held by Cessna Aircraft Company)
tennesseeaircraft.net: 210 Wing Lower Spar Cap Bulletin Sel-57-01 Revision 1 (2012)
avweb.com: Mask Shortage Results In Cessna 210 Spar AD Extension
generalaviationnews.com: Cessna 210 spar caps subject of new Airworthiness Directive
flyingmag.com: FAA Calls for Cessna 210 Wing Spar Inspections
aopa.org: Cessna 210 owners weigh compliance options for corrosion AD
asrs.arc.nasa.gov: Smoking Rivets
W: Spotface
faa.gov: FAA Request for Information on Cessna 201 and 177 Airplanes
pprune.org: Cessna 210 Accident Mt ISA

1995 Kobe Earthquake Freeway Collapse

The Collapse of the Hanshin Expressway (Fukae) Bridge, Kobe 1995: Soil-Foundation-Structure Interaction, Reconstruction, Seismic Isolation
fhwa.dot.gov: Aftermath of The Kobe Earthquake

Posted in Tech | Leave a comment

Covid-19: Nobody Saw Another Corona Virus Coming?

The frequent excuse for the lack of civil preparedness to the corona virus is, “Nobody saw it coming.”

Well, actually that’s not true:

  1. BlueDot, a news alert service, notified their subscribers on Dec. 30
  2. a Texas grocery store chain with a Director of Emergency Preparedness did:
    Inside the Story of How H-E-B Planned for the Pandemic

  3. Disneyworld Shanghai was closed on Jan. 24

And in the past 20 years, there’s been:

  1. SARS-1 (2002)
  2. H5N1 (2005)
  3. H1N1 (2009)
  4. MERS (2012)
  5. Zika (2015-2016)
  6. SARS-2 (2019)

Chronic epidemics, that especially affect those hospitalized:

  1. MDR-TB
  2. obesity
  3. Ebola (most years)
  4. 18 drug-resistant “hospital” bacteria per CDC

Update 2020-04-24: The Wuhan virus lab published papers on corona virus research, and appears to be source of the pandemic. The research was funded by the US, Australian and Chinese governments. More research is needed to 100% confirm what was transmitted and when, but nobody denies this possibility now. The US DOE LLNL also confirmed this in a report in May, 2020.

politico.com: Inside America’s 2-Decade Failure to Prepare for Coronavirus

A corona patient using a ventilator requires up to 21 days on the machine, unlike non-corona patients, which only need 3-4 days typically. I guess it’s somewhat reasonable to not have enough ventilators on hand for 5x as many patient-days. (Update 2020-04-12: ventilator patients have a 66% to 90% mortality rate, so ventilators were ineffective and a red herring. Cannulas (rubber hose with oxygen bottle) are now recommended as long as possible – until chronic fainting.)

But there should have been a plan on how to make/procure them if needed, since the ventilator inventory count is considered important enough by many countries to be a military secret, including the US.

Some constructive things the USA can do for next time:

  1. study coronavirus (SARS-1 and SARS-2) and understand how to test and cure them
  2. maintain a FEMA/CDC/Whitehouse pandemic department and a pandemic mgmt and communications plan, like we used to do
  3. maintain a stockpile of PPE and ventilators (California’s stockpile was not maintained due to budget cuts)
  4. talk to other countries about how they handled it. China, S. Korea and Singapore did a lot better than Western countries.

Covid-9 Timeline Info (Starting Dec. 1, 2019)

TIMELINE: The Trump Administration’s Decisive Actions To Combat the Coronavirus
WHO.int: Pneumonia of unknown cause – China
CDC.gov: Coronavirus (COVID-19)
nationalreview.com: The Comprehensive Timeline of China’s COVID-19 Lies
Wuhan shrimp seller identified as coronavirus ”patient zero”
COVID-19 twice as contagious as previously thought – CDC study
Why New York has 12 times as many coronavirus deaths as California
AP report claims China knew of pandemic danger in Wuhan even as officials downplayed risk of virus
Antibody tests reveal that coronavirus infections vastly exceed official counts
Corona Virus Outbreak Dashboard
Coronavirus: What did China do about early outbreak?


Mossad officer describes covert global battle to obtain ventilators at all costs
For some survivors, coronavirus complications can last a ‘lifetime’
npr.org: Ventilators Are No Panacea For Critically Ill COVID-19 Patients
cnn.com: Doctor: Splitting a single ventilator only works for some
A Medical Worker Describes Terrifying Lung Failure From COVID-19 — Even in His Young Patients
With ventilators running out, doctors say the machines are overused for Covid-19
marketwatch.com: Why this epidemiologist is more worried about coronavirus than he was a month ago
washingtonpost.com: I spent six days on a ventilator with covid-19. It saved me, but my life is not the same.
apnews.com: Some doctors moving away from ventilators for virus patients”> HN

Tracing Apps

youtube.com: Covid-19 in S. Korea

Aviation During Pandemic

Here’s Why So Many Planes Are Still Flying, Nearly Empty
Coronavirus travel restrictions for visitors to Hawaii in 2020
W: List of busiest airports by passenger traffic


Santa Clara County Coronavirus (COVID-19) Data Dashboard
COVID-19 Dashboard by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU)
The Johns Hopkins Coronavirus Dashboard Gets 1.2 Billion Interactions a Day
Our World in Data
Rt Covid-19 Chart by State
NY Regional Monitoring Dashboard

Other News

Suspected SARS virus and flu samples found in luggage: FBI report describes China’s ‘biosecurity risk’
medium.com: What Everyone’s Getting Wrong About the Toilet Paper Shortage
The Coronavirus Called America’s Bluff
‘It’s Just Everywhere Already’: How Delays in Testing Set Back the U.S. Coronavirus Response
Best Evidence Yet that Coronavirus Came from Wuhan BSL-4 Lab
Sweden’s Relaxed Approach to the Coronavirus Could Already Be Backfiring
Here’s What You Do With Two-Thirds of the World’s Jets When They Can’t Fly
The Yeast Supply Chain Can’t Just Activate Itself
Near 90% Mortality Rate in Intubated COVID-19 Patients in NYC
36,000 Missing Deaths: Tracking the True Toll of the Coronavirus Crisis
When SARS-1 Ended
Lab-made? CoV2 genealogy through the lens of gain-of-function research
US government review of first Wuhan repatriation says safety protocols not followed
Antibody study suggests Covid-19 could be far more prevalent in the Bay Area than official numbers suggest
Silent hypoxia: Covid-19 patients who should be gasping for air but aren’t


Coronavirus has been in California ‘a lot longer than we believed’ with cases as early as DECEMBER
Wuhan lab was performing coronavirus experiments on bats from the caves where the disease is believed to have originated – with a £3m grant from the US

Disney Shanghai

Bob Iger Thought He Was Leaving on Top. Now, He’s Fighting for Disney’s Life
Shanghai Disney shuts to prevent spread of virus

Due to #COVIDー19, all TCP applications will be converted to UDP to avoid handshakes.

Posted in Business, Tech | Leave a comment

Perl Sample Code for Geolocation Lookups with MaxMind GeoLite2

There’s been a lot of changes with MaxMind’s GeoIP database:

  1. You must register for an account to download databases.
  2. The database format was changed from GeoIP to GeoIP2. The free databases are called GeoLite2.
  3. Older lookup APIs no longer work, so you must update your libraries and source code.
  4. The new API supports both IPv4 and IPv6.

There’s a lot of obsolete and unclear examples online.

Here’s complete, tested perl sample (March 2020):


use strict;
use warnings;

use GeoIP2::Database::Reader;

my $ip = "2607:f8b0:4005:804::200e";
my $db = "/usr/share/GeoLite2/GeoLite2-City.mmdb";

# MaxMind's placeholder coordinates for when a geoip lookup fails
my $nf_lat = '37.751';
my $nf_long = '-97.822';

my ($lat, $long) = (0, 0);

eval {
    my $reader = GeoIP2::Database::Reader->new(
       file => $db, locales => ["en"]
    my $where = $reader->city( ip => $ip );
    my $location = $where->location;
    ($lat, $long) = ($location->latitude, $location->longitude);
if (@! or ($lat eq $nf_lat and $long eq $nf_long)) {
   print "error: lookup failed on '$ip'\n";
else {
   print "$lat, $long\n";
$ perl test_geoip2.pl

metacpan.org: GeoIP2::Database::Reader
perladvent.org: Where in the World?

Posted in Open Source, Perl, Tech | Leave a comment

Can Web Apps Rot?

Can Web apps rot?

Why yes, yes they can, mainly if you rely on cloud integrations.

Just looking at an old web app (finished 18 months ago) recently and found the following:

  1. Twilio changed an API path (SMS/Messages => Messages), and restricted their free tier to one “sandbox” phone number for receiving messages. Note: Plivo’s free tier works the same way.
  2. Firefox made CORS more restrictive. The same code still works on current versions of Chrome and Safari, but no amount of configuration changes allows Firefox to call another domain from the current Javascript on my domain:

    Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at []. (Reason: CORS request did not succeed)

  3. Google Maps changed their free tier, darkening the map and overlaying a dialog box until a credit card is provided. So I migrated to free OpenLayers, the OpenStreetMap API.

Free Tier After ToS Change Requiring Payment Method

Posted in API Programming, Cloud, Tech | Leave a comment

Korg ARP 2600 Synthesizer Available Again

ARP 2600, new! lol.

It was the first portable (van instead of truck) synth, and used in the original Star Wars movie for the “voice” of R2-D2.

ARP went out of business after some failed products, and Korg bought them.

Behringer is also doing a more affordable and compact 2600 version.

Behringer 2600 First Demos and Overview

The ARP 2600: The Story of a Legendary Synthesizer | Reverb Feature
W: ARP 2600

Sweetwater Demo – Daniel Fisher: “I’m playing with a prototype here.” lol. How can you tell?

Arturia ARP2600 V Analog Synthesizer Software Instrument

Posted in Tech, Toys | Leave a comment

Meetup: New Security Features in Redis 6

Redis Labs Security Product Manager Jamie Scott talked at the Redis Meetup today about “New Security Features in Redis 6 Open Source.”

Because of the Corona virus, the lecture was streamed on Youtube instead of presented to a live audience in the Redis Mountain View office.

The new security features in Redis 6 are:

  1. ACLs – defines users, passwords, access. Errors are logged and viewable.
  2. TLS now built-in, so stunnel, etc. no longer needed. Available for client, cluster and replication encryption.

Combined with Redis databases and namespaces, ACL users provide granular authentication and permissions.


James’ Comments on Compliance

From a security compliance standpoint, the new Redis security features help with:

  1. TLS addresses the encryption-in-transit requirement. Some stunnel users reported that it was 3x slower than patching TLS libraries into the Redis server directly, so this is a huge win considering that for many users, Redis is used as a high-performance cache. It also provides another option to paying for Enterprise or AWS Elasticache licenses.
  2. ACL users address the requirement to not use administrative passwords and to have least-privilege
  3. ACL users potentially address the key rotation requirement, if you add a new user/password, then expire the old user/password on a schedule. This would avoid caching layer interruption during the switchover, and lets you use infrastructure-as-code tools to first add the new user/password, then lazily update the application configuration to use the new credentials in the next release, then later drop the old user/password.

(Box wrote a proxy to accept remote TLS connections, then talk to Redis server on localhost. The proxy also managed password rotation by allowing old and new passwords during password rotation.)

mikeperham.com: Storing Data with Redis (2015)
zdnet.com: SXSW, Google I/O, Facebook F8 and more 2020 tech conference cancellations and travel bans due to coronavirus fears

Redis Labs, Inc
700 E El Camino Real #250 · Mountain View, CA
Posted in Open Source, Tech, User Groups | Leave a comment

Music: Affordable Guitar Rack

I recommend ebay seller alienbid’s “New 9 Folding Multiple Guitar Bass Holder Rack Display Stand Black” if you need an affordable guitar rack.

It’s about $19.79+tax, including free delivery.

The rack is well-made from steel, and is strong. It feels like a precision instrument as you thread the bolts into place. No tools are needed, as the bolts are thumb screws.

No extra parts are included. There is a half-page assembly guide but it’s not that helpful. I recommend having somebody help with holding parts during assembly. That way you will save a lot of time and avoid stripping any bolts.

I’d prefer if the neck dividers were cupped instead of pegs. Currently I use padded mailing envelopes to ensure there’s no “stand rash” in case a guitar twists sideways and touches the next one.

I stripped one “bracket screw” during assembly, so I sent an ebay message to the seller. He replied the next business day (in the morning), and I received a complete set of bracket screws in a bag via USPS five calendar days later for no charge. I consider that excellent customer support.

It’s steel, so requires no maintenance, although I’d avoid direct sunlight on the foam parts.

I have no idea how they can make or ship this for $20. I should get a second one. 🙂

Trogly: Gibson Zither Single Guitar Stand Review (requires oiling, humidity control and avoid direct sunlight. Nitro-safe’ish. $200 and up.)

Posted in Tech | Leave a comment

Boeing Pushes Estimated MAX Return To Midyear 2020

I predicted in early 2019 that the 737 MAX wouldn’t fly in 2019, based on my engineering and commercial pilot rating experience.

I was right, and most other pundits were very wrong.

Boeing lost $18 billion total on the 737, resulting in a $5 billion overall company loss in 2019 after parking 400 new and about 389 customer aircraft, and fired their CEO, as well as the manager for pilot training. Airbus is now the #1 global manufacturer, climbing from 5% of sales to more than half.

However, Boeing must be a truly dysfunctional organization to still not be ready to fly as we approach March 2020.

What’s interesting is that since Boeing said a sim wouldn’t be needed, there basically aren’t any available globally in 2020.

Additionally, more problems have been found, including wire harness chafing and FOD in fuel tanks. If the FAA decides those issues must be fixed, there’s a small but non-zero chance those 789 parked planes (389 delivered and 400 undelivered) will be scrapped.

Boeing is an enviable position where the US military and world airlines need them, but that’s in spite of Boeing management, not because of them. In the early days of aviation, a company founder could have stepped in to right the ship. Unfortunately, those days are long gone.

avweb.com: Boeing Pushes Estimated MAX Return To Midyear, Boeing Finds FOD In Stored MAX Aircraft
reuters.com: Boeing proposal to avoid MAX wiring shift does not win U.S. support
Boeing Built Deadly Assumptions Into 737 Max, Blind to a Late Design Change
newrepublic.com: boeing-737-max-investigation-indonesia-lion-air-ethiopian-airlines-managerial-revolution
youtube: Inside the Boeing 737 MAX Scandal That Rocked Aviation | WSJ – Mar 10, 2020, How Boeing Will Get the 737 MAX Flying Again, What has happened to Boeing?!

Posted in Tech | Leave a comment

Yamaha DX7 Links

Mark 1 (12-bit DAC), Logo “DX7”, pastel, membrane

Insane custom patches – 80’s Hits, Van Halen Jump, Rush Subdivisions, Rush Tom Sawyer, UK Alaska, Berlin
Herbie Hancock Yamaha DX7 Vintage Demo 80s
Woody Piano Shack: Yamaha DX7 versus DX7ii (Mk2, DX7S)

Mark 2 (16-bit DAC), Logo “Yamaha DX7”, black

DX7s, buttons, black

Woody Piano Shack: Yamaha DX7 does hits of 2017 | Ed Sheeran, DJ Snake, Major Lazer and more


DX7 Centennial (76 keys, silver, 300 made)

Gear Forum: Yamaha DX7 or DX7S???

Note: Mark I and Mark 2 cartridges are different sizes.

How to Play Guitar Parts on Keyboards with Daniel Fisher (MODX) @15:31 “Do You Feel”

Yamaha MG10XU

How to replace the battery in a Yamaha DX7s
How to restore Yamaha DX7s factory voice and performance data

Posted in Tech | Leave a comment

Music: Understanding the Steinberg UR22C Power Requirements

The Yamaha/Steinberg UR22C audio interface just came out in 2019, so I haven’t seen detailed information on it, aside from what’s in the manual.

I did some research, and the UR22C can be powered the following 3 ways on Macbook Pros:

  1. using the included cable on Apple Macbook Pros using USB 2.0 (tested on “early 2011” MBP. “System Report … USB … Yamaha UR22C” says “Power Available: 500 mA and Power Required: 500 mA”)
  2. newer Apple Macbook Pros that support USB 3.0
  3. external 5 volt/500 (or more) milliamp mini-USB adapter (mentioned in the UR22C manual, but the adapter is not included, and not tested by me) – nobody has heard of an official adapter, so you very likely don’t need this.

The list above should also be a good starting point for Windows or Linux.

Another thing to know is that Mac OSX supports “aggregated MIDI interfaces”, so if you have an existing audio interface like a 22 and buy another 22, you can combine them into a 44 using the Mac OSX “Audio MIDI Setup” utility. 🙂

steinberg.net: Steinberg UR22C Manual
support.apple.com: If a Mac accessory needs more power or is using too much power

Posted in Tech | Leave a comment

Installing Cubase AI 10.5 on Mac OSX

Most Yamaha music instrument and audio interface products come with a free license for the Yamaha/Steinberg Cubase DAW Windows and Mac OSX software.

The Cubase AI installation process is obnoxious and lengthy, but this tutorial should help save you time and decrease anxiety.

(I installed a Steinberg UR22C today for a Yamaha MX49 on High Sierra, but Cubase is not device-specific and is quite backwards-compatible with older versions of Mac OSX.)

The steps are:

1. Find your Cubase AI license key and serial number of the device
2. Be prepared to spend the time to download 15 GB plus 2 hours for the software installation. You should have 45 GB free disk space – note this could be a problem on 128 GB disks unless it’s a new computer.
3. Plug in your computer and disable the screensaver or system sleep function
4. Follow the Cubase download instructions, which will involve creating a Steinberg account, doing email verification, and dowloading the download helper (about 100 MB). Save the license activation code (32 hex digits) using Textedit
5. open the download helper and choose Cubase AI full install
6. it will start downloading 15 GB and will show the download rate. You can pause it and continue reliably if you need to select a faster wifi connection
7. after downloading it will start “Verifying.” Wait up to 30 minutes for this to finish. If it takes longer, quit from the installer program and look in the Downloads folder for a 15 GB file and open it in the finder. A window will open that you can click on to run the program installer.
8. It worked if you see a big green checkmark and the message “The installation was successful.”
9. Drag Cubase icon to ribbon
10. Open Cubase icon. It will start eLicenser and do start license activation. Paste in code from #4.
11. Open Cubase icon again. Test Cubase. If successful, you can delete the 15 GB dmg file from Step #8.
12. Click on the download button or link for Mac (10.12 – 10.15) and Yamaha audio firmware updates as needed.
13. Restart.
14. Open Cubase and select the audio driver for your audio interface. Note that you can switch audio interfaces during a Cubase session.

Posted in Tech | Leave a comment

Change Data Capture at Netflix with DBLog

Netflix Tech Blog has a post on relatively new change data capture tool called DBLog.

When I worked at Netflix a while ago, they also had a hacky tool that would find changes in one database and copy them to another. Although hacky, it was effective, and they key thing was that it only copied changed rows in a window older than 10 minutes. For that use case, most update churn was under 5 minutes, so older rows were already quiescent. 🙂

netflixtechblog.com: DBLog: A Generic Change-Data-Capture Framework

Posted in Tech | Leave a comment

Slick SL57 Guitar

Got a Slick SL57 Aged Vintage Sunburst Maple Fingerboard Alnico Pickups with camo hard case for $259+$59.

Slick is famous/notorious for guitar prices being less than cost of hardware parts. And automotive paint.


– arrived naked in case
– not much case candy, just tremolo bar, screwdriver and invoice
– yup, that looks like automotive paint!
– quick shipment
– Made in China sticker on back of headstock.

Slick Hard Case

– camo looks good. See how it’s perceived in public later 🙂
– hardware very bright gold-colored.

squier-talk.com: NGD Slick SL57 Strat Review

Posted in Tech | Leave a comment

tcpdump Tips

I tend to use tcpdump when working on remote servers with multiple services running.

Thus it’s important to specify exactly which hosts and ports I want to see, or end up buried in output.

Here’s a canonical example of looking at output from a REST service on a remote host that’s listening on port 8080:

# CentOS 6/7

sudo yum -y install tcpdump
sudo tcpdump -nn -vv -i eth0 -s0 -A host X.X.X.X and port 8080

Options decoder: -nn is no name or port lookups, -i is network interface, -s0 is unlimited snapshot length, -A is ASCII output.

# Mac OS X

brew install tcpdump
sudo tcpdump -nn -vv -i en0 -s0 -A host X.X.X.X and port 8080

A more advanced example is to only capture on HTTP data packets on port 80. Avoid capturing the TCP session setup (SYN / FIN / ACK):

sudo tcpdump 'tcp port 8080 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'

Some tips:

  1. use tcpdump, wireshark, telnet, netcat and nmap periodically so that you’re familiar with your environment, and proficient when you need them. Ensure you have permission for those hosts!
  2. if you’re using ssh, ensure you’re not dumping port 22 to avoid excessive display output
  3. seeing zero-length responses is worth paying attention to, but some listeners just work that way
  4. seeing duplicate responses is also worth paying even closer attention to as that usually indicates an unexpected problem (unless you’re dumping on localhost)
  5. seeing unexpected responses from port 3128 usually means there is a proxy
  6. if you nuke your terminal with binary data, try the clear or reset commands, or redirect to a file
  7. on a physical server you can see actual packets with MTU size, but in virtualized/cloud environments they will likely be merged before you can see them.
  8. when things just don’t seem to make sense, the ip route, tcproute and telnet commands take you down to bedrock.

Using tcpdump to see HTTP requests and responses
FAQ: Questions about CentOS 7
A Previous Post

Posted in Open Source, Tech | Leave a comment

Odd Band Relationships

Every band has drama, but these are especially notable:

  • EVH played Michael Anthony’s bass parts on the Van Halen albums. Also, their songs went pop so that David Lee Roth could try to get a #1 hit.
  • Smashing Pumpkins was really two different acts: Billy Corgan was the main force behind the albums, but the tours were a more equal partnership with the instrumental players. D’arcy was originally a guitar player before meeting Corgan, but he asked her to join as a bass player, which is why she’s considered an “elementary bass player.” Corgan often re-did D’arcy’s bass parts on the albums.
  • Bon Jovi’s record label contract was only with John (his uncle owned a studio, so John had access to pro-level facilities, session musicians and business advice while still a teen.) The other Bon Jovi band members were his employees. You can hear that in his interviews, as every sentence starts with “I …”.
  • Journey’s Steve Perry fired their bass player and drummer in favor a drum machine. Original founding member Neil Schon later said, “A drum machine? I still don’t get it.” and Jonathan Cain said, “If you’re a band, be a band.”
  • The Go-Go’s, the first commercially successful all-female band, broke up because their main writer, Jane Wiedlin, made all the money.
  • Before Dave Grohl joined as their drummer, Nirvana was known as “2 musicians and a drummer.”
  • Guns and Roses replacement members like Gilby Clarke were contract players. Later Axel told the original band members to either sign over the band name or he’d walk, so they did.
  • Peter Cetera left Chicago because the band mgmt. wanted to keep members as anonymous cogs, and as lead vocalist he was obviously the frontman.
  • Iron Maiden has a 4-part series on how difficult it was to maintain a full line-up. Truly epic drama.
Posted in Tech | Leave a comment

Storm Ciara Helps Plane Beat Transatlantic Flight Record

I love the jet stream. Some factoids:

  1. one of the Greek airlines used to always fly in the jet stream for decades, no matter how rough.
  2. I had one flight from SFO to Manila in under 10 hours
  3. here’s a new BA subsonic airliner record:

“Experts are hailing a British Airways flight as the fastest subsonic New York to London journey.

The Boeing 747-436 reached speeds of 825 mph (1,327 km/h) as it rode a jet stream accelerated by Storm Ciara.

The four hours and 56 minutes flight arrived at Heathrow Airport 80 minutes ahead of schedule on Sunday morning.

According to Flightradar24, an online flight tracking service, it beat a previous five hours 13 minutes record held by Norwegian.”

bbc.com: Storm Ciara helps plane beat transatlantic flight record

Posted in Tech | Leave a comment

EMB-170/175 Airliner Wire Chafing

An old problem, but you have to wonder how the installers and inspectors overlooked this. 10 aircraft means they didn’t know what they were doing:

“Investigators found a chafed wire in the captain’s control column caused by contact with an untucked safety wire pigtail. Republic inspected its other EMB-170/175s and found nine other aircraft with similar chafing.”

avweb.com: Wire Chafing Checks Recommended After EMB 175 Crew Briefly Loses Pitch Control

Posted in Tech | Leave a comment

Ukraine International Airlines Flight 752

Things that make you go, “Hmmm …”:

  1. How do you confuse a scheduled airliner on takeoff with a cruise missile?
  2. How was the airliner “unidentified”?
  3. Does the Iranian air defense system actually work?
  4. Was there confusion that an adversary was flying next to the airliner?

It sounds like Iran, which is a regional industrial power, doesn’t have a functioning air traffic control system.

“Ukraine International Airlines Flight 752 (PS752) was a scheduled international passenger flight from Tehran to Kiev operated by Ukraine International Airlines (UIA). On 8 January 2020, the Boeing 737-800 operating the route was shot down shortly after takeoff from Tehran Imam Khomeini International Airport. All 176 passengers and crew were killed; it was the first fatal air accident for Ukraine International Airlines.”

“When the Revolutionary Guards officer spotted what he thought was an unidentified aircraft near Tehran’s international airport, he had seconds to decide whether to pull the trigger.

Iran had just fired a barrage of ballistic missiles at American forces, the country was on high alert for an American counterattack, and the Iranian military was warning of incoming cruise missiles.

The officer tried to reach the command center for authorization to shoot but couldn’t get through. So he fired an antiaircraft missile. Then another.

The plane, which turned out to be a Ukrainian jetliner with 176 people on board, crashed and exploded in a ball of fire.”

nytimes.com: Anatomy of a Lie: How Iran Covered Up the Downing of an Airliner
W: Ukraine International Airlines Flight 752
Israel Accused Of Hiding Behind Airliners In Missile Strikes

Posted in Tech | Leave a comment

Kinko’s Campbell as a Third Place

Kinko’s Campbell used to be 24 hours around 2004, and as the only place open, had a vibrant late-nite community of regulars.

Later the hours were reduced by the regional manager, and then Fedex bought the chain.

On most nites, an eclectic group filtered in and out of the store:

  • saxophonist/chess player (He lived in a large customized van. I read a report of him possibly being killed by a light rail train.)
  • contract accountant
  • chef (with legal problems)
  • real estate title researcher
  • software developer (me) using MIRC chatbot scripts
  • some guy doing email address verification with a laptop left under a desk. 🙂

Kind of like “Friends” meets “Silicon Valley.”

Everybody was friendly – I don’t remember a single cross word ever being said. One chess set went missing one nite, and was presumed stolen.

In the early morning, a construction crew assembled, in full work gear and galoshes, with blueprint rolls.

The reduction in store hours ended the community.

Also, after the Bascom light rail station was built, it seemed like the atmosphere changed in the area. Staff had to deal with a couple mentally ill people wandering in and trashing the restroom, for example.

One other nearby store location removed all their chairs for similar reasons shortly afterwards.

Some related anecdotes:

The VTA had a cleanup crew in white spacesuit-like hazmat suits at Bascom station each morning to disinfect the elevator there. Surreal.

Once they had to dismantle a massive “log cabin” built overnight on the pedestrian bridge, which was a fire hazard to the structure. The “builder” could have just done it at ground level and nobody would have noticed for weeks.

Clickaway complained to the store manager over anti-competitive “free training” for customers of desktop publishing software, so he forbade employees from assisting customers with document creation. I guess they felt “entitled to a profit.”

Posted in Retro, San Jose Bay Area, Tech | Leave a comment

BOSS GT-1000 and BCB-60 Pedal Board Review and Tips

I’ve been thinking about what case or pedal board to use with my BOSS GT-1000 guitar effects processor.

I looked around online for reviews but didn’t see much as of Nov. 2019, aside from the BOSS ME-80 nylon case.

The GT-1000 is a moderately-large unit, but still smaller than the Helix Line 6 or Headrush, which are brief-case sized.

My first experiment was to see if it would even fit in the BOSS BCB-60 pedal board case.

TLDR; Yes, it fits!

Although the BCB-60 was never intended for the GT-1000:

– it fits fine, almost to the millimeter on height, after removing the pads and main cable raceway
– the included AC adapter is the same as the one provided with the GT-1000, so free backup.
– the leftover space is enough for 2 standard full-sized pedals, or one pedal and a wah pedal
– the top spaces (near the handle) can hold accessories such as the AC adapter or a non-Boss string tuner
– easy to configure, like a Lego set. Only tools needed are a Philips screwdriver and scissors.

Since the GT-1000 is already a pedal board, does it make sense to store it in yet another pedal board?

Using the GT-1000 alone (naked), probably not.

But … yes, if:

– you want some protection for your GT-1000, which is a major investment for most players
– you want to permanently store or cable 2 more effects pedals, or 2 pedals and a non-Boss tuner, or three of the new mini pedals
– you plan to transport it in a car or van and like the form factor and rugged molded-in handle
– you don’t want to use a soft case for whatever reason
– you plan to use it at home, or gig monthly rather than daily.


– good value with an amazing set of 7 L-cables, one daisy-chained power cable and an extra pad
– easy to configure – more like Lego, not a research or science project
– Boss has amazing warranties and compatibility. You can’t accidently plug-in the wrong voltage and damage your gear if it’s all Boss.


– plastic case with poorly-made plastic latches that will break off sooner than later.
– no velcro or mounting tape included, but not needed except for pro gigging
– plastic case has prominent Boss logo
– doesn’t have power bus across top of case, just a point source
– most of the screws are thumbscrews, but you will need a table and good lighting to thread them
– plastic lip to step over
– “external” AC adapter rather than built-in


– use the sharpest knife you can get to cut the pads. Use one stroke. The remnants can be used to pad the GT-1000 edges. One pad is not pre-patterned
– for best results the first time, ensure you place all the pedals and completely cable them before cutting foam
– BOSS and JHS analog pedals fit exactly into the pad cutouts, but MXRs do not fit snugly.
– keep the original product and shipping boxes if you want to ship safely later. Remember, plastic case.
– the BOSS FS-7 foot switch mates nicely with the GT-1000, if you need a switch
– there is a 1” plastic case lip to step over, so try that out first.
– the BCB-60 is fine if you plan to fill it up. Otherwise, there’s no point in the extra weight and expense
– since the BCB-60 is so easy to configure and cable, a viable option to maximize space is to put 2 pedals inside the case
and a wah pedal outside the case. Or store the power cable in a bag.
– I’m looking for 9” L-cables, as the included 12” are a little too long and take up too much space.

I’m finding the result intriguing, and will continue experimenting with the GT-1000 and BCB-60. If it doesn’t work out, I’ll definitely use it for an all-analog board, where the built-in cable raceway and junction boxes would shine.

Other Options

– the GT-1000 is a pedal board, so just use a bag, with pockets even
– you can go a long way with just a Zoom multi-effects processor. The Zoom G3Xn is $220, and the Zoom G5n includes an audio interface for $330. They’re compact and cheap enough that you can just throw them in any bag or backpack that you already have laying around.
– Headrush has user-friendly color touch-screen. See Ola’s demo.

BOSS GT-1000 Download Site

Thon Case Boss GT-1000
Gator GPT-PRO Pedal Tote Pro Pedal Board with Carry Bag
Boss CB-ME80 Carrying Bag for ME-80 and GT-1000 Multi-Effects Processor
On-Stage GPB2000 Compact Pedal Board
Audio Volume Controller – Aluminum Alloy CNC – Control Knob Lossless Speaker Line

GT-1000 Features

– no delay between presets
– no dsp lag with effects
– simplified modeller amp names (crunchy, classic, clean, etc.) rather than fake specific names
– BOSS compatibilty (voltage, etc.)

How To Use The Boss GT-10 As An Audio Interface

tonymckenziecom: Roland Boss GT 1000 Inside and Out Review & Editor

Player Reviews

Leon Todd – BOSS GT-1000: First Impressions, 4CM Tones (Effects Only), Dial it in, Exploring the Amp Models
Leon Todd – BIAS FX 2: New Amps, Effects, Pickup Profiling & More (BE 101 @16:55, tri-chorus @18:03)
Brett Kingman GT-1000: My first day with it., How I’ve Using Mine Lately
Dagan: First Impressions

Posted in Tech | Leave a comment

Frank Denis says Stop using ridiculously low DNS TTLs

One of the best DNS posts I’ve seen in the last decade, on short TTL issues. It’s also a great tutorial on how to do data-driven technical analysis.

As a DNS administrator, I knew it was bad, but not that bad. 🙂

I like to use 1 hour TTL during normal times and 1 or 5 minutes during moves, along with a box left behind to redirect requests. But as the article points out, it’s easy to forget to up the TTL after the move.

One of the interesting asides is that if you’re going to use super-low DNS TTL’s, then what about your application cache TTL’s? 🙂

Off-topic, but Frank’s blog post on how he lost weight is quite interesting. At first I thought, “this guy really likes to go overboard,” but large-scale studies of dieters show that diets almost entirely fail, yet he succeeded.

Stop using ridiculously low DNS TTLs HN

Benchmarking the top-level domain names HN

Posted in Toys | Leave a comment

Space Reality Check

Gee, this guy is even more brutally honest than I am:
5 Horrifying Facts You Didn’t Know About the Space Shuttle

Also, an EASA administrator was roasted for his answer that reusable spacecraft would reduce jobs, but it looks like SpaceX is doomed financially because there aren’t enough missions to make low-cost launches financially viable for its 5,000+ staff.

Which is why SpaceX is polluting the sky with microsatellites now.

Around 1998, I had a chance to talk to a de Havilland aircraft engineer. It went like this:

Me: “Since we lose money on every airplane built, why not just close the Toronto factory and do design work?”
de Havilland Engineer: “because without a jobs scheme, there won’t be any budget for design work.”

Of course, then Bombardier (a Quebec jobs scheme) gave the A220 design to Airbus in 2017/2018, so who was right? 🙂

The State of Galileo as seen by an outsider

From my studies of the X-planes program, that was the USA’s most successful and cost-efficient aeronautics program. The research ended up in every civilian and military airplane built.

The pause in the X-planes program in favor of the rocket program was a huge mistake that set back hypersonic flight by decades. You could make a list of 100 advancements due to the X program that essentially cost nothing.

Posted in Tech | Leave a comment

Major Accident Outside Stonestown Galleria Mall

Late Friday nite (Oct. 25), I happened to be looking towards the 19th Avenue and Winston Drive intersection outside the Stonestown, Galleria Mall, SF, when a major accident occurred.

A car at high speed ran into another car doing a turn in the intersection. There was a loud bang and an orange fireball (just like a Hollywood movie) appeared over one car. The cars moved about 10 yards, and one car remained in the road while the other stopped on the sidewalk. I noticed a police car stopped before the intersection immediately (ie. instantaneously) after the accident. There was a lot of smoke drifting over the road.

The collision was intense and violent. It happened so fast that very few details beyond that were apparent. I remember thinking, “I hope the people in the 2 cars are ok” and “I don’t need to see something like this.”

Cars continued to drive and turn through the unblocked lane next to the accident. I remember the distinct crunching of plastic and metal parts that were on the road.

Several more police cars arrived within a minute or two while one policeman emptied a small fire extinguisher at the car. I was impressed with the speedy arrival of so many police cars, and also the heroism that several policemen quickly approached the smoking car on a street full of traffic.

(According to the skid marks I saw the next day, it was the high-speed car that ended up on the sidewalk. It applied brakes heavily but briefly, then veered at a 45 degree angle from its lane, over a small median island, onto the sidewalk, and dug into the grass, which stopped it.)

Map of accident at 19th Avenue and Winston, SF

I guess the questions I have now are:

  1. if a car explodes, does that mean all the gasoline has been consumed and it’s safe to approach?
  2. how are the occupants doing?
  3. the police did a great job – after all, this was like a war zone on a busy street. I wonder if there should be more procedure to always carry fire extinguishers towards a burning car, or carry gloves, but they’re not firemen.
  4. Was this a police chase? That would explain the high rate of speed and proximity of so many police cars.
Posted in San Jose Bay Area | Leave a comment

Migrating SQLite Databases to MySQL

Recently I had to migrate a relatively simple (no FKs or views) Grafana 4 database of about 20 tables and 80,000 rows from sqlite3 to mysql5.6. Below are some notes I made on the 3 methods I tried.

  1. First up was MySQL Workbench on Mac OS. This required first finding and installing a working ODBC driver for sqlite3, which was a hassle. (The Devart one worked.) Then configuring it in the Mac’s ODBC Manager. After clicking a few UI buttons, the import failed with an error message for each table and Workbench crashed, like usual for the past 2 decades. Tried again with another crash. Moving on …
  2. Next up was Navicat Premium Trial Edition. It was able to create a sql file that was syntactically valid with MySQL but had 3 major problems:
    1. it applied the TEXT cast to most of the string and date values
    2. which then meant it thought it had to declare most of the character and date columns as LONGTEXT and LONGBLOB
    3. which meant the index definitions using those columns were invalid because of length. In other words, a mess.
  3. Finally I used a bash script that enumerated the table names using .tables and did a sql dump using .mode insert on each table of individual INSERT statements. That worked fine.

Some other ideas to consider for more complex migrations are:

  • Use the Amazon database migration tools. There’s at least 3 that I’m aware of.
  • Use the professional-grade pgloader to migrate to Postgresql, then use another tool to pivot to MySQL.

The MySQL grants you need for grafana are:

GRANT USAGE ON grafana.* TO 'grafana'@'%' IDENTIFIED BY 'mypw';
GRANT ALL PRIVILEGES ON grafana.* TO 'grafana'@'%';

There is problem with the Grafana initial table creation script: it uses CREATE IF NOT EXISTS for the tables, but not the indexes, so you see this:

CREATE UNIQUE INDEX `UQE_user_login` ON `user` (`login`);

Posted in MySQL, Open Source, Tech | Leave a comment

PSA: Running Mac OS X csrutil Without a Recovery OS Partition

This blog post contains my notes for advanced users (Mac OS software developers and IT staff), not end-users.

Starting with Mac OS X El Capitan (10.11), some low-level system operations, like setting the NVRAM, are restricted by default by System Integrity Protection (SIP), which is controlled by the csrutil command.

However, the csrutil disable (or enable) command can only be run in a terminal after booting into a recovery OS partition (officially) or an installer (works for me, see below.)

Machines with a cloned OS often don’t have a recovery partition, or if they do, it doesn’t show up in the Startup Manager interfaces.

Even when there is a hidden recovery OS partition, versions of software starting with El Capitan have removed the Disk Util menu option to enable the recovery OS partition with Debug (apparently to hide the EFI partitions) when hidden. (Leave a note if you’ve used an older version of Diskutil to enable it. 🙂 )

You have 5 options of varying difficulty to choose from if you don’t want to do a full re-install of Mac OS X:

1) not that ez – install (ie. fix) your recovery partition, as recommended in most Internet How-to’s. Good luck, since most of the How to’s are incoherent. A suggestion would be to find one that talks only about one OS, preferably yours, instead of 4 or 5 versions.
2) ez – boot into a USB Mac OS El Capitan (or higher) installer, but don’t install. Just open the OS X Utilities ... Terminal menu and try csrutil disable. This worked for me, but if not, #5 below also worked for me.
3) didn’t try – install the recover partition to an external drive. My understanding is that this is intended for the Mac mini. Supposed to be ez.
4) didn’t try – try booting into a Lion installer and use the old Diskutil with debug mode to enable the hidden recovery HD partition. If you want to try that, boot into a Lion installer then open Terminal and type the following command, then open Diskutil last:

$ defaults write com.apple.DiskUtility DUDebugMenuEnabled 1​

5) super ez – boot into a USB installer, use Utilities … Diskutil … resize your original drive for a new 25 GB OS partition and install to it. Since you’re doing a fresh install, a recovery hd partition will be automatically created. boot into that with Option+R and run csrutil disable and reboot again to activate. This sounds kind of round-about, but is really easy, idiot-proof, and worked for me.

You’ll end up with something like this:

$ diskutil list
/dev/disk0 (internal, physical):
#:                  TYPE NAME         SIZE     IDENTIFIER
0: GUID_partition_scheme             *500.1 GB disk0
1:                   EFI EFI          209.7 MB disk0s1
2:             Apple_HFS HardDisk     474.6 GB disk0s2
3:            Apple_Boot Recovery HD  650.0 MB disk0s3 # can't see
4:             Apple_HFS Untitled      24.0 GB disk0s4
5:            Apple_Boot Recovery HD  650.0 MB disk0s5 # from option #5
$ csrutil disable
Successfully disabled System Integrity Protection. Please restart the machine for the changes to take effect.
$ csrutil status
System Integrity Protection status: disabled.

Of course, when you’re done, run csrutil enable because that’s the default, and it protects the NVRAM from malware.

Terminology: “recovery OS partition” and “recovery HD partition” refer to the same thing, but the first is conceptual and the second is an actual disk partition intended for recovery and contains the recovery OS.

Also “Disk Util” is the Mac OS app, “Diskutil” is the on-boot menu option and “diskutil” is the CLI program. Again, the names vary but they all do the same thing, but available in different environments.

developer.apple.com: Configuring System Integrity Protection
W: System Integrity Protection

Posted in Tech | Leave a comment

Hadoop Has No Business Model

Hadoop LogoI had a chance to talk to a Big Data sales manager 5 years ago.

He said, “Hadoop is a tough sale on the East Coast. Hoping Spark will help.”

Now, MapR is facing insolvency and Cloudera/Hortonworks isn’t doing great.

Some of the reasons Hadoop has lost its lustre:

  1. Commercial Hadoop is licensed per server ($5,000+) times the number of cluster nodes ($millions)
  2. Hadoop requires rewriting any current reporting jobs using Java/Mapreduce. Large companies tend to not formally budget for maintenance or re-QA of existing applications
  3. Hadoop jobs tend to be duplicated for different departments and salespersons compared to dedicated internal reporting projects. Yahoo! went from 10 servers to around 1,000 Hadoop nodes for one of their datawarehouses
  4. Google, the inventor of Mapreduce, used C, which resulted in 3x faster results than Java, and have moved onto other graph systems, like Pregel, a graph database
  5. vendors declined to monetize their distributed file system as a stand-alone product, resulting in salesmen and technical buyers being at odds.

DBA Pro Tip: Don’t use Hadoop, instead use summary tables and aggressive data retention. AWS has volumes large enough to avoid distributed systems entirely.

Cloudera plummets 40% after CEO abruptly departs and company cuts forecast
An Update from MapR “As a privately held company we are unable to provide forward-looking statements regarding financial performance.” – Really?
theregister.co.uk: MapR misses deadline for sale, biz prospects looking thinner than a Hadoop sales pitch

Posted in Hadoop, Java, Open Source, Tech | Leave a comment

Amazon SF Loft: Observability for Startups

Luke Demi from Coinbase (YC) gave a talk on “Observability for Startups” tonite at the AWS Loft at 525 Market Street. Video

He talked about their odyssey through the years to monitor their infrastructure, now over 5,000 AWS servers (originally Ruby on Rails and Mongo) and serverless (AWS Lambda.)

Like most startups, they don’t have a DBA team because features. So they rely on monitoring during outages to tell them when to run EXPLAIN. 🙂

EXPLAIN is your friend!

They have used or evaluated several products over the years:

  • New Relic – likely too expensive
  • Kibana – based on logs, so detailed, but poor aggregation and alerting. Only can afford to store 7 days. Dashboard with 100 ms/200 ms and 500s graphs. They use AWS-managed services as much as possible to reduce workload.
  • Grafana/Prometheus – too DIY, poor alerting UI
  • Datadog – good aggregation and alerting, easiest for new engineers, long retention, poor details and granularity. Special VPC routing for eng. security.

Kibana vs. Datadog – Complementary Features

Now using Kibana and Datadog, which are literally complementary (see slides) but would like to combine the best of both into one tool. Maybe someday! 🙂

For serverless (AWS lambda), either AWS X-ray with Datadog, or Cloudwatch.

As a finance company, Coinbase does spend effort on compliance, though it’s a Cloud world now.

Slides Video

Coming soon: AWS Database Week in SF from June 4-6.

AWS: Patching Python Libraries to Instrument Downstream Calls
bloomberg.com: Coinbase Says Chief Operating Officer Has Left Crypto Exchange
Firefox zero-day was used in attack against Coinbase employees, not its users

Posted in Business, Tech | Leave a comment

Party City is Facing a Helium Shortage

I’ve read about possible helium shortages for years, but this is the first time I’ve heard of actual impact.

Helium is a strategically important element used in MRI machines and scientific experiments. It’s so light, that if you don’t store it in a cylinder, it escapes into outer space.

Making it (transmutation) is too expensive for commercial use, so it has to be captured from decay. Hence using it in party balloons is short-sighted.

Interesting that balloon purchases make Party City Amazon-proof. 🙂

Dollar Tree also does a roaring business in the Bay Area for birthday party balloons.

cnn.com: Party City is facing a helium shortage. It’s also closing 45 stores HN
forbes.com: Why We Are Running Out of Helium And What We Can Do About It

Posted in Tech | Leave a comment

Las Vegas 2019 Trip

While on vacation in Las Vegas last week, I saw the record rainfall. I had just entered the hotel, and noticed blackening cumulonimbus clouds over the hills – within 30 minutes torrential rains lashed the hotel windows.

Ultimately a record 0.25 to 1 inch of rain and hail fell.

Torrential rain and hail arriving in Las Vegas. The sky view from downtown Las Vegas was almost all black.

I saw the Blue Man Group, Hershey’s Chocolate World and went on a once-in-a-lifetime tour of the Hoover Dam.

I booked the tour through Grayline Tours, and saw the following:

  1. Ethel M/Mars Chocolate Factory and Botanical Cactus Gardens
  2. Lake Mead “Desert Princess” boat tour
  3. Hoover Dam tour inside (conduits and generators)
  4. Hoover Dam tour from top

The Hoover Dam took 5,000 workers and 5 years to build in the middle of the Depression.

It was the largest construction in history at the time, and still looks like it was built yesterday.

The dam was built primarily for water resource management, with the electrical generation a bonus. The electrical production paid off the dam project by 1986.

Hoover Dam – view from “Desert Princess” boat on Lake Mead

Hoover Dam – 7×130 MW Generators (~1 GW per side totaling 2 GW, supplying half of region’s electrical needs)

Hoover Dam – view from top of dam towards Colorado River showing concrete bridge, which took 9 years to build

W: Hoover Dam

Posted in Tech | Leave a comment

PSA: IBM Cloud Doing DNS Zones Cleanup

One of my TuCows/Hover.com domain names became unresolvable recently that was DNS hosted with servermatrix.com.

So it appears IBM Cloud/Softlayer/ServerMatrix may be doing some zone cleanup in 2019, and you should want to double-check any domain names hosted there. Or better yet, add monitoring.

TuCows/Hover.com chat support was very helpful in determining the problem and resolving it. Bad pun, ya? 🙂

Posted in Tech | Leave a comment

Getting Started in Computer Programming

From time to time, people ask me how to get started in the career of computer programming.

If you like solving puzzles and have the ability to focus for hours at a time, then you’re a good candidate to enjoy a programming career. Many programmers are introverts, so if that describes you, great! 🙂

First ask your friends, co-workers or hosting provider which computer language they use. If they don’t have an opinion, read on.

For writing scripts (small programs), the 3 main languages are:

Language Tutorial Book
Perl Tutorial “Learning Perl”, Randal Schwartz, O’Reilly
Python Tutorial Books
PHP (mostly for web programming) Tutorial Books

Perl has some advantages in that older sample scripts were usually written in Perl, and it’s an extremely powerful and compact language. Perl is popular with people who want the biggest hammer available for solving problems, or who want to specialize in security engineering scripting. 🙂

On job sites, Python is currently the most popular scripting language. It also has good Machine Learning support. Python is popular with people who have OCD or like to “follow the rules” with its indentation rules. Google uses python or Java for most programming.

PHP is popular with people who mostly want to do web programming, and has the widest hosting provider (ISP) support. Besides WordPress, Facebook also uses PHP.

After learning simple programming, build a web site for a CD collection inventory to learn basic HTML, CSS and database programming with MySQL.

Posted in Tech | Leave a comment

RedisConf 2019

Redis Labs presented another well-organized, strong technical conference this year. Once again it was held at Pier 27 in SF.

The conference theme this year was, “Redis, Database for the Instant Experience.”

Executive Summary:

  1. Redis Enterprise active-active (based on CRDTs) is being used in production already, which is ideal for HA distributed session caches, and more. RedisSearch active-active is now in preview. 2019 is the first time you can easily add multi-region support to your MySQL with Redis applications.
  2. Box, Inc. has developed a zero-downtime deploy and key rotation proxy for Open Source Redis. It was developed for IT compliance, and is being Open Sourced in April 2020. See diagram.

The Holy Grail: a geo-distributed application that uses active-active-active Redis using a counter

Redis Database Feature Groups

Conference Day One (Tuesday)



Extreme Performance with Redis, Andi Gutmans and Kevin McGehee, AWS

– C5n 100 Gbps
– I3 16 Gbps SSD at 3.3 M iops
– R5d nitro coprocessor board
– Rdbtools tools for redis
– Main use cases: Strings and hashes with TTL
– Socket io contention with command parsing
– AWS added dedicated thread
– 50 to 83% faster
– setReadFrom nearest
– ConnectionPool
– Pipeline/execute
– Scan (paginated) not keys
– Async delete (unlink vs delete)
– Sz of collections
– Slowlog
– Client reply off/skip
– Upgrade
– Cluster mode allows scale-in and out with no downtime
– gutmans@
– McGehee@

Redis Security, Box

– Asked by Secops to solve the problem of securing passwords, but found password rotation to be a big problem
– Old and new password in Vault
– Old and new Redis proxy processes
– iptables to switch proxy process using SO_REUSE
– Drain old proxy
– RPX intelligence for different command groups: Harmless, auth, internal Redis, and other commands
– Zookeeper has list of authorized nodes especially for replication

– 20k new conn per host
– 15k concurrent conn per host
– 5 GBps per redis host
– 0.5 ms per req
– TLS being tested.

Patrick King, New Relic

– used Facebook’s memcached mcrouter for about 5 years
– 7 Tbps bandwidth from monitoring agents
– 232 million agents, some live only 90 seconds
– First rails with Redis
– Mesos -> K8s
– Megabase is their custom persistence service
– 12M agents
– 20M wpm
– 12 M rpm
– Had to shard Sentinel
– consistent hashing can be brittle so use 2 layers to isolate config from app
– Demos of online auto redis-cli create and reshard

Work-stealing, Jim Nelson, Internet Archive

– Instead of monolithic cron jobs, use distributed workers
– Watch/exec with hash and array for field-level ttl
– Srandmember/srem
– http:s//github.com/internetarchive/work-stealing
– Ok for auditing, sampling, etc
– Not ok for sequential important results

How to do 1 million OPS with Redis, Jane Paek, Redis Labs jane@redislabs.com,

– typical warnings about slow operators like keys and delete
– memtier-benchmark
– redis-benchmark
– redis-benchmark -q script-load “redis.call(‘set’,’foo’,’bar’)”

Game with Redis, Python and Websockets

– IXWebSocket
– async
– no Boost needed
– auto-reconnect

– Python 3.5+ async IO
– MagicStack lib uvloop
– Nice web sockets module
Mypy static type checker
Tracemalloc module new in version 3.4

– Neo
zlib on web sockets saves 75% bandwidth

Conference Day Two (Wed.)



CRDT’s by Roshan Kumar, Redis Labs

Good use cases:

– counters
– inventory
– sessions
– most else

Not good use cases:

– financial
– order processing
– only one client should pop data

Points to remember:

1. Your app will not receive notification when conflict occurs
2. You cannot override the default conflict resolution semantics
3. Lua script synchronizes commands

Best Practices:

1. Make your apps stateless (let Redis manage the state of your data)
2. Design your solution assuming conflicts may occur

Synchronize your clocks for best results. Note that some operations are time-independent, like adding to an array.

Github RedisCRDTDemo repo for network partitioning (split and restore) test scripts

Successful Redis active-active-active live demo updating a key-value using CRDTs across 3 regions: Atlanta, Boston and Chicago 🙂

RedisTimeSeries Module, Danni Moiseyev, Redis Labs

– introduces time series TS.* operators
– can insert timestamps with multiple labels
– can define aggregation labels
– compatible with Prometheus and Grafana
– benchmarks show RedisTimeSeries much faster than Prometheus and TimeScaleDB
– very high performance
– compression coming soon

Writing Redis Modules with Rust, Redis Labs

– memory-safe compared to C
– use macros
– see https://github.com/RedisLabsModules/redismodule-rs for sample code
– Rust compiles to a .so for loading like a normal library

RedisSearch Benchmarks and CRDTs, Redis Labs

– 50% to 400% faster than ElasticSearch because C/RESP vs. Java/HTTP
– able to do tens of thousands of indexes for multi-tenant applications, unlike ES which crashed at 912 indexes
– CRDT support in Redis Enterprise Preview
– Best Practice is not to use RedisSearch clusters with session caches, etc. to avoid key conflicts

Cache, Zohaib Hassan, Doordash

– random ahead of time eviction (time jitter) in multiple nodes (clusters)
– one way to avoid thundering herds: timestamp + ttl + (rand() * gap) > now()
– another way: p886-vattani.pdf
– L1 -> L2 -> DB
– See Java Caffeine cache for similar idea, but it might lock thread
– compress large values. lzbench said LZ4 was faster than Google’s Snappy (previously known as Zippy) for menu content (64 KB – 700 KB)
– prolly more reads than writes when caching
– 2x faster GETs with compression
– no spikes after pre-eviction + compression applied during busy periods now
– 15% extra RAM after compression
– Redis is great, but pay attention to the details
– Cloudflare uses Facebook’s zstd for message queues and is happy.

RedisConf 2019 Exhibit Area

Small exhibit area with AWS, GCP, Azure, Redis, Heimdall Data (proxy), Western Digital, RDBTools (bought by Redis Labs 2 days ago), Redis U.

Got a comprehensive demo of RDBTools. It’s an amazing web UI for Redis database. I particularly liked the slowlog report and online resharding. Redis Labs bought them 2 days ago, so it is now offered as part of their product lineup now for a small monthly fee.

Impressive Figure-8 RC car racetrack!

Several food trucks served lunch.

No walls between the 6 lecture tracks, so Silent Events headphones were provided to amplify regular speaker voices.

Note to Conference Organizers

If possible, please serve drinks on the 2nd floor next year.

Simple Developer Tutorial

$ brew install redis
$ redis-cli
> SETEX pages:about 5 "about us"
> GET pages:about
"about us"
> KEYS *
1) "pages:about"
# wait 5 seconds
> GET pages:about
> KEYS *
> shutdown
> quit

To start Redis later:

brew services start redis

infoworld.com: When to use a CRDT-based database
Redis Labs Videos
Using the command line to check redis health

twitter.com: Improving Key Expiration in Redis
kn100.me: Beating round-trip latency with Redis pipelining
redis.io: redis-cli, the Redis command line interface

Posted in API Programming, Cloud, Linux, Tech | Leave a comment

Announcement: check_s3_encryption.sh Linux Utility Available

Announcing a new SRE utility I wrote called check_s3_encryption.sh to report and optionally encrypt any AWS S3 unencrypted buckets:

  1. it can be run from the command line or a crontab
  2. it’s useful for IT compliance
  3. MIT License.

See the README for documentation and example output.

Of course, after running this, you should enable a config policy to always create S3 buckets with encryption enabled. Also, there’s an option to skip public buckets in case you customized the permissions or redirects.

Posted in Cloud, Linux, Open Source, Tech | Leave a comment

Boeing 737 MAX MCAS Nonsense

Southwest Airlines Boeing 737 MAX8 Parking Lot at Chicago Midway Airport (MDW)

Disasters and manias of our time:

  • Fukushima
  • p-hacking in the social sciences
  • SF Transbay Terminal welding and materials failures
  • Bitcoin speculation
  • and now Boeing MCAS.

So, let me get this straight about the Boeing 737 MAX MCAS sensors:

  1. MCAS is non-redundant – it relies on only one of two fuselage-mounted external Angle-of-Attack (AoA) sensors to control the most critical flight control, the stabilizer
  2. that can be damaged or frozen
  3. the 2 sensors can disagree with no warning to the pilot (except with a warning light in an optional package and compatible display device, which even Southwest initially didn’t have)
  4. that are not checked for zero on the ground
  5. and fuselage-mounted AoA sensors don’t provide true AoA except when wings-level
  6. behavior doesn’t match the certification documents, and no training materials provided.

The Angle-of-attack (AoA) sensor is located near the bottom of the foto. Note that it is exposed to physical damage and icing.

and …

  1. MCAS is activated on manual flight, where the pilot wouldn’t expect to have to disengage any “autopilots”. (But they would know what a stickshaker warning is.)
  2. MCAS is activated on flaps-up, thus immediately after takeoff, with little time for correction. Knowing how to disable MCAS would have to be a memory item, since reading the flight manual or checklist could easily take longer than the 25 seconds MCAS needs to hit the stop and become unrecoverable at low altitude (and in fact, one of the pilots was reading the flight manual at the time of impact.)
  3. MCAS can trim the stabilizer full nose-down to the stop (5 degrees) instead of the original FAA-approved 0.6 degrees, and will continuously to do so with a failed sensor like some kind of doomsday machine.

And this wasn’t adequately documented before revenue flying on the most popular airliner ever sold?

What’s interesting is that had a faithful simulator been created, this problem would have been found much earlier. But that would have raised re-certification and training questions.

MCAS was a bad proof-of-concept system, not at all ready for airline use. I can’t believe that not a single Boeing engineer noticed this design error.

Kudos to the crew of the first Lionair incident flight. The 3 pilots exercised good CRM and found the cutout switches in time to save their plane. (I recommended 3-man crewing for SE Asian flights in a previous post.)

“The FAA said it will mandate Boeing’s software fix in an airworthiness directive no later than April.” – but how does that fix the non-redundant sensor?

flyingmag.com: Lion Air Investigation Takes an Unexpected Turn
b737.org.uk: 737 MAX – MCAS
seattletimes.com: Flawed analysis, failed oversight: How Boeing, FAA certified the suspect 737 MAX flight control system
theaircurrent.com: The World Pulls the Andon Cord on the 737 MAX
Capt. Sullenberger on the FAA and Boeing: ‘Our credibility as leaders in aviation is being damaged’
The emerging 737 MAX scandal, explained
737: The MAX Mess (Very detailed notes)
American Airlines extends Boeing 737 Max flight cancellations
avweb.com: International Committee To Review MAX
avweb.com: Will Boeing Ever Dig Itself Out?

Posted in Tech | Leave a comment

oom-report linux Utility

Announcing a new SRE utility I wrote for processing linux syslog Out of Memory (OOM) entries, oom-report.

It parses logfiles for OOMs and reports on how much RAM (vm + rss) each process used and sorts from smallest to largest. (By default, the linux OOM-killer kills the largest process.)

It’s a typical Unix-style filter program, so you can run it manually on the host, or copy it to each cluster host and use ssh or a configuration management tool to run it remotely.

perl oom_report.pl < /var/log/syslog | tail -10

    salt-minion =         264,981
        dockerd =         297,009
docker-containe =         442,017
          agent =         584,720
          java3 =         877,946 (dd-agent)
          java1 =       1,205,023 (logstash)
          nginx =       2,926,052
          java2 =       5,608,659 (app server)

          total =      13,427,264

Process names are squashed if identical, except java process names are uniqified with a numeric suffix.

Posted in Java, Linux, Open Source, Tech | Leave a comment

India Requests Additional MiG-29 Fighters – in 2019

Interesting how supposedly “obsolete” but great military airplanes never disappear:

  • USA has chosen the B-52 (first flown 1952) to outlast the B-1 and B-2 due to high maintenance costs and low dispatch rate of the newer stealth bombers. More.
  • USA relies on the F-5/T-38 (first flown 1959) for a large percent of its “behind-the-scenes” training and testing military operations. So much so that Cold War Military Assistance Program (MAP) versions are being repatriated from overseas and overhauled for drone use.

    NASA’s forward-swept X-29 (2 copies) were an F-5 body with F-16 landing gear. Unfortunately the wings on one of the pair were cut with a titanium chainsaw to truck to an East Coast museum instead of via the Panama Canal or ferrying, so it won’t fly again.

    X-29 #049. Notice tufting on the wing surface, aft fuselage and aft control surfaces to visualize airflow, which is expected to flow from wingtip to inboard for a forward-swept wing. Each of the parties that funded the research is listed on the side.

    Swiss Air Force F-5E. USA is buying back 22 plus spares for Top Gun and other programs in 2019!

    “Perhaps the most interesting aviation item in the FY20 request is that for 22 Northrop F-5E/F Tiger II aircraft, to be divided equally between the Navy and Marine Corps. These aircraft will be acquired to improve and expand the adversary fleets of both services. The Navy bought 44 F-5E/Fs from Switzerland in the 2000s, refurbishing them as F-5Ns. The new batch of 22 is also coming from Switzerland, and the aircraft are due for refurbishment by Northrop Grumman at its St Augustine, Florida, facility. The value of this requested purchase is just under $40 million for all 22 aircraft and spares.”

  • India lost a MiG-21 (first flown 1959) in the Kashmir air battle in Feb. 2019
  • India is buying more MiG-29s (first flown 1977) for $40 million each

What those planes all have in common is that they had all-metal airframes that required low maintenance – none are fly by wire or composite (Mig 29 has minimal composites.) The fighters can all land on grass strips and be maintained without a hangar or special tools.

nytimes.com: After India Loses Dogfight to Pakistan, Questions Arise About Its ‘Vintage’ Military
reuters.com: Washington wants to know if Pakistan used U.S.-built jets to down Indian warplane
ainonline.com: India Requests Additional MiG-29 Fighters
Pentagon To Retire USS Truman Early, Shrinking Carrier Fleet To 10
Did Pakistan use its Chinese JF-17 jets to shoot down Indian planes?
Fighting Falcon puts off retirement: F-16 to fly for USAF through 2048
Dutch F-16 flies into its own bullets, scores self-inflicted hits
avweb.com: Sometimes Old Technology Is Appropriate
avweb.com: Honeywell Retires Convair 580 after 67 years (mfg. 1952, same as B-52)

W: MiG-21, MiG-29, F-5, B-52, Convair 580

Posted in Tech | Leave a comment

Comprehensive and Well-written Collection of Life and Business Development Topics

35 Hard Truths You Should Know Before Becoming “Successful” is a comprehensive and well-written collection of life and business development topics.

They can be used in many useful ways:

  1. read in one sitting, combined with self-reflection
  2. as a sequence of items to study, one per day
  3. as topics to expand upon, for example, google each of the quotes and examples for more details
  4. as topics to discuss between mentor and mentee
  5. even if some are intuitive, it’s good to explore them deeper.

My favorites are 12, 16, 18 and 20.

How to be More Productive and Eliminate Time Wasting Activities by Using the “Eisenhower Box”
W: Koan
What I Learned From Learning How to Say No
Example of Compliment Sandwich Letter
“Be yourself” is terrible advice HN


  • “Ask yourself periodically, is this who I really wanna be?”
  • “Just do it.”
  • “Be the best version of yourself you can possibly be.”
  • “Become your Platonic ideal.”

We Don’t Have Unprofitable Customers, Just Unhappy Accidents HN
“TL;DR for folks without sales lingo:

Companies often think their business would be healthier if they got rid of their “bad customers,” but a customer who is difficult with one provider (e.g. always asks for discounts, doesn’t commit over time) can act much more nicely with providers who they consider critical to their business. This is evidenced through a merger where two distribution companies compare customer lists and realize that one’s best customer is the other’s worst — in other words, a customer isn’t good or bad, they just behave differently with different suppliers. So the key to business is not to get rid of the bad customers, but to become your customers’ favorite supplier so that they build a long-term partnership with you.”

Okabashi: Footwear Company Makes 1.2 Million Shoes a Year in Georgia

getpocket.com: Why You Can’t Trust Yourself
Every productivity thought I’ve ever had, as concisely as possible, Why Your Brain Loves Procrastination

Posted in Tech | Leave a comment

Postgres Performance on AWS EBS

AWS EBS is network-attached storage … in other words, S L O W, compared to local SSD for Postgres database use.

I’ve been seeing average disk latency of 0.55 – 0.80 milliseconds per block (when operating correctly, otherwise 3 ms to 10 ms), and IOPS and bandwidth are throttled by both the instance and the volume.

For m4.2xlarge, only 10,000 IOPS with 100 Mbps are available, regardless of how many or beefy your attached EBS volumes are – not impressive for SSD at all:

Figure 1: m4.2xlarge throttling IO from 4G EBS gp2 volume with 10,000 IOPS and 250 Mbps

Figure 2: 4G EBS gp2 unencrypted volume showing minimum read latency of 0.55 ms

In the above case, one thing you can do is to switch from m4.2xlarge to m5.2xlarge, which is cheaper and has double the IO performance.

But if you’re stuck using Postgres with EBS for large databases (bigger than RAM), there are workarounds related to the fact that shared_buffers will store the index in RAM:

  1. carefully configure shared_buffers to be as large as possible, and max_connections as small as possible
  2. run EXPLAIN to see if indexes are used (no SEQ SCAN) and pg_stat_statements extension to identify slow or frequent queries
  3. use covering indexes to read data from the index cache
  4. rewrite queries to do index scans from RAM instead of table scans across the network from EBS (ie. HAVING => INTERSECT and EXCEPT, WHERE-splitting, etc.)
  5. remove ORDER BY if clause not indexed and your app doesn’t need sorting
  6. use Redis to cache repeated queries
  7. use io1 instead of gp2 volumes, but do your own benchmarks and latency measurements as they vary with both types
  8. use local instance, “ephemeral” SSD volumes and replication/WAL copy for HA.

It would be nice if Postgres had a setting to indicate network-attached storage as a hint to the optimizer.

Percona has some advice for tuning operating system parameters.

Amazon EBS Volume Types
The most useful Postgres extension: pg_stat_statements
Amazon Postgres RDS pg_stat_statements not loaded
PostgreSQL Workload Analyzer
docs.aws.amazon.com: Initializing Amazon EBS Volumes

aws.amazon.com: Monitoring the Status of Your Volumes

“In relation to alerts for slow volumes, io1 volumes do have a “volume status” that will be updated if performance is below expectation, or I/O is stalled.”

Download NVM_Express_1_2_Gold_20141209.pdf at https://nvmexpress.org

Keywords: cloud, architecture, Postgresql

Posted in Linux, Postgresql, Tech | Leave a comment

Cassandra vnodes Streaming Reliability Calculator

The Cassandra database has a setting in cassandra.yaml, num_tokens, for the number of vnodes. num_tokens is the number of partitions to use per host, and thus the number of parallel streams to use for data updates.

The default was 256 vnodes, but that lead to a high probability of a streaming failure, so “DataStax recommends using 8 vnodes (tokens)” now.

A Netflix paper agrees, saying, “the Cassandra default of 256 virtual nodes per physical host is unwise”, as well as the experienced DBAs on the Apache Cassandra Users List.

To calculate the impact of vnodes count on cluster streaming reliability:

where Pstreaming-one-failure is the independent probability of a streaming failure of one connection, possibly in the range of .0001 to .00001, during one week. (You could process your log files to get your exact failure count.)

I wrote a Javascript calculator to help visualize how vnodes increase the probability of streaming failures.

Calculate Cassandra streaming reliability using Javascript:

Expected probability of a vnode stream failing (per week):
Number of nodes in cluster:

Note that changing num_tokens after a ring bootstraps is not a casual thing. The easiest way is to replicate to a new ring or DC with different num_tokens setting, then fail over.

Examples of Streaming Errors

datastax: Streaming operations throw “java.lang.AssertionError: Memory was freed” error
SO: Can’t add a new Cassandra datacenter due to streaming errors
Cassandra Vnodes and token Ranges
Netflix: Cassandra Availability with vnodes Whitepaper

Posted in Cassandra, Open Source, Tech | Leave a comment

Postgres Monitoring Script pg_glance.sh Available

I wrote a small performance monitoring script for Postgres during the Super Bowl on Sunday called pg_glance.

You can download it from my github project pg_glance.

Getting Started with pg_glance

It’s easy to get started …

If you’re remotely monitoring postgres instances using ssh keys to login from your notebook or application server:

  1. Download pg_glance.sh to your ~/.ssh directory
  2. update the hosts variable with a space-separated list of postgres servers to monitor
  3. if you don’t have passwords on the linux postgres account, just run:
    watch -n 15 "./pg_glance.sh | grep ':: '"

If you’re monitoring localhost, or using passwords with your postgres login, then you’ll need to spend a minute customizing the script.

Posted in Postgresql, Tech | Leave a comment

Super Bowl LIII 2019

I watched Super Bowl LIII on TV.

The New England Patriots won over the Los Angeles Rams 13-3.

Low-scoring first half, but it picked up in the 4th quarter. Still ended up being the lowest scoring game in history.

Some ugly plays, with one player apparently spearing a receiver in the back near the ground (might have been unintentional), and another reaching into the mask of an opponent.

Final field goal was a miss.

Edelman was MVP.

Tom Brady’s 6th ring at 36 years old – representing middle-agers. 🙂

Colbert’s gloating over a “free commercial spot” was the funniest ad.

W: Super Bowl LIII
Bob Costas, unplugged: From NBC and broadcast icon to dropped from the Super Bowl

Posted in Tech | Leave a comment

Rare snow rollers spotted in field near Marlborough

Rare snow rollers spotted in field near Marlborough

Posted in Tech | Leave a comment

Tonga Fiber Cut 2019

Imagine your whole country’s Internet going dark. Kudos to the private ISP owners in Tonga who financed and organized a solution while the fiber is repaired.

Bali is another island in a similar situation, at 100% electrical and Internet capacity (excluding the private Biznet Fiber link from Jakarta) for over a decade. Restrictive proxying and timeouts are used to control bandwidth usage, along with “USB drive sneaker net”, similar to Cuba.

When there’s a fiber cut, you just spend the week or two at the beach until it’s fixed. 🙂

Geek heroes rescue Tonga from worst case fibre optic cable blackout

Posted in Tech | Leave a comment

The Awesome Airplanes of Burt Rutan

Great photo collection of Burt Rutan-designed planes by Flying Magazine.

He favored composites since he could mfg. prototypes almost 10x faster than metal or wood.

Northrop Grumman bought his company, Scaled Composites. This is his Firebird H03 Observation plane.

Rutan Model 401 (Photo 2019)

All of his designs appear to be subsonic, which allows non-exotic composites to be used in wing and fuselage design without concern for high skin temperatures.

(SpaceX just replaced carbon fiber with stainless steel in their “Starship” vehicle for that reason. I’m not sure how they went down that blind alley, in 2018 no less.)

One of the prototypes is the Stratolaunch, the world’s largest airplane:

Stratolaunch satellite launch plane (Photo 2019)

One of the nice things about composites is that they’re hail-resistant (small hailstones just rebound) unlike painted aluminum. However, it takes discipline to realize weight savings over metal for structural-related components, and long-term maintenance of composites is still an open question.

yt: Turbofan Killer Bee: Rutan ARES “Mudfighter” for U.S. Army Close Air Support

W: Burt Rutan, Stratolaunch
Composites: Tips for working on Cirrus composite structures
A Hailstorm Completely Obliterated This American Airlines Plane
Preparing for hail season
avweb.com: ‘Secret’ Model 401 Airplane Stops At FBO
thedrive.com: Scaled Composites’ Stealthy Mystery Jet Is Now At The Navy’s Top Flight Test Base
In California, giant Stratolaunch jet flies for first time HN
The world’s largest airplane is up for sale for $400 million
avweb.com: Stratolaunch Price Tag $400,000,000

Posted in Tech | Leave a comment

KiCad, Electronics and EDA DFM Links

Thanks to funding and project management by CERN, KiCad is the most advanced free and Open Source EDA tool now.

CERN has funded differential traces, and dragging.

Free tools will likely never catch up to commercial tools for parts libraries and integration with other tools.

But KiCAD gives many users a viable long-term path for creating PCBs that they can still update years from now without Cloud fees a la Eagle (recently bought by Autodesk.)

Lessons from Running a Small-Scale Electronics Factory in my Guest Bedroom, part 1: Design HN
Exclave: Hardware Testing in Mass Production, Made Easier

madengr: “I had an electronic module that worked in our lab, but failed at the customers, but then worked again in ours.

Turns out one of the pins in a space grade connector (MDM-25) open-circuited exactly between 68F and 70F. Our lab was 70F, and the customers was 69F. The way we caught it was to be watching the temp chamber when it slowly ramped from cold to ambient.

Turns out the connector supplier had shifted production to Mexico, and they were contaminating the contacts with RTV when sealing it. ”

Skewed Layout
cern.ch: KiCad development Bugs
Seeed Studio guide to DFM
Xilinx: 7 Series FPGAs PCB Design Guide
Soldering SMD LED diodes
Electronics and electrical design checklist
High Power Circuit Board Design (PCB) – KiCad 5 – Part 1
Kicad vs Eagle – Which one is best? [2018 comparison]
Horizon EDA documentation
wellpcb.com: 10 Best PCB Design Software Tools In 2021
KiCad 6.0.0 HN

Posted in Tech | Leave a comment

Manila 2019Q1 Mobile Commerce

This is an annual update of my Manila mobile commerce report. Use the search widget for previous editions.

Rockwell/Powerplant Mall as seen from the newish City Garden Hotel Makati 32nd storey pooldeck. iPhone 8+, hand-held.

Chinese payment methods available in Manila.

Hmm … I guess handicams are still a problem.

Screen protectors that prevent strangers from reading your Facebook are popular.

Not a Mac!

There is a new, locally-advertised prepaid card solution PayMaya/Smart Padala, “duly licensed by the Bangko Sentral ng Pilipinas (BSP).”

For computer repairs, the two main districts are Gilmore and Greenhills Shopping Center. The latter is cheaper for Mac repairs.

HN: Hugely Regret Using Stripe Atlas
bloomberg.com: What Uber Left Behind in Asia

Posted in Tech, Travel | Leave a comment

Southwest 737 Runway Overrun and EMAS

Avweb has an article mentioning EMAS: “Southwest Airlines Flight 278 slid off the end of the runway while landing at California’s Hollywood Burbank Airport (BUR) at 9:05 am local time on Thursday. According to a statement issued by the FAA, the Boeing 737 came to rest in the Engineered Material Arresting System (EMAS) at the end of Runway 8. No injuries …”

Note the crushed “bricks” near the nose wheel – that’s EMAS

The initial cause seems to be that the plane landed on a wet runway with a tailwind. It looks like the EMAS was high enough to touch the landing gear, engines and access panels.

EMAS is used in the USA when there is less than 1000′ of suitable overrun for a runway. As the “bricks” get crushed, speed is dissipated. Of the 13 reported incidents, half have been airliners and half GA or cargo planes.

I’m interested in finding out:

  1. how much it costs to remediate the EMAS
  2. how the pax got back to the terminal
  3. how the plane got free of the EMAS since it can’t taxi
  4. did the engines ingest EMAS?
  5. were the landing gear or doors damaged by EMAS?

faa.gov: Fact Sheet – Engineered Material Arresting System (EMAS)

Posted in Tech | Leave a comment

Dart Pop-out Float Test On R66

Informative video on deploying Dart pop-out floats on a Robinson R66 helicopter, then repacking them.

Takes about 8 seconds to fully inflate, but a couple hours to repack all 4 floats.

Requires 3 men, talcum powder, a vacuum cleaner, replacing the shear rivet and refilling the helium cylinder.

yt: Emergency Dart Float Test On R66
W: Robinson R66 Helicopter

Posted in Tech | Leave a comment


Programmers who want to curry favor (to be polite) with Paul Graham, the founder of Ycombinator, upvote postings on the Lisp programming language on news.ycombinator.com.

Humorously, one commenter noticed this gem in a Lisp article: 🙂

classichasclass 3 hours ago [-]

> correct-endian, i.e. little.

Hey now.

Paul and his co-founders used Lisp to write ViaWeb, an ecommerce startup that Yahoo! bought and rebranded as Yahoo! Stores. That sale became the stake used to start Ycombinator.

Paul himself credits these things with the success of ViaWeb in the pre-Web 1.0 era:

  1. using a scripting language, like Lisp, rather than Java
  2. right place and right time
  3. hiring a PR firm to get the word out and start traction.

Lisp Machine Inc. K-machine: The Deffenbaugh, Marshall, Powell, Willison architecture as remembered by Joe Marshall
W: Viaweb

Posted in Tech, Toys | Leave a comment

mysql-trigger-logger Available on github

I just added new github repo, mysql-trigger-logger.

It demonstrates how to use MySQL triggers to log the timestamp, user, SQL and a note for unexpected database changes (“heisenbugs”) from UPDATE or INSERT statements to a table.

This is professional-grade code with documentation and tested notification scripts in Perl, Python and bash, and goes beyond what is available in StackOverflow answers.

github: mysql-trigger-logger

Posted in Tech | Leave a comment

DNS infrastructures still vulnerable to attacks

Nice article on what’s new in DNS in the past 2 years – not much apparently, according to Cricket Liu and ThousandEyes.

Also check out the new DNS book from Mark Jeftovic, the founder and CEO of EasyDNS.com.

O2 ‘to seek millions’ in damages over data outage [due to Ericsson SSL Certificate Expiry in 11 countries] HN
Why CISA issued our first Emergency Directive (2019) HN
cambus.net: Fuzzing DNS zone parsers

Posted in Tech | Leave a comment

Devops and Failing Forward

When deploying a production change, usually you have a rollback procedure – documented even! 🙂

But sometimes after a deploy, things don’t work exactly as expected. At that time, you need to decide which is better: rollback, or fail forward?

If it’s a small, isolated code change and you can operate with the old version, often it’s an easy decision to rollback.

But in more complex situations, sometimes it’s better to accept that the change wasn’t perfect, but:

  • is still an improvement overall, and gets you closer to your goal
  • or is equivalently bad to the previous situation, but in the right direction
  • moves to a new target architecture that can’t be simulated in qa or stage for reasons of time, cost or complexity.
  • serves as a commonly-understood stake in the ground, or anchor point. “Now that we’re here, we can see the right direction!”

and keep the change and fail forward instead of doing a rollback.

Generally to know if failing forward is an option, you need:

  • enough personal and organizational responsibility to accept the risks and handle the consequences.
  • a clear understanding of the overall IT systems and IT risks
  • a clear understanding of the overall business systems and business risks
  • availability of staff to do verification and fix small issues that arise
  • to pick a good time for the change that minimizes stress and risk
  • Monitoring and application logging tools help evaluate the situation. (I’d even suggest rounding out your tools inventory beforehand if failing forward is new to you.)
  • communicate that you may fail forward if necessary, based on calculated, not reckless, risk assessment and that rollback is still an option.

Some actual examples of when I have failed forward successfully:

  • firewall rule changes that were closer to the final goal, but broke a couple of servers temporarily.
  • database schema changes that were correct, but required a day or two of minor internal application updates that were not in the original QA test plan.

Some actual examples of when fail forward was not acceptable, and rollback was required:

  • changes from httpd 2.0 to 2.4 that actually required significant re-QA and updates to the deploy process
  • database schema changes that were correct, but required a major application re-build and re-QA totalling more than 3 hours of downtime
  • changes that affected legacy applications with no budget for developers or QA.

Especially with databases, the arrow of time cannot be reversed. So database restores results in the loss of data on busy systems, making fail forward the default policy at many SaaS companies. Additionally, failing forward helps with development velocity.

Posted in Tech | Leave a comment

pico-build: the world’s smallest three-environment build and deploy system

I recently created a new minimal build system, called pico-build, that is hosted on github.

pico-build is the world’s smallest, yet featureful, three-environment (dev, stage and prod) build and deploy system.

It uses make, but in a different way than normal – pico-build manages entire environments, not individual files.

$ make
usage: make [help|check|dev|stage|prod|dist|all]

It is intended for individual programmers and small teams who want to test and deploy to three environments, but don’t want to spend time setting up (and patching) Jenkins or building a CI/CD pipeline.

pico-build deploy flow
Folder structure of a website deployed with pico-build, showing the deploy flow in red

To deploy according to the diagram above:

$ make dev
$ make stage
$ make prod


$ make all

Please try it out and send me feedback or github pull requests! 🙂

Makefiles, Best Practices HN
Fizzbuzz in make

Posted in Tech | Leave a comment

Gitlab for Users Transitioning from Github

After using Github for several years, I recently used Gitlab for a solo developer project.

Some of the differences I noticed over the past two weeks were:

1. After initial signup, I got a 422 error HTTP response for about 5 minutes. I assume my new login was propagating to their caching layer(s). See screenshot below.

2. By default, the master branch on repos is “protected” and a maintainer must remove that for regular users to push to.

3. Gitlab markdown is different than Github’s.

4. No tagging of repos with the probable computer language. (seems to work for me as of Nov. 13, 2018.)

5. When sharing a private project, the “Guest” role cannot view source code.

I hope this helps programmers transitioning from Github to Gitlab. Please leave a comment below if you encounter other differences.

Posted in Cloud, Open Source, Tech | Leave a comment

PSA: Godaddy Late Renewals and Domain Parking Annoyances

PSA: If you host domains with Godaddy, and you’re a day or two late in renewing a domain, it will be “parked” – in other words, assigned a different IP address than the previous one.

This is as bad as it sounds, since your site availability and SEO rank will be hurt as various crawlers will cache the random parking pages for varying periods of time, which can last for weeks.

To remove domain parking so that Internet users can see your site again:

  1. login to your Godaddy account
  2. pay for your domain renewal if still possible
  3. find your domain and click on the DNS button
  4. look for the parking widget form in the bottom right and click on the trash can to disable parking
  5. next to your A record, the status “Parking” should disappear and the pencil icon should re-appear so you can edit it to the correct IP address again
  6. do nothing at first for 60 minutes to avoid negative DNS caching. Then try your site, monitor Google search, etc.
  7. for future reference, print out your DNS zone settings once it is working.
  8. add application monitoring to your site to detect IP and content changes.

Otherwise, contact Godaddy support ASAP.

If you need quick temporary access for intranet/personal use, just add a hosts entry pointing to the domains and sudbdomains you need and restart your browser. Your site will work perfectly, but just for you.

Posted in Tech | Leave a comment

SELinux Problems in CentOS 7.x

StackOverflow has a solution to the selinux problems encountered by some CentOS 7.4 and 7.5 users related to D-Bus errors.

SELinux corrupted? Now unable to boot CentOS 7 with SELinux enabled

Posted in Linux, Tech | Leave a comment

Congrats to HondaJet for their New Elite Model

HondaJet Elite SP Jet

I’ve been following the development of the HondaJet for over a decade, and they just released their new Elite SP model with several small refinements that add up to major utility and fewer fuel stops over the original HF420.

Honda also worked on the main complaint of the first version, cabin noise, which was about 3 db higher (double) than desired for passengers.

  1. 200 pounds useful load increase (ie. “put on a diet”), significant for a light jet, esp. for sales to super-sized USA passengers (that’s what killed the Cessna 162 Skycatcher)
  2. 16 gallon additional fuel capacity
  3. better short-field performance and “NBAA IFR range with four passengers is now 1,437 nm, up 17 percent from 1,223 nm”
  4. avionics improvements, including further progress in using electronics checklists, similar to new military jet human factors. These enhancements are critical for SP operation.

$4.9 million is an interesting price point. You have to be a billionaire to afford a Gulfstream, but many successful business owners can handle a new HondaJet or a used Hawker lease, if they want full-size cabin space.

HondaJet Elite is Honda’s Newest Light Jet
HondaJet HA-420 : Buyer’s and Investor’s Guide (2013, but interesting)
HondaJet Flight Demo
[FullHD] Private Honda HA-420 HondaJet takeoff at Geneva/GVA/LSGG
Honda Expands Jet HQ, To Increase Production Rate

Posted in Tech, Toys | Leave a comment

Facebook Production Engineering Open House 2018

I attended the Facebook Production Engineering Open House at their Menlo Park HQ.

“Production Engineering is Facebook’s secret sauce – it draws on multiple disciplines (Software Engineering, Systems Administration, Distributed Systems and Networking) to plan, build and maintain the massive Facebook infrastructure.”

It’s always interesting to see how large operators solve problems at scale. Even if you’re small, usually one can borrow one or two nuggets.

6:00 – 7:00: Registration, apps and drinks

7:00 – 7:30: “Welcome and PE overview” by Fernanda Weiden (OG Ads Team Member) Wikipedia

Production Engineering (PE) is Facebook’s term for Devops. PE’s must know how to program, but don’t necessarily end up doing so. Newer services may require more programming from PE’s, and older services less.

Tech Talks:
7:30 – 7:50: “Scaling Instagram On-call” by Nick Shortway

– took a long time to refine on-call schedule
– 1-day on-calls were too much administrative effort to schedule
– now newbies are Level 1 and experienced people are Level 2
– “loop” is a period of time
– 3-day loops where the #1 priority Is being on-call
– Level 1 escalates to Level 2 within 2-3 minutes if no apparent solution
– Level 1 might not get sleep for 3 days
– 3 day loops are a lot of effort to administer, but manageable.

7:50 – 8:10: “How we monitor and scale FB” by Patrick Taylor

– lsof, strace, nm, /proc/pid/exe and trace.py
– cubism (folded perf graphs) click through to “Deep Dive” graphs
– region, data center, cluster, rack, server
– examples with retransmit, cluster cache failure

Patrick/Facebook has developed a “Maszlow Hierarchy of Needs for PE”, similar to this one.

Facebook uses a style of time series visualization called a cubism or horizon graph, first implemented in D3 by Mike Bostock at Square in 2012.

Similar Horizon or Cubism Chart by Splunk. Charts are folded into 25% of original height, with darker colors representing larger values.

8:10 – 8:30: “Supporting Global Events in FB Live” by Peter Knowles (10-year employee)

Justin Bieber caused a lot of problems up to 2015 (melt-down of his PostgreSQL shard, cache-busting.)

– 3 methods of load testing:

  1. Remove nodes
  2. Synthetic load
  3. Shadow traffic (duplicated traffic, but don’t show copies in user timeline)

8:30 – 9:00 Q&A Speaker Panel and Mingle

Q: If you run out of capacity, do you prioritize ads or user platform up?
A: Ad systems are lower priority than user platform.

Q: How does Facebook support multiple development languages?
A: Thrift. Longer answer is developers should use “officially supported” languages, but they can use anything they want as long as they write their own Thrift client library.

Q: When is Facebook moving to the Public Cloud?
A: Unnecessary. (Editor: FB is the Cloud.)

Q: What was your longest full outage?
A: 36 minutes or so about 18 months ago. (Speaker said he wrote the memo, so he won’t forget the number of minutes.)

Q: Are you using AI/NLP in monitoring?
A: Not yet, but something we follow.

There were some exhibits, with one item apparently being an OpenCompute server.

Food was chicken sliders, tomato brusciatta, triangle-turnovers and smaller appetizers with an open bar.

Thanks to Facebook for hosting the event.

code.facebook.com: How production engineers support global events on Facebook, PE Blog

Cubism.js Time Series Visualization Slides

r-bloggers.com: Cubism Horizon Charts in R
acm.org: Sizing the Horizon: The Effects of Chart Size and Layering on the Graphical Perception of Time Series Visualizations
tableau.com: Horizon Chart Workarounds in Tableau

Posted in API Programming, Cloud, Linux, Microservices, Postgresql, Tech | Leave a comment

Northrop Grumman F-5 Links

Northrop F5 Freedom Fighter HD
Ron Gibb, former Northrop Grumman F-5 Project Office, conducts a walk-around of the F-5A Freedom Fighter
F-0195 Northrop F-5A Tiger (Ad)

$15 Million Fighter Jet Has Everything–but A Buyer
W: Area rule, Northrop F-20 Tigershark, Quail decoy/cruise missile
Paul Allen’s F-5B Take-off from SJC (2014)
Northrop F-5C Skoshi Tiger MTPF 6.5 hours, Abort rate 1.5%
Garmin Goes Supersonic in F-5

Posted in Tech | Leave a comment

PostgreSQL and “PANIC: replication checkpoint has wrong magic” error

PSA: I haven’t seen a solution to this issue online, so posting it here for search engines to index.

PostgreSQL is an ACID-compliant database, but filesystem corruption can still prevent it from starting.

In my case, a linux test instance running in a Virtualbox VM was not cleanly stopped after a power failure.

The result was that data/pg_stat_tmp/ was corrupted, and an invalid replication checkpoint value was written, resulting in PostgreSQL refusing to start.

If you see the following error in data/log/:

2018-05-19 06:18:38.437 UTC [2647] PANIC: replication checkpoint has wrong magic 1767992667 instead of 307747550
2018-05-19 06:18:38.446 UTC [2595] LOG: startup process (PID 2647) was terminated by signal 6: Aborted
2018-05-19 06:18:38.446 UTC [2595] LOG: aborting startup due to startup process failure
2018-05-19 06:18:38.448 UTC [2595] LOG: database system is shut down

The solution after you fix any filesystem problems (if you don’t have a slave):

  1. backup your config: cp -p data/postgresql.conf data/postgresql.conf.bkp
  2. in data/postgresql.conf set max_logical_replication_workers = 0
  3. start PostgreSQL to automatically recover, then stop it
  4. restore your config: cp -p data/postgresql.conf.bkp data/postgresql.conf
  5. restart PostgreSQL
  6. read the logs in /data/log/ to understand if there are other problems.

Although you can fix (“monkey-patch”) missing or corrupted/indexes like the following with REINDEX INDEX idx_name, you may just want to re-install PostgreSQL from backup at this point:

2018-05-19 17:30:18.052 UTC [11795] ERROR: index "idx_16573_primary" contains unexpected zero page at block 1733
2018-05-19 17:30:18.052 UTC [11795] HINT: Please REINDEX it.

If you have a slave, you will likely have to:

  1. stop the slave first
  2. do the above
  3. make a new backup from the master and rebuild your slave.
  4. start the new slave.

PostgreSQL Manual: 19.6. Replication
Kernel: disabling IRQ #19 (network card)

Posted in Open Source, Postgresql, Tech | Leave a comment

AWS Elastic Load Balancer (ELB) – Call Me Maybe

When is a load balancer not really a load balancer?

When it’s an AWS Elastic Load Balancer (Classic ELB or ALB.)

Although it’s been well-documented for at least 6 years, it’s still not well-known that one side of a Classic ELB (or ALB) does load balancing, but the other side doesn’t have a VIP, or static IP address, which most users expect.

The reasons AWS decided on no static VIP are:

  1. AWS is likely using round-robin DNS to implement ELBs under the hood, although with options like “sticky” and health-checking
  2. AWS doesn’t want users to rely on permanent IP addresses for “AWS internal network management reasons”
  3. AWS didn’t care about end-user expectations, otherwise they would have called it a “HLB” (Half Load Balancer)
  4. AWS can tweak their own services to consume the output of ELB’s, like endpoints.

The problems with not having a static VIP are:

  1. client programs (browsers, Java applications, k8s, etc.) that connect to an ELB will have apparent random connect failures and have to resolve the endpoint per connection request periodically when the ELB changes the addresses
  2. client programs cannot cache DNS if they want reliable connections, a significant performance problem
  3. ELB addresses cannot be whitelisted in firewalls
  4. ELBs typically return several IP addresses, similar to round-robin DNS, which some applications don’t expect
  5. in network engineering terms, ELBs don’t support TCP Layer 3, which is immensely unhelpful
  6. without a VIP, designing static architectures is fruitless – how can you guarantee 5×9’s when devices are changing without any advance notification? ie. you’re “painted into a corner”
  7. if traffic ramps up quickly, the ELB topology will scale by returning a varying list of IP addresses in a short amount of time
  8. connections are broken by ELB, introducing corruption into stateful applications like middleware and orchestration software.

Additional problems with ELBs are:

  1. even when expecting multiple IPs, ELBs randomize them, causing cache misses. You must also have a distributed session manager if you don’t use “sticky.”
  2. the default is to break the connection on change, not to drain the connection first.

By now, you’re likely horrified as you see your 5×9’s rapidly disappear in the rear-view mirror. 🙂

For most users, the lack of that static IP disqualifies Classic ELBs and ALBs from any HA architecture design.

Solutions if your client program is expecting a static VIP are:

  1. Network Load Balancers (NLB’s) (introduced in 2017) support an EIP
  2. use EIPs (note that replacing a server in your farm will require binding an EIP in some cases, which may take up to 120 seconds. So you really should start with n+1 servers at all times.)
  3. EIPs to HAProxy
  4. third-party solutions like F5.

Lame workarounds:

  1. AWS has an article on combining NLB+ALB+Lambda to track IP address changes
  2. manually monitor ELB changes and restart your apps, preferably automatically
  3. tell your browser users to retry failed requests, or close their browser and reopen it, whenever they see errors.


  1. Route53 DNS ALIAS and CNAMEs only help if your client program doesn’t cache lookups. I don’t know why AWS documentation erroneously says that ALIAS will somehow help with ELB’s, as client programs cache lookups.

AWS High Availability Patterns : DNS Load Balancing Tier
GCE: Network Load Balancing (uses a static IP as expected)

Posted in Cloud, Tech | Leave a comment

RedisConf 2018

I attended RedisConf 2018 on Thursday at Pier 27 (The James R. Herman Cruise Terminal) in SF.

Just like 2017, the talks were extremely high-quality and imaginative – I wished I could see all of the tracks.

The Herman Center is a beautiful new (2012) venue with breathtaking views of Golden Gate bridge, Alcatraz and the downtown skyline.

Executive Summary

  1. Redis community version (free) will have a number of enterprise features built-in this year: multi-master, SSL/TLS and likely multi-level cache primitives.
  2. Redis can provide a new way to look at and solve business problems by combining two or more built-in features. The multi-level cache talk (see below) shows how Redis caching was combined with Redis pub/sub and Lua, resulting in something powerful with only one page of code.


Keynote videos for 2018

Some of the talks I attended:

Thursday Talks

Techniques for Synchronizing In-Memory Caches with Redis Youtube video link
Ben Malec, Paylocity

“For some highly-accessed value, a network roundtrip incurs too much latency. An obvious solution would be adding an in-memory caching layer, but that brings many challenges around keeping data in sync across multiple clients. This presentation will detail the approach Paylocity implemented, which leverages Redis Pub/Sub, bucketing keys to minimize synchronization message length, and carefully exploiting order-of-operation to eliminate the need for a master synchronization clock.”

Data Flow Diagram of Ben Malec’s Multilevel Redis Cache (with ignore self-pub)


– co-worker suggested multilevel cache design. In past times, that was overly complex. now, time to reconsider.
– data Source of Truth (SoT) is MS SQL Server
– multilevel cache with Redis as the cache SoT (still has TTLs) and pub/sub to 50 local clients
– .NET (Windows) using default MS memory cache on clients

#1 Possible Multilevel Cache Design
– broadcast keys and values
– will blow up network

#2 Possible Multilevel Cache Design
– broadcast just keys
– can still blow up network

#3 Possible Multilevel Cache Design
– broadcast 16-bit custom hash slot generated on client nodes
– store last updated array in RAM on client nodes and use lazily
– we will explore this option (see diagram)

– various race conditions to think about though when requiring local cache to always be correct despite various latencies and TTLs

– Redis lets a database application exploit Redis’ O(1) data structures, which RDBMS’ cannot match. Note that Publish is O(N_subscribers) and does not guarantee delivery (James)
– the slot hash starts to look like a zero-knowledge proof – interesting area to research (James)

– can decrement timestamp by 1 to know the newly arriving Redis event is later. (Some people postfix with -1, -2, etc.)
– can ignore self pub/sub messages:

if (dataSyncMessage.senderInstanceId == _instanceId)

– can write Lua script on Redis side to send the pub/sub event and save sending 1 event over network

– possibility of cache thrashing in this design since hash slot is a range of keys with 16-bit ID’s, not a concern in practice
– but 18-byte (instance guid and slot int) messages (very small)


– publish hit/miss metrics into ELK or Redis time-series module maybe
– possibly use Redis XFETCH to optimize cache reload
– add support for more Redis types
– track hot keys
– StackExchange also doing similar multilevel cache design.

– 45x faster with local cache than accessing Redis over network, approaching zero bandwidth
– also client could tell Redis a TTL and not send a pub/sub message!
– look at Redis’ key notifications (Salvatore commented on that)
– Salvatore: “We could spend a month talking about how to incorporate this into Redis.” 🙂

Bandwidth Optimizations

– delete notifications from clients to Redis are slot hash values (that represent a range)
– update notifications to clients are slot hash values that are cached in lastupdate array. This array is consulted before using a value in case it needs to be re-fetched (lazily.)

Latency Optimizations

– slot hash values, not key:values (Redis keys can be 512 MB)
– local lastupdate array, no network check

Code Optimizations

– using MS Cache and Redis pub/sub
– only need 1 page of code to implement multilevel cache, easier to verify correctness and/or test
– short code can be customized per use case

See links at bottom for related client-side cache discussions.


– idea for pub/sub came in shower
– implemented within one data center
– but could be useful for reducing latency in geo-distributed databases as Redis multi-master goes GA (James)

Application of Redis in IOT Edge Devices
Glenn Edgar, Lacima Ranch (“The Avocado Farmer”)


– IoT by former embedded engineer intended for avocado farmers
– 22 sprinkler wires plus PLC for $300
– tell field-hand where to find sprinkler damage (3 Gallons/Minute)
– gopher can chew on sprinkler, cracks can be non-visible (0.5 Gallons/Minute)
– started with typical embedded programming/web solution with Mongoose web server
– but needed to share data between 2 processes …
– Raspberry Pi with Redis is job queue controller, python apps talk to it
– need to schedule watering, collect info for flow and pump problems, log
– “I have the smallest database at the conference: 40 MB.” (“Power of small data!”)
– then 3 processes, graph database module needed
– Reference by Chinese Electric Utility to IEC1970 standard for SCADA, studied it
– SNMP on steroids
– not using Grafana now, but if he did, would do with monitoring as code
– some graph nodes are weather stations
– 2 types of data structures: system graph/logs and irrigation schedules
– code generates Redis keys to ensure keys are managed properly
– evaporative loss and moisture loss calculations
– Deep Learning and ML to interpret graphs, trends
– “You’re not going to run Tensor Flow on a Raspberry Pi.”
– Method of Synchronization between Cloud and Edge with AMQP
– adoption limited by less sophisticated neighbors and commercial SCADA interests
– cameras are for security: avocado theft

Integrating Redis with ElasticSearch to Get the Best Out of Both Technologies
Dmitry Polyakovsky, Zumobi

ft.aggregate API

– can we count this? faceted search in Redis

– insert item and index in 1 ms
– aggregate vs. search


– top N
– no processing involved


– top surname and count
– count StackOverflow questions by database by month
– filter -> group -> apply -> sort -> add more (all Redis types)
– country -> age -> profession -> languages
– numeric functions and expressions
– distributed: naive uses window function, still too much bandwidth and time
– distributed: better uses coordinator and does aggregates only (1000x less data)
– hyperloglog maintains 4% precision even after coordinator merge
– reservoir sample -> median
– query plan translator has 2 parts: remote query and local query


– single GROUP BY advisable for good performance
– high number of groups – still slow
– exact COUNT DISTINCT and quantiles slow
– for huge parallel workloads, use Spark, etc.


– live sub-second searches of multi-million row StackOverflow data with AWS one-server 15-shard demo server
– APPLY is used last, so efficient on TOP N queries
redash-client, plus custom changes


– port all existing Redis simple search functionality
– streaming time-window searches

Conference Closing Session

– Open bar with bar snacks
– prizes giveaway
– talked to some of the presenters.

The James R. Herman Cruise Terminal

Though beautiful, it has some limitations as a conference venue, being split on 2 floors with occasional wind rattling. Parking is also limited.

Booth staff wore parkas and were still unbearably cold. Coffee was only available on the first floor, but talks were on the second floor – rather inconvenient, but easy to fix for next time.

The “F” streetcar stops outside.

Scaling a High-traffic Rate Limiting Stack With Redis Cluster
Multilevel cache system for Java 8
RedisConf18: Techniques for Synchronizing In Memory Caches with Redis – Redis Labs
groups.google.com: Client side caching: initial design ideas (2018)
http://antirez.com: Client side caching in Redis 6 (July, 2019)

Posted in API Programming, Cloud, Java, Linux, Microservices, MySQL, Open Source, Tech | Leave a comment

Percona Live 2018 Conference

I attended the Percona Live 2018 database conference in Santa Clara again.

As always, the conference was very well-organized and had great talks.

For 2018, Percona attempted to make it more affordable, with prices in the $600 range. Much appreciated!

Executive Summary

  1. MySQL 8 has been released with many new and improving features, including JSON and dynamic settings support
  2. PostgreSQL 10 has create partition and improving geo-distributed features
  3. ProxySQL can be used to replace HAProxy in HA database architectures.

I mainly attended the PostgreSQL track this year. The presenters were experts – either PostgreSQL server developers, professional trainers or AWS RDS/Aurora staff.

Tuesday Keynotes

Percona Summary

Linux Performance 2018
Brendan Gregg, Netflix
Important to DBA
– BPF support added to perf
– BPF acts as a “sandbox” for monitoring agents
– Google’s BBR TCP algorithm 3x performance with 1% packet loss (hired Van Jacobsen)
– Facebook’s Kyber block reduce 99% latency by 300x


Real-time with Redis
Jon Hyman, Braze (formerly AppBoy)

– add jitter to timestamp to fix thundering herd
– 15 job queues
– SADD 7 GB arrays overwhelm replication slaves
– CPU utilization can stop replication
– API aggregation of keys
– fine-grained for user flexibility and triggered event use
– 120-140 redis servers at 40,000 ops/seconds
– expensive for replication


– ProxySQL can help with 1 minute RDS failover and nearly-immediate Aurora failover
– ProxySQL does not forward the client IP yet (bug filed)
– ProxySQL is a good solution for filtering reads to slaves

Tuning Postgresql for High Write Workloads
Grant McAlister – Sr. Principal Engineer – Amazon RDS

– WAL compression will help reduce full page writes
– testing with random data may not provide useful results with WAL
– max_wal_size = 16GB, but increases recovery time
– checkpoint each minute
– async with random data maybe slower (code path not optimized for this workload)
– extra indexes in pg are expensive – hundreds or thousands of %
– prefix uuid with date to update btree on right (right lean)

– run vacuum to maintain perf (HOT and block cache) and avoid TID wraparound
– vacuum in memory:
– increase checkpoint_timeout
– alter table X set
– in memory before checkpoint 3.5 seconds
– In memory after checkpoint 84.5 seconds
– Vacuum not in memory 165.8 seconds

– Aurora: no checkpoints, no FWP, no log buffer
– not san, so block level. Also 4/6 quorum
– hash indexes are crash-safe in aurora 9.6, with be same with GA in 10
– managed service AWS staff gets paged on TID wrap-around

Securing Your Data on Postgresql
Payal Singh – OmniTI Computer Consulting Inc. payal@omniti.com

– Policies
– Row-level security (FORCE for table owner)
– Public is “public”
– before 10, policies were “OR”. 10 can do “AND”
– SSL (Certs needs to be 600 or restart will fail)
– event triggerimg like ddl event for auditing, can detect ownership changes
– table_rewrite notification to reject ddl and send message (possible solution for write amp)
– pg_audit
– at-rest encryption: pgcrypto. There is performance impact, so choose columns
– backups
– monitors to monitor queries for compliance (pg_stats extension)
– cryptdb can do encrypted queries (mysql-based)
– read-only replica can be a compliance problem because no logging possible
– desired features: redaction, oracle the, show grants
– pw strength is only on plaintext pa’s, not md5 or scram
– no rush for AWS KMS support

Saving Bandwidth When Using MySQL
Georgi Kodinov, Oracle MySQL SrvGen Team Lead (includes Security Team)

– Wireshark (smallest item is data response!) – disable SSL, will show sub requests
– SELECT and CALL generate metadata is large, but needed for MySQL client program, not for PHP apps. 443 vs. 106 bytes
– SET @@session.resultset_metadata=NONE;
– MySQL protocol has “capability flag” to help with backward compatibility
– Text protocol vs. binary protocol
– MySQL Protocol Doxygen in MySQL 8.0

Amazon Aurora MySQL and RDS MySQL: Lessons Learned
Mariella Di Giacomo

– did benchmarks with t2.medium and a large of RDS MySQL vs. Aurora MySQL. Aurora won, but I’m not sure of the difference in cost

Exhibit Hall

– got a demo of Vivid Cortex SAML/Okta support. Very nice UI.

VividCortex Dinner at Levi’s Stadium
– nice chance to meet other Vivid Cortex users and conference attendees
– Brendan Gregg is starting a cricket team at Netflix.

Wednesday Keynotes

Percona Summary

Wednesday Lectures

PostgreSQL Replication
Christophe Pettus, PostgreSQL Experts thebuild.com

WAL-Based Replication

– first core replication was WAL shipping. All or nothing, monitor your disk. Same major version.
– archive_command can copy to secondary, manage it yourself
– for stream replication, use recovery.conf to point secondary at primary
– optional sync available (sometimes used to avoid lag)
– replicas can cascade
– max_standby_*_delay to control replication lag
. Use 0 for DR
– hot_standby_feedback

Trigger-Based Replication

– triggers cascade and run in alphabetical order
– on tables (Slony 1). Bucardo does multi-master replication (ping-pong protection)
– tedious, fiddly, performance impact, no DDL
– Slony requires C-language extensions, so cannot use on RDS
– Bucardo can be used on RDS
– RDS version 10 has WAL files access

Logical Decoding
– introduced in 9.4, improved in 9.6
– create replication slot to capture WAL stream
– tracks WAL position of consumer
– can run out of disk space

Replication Plugins
– can do whatever you want
– eg. export to Kafka

Logical Replication
– PG 10 has built-in logical replication, 9.4 has pg_logical
– pg_dump to dump schema
– primary key or unique index is a good idea, often required
– sequence values are not replicated, use disjoint ranges per server or UUIDs
– truncate not replicated, nor cascade
– replicate real table to real table, not view, FK
– pg 10 partitioning cannot be replicated because root table is not a real table
– no temp tables or unlogged tables
– COPY is individual sql inserts, could be millions

– statement-based replication splits commands and sends to 2 servers – don’t use

Amazon DMS
– pg logical decoding
– timestamptz not supported at this time

2nd Qiuadrant BDR
– closed source, bi-directional

Streaming does DDL – reliable since 9.3, monitor secondary in case of quiet disconnect

PostgreSQL Replication
Simon Riggs, 2ndQuadrant

– Physical Streaming replication is sending WAL. Avail. 8 years. In-core
– Logical is non-WAL, like MySQL
– hot standby is same as read replica (AWS)
– repmgr and barman help mng
– postgres_fdw (foreign data wrapper)
– file_fdw access data like COPY
– PG pub/sub replication (push)

Multi-Node Advanced Features
– Push (Logical) or Pull Data Access (FDW)
– Multi-server hetero SQL
– Sharding (native in PG 11)
– Multi-node Query
– Multi-master database
– PostgreSQL-XL MPP similar to Teradata, Greenplum, Redshift
– Postgres-BDR Geo Cluster
– https://www.2ndquadrant.com/en/resources/postgres-bdr-2ndquadrant/
– BDR = Bidirectional Replication

Deep Dive into the RDS PostgreSQL Universe
– use cross-region read replica for migration
– create a paramater grep with rds.force_ssl=1
– RDS uses pg_upgrade or DMS
– OS Level Enhanced Monitoring
– Amazon RDS Performance Insights (Database Level) – Postgres Aurora today

Conference Closing Session

Lightning Talks

See this Percona Blog.

Prizes Giveaway

– Nice prizes for filling out the exhibitor pass.

Posted in API Programming, Linux, MySQL, Open Source, Oracle, Postgresql, Tech | Leave a comment

Southwest Flight 1380 Accident and Likely Changes

As a commercially-rated airplane pilot who’s read over 2,000 accident reports, I followed the Southwest Flight 1380 fatal uncontained engine explosion closely.

Executive Summary:

    The FAA failed in it’s oversight responsibility of airline engine maintenance in two cases:

  1. The rotating fan blade that killed the passenger had not been tested since 2012, almost 5 years, meaning this was a preventable accident. Fatigue cracks were found at the blade dovetail, an obvious locus. Update 2018-10-01: engine inspections now 1,600 cycles.
  2. The engine shroud did not contain the flying blade, leading to the death of a passenger.

The FAA is likely investigating the following concerns:

  1. how well did the pilots handle the emergency (it appears well, handling the depressurization event and descending to a breathable altitude.)
  2. engine turbine blade failure (engine design and maintenance)
  3. uncontained engine shrapnel (engine design)
  4. uncontained engine shrapnel into passenger cabin, resulting in the death of a passenger and 7 with minor injuries (plus psychological trauma, esp. to first responders) (fuselage design)
  5. improper use of oxygen masks by most passengers (passenger safety briefing, oxygen mask design.)
  6. the passenger killed was wearing a seatbelt, yet was partially blown out of the airplane. Was the seat belt worn snugly. If so, do we need 3-point or 5-point harnesses?
  7. it’s the first US airline accident in 9 years, and the first en route fatal accident for Southwest. The FAA wants zero accidents for political reasons, and Southwest wants zero accidents as an operating airline.

The oxygen masks descended, but passengers wore the mask over their mouth. Masks are intended to be worn over the nose, so likely the mask shape will be changed and training materials updated.

Regarding the above items, a simple explanation of what was expected and what happened is this.

There are FAA regulatory and political expectations that uncontained engine failures don’t happen. However the physics of large machines with rotating components says otherwise. Likely this will result in armoring the engine shroud and hull with kevlar or metal, and further engine design changes based on energy of rotating components.

There are FAA expectations that the aviation industry mitigates risk so that the uninformed flying public is not injured. However, a passenger died en route.

After the depressurization, oxygen masks were expected to descend and be used by passengers over their nose. Instead they were put over their mouth. Likely passenger briefings will become more detailed, and oxygen masks will be changed from a round to pear-shaped.

Fortunately the pilots descended relatively quickly, at 3,000 fpm, to 10,000′, as:

  1. time of useful consciousness at 35,000′ is only about 1 minute
  2. the masks were worn incorrectly
  3. only about 10 minutes of oxygen is available in cannisters when properly provisioned

However, was the emergency descent actually fast enough? With an improperly-worn mask, generally-speaking the body needs sufficient cabin oxygen pressure within 4 minutes to prevent brain-damage or death.

avweb.com: Southwest Accident Brings Passenger Safety Briefings To The Forefront
philg: Southwest 1380: think about the flight attendants
avweb.com: Southwest 1380: “Flew Like a Rock”
ainonline.com: Southwest Airlines 1380 Engine Failure 4/17/2018 ATC Audio
1 killed as Southwest jet makes emergency landing after apparently blowing an engine in flight

Qantas Oxygen Mishaps (2007 and 2008)

These Qantas incidents illustrate the importance of vigilance when maintaining passenger oxygen canisters:

1) Careless filling with nitrogen instead of oxygen:

theage.com.au: Probe after Qantas pumps wrong gas into jets PPRuNe

2) Oxygen canister valve failure due to unknown cause:

nytimes.com: Officials Ask Qantas to Inspect Oxygen Canisters
telegraph.co.uk: Exploding oxygen bottle caused hole in Qantas jet
smh.com.au: Valve in oxygen cylinder the culprit in 747 explosion

Oxygen canister valve failure causes depressurization in Qantas 747 (2008)

Regulators Mandate More Inspections for 737NG Fan Blades

Posted in Tech | 1 Comment

Hiller Aviation Museum 2018

I spent the afternoon at the amazing Hiller Aviation Museum at San Carlos Airport (SQL).

The museum is a real gem, with compact (3 hours to see everything) but nice exhibits in the areas of:

1) Hiller helicopters
2) flying platforms
3) engines from 30 HP (Wright B) to 3,500 HP (Wasp), including a Merlin
4) flight simulators
5) 727 and 747 “steam gauge” cockpits
6) a truly beautiful Grumman Albatross
7) models of most other aircraft

(The Boeing SST that was on loan has been returned to Boeing.)

Flight Simulators at Hiller

Hiller has done a great job of integrating modern computer displays and flight simulators into their exhibits:

Mechanical Flight Simulators

– Curtiss P-2 with map table
– portable flight director
– IFR simulator
– Stearman simulator

Computer Flight Simulators

– 16x PC on 2nd floor (FS9 with Saitek yoke, engine quadrant, and pedals with 3 monitors)
– 1x Redbird FMX full-motion AATD
– Google Earth “wall”
– Wright Flyer (MS FS9 with wood levers and 3 screens)
– a real helicopter with cyclic control and 1 screen

They are moving to X-Plane 11 as PC hardware compatibility leaves FS9 behind.

Additional points of interest:

1) The 747-100 fuselage static display behind the museum lets you climb up the 1st class cabin and sit in the cockpit. It has 5 seats (pilot, copilot, navigator, radio and FAA.) It is closed at 4:30 pm.
2) Behind the 747-100 is a small platform where you can observe takeoffs and landings. It is closed at 4:30 pm.
3) The Burger King next door has a helicopter on static display outside, and historical aviation fotos and a “briefing area.”

avweb.com: Personal Flight Simulators: one-G Simulation

Posted in Tech | Leave a comment

Postgresql vs MySQL in the Enterprise

There are few people who have used both MySQL and Postgresql in production at scale. Both are great Open Source databases, so I thought I’d add some comments based on my experience.

Feature MySQL/InnoDB Postgresql
Replication Many options, simple to admin Many options, many warts
Index Write Amplification No Yes
Error handling on disk full Sketchy Yes
Grants complexity Simple to understand Complex
Sync and Semi-sync options Several Yes
Easy-to-use CREATE PARTITION Yes PG 10+, root table is not real
ON DUPLICATE KEY Yes PG 9.5+, but odd
Partial Indexes No Yes
Functional Indexes No Yes
DELETE Purging Automatic Purge thread VACUUM
Object Names Finely-scoped Globally-scoped
Invisible Indexes 8.0+ hypopg extension

Howto Install Only psql on Mac OS X with ‘brew install libpq’

Posted in MySQL, MySQL Cluster, Open Source, Postgresql, Tech | Leave a comment

AWS Loft Security Week – GuardDuty

This week was Security Week at the free AWS Loft SF. I went to the Threat Detection & Remediation (GuardDuty) day, since I use it.

GuardDuty aggregates logs from 3 sources (VPC Flow Logs, AWS CloudTrail event logs and DNS logs) and lets you filter the events you want.

GuardDuty is free for 30 days and will report on what future use will cost, so when you’re ready, just enable it. Since it monitors other AWS logs, there’s no impact on your other services or instances.

One of the filter methods is a lambda, which can call another lambda (chained lambdas.)

If you create a separate “forensics account” in the same AZ, you can automatically do some sophisticated things:

  1. forward logs and events for analysis that is isolated from your production account
  2. have your lambda move (ie. “quarantine”) a suspect host from your production account to the forensics account.

The lecture “A Case Study on Insider Threat Detection” was mostly on GuardDuty. In my experience, Loft lecturers are excellent, and this was no exception.

Afterwards was a detailed 2-part lab where you create two hosts, have them interact, and view the events in GuardDuty. Bring a mouse to AWS labs, because you’ll be doing a lot of clicking around. 🙂

I noticed that most of the attendees and even the “Ask the Experts” were not familiar with newer AWS services and features, like GuardDuty and PrivateLink. Such is the rapid progress that AWS is making.

Gripes: The pasta salad was slightly crunchy (pasta was under-cooked by about 3 minutes) and there was no half-and-half for the coffee. Also, the previous Loft configuration with lectures upstairs and Ask the Experts downstairs made more sense, since putting them on the same floor causes noise interference problems.

AWS Pop-up Loft
1446 Market Street, San Francisco
Feb 20 – Feb 23 10:00AM – 4:00PM
(You should register the week in advance for AWS Loft events, but you can also register on-site with a photo id.)

AWS Forensics Marketplace Vendors
AWS Loft London: Incident Response and Forensics Slides
/r/aws: Is AWS GuardDuty an IDS/IPS?

Posted in Cloud, Linux, Tech | Leave a comment

Congrats to SpaceX on the Successful Launch of Falcon Heavy

Congrats to SpaceX on the successful launch of Falcon Heavy, with a Tesla for ballast.

And yes, recovering two boosters simultaneously on adjacent pads is showing off, but of the best kind!


One of the main goals of this launch was to not blow up the launch pad, since that would:

  1. halve the number of pads available
  2. take a year to rebuild
  3. and cost about $50 million.

Although it’s always uplifting to see a successful launch, it’s actually valuable to have failures early-on. Having a successful first launch means the rocket was likely over-engineered (ie. heavy.) Hopefully review of the sensor data can find something worth investigating and fixing before launching satellites or people.

One launch doesn’t mean the mechanical and acoustical vibrations between the 27 engines is dampened enough, or even understood at this point.

“In America, if you have lots of money, you can build a space program and launch your own car into heliocentric orbit.” 🙂

SpaceX launches Falcon Heavy, the world’s most powerful rocket
WATCH LIVE: SpaceX to Launch Falcon Heavy Rocket #MarsRocket @3:45pm EST delayed

Posted in Tech | Leave a comment

Maintenance Banner mx_banner Available

While watching Superbowl 52, I finished my github project Maintenance Banner, aka mx_banner.php.

It lets an application administrator schedule maintenance event notifications for a web application in a database, and the web application can then automatically display those on the post-login page.

mx_banner.php is Apache-licenced and written in PHP with PDO and supports MySQL and Postgresql. There are only a few fairly generic SQL statements, so it should work also with SQLite, Microsoft SQL Server and Oracle Enterprise.

mx_banner.php should be installed on a web server that supports SSL and uses authentication, but I suppose for internal applications you could get by without those security layers.

Your web application can be written in any programming language that can query the maintenance events database and insert the event information into a div on your application’s post-login page.

If you host multiple applications that can experience unique maintenance windows, then just install an extra copy of mx_banner.php and the database schema for each application maintenance “zone.”

mx_banner.php is a minimal but non-trivial web application that is also good for learning PHP application programming. Also see the WordPress plugin I wrote in PHP on github.

Posted in Linux, MySQL, Open Source, Postgresql, Tech | Leave a comment

Superbowl 52 in Minneapolis

I watched Superbowl 52 on restaurant cable. Great seat for a great game with the the most ground and air yardage in history.

PHI won 41 – 33 NE in Minneapolis with some nice trick plays in the end-zone.

Brady started the game with an injured hand and had to bench for a while.

Sadly Brandon Cooks didn’t see a legal but huge hit coming after a reception and ran into a player with helmet contact, resulting in a head injury.

The kickers said they “had problems with the in-turf logos.” There were a few missed field goals, including one that hit the post.

Justin Timberlake did the half-time show, but with only 15 minutes to perform his appearance was too brief.

Nick Foles, High-Powered Eagles Stun Tom Brady, Patriots to Win Super Bowl 52

Jack in the Box (Jack and Martha Stewart) and Amazon (Alexa “turks”) had some funny commercials.

time.com: These Are the Best 2018 Super Bowl Commercials

Posted in Tech | Leave a comment

Table Partitions in MySQL and Postgresql

We’re lucky to have two great Open Source databases, MySQL and Postgresql.

One of the killer features in both MySQL and Postgresql is table partitions – for example, most Silicon Valley adtech companies are powered by MySQL partitions.

They let enterprises, and growing startups, easily manage large volumes of data.


  1. drop millions of rows without purge thread or VACUUM workload, esp. useful for time-series (logging) and retention compliance requirements
  2. save storage space by implying the key in the table name
  3. allow transportable tablespaces for archiving and repair or storage management
  4. allow smaller table scans and index lookups by isolating the row range
  5. potentially allow “parallel query execution” (I believe Oracle has a patent on this, but eventually patents expire.)
  6. potentially allow parallel writes across partitions. I believe this already works in recent versions of MySQL/InnoDB and since Postgresql 8.0.


  1. syntax restrictions on secondary keys, triggers, etc. so read the manual first
  2. pgloader does not support CREATE PARTITION as of Jan. 27, 2018. (The author is looking for sponsorship however.)

Simpler syntax for Postgresql 10, known as declarative partitioning, has been announced that is similar to MySQL’s syntax.

Here’s a chart comparing partitions across database products:

Feature MySQL 5.0+ Postgresql 8.0+ Postgresql 10.0 Postgresql 11.0
Table Partition Syntax 5.7 9.6 10 TBD

The MySQL bug database lists many serious partition bugs in 5.0 and 5.1, so it’s important to use the latest version possible. MySQL also supports merge tables for MyISAM tables.

Chapter 15 of “PostgreSQL 9 High Performance Book Review” covers Postgresql 9 partitions.

PostgreSQL Partition Manager Extension (pg_partman), Blog, github
Yahoo MySQL Partition Manager, Blog
PalominoDB pdb-parted
PostgreSQL 10: Partitions of… partitions! HN Discussion
wiki.postgresql.org: Table partitioning
Creating partitions automatically in PostgreSQL
rhaas.blogspot.com: Plans for Partitioning in v11
reddit: What are some bad things about PostgreSQL?
PostgreSQL 10 – table partitioning – how to check partitions and manipulate with them

Posted in MySQL, MySQL Cluster, Open Source, Oracle, Postgresql, Tech | Leave a comment

Postgresql Concepts for MySQL Users

Postgres’ UPSERT functionality is an important improvement. Although most developers think of it as a syntax improvement, it’s much deeper than that:

  1. the developer can do new and different things in one statement
  2. duplicate key log messages can be easily suppressed
  3. upsert is atomic, whereas insert/exist/select is not atomic without locking effort

Uniqueness in PostgreSQL: Constraints versus Indexes
Avoid naming a constraint directly when using ON CONFLICT DO UPDATE
PostgreSQL Upsert Using INSERT ON CONFLICT statement
How to install PostgreSQL 9.5 on CentOS 7

Alias for table name in SQL insert statement
Postgres 9.5 feature highlight – Upsert
psql: FATAL: database “” does not exist
psql: FATAL: Ident authentication failed for user “username” Error and Solution
Postgresql Repos: 9.5, 9.6
Postgresql Sample Databases
brew install postgresql@9.6

Posted in Open Source, Postgresql, Tech | Leave a comment

Last USA 747 Passenger Airplane Retired

The last 747 has been retired from passenger service in the USA.

I’ll miss the 747 because it was the only widely-used airliner built for trans-oceanic flights:

  1. For safety reasons, I’d rather have 4 engines than 2 for trans-oceanic flights. The slow acceptance of twin-engined airliners under ETOPS over the past few decades means greater fuel efficiency for airlines, but in a pinch you’re left with one engine, which is not a good situation for passengers.
  2. The 777’s that I’ve flown on to Asia don’t have eyeball air vents, so I end up roasting on long flights.

Delta Retires Last USA 747 to Boneyard

Malaysia hunts owners of Three Boeing 747s abandoned at airport

Posted in Tech | Leave a comment

Part 2: Migration Notes from MySQL to Postgresql Using pgloader

This is a multi-part series starting with Part 1.

After perfecting the schema and data migration with pgloader rules and getting the application sessions and login code working, it was time to convert the non-working SQL statements to Postgres syntax.

Here’s the SQL syntax I had to change (note: I use placeholders (?) whenever possible):

MySQL Postgresql 9.1 Notes/PG 9.5
DATE_ADD(?, INTERVAL ? unit) ? + INTERVAL ‘$x timeunit’ Perl DBI can do placeholder with ?::interval and “$x timeunit”
DATE_SUB(?, INTERVAL ? unit) ? – INTERVAL ‘$x timeunit’ see above for interval placeholder info
? = 0 or ‘0000-00-00 00:00:00’ ? IS NULL
INSERT IGNORE INSERT in the case when the action is not important. PG 9.5: INSERT … ON DUPLICATE KEY IGNORE
REPLACE INTO custom code in the case when rows are immutable (never updated). PG 9.5: INSERT … ON DUPLICATE KEY UPDATE
LIMIT ?, ? LIMIT ? OFFSET ? can use the pg syntax for both databases 🙂
GROUP BY GROUP BY Postgresql is stricter about the columns list here
ORDER BY NULL MySQL optimization only
SQL_CALC_ … FOUND_ROWS SELECT COUNT(*) MySQL optimization only
table `user` table users Postgresql has public.user, so treat the user table name as a reserved symbol
UPDATE table1, table2 … UPDATE table1 … WHERE id IN (SELECT id FROM table2 WHERE …)
last insert id (various) RETURNING

Don’t forget to:

  • ensure string lengths don’t exceed your schema column widths, and that Postgresql char() space-padding isn’t an issue for your application code
  • verify how Postgresql dates use timezones
  • run EXPLAIN on each of your statements. 🙂

Perl CGI::Session Notes

pgloader migrates the MySQL CGI::Session table using a Postgresql text column. This works better:

alter table sessions alter a_session type bytea using a_session::bytea;

postgresql.org: Don’t Do This

Posted in Open Source, Postgresql, Tech | Leave a comment

PSA: Intel and AMD Security Bugs and the DBA

CNN.com homepage featuring Meltdown and Spectre
Also affects Linux servers, which power the Cloud.

There’s at least 5 problems related to the on-going Meltdown and Spectre serious CPU security bugs (AWS announcement) that impact the Database Administrator (DBA):

  1. in shared environments, like AWS or VMs, neighbour VMs can read/write your data on unpatched systems. A privacy solution is to provision the entire server to yourself. In AWS terminology, that’s a dedicated server. It costs 1% more per hour and only certain instance types can be provisioned.
  2. forthcoming patches might work, or not. Complex security patches often don’t address the issue on Day One, so there will be a sequence of related patches (whack-a-mole, like Shellshock) that will affect database uptime and cache performance. AWS has revised the related announcement page more than 12 times in 2018. Say good-bye to your 400-day uptimes!
  3. the patches are reported to consume more memory and reduce benchmark performance by 33% on Linux 4.2.0 on Intel processors. If your database server is configured, like with MySQL’s innodb_buffer_pool_size, to use 90% of RAM you should consider 80% or 75% to avoid OOMs.
  4. in AWS, significant clock skew has been reported, so add that to your monitoring.
  5. there are Javascript exploits to read your notebook. That means if you connect to a remote database server with a database client or monitoring program from your notebook, your credentials can be read/changed. So keep your notebook OS and browser(s) up-to-date.

Note: innodb_buffer_pool_size can be set dynamically in MySQL 5.7 with some caveats:

SET GLOBAL innodb_buffer_pool_size=4G;

The above applies doubly to server consolidation and microservices in VMs.

Of course, if you’re an experienced production DBA, then you never trusted VMs anyway. 🙂

Some numbers from Redhat (paywalled):

> Measureable: 8-12% – Highly cached random memory, with buffered I/O, OLTP database workloads, and benchmarks with high kernel-to-user space transitions are impacted between 8-12%. Examples include Oracle OLTP (tpm), MariaBD (sysbench), Postgres(pgbench), netperf (< 256 byte), fio (random IO to NvME).

>Modest: 3-7% – Database analytics, Decision Support System (DSS), and Java VMs are impacted less than the “Measureable” category. These applications may have significant sequential disk or network traffic, but kernel/device drivers are able to aggregate requests to moderate level of kernel-to-user transitions. Examples include SPECjbb2005 w/ucode and SQLserver, and MongoDB.

Redis: Meltdown fix impact on Redis performances in virtualized environments
Cassandra: Meltdown/Spectre Linux patch – Performance impact on Cassandra?

I’ll leave it to others to pontificate on what it means when you can’t trust any desktop, server or mobile computer in an Internet-connected world. Or what HIPAA compliance means in the cloud where your server is a party-line telephone.

forums.aws.amazon.com: Degraded performance after forced reboot due to AWS instance maintenance , HN
ARM: Vulnerability of Speculative Processors to Cache Timing Side-Channel Mechanism
Escaping Docker container using waitid() – CVE-2017-5123
theregister.co.uk: Azure VMs borked following Meltdown patch, er, meltdown
CPU hardware vulnerable to side-channel attacks (Replace CPU hardware), HN (I called this in advance, but there needs to be two steps: re-design CPUs in 2018 if there’s no possible microcode update, then replace them in 2019)
blog.appoptics.com: Visualizing Meltdown on AWS
Intel alerted computer makers to chip flaws on Nov 29 – new claim – Total coincidence: That’s the same day Chipzilla’s CEO sold off his shares
zdnet.com: Researchers discover seven new Meltdown and Spectre attacks HN discussion
phoronix.com: Bisected: The Unfortunate Reason Linux 4.20 Is Running Slower HN
aws.amazon.com: Processor Speculative Execution Research Disclosure
forums.aws.amazon.com: Spectre/Meltdown Vulnerabilities – AWS please clarify
Potentially disastrous Rowhammer bitflips can bypass ECC protections HN
Google Says Spectre And Meltdown Are Too Difficult To Fix
Intel VISA Exploit Gives Access to Computer’s Entire Data, Researchers Show
Intel CPUs impacted by new Zombieload side-channel attack
twitter.com: Mitigations reduce performance by 25% HN
Amazon Linux AMI Security Advisory: ALAS-2019-1205

Keywords: Spectre, Specter, Meltdown, Foreshadow, Zombieload, Rowhammer, Microarchitectural Store Buffer Data Sampling (MSBDS)

Posted in Microservices, MySQL, MySQL Cluster, Tech | Leave a comment

Part 1: Migration Notes from MySQL to Postgresql Using pgloader

Recently I migrated a small but non-trivial (25 tables, about 500 columns, no triggers or SPs) MySQL schema to Postgresql using the Open Source pgloader utility.

pgloader supports migration from several databases/formats (MySQL, Sqlite, MS SQL, dBase, CSV) to Postgresql.

pgloader is fast: it took about 20 seconds to migrate/load both the schema and data into Postgresql 9.2 on my Mac notebook running in Virtualbox. The total number of rows was about 3 million narrow rows.

The impressive speed of pgloader lets you iterate quickly when writing pgloader rules, and lets you embed the loading process into automated scripts without any concern about performance.

Executive Summary

pgloader is a free scriptable ETL tool that lets you quickly migrate database schemas and data to Postgresql. It is helpful for setting up a PoC to estimate the work required for a full migration analysis. (It does not do SQL syntax or application code migration.) Although pgloader is excellent at what it does, any database migration is a major project.

Installation on CentOS 7

yum install freetds postgresql
createdb mydb
# unpack and run the build script from github or use the Docker image
make pgloader
pgloader mysql://root@localhost/mydb postgresql:///mydb

Installation on Debian or Ubuntu

apt-get install pgloader postgresql
createdb mydb
pgloader mysql://root@localhost/mydb postgresql:///mydb

pgloader pluses:

– free
– fast
– 100% scriptable and customizable with pgloader rules (called a “load file”)
– reasonable result for a first pass even without custom pgloader rules
– “pro DBA” feeling – accepts configuration file, and emits log files, similar to the behavior of Oracle’s BCP utility.

pgloader minuses:

– not included with Postgres
– unusual in that it is written in Lisp, but that is not user-visible
– doesn’t create Postgresql users/roles (to be expected as the SQL standard doesn’t specify these, thus they vary greatly across databases)
– doesn’t convert MySQL partitions to Postgresql inheritance-syntax partitions in 8.x or 9.x, but should work with declarative-syntax partitions in 10.x+
– detects and double-quotes reserved object names, but doesn’t notify of schema name conflicts with the Postgresql public namespace ie. table name ‘user’
– will require time-consuming schema cleanup for most use cases. See below.

My pgloader project files are available for download from my github.

Schema/Data Cleanup Notes

  • MySQL timestamps do not display the tz, but by default pg shows … “+00” unless you specify “timestamp without time zone”, or you use EXTRACT() or TO_CHAR(). Read about related pgloader rules here.
  • MySQL text and varchar columns can be migrated with varying results to pg text and bytea types
  • treat the table name ‘user’ as reserved since there is a public.user symbol that overrides the search path
  • don’t underestimate how long schema cleanup will take. Although pgloader runs quickly, Postgresql does not do casting automatically, so is extremely sensitive to application SQL statements
  • MySQL and Postgresql have different models for representing users/roles and timezones that need to be dealt with sooner than later. Here is some advice on timezone setting: Adding timezone to naive datetime fields from MySQL #331
  • migrating applications from MySQL to Postgresql is easier with Postgresql 9.5 since it has INSERT … ON CONFLICT DO UPDATE (UPSERT.)
  • index names are table-specific in MySQL, but schema-wide in Postgresql, so by default pgloader names them idx_NNNNN_name to uniqify them. If you need to use named indexes, like with ON CONFLICT ON CONSTRAINT in 9.5, then you either need to uniqify the index names in MySQL and do pgloader –with “preserve index names”, or add a pgloader rule to do CREATE INDEX your_index_name.

My Migration Results

After renaming the table ‘user’ to ‘users’ and altering the sessions table (see above), around half of the SELECT queries worked as-is and I could login to the application and click around. 🙂

However, virtually all of the INSERT and UPDATE statements had to be rewritten, taking 2 man days to get to an alpha version and 2 weeks for something worth doing formal QA.

Database Migration Alternatives

Amazon AWS provides 3 powerful data migration tools under the AWS Database Migration Service banner that are either free to use, or have a 6 month trial.

Click here to read Part 2.

pgloader: Homepage, github, Manual, Licence
postgresql.org: pg_dump
http://rhaas.blogspot.com: The State of VACUUM
severalnines.com: Upgrading Your Database to PostgreSQL Version 10 – What You Should Know

Posted in MySQL, Open Source, Postgresql, Tech | Leave a comment

PSA: Upgrade Early Macbook Pro Notebooks Now

Macbook Pro 2009/2010 notebooks came pre-installed with Snow Leopard (10.6) Mac OS X.

As of Dec. 2017, very little popular software will work or update on anything older than Mavericks (10.9):

  • Several popular chat programs no longer work – Skype app, GotoMeeting, Google Hangouts, Highfive
  • No major browsers are supported
  • Even Apple updates require 10.8 or higher – for now.

Before you update Mac OS X, note that many users have complained of post-update issues including:

  • failure to boot and very slow operation on hard disks. It is possible to downgrade later, but there’s no guarantee that your data will be ok. So backup your files first!
  • new, empty keychain folder. Delete it and reboot. (You probably will have to re-enter all of your wifi, browser and application passwords.)
  • Xcode will have to be upgraded. Virtualbox will have to be upgraded to open previous VMs, and if you use Dia, read this to fix it
  • if your Mac was not purchased in your name, apps like iMovie will not be updatable from the App Store. (iPhoto 2011 will not work at all on High Sierra since it has been deprecated. Your photos will be migrated into the iCloud and accessed with the new Apple Photos app.)
  • Restart and Shutdown takes a long time or hangs. If I need to restart with High Sierra, I close the apps manually and hold the power button for 6 seconds to force a power off.
  • Resume (open lid to resume) is flaky. Always slow to resume, sometimes restarts.

The best way to update Mac OSX is to use the Apple softwareupdate command line interface (CLI) tool:

  1. Confirm you have 2 GB RAM and at least 8 GB free disk space
  2. plug your charger in and connect to a reliable WiFi hotspot, or find a tutorial on creating a USB update media
  3. Close all open applications, including Textedit.app files
  4. Backup your files and all passwords! If the in-place update fails, you may have to wipe OS X and do a fresh install, and the install will likely bork your keychain file.
  5. open Terminal.app and run these commands to find out what updates are available, and then apply the updates:
    1. sudo softwareupdate -l (copy and paste the updates list to a file and save it for later reference)
    2. sudo softwareupdate -ia
$ sudo softwareupdate -l
Software Update Tool

Finding available software
Software Update found the following new or updated software:
   * macOS High Sierra 10.13.2 Supplemental Update- 
	macOS High Sierra 10.13.2 Supplemental Update ( ), 138293K [recommended] [restart]

Otherwise you need to register with the App Store using a credit card to use Software Update. Expect 10 or more reboots/logins and about 4 hours total time if everything goes smoothly. The final reboot will actually install the new software – it will take about 41 minutes on a hard drive, or 20 minutes on an SSD.


  • if you want to keep all your windows open as long as possible, use “Apple … Restart” instead of the installer restart button until the installer says “Ready to Install”
  • Command+L on the installer dialog window will show the installer log.


  • If the install window closes and you want to retry running the installer, open the Applications folder in Finder and click on “Install macOS High Sierra”.
  • After the installer is downloaded you will not be able to see the update label name in sudo softwareupdate -l
  • An error dialog saying “The recovery server cannot be contacted.” means a network connection error or clock setting error that causes SSL certificate problems
  • if the machine becomes not bootable, power cycle while holding Command+R to restore the old OS X and try again while following a recovery tutorial on another machine.

apple.com: High Sierra macOS freezing and stops, HN
W: Macintosh operating systems

Posted in Tech | Leave a comment

WordPress Community Scheduler Interval Bug

I opened WordPress Bug 42866: “WordPress Community Scheduler Interval Bug.”

I found that when writing the RackPing Monitoring widget, scheduled tasks frequently run up to 45 seconds earlier than the 15 minute interval specified. This may have wide and serious impact to the widget community, especially widgets that do scheduled posts and other visible blog changes.

[ WordPress has commented on my bug report with a detailed explanation. ]

theregister.co.uk: WordPress captcha plugin on 300,000 sites had a sneaky backdoor
wordfence.com: Display Widgets Plugin Includes Malicious Code to Publish Spam on WP Sites

Posted in API Programming, Business, Cloud, MySQL, Open Source, REST API Programming, Tech | Leave a comment

How to Autoscale AWS RDS Read Slaves

I don’t see a lot of links on autoscaling AWS RDS read slaves.

Is the reason:

  • AWS RDS users are simply content with relying on features as they arrive?
  • users could just be upsizing RDS masters or moving to Aurora, etc. as load increases?
  • only a small percentage of businesses have big data issues?
  • software developers at other companies religiously run EXPLAIN? 🙂

Below are some insightful comments and some projects that have implemented this.

Jeff Barr, AWS [2010]

“It should be fairly easy to create a data-driven auto scaling database cluster using Read Replicas and CloudWatch metrics.
At the low end, this cluster would consist of a Small DB Instance running in one Availability Zone with 5 GB of storage.

At the high end it would consist of a High Memory Quadruple Extra Large Multi-AZ deployment (primary/secondary pair) DB Instance with 1 TB of storage and 5 associated Read Replicas. That’s quite a range!”

harish11g.com: Load Balancing Amazon RDS Read Replica’s using HAProxy
github.com: ReDS – ReActive Database System – Terraform – Auto-Scale for RDS Instances

Other Resources

docs.aws.amazon.com: Replication with a MySQL or MariaDB Instance Running External to Amazon RDS

Posted in Cloud, Linux, MySQL, Open Source, Oracle, Tech | Leave a comment

Database Blob Normal Form – Blob NF

I propose a new normal form for databases, Blob Normal Form (Blob NF.)

Definition: A database is in Blob Normal Form (Blob NF) when all fields containing binary data or long arbitrary text are moved to tables consisting of a primary key and a blob column.

Good candidates for Blob NF are images and documents where database constraints are not applicable.

Motivation: Blob Normal Form solves physical and operational problems involving storage, performance, retention and character set assignment as blobs have wildly different characteristics from typical database values. A blob generally cannot act as a real primary key, so it can almost always be moved to a different table by itself (possible exceptions would be unique constraints, rarely used for blobs.)

Comment: Since Blob NF is not strictly a relational algebra database form, it should not have a numeric name as 1NF to 6NF have.

Relationary: The Art of Database Normalization
W: Sixth normal form

Posted in MySQL, Open Source, Tech | Leave a comment

Internet Latency and Multi-Master Database Transactions

There’s 2 common misconceptions in engineering West Coast – East Coast data centers:

  1. that packets travel at the speed of light
  2. that database masters can be located anywhere (ie. far apart.)

What happens when we look at the actual latency numbers with ecommerce/advertising applications in mind?

Cross-USA Internet Packet Latency (One-Way)

Figure 1: SF-NY (4,700 km geographic distance)

Transmission Method End-End Speed Time SF-NY Note
Light in vacuum 299,792 kps 16 ms similar speed in air
Microwave repeaters in air 235,000 kps 20 ms Repeaters every 48 km (actually built in 1950s in both USA and Canada! Currently HFT applications use 15+ microwave routes from Chicago to New York.)
Light in silica fiber (theoretical) 204,081 kps 22 ms Index of refraction is 1.45
Oceanic cable for comparison 156,666 kps 30 ms Including amplification and switching
Google Routing in silica fiber 150,000 kps 31 ms Extrapolated from The Dalles to Ashburn (4,350 km) at 29 ms
AT&T Routing in silica fiber 150,000 kps 31 ms
Public Internet Packets in silica fiber 137,000 kps 35 – 36 ms Public Internet already using MPLS
Transmission Method End-End Speed Time SF-NY Note

From Figure 1 above, we see that light can travel from SF to NY in 16 ms, yet the public Internet averages 35 ms. That’s 2.2x longer than expected if packets are expected to travel at the speed of light in a vacuum. So packets don’t travel at the speed of light.

Now that we’ve described numerically the latency limits, there’s some very interesting things to investigate:

  1. A serious enterprise could construct microwave towers across the USA again for latency-sensitive traffic with 20 ms latency in good weather, with fiber backup. (“It’s better to be fast 99% of the time than slow 99.999% of the time” – mckay-brothers.com has done that between Chicago and NY for HFT. 🙂 )
  2. If SF – NY is too ambitious (after all, SF is earthquake-prone) “pinch” the west and east-most locations by using a central region. (See below.)

Figure 2: Instead of SF-NY (~31 ms today) Data Center locations, “pinch” the network topology of the synchronous master database pair to LAS or SLC and ATL or ORD (~20 ms today). (Map of USA Population Centers According to Major Airport Traffic)

Figure 3: Another interesting topology, using near speed-of-light microwave links from the East-most synchronous master database Chicago to NY (8.5 ms). Instead of spending a few billion dollars on a nation-wide microwave chain, one of the 15+ existing microwave providers in Chicago can be leveraged for 1,300 km for low-bandwidth transaction traffic.

Wide-Area Multi-Master Database Transactions

So how does that help us with multi-master database latency?

  1. for 2-phase/sync commit, 31-35 ms for a medium to high volume of OLTP transactions isn’t workable, especially over the Public Internet. But 17-20 ms of reliable latency is fundamentally different. (10 ms is the same as public Internet latency from San Jose to Las Vegas!) An optimized ecommerce store application would work with a reliable latency near 20 ms. (Confirmed with Percona Consulting.)
  2. if that’s not workable, think beyond 2-phase commit. Lamport/vector clock algorithms have been available since 1988, and have been implemented in Voldemort and since 2018 in Redis (so you can delegate database session handling, etc. to Redis if you need cross-DC availability.) Cassandra uses last-write wins and is DC-aware. Use NTP/GPS/optimization like Google Spanner does.
  3. #1 can be modified by “pinching” the location of the database masters. Instead of thinking SF and NY, locate the masters in Las Vegas or SLC and Atlanta or ORD with read-slaves in SJC and Ashburn as required.
  4. Google and AT&T have virtually unlimited CONUS fiber, meaning unlimited bandwidth and known reliability around 31 ms. A new algorithm can be built according to those constraints. Think git, but for database transactions.

What Does a Reliable Network Mean?

Reliable for wide-area multi-master database transactions means:

  1. almost always partition-free – 5x9s or more during most-active shopping times (Google is emphasizing partition-free in their networks, as it’s far easier than reducing latency and more predictable overall)
  2. zero packet loss
  3. maintenance windows known in advance
  4. good enough for your DBA Team to say “Yes, we can support this.”

At this time, that requires a dedicated network, either yours or a cloud provider (Spanner with SQL has been available since 2017.)

What is the Low-Hanging Fruit?

From lowest-cost to highest-cost for making database transactions WAN-safe:

  1. wiki exercise – document how your business applications:
    1. Internal and external SLAs are defined
    2. what applications connect to the databases (what options are used, are they persistent and how many round-trips result)
    3. how many database round-trips are needed per page
    4. how sessions and session failover works
    5. what percentage of writes vs. reads are made
    6. are the transactions as thin as possible using row self-updates and removing read-before-write cases aka race conditions
    7. how it all should really work in edge cases (network partitions and slowdowns, etc.)
    8. what can be cached with Redis/Elasticache, Memcached, DynamoDB, etc?
  2. data reduction/archiving (just active OLTP rows, please)
  3. use transaction group commit
  4. pinching west-most and east-most locations closer together. ie. put one master in a central location. See Figures 2 and 3 above.
  5. algorithms like vector clocks, or newer/better
  6. reducing latency on existing routes (MPLS, direct optical routes)
  7. building new private CONUS/Gulf of Mexico fiber route.

In my experience, most organizations never even get to step #1 above: 🙂

Fortunately, there is a half-measure: multi-AZ with AWS uses different data centers in the same region with only 1-2 ms inter-DC latencies. James Hamilton from AWS calls using small data centers in the same region “limiting the blast radius.”

The Speed of Light – Depends on the Medium

The speed of light in a vacuum is 299,792,458 meters per second, or 186,282 miles per second. In any other medium, though, it’s generally a lot slower. In normal optical fibers (silica glass), light travels a full 31% slower.

Exercises for the Reader

  • Fill in the wiki outline above.
  • What regions does my cloud provider support?
  • What is the lowest inter-master latency that can be provisioned?
  • How many TPS does my database do that is directly ecommerce-related (not DW or logging)?

The World’s First West-East MySQL Multi-master Cluster

Yahoo paid MySQL AB about $40,000 for the first replication feature (statement-based) to use on their leased fiber. Because MySQL classic replication is asynchronous, latency is not a big issue for most operations as long as the total throughput is adequate.

Google Spanner

Google has built a database that corresponds to what’s discussed in this blog post called Spanner. SQL was added in 2017.

Please leave a comment!

Please leave a comment (no registration required) if you have any experience implementing similar topologies, or have suggestions or corrections.


How Google Does It

cloudplatform.googleblog.com: With Multi-Region support in Cloud Spanner, have your cake and eat it too
Google Public NTP

Microwave WAN Transmission

The secret world of microwave networks
The Abandoned Microwave Towers That Once Linked the US
Trans Canada Microwave
mckay-brothers.com: Microwave Bandwidth at Extreme Low Latency
109 Microwave Towers Bring the Internet to Remote Alaska Villages

Fiber Optic

Calculating Optical Fiber Latency
$1.5 billion: The cost of cutting London-Tokyo latency by 60ms
Researchers create fiber network that operates at 99.7% speed of light, smashes speed and latency records (fiber optic waveguide)

Public Internet Latency Measurements

SO: How much network latency is “typical” for east – west coast USA?
AWS Inter-Region Latency

Research and Other News

netflix: Active-Active for Multi-Regional Resiliency
Network latency – how low can you go?
W: Multiprotocol Label Switching (MPLS)
Latency: The New Web Performance Bottleneck
developer.apple: Networking Concepts
hpbn.co: Primer on Latency and Bandwidth
Network performance: Links between latency, throughput and packet loss
Turning the Optical Fiber Network into a Giant Earthquake Sensor
fgiesen.wordpress.com: Network latencies and speed of light
Einstein, Poincaré & Modernity: a Conversation

Posted in Business, Cassandra, Linux, MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

The Awesome Cloudify Orchestration Tool Roundup Slides

Cloudify and Gigaspaces wrote a really useful slide deck titled, “Orchestration Tool Roundup – Docker Swarm vs. Kubernetes, TerraForm vs. TOSCA/Cloudify vs. Heat” dated June 11, 2015.

There’s 3 things that make the slides awesome:

  1. they start with a sample reference architecture of a node.js/Mongo deployment
  2. then they show how to manage that architecture using several orchestration tools
  3. and they include the scripts to do so.

So it’s really an evaluation framework for you to add new tools, or customize for your environment. 🙂

Adding Cloudformation Provisioning

To add AWS Cloudformation (CF) after Terraform, basically they’re both so simple that you can just repeat the TF slides, change “Terraform” to “Cloudformation”, except CF is AWS-specific.

Cloudformation scripts are written as JSON or YAML templates and can be executed using the AWS CLI or API or the Management Console UI, and the resulting cluster definition is called a “stack.”

Cloudformation itself is free to use, thus you only pay for the underlying AWS resources consumed. Conceptually for Chef and Puppet users, a CF template just saves you from calling the AWS API multiple times.

Reference Architecture for Orchestration Tool Discussion

Posted in Business, Cloud, Linux, Open Source, REST API Programming, Tech | Leave a comment

Linux zero filesystem bash script

This is a short linux bash script I wrote that’s adequate for zero’ing a filesystem for on-premise server storage.

Note that the only “secure erase” is to destroy the physical drive because of various internal disk-level caches. 😐

To do multiple passes (in this case 10 passes):

# seq 1 10 | xargs ./zero.sh

# Program: zero.sh
# Purpose: write on pass of zeros in free space on current volume. 
# Date: 2017 10 28
# Author: James Briggs, USA
# Usage: cd "filesystem"; screen ./zero.sh
# Env: CentOS 6,7 bash
# Notes:
#   This program is write-intensive, so not recommended for SSD or Flash drives unless disk wear is ok
#   This is not a secure erase, but is ok for on-premise storage.



# cleanup after previous runs
/bin/rm -f *.zero.bin "$0.out"

# while [ "$n" -le "4" ]; do
while true; do
   /bin/echo "$0: writing chunk #$n of $sz ..."
   /bin/dd if=/dev/zero of="$outfile-$n.zero.bin" oflag=nocache bs=$sz count=1 conv=fsync || let err="yes"
   /bin/echo "$0: syncing chunk #$n of $sz ..."

   [ -n "$err" ] && break

   sleep $delay
   let n++

/bin/echo "$0: deleting $n chunks of $sz ..."
/bin/rm *.zero.bin

/bin/echo "$0: syncing filesystem after chunk deletions ..."

msg="done zeroing current volume with $n chunks of $sz."
/bin/echo "$0: $msg"
/bin/echo "$0: `/bin/date` $msg" > "$0.out"

exit 0

A one-line log file is written:

# cat zero.sh.out
./zero.sh: Sun Oct 29 04:16:30 UTC 2017 done zeroing current volume with 12 chunks of 1G.

A more complicated utility:

github: redeemer

Posted in Linux, Open Source, Tech | Leave a comment

Redis and CentOS 7

Redis is a feature-packed cache that’s easy to install and work with. Here’s some notes on installing and using Redis with CentOS 7.

Install available redis package from your available yum repos:

# yum install redis
# systemctl enable redis
# systemctl start redis

By default, Redis is configured at installation to bind to localhost with no password in /etc/redis.conf.

See status of redis and redis services:

# systemctl list-unit-files 'redis*'
redis-sentinel.service disabled
redis.service enabled
2 unit files listed.

Setup for programming with some Perl modules:

# cpan Redis Redis::Fast Redis::List

Perl code snippet for a web page performance footer:

use strict;
use diagnostics;

use Redis::Fast;

my $r_cache = cache_info();

my $cache_info = '';

$cache_info = "Redis $r_cache->{'redis_version'} \
up $r_cache->{'uptime_in_seconds'} s \
$r_cache->{'keyspace_hits'}h/$r_cache->{'keyspace_misses'}m \
/$r_cache->{'expired_keys'}e last write: \
$r_cache->{'aof_last_write_status'}" if defined $r_cache;

print $cache_info;

sub cache_info {
   my $h = Redis::Fast->new;

   if ($h) {
      my $r_hash = $h->info();
      return $r_hash;

   return undef;

will output a useful message like:

Redis 3.2.10 up 414908 s 1146h/301m/141e last write: ok
Posted in API Programming, Open Source, Perl, Storage, Tech | Leave a comment

Amazon Waterproof Kindle Oasis

Interesting news that Amazon is selling a waterproof Kindle, and that previously Jeff Bezos used Kindles in a one-gallon Ziploc bag in the bathtub.

I worked on the Cloud backend for the discontinued Ricoh eQuill waterproof business tablet. There are myriad uses for a waterproof work tablet:

  • hotel room staff maid and minibar reporting
  • fairground employee and mechanical supervision notes
  • healthcare biological samples recording and hospital charts
  • since there’s no resale market for a serial-numbered industrial tablet, theft is reduced.

Ricoh eWriter Tablet

Ricoh Waterproof Ewriter Tablet (2011)

Amazon finally made a waterproof Kindle

W: International Protection Marking Code

Disclaimer: I have worked on tablet Cloud software for Amazon, Apple and Ricoh.

Ricoh Introduces the eWriter Solution ─ New Business-class Tablet and Back-end Services Improve Business Efficiencies by Moving Paper Processes Online
The Ricoh eQuill is an e-ink tablet for businesses
Ricoh Targets Channel With Tablet, Digital Workflow Services

Posted in Tablets, Tech, Toys | 1 Comment

Safety-critical Realtime with Linux

LogoLWN has a nice article on “Safety-critical realtime with Linux.”

The intro has a good overview of real-time concepts, and the rest of the article has a summary of Linux implementation techniques.

(RTLinux is not mentioned. The company was bought by Wind River Systems in 2007, which has dropped commercial support.)

W: RTLinux, RTAI
linuxfoundation.org: Preempt RT

Posted in Linux, Open Source, Tech | Leave a comment

Fix for Apache httpd Update Causing WordPress Redirect Loop

The recent CentOS 6 Apache httpd 2.2 server update changes the envariable HTTP_HOST from ‘domain’ to ‘domain:port’, causing cookie domain match and redirect issues with web apps, especially in a reverse proxy setup.

Sep 23 23:55:25 Updated: httpd-2.2.15-60.el6.centos.5.x86_64

The fix for WordPress in my setup with a httpd front-end and a httpd_php backend (reverse proxy) is:


-      if ( ! $requested_url && isset( $_SERVER['HTTP_HOST'] ) ) {
+      if ( ! $requested_url && isset( $_SERVER['SERVER_NAME'] ) ) {
               // build the URL in the address bar
               $requested_url  = is_ssl() ? 'https://' : 'http://';
-              $requested_url .= $_SERVER['HTTP_HOST'];
+              $requested_url .= $_SERVER['SERVER_NAME'];
               $requested_url .= $_SERVER['REQUEST_URI'];

To fix similar redirect problems in other Open Source web products, just:

  1. backup your files
  2. add a print statement for the domain/cookie handling variables to a file and inspect what is happening
  3. either use SERVER_NAME instead of HTTP_HOST, or use a regex to clean the HTTP_HOST value like s/(:\d+)$//;
Posted in Business, Open Source, Tech | Leave a comment

Domain News Links

Commonly-used country Top Level Domains (TLDs) are often mismanaged.

One of the most popular TLDs is .io, which is owned by The Chagos Islands, known as the British Indian Ocean Territory, hence the .io country domain.

theregister.co.uk: Bloke takes over every .io domain by snapping up crucial name servers
.io name servers down
theregister.co.uk: Telco forgot to renew its web domain, broke deaf folks’ video calls – now gets a $3m paddlin’
hackernoon.com: Stop using .IO Domain Names for Production Traffic HN
theregister.co.uk: Question mark hangs over trendy tech startup domains as UN condemns British empire hangover HN

Posted in Tech | Leave a comment

PSA: Workarounds for CentOS 7.3 Problems with SE Linux 2.5

PSA: If you update to CentOS 7.3 and see odd console or log errors like this resulting in a hung boot:

Failed to start Import network configuration from initramfs


work still pending
FAILED Failed to start Login Service.
See 'systemctl status systemd-logind.service' for details.
FAILED Failed to start Authorization Manager.
See 'systemctl status polkit.service' for details.
DEPEND Dependency failed for Dynamic System Tuning Daemon.

then you have a problem with CentOS 7.3 and SE Linux 2.5.

The workarounds are surprisingly simple:

  1. don’t upgrade anything until CentOS 7.4 is released (verified by me on Sept. 17 on a Dell 1950) or
  2. add selinux=0 to the kernel boot line in the boot menu and/or grub.

ask.fedoraproject.org: Startup fails with multiple errors

SSH: Unable to get valid context

Keywords: Virtualbox, linux, selinux, enforcing.

Posted in Linux, Tech | Leave a comment

Solr Meetup at Cloudera in Palo Alto

Cloudera hosted another Solr Meetup at their office in Palo Alto. About 30 people attended.

Two software engineers from Cloudera did presentations tonite:

1) Michael Sun talked about his nitely Solr microbenchmark and cluster benchmarks.

Solr Nightly Benchmarks (SOLR-10317)

2) Mano Kovacs from Cloudera talked about Solr LIR trouble-shooting techniques

Nice t-shirt: “Data is the New Bacon.” 🙂

Leader incorrectly publishes state for replica when it puts replica into LIR (SOLR-9555)

He also talked about limitations and issues of of autoAddReplicas.

Recommends PlantUML for documentation.

Audience Questions

– piercing – checking 100 document versions may not be enough if there’s 1000 total versions

After party at Antonio’s Nuthouse on California Ave.

Thanks to Cloudera for hosting this event and the Mediterranean food!

Cloudera 1001 Page Mill Road Palo Alto, CA 94304

Apache Solr Memory Tuning for Production
techcrunch.com: Algolia raises $53 million for its search engine API

Posted in Cloud, Storage, Tech | Leave a comment

Distributed Systems Laws Applied to Distributed Databases

Perl LogoAvery’s Law of Distributed Systems Reliability: “Distributed systems are more reliable when you can get a service from one node OR another. They get less reliable when a service depends on one node AND another. And the numbers combine multiplicatively, so the more nodes you have, the faster it drops off.”

Lamport’s Observation: “A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable.”

Both sound simple enough, even obvious for systems serving HTTP(S) or caching.

But apply those to clustered and distributed databases, and you can do powerful analysis on expected availability.

Topo Nodes Database Configuration Reads during single node failure Writes without a failure Writes during single node failure
WAN 2 MySQL Master-Slave Async Replication OR Single Master NA
WAN 2 MySQL Master-Master (standby) Async Replication OR OR (Single Active) OR (Single Active)
LAN 2 MySQL Master-Slave Semi-sync Replication OR Single Master NA
LAN 2/4 MySQL CGE NDB Cluster Sync Replication OR OR NA
LAN 3 (Min.) MySQL Galera Sync Replication OR OR Requires quorum for OR
WAN 3 (Min.) Cassandra Appropriate RF and CL OR OR OR

acm.org: Turing Award to Leslie Lamport

Posted in Cassandra, Cloud, Storage, Tech | Leave a comment

Perl, DBI and MySQL utf8mb4 Character Set Support

Perl LogoMySQL’s modern UTF-8 encoding is named utf8mb4 (4 bytes), not utf8 (3 bytes.)

For new applications, especially web, you should start with utf8mb4. For existing applications, you need to decide if an upgrade is worthwhile, and test extensively before a production upgrade.


  • Note that changing your database character set for production applications should be treated seriously, like the major project that it is.
  • If all your data is in fact currently US-ASCII, then the database migration will be easy since it is a subset of UTF-8. (However your applications may need to do Unicode normalization of strings before insert, for comparisons to work later.)
  • Be careful with converting binary columns to UTF-8, like blobs. The result may be undefined, so test.
  • Since collation is language-specific, the various Unicode collations are almost never the right ones
  • You do need UTF-8 for people and place names, but there’s no reason to use UTF-8 for columns that will always be US-ASCII, like database id’s, IPv6 addresses, etc. MySQL has to allocate more space for UTF-8, so it is a disadvantage when not needed.

Testing utf8mb4

  1. ensure you’re choosing an up-to-date version of MySQL 5.6 or 5.7 that if necessary supports long keys (>768 bytes) using DYNAMIC/BARRACUDA with SHOW VARIABLES LIKE “innodb_file_format”;
  2. identify some test strings using ie. emoji and write some small test apps in each of the application languages you support
  3. dump and restore your data on a test instance, especially if you have Asian or emoji characters. Run mysql_upgrade.
  4. convert your schemas, tables and columns to utf8mb4. Update my.cnf.
  5. test using the test programs in #1
  6. update your business applications’ client settings and get acceptance testing. Now is a good time to write a central database connection function, and also a central transaction retry function.
  7. test your database tools, including backup and restore.

Production Migration to utf8mb4

  1. upgrade client libraries on application servers and verify
  2. update all applications in advance if possible and verify
  3. schedule downtime
  4. during downtime, deploy new applications settings on all servers and dump and migrate database
  5. do acceptance testing

If you have applications written in Perl, you need to first upgrade DBD::mysql to a version greater than 4.041. (Even CentOS 7 comes with only 4.023.)

Before (CentOS 7):

$ perl -e 'use DBD::mysql; print $DBD::mysql::VERSION'

# mysql_config
-bash: mysql_config: command not found


# yum install mysql-devel
# cpan DBD::mysql

$ perl -e 'use DBD::mysql; print $DBD::mysql::VERSION'

blogs.perl.org: DBD::mysql – all your UTF-8 bugs are belong to us!!
SO: Trying to install Perl-Mysql DBD, mysql_config can’t be found
MySQL utf8 vs utf8mb4 – What’s the difference between utf8 and utf8mb4? HN
mysqlserverteam.com: MySQL 8.0: When to use utf8mb3 over utf8mb4?
Using Databases with DBI: What Not To Do

Posted in Linux, MySQL, MySQL Cluster, Open Source, Oracle, Perl, Tech | Leave a comment

Lecture: Silicon Valley Perl Meetup – REST API Server Programming With Perl

Perl LogoI gave a talk at the Silicon Valley Perl Meetup on “REST API Server Programming With Perl.”

Here are the slides:

  1. Part 1: “The REST API Landscape in 2017”
  2. Part 2: “REST API Server Programming with Perl, Swagger and the Mojolicious Framework”

github: Perl Petstore Enhanced REST API Framework, Sample REST API Clients

There was an extended audience discussion afterwards with some interesting observations:

  1. “If you use boolean; before use JSON;, you may get the true/false behavior you want.”
  2. “Catalyst fixed their module dependencies issues a few years ago and can be installed quickly now.”
  3. “Even though Swagger2 provides some input parameter validation, all the 3-letter security acronyms have to be handled.”
  4. “Upgrading our spec file from Swagger 1.0 to Swagger 2.0 didn’t work until we added a directive like ‘x-mojo-controller2’.”
  5. “Consider just validating a part of the spec when doing input validation of requests.”

Swagger 2.0: How to specify an input parameter of type ‘object’?
Reading the Swagger 2.0 spec:

  • body location supports input objects now. path and query locations request input arrays for now, and objects later (limited by not having Content-type per input parameter).
  • output response can be objects.
  • having said that, whether your validator supports that or not requires testing.

File Structure

“The Swagger representation of the API is made of a single file. However, parts of the definitions can be split into separate files, at the discretion of the user. This is applicable for $ref fields in the specification as follows from the JSON Schema definitions.

By convention, the Swagger specification file is named swagger.json.”

Thanks to Nvidia for hosting the meetup again.

OpenAPI Specification Version 2.0
URLs are UI
JWT Comments
Jeremy Zawodny: From mod_perl to Mojolicious at craigslist Slides
amihaiemil.com: What is HATEOAS?
Ask HN: What’s your biggest struggle with Microservices?
Ask HN: What are the not-so obvious things to consider while API integration?
Ask HN: In 2018, What makes a good API?

Posted in Open Source, Perl, Tech | Leave a comment

Skype and Facetime Tips for Older Mac OS X Versions

PSA for Mac OS X users on older machines like Lion and Mountain Lion: if you’re stuck on Skype 6.15, and can’t do video chat after the Microsoft’s July 3, 2017 breaking update …

Apple Facetime is a free video chat application that comes with Mac OS X.

Some tips for using Facetime:

  1. It seems to use an entire CPU core, and will quickly drain your battery, so plug in.
  2. To add a new contact, you need to know their Apple ID email or their iPhone number. For phone numbers, use the full international prefix with 0 or 1, but without +.
  3. By default if Facetime has a window visible (ie. it’s opened), it will enable the camera and consume power. To prevent that, click on the yellow dot to minimize it (ie. background it.)
  4. To make Facetime wait for calls, minimize it (click on the yellow dot) or leave Contacts open.
  5. If Facetime is not for you, try iMessage/Messages/iChat.
  6. Facetime on iPhone is paused automatically when they switch apps and shows a “Paused” message. Facetime is not paused on Mac OS X.

HN Discussion

Posted in Tech | Leave a comment

Configuring IPv6 on Linux CentOS

Linux logoConfiguring IPv6 on your linux server is this easy if your ISP is IPv6-ready (if not, see Tunnelbroker links below) on CentOS 5 and 7:

vi /etc/sysconfig/network-scripts/ifcfg-my_interface file, note the settings that start with “IPV6” and update them:

  1. DNS2=2001:4860:4860::8888 – Google Public IPv6 nameserver
  2. IPV6INIT=yes – This is needed when configuring IPv6 on the interface
  3. IPV6ADDR=my_ipv6-address – Specifies a primary static IPv6 address
  4. IPV6_DEFAULTGW=my_ipv6-address – Adds a default route through the interface specified

You don’t need new IPv6 switches, since switching is done at Layer 2. I’m using the old bargain web-managed HP Procurve J9028A. 🙂

(When people say IPv6-capable, that means the management features can be assigned IPv6 addresses, or that it can do Layer 3 routing functions with IPv6 addresses.)

To test:

  1. ping6 ipv6.google.com
    # ping6 ipv6.google.com
    PING ipv6.google.com(sfo07s17-in-x0e.1e100.net (2607:f8b0:4005:80a::200e)) 56 data bytes
    64 bytes from sfo07s17-in-x0e.1e100.net (2607:f8b0:4005:80a::200e): icmp_seq=1 ttl=56 time=1.39 ms
  2. traceroute -6 ipv6.google.com
    # traceroute -6 ipv6.google.com 
    traceroute to ipv6.google.com (2607:f8b0:4005:80a::200e), 30 hops max, 80 byte packets
     1  gateway (xxx:xx:x:xxx::1)  3.959 ms  3.978 ms  4.005 ms
     2  10ge7-3.core3.fmt2.he.net (2001:470:0:274::1)  7.859 ms  7.933 ms  11.998 ms
     3  10ge10-5.core1.pao1.he.net (2001:470:0:263::2)  11.296 ms  0.785 ms  11.311 ms
     4  google-as15169.10gigabitethernet8-2.core1.pao1.he.net (2001:470:0:244::2)  0.944 ms  0.946 ms  0.980 ms
     5  2001:4860:0:1004::1 (2001:4860:0:1004::1)  1.471 ms 2001:4860:0:1006::1 (2001:4860:0:1006::1)  1.478 ms 2001:4860:0:1004::1 (2001:4860:0:1004::1)  1.577 ms
     6  2001:4860:0:1::1f71 (2001:4860:0:1::1f71)  1.328 ms  1.266 ms  1.221 ms
     7  sfo07s17-in-x0e.1e100.net (2607:f8b0:4005:80a::200e)  1.150 ms  1.197 ms  1.210 ms


  1. on CentOS 7, leaving network manager enabled was more successful than attempting to disable it
  2. the files in network-scripts/ are space-sensitive, so don’t use spaces or you will get weird parsing errors
  3. if you’re a Perl programmer, for best results use perl 5.14 or newer and IO::Socket::IP instead of IO::Socket::INET. (Perl 5.10 can work if you upgrade IO::Socket and LWP modules.)
  4. “Check sysctl -a | grep disable_ipv6 output. And if it’s =1, set it to 0.”
  5. “When NetworkManager is running, it may disable ipv6 on the interface if it’s not configured via NM.”

rootusers.com: Configure IPv6 Addresses And Basic Troubleshooting In Linux
google.com: Google Public DNS IP addresses
centos.org: Are you using Network-Manager in no-GUI CentOS 7 Server?

HE Tunnelbroker Links

If your ISP does not support IPv6 yet, you can tunnel traffic in and out of your machine using the HE Tunnelbroker. This is also simple to setup, taking about 5 minutes if your server has IPv6 enabled. (Note that IRC and email traffic to port 25 are filtered to reduce abuse.)

Create Hurricane Tunnel Broker on Raspberry Pi
Hurricane Electric Free IPv6 Tunnel Broker
AWS IPv6 Update – Global Support Spanning 15 Regions & Multiple AWS Services

Posted in Linux, Open Source, Perl, Tech | Leave a comment

Debugging CSS for Programmers

In the 90’s and 00’s, programmers used to scorn front-end designers who called writing HTML “programming.” That changed when CSS arrived and pixel-perfect results raised the bar.

Here’s some tips for programmers struggling with fixing CSS rendering problems:

Before Getting Started

  1. Most programmers are not artists, so don’t write your own site-wide layout CSS. Either find a designer or use bootstrap or an alternative. If you’re doing more than minimal JavaScript programming (ie. using AJAX and/or modeless dialogs), use jquery or an alternative.
  2. Treat a CSS project like learning a new programming language. Schedule a day when you’re fresh. You will need all of your concentration ability if you’re new to CSS because of the nested inheritance and browser quirks. If you finish early, that’s a nice bonus.

Getting Started

  1. You may want to capture screenshots of what the original site looks like. The Mac OS X Preview app has professional-looking annotation features by Adobe under “Tools … Annotate.”
  2. make a backup of the old CSS directory and save in a safe place. Then copy the CSS into a subdirectory so you can use the diff command on the old and new versions
  3. if your site uses normalize.css, ensure it’s the most recent version, especially before mobile testing.
  4. ensure your network environment has isolation: no load balancers, proxies or far future expires that can cache your old CSS files while testing
  5. use the W3C HTML and CSS validators to get a feeling for what’s there and pair each div with /div (I helped debug the original HTML validator.) 🙂
  6. use browser developer tools to examine the CSS. My favorite is Firefox’s Inspector.
  7. remember the hierarchy with CSS:
    1. the most-specific style wins
    2. the last definition wins, whether at the block or file level
    3. id’s win over classes. (Use CSS classes whenever possible. id’s are more heavily used with JavaScript DOM code.)
    4. pseudo-classes: “Note: a:hover MUST come after a:link and a:visited in the CSS definition in order to be effective! a:active MUST come after a:hover in the CSS definition in order to be effective! Pseudo-class names are not case-sensitive.”
    5. !important wins. (Try to avoid this unless for example you’re fixing a browser z-order rendering problem.)
    6. all bets are off with syntax errors, missing semicolons, or mis-specified class name lists, so fix those first. See below.
  8. do some reading online about block and inline element display. Some properties can only be applied to a block element, which usually means adding an enclosing div.
  9. CSS uses the original C-style comments of slash-star. Invalid comment characters, like // and #, can cause surprising behavior. Firefox will log “Selector expected. Ruleset ignored due to bad selector.”

Note: if you’re paying for HTML or CSS, always validate it before payment. Even better, ask your designer to validate it weekly and before delivery.

Debugging Strategies

  1. verify if there are any CSS media screen blocks that vary with resolution
  2. verify the correctness of individual class definitions, especially lists. if “cascading seems broken”, most likely there is a syntactically-correct but stray ‘div’, ‘p’ or ‘input’ after a class name
  3. once the CSS looks sane, you can debug problems by setting enclosing div classes to red or green and refresh your browser
  4. if there’s multiple CSS files included, try varying the order
  5. to debug rounded corners issues, try setting the background to black or white for maximum contrast
  6. test on Firefox, Safari, Chrome and IE11. Budget twice as much time if you want to support IE9 and IE10
  7. try overriding a CSS element or enclosing div with an inline style to narrow down a problem.

Emergency Fallbacks

If you’re on a deadline or don’t have story points for a CSS project, there are workarounds.

  1. Although it’s better to write classes that apply to all elements, being more specific can fix individual form field problems
  2. As a temporary fix, doing an inline style will guarantee what the element looks like.

Getting Done

  1. if you’re not working solo, use the diff command to communicate changes to other developers and artists
  2. coordinate the checkin. Note that artists often are not version control experts, so you need to find out what the natural flow is or somebody’s changes will get stepped on (seen it twice already.) Investing some time now could prevent big problems later

Little CSS Stuff Newcomers Get Confused About

Related Annoyances

  1. maxlength doesn’t work on input type=number
  2. it’s more difficult to left-justify and right-justify 2 items in CSS than with tables


CSS Stats
CompressPNG and tools family

Posted in Tech | Leave a comment

Redis Conference 2017

I went to RedisConf 2017 Wed. and Thu. at the Marriott Marquis in SF hosted by RedisLabs.

About 2,000 people showed up – initially surprising to me, but then Redis is more than a key-value store.

Kudos to RedisLabs for a great conference: excellent venue, very well-organized, good food but intermittent Wifi.

Special credit for the combination of excellent, very technical talks with “lateral” keynotes that were thought-provoking even when not Redis-specific.

Conference slides are here.

Executive Summary

  1. 2017 is the year of three modern “at scale” tools: Redis, Spark and Grafana.
  2. Redis is much more than a sub-millisecond key-value store. It has several killer features, including zsets, geo calculations and pub/sub. Redis can tier storage to SSD, supporting multi-TB datasets that are larger than RAM.
  3. Redis users are already using replication and clustering features. Coming soon is multi-master with vector clocks, enabling multi-region masters, the Holy Grail of data persistence.
  4. Caches should be managed (documented, monitored and administered) by assigned operations staff, and in more complex scenarios, by DBAs.

Wednesday Keynotes

Kelsey Hightower, Staff Developer Advocate, Google

– Kelsey built redis kubernetes cluster in live demo and did failures
– used “Siri” voice commands to do some of the deployment

Sam Ramji, VP Product Management, Google

– very interesting high-level comments
– Dr. Eric Brewer is a VP Eng.
– “Spanner defeats CAP” because:

1) Google’s network is reliable (no partitions expected)
2) atomic clocks in each DC, less than 200 microsecond drift

– Spanner has SQL UI now, so anybody can access data.


Spanner High-Level Architecture Diagram

Wednesday Talks

Case Study: Redis Cluster at Flickr (Yahoo!)
Sean Perkins, Sr. Operations Engineer (18-year veteran)

– traditional large-scale IT environment (still pets not cattle with long-lived masters and names/IP addresses)
– had to do a cleanup and doc phase, dedicated vs. general cache instances
– recommends rcm esp. since hard to tell if redis cluster nodes up-to-date
– managed with yinst, chef, jenkins/screwdriver
– not full-on HA, but good enough for easy upgrades
– no redis cluster talks in 2016, so this is one of the first. very well-received.

Operationalizing Redis At Scale
Brian Ip, Software Engineer, Square

– SSL everywhere with ghostunnel
– read-only using min clients 1000
– redis is ro-cache, but backups made for operational reasons like cache warming
– LXC environment
– looking forward to slides!

Building Large, High Performance Databases with Redis using Flash Memory and Emerging Hardware
Cihan Biyikoglu/Frank Ober
(Redis Labs/Intel)

– RAM is expensive, SSD is cheaper
– so use SSD in tiered storage where redis is aware
– Intel has Optane SSD series for this (gave one away to audience member)

How Roblox keeps millions of users up to date with Redis Pub/Sub
Peter Phillips, Senior Software Engineer

– crazy numbers – 1 million persistent chat connections using web tier, 4 redis nodes
– Windows environment
– try to keep connection count down to 1 per device but not easy
– pub/sub used for cache invalidation
– some attendees said “best technical talk”

Using Redis at scale at Twitter
Rashmi Ramesh, Senior Software Engineer

– LB proxy layer in front of redis
– online rebalancing for qps avg. 600k qps, not size
– nighthawk is internal name
– still use memcached for flat values
– patches or designs for identifying hot keys, recommends caching in app.

Thursday Keynotes

Doing More with Redis
Ofer Bengal and Yiftach Shoolman, Co-Founders, Redis Labs

Real-time intelligence with Spark and Redis
Reynold Xin, CTO, Databricks

– great talk using the Google NIPS 2015 machine learning diagram: LHS can be implemented Spark, RHS in Redis.

Microservices and Redis
Chris Richardson, Founder, Eventuate.io

– “microservices reflect business units”
– for distributed data modelling, look up Saga and CQRS

Chris is a world-class deep thinker. Check out his slide deck.

Trends in DevOps
Charity Majors, CEO, Honeycomb.io (ex-Parse, ex-Facebook)

Listening to Charity is definitely a unique experience. Very brassy, but also dead-on accurate.

– Redis is “Memory as a Service”
– monitoring vs. observability
– what’s changed?
– infrastructure and storage complexity is growing
– do you know who the DBA is? If not, it’s you! 🙂
– distributed system: your system is never entirely up
– your green dashboard is a lie
– there are no more easy problems. there are only hard problems. (combinations, and you fixed the easy ones)
– context is everything. aggregates throw away data
– dashboards must be people-first and consumer-friendly
– don’t make everyone be an expert
– don’t promote swdevs who don’t know operations: #OwnYourProblems

Thomas Middleditch
(Star of the hit HBO show “Silicon Valley”)

Thomas is a professional comedian/actor who portrays an alpha nerd on TV. IRL, he is a gamer and did some QBasic programming as a kid. He fielded questions from the audience and was generally witty and entertaining. His TV persona is a composite of nerds, not inspired by one particular person. One of the funniest things he did was to read the conference poster buzzwords and exclaim, “I want to get me some more pam-auth.” 🙂

Thursday Talks

Utilizing Redis in a High-tech Ad Traffic Stack
Rahul Babbar, Timesinternet (Advertising in Colombia, South America)

– 99% percentile is 2 ms

Best practices (also see slides):

– do BGSAVE on masters sequentially to reduce IO hotspot
– ensure TTL is set
– renamed destructive commands so that rogue client programs don’t have accidents
– set idle timeout
– hz patameter
– have an app strategy in case of redis slowdown or failure

Machine Learning in Real-time with Redis-ML
Shay Nativ, Redis Labs

– Redis’ plugins are a great way to add features
– one plugin is ML, with several models available
– one adtech user was able to use the tree model to reduce from 1200 Java app servers to 40 Redis servers

A Collaborative Canvas Using Bitfields in Redis
Daniel Ellis, Reddit

– about Reddit’s April Fools collaborative painting tool
– initial data model for Cassandra, not really a good fit
– changed to Redis using one key with 15 MB string representing one nybble for each pixel

Multi-Master Redis : A Deep Dive
Elad Ash, Redis Labs

– current Redis is just LWW (last write wins)
– for multi-master, it’s desirable to have some kind of write conflict strategy
– each data type is getting “vector clock-ized”
– implemented as a Redis plug-in
– available now for beta, to be released in 2018

Geofencing Using Redis Geospatial Queries
Matthew Hicks, Appboy

– Appboy library is installed on up to a billion mobile devices/apps, bundled with various apps
– use Redis built-in commands to calculate distance from an advertiser location to the current user location for ad targeting (circle diameter)
– For example “is a passenger approaching a branch location? if so, push an ad.”
– not difficult to implement using Redis, but need detailed user requirements to do so in a relevant way.

Geospatial Indexing At Scale: the 10 Million QPS Redis Architecture Powering Lyft
Daniel Hochman, Lyft

– good talk.

Getting There

From peninsula, take Caltrain to 4th and King then Muni to Powell Station, or BART from Millbrae Station to Powell Station.

The Hardest Part About Microservices: Your Data

Posted in Conferences, Microservices, MySQL, REST API Programming, Storage, Tech | Leave a comment

Ecommerce Weather Report for Manila in 2017

This is a follow-up to my 2016 post.

Not much new in mid-2017 for online shopping sites, but Honestbee is doing the first grocery home delivery now in the Philippines. The dominant grocery chain is Robinson’s, but they don’t offer home delivery. Honestbee will mainly shop for clients at Robinson’s and other specialty food providers.

The Manila area is interesting for home delivery for 5 reasons:

  1. large, high-density urban population
  2. relatively low adoption of cars, full trains and buses, traffic jams (exacerbated by Uber contractors), typhoons
  3. large available workforce of “Honestbees” (concierge shoppers and delivery bees) at very low wages
  4. large nuclear families, lots of kids – moms should welcome delivery to the door
  5. restaurant chains (McDonald’s, Shakeys, etc.) deliver, but nobody else does.

Also some recent payment processing changes to checks, credit cards and Philippines Check Image Clearing System (CICS):

  • Checks are now settled in one day nation-wide using check images instead of returning paper to the issuer, requiring new checks with an updated waiver statement on the face. Checkbook holders can only use old non-waivered checks until June 30, 2017. (So checks issued for pre-pay of insurance, etc. after June 30 must be destroyed and re-drawn.)
  • Robinsons Bank ATM cards now use the EMV (Europay, MasterCard, and Visa) smart-card standard.

Robinsons Bank CICS Check Notice

Robinsons Bank EMV Notice

Robinsons Bank Debit Card Notice

Family Ties Bind Philippine Banks


  • Note: NAIA Airport cancelled the P750 (USD $17.00) airport departure fee a while ago, and still hasn’t replaced it. They seem to have working AC throughout the airport now, as this was the most pleasant departure that I can remember from Manila.

  • Some Manila hotels are using Hi-Wire Wifi as their wireless ISP (WISP.) They use a non-standard provider architecture: the same server for both the routing gateway and local/public DNS. That means that unlike most wireless ISPs, DNS access is also restricted until authenticated.
    To get online, you must use DHCP for their DNS server, or set your DNS and gateway to If your networking setting DNS is set to for example google DNS ( only, login will silently fail and it is beyond the ability of hotel IT staff to troubleshoot. 🙂
  • Fastest wifi in the Philippines! Shhhh…”
  • Gateway and San Lazaro Malls both have publicly available wifi if you know where to look, but still not Magnolia, which is undergoing major renovations despite being built only a couple years ago.
  • China Eastern Airlines’ hub, Shanghai Pudong Airport, requires registering your phone if you want free access. An alternative is the business lounge wifi if you can obtain the password. Most people would want to use their cell phone instead. (T-mobile USA seemed to work there, as I received travel update SMS.)
  • China Eastern Airlines’s main gate at Pudong seems to be Gate 16. The arrivals and departures display is obscure, but there is a small 2-screen display outside the store located across from Gate 16.

Les Paul Search Engine Traffic Volume

vintageguitarprice.com 2017 guitar search volume analysis: “Many new players might opt for a Epiphone Les Paul over the Gibson equivalent costing at least hundreds of dollars more. However, the popularity of the Les Paul is clear to see, in related queries making up three of the top five top most searched terms related to Epiphone.

When looking at geographic distribution you can see the Philippines makes up the 5th spot by [Les Paul] search interest (this place was held by Norway for Gibson), highlighting the premium vs. standard marketing (and pricing). We think it’s a great thing that Epiphone guitars are accessible to lower income markets.”

Jollibee trading motorbikes for bicycles, to spare the air or …. ?

Apple Pay and the rise of the five-party network
For Western Union, Refugees and Immigrants Are the Ultimate Market
W: Honestbee
bloomberg.com: Last year, $1.2 trillion in mobile payments were made through WeChat, with each user averaging about $85 a month in peer-to-peer transfers
Visa USA Interchange Reimbursement Fees
Xend Business Shipping
Online shopping in Africa doesn’t work because of this web form

Posted in Business, Tech, Travel | Leave a comment

Truly Seamless Reloads with HAProxy – No More Hacks!

Landmark “corporate” technical blog post from Willy Tarreau titled “Truly Seamless Reloads with HAProxy – No More Hacks!”

Summary is that there is progress concerning HAProxy zero-downtime reloads on linux, even under heavy connection load.

Interestingly, microservice owners doing frequent updates and reloads have been driving the interest behind this.

Thanks to Willy for both the improvements, and the very detailed (and lengthy) blog post explaining the history and status of the fixes.

haproxy.com: DNS for Service Discovery in HAProxy
github.com: AirBnB Synapse

Posted in API Programming, Cloud, Linux, Microservices, Open Source, REST API Programming, Tech | Leave a comment

Percona Live MySQL Conference 2017

Percona Live 2017 was again a well-organized and well-attended Open Source database conference this year in Santa Clara.

Percona Live 2017 Keynote Day 3 by Peter Zaitsev
Peter Zaitsev, Co-founder and CEO, Percona

Each year a different conference theme emerges. This year I would say it was timeseries databases, as there was a track dedicated to it, a keynote announcement on Facebook’s Beringei, and half of the expo booths were monitoring and related product vendors (VividCortex, InfluxDB, Timescale, ScaleDB, Grafana Labs, Solarwinds, etc.)

VividCortex actually had the primary expo real estate for the first time, as well as delivering multiple talks. Congrats!

Keynote Videos
Conference Videos

Some other interesting topics were:

These were the BoF Lightning Talks:

  1. “Successful stories around MySQL and MariaDB Multi-Source Replication” by Mariella Di Giacomo
  2. “What is Sharding” by Manjot Singh
  3. “Use slow logs to collect unique queries and their performance continuously” on Percona Server using log_slow_rate_limit with Michael Wang. (Feature to do sampled slow query analysis)
  4. “The two little bugs that almost brought down Booking.com” by Jean-François Gagné. Impact of recent application bug and server replication panic bug due to a memory leak on their complex fanout topology. Possibly bug #69848. Possibly related.

The MyRocks BoF was very popular and lively. RocksDB is used internally at Facebook, and is in MariaD 10.2 as MyRocks. Mark Callaghan answered several questions and talked about Linux IO accounting as being inaccurate for SSD according to his customized version of fio, and generally lacking in performance-tracking features. Some audience members acted keen to adopt MyRocks, but it sounds like early days for public use. There was a session on MyRocks.

Thanks to Continuent for hosting their “customer appreciation dinner.” It was interesting talking to the CEO/owner, Eero Teerikorpi, about the history of Continuent over the years, MySQL replication and other major features used by enterprises.

Eero sold Continuent to VMware as a component in their cloud hosting plans, but when those plans were cancelled bought it back in 2016. So Continuent is independent again. (Robert Hodges and Giuseppe Maxima stayed at VMware.) They plan to invest more in their replication product. (I’ve used it previously for MySQL to Vertica replication.)

Posted in Cloud, Conferences, Linux, MySQL, Open Source, Oracle, San Jose Bay Area, Tech | Leave a comment

Cisco ASR 920: A router with a fear of heights?

TheRegister has an article about a Cisco recall on the ASR 920 Series Aggregation Services Routers PSUs

I enjoy reading TheRegister, but this article is a little light on research.

Looking around online, it turns out that PSUs are often designed for certain altitudes when using air as a dielectric and for cooling, and 2,000 meters is common. However, the current ASR 920 datasheet specifies 4,000 meters, hence the recall.

The real story is:

  1. How did Cisco learn of the PSU problem? Device failure, or DC fire?
  2. How many other models have the wrong PSU?

Cisco ASR 920
Various Configurations of the Cisco ASR 920

How Does Altitude Affect AC-DC Power Supplies?
theregister.co.uk: A router with a fear of heights? Yup. It’s a thing
cisco.com: Cisco ASR 920 Series Aggregation Services Routers: Low-Port-Density Models Data Sheet

Posted in Tech | Leave a comment

AWS Loft Architecture Week – Databases

I attended 2 days of the AWS Loft SF Database Architecture Conference.

The software scalability work that Amazon has done on databases and caches, especially the Aurora distributed MySQL and Postgres databases, is very impressive. (DBA note: do careful acceptance testing of any distributed database.)

Executive Summary:

  1. AWS has gone far beyond “IaaS EC2 Classic hosting” and developed a complete HA database software stack.
  2. AWS Solution Architects are very knowledgeable, and available to all account holders.
  3. The free data migration tools, SCT (Schema Conversion Tool) then DMS (Data Migration Service), can use any source and destination in both AWS and OnPrem servers, and across several common database products.
  4. The new Intel chips make the new EC2 instance types 34% faster
  5. Slides

Some of the talks I went to:

What’s New in Amazon Aurora for MySQL and PostgreSQL
by Kevin Jernigan, AWS Manager of Tech Product Management, DBS

– very impressive engineering work by AWS engineers – complete internals modernization of MySQL and PostgreSQL
– split Open Source MySQL and PG code into two components (SQL engine and SAN storage modules)
– rewrote algorithms (btree => Z-index), log replay (max. 1.5 seconds) and locking code pushed down for MySQL
– no checkpoints, so 3x less jitter (query latency variation) since data is written to network, so no disk stalls
– Aurora MySQL 5x faster or more
– Aurora PG 2x faster (already well-written internals)
– 6 nodes, 4 required for quorum.
– my opinion as a DBA is that SANs are always a problem, so carefully evaluate this.
– story on edge cases are not well understood yet, critical for operating large distributed database
– you still choose and instance size for IO/CPU since that’s how their billing works.

The speaker was well received as being authoritatively technical by my co-attendee. I found the MySQL comparisons to Aurora a little contrived as being the worst-case configuration of MySQL. ie. I can fail over in 2-5 seconds with master-master and a load balancer, as compared to his discussion of 30 seconds to a minute or more.

What’s New in Amazon RDS for Open-Source and Commercial Databases
by KD Singh, AWS Partner Solutions Architect:

– MariaDB, Oracle 12c now supported
– Read slaves use regular async replication from product, so could lag. Your app needs to handle it.
– HIPAA, ITAR, USgov, UKgov, SGgov, PCI Level 1 seller approval
– during RDS failover, your app must be programmed to reconnect automatically. expected downtime is about a minute for the CNAME => IP address to update
– SQL Server limit is 4 TB, supports AD, .bak files
– 1-second monitoring now included in Cloudwatch
– “pick the smallest you think will work, and migrate when you need a bigger instance size”
– local timezone now supported everywhere

Migrating to Amazon RDS with Database Migration Service
by Dhanraj Pondicherry, Senior Manager of Solutions Architecture, AWS

– SCT (Schema Conversion Tool) then DMS (Data Migration Service)
– Successfully used for Oracle => PG by marquis clients like Shaadi.com
– may need careful VPC setup for source or target
– SCT lists count of tables, SPs so easy to eyeball for QC
– inbound bandwidth is free, so migration is very cheap within AZ
– DMS requires a CDC method to be enabled on the source, like Oracle CDC or MySQL binlogs.
– very impressive effort on these migration tools, with almost any combination of source and target possible now, including OnPrem, EC2 Classic, RDS, Aurora, and Redshift.
– “for your migration tool, pick the smallest you think will work, and migrate when you need a bigger instance size”

Amazon Aurora and Amazon Database Migration Service
by Joyjeet Banerjee, Solutions Architect, AWS Migration Lab

– download and try SCT (OLAP option is for Redshift)
– DMS Online Lab: qwiklabs.com/focuses/2965


Introduction to Amazon DynamoDB
by Sean Shriver, NoSQL Solutions Architect, AWS

– is a key-value store, key up to 2KB, is a string
– not currently related to original Dynamo paper. Most of the authors highly promoted.
– 1 partition, 5 GSIs, 5 LSIs (per partition)
– GSI and LSI separate tables
– GSI need to provision IOPs
– charged 1k writes 4k reads. reading from GSI reduces io cost
– partition key uses consistent hashing

Amazon Elasticache
by Darin Briskman, Developer Evangelist, AWS

– used to be based on memcached, now redis
– average operation 480 microseconds for 4 KB object, 240 microseconds for 1 KB object
– max. 3.5 TB per server, in clusters of 15 servers. 20 million reads per second, 4.5 million writes
– 300,000 TPS
– HA is 1,000 little details. 999 doesn’t count
– “Fast Data” sub-millisecond requests for IoT, mobile real-time info
– Alexa 1,500 ms budget, but 1,000 ms is network trip. So 500 ms for calculation.
– Alexa is DynamoDB+ElastiCache
memcached challenges:
1. no persistence
2. no HA
3. race conditions on threads
Thus Redis (“Remote Dictionary Service”)
Oracle, SQL Server, Mysql then Redis
– AWS-redis persistence via snapshot to S3
– snapshots to 90% of RAM (even 95%) network copied to alternate node. Allowed 20 snapshots per day.
– replication for HA. 30 seconds to failover usually
– primary and replica (don;t like master-slave sounds)
– 1 ms in same AZ, 2-3 ms different AZ
– 55 DCs in an AZ in US-EAST
– new Intel chips 34% faster
– no cross-AZ data transfer costs, so similar cost to Classic EC2
– don’t use T2 for prod. Use R or M.
– key CRC16
– promotion is to last-written replica, timeout of 15 seconds in case of network problem
– string key up to 512 MB, really just binary
– hash, set, list, geo, hyperloglog
– could use lambda to notify of OnPrem database update and invalidate Elasticache row
– IGA Works/Adpopcorn is Korean mobile business platform. Moment scoring on mobile users, including when to show ads
– Expedia’s real-time analytics with Dynamodb was 35000 writes, down to 3500 with elasticahe, 6x savings. 200 million messages daily.
– only a few airlines overseas, but lots of hotels at the destination. mom and pop agencies refreshing expedia also as their backend. 100 most popular routes are 50% of queries. TTL 24 hour, updates 10 time per day
– https://www.youtube.com/watch?v=ie4dWGT76LM
– one day of work and 5 days of testing
– beyond time of year caching, you may know the most popular teams/items
– or cache the whole database if small enough
– cannot add another node for say 5 shards to 6 shards because could lost data now. maybe later.
– “you don’t have to do anything. when you woke up later, it’ll be there.”

ElastiCache Best Practices

– set reserved-memory to 90% so writes can fit in without eviction
– swap usage should be zero
– position a read replica in another AZ for HA
– primary with 2 replicas is 5×9’s
– avoid KEYS and other long-running commands
– not needed for like 1 MB of data
– 50% – 90% reduction in cost
– former Solution Architect at IBM for 20 years. At AWS, allowed to recommend ways to save money.

Everything You Need for a Viral Game, Except the Game
by Darin Briskman, Developer Evangelist, AWS

– use Dynamodb and redis
– wechat runs on redis
– publish and subscribe redis commands: subscribe to a channel then publish to it
– twitch offers hosted chat

– CloudTrail tracks every action including DBA-level access to RDS in JSON
– talk to your Solution Architect, available to every AWS account holder

by Darin Briskman, Developer Evangelist, AWS

– most downloaded Open Source app after linux kernel
– nice REST interace
– same code as Open Source, but manageable in AWS
– AWS is green-blue instead of red-black
– can resize
– AWS answers
– Centralized Logging
– CloudSearch is Solr

Hands-on Labs: Amazon ElastiCache
by Darin Briskman, Developer Evangelist, AWS

– https://s3-us-west-2.amazonaws.com/fastdata/ElastiCacheLab.zip

AWS Talks link

Kudos to Kevin Jernigan and Darin Briskman for their excellent Aurora and ElastiCache talks – the best database talks I’ve ever seen.

This version of the AWS Loft is nice as far as “pop-up” conferences go – everything is hosted on the 2nd Floor, so no sprinting up and down stairs every 30 minutes.

Amazon Aurora Under the Hood: Fast DDL

Amazon DynamoDB Accelerator (DAX)


– Windows 10 has openssh support via Ubuntu. No Putty needed.

Getting there: 1446 Market St. SF. Take Muni K or T line to Van Ness station. or take Castro bus on Market St.

Posted in Conferences, Linux, MySQL, Open Source, Oracle, Tech | Leave a comment

Advanced Swagger UI Techniques

The benefits of using Swagger/OpenAPI are to maintain consistent API specifications, validation and documentation.

And the Swagger UI documentation tool initially looks very attractive cosmetically. However there’s no publishing or privacy restriction features. The reason is that the Swagger API spec itself doesn’t support publishing controls.

That’s fine for Open source authors and most internal-only corporate users, but commercial sites will be unhappy without more control.

So it’s important to decide well before ship date if the Swagger UI will work for you, or if you need to find another solution (ie. you might find it to be “more hole than donut.”)

Swagger UI showing API endpoints (the colored bars) and Auth Dialog. Note the double Authorize buttons. Everything you see is live (auto-generated from the Swagger API spec file.)

Here’s a summary of the Swagger UI issues that I’ve observed when writing a non-trivial API:

# Swagger UI Issue Solution
1. Swagger UI makes your Swagger API spec file downloadable by end-users A half-step is to use Basic Auth for viewing it, but authenticated end-users can still save it to disk. A plan would be to run it server-side.
2. Swagger UI by default sends your Swagger API spec file to an external validation service Fixable. validatorUrl: null
3. the “Try it out” buttons” are active, letting anybody too easily send GET, DELETE, PUT, POST, PATCH commands to your server, whether the request makes sense or not Fixable. See supportedSubmitMethods parameter.
4. the default branding is for the Swagger project Fixable. Read docs or just use Firefox Inspector on Swagger UI header and change the CSS/JS.
5. the UI tool is nice, but even nicer is professionally written text with detailed examples Not fixable. Outside the scope of Swagger UI.
6. the UI tool does not publish to a static file. Most commercial publishers want to provide a PDF file. Not easily fixable, though you can load your Swagger spec file into the Swagger Editor, drag the left frame further left, “print as PDF” and edit the text with Libre Office.
7. No ability to restrict on displaying paths or request methods Not easily fixable. You could export a minimal Swagger API spec file just for use with Swagger UI.
8. Add username and password auth Easy. Just add a securitydefinitions block in your Swagger spec file and define the key name in Swagger UI’s index.html. Note that multiple auth methods require showing and clicking multiple “Authorize” buttons (see screenshot above) since the Swagger specification considers multiple auth methods to logically OR and your app has to sort them out.
9. Add API key auth Easy. Supported by current versions.
10. URL bar shows address of Swagger spec file Fixable, but end-users can still download your spec file: document.
style.visibility= "hidden";
11. Themes Some are available at swagger-ui-themes
12. There can be one “body” parameter at most. The Swagger spec only allows one body element (formData or JSON) while you may want to allow both. Some parsers enforce that, and some don’t.
13. Swagger UI markdown is broken/incomplete. As of Feb. 2017, markdown support is missing numbered lists, italics, tables and more. Please post a comment when numbered item lists and line continuations work. See Issue #825 for more context.

* by “not fixable” and “not easily fixable” I mean “total Swagger UI re-design and rewrite required.” 🙂

Getting Started with Swagger UI

  1. unzip or clone master to a public directory
  2. update dist/index.html with the location of your api.json file
  3. if you’re using a load balancer and see an error like “cannot call https from http”, change scheme from https to http in your Swagger API spec file.

Customizing the Swagger UI

Test Parameter
Security Tokens
How to break swagger 2.0 JSON file into multiple modules
Tom Johnson’s Tutorial
Tuan’s Tip to Add Username and Password (Check spelling carefully)
A Visual Guide to What’s New in Swagger 3.0

Note: if you google or stackoverflow for help, ignore any bug reports about Swagger UI before 2016.

Swagger Editor Custom UI
blog.novatec-gmbh.de: The problems with Swagger, HN discussion
Top 7 Myths about HTTPS and Browser Caching
Ask HN: What’s the best way to write an API spec?
robwin.eu: Documentation of a REST API with Swagger and AsciiDoc
Document your Already Existing APIs with Swagger

Posted in API Programming, Open Source, Tech | Leave a comment

Free Mac OS X PDF Editors

When you need to edit a PDF, you really need to edit a PDF.

The free Preview app that comes with Mac OS X can annotate and do some simple operations on PDF files, but does not have a feature to edit the actual text.

I tried the following free or trial PDF editors on Mac OS X 8.5 for editing text on a 30-page Firefox “print to PDF.”

Product Recommendation Notes
Libre Office for Mac (Open Source) Recommended Does a great job of editing text
Inkscape (Open Source) Not Recommended Can only edit first page “due to SVG spec” unless you install plugin
Mac OSX Preview (Manual) Not Recommended Can remove PDF pages, copy text and add text.
iSkysoft PDF Editor Trial Not Recommended Can edit text, but inserts large yellow watermark on save

iSkySoft Watermark
iSkySoft Watermark

Multiple page support for Inkscape

Posted in Tech | Leave a comment

Super Bowl LI 2017

Best Super Bowl that I can remember, with first Super Bowl Overtime.

The New England Patriots (QB Tom Brady) came from behind to win 34-28 over the Atlanta Falcons (QB Matt Ryan).

Atlanta scored 3 TD’s in Q2:

  1. Freeman does 3 drives resulting in a flying landing in the endzone
  2. xxx runs into endzone
  3. did a 82-yard interception.

Patriots got a 41-yard field kick for the 3 points.

Lady Gaga’s half-time performance was good, with surprising acrobatics.

Then the Patriots scored 31 unanswered points to win.

Commercials ($5 million for 30 seconds) weren’t too good, although John Malkovich trying to get his vanity domain from a domain squatter was pretty good (squarespace.com.)

Seemed like mostly cell phone, car and VR-related ads.

W: Super Bowl LI

Posted in Tech | Leave a comment

Odd but Handy URLs

Organization Link Purpose
Apple http://captive.apple.com/ non-SSL ProbeURL for WiFi hand-shaking. Replaces success.html.
Google http://www.google.com/ncr force plain English site (“no country redirect”)
IANA http://www.example.com official domain for examples in documents
IANA http://www.example.org official domain for examples in documents

How to fix “SuccessSuccess” Wi-Fi issue on MacOS X Mavericks temporally

Posted in Tech | Leave a comment

SpaceX Launch Begins Era of Space-Based ADS-B Tracking

The big news from the SpaceX launch of 10 Aireon Iridium 2G “Next” satellites is that ADS-B was also included on each satellite.

ADS-B is used for tracking and sending ATC digital information to airplanes. The FAA has mandated that almost all aircraft will install ADS-B transceivers before 2020, at a cost of $5,000 to $1 million per airplane, plus downtime.

Since there’s 150,000 registered US aircraft and thousands of foreign airliners, that just isn’t going to happen with the existing number of Mx shops and remaining 1,080 days. 🙂

These are fairly large satellites at 860 kg each:

SpaceX Launch Begins Era of Space-Based ADS-B Tracking
Iridium-1 Hosted Webcast
Layman HN Commentary
W: Automatic dependent surveillance – broadcast
faa.gov: ADS-B Frequently Asked Questions (FAQs)
avweb.com: New Satellites Promise Better ATC Coverage
gpsworld.com: Clocks fail on some Galileo satellites, backups working

Posted in Tech | Leave a comment

Perl and Monotonic Time Functions

Perl on Linux supports the POSIX C clock_gettime() function to get the monotonic time (always increasing system time, except for variable overflow) values:

Comparing monotonic time values:

  • avoid problems with leap seconds going backwards in time by NTP, but can “warp”
  • avoid problems with VM time going backwards
  • can only be used locally, not compared across machines
  • can rollover on variable overflow

Disadvantages of clock_gettime() over time/gmtime:

  • rollover requires awareness and calculation
  • not supported on Mac OS X and buggy before RHEL 5.3
  • relative time, not actual time, so cannot be displayed for humans
  • for most programs, requires code change and re-QA
  • dichotomy still exists between system and database time
use strict;
use diagnostics;

use Time::HiRes qw(clock_gettime CLOCK_REALTIME CLOCK_MONOTONIC);

   my $realtime = clock_gettime(CLOCK_REALTIME);

   my $mono = clock_gettime(CLOCK_MONOTONIC);

   print "realtime = $realtime, monotonic = $mono\n";
$ perl /tmp/clock.pl
realtime = 1483451061.64625, monotonic = 4536159.37919642

Perl – Time::HiRes
clock_gettime(3) – Linux man page
Erlang – Postscript: Time Goes On
lwn.net: The leap second of doom
SO: How do I get monotonic time durations in python?
SO: Linux clock_gettime(CLOCK_MONOTONIC) strange non-monotonic behavior
SO: Is CLOCK_MONOTONIC process (or thread) specific?
How the NYE leap second clocked Cloudflare – and how a single character fixed it
W: Swatch Internet Time (Beats)

Posted in API Programming, Linux, Open Source, Perl, Tech | Leave a comment

eBay Bucks Base Earnings Now 1%


“Changes to eBay Bucks Rewards Program starting January 1, 2017
Effective January 1, 2017 the Base earnings are changing from 2% to 1%.”

Guess I’ll be advising sellers to wait for 8% or 10% eBay Bucks days.

Last time I’ll see one of those.

Basic Economy Fares Don’t Lower Ticket Prices, They Increase Ticket Prices
The Champions of the 401(k) Lament the Revolution They Started

Posted in Tech | Leave a comment

Microservices: Java MicroProfile Links

Java DukeFrom the MicroProfile FAQ:

“The MicroProfile is a baseline platform definition that optimizes Enterprise Java for a microservices architecture and delivers application portability across multiple MicroProfile runtimes.

The initially planned baseline is JAX-RS + CDI + JSON-P, with the intent of community having an active role in the MicroProfile definition and roadmap.”

Looks more like a a REST bundle to me that avoids fixing Java’s inherent flaws:

  1. long GC pauses (seconds)
  2. bloated memory consumption (GBs)
  3. slow start-up time (seconds)
  4. crashes from exceeding pre-configured heap size
  5. licence confusion – is it Apache v2? EPL? “Copyrights are inconsistent at the moment”? what does Oracle say?

microprofile.io: Home, FAQ
theregister.co.uk: MicroServices-friendly Java lands on Eclipse
eclipse.org: Eclipse MicroProfile

Posted in API Programming, Business, Cloud, GC Pauses, Java, Linux, Microservices, Open Source, Oracle, REST API Programming, Tech | Leave a comment

Aerodynamics: Rolling Gs

Avweb has an interesting article on ‘Extreme Maneuvering’ about practical applications of the FAA commercial aerobatic maneuvers (chandelles, lazy 8’s, steep spirals) that mentioned “rolling G’s.”

A rolling G occurs when you maneuver an aircraft in more than one axis at a time, causing the airframe or wing to twist. The rolling G design limit is considered to be 2/3 of the normal G limit, according to FAR 23.

Although I performed the commercial maneuvers during my Commercial Airplane practical test and Citabria checkout, I wasn’t really aware of two things:

  1. airframe twisting from rolling G’s can more easily exceed a plane’s load limit. Those limits would be lower in older aircraft, possibly already damaged, than newer ones. It’s important to load the airplane one axis at a time.
  2. the commercial maneuvers can be used to reverse in a box canyon. I know a private pilot who crashed in a box canyon (he luckily survived) because he knew of no course reversal methods, so this is handy to know.

Chandelle (Climbing, Reversing Turn) Animation

Advanced Section

The asymmetric lift, resulting in a torque, caused by the ailerons travelling up and down simultaneously with yawing and pitching maneuvers is believed to have caused several airshow accidents in older airplanes, shearing the wing spar. Contributing factors are the acceleration rate of the control movement, airspeed above maneuvering speed, Va, and wing and fuselage harmonics. Sideslip also affects G limits.

It would be difficult to calculate actual rolling G limits without destroying several aircraft to build a mathematical model. There are a number of reasons for that, but primarily the problem is that dynamic torque must be calculated for multiple types of members, including spars, fuselage skin, and especially attach points. The latter is tricky because attach point hardware may be very strong in one axis, and very weak when loaded off-axis (or corroded.)

pprune.org: Why is Rolling G dangerous?, Normal G limits vs Rolling G limits?
W: Chandelle,
Ice and Tail Stalls

Posted in Tech | Leave a comment

Ecommerce Weather Report for Manila in 2016

Current consumer ecommerce weather report for Manila in Dec., 2016 …

US ecommerce sites – Just Say No

Manila sellers are wary of Facebook Pages commissions on retail listings, and “meh” on ebay for the same reason. Craigslist is free but doesn’t have any mindshare in Manila.

Southeast-Asian ecommerce sites – Just Say Yes

So they’re going with Lazada, probably #1 in Manila, or Shopee, which is ad-supported, and Carousell.

Shoppee is owned by the Garena Group of Singapore. They have registered country-specific top-level domains (TLDs) for each Asian country supported.

How Shopee works:

  • buyers and sellers download iPhone or Android mobile app or use web site to upload and view listings
  • Shoppee Customer Support, local to each Asian country, approves photos
  • buyers and sellers can apply for free shipping
  • Shoppee shows ad banners for $$$$.

I got a tour of the Shopee office. It’s similar to Silicon Valley start-up offices, but has a staffed reception area. 🙂

Car Hire

The most popular car apps are Uber and Grab. Riders use car apps because buses and the MRT (train) are inadequate for longer commutes, and unsafe due to petty criminal gangs. Drivers see car apps as a way to pay off their car loan, and to kill time, due to rampant underemployment.

Uber is rumored to have increased Manila traffic by the equivalent of 19,000 cars. Rush hour used to be 7 am to 9 am and 5 pm to 8 pm. Now it’s 6 am to midnite. (In the USA, there have been mixed reports of increased traffic. SF is reported to have problems, while Phoenix less.

Uber passengers used to cancel arriving sub-compact hatchbacks like the Toyota Wigo (MSRP USD$10,000) in favor of sedans, but the Wigo is getting more respect 2 years after market introduction.

Garena’s Shopee could be on its way to beating Carousell in Asia

Posted in Business, Tech | Leave a comment

Notes for Installing Percona Xtradb Cluster 5.7 on CentOS 5

Percona supports Percona Xtradb Cluster 5.7 on CentOS 6 and CentOS 7, but not CentOS 5.

You can install the RPMs or tarball binary, but on start will see various package dependencies that can’t be resolved on openssl.0.10 and others.

So your options are:

  1. upgrade your OS first to CentOS 6 or 7 64-bit first (recommended)
  2. downgrade and install Percona Xtradb Cluster 5.6 with yum install Percona-XtraDB-Cluster-server-56
  3. not recommended, but if you’re stubborn about clinging to CentOS 5.x and you’re a programmer, you can install Percona Xtradb Cluster 5.7 from source. You will need (at least) cmake 2.8.2+, boost 1.59+, and recommended are gcc 4.4 or clang 3.3.

Here’s the build instructions that compiled for me with gcc 4.1.2:

1. So download and install cmake 2.8.2 or higher from source first:

yum remove cmake
wget --no-check-certificate https://cmake.org/files/v3.7/cmake-3.7.1.tar.gz
tar zxvf - < cmake-3.7.1.tar.gz
cd cmake-3.7.1
./bootstrap && make && make install
cd ..

2. Download and install boost 1.62 from source:

yum remove boost
wget --no-check-certificate https://sourceforge.net/projects/boost/files/latest/download?source=files
yum install p7zip
7za x boost_1_62_0.7z
cd boost_1_62_0
./bootstrap.sh --prefix=/usr/local
./b2 install
cd ..

3. Build Percona-XtraDB-Cluster-5.7 source like this:

wget --no-check-certificate https://www.percona.com/downloads/Percona-Server-5.7/LATEST/source/
cd Percona-XtraDB-Cluster-5.7.16-27.19
# remove new gcc 4.4 flag -Wvla:
# -Wvla
#    Warn if variable length array is used in the code. -Wno-vla will prevent the -pedantic warning of the variable length array. 
perl -i.orig -p -e 's/-Wvla//g' `find . -name maintainer.cmake`
cmake . -DMYSQL_DATADIR=/var/lib/mysql
mv boost_1_59_0 /tmp
# fix the boost and gcc version errors. Just replace with your versions.
vi cmake/os/Linux.cmake +27
vi cmake/boost.cmake +265
# insert 2 "out-of-scope" macros os_compare_and_swap_thread_id and os_compare_and_swap from storage/innobase/include/os0atomic.h into these 2 source files:
vi storage/innobase/lock/lock0lock.cc +1904
vi storage/innobase/trx/trx0trx.cc +204
# define os_compare_and_swap(ptr, old_val, new_val) \
        __sync_bool_compare_and_swap(ptr, old_val, new_val)

#  define os_compare_and_swap_thread_id(ptr, old_val, new_val) \
        os_compare_and_swap(ptr, old_val, new_val)
make -j 8
make test
# the new server is located at sql/mysqld
make install
# note that 5.7 has a new grants schema, so your old database won't work until upgraded
# in /etc/init.d/mysql, bindir=/usr/local
Posted in Linux, MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

Star Wars ‘Rogue One’ Review

I don’t often go to the movies, but saw ‘Rogue One’ with a date.

The first half seemed kind of slow and disconnected, dealing with various rebel assassination plots (!) on Jedah and Eadu. Good visuals but weak story-telling.

The protagonist, Jyn Eros, portrayed by Felicity Jones, must be a pretty bad actor to have her mother killed in front of her, yet convey being unsympathetic and uninvolved throughout most of the film – I’d rather watch paint dry.

However, the second half dealing with the invasion and destruction of the Imperial base at Scarif and testing of the Death Star was riveting.

Darth Vader’s brutal but ultimately futile light-saber fight scene at the end will please action fans.

And seeing a youthful Princess Leia (CGI) at the end receiving the Deathstar plans was a nice surprise.

The rebels’ companion robot, a re-programmed Imperial model named K-2SO, was intelligent and funny enough to be unsettling. Admiral “Fishlips” Raddus a Mon Calamari, also provided comedic distraction.

Admiral “Fishlips” Raddus. Photo Credit: Lucasfilm

W: Rogue One
IMDB: Rogue One
Can we talk about that final Darth Vader scene in Rogue One?
bbc.com: Rogue One is Star Wars for Better and for Worse
Rogue One: Meet Admiral Raddus, thecharacter inspired by Winston Churchill
Behind the scenes of ‘Rogue One’ with director Gareth Edwards

The story behind Princess Leia’s hairstyle

Keywords: General Fishlips

Posted in Tech | Leave a comment

Cessna Skycatcher 162 Inventory Crushed

The sorry tale of the Cessna 162 Light Sport Aircraft (LSA) has finally concluded. The remaining 80 airplanes have been crushed with a backhoe outside the the Chinese factory, including the installed engines and avionics.

It’s believed that liability insurance and parts support didn’t pencil out for the accountants. Crushing solved the liability problem, and also any agreements with suppliers like Continental and Garmin prohibiting resale.

Kind of a shame they couldn’t have sold them to the Chinese government for $1 for use in flight training in exchange for indemnity from lawsuits.

Of the original 1,000 projected orders, 200 were actually delivered and 80 crushed.

There were many problems with the 162:

  1. capabilities were Day and Night VFR, not IMC.
  2. 1,320 pounds gross weight only left room for one American after full fuel with this design
  3. flight schools required a separate check-out, even if you had 152 and 172 experience. This involved additional expense and searching for a slender CFI.
  4. price was high for flight schools given the above limitations. The 162 had teething problems, and some owners had to replace the ADHARS twice
  5. assembled in China, unlike most trainers.

Crushing C162 with a Backhoe
Crushing C162 with a Backhoe
Crushing C162 with a Backhoe
Crushing C162 with a Backhoe
Crushing C162 with a Backhoe

Cessna Scraps Unsold Skycatchers
Crushing More Than Airplanes
Skycatcher’s Demise: Barely a Ripple [2013]

Posted in Tech | 2 Comments

Mac OS X TextEdit Reads and Writes Microsoft Word Formats

TextEdit IconWow. Who knew the little TextEdit application supports .doc, .docx and PDF formats?

This actually worked for me today:

  1. I imported some Word 97/5.0 business documents
  2. updated and saved them
  3. and I exported them as PDF.

So you can do light but professional documentation, invoicing, etc. without installing additional software.

For multi-column and chart formatting, just use the the Format => Table… menu option, similar to old-school HTML layout. You can set the table cell borders to 0 pixels to make them disappear.

Bonus tip: Preview, which is also included with Mac OS X, can be used to professionally annotate images. If you’re a manager or engineer, you will love the results. Just click on Tools … Annotate.

osxdaily.com: Opening DOCX Files on a Mac, Without Microsoft Office

Posted in Business, Tech, Toys | Leave a comment

Storage: Erasure Encoding Acceleration with Intel CPUs

There’s basically 3 ways to store online data, where “storage” includes block and/or network locations:

  1. filesystems on top of blocks (zfs, xfs, ext4)
  2. object stores across storage (OpenStack Swift, Backblaze, S3)
  3. files erasure-encoded across storage networks (CleverSafe)

CleverSafe struggled along with VC funding until Oct. 2015, when it was bought by IBM for $1.3 billion.


An HN commenter has done us a favor by listing a handful of links to Intel CPU acceleration techniques useful for computing #3:

“At a glance, this seems like a clear explanation of using standard SIMD instructions to solve the problem, but I think the landscape has changed since this was written such that there are now better approaches.

In 2010, Intel released processors with a dedicated instruction for “packed carry-less multiplication.”

Unfortunately, the early implementations (through Sandy Bridge) were slow, and could be beaten by combining other SIMD operations as shown in this paper.

With the Haswell generation released in 2013, though, PCLMULQDQ got much faster. Instead of being able to complete one instruction every 8 cycles, it became possible to finish one every 2 cycles (inverse throughput went from 8 to 2). This 2015 paper “Faster 64-bit universal hashing using carry-less multiplications [PDF]” shows the difference this makes:

If you are looking for an explanation of how the problem could be solved with the basic building blocks of SIMD, the 2013 Plank, Greenan, Miller paper might be a good resource. But if you are hoping to implement high performance solution for modern processors, the 2015 Lemire and Kaser paper is probably a better starting point.

(This is with the caveat that I don’t actually understand the theory or terminology of Galois fields, and maybe there is something about applying it to Erasure Coding that makes the faster PCLMULQDQ approach inapplicable.)”

FAST-2013: Screaming Fast Galois Field Arithmetic Using Intel SIMD Instructions (2013) [PDF]
OpenStack Swift
register.co.uk: IBM Buys CleverSafe
Patent Troll Kills Open Source Project On Speeding Up The Computation Of Erasure Codes

Keywords: Reed, Solomon, Galois, GFC.

Posted in API Programming, Cloud, Linux, Open Source, Storage, Tech, Toys | Leave a comment

apis.json File AKA sitemap.xml for APIs

apis.json is a site discovery format like sitemap.xml, but for your APIs. It is an open project to create a new standard by some ambitious API evangelists .

Steps to create your own apis.json file:

  1. look at existing files featured on apisjson.org, or for a complete example, RackPing.com
  2. validate your apis.json file
  3. contact the apisjson.org maintainers to add a link to your apis.json file.

apis.json: validator, Google Group, github, Proposed Intranet Properties

API Evangelist
github: OpenAPI Specification

Posted in API Programming, Tech | Leave a comment

Retro: GeoCities Cage Photos [1999]

I had a chance to see the GeoCities Exodus 1 cage in 2000 when I was at eBay Payments.

It used LaCie JBOD stacked to the ceiling as storage devices, and a 3′ diameter floor fan to move the hot air to other customers’ cages. 🙂

$50 Floor Fan Protecting $millions in GeoCities equipment from outside their colo cage. What could go wrong?

Their cage left an impression on me, and demonstrated how:

  1. ghetto colo can work
  2. cages can achieve very high densities
  3. devices can work at very high temperatures for extended time intervals
  4. to work your colo provider (there’s no way they got prior approval for that floor fan!)

Below is some photos of one of their cages with Sun and Netapp gear:

The GeoCities Cage at Exodus Communications [1999] HN Comments

Note their use of Veritas Volume Manager.

Until around 1998, linux did not have a journalled filesystem. I started evaluating Reiserfs 3 on Suse Linux at that time on my personal machine. A Suse salesrep a decade later refused to believe that anybody in the USA could have been using Suse back then. 🙂

The other cage from 2000 that left an impression on me was About.com’s, which had a Sun E10k server ($2 million each fully populated). I don’t think they ever launched a product, yet they had the same equipment as eBay’s main cage.

I was talking to some other sysadmins with gear at Exodus 3 and 4, and they mentioned a lot of customers also built out their colo but never launched.

Dedicated Internet Access & Hosting Agreement between Exodus and GeoCities
W: Exodus Communications

Posted in San Jose Bay Area, Storage, Tech, Toys | Leave a comment

TransAsia Airways Shuts Down After Two Horrific Accidents


TransAsia Shuts Down Amid Safety, Financial Problems

Posted in Tech | Leave a comment

GitLab Validating Ceph in Production For Me

GitLab.com: Spikes are Outages. OSD = Ceph Object Storage Daemon

  • It would be easy for me to criticize GitLab for using a distributed file system in production, especially Ceph, in AWS. I just wouldn’t roll that way.
  • And it would be easy for me to say, “I told you so.” again about AWS latency being a performance killer. It’s physics.

    After all, when you yoke a bunch of water buffalo together, your team is only as fast as the slowest buffalo.

    But I find it fascinating and convenient that they’re doing all that distributed file system testing for me. Thanks, guys! 🙂

    On the plus side, supporting a distributed file system is almost possible on homogeneous hardware …

    Here’s some free consulting from somebody who works on x,000 to xx,000-server data centers:

    1. buy hardware compatible with Ceph
    2. use 10 Gbps switch ports
    3. use cluster-dedicated switches
    4. hire somebody already doing it now
    5. don’t goof up your health-checks. Include all healthy servers, not just the healthiest one
    6. or instead of using Ceph or Gluster, do it right. Implement Backblaze’s object store design. Invert the problem from being “the network and OSD has to always work” to something tractable like “my HTTP API has to work most of the time”. And use a combination of Arista Clos network design and HAProxy as the mesh router to avoid network hotpspots and SPOFs. Non-blocking and “Propah!” with multi-terabits per second sustained throughput! Now we’re talking! 😎

    “There is a threshold of performance on the cloud and if you need more, you will have to pay a lot more, be punished with latencies, or leave the cloud.”

    GitLab.com: How We Knew It Was Time to Leave the Cloud
    HN discussion (with Cloud Apologists)
    Proposed server purchase for GitLab.com HN

  • Posted in Cassandra, Cloud, Open Source, Storage, Tech, Toys | Leave a comment

    Congrats to Cirrus on Type Certificate for SF50 Jet

    Cirrus SF50 Vision Jet – almost actual size!

    Congrats to Cirrus Aircraft on receiving an FAA type certificate for their new SF50 Vision Jet.

    It’s a short-range single-engine 4-seat passenger jet for $1.5 million with a parachute that can be operated for $660/hour.

    I’ve been following the news on this during the last decade of development. General Aviation (GA) moves slowly, but the SF50 and HondaJet show that eventually small aircraft do get certified.

    The numbers:

    1. 300 KTAS
    2. 2036′ runway
    3. 67 knots stall speed
    4. FL280
    5. FIKI
    6. cabin height is only 4.1′.

    Obviously it’s intended for existing Cirrus owners who want to step up to a jet.

    It uses a Williams FJ33-5A jet engine, similar to the original $800,000 Eclipse 500 jet plans. Williams is a cruise-missile manufacturer, so has lots of manufacturing experience with small jet engines, but not much experience with passenger airplane maintenance.

    Boeing Business Jet (BBJ)
    The Cirrus SF50 is not like this, a Boeing Business Jet (BBJ)

    Cirrus was bought by a Chinese state company, AVIC, in 2011.

    China has bought most of the American GA manufacturing capacity out of bankruptcy in the past decade to position itself for growth in the emerging Chinese civilian market, including Continental Engines (2010), Superior Air Parts (2008), Mooney (2013) and Diamond Canada (2016.)

    So far, that has resulted in much-needed investment, though it’s unclear what the long-term implications are for the USA.

    avweb.com: Cirrus SF50 Vision Jet: Learning From the Past
    aopa.org: Hourly operating costs of 45 jets compared
    Mooney’s Fortunes Tied to China
    Luxury VIP jets: How the super-rich fly
    avweb.com: Checking The China Acquisition Score Card
    Cirrus Delivers First Vision Jet

    Posted in Tech | Leave a comment

    Linux HTTP Load Testing with httperf

    Linux logohttperf is an easy-to-use but powerful GPL2 command line (CLI) stress and load testing tool for linux.

    Installing httperf

    CentOS 6:

    yum install httperf

    CentOS 7:

    wget http://ftp.tu-chemnitz.de/pub/linux/dag/redhat/el7/en/x86_64/rpmforge/RPMS/rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
    rpm -Uvh rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
    yum install httperf

    Running httperf

    1. Always get permission from the site owner first before doing load testing
    2. It’s important to start by calibrating your tool first. Send one request and check the response:
    $ httperf --server www.example.com --uri /index.php --print-request --print-reply -d10

    If you see non-200 HTTP responses, like this 301 example response below, then you need to ensure you have the correct –uri parameter:

    httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE
    httperf: maximum number of open descriptors = 1024
    SH0:GET /index.php HTTP/1.1
    SH0:User-Agent: httperf/0.9.0
    SH0:Host: www.example.com
    SS0: header 83 content 0
    RH0:HTTP/1.1 301 Moved Permanently

    You can ignore the open files warning – it’s a bug in httperf. Just keep the load under 200 connections, or compile your own version from source.

    Now we’re ready to do concurrent testing:

    $ httperf --server www.example.com --uri /index.php --num-conn 20 --num-cal 10 --rate 2 --timeout 5
    httperf --timeout=5 --client=0/1 --server=www.example.com --port=80 --uri=/blog --rate=2 --send-buffer=4096 --recv-buffer=16384 --num-conns=20 --num-calls=10
    httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE
    Maximum connect burst length: 1
    Total: connections 20 requests 200 replies 200 test-duration 10.675 s
    Connection rate: 1.9 conn/s (533.8 ms/conn, <=4 concurrent connections)
    Connection time [ms]: min 1175.2 avg 1266.2 max 1728.3 median 1179.5 stddev 179.3
    Connection time [ms]: connect 63.4
    Connection length [replies/conn]: 10.000
    Request rate: 18.7 req/s (53.4 ms/req)
    Request size [B]: 73.0
    Reply rate [replies/s]: min 18.2 avg 19.1 max 20.0 stddev 1.3 (2 samples)
    Reply time [ms]: response 120.3 transfer 0.0
    Reply size [B]: header 238.0 content 0.0 footer 0.0 (total 238.0)
    Reply status: 1xx=0 2xx=0 3xx=200 4xx=0 5xx=0
    CPU time [s]: user 2.36 system 8.30 (user 22.1% system 77.7% total 99.9%)
    Net I/O: 5.7 KB/s (0.0*10^6 bps)
    Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
    Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0

    Always check for non-zero error counts.

    Going Pro

    After you're comfortable using httperf, here's how to take it to the next level:

    1. use a dedicated physical machine separate from your subject under test to reduce intrusive latencies, and tail the server logs in separate terminal windows. Graph CPU and RAM consumption of the subject.
    2. build your own version of httperf with your preferred options. On CentOS 7:
      git clone https://github.com/httperf/httperf.git
      cd httperf
      # read Readme.md
      sudo yum install automake openssl-devel libtool
      libtoolize --force
      autoreconf -i
      sudo make install
      read the links below and configure open files, port range and TCP timeout
    3. do runs 3 times at different times of the day and/or seasons
    4. again, always check for non-zero error counts
    5. add load and stress testing to your server and application deployment checklists. There's always some kind of surprise just waiting to be discovered. :)
    Advanced Notes
    1. test tools are one of those things where you really need the source code to get what you want
    2. Runnning strace httperf ..., we see that httperf does polling with the select() system call. Hmm ...
      select(4, [3], [], NULL, {0, 0})        = 0 (Timeout)
      select(4, [3], [], NULL, {0, 0})        = 0 (Timeout)
      select(4, [3], [], NULL, {0, 0})        = 0 (Timeout)

    akamaras.com: stress test your web server with httperf
    SO: Changing the file descriptor size in httperf
    easyengine.io: Increase "Open Files Limit"
    brendangregg.com: The USE Method

    Posted in API Programming, Cloud, Linux, Open Source, Tech | Leave a comment

    The first ever photograph of light as both a particle and wave

    Magnified image of electrons interacting with a standing photonic wave along a thin wire. The standing wave shows the wave nature of light, and the coloration measures the change in velocity as photons interact with electrons (particles)

    The first ever photograph of light as both a particle and wave [2015]

    Posted in Tech | Leave a comment

    Basic JMeter Load Testing of Web Sites and Rest APIs

    JMeter LogoThis is an intro to load testing with Apache JMeter, an Open Source load testing tool.

    As a developer, QA or Operations engineer, it’s important to be familiar with what load testing tools can do, and to know how to configure a few actual tools.

    I usually reach for a load testing tool in the following scenarios, which appear similar, but really are very different. You can divide stress and QA testing into 4 categories:

    1. I want to know what will happen when 100 requests are sent to a single or handful of endpoints in a brief time interval, usually one second (connection and configuration testing)
    2. I want to know what will happen under sustained load of 50 requests/second to a single or handful of endpoints, for typically 5 minutes (performance testing)
    3. I want to know that all of an application’s pages respond successfully, typically using 1 thread (application testing)
    4. I want to know how many synthetic users that application’s pages respond successfully under load, typically 20 – 100 threads. (application load testing)

    Note that I don’t consider a simulated load to be meaningful for predicting human loads.

    For example, on one intranet project, 70,000 users were happy with a phone book web app that only load tested to 20 simultaneous users. The test was useful in indicating that nothing was misconfigured, but not useful in predicting how many people could actually use it.

    Load tests just tell me:

    1. if something is misconfigured or broken. If I get less than once response per second, or the server stops listening for a period of seconds, then we know there is a problem to investigate.
    2. numbers that I can compare to other runs over time.

    Why JMeter?

    JMeter is:

    • convenient (after the first time you learn it)
    • popular in the Java community, so worth being familiar with
    • Open Source (free)
    • extensible

    The disadvantages of JMeter are that:

    • it has a complex UI
    • Java GC pauses can affect results on longer test runs. You can mitigate that by setting up your tests using the UI, then run the tests from the command line as recommended.

    JMeter Installation

    1. check your Java version for 1.7 or 1.8 with java -version
    2. download and install JMeter
    3. read the JMeter Getting Started guide
    4. read the first 7 pages of the Basic Scripting with JMeter tutorial by Simon Knight
    5. setup an initial test using the JMeter UI. You must include a Response Assertion for a credible test. Then save to a jmx file as bin/Mysite.com.jmx (it’s an XML file with your settings.)


    Make 4 copies of your jmx file with:

    cp -p Mysite.com.jmx Mysite1.jmx
    cp -p Mysite.com.jmx Mysite2.jmx
    cp -p Mysite.com.jmx Mysite2.jmx
    cp -p Mysite.com.jmx Mysite3.jmx

    Edit each jmx file to customize the properties according to the 4 strategies I listed above:


    (If you want to invest some time, you can parameterize those as documented in the JMeter FAQ.)

    Create the following bash script make_mysite1.sh so that you can run your test from the command line:

    # Program: test_mysite1.sh
    rm -f mysite1.log
    ./jmeter -n -t Mysite1.jmx -l mysite1.samples.log -j mysite1.log
    grep "Thread Group" mysite1.samples.log | grep -v [O]K

    Running Tests

    1. Ensure you’re authorized before running any load test against a server you don’t own.
    2. bash test_mysite1.sh
    3. Analyze the response codes and timings. Test samples will be in mysite1.samples.log, and reports in mysite1.log.

    When to Run

    Every time a change is made to your environment, you should re-run the load tests. So include it as part of your release process checklist.

    Distributed Testing

    After you’re familiar with load testing using a single client with JMeter, you can learn about using multiple load test clients.

    Bonus – “Soak Testing”

    In the telco industry, historically new systems have undergone “soak testing.” This is operating test systems under a realistic load for one month or more to “provide a measure of a system’s stability over an extended period of time.”

    JMeter Too Difficult?

    There’s a couple options for people who want results without fussing with JMeter:

    1. Command Line Interface (CLI) – httperf
    2. Graphical User Interface (GUI) – Microsoft’s discontinued Web Application Stress Tool (WAST) aka “Homer” is an incredibly easy-to-use, distributed and powerful Windows graphical tool – “its ease of use means it actually gets used.” If you want to do load testing from Windows client machines, you can download it from here.

      WAST is so good that it has fans, which can’t be said for any other load test tool. It was replaced by Visual Studio Team System’s (VSTS) Test Manager.

    JMeter: FAQ, Best Practices
    SO: Load test with varying number of threads in JMeter

    Posted in API Programming, GC Pauses, Java, Open Source, REST API Programming, Tech | Leave a comment

    How to Build Linux rkt Container Manager on CentOS 6.7

    Linux logoInstalling the rkt container manager on CentOS 6.x with yum will give you this error:

    # yum -y install go rkt
    # rkt run
    rkt: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by rkt)

    glibc is not something you can easily upgrade yourself, but you can build rkt from source. On CentOS 6.6 and 6.7, this works:

    sudo yum install -y go squashfs-tools libacl-devel glibc-static trousers-devel
    wget https://github.com/coreos/rkt/archive/v1.15.0.tar.gz &&
    tar zxvf - < v1.15.0.tar.gz &&
    cd rkt-1.15.0 &&
    echo "insert 'echo' at line 5667 to workaround old autogen bug"
    vi configure
    ./configure --disable-sdjournal --with-stage1-flavors=fly --disable-tpm &&
    echo 'readlink -f "$@"' > realpath &&
    chmod +x realpath &&
    export PATH=$PATH:. &&
    # now test by downloading a Docker Ubuntu image and running it (requires about 256 MB RAM)
    sudo ./build-rkt-1.15.0/target/bin/rkt run --interactive docker://ubuntu --insecure-options=image

    CoreOS Issue #1063: build with old glibc so rkt runs on CentOS 6?

    Posted in API Programming, Cloud, Open Source, Tech | Leave a comment

    Linux rkt on CentOS7 is Just Too Easy

    Linux logoThe rkt (pronounced “rocket”) container manager is just too easy to run on CentOS7!

    Here’s me running a Docker Ubuntu 16.04.1 LTS image on CentOS7 (Dell 1950 III with 8 GB RAM on 100 Mbps Internet connection) for the first time in under a minute. The Ubuntu Docker image actually starts in 3 seconds once downloaded.

    Download the rkt RPM then …

    # rpm -Uvh rkt-1.18.0-1.x86_64.rpm
    # cat /etc/redhat-release 
    CentOS Linux release 7.2.1511 (Core) 
    # uptime
    06:16:04 up 141 days, 51 min, 1 user,load average: 0.18, 0.26, 0.22
    # rkt run --interactive docker://ubuntu --insecure-options=image
    Downloading sha256:6bbedd9b76a [================] 49.9 MB / 49.9 MB
    Downloading sha256:fc19d60a83f [================]     824 B / 824 B
    Downloading sha256:668604fde02 [================]     160 B / 160 B
    Downloading sha256:de413bb911f [================]     444 B / 444 B
    Downloading sha256:2879a7ad314 [================]     678 B / 678 B
    root@rkt-2ee79be0-a70b-44be-90fd-1a1a54c17216:/# cat /etc/os-release 
    VERSION="16.04.1 LTS (Xenial Xerus)"
    PRETTY_NAME="Ubuntu 16.04.1 LTS"
    root@rkt-2ee79be0-a70b-44be-90fd-1a1a54c17216:/# uptime
    06:16:27 up 141 days, 52 min, 0 users,load average: 0.25, 0.27, 0.22
    root@rkt-2ee79be0-a70b-44be-90fd-1a1a54c17216:/# ps -ef
    UID        PID  PPID  C STIME TTY          TIME CMD
    root         1     0  0 06:16 ?        00:00:00 /usr/lib/systemd/systemd --default-standard-output=tty --log-target=null --show-status=0
    root         3     1  0 06:16 ?        00:00:00 /usr/lib/systemd/systemd-journald
    root         5     1  0 06:16 console  00:00:00 /bin/bash
    root        13     5  0 06:16 console  00:00:00 ps -ef
    root@rkt-2ee79be0-a70b-44be-90fd-1a1a54c17216:/# exit
    # cat /etc/redhat-release 
    CentOS Linux release 7.2.1511 (Core) 
    # uptime
    06:16:58 up 141 days, 52 min, 1 user,load average: 0.15, 0.24, 0.22

    Is 0.0% memory usage light-weight enough? 🙂

    # ps aux | egrep -e "[U]SER|[r]kt"
    root      6711  0.4  0.0  41772  2244 pts/0    S+   07:22   0:01 stage1/rootfs/usr/lib/ld-linux-x86-64.so.2 stage1/rootfs/usr/bin/systemd-nspawn --boot --notify-ready=yes --register=true --link-journal=try-guest --quiet --uuid=40660e8b-09c0-46c2-893e-53de6d4068ff --machine=rkt-40660e8b-09c0-46c2-893e-53de6d4068ff --directory=stage1/rootfs --capability=CAP_AUDIT_WRITE,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FSETID,CAP_FOWNER,CAP_KILL,CAP_MKNOD,CAP_NET_RAW,CAP_NET_BIND_SERVICE,CAP_SETUID,CAP_SETGID,CAP_SETPCAP,CAP_SETFCAP,CAP_SYS_CHROOT -- --default-standard-output=tty --log-target=null --show-status=0

    rkt prepare
    Get Started with rkt Containers in Three Minutes
    build with old glibc so rkt runs on CentOS 6?
    https://github.com/JCMais/node-libcurl/issues/45: ./build-rkt-1.15.0/target/bin/rkt run –interactive docker://ubuntu –insecure-options=image

    Posted in Cloud, Linux, Open Source, Tech | Leave a comment

    Linux Graceful Service Shutdown Techniques

    Linux logoWhen doing server upgrades with multiple servers, the ideal way is to:

    1. take one instance out of the pool
    2. drain connections on it
    3. upgrade it
    4. put it back into the pool
    5. back to #1.

    The various techniques can be categorized as:

    1. application-level
    2. load-balancer-level
    3. OS-level

    The most graceful method is to use an application-level feature, since the application knows what its worker status is.

    For example, with httpd on CentOS or Redhat, either use the apachectl command, or add the graceful-stop option to /etc/init.d/httpd:

    set -e
    echo "info: Draining connections ..."
    apachectl graceful-stop
    echo "info: You have 5 minutes to start and finish your upgrade."
    sleep 300
    apachectl start
    echo "info: httpd restarted!"
    exit 0

    If we didn’t have an application-specific way to do that, we could use iptables:

    set -e
    iptables -I INPUT -j DROP -p tcp --syn --destination-port 80
    echo "info: Draining connections ..."
    sleep 60
    echo "info: You have 5 minutes to start and finish your upgrade"
    sleep 300
    iptables -D INPUT -j DROP -p tcp --syn --destination-port 80
    echo "info: iptables allowing new incoming connections!"
    exit 0

    With HAProxy we can do this on the HAProxy host (do yum -y install socat first):

    set -e
    echo "set server application-backend/www0 state drain" | socat unix-connect:/var/run/haproxy.sock stdio
    echo "info: Draining connections ..."
    sleep 60
    echo "set server application-backend/www0 state maint" | socat unix-connect:/var/run/haproxy.sock stdio
    echo "info: You have 5 minutes to start and finish your upgrade"
    sleep 300
    echo "set server application-backend/www0 state ready" | socat unix-connect:/var/run/haproxy.sock stdio
    echo "info: haproxy allowing new incoming connections!"
    exit 0

    Sample HAProxy “show stat” output while www0 is draining (notice the “DRAIN” status):

    [root@gw ~]# echo "show info" | socat unix-connect:/var/run/haproxy.sock stdio
    Name: HAProxy
    Version: 1.5.10
    [root@gw ~]# echo "show stat" | socat unix-connect:/var/run/haproxy.sock stdio
    # pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq,dresp,ereq,econ,eresp,wretr,wredis,status,weight,act,bck,chkfail,chkdown,lastchg,downtime,qlimit,pid,iid,sid,throttle,lbtot,tracked,type,rate,rate_lim,rate_max,check_status,check_code,check_duration,hrsp_1xx,hrsp_2xx,hrsp_3xx,hrsp_4xx,hrsp_5xx,hrsp_other,hanafail,req_rate,req_rate_max,req_tot,cli_abrt,srv_abrt,comp_in,comp_out,comp_byp,comp_rsp,lastsess,last_chk,last_agt,qtime,ctime,rtime,ttime,
    application-backend,www0,0,0,0,1,5000,3,583,1078,,0,,0,0,0,0,DRAIN,1,1,0,0,0,385,0,,1,4,1,,3,,2,0,,1,L7OK,301,0,0,2,1,0,0,0,0,,,,0,0,,,,,442,Moved Permanently,,0,0,0,1,
    application-backend,www1,0,0,0,1,5000,4,709,18640,,0,,0,0,0,0,UP,1,1,0,0,0,755,0,,1,4,2,,4,,2,0,,1,L7OK,301,0,0,3,1,0,0,0,0,,,,0,0,,,,,141,Moved Permanently,,0,1,8,8,

    For nginx:

    set -e
    echo "info: Draining connections ..."
    nginx -s quit
    echo "info: You have 5 minutes to start and finish your upgrade."
    sleep 300
    nginx -s start
    echo "info: nginx restarted!"
    exit 0

    If you’re using a configuration management system, like puppet or Chef, you can remove the service from your load balancer pool. This works well in practice with only 2 or 3 servers, though draining is usually not considered.

    Note that when using the popular “reverse HAProxy” setup with application servers running HAProxy on localhost, and HAProxy forwarding localhost requests to the real servers (like httpd), then you want to stop or block the httpd services on the real server end. Otherwise you would have to make changes on multiple application servers.

    In a future post, I’ll discuss zero-downtime deploys.

    Drain connections on restart of NGINX process? (with iptables)
    Tomcat’s Graceful Shutdown with Daemons and Shutdown Hooks
    Get haproxy stats/informations via socat
    Go net/http: add built-in graceful shutdown support to Server #4674
    haproxy.tech-notes.net: HAProxy Socket Commands

    Posted in API Programming, Business, Cloud, Java, Linux, Microservices, Open Source, Tech | Leave a comment

    Solving Java GC Pause Outages in Production

    Java Duke
    Just thinking about how to configure HAProxy with two backend Java servers to be HA, despite GC pauses.

    Java programs pause periodically to recycle temporary variables, known as garbage collection (GC). This is called a “GC Pause.”

    The description “Stop the World” (STW) illustrates their true severity – GC pauses are a slow-motion train wreck for incoming requests. They can last from hundreds of milliseconds to minutes, and require intense CPU activity.

    Executive Summary:

    • If you have a latency-sensitive requirement, don’t use Java – use C or Go 1.8+ [GC benchmarks]
    • If you want to use Java, follow the best programming practices listed below to reduce garbage collection pause time, or consider paying Azul $3,500/server
    • HAProxy can be used with option redispatch to load balance across multiple Java servers to maintain availability during GC pauses. You can either use the HAProxy drain feature for rolling deployments, or in more complex setups, iptables.
    • Bonus tip: Java GC pauses don’t only impact your application, they also affect their entire environment like a grenade – performance tools written in Java pause, tomcat pauses, even reflection APIs are paused.

    If you’re new to this topic, please read:

    Willy: “I work with people who use a lot of Java applications, and I’ve seen them spend as much time on tuning the JVM as they spend writing the code, and the result is really worth it.” Anybody have some extra time? 😐

    My operational requirements for Java in production are:

    1. understand GC pause activity for my application servers
    2. control GC pause activity to a reasonable and bounded extent
    3. configure HAProxy load balancer to not send requests to servers undergoing GC pauses (ie. don’t lose requests)
    4. use an affordable amount of RAM to accomplish the above, preferably 8 or 16 GB in a shared VM environment.

    1. Understand GC pause activity for my application servers

    Detailed GC logging and heap dump on OOM can be enabled with:

    -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError

    and you can specify a separate GC log with:

    -verbose:gc -Xloggc:/tmp/gc.log

    See “Understanding Garbage Collection Logs.”

    2. Control GC pause activity to a reasonable and known extent

    One of the biggest challenges is to control the frequency, duration and intensity of GC pauses …

    Some Java configuration approaches:

    • set heap size and compaction percent only somewhat above need. That will cause GCs to be more frequent, but also faster or the opposite …
    • set heap size to large amount and compaction to 100%, then trigger GC after hours
    • investigate alternate JVMs.

    An example of some of the tuning options:

    java -Xms512m -Xmx1152m -XX:MaxPermSize=256m -XX:MaxNewSize=256m MyClass.java

    JRockit JVM: Tuning For a Small Memory Footprint
    Tuning Java Virtual Machines (JVMs)
    Weblogic Tuning JVM Garbage Collection for Production Deployments

    Programming best practices to reduce GC pauses:

    • use streaming file IO with Files.lines() instead of reading into a String or hashmap, or use memory-mapped files
    • rewrite portions of your application to correctly use StringBuffer instead of String
    • Reduce object copies – if you do not have a problem with thread safety, then you don’t need immutable objects.
    • call dispose() method when available, such as SWT image class
    • for HashMaps, call clear() to re-use the memory later, but set to null to GC it
    • split java server into real-time and batch servers where possible with appropriate minimal heap sizes for each
    • preallocate array memory using the length parameter to potentially avoid re-copying the entire array for new elements
    • Note that Java debuggers change the lifetime of variables so that they can be viewed longer scope-wise. Caveat emptor.

    3. Configure HAProxy load balancer requests to not be sent to servers undergoing GC pause events

    The first thing to do is to read up on HAProxy’s option redispatch feature. Continue reading for more in-depth considerations.

    This is tricky for several reasons:

    • health checks can be passive or active. Both have check gaps that won’t notice a GC starting before a request is sent
    • even if GC notifications are enabled and the server health check is red, HAProxy will not know (see above)
    • even if GC notifications are enabled and the server health check is now green, HAProxy will not know (see above) 🙂
    • the HAProxy options log-health-checks and redispatch may be helpful

    a) Some things to think about:

    1. understand your GC pattern
    2. use HAProxy socket interface to drain, then disable one backend
    3. wait for zero connections
    4. force a GC (easier said than done in Oracle Java since System.gc() is only a request for GC), or restart the Java server
    5. use HAProxy socket interface to enable the Java server.

    This method would be risky with two Java servers, since during maintenance on one server, the other could GC pause. (facepalm)

    b) Another possible approach would be to handle MemoryPoolMXBean MEMORY_THRESHOLD_EXCEEDED events. Maybe that can be used to update the health check on the server side and send a drain socket request to HAProxy if you reliably had advance notice and could force a GC now, trying the Java Tool Interface ForceGarbageCollection()?

    c) And another idea is to write a sentinel file every 250 ms, and if it reaches 750 ms, assume a GC is happening and drain HAProxy. Unfortunately the TI events GarbageCollectionStart() and GarbageCollectionEnd() are sent after the VM is stopped, so you’re limited in what you can do when you need the most flexibility.

    Some Java 8 Classes related to GC notifications:

    1. MemoryPoolMXBean – “The memory usage monitoring mechanism is intended for load-balancing or workload distribution use. For example, an application would stop receiving any new workload when its memory usage exceeds a certain threshold. It is not intended for an application to detect and recover from a low memory condition.”
    2. GarbageCollectionNotificationInfo
    3. GarbageCollectorMXBean

    Also, investigate mod_jk and AJP. tomcat uses the same heap as your application, so tuning is very important here too.

    4. Use an affordable amount of RAM to accomplish the above, preferably 8 or 16 GB in a shared VM environment

    If you work in a VM consolidation environment, it’s important to minimize the footprint of your base image and also applications. See above for rewriting applications to minimize heap and GCs.

    SO: Is there any correlation between an out of memory scenario and blocked threads?
    Garbage Collection JMX Notifications Example Code
    Blade: A Data Center Garbage Collector
    How to Tame Java GC Pauses? Surviving 16GiB Heap and Greater
    SO: Garbage Collection Notifications
    Letting the Garbage Collector Do Callbacks
    How to force garbage collection in Java?
    SSL Termination, Load Balancers & Java
    Github: Measuring Java Memory Consumption – sample code
    Java is not “angry” with you.
    Set State to DRAIN vs set weight 0
    Scalable web applications [with Java]
    Examples of forcing freeing of native memory direct ByteBuffer has allocated, using sun.misc.Unsafe?
    Lucene ByteBuffer sample code
    Improve availability in Java enterprise applications
    The Four Month Bug: JVM statistics cause garbage collection pauses
    Memory management when failure is not an option

    Making Garbage Collection faster
    The Complete Guide to Instrumentation: How to Measure Everything You Need Inside Your Application
    Java heap terminology: young, old and permanent generations?
    5 Coding Hacks to Reduce GC Overhead

    Java Debugger Changes Lifetime of Variables
    Objects Should Be Immutable
    Thread Safety and Immutability
    Azul Blog: So, G1 may become the default collector for Java 9?
    Java and Scala Type Systems are Unsound


    Golang: sub-millisecond GC pause on production 18gb heap HN
    Getting Past C
    Go GC: Prioritizing low latency and simplicity
    Sub-millisecond GC pauses in Go 1.8 Graphs


    CASSANDRA-5345: Potential problem with GarbageCollectorMXBean
    Java GC pauses, reality check

    Posted in Cassandra, GC Pauses, Java, Microservices, Open Source, Oracle, REST API Programming, Tech | Tagged | Leave a comment

    I found jMeter, however, really easy to use.

    Java DukeSo, let me get this straight

    • Java is not safe for use in servers because of GC pauses.
    • And it’s not safe for use in clients because of GC pauses.

    Doesn’t leave much left! 🙂

    Thanks to Greg Lindahl, founder of Blekko, for making my day. You’re The Man when it comes to performance!

    Another good one that made *me* pause:

    • Me at ApacheCon 2009: “So how do you like programming in Java?”
    • Random Attendee in Wifi tables area: “It’s great. Not sure why people gripe about memory consumption.”
    • Me: “Really. Show me your Java app.”
    • Random Attendee: “Well, my Macbook Air doesn’t have enough RAM.” 🙂


    Apache jMeter
    Distributed Testing with JMeter on EC2

    Analyzing JMeter Application Performance Results

    Dan Luu: HN comments are underrated HN comments

    Posted in API Programming, Conferences, GC Pauses, Java, Open Source, Tech | Leave a comment

    REST API Client Computer Languages and Frameworks Survey

    I recently wrote REST API client programs in several programming languages as a subproject of my Perl REST API Framework, and had some surprises, both good and bad.

    I would have gladly just linked to somebody else’s sample clients, but I couldn’t find any remotely complete or professional-grade code (complete working program with error-handling, Basic auth and timeout.)

    The closest to useful REST clients that I saw were the Java tutorials by mkyong.com, RESTful Java client with Apache HttpClient

    Here’s my notes:


    • tough to find the a working HTTP class for Java 1.8 on Centos7. I couldn’t get Apache HTTPClient imports working, so I ended up using HttpURLConnection
    • first experience with immutable data collections – quite jarring to realize you have to copy anything returned from a library first to change it. And an int is not an Integer, and a String is not a StringBuffer. Hahaha, good one!
    • somewhat of a learning curve for java build process. See make_java.sh for a minimal build tool
    • lint: javac -Xlint:all
    • I wouldn’t be surprised if Java’s legendary slowness and memory bloat are from the above issues, obvious even from a 200-line program.
    • since Java uses block scope, when you add try/catch/finally blocks, variables referenced in catch/finally blocks must be moved outside the enclosing try scope. Makes code a lot messier. In Java 7, try-with-resource partially solves that.


    • overall, programming in Go is a pretty nice experience. The included net/http package has everything you need.
    • but “encoding/json” is overly-complicated. Some XML-head must have designed it.
    • not sure why Go treats unused modules and variables as fatal compiler errors
    • http.StatusNoContent appears to be missing from package net/http


    • used the requests HTTP module
    • felt comfortable until running Pylint and discovering how freaky the python community can get (scoring my working program -1.5/10, but getting 10/10 after whitespace-only changes. Really?)
    • nice indenting: pindent.py -r -s 4 -e


    • used the httparty HTTP framework
    • elegant, beautiful OO code without even trying.


    • got it working the fastest, but then took longer to polish it
    • wish there was a lint for PHP


    • not bad – straight-forward to do various requests and get responses


    • inadequate for REST API programming, more for manually fetching files only


    • LWP is mature, well-documented and readily available and made Perl the easiest scripting language to work with overall
    • Perl’s built-in lint checking (strict and diagnostics) is much appreciated after its lack in PHP, Python and bash.

    RESTful Java client with Apache HttpClient
    Why Pylint is both useful and unusable, and how you can actually use it
    Notes on Managing Java in the Cloud
    Static typing will not save us from broken software
    OpenFeign Java HTTP Client Library
    Java: How To Read a Complete Text File into a String
    Golang’s Real-time GC in Theory and Practice

    Posted in API Programming, Java, Linux, Microservices, Open Source, REST API Programming, Tech | Leave a comment

    PagerDuty Summit Conference 2016 SF

    PagerDuty LogoI went to the complimentary PagerDuty Summit Sept. 13 on Market Street in SF.

    The well-organized conference format was 2 tracks downstairs, with breaks and a small expo area upstairs.

    Andrew Fong of Dropbox had a very good talk on their struggle to go from four 9’s (“can use tactics”) to five 9’s (“has to be strategic”.) Their solution was to have a working group composed of anybody who wanted to contribute, across departments. (Not dedicated HA staff.)

    Andre Kelly of Google talked about having well-defined post-mortem processes in place now to capture outages in an organized manner and data mine the results over time later.

    Apparently there’s some popular Open Source post-mortem systems for that. Please leave a comment if you have any experience with those.

    Sean Reilley of IBM discussed people issues in communicating agile across a large company with pockets of staff who were used to waiting for permission (ie. not inherently agile.)

    Upstairs, the mini-expo seemed to have a couple booths for security-related start-up Cloud products, Datadog, plus a booth for PagerDuty itself to do customer demos and get beta feedback.

    PagerDuty Incident Timeline

    Sketch of New PagerDuty Incident Timeline Visualization Tool

    The money shot was seeing their new beta graphical incident timeline, to be released in November, which made the trip worthwhile. Until then, you can enable HTML emails for a slightly richer experience.

    The “Village” historic venue, [pic], was not my favorite: climbing up and down steep stairs with a backpack got old fast.

    Conference Videos

    Posted in Conferences, Tech | Leave a comment

    Eye of Hurricane Matthew

    Eye of Hurricane Matthew

    Posted in Tech | Leave a comment

    John Collins “The Paper Airplane Guy”

    CNN linked to a video of John Collins, “The Paper Airplane Guy.”

    John holds the world-record for paper airplane distance throwing.

    I had a chance to see John live recently when he gave a lecture and demo at my office in Silicon Valley.

    It was a unique experience:

    1. John is a fun lecturer who really knows aerodynamics and can explain it clearly to both kids and adults
    2. learning the art of making high-performance airplanes was great fun.

    I hold a commercial airplane licence and can say that he really knows his stuff. Highly recommended.

    Posted in San Jose Bay Area, Tech, Toys | Leave a comment

    Does Software Rot?

    Back in the day, Joel wrote an infamous post asserting that “software doesn’t rot” over time.

    I believe Joel was addressing the tendency of new programmers on a project to avoid learning the old codebase and write a new one instead, at great cost in terms of time and money.

    But let’s discuss the more interesting topic of whether software can actually rot.

    I would say that he was correct that it doesn’t rot in a very narrow sense, namely a program written for a single version of Windows.

    But in the big picture, he was completely wrong. Even Windows software requires re-writes for “Certified for Windows” assurance for new shrink-wrapped versions to be shelved in US chain stores. (Stores were trying to reduce the rate of returns and customer support.)

    And how’s Silverlight, discontinued in 2012, working out for developers? 🙂

    When it comes to web software, total re-writes have been required for:

    • mobile
    • Apple “Retina” resolutions
    • REST APIs
    • XML and JSON output
    • new Javascript frameworks
    • web application security headers require origin and Javascript changes
    • multiple web icon resolutions

    Additionally, Y2K often required software changes, as will Y2038.

    In Joel’s article Fire And Motion he makes an interesting observation that changes can also be used to keep developers off balance. It’s a very effective technique in the software world where buyers chase versio numbers.

    Apple could kill almost 200,000 apps with iOS 11
    joelonsoftware.com: Fire And Motion

    Posted in API Programming, Open Source | Leave a comment

    Perl Petstore Enhanced REST API Framework

    Perl LogoI’ve been doing a lot of work with REST APIs and microservices, so I decided to write a complete REST API framework in Perl based on the Mojolicious and Swagger2 Petstore sample.

    You can git clone the repo and add a new API endpoint in about 5 minutes with automatic parameter validation and documentation:

    git clone git@github.com:jamesbriggs/perl-petstore-enhanced.git
    cd perl-petstore-enhanced/pets
    less ../README.md
    vi api.spec set.sh cgi-bin/pets.cgi ./lib/Pets/Controller/Pet.pm
    # add an Alias for cgi-bin/pets.cgi to httpd or nginx
    # point your browser at http://www.example.com/api/v1.0/pets/1
    # Good job. Have a Modelo! :)

    or you can spend an hour to rename the files for your project and tweak it to requirements.

    This project serves as a convenient bridge for those who:

    1. can write simple CGI programs and want to write a best-practises Swagger (OpenAPI) REST API server without climbing a steep learning curve, or
    2. want to write a quick proof-of-concept API server to be re-implemented in other languages or frameworks later, as your Swagger spec file is 100% reusable
    3. are targeting a small VM. This will work in a 2 GB RAM VM just fine, or on an existing server running httpd or nginx.

    Also of note is the samples/ folder, which has non-trivial client programs in several languages (bash, Java, Perl, PHP and Ruby.)

    I learned the importance of Swagger2 and auto-generated API documentation and validation when I was programming with the old Rackspace Cloud v1 and v2 APIs.

    People asked me, “How did you get anything to work? You must have really wanted it!” since the Rackspace sample code, docs and live API didn’t match each other. My secret: I actually guessed URLs in the browser to find the endpoints I needed. Swagger prevents that headache.

    Swagger UI
    idratherbewriting.com: 10 realizations as I was creating my Swagger spec and Swagger UI

    Are microservices for you? You might be asking the wrong question.

    developer.mozilla.org: List of default Accept values

    Posted in API Programming, Open Source, Perl, Tech | Leave a comment

    Hawaii Trip 2016 – What’s New in Waikiki

    Spent Labor Day weekend in Waikiki.

    I enjoy going there every few years and seeing what’s new.

    However, it’s been completely built out as a mall, so looks kind of corporate now. To combat that, plan to climb Diamondhead and go to the zoo.

    Also, who would fly a quadcopter drone at one of the most crowded beaches in the world? Not surprised, just saying.

    So what’s new in Waikiki?

    • Two hurricanes were approaching the Islands, but like usual did not landfall on Oahu
    • Not very busy, likely because of the Hurricane news
    • International Marketplace is now a shiny mall that opened Aug. 25. It is anchored by Saks Fifth Avenue, and has the only public restrooms in Waikiki now. It has plaques to remember the mom-and-pop stores they bulldozed.
    • Kalakaua is also a giant hand-bag mall for Japanese tourists
    • Matteo’s Italian (and Seafood) at Seaside and Kuhio closed, and a Crackin Kitchen Seafood opened next door
    • 24hour Fitness is charging $25 for a day-pass on Kalakaua, but it does have a beach view
    • Free Kuhio Beach Hula Show (Waikiki) is 6:30 pm Tues/Thu/Sat – features two dozen performers! Bring your own towel or beach chair to sit on, practise your photography.
    • 100 Japanese people were lined up outside Marukame Udon on Kuhio one night at 9 pm. Must be pretty good. Next door is a souvenir shop with the most awesomely tacky items. If you need a hula dancer for your car, get shopping.
    • Princess Kaiulani Hotel buffet ($42/person) still has free Hawaiian music and hula show downstairs, and a very good Polynesian show/dinner upstairs. (They cancelled the downstairs show at least one evening because of Hurricane weather reports.)
    • McD still serves the free pineapple cup with combos, and also offers taro pie – very sweet. They charge $10 for a combo, but you can get a BOGO Big Mac on Mondays and they have a Pick Two special, and they do have drink refills and wifi
    • Duke’s Restaurant is still packed, but the Hula Grill ($60/person) upstairs doesn’t have a wait list. Has restrooms.
    • TheBus is $2.50 per trip now, or $35/4-day tourist pass available in ABC Stores. The Waikiki Trolley is only $2/trip between Waikiki and Ala Moana and the open air cars are good for photography and sight-seeing
    • Lots of hotel and residential construction cranes
    • Flew American Airlines there – they served biscuits instead of meals, and had no entertainment systems. Ran APU for one-hour while finding pilots. Dreadful experience, but this is a USA airline, so I’m being redundant.
    • Disney Aulani is not a theme park – it’s a time-share (ie. scam) with a few hotel room rentals for $450/nite in the middle of nowhere. ok if you’re a large family that wants to cocoon, maybe.
    • if you go on a boat tour of any kind and want to have fun, buy the cheapest tickets or you’ll be stuck with grandparents

    Waikiki photo vantage points:

    • beach sunsets
    • surfboard stands
    • rescue canoes
    • Kuhio Hula Show (Tues/Thurs/Sat at 6:30 pm)
    • street performers
    • Diamondhead
    • Honolulu Zoo

    If you’re from the mainland, remember that Hawaii is hot and humid. Stay hydrated, wear a hat, and don’t over-exert yourself – especially around noon-time.


    Posted in Travel | Leave a comment

    Re: Botched Go-around Appears To Have Led to Emirates 777 Crash

    As a commercially-rated airline pilot who reads accident reports, I always tingle when I fly on anything but a USA majors flight in less than perfect weather.

    The recent Emirates 777 crash in Dubai is a case in point.

    The airliner, with 300 people aboard, crashed into the runway with a sink rate of 900’/minute, and later the center-tank exploded, killing one firefighter. 22 pax and FAs were injured descending the slides (typically, several people are injured during a slide evacuation.)

    It’s important for pilots to always be mindful that a landing approach can end in two ways:

    1. landing
    2. go-around

    Though it would take a lot of painstaking research to say where this particular flight started going wrong, we do know some of the links in the “accident chain”:

    1. wind shear from 8 knots headwind to 16 knots tailwind. Depending on when the pilots learned this, their spidey- sense should have been off the scale – ie. either requesting a hold, a go-around or another airport. I also wouldn’t use the autopilot in wind-shear because judgment is needed to manage the throttle in that situation
    2. long landing – aim point in an airliner is 1000′, but they had a 1,100 meter (3,609′) warning. If they couldn’t start a normal landing at 1,000′, it was time to seriously think about a go-around
    3. late go-around – if you’re over the runway at idle and 5′ in a wind shear with your gear down, you probably should just land. What were the pilots thinking here? Were they blindly following ATC or book procedures when they really needed piloting skill?
    4. late TOGA power – jet engines take about 6 seconds to produce useful lift, the pilots tried 3 seconds. Do the math.
    5. foreign airline and pilots – for some reason, they’re often not up to challenging weather. They seem to be more interested in epaulets than aerial mastery. I’d suggest making them fly this flight profile in the sim before graduation. Or is the extra $5,000 in fuel for a go-around a career-limiting problem?

    Taken together, obviously nobody with a clue was in the cockpit that day. I would rank this accident as bad as the TransAsia GE235 “Oops, I shutdown the good engine” accident in Taipei.

    Botched Go-around Appears To Have Led to Emirates 777 Crash

    Posted in Tech, Travel | Leave a comment

    Farewell to Prince

    Disbelief at the death of Prince at the relatively young age of 57.

    Prince was a musical genius, certainly one of the giants of this century – he wrote, sang, was a virtuoso of 2 dozen instruments, and played guitar at the level of Jimi Hendrix.

    He could perform with everybody, or nobody, yet chose to mentor female musicians, introducing them by name in his shows.

    I saw one of his shows, but wish I had gone to more.

    For business reasons, he never allowed his catalog on YouTube, but there’s a few links from TV performances that indicate his brilliance and show him “bringing the funk”:

    Prince & 3RDEYEGIRL Perform ‘She’s Always In My Hair’
    Prince Saturday Night Live Full Performance (2014)
    Prince playing piano over ‘Summertime’ at Soundcheck, Koshien, Hyogo Prefecture (1990)
    PRINCE BET Interview with Tavis Smiley(1998)
    “Stand Back” – Stevie Nicks (writer/vocals) with Prince (synths/drum machine), inspired by Little Red Corvette

    cnn.com: Prince’s vision for lifting up black youths: Get them to code, Prince’s Death: Latest News
    W: Prince
    Nicole Scherzinger sings Purple Rain Tribute
    Mayte Garcia on the Prince you didn’t know

    Posted in Tech | Leave a comment

    Weekend of Earthquakes

    There were a few major earthquakes this weekend:

    • Ueki, Japan – 6.2 (foreshock)
    • Kumamoto, Japan – 7.0
    • Ecuador – 7.8

    Hundreds of aftershocks have occurred in Japan.

    You would think that Californians, of all people, would be concerned with earthquake safety, but the LA Times has reported on a building safety cover-up involving thousands of older schools and office buildings which will pancake in a major quake.

    How Risky Are Older Concrete Buildings?
    LA Times FAQ: Concrete buildings, earthquake safety and you
    Non-ductile Concrete Buildings

    Posted in Tech | Leave a comment

    Congrats to SpaceX on Ocean Landing

    I used to write telemetry collection software for the Space Shuttle, rockets and balloons, but even I watched the SpaceX barge landing with disbelief as the rocket smoothly rotated in all 11 or so degrees of freedom at the same time – no hesitation or staging before the touchdown.

    It was like watching a really big lawn dart plant itself. 🙂

    View post on imgur.com

    twitter: SpaceX
    HN: SpaceX Launch Livestream: CRS-8 Dragon Hosted Webcast
    theRegister: SpaceX finally lands Falcon rocket on robo-barge in one piece, SpaceX’s Musk: We’ll reuse today’s Falcon 9 rocket within 2 months

    Posted in Tech | Leave a comment

    MH370 Debris Illuminates Crash Reasons

    A few pieces of MH370, a Boeing 777-200ER, have recently been found on a Mozambique beach, and confirmed as authentic parts.

    Their excellent condition and relatively large sizes indicate that the accident wasn’t a high-speed impact with an obstacle or water.

    As a commercially-rated airplane pilot, my opinion is that leaves:

    1. explosion or decompression
    2. descent (or phugoid) into ocean at relatively low speed
    3. “graveyard spiral dive” pulled the wings off.

    An interesting question would be, “In modern airliners, especially Airbus, anti-phugoid software is deployed. How would that affect an uncontrolled airliner?” Sully said that anti-phugoid software in his Airbus prevented him from slowing descent before impacting the Hudson River.

    MH370 Debris Storm
    Tourist who found debris was searching for MH370
    Turbulence V-Speeds
    Australia Confirms Mozambique Debris Came from MH370
    MH370: Debris found in March ‘almost certainly’ from missing plane
    Investigators Report On MH370 Debris Analysis (2016)
    ATSB MH370 Report [pdf]

    ATSB Image

    Posted in Tech | Leave a comment

    Congrats to LIGO Team

    Congrats to the Laser Interferometer Gravitational-wave Observatory (LIGO) team for directly detecting gravitational waves for the first time.

    LIGO was the NSF’s most expensive project, and took scientists basically from the 1960’s to 2015 to fully realize – initially nobody believed it was possible to actually build this instrument.

    Two detector locations with perpendicular 4 km 4-mirror laser interferometers were able to detect gravitation waves from a billion year-old blackhole collision:

    Direct Gravitational Wave Measurement of Two Black Holes Merging in 1/10 of a second!

    The graph is very interesting and raises the questions:

    1. do the pre-impact lobes represent the outer limits (presumably thinner than the center) of the black holes merging?
    2. the slope of the lines in the main chirp are very steep … are those spikes much taller (or infinite) than we can resolve?
    3. what do the small post-impact lobes mean exactly?

    (The local speed of light in a medium is a constant, while gravitation waves distort space, thus changing an interference pattern.)

    Basic science is always valuable, but just a few of the reasons why this experiment is important:

    1. confirm the equations originally proposed by Einstein in 1915 for gravitation in the GTR and Standard Model
    2. confirm experimentally that light and gravitation waves have different propagation characteristics
    3. develop the technology to make observations at the sub-proton level
    4. study large-scale cosmic events (black holes, colliding galaxies, supernova, binary star systems)
    5. study the time of the Big Bang, as gravitational waves are not filtered like EM waves
    6. confirm or deny cosmic observations and theories made in the EM spectrum, and provide advance notification of occurring events for study in the EM spectrum.

    More generally, measurement tools are the highest form of technology, whether for time, space, EM, or gravity. Any investment of time or money in measurement tools is easily repaid 1000x. For example, the GPS system is the result of accurate time measurement using “atomic clocks.”

    This decade is an exciting time for science, as several major terrestrial and space instruments come online or are upgraded.

    It will be interesting to see if anybody develops a table-top model of LIGO. Experiments in the 60’s with non-laser methods were susceptible to ambient vibrations, but we’ll see. Cryogenics would likely have to be involved since random atomic motions are larger than the signal being acquired.

    LIGO Detects Gravitational Waves for Third Time
    Gravitational Waves Detected 100 Years After Einstein’s Prediction
    W: LIGO
    Reddit AMA
    LIGO black hole echoes hint at general-relativity breakdown
    On the time lags of the LIGO signals HN
    Gravitational waves from a binary black hole merger observed by LIGO and Virgo
    LIGO Architects Win Nobel Prize in Physics
    LIGO and Virgo announce the detection of a black hole binary merger from June 8, 2017

    Posted in Tech | Leave a comment

    Superbowl 50

    I watched Superbowl 50 in Sunnyvale – a nice spring-like day with blue skies.

    Got a bonus show: I was just going inside as the Blue Angels did a low-altitude formation flyover, followed by a couple solo approaches, toward Levi’s Stadium.

    Denver Broncos over Carolina Panthers 24 – 10, with Denver leading the entire game.

    Cam Newton, QB for Carolina, got sacked, to varying degrees, 6 times. He sore-loser sulked during the post-game interviews, which generated a lot of controversy.

    Peyton Manning, Bronco’s QB, won MVP, amidst the usual narcissistic drama of whether he’d retire on top, or not.

    The turf came under scrutiny, as some linebackers were literally sliding across it.

    Halftime Show

    Beyonce, looking thick, Bruno Mars, nice moves in a rubber suit, and Coldplay (woefully) performed. Must have been a nostalgic Brit on the halftime committee I guess.

    According to the media, Beyonce was doing a Black Power protest, but the show wasn’t particularly different than anything MJ or Janet did. And frankly, I wouldn’t blame her if she did.


    Most of the ads were forgettable.

    The Amazon ad with Baldwin and Marino was ok.

    There were a few annoying prescription ads, though the cartoon intestines with feet one was more than weird.

    Municipal Sports Stadium Corruption

    I’m local to the Levi’s Stadium, so am aware of the endless tales of corruption (lack of accounting to City Council, failure to make public service reimbursements, destruction of meeting notes and emails, mis-appropriation of a kids soccer park, etc.)

    But even I was surprised that the local transit authorities “privatized” the Caltrain and VTA Light Rail for the day, requiring a a SuperBowl 50 ticket and special $40 ticket per passenger to use a taxpayer-funded system. Hmm.

    “Event Passengers Must Pre-Purchase VTA Fare Prior to Boarding

    All passengers traveling to the Super Bowl must use VTA’s mobile app, EventTIK to purchase a special VTA Super Bowl 50 Day Pass fare AND possess proof of a valid Super Bowl ticket in order to board the special Super Bowl trains.”

    Mr. York: next time, pay for your own damn stadium. You can afford it.

    Formation Flyover Photo

    Posted in San Jose Bay Area | Leave a comment

    Babbage’s Difference Engine at Computer History Museum

    Today was the last chance to see Babbage’s Difference Engine at the CHM in Mountain View before the owner makes it private again.

    The Computer History Museum has certainly matured into a world-class museum over the years.

    The docent talked for about 45 minutes. Unfortunately, it was displayed at the end of a hallway. So 100+ people with kids and strollers jostled to get a view.

    It’s very impressive in person – consists of 8,000 parts, weighs five tons, and measures 11 feet long, moderately noisy and mesmerizing to watch. The cranker used a moderately strong rowing motion.

    Babbage, in building the first computer, did not have the hindsight to start with a smaller version first. Thus he never finished building a working model despite a decade of funding from the British government and the remaining days of his life working on it.

    CHM did a fantastic job on the DEC PDP-1 and IBM 1401 display rooms. Only about 50 PDP-1’s were made, so to have a working model is amazing.

    Posted in Tech | Leave a comment

    Congratulations on HondaJet USA Certification

    Congrats to Honda for earning FAA Production Certification for their first aircraft, the HondaJet HA-420 light business jet.

    I’ve been following the news of the HondaJet for over a decade as they progressed step-by-step towards certification.

    The HA-420 is the most technologically advanced, fastest (420 knots) and efficient (by up to 20%) small business jet currently certified. Of interest to owner/operators, it may be flown single-pilot.

    The price is $4.5 million, which Honda can finance.

    The creation of the HondaJet is an epic story, starting with Honda’s founder dreaming of building an airplane several decades ago, and establishing design facilities 2 decades ago in the USA.

    A jet engine, the GE Honda HF120, was also certified for this plane.

    The total investment to certify both an airframe and an engine must have been staggering to get to this point. Only a multinational mfg. company with support from top executives like Honda can pull that off in peacetime.

    Even so, aviation is a tough business to make money in, especially as a new entrant.

    Japanese companies have a long history of interesting work in aerodynamics. Both the Battleship Yamato and Bullet Train used duck-bill shaped leading airfoils for significant drag reduction. The HondaJet developers likewise use laminar flow nose (see top photo) and wings, and winglets (see second photo.)

    According to a review by a friend of Philip Greenspun, the airplane has some issues: interior noise in the passenger compartment is 6 DB too high, only 573 pounds of useful load with full fuel, and a 4000′ runway is needed. Also, a lot of pilot ergonomics that should have made it in, didn’t. Also, the high price is comparable to the the next class up, which are much roomier and have more comfortable useful loads.

    yt: Kenny G Live at the HondaJet TC Event with Mr.Fujino,
    HondaJet FAA Type Certification Celebration
    avweb.com: HondaJet Wins FAA Certification
    HondaJet Nominated for 2015 Collier Trophy
    W: HondaJet
    philip.greenspun.com: HondaJet Pilot Review
    HondaJet Makes Chinese Debut at ABACE Show
    HondaJet Flight Trial 2017 [Video]
    Honda Aircraft Adds Canada to HondaJet Approvals
    HondaJet Gets Brazilian Nod
    HondaJet, Phenom 300 Spar for Top Light Jet Sales Title

    Posted in Tech, Toys | Leave a comment

    TAP Plastics Mountain View

    Although I’ve walked by TAP Plastics on Castro St. in Mountain View a hundred times, today was the first time I went inside.

    Their motto “the fantastic plastic place” is accurate.

    They have specialized in plastics sales since 1952 and have 21 stores.

    • marketing, signs and displays
    • collectibles displays
    • marine
    • fiberglass laminate supplies
    • custom design (linear, not vacuum forming)

    Their web site is a gem, supporting 9 languages using Google Translate.

    TAP Plastics Inc.
    312 Castro Street
    Mountain View, CA 94041

    Posted in San Jose Bay Area | Leave a comment

    HOWTO: CentOS 7/Redhat 7 Firewalld Setup for Cassandra Server

    How to do initial firewalld configuration for Cassandra Server and Opscenter on CentOS/Redhat 7 with 2 network interfaces, in my case Dell 1950/2950.

    First: verify that your network interfaces are associated with a NetworkManager zone:

    # grep -i zone /etc/sysconfig/network-scripts/ifcfg-*
    # service network restart

    Second: add the Cassandra ports to the internal zone (private interface) and public zone (public interface):


    # add ports on internal interface for Cassandra server

    firewall-cmd --zone=internal --add-port=7000/tcp --add-port=7199/tcp --add-port=9042/tcp --add-port=9160/tcp --add-port=61619-61621/tcp --permanent

    # add ports on public interface for Cassandra server

    firewall-cmd --zone=public --add-port=80/tcp --add-port=8888/tcp --permanent

    firewall-cmd --reload

    Edit the files in /etc/firewalld/zones to remove the desktop helper services, then do

    service firewalld restart

    3. Verify configuration:

    firewall-cmd --get-active-zones
    firewall-cmd --zone=public --list-ports
    firewall-cmd --zone=public --list-services
    firewall-cmd --zone=internal --list-ports
    firewall-cmd --zone=internal --list-services

    Output is:

    # firewall-cmd --get-active-zones
    interfaces: enp4s0
    interfaces: enp8s0

    # firewall-cmd --zone=internal --list-ports
    7000/tcp 7199/tcp 9042/tcp 9160/tcp 61619-61621/tcp

    # firewall-cmd --zone=internal --list-services

    # firewall-cmd --zone=public --list-ports
    80/tcp 8888/tcp

    # firewall-cmd --zone=public --list-services

    4. Verify firewall rules with nmap:

    # nmap -sS my.external.interface.com

    Starting Nmap 5.51 ( http://nmap.org ) at 2015-10-15 22:34 PDT
    Nmap scan report for my.external.interface.com
    Host is up (0.075s latency).
    Not shown: 997 filtered ports
    22/tcp open ssh
    80/tcp open http
    8888/tcp open opscenter

    Nice! 🙂


    As always, if you experience network issues on linux, disable selinux, firewalld and TCP wrappers first and verify if those are the source of the problem:

    setenforce 0
    service firewalld stop
    cat /etc/hosts.*

    To boot into singleuser mode, replace the linux grub line “ro” item with “rw init=/sysroot/bin/sh”.

    Fedora introduces Network Zones
    fedoraproject.org: Network Zones

    Posted in Cassandra, Linux, Open Source, Storage, Tech | Leave a comment

    Notes on Virtualbox 4.3.30 and OS X 10.8.5 for CentOS 7

    Virtualbox 4.3.30 on OS X 10.8.5 with CentOS 7 guest VMs work ok on my notebook for web development, but setup was a little fussy.

    I use VMs for:

    1. general web development and testing, to stay off the production environment
    2. destructive performance testing (intrusive changes to source code and configurations that require VM rollback to undo, most of which will never be commmitted.) This is great for work on profiling, i18n, caching, mod_rewrite rules, etc.
    3. accelerating automation testing, since a VM can boot in 10 seconds on my Mac with SSD, and VM creation is scriptable. This is a huge win.
    4. working offline (no-Wifi areas.)


    • “Host” is your Mac notebook. It runs Virtualbox under Mac OS X.
    • “Guest” is the VM running under Virtualbox. A guest can be any operating system, but in this case we’re using CentOS 7.x.

    Getting Started

    • check Internet for known software issues first
    • update to the latest version of Virtualbox

    Choose Network Topology

    I wanted to run my web site in a VM, viewable from the Mac browser and have the VM be able to run ‘yum update’, so needed host => guest and guest => Internet routing. There’s 2 networking choices that match those requirements:

    1. Bridged – easiest and works best if a Mac network adapter is always connected, like in the office, or at home if your Wifi access point is always on
    2. NAT – always works, but you have to NAT from host => guest (ie. => You can use Mac’s ipfw or ipf firewalls to then NAT from 80 to 8000, making it seamless:

      sudo ipfw add 100 fwd,8080 tcp from any to any 80 in


    • under “Machine … Settings”, choose “Bridged Adapter”
    • guest IP address will come from Virtualbox DHCP server, usually the guest IP address is
    • on the host, you just use the guest’s real IP address from above
    • if you bridge to the Airport interface (en0), and the host Wifi is off, you lose your guest lease (ie. no routing inside or outside guest VM)
    • binds to a host’s physical interface (conceptually speaking)
    • no NAT needed or available in Virtualbox settings
    • the Virtualbox DHCP address is 192.168.x.100


    • under “Machine … Settings”, just choose NAT, not “NAT Network”
    • guest IP address will come from Virtualbox DHCP server, usually or
    • host IP address will be (NATTed to guest address above)
    • click on “Port Forwarding” button and use host ports above 1024 (usually 2222 for ssh and 8000 for HTTP)


    • the Virtualbox manual is a reference, not a tutorial. After reading this blog post, the manual is useful to fill in details.
    • disable CentOS 7 firewall with ‘service firewalld stop’
    • view CentOS 7 interfaces with ‘ip a’
    • if one networking topology doesn’t work for you, try another. No need to reboot the VM.
    • if you spend more than an hour without success, try VMware Fusion. It covers my use case automatically.


    • do ‘tail -f /var/log/messages’, disable “Cable Connected”, click “OK”, and watch as DHCP lease is lost. Then click on “Cable Connected”, click “OK” to restore
    • if using Bridged on en0, do ‘tail -f /var/log/messages’, do “Turn Wi-fi Off” on Mac, and watch as DHCP lease is lost. Then turn Wifi back on.

    Network Security

    • use strong passwords if you value what’s inside the VM
    • enable guest firewall with ‘service firewalld start’
    • TCP wrappers is an easy and effective filtering method



      ALL: ALL

    Simulating Production

    You can update /etc/hosts to have your browser access your web site in a VM:


    # NAT www.mysite.com
    # Bridged www.mysite.com

    But I find that Firefox gets less confused with permanent redirects, etc. by prefixing the hostname:


    # Virtualbox NAT Topology (don't forget to use ports 2222 and 8000 from host => guest!)
    # www.test-mysite.com
    # Virtualbox Bridged Topology
    # www.test-mysite.com
    # www.test-mysite.com


    Take advantage of Virtualbox’s clone and snapshot features.

    forums.virtualbox.org: What does “Cable connected” checkbox change?
    Port Forwarding in Mac OSX Mavericks
    Port Forwarding in Mac OS Yosemite

    Posted in Linux, Open Source, Oracle, Tech | Leave a comment

    Percona Clustercheck Improved Error Handling Patch

    Here’s my Github pull request for improved error handling in Percona’s clustercheck utility, used by haproxy for health-checking a Percona XtraDB Cluster.

    It adds two features:

    1. 401 Unauthorized response for failed authentication
    2. 404 Not Found response if the mysql program can’t be found

    The error detection is done in a low-latency manner using PIPESTATUS, without an additional database connection. Here is colored diff output.

    Posted in API Programming, Linux, MySQL, MySQL Cluster, Open Source, Tech | Leave a comment

    Percona MySQL Conference 2015

    Wed. Keynotes


    5 companies
    Facebook, Google, Alibaba, Twitter
    Percona and MariaDB
    Please use apache CLA

    MariaDB.com CEO
    Multisource replication from Taobao
    Spider Sharding
    Atomic writes with Fusion IO/Sandisk
    18% faster with 1/4 writes
    Coneect Storage Engine for Federation
    Galera integrated
    Encryption by Google – tablespace and table
    Amazon Aurora
    Maxscale proxy

    Tomas Ulin, Oracle
    MySQL 5.7
    – Optimizer improvements
    – JSON support in pipeline
    – SYS
    – GIS rewrite
    – Innodb improvements
    – native partitions. Bug fixes and transportable tabelspaces
    – dynamic buffer pool size
    – group replication
    – Fabric 1.5
    – Workbench 6.3
    – MySQL Cluster 7.4 GA ???

    Robert Hodges, Continuent/VMware
    – “VMware is creating a new kind of hybrid cloud”
    – vSphere 6 FT – cpu/ram mirror over 10 Gb Ethernet up to 4 vcpus
    – but maintenance still needs continuent
    – information week 2014 db popularity
    – tunsten replication to vertica, redshift, oracle, hadoop


    Lightning Talks


    Percona Acquires Tokutek!

    Posted in Conferences, MySQL, Open Source, Storage, Tech, Toys | Leave a comment

    SVLUG: Daniel Klopp on Docker

    Linux Penguin LogoAt Silicon Valley Users Group (SVLUG) tonite, Daniel Klopp, Senior Technical Consultant, Taos Consulting, gave an intermediate talk on “Docker.”

    He had some really informative and detailed slides on using Docker, especially his cgroup commands samples.

    Some of the interesting things he mentioned were:

    1. cgroups are nested
    2. Docker currently has a limit of 127 “layers”, with prior layers appearing to be read-only to the current layer
    3. Docker is high-level enough to run on multiple operating systems, including both linux and windows

    Daniel Klopp

    Daniel Klopp

    One attendee mentioned that a work-around for the insecure nature of Docker is to combine it with SELinux, though that will involve a fair amount of work.

    Over 400 people RSVPed on a related Meetup, and over 150 people attended, a record for this decade.

    Pasta Spread

    Great turnout!

    Pasta Spread

    Salad, meat lasagna, pasta alfredo, veggie lasagna from Taos!

    Thanks to Taos for providing food for all. Taos has job postings for sys admin, network admin, devops and help desk IT persons.

    Thanks to Symantec once again for hosting the event.

    Posted in API Programming, Cloud, Linux, Open Source, Tech, User Groups | Leave a comment

    Top Utility for Cassandra Clusters – cass_top

    DataStax’s OpsCenter is pretty, but sometimes you don’t want to chop holes in your firewall for the server and agents.

    So I wrote cass_top. It works like top, but colorizes the output of nodetool status. It also lets you build nodetool commands using menus, run and log the output.

    What’s especially nice is that it uses bash (no python required), and uses minimal screen real estate, so you can view all your clusters on one monitor using eterms.

    $ cass_top

    cass_top Screenshot
    cass_top Help Screenshot

    Please leave a comment with your suggestions.

    github: Cassandra Top cass_top

    Posted in Cassandra, Linux, Storage, Tech, Toys | Leave a comment

    MariaDB Patch: CREATE [[NO] FORCE] VIEW Options

    MariaDB LogoBelow is my patch that implements the CREATE [[NO] FORCE] VIEW options against MySQL/MariaDB 10.1.0.

    It adds two new options that look like this:

    1. CREATE NO FORCE VIEW v1 AS SELECT * FROM TABLE1; — base TABLE1 must exist, as before
    2. CREATE FORCE VIEW v1 AS SELECT * FROM TABLE1; — base TABLE1 doesn’t need to exist


    • these options follow the Oracle Enterprise options fairly closely. NO FORCE works like the old default – a user needs database, table, column access and CREATE VIEW grant to create a view (more or less). FORCE allows a user to create a view with only database access and CREATE VIEW grant and no underlying base table. At SELECT time, full access control and grant checking is performed, and an error will occur if those constraints are not met.
    • views are more complicated than one would expect, and can be composed of base tables, derived tables, INFORMATION_SCHEMA (IS), and other views. The only table object not allowed is a temporary table
    • CREATE FORCE VIEW is an important option when managing large sets of views when you don’t want to track the creation sequence, or when creating views via program. An example is mysqldump, which can be simplified by replacing the current temporary tables ordering workarounds with FORCE VIEW.
    • It’s a fairly solid patch. I think the best thing is to commit it to alpha and let it bake for a while.
    • One permutation that will need special handling is this: CREATE FORCE VIEW view1 AS SELECT * FROM table1; Since * is not resolved to column names by FORCE, currently ” AS SELECT * AS ” is generated, causing an error. So just use explicit column names like CREATE FORCE VIEW view1 SELECT id, col1, col2 FROM table1; See this bug.
    • it passes t/view.test:
      # ./mysql-test-run.pl view
      Logging: ./mysql-test-run.pl  view
      vardir: /usr/local/mariadb-10.1.0/mysql-test/var
      MariaDB Version 10.1.0-MariaDB-debug
      TEST                                  RESULT   TIME (ms) or COMMENT
      main.view                            [ pass ]   1896
      The servers were restarted 0 times
      Spent 1.896 of 7 seconds executing testcases
      Completed: All 1 tests were successful.
    • I wrote tests/view.pl which does 8,000+ test permutations. It passes. 🙂

    $ cat create_force_view.patch

    --- ../mariadb-10.1.0/sql/sql_view.h 2014-06-27 04:50:36.000000000 -0700
    +++ sql/sql_view.h 2014-09-02 02:35:42.000000000 -0700
    @@ -29,10 +29,10 @@
    /* Function declarations */

    bool create_view_precheck(THD *thd, TABLE_LIST *tables, TABLE_LIST *view,
    - enum_view_create_mode mode);
    + enum_view_create_mode mode, enum_view_create_force force);

    bool mysql_create_view(THD *thd, TABLE_LIST *view,
    - enum_view_create_mode mode);
    + enum_view_create_mode mode, enum_view_create_force force);

    bool mysql_make_view(THD *thd, File_parser *parser, TABLE_LIST *table,
    uint flags);
    --- ../mariadb-10.1.0/sql/sql_lex.h 2014-06-27 04:50:33.000000000 -0700
    +++ sql/sql_lex.h 2014-09-02 01:21:10.000000000 -0700
    @@ -170,6 +170,12 @@
    VIEW_CREATE_OR_REPLACE // check only that there are not such table

    +enum enum_view_create_force
    + VIEW_CREATE_NO_FORCE, // default - check that there are not such VIEW/table
    + VIEW_CREATE_FORCE, // check that there are not such VIEW/table, then ignore table object dependencies
    enum enum_drop_mode
    DROP_DEFAULT, // mode is not specified
    @@ -2442,6 +2448,7 @@
    enum enum_var_type option_type;
    enum enum_view_create_mode create_view_mode;
    + enum enum_view_create_force create_view_force;
    enum enum_drop_mode drop_mode;

    uint profile_query_id;
    --- ../mariadb-10.1.0/sql/sql_parse.cc 2014-06-27 04:50:34.000000000 -0700
    +++ sql/sql_parse.cc 2014-09-02 02:34:31.000000000 -0700
    @@ -4943,7 +4943,7 @@
    Note: SQLCOM_CREATE_VIEW also handles 'ALTER VIEW' commands
    as specified through the thd->lex->create_view_mode flag.
    - res= mysql_create_view(thd, first_table, thd->lex->create_view_mode);
    + res= mysql_create_view(thd, first_table, thd->lex->create_view_mode, thd->lex->create_view_force);
    --- ../mariadb-10.1.0/sql/sql_yacc.yy 2014-06-27 04:50:37.000000000 -0700
    +++ sql/sql_yacc.yy 2014-09-05 17:19:29.000000000 -0700
    @@ -1851,7 +1851,7 @@
    statement sp_suid
    sp_c_chistics sp_a_chistics sp_chistic sp_c_chistic xa
    opt_field_or_var_spec fields_or_vars opt_load_data_set_spec
    - view_algorithm view_or_trigger_or_sp_or_event
    + view_algorithm view_or_trigger_or_sp_or_event view_force_option
    definer_tail no_definer_tail
    view_suid view_tail view_list_opt view_list view_select
    view_check_option trigger_tail sp_tail sf_tail udf_tail event_tail
    @@ -2446,6 +2446,7 @@
    Lex->create_view_algorithm= DTYPE_ALGORITHM_UNDEFINED;
    Lex->create_view_suid= TRUE;
    + Lex->create_view_force= VIEW_CREATE_NO_FORCE; /* initialize just in case */
    @@ -15887,6 +15888,15 @@
    | event_tail

    + /* empty */ /* 411 - is there a cleaner way of initializing here? */
    + { Lex->create_view_force = VIEW_CREATE_NO_FORCE; }
    + { Lex->create_view_force = VIEW_CREATE_NO_FORCE; }
    + | FORCE_SYM
    + { Lex->create_view_force = VIEW_CREATE_FORCE; }
    + ;

    DEFINER clause support.
    @@ -15944,7 +15954,7 @@

    - view_suid VIEW_SYM table_ident
    + view_suid view_force_option VIEW_SYM table_ident
    LEX *lex= thd->lex;
    lex->sql_command= SQLCOM_CREATE_VIEW;
    --- ../mariadb-10.1.0/sql/sql_view.cc 2014-06-27 04:50:36.000000000 -0700
    +++ sql/sql_view.cc 2014-09-05 19:33:58.000000000 -0700
    @@ -248,7 +248,7 @@

    bool create_view_precheck(THD *thd, TABLE_LIST *tables, TABLE_LIST *view,
    - enum_view_create_mode mode)
    + enum_view_create_mode mode, enum_view_create_force force)
    LEX *lex= thd->lex;
    /* first table in list is target VIEW name => cut off it */
    @@ -259,7 +259,7 @@

    - Privilege check for view creation:
    + Privilege check for view creation with default (NO FORCE):
    - user has CREATE VIEW privilege on view table
    - user has DROP privilege in case of ALTER VIEW or CREATE OR REPLACE
    @@ -272,6 +272,7 @@
    checked that we have not more privileges on correspondent column of view
    table (i.e. user will not get some privileges by view creation)
    if ((check_access(thd, CREATE_VIEW_ACL, view->db,
    @@ -285,6 +286,11 @@
    check_grant(thd, DROP_ACL, view, FALSE, 1, FALSE))))
    goto err;

    + if (force) {
    + res = false;
    + DBUG_RETURN(res || thd->is_error());
    + }
    for (sl= select_lex; sl; sl= sl->next_select())
    for (tbl= sl->get_table_list(); tbl; tbl= tbl->next_local)
    @@ -369,7 +375,7 @@

    bool create_view_precheck(THD *thd, TABLE_LIST *tables, TABLE_LIST *view,
    - enum_view_create_mode mode)
    + enum_view_create_mode mode, enum_view_create_force force)
    return FALSE;
    @@ -391,7 +397,7 @@

    bool mysql_create_view(THD *thd, TABLE_LIST *views,
    - enum_view_create_mode mode)
    + enum_view_create_mode mode, enum_view_create_force force)
    LEX *lex= thd->lex;
    bool link_to_local;
    @@ -425,14 +431,13 @@
    goto err;

    - if ((res= create_view_precheck(thd, tables, view, mode)))
    + if (res= create_view_precheck(thd, tables, view, mode, force))
    goto err;

    lex->link_first_table_back(view, link_to_local);
    view->open_type= OT_BASE_ONLY;

    - if (open_temporary_tables(thd, lex->query_tables) ||
    - open_and_lock_tables(thd, lex->query_tables, TRUE, 0))
    + if (open_temporary_tables(thd, lex->query_tables) || (!force && open_and_lock_tables(thd, lex->query_tables, TRUE, 0)))
    view= lex->unlink_first_table(&link_to_local);
    res= TRUE;
    @@ -513,6 +518,7 @@

    +if (!force) {
    /* prepare select to resolve all fields */
    lex->context_analysis_only|= CONTEXT_ANALYSIS_ONLY_VIEW;
    if (unit->prepare(thd, 0, 0))
    @@ -612,6 +618,7 @@

    res= mysql_register_view(thd, view, mode);

    @@ -621,7 +628,7 @@
    meta-data changes after ALTER VIEW.

    - if (!res)
    + // if (!res)
    + if (!res && !force) /* 411 - solves segfault problems with CREATE FORCE VIEW option sometimes */
    tdc_remove_table(thd, TDC_RT_REMOVE_ALL, view->db, view->table_name, false);

    if (mysql_bin_log.is_open())
    @@ -908,6 +915,8 @@
    fn_format(path_buff, file.str, dir.str, "", MY_UNPACK_FILENAME);
    path.length= strlen(path_buff);

    if (ha_table_exists(thd, view->db, view->table_name, NULL))
    if (mode == VIEW_CREATE_NEW)
    --- ../mariadb-10.1.0/mysql-test/t/view.test 2014-06-27 04:50:30.000000000 -0700
    +++ mysql-test/t/view.test 2014-09-06 00:23:32.000000000 -0700
    @@ -5263,4 +5263,17 @@
    --echo # -----------------------------------------------------------------
    --echo # -- End of 10.0 tests.
    --echo # -----------------------------------------------------------------
    +create no force view v1 as select 1;
    +drop view if exists v1;
    +create force view v1 as select 1;
    +drop view if exists v1;
    +create force view v1 as select * from missing_base_table;
    +drop view if exists v1;
    +--echo # -----------------------------------------------------------------
    +--echo # -- End of 10.1 tests.
    +--echo # -----------------------------------------------------------------
    SET optimizer_switch=@save_optimizer_switch;

    Posted in API Programming, Linux, MySQL, Open Source, Oracle, Storage, Tech | 1 Comment

    Installing Datastax Cassandra and Python Driver on CentOS 5

    Cassandra Logo

    Cassandra can run on CentOS 5.x, but there is no yum repo support.

    If you can’t upgrade linux distros, here’s how to install Datastax Cassandra Community Edition and the python cassandra driver on CentOS 5.x.

    It’s not difficult, but there’s several steps, including updating java.

    (The following steps would make a complete chef or puppet recipe for a non-SSL install with vnodes.)

    # setup environment
    groupadd -g 602 cassandra
    useradd -u 602 -g cassandra -m -s /sbin/nologin cassandra
    mkdir /var/lib/cassandra /var/log/cassandra /var/run/cassandra
    touch /var/log/cassandra/system.log
    chown -R cassandra:cassandra /var/lib/cassandra /var/log/cassandra /var/run/cassandra
    mkdir -p /opt && cd /opt

    cat >> /etc/security/limits.conf <<EOD
    cassandra soft memlock unlimited
    cassandra hard memlock unlimited
    cassandra soft nofile 8192
    cassandra hard nofile 10240

    # upgrade java
    yum remove java
    # download, then install JDK 7.x from oracle.com
    rpm -Uvh jdk-7u67-linux-x64.rpm
    # download, then install recent jna.jar from https://github.com/twall/jna
    mv jna.jar /usr/share/java
    ln -s /usr/share/java/jna.jar /opt/cassandra/lib/
    # update envariables
    cat >> /etc/profile <<"EOD"
    export JAVA_HOME=/usr/java/default
    export JRE_HOME=/usr/java/default/jre
    export CASSANDRA_HOME=/opt/cassandra

    # get Datastax DCE
    curl -L http://downloads.datastax.com/community/dsc.tar.gz >dsc-cassandra-2.0.9.tar.gz
    tar zxvf - < dsc-cassandra-2.0.9.tar.gz ln -s /opt/dsc-cassandra-2.0.9 /opt/cassandra chown -R root:root /opt/cassandra/ bash cassandra/switch_snappy 1.0.4

    # open cassandra firewall ports if necessary (not needed if using internal interface on most servers)
    vi /etc/sysconfig/iptables
    -A INPUT -i eth0 -m state --state NEW -m multiport -p tcp --dport 7000,7199,9042,9160 -j ACCEPT
    service iptables restart
    # configure /opt/cassandra/conf/cassandra.yaml (at least listen_address, rpc_address, seeds and tokens before starting server. If you need a do-over, clean the cassandra data with # rm -fr /var/lib/cassandra/*)

    # download startup script:
    wget http://jebriggs.com/php/start_cassandra.txt -O /etc/init.d/cassandra
    chown root:root /etc/init.d/cassandra
    chmod 755 /etc/init.d/cassandra
    chkconfig --add cassandra

    # start cassandra server (if it is standalone, or a seed server. otherwise start after the seed servers):
    service cassandra start

    # cat /etc/redhat-release 
    CentOS release 5.10 (Final)
    [root@www1 conf]# nodetool status
    Datacenter: datacenter1
    |/ State=Normal/Leaving/Joining/Moving
    --  Address   Load       Tokens  Owns   Host ID                               Rack
    UN  71.87 KB   256     66.8%  8302c6d5-4c88-4695-bbf4-762bc7f24544  rack1
    UN  136.63 KB  256     69.9%  eddb03b2-98d3-46ff-be63-95435414a883  rack1
    UN  100.08 KB  256     63.3%  2a8dde5e-29b0-4a67-8204-40769376c44a  rack1

    If you only see the node on localhost, then you have a problem:

    • read and fix any errors in /var/log/cassandra/system.log until there are zero errors. snappy-related errors are from /tmp being noexec or not running the switch_snappy 1.0.4 command above.
    • disable iptables firewall, test and reenable later
    • in log4j-server.properties, increase log4j.rootLogger to DEBUG
    • if you have multiple NICs, JMX (ie. nodetool) can bind to the wrong interface. You likely need to configure the-Djava.rmi.server.hostname=[address] option in cassandra-env.sh - to the address you want to listen on
    • public/private IP address problems in AWS EC2. You may need to set broadcast_address: [public_ec2_address]
    • normally rmiregistry is not needed unless you have some atypical firewalling or routing (NAT.)

    Datastax Opscenter 5.0

    You can install the binary from yum or tarball, but the important things to know are:

    • the monitoring agent will be installed on each cassandra node and uses port 61621. The init script is called datastax-agent.
    • the UI only needs to be installed once, but needs ports 61620, and 8888 for HTTP.
    • to allow Opscenter to remotely manage nodes with ssh, remove old ssh entries from .ssh/known_hosts first, connect manually to each node, then Opscenter should be happy
    • by default, Opscenter listens for agents on, phones home to Datastax.com each day, and does not require web authentication, so you likely want to change those.

    Python also needs to be upgraded if you want to use cqlsh or the python client cassandra driver.

    # install python 2.6 and dependencies
    yum install gcc python26 python26-devel libev libev-devel

    # install python's pip module
    curl --silent --show-error --retry 5 https://bootstrap.pypa.io/get-pip.py | python26

    # install cassandra driver for python
    pip install cassandra-driver

    # install blist.py
    tar zxvf - < blist-1.3.6.tar.gz cd blist-1.3.6 python26 setup.py install cd ..

    # cluster.py - test installation
    from cassandra.cluster import Cluster
    cluster = Cluster([''])
    def dump(obj):
       for attr in dir(obj):
           if hasattr( obj, attr ):
               print( "obj.%s = %s" % (attr, getattr(obj, attr)))
    # python26 cluster.py
    obj.__class__ = <class 'cassandra.cluster.Cluster'>

    Troubleshooting connection problems in JConsole
    datastax.com: Storing OpsCenter Data in a Separate Cluster

    Posted in Cassandra, Cloud, Linux, Open Source, Tech | Leave a comment

    MySQL 5.6 Views and Stored Procedures Tips

    MySQL LogoI recently tuned an existing application that used dozens of views and hundreds of stored procedures using MySQL 5.6.

    There seems to be three attitudes towards using views and stored procedures (SPs) with MySQL:

    1. don’t use them at all to increase portability
    2. just use SPs to reduce network traffic in large reporting queries (my choice)
    3. go crazy and use them everywhere like old-school Oracle Enterprise apps.

    Here are some notes on using views:

    • before creating views, review your schema to ensure keys have matching types and charsets for good performance. It’s much easier to spot schema problems in a text listing than to guess why a view is slower than expected at execution time. (This is doubly true for MySQL Cluster.)
    • MySQL currently doesn’t have CREATE VIEW FORCE, although MariaDB 10.1.0 alpha has my patch. The FORCE option will greatly simply view administration and also mysqldump output, which creates temporary tables to ensure views can be created regardless of table/view ordering issues
    • When looking at the MariaDB source code, it’s apparent that some view options were never actually implemented, like RESTRICT/CASCADE

    And some notes on stored procedures (SPs):

    • if a SP makes a stateful session change, like set sql_log_bin=0, ensure that isn’t going to be a problem later if an exception condition doesn’t reset it
    • after running a SP, SHOW PROFILES will list all the queries executed with performance statistics
    • SPs that do non-essential SELECTs or INFORMATION SCHEMA queries probably need to be reviewed by a DBA for fundamental problems like non-atomic “reading before writing”
    • MySQL compiles SPs again for each thread.

    Both views and SPs are relatively new MySQL features, so budget some extra development and testing time when using them, especially with replication.

    mysqlperformanceblog.com: Using MySQL triggers and views in Amazon RDS

    Posted in MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

    SVLUG: Devops and Release Canaries with Linux, CloudStack and MySQL Cluster

    I did a talk at the Silicon Valley Linux Users Group (SVLUG) tonite on “Devops and Release Canaries with Linux, CloudStack and MySQL Cluster.”

    Thanks again to Symantec for hosting.

    Ravello Arms Deutsche Telekom with On-Demand Cloud Flexibility
    Deutsche Telekom’s Enterprise DevOps Journey with VMware, AWS, Jenkins, Chef & Ravello, Slides

    Posted in API Programming, Cloud, Linux, MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

    Velocity Conference Santa Clara 2014 Tips Game Cards

    The O’Reilly Velocity Web Operations & Performance Conference is June 24-26 in Santa Clara.

    Next to the messages/jobs board was a Web Ops & Performance Tips board:

    – use source maps to debug compressed JS and CSS
    – use ::before to optimize font rendering
    – use local storage to persist markup and templates to reduce requests and payload
    – avoid CSS block rendering in chrome by not using screen media type until after. Then put screen back to element
    – use gatling stress tool for load generation/perf testing (Apache Licence 2.0)
    – learn curl
    – learn POSIX before recreating another tool that already exists. Bill Joy (?)
    – “if you do it more than twice a week, automate”
    – it takes no skills to do NoOps! 🙂

    Posted in Cloud, Conferences, Open Source, Tech | Leave a comment

    AWS Pop-up Loft, San Francisco

    Amazon Web Services pop-up loft (Ask an Architect area, lecture hall, kitchen/lounge)
    Photo credit: Amazon.com.

    I happened to be in SF today, so I went to the Amazon Web Services pop-up loft on Market St.

    Amazon rented an empty storefront for 4 weeks for lecture sessions upstairs, and a computer lab and an ‘Ask an Architect’ bar downstairs.

    One of the hosts said the loft was a shell in May, and they had to build out everything: the kitchen area, 2 bathrooms and various partitions.

    I asked the experts about new EBS and RDS features, and they had answers as well as a $100 AWS credit.

    The weather was sunny and warm in SF.

    Lots of street performers and hustlers, including a very smooth male R&B singer. A young rapper named Rap2K15 was selling hand-made CDs.

    Update 2014 06 23: Apparently a drawing was held, and I was one of 3 winners of a free general pass to the AWS:Reinvent Conference 🙂

    Update 2014 06 24:

    AWS Bootcamp

    Full-day AWS overview, including EC2, S3, RDS, VPC and IAM, with 2 labs.

    “Provisioning and Managing AWS Infrastructure with Chef” with special guest George Miranda, Chef Technical Consultant, Chef

    George talked about using Chef tools like chef metal, knife and chef zero and a minimal amount of ruby to make an AMI and provision a MySQL server and 5 Nginx web servers.


    @gmiranda23, chef-ami-factory

    Update 2014 06 26:

    Dealing With Obstacles at Scale, Bob Hagemann, Twilio

    To reduce pain:

    – UTC timezone
    – UTF8
    – use thin AMI and chef/puppet instead of thick AMI
    – wrote boxconfig a few years ago (like netflix asgard)
    – remote admin mainly
    – small teams 3-8
    – services should run in 3 AZs
    – monitoring with nagios, cron, pingdom
    – haproxy on each host as proxy
    – MySQL, MHA, LVM. Manual failover.
    – SQS DLQ
    – global low latency with route53
    – http://github.com/twilio
    – @bobzilla42
    – Uses freeswitch plus own telcom sw
    – billing system 100s QPS
    – Ops team is about 8 people
    – VPNs to HQ and carrier-approved colo
    – three founders, one came from Amazon.

    925 Market Street, SF
    June 4 – 27, 2014 (likely closed on the 27th for dismantling)
    Free registration, tshirts and lunch. Closes 5:30 pm, 6:00 pm or 8:00 pm daily.
    Muni 30 and 45 return from Market St. and 5th to Caltrain.

    @AWSstartups #AWSloft

    AWS Loft Returning in Fall 2014

    Posted in API Programming, Business, Cloud, Conferences, Linux, MySQL, Open Source, Oracle, San Jose Bay Area, Tech | Leave a comment

    Advanced Liquibase Techniques

    Liquibase LogoI recently did some work with liquibase. Here’s some techniques for advanced users to workaround limitations to calculate query cost.

    Liquibase Introduction

    Liquibase is an Open Source (Apache 2.0 License) Java utility and API for specifying and versioning schema changes (DDL) for several popular databases. It is commonly introduced to projects by programmers, rather than DBAs.

    What liquibase can do:

    • allow “refactoring” of SQL schema changes to target multiple databases using XML by using a database-independent syntax, or raw SQL, depending on your preference
    • allow conditional execution and rollback of SQL based on database type or environment.

    What liquibase can’t do:

    • has no built-in provisions for operational concerns, like conditionally executing SQL based on time/cost. There’s an assumption that schema changes are online, often true on Oracle and SQL Server, less so on MySQL, especially prior to 5.6 (unless you do micro-sharding)
    • does not do intelligent merges to the same object across changesets, like adding multiple columns to the same table in one statement.

    How liquibase works:

    • the programmer specifies schema changes in Java, XML or JSON and runs the liquibase command
    • liquibase creates 2 tables in your database to store version, user and patch name information and to lock out other simultaneous liquibase runs.

    How to Make Liquibase Consider Cost for MySQL

    After some experimentation, there’s a couple liquibase features you can use to do more advanced things:

    1. create a savepoint using the tag and rollback options:
      • liquibase tag rel0; liquibase update …; liquibase rollback rel0
    2. prepend and append logic to each changeset to use information_schema on the SQL DDL statement. on failure, exit with 1 (See XML example below)


    <?xml version="1.0" encoding="UTF-8"?>


        <changeSet id="1" author="james">
           create table if not exists `profiling` ( `connection_id` int(11) not null default 0, `query_id` int(11) not null default '0', `state` varchar(40) default '', KEY (query_id));
           truncate table profiling;
           set profiling=1;

           alter table department add column test2 int default null;
           insert into profiling (connection_id, query_id, state) select connection_id(), query_id, state from information_schema.profiling where query_id=2;
            <sql>alter table department drop column test2</sql>

        <changeSet id="1-post" author="james">
          <preConditions onFail="HALT">
            <sqlCheck expectedResult="0">SELECT count(*) from profiling where state='copy to tmp table'</sqlCheck>


    1. the changeset DDL statement will still have run, even if the precondition HALTs – they’re separate changesets, after all
    2. the rollback in “1” will not be executed, even if “1-post” HALTs.

    The workaround for those 2 issues is to combine the two techniques in a shell script:

    liquibase tag rel0
    liquibase update changeset.xml || {
        # fail the build pipeline to not propagate changeset to next stage
        # (ie. don't run in production)
        liquibase rollback rel0
        mysql -e 'alter table test.department drop column test2' 
        exit 1

    The above looks a little kludgy, but provides a stepping stone for the reader to customize in their particular environment. (The preConditions and bash script can be easily autogenerated with a Perl or Python script.)

    An alternative to XML is using the Java API to set everything up.

    Please leave a comment if you have any suggestions or a Java API program.

    Posted in API Programming, MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

    Percona Live MySQL Conference Santa Clara 2014

    The Percona Live MySQL Conference was held once again in Santa Clara from April 1-4, 2014.

    Executive Summary:

    1. Percona hosted another excellent conference, with 1,150 attendees from 43 countries plus a vibrant exhibit hall.
    2. The overall themes that emerged this year were “What’s new in MySQL 5.6?” and “The rise of Galera Cluster.” Unfortunately, Oracle delivered the 5.6 features they promised, but didn’t bother to ask production DBAs what they really needed (ie. GTIDs require downtime to configure, and ALTER ONLINE doesn’t support throttling or background operation on slaves (SR 3-8856341908).)
    3. MySQL 5.7 is promising about double the performance of 5.6, but note that the 5.7 feature micro-benchmark effort hasn’t translated into a complete understanding of whole database performance yet.
    4. the current active branches are now: Oracle 5.6/5.7, MariaDB 10.0/10.1, Webscale SQL (Facebook, Google, LinkedIn, and Twitter), Facebook 5.6 with Deployable GTIDs, and Percona Server 5.6. (The version you want to migrate to is one based on MySQL 5.6.17 or later.)

    Severalnines Booth
    Severalnines.com booth. They create and support cluster and cloud database solutions. Photo credit: Steve Barker, SphinxSearch.com


    Wed. Keynotes

    Percona Live 2014 opening keynote with Percona CEO Peter Zaitsev
    Robert Hodges – Getting Serious about MySQL and Hadoop at Continuent
    (Continuent needs to pivot into another market as MySQL’s new built-in features displace their replication products.)
    ‘Raising the MySQL Bar’ with Oracle’s Tomas Ulin, VP of Engineering for MySQL, Oracle
    Adventures in MySQL at Dropbox, Renjish Abraham

    Wed. Talks

    Online schema changes for maximizing uptime, David Turner, Dropbox, Ben Black, Tango

    – MySQL 5.6 has online schema change capability, however there’s no way to throttle IO consumed during the operation and the single-threaded slave will lag
    – David has tested the ALTER ONLINE in MySQL 5.6.17 and will use it when ported to Percona Server
    – for now uses Percona Online Schema Change utility for its throttling feature.

    Be the hero of the day with the InnoDB Data recovery tool, Marco “The Grinch” Tusa and Aleksandr Kuzminsky, Percona Services

    – tools have been created by Percona to recover Innodb data if you don’t have backups and you’re out of business otherwise. Call them! 🙂

    Galera Cluster New Features, Seppo Jaakola, Codership

    – reviewed features in Galera Cluster versions 3 and 4
    – looking good.

    MySQL Cluster Performance Tuning, Johan Andersson, severalnines.com

    - Disable NUMA
    - echo 0 > /proc/sys/vm/swappiness
    - bind data node threads to CPUs
    - cat /proc/interrupts
    LDM = cores/2
    TC = LDM/4
    Tune redo log

    Practical sysbench, Peter Boros, Percona

    – prefers “latency” graph style with transparent dots vs. line charts
    – uses R and ggplot2 for graphing
    – attendees tried to guess SSD performance on Peter’s notebook for different block sizes, most were proven totally wrong by sysbench

    Birds of a Feather (BoF) Sessions

    “Meet MySQL Team (at Oracle)” BoF

    – discussion again this year about parallel query execution (same as at MariaDB BoF last year), with Peter Zaitsev also bringing it up again
    – discussion about raw partitions (belief is that they will be 20% more space-efficient and 30% faster, and avoid Linux endless limitations and bugs)
    – internal “development roadmap” only extends about 12 months at a time, subject to customer demands
    – I griped about FK panic/data loss issues in MySQL Cluster 7.3.3. Tomas Ulin, Vice President, MySQL Engineering, said that was news to him. (See SR 3-8717994851 and SR 3-87646727311)
    – Mark Callaghan, Facebook, said he was working on MongoDB now, but requested named keys in flexible schema in MySQL.
    – Peter Zaitsev, Percona, said several clients are using GTIDs and they seem to work.
    – Oracle pleaded with users to drop MyISAM. I mentioned the main reason was that legacy systems used older compression methods, but InnoDB could be used since it has compression too
    – The Oracle MySQL Fabric project is an attempt to counter MongoDB’s automatic slave promotion.


    Thursday Keynotes

    ‘9 Things You Need to Know…’, Peter Zaitsev, Percona
    The Evolution of MySQL in the All-Flash Datacenter, Nisha Talagala, Fusion-IO
    MySQL, Private Cloud Infrastructure and OpenStack, Sean Chighizola, Big Fish Games
    Keynote Panel: The Future of Operating MySQL at Scale

    Thu. Talks

    Benchmarking Databases for Scale, Peter Boros and Kenny Gryp, Percona

    Question: “What is Percona’s secret to professional benchmarks?”
    Answer: “Benchmark absolutely everything multiple times, time permitting.”

    MySQL 5.7: Performance & Scalability Benchmarks, Dimitri KRAVTCHUK

    – comprehensive micro-benchmarking graphs of 5.7 to gain a deeper understanding of parts
    – the challenge remains: how to tune the whole database to perform well?

    Use Your MySQL Knowledge to Become an Instant Cassandra Guru, Robert Hodges, Continuent and Tim Callaghan, Tokutek

    – good comparison of relational data modelling and C* data modelling, lots of similarities
    – note that MariaDB has a Cassandra plugin

    RDS for MYSQL, Tips, Patterns and Common Pitfalls, Laine Campbell, Blackbird (formerly PalominoDB)

    Write Conflicts in Multi-Master Replication Topologies, Seppo Jaakola, Codership

    – it’s good to see that Codership is paying attention to the details of replication

    MySQL Community Awards

    Shlomi has a comprehensive post on this years winners.

    MySQL Lightning Talks (5 minutes each)

    Truncating Sub Optimal DBA Verbal Responses Vectors, David Stokes (Oracle)

    MySQL 5.6 Global Transaction IDs: Benefits and Limitations, Stephane Combaudon (Percona)


    Zero database downtime using the Federated storage engine and Replication, prasad mani (BBC)

    Scaling via adding a Table, Rick James (self)

    Rick knows some clever ways to optimize solutions with MySQL. He’s doing consulting now, so contact him.

    Extra Table Saves the Day: Slides

    No es ‘ano’, es ‘año’! A take on encoding in your DB, Ignacio Nin (Vivid Cortex)

    What Not to Say to the MySQL DBA, Gillian Gunson (Blackbird (formerly PalominoDB))
    “I’ll code around it. ”
    “Stop micro-optimizing. ”
    “Use passive master for QA”
    “MySQL is a toy database. ”
    This conference is a support group. ”

    Hall of Shame, Shlomi Noach
    Triple active-replication in gaming anecdote: don’t do that.

    The bash slave-prefetch oneliner, Art van Scheppingen (Spil Games)

    Unsung Relay Log, Vishnu Rao, FlipKart
    Com_relaylog_dump for tungsten and mysql 5.5

    Unique User Count — Rollup, Rick James (self)

    Formula for user visit estimation by counting bits.

    Logical Backups in the Cloud, Bill Karwin, Percona
    Backups for PHP designers
    PHP class Mysql/Dump

    How to Squat, Kyle Redinger (VividCortex, Inc)

    Iron DBA Replication Challenge, Attunity


    Friday Keynotes

    Percona CMO Terry Erisman opens the 3rd and final day of Percona Live 201

    Keynote: OpenStack Co­Opetition, A View from Within, Boris Renski, Mirantis and OpenStack Boardmember

    – one of the best conference keynotes ever, and a great primer on Open Source marketing … up there with the O’Reilly Open Source Conference keynote on the importance of Android – before it shipped.

    Friday Talks

    Global Transaction ID at Facebook, Evan Elias, Santosh Banda and Yoshinori Matsunobu, Facebook

    – just write your own MySQL branch if a feature is too hard to deploy 🙂

    R for MySQL DBAs, Ryan Lowe and Randy Wigginton, Percona

    – R has about 1,000 interesting sample databases (demos included diamonds and cars)
    – good interface for quick graphing, not so great for complex programs
    – Percona usess R and ggplot graph module for most of the graphs you see now.

    MariaDB for Developers, Colin Charles, Chief Evangelist, MariaDB

    Closing Prize Drawing

    About 30 high-end gifts were handed out.

    Some nice prizes contributed by exhibitors, including Nexus 7 tablets, $250 AWS gift certificates, SQLyog and Monyog licenses, and a quad drone!


    The exhibits are one of my favorite things at the conference each year because of how strong the MySQL third-party community is.

    Some notable absences were Clustrix and Violin memory, but those were offset by new exhibitors. Webyog was a sponsor but I didn’t see a booth. PalominoDB changed their name to Blackbird, and appear to be offering DevOps as well as DBA services.

    And of course, as the organizers, Percona had a large, central spread. 🙂

    Thanks to the sponsors and exhibitors for making a conference like this financially possible.

    Facebook Debuts Web-Scale Variant Of MySQL

    Facebook’s Yoshinori Matsunobu on MySQL, WebScaleSQL & Percona Live
    Twitter’s Calvin Sun on WebScaleSQL, Percona Live
    Tweets about PerconaLive
    Percona Live MySQL Conference Highlights

    Posted in Cassandra, Cloud, Conferences, Linux, MySQL, MySQL Cluster, Open Source, Oracle, Perl, San Jose Bay Area, Storage, Tech | Leave a comment

    Cassandra Operations Checklist

    Most of the Cassandra rollouts I’ve heard about at conferences have been “Devopsed” – written by Dev and productionized by Dev, with hand-off to Operations long afterwards.

    That’s the opposite to how RDBMS projects are usually deployed in large companies.

    As Cassandra becomes more mature, this hand-off will occur earlier after development ends.

    Here is a checklist for handing off a Cassandra database to Operations (I only consider non-trivial rings of 3 or more nodes in production with a full data set):

      Node Impact
      Item Comments Performance/ Space/ Time/IOPs/BW
    Cassandra Server Version Should be exactly the same minor version across cluster except briefly during server updates
    Token or vnodes? needs to be configured before first start of server
    Cassandra Client/Connector Version Thrift or CQL?
    Snitch name? Why? several choices
    Replication Factor (RF)? Why? usually RF=3 for SoT* data, defined at keyspace level
    Compaction method? Why? Size or Level, defined at CF level
    Read Consistency Level? Why? Netflix recommends CL=ONE. ALL seldom makes sense.
    Write Consistency Level? Why? ALL seldom makes sense.
    TTL? Why? Defined at row level.
    Expected Average Query Latency 10 ms is reasonable, 1 ms is tough.
    nodetool repair/scrub needed weekly yes more space more
    Bootstrapping a new node yes yes
    Java gcpause stop the world yes yes
    Are there any wide columns? do they get wider over time? pathological case for Cassandra yes more space more
    Backup in case of application bug or a disaster. Opscenter, Priam, custom. yes slightly more for incremental backups, double for local cold copy more
    Restore requires Cassandra node shutdown yes
    If a storage volume fills, howto fix it? Especially a problem with multiple JBOD volumes, which fill unevenly. yes less space less
    If a storage volume fails, howto fix it? yes less space less
    What is the total data size now? Projected in 12 months? affects most operations yes yes yes
    What is the acceptable query latency? affects network and hardware choices
    What is the best maintenance window time each week?
    What are the business and practical SLAs?
    What training is needed for your Operations team? Datastax Admin and Data Modelling Classes (recommend most recent Cassandra version)
    What partitioner is used? Opscenter only supports random partitioner or murmur 3 partitioner for rebalancing
    What procedures need to be written for your Operations team?
    What monitoring tools?
    1. DSE or DCE/OpsCenter
    2. nodetool
    3. Jconsole/jmxterm
    4. Boundary
    5. nagios/zabbix
    What bugs have been encountered? Which ones still apply?
    What lessons can Devops share with the Operations team?

    SoT = Source of Truth

    About Data Consistency in Cassandra
    ConstantContact techblog: Cassandra and Backups
    stackoverflow.com: Do I absolutely need a minimum of 3 nodes/servers for a Cassandra cluster or will 2 suffice?
    Cassandra Parameters Calculator
    Adding vnodes to an existing cluster

    Posted in Business, Cassandra, Cloud, Tech | Leave a comment

    Howto Add a New Command to the MySQL Server

    MySQL LogoAdding a new statement or command to the MySQL server is not difficult.

    First, decide if you want to modify the server source code, or if a User-Defined Function (UDF) will meet your needs.

    Since I just added the SHUTDOWN server command, I thought I would be helpful to outline the steps needed to add a new command.


    1. some familiarity with C/C++ syntax and programming (like “The C Programming Language”, by Kernighan and Ritchie.)
    2. some familiarity with lex and yacc. (I read the Dragon Book a long time ago.)
    3. access to a linux account with cmake, gcc, make and bison packages.
    # CentOS
    yum install cmake gcc make bison
    # Ubuntu
    apt-get update
    apt-get install cmake gcc make bison

    # unpack the MySQL source code:

    tar zxvf - < mariadb-5.5.30.tar.gz

    # most of the files you need to modify are in this directory:

    cd mariadb-5.5.30/sql
    • sql_parse.cc
    • sql_yacc.yy
    • sql_prepare.cc
    • mysqld.cc
    • sql_lex.h

    # add the token(s) (commands and arguments you think you will need) and verify the syntax:

    bison -v sql_yacc.yy

    # if you get warnings, fix %expect in sql_yacc.cc

    # cut-and-paste a code block from a command with similar syntax in sql_yacc.cc to implement your new command, and build a test version of MySQL

    # build your new server in a sandbox:


    cd mariadb-5.5.30
    cmake . -DCMAKE_INSTALL_PREFIX:PATH=/usr/local/mariadb-5.5.30
    make --with-debug
    sudo make install

    # test your new server with 3 terminal windows:


    killall mysqld
    /usr/local/mariadb-5.5.30/bin/mysqld_safe --user=mysql --debug &
    tail -f  /tmp/mysqld.trace | grep Got &
    tail -f /var/log/mysqld.log &
    mysql -u root -p
    # login, then test your new command while watching the log and trace

    # read /var/log/mysqld.log and /tmp/mysqld.trace for errors and panics like this:

    Version: '5.5.30-MariaDB-debug'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
    mysqld: /home/james/mariadb-5.5.30/sql/sql_parse.cc:4477: int mysql_execute_command(THD*): Assertion `0' failed.
    130515 11:25:19 [ERROR] mysqld got signal 6 ;
    This could be because you hit a bug. It is also possible that this binary
    or one of the libraries it was linked against is corrupt, improperly built,
    or misconfigured. This error can also be caused by malfunctioning hardware.

    The above panic was caused by the SQLCOM_ switch falling through, because the new command was not defined yet.

    # When you’re done, make a test

    vi mysql-test/t/my_new_command.test

    # Create a patch file:

    mv mariadb-5.5.30 mariadb-5.5.30-new
    tar zxvf - < mariadb-5.5.30.tar.gz
    cd mariadb-5.5.30/src
    for i in sql_parse.cc sql_yacc.yy sql_prepare.cc mysqld.cc sql_lex.h; do
       echo $i
       diff -u $i ../../mariadb-5.5.30-new/sql/ >>patch.txt
    # don't forget mysql-test/t/my_new_command.test

    # apply your patch file:

    patch -b < patch.txt

    # do a build and test your patch before distributing it.

    Easy peasy, right! 🙂

    Sergei Golubchik wrote on the MariaDB developers list: "Reserved words are keywords (listed in the sql/lex.h) that are
    not listed in the 'keyword' rule of sql_yacc.yy (and 'keyword_sp' rule, that 'keyword' rule includes)."

    How can I get the output of the DBUG_PRINT
    How to find shift/reduce conflict in this yacc file?
    MariaDB Contributor Agreement (MCA) Frequently Asked Questions
    wikipedia: diff

    MySQL Internals Manual
    mysqlperformanceblog.com: XtraDB / InnoDB internals in drawing
    Overloading Procedures
    innodb_diagrams project
    Understanding MySQL Internals By Sasha Pachev (O'Reilly)
    DTrace can tell you what MySQL is doing
    MySQL C Client API programming tutorial
    MySQL 5.1 Class Index

    • https://launchpad.net/~maria-developers
    • IRC, #maria channel on Freenode
    • https://kb.askmonty.org/en/community-contributing-to-the-mariadb-project/
    • https://kb.askmonty.org/en/contributing-code/
    • https://kb.askmonty.org/en/google-summer-of-code-2013/ (ideas)
    • http://mariadb.org/jira/ (search for unassigned tasks)

    Keywords: MariaDB, MySQL server programming, tutorial, patch.

    Posted in API Programming, Linux, MySQL, Open Source, Oracle, Tech, Toys | 3 Comments

    Patch to Add Shutdown Statement to MySQL MariaDB

    MySQL LogoAt the OSCON 2011 MariaDB Birds-of-a-Feather (BoF) session, I suggested adding a MySQL SHUTDOWN statement to Monty, which was written up as WL#232. Other databases have this feature, and it’s very handy when automating management of a cluster of MySQL servers.

    And at the Percona Live MySQL Conference 2013, Monty suggested to MariaDB BOF attendees that a good way to get a new feature added is to to write a patch to pave the way for a committer to start with.

    Phase 1

    So … I sat down last nite and wrote the patch against MariaDB 5.5.30.

    Basically it meant telling mysql’s lex/yacc files to parse “shutdown”, then calling the existing MySQL API shutdown kill_mysql() function.

    This code is released under the Open Source BSD-new License, according to the MariaDB Contributor Agreement.

    shutdown_0.1.patch.txt – MariaDB 5.5.30:

    --- sql_parse.cc	2013-03-11 03:29:13.000000000 -0700
    +++ /home/james/mariadb-5.5.30-new/sql/sql_parse.cc	2013-05-15 13:17:05.000000000 -0700
    @@ -1305,7 +1305,6 @@
       case COM_SHUTDOWN:
    @@ -1333,7 +1332,6 @@
       case COM_STATISTICS:
         STATUS_VAR *current_global_status_var;      // Big; Don't allocate on stack
    @@ -3736,6 +3734,31 @@
    +  case SQLCOM_SHUTDOWN:
    +  {
    +    // jeb - This code block is copied from COM_SHUTDOWN above. Since kill_mysql(void) {} doesn't take a level argument, the level code is pointless.
    +    // jeb - In fact, the level code should be removed and Oracle Database statements implemented: SHUTDOWN, SHUTDOWN IMMEDIATE and SHUTDOWN ABORT. See WL#232.
    +    status_var_increment(thd->status_var.com_other);
    +    if (check_global_access(thd,SHUTDOWN_ACL))
    +      break; /* purecov: inspected */
    +    enum mysql_enum_shutdown_level level;
    +    level= SHUTDOWN_DEFAULT;
    +    if (level == SHUTDOWN_DEFAULT)
    +      level= SHUTDOWN_WAIT_ALL_BUFFERS; // soon default will be configurable
    +    else if (level != SHUTDOWN_WAIT_ALL_BUFFERS)
    +    {
    +      my_error(ER_NOT_SUPPORTED_YET, MYF(0), "this shutdown level");
    +      break;
    +    }
    +    DBUG_PRINT("SQLCOM_SHUTDOWN",("Got shutdown command for level %u", level));
    +    my_eof(thd);
    +    kill_mysql();
    +    res=TRUE;
    +    break;
    +  }
    --- sql_yacc.yy	2013-03-11 03:29:19.000000000 -0700
    +++ /home/james/mariadb-5.5.30-new/sql/sql_yacc.yy	2013-05-15 11:12:03.000000000 -0700
    @@ -791,7 +791,7 @@
       Currently there are 174 shift/reduce conflicts.
       We should not introduce new conflicts any more.
    -%expect 174
    +%expect 196
        Comments for TOKENS.
    @@ -1645,6 +1645,7 @@
             definer_opt no_definer definer
             parse_vcol_expr vcol_opt_specifier vcol_opt_attribute
             vcol_opt_attribute_list vcol_attribute
    +        shutdown
     %type  call sp_proc_stmts sp_proc_stmts1 sp_proc_stmt
    @@ -1796,6 +1797,7 @@
             | savepoint
             | select
             | set
    +        | shutdown
             | signal_stmt
             | show
             | slave
    @@ -13715,6 +13717,17 @@
    +          SHUTDOWN
    +          {
    +            LEX *lex=Lex;
    +            lex->value_list.empty();
    +            lex->users_list.empty();
    +            lex->sql_command= SQLCOM_SHUTDOWN;
    +          }
    +        ;
               expr { $$=$1; }
             | DEFAULT { $$=0; }
    --- sql_prepare.cc	2013-03-11 03:29:11.000000000 -0700
    +++ /home/james/mariadb-5.5.30-new/sql/sql_prepare.cc	2013-05-15 03:07:00.000000000 -0700
    @@ -2173,6 +2173,7 @@
       case SQLCOM_GRANT:
       case SQLCOM_REVOKE:
       case SQLCOM_KILL:
    +  case SQLCOM_SHUTDOWN:
       case SQLCOM_PREPARE:
    --- mysqld.cc	2013-03-11 03:29:14.000000000 -0700
    +++ /home/james/mariadb-5.5.30-new/sql/mysqld.cc	2013-05-15 01:20:11.000000000 -0700
    @@ -3333,6 +3333,7 @@
       {"savepoint",            (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SAVEPOINT]), SHOW_LONG_STATUS},
       {"select",               (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SELECT]), SHOW_LONG_STATUS},
       {"set_option",           (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SET_OPTION]), SHOW_LONG_STATUS},
    +  {"shutdown",             (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SHUTDOWN]), SHOW_LONG_STATUS},
       {"signal",               (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SIGNAL]), SHOW_LONG_STATUS},
       {"show_authors",         (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SHOW_AUTHORS]), SHOW_LONG_STATUS},
       {"show_binlog_events",   (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SHOW_BINLOG_EVENTS]), SHOW_LONG_STATUS},
    --- sql_lex.h	2013-03-11 03:29:13.000000000 -0700
    +++ /home/james/mariadb-5.5.30-new/sql/sql_lex.h	2013-05-15 01:19:17.000000000 -0700
    @@ -193,6 +193,7 @@
         When a command is added here, be sure it's also added in mysqld.cc

    To apply:

    tar zxvf - < mariadb-5.5.30.tar.gz
    cd mariadb-5.5.30/sql
    wget http://jebriggs.com/php/shutdown_0.1.patch.txt
    patch -b < shutdown_0.1.patch.txt


    cd mariadb-5.5.30
    cmake . -DCMAKE_INSTALL_PREFIX:PATH=/usr/local/mariadb-5.5.30
    make --with-debug
    sudo make install


    killall mysqld
    /usr/local/mariadb-5.5.30/bin/mysqld_safe --user=mysql --debug &
    tail -f  /tmp/mysqld.trace | grep Got &
    mysql -u root -p

    mysql client (with mysqld.log and mysql.trace entries overlaid):

    mysql> shutdown;
    ERROR 2013 (HY000): Lost connection to MySQL server during query
    mysql> 130515 13:20:38 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended


    T@4    : | | | >parse_sql
    T@4    : | | | <parse_sql
    T@4    : | | | >LEX::set_trg_event_type_for_tables
    T@4    : | | | <LEX::set_trg_event_type_for_tables
    T@4    : | | | >mysql_execute_command
    T@4    : | | | | >deny_updates_if_read_only_option
    T@4    : | | | | <deny_updates_if_read_only_option
    T@4    : | | | | >stmt_causes_implicit_commit
    T@4    : | | | | <stmt_causes_implicit_commit
    T@4    : | | | | SQLCOM_SHUTDOWN: Got shutdown command for level 16
    T@4    : | | | | >set_eof_status
    T@4    : | | | | <set_eof_status
    T@4    : | | | | >kill_mysql
    T@4    : | | | | | quit: After pthread_kill
    T@4    : | | | | <kill_mysql
    T@4    : | | | | proc_info: /home/james/mariadb-5.5.30/sql/sql_parse.cc:4507  query end


    130515 13:20:08 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
    130515 13:20:08 InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
    130515 13:20:08 InnoDB: The InnoDB memory heap is disabled
    130515 13:20:08 InnoDB: Mutexes and rw_locks use GCC atomic builtins
    130515 13:20:08 InnoDB: Compressed tables use zlib 1.2.3
    130515 13:20:08 InnoDB: Initializing buffer pool, size = 128.0M
    130515 13:20:08 InnoDB: Completed initialization of buffer pool
    130515 13:20:08 InnoDB: highest supported file format is Barracuda.
    130515 13:20:09  InnoDB: Waiting for the background threads to start
    130515 13:20:10 Percona XtraDB (http://www.percona.com) 5.5.30-MariaDB-30.1 started; log sequence number 1597945
    130515 13:20:10 [Note] Plugin 'FEEDBACK' is disabled.
    130515 13:20:10 [Note] Event Scheduler: Loaded 0 events
    130515 13:20:10 [Note] /usr/local/mariadb-5.5.30/bin/mysqld: ready for connections.
    Version: '5.5.30-MariaDB-debug'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
    130515 13:20:37 [Note] Got signal 15 to shutdown mysqld
    130515 13:20:37 [Note] /usr/local/mariadb-5.5.30/bin/mysqld: Normal shutdown
    130515 13:20:37 [Note] Event Scheduler: Purging the queue. 0 events
    130515 13:20:37  InnoDB: Starting shutdown...
    130515 13:20:38  InnoDB: Shutdown completed; log sequence number 1597945
    130515 13:20:38 [Note] /usr/local/mariadb-5.5.30/bin/mysqld: Shutdown complete
    130515 13:20:38 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

    A possible test would be like this, but it would interfere with operation of the test mysqld instance:



    Phase 2

    My above patch applies cleanly within the existing MySQL shutdown framework, which implements a feature like Oracle Database's SHUTDOWN IMMEDIATE command.

    However, my patch is a Pyrrhic victory, since there's so much wrong with MySQL's existing shutdown framework that it will take an internals committer to sort it out.

    The shutdown framework is badly designed, if it was designed at all, since it fails the "does this feel programmed on purpose?" test, and in fact doesn't work reliably:

    1. Conceptually, there should be 3 Oracle Database-style SHUTDOWN options: WAIT, IMMEDIATE and ABORT. Implementing SHUTDOWN WAIT would mean intrusive changes to the MySQL source code, while SHUTDOWN ABORT would be easier to program, but at the risk of data integrity.
    2. the following bug reports describe a race condition between mysqld threads and the shutdown thread:

    I guess I'll have to pay myself the worklog bounty of $100. 🙂

    This is actually my second MySQL patch contribution. In 1997 or 1998 I submitted a patch for the installer, which was one of the most troublesome components at that time. Monty rewrote it, but I liked my version better.

    Update: Sergei Golubchik committed this patch to MariaDB 10.0.4 on 2013-06-25. Thanks, Sergei!

    MySQL's Missing Shutdown Statement
    Bug #63276: skip sleep in srv_master_thread when shutdown is in progress

    Posted in Linux, MySQL, Open Source, Oracle, OSCON, Tech | 1 Comment