O’Reilly: Managing Mission-Critical Domains and DNS

It’s been a while since I looked forward to a book, but this one by Mark Jeftovic of easydns.com looks pretty interesting:



What’s amusing is that Amazon lists it as #1 in “Hot New Releases in Unix DNS & Bind.” Unsurprisingly, it’s also the only title in that category. :)

shop.oreilly.com: Managing Mission-Critical Domains and DNS
amazon.com: Managing Mission-Critical Domains and DNS

Posted in Cloud, Open Source, Tech | Leave a comment

Drones and ADS-B


DJI Phantom Drone

AvWeb has a very interesting article on possible Google involvement with drones and ADS-B.

For those new to ADS-B, at a high level it is digital beacon that must be installed on all airliners world-wide, and in the USA all airplanes by 2020. The cost is borne by the airplane operator, and ranges from $5,000 for a small plane to $1 million or more for an airliner, including avionics and installation costs.

“ADS-B out” is the digital transmitter, and “ADS-B in” is the digital reception of weather and ATC data.

Since $5,000+ is a lot of money for something that doesn’t make you fly better, if the FAA decides to impose ADS-B on drones, then a big outside player like Google or Amazon will have to design/manufacture low cost versions to make drones cost-effective. Like an order of magnitude cheaper, or even two.

One of the commenters has an interesting question, “Just how many airborne ADS-B devices can the FAA’s ground-bound infrastructure handle at one time? The answer to that question may frame the response that we can expect from the agency… Seriously.”

FAA allows AIG to use drones for insurance inspections
DJI: A Chinese firm has taken the lead in commercial drones

wikipedia: Traffic collision avoidance system

Posted in Tech, Toys | Leave a comment

SVLUG: Daniel Klopp on Docker

Linux Penguin LogoAt Silicon Valley Users Group (SVLUG) tonite, Daniel Klopp, Senior Technical Consultant, Taos Consulting, gave an intermediate talk on “Docker.”

He had some really informative and detailed slides on using Docker, especially his cgroup commands samples.

Some of the interesting things he mentioned were:

  1. cgroups are nested
  2. Docker currently has a limit of 127 “layers”, with prior layers appearing to be read-only to the current layer
  3. Docker is high-level enough to run on multiple operating systems, including both linux and windows


Daniel Klopp

Daniel Klopp

One attendee mentioned that a work-around for the insecure nature of Docker is to combine it with SELinux, though that will involve a fair amount of work.

Over 400 people RSVPed on a related Meetup, and over 150 people attended, a record for this decade.


Pasta Spread

Great turnout!

Pasta Spread

Salad, meat lasagna, pasta alfredo, veggie lasagna from Taos!


Thanks to Taos for providing food for all. Taos has job postings for sys admin, network admin, devops and help desk IT persons.

Thanks to Symantec once again for hosting the event.

Posted in API Programming, Cloud, Linux, Open Source, Tech, User Groups | Leave a comment

Why Can’t ISPs Handle SPF Records?

I’m always appalled when I need to setup a Sender Policy Framework (SPF) record using ISP zone file editors.

It took ThePlanet (now owned by IBM/SoftLayer) 5 years to fix their web interface to handle valid SPF records (re-edit and save) – and that’s *after* I reported the bug.

I had to make an official visit to their CEO as their #109th largest customer to actually get somebody to look at the ticket. Their engineering staff was in disbelief, until they actually tested it and said, “Oops!” :)

GoDaddy currently has 3 oddities in their new and classic DNS zone editor web programs:

  1. the SPF wizard does not show double quotes, required for records with spaces, as all SPF records have. It silently inserts the quotes, doubling them if you also add them, causing an invalid record.
  2. their SPF wizard wildly flails around, making the longest SPF records I’ve ever seen. That means problems, like more DNS lookups and possibly truncation issues
  3. it refuses to allow domain names in the left-hand column, forcing the origin (@ symbol). That works for most people, but I hope you’re not the exception.

Can you spot more bugs? :)

Notes:

  • Regarding #3, for those people not familiar with SPF, rules apply to domain names and subdomain names, usually mydomain.com or mail.mydomain.com, the latter of which @ will not match.
  • SPF clients match the SPF or TXT record with the FQDN in the Return-Path header. If you don’t want to add a SPF record for each host (like www0 and www1), then email server masquerading can be used. In sendmail, that’s
    FEATURE(masquerade_envelope)dnl

openspf.org: Common mistakes when creating an SPF record

Posted in Open Source, Tech | Leave a comment

Upgrading Percona Server 5.5 to 5.6 on CentOS

Percona LogoI like using Percona Server for some projects because you get to see what their clients feel is important for operating MySQL at scale, as reflected in the features that Percona adds. Some examples are fast Innodb log replay and transportable Innodb tablespaces.

A drawback of Percona Server is that they do limited QA on packaging, so I find that the grant tables get in a bad state after using yum update a few times. So I recommend periodically doing a fresh install.

Here’s the steps I used for upgrading from 5.5 to 5.6 on CentOS this weekend. It’s helpful as a checklist for non-DBAs, and as a pre-flight for DBAs so they will know what to expect in advance.

(Nearly all of this applies to upgrading to Percona XtraDB Cluster (Galera) as well. Just change the package names and start the first server with the pxc command.)

  1. read changelog and decide if your app will work with 5.6. For example, timestamp formats have changed since 5.5.
  2. stop apps and monitoring.
  3. backup old databases with mysqldump and capture the old grant commands.

    To mysqldump the MySQL grant tables, you may need the –skip-lock-tables option:

    # mysqldump -h host -u root --skip-lock-tables -p mysql >grants.sql
    
  4. remove Percona packages:
    # yum remove Percona-Server-client-55 Percona-Server-server-55 \
    Percona-Server-shared-55
    
  5. You must rename my.cnf so that yum install will generate the mysql grant tables:
    # mv /etc/my.cnf /etc/my.cnf.old;
    # rm -fr /var/lib/mysql
    # also, comment out deprecated option like table_cache or
    #    the new server will not start
    
  6. Find and remove any files that should have been removed:
    # find / -name Percona
    # find / -name mysql
    
  7. Install new packages:
    # yum install Percona-Server-client-56 Percona-Server-server-56 \
    Percona-Server-shared-56
    
  8. Can move my.cnf back now:
    # mv /etc/my.cnf/old /etc/my.cnf
    
  9. # service mysql start # if it doesn't start, read mysqld.err
    
  10. if the grant tables were not created, you can do this:
    # chown mysql:mysql /var/lib/mysql
    # chgrp mysql /var/lib/mysql
    # mysql_install_db --user=mysql --ldata=/var/lib/mysql
    
  11. if you use replication, add grant on master now:

    mysql> grant replication slave on *.* to 'repl'@'slave-ip' identified by 'pw';
    mysql> flush logs;
  12. Post-install commands which only need to be run once on the master:
    mysql -e "CREATE FUNCTION fnv1a_64 RETURNS INTEGER SONAME 'libfnv1a_udf.so'"
    mysql -e "CREATE FUNCTION fnv_64 RETURNS INTEGER SONAME 'libfnv_udf.so'"
    mysql -e "CREATE FUNCTION murmur_hash RETURNS INTEGER SONAME 'libmurmur_udf.so'"
    
  13. If you use replication, on the slave:
    mysql> change master to master_host='master-ip',
       master_user='repl',
       master_password='pw',
       master_log_file='my-binlog-prefix.000002',
       master_log_pos=4;
    mysql> start slave;
    mysql> show slave status\G
    
  14. restore your database backups (not the mysql database tables, as their format changes over time. Use GRANT statements instead.)
Posted in Linux, MySQL, Open Source, Oracle, Tech | Leave a comment

ArXiv Paper on Using Additive Noise to Determine Cause and Effect

One doesn’t see many scientific papers that are immediately useful, but this one qualifies:

arXiv.org: Distinguishing Cause From Effect Using Observational Data: Methods And Benchmarks
Physics arXiv Blog: Cause And Effect: The Revolutionary New Statistical Test That Can Tease Them Apart

“They say the additive noise model is up to 80 per cent accurate in correctly determining cause-and-effect” … “in the very simple situation in which one variable causes the other.”

Posted in Tech | Leave a comment

HAProxy and SSL SNI Support

The HAProxy 1.5 branch has SSL support built-in, so you don’t need stunnel or other SSL-termination helpers now.

I tested SSL Server Name Indication (SNI) functionality with HAProxy 1.5.10, OpenSSL 1.0.2 and two SSL certificates (GeoTrust from Namecheap.com) on 3 Dell 1950 servers and it worked fine for me. HAProxy ran on one server and the others ran Apache HTTPD using virtual servers for each domain being load balanced.

SNI lets you use one IP address with multiple SSL certificates. For each site, you just create a single PEM file with key, crt and chain entries, in that exact order. Using SNI reduces the number of IP addresses you need, and also avoids having a separate stunnel process for each SSL certificate.

SNI works fine with most desktop browsers since 2003, but not IE8 or older on Windows XP. Also, custom client applications and embedded devices that use SSL may be confused with SNI. I noticed that the Nagios plugin cannot see the second certificate, even with -H hostname specified.

For GeoTrust certs for Apache+OpenSSL as of Feb. 15 2015, the correct installation of the 4 certificates is:

cat server.key server.crt rapidssl_cabundle.crt >server.pem

-----BEGIN RSA PRIVATE KEY-----
-----END RSA PRIVATE KEY-----
-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----
INTERMEDIATE CA:
---------------------------------------
-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----

Troubleshooting:

  1. note that haproxy prints a general error message of “unable to load SSL private key from PEM file”, regardless if it’s a missing filename, incorrect file permissions or incorrectly formatted certificates, so check the filename and permissions first.
  2. ensure there’s no malformed header (dashed) lines and delete blank lines
  3. OpenSSL certs are in PEM format by default, so there’s no need to convert them. (Usually it’s Windows users who have to do PEM conversion.)
  4. After haproxy starts, it’s important to verify the certificate chain. Use sslchecker.com and use the Chain Details button to see the intermediate and root certificate names and dates.

A new section in haproxy.cfg is needed to listen on port 443:

frontend https-in
    bind *:443 ssl crt /etc/ssl/server1.pem crt /etc/ssl/server2.pem
    reqadd X-Forwarded-Proto:\ https
    default_backend application-backend

For CentOS 5 users, SNI requires you to build haproxy from source with a newer version of OpenSSL statically. The README tells you how to do that. Use the latest version of OpenSSL to avoid errors about missing function names.

cd openssl-1.0.2
export STATICLIBSSL=/tmp/staticlibssl
make clean
./config --prefix=$STATICLIBSSL no-shared
make && make test && make install
cd ../haproxy-1.5*
make clean
make TARGET=linux26 USE_OPENSSL=1 SSL_INC=$STATICLIBSSL/include SSL_LIB=$STATICLIBSSL/lib ADDLIB=-ldl
service haproxy stop
make install
service haproxy start

For those upgrading from previous versions of haproxy, old .cfg files should still work, but warnings are emitted for timeout settings, as they have been renamed in 1.5:

service haproxy start
[...]
   | While not properly invalid, you will certainly encounter various problems
   | with such a configuration. To fix this, please ensure that all following
   | timeouts are set to a non-zero value: 'client', 'connect', 'server'.

1.5 has only been GA since June 2014, so ensure you test it adequately for your requirements and keep an eye on the changelog.

SO: Configure multiple SSL certificates in Haproxy
HAProxy and SNI-based SSL offloading with intermediate CA
blog.haproxy.com: Enhanced SSL load-balancing with Server Name Indication (SNI) TLS extension
blog.haproxy.com: How to get SSL with HAProxy getting rid of stunnel, stud, nginx or pound

sslmate.com: Buy SSL certs from the command line
How exactly does AES-NI work?

Posted in Linux, Open Source, Tech | Leave a comment

Top Utility for Cassandra Clusters – cass_top

DataStax’s OpsCenter is pretty, but sometimes you don’t want to chop holes in your firewall for the server and agents.

So here’s cass_top. It works like top, but colorizes the output of nodetool status. It also lets you build nodetool commands using menus, run and log the output.

What’s especially nice is that it uses bash (no python required), and uses minimal screen real estate, so you can view all your clusters on one monitor using eterms.

$ cass_top 10.0.1.140


cass_top Screenshot
cass_top Help Screenshot

Please leave a comment with your suggestions.

github: Cassandra Top cass_top

Posted in Cassandra, Linux, Storage, Tech, Toys | Leave a comment

MariaDB Patch: CREATE [[NO] FORCE] VIEW Options

MariaDB LogoBelow is my patch that implements the CREATE [[NO] FORCE] VIEW options against MySQL/MariaDB 10.1.0.

It adds two new options that look like this:

  1. CREATE NO FORCE VIEW v1 AS SELECT * FROM TABLE1; — base TABLE1 must exist, as before
  2. CREATE FORCE VIEW v1 AS SELECT * FROM TABLE1; — base TABLE1 doesn’t need to exist

Notes:

  • these options follow the Oracle Enterprise options fairly closely. NO FORCE works like the old default – a user needs database, table, column access and CREATE VIEW grant to create a view (more or less). FORCE allows a user to create a view with only database access and CREATE VIEW grant and no underlying base table. At SELECT time, full access control and grant checking is performed, and an error will occur if those constraints are not met.
  • views are more complicated than one would expect, and can be composed of base tables, derived tables, INFORMATION_SCHEMA (IS), and other views. The only table object not allowed is a temporary table
  • CREATE FORCE VIEW is an important option when managing large sets of views when you don’t want to track the creation sequence, or when creating views via program. An example is mysqldump, which can be simplified by replacing the current temporary tables ordering workarounds with FORCE VIEW.
  • It’s a fairly solid patch. I think the best thing is to commit it to alpha and let it bake for a while.
  • One permutation that will need special handling is this: CREATE FORCE VIEW view1 AS SELECT * FROM table1; Since * is not resolved to column names by FORCE, currently ” AS SELECT * AS ” is generated, causing an error. So just use explicit column names like CREATE FORCE VIEW view1 SELECT id, col1, col2 FROM table1; See this bug.
  • it passes t/view.test:
    # ./mysql-test-run.pl view
    Logging: ./mysql-test-run.pl  view
    vardir: /usr/local/mariadb-10.1.0/mysql-test/var
    MariaDB Version 10.1.0-MariaDB-debug
    
    TEST                                  RESULT   TIME (ms) or COMMENT
    -------------------------------------------------------------------
    main.view                            [ pass ]   1896
    -------------------------------------------------------------------
    The servers were restarted 0 times
    Spent 1.896 of 7 seconds executing testcases
    Completed: All 1 tests were successful.
    
  • I wrote tests/view.pl which does 8,000+ test permutations. It passes. :)

$ cat create_force_view.patch

--- ../mariadb-10.1.0/sql/sql_view.h 2014-06-27 04:50:36.000000000 -0700
+++ sql/sql_view.h 2014-09-02 02:35:42.000000000 -0700
@@ -29,10 +29,10 @@
/* Function declarations */

bool create_view_precheck(THD *thd, TABLE_LIST *tables, TABLE_LIST *view,
- enum_view_create_mode mode);
+ enum_view_create_mode mode, enum_view_create_force force);

bool mysql_create_view(THD *thd, TABLE_LIST *view,
- enum_view_create_mode mode);
+ enum_view_create_mode mode, enum_view_create_force force);

bool mysql_make_view(THD *thd, File_parser *parser, TABLE_LIST *table,
uint flags);
--- ../mariadb-10.1.0/sql/sql_lex.h 2014-06-27 04:50:33.000000000 -0700
+++ sql/sql_lex.h 2014-09-02 01:21:10.000000000 -0700
@@ -170,6 +170,12 @@
VIEW_CREATE_OR_REPLACE // check only that there are not such table
};

+enum enum_view_create_force
+{
+ VIEW_CREATE_NO_FORCE, // default - check that there are not such VIEW/table
+ VIEW_CREATE_FORCE, // check that there are not such VIEW/table, then ignore table object dependencies
+};
+
enum enum_drop_mode
{
DROP_DEFAULT, // mode is not specified
@@ -2442,6 +2448,7 @@
};
enum enum_var_type option_type;
enum enum_view_create_mode create_view_mode;
+ enum enum_view_create_force create_view_force;
enum enum_drop_mode drop_mode;

uint profile_query_id;
--- ../mariadb-10.1.0/sql/sql_parse.cc 2014-06-27 04:50:34.000000000 -0700
+++ sql/sql_parse.cc 2014-09-02 02:34:31.000000000 -0700
@@ -4943,7 +4943,7 @@
Note: SQLCOM_CREATE_VIEW also handles 'ALTER VIEW' commands
as specified through the thd->lex->create_view_mode flag.
*/
- res= mysql_create_view(thd, first_table, thd->lex->create_view_mode);
+ res= mysql_create_view(thd, first_table, thd->lex->create_view_mode, thd->lex->create_view_force);
break;
}
case SQLCOM_DROP_VIEW:
--- ../mariadb-10.1.0/sql/sql_yacc.yy 2014-06-27 04:50:37.000000000 -0700
+++ sql/sql_yacc.yy 2014-09-05 17:19:29.000000000 -0700
@@ -1851,7 +1851,7 @@
statement sp_suid
sp_c_chistics sp_a_chistics sp_chistic sp_c_chistic xa
opt_field_or_var_spec fields_or_vars opt_load_data_set_spec
- view_algorithm view_or_trigger_or_sp_or_event
+ view_algorithm view_or_trigger_or_sp_or_event view_force_option
definer_tail no_definer_tail
view_suid view_tail view_list_opt view_list view_select
view_check_option trigger_tail sp_tail sf_tail udf_tail event_tail
@@ -2446,6 +2446,7 @@
VIEW_CREATE_OR_REPLACE);
Lex->create_view_algorithm= DTYPE_ALGORITHM_UNDEFINED;
Lex->create_view_suid= TRUE;
+ Lex->create_view_force= VIEW_CREATE_NO_FORCE; /* initialize just in case */
}
view_or_trigger_or_sp_or_event
{
@@ -15887,6 +15888,15 @@
| event_tail
;

+view_force_option:
+ /* empty */ /* 411 - is there a cleaner way of initializing here? */
+ { Lex->create_view_force = VIEW_CREATE_NO_FORCE; }
+ | NO_SYM FORCE_SYM
+ { Lex->create_view_force = VIEW_CREATE_NO_FORCE; }
+ | FORCE_SYM
+ { Lex->create_view_force = VIEW_CREATE_FORCE; }
+ ;
+
/**************************************************************************

DEFINER clause support.
@@ -15944,7 +15954,7 @@
;

view_tail:
- view_suid VIEW_SYM table_ident
+ view_suid view_force_option VIEW_SYM table_ident
{
LEX *lex= thd->lex;
lex->sql_command= SQLCOM_CREATE_VIEW;
--- ../mariadb-10.1.0/sql/sql_view.cc 2014-06-27 04:50:36.000000000 -0700
+++ sql/sql_view.cc 2014-09-05 19:33:58.000000000 -0700
@@ -248,7 +248,7 @@
*/

bool create_view_precheck(THD *thd, TABLE_LIST *tables, TABLE_LIST *view,
- enum_view_create_mode mode)
+ enum_view_create_mode mode, enum_view_create_force force)
{
LEX *lex= thd->lex;
/* first table in list is target VIEW name => cut off it */
@@ -259,7 +259,7 @@
DBUG_ENTER("create_view_precheck");

/*
- Privilege check for view creation:
+ Privilege check for view creation with default (NO FORCE):
- user has CREATE VIEW privilege on view table
- user has DROP privilege in case of ALTER VIEW or CREATE OR REPLACE
VIEW
@@ -272,6 +272,7 @@
checked that we have not more privileges on correspondent column of view
table (i.e. user will not get some privileges by view creation)
*/
+
if ((check_access(thd, CREATE_VIEW_ACL, view->db,
&view->grant.privilege,
&view->grant.m_internal,
@@ -285,6 +286,11 @@
check_grant(thd, DROP_ACL, view, FALSE, 1, FALSE))))
goto err;

+ if (force) {
+ res = false;
+ DBUG_RETURN(res || thd->is_error());
+ }
+
for (sl= select_lex; sl; sl= sl->next_select())
{
for (tbl= sl->get_table_list(); tbl; tbl= tbl->next_local)
@@ -369,7 +375,7 @@
#else

bool create_view_precheck(THD *thd, TABLE_LIST *tables, TABLE_LIST *view,
- enum_view_create_mode mode)
+ enum_view_create_mode mode, enum_view_create_force force)
{
return FALSE;
}
@@ -391,7 +397,7 @@
*/

bool mysql_create_view(THD *thd, TABLE_LIST *views,
- enum_view_create_mode mode)
+ enum_view_create_mode mode, enum_view_create_force force)
{
LEX *lex= thd->lex;
bool link_to_local;
@@ -425,14 +431,13 @@
goto err;
}

- if ((res= create_view_precheck(thd, tables, view, mode)))
+ if (res= create_view_precheck(thd, tables, view, mode, force))
goto err;

lex->link_first_table_back(view, link_to_local);
view->open_type= OT_BASE_ONLY;

- if (open_temporary_tables(thd, lex->query_tables) ||
- open_and_lock_tables(thd, lex->query_tables, TRUE, 0))
+ if (open_temporary_tables(thd, lex->query_tables) || (!force && open_and_lock_tables(thd, lex->query_tables, TRUE, 0)))
{
view= lex->unlink_first_table(&link_to_local);
res= TRUE;
@@ -513,6 +518,7 @@
}
}

+if (!force) {
/* prepare select to resolve all fields */
lex->context_analysis_only|= CONTEXT_ANALYSIS_ONLY_VIEW;
if (unit->prepare(thd, 0, 0))
@@ -612,6 +618,7 @@
}
}
#endif
+}

res= mysql_register_view(thd, view, mode);

@@ -621,7 +628,7 @@
meta-data changes after ALTER VIEW.
*/

- if (!res)
+ // if (!res)
+ if (!res && !force) /* 411 - solves segfault problems with CREATE FORCE VIEW option sometimes */
tdc_remove_table(thd, TDC_RT_REMOVE_ALL, view->db, view->table_name, false);

if (mysql_bin_log.is_open())
@@ -908,6 +915,8 @@
fn_format(path_buff, file.str, dir.str, "", MY_UNPACK_FILENAME);
path.length= strlen(path_buff);

if (ha_table_exists(thd, view->db, view->table_name, NULL))
{
if (mode == VIEW_CREATE_NEW)
--- ../mariadb-10.1.0/mysql-test/t/view.test 2014-06-27 04:50:30.000000000 -0700
+++ mysql-test/t/view.test 2014-09-06 00:23:32.000000000 -0700
@@ -5263,4 +5263,17 @@
--echo # -----------------------------------------------------------------
--echo # -- End of 10.0 tests.
--echo # -----------------------------------------------------------------
+
+create no force view v1 as select 1;
+drop view if exists v1;
+
+create force view v1 as select 1;
+drop view if exists v1;
+
+create force view v1 as select * from missing_base_table;
+drop view if exists v1;
+
+--echo # -----------------------------------------------------------------
+--echo # -- End of 10.1 tests.
+--echo # -----------------------------------------------------------------
SET optimizer_switch=@save_optimizer_switch;

Posted in API Programming, Linux, MySQL, Open Source, Oracle, Storage, Tech | Leave a comment

Installing Datastax Cassandra and Python Driver on CentOS 5


Cassandra Logo

Cassandra can run on CentOS 5.x, but there is no yum repo support.

If you can’t upgrade linux distros, here’s how to install Datastax Cassandra Community Edition and the python cassandra driver on CentOS 5.x.

It’s not difficult, but there’s several steps, including updating java.

(The following steps would make a complete chef or puppet recipe for a non-SSL install with vnodes.)


# setup environment
groupadd -g 602 cassandra
useradd -u 602 -g cassandra -m -s /sbin/nologin cassandra
mkdir /var/lib/cassandra /var/log/cassandra /var/run/cassandra
touch /var/log/cassandra/system.log
chown -R cassandra:cassandra /var/lib/cassandra /var/log/cassandra /var/run/cassandra
mkdir -p /opt && cd /opt


cat >> /etc/security/limits.conf <<EOD
cassandra soft memlock unlimited
cassandra hard memlock unlimited
cassandra soft nofile 8192
cassandra hard nofile 10240
EOD


# upgrade java
yum remove java
# download, then install JDK 7.x from oracle.com
rpm -Uvh jdk-7u67-linux-x64.rpm
# download, then install recent jna.jar from https://github.com/twall/jna
mv jna.jar /usr/share/java
ln -s /usr/share/java/jna.jar /opt/cassandra/lib/
# update envariables
cat >> /etc/profile <<"EOD"
export JAVA_HOME=/usr/java/default
export JRE_HOME=/usr/java/default/jre
export CASSANDRA_HOME=/opt/cassandra
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$CASSANDRA_HOME/bin
EOD


# get Datastax DCE
curl -L http://downloads.datastax.com/community/dsc.tar.gz >dsc-cassandra-2.0.9.tar.gz
tar zxvf - < dsc-cassandra-2.0.9.tar.gz
ln -s /opt/dsc-cassandra-2.0.9 /opt/cassandra
chown -R root:root /opt/cassandra/
bash cassandra/switch_snappy 1.0.4

# open cassandra firewall ports if necessary (not needed if using internal interface on most servers)
vi /etc/sysconfig/iptables
-A INPUT -i eth0 -m state --state NEW -m multiport -p tcp --dport 7000,7199,9042,9160 -j ACCEPT
service iptables restart
# configure /opt/cassandra/conf/cassandra.yaml (at least listen_address, rpc_address, seeds and tokens before starting server. If you need a do-over, clean the cassandra data with # rm -fr /var/lib/cassandra/*)

# download startup script:
wget http://jebriggs.com/php/start_cassandra.txt -O /etc/init.d/cassandra
chown root:root /etc/init.d/cassandra
chmod 755 /etc/init.d/cassandra
chkconfig --add cassandra

# start cassandra server (if it is standalone, or a seed server. otherwise start after the seed servers):
service cassandra start

# cat /etc/redhat-release 
CentOS release 5.10 (Final)

[root@www1 conf]# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load       Tokens  Owns   Host ID                               Rack
UN  10.0.1.2  71.87 KB   256     66.8%  8302c6d5-4c88-4695-bbf4-762bc7f24544  rack1
UN  10.0.1.3  136.63 KB  256     69.9%  eddb03b2-98d3-46ff-be63-95435414a883  rack1
UN  10.0.1.4  100.08 KB  256     63.3%  2a8dde5e-29b0-4a67-8204-40769376c44a  rack1

If you only see the node on localhost, then you have a problem:

  • read and fix any errors in /var/log/cassandra/system.log until there are zero errors. snappy-related errors are from /tmp being noexec or not running the switch_snappy 1.0.4 command above.
  • disable iptables firewall, test and reenable later
  • in log4j-server.properties, increase log4j.rootLogger to DEBUG
  • if you have multiple NICs, JMX (ie. nodetool) can bind to the wrong interface. You likely need to configure the-Djava.rmi.server.hostname=[address] option in cassandra-env.sh - to the address you want to listen on
  • public/private IP address problems in AWS EC2. You may need to set broadcast_address: [public_ec2_address]
  • normally rmiregistry is not needed unless you have some atypical firewalling or routing (NAT.)

Datastax Opscenter 5.0

You can install the binary from yum or tarball, but the important things to know are:

  • the monitoring agent will be installed on each cassandra node and uses port 61621. The init script is called datastax-agent.
  • the UI only needs to be installed once, but needs ports 61620, and 8888 for HTTP.
  • to allow Opscenter to remotely manage nodes with ssh, remove old ssh entries from .ssh/known_hosts first, connect manually to each node, then Opscenter should be happy
  • by default, Opscenter listens for agents on 0.0.0.0, phones home to Datastax.com each day, and does not require web authentication, so you likely want to change those.

Python also needs to be upgraded if you want to use cqlsh or the python client cassandra driver.


# install python 2.6 and dependencies
yum install gcc python26 python26-devel libev libev-devel


# install python's pip module
curl --silent --show-error --retry 5 https://bootstrap.pypa.io/get-pip.py | python26


# install cassandra driver for python
pip install cassandra-driver


# install blist.py
tar zxvf - < blist-1.3.6.tar.gz
cd blist-1.3.6
python26 setup.py install
cd ..

# cluster.py - test installation

from cassandra.cluster import Cluster

cluster = Cluster(['127.0.0.1'])

def dump(obj):
   for attr in dir(obj):
       if hasattr( obj, attr ):
           print( "obj.%s = %s" % (attr, getattr(obj, attr)))

dump(cluster);
# python26 cluster.py

obj.__class__ = <class 'cassandra.cluster.Cluster'>
[...]

Troubleshooting connection problems in JConsole
datastax.com: Storing OpsCenter Data in a Separate Cluster

Posted in Cassandra, Cloud, Linux, Open Source, Tech | Leave a comment

MySQL 5.6 Views and Stored Procedures Tips

MySQL LogoI recently tuned an existing application that used dozens of views and hundreds of stored procedures using MySQL 5.6.

There seems to be three attitudes towards using views and stored procedures (SPs) with MySQL:

  1. don’t use them at all to increase portability
  2. just use SPs to reduce network traffic in large reporting queries (my choice)
  3. go crazy and use them everywhere like old-school Oracle Enterprise apps.

Here are some notes on using views:

  • before creating views, review your schema to ensure keys have matching types and charsets for good performance. It’s much easier to spot schema problems in a text listing than to guess why a view is slower than expected at execution time. (This is doubly true for MySQL Cluster.)
  • MySQL currently doesn’t have CREATE VIEW FORCE, although MariaDB 10.1.0 alpha has my patch. The FORCE option will greatly simply view administration and also mysqldump output, which creates temporary tables to ensure views can be created regardless of table/view ordering issues
  • When looking at the MariaDB source code, it’s apparent that some view options were never actually implemented, like RESTRICT/CASCADE

And some notes on stored procedures (SPs):

  • if a SP makes a stateful session change, like set sql_log_bin=0, ensure that isn’t going to be a problem later if an exception condition doesn’t reset it
  • after running a SP, SHOW PROFILES will list all the queries executed with performance statistics
  • SPs that do non-essential SELECTs or INFORMATION SCHEMA queries probably need to be reviewed by a DBA for fundamental problems like non-atomic “reading before writing”
  • MySQL compiles SPs again for each thread.

Both views and SPs are relatively new MySQL features, so budget some extra development and testing time when using them, especially with replication.

[MDEV-6365] CREATE VIEW Ignores RESTRICT/CASCADE Options
mysqlperformanceblog.com: Using MySQL triggers and views in Amazon RDS

Posted in MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

SVLUG: Devops and Release Canaries with Linux, CloudStack and MySQL Cluster

I did a talk at the Silicon Valley Linux Users Group (SVLUG) tonite on “Devops and Release Canaries with Linux, CloudStack and MySQL Cluster.”

Thanks again to Symantec for hosting.

Posted in API Programming, Cloud, Linux, MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

Velocity Conference Santa Clara 2014 Tips Game Cards

The O’Reilly Velocity Web Operations & Performance Conference is June 24-26 in Santa Clara.

Next to the messages/jobs board was a Web Ops & Performance Tips board:

– use source maps to debug compressed JS and CSS
– use ::before to optimize font rendering
– use local storage to persist markup and templates to reduce requests and payload
– avoid CSS block rendering in chrome by not using screen media type until after. Then put screen back to element
– use gatling stress tool for load generation/perf testing (Apache Licence 2.0)
– learn curl
– learn POSIX before recreating another tool that already exists. Bill Joy (?)
– “if you do it more than twice a week, automate”
– it takes no skills to do NoOps! :)

Posted in Cloud, Conferences, Open Source, Tech | Leave a comment

AWS Pop-up Loft, San Francisco



Amazon Web Services pop-up loft (Ask an Architect area, lecture hall, kitchen/lounge)
Photo credit: Amazon.com.

I happened to be in SF today, so I went to the Amazon Web Services pop-up loft on Market St.

Amazon rented an empty storefront for 4 weeks for lecture sessions upstairs, and a computer lab and an ‘Ask an Architect’ bar downstairs.

One of the hosts said the loft was a shell in May, and they had to build out everything: the kitchen area, 2 bathrooms and various partitions.

I asked the experts about new EBS and RDS features, and they had answers as well as a $100 AWS credit.

The weather was sunny and warm in SF.

Lots of street performers and hustlers, including a very smooth male R&B singer. A young rapper named Rap2K15 was selling hand-made CDs.

Update 2014 06 23: Apparently a drawing was held, and I was one of 3 winners of a free general pass to the AWS:Reinvent Conference :)

Update 2014 06 24:

AWS Bootcamp

Full-day AWS overview, including EC2, S3, RDS, VPC and IAM, with 2 labs.

“Provisioning and Managing AWS Infrastructure with Chef” with special guest George Miranda, Chef Technical Consultant, Chef

George talked about using Chef tools like chef metal, knife and chef zero and a minimal amount of ruby to make an AMI and provision a MySQL server and 5 Nginx web servers.

Slides

@gmiranda23, chef-ami-factory

Update 2014 06 26:

Dealing With Obstacles at Scale, Bob Hagemann, Twilio

To reduce pain:

– UTC timezone
– UTF8
– use thin AMI and chef/puppet instead of thick AMI
– wrote boxconfig a few years ago (like netflix asgard)
– remote admin mainly
– small teams 3-8
– services should run in 3 AZs
– monitoring with nagios, cron, pingdom
– haproxy on each host as proxy
– MySQL, MHA, LVM. Manual failover.
– SQS DLQ
– global low latency with route53
– http://github.com/twilio
– @bobzilla42
– Uses freeswitch plus own telcom sw
– billing system 100s QPS
– Ops team is about 8 people
– VPNs to HQ and carrier-approved colo
– three founders, one came from Amazon.

925 Market Street, SF
June 4 – 27, 2014 (likely closed on the 27th for dismantling)
Free registration, tshirts and lunch. Closes 5:30 pm, 6:00 pm or 8:00 pm daily.
Muni 30 and 45 return from Market St. and 5th to Caltrain.

@AWSstartups #AWSloft

AWS Loft Returning in Fall 2014

Posted in API Programming, Business, Cloud, Conferences, Linux, MySQL, Open Source, Oracle, San Jose Bay Area, Tech | Leave a comment

Advanced Liquibase Techniques

Liquibase LogoI recently did some work with liquibase. Here’s some techniques for advanced users to workaround limitations to calculate query cost.

Liquibase Introduction

Liquibase is an Open Source (Apache 2.0 License) Java utility and API for specifying and versioning schema changes (DDL) for several popular databases. It is commonly introduced to projects by programmers, rather than DBAs.

What liquibase can do:

  • allow “refactoring” of SQL schema changes to target multiple databases using XML by using a database-independent syntax, or raw SQL, depending on your preference
  • allow conditional execution and rollback of SQL based on database type or environment.

What liquibase can’t do:

  • has no built-in provisions for operational concerns, like conditionally executing SQL based on time/cost. There’s an assumption that schema changes are online, often true on Oracle and SQL Server, less so on MySQL, especially prior to 5.6 (unless you do micro-sharding)
  • does not do intelligent merges to the same object across changesets, like adding multiple columns to the same table in one statement.

How liquibase works:

  • the programmer specifies schema changes in Java, XML or JSON and runs the liquibase command
  • liquibase creates 2 tables in your database to store version, user and patch name information and to lock out other simultaneous liquibase runs.

How to Make Liquibase Consider Cost for MySQL

After some experimentation, there’s a couple liquibase features you can use to do more advanced things:

  1. create a savepoint using the tag and rollback options:
    • liquibase tag rel0; liquibase update …; liquibase rollback rel0
  2. prepend and append logic to each changeset to use information_schema on the SQL DDL statement. on failure, exit with 1 (See XML example below)

changeset.xml:

<?xml version="1.0" encoding="UTF-8"?>

<databaseChangeLog
  xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
       http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.1.xsd">

    <changeSet id="1" author="james">
     <sql>
       create table if not exists `profiling` ( `connection_id` int(11) not null default 0, `query_id` int(11) not null default '0', `state` varchar(40) default '', KEY (query_id));
       truncate table profiling;
       set profiling=1;

       alter table department add column test2 int default null;
       insert into profiling (connection_id, query_id, state) select connection_id(), query_id, state from information_schema.profiling where query_id=2;
     </sql>
     <rollback>
        <sql>alter table department drop column test2</sql>
    </rollback>
    </changeSet>

    <changeSet id="1-post" author="james">
      <preConditions onFail="HALT">
        <sqlCheck expectedResult="0">SELECT count(*) from profiling where state='copy to tmp table'</sqlCheck>
      </preConditions>
    </changeSet>
</databaseChangeLog>

Notes:

  1. the changeset DDL statement will still have run, even if the precondition HALTs – they’re separate changesets, after all
  2. the rollback in “1” will not be executed, even if “1-post” HALTs.

The workaround for those 2 issues is to combine the two techniques in a shell script:

#!/bin/bash

liquibase tag rel0

liquibase update changeset.xml || {
    # fail the build pipeline to not propagate changeset to next stage
    # (ie. don't run in production)
    liquibase rollback rel0
    mysql -e 'alter table test.department drop column test2' 
    exit 1
}

The above looks a little kludgy, but provides a stepping stone for the reader to customize in their particular environment. (The preConditions and bash script can be easily autogenerated with a Perl or Python script.)

An alternative to XML is using the Java API to set everything up.

Please leave a comment if you have any suggestions or a Java API program.

Posted in API Programming, MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

Percona Live MySQL Conference Santa Clara 2014

The Percona Live MySQL Conference was held once again in Santa Clara from April 1-4, 2014.

Executive Summary:

  1. Percona hosted another excellent conference, with 1,150 attendees from 43 countries plus a vibrant exhibit hall.
  2. The overall themes that emerged this year were “What’s new in MySQL 5.6?” and “The rise of Galera Cluster.” Unfortunately, Oracle delivered the 5.6 features they promised, but didn’t bother to ask production DBAs what they really needed (ie. GTIDs require downtime to configure, and ALTER ONLINE doesn’t support throttling or background operation on slaves (SR 3-8856341908).)
  3. MySQL 5.7 is promising about double the performance of 5.6, but note that the 5.7 feature micro-benchmark effort hasn’t translated into a complete understanding of whole database performance yet.
  4. the current active branches are now: Oracle 5.6/5.7, MariaDB 10.0/10.1, Webscale SQL (Facebook, Google, LinkedIn, and Twitter), Facebook 5.6 with Deployable GTIDs, and Percona Server 5.6. (The version you want to migrate to is one based on MySQL 5.6.17 or later.)


Severalnines Booth
Severalnines.com booth. They create and support cluster and cloud database solutions. Photo credit: Steve Barker, SphinxSearch.com

Wednesday

Wed. Keynotes

Percona Live 2014 opening keynote with Percona CEO Peter Zaitsev
Robert Hodges – Getting Serious about MySQL and Hadoop at Continuent
(Continuent needs to pivot into another market as MySQL’s new built-in features displace their replication products.)
‘Raising the MySQL Bar’ with Oracle’s Tomas Ulin, VP of Engineering for MySQL, Oracle
Adventures in MySQL at Dropbox, Renjish Abraham

Wed. Talks

Online schema changes for maximizing uptime, David Turner, Dropbox, Ben Black, Tango

– MySQL 5.6 has online schema change capability, however there’s no way to throttle IO consumed during the operation and the single-threaded slave will lag
– David has tested the ALTER ONLINE in MySQL 5.6.17 and will use it when ported to Percona Server
– for now uses Percona Online Schema Change utility for its throttling feature.

Be the hero of the day with the InnoDB Data recovery tool, Marco “The Grinch” Tusa and Aleksandr Kuzminsky, Percona Services

– tools have been created by Percona to recover Innodb data if you don’t have backups and you’re out of business otherwise. Call them! :)

Galera Cluster New Features, Seppo Jaakola, Codership

– reviewed features in Galera Cluster versions 3 and 4
– looking good.

MySQL Cluster Performance Tuning, Johan Andersson, severalnines.com

- Disable NUMA
- echo 0 > /proc/sys/vm/swappiness
- bind data node threads to CPUs
- cat /proc/interrupts

ThreadConfig

LDM = cores/2

TC = LDM/4

RealTimeScheduler=1

Numoffraglogparts=LDM

Tune redo log

Fragmentlogsize=256M

Nooffragmentlogfiles=redobuffer=64M

Practical sysbench, Peter Boros, Percona

– prefers “latency” graph style with transparent dots vs. line charts
– uses R and ggplot2 for graphing
– attendees tried to guess SSD performance on Peter’s notebook for different block sizes, most were proven totally wrong by sysbench

Birds of a Feather (BoF) Sessions

“Meet MySQL Team (at Oracle)” BoF

– discussion again this year about parallel query execution (same as at MariaDB BoF last year), with Peter Zaitsev also bringing it up again
– discussion about raw partitions (belief is that they will be 20% more space-efficient and 30% faster, and avoid Linux endless limitations and bugs)
– internal “development roadmap” only extends about 12 months at a time, subject to customer demands
– I griped about FK panic/data loss issues in MySQL Cluster 7.3.3. Tomas Ulin, Vice President, MySQL Engineering, said that was news to him. (See SR 3-8717994851 and SR 3-87646727311)
– Mark Callaghan, Facebook, said he was working on MongoDB now, but requested named keys in flexible schema in MySQL.
– Peter Zaitsev, Percona, said several clients are using GTIDs and they seem to work.
– Oracle pleaded with users to drop MyISAM. I mentioned the main reason was that legacy systems used older compression methods, but InnoDB could be used since it has compression too
– The Oracle MySQL Fabric project is an attempt to counter MongoDB’s automatic slave promotion.

Thursday

Thursday Keynotes

‘9 Things You Need to Know…’, Peter Zaitsev, Percona
The Evolution of MySQL in the All-Flash Datacenter, Nisha Talagala, Fusion-IO
MySQL, Private Cloud Infrastructure and OpenStack, Sean Chighizola, Big Fish Games
Keynote Panel: The Future of Operating MySQL at Scale

Thu. Talks

Benchmarking Databases for Scale, Peter Boros and Kenny Gryp, Percona

Question: “What is Percona’s secret to professional benchmarks?”
Answer: “Benchmark absolutely everything multiple times, time permitting.”

MySQL 5.7: Performance & Scalability Benchmarks, Dimitri KRAVTCHUK

– comprehensive micro-benchmarking graphs of 5.7 to gain a deeper understanding of parts
– the challenge remains: how to tune the whole database to perform well?

Use Your MySQL Knowledge to Become an Instant Cassandra Guru, Robert Hodges, Continuent and Tim Callaghan, Tokutek

– good comparison of relational data modelling and C* data modelling, lots of similarities
– note that MariaDB has a Cassandra plugin

RDS for MYSQL, Tips, Patterns and Common Pitfalls, Laine Campbell, Blackbird (formerly PalominoDB)

Write Conflicts in Multi-Master Replication Topologies, Seppo Jaakola, Codership

– it’s good to see that Codership is paying attention to the details of replication

MySQL Community Awards

Shlomi has a comprehensive post on this years winners.

MySQL Lightning Talks (5 minutes each)

Truncating Sub Optimal DBA Verbal Responses Vectors, David Stokes (Oracle)

MySQL 5.6 Global Transaction IDs: Benefits and Limitations, Stephane Combaudon (Percona)

mysqlfailover
mysqlrpladmin

Zero database downtime using the Federated storage engine and Replication, prasad mani (BBC)

Scaling via adding a Table, Rick James (self)

Rick knows some clever ways to optimize solutions with MySQL. He’s doing consulting now, so contact him.

IPs
Lat/Long
Mysql.rjweb.org
Extra Table Saves the Day: Slides

No es ‘ano’, es ‘año’! A take on encoding in your DB, Ignacio Nin (Vivid Cortex)

What Not to Say to the MySQL DBA, Gillian Gunson (Blackbird (formerly PalominoDB))
“I’ll code around it. ”
“Stop micro-optimizing. ”
“Use passive master for QA”
“MySQL is a toy database. ”
This conference is a support group. ”

Hall of Shame, Shlomi Noach
Triple active-replication in gaming anecdote: don’t do that.

The bash slave-prefetch oneliner, Art van Scheppingen (Spil Games)

Unsung Relay Log, Vishnu Rao, FlipKart
Com_relaylog_dump for tungsten and mysql 5.5

Unique User Count — Rollup, Rick James (self)

Formula for user visit estimation by counting bits.

Logical Backups in the Cloud, Bill Karwin, Percona
Backups for PHP designers
PHP class Mysql/Dump

How to Squat, Kyle Redinger (VividCortex, Inc)

Iron DBA Replication Challenge, Attunity
Humor

Friday

Friday Keynotes

Percona CMO Terry Erisman opens the 3rd and final day of Percona Live 201

Keynote: OpenStack Co­Opetition, A View from Within, Boris Renski, Mirantis and OpenStack Boardmember

– one of the best conference keynotes ever, and a great primer on Open Source marketing … up there with the O’Reilly Open Source Conference keynote on the importance of Android – before it shipped.

Friday Talks

Global Transaction ID at Facebook, Evan Elias, Santosh Banda and Yoshinori Matsunobu, Facebook

– just write your own MySQL branch if a feature is too hard to deploy :)

R for MySQL DBAs, Ryan Lowe and Randy Wigginton, Percona

– R has about 1,000 interesting sample databases (demos included diamonds and cars)
– good interface for quick graphing, not so great for complex programs
– Percona usess R and ggplot graph module for most of the graphs you see now.

MariaDB for Developers, Colin Charles, Chief Evangelist, MariaDB

Closing Prize Drawing

About 30 high-end gifts were handed out.

Some nice prizes contributed by exhibitors, including Nexus 7 tablets, $250 AWS gift certificates, SQLyog and Monyog licenses, and a quad drone!

Exhibits

The exhibits are one of my favorite things at the conference each year because of how strong the MySQL third-party community is.

Some notable absences were Clustrix and Violin memory, but those were offset by new exhibitors. Webyog was a sponsor but I didn’t see a booth. PalominoDB changed their name to Blackbird, and appear to be offering DevOps as well as DBA services.

And of course, as the organizers, Percona had a large, central spread. :)

Thanks to the sponsors and exhibitors for making a conference like this financially possible.

Facebook Debuts Web-Scale Variant Of MySQL

Facebook’s Yoshinori Matsunobu on MySQL, WebScaleSQL & Percona Live
Twitter’s Calvin Sun on WebScaleSQL, Percona Live
Slides
Tweets about PerconaLive
Percona Live MySQL Conference Highlights

Posted in Cassandra, Cloud, Conferences, Linux, MySQL, MySQL Cluster, Open Source, Oracle, Perl, San Jose Bay Area, Storage, Tech | Leave a comment

Cassandra Operations Checklist

Most of the Cassandra rollouts I’ve heard about at conferences have been “Devopsed” – written by Dev and productionized by Dev, with hand-off to Operations long afterwards.

That’s the opposite to how RDBMS projects are usually deployed in large companies.

As Cassandra becomes more mature, this hand-off will occur earlier after development ends.

Here is a checklist for handing off a Cassandra database to Operations (I only consider non-trivial rings of 3 or more nodes in production with a full data set):

  Node Impact
  Item Comments Performance/ Space/ Time/IOPs/BW
Cassandra Server Version Should be exactly the same minor version across cluster except briefly during server updates
Token or vnodes? needs to be configured before first start of server
Cassandra Client/Connector Version Thrift or CQL?
Snitch name? Why? several choices
Replication Factor (RF)? Why? usually RF=3 for SoT* data, defined at keyspace level
Compaction method? Why? Size or Level, defined at CF level
Read Consistency Level? Why? Netflix recommends CL=ONE. ALL seldom makes sense.
Write Consistency Level? Why? ALL seldom makes sense.
TTL? Why? Defined at row level.
Expected Average Query Latency 10 ms is reasonable, 1 ms is tough.
nodetool repair/scrub needed weekly yes more space more
Bootstrapping a new node yes yes
Java gcpause stop the world yes yes
Are there any wide columns? do they get wider over time? pathological case for Cassandra yes more space more
Backup in case of application bug or a disaster. Opscenter, Priam, custom. yes slightly more for incremental backups, double for local cold copy more
Restore requires Cassandra node shutdown yes
If a storage volume fills, howto fix it? Especially a problem with multiple JBOD volumes, which fill unevenly. yes less space less
If a storage volume fails, howto fix it? yes less space less
What is the total data size now? Projected in 12 months? affects most operations yes yes yes
What is the acceptable query latency? affects network and hardware choices
What is the best maintenance window time each week?
What are the business and practical SLAs?
What training is needed for your Operations team? Datastax Admin and Data Modelling Classes (recommend most recent Cassandra version)
What partitioner is used? Opscenter only supports random partitioner or murmur 3 partitioner for rebalancing
What procedures need to be written for your Operations team?
What monitoring tools?
  1. DSE or DCE/OpsCenter
  2. nodetool
  3. Jconsole/jmxterm
  4. Boundary
  5. nagios/zabbix
What bugs have been encountered? Which ones still apply?
What lessons can Devops share with the Operations team?

SoT = Source of Truth

About Data Consistency in Cassandra
ConstantContact techblog: Cassandra and Backups
stackoverflow.com: Do I absolutely need a minimum of 3 nodes/servers for a Cassandra cluster or will 2 suffice?

Posted in Business, Cassandra, Cloud, Tech | Leave a comment

Howto Add a New Command to the MySQL Server

MySQL LogoAdding a new statement or command to the MySQL server is not difficult.

First, decide if you want to modify the server source code, or if a User-Defined Function (UDF) will meet your needs.

Since I just added the SHUTDOWN server command, I thought I would be helpful to outline the steps needed to add a new command.

Prerequisites:

  1. some familiarity with C/C++ syntax and programming (like “The C Programming Language”, by Kernighan and Ritchie.)
  2. some familiarity with lex and yacc. (I read the Dragon Book a long time ago.)
  3. access to a linux account with cmake, gcc, make and bison packages.
# CentOS
yum install cmake gcc make bison

# Ubuntu
apt-get update
apt-get install cmake gcc make bison

# unpack the MySQL source code:

tar zxvf - < mariadb-5.5.30.tar.gz

# most of the files you need to modify are in this directory:

cd mariadb-5.5.30/sql
  • sql_parse.cc
  • sql_yacc.yy
  • sql_prepare.cc
  • mysqld.cc
  • sql_lex.h

# add the token(s) (commands and arguments you think you will need) and verify the syntax:

bison -v sql_yacc.yy

# if you get warnings, fix %expect in sql_yacc.cc

# cut-and-paste a code block from a command with similar syntax in sql_yacc.cc to implement your new command, and build a test version of MySQL

# build your new server in a sandbox:

make.sh:

#!/bin/bash

cd mariadb-5.5.30
cmake . -DCMAKE_INSTALL_PREFIX:PATH=/usr/local/mariadb-5.5.30
make --with-debug
sudo make install

# test your new server with 3 terminal windows:

start.sh:

#!/bin/bash

killall mysqld
/usr/local/mariadb-5.5.30/bin/mysqld_safe --user=mysql --debug &
tail -f  /tmp/mysqld.trace | grep Got &
tail -f /var/log/mysqld.log &
mysql -u root -p
# login, then test your new command while watching the log and trace

# read /var/log/mysqld.log and /tmp/mysqld.trace for errors and panics like this:

Version: '5.5.30-MariaDB-debug'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
mysqld: /home/james/mariadb-5.5.30/sql/sql_parse.cc:4477: int mysql_execute_command(THD*): Assertion `0' failed.
130515 11:25:19 [ERROR] mysqld got signal 6 ;

This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

The above panic was caused by the SQLCOM_ switch falling through, because the new command was not defined yet.

# When you’re done, make a test

vi mysql-test/t/my_new_command.test

# Create a patch file:

mv mariadb-5.5.30 mariadb-5.5.30-new
tar zxvf - < mariadb-5.5.30.tar.gz

cd mariadb-5.5.30/src
>patch.txt
for i in sql_parse.cc sql_yacc.yy sql_prepare.cc mysqld.cc sql_lex.h; do
   echo $i
   diff -u $i ../../mariadb-5.5.30-new/sql/ >>patch.txt
done
# don't forget mysql-test/t/my_new_command.test

# apply your patch file:

patch -b < patch.txt

# do a build and test your patch before distributing it.

Easy peasy, right! :)

Sergei Golubchik wrote on the MariaDB developers list: "Reserved words are keywords (listed in the sql/lex.h) that are
not listed in the 'keyword' rule of sql_yacc.yy (and 'keyword_sp' rule, that 'keyword' rule includes)."

How can I get the output of the DBUG_PRINT
How to find shift/reduce conflict in this yacc file?
MariaDB Contributor Agreement (MCA) Frequently Asked Questions
wikipedia: diff

MySQL Internals Manual
mysqlperformanceblog.com: XtraDB / InnoDB internals in drawing
Overloading Procedures
innodb_diagrams project
Understanding MySQL Internals By Sasha Pachev (O'Reilly)
DTrace can tell you what MySQL is doing
MySQL C Client API programming tutorial
MySQL 5.1 Class Index

  • https://launchpad.net/~maria-developers
  • IRC, #maria channel on Freenode
  • https://kb.askmonty.org/en/community-contributing-to-the-mariadb-project/
  • https://kb.askmonty.org/en/contributing-code/
  • https://kb.askmonty.org/en/google-summer-of-code-2013/ (ideas)
  • http://mariadb.org/jira/ (search for unassigned tasks)

Keywords: MariaDB, MySQL server programming, tutorial, patch.

Posted in API Programming, Linux, MySQL, Open Source, Oracle, Tech, Toys | 3 Comments

Patch to Add Shutdown Statement to MySQL MariaDB

MySQL LogoAt the OSCON 2011 MariaDB Birds-of-a-Feather (BoF) session, I suggested adding a MySQL SHUTDOWN statement to Monty, which was written up as WL#232. Other databases have this feature, and it’s very handy when automating management of a cluster of MySQL servers.

And at the Percona Live MySQL Conference 2013, Monty suggested to MariaDB BOF attendees that a good way to get a new feature added is to to write a patch to pave the way for a committer to start with.

Phase 1

So … I sat down last nite and wrote the patch against MariaDB 5.5.30.

Basically it meant telling mysql’s lex/yacc files to parse “shutdown”, then calling the existing MySQL API shutdown kill_mysql() function.

This code is released under the Open Source BSD-new License, according to the MariaDB Contributor Agreement.

shutdown_0.1.patch.txt – MariaDB 5.5.30:

--- sql_parse.cc	2013-03-11 03:29:13.000000000 -0700
+++ /home/james/mariadb-5.5.30-new/sql/sql_parse.cc	2013-05-15 13:17:05.000000000 -0700
@@ -1305,7 +1305,6 @@
     my_ok(thd);
     break;
   }
-#ifndef EMBEDDED_LIBRARY
   case COM_SHUTDOWN:
   {
     status_var_increment(thd->status_var.com_other);
@@ -1333,7 +1332,6 @@
     error=TRUE;
     break;
   }
-#endif
   case COM_STATISTICS:
   {
     STATUS_VAR *current_global_status_var;      // Big; Don't allocate on stack
@@ -3736,6 +3734,31 @@
                     lex->kill_signal);
     break;
   }
+  case SQLCOM_SHUTDOWN:
+  {
+    // jeb - This code block is copied from COM_SHUTDOWN above. Since kill_mysql(void) {} doesn't take a level argument, the level code is pointless.
+    // jeb - In fact, the level code should be removed and Oracle Database statements implemented: SHUTDOWN, SHUTDOWN IMMEDIATE and SHUTDOWN ABORT. See WL#232.
+
+    status_var_increment(thd->status_var.com_other);
+    if (check_global_access(thd,SHUTDOWN_ACL))
+      break; /* purecov: inspected */
+
+    enum mysql_enum_shutdown_level level;
+    level= SHUTDOWN_DEFAULT;
+    if (level == SHUTDOWN_DEFAULT)
+      level= SHUTDOWN_WAIT_ALL_BUFFERS; // soon default will be configurable
+    else if (level != SHUTDOWN_WAIT_ALL_BUFFERS)
+    {
+      my_error(ER_NOT_SUPPORTED_YET, MYF(0), "this shutdown level");
+      break;
+    }
+    DBUG_PRINT("SQLCOM_SHUTDOWN",("Got shutdown command for level %u", level));
+    my_eof(thd);
+    kill_mysql();
+    res=TRUE;
+    break;
+  }
+
 #ifndef NO_EMBEDDED_ACCESS_CHECKS
   case SQLCOM_SHOW_GRANTS:
   {
--- sql_yacc.yy	2013-03-11 03:29:19.000000000 -0700
+++ /home/james/mariadb-5.5.30-new/sql/sql_yacc.yy	2013-05-15 11:12:03.000000000 -0700
@@ -791,7 +791,7 @@
   Currently there are 174 shift/reduce conflicts.
   We should not introduce new conflicts any more.
 */
-%expect 174
+%expect 196
 
 /*
    Comments for TOKENS.
@@ -1645,6 +1645,7 @@
         definer_opt no_definer definer
         parse_vcol_expr vcol_opt_specifier vcol_opt_attribute
         vcol_opt_attribute_list vcol_attribute
+        shutdown
 END_OF_INPUT
 
 %type  call sp_proc_stmts sp_proc_stmts1 sp_proc_stmt
@@ -1796,6 +1797,7 @@
         | savepoint
         | select
         | set
+        | shutdown
         | signal_stmt
         | show
         | slave
@@ -13715,6 +13717,17 @@
         ;
 
 
+shutdown:
+          SHUTDOWN
+          {
+            LEX *lex=Lex;
+            lex->value_list.empty();
+            lex->users_list.empty();
+            lex->sql_command= SQLCOM_SHUTDOWN;
+          }
+        ;
+
+
 set_expr_or_default:
           expr { $$=$1; }
         | DEFAULT { $$=0; }
--- sql_prepare.cc	2013-03-11 03:29:11.000000000 -0700
+++ /home/james/mariadb-5.5.30-new/sql/sql_prepare.cc	2013-05-15 03:07:00.000000000 -0700
@@ -2173,6 +2173,7 @@
   case SQLCOM_GRANT:
   case SQLCOM_REVOKE:
   case SQLCOM_KILL:
+  case SQLCOM_SHUTDOWN:
     break;
 
   case SQLCOM_PREPARE:
--- mysqld.cc	2013-03-11 03:29:14.000000000 -0700
+++ /home/james/mariadb-5.5.30-new/sql/mysqld.cc	2013-05-15 01:20:11.000000000 -0700
@@ -3333,6 +3333,7 @@
   {"savepoint",            (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SAVEPOINT]), SHOW_LONG_STATUS},
   {"select",               (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SELECT]), SHOW_LONG_STATUS},
   {"set_option",           (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SET_OPTION]), SHOW_LONG_STATUS},
+  {"shutdown",             (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SHUTDOWN]), SHOW_LONG_STATUS},
   {"signal",               (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SIGNAL]), SHOW_LONG_STATUS},
   {"show_authors",         (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SHOW_AUTHORS]), SHOW_LONG_STATUS},
   {"show_binlog_events",   (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_SHOW_BINLOG_EVENTS]), SHOW_LONG_STATUS},
--- sql_lex.h	2013-03-11 03:29:13.000000000 -0700
+++ /home/james/mariadb-5.5.30-new/sql/sql_lex.h	2013-05-15 01:19:17.000000000 -0700
@@ -193,6 +193,7 @@
   SQLCOM_SHOW_RELAYLOG_EVENTS, 
   SQLCOM_SHOW_USER_STATS, SQLCOM_SHOW_TABLE_STATS, SQLCOM_SHOW_INDEX_STATS,
   SQLCOM_SHOW_CLIENT_STATS,
+  SQLCOM_SHUTDOWN,
 
   /*
     When a command is added here, be sure it's also added in mysqld.cc

To apply:

tar zxvf - < mariadb-5.5.30.tar.gz
cd mariadb-5.5.30/sql
wget http://jebriggs.com/php/shutdown_0.1.patch.txt
patch -b < shutdown_0.1.patch.txt

make.sh:

#!/bin/bash

cd mariadb-5.5.30
cmake . -DCMAKE_INSTALL_PREFIX:PATH=/usr/local/mariadb-5.5.30
make --with-debug
sudo make install

start.sh:

#!/bin/bash

killall mysqld
/usr/local/mariadb-5.5.30/bin/mysqld_safe --user=mysql --debug &
tail -f  /tmp/mysqld.trace | grep Got &
mysql -u root -p

mysql client (with mysqld.log and mysql.trace entries overlaid):

mysql> shutdown;
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql> 130515 13:20:38 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

/tmp/mysql.trace:


T@4    : | | | >parse_sql
T@4    : | | | <parse_sql
T@4    : | | | >LEX::set_trg_event_type_for_tables
T@4    : | | | <LEX::set_trg_event_type_for_tables
T@4    : | | | >mysql_execute_command
T@4    : | | | | >deny_updates_if_read_only_option
T@4    : | | | | <deny_updates_if_read_only_option
T@4    : | | | | >stmt_causes_implicit_commit
T@4    : | | | | <stmt_causes_implicit_commit
T@4    : | | | | SQLCOM_SHUTDOWN: Got shutdown command for level 16
T@4    : | | | | >set_eof_status
T@4    : | | | | <set_eof_status
T@4    : | | | | >kill_mysql
T@4    : | | | | | quit: After pthread_kill
T@4    : | | | | <kill_mysql
T@4    : | | | | proc_info: /home/james/mariadb-5.5.30/sql/sql_parse.cc:4507  query end

/var/log/mysqld.log:

130515 13:20:08 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130515 13:20:08 InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
130515 13:20:08 InnoDB: The InnoDB memory heap is disabled
130515 13:20:08 InnoDB: Mutexes and rw_locks use GCC atomic builtins
130515 13:20:08 InnoDB: Compressed tables use zlib 1.2.3
130515 13:20:08 InnoDB: Initializing buffer pool, size = 128.0M
130515 13:20:08 InnoDB: Completed initialization of buffer pool
130515 13:20:08 InnoDB: highest supported file format is Barracuda.
130515 13:20:09  InnoDB: Waiting for the background threads to start
130515 13:20:10 Percona XtraDB (http://www.percona.com) 5.5.30-MariaDB-30.1 started; log sequence number 1597945
130515 13:20:10 [Note] Plugin 'FEEDBACK' is disabled.
130515 13:20:10 [Note] Event Scheduler: Loaded 0 events
130515 13:20:10 [Note] /usr/local/mariadb-5.5.30/bin/mysqld: ready for connections.
Version: '5.5.30-MariaDB-debug'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution
130515 13:20:37 [Note] Got signal 15 to shutdown mysqld
130515 13:20:37 [Note] /usr/local/mariadb-5.5.30/bin/mysqld: Normal shutdown

130515 13:20:37 [Note] Event Scheduler: Purging the queue. 0 events
130515 13:20:37  InnoDB: Starting shutdown...
130515 13:20:38  InnoDB: Shutdown completed; log sequence number 1597945
130515 13:20:38 [Note] /usr/local/mariadb-5.5.30/bin/mysqld: Shutdown complete

130515 13:20:38 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

A possible test would be like this, but it would interfere with operation of the test mysqld instance:

mysql-test/t/shutdown.test:

shutdown;

Phase 2

My above patch applies cleanly within the existing MySQL shutdown framework, which implements a feature like Oracle Database's SHUTDOWN IMMEDIATE command.

However, my patch is a Pyrrhic victory, since there's so much wrong with MySQL's existing shutdown framework that it will take an internals committer to sort it out.

The shutdown framework is badly designed, if it was designed at all, since it fails the "does this feel programmed on purpose?" test, and in fact doesn't work reliably:

  1. Conceptually, there should be 3 Oracle Database-style SHUTDOWN options: WAIT, IMMEDIATE and ABORT. Implementing SHUTDOWN WAIT would mean intrusive changes to the MySQL source code, while SHUTDOWN ABORT would be easier to program, but at the risk of data integrity.
  2. the following bug reports describe a race condition between mysqld threads and the shutdown thread:

I guess I'll have to pay myself the worklog bounty of $100. :)

This is actually my second MySQL patch contribution. In 1997 or 1998 I submitted a patch for the installer, which was one of the most troublesome components at that time. Monty rewrote it, but I liked my version better.

Update: Sergei Golubchik committed this patch to MariaDB 10.0.4 on 2013-06-25. Thanks, Sergei!

shutdown_0.1.patch.txt
MySQL's Missing Shutdown Statement
WL#232
Bug #63276: skip sleep in srv_master_thread when shutdown is in progress

Posted in Linux, MySQL, Open Source, Oracle, OSCON, Tech | 1 Comment

IFR Magazine: Danger Below MDA?

AvWeb has a chilling reprint from IFR Magazine on US airlines intentionally descending below the approach plate MDA …

“Flight inspection noted that a GPWS alert was received at the reported location if the aircraft continued to follow the published Vertical Descent Angle (VDA) below MDA. The airline (and several others) reported that it was their SOP to do so, pointing to the benefits of stabilized approaches and the use of a continuous descent angle.”

(In layman’s terms, often an airplane is supposed to descend from the clouds, level out, then fly at the FAA minimum descent altitude until the runway is in sight. Airlines admitted they were continuously descending almost into obstructions every day to avoid leveling out first – completely crazy.)

Descending below the MDA into terrain would void insurance policies and likely result in the airline company folding.

In the Polish state visit to Russia accident, the captain similarly made up his own approach procedure. He descended below MDA and used an on-board radar altimeter over uneven terrain at treetop height. The resulting crash killed 1/3 of their government and military leaders.

Posted in Tech | Leave a comment