AWS Loft Architecture Week – Databases



I attended 2 days of the AWS Loft SF Database Architecture Conference.

The software scalability work that Amazon has done on databases and caches, especially the Aurora distributed MySQL and Postgres databases, is very impressive. (DBA note: do careful acceptance testing of any distributed database.)

Executive Summary:

  1. AWS has gone far beyond “IaaS EC2 Classic hosting” and developed a complete HA database software stack.
  2. AWS Solution Architects are very knowledgeable, and available to all account holders.
  3. The free data migration tools, SCT (Schema Conversion Tool) then DMS (Data Migration Service), can use any source and destination in both AWS and OnPrem servers, and across several common database products.
  4. The new Intel chips make the new EC2 instance types 34% faster
  5. Slides

Some of the talks I went to:

What’s New in Amazon Aurora for MySQL and PostgreSQL
by Kevin Jernigan, AWS Manager of Tech Product Management, DBS

– very impressive engineering work by AWS engineers – complete internals modernization of MySQL and PostgreSQL
– split Open Source MySQL and PG code into two components (SQL engine and SAN storage modules)
– rewrote algorithms (btree => Z-index), log replay (max. 1.5 seconds) and locking code pushed down for MySQL
– no checkpoints, so 3x less jitter (query latency variation) since data is written to network, so no disk stalls
– Aurora MySQL 5x faster or more
– Aurora PG 2x faster (already well-written internals)
– 6 nodes, 4 required for quorum.
– my opinion as a DBA is that SANs are always a problem, so carefully evaluate this.
– story on edge cases are not well understood yet, critical for operating large distributed database
– you still choose and instance size for IO/CPU since that’s how their billing works.

The speaker was well received as being authoritatively technical by my co-attendee. I found the MySQL comparisons to Aurora a little contrived as being the worst-case configuration of MySQL. ie. I can fail over in 2-5 seconds with master-master and a load balancer, as compared to his discussion of 30 seconds to a minute or more.

What’s New in Amazon RDS for Open-Source and Commercial Databases
by KD Singh, AWS Partner Solutions Architect:

– MariaDB, Oracle 12c now supported
– Read slaves use regular async replication from product, so could lag. Your app needs to handle it.
– HIPAA, ITAR, USgov, UKgov, SGgov, PCI Level 1 seller approval
– during RDS failover, your app must be programmed to reconnect automatically. expected downtime is about a minute for the CNAME => IP address to update
– SQL Server limit is 4 TB, supports AD, .bak files
– 1-second monitoring now included in Cloudwatch
– “pick the smallest you think will work, and migrate when you need a bigger instance size”
– local timezone now supported everywhere

Migrating to Amazon RDS with Database Migration Service
by Dhanraj Pondicherry, Senior Manager of Solutions Architecture, AWS

– SCT (Schema Conversion Tool) then DMS (Data Migration Service)
– Successfully used for Oracle => PG by marquis clients like Shaadi.com
– may need careful VPC setup for source or target
– SCT lists count of tables, SPs so easy to eyeball for QC
– inbound bandwidth is free, so migration is very cheap within AZ
– DMS requires a CDC method to be enabled on the source, like Oracle CDC or MySQL binlogs.
– very impressive effort on these migration tools, with almost any combination of source and target possible now, including OnPrem, EC2 Classic, RDS, Aurora, and Redshift.
– “for your migration tool, pick the smallest you think will work, and migrate when you need a bigger instance size”

Amazon Aurora and Amazon Database Migration Service
by Joyjeet Banerjee, Solutions Architect, AWS Migration Lab

– download and try SCT (OLAP option is for Redshift)
– DMS Online Lab: qwiklabs.com/focuses/2965

Slides

Introduction to Amazon DynamoDB
by Sean Shriver, NoSQL Solutions Architect, AWS

– is a key-value store, key up to 2KB, is a string
– not currently related to original Dynamo paper. Most of the authors highly promoted.
– 1 partition, 5 GSIs, 5 LSIs (per partition)
– GSI and LSI separate tables
– GSI need to provision IOPs
– charged 1k writes 4k reads. reading from GSI reduces io cost
– partition key uses consistent hashing

Amazon Elasticache
by Darin Briskman, Developer Evangelist, AWS

– used to be based on memcached, now redis
– average operation 480 microseconds for 4 KB object, 240 microseconds for 1 KB object
– max. 3.5 TB per server, in clusters of 15 servers. 20 million reads per second, 4.5 million writes
– 300,000 TPS
– HA is 1,000 little details. 999 doesn’t count
– “Fast Data” sub-millisecond requests for IoT, mobile real-time info
– Alexa 1,500 ms budget, but 1,000 ms is network trip. So 500 ms for calculation.
– Alexa is DynamoDB+ElastiCache
memcached challenges:
1. no persistence
2. no HA
3. race conditions on threads
Thus Redis (“Remote Dictionary Service”)
Oracle, SQL Server, Mysql then Redis
– AWS-redis persistence via snapshot to S3
– snapshots to 90% of RAM (even 95%) network copied to alternate node. Allowed 20 snapshots per day.
– replication for HA. 30 seconds to failover usually
– primary and replica (don;t like master-slave sounds)
– 1 ms in same AZ, 2-3 ms different AZ
– 55 DCs in an AZ in US-EAST
– new Intel chips 34% faster
– no cross-AZ data transfer costs, so similar cost to Classic EC2
– don’t use T2 for prod. Use R or M.
– key CRC16
– promotion is to last-written replica, timeout of 15 seconds in case of network problem
– string key up to 512 MB, really just binary
– hash, set, list, geo, hyperloglog
– could use lambda to notify of OnPrem database update and invalidate Elasticache row
– IGA Works/Adpopcorn is Korean mobile business platform. Moment scoring on mobile users, including when to show ads
– Expedia’s real-time analytics with Dynamodb was 35000 writes, down to 3500 with elasticahe, 6x savings. 200 million messages daily.
– only a few airlines overseas, but lots of hotels at the destination. mom and pop agencies refreshing expedia also as their backend. 100 most popular routes are 50% of queries. TTL 24 hour, updates 10 time per day
– https://www.youtube.com/watch?v=ie4dWGT76LM
– one day of work and 5 days of testing
– beyond time of year caching, you may know the most popular teams/items
– or cache the whole database if small enough
– cannot add another node for say 5 shards to 6 shards because could lost data now. maybe later.
– “you don’t have to do anything. when you woke up later, it’ll be there.”

ElastiCache Best Practices

– set reserved-memory to 90% so writes can fit in without eviction
– swap usage should be zero
– position a read replica in another AZ for HA
– primary with 2 replicas is 5×9’s
– avoid KEYS and other long-running commands
– not needed for like 1 MB of data
– 50% – 90% reduction in cost
– former Solution Architect at IBM for 20 years. At AWS, allowed to recommend ways to save money.

Everything You Need for a Viral Game, Except the Game
by Darin Briskman, Developer Evangelist, AWS

– use Dynamodb and redis
– wechat runs on redis
– publish and subscribe redis commands: subscribe to a channel then publish to it
– twitch offers hosted chat

– CloudTrail tracks every action including DBA-level access to RDS in JSON
– talk to your Solution Architect, available to every AWS account holder

Elasticsearch
by Darin Briskman, Developer Evangelist, AWS

– most downloaded Open Source app after linux kernel
– nice REST interace
– same code as Open Source, but manageable in AWS
– AWS is green-blue instead of red-black
– can resize
– AWS answers
– Centralized Logging
– CloudSearch is Solr

Hands-on Labs: Amazon ElastiCache
by Darin Briskman, Developer Evangelist, AWS

– https://s3-us-west-2.amazonaws.com/fastdata/ElastiCacheLab.zip

AWS Talks link
https://s3-us-west-1.amazonaws.com/architectureweeks/Database/SF+Februray+21-23%2C+2017/Database+S3.pdf

Kudos to Kevin Jernigan and Darin Briskman for their excellent Aurora and ElastiCache talks – the best database talks I’ve ever seen.

This version of the AWS Loft is nice as far as “pop-up” conferences go – everything is hosted on the 2nd Floor, so no sprinting up and down stairs every 30 minutes.

Amazon Aurora Under the Hood: Fast DDL

Amazon DynamoDB Accelerator (DAX)

Tips

– Windows 10 has openssh support via Ubuntu. No Putty needed.

Getting there: 1446 Market St. SF. Take Muni K or T line to Van Ness station. or take Castro bus on Market St.

This entry was posted in Conferences, Linux, MySQL, Open Source, Oracle, Tech. Bookmark the permalink.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.