RedisConf 2018


I attended RedisConf 2018 on Thursday at Pier 27 (The James R. Herman Cruise Terminal) in SF.

Just like 2017, the talks were extremely high-quality and imaginative – I wished I could see all of the tracks.

The Herman Center is a beautiful new (2012) venue with breathtaking views of Golden Gate bridge, Alcatraz and the downtown skyline.






Executive Summary

  1. Redis community version (free) will have a number of enterprise features built-in this year: multi-master, SSL/TLS and likely multi-level cache primitives.
  2. Redis can provide a new way to look at and solve business problems by combining two or more built-in features. The multi-level cache talk (see below) shows how Redis caching was combined with Redis pub/sub and Lua, resulting in something powerful with only one page of code.

Keynotes

Keynote videos for 2018

Some of the talks I attended:

Thursday Talks

Techniques for Synchronizing In-Memory Caches with Redis
Ben Malec, Paylocity

“For some highly-accessed value, a network roundtrip incurs too much latency. An obvious solution would be adding an in-memory caching layer, but that brings many challenges around keeping data in sync across multiple clients. This presentation will detail the approach Paylocity implemented, which leverages Redis Pub/Sub, bucketing keys to minimize synchronization message length, and carefully exploiting order-of-operation to eliminate the need for a master synchronization clock.”



Data Flow Diagram of Ben Malec’s Multilevel Redis Cache (with ignore self-pub)

https://github.com/bmalec/RedisMultilevelCache

– co-worker suggested multilevel cache design. In past times, that was overly complex. now, time to reconsider.
– data Source of Truth (SoT) is MS SQL Server
– multilevel cache with Redis as the cache SoT (still has TTLs) and pub/sub to 50 local clients
– .NET (Windows) using default MS memory cache on clients

#1 Possible Multilevel Cache Design
– broadcast keys and values
– will blow up network

#2 Possible Multilevel Cache Design
– broadcast just keys
– can still blow up network

#3 Possible Multilevel Cache Design
– broadcast 16-bit custom hash slot generated on client nodes
– store last updated array in RAM on client nodes and use lazily
– we will explore this option (see diagram)

– various race conditions to think about though when requiring local cache to always be correct despite various latencies and TTLs

– Redis lets a database application exploit Redis’ O(1) data structures, which RDBMS’ cannot match. Note that Publish is O(N_subscribers) and does not guarantee delivery (James)
– the slot hash starts to look like a zero-knowledge proof – interesting area to research (James)

– can decrement timestamp by 1 to know the newly arriving Redis event is later. (Some people postfix with -1, -2, etc.)
– can ignore self pub/sub messages:

RedisMultilevelCache/MultilevelCacheProvider.cs:122
if (dataSyncMessage.senderInstanceId == _instanceId)
{
return;
}

– can write Lua script on Redis side to send the pub/sub event and save sending 1 event over network

– possibility of cache thrashing in this design since hash slot is a range of keys with 16-bit ID’s, not a concern in practice
– but 18-byte (instance guid and slot int) messages (very small)

Future

– publish hit/miss metrics into ELK or Redis time-series module maybe
– possibly use Redis XFETCH to optimize cache reload
– add support for more Redis types
– track hot keys
– StackExchange also doing similar multilevel cache design.

– 45x faster with local cache than accessing Redis over network, approaching zero bandwidth
– also client could tell Redis a TTL and not send a pub/sub message!
– look at Redis’ key notifications (Salvatore commented on that)
– Salvatore: “We could spend a month talking about how to incorporate this into Redis.” 🙂

Bandwidth Optimizations

– delete notifications from clients to Redis are slot hash values (that represent a range)
– update notifications to clients are slot hash values that are cached in lastupdate array. This array is consulted before using a value in case it needs to be re-fetched (lazily.)

Latency Optimizations

– slot hash values, not key:values (Redis keys can be 512 MB)
– local lastupdate array, no network check

Code Optimizations

– using MS Cache and Redis pub/sub
– only need 1 page of code to implement multilevel cache, easier to verify correctness and/or test
– short code can be customized per use case

Reception

– idea for pub/sub came in shower
– implemented within one data center
– but could be useful for reducing latency in geo-distributed databases as Redis multi-master goes GA (James)




Application of Redis in IOT Edge Devices
Glenn Edgar, Lacima Ranch (“The Avocado Farmer”)

Summary

– IoT by former embedded engineer intended for avocado farmers
– 22 sprinkler wires plus PLC for $300
– tell field-hand where to find sprinkler damage (3 Gallons/Minute)
– gopher can chew on sprinkler, cracks can be non-visible (0.5 Gallons/Minute)
– started with typical embedded programming/web solution with Mongoose web server
– but needed to share data between 2 processes …
– Raspberry Pi with Redis is job queue controller, python apps talk to it
– need to schedule watering, collect info for flow and pump problems, log
– “I have the smallest database at the conference: 40 MB.” (“Power of small data!”)
– then 3 processes, graph database module needed
– Reference by Chinese Electric Utility to IEC1970 standard for SCADA, studied it
– SNMP on steroids
– not using Grafana now, but if he did, would do with monitoring as code
– some graph nodes are weather stations
– 2 types of data structures: system graph/logs and irrigation schedules
– code generates Redis keys to ensure keys are managed properly
– evaporative loss and moisture loss calculations
– Deep Learning and ML to interpret graphs, trends
– “You’re not going to run Tensor Flow on a Raspberry Pi.”
– Method of Synchronization between Cloud and Edge with AMQP
– adoption limited by less sophisticated neighbors and commercial SCADA interests
– cameras are for security: avocado theft
github

Integrating Redis with ElasticSearch to Get the Best Out of Both Technologies
Dmitry Polyakovsky, Zumobi

ft.aggregate API

– can we count this? faceted search in Redis

– insert item and index in 1 ms
– aggregate vs. search

Search

– top N
– no processing involved

Aggregate

– top surname and count
– count StackOverflow questions by database by month
– filter -> group -> apply -> sort -> add more (all Redis types)
– country -> age -> profession -> languages
– numeric functions and expressions
– distributed: naive uses window function, still too much bandwidth and time
– distributed: better uses coordinator and does aggregates only (1000x less data)
– hyperloglog maintains 4% precision even after coordinator merge
– reservoir sample -> median
– query plan translator has 2 parts: remote query and local query

Limitations

– single GROUP BY advisable for good performance
– high number of groups – still slow
– exact COUNT DISTINCT and quantiles slow
– for huge parallel workloads, use Spark, etc.

Demo

– live sub-second searches of multi-million row StackOverflow data with AWS one-server 15-shard demo server
– APPLY is used last, so efficient on TOP N queries
redash-client, plus custom changes

Future

– port all existing Redis simple search functionality
– streaming time-window searches

Conference Closing Session

– Open bar with bar snacks
– prizes giveaway
– talked to some of the presenters.

The James R. Herman Cruise Terminal

Though beautiful, it has some limitations as a conference venue, being split on 2 floors with occasional wind rattling. Parking is also limited.

Booth staff wore parkas and were still unbearably cold. Coffee was only available on the first floor, but talks were on the second floor – rather inconvenient, but easy to fix for next time.

The “F” streetcar stops outside.

Scaling a High-traffic Rate Limiting Stack With Redis Cluster
Multilevel cache system for Java 8

This entry was posted in API Programming, Cloud, Java, Linux, Microservices, MySQL, Open Source, Tech. Bookmark the permalink.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.