Category Archives: Cassandra

Cassandra vnodes Streaming Reliability Calculator

The Cassandra database has a setting in cassandra.yaml, num_tokens, for the number of vnodes. num_tokens is the number of partitions to use per host, and thus the number of parallel streams to use for data updates. The default was 256 … Continue reading

Posted in Cassandra, Open Source, Tech | Leave a comment

Internet Latency and Multi-Master Database Transactions

There’s 2 common misconceptions in engineering West Coast – East Coast data centers: that packets travel at the speed of light that database masters can be located anywhere (ie. far apart.) What happens when we look at the actual latency … Continue reading

Posted in Business, Cassandra, Linux, MySQL, MySQL Cluster, Open Source, Oracle, Tech | Leave a comment

Distributed Systems Laws Applied to Distributed Databases

Avery’s Law of Distributed Systems Reliability: “Distributed systems are more reliable when you can get a service from one node OR another. They get less reliable when a service depends on one node AND another. And the numbers combine multiplicatively, … Continue reading

Posted in Cassandra, Cloud, Storage, Tech | Leave a comment

GitLab Validating Ceph in Production For Me Spikes are Outages. OSD = Ceph Object Storage Daemon It would be easy for me to criticize GitLab for using a distributed file system in production, especially Ceph, in AWS. I just wouldn’t roll that way. And it would … Continue reading

Posted in Cassandra, Cloud, Open Source, Storage, Tech, Toys | Leave a comment

Solving Java GC Pause Outages in Production

Just thinking about how to configure HAProxy with two backend Java servers to be HA, despite GC pauses. Java programs pause periodically to recycle temporary variables, known as garbage collection (GC). This is called a “GC Pause.” The description “Stop … Continue reading

Posted in Cassandra, GC Pauses, Java, Microservices, Open Source, Oracle, REST API Programming, Tech | Tagged | Leave a comment