AWS EBS is network-attached storage … in other words, S L O W, compared to local SSD for Postgres database use.
I’ve been seeing average disk latency of 0.55 – 0.80 milliseconds per block, and IOPS and bandwidth are throttled by both the instance and the volume.
For m4.2xlarge, only 10,000 IOPS with 100 Mbps are available, regardless of how many or beefy your attached EBS volumes are – not impressive for SSD at all:
Figure 1: m4.2xlarge throttling IO from 4G EBS gp2 volume with 10,000 IOPS and 250 Mbps
Figure 2: 4G EBS gp2 unencrypted volume showing minimum read latency of 0.55 ms
In the above case, one thing you can do is to switch from m4.2xlarge to m5.2xlarge, which is cheaper and has double the IO performance.
But if you’re stuck using Postgres with EBS for large databases (bigger than RAM), there are workarounds related to the fact that
shared_buffers will store the index in RAM:
- carefully configure shared_buffers to be as large as possible, and max_connections as small as possible
- run EXPLAIN to see if indexes are used (no SEQ SCAN) and pg_stat_statements extension to identify slow or frequent queries
- use covering indexes to read data from the index cache
- rewrite queries to do index scans from RAM instead of table scans across the network from EBS (ie. HAVING => INTERSECT and EXCEPT, WHERE-splitting, etc.)
- remove ORDER BY if clause not indexed and your app doesn’t need sorting
- use Redis to cache repeated queries
- use io1 instead of gp2 volumes, but do your own benchmarks and latency measurements as they vary with both types
- use local instance, “ephemeral” SSD volumes and replication/WAL copy for HA.
It would be nice if Postgres had a setting to indicate network-attached storage as a hint to the optimizer.
Keywords: cloud, architecture, Postgresql