Optimizing Entity-Attribute-Value Design for MySQL

MySQL LogoIn traditional relational database design, we normalize (restructure) data relationships until rows consist of a primary key and only columns that depend on that primary key.

In Entity-Attribute-Value Design (EAV) design, a step beyond normalization is made where only one column in addition to the primary key is stored in each row. A data dictionary is needed to “reassemble” the data later.

EAV has an advantage when data is sparse (storage space would be decreased), the data is mainly logging events, or perhaps when programs auto-generate a system.

Somebody asked me about performance tuning for EAV database applications for when tables grow to 10 million rows or so.

An obvious suggestion would be to ensure that you use numeric-valued primary keys, and that there is enough key buffer to cache the indexes.

MySQL NDB tables might be a good match since you’re mainly doing primary key operations – after all, the row is almost entirely a primary key!

Also, you could look at alternative table types, like archive or merge, where the data can be made read-only to avoid re-building indexes in case of an unclean shutdown.

EAV design is not that appealing to me as a programmer since there’s more mental effort needed to understand the system. For example, if a lot of UI forms will be displayed, then a mapping must be done using the data dictionary.

Also one of the key benefits of a database is enforcing constraints to maintain data integrity. Using EAV and a data dictionary moves that enforcement to the app, which is not a good thing.

Wikipedia: Entity-Attribute-Value model
An Introduction to Entity-Attribute-Value Design for Generic Clinical Study Data Management Systems

This entry was posted in MySQL, Tech. Bookmark the permalink.

Comments are closed.