MUCK – Massively Unified Code-Key, Generic Three Table Data Model


Another Database Design Mistake To Avoid.   Actually, one of the worst DB design mistakes you can make.

One of the worst things you can do to a database system is to make a generic data model of three tables for the whole system. A variation is to use three tables to combine multiple lookup tables into one.

Instead of standard tables and columns, three tables are used for everything.

One table each for:
entity/table
attribute/column/field
values of the data

MUCK - Massively Unified Code-Key, Generic Three Table Data Model

Instead of a number of columns in a row, you now have a single column, and multiple rows.  One writer, Don Peterson, referred to this method, as MUCK, for Massively Unified Code-Key.

This has proved to be such an incredibly bad idea that it has already been discussed online in a few places.  Unanimously the articles state why you should never do this.  A few well written articles can be found here.

http://www.simple-talk.com/sql/database-administration/five-simple–database-design-errors-you-should-avoid/

http://www.simple-talk.com/sql/database-administration/ten-common-database-design-mistakes/

http://www.projectdmx.com/dbdesign/lookup.aspx

Tom Kyte also wrote about it on this in “Effective Oracle By Design”, on page 35.  “But how does this model perform?  Miserably, terribly, and horribly.”

All these articles reach the same conclusion, and give even more (different) reasons why you shouldn’t use this idea.  Do check them out.

The results of MUCK:

Awful Performance:

This kind of system, is actually DESIGNED to go slow.  For even the simplest of queries, you now have to join at least three tables.

Pivots:

Another reason for the bad performance is that each value for each attribute, is stored in a separate row.  So, if you have five attributes, the data will be stored in five different rows.  To get them all back on a single row, you have to pivot the data.  This is of course more work than if all the data was on a single row.

Multiple Inline Views:

Since each piece of data is on a different row, if you have more than one field to retrieve, you have to create an inline view, for each field!  So, to retrieve five fields with their appropriate values on the same row, you actually have to create five inline views!  The queries get to be really big and complex, fast.

Data Type Problems:

To make this system work, the field that holds the value is made of type VARCHAR.  This will hold a representations of other data types, such as a number.  But it will be stored in the character datatype “8″, not as a number, 8.  At some point, you will need to convert it from a string, back to the number.  But eventually, a non numeric value will sneak in, such as “A”, that can’t be converted back into a number.  The query will fail, and you will be debugging, and fixing it manually.  More people time.

Really Complex Inserts:

For an insert on a normal system, you would insert one row with the values for (say) five different fields.  To get the equivalent in the MUCK system, you would need to do five separate insert statements!   More people time.

People Time:

Lots more people time is necessary for even the simplest of queries.  For a simple query, you will sometimes have to query the lookup tables first, before you even compose your SQL statement, so that you can find the codes to filter on.  “What is the code for ???”

Too Confusing For Even The Brightest:

Once my department had a task to double check a lot of data from these tables, and compare them to another system.  With regular tables and columns, we would just select the fields, and got it done in a few minutes.  But we spent a number of hours trying to ensure these complex queries were correct.

And yet, the three of us came up with three different answers!  Which one was correct?  This was in spite of the fact that all three of us knew SQL really well.  How’s that for accuracy?

Cost:

What is the dollar value of the people time to use and maintain this system?  Compared to whatever advantages there were for doing it in the first place?

It’s Cobol:

If you have to look at other fields, or run processing to know what a particular row or field represents, you are basically running a Cobol system.  More on Cobol systems in another post.

Correction:

JUST SAY NO!     Standard database design practice please.

About these ads

One Response to MUCK – Massively Unified Code-Key, Generic Three Table Data Model

  1. rodgersnotes says:

    I have since found out that MUCK is also known as the Entity Attribute Value (EAV) Model, or Open Schema.

    This wikipedia entry describes it.
    http://en.wikipedia.org/wiki/Entity–attribute–value_model

    But remarkably, the article comes off as if EAV is actually a model to consider! It does mention the problems of pivoting, and comes up with three very costly solutions to it. A separate warehouse. Materialized views. And in memory data structures. More complexity upon complexity.

    But still no real added benefit to just doing it right in the first place.

    Amazing how such an incredibly bad idea is made palatable and sold.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: