Failure To Do Data Analysis

October 27, 2010

Databases are all about, data.

So it’s totally bizarre how so many people working with databases never look at the data!

It’s really so fundamental, it’s hard to make a good analogy.  Imagine a tailor making your clothes, but never looked at the fabrics!  Can you think of a better analogy?

I’ve met programmers who have charged ahead and written one thousand lines of code, and have not even looked at either the data in the tables, the input, or the output!

Read the rest of this entry »


System Performance: Cobol Versus Database

October 7, 2010

More Architecture Mistakes To Avoid, Database Design Mistakes To Avoid, Tuning

When I started working with Oracle, I helped to replace a Cobol system. For a while, the company systems ran half on Cobol, and half on Oracle.  The Oracle 7 system was pretty slow then.  After the cold backups had run overnight, it was not unusual in the morning to see a query against a 700K row table take about 30 seconds.

I was asked to write a C program that only interfaced with the Cobol flat files, and read Cobol transaction records.  Each transaction was stored in multiple rows, and given repeating groups of records, could vary in size from a few rows, to dozens of rows.  The files contained about 700K to 800K rows altogether.  The program grabbed information out of the multiple lines, tallied the numbers up, and output one line of summary. Lots of IF statements, and loops.  About 800 lines of code.

I expected this program to take at least 30 seconds, given that database queries to similar size tables took that long, and that there was so much logic and so many lines of code.

In fact, the program ran in about 5 seconds!  I was really surprised.

But when you think of it, it should be obvious that Read the rest of this entry »


The Meaningless Data Model

October 6, 2010

More Database Design Mistakes To Avoid
Architecture Mistakes to Avoid

The saying goes, a picture is worth a thousand words.

If an organization has a good database model, with meaningful table and field names, a look at the model should give you a pretty good idea what the system does.

Like a picture, when you look at a model, it’s supposed to communicate.  Consider a model car, or airplane, or a scale model of building.

Without a model or picture, can you communicate how to build the special research car, XOF1?  How it runs?  Or what it looks like?

Similarly, can you look at an ERD (Entity Relationship Diagram) diagram your of database, and have a pretty good idea of the system does?  Or not?

At so many places, I’ve have to look in many places to understand the system.
– Many queries to the Oracle Data dictionary
– Values in database fields.
– Rows in database tables.
– Extensive data analysis.
– Requirements documents, that don’t give any specifications, most always outdated. Read the rest of this entry »


Cars, License Plates, And Using The Wrong Identifier

September 29, 2010

More Database Design Mistakes To Avoid
Architecture Mistakes To Avoid

I’ve now moved to a few different states, and registered my car in the new state.

Even though I was only a customer, and never saw the back end database, I could quickly tell which identifier the states were using for the car.   Each identifier had huge implications on the whole process, the consumer, the insurance companies, and the DMV.

In Connecticut, the process (in 1999) I followed was as follows.
Read the rest of this entry »


Creating Cobol Systems Inside the Database

September 29, 2010

More Database Design Mistakes To Avoid
Architecture Mistakes To Avoid

I haven’t worked a whole lot with Cobol, but I did do enough.

Coming from a background where the first thing I really learned well on the job was how to do good database design, there were some odd things that I noticed about Cobol.
Read the rest of this entry »


Ways To Get Duplicate Data

September 23, 2010

More Database Design Mistakes To Avoid

Duplicate data.  Designed into the system.  How can I count the ways?
Read the rest of this entry »


Splitting a Table Vertically, AKA, Duplicate Tables!

September 23, 2010

More Database Design Mistakes to Avoid

There is more than one way to split a 1:1 relationship into multiple tables.  In the first example, the table was split horizontally, and different columns went into different tables.  But the number of rows stayed the same in all of them.

The stranger way is to split the table vertically.

In other words, to have duplicate tables!
Read the rest of this entry »


MUCK – Massively Unified Code-Key, Generic Three Table Data Model

September 21, 2010

Another Database Design Mistake To Avoid.   Actually, one of the worst DB design mistakes you can make.

One of the worst things you can do to a database system is to make a generic data model of three tables for the whole system. A variation is to use three tables to combine multiple lookup tables into one.

Instead of standard tables and columns, three tables are used for everything.

One table each for:
entity/table
attribute/column/field
values of the data
Read the rest of this entry »


Too Few Tables

September 21, 2010

More Database Design Mistakes To Avoid

One program that I inherited just wasn’t making sense.

The load of this program loaded data into a single table.  It retrieved data into a few fields.  Then did loop de loops, reading these initial fields, and compared them to hard coded values in the program.  Finally updated the remaining fields, based on the hard coded values in the program.

Eventually it dawned on me.  There was a functional dependency, violating 2NF.  The first field, determined the second field.  This of course should have been moved into another table, but it wasn’t.
Read the rest of this entry »


Binary Trees As The Database Design!

September 21, 2010

More Database Design Mistakes To Avoid

One design that I saw at an interview, was highly unusual.   There was a diagram on the white board.  Inside the database were a number of other boxes, each representing a different subsystem.  I’d never seen anything like it.

We talked about designing the database model for good performance.  We certainly seemed to be on the same page. But then he sat me down at a computer, and had me do a test.  There were a lot of Recursive Ids and parent Ids.  For a given ID, find all the other keys associated with it.

The database “design” was a binary tree, and the test was to traverse the tree!
Read the rest of this entry »


Circular or Cyclical References

September 21, 2010

Another Database Design Mistake To Avoid

One very bizarre design that I saw actually had circular references between tables.

The primary/foreign keys of two tables, were actually found in the other table.
Read the rest of this entry »


Using the Wrong Object

September 20, 2010

More Database Design Mistakes To Avoid

Given that you need more information in or from the database, what do you do?

Do you create one of the following?
Read the rest of this entry »


Defining Recursive Relationships In Lookup Tables!

September 20, 2010

Another Database Design Mistake to Avoid

Recursive programming is used in a few places in computer science. Most notably in stack processing.  The big picture looks something like:

x program starts  (a)
which calls itself again (b)
which calls itself again (c )

then iteration c finishes
iteration b finishes
iteration a finishes

Recursive relationships are also sometimes used in database design.

The classic example is an organization that has employees.  And managers.   The managers are also employees.

Unfortunately, I’ve seen the recursion pulled out into another table!  A lookup table.
Read the rest of this entry »


Simple Rule Of Thumb To Determine the Number of Tables In A Relationship

September 19, 2010

More on Database Design Mistakes to Avoid

Here is my rule of thumb for the number of tables, for each relationship.
Read the rest of this entry »


Combining Identifying Data, Transaction Data, and LOBs in a Single Table

September 19, 2010

Another Database Design Mistake to Avoid

In another post, I mentioned that as a rule of thumb, if data has a 1:1 relationship, all the fields should be in the same table.

Of course, with many rules, you can go overboard.  Blindly following rules is the mark of a beginner.  Search for:
The Dreyfus Model of Skills Acquisition

Here is a caveat.
Read the rest of this entry »


Too Many Tables For a 1:1 Relationship

September 19, 2010

More Database Design Mistakes to Avoid

I’ve seen a few “designs” where a number of 1:1 relationships were actually split up into a number of tables.
Read the rest of this entry »


Script to List Tables With No Foreign Keys

September 19, 2010

More on Database Design Mistakes to Avoid

Related to referential integrity, in addition to systems that do not have primary keys, there are database “designs” where there were no foreign keys!  Much like a Cobol system.

Here is an Oracle script to find the tables that don’t have foreign keys.  Tables that are neither a parent table, or a child table.
Read the rest of this entry »


Redundant Foreign Key

September 18, 2010

Another Database Design Mistake to Avoid is the Redundant Foreign Key.

I’ve seen a redundant FK directly to the parent, of the parent.
Read the rest of this entry »


Referential Integrity and Performance

September 18, 2010

There are three ways to implement Referential Integrity in a database system.  And each method results in different database performance.

1)
Primary and Foreign Keys:
Fastest.  Most structured.  Declarative.  No coding required.

2)
Triggers:
in between.

3)
Application code:
Slowest.  Least Structured.  Procedural.  The most coding required.   If the code is found in applications outside of Oracle, network traffic gets involved and is even slower.
Read the rest of this entry »


Oracle Script To Find Tables With No Primary Key

September 18, 2010

Primary keys are so important in databases.  I keep trying to write an effective post about why they are so important.  But I don’t feel I’m doing the subject justice.  So let me start with a related, but shorter post.

At a number of shops, I’ve come across so many tables that did not have a primary key.  Some of these were very important tables in the system!   Bad, bad, bad.

After discovering so many tables that no primary key, at so many shops, I finally wrote an Oracle script to list all the tables that did not have a PK.
Read the rest of this entry »