Showing posts with label Data Modeling. Show all posts
Showing posts with label Data Modeling. Show all posts

Saturday, April 30, 2011

Things to avoid in a ER data model


Key learning created from webcast Ten things to avoid in a data model

  • Don’t build a model without understanding the biz. rational – don’t build vague models

The purpose of the model dictates level of detail and the level of polish (how complete the model is). Iteration is the key to a good model to ensure the model is complete. Make sure all the ideas are thought through.

  • Literal Modeling

You must interpret and abstract what the customer tells you. Don’t blindly build the data model based on what the customer says. Pay attention to hidden requirements.

Example of literal modeling:


In the above example, what happens if a person is promoted to a supervisor then a manager? Do we create multiple records?

  • Avoid Large Models

Limit a model to no more than 200 tables. You must provide compelling justification for large models. There are exceptions to this rule ex: packaged applications (Oracle Financials/SAP…).

  • Speculative Content

Don’t model something that is not needed now - “might be helpful in future”. This is counter to the philosophy of agile development. Address the requirements but don’t try to exceed them. This will only make the model large and increase the cost of development rather evolve the model as needed.

  • Lack of Clarity

A relational database is declarative. Declare data in your models. A domain is a set of possible values for an attributes that can be defined in ERwin. An enumeration is a domain that has a finite set of values. You don’t want people to put incorrect values in attributes. Declare enumerations in your database. Don’t use cryptic names.

Example of Enumeration:


  • Violation of Normal forms

A normal form is a guideline that increases data consistency. As tables satisfy higher levels of normal forms, they are less likely to store redundant or inconsistent data. Denormalization is only justified for Data Warehouses (Yay!). Be suspicious of wide tables greater than 30 columns. It is acceptable to violate normal forms deliberately when there is a good reason to do so (ex: performance bottleneck). If you are having difficulty defining an entity type, then chances are that you are combining different entity types. Think twice before doing that.

Example of Violation of Normal Form:


  • Needless Redundancy

Don’t include redundant data to compensate for poorly conceived applications. Redundancy is a bad thing. Again, Data Warehouses are exceptions. Materialized views are OK.

  • Avoid Parallel attributes in non-data warehouse apps.



  • Avoid symmetric relationships for relational databases


In the above example, should I double search a record?

  • Avoid anonymous fields


A few incidental user-defined fields are OK.


A data model determines an application’s data quality, extensibility and performance and influences whether it has chance to succeed. You can improve your data models by paying attention to these pitfalls.