Is Your Data Normal?
Ouch - that sounds so, well, judgmental! Who wants to be told that their data isn’t normal (which makes it abnormal, right?) However - if you’re getting ready to move from a tired database, a list in Excel, or a table in Word - keep reading. your data probably could use some cleaning - and you’ll save time and money by doing that on your own!
So - what IS normal data? You can read all about it at Wikipedia and other places if you wish - but at its simplest, I think normal data has these characteristics:
- You only have ONE kind of data in a field.
- You enter all of the data in the field using the same criteria - for instance - all phone numbers must include an area code, or all zip codes must have a ZIP plus4.
- You arrange your data (this can be hard to figure out) by function. For instance, all contact information belongs in the same table - but donations belong in a table all of their own.
Suppose you are keeping track of individual donors - you might have a record that looks like this:
FName LName Street City State Zip
Patrick Shaw 403 23rd Ave S Seattle WA 98144
You can tell that I have a field for my First name, my Last Name, my Street Address, my City, State, and Zip. This is normal data!
But what happens if you find out that I’m sharing a house with my twin brother? Then you might be inclined to do this:
FName LName Street City State Zip
Patrick and Paul Shaw 403 23rd Ave S Seattle WA 98144
Well - that doesn’t seem so bad at first glance - but what that really means is that now you can’t send a letter to just me - everything will have to be addressed to Patrick and Paul - you won’t be able to send just ME a thank you note. This is “non-normal” data.
You can solve this by creating a new record for my twin:
FName LName Street City State Zip
Patrick Shaw 403 23rd Ave S Seattle WA 98144
Paul Shaw 403 23rd Ave S Seattle WA 98144
But we have the same address - so you might end up sending us TWO invitations to your next event, making us think that you aren’t careful with saving your paper. You can solve this using the concept of a Household - where a household is a “parent” or associated record, and Paul and I both belong to that Household. Then - you can send print mailings and other sorts of correspondence to the Household (just one copy!) and can also keep track of our individual contact information, donor records and so on.
So - what’s the takeaway? If you are considering a migration to a new tool, or if you have the ability to edit your existing tool, you can start “normalizing” your data. That might mean you have to create new records, adjust your naming conventions and so on -but it will help in the long run!

Dave van Wagenen wrote:
This was a very good explanation of normalization, but I feel you glossed over the importance of separating FNAME & LNAME to start with.
I would have also like to have seen a suggestion on how to handle the creation of a record when the check comes in as Mr. & Mrs. J. Q. Public.
Posted on 13-Jun-07 at 7:36 am | Permalink
patricks wrote:
Dave,
I’m hoping to provide more information later about both normal data, redundant data, and what we call “dirty” data.
In terms of handing a generic check - I think the answer depends a bit on what tools you have available for data entry, and what your agency culture is regarding your donors. I’d be inclined to solve that problem with a people process - pick up the phone (or write a letter) asking Mr. and Mrs. Public how/if they’d like their receipt and other correspondence. This helps in two ways - you are more sure of your data, and it helps you build that relationship. Comments always welcome!
Posted on 13-Jun-07 at 12:10 pm | Permalink
sayo wrote:
I have a statistical data i want to normalize. i’ve read about some methods already & have tried to applt them; but no success. Methods like box-cox. can you pls give some other straight forward method?
Thanks.
Posted on 10-Jun-08 at 12:17 pm | Permalink
patricks wrote:
Sayo,
Your best option is to find someone local to help - my blog is much more general than providing particular assistance! Good luck - normalizing data can be a great first step to more effective use of a database!
Posted on 10-Jun-08 at 6:15 pm | Permalink