Thursday, October 12, 2006

A Morning at Work

01/17/2006

A Morning at Work

This morning, at work, I've found bad news.
A client of mine made a bad mistake.
Each 3 months they extract from the accounting system two files with vendor's commissions and load them in their corporate datawarehouse. It's an old , Informix based, system. We made it, but I, personally, inherited the system. I cannot blame my predecessor for what happened: he worked like everyone else used to work at that time.
My client, as I told before, made a mistake. They picked 4 months to be extracted instead of 3. This meant that there was one month of doubled data loaded up. Luckily I made a one-shot backup of the fact table, from where some other fact tables are rebuilt from scratch for reporting. After a short analysis it was clear that the only viable solution was cleaning up the table and reloading data.
At this point
· I made a copy of the backup file
· I counted the records in table and files to be sure that nothing strange, after backup, has occurred
· I extracted a short selection of records
· I connected to our local test system to test the load
· Ouch, the test system table has not the correct structure
· I connected to the production system
· I prepared to load the small extract of data
· Wait a bit, is that the correct table?
· I checked the sql script which loads data to be sure it was the correct table
· I prepared again to load the extract
· Am I targeting the right database?
· I checked the connection and reconnected
· I loaded the extract and it was ok
· I prepared to delete
· A coworker asked me for an urgent advice
· What was I doing?
· I prepared to delete
· Am I targeting the right db and table?
· I re checked the scripts
· I answered the phone
· I re checked again the scripts
· I prepared to delete
· I checked the SQL
· Is it the right table and the right db? Maybe the script does something strange
· Script rechecked
· Prepared the SQL
· Stared at it for ten minutes
· Run the delete (while a cold drop of sweat run through my spine)
· Done, prepared to load the backup file
· Is that the correct file?
· Check the file
· Prepared to load the file
· Stared at SQL loading statement for ten minutes
· Loading... loaded.
An entire morning spent on this and the other tasks accumulated. If I had to have the tape backups restored the mess would have exploded.
What does this teach me?
1) Never create an incremental datamart without a mechanism to rollback the loading. I had no chance to identify correctly the extra rows, so I had to reload everything. From this point of view, sap BW is perfect. It allows to get rid of wrong data loads in a click. The data load itself is an object which can be activated, deactivated, compressed, archived etc.
In a less structured environment, adding a load identifier is invaluable to delete all the records. Simply add a field to the fact table and fill it with a unique identifier, the same for the entire process. Probably the best way is to generate it from the date and the time of data load start, so it becomes human-readable. Tagged records may be easily and selectively deleted in case of mistakes.
2) Write down a procedure to do so, test it and put a copy within the safe. In this way you'll not be compelled to re-tell the entire story from start.
3) There is no way of cutting away a human being from the loop, so be prepared to correct even the most trivial mistakes. Better, let the user correct everything by themselves.
Even in a minimally complex dw, probably, you’ll have to provide a side application to manage those data which are not managed by the transactional system that feed it. You have to sit down and code anyway. At this point, the overhead to manage also the loadings is not so heavy not to be faced.
It will spare you a lot of time while charging the same 20% of the license fee for maintenance;-).

0 Comments:

Post a Comment

<< Home