Is there a better time to pay me a visit than a Friday afternoon? The “don’t do anything affecting production on a Friday”-rule wasn’t obeyed and I had to add some feature to our watch script. “Fine, let’s get it over with” me thought and I proceded to implement it.
Then everything went haywire.
The script quit with a never before seen error message. Since this takes some time to run (~30min.) testing appeared to be evil. To spare you the description of my suffering, I’ll make it short.
# current production database did not work.
# reduced size production database did not work.
# newly constructed local database did not work
# newly constructed reduced size local database did not work.
# one week old local database works
# one week old reduced size local database works
Of course I should have asked: what’s the difference.
Which I did. My first assumption was a new version of the external tool being used. But the last was from end of March, so my assumption wasn’t correct. My second idea was, that the database file was corrupt, but it was still being used and working fine. That one was wrong. What other differences could there be? I was getting desparate seeing myself 10 hours later sitting in the same spot staring at the same lines. I was thinking that maybe there was a maximum number of fields to that database, but the week-old database had nearly the same number and I could add any number to it. Another dead-end.
Then I started looking at the reduced size files and edited one by hand and voila everything went fine. Then I compared to the autogenerated stuff and I noticed the xml wasn’t valid because he inserted the friggin’ pieces of text just right in the middle of the database at a point where it was not supposed to be inserted.
A few minutes later it hit me. The other day I had added more fields to the database and the new fieldname also matched the regular expression I was testing with. So the difference I should have been looking for was the two new fields added.
What did I learn today? Check your regular expressions and make them as strict as can be. Stricter even. What else? If you are writing tools that use XML anywhere, please throw exceptions or generate error messages that tell the user that there is a problem with the xml instead of simply saying
ERROR: unknown date acquisition function ''.
Guess what as soon as this is fixed I will be leaving here and pick up my new computer (I got the mail that it was there two hours ago). Have a bug-free week-end.
fn1. I am still thanking all the gods of randomness that this file somehow survived
fn2. I am pretty sure I have another bug in another project also caused by too lenient regular expressions