Monday, 29 August 2011
to read the Unicode Standard. Oh.
I was remined of these issues when I saw a post by Charlie Stross discussing the post "Falsehoods Programmers Believe About Names" by Patrick McKenzie. People's names are difficult to model.
2500th anniversary of the Battle of Marathon a year too early). Changing calenders from Julian to Gregorian -- give us back our 11 days (of tax payments, that is). Changing calenders at different times in different countries. Leap years. The algorithm for leap years (and the difference between the Julian and Gregorian algorithms). The argument about whether the year 2000 should be a leap year. Time Zones. Changing time zones. Summer time (aka daylight savings time). Summer time coming and going at different days in different countries. Double summer time. Leap seconds. And so on.
One of the comments in Charlie's post references "Gay marriage: the database engineering perspective", a post with an interesting analysis of marriage database design (ignoring the name problem), and how some designs make changes harder than they need to be.
Notice the continual use of "ids" in that post. Names (even if we decide how to model them adequately) are not unique, and so are not suitable for an identifier. What properties might make a good unique identifier? I recall hearing of a new police database that used "Surname, initial, date of birth" as a unique id. The story goes that this database was installed a few days before the Kray twins were arrested...
In fact, no attribute is immutable, and so should not be used in such an identifier. The same person who told me the Kray twins story also told me of the problems a hospital had in assuming that the "sex" entry was immutable when they did their first sex-change operation.
Wait a minute. No attribute is immutable? What about "date of birth" (dob)? How can that change? Well, remember this database information is a model of reality, not reality itself. Models can have errors. The dob might have been mis-entered, or been lied about (when my maternal grandmother died, the family discovered from her birth certificate that she was 10 years older than she had let on to her children), or it might simply be unknown.
Is Pluto a planet or not? When reality doesn't fit our classification, the fault lies not with reality, but with the classification system.