Quotes from Michael Kay Michael Kay, author of XSLT Programmer's Reference, was amoung the first to tackle the genealogy XML challenge. Visit Mike's site. The following snippets were collected from the Genealogy XML Discussion Group.
On the sharing of family history
There are important applications of XML in genealogy outside the traditional GEDCOM domain. Arguably there are three phases of data management in genealogy work - raw data collection and compilation, construction of linked pedigrees, and publication of family history. GEDCOM is only really designed for the second of those.
Focus on interchange, let standards follow
Don't start with the goal of developing a standard. Start with the goal of developing an interchange model, and some software that supports it. If it proves popular, then think about how to make it a standard.
On "Person Matching"
In my case I've found that the worst part of the data interchange problem is definitely the "synchronisation" problem, rather than any data model issues. Someone will insist on sending me an update that uses the transient identifiers from my last publication but three, or no identifiers other than name and date of birth, or that includes 95% unchanged data along with the 5% of new data. When I find time to write some XML-based tools, I will probably start with some kind of "person matching" tool to link up the individuals in two files and highlight any differences.
On Ambiguity, Provenance and Probability
There was a "gentech" group producing a proposal for a future data model that tried to tackle some of these things. They didn't have enough experience in data modelling theory to do a proper job, though they came close to reinventing the functional data model which I've seen quite successfully used to tackle criminology databases, which have essentially the same structure - not because our ancestors are all criminals, but because you need a very open-ended model that can handle arbitrary events, relationships, times and places, as well as uncertainty (including uncertainty as to who people really are), sources of information, conflicting information or opinions, and the like.
|