A home-grown project

Monday, September 11, 2006 at 9:39 AM

By Colin Colehour, Partner Solutions team

I work on the Partner Solutions team at Google, and one of the things that I do is monitor some of our larger Google Base bulk uploads. As a result, I became interested in adding my own content. So as a side project, I started gathering the genealogy data that my father has been collecting over the past 22 years in order to upload it into Base. The first challenge that I faced was getting the individual genealogical records that are typically stored in Gedcom files into a format that made sense for Base. I resolved this by writing a PHP script to extract data directly from our website's MySQL database and converting it into a Google Base-friendly XML file.

Another challenge I had was deciding on which attributes made sense to include for each listing, especially since an important aspect of genealogical data is the relationship between different family members. For my first batch, I decided to include the attributes that would make sense to someone looking up this information for research purposes. So far, the listings include first and last name, gender, spouse first and last name, and three important dates: birth, death, and marriage. For phase two, I plan to include information on children in the listings as well.

By making these listings live, I hope that this information will be more accessible to people interested in genealogy. Maybe it will even encourage others to contribute their own family histories. To date, I have about 10,800 listings live on Base, and am working to get the next batch in within the next few weeks. Here are my listings in case you'd like to take a look.

When I asked my dad what he thinks of this project, he was pretty excited. "You can look at many dimensions of the data, depending on what info you know before the search," he said. "It gives you an option to look outside of the database as well." Thanks, dad.