The blog of Dave

Red rose of Lancashire

Family snapshot

image

Why I wrote DhG2

I first got interested in genealogy when the 1901 census was made available to the public. Using some credits that I was given as a gift, I found the census entries for my paternal grandparents. I asked my father if he could remember any of the other people on the records. I wrote all the information in an exercise book and drew some family trees.

For a while, progress was slow and the hand-written notes served their purpose.

Using a computer

With the appearance of online genealogical information on commercial sites like Ancestry and FindMyPast, as well as the many free sites, it became clear that hand-written notes and hand-drawn trees would not suffice for the flood of information that I was gathering, so I looked for a computer-based approach. Most of the programs that I found would only run on Windows, and I avoid Microsoft products like the plague. I use Linux on all my PCs.

I did briefly consider one of the web-based offerings, but rejected them because I wouldn’t have full control of the data; what happens to it is out of my control and might tie me into a permanent subscription. Furthermore, there’s a security problem; there are living people in my family tree, some of whom I don’t have any contact with. I don’t want that information to be on a public website. Even if it is hidden behind some kind of privacy flag it is still subject to terms and conditions over which I have very little control.

Using Linux

I needed a Linux solution.

First of all, I tried an ancient program called Lifelines. It generates very nice tree diagrams but maintaining the database is a bit daunting - you have to use a dialect of GEDCOM. Unfortunately it doesn’t seem to understand everything that you can express in GEDCOM. That means that you still have to keep separate notes for all the evidence that you accumulate, like birth and death certificates, parish register and census pages.

Then I found a program called Gramps that at first seemed to do everything I needed. Gramps uses some kind of database for storing the information. I entered a few people into it, then started adding all the evidence that I had about the people. That’s where I came unstuck. Gramps insisted on storing the absolute location of image files, and might even store the content of the files in its database. I wasn’t very happy about that. However, there was another big problem. It wasn’t clear to me exactly where I went wrong, but suffice to say that I made some mistakes and corrupted the database. The only thing I could do was to start again.

It was then that I decided that having a single database isn’t very resilient to failure, so I decided to write my own program that would be robust against failures and bugs in the software itself.

DhG - the first version

I decided to use plain text files instead of a database. This has several advantages, among which are that I don’t need to write a special editor for them and I can maintain them in a revision control system that programmers routinely use for maintaining source code.

There’s one text file per person, analogous to an index card that genealogists might have used in the past, but with much more space available for recording information. I decided to use a standardised format for these card files to make it easy for a program to parse in order to construct family trees etc.

I wrote DhG in Perl because regular expressions are baked into the language. I know that Perl is considered by some to be a “write-only” language, but that depends on the programmer, not the language.

The main problem with DhG was that I was developing the structure of the files at the same time as the program. Some of my early decisions about the files turned out to be wrong, hence the “Version 2” in all my current files.

The DhG source code is still available for anyone who wants a laugh. Because of the database changes, it contains a lot of duplicated code, some of which is probably never used.

DhG2 - the rewrite

When the structure of the card files was more-or-less stable, I decided to do a complete re-write to clean up the code. I had been introduced to Python and enjoyed working in it, so I decided to use it instead of Perl. Using Python also meant that I wasn’t tempted to copy-and-paste the old code.

The rewrite adds a few new features and eliminates some of the shortcomings of the Perl version. It still loads its internal database from the card files when you first start it, but maintains the database dynamically when you edit the files from within DhG2. You only need to reload if you change a card file outside the DhG2 interface, although I’d recommend reloading after adding a marriage record.

There’s a configuration file too, so you can set up its standard behaviour and even use it to maintain several different databases.

DhG2 uses templates for most of its output. It searches for templates along a path that you configure, so you can create custom templates without having to modify your copy of the program.

You can change most of the settings during the program. There are Father and Mother settings that make it simple and consistent when adding all the children of a family.

There are still some improvements to make. The startup time is quite long the first time you start it, but Linux caches the content of the files so it’s much faster on subsequent reloads. I don’t think this is something I want to change, though. It’s part of what makes DhG2 resiliant to corruption.

Error reporting leaves a bit to be desired. It can be quite difficult to locate an error in a card file, especially if the error is reported while generating the website.

If you want to see what the HTML output looks like, take a look at my family history pages. The front page is hand-coded, but most of the rest is generated by DhG2.

Where to get DhG2

It’s on CodeBerg: https://codeberg.org/TheLancashireman/DhG2.

You can find it on GitHub and GitLab too, but I won’t post links for them.

It uses several Python libraries, most of which are part of a standard installation on Linux, I think. The one library that you definitely have to install separately is the Jinja2 template library. On my Linux distro it’s just an “apt install” away.

I’ve been told that DhG2 runs fine on Windows, but I can’t give instructions. It’ll probably work on an Apple Mac too.

The “round twit” department

I really ought to write some better documentation. There’s a user’s guide for the old DhG, which contains a description of the file format that’s still mostly valid. You can find it on my website. There’s some pretty terse help text in the program itself.

I’d really like to have some pretty family trees for a family book that I plan. A GEDCOM export that can be imported by another program like Lifelines might be easier than doing the graphics myself.

I’ll do it when I get a round twit. :-)

If you use DhG2 and find any bugs, please let me know. If you make any changes to the code, please send me a patch. I ignore pull requests.

If anyone out there feels like adding a pointy-clicky GUI front end, be my guest. If you wish, I’ll incorporate it into the main line, Again, send me a patch, not a pull request.