As Y-DNA and mtDNA do not recombine and only mutate, these mutations can be used to link people who share the same mutations into haplogroups – meaning that these people share a direct paternal or direct maternal ancestor in the last centuries or even millennia. Combining information on the locations of the earliest known ancestors of people who share this information can be quite interesting: it can help in pinpointing in what area the common ancestor might have lived. While the haplotrees of FamilyTreeDNA and YFull do indeed show the origins of a lot of the participants in the tree, this data is difficult to interpret: popularity of DNA testing varies wildly, which means that some countries have much larger samples than others. As a solution to this problem I have created a tool called Haplomapper, which maps of the FamilyTreeDNA data which use the relative frequency of the haplogroup in the sample, much like PhyloGeographer also had to do make maps using the YFull data. About half of the over 200.000 participants at FamilyTreeDNA have reported a country of origin for their direct paternal and maternal ancestors, so this makes for some nice and useful maps.
So how can this help you interpret your haplogroups? Let’s take the Y-DNA haplogroup R-U106 as an example. In the FamilyTreeDNA Y-DNA haplotree the number of participants who have been shown to belong to haplogroup R-U106 or any of its subclades seems to be the highest amongst participants who trace their paternal lines to England and other parts of the United Kingdom.
When corrected for the number of tests taken in each country, it does however seem that the spatial distribution of R-U106 occurrences is not centered around the United Kingdom, but rather around the Netherlands and (to a lesser extent) Belgium, with high frequencies occurring in Denmark, Germany, Norway and Sweden as well! This is useful information: whereas previously someone of Belgian descent might have misinterpreted the earlier data representation (leading him to think his haplogroup did not occur frequently in Belgium), this representation sheds some new light.
Like any tool it should however be handled with care. It all depends on the quality of the available data. Looking at subclades of R-U106, R-S23189 shows that it has a high occurrence in people whose paternal lines originate in South Africa (which becomes even more prominent due to the fact that only 112 people with a paternal origin in South Africa tested their DNA). However, it may be more likely these people actually descend from the Dutch or English who colonized the country! So be warned: interpretation is key. To exclude certain smaller countries, the settings panel allows to exclude countries up to 200 tests.
Curious? Enjoy those maps!
One Reply to “Haplomapper: mapping spatial distributions of haplogroups using FamilyTreeDNA data”
I do have some reservations about the “FamilyTreeDNA Haplogroup” label on the search box of haplomapper. My DNA has been categorized into a maternal haplogroup and a paternal haplogroup by several companies. I understand that haplomapper uses FamilyTreeDNA data to make the visualization of distribution of tests for a particular haplogroup, but isn’t a haplogroup more generic than the company which analysed it? So, shouldn’t the label just be “Haplogroup”?