Cladogram updating

This forum is for general discussion about the Dál Cuinn.
Post Reply
pdonelan
Posts: 9
Joined: Sat, 2024-Jun-01 12:10 pm

Cladogram updating

Post by pdonelan »

I am very curious to know how updates are obtained for the cladogram. You seem to be able to obtain new information very fast.
Until recently, my Big-Y terminal SNP was FTA56326. Two weeks ago FTDNA notified me that it had been updated to FTC37469. However, back in January I had already noticed this myself. Also, what was even more impressive was that I found that the cladogram already contained this information! Is there some way that you obtain updates directly from FTDNA?
My new match with FTC36282 is managed by Sharon LeBlanc, but the person tested was her father, a Cassidy (earliest known Cassidy ancestor Thomas Cassidy, born 1765 in Prince William, Virginia). According to Tibor Féher, the Cassidy sept of Fermanagh (FT65606) is a subgroup of I2a2-M284>L1195, so there seems to have been a surname change event.
Even more impressive, the cladogram shows that I have another SNP below FTC37469, namely FTC36282. I have not been notified about this yet by FTDNA, but my guess is that my match will be James William Donlon (ancestry County Roscommon), who submitted his sample for Big-Y testing at the end of December. His Y111 results were already available in January, and he is a Y111 match to both me and Mr. Cassidy. His Big-Y results are not available yet, but when they are, my guess is that he will be my match at FTC36282. When his Big-Y results come in, I will likely make a separate for James and myself in the FTDNA Donnellan Y-DNA project. At present we are both in the "miscellaneous" group.
It would be interesting to know how you get your early information, because it would shed some light on what the process of creating new haplogroups.
User avatar
Webmaster
Site Admin
Posts: 1412
Joined: Wed, 2019-Jun-26 2:47 pm

Re: Cladogram updating

Post by Webmaster »

Patrick,

Thank you for your kind remarks. They are very much appreciated. Thank you also for the info on Mr. Cassidy. I have updated his info on the Cladogram.

No, I do not get data directly from FTDNA. I diligently check the public FTDNA Y-Haplotree and a few select projects daily, as well as YFull and The Big Tree. It is definitely a labor of love, because otherwise it is a royal pain in the arse. :lol: The Cladogram is an aggregate of the data from all 3 websites and it makes our R1b-DF104 Cladogram one of the most complete on the internet.

Your data comes from your unique variants on The Big Tree. At least, I am assuming you are kit #FTD-B158287. I select the lead variant for the clade using the lowest number "FT" variant first, then "BY" second, and other labs alphabetically after that. That will change if another man shares some or all of your unique variants; and then FTDNA will choose which variant as the lead variant for the clade. I have yet to figure out how they choose the lead variants.
https://www.ytree.net/SNPinfoForPerson. ... onID=13774

FTDNA does a manual review of BigY 700 tests after the initial automated results become available. This usually takes a couple of weeks. There can be some slight changes between the two. You should be able to see your unique variants in your FTDNA dashboard; I think with your Block Tree, but I am not sure since I don't have an FTDNA account.

I hope this answers your questions adequately. :)
Image
pdonelan
Posts: 9
Joined: Sat, 2024-Jun-01 12:10 pm

Re: Cladogram updating

Post by pdonelan »

Thank you, that was very helpful.
pdonelan
Posts: 9
Joined: Sat, 2024-Jun-01 12:10 pm

Re: Cladogram updating

Post by pdonelan »

One more think is confusing me. On the cladogram, my terminal SNP is given as R1b-FTC36282 (coming after FTC37469, which I share with Sharon LeBlanc's father, Mr. Cassidy). My understanding from your explanation is that this (FTC36282) and the block of SNPs associated with it are my private variants. I do not have anyone yet that I share this SNP with. Is this correct?
FTDNA identify my private variants by just using the position, not name. Also, on the internet I have found statements such as "Private Variants are SNPs that are newer mutations and have not been named yet. When found in 2 or more individuals in high confidence, the SNP is named and placed on the haplotree."
I would appreciate it if you could clear up this confusion for me.
Thanks.
User avatar
Webmaster
Site Admin
Posts: 1412
Joined: Wed, 2019-Jun-26 2:47 pm

Re: Cladogram updating

Post by Webmaster »

Patrick,

Yes, you do not share your unique variants yet because no one else has yet tested positive for them. That changes over time as more men test.

FTDNA previously did not name variants until 2 or more men were found to share it. That changed a few years ago so that now FTDNA immediately names all new variants. They typically release these new variants immediately to the YBrowse database maintained by Thomas and Astrid Krahn at YSEQ on behalf of ISOGG. So with the position and allele change, you can find the name, if there is one, in YBrowse.

BUT, to be clear, they do not add variants to the public Y-Haplotree until two or more men share them. So the new variants released to YBrowse may not have any clade information and definitely not anything that could identify the tester.
Private Variants are SNPs that are newer mutations and have not been named yet.
That statement can be misleading. It means they are more recently DISCOVERED, not that their formation is more recent. The two may be somewhat related, but there are many factors involved. For example, the particular branch involved MAY have been discovered with BigY 500 tests and then a new BigY 700 uncovers variants that may be older in formation, but just were not seen in the BigY 500 tests, which had much less coverage than the BigY 700 test.

Something that must be understood with current sequencing tests is that they DO NOT sequence a consistent contiguous region of the Y chromosome. There is a region they try to sequence, but it can have gaps in coverage. NO two tests produce the exact same results. You could do two BigY 700 tests and one may find variants that the other does not and vice versa. I do not know the exact figure, but the tests do overlap by MAYBE as much as 95%. But that means there is 5% that they DO NOT overlap, and that is why no two tests produce the exact same result.

Also keep in mind the Y chromosome is ~62 million base pairs long. The BigY 700 only sequences ~18.5 million base pairs; so not even 1/3. A WGS 30X test sequences ~23.5 million base pairs; ~5 million base pairs more than a BigY 700 test. That is a whole lot more coverage to discover crucial variants.

The reason for the incomplete coverage of the Y chromosome is technological limitations and the complexity of the Y chromosome structure. Current sequencing chops up the Y chromosome into 150 base pair segments; sequences each of these 150 base pair segments; and then uses computers to glue all these 150 base pair segments back into the ~62 million base pair complete Y chromosome. But 150 base pair segments CANNOT adequately cover the complete Y chromosome because there are places with repeating sequences that are far longer than 150 base pairs, so there is no way to successfully glue the segments back together in these regions.

Within the last couple of years they have been able to completely sequence the Y chromosome by using new technology that uses 1000 to 100,000 base pair segments. This can successfully span the problematic regions of the Y chromosome. But these Long Read sequencers are not as accurate yet as the shorter 150 base pair sequencers. The T2T reference that does span the entire Y chromosome was built using a combination of Long Read and short read machines. It cost over US$50,000 in materials and thousands of volunteer man-hours to develop. In other words, you can't get a T2T test yet commercially. Hopefully in 5 years or so!
Image
pdonelan
Posts: 9
Joined: Sat, 2024-Jun-01 12:10 pm

Re: Cladogram updating

Post by pdonelan »

Thank you. That fully explains everything.
Post Reply