R1b-A1742 has this phylogenetic equivalent variant:
BY35985 (chrY: 20083013-G-C)
and its R1b-A12551 subclade has this phylogenetic equivalent variant:
TBT10365 (chrY_KI270740v1_random: 19766-C-G)
But when mapped to the CP086569.2 reference using the UCSC Liftover tool:
BY35985 (CP086569.2: 20980089-C-G)
TBT10365 (CP086569.2: 20980089-C-G)
R1b-BY69603 has these two phylogenetic equivalent variants:
BY214017 (chrY: 20082979-C-T)
TBT10388 (chrY_KI270740v1_random: 19800-G-A)
But when mapped to the CP086569.2 reference using the UCSC Liftover tool:
BY214017 (CP086569.2: 20980123-G-A)
TBT10388 (CP086569.2: 20980123-G-A)
I am not sure what to do with these two cases. Currently, I am leaving the apparent duplicates in the database. But it makes me wonder how many more of the GRCh38 chrY_KI270740v1_random contig positions are in fact duplicates of main chrY contig positions.
The main takeaway of this is the obvious deficiency of the GRCh38 reference and the urgent need to have an R1b based T2T reference instead of the current NA24385 based T2T reference, who is an Ashkenazi male belonging to the J1-M267 Y haplogroup. Unfortunately, without specific funding for an R1b based reference, one is not likely to be forthcoming in the near future.
The GRCh38 reference is primarily based on R1b men and as the above quote indicates, efforts won't be focused on R1b any time soon. To be VERY clear, this is not an ethnic rant, but a lament about the scientific need for an R1b based T2T reference due to the significant differences in the Y chromosome structure among the various major Y haplogroups and the fact there is a HUGE amount of R1b data that has been collected and desperately needs the benefit of an R1b based T2T reference.Since the release of the complete human genome, the priority of human genomic study has now been shifting towards closing gaps in ethnic diversity. Here, we present a fully phased and well-annotated diploid human genome from a Han Chinese male individual (CN1), in which the assemblies of both haploids achieve the telomere-to-telomere (T2T) level.
https://www.nature.com/articles/s41422- ... 23100cbc13