Description

These tracks show curated SARS-CoV-2 protein-coding genes conserved within the Sarbecovirus subgenus as determined using PhyloCSF [1], FRESCo [2], and other comparative genomics methods, consistent with experimental evidence in SARS-CoV-2. Ambiguous gene names were resolved according to the recommendations in [3]. For a complete description of the evidence, see [4].

Data Access

For automated analysis, the genome annotations are stored in bed and bigBed files that can be downloaded from the SARS-CoV-2 PhyloCSF Genes track hub.

Credits

Questions should be directed to Irwin Jungreis.

If you use the SARS-CoV-2 PhyloCSF Genes Track Hub, please cite Jungreis et al. 2021 [4].

References

[1] Lin MF, Jungreis I, and Kellis M (2011). PhyloCSF: a comparative genomics method to distinguish protein-coding and non-coding regions. Bioinformatics 27(13), i275-i282. doi.org/10.1093/bioinformatics/btr209

[2] Sealfon RS, Lin MF, Jungreis I, Wolf MY, Kellis M, Sabeti PC (2015). FRESCo: finding regions of excess synonymous constraint in diverse viruses. Genome Biol. 16(1), 1-14. doi: 10.1186/s13059-015-0603-7

[3] Jungreis, I., Nelson, C. W., Ardern, Z., Finkel, Y., Krogan, N. J., Sato, K., ... & Kellis, M. (2021). Conflicting and ambiguous names of overlapping ORFs in the SARS-CoV-2 genome: A homology-based resolution. Virology 558, 145-151. doi.org/10.1016/j.virol.2021.02.013

[4] Jungreis I, Sealfon R, Kellis M (2021). SARS-CoV-2 gene content and COVID-19 mutation impact by comparing 44 Sarbecovirus genomes. Nature Communications 12(1), 1-20. doi:10.1038/s41467-021-22905-7