With its ambitious mission, the VGP aims to address fundamental questions in biology, conservation, and disease including identifying species most genetically at risk for extinction and preserving their genetic information for future generations. The high-quality VGP genomes will become the main references for their species and will be stored in the Genome Ark, a digital open-access library of genomes. The current Phase 1 of the VGP – the VGP orders project – aims to create reference assemblies of selected species representing all 260 vertebrate orders that have diverged from each other shortly after the last mass extinction 66 million years ago. Studying these ordinal-level species will help scientists determine what type of species survived the previous extinction event that wiped out the dinosaurs. Those studies can also give insights into how other species could survive the current 6th mass extinction event and help identify genetic variants that might protect these species from total extinction. Amongst the 15 new genomes are critically endangered species like the platypus, and the Kakapo parrot. Other species include the zebra finch songbird and Anna’s hummingbird, which like parrots, belong to the only three vocal learning bird orders among over 40 orders of birds. Also, two vocal learning bat species are part of this first data release. The Vertebrate Genomes Project – Two of the 15 released genomes, a bat and a fish, have been sequenced and assembled at the Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG) in Dresden. Gene Myers on the huge project. To conduct the VGP, the umbrella G10K organization, from which the project arose, has convened over 150 experts from academia, industry, and government, from 12 countries, to develop high-resolution sequencing methods that both reduce costs and eliminate the errors that plague current reference genomes. Many current reference genomes are riddled with errors—parts of genes are missing, some are incorrectly assembled, and other genes are completely missing. Consequently, researchers are potentially working with incorrect gene sequences and structures hampering their genomic studies. The new VGP genomes eliminate most of these errors.

Genome analysis of bats and fish

The Max Planck Institute of Molecular Cell Biology and Genetics and in particular its bioinformatics researchers at the Center for Systems Biology Dresden (CSBD) are involved in the sequencing, assembly, and annotation of the initial Phase I genomes of the VGP project with a focus on bats and fish. The Dresden scientists are part of the DRESDEN-concept Genome Center (DCGC) and have special expertise in using various long-read sequencing and long-range scaffolding technologies. The Dresden hub, led by Eugene Myers has contributed two genomes of the 15 released genomes: the greater horseshoe bat (Rhinolophus ferrumequinum) and the flier cichlid fish (Archocentrus centrarchus). In the future, about ten to 20 percent of the VGP species are expected to be sequenced in Dresden. Eugene Myers, director at the Dresden Max Planck Institute and founder of the CSBD says, “The advances in long-read sequencing is revolutionizing DNA sequencing. After a ten-year hiatus, this trend inspired me to return to genome assembly as I believe it implies that we will ultimately be able to produce near-perfect genome reconstructions. I think this capability is going to dramatically alter the landscape of genomics.” In addition to the VGP, the Max Planck Institute of Molecular Cell Biology and Genetics and the CSBD are actively engaged in synergistic international sequencing projects. The Bat1K project has the goal of sequencing all 1,300 bat species, many of which live unusually long or have near-perfect immune systems. Six bat genomes will be released in the near future, and another 25 species are being prepared to study aging, immunity, and vocal-learning in collaboration with the Bat1K consortium, which includes partners Sonja Vernes from the Max Planck Institute for Psycholinguistics in the Netherlands and Emma Teeling of the University College Dublin, UK. Another project is the Euro-Fish project, which aims to sequence almost all 600 species of fish swimming in European freshwaters. One of our main collaborators is Axel Meyer of the University of Konstanz. The Max Planck Society is funding the initial genomes from these synergistic projects. All the genomes will be sequenced to the high-quality standard set by the VGP and will be placed in the Genome Ark repository, where one day all 66,000 vertebrates will be recorded.

The 15 new genomes

  1. Mammals (4 species)

Two bat species, Greater horseshoe bat (Rhinolophus ferrumequinum) and Pale spear-nose bat (Phyllostomus discolor), used as models for longevity and vocal learningThe Canada lynx (Lynx canadensis), once nearly extinct in the United States and now recoveringThe duck-billed platypus (Ornithorhynchus anatinus), an egg-laying mammal with reptilian traits

  1. Reptiles (1 species)

A newly discovered turtle species from Mexico, Goode’s Thornscrub Tortoise (Gopherus evgoodei)

  1. Amphibians (1 species)

Two-lined caecilian (Rhinatrema bivittatum), a limbless amphibian that resembles a snake

  1. Birds (3 species, 4 genomes)

In addition to the kakapo (Strigops habroptilus), the VGP re-sequenced species from two other bird orders to represent the only three vocal learning birds among more than 40 avian ordersA male and female zebra finch (Taeniopygia guttata), the most commonly studied vocal learnerAnna’s hummingbird (Calypte anna), belonging to the smallest group of birds

  1. Fish (5 species representing a large diversity of traits and are used to study species evolution and adaptation):

Flier Cichlid (Archocentrus centrarchus), native to Central AmericaEastern happy (Astatotilapia calliptera), also a cichlid fish Native to Lake Malawi, AfricaClimbing perch (Anabas testudineus), native to inland waters of Southeast AsiaTire track eel (Mastacembelus armatus), native to rivers of Southeast AsiaBlunt-snouted clingfish (Gouania willdenowi), native to north Mediterranean coast, Syria to Spain