When it was first discovered, the coelacanth caused a lot of excitement. It was a living example of a group of fish that was thought to only exist as fossils. And not just any group of fish. With their long, stalk-like fins, coelacanths and their kin are thought to include the ancestors of all vertebrates that aren't fish—the tetrapods, or vertebrates with four limbs. Meaning, among a lot of other things, us.
Since then, however, evidence has piled up that we're more closely related to lungfish, which live in freshwater and are found in Africa, Australia, and South America. But lungfish are a bit weird. The African and South American species have seen the limb-like fins of their ancestors reduced to thin, floppy strands. And getting some perspective on their evolutionary history has proven difficult because they have the largest genomes known in animals, with the South American lungfish genome containing over 90 billion base pairs. That's 30 times the amount of DNA we have.
But new sequencing technology has made tackling that sort of challenge manageable, and an international collaboration has now completed the largest genome ever, one where all but one chromosome carry more DNA than is found in the human genome. The work points to a history where the South American lungfish has been adding 3 billion extra bases of DNA every 10 million years for the last 200 million years, all without adding a significant number of new genes. Instead, it seems to have lost the ability to keep junk DNA in check.
Going long
The work was enabled by a technology generically termed "long-read sequencing." Most of the genomes that were completed were done using short reads, typically in the area of 100–200 base pairs long. The secret was to do enough sequencing that, on average, every base in the genome should be sequenced multiple times. Given that, a cleverly designed computer program could figure out where two bits of sequence overlapped and register that as a single, longer piece of sequence, repeating the process until the computer spit out long strings of contiguous bases.
The problem is that most non-microbial species have stretches of repeated sequence (think hundreds of copies of the bases G and A in a row) that were longer than a few hundred bases long—and nearly identical sequences that show up in multiple locations of the genome. These would be impossible to match to a unique location, and so the output of the genome assembly software would have lots of gaps of unknown length and sequence.
This creates extreme difficulty for genomes like that of the lungfish, which is filled with non-functional "junk" DNA, all of which is typically repetitive. The software tends to produce a genome that's more gap than sequence.
Long-read technology gets around that by doing exactly what its name implies. Rather than being able to sequence fragments of 200 bases or so, it can generate sequences that are thousands of base pairs long, easily covering the entire repeat that would have otherwise created a gap. One early version of long-read technology involved stuffing long DNA molecules through pores and watching for different voltage changes across the pore as different bases passed through it. Another had a DNA copying enzyme make a duplicate of a long strand and watch for fluorescence changes as different bases were added. These early versions tended to be a bit error-prone but have since been improved, and several newer competing technologies are now on the market.
Back in 2021, researchers used this technology to complete the genome of the Australian lungfish—the one that maintains the limb-like fins of the ancestors that gave rise to tetrapods. Now they're back with the genomes from African and South American species. These species seem to have gone their separate ways during the breakup of the supercontinent Gondwana, a process that started nearly 200 million years ago. And having the genomes of all three should give us some perspective on the features that are common to all lungfish species, and thus are more likely to have been shared with the distant ancestors that gave rise to tetrapods.
Lots of junk, no cleaning service
For starters, it's worth noting again how 20 years of technology development has completely revolutionized things. The human genome, at 3 billion bases' worth of DNA, took multiple international consortiums years to finish. For this paper, a team of just 25 people managed to complete genomes that were 40 billion and 90 billion bases long. Those 90 billion bases were spread across 19 chromosomes, and 18 of those were each longer than the entire human genome put together.
The human genome is also an interesting point of comparison in that our genome has roughly 20,000 protein-coding genes. And these fish, with up to 30 times as much DNA... also have about 20,000 protein-coding genes. As do pretty much all the other tetrapods we've looked at (there are exceptions, like a frog called Xenopus laevis, that carries around an extra set of chromosomes). In fact, the genes appear to be in a configuration that is likely to represent something similar to that found in the ancestor of all tetrapods, meaning that genes that are next to each other now are likely to have been next to each other nearly 400 million years ago.
So, if that extra DNA isn't there to support a lot of additional genes, what's it doing there? All indications are nothing, or at least nothing that's useful to the fish. Instead, most of the additional DNA appears to be junk.
Junk DNA is a generic term that describes genome debris that has a tendency to accumulate. It can be superfluous copies of useful genes, damaged copies of unused ones, pieces of inactivated viruses, and DNA-level parasites called transposable elements that can move about the genome. In the case of the lungfish, most of the junk seems to be transposable elements; the South American species, which has twice the amount of DNA as its two relatives, also has twice the number of transposable elements.
Most genomes have a large number of transposable elements—they account for about 40 percent of human DNA, for example. But they have also evolved mechanisms that keep these things from hopping out of control. Those mechanism seem to be considerably weaker in the lungfish. The fish make fewer functional copies of an RNA that helps shut down transposable element movement. And a gene family that silences transposable elements appears to have far fewer members in the South American lungfish. (Humans and other lungfish have about 300 copies, while the South American lungfish only has 23.)
The net result appears to be enough extra copies of transposable elements to create the incredible bloat found in the South American species. Based on evidence of when the different species separated, the researchers estimate that the South American lungfish genome has been growing by the equivalent of a human genome every 10 million years, and doing so for roughly 200 million years.
What made us?
Aside from examining the explosion of junk DNA, the researchers also spent some time looking into the thin fins found in the African and South American species. The researchers hypothesized that this might be related to the activity of the gene Sonic hedgehog (yes, named after the game character), which helps set up the pattern of specialized digits seen in tetrapods. The gene is normally active in a specific location in the developing limb, as well as the fin of the Australian lungfish. But that activity is missing in the African and South American species.
There were some other changes in gene activity found in the developing fins, suggesting that the difference doesn't involve the loss or gain of a gene, but rather changes in how existing genes are used. And this happened at a time where, at least physically, the Australian lineage remained largely unchanged.
The explosion in genome size tells us something about why the amount of an organism's DNA seems largely unrelated to its physical or behavioral complexity. But the real value of these genomes is going to be once people start using them to understand what changes took place between these fish and the tetrapods that ultimately took over land.
You can post now and register later.
If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.
Recommended Comments
There are no comments to display.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.