Cruciviruses & RNA-DNA Hybrid Viruses

IMG_8846.jpg

A strange sequence,
a big discovery

Beginning in 2001, every year the Stedman lab has traveled to Lassen Volcanic National Park, in Northern California, to sample the hot acid environments created by volcanic hot spots. One year, while combing through metagenomic data from Boiling Springs Lake in Lassen, a strange sequence was found. The data showed a single stranded DNA (ssDNA) viral sequence, that had both homology to a DNA virus and a gene most similar to an RNA virus. At first, we assumed that this was an error in our data, because nothing had been seen like this before. However, after finding that the sequence was consistent throughout the metagenome, we were able to piece together what would become the sequence of the very first virus of its kind to be discovered. It was named Boiling Springs Lake RNA-DNA Hybrid Virus (BSL-RDHV), after the place it was found, and in 2012, Diemer & Stedman was published, marking the official discovery of these viral sequences.

Since then, many more viral sequences similar to BSL-RDHV have been detected in metagenomes from a wide range of environments - from Tampa Bay, to the Finger Lakes of upstate New York, Antarctic lakes, deep sea sediments, and dragonfly innards. These virus sequences have been dubbed "chimeric viruses" or "crucivirues" because of the DNA-RNA viral recombination that their genomes imply. The frequency and mechanism of the recombination that generate cruciviruses are unclear, but are critical for understanding virus evolution, the emergence of new viruses, and could allow insights into the transition between RNA and DNA in the evolution of early cellular life on Earth.

Cruciviridae  may be the result of recombination between a ssDNA virus and a ssRNA virus. The capsid protein encoding genes of Cruciviruses have most homology to ssRNA viruses, while the Rep encoding genes have most homology to ssDNA viruses.

Cruciviridae may be the result of recombination between a ssDNA virus and a ssRNA virus. The capsid protein encoding genes of Cruciviruses have most homology to ssRNA viruses, while the Rep encoding genes have most homology to ssDNA viruses.

All life forms use DNA as their genetic material, but viruses are the only contemporary life-forms that sometimes use RNA as their genetic material. Viruses have also been proposed to have been the first to use doubles stranded DNA (dsDNA) as a genetic material in a putative RNA world, only to have this strategy acquired by cellular life. Therefore viruses that use RNA may be remnants of the “RNA World”, and RNA-DNA hybrid viruses may be modern day relics of the transition period from RNA to DNA as the main genetic material.

There are two kinds of RNA viruses: single stranded (ss), such as influenza, Ebola and Zika viruses, and double stranded (ds). Another class of viruses are the retroviruses, such as HIV-1, that switch between DNA and RNA as part of their replication cycle. All viruses exchange genetic material within groups - DNA viruses with DNA viruses, and RNA viruses with RNA viruses - but viruses were thought not to exchange genetic material between RNA and DNA until RNA-DNA Hybrid Viruses (RDHVs), now called Cruciviruses (CruVs).

Currently, our research on these new and peculiar virus-like sequences in the Stedman lab focuses on three components:
          • Understanding the replication mechanism and how the Rep protein works in BSL-RDHV
          • Characterizing and expressing the capsid protein in BSL-RDHV and other cruciviruses
          • Elucidating the host of BSL-RDHV and other cruciviruses through environmental sampling

 

The mysteries of BSL-RDHV Replication

Scooping up environmental samples from Boiling Springs Lake in Lassen Volcanic National Park. Due to unstable volcanic hot spots in the earth around the lake, sampling is done from afar, and requires some creativity. This is where BSL-RDHV was discovered, and where we hope to find more viral sequences like it.

Scooping up environmental samples from Boiling Springs Lake in Lassen Volcanic National Park. Due to unstable volcanic hot spots in the earth around the lake, sampling is done from afar, and requires some creativity. This is where BSL-RDHV was discovered, and where we hope to find more viral sequences like it.

Almost all ssDNA viruses with circular genomes replicate their genomes in a similar fashion, called rolling circle replication (RCR). The BSL-RDHV genome, found in 2012, contained a Rep gene that was very similar to the Rep gene from the ssDNA virus PCV2, a virus in the circular rep-encoding single-stranded (CRESS) virus family. The ssDNA genome also contained a conserved stem-loop structure, and a gene very similar to capsid genes that, before then, were only found in RNA viruses.

It is notable that the gene acquired from an RNA virus is the capsid protein, because it is what makes the outside shell of the virus, determines what the virus can infect, and how it is transmitted and avoids host defenses. This is exactly the gene that would provide an evolutionary advantage to any virus that acquired it. Current work in our lab involves learning more about this replication process and the Rep protein, as well as BSL-RDHV capsid expression. 

The critical protein for RCR replication is the virus replication initiation protein or Rep. This protein initiates replication by cutting the viral DNA at a conserved stem-loop secondary structure, the double-strand origin (DSO). This provides a starting point, or primer, for DNA replication that is extended by cellular DNA polymerases. The Rep protein in most ssDNA viruses is also a helicase and peels off the non-template strand of DNA. Finally, the Rep protein rejoins (ligates) the genome after a single round of replication.

 
 

Cruci HUnters

In 2017, Stedman lab post-doc Dr. Nacho de la Iguera and undergrad Ellis Torrance discovered a new cruciviral sequence in an environmental sample from the marshland above a peat bog in Woodburn, OR. Since then they have discovered three new sequences from Woodburn (de la Higuera et al., 2019).

Part of the "crucicrew" on our 2017 trip to Lassen.

Part of the "crucicrew" on our 2017 trip to Lassen.

The genome of PB1-RDHV

The genome of PB1-RDHV

PB1-RDHV is a small sequence, at 2.95 kb, and the majority of its genome is comprised of two genes: the capsid, which is closely related to the family of ssRNA viruses Tombusviridae, and the Rep protein, which, like in BSL-RDHV, relates most closely to circular ssDNA viral families.

The first step in cruci hunting involves environmental sampling. The lab has sampled from the aforementioned Woodburn peat bog, and Boiling Springs Lake in Lassen, as well as Drakesbad Marsh in Lassen, the PSU campus greenhouse, and a marsh in Gearhart, OR. Many previous discoveries of cruciviral DNA have come from these "water-mixing" environments - places that are marshy, wet, and muddy, so that's where our efforts have been focused. After collecting these samples, they are processed and filtered by a number of methods to purify the DNA present. The DNA is then selectively amplified for circular sequences, and PCR methods are used to recover the cruciviral sequences within.

The lovely Woodburn "Peat Bog" is the marsh where PB1-RDHV was first detected. We call it that because it sits on top of a peat bog where many Ice Age fossils have also been uncovered, but really it's just a ditch. The crucicrew takes frequent trips here for sampling in hopes of finding more crucis, and to find the host of PB1-RDHV.

The lovely Woodburn "Peat Bog" is the marsh where PB1-RDHV was first detected. We call it that because it sits on top of a peat bog where many Ice Age fossils have also been uncovered, but really it's just a ditch. The crucicrew takes frequent trips here for sampling in hopes of finding more crucis, and to find the host of PB1-RDHV.

Dr. Nacho and Ellis Torrance testing the pH of soil samples from Drakesbad Marsh, Lassen Volcanic Natl. Park, Northern California.

Dr. Nacho and Ellis Torrance testing the pH of soil samples from Drakesbad Marsh, Lassen Volcanic Natl. Park, Northern California.

From here, the complete crucivirus genome can be further studied and used in the hunt for it's natural host. Because the two genes in PB1 are both closely related to Eukaryotic viruses, and because of the marsh-like environment in which it was found, it is hypothesized that PB1's natural host is eukaryotic.

 

Key Publications

Researchers