Viruses are
known as the most abundant and genetically diverse life forms on the planet,
but there is still a lack of information on many of the various types that make
up this group, including the phylogenetic relationships between them. Single-stranded
DNA (ssDNA) viruses have previously been divided up into seven known families
and are known to infect many plants and animals, and have small genomes
(between 1.4 and 8.5 kb). Of these seven families, there are currently two groups
of bacteriophages (Inoviridae and Microviridae) and five that infect
eukaryotes (Nanoviridae and Geminiviridae
which infect plants; and Circoviridae,
Parvoviridae and Anelloviridae
which infect animals). The genomes of these ssDNA viruses are able to encode as
few as two genes; a capsid and a replication initiator.
Metagenomic
techniques in other studies have detected similar sequences to those of ssDNA
viruses in both marine and freshwater systems, but comparative analyses of
sequences stored in databases and comprehensive metagenomic studies are needed
to be able to identify ssDNA viruses, and to determine how diverse this group
is. This current study investigated the diversity of ssDNA viruses in marine
ecosystems, both in temperate (Saanich Inlet, SI, and Strait of Georgia, SOG)
and subtropical (Gulf of Mexico, SOM) waters to expand the current knowledge
and understanding of the diversity of these viral families. They also wanted to
determine whether there were any novel groups of ssDNA viruses that have
diverged to the extent of forming new families by comparing the phylogenetic
relationships between the viruses identified in the samples.
The authors were
able to yield 608 complete composite genomes from the 4995 contiguous sequences
(contigs) that were assembled and organised from the three different sites. Contigs
were found to be homologous to viruses from the Circoviridae and Nanoviridae
(4.9% and 2.0% of the total), and approximately 1.6% of the total sequences
were homologous to the genus Gokushovirus
from the Microviridae. Therefore, they
suggest that many of these new groups (also known as CDS groups) are part of
previously unknown viral families, as viruses within extant families seem to
possess similar genomes. This almost doubled the number of sequenced ssDNA
viruses from the NCBI database.
These composite
genomes were compared with other ssDNA viruses and environmental sequences
using the Feature Frequency Profile (FFP) analyses and by comparing the similarity
sequences based on results from tBLASTx. The FFP was able to distinguish the
different families of viruses, including nanoviruses. Related sequences were
grouped based on sequence homology, which, based on results from tBLASTx in the
form of a network (where each cloud within this network shows a group of sequences
that share a gene homolog), revealed 129 genetically distinct new groups of
ssDNA which had no or almost no recognisable sequence similarity. Most of these
sequences (84%) fell into 11 major coding DNA sequence (CDS) groups, but the
diversity was greater than expected, with only two of the clusters harbouring
previously sequenced viruses. In addition, most groups were similarly
distributed within temperate and subtropical waters.
To identify
ssDNA viruses from other marine metagenomic libraries, the samples from both
temperate and subtropical waters from this current study were compared using
BLAST techniques to four DNA -virus
metagenomic libraries (NCBI database) comprising of samples from sequenced and
isolated viruses that infect plants, animals and bacteria, and these were
collected from the Arctic Ocean, Sargasso Sea, SOG and GOM. Approximately 5-15%
of the sequences that were originally from British Columbia, the GOM and the
Sargasso Sea were identified as ssDNA in the composite samples. In addition,
the BLAST comparisons revealed that about 50% of the ssDNA sequences from
temperate waters and the GOM were homologous to each other, but datasets from
the SOG and SI were 72-82% similar to each other. Because these datasets are
made up of hundreds of collective samples from a range of temperate and
subtropical environments, it indicated that they were possible sequenced
representatives of most of the ssDNA viruses in the surface waters of the SOG
and the GOM.
At least 10
new clades of environmental ssDNA viruses were found by analysing the
phylogenetic links between the replication protein sequences that are distinct
from terrestrial viruses, and these new clades are also congruent to genome
organisation. As there is also a high degree of evolutionary divergence between
sequences, this also implies that these sequences belong to viruses that have
the capacity to be pathogens of a wide diversity of organisms, from a range of
different environments. Although some of these sequence groups may result from
viruses that infect bacteria, the few sequence groups that could be associated
with extant viruses belonged to families of viruses that infect eukaryotes, so the
authors suggest that some of the new groups are made up of viruses that infect
eukaryotic phytoplankton and zooplankton that underlie marine food webs.
This study
provides a significant and extended insight into how diverse this apparently understudied
group of viruses is, and how easy it can be to overlook a substantial amount of
individuals, or even groups of individuals, if the most appropriate techniques
are not applied. Therefore, this study offers a comprehensive method to identify
ssDNA viruses and their diversity more thoroughly. However, the authors also
mention that there was a very low concentration of ssDNA in the pooled mix
samples from the different sites, so they had to isolate and amplify a
significant amount of short segments of ssDNA, thus this study does not
necessarily cover all of the families and sequences of ssDNA viruses in the
ocean. In addition, their samples were taken from surface waters, so it would
be interesting to see whether there is a difference in diversity at deeper
depths in these sites.
Hello Eleanor,
ReplyDeleteinvestigation of environmental sample with new genomic and metagenomic techniques seems allow to identify continously new groups of organisms. And this also reveal new filogenetic relationships. Amazing to see that authors found 129 genetically isolated groups with almost of them never sequenced before and quite common between different locations. Although these methods require to manipulate a lot the data (to avoid artefact results, make comprehensible matching, excluding no clear defined sequences, ecc) it seems that the diversity in virus is growing after each study. I think that identify functions (and host organisms) of these specific group could make light also on mechanism that allow such big diversity to persist between virus with apparent redundant function (infection). But maybe is really this the key of their success ( at least in number).
I agree, it's amazing the sheer number of new and previously undiscovered groups there were, by studying the ssDNA viruses from a new perspective as well as combining techniques, and linking what other studies have looked at, such as the viruses only found within bacteria.
ReplyDeleteThey do mention at the beginning of the paper that almost nothing is known about the role of viruses in ecosystems, particularly ssDNA viruses, so understanding both the phylogenetic relationships and functions may give a better insight into the how their abundance and diversity in the ocean have come about, as you said.