Friday, 4 April 2014

Marine ssDNA viruses are a more diverse group of pathogens than previously thought

Viruses are known as the most abundant and genetically diverse life forms on the planet, but there is still a lack of information on many of the various types that make up this group, including the phylogenetic relationships between them. Single-stranded DNA (ssDNA) viruses have previously been divided up into seven known families and are known to infect many plants and animals, and have small genomes (between 1.4 and 8.5 kb). Of these seven families, there are currently two groups of bacteriophages (Inoviridae and Microviridae) and five that infect eukaryotes (Nanoviridae and Geminiviridae which infect plants; and Circoviridae, Parvoviridae and Anelloviridae which infect animals). The genomes of these ssDNA viruses are able to encode as few as two genes; a capsid and a replication initiator.

Metagenomic techniques in other studies have detected similar sequences to those of ssDNA viruses in both marine and freshwater systems, but comparative analyses of sequences stored in databases and comprehensive metagenomic studies are needed to be able to identify ssDNA viruses, and to determine how diverse this group is. This current study investigated the diversity of ssDNA viruses in marine ecosystems, both in temperate (Saanich Inlet, SI, and Strait of Georgia, SOG) and subtropical (Gulf of Mexico, SOM) waters to expand the current knowledge and understanding of the diversity of these viral families. They also wanted to determine whether there were any novel groups of ssDNA viruses that have diverged to the extent of forming new families by comparing the phylogenetic relationships between the viruses identified in the samples.

The authors were able to yield 608 complete composite genomes from the 4995 contiguous sequences (contigs) that were assembled and organised from the three different sites. Contigs were found to be homologous to viruses from the Circoviridae and Nanoviridae (4.9% and 2.0% of the total), and approximately 1.6% of the total sequences were homologous to the genus Gokushovirus from the Microviridae. Therefore, they suggest that many of these new groups (also known as CDS groups) are part of previously unknown viral families, as viruses within extant families seem to possess similar genomes. This almost doubled the number of sequenced ssDNA viruses from the NCBI database.

These composite genomes were compared with other ssDNA viruses and environmental sequences using the Feature Frequency Profile (FFP) analyses and by comparing the similarity sequences based on results from tBLASTx. The FFP was able to distinguish the different families of viruses, including nanoviruses. Related sequences were grouped based on sequence homology, which, based on results from tBLASTx in the form of a network (where each cloud within this network shows a group of sequences that share a gene homolog), revealed 129 genetically distinct new groups of ssDNA which had no or almost no recognisable sequence similarity. Most of these sequences (84%) fell into 11 major coding DNA sequence (CDS) groups, but the diversity was greater than expected, with only two of the clusters harbouring previously sequenced viruses. In addition, most groups were similarly distributed within temperate and subtropical waters.

To identify ssDNA viruses from other marine metagenomic libraries, the samples from both temperate and subtropical waters from this current study were compared using BLAST techniques to four DNA-virus metagenomic libraries (NCBI database) comprising of samples from sequenced and isolated viruses that infect plants, animals and bacteria, and these were collected from the Arctic Ocean, Sargasso Sea, SOG and GOM. Approximately 5-15% of the sequences that were originally from British Columbia, the GOM and the Sargasso Sea were identified as ssDNA in the composite samples. In addition, the BLAST comparisons revealed that about 50% of the ssDNA sequences from temperate waters and the GOM were homologous to each other, but datasets from the SOG and SI were 72-82% similar to each other. Because these datasets are made up of hundreds of collective samples from a range of temperate and subtropical environments, it indicated that they were possible sequenced representatives of most of the ssDNA viruses in the surface waters of the SOG and the GOM.

At least 10 new clades of environmental ssDNA viruses were found by analysing the phylogenetic links between the replication protein sequences that are distinct from terrestrial viruses, and these new clades are also congruent to genome organisation. As there is also a high degree of evolutionary divergence between sequences, this also implies that these sequences belong to viruses that have the capacity to be pathogens of a wide diversity of organisms, from a range of different environments. Although some of these sequence groups may result from viruses that infect bacteria, the few sequence groups that could be associated with extant viruses belonged to families of viruses that infect eukaryotes, so the authors suggest that some of the new groups are made up of viruses that infect eukaryotic phytoplankton and zooplankton that underlie marine food webs.     

This study provides a significant and extended insight into how diverse this apparently understudied group of viruses is, and how easy it can be to overlook a substantial amount of individuals, or even groups of individuals, if the most appropriate techniques are not applied. Therefore, this study offers a comprehensive method to identify ssDNA viruses and their diversity more thoroughly. However, the authors also mention that there was a very low concentration of ssDNA in the pooled mix samples from the different sites, so they had to isolate and amplify a significant amount of short segments of ssDNA, thus this study does not necessarily cover all of the families and sequences of ssDNA viruses in the ocean. In addition, their samples were taken from surface waters, so it would be interesting to see whether there is a difference in diversity at deeper depths in these sites.  

 Labonté, J.M. and Suttle, C.A. (2013) Previous unknown and highly divergent ssDNA viruses populate the oceans. The International Society for Microbial Ecology, 7: 2169-2177


  1. Hello Eleanor,
    investigation of environmental sample with new genomic and metagenomic techniques seems allow to identify continously new groups of organisms. And this also reveal new filogenetic relationships. Amazing to see that authors found 129 genetically isolated groups with almost of them never sequenced before and quite common between different locations. Although these methods require to manipulate a lot the data (to avoid artefact results, make comprehensible matching, excluding no clear defined sequences, ecc) it seems that the diversity in virus is growing after each study. I think that identify functions (and host organisms) of these specific group could make light also on mechanism that allow such big diversity to persist between virus with apparent redundant function (infection). But maybe is really this the key of their success ( at least in number).

  2. I agree, it's amazing the sheer number of new and previously undiscovered groups there were, by studying the ssDNA viruses from a new perspective as well as combining techniques, and linking what other studies have looked at, such as the viruses only found within bacteria.

    They do mention at the beginning of the paper that almost nothing is known about the role of viruses in ecosystems, particularly ssDNA viruses, so understanding both the phylogenetic relationships and functions may give a better insight into the how their abundance and diversity in the ocean have come about, as you said.