Wednesday, 12 February 2020

Massive sequencing of last generation, a world of possibilities

Welcome to the first blog post of the FISABIO Sequencing and Bioinformatics Service! We are very excited to embark on this new adventure, through this blog we want to let you know of any interesting news related to our work. One of our biggest commitments as a service is to help our users to understand and choose the best platforms for your sequencing needs, as well as to explore all possible bioinformatic analysis, since we offer both standard and customized analyzes.

So you can get to know us better, this first blog post will be a presentation, on the one hand we will describe what mass sequencing is and on the other we will discuss the platforms we currently have in the Service.

Basically, mass sequencing brings together a set of biochemical methods and protocols to determine the sequence of the four basic components of DNA (adenine, thymine, cytosine and guanine) or RNA (adenine, uracil, cytosine and guanine) of a gene, an amplicon, a genome, and/or genetic material of any ecosystem that is of interest to us. One of the wonders of these methodologies is that they allow us to decipher the genetic information from a single cell, to very complex communities.

These technologies have many applications, among which we highlight the de novo genome assembly, the assisted assembly based on a reference genome, the genotyping of variants based on a reference, functional studies (transcriptome), the description of microbial communities or fungi of an ecosystem (metataxonomy), metagenomics and metatranscriptomics.

So far, the most used technologies have been the so-called Second Generation Sequencing or Next Generation Sequencing, by comparison with the first generation Sanger sequencing. These second generation techniques are characterized by producing a large amount of output data in very short times and very contained costs with respect to the Sanger method. All this has been achieved due to the parallelization of the sequencing reactions. Its main defect so far has been the short size of the fragments that can be sequenced, and that the clonal amplification to which these fragments are subjected to before being sequenced introduces considerable biases. To avoid these problems, a third generation of sequencers has been developed with new, much more sensitive detection techniques that avoid the necessary clonal amplification from the second generation, and also adding the possibility of sequencing considerably longer fragments.

Next, we will describe the sequencers available in FISABIO.

Second Generation Sequencers

We have several Illumina technology sequencers, specifically the MiSeq and NextSeq500 models. The difference between the two lies in the maximum size of the fragment to be amplified and in the maximum output of data output, being 2x300pb and a maximum of 15 Gigabases or 25 million readings for the MiSeq and 2x150pb and 120 Gigabases or 400 million readings for the NextSeq.

For a complete guide to all the sequencing protocols and applications that Illumina offers for both DNA and RNA, you can explore the following link. If you have any particular interest, you can contact us to discuss the specifications you need.

Although there are hundreds of applications of sequencing and analysis methods, in our service the most common approaches with the Illumina platforms are:
Sequencing of amplicons. For example, for applications related to biomedicine and microbiology, studying the diversity of amplified fragments.
Sequencing of complete genomes. We can obtain the complete genome of bacteria and viruses, as well as small eukaryotes (yeasts, ...)
Resequencing of complete genomes. For example to genotype variants, a very important tool in molecular epidemiology.
Sequencing of transcripts. It allows us to know the functions that are being expressed in a given moment, or to know the total set of RNAs in a sample.
Sequencing of metagenomes. Where we explore bacterial genomes in a representative sample of the environment.
Sequencing of metatranscriptomes. It provides us with the functional activity profiles in complex samples.
Sequencing of miRNAs, siRNA and other small RNAs. Especially for the study of posttranscriptional regulation of gene expression.
Gene panels Useful with known genomes, where panels are designed to amplify specific genes at high resolution and mutations can be easily distinguished, even those of low frequency.

Third Generation Sequencers

In FISABIO we have two platforms of this generation:

Oxford Nanopore MinION

It is a technology based on the creation of artificial membrane pores through which the nucleic acid sequence is passed. The sequence determination is based on the measurement of the changes in electrical current between the two faces of the membrane in which the pore is immobilized. The biggest advantage of this technology lies in its miniaturization and its speed (it produces data in real time). This is advantageous because, on the one hand, it is fully transportable by having the size of a pen drive (only in some of its models) and it is possible to connect it directly to any computer through a USB port. And on the other, it is suitable for the immediate detection of pathogenic organisms in-situ or in applications in which the response time is a vital factor. Although it still has lower sequence quality values than other platforms, this fact can be solved by combining it with sequences from the illumina technology.

Pacific Biosciences PacBio Sequel II

It's method is based on reading the fluorescence emitted during DNA synthesis by a polymerase immobilized at the center of a microscopic well. The nucleotides that are incorporated into the template DNA, according to their type (A, T, G or C), emit fluorescent light of different colors. In this way, a single molecule per well is sequenced in real time (Single Molecule Real Time, SMRT). The sequences that we obtain from Sequel II have a length that goes from 4000 bp, to extremes of 500,000 bp, in a maximum time of 90 hours. The new PacBio sequencers (Sequel series) have been able to eradicate the problem of low quality, historically typical of third-generation sequencers, by presenting a new method called Hi-Fi, where the same molecule is sequenced several times in a circular mode, generating a high quality self-corrected consensus (CCS sequencing).

The applications recommended for this platform overlap with those previously mentioned for Illumina, with some improvements: the sequencing of complete genomes can be of any size and nature, both prokaryotic and eukaryotic. Also, its use is advised for the search of complex structural variants; the description of microbial and fungal communities in an ecosystem through the sequencing of complete ribosomal genes (16S, 18S, including up to complete ribosomal operons), or other amplicons of interest; Metagenomics and transcriptomics.

We hope this entry has been of interest to you. Soon we will present in detail the bioinformatics services we offer, as well as news about new technologies and emerging platforms, without forgetting practical tutorials for those who want to learn more bioinformatics and analyze their own sequences.

If you have any questions or comments, do not hesitate to write to us! Also if you have ideas about what you want to read in our blog. We will be happy to hear from you.

Mass sequencing brings together a set of biochemical methods and protocols to determine the sequence of the four basic DNA or RNA components of a gene, an amplicon, a genome, and/or genetic material of any ecosystem of interest.

Massive sequencing of last generation, a world of possibilities

Second Generation Sequencers

Third Generation Sequencers

Oxford Nanopore MinION

Pacific Biosciences PacBio Sequel II

Share the news

By Mariana Reyes, M.Sc.