Direct FASTQ Upload
Simply drag and drop your FASTQ’s or use the command line and upload your samples directly to the platform in a matter of minutes
Cosmos-Hub allows researchers to import their raw sequencing data directly into the platform and run a number of available bioinformatics pipelines for microbiome analysis.
Cosmos-Hub allows researchers to import their raw sequencing data directly into the platform and run a number of available bioinformatics pipelines for microbiome analysis.
In just a few easy steps, users can run industry-leading bioinformatics pipelines for a wide range of different data types:
In just a few easy steps, users can run industry-leading bioinformatics pipelines for a wide range of different data types:
The Cosmos-Hub team have collaborated to implement an optimized pipeline based on Emu, a tool created for estimating microbial relative abundance from full-length, microbial amplicon gene sequences.
Emu is an expectation maximization pipeline that uses sophisticated probabilistic algorithms that consider your entire sample context when assigning taxonomic labels. This approach significantly reduces false positives and improves accuracy for closely related species, critical for clinical and research applications where precision matters. It is appreciated among the scientific community for its super performance, as shown in many benchmarks.
Emu’s core strength lies in its native compatibility with long-read sequencing platforms, such as PacBio and Oxford Nanopore Technologies, hence taking into account not only the sequence length but also the error profiles of these sequencers.
The Cosmos-Hub pipeline implementation allows the analysis of 4 popular microbial marker genes for taxonomic identification in complex samples:
The 16S rRNA region is the gold standard for bacterial and archaeal profiling, enabling taxonomic identification and quantification of prokaryotic community structure across diverse environments.
The 18S rRNA region targets microbial eukaryotes—such as protists and fungi—broadly capturing the diversity of non-bacterial components of microbiomes. Its conserved and variable regions allow characterization of taxonomic composition among eukaryotes, and protocols often include blocking primers to reduce host DNA contamination in hosts such as plant microbiomes.
The ITS region is widely used as the primary genetic marker for fungal community profiling and species identification. Its high sequence variability enables high-resolution discrimination of closely related fungal taxa and complements the information provided by 16S/18S amplicons.
The 16S-ITS-23S complete rRNA operon, offers maximal taxonomic resolution for bacteria and archaea, spanning all hypervariable regions and providing robust species- and even subspecies-level classification. This approach improves confidence in taxonomic assignment and allows detection of complex community structures.
The Cmbio team have curated 8 different databases for users to choose from, based on their amplicon of choice, sample type and personal preferences. This gives users the opportunity to leverage multiple databases for microbiomes without specialized reference databases to maximize discovery. The Emu-formatted taxonomic profiling outputs of the pipeline can be exported directly from the platform as well as plugged directly into the Cosmos-Hub Statistics Toolbox
A comprehensive, high-quality, and regularly updated database of aligned ribosomal RNA (rRNA) gene sequences from Bacteria, Archaea, and Eukaryotes.
An updated reference database unifying genomic and 16S rRNA data within a single, integrated tree to reconcile results between 16S and shotgun metagenomic studies, supporting high-resolution and reproducible microbial analysis.
Extensive benchmarking and validation of the integrated Emu-based pipeline streamlines microbiome analysis with unparalleled efficiency and accuracy:
Simply drag and drop your FASTQ’s or use the command line and upload your samples directly to the platform in a matter of minutes
Flexible parameters including read length, quality thresholds, and database selection to tailor analyses. Customize your analysis or use the Cosmos-Hub recommended parameters, as per your comfort level.
Developed in collaboration with our long-read sequencing and analysis center of excellence in Aalborg (formerly DNASense).
Generate results in a matter of hours once the files are successfully uploaded.
Choose from a number of specialized databases to suit your sample type and study question for versatile and precise analyses.
Once profiling is complete, leverage your metadata to create groups and generate statistical analyses and interactive visualizations.
Decision-making on when and how to implement long-read amplicon sequencing and analysis is determined by the sample type, application and kingdom(s) of the organism one is interested in.
Whilst long-read amplicon sequencing can be performed in any sample type; human, non-human or environmental, this approach has some particular advantages over shotgun metagenomics in:
Cmbio customers have used long-read amplicon sequencing and analysis to analyze:
Complete your microbiome study effortlessly with a single, integrated platform containing a customizable, no-code pipeline, data storage and statistics toolbox.
Long-read amplicon sequencing approaches have enabled microbiome researchers to generate high-quality amplicon sequencing data covering full-length genetic regions such as the entire 16S rRNA gene, 18S rRNA, or ITS regions. This is achieved through long read sequencing platforms like Oxford Nanopore Technologies or PacBio sequencing, which directly analyse genomic DNA to produce reads that span entire amplicons without fragmentation. These methods are especially valuable for characterising microbial communities from complex or novel environments.
The approach supports amplicon sequencing analysis from illumina amplicon data or other short-read datasets through hybrid assemblies, improving resolution and downstream applications. By including accurate primer sequences during library preparation, researchers can reduce sequencing errors and improve the quality of sequencing data used for downstream analysis.
Compared to short-read strategies, long read amplicon sequencing delivers improved phylogenetic resolution, enabling precise taxonomic assignment for operational taxonomic units and minimising misclassification. Because it spans the full 16S rRNA, it provides more accurate insights into microbial diversity and supports robust genetic variation studies by linking variants to complete gene contexts.
The technique reduces the influence of primer bias, allowing more reliable quantification of amplicon sequencing data and better estimation of microbial relative abundance. It is also sensitive enough for low-biomass samples, overcoming host DNA interference. These strengths make it suitable for studies in environmental microbiology, clinical microbiome profiling, and detection of rare or structurally unique organisms.
In addition, long-read data can be aligned to a reference genome for detection of structural variants or novel alleles, supporting investigations into functional roles and evolutionary relationships. Researchers often combine nucleic acids res methods with long-read sequencing to optimize yield and purity of input DNA.
Short-read workflows, such as those generating illumina amplicon data, typically rely on partial 16S rRNA gene regions. While this can be effective for broad profiling, it may lead to taxonomic ambiguity and underestimation of diversity. In contrast, long-read strategies generate complete amplicons, improving classification accuracy and enabling confident identification at the species or strain level.
When integrated into a bioinformatics workflow that includes quality filtering, chimera removal, and amplicon sequencing analysis, long-read datasets can yield highly detailed profiles of microbial communities, revealing niche populations that might be missed by short-read approaches.