Whole Genome Metagenome

Whole genome metagenome sequencing, often called whole metagenome shotgun sequencing, is a powerful technique used to study the complete collection of genetic material (DNA) found in a complex microbial community. Unlike methods that target a specific gene (like 16S rRNA sequencing, which focuses on bacteria), whole genome metagenome sequencing looks at all the DNA from all the organisms in a sample — including bacteria, viruses, fungi, and other microscopic life forms.

Whole genome metagenome sequencing opens a window into the hidden world of microbes by capturing their entire genetic blueprint in a single, comprehensive analysis. This approach lets researchers move beyond simply naming the microbes to understanding their potential activities, interactions, and impacts on their surroundings.

IMG
By completing this course, learners will gain a solid understanding of whole genome metagenome sequencing, including its workflow from sample collection to data analysis. They will be able to perform quality control, assemble metagenomic data, and identify microbial species through taxonomic profiling. Additionally, learners will develop skills to annotate genes and interpret the functional roles of microbes within complex communities, enabling them to apply these techniques to real-world microbiome research and environmental studies.

👉 Enroll now and take the first step toward a future in data-driven biology and beyond.

## 10% Discount if you register before 15th October, 2025. Hurry up!!


Course Information
Course Whole Genome metagenome Data Analysis
Duration online 7 Days Training [ 2 Hours Daily [ Monday To Friday ] ]

Slots

Our working Time is 9:00 AM to 6:00 PM Indian Time Available slots - 9:00 AM to 11:00 AM / 11:00 AM to 1:00 PM / 2:00 PM to 4:00 PM / 4:00 PM to 6:00 PM
For training slots after 6 PM or before 9 am as well as weekends training kindly mention during registration accordingly it will be scheduled.

Mode

👉 For online training candidate have to install ZOOM (with remote control on candidate system which makes 100% interactive)
👉 Run time video recording candidate can make as well as pdf manual will be provided for future reference.
👉 All our training is 100% practical and 100% industrial and 100% interactive which provides same as offline learning.
👉 For doubt clear there will be extra support will be provided based on the requirement.
👉 Certificate will be provided

Sequencing Platform Illumina /Ion Torrent/PacBio/Nanopore
Raw data Candidate can include maximum 4 datasets of their own during training. Publication standards figures and tables will be generated.
Training Fees
Module-NGS Whole Genome metagenome Data Analysis
    📘 Introduction to NGS WGS Metagenome
    - Understanding NGS and Its Applications
    - Types of sequencing data generated
    - Understanding FASTQ files and sequencing quality scores

    📘 Linux Basics & Environment Setup
    - Linux Command Line Basics
    - Installing WGS Metagenome Tools
    - Using Conda and Shell Scripting

    📘 Data Retrieval & Formats
    - Fetching data from NCBI SRA using SRA Toolkit
    - Understanding different file formats

    📘 Introduction to R/Bioconductor
    - Installing packages with CRAN and Bioconductor
    - Data types and standardized data container
    - Data manipulation

    📘 Detailed Metagenomic Data Analysis Pipeline

    1. - Assembly:
      Assembling short reads into longer contigs to reconstruct genomic fragments
    2. - RNA Prediction and Classification:
      Detecting ribosomal RNA and other non-coding RNAs from assembled sequences and classifying them into known RNA families
    3. - ORF (CDS) Prediction:
      Predicting open reading frames (ORFs) or coding sequences (CDS) using tools like Prodigal to identify potential genes
    4. - Homology Searching Against Taxonomic and Functional Databases:
      Matching predicted genes against reference databases (e.g., NCBI NR, COG, KEGG) to infer function and taxonomy
    5. - HMMER Searching Against Pfam Database:
      Identifying protein domains using HMMER against the Pfam database to support functional annotation
    6. - Taxonomic Assignment of Genes:
      Assigning taxonomy to each gene based on homology search results
    7. - Functional Assignment of Genes:
      Linking predicted genes to functional categories like metabolic pathways or gene families using KEGG, COG etc.
    8. - Blastx on Parts of the Contigs with No Gene Prediction or Hits:
      Running Blastx on contig regions that lack gene predictions or functional hits to discover missed genes
    9. - Taxonomic Assignment of Contigs and Disparity Checks:
      Assigning taxonomy at the contig level and verifying consistency among genes within each contig to avoid chimeras
    10. - Coverage and Abundance Estimation for Genes and Contigs:
      Calculating how abundant each gene or contig is by mapping sequencing reads back to the assembly
    11. - Estimation of Taxa Abundances:
      Quantifying the abundance of microbial taxa across samples based on read mappings and gene assignments
    12. - Estimation of Function Abundances:
      Estimating the abundance of biological functions or pathways in the community using annotated gene data
    13. - Merging of Previous Results to Obtain the ORF Table:
      Integrating taxonomic, functional, and abundance data into a single table summarizing all ORFs
    14. - Binning with Different Methods:
      Clustering contigs into bins representing draft genomes using binning tools like MetaBAT, MaxBin, or CONCOCT
    15. - Binning Integration with DAS Tool:
      Refining binning results by integrating outputs from multiple methods using the DAS Tool for higher accuracy
    16. - Taxonomic Assignment of Bins and Disparity Checks:
      Assigning taxonomy to bins and verifying internal consistency using marker gene and taxonomic information
    17. - Checking of Bins with CheckM2 and GTDB-Tk:
      Assessing bin completeness and contamination with CheckM2, and classifying high-quality bins using GTDB-Tk
    18. - Merging of Previous Results to Obtain the Bin Table:
      Compiling bin-level data including taxonomy, abundance, and completeness scores into a comprehensive table
    19. - Merging of Previous Results to Obtain the Contig Table:
      Creating a table summarizing all contigs with associated taxonomy, bins, gene content, and abundance
    20. - Prediction of KEGG and MetaCyc Pathways for Each Bin:
      Mapping genes in each bin to known metabolic pathways using databases like KEGG and MetaCyc
    21. - Final Statistics for the Run:
      Summarizing the entire analysis with metrics on assembly, binning, annotation, and read coverage
    22. - Generation of Tables with Aggregated Taxonomic and Functional Profiles:
      Producing overview tables that summarize taxonomic composition and functional potential across samples
    23. - Alpha and Beta Diversity Analysis:
      Calculating within-sample (alpha) and between-sample (beta) diversity using taxonomic and functional data
    24. - Visualization:
      PCA, Rarefaction, Stack Bar, Heatmap, Krona, Phylogeny, Bin Visualization, etc.

Preparation

- Whole genome metagenomics : https://en.wikipedia.org/wiki/Whole_genome_sequencing
- NCBI : https://pmc.ncbi.nlm.nih.gov/articles/PMC9456280/

Instructor

Industry Experienced

Target Audiance

This course is designed for graduate students, postdoctoral researchers, and professionals working in the fields of conservation biology, evolutionary genomics, and population genetics or any life sciences who are interested in applying genomic tools to real-world conservation challenges.

Contact

Please write us at info@arraygen.com or call or whatsapp us on mobile +91-9673625446 if you need any clarification or for any custom training based on candidate reference paper or candidate own content/tools.

Course Information
Course Whole Genome metagenome Data Analysis
Duration online 15 Days Training [ 2 Hours Daily [ Monday To Friday ] ]

Slots

Our working Time is 9:00 AM to 6:00 PM Indian Time Available slots - 9:00 AM to 11:00 AM / 11:00 AM to 1:00 PM / 2:00 PM to 4:00 PM / 4:00 PM to 6:00 PM
For training slots after 6 PM or before 9 am as well as weekends training kindly mention during registration accordingly it will be scheduled.

Mode

👉 For online training candidate have to install ZOOM (with remote control on candidate system which makes 100% interactive)
👉 Run time video recording candidate can make as well as pdf manual will be provided for future reference.
👉 All our training is 100% practical and 100% industrial and 100% interactive which provides same as offline learning.
👉 For doubt clear there will be extra support will be provided based on the requirement.
👉 Certificate will be provided

Sequencing Platform Illumina /Ion Torrent/PacBio/Nanopore
Raw data Candidate can include maximum 4 datasets of their own during training. Publication standards figures and tables will be generated.
Training Fees
Module-I Advanced Bioinformatics & basic programming
Topics
    📘 Introduction to Bioinformatics
    - Overview of bioinformatics and its applications
    - Key concepts in computational biology
    - Role of bioinformatics in genomics, transcriptomics, and proteomics

    📘 Understanding NGS and Genomics Bioinformatics
    - Basics of Next-Generation Sequencing (NGS)
    - Types of NGS data (RNA-seq, WGS, WES)
    - Overview of NGS data formats: FASTQ, BAM, VCF
    - Introduction to pipelines and tools for NGS data analysis

    📘 Databases & Data Retrieval (NCBI and UCSC)
    - Learning how to retrieve biologically correct data
    - Performing complete batch retrieval (e.g., whole exome)
    - NCBI: understanding gene-level data retrieval
    - UCSC: handling large-scale data retrieval
    - UCSC Genome Browser and Table Browser usage
    - Batch Coordinate Retrieval and Genomic Data downloads
    - GFF/GTF gene annotation formats and how to retrieve them
    - Using BLAT for sequence-based search and alignment

    📘 Gene Prediction and Functional Annotation
    - Gene prediction approaches and tools
    - Functional annotation using Gene Ontology (GO)
    - Pathway analysis using KEGG, Reactome
    - Interpreting gene sets and biological relevance

    📘 Standalone/Offline BLAST for Large-Scale Genomic Data
    - Installing and setting up standalone BLAST
    - Running local BLAST for batch sequence alignment
    - Applications in genome-wide homology searches
    - Custom BLAST databases and performance optimization

    📘 PCR Primer Designing and Specificity Check
    - Designing accurate primers for PCR amplification
    - Tools: Primer3, NCBI Primer-BLAST
    - Checking primer specificity using genome-wide BLAST
    - Avoiding non-target amplification through design best practices

    📘 Understanding Python Programming
    - Introduction to Python for bioinformatics
    - Scripting basics: variables, loops, functions
    - Libraries like Biopython, pandas for biological data handling
    - Automating genomic workflows with Python scripts

AND
Module-II Next Generation Sequencing (NGS) - Whole Genome metagenome Data Analysis
Topics
    📘 Introduction to NGS WGS Metagenome
    - Understanding NGS and Its Applications
    - Types of sequencing data generated
    - Understanding FASTQ files and sequencing quality scores

    📘 Linux Basics & Environment Setup
    - Linux Command Line Basics
    - Installing WGS Metagenome Tools
    - Using Conda and Shell Scripting

    📘 Data Retrieval & Formats
    - Fetching data from NCBI SRA using SRA Toolkit
    - Understanding different file formats

    📘 Introduction to R/Bioconductor
    - Installing packages with CRAN and Bioconductor
    - Data types and standardized data container
    - Data manipulation

    📘 Detailed Metagenomic Data Analysis Pipeline

    1. - Assembly:
      Assembling short reads into longer contigs to reconstruct genomic fragments
    2. - RNA Prediction and Classification:
      Detecting ribosomal RNA and other non-coding RNAs from assembled sequences and classifying them into known RNA families
    3. - ORF (CDS) Prediction:
      Predicting open reading frames (ORFs) or coding sequences (CDS) using tools like Prodigal to identify potential genes
    4. - Homology Searching Against Taxonomic and Functional Databases:
      Matching predicted genes against reference databases (e.g., NCBI NR, COG, KEGG) to infer function and taxonomy
    5. - HMMER Searching Against Pfam Database:
      Identifying protein domains using HMMER against the Pfam database to support functional annotation
    6. - Taxonomic Assignment of Genes:
      Assigning taxonomy to each gene based on homology search results
    7. - Functional Assignment of Genes:
      Linking predicted genes to functional categories like metabolic pathways or gene families using KEGG, COG etc.
    8. - Blastx on Parts of the Contigs with No Gene Prediction or Hits:
      Running Blastx on contig regions that lack gene predictions or functional hits to discover missed genes
    9. - Taxonomic Assignment of Contigs and Disparity Checks:
      Assigning taxonomy at the contig level and verifying consistency among genes within each contig to avoid chimeras
    10. - Coverage and Abundance Estimation for Genes and Contigs:
      Calculating how abundant each gene or contig is by mapping sequencing reads back to the assembly
    11. - Estimation of Taxa Abundances:
      Quantifying the abundance of microbial taxa across samples based on read mappings and gene assignments
    12. - Estimation of Function Abundances:
      Estimating the abundance of biological functions or pathways in the community using annotated gene data
    13. - Merging of Previous Results to Obtain the ORF Table:
      Integrating taxonomic, functional, and abundance data into a single table summarizing all ORFs
    14. - Binning with Different Methods:
      Clustering contigs into bins representing draft genomes using binning tools like MetaBAT, MaxBin, or CONCOCT
    15. - Binning Integration with DAS Tool:
      Refining binning results by integrating outputs from multiple methods using the DAS Tool for higher accuracy
    16. - Taxonomic Assignment of Bins and Disparity Checks:
      Assigning taxonomy to bins and verifying internal consistency using marker gene and taxonomic information
    17. - Checking of Bins with CheckM2 and GTDB-Tk:
      Assessing bin completeness and contamination with CheckM2, and classifying high-quality bins using GTDB-Tk
    18. - Merging of Previous Results to Obtain the Bin Table:
      Compiling bin-level data including taxonomy, abundance, and completeness scores into a comprehensive table
    19. - Merging of Previous Results to Obtain the Contig Table:
      Creating a table summarizing all contigs with associated taxonomy, bins, gene content, and abundance
    20. - Prediction of KEGG and MetaCyc Pathways for Each Bin:
      Mapping genes in each bin to known metabolic pathways using databases like KEGG and MetaCyc
    21. - Final Statistics for the Run:
      Summarizing the entire analysis with metrics on assembly, binning, annotation, and read coverage
    22. - Generation of Tables with Aggregated Taxonomic and Functional Profiles:
      Producing overview tables that summarize taxonomic composition and functional potential across samples
    23. - Alpha and Beta Diversity Analysis:
      Calculating within-sample (alpha) and between-sample (beta) diversity using taxonomic and functional data
    24. - Visualization:
      PCA, Rarefaction, Stack Bar, Heatmap, Krona, Phylogeny, Bin Visualization, etc.

Preparation

- Whole genome metagenomics : https://en.wikipedia.org/wiki/Whole_genome_sequencing
- NCBI : https://pmc.ncbi.nlm.nih.gov/articles/PMC9456280/

Instructor

Industry Experienced

Target Audiance

This course is designed for graduate students, postdoctoral researchers, and professionals working in the fields of conservation biology, evolutionary genomics, and population genetics or any life sciences who are interested in applying genomic tools to real-world conservation challenges.

Contact

Please write us at info@arraygen.com or call or whatsapp us on mobile +91-9673625446 if you need any clarification or for any custom training based on candidate reference paper or candidate own content/tools.