WGS denovo assembly

Whole genome sequencing (WGS) de novo assembly is the process of reconstructing a genome from sequencing reads without using a reference genome. This approach is especially important for newly discovered or poorly characterized organisms, where no prior genomic information is available. The process begins with raw sequencing data, typically in the form of short reads (e.g., from Illumina) or long reads (e.g., from PacBio or Oxford Nanopore). These reads undergo quality control and trimming to remove low-quality sequences and adapter contamination. The cleaned reads are then assembled into contiguous sequences, known as contigs, by identifying overlapping regions.

This course provides comprehensive training in de novo genome assembly from whole genome sequencing (WGS) data. It covers short-read, long-read, and hybrid assembly approaches, and includes hands-on training with industry-standard bioinformatics tools. Participants will gain the skills to assemble, polish, scaffold, and evaluate genomes in real-world research scenarios.

IMG
By the end of the course, participants will be able to perform de novo genome assembly from raw sequencing data, choose appropriate tools and strategies based on data type, polish and scaffold assemblies, and evaluate assembly quality using standard metrics. They will gain hands-on experience with both short- and long-read technologies, and be equipped to apply these skills to real-world genomic research projects.

👉 Enroll now and take the first step toward a future in data-driven biology and beyond.

## 10% Discount if you register before 15th October, 2025. Hurry up!!


Course Information
Course Whole Genome Denovo Assembly (Whole Genome Sequencing Analysis - Prokaryote)
Duration online 7 Days Training [ 2 Hours Daily [ Monday To Friday ] ]

Slots

Our working Time is 9:00 AM to 6:00 PM Indian Time Available slots - 9:00 AM to 11:00 AM / 11:00 AM to 1:00 PM / 2:00 PM to 4:00 PM / 4:00 PM to 6:00 PM
For training slots after 6 PM or before 9 am as well as weekends training kindly mention during registration accordingly it will be scheduled.

Mode

👉 For online training candidate have to install ZOOM (with remote control on candidate system which makes 100% interactive)
👉 Run time video recording candidate can make as well as pdf manual will be provided for future reference.
👉 All our training is 100% practical and 100% industrial and 100% interactive which provides same as offline learning.
👉 For doubt clear there will be extra support will be provided based on the requirement.
👉 Certificate will be provided

Sequencing Platform Illumina /Ion Torrent/PacBio/Nanopore/HIC/BIONANO
Raw data Candidate can include maximum 1 datasets of their own during training. Publication standards figures and tables will be generated.
Training Fees
Module-NGS Whole Genome Denovo Assembly (Whole Genome Sequencing Analysis - Prokaryote)
    📘 Introduction to Genome Assembly and NGS
    - Sequencing technologies: Illumina, PacBio, Nanopore
    - Reference-guided vs. de novo assembly
    - Genome complexity: repeats, ploidy, heterozygosity
    - Assembly algorithms: De Bruijn Graphs, string graphs
    - Understanding NGS: Depth, coverage, sequencing type, quality, Mapping quality, CIGAR value
    - Different File Formats: FASTQ, SAM, BAM

    📘 Linux Basics & Environment Setup
    - Linux Command Line Basics
    - Installing WGS Assembly Tools
    - Using Conda and Shell Scripting

    📘 Data Retrieval & Formats
    - Fetching data from NCBI SRA using SRA Toolkit
    - Understanding different file formats

    📘 Data Preprocessing and Quality Control
    - Quality control
    - Trimming/filtering
    - Read correction

    📘 Assembly with Short Reads
    - K-mer selection and parameter tuning
    - Resource management and outputs

    📘 Long-Read and Hybrid Assembly
    - Long-read assemblers
    - Hybrid workflows
    - Assembly benchmarking

    📘 Scaffolding and Polishing
    - Scaffolding
    - Polishing
    - Iterative refinement techniques
    - Chromosome-level scaffolding: Hi-C, Bionano

    📘 Assembly Evaluation and Quality Assessment
    - Assembly metrics
    - Contamination detection

    📘 Assembly Post-Advance Analysis
    - Pan-genome assembly and comparative & synteny analysis
    - Repeat masking
    - SRR discovery
    - Structural variant analysis from assemblies
    - Gene/ORF prediction
    - Functional annotation (Gene Ontology & Pathway (KEGG)/COG)
    - Genome and gene representation using Circos Genome Visualization
    - Phylogenetic analysis

Preparation

- Targeted metagenomics : https://en.wikipedia.org/wiki/De_novo_sequence_assemblers
- NCBI : https://pmc.ncbi.nlm.nih.gov/articles/PMC2813482/

Instructor

Industry Experienced

Target Audiance

This course is designed for graduate students, postdoctoral researchers, and professionals working in the fields of conservation biology, evolutionary genomics, and population genetics or any life sciences who are interested in applying genomic tools to real-world conservation challenges.

Contact

Please write us at info@arraygen.com or call or whatsapp us on mobile +91-9673625446 if you need any clarification or for any custom training based on candidate reference paper or candidate own content/tools.

Course Information
Course Whole Genome Denovo Assembly (Whole Genome Sequencing Analysis - Prokaryote)
Duration online 15 Days Training [ 2 Hours Daily [ Monday To Friday ] ]

Slots

Our working Time is 9:00 AM to 6:00 PM Indian Time Available slots - 9:00 AM to 11:00 AM / 11:00 AM to 1:00 PM / 2:00 PM to 4:00 PM / 4:00 PM to 6:00 PM
For training slots after 6 PM or before 9 am as well as weekends training kindly mention during registration accordingly it will be scheduled.

Mode

👉 For online training candidate have to install ZOOM (with remote control on candidate system which makes 100% interactive)
👉 Run time video recording candidate can make as well as pdf manual will be provided for future reference.
👉 All our training is 100% practical and 100% industrial and 100% interactive which provides same as offline learning.
👉 For doubt clear there will be extra support will be provided based on the requirement.
👉 Certificate will be provided

Sequencing Platform Illumina /Ion Torrent/PacBio/Nanopore
Raw data Candidate can include maximum 1 datasets of their own during training. Publication standards figures and tables will be generated.
Training Fees
Module-I Advanced Bioinformatics & basic programming
Topics
    📘 Introduction to Bioinformatics
    - Overview of bioinformatics and its applications
    - Key concepts in computational biology
    - Role of bioinformatics in genomics, transcriptomics, and proteomics

    📘 Understanding NGS and Genomics Bioinformatics
    - Basics of Next-Generation Sequencing (NGS)
    - Types of NGS data (RNA-seq, WGS, WES)
    - Overview of NGS data formats: FASTQ, BAM, VCF
    - Introduction to pipelines and tools for NGS data analysis

    📘 Databases & Data Retrieval (NCBI and UCSC)
    - Learning how to retrieve biologically correct data
    - Performing complete batch retrieval (e.g., whole exome)
    - NCBI: understanding gene-level data retrieval
    - UCSC: handling large-scale data retrieval
    - UCSC Genome Browser and Table Browser usage
    - Batch Coordinate Retrieval and Genomic Data downloads
    - GFF/GTF gene annotation formats and how to retrieve them
    - Using BLAT for sequence-based search and alignment

    📘 Gene Prediction and Functional Annotation
    - Gene prediction approaches and tools
    - Functional annotation using Gene Ontology (GO)
    - Pathway analysis using KEGG, Reactome
    - Interpreting gene sets and biological relevance

    📘 Standalone/Offline BLAST for Large-Scale Genomic Data
    - Installing and setting up standalone BLAST
    - Running local BLAST for batch sequence alignment
    - Applications in genome-wide homology searches
    - Custom BLAST databases and performance optimization

    📘 PCR Primer Designing and Specificity Check
    - Designing accurate primers for PCR amplification
    - Tools: Primer3, NCBI Primer-BLAST
    - Checking primer specificity using genome-wide BLAST
    - Avoiding non-target amplification through design best practices

    📘 Understanding Python Programming
    - Introduction to Python for bioinformatics
    - Scripting basics: variables, loops, functions
    - Libraries like Biopython, pandas for biological data handling
    - Automating genomic workflows with Python scripts

AND
Module-II Next Generation Sequencing (NGS) - Whole Genome Denovo Assembly (Whole Genome Sequencing Analysis - Prokaryote)
Topics
    📘 Introduction to Genome Assembly and NGS
    - Sequencing technologies: Illumina, PacBio, Nanopore
    - Reference-guided vs. de novo assembly
    - Genome complexity: repeats, ploidy, heterozygosity
    - Assembly algorithms: De Bruijn Graphs, string graphs
    - Understanding NGS: Depth, coverage, sequencing type, quality, Mapping quality, CIGAR value
    - Different File Formats: FASTQ, SAM, BAM

    📘 Linux Basics & Environment Setup
    - Linux Command Line Basics
    - Installing WGS Assembly Tools
    - Using Conda and Shell Scripting

    📘 Data Retrieval & Formats
    - Fetching data from NCBI SRA using SRA Toolkit
    - Understanding different file formats

    📘 Data Preprocessing and Quality Control
    - Quality control
    - Trimming/filtering
    - Read correction

    📘 Assembly with Short Reads
    - K-mer selection and parameter tuning
    - Resource management and outputs

    📘 Long-Read and Hybrid Assembly
    - Long-read assemblers
    - Hybrid workflows
    - Assembly benchmarking

    📘 Scaffolding and Polishing
    - Scaffolding
    - Polishing
    - Iterative refinement techniques
    - Chromosome-level scaffolding: Hi-C, Bionano

    📘 Assembly Evaluation and Quality Assessment
    - Assembly metrics
    - Contamination detection

    📘 Assembly Post-Advance Analysis
    - Pan-genome assembly and comparative & synteny analysis
    - Repeat masking
    - SRR discovery
    - Structural variant analysis from assemblies
    - Gene/ORF prediction
    - Functional annotation (Gene Ontology & Pathway (KEGG)/COG)
    - Genome and gene representation using Circos Genome Visualization
    - Phylogenetic analysis

Preparation

- Targeted metagenomics : https://en.wikipedia.org/wiki/De_novo_sequence_assemblers
- NCBI : https://pmc.ncbi.nlm.nih.gov/articles/PMC2813482/

Instructor

Industry Experienced

Target Audiance

This course is designed for graduate students, postdoctoral researchers, and professionals working in the fields of conservation biology, evolutionary genomics, and population genetics or any life sciences who are interested in applying genomic tools to real-world conservation challenges.

Contact

Please write us at info@arraygen.com or call or whatsapp us on mobile +91-9673625446 if you need any clarification or for any custom training based on candidate reference paper or candidate own content/tools.