Read mapping

Mapping

mapping_rules

minimap2

  • Input:
    • FASTQ || compressed FASTQ file (trimed reads)
       → provided by seqkit seq rule
    • FASTA file (reference)
  • Output:
    • SAM file (mapping)
       → used by samtools view rule
       → used by samtools flagstat rule
  • Description:
     Mapped the reads on the reference provided.
  • Default options:
    • --MD → output the MD tag
    • -a → choose SAM as output format
    • -x map-ont → choose Nanopore vs reference mapping

samtools view

  • Input:
    • SAM file (mapping)
       → provided by minimap2 rule
  • Output:
    • BAM file (mapped reads)
       → used by samtools sort rule
  • Description:
     Convert the mapping file in binary format et remove unmapped reads
  • Default options:
    • -b → convert output in BAM format
    • -h → include header
    • -S → input format is auto-detected
    • -F 4 → exclude flags 4 (unmapped reads)

samtools sort

  • Input:
    • BAM file (mapped reads)
       → provided by samtools view rule
  • Output:
    • BAM file (mapped & sorted reads)
       → used by samtools index rule
       → used by medaka consensus rule
       → used by sniffles rule
       → used by cuteSV rule
       → used by svim rule
       → used by NanoVar rule
       → used by bamCoverage rule
       → used by plotCoverage rule
  • Description:
     Sort mapped reads
  • Default options:
    • -l 9 → set compression to best level

Indexing

indexing_rules

samtools index

  • Input:
    • BAM file (mapped & sorted reads)
       → provided by samtools sort rule
  • Output:
    • BAI file (mapped & sorted index)
       → used by medaka consensus rule
       → used by sniffles rule
       → used by cuteSV rule
       → used by svim rule
       → used by NanoVar rule
       → used by bamCoverage rule
       → used by plotCoverage rule
  • Description:
     Create a index of the mapped & sorted BAM file

Statistic control

flagstat_rules

samtools flagstat

  • Input:
    • SAM file (mapping)
       → provided by minimap2 rule
  • Output:
    • TXT file (flagstat)
  • Description:
     Compute the mapping statistics

coverage_rules

bamCoverage

  • Input:
    • BAM file (mapped & sorted reads)
       → provided by samtools sort rule
    • BAI file (mapped & sorted index)
       → provided by samtools index rule
  • Output:
    • bedgraph file (coverage file)
  • Description:
     Compute the bedgraph of the sample mapping
  • Default options:
    • --normalizeUsing RPGC → Choose the RPGC method to normalize number of reads per bin
    • -of bedgraph → Choose bedgraph as output file type

plotCoverage

  • Input:
    • BAM file (mapped & sorted reads)
       → provided by samtools sort rule
    • BAI file (mapped & sorted index)
       → provided by samtools index rule
  • Output:
    • PDF file (depth plot)
  • Description:
     Create the plot of the coverage for the sample mapping.
  • Default options:
    • --smartLabels → use file name as labels
    • --plotFileFormat pdf → choose pdf as output format