Loading report..

Highlight Samples

Regex mode off

    Rename Samples

    Click here for bulk input.

    Paste two columns of a tab-delimited table here (eg. from Excel).

    First column should be the old name, second column the new name.

    Regex mode off

      Show / Hide Samples

      Regex mode off

        Export Plots

        px
        px
        X

        Download the raw data used to create the plots in this report below:

        Note that additional data was saved in multiqc_GRCg6a_data when this report was generated.


        Choose Plots

        If you use plots from MultiQC in a publication or presentation, please cite:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        Save Settings

        You can save the toolbox settings for this report to the browser.


        Load Settings

        Choose a saved report profile from the dropdown box below:

        About MultiQC

        This report was generated using MultiQC, version 1.11

        You can see a YouTube video describing how to use MultiQC reports here: https://youtu.be/qPbIlO_KWN0

        For more information about MultiQC, including other videos and extensive documentation, please visit http://multiqc.info

        You can report bugs, suggest improvements and find the source code for MultiQC on GitHub: https://github.com/ewels/MultiQC

        MultiQC is published in Bioinformatics:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        These samples were run by seq2science v0.6.1, a tool for easy preprocessing of NGS data.

        Take a look at our docs for info about how to use this report to the fullest.

        Workflow
        rna-seq
        Date
        December 20, 2021
        Project
        tdewijs
        Contact E-mail
        tessa.dewijs2@ru.nl

        Report generated on 2021-12-21, 18:32 based on data in:

        Change sample names:


        General Statistics

        Showing 38/38 rows and 10/19 columns.
        Sample Name% DuplicationGC content% PF% Adapter% Dups% MappedM Total seqsGenome coverageM Genome readsM MT genome reads
        DRR032765
        28.7%
        47.1%
        100.0%
        56.3%
        100.0%
        31.5
        3.0 X
        31.5
        2.4
        DRR032766
        23.8%
        47.0%
        100.0%
        50.1%
        100.0%
        19.3
        1.8 X
        19.0
        1.5
        DRR032767
        31.1%
        46.4%
        100.0%
        58.4%
        100.0%
        32.1
        3.0 X
        31.9
        2.5
        DRR032768
        31.5%
        46.1%
        100.0%
        59.3%
        100.0%
        30.5
        2.8 X
        29.8
        2.8
        DRR032769
        29.9%
        45.8%
        100.0%
        59.6%
        100.0%
        32.0
        3.0 X
        32.1
        2.7
        DRR032770
        29.4%
        46.1%
        100.0%
        59.1%
        100.0%
        30.9
        2.9 X
        31.2
        2.4
        DRR032771
        29.9%
        49.3%
        97.9%
        0.3%
        56.4%
        100.0%
        38.4
        3.8 X
        39.8
        2.8
        DRR032772
        29.8%
        45.3%
        99.9%
        0.4%
        57.6%
        100.0%
        30.7
        2.8 X
        30.3
        2.5
        DRR032773
        30.4%
        49.4%
        97.9%
        0.3%
        56.2%
        100.0%
        40.3
        4.0 X
        41.8
        2.7
        DRR032774
        29.9%
        49.5%
        98.0%
        0.3%
        55.6%
        100.0%
        37.0
        3.6 X
        38.3
        2.5
        DRR032775
        30.5%
        49.2%
        98.1%
        0.8%
        57.1%
        100.0%
        38.8
        3.6 X
        38.5
        3.0
        DRR032776
        30.9%
        48.9%
        98.2%
        0.8%
        57.8%
        100.0%
        41.1
        3.8 X
        40.3
        3.3
        DRR032777
        25.0%
        46.3%
        100.0%
        0.2%
        49.2%
        100.0%
        21.6
        2.0 X
        21.4
        1.4
        DRR032778
        24.3%
        46.2%
        99.9%
        0.2%
        48.1%
        100.0%
        21.1
        2.0 X
        20.8
        1.4
        DRR032779
        26.6%
        48.9%
        98.1%
        0.4%
        53.1%
        100.0%
        40.3
        3.9 X
        41.7
        2.5
        DRR032780
        27.9%
        48.4%
        98.3%
        52.3%
        100.0%
        42.3
        4.0 X
        42.6
        2.6
        DRR032781
        26.6%
        49.7%
        97.9%
        0.4%
        56.2%
        100.0%
        39.0
        4.0 X
        42.5
        2.3
        DRR032782
        26.7%
        49.4%
        97.9%
        0.4%
        53.9%
        100.0%
        35.0
        3.4 X
        36.3
        2.3
        DRR032783
        25.7%
        45.5%
        100.0%
        52.8%
        100.0%
        31.2
        2.9 X
        31.3
        2.1
        DRR032784
        25.7%
        45.6%
        100.0%
        53.0%
        100.0%
        30.5
        2.9 X
        30.4
        2.1
        DRR032785
        31.2%
        47.4%
        100.0%
        58.6%
        100.0%
        31.1
        2.9 X
        31.0
        2.4
        DRR032786
        26.5%
        46.9%
        100.0%
        52.3%
        100.0%
        20.0
        1.9 X
        20.0
        1.5
        DRR032787
        32.5%
        49.1%
        98.1%
        1.0%
        58.2%
        100.0%
        41.6
        3.9 X
        41.4
        3.7
        DRR032788
        33.1%
        48.7%
        98.3%
        0.7%
        57.4%
        100.0%
        39.3
        3.7 X
        38.7
        3.4
        ERR753791
        32.4%
        51.2%
        95.7%
        0.3%
        54.2%
        100.0%
        20.0
        0.7 X
        19.6
        1.7
        ERR753792
        30.6%
        53.6%
        97.1%
        50.2%
        100.0%
        19.8
        0.7 X
        20.5
        0.5
        ERR753793
        40.4%
        52.5%
        96.8%
        0.2%
        54.7%
        100.0%
        21.8
        0.8 X
        22.3
        0.6
        ERR753794
        40.6%
        51.9%
        97.1%
        0.2%
        54.4%
        100.0%
        21.6
        0.8 X
        22.4
        0.6
        ERR753795
        42.4%
        53.1%
        97.2%
        0.2%
        57.8%
        100.0%
        19.8
        0.7 X
        20.4
        0.7
        GSM5017912
        33.3%
        51.0%
        98.2%
        1.6%
        55.5%
        100.0%
        38.4
        3.1 X
        38.5
        1.1
        GSM5017913
        35.0%
        51.6%
        98.1%
        2.5%
        60.4%
        100.0%
        47.5
        3.8 X
        47.7
        1.4
        GSM5017914
        35.3%
        51.7%
        97.9%
        2.5%
        60.5%
        100.0%
        48.7
        3.9 X
        48.9
        1.4
        GSM5017915
        35.6%
        51.4%
        98.1%
        1.6%
        59.1%
        100.0%
        46.5
        3.7 X
        46.2
        1.5
        GSM5017916
        34.4%
        51.4%
        98.2%
        1.8%
        57.6%
        100.0%
        42.3
        3.4 X
        42.2
        1.2
        GSM5017917
        31.9%
        51.2%
        97.9%
        3.4%
        56.3%
        100.0%
        39.2
        3.1 X
        39.1
        1.2
        GSM5017918
        31.2%
        51.6%
        96.5%
        5.2%
        57.0%
        100.0%
        41.5
        3.3 X
        42.1
        1.0
        GSM5017919
        31.7%
        51.5%
        94.9%
        6.6%
        56.5%
        100.0%
        37.2
        3.0 X
        37.4
        1.1
        GSM5017920
        28.3%
        51.7%
        91.8%
        10.9%
        52.9%
        100.0%
        30.7
        2.4 X
        30.8
        0.9

        Workflow explanation

        Preprocessing of reads was done automatically with workflow tool seq2science v0.6.1. Public samples were downloaded from the Sequence Read Archive with help of the ncbi e-utilities and pysradb. Genome assembly GRCg6a was downloaded with genomepy 0.11.0. Single-end reads were trimmed with fastp v0.20.1 with default options. Reads were aligned with STAR v2.7.6a with default options. Transcript abundances were quantified with Salmon v1.5.2 with options '--seqBias --gcBias --validateMappings --recoverOrphans'. Afterwards, duplicate reads were marked with Picard MarkDuplicates v2.23.8. General alignment statistics were collected by samtools stats v1.14. Mapped reads were removed if they did not have a minimum mapping quality of 255, were a (secondary) multimapper or aligned inside the ENCODE blacklist.Afterwards samples were downsampled to -1 reads. Transcript abundance estimations were aggregated and converted to gene counts using tximeta v1.10.0. Sample sequencing strandedness was inferred using RSeQC v4.0.0 in order to improve quantification accuracy. Deeptools v3.5.0 was used for the fingerprint, profile, correlation and dendrogram/heatmap plots, where the heatmap was made with options '--distanceBetweenBins 9000 --binSize 1000'. The UCSC genome browser was used to visualize and inspect alignment. RNA-seq read duplication types were analyzed using dupRadar v1.20.0. Quality control metrics were aggregated by MultiQC v1.11.

        Assembly stats

        Genome assembly GRCg6a contains of 464 contigs, with a GC-content of 42.23%, and 0.92% consists of the letter N. The N50-L50 stats are 91315245-4 and the N75-L75 stats are 24153086-10. The genome annotation contains 13384 genes.

        fastp

        fastp An ultra-fast all-in-one FASTQ preprocessor (QC, adapters, trimming, filtering, splitting...)

        Filtered Reads

        Filtering statistics of sampled reads.

        loading..

        Duplication Rates

        Duplication rates of sampled reads.

        loading..

        Sequence Quality

        Average sequencing quality over each base of all reads.

        loading..

        GC Content

        Average GC content over each base of all reads.

        loading..

        N content

        Average N content over each base of all reads.

        loading..

        Picard

        Picard is a set of Java command line tools for manipulating high-throughput sequencing data.

        Mark Duplicates

        Number of reads, categorised by duplication state. Pair counts are doubled - see help text for details.

        The table in the Picard metrics file contains some columns referring read pairs and some referring to single reads.

        To make the numbers in this plot sum correctly, values referring to pairs are doubled according to the scheme below:

        • READS_IN_DUPLICATE_PAIRS = 2 * READ_PAIR_DUPLICATES
        • READS_IN_UNIQUE_PAIRS = 2 * (READ_PAIRS_EXAMINED - READ_PAIR_DUPLICATES)
        • READS_IN_UNIQUE_UNPAIRED = UNPAIRED_READS_EXAMINED - UNPAIRED_READ_DUPLICATES
        • READS_IN_DUPLICATE_PAIRS_OPTICAL = 2 * READ_PAIR_OPTICAL_DUPLICATES
        • READS_IN_DUPLICATE_PAIRS_NONOPTICAL = READS_IN_DUPLICATE_PAIRS - READS_IN_DUPLICATE_PAIRS_OPTICAL
        • READS_IN_DUPLICATE_UNPAIRED = UNPAIRED_READ_DUPLICATES
        • READS_UNMAPPED = UNMAPPED_READS
        loading..

        SamTools pre-sieve

        Samtools is a suite of programs for interacting with high-throughput sequencing data.

        The pre-sieve statistics are quality metrics measured before applying (optional) minimum mapping quality, blacklist removal, mitochondrial read removal, read length filtering, and tn5 shift.

        Percent Mapped

        Alignment metrics from samtools stats; mapped vs. unmapped reads.

        For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.

        Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).

        loading..

        Alignment metrics

        This module parses the output from samtools stats. All numbers in millions.

        loading..

        deepTools

        deepTools is a suite of tools to process and analyze deep sequencing data.

        PCA plot

        PCA plot with the top two principal components calculated based on genome-wide distribution of sequence reads

        loading..

        Fingerprint plot

        Signal fingerprint according to plotFingerprint

        loading..

        Strandedness

        Strandedness package provides a number of useful modules that can comprehensively evaluate high throughput RNA-seq data.

        Sequencing strandedness was inferred for the following samples, and was called if 60% of the sampled reads were explained by either sense (forward) or antisense (reverse).

        Infer experiment

        Infer experiment counts the percentage of reads and read pairs that match the strandedness of overlapping transcripts. It can be used to infer whether RNA-seq library preps are stranded (sense or antisense).

        loading..

        deepTools - Spearman correlation heatmap of reads in bins across the genome

        Spearman correlation plot generated by deeptools. Spearman correlation is a non-parametric (distribution-free) method, and assesses the monotonicity of the relationship.


        deepTools - Pearson correlation heatmap of reads in bins across the genome

        Pearson correlation plot generated by deeptools. Pearson correlation is a parametric (lots of assumptions, e.g. normality and homoscedasticity) method, and assesses the linearity of the relationship.


        dupRadar

        Figures generated by [dupRadar](https://bioconductor.riken.jp/packages/3.4/bioc/vignettes/dupRadar/inst/doc/dupRadar.html#plotting-and-interpretation). Click the link for help with interpretation.


        DESeq2 - Sample distance cluster heatmap of counts

        Euclidean distance between samples, based on variance stabilizing transformed counts (RNA: expressed genes, ChIP: bound regions, ATAC: accessible regions). Gives us an overview of similarities and dissimilarities between samples.


        DESeq2 - Spearman correlation cluster heatmap of counts

        Correlation cluster heatmap based on variance stabilizing transformed counts. Spearman correlation is a non-parametric (distribution-free) method, and assesses the monotonicity of the relationship.


        DESeq2 - Pearson correlation cluster heatmap of counts

        Correlation cluster heatmap based on variance stabilizing transformed counts. Pearson correlation is a parametric (lots of assumptions, e.g. normality and homoscedasticity) method, and assesses the linearity of the relationship.


        Samples & Config

        The samples file used for this run:

        sample assembly stage descriptive_name
        DRR032785 GRCg6a HH6 HH6_rep1
        DRR032786 GRCg6a HH6 HH6_rep2
        DRR032787 GRCg6a HH8 HH8_rep1
        DRR032788 GRCg6a HH8 HH8_rep2
        GSM5017912 GRCg6a HH20 RNA-seq_WB_HH20_replicate_1
        GSM5017913 GRCg6a HH20 RNA-seq_WB_HH20_replicate_2
        GSM5017914 GRCg6a HH20 RNA-seq_WB_HH20_replicate_3
        GSM5017915 GRCg6a HH22 RNA-seq_WB_HH22_replicate_1
        GSM5017916 GRCg6a HH22 RNA-seq_WB_HH22_replicate_2
        GSM5017917 GRCg6a HH22 RNA-seq_WB_HH22_replicate_3
        GSM5017918 GRCg6a HH24 RNA-seq_WB_HH24_replicate_1
        GSM5017919 GRCg6a HH24 RNA-seq_WB_HH24_replicate_2
        GSM5017920 GRCg6a HH24 RNA-seq_WB_HH24_replicate_3
        DRR032765 GRCg6a HH11 HH11_rep1
        DRR032766 GRCg6a HH11 HH11_rep2
        DRR032767 GRCg6a HH14 HH14_rep1
        DRR032768 GRCg6a HH14 HH14_rep2
        DRR032769 GRCg6a HH16 HH16_rep1
        DRR032770 GRCg6a HH16 HH16_rep2
        DRR032771 GRCg6a HH19 HH19_rep1
        DRR032772 GRCg6a HH19 HH19_rep2
        DRR032773 GRCg6a HH21 HH21_rep1
        DRR032774 GRCg6a HH21 HH21_rep2
        DRR032775 GRCg6a HH24 HH24_rep1
        DRR032776 GRCg6a HH24 HH24_rep2
        DRR032777 GRCg6a HH28 HH28_rep1
        DRR032778 GRCg6a HH28 HH28_rep2
        DRR032779 GRCg6a HH32 HH32_rep1
        DRR032780 GRCg6a HH32 HH32_rep2
        DRR032781 GRCg6a HH34 HH34_rep1
        DRR032782 GRCg6a HH34 HH34_rep2
        DRR032783 GRCg6a HH38 HH38_rep1
        DRR032784 GRCg6a HH38 HH38_rep2
        ERR753794 GRCg6a 0hpf sample_4_0_hrs
        ERR753795 GRCg6a 5hpf sample_5_5_hrs
        ERR753791 GRCg6a 10hpf sample_1_10_hrs
        ERR753792 GRCg6a 15hpf sample_2_15_hrs
        ERR753793 GRCg6a 20hpf sample_3_20_hrs

        The config file used for this run:
        # tab-separated file of the samples
        samples: samples.tsv
        
        # pipeline file locations
        result_dir: ./results_Gga  # where to store results
        genome_dir: ./genomes  # where to look for or download the genomes
        fastq_dir: ./fastq  # where to look for or download the fastqs
        
        
        # contact info for multiqc report and trackhub
        email: tessa.dewijs2@ru.nl
        
        # produce a UCSC trackhub?
        create_trackhub: true
        
        # how to handle replicates
        technical_replicates: merge    # change to "keep" to not combine them
        
        # which trimmer to use
        trimmer: fastp
        
        # which quantifier to use
        quantifier: salmon  # or salmon or featurecounts
        
        # which aligner to use (not used for the gene counts matrix if the quantifier is Salmon)
        aligner: star
        
        # filtering after alignment (not used for the gene counts matrix if the quantifier is Salmon)
        remove_blacklist: true
        min_mapping_quality: 255  # (only keep uniquely mapped reads from STAR alignments)
        only_primary_align: true
        remove_dups: false # keep duplicates (check dupRadar in the MultiQC)
        
        ## differential gene expression analysis
        #contrasts:
        #  - 'stage_2_1'