Loading report..

Highlight Samples

Regex mode off

    Rename Samples

    Click here for bulk input.

    Paste two columns of a tab-delimited table here (eg. from Excel).

    First column should be the old name, second column the new name.

    Regex mode off

      Show / Hide Samples

      Regex mode off

        Export Plots

        px
        px
        X

        Download the raw data used to create the plots in this report below:

        Note that additional data was saved in multiqc_GRCh38.p13_data when this report was generated.


        Choose Plots

        If you use plots from MultiQC in a publication or presentation, please cite:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        Save Settings

        You can save the toolbox settings for this report to the browser.


        Load Settings

        Choose a saved report profile from the dropdown box below:

        Tool Citations

        Please remember to cite the tools that you use in your analysis.

        To help with this, you can download publication details of the tools mentioned in this report:

        About MultiQC

        This report was generated using MultiQC, version 1.14

        You can see a YouTube video describing how to use MultiQC reports here: https://youtu.be/qPbIlO_KWN0

        For more information about MultiQC, including other videos and extensive documentation, please visit http://multiqc.info

        You can report bugs, suggest improvements and find the source code for MultiQC on GitHub: https://github.com/ewels/MultiQC

        MultiQC is published in Bioinformatics:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        These samples were run by seq2science v1.2.0, a tool for easy preprocessing of NGS data.

        Take a look at our docs for info about how to use this report to the fullest.

        Workflow
        rna-seq
        Date
        October 04, 2023
        Project
        2023_s2s_run
        Contact E-mail
        slrinzema@science.ru.nl

        Report generated on 2023-10-04, 11:19 CEST based on data in:

        Change sample names:

        Show/Hide samples:


        General Statistics

        Showing 62/62 rows and 12/22 columns.
        Sample Name% DuplicationM Reads After FilteringGC content% PF% AdapterInsert Size% Dups% MappedM Total seqsGenome coverageM Genome readsM MT genome reads
        DMSO_1
        17.1%
        104.0
        48.9%
        100.0%
        0.0%
        222 bp
        20.8%
        100.0%
        101.1
        1.9 X
        142.5
        3.9
        DMSO_2
        33.6%
        143.2
        51.0%
        100.0%
        0.0%
        209 bp
        39.5%
        100.0%
        139.4
        2.7 X
        202.0
        4.7
        DMSO_3
        13.0%
        49.7
        48.4%
        100.0%
        0.0%
        198 bp
        16.1%
        100.0%
        48.3
        0.8 X
        62.8
        1.8
        DMSO_4
        32.3%
        143.9
        50.6%
        100.0%
        0.0%
        218 bp
        20.8%
        100.0%
        101.3
        1.9 X
        143.8
        3.1
        HSCD12_DMSO_1bio
        12.6%
        50.4
        49.0%
        97.1%
        0.0%
        HSCD12_DMSO_1tec
        13.0%
        53.7
        48.7%
        97.4%
        0.0%
        HSCD12_DMSO_2bio
        19.2%
        49.9
        51.2%
        96.8%
        0.1%
        HSCD12_DMSO_2tec
        26.1%
        93.4
        50.9%
        96.9%
        0.1%
        HSCD12_DMSO_3bio
        11.1%
        35.6
        48.4%
        96.4%
        0.1%
        HSCD12_DMSO_3tec
        7.8%
        14.1
        48.4%
        96.5%
        0.1%
        HSCD12_RU1uMTA100nM_1bio
        9.1%
        30.5
        48.7%
        96.9%
        0.1%
        HSCD12_RU1uMTA100nM_1tec
        12.3%
        58.5
        44.2%
        97.1%
        0.1%
        HSCD12_RU1uMTA100nM_2bio
        13.7%
        43.1
        51.4%
        95.6%
        0.1%
        HSCD12_RU1uMTA100nM_2tec
        9.2%
        39.8
        45.2%
        96.3%
        0.1%
        HSCD12_RU1uMTA100nM_3bio
        12.7%
        44.8
        50.0%
        97.3%
        0.1%
        HSCD12_RU1uMTA100nM_3tec
        11.3%
        38.5
        45.3%
        96.2%
        0.2%
        HSCD12_RU1uMTA1uM_1bio
        11.6%
        41.5
        48.8%
        96.3%
        0.1%
        HSCD12_RU1uMTA1uM_1tec
        11.9%
        42.2
        48.5%
        96.7%
        0.1%
        HSCD12_RU1uMTA1uM_2bio
        10.8%
        34.4
        50.0%
        97.9%
        0.1%
        HSCD12_RU1uMTA1uM_2tec
        14.7%
        72.3
        49.8%
        97.7%
        0.1%
        HSCD12_RU1uMTA1uM_3bio
        13.8%
        44.9
        51.1%
        97.7%
        0.1%
        HSCD12_RU1uMTA1uM_3tec
        13.5%
        47.7
        50.9%
        97.2%
        0.1%
        HSCD12_RU1uM_1bio
        12.3%
        31.4
        50.3%
        97.2%
        0.0%
        HSCD12_RU1uM_1tec
        21.0%
        42.5
        47.9%
        96.0%
        0.6%
        HSCD12_RU1uM_2bio
        10.5%
        42.0
        50.3%
        96.5%
        0.1%
        HSCD12_RU1uM_2tec
        16.6%
        43.5
        48.4%
        97.1%
        0.9%
        HSCD12_RU1uM_3bio
        12.6%
        36.2
        50.4%
        96.7%
        0.1%
        HSCD12_RU1uM_3tec
        11.2%
        40.5
        46.4%
        97.8%
        0.2%
        HSCD12_TA100nM_1bio
        12.0%
        32.0
        51.3%
        97.7%
        0.1%
        HSCD12_TA100nM_1tec
        17.4%
        105.1
        51.2%
        97.8%
        0.1%
        HSCD12_TA100nM_2bio
        11.7%
        36.8
        50.7%
        95.6%
        0.1%
        HSCD12_TA100nM_2tec
        14.2%
        60.3
        50.5%
        95.5%
        0.1%
        HSCD12_TA100nM_3bio
        14.6%
        44.3
        50.1%
        96.4%
        0.2%
        HSCD12_TA100nM_3tec
        12.5%
        29.1
        49.9%
        96.4%
        0.2%
        HSCD12_TA1uM_1bio
        12.3%
        36.3
        50.2%
        97.1%
        0.1%
        HSCD12_TA1uM_1tec
        17.4%
        101.9
        50.2%
        97.3%
        0.2%
        HSCD12_TA1uM_2bio
        10.9%
        37.5
        51.5%
        97.4%
        0.0%
        HSCD12_TA1uM_2tec
        13.6%
        68.7
        51.4%
        97.3%
        0.0%
        HSCD12_TA1uM_3bio
        13.0%
        38.7
        49.2%
        98.3%
        0.1%
        HSCD12_TA1uM_3tec
        15.1%
        55.6
        49.1%
        98.2%
        0.1%
        NP_Day12_supernatant_DMSO_1
        17.9%
        45.8
        48.9%
        97.5%
        0.0%
        NP_Day12_supernatant_DMSO_2
        11.2%
        48.3
        49.1%
        97.5%
        0.1%
        NP_Day12_supernatant_DMSO_3
        57.7%
        49.8
        53.5%
        97.4%
        0.0%
        NP_Day12_supernatant_TA_1
        13.1%
        40.1
        48.1%
        97.0%
        0.0%
        NP_Day12_supernatant_TA_2
        15.2%
        49.7
        52.0%
        96.2%
        0.0%
        NP_Day12_supernatant_TA_3
        54.8%
        51.4
        53.3%
        98.5%
        0.0%
        RU1uMTA100nM_1
        13.5%
        89.0
        45.8%
        100.0%
        0.0%
        199 bp
        16.0%
        100.0%
        86.3
        1.4 X
        101.0
        4.2
        RU1uMTA100nM_2
        14.0%
        82.8
        48.5%
        100.0%
        0.0%
        201 bp
        17.1%
        100.0%
        80.6
        1.4 X
        105.0
        3.2
        RU1uMTA100nM_3
        14.5%
        83.2
        47.8%
        100.0%
        0.0%
        198 bp
        17.3%
        100.0%
        80.7
        1.4 X
        101.4
        3.7
        RU1uMTA1uM_1
        16.1%
        83.7
        48.6%
        100.0%
        0.0%
        217 bp
        19.7%
        100.0%
        80.4
        1.5 X
        110.8
        3.3
        RU1uMTA1uM_2
        17.4%
        106.7
        49.9%
        100.0%
        0.0%
        217 bp
        21.3%
        100.0%
        103.3
        1.9 X
        144.1
        4.1
        RU1uMTA1uM_3
        17.9%
        92.5
        51.0%
        100.0%
        0.0%
        202 bp
        21.8%
        100.0%
        89.6
        1.8 X
        130.2
        4.0
        RU1uM_1
        19.7%
        73.9
        48.9%
        100.0%
        0.0%
        188 bp
        23.8%
        100.0%
        71.3
        1.7 X
        124.4
        3.9
        RU1uM_2
        16.0%
        85.5
        49.3%
        100.0%
        0.0%
        196 bp
        19.7%
        100.0%
        82.9
        1.9 X
        140.4
        3.3
        RU1uM_3
        14.5%
        76.7
        48.3%
        100.0%
        0.0%
        197 bp
        17.4%
        100.0%
        74.7
        1.3 X
        99.4
        3.3
        TA100nM_1
        19.4%
        137.0
        51.2%
        100.0%
        0.0%
        172 bp
        23.6%
        100.0%
        133.3
        2.7 X
        198.2
        5.9
        TA100nM_2
        17.6%
        97.1
        50.5%
        100.0%
        0.0%
        222 bp
        21.7%
        100.0%
        94.3
        1.9 X
        138.6
        3.0
        TA100nM_3
        18.9%
        73.4
        50.0%
        100.0%
        0.0%
        175 bp
        23.1%
        100.0%
        71.2
        1.3 X
        99.6
        3.4
        TA1uM_1
        19.9%
        138.1
        50.2%
        100.0%
        0.0%
        194 bp
        23.8%
        100.0%
        134.5
        2.6 X
        195.7
        5.1
        TA1uM_2
        16.6%
        106.2
        51.4%
        100.0%
        0.0%
        222 bp
        20.7%
        100.0%
        102.9
        2.1 X
        153.4
        3.3
        TA1uM_3
        18.6%
        94.3
        49.1%
        100.0%
        0.0%
        225 bp
        22.4%
        100.0%
        92.1
        1.7 X
        125.2
        4.0
        TA_1
        32.0%
        141.1
        51.4%
        100.0%
        0.0%
        230 bp
        21.5%
        100.0%
        100.6
        2.0 X
        151.3
        3.0

        Workflow explanation

        Preprocessing of reads was done automatically by seq2science v1.2.0 using the rna-seq workflow. Paired-end reads were trimmed with fastp v0.23.2 with default options. Genome assembly GRCh38.p13 was downloaded with genomepy 0.16.1. Reads were aligned with STAR v2.7.10b with default options. Decoy sequences were generated in order to improve Salmon mapping accuracy. Afterwards, duplicate reads were marked with Picard MarkDuplicates v3.0.0. Transcript abundances were quantified with Salmon v1.10.1 with options '--seqBias --gcBias --posBias --validateMappings --recoverOrphans'. General alignment statistics were collected by samtools stats v1.16. Transcript abundance estimations were aggregated using pytxi and converted to gene counts using genomepy. Sample sequencing strandedness was inferred using RSeQC v5.0.1 in order to improve quantification accuracy. Deeptools v3.5.1 was used for the fingerprint, profile, correlation and dendrogram/heatmap plots, where the heatmap was made with options '--distanceBetweenBins 9000 --binSize 1000'. RNA-seq read duplication types were analyzed using dupRadar v1.28.0. The UCSC genome browser was used to visualize and inspect alignment. Quality control metrics were aggregated by MultiQC v1.14.

        Assembly stats

        Genome assembly GRCh38.p13 contains of 194 contigs, with a GC-content of 40.86%, and 4.96% consists of the letter N. The N50-L50 stats are 145138636-9 and the N75-L75 stats are 114364328-14. The genome annotation contains 39397 genes.

        fastp

        fastp An ultra-fast all-in-one FASTQ preprocessor (QC, adapters, trimming, filtering, splitting...).DOI: 10.1093/bioinformatics/bty560.

        Filtered Reads

        Filtering statistics of sampled reads.

        loading..

        Insert Sizes

        Insert size estimation of sampled reads.

        loading..

        Sequence Quality

        Average sequencing quality over each base of all reads.

        loading..

        GC Content

        Average GC content over each base of all reads.

        loading..

        N content

        Average N content over each base of all reads.

        loading..

        Picard

        Picard is a set of Java command line tools for manipulating high-throughput sequencing data.

        Insert Size

        Plot shows the number of reads at a given insert size. Reads with different orientations are summed.

        loading..

        Mark Duplicates

        Number of reads, categorised by duplication state. Pair counts are doubled - see help text for details.

        The table in the Picard metrics file contains some columns referring read pairs and some referring to single reads.

        To make the numbers in this plot sum correctly, values referring to pairs are doubled according to the scheme below:

        • READS_IN_DUPLICATE_PAIRS = 2 * READ_PAIR_DUPLICATES
        • READS_IN_UNIQUE_PAIRS = 2 * (READ_PAIRS_EXAMINED - READ_PAIR_DUPLICATES)
        • READS_IN_UNIQUE_UNPAIRED = UNPAIRED_READS_EXAMINED - UNPAIRED_READ_DUPLICATES
        • READS_IN_DUPLICATE_PAIRS_OPTICAL = 2 * READ_PAIR_OPTICAL_DUPLICATES
        • READS_IN_DUPLICATE_PAIRS_NONOPTICAL = READS_IN_DUPLICATE_PAIRS - READS_IN_DUPLICATE_PAIRS_OPTICAL
        • READS_IN_DUPLICATE_UNPAIRED = UNPAIRED_READ_DUPLICATES
        • READS_UNMAPPED = UNMAPPED_READS
        loading..

        SamTools pre-sieve

        Samtools is a suite of programs for interacting with high-throughput sequencing data.DOI: 10.1093/bioinformatics/btp352.

        The pre-sieve statistics are quality metrics measured before applying (optional) minimum mapping quality, blacklist removal, mitochondrial read removal, read length filtering, and tn5 shift.

        Percent Mapped

        Alignment metrics from samtools stats; mapped vs. unmapped reads.

        For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.

        Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).

        loading..

        Alignment metrics

        This module parses the output from samtools stats. All numbers in millions.

        loading..

        deepTools

        deepTools is a suite of tools to process and analyze deep sequencing data.DOI: 10.1093/nar/gkw257.

        PCA plot

        PCA plot with the top two principal components calculated based on genome-wide distribution of sequence reads

        loading..

        Fingerprint plot

        Signal fingerprint according to plotFingerprint

        loading..

        Strandedness

        Strandedness package provides a number of useful modules that can comprehensively evaluate high throughput RNA-seq data.DOI: 10.1093/bioinformatics/bts356.

        Sequencing strandedness was inferred for the following samples, and was called if 60% of the sampled reads were explained by either sense (forward) or antisense (reverse).

        Infer experiment

        Infer experiment counts the percentage of reads and read pairs that match the strandedness of overlapping transcripts. It can be used to infer whether RNA-seq library preps are stranded (sense or antisense).

        loading..

        deepTools - Spearman correlation heatmap of reads in bins across the genome

        Spearman correlation plot generated by deeptools. Spearman correlation is a non-parametric (distribution-free) method, and assesses the monotonicity of the relationship.


        deepTools - Pearson correlation heatmap of reads in bins across the genome

        Pearson correlation plot generated by deeptools. Pearson correlation is a parametric (lots of assumptions, e.g. normality and homoscedasticity) method, and assesses the linearity of the relationship.


        dupRadar

        Figures generated by [dupRadar](https://bioconductor.riken.jp/packages/3.4/bioc/vignettes/dupRadar/inst/doc/dupRadar.html#plotting-and-interpretation). Click the link for help with interpretation.


        DESeq2 - Sample distance cluster heatmap of counts

        Euclidean distance between samples, based on variance stabilizing transformed counts (RNA: expressed genes, ChIP: bound regions, ATAC: accessible regions). Gives us an overview of similarities and dissimilarities between samples.


        DESeq2 - Spearman correlation cluster heatmap of counts

        Correlation cluster heatmap based on variance stabilizing transformed counts. Spearman correlation is a non-parametric (distribution-free) method, and assesses the monotonicity of the relationship.


        DESeq2 - Pearson correlation cluster heatmap of counts

        Correlation cluster heatmap based on variance stabilizing transformed counts. Pearson correlation is a parametric (lots of assumptions, e.g. normality and homoscedasticity) method, and assesses the linearity of the relationship.


        Samples & Config

        The samples file used for this run:

        sample assembly technical_replicates descriptivate_name
        HSCD12_DMSO_1bio GRCh38.p13 DMSO_1 HSCD12_DMSO_1bio
        HSCD12_DMSO_1tec GRCh38.p13 DMSO_1 HSCD12_DMSO_1tec
        HSCD12_DMSO_2bio GRCh38.p13 DMSO_2 HSCD12_DMSO_2bio
        HSCD12_DMSO_2tec GRCh38.p13 DMSO_2 HSCD12_DMSO_2tec
        HSCD12_DMSO_3bio GRCh38.p13 DMSO_3 HSCD12_DMSO_3bio
        HSCD12_DMSO_3tec GRCh38.p13 DMSO_3 HSCD12_DMSO_3tec
        HSCD12_RU1uM_1bio GRCh38.p13 RU1uM_1 HSCD12_RU1uM_1bio
        HSCD12_RU1uM_1tec GRCh38.p13 RU1uM_1 HSCD12_RU1uM_1tec
        HSCD12_RU1uM_2bio GRCh38.p13 RU1uM_2 HSCD12_RU1uM_2bio
        HSCD12_RU1uM_2tec GRCh38.p13 RU1uM_2 HSCD12_RU1uM_2tec
        HSCD12_RU1uM_3bio GRCh38.p13 RU1uM_3 HSCD12_RU1uM_3bio
        HSCD12_RU1uM_3tec GRCh38.p13 RU1uM_3 HSCD12_RU1uM_3tec
        HSCD12_RU1uMTA100nM_1bio GRCh38.p13 RU1uMTA100nM_1 HSCD12_RU1uMTA100nM_1bio
        HSCD12_RU1uMTA100nM_1tec GRCh38.p13 RU1uMTA100nM_1 HSCD12_RU1uMTA100nM_1tec
        HSCD12_RU1uMTA100nM_2bio GRCh38.p13 RU1uMTA100nM_2 HSCD12_RU1uMTA100nM_2bio
        HSCD12_RU1uMTA100nM_2tec GRCh38.p13 RU1uMTA100nM_2 HSCD12_RU1uMTA100nM_2tec
        HSCD12_RU1uMTA100nM_3bio GRCh38.p13 RU1uMTA100nM_3 HSCD12_RU1uMTA100nM_3bio
        HSCD12_RU1uMTA100nM_3tec GRCh38.p13 RU1uMTA100nM_3 HSCD12_RU1uMTA100nM_3tec
        HSCD12_RU1uMTA1uM_1bio GRCh38.p13 RU1uMTA1uM_1 HSCD12_RU1uMTA1uM_1bio
        HSCD12_RU1uMTA1uM_1tec GRCh38.p13 RU1uMTA1uM_1 HSCD12_RU1uMTA1uM_1tec
        HSCD12_RU1uMTA1uM_2bio GRCh38.p13 RU1uMTA1uM_2 HSCD12_RU1uMTA1uM_2bio
        HSCD12_RU1uMTA1uM_2tec GRCh38.p13 RU1uMTA1uM_2 HSCD12_RU1uMTA1uM_2tec
        HSCD12_RU1uMTA1uM_3bio GRCh38.p13 RU1uMTA1uM_3 HSCD12_RU1uMTA1uM_3bio
        HSCD12_RU1uMTA1uM_3tec GRCh38.p13 RU1uMTA1uM_3 HSCD12_RU1uMTA1uM_3tec
        HSCD12_TA100nM_1bio GRCh38.p13 TA100nM_1 HSCD12_TA100nM_1bio
        HSCD12_TA100nM_1tec GRCh38.p13 TA100nM_1 HSCD12_TA100nM_1tec
        HSCD12_TA100nM_2bio GRCh38.p13 TA100nM_2 HSCD12_TA100nM_2bio
        HSCD12_TA100nM_2tec GRCh38.p13 TA100nM_2 HSCD12_TA100nM_2tec
        HSCD12_TA100nM_3bio GRCh38.p13 TA100nM_3 HSCD12_TA100nM_3bio
        HSCD12_TA100nM_3tec GRCh38.p13 TA100nM_3 HSCD12_TA100nM_3tec
        HSCD12_TA1uM_1bio GRCh38.p13 TA1uM_1 HSCD12_TA1uM_1bio
        HSCD12_TA1uM_1tec GRCh38.p13 TA1uM_1 HSCD12_TA1uM_1tec
        HSCD12_TA1uM_2bio GRCh38.p13 TA1uM_2 HSCD12_TA1uM_2bio
        HSCD12_TA1uM_2tec GRCh38.p13 TA1uM_2 HSCD12_TA1uM_2tec
        HSCD12_TA1uM_3bio GRCh38.p13 TA1uM_3 HSCD12_TA1uM_3bio
        HSCD12_TA1uM_3tec GRCh38.p13 TA1uM_3 HSCD12_TA1uM_3tec
        NP_Day12_supernatant_DMSO_1 GRCh38.p13 DMSO_4 NP_Day12_supernatant_DMSO_1
        NP_Day12_supernatant_DMSO_2 GRCh38.p13 DMSO_4 NP_Day12_supernatant_DMSO_2
        NP_Day12_supernatant_DMSO_3 GRCh38.p13 DMSO_4 NP_Day12_supernatant_DMSO_3
        NP_Day12_supernatant_TA_1 GRCh38.p13 TA_1 NP_Day12_supernatant_TA_1
        NP_Day12_supernatant_TA_2 GRCh38.p13 TA_1 NP_Day12_supernatant_TA_2
        NP_Day12_supernatant_TA_3 GRCh38.p13 TA_1 NP_Day12_supernatant_TA_3

        The config file used for this run:
        # tab-separated file of the samples
        samples: samples.tsv
        
        # pipeline file locations
        result_dir: ./results  # where to store results
        genome_dir: /ceph/rimlsfnwi/data/genomes/
        fastq_dir: /home/slrinzema/ceph/ghe/samples
        
        
        # contact info for multiqc report and trackhub
        email: slrinzema@science.ru.nl
        
        # produce a UCSC trackhub?
        create_trackhub: true
        
        # how to handle replicates
        technical_replicates: merge    # change to "keep" to not combine them
        
        # which trimmer to use
        trimmer: fastp
        
        # which quantifier to use
        quantifier: salmon
        
        # which aligner to use
        aligner: star
        
        # how to sort bam
        bam_sorter:
          samtools:
            coordinate
        
        # filtering after alignment
        remove_blacklist: true
        only_primary_align: true
        min_mapping_quality: 255 # instead of 30
        remove_dups: false