Loading report..

Highlight Samples

Regex mode off

    Rename Samples

    Click here for bulk input.

    Paste two columns of a tab-delimited table here (eg. from Excel).

    First column should be the old name, second column the new name.

    Regex mode off

      Show / Hide Samples

      Regex mode off

        Export Plots

        px
        px
        X

        Download the raw data used to create the plots in this report below:

        Note that additional data was saved in multiqc_GRCz11_data when this report was generated.


        Choose Plots

        If you use plots from MultiQC in a publication or presentation, please cite:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        Save Settings

        You can save the toolbox settings for this report to the browser.


        Load Settings

        Choose a saved report profile from the dropdown box below:

        About MultiQC

        This report was generated using MultiQC, version 1.11

        You can see a YouTube video describing how to use MultiQC reports here: https://youtu.be/qPbIlO_KWN0

        For more information about MultiQC, including other videos and extensive documentation, please visit http://multiqc.info

        You can report bugs, suggest improvements and find the source code for MultiQC on GitHub: https://github.com/ewels/MultiQC

        MultiQC is published in Bioinformatics:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        These samples were run by seq2science v0.6.1, a tool for easy preprocessing of NGS data.

        Take a look at our docs for info about how to use this report to the fullest.

        Workflow
        rna-seq
        Date
        January 12, 2022
        Project
        pt6
        Contact E-mail
        tessa.dewijs2@ru.nl

        Report generated on 2022-01-13, 20:45 based on data in:

        Change sample names:


        General Statistics

        Showing 38/38 rows and 11/21 columns.
        Sample Name% DuplicationGC content% PF% AdapterInsert Size% Dups% MappedM Total seqsGenome coverageM Genome readsM MT genome reads
        GSM1600035
        28.9%
        47.9%
        99.2%
        2.4%
        61.4%
        100.0%
        49.0
        4.4 X
        59.5
        2.0
        GSM1600036
        53.8%
        48.6%
        99.4%
        78.0%
        100.0%
        62.3
        4.6 X
        62.3
        2.2
        GSM1600037
        35.2%
        47.8%
        99.5%
        77.1%
        100.0%
        45.0
        3.2 X
        43.9
        2.4
        GSM1600038
        43.1%
        50.0%
        99.5%
        84.4%
        100.0%
        133.6
        9.9 X
        134.1
        3.5
        GSM1600039
        34.0%
        48.2%
        99.5%
        75.8%
        100.0%
        85.8
        6.5 X
        88.4
        2.9
        GSM1600040
        38.6%
        50.6%
        99.3%
        1.7%
        75.8%
        100.0%
        85.4
        6.5 X
        89.2
        1.3
        GSM1600041
        37.7%
        49.3%
        99.3%
        2.1%
        77.4%
        100.0%
        111.4
        8.3 X
        112.6
        1.6
        GSM1600042
        45.6%
        47.8%
        99.6%
        71.1%
        100.0%
        69.1
        5.2 X
        70.1
        1.0
        GSM1600043
        46.1%
        47.7%
        99.6%
        1.0%
        71.8%
        100.0%
        81.2
        6.0 X
        82.3
        1.1
        GSM1600044
        45.3%
        47.4%
        99.6%
        72.1%
        100.0%
        93.9
        7.5 X
        102.1
        1.2
        GSM4252148
        26.9%
        46.2%
        95.5%
        0.1%
        207 bp
        66.6%
        100.0%
        63.9
        2.4 X
        44.3
        32.5
        GSM4252149
        33.6%
        48.6%
        94.7%
        0.2%
        207 bp
        48.4%
        100.0%
        64.0
        3.6 X
        65.3
        9.0
        GSM4252150
        13.1%
        46.8%
        96.4%
        0.1%
        181 bp
        68.4%
        100.0%
        77.5
        3.9 X
        71.8
        41.8
        GSM4252151
        20.8%
        51.0%
        92.8%
        0.1%
        171 bp
        50.7%
        100.0%
        58.8
        5.5 X
        100.9
        9.0
        GSM4252152
        13.3%
        45.9%
        93.9%
        0.1%
        188 bp
        58.0%
        100.0%
        74.1
        2.5 X
        45.6
        36.8
        GSM4252153
        22.5%
        48.5%
        93.7%
        0.1%
        193 bp
        46.3%
        100.0%
        70.6
        3.4 X
        62.3
        17.4
        GSM4252154
        13.7%
        46.2%
        95.0%
        0.3%
        151 bp
        38.7%
        100.0%
        95.1
        5.6 X
        102.0
        20.9
        GSM4252155
        20.6%
        48.5%
        93.0%
        0.2%
        167 bp
        48.5%
        100.0%
        70.2
        3.7 X
        67.8
        19.5
        GSM4252156
        7.7%
        42.6%
        94.1%
        0.3%
        168 bp
        18.4%
        100.0%
        80.0
        5.2 X
        94.7
        6.5
        GSM4252157
        25.2%
        51.5%
        88.6%
        0.1%
        174 bp
        60.0%
        100.0%
        54.6
        4.9 X
        88.6
        12.8
        GSM5136500
        25.3%
        49.5%
        99.6%
        1.0%
        349 bp
        37.9%
        100.0%
        58.1
        6.5 X
        60.1
        1.2
        GSM5136501
        25.1%
        49.7%
        99.6%
        1.2%
        322 bp
        37.1%
        100.0%
        49.0
        5.5 X
        50.9
        1.0
        GSM5136502
        26.7%
        50.0%
        99.6%
        1.2%
        313 bp
        40.7%
        100.0%
        67.4
        8.0 X
        73.9
        1.4
        GSM5136503
        23.8%
        49.3%
        99.7%
        1.1%
        321 bp
        35.9%
        100.0%
        46.1
        5.2 X
        47.7
        1.0
        GSM5136504
        24.8%
        49.2%
        99.7%
        1.3%
        302 bp
        36.7%
        100.0%
        46.5
        5.2 X
        48.5
        1.0
        GSM5136505
        19.2%
        49.1%
        99.7%
        1.3%
        296 bp
        31.6%
        100.0%
        40.2
        4.5 X
        41.6
        0.9
        GSM5136506
        18.8%
        49.4%
        99.7%
        1.1%
        331 bp
        31.8%
        100.0%
        40.5
        4.6 X
        42.6
        0.9
        GSM5136507
        24.1%
        49.0%
        99.6%
        1.1%
        306 bp
        36.2%
        100.0%
        47.4
        5.3 X
        49.0
        1.1
        GSM5136508
        24.0%
        49.1%
        99.6%
        1.2%
        308 bp
        35.8%
        100.0%
        47.0
        5.3 X
        48.7
        1.0
        GSM5136509
        19.7%
        49.2%
        99.7%
        1.1%
        311 bp
        32.0%
        100.0%
        38.5
        4.4 X
        40.7
        0.9
        GSM5136510
        23.5%
        49.1%
        99.7%
        1.2%
        306 bp
        35.6%
        100.0%
        44.5
        5.0 X
        46.0
        1.1
        GSM5136511
        19.6%
        49.2%
        99.7%
        1.1%
        316 bp
        32.1%
        100.0%
        39.5
        4.4 X
        40.8
        0.9
        GSM5136512
        17.8%
        49.2%
        99.5%
        1.0%
        412 bp
        30.7%
        100.0%
        55.2
        6.2 X
        57.0
        1.2
        GSM5136513
        13.4%
        49.3%
        99.6%
        1.1%
        334 bp
        26.7%
        100.0%
        50.6
        5.7 X
        52.5
        1.1
        GSM5136514
        14.1%
        49.2%
        99.6%
        1.1%
        331 bp
        28.1%
        100.0%
        57.2
        6.4 X
        59.3
        1.3
        GSM5136515
        25.9%
        49.2%
        99.6%
        1.3%
        307 bp
        38.9%
        100.0%
        51.5
        5.8 X
        53.2
        1.2
        GSM5136516
        25.9%
        49.6%
        99.7%
        1.3%
        294 bp
        40.3%
        100.0%
        58.5
        6.5 X
        60.5
        1.2
        GSM5136517
        27.9%
        49.2%
        99.6%
        1.2%
        286 bp
        42.5%
        100.0%
        71.6
        8.0 X
        74.1
        1.7

        Workflow explanation

        Preprocessing of reads was done automatically with workflow tool seq2science v0.6.1. Public samples were downloaded from the Sequence Read Archive with help of the ncbi e-utilities and pysradb. Genome assembly GRCz11 was downloaded with genomepy 0.11.0. Paired-end reads were trimmed with fastp v0.20.1 with default options. Single-end reads were trimmed with fastp v0.20.1 with default options. Reads were aligned with STAR v2.7.6a with default options. Transcript abundances were quantified with Salmon v1.5.2 with options '--seqBias --gcBias --validateMappings --recoverOrphans'. Afterwards, duplicate reads were marked with Picard MarkDuplicates v2.23.8. General alignment statistics were collected by samtools stats v1.14. Mapped reads were removed if they did not have a minimum mapping quality of 255, were a (secondary) multimapper or aligned inside the ENCODE blacklist.Afterwards samples were downsampled to -1 reads. Transcript abundance estimations were aggregated and converted to gene counts using tximeta v1.10.0. Sample sequencing strandedness was inferred using RSeQC v4.0.0 in order to improve quantification accuracy. Deeptools v3.5.0 was used for the fingerprint, profile, correlation and dendrogram/heatmap plots, where the heatmap was made with options '--distanceBetweenBins 9000 --binSize 1000'. The UCSC genome browser was used to visualize and inspect alignment. RNA-seq read duplication types were analyzed using dupRadar v1.20.0. Quality control metrics were aggregated by MultiQC v1.11.

        Assembly stats

        Genome assembly GRCz11 contains of 993 contigs, with a GC-content of 36.65%, and 0.34% consists of the letter N. The N50-L50 stats are 54304671-11 and the N75-L75 stats are 48040578-18. The genome annotation contains 30954 genes.

        fastp

        fastp An ultra-fast all-in-one FASTQ preprocessor (QC, adapters, trimming, filtering, splitting...)

        Filtered Reads

        Filtering statistics of sampled reads.

        loading..

        Duplication Rates

        Duplication rates of sampled reads.

        loading..

        Insert Sizes

        Insert size estimation of sampled reads.

        loading..

        Sequence Quality

        Average sequencing quality over each base of all reads.

        loading..

        GC Content

        Average GC content over each base of all reads.

        loading..

        N content

        Average N content over each base of all reads.

        loading..

        Picard

        Picard is a set of Java command line tools for manipulating high-throughput sequencing data.

        Insert Size

        Plot shows the number of reads at a given insert size. Reads with different orientations are summed.

        loading..

        Mark Duplicates

        Number of reads, categorised by duplication state. Pair counts are doubled - see help text for details.

        The table in the Picard metrics file contains some columns referring read pairs and some referring to single reads.

        To make the numbers in this plot sum correctly, values referring to pairs are doubled according to the scheme below:

        • READS_IN_DUPLICATE_PAIRS = 2 * READ_PAIR_DUPLICATES
        • READS_IN_UNIQUE_PAIRS = 2 * (READ_PAIRS_EXAMINED - READ_PAIR_DUPLICATES)
        • READS_IN_UNIQUE_UNPAIRED = UNPAIRED_READS_EXAMINED - UNPAIRED_READ_DUPLICATES
        • READS_IN_DUPLICATE_PAIRS_OPTICAL = 2 * READ_PAIR_OPTICAL_DUPLICATES
        • READS_IN_DUPLICATE_PAIRS_NONOPTICAL = READS_IN_DUPLICATE_PAIRS - READS_IN_DUPLICATE_PAIRS_OPTICAL
        • READS_IN_DUPLICATE_UNPAIRED = UNPAIRED_READ_DUPLICATES
        • READS_UNMAPPED = UNMAPPED_READS
        loading..

        SamTools pre-sieve

        Samtools is a suite of programs for interacting with high-throughput sequencing data.

        The pre-sieve statistics are quality metrics measured before applying (optional) minimum mapping quality, blacklist removal, mitochondrial read removal, read length filtering, and tn5 shift.

        Percent Mapped

        Alignment metrics from samtools stats; mapped vs. unmapped reads.

        For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.

        Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).

        loading..

        Alignment metrics

        This module parses the output from samtools stats. All numbers in millions.

        loading..

        deepTools

        deepTools is a suite of tools to process and analyze deep sequencing data.

        PCA plot

        PCA plot with the top two principal components calculated based on genome-wide distribution of sequence reads

        loading..

        Fingerprint plot

        Signal fingerprint according to plotFingerprint

        loading..

        Strandedness

        Strandedness package provides a number of useful modules that can comprehensively evaluate high throughput RNA-seq data.

        Sequencing strandedness was inferred for the following samples, and was called if 60% of the sampled reads were explained by either sense (forward) or antisense (reverse).

        Infer experiment

        Infer experiment counts the percentage of reads and read pairs that match the strandedness of overlapping transcripts. It can be used to infer whether RNA-seq library preps are stranded (sense or antisense).

        loading..

        deepTools - Spearman correlation heatmap of reads in bins across the genome

        Spearman correlation plot generated by deeptools. Spearman correlation is a non-parametric (distribution-free) method, and assesses the monotonicity of the relationship.


        deepTools - Pearson correlation heatmap of reads in bins across the genome

        Pearson correlation plot generated by deeptools. Pearson correlation is a parametric (lots of assumptions, e.g. normality and homoscedasticity) method, and assesses the linearity of the relationship.


        dupRadar

        Figures generated by [dupRadar](https://bioconductor.riken.jp/packages/3.4/bioc/vignettes/dupRadar/inst/doc/dupRadar.html#plotting-and-interpretation). Click the link for help with interpretation.


        DESeq2 - Sample distance cluster heatmap of counts

        Euclidean distance between samples, based on variance stabilizing transformed counts (RNA: expressed genes, ChIP: bound regions, ATAC: accessible regions). Gives us an overview of similarities and dissimilarities between samples.


        DESeq2 - Spearman correlation cluster heatmap of counts

        Correlation cluster heatmap based on variance stabilizing transformed counts. Spearman correlation is a non-parametric (distribution-free) method, and assesses the monotonicity of the relationship.


        DESeq2 - Pearson correlation cluster heatmap of counts

        Correlation cluster heatmap based on variance stabilizing transformed counts. Pearson correlation is a parametric (lots of assumptions, e.g. normality and homoscedasticity) method, and assesses the linearity of the relationship.


        Samples & Config

        The samples file used for this run:

        sample assembly stage _trep descriptive_name
        GSM1600035 GRCz11 sperm GSM1600035 0hpf_E1
        GSM1600036 GRCz11 oocyte GSM1600036 0hpf_E2
        GSM1600037 GRCz11 1-cell GSM1600037 0hpf_E3
        GSM1600038 GRCz11 4-cell GSM1600038 1hpf_E4
        GSM1600039 GRCz11 16-cell GSM1600039 1.5hpf_E5
        GSM1600040 GRCz11 64-cell GSM1600040 2hpf_E6
        GSM1600041 GRCz11 128-cell GSM1600041 2.25hpf_E7
        GSM1600042 GRCz11 256-cell GSM1600042 2.5hpf_E8
        GSM1600043 GRCz11 1k-cell GSM1600043 3hpf_E9
        GSM1600044 GRCz11 Sphere GSM1600044 4hpf_E10
        GSM4252148 GRCz11 64-cell GSM4252148 2hpf_G1
        GSM4252149 GRCz11 64-cell GSM4252149 2hpf_G2
        GSM4252150 GRCz11 256-cell GSM4252150 2.5hpf_G3
        GSM4252151 GRCz11 256-cell GSM4252151 2.5hpf_G4
        GSM4252152 GRCz11 1000-cell GSM4252152 3hpf_G5
        GSM4252153 GRCz11 1000-cell GSM4252153 3hpf_G6
        GSM4252154 GRCz11 Dome GSM4252154 4.33hpf_G7
        GSM4252155 GRCz11 Dome GSM4252155 4.33hpf_G8
        GSM4252156 GRCz11 Shield GSM4252156 6hpf_G9
        GSM4252157 GRCz11 Shield GSM4252157 6hpf_G10
        GSM5136500 GRCz11 96hpf GSM5136500 96hpf_G11
        GSM5136501 GRCz11 96hpf GSM5136501 96hpf_G12
        GSM5136502 GRCz11 96hpf GSM5136502 96hpf_G13
        GSM5136503 GRCz11 97hpf GSM5136503 97hpf_G14
        GSM5136504 GRCz11 97hpf GSM5136504 97hpf_G15
        GSM5136505 GRCz11 97hpf GSM5136505 97hpf_G16
        GSM5136506 GRCz11 98hpf GSM5136506 98hpf_G17
        GSM5136507 GRCz11 98hpf GSM5136507 98hpf_G18
        GSM5136508 GRCz11 98hpf GSM5136508 98hpf_G19
        GSM5136509 GRCz11 102hpf GSM5136509 102hpf_G20
        GSM5136510 GRCz11 102hpf GSM5136510 102hpf_G21
        GSM5136511 GRCz11 102hpf GSM5136511 102hpf_G22
        GSM5136512 GRCz11 108hpf GSM5136512 108hpf_G23
        GSM5136513 GRCz11 108hpf GSM5136513 108hpf_G24
        GSM5136514 GRCz11 108hpf GSM5136514 108hpf_G25
        GSM5136515 GRCz11 132hpf GSM5136515 132hpf_G26
        GSM5136516 GRCz11 132hpf GSM5136516 132hpf_G27
        GSM5136517 GRCz11 132hpf GSM5136517 132hpf_G28

        The config file used for this run:
        # tab-separated file of the samples
        samples: samples_pt6.tsv
        
        # pipeline file locations
        result_dir: /bank/tdewijs/results_Dre/pt6  # where to store results
        genome_dir: /bank/tdewijs/genomes  # where to look for or download the genomes
        fastq_dir: /bank/tdewijs/fastq  # where to look for or download the fastqs
        
        
        # contact info for multiqc report and trackhub
        email: tessa.dewijs2@ru.nl
        
        # produce a UCSC trackhub?
        create_trackhub: true
        
        # how to handle replicates
        technical_replicates: merge    # change to "keep" to not combine them
        
        # which trimmer to use
        trimmer: fastp
        
        # which quantifier to use
        quantifier: salmon  # or salmon or featurecounts
        
        # which aligner to use (not used for the gene counts matrix if the quantifier is Salmon)
        aligner: star
        
        # filtering after alignment (not used for the gene counts matrix if the quantifier is Salmon)
        remove_blacklist: true
        min_mapping_quality: 255  # (only keep uniquely mapped reads from STAR alignments)
        only_primary_align: true
        remove_dups: false # keep duplicates (check dupRadar in the MultiQC)
        
        ## differential gene expression analysis
        #contrasts:
        #  - 'stage_2_1'