These samples were run by seq2science v0.5.4, a tool for easy preprocessing of NGS data.

Take a look at our docs for info about how to use this report to the fullest.

Workflow: rna-seq
Date: August 31, 2021
Project: ghe_2021
Contact E-mail: yourmail@here.com

JavaScript Disabled

MultiQC reports use JavaScript for plots and toolbox functions. It looks like you have JavaScript disabled in your web browser. Please note that many of the report functions will not work as intended.

Report generated on 2021-08-31, 20:01 based on data in:

/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day10_supernatant_DMSO_A72_S3.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day12_supernatant_DMSO_A74_S8.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/plotCorrelation/GRCh37-DESeq2_pearson_correlation_clustering_mqc.png
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day1_adherentcells_DMSO_A70_S14.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day12_adherent_DMSO_A61_S9.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day19_supernatant_DMSO_A63_S6.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day19_adherent_DMSO_A65_S2.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day10_supernatant_TA_A71_S7.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day12_adherent_TA_A60_S4.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day19_adherent_TA_A64_S1.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day10_supernatant_TA_A71_S7.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day12_supernatant_TA_A73_S10.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day12_adherent_TA_A60_S4.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day12_supernatant_TA_A73_S10.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day12_supernatant_DMSO_A74_S8.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day10_supernatant_TA_A71_S7.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day10_supernatant_TA_A71_S7.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day19_adherent_TA_A64_S1.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day1_adherentcells_DMSO_A70_S14.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day19_supernatant_DMSO_A63_S6.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day1_adherentcells_DMSO_A70_S14.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day12_supernatant_TA_A73_S10.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day12_supernatant_DMSO_A74_S8.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day10_supernatant_DMSO_A72_S3.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day19_supernatant_TA_A62_S5.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day12_adherent_TA_A60_S4.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/plotCorrelation/GRCh37-DESeq2_sample_distance_clustering_mqc.png
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day19_adherent_DMSO_A65_S2.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day12_adherent_TA_A60_S4.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day10_supernatant_DMSO_A72_S3.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day19_supernatant_TA_A62_S5.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day1_adherentcells_TA_A66_S11.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day19_adherent_DMSO_A65_S2.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/log/workflow_explanation_mqc.html
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day1_adherentcells_TA_A66_S11.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day1_adherentcells_DMSO_A70_S14.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/plotCorrelation/GRCh37-deepTools_spearman_correlation_clustering_mqc.png
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day1_adherentcells_TA_A66_S11.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day19_supernatant_DMSO_A63_S6.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day12_adherent_DMSO_A61_S9.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day12_adherent_DMSO_A61_S9.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day12_adherent_DMSO_A61_S9.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day10_supernatant_DMSO_A72_S3.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day12_supernatant_TA_A73_S10.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day12_supernatant_DMSO_A74_S8.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day19_supernatant_TA_A62_S5.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day19_supernatant_DMSO_A63_S6.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day10_supernatant_DMSO_A72_S3.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day19_adherent_TA_A64_S1.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day1_adherentcells_DMSO_A70_S14.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samplesconfig_mqc.html
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day1_adherentcells_TA_A66_S11.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/dupRadar/GRCh37-dupRadar_mqc.png
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day10_supernatant_TA_A71_S7.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day19_supernatant_TA_A62_S5.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day19_supernatant_TA_A62_S5.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day12_supernatant_DMSO_A74_S8.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/star/GRCh37-NP_Day12_supernatant_TA_A73_S10.samtools-coordinate-unsieved.bam.mtnucratiomtnuc.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day12_adherent_DMSO_A61_S9.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/trimming/NP_Day19_adherent_TA_A64_S1.fastp.json
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day1_adherentcells_TA_A66_S11.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day19_supernatant_DMSO_A63_S6.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/plotCorrelation/GRCh37-DESeq2_spearman_correlation_clustering_mqc.png
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/samtools_stats/star/GRCh37-NP_Day12_adherent_TA_A60_S4.samtools-coordinate.samtools_stats.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/markdup/GRCh37-NP_Day19_adherent_DMSO_A65_S2.samtools-coordinate.metrics.txt
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day19_adherent_TA_A64_S1.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/InsertSizeMetrics/GRCh37-NP_Day19_adherent_DMSO_A65_S2.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/plotCorrelation/GRCh37-deepTools_pearson_correlation_clustering_mqc.png
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/plotPCA/GRCh37.tsv
/ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/results/qc/plotFingerprint/GRCh37.tsv

Change sample names:

General Statistics

Showing ¹²/₁₂ rows and ¹¹/₂₁ columns.

Sample Name	% Duplication	% > Q30	Mb Q30 bases	GC content	% PF	% Adapter	Insert Size	Mean Insert Size	% Dups	Error rate	M Non-Primary	M Reads Mapped	% Mapped	% Proper Pairs	% MapQ 0 Reads	M Total seqs	MT genome coverage	Genome coverage	MT to Nuclear Ratio	M Genome reads	M MT genome reads
NP_Day10_supernatant_DMSO_A72_S3	17.4%	94.7%	1617.9	49.8%	95.9%	0.0%	237 bp	306 bp	22.8%	0.00%	8.0	39.5	100.0%	100.0%	0.5%	39.5	2196.5 X	0.6 X	3480.0	46.6	0.9
NP_Day10_supernatant_TA_A71_S7	18.5%	94.7%	1661.1	51.1%	95.0%	0.0%	238 bp	288 bp	26.7%	0.00%	14.0	40.6	100.0%	100.0%	0.4%	40.6	1754.8 X	0.7 X	2406.3	53.9	0.7
NP_Day12_adherent_DMSO_A61_S9	7.6%	94.7%	2529.6	49.3%	97.5%	0.0%	255 bp	321 bp	14.2%	0.00%	8.8	61.8	100.0%	100.0%	0.4%	61.8	3460.6 X	0.9 X	3694.1	69.2	1.4
NP_Day12_adherent_TA_A60_S4	9.6%	94.8%	1673.1	47.6%	96.6%	0.1%	271 bp	339 bp	14.7%	0.00%	4.9	40.7	100.0%	100.0%	0.4%	40.7	3252.5 X	0.6 X	5423.8	44.3	1.3
NP_Day12_supernatant_DMSO_A74_S8	11.8%	94.9%	1824.8	48.9%	97.5%	0.0%	246 bp	307 bp	18.1%	0.00%	7.2	44.7	100.0%	100.0%	0.4%	44.7	3462.2 X	0.7 X	5068.8	50.5	1.4
NP_Day12_supernatant_TA_A73_S10	6.6%	94.8%	1594.3	48.1%	97.0%	0.0%	253 bp	318 bp	12.8%	0.00%	6.0	38.9	100.0%	100.0%	0.4%	38.9	2722.1 X	0.6 X	4584.7	43.9	1.1
NP_Day19_adherent_DMSO_A65_S2	27.5%	94.6%	1750.0	51.7%	95.7%	0.0%	223 bp	268 bp	35.8%	0.00%	12.5	42.8	100.0%	100.0%	0.4%	42.8	1755.4 X	0.7 X	2378.1	54.6	0.7
NP_Day19_adherent_TA_A64_S1	32.0%	94.8%	1480.5	50.5%	96.2%	0.1%	194 bp	237 bp	37.6%	0.00%	4.8	36.1	100.0%	100.0%	0.5%	36.1	1096.6 X	0.5 X	2003.1	40.4	0.4
NP_Day19_supernatant_DMSO_A63_S6	5.3%	94.9%	1854.8	48.9%	96.8%	0.1%	209 bp	256 bp	9.8%	0.00%	6.8	45.0	100.0%	100.0%	0.6%	45.0	1979.0 X	0.7 X	2865.5	51.0	0.8
NP_Day19_supernatant_TA_A62_S5	10.0%	94.8%	1715.8	49.7%	96.8%	0.1%	246 bp	298 bp	14.3%	0.00%	7.0	41.8	100.0%	100.0%	0.5%	41.8	2051.5 X	0.6 X	3162.4	47.9	0.8
NP_Day1_adherentcells_DMSO_A70_S14	4.5%	94.7%	1976.0	51.2%	95.7%	0.0%	260 bp	331 bp	11.5%	0.00%	15.2	47.9	100.0%	100.0%	0.4%	47.9	2380.2 X	0.8 X	2827.6	62.2	0.9
NP_Day1_adherentcells_TA_A66_S11	7.3%	94.8%	1928.4	49.9%	96.5%	0.0%	250 bp	320 bp	12.1%	0.00%	9.1	46.4	100.0%	100.0%	0.5%	46.4	1817.2 X	0.7 X	2452.8	54.8	0.7

Uncheck the tick box to hide columns. Click and drag the handle on the left to change order.

Sort	Group	Column	Description	ID	Scale
\|\|	fastp	% Duplication	Duplication rate before filtering	`pct_duplication`	None
\|\|	fastp	% > Q30	Percentage of reads > Q30 after filtering	`after_filtering_q30_rate`	None
\|\|	fastp	Mb Q30 bases	Bases > Q30 after filtering (millions)	`after_filtering_q30_bases`	base_count
\|\|	fastp	GC content	GC content after filtering	`after_filtering_gc_content`	None
\|\|	fastp	% PF	Percent reads passing filter	`pct_surviving`	None
\|\|	fastp	% Adapter	Percentage adapter-trimmed reads	`pct_adapter`	None
\|\|	Picard	Insert Size	Median Insert Size, all read orientations (bp)	`summed_median`	None
\|\|	Picard	Mean Insert Size	Mean Insert Size, all read orientations (bp)	`summed_mean`	None
\|\|	Picard	% Dups	Mark Duplicates - Percent Duplication	`PERCENT_DUPLICATION`	None
\|\|	SamTools pre-sieve	Error rate	Error rate: mismatches (NM) / bases mapped (CIGAR)	`error_rate`	None
\|\|	SamTools pre-sieve	M Non-Primary	Non-primary alignments (millions)	`non-primary_alignments`	read_count
\|\|	SamTools pre-sieve	M Reads Mapped	Reads Mapped in the bam file (millions)	`reads_mapped`	read_count
\|\|	SamTools pre-sieve	% Mapped	% Mapped Reads	`reads_mapped_percent`	None
\|\|	SamTools pre-sieve	% Proper Pairs	% Properly Paired Reads	`reads_properly_paired_percent`	None
\|\|	SamTools pre-sieve	% MapQ 0 Reads	% of Reads that are Ambiguously Placed (MapQ=0)	`reads_MQ0_percent`	None
\|\|	SamTools pre-sieve	M Total seqs	Total sequences in the bam file (millions)	`raw_total_sequences`	read_count
\|\|	mtnucratio	MT genome coverage	Average coverage (X) on mitochondrial genome.	`mt_cov_avg`	None
\|\|	mtnucratio	Genome coverage	Average coverage (X) on nuclear genome.	`nuc_cov_avg`	None
\|\|	mtnucratio	MT to Nuclear Ratio	Mitochondrial to nuclear reads ratio (MTNUC)	`mt_nuc_ratio`	None
\|\|	mtnucratio	M Genome reads	Reads on the nuclear genome (millions)	`nucreads`	read_count
\|\|	mtnucratio	M MT genome reads	Reads on the mitochondrial genome (millions)	`mtreads`	read_count

Workflow explanation

Preprocessing of reads was done automatically with workflow tool seq2science v0.5.4. Paired-end reads were trimmed with fastp v0.20.1 with default options. Genome assembly GRCh37 was downloaded with genomepy 0.9.3. Reads were aligned with STAR v2.7.6a with default options. Mapped reads were removed if they did not have a minimum mapping quality of 255, were a (secondary) multimapper or aligned inside the ENCODE blacklist. Transcript abundances were quantified with Salmon v1.3.0 with options '--seqBias --gcBias --validateMappings --recoverOrphans'. General alignment statistics were collected by samtools stats v1.11. Transcript abundance estimations were aggregated and converted to gene counts using tximeta v1.6.3. Afterwards, duplicate reads were marked with Picard MarkDuplicates v2.23.8. Differential gene expression analysis was performed using DESeq2 v1.30.1. To adjust for multiple testing the (default) Benjamini-Hochberg procedure was performed with an FDR cutoff of 0.1 (default is 0.1). Counts were log transformed using the (default) shrinkage estimator apeglm v1.12.0. Deeptools v3.5.0 was used for the fingerprint, profile, correlation and dendrogram/heatmap plots, where the heatmap was made with options '--distanceBetweenBins 9000 --binSize 1000'. Sample sequencing strandedness was inferred using RSeQC v4.0.0 in order to improve quantification accuracy. RNA-seq read duplication types were analyzed using dupRadar v1.20.0. The UCSC genome browser was used to visualize and inspect alignment. Quality control metrics were aggregated by MultiQC v1.11.

fastp

fastp An ultra-fast all-in-one FASTQ preprocessor (QC, adapters, trimming, filtering, splitting...)

Filtered Reads

Filtering statistics of sampled reads.

Duplication Rates

Duplication rates of sampled reads.

Insert Sizes

Insert size estimation of sampled reads.

Sequence Quality

Average sequencing quality over each base of all reads.

GC Content

Average GC content over each base of all reads.

N content

Average N content over each base of all reads.

Picard

Picard is a set of Java command line tools for manipulating high-throughput sequencing data.

Insert Size

Plot shows the number of reads at a given insert size. Reads with different orientations are summed.

Mark Duplicates

Number of reads, categorised by duplication state. Pair counts are doubled - see help text for details.

The table in the Picard metrics file contains some columns referring read pairs and some referring to single reads.

To make the numbers in this plot sum correctly, values referring to pairs are doubled according to the scheme below:

READS_IN_DUPLICATE_PAIRS = 2 * READ_PAIR_DUPLICATES
READS_IN_UNIQUE_PAIRS = 2 * (READ_PAIRS_EXAMINED - READ_PAIR_DUPLICATES)
READS_IN_UNIQUE_UNPAIRED = UNPAIRED_READS_EXAMINED - UNPAIRED_READ_DUPLICATES
READS_IN_DUPLICATE_PAIRS_OPTICAL = 2 * READ_PAIR_OPTICAL_DUPLICATES
READS_IN_DUPLICATE_PAIRS_NONOPTICAL = READS_IN_DUPLICATE_PAIRS - READS_IN_DUPLICATE_PAIRS_OPTICAL
READS_IN_DUPLICATE_UNPAIRED = UNPAIRED_READ_DUPLICATES
READS_UNMAPPED = UNMAPPED_READS

SamTools pre-sieve

Samtools is a suite of programs for interacting with high-throughput sequencing data.

The pre-sieve statistics are quality metrics measured before applying (optional) minimum mapping quality, blacklist removal, mitochondrial read removal, read length filtering, and tn5 shift.

Percent Mapped

Alignment metrics from samtools stats; mapped vs. unmapped reads.

For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.

Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).

Alignment metrics

This module parses the output from samtools stats. All numbers in millions.

deepTools

deepTools is a suite of tools to process and analyze deep sequencing data.

PCA plot

PCA plot with the top two principal components calculated based on genome-wide distribution of sequence reads

Fingerprint plot

Signal fingerprint according to plotFingerprint

deepTools - Spearman correlation heatmap of reads in bins across the genome

Spearman correlation plot generated by deeptools. Spearman correlation is a non-parametric (distribution-free) method, and assesses the monotonicity of the relationship.

deepTools - Pearson correlation heatmap of reads in bins across the genome

Pearson correlation plot generated by deeptools. Pearson correlation is a parametric (lots of assumptions, e.g. normality and homoscedasticity) method, and assesses the linearity of the relationship.

dupRadar

Figures generated by [dupRadar](https://bioconductor.riken.jp/packages/3.4/bioc/vignettes/dupRadar/inst/doc/dupRadar.html#plotting-and-interpretation). Click the link for help with interpretation.

DESeq2 - Sample distance cluster heatmap of counts

Euclidean distance between samples, based on variance stabilizing transformed counts (RNA: expressed genes, ChIP: bound regions, ATAC: accessible regions). Gives us an overview of similarities and dissimilarities between samples.

DESeq2 - Spearman correlation cluster heatmap of counts

Correlation cluster heatmap based on variance stabilizing transformed counts. Spearman correlation is a non-parametric (distribution-free) method, and assesses the monotonicity of the relationship.

DESeq2 - Pearson correlation cluster heatmap of counts

Correlation cluster heatmap based on variance stabilizing transformed counts. Pearson correlation is a parametric (lots of assumptions, e.g. normality and homoscedasticity) method, and assesses the linearity of the relationship.

Samples & Config

The samples file used for this run:

sample	assembly	day	fraction	compound	descriptive_name	supernatant	adherent	day19
NP_Day1_adherentcells_TA_A66_S11	GRCh37	1	adherent	TA	Day1_adherent_TA
NP_Day1_adherentcells_DMSO_A70_S14	GRCh37	1	adherent	DMSO	Day1_adherent_DMSO
NP_Day10_supernatant_TA_A71_S7	GRCh37	10	supernatant	TA	Day10_supernatant_TA	TA
NP_Day10_supernatant_DMSO_A72_S3	GRCh37	10	supernatant	DMSO	Day10_supernatant_DMSO	DMSO
NP_Day12_supernatant_TA_A73_S10	GRCh37	12	supernatant	TA	Day12_supernatant_TA	TA
NP_Day12_supernatant_DMSO_A74_S8	GRCh37	12	supernatant	DMSO	Day12_supernatant_DMSO	DMSO
NP_Day12_adherent_TA_A60_S4	GRCh37	12	adherent	TA	Day12_adherent_TA		TA
NP_Day12_adherent_DMSO_A61_S9	GRCh37	12	adherent	DMSO	Day12_adherent_DMSO		DMSO
NP_Day19_supernatant_TA_A62_S5	GRCh37	19	supernatant	TA	Day19_supernatant_TA		TA	supernatant
NP_Day19_supernatant_DMSO_A63_S6	GRCh37	19	supernatant	DMSO	Day19_supernatant_DMSO		DMSO	supernatant
NP_Day19_adherent_TA_A64_S1	GRCh37	19	adherent	TA	Day19_adherent_TA			adherent
NP_Day19_adherent_DMSO_A65_S2	GRCh37	19	adherent	DMSO	Day19_adherent_DMSO			adherent

The config file used for this run:

# tab-separated file of the samples
samples: samples.tsv

# pipeline file locations
result_dir: ./results  # where to store results
genome_dir: /ceph/rimlsfnwi/web_share/mbdata/siebrenf/GRCh37/
fastq_dir: /ceph/rimlsfnwi/data/moldevbio/heeringen/siebrenf/ghe_2021/fastq/


# contact info for multiqc report and trackhub
email: yourmail@here.com

# produce a UCSC trackhub?
create_trackhub: true

# how to handle replicates
technical_replicates: merge    # change to "keep" to not combine them

# which trimmer to use
trimmer: fastp

# which quantifier to use
quantifier: salmon  # or htseq or featurecounts

# which aligner to use (not used for the gene counts matrix if the quantifier is Salmon)
aligner: star

# filtering after alignment (not used for the gene counts matrix if the quantifier is Salmon)
markduplicates: -Xms4G -Xmx6G MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=999  # keep duplicates (check dupRadar in the MultiQC)
remove_blacklist: true
min_mapping_quality: 255  # (only keep uniquely mapped reads from STAR alignments)
only_primary_align: true

# differential gene expression analysis
contrasts:
  - 'supernatant_TA_DMSO'
  - 'adherent_TA_DMSO'
  - 'day19_adherent_supernatant'

v1.11

MultiQC Toolbox

Highlight Samples

Rename Samples

Show / Hide Samples

Export Plots

Choose Plots

Save Settings

Load Settings

About MultiQC

These samples were run by seq2science v0.5.4, a tool for easy preprocessing of NGS data.

General Statistics

Workflow explanation

fastp

Filtered Reads

Duplication Rates

Insert Sizes

Sequence Quality

GC Content

N content

Picard

Insert Size

Mark Duplicates

SamTools pre-sieve

Percent Mapped

Alignment metrics

deepTools

PCA plot

Fingerprint plot

deepTools - Spearman correlation heatmap of reads in bins across the genome

deepTools - Pearson correlation heatmap of reads in bins across the genome

dupRadar

DESeq2 - Sample distance cluster heatmap of counts

DESeq2 - Spearman correlation cluster heatmap of counts

DESeq2 - Pearson correlation cluster heatmap of counts

Samples & Config

Toggle navigation v1.11

MultiQC Toolbox

Apply Highlight Samples

Apply Rename Samples

Apply Show / Hide Samples

Export Plots

Choose Plots

Save Settings

Load Settings

About MultiQC

These samples were run by seq2science v0.5.4, a tool for easy preprocessing of NGS data.

General Statistics

General Statistics: Columns

Workflow explanation

fastp

Filtered Reads

Duplication Rates

Insert Sizes

Sequence Quality

GC Content

N content

Picard

Insert Size

Mark Duplicates Help

SamTools pre-sieve

Percent Mapped Help

Alignment metrics

deepTools

PCA plot

Fingerprint plot

deepTools - Spearman correlation heatmap of reads in bins across the genome

deepTools - Pearson correlation heatmap of reads in bins across the genome

dupRadar

DESeq2 - Sample distance cluster heatmap of counts

DESeq2 - Spearman correlation cluster heatmap of counts

DESeq2 - Pearson correlation cluster heatmap of counts

Samples & Config

v1.11

Highlight Samples

Rename Samples

Show / Hide Samples

Mark Duplicates

Percent Mapped