Sort Order:
(Applies to Sample Metrics)

Precursor Evidence Heatmap (Top 50 Most Variable)

Cumulative Detection Curve

Sample Clustering by Detection Pattern (Jaccard Distance)

Precursor Count per Protein (per Sample)
No TIC data available. Extract from the Search tab before searching, or use the button below to extract from an existing output directory.
Color by:
Viewing Comparison:
Tip: Assign experimental groups (required). Covariate columns are optional - customize names and include in model as needed.
Export
Optional covariates
In model
Column name (click to rename)
Tick “In model” to adjust DE for that factor (e.g., batch effects). The text box renames the column — fill values for each sample in the table below.
Viewing Comparison:
CSV
Per-Group Replicate Statistics
Contaminant Protein Analysis
Per-Sample Contaminant Intensity
Top Contaminant Proteins
Contaminant Expression Heatmap (Top 20)
Viewing Comparison:
💾 Export Full Table

Data Explorer

Export for Claude
Abundance Profiles (Quartile Analysis)
Proteins split into quartiles by average intensity. Colors show per-sample quartile assignment. Proteins that change quartile across samples may be biologically interesting.
Variable Proteins (Quartile Range >= 2)
Sample-Sample Scatter

AI-Powered Analysis Summary

How it works: Click the button below to generate a comprehensive AI-powered analysis across all comparisons in your experiment.

The AI will identify key DE proteins per comparison, cross-comparison biomarkers, and provide biological insights on high-confidence candidates.

Viewing Comparison:
Heatmap of Selected/Top Proteins
PNG SVG
💾 Export
Color by:
Axes:
PNG
CSV

Distribution of Coefficient of Variation (CV) for significant proteins, broken down by experimental group.

PNG
Proteins detected in one group AND completely missing from the other have no finite logFC — limma silently drops them from the volcano. They're listed here as presence/absence calls . Most relevant under the MaxLFQ + limma pipeline (DPC-Quant fills missing values, so this list will normally be empty there).
CSV

                      

Enrichment analysis on DE results. Auto-detects organism. Results cached per ontology.

PNG
Configure Comparison
Or upload individual files...
Spectronaut Setup Guide How to export from Spectronaut
Layers 1-3 only. Run FragPipe-Analyst for full DE comparison.
Attach DIA-NN log files (optional — fills in search parameters)
Upload report_log.txt or the SLURM .out file from each DIA-NN search. Only the command line and summary stats are read.
Protein Details
AI-Powered Comparison Analysis

Generate an AI narrative summary or export data for external analysis.

Export ZIP for Claude Analysis

MOFA2 Factor Decomposition

Treats Run A and Run B as two views of the same samples and decomposes joint variance into shared and run-specific factors.

Chat with Full Data (QC + Expression)

                    

Note: QC Stats (with Groups) + Top 800 Expression Data are sent to AI.

De Novo Search

Standard tryptic database search + Casanovo de novo sequencing + DIAMOND BLAST. Use this for protein discovery in any species — including ancient or non-model organisms where Casanovo's novel peptides + BLAST cascade are the value-add.

  • Enzyme: Trypsin/P, 7–50 AA
  • Variable mods: ox(M), N-term acetyl
  • Downstream: species attribution, BLAST alignments, coverage maps, deamidation tracking

Peptidomics — endogenous peptides

Nonspecific search for endogenous peptides (no enzymatic digestion). Use this for neuropeptides, milk peptides, antimicrobial peptides, or any analysis where peptides arrive in the MS already-cleaved by endogenous proteases.

  • Enzyme: none (cleave_at = ""), 5–25 AA, 400–5000 Da
  • Variable mods: ox(M), pyro-Glu (Q/E N-term), C-term amidation, N-term acetyl
  • Downstream: peptide-source-protein chart, N-/C-term cleavage motifs, PTM landscape

Nonspecific search is ~10–50× slower than tryptic. Walltime auto-bumped to 8 h.

HLA / MHC — immunopeptidomics

MHC class I and II peptide identification. Nonspecific search with class-specific length and charge windows. Use this for immunopeptidome studies, neoantigen discovery, or vaccine target ID.

  • Enzyme: none (cleave_at = "")
  • Variable mods: ox(M), deamidation (N/Q)
  • Charge range: 1–3 (z=1 dominant on TOF instruments)
  • Downstream: length histogram, P2/PΩ anchor logos, source-protein analysis

Load DDA / de novo results by uploading a shared ZIP

Casanovo peptide score = (product of per-residue scores) − 1 if the peptide fails precursor-mass closure. So a negative score means the WHOLE peptide's mass is incomplete — NOT that the matched residues are wrong (decoy-validated: score<0 hits still match references 99% at their confident residues). Default −1 shows all; raise only to inspect mass-complete reconstructions.
Manuscript Summary Statistics (Table 1)
Mass spec files:
Clear (show all)
View:
Contaminants:
Protein filter:
Skin/hair = keratin family; opt-in, off by default.
Peptide length distribution
Tryptic peptides typically span 7–25 aa.
HLA class I shows a sharp peak at 9; class II at 13–15; peptidomics is broad 5–25.

HLA anchor residue frequencies
P2 + PΩ are the dominant anchor positions for MHC-I. Allele preferences fingerprint the donor's HLA type (e.g. A*02:01 → L at P2 + L/V at PΩ).

Cleavage flanking residues
N- and C-terminal residue percentages — fingerprints the endogenous protease activity that produced these peptides. High C-term K/R = trypsin contamination.
Rows = unique peptides; columns = source files; cells = PSM count. Peptides with high total in one file but 0 in others are candidate condition-specific. Sort/filter on the column toolbar; use Copy/CSV/Excel to export.
BLASTs novel peptides against UniProt SwissProt + TrEMBL (SwissProt first, then TrEMBL on misses) on HPC.
Top Proteins by De Novo Peptide Count
Taxonomic Coverage

Identity of each peptide across the top species, grouped by source protein. Reveals patterns like conserved vs species-specific protein regions.

Show full peptide-species identity matrix
Peptide Length Distribution
Charge State Distribution
Post-Translational Modifications

Modification analysis from de novo sequences. In paleoproteomics, high N-deamidation with low Q-deamidation indicates genuine ancient protein (time-dependent asparagine degradation).

Select a near-match peptide from the table below, then click Show Alignment to visualize mismatches with per-residue confidence. Green = genuine variant (AA score > 0.95), Red = possible sequencing error (AA score < 0.70).

This view cross-references BLAST mismatches with Casanovo's per-residue amino acid confidence scores to distinguish species-specific markers from sequencing artifacts.

How we put an FDR on a de novo species call

De novo peptides are BLASTed against NCBI nr to identify the species — but what is the false-discovery rate when the organism may be absent from the database? We adapt the NovoBoard decoy-spectra method (Tran et al. 2024) to the homology search. Here is the whole process, end to end.

1 Build a decoy spectrum

For every real MS/MS spectrum we keep ~20% of its peaks and replace the other ~80% with peaks drawn at random from the global peak pool — keeping the precursor m/z and charge. The result looks like a real spectrum but encodes no real peptide.

2 Sequence real and decoy spectra with Casanovo

The identical Casanovo model and settings run on both sets, giving a population of real de novo peptides and a matched population of decoy de novo peptides — a true 1:1 null with the same number of queries.

3 BLAST both against nr, then compete: FDR = decoy ÷ real

Both populations are BLASTed against nr. Real peptides hit far more often than decoy peptides at every Casanovo score; the cumulative decoy ÷ real hit ratio is the FDR. Worked example below (ocelot, Leopardus pardalis , 9 runs):

! A clean FDR is necessary but NOT sufficient for a species call

The decoy-spectra FDR only controls chance homology . Two further errors need their own controls: (1) de novo sequence error — gate with the Casanovo confidence slider (≈21% error at conf ≥0.95, ≈12% at ≥0.99); (2) species mis-assignment from conserved peptides — use the Species (LCA) pane (lowest common ancestor across all hits, not the single best hit). Full validation: docs/DENOVO_FDR_VALIDATION.md.


Live view — your loaded dataset
Hit rate = % of unique de novo peptides in each Casanovo-score bin with ≥1 nr BLAST hit (per peptide, e-value ≤1). Loads denovo/blast_results_decoy_spectra.tsv (falls back to the legacy shuffled decoy).

Lowest-common-ancestor species attribution from the nr BLAST. Each de novo peptide is placed at the deepest taxon shared by its top hits: species/genus = diagnostic, family+ = conserved (not species-attributed), bacterial/viral = microbiome.

Per-peptide LCA

One row per de novo peptide combining all three evidence streams: Casanovo confidence , whether Sage found it in the database, and the nr BLAST species/clade (LCA) . Hidden below the confidence slider at the top of the page — low-confidence calls are excluded by default but every peptide is one slider-click away.

The de novo peptides assembled into proteins by the same parsimony model FragPipe (ProteinProphet) and IDPicker use — the minimal protein set explaining the peptides, with razor peptide assignment and indistinguishable-protein grouping. Click a protein to pop out a per-residue coverage map (reference vs de novo, full-screen-able) that colours amino-acid substitutions by Casanovo confidence. Honours the confidence slider above.

Export Complete Analysis

Download everything needed to reproduce and share this analysis. Includes all data files, DIA-NN search parameters, and session state.

What's included (click to expand)
  • expression_matrix.csv -- Normalized protein intensities (pipeline-aware: DPC-Quant complete, or MaxLFQ with NAs)
  • DE_Results_Full.csv -- All contrasts × all proteins with logFC, P.Value, adj.P.Val (when DE was run)
  • QC_Metrics.csv -- Per-sample QC metrics + group labels (when QC stats exist)
  • Phospho_DE_Results.csv -- Site-level phospho DE (when phosphoproteomics was run)
  • diann_pg_matrix.tsv -- DIA-NN protein-level matrix with real missing values (0 = not detected, ~200 KB)
  • data_quality_summary.csv -- Per-sample protein counts, % detected, contaminant counts
  • detection_matrix.csv -- Per-protein precursor detection counts per sample
  • quartile_profiles.csv -- Intensity quartile assignments per sample
  • variable_proteins.csv -- Proteins with inconsistent abundance across samples
  • sample_metadata.csv / group_assignments.csv -- Sample groups and identifiers
  • contaminant_summary.csv -- Contaminant protein statistics
  • search_info.md -- Full DIA-NN search parameters and job metadata
  • session.rds -- Complete session state (reload in DE-LIMP)
  • methods.txt / parameters.txt -- Pipeline parameters, normalization, app version
  • reproducibility_log.R -- R code log + sessionInfo() to reproduce every step
  • figures/ -- 9 publication-quality SVG figures: volcano, heatmap_top20, violin_top10_up/down, pca, qc_group_distribution, normalization_density, data_completeness, sample_correlation, pvalue_distribution
  • PROMPT.md -- AI analysis prompt with biological questions and figure-reference instructions (DE-aware)
  • MANIFEST.txt -- Per-section export status (any skipped files explained here)
Export Complete Analysis ZIP

DE Results Table

Quick export of the DE results for the selected comparison. Includes gene symbols, logFC, P.Value, adj.P.Val, and per-sample expression values. One CSV file — no search parameters or session data.

Export Results CSV

CV Analysis

Coefficient of variation for significant proteins. Includes per-group CV and average CV values. One CSV file.

Export CV Analysis CSV

Full DIA-NN Output

The complete DIA-NN search output (report.parquet, precursor matrices, spectral libraries, logs) is stored on the HPC cluster. These files can be large (100 MB+) and are not included in the analysis export.

Action Log: This code recreates your analysis step-by-step. Each section shows:
  • Action name - what you did (e.g., 'Run Pipeline')
  • Timestamp - when you did it
  • R code - how to reproduce it

Copy this entire code block to reproduce your analysis in a fresh R session.

💾 Download Reproducibility Log

                      

                        

DE-LIMP

Differential Expression — LIMPA Pipeline

If DE-LIMP helped your work, a star on GitHub helps other proteomics labs find it. Star DE-LIMP →
Proteomics Resources & Training

Explore video tutorials, training courses, and methodology citations.