Background Immunopeptidomes are the peptide antigen repertoires bound by the molecules encoded by the major histocompatibility complex (MHC) (human leukocyte antigen (HLA) in humans). These HLA-peptide complexes are presented on the surface of cells for recognition by T cells of the immune system. Immunopeptidomics denotes the utilization of tandem mass spectrometry (MS/MS) to identify and quantify peptides bound to HLA molecules. Data-independent acquisition (DIA) has emerged as a powerful strategy for deep proteome-wide profiling, but DIA’s application to immunopeptidomic analyses has so far seen limited use. Further, of the many DIA data processing tools currently available, there is no consensus in the immunopeptidomics community on the most appropriate pipeline for in-depth and accurate HLA peptide identification. Herein we benchmarked four conventional and recently developed spectral library-based pipelines for processing and targeted analysis of DIA data for label-free immunopeptidome quantification.
Methodology We immunoprecipitated HLA molecules from replicates of 5×107 cells of two cell lines, C1R-B*57:01 and C1R-A*02:01, and eluted HLA-bound peptides for DIA analysis on an Orbitrap Fusion™ Tribrid™ mass spectrometer. Data analysis was evaluated across four peptide-centric DIA software tools (Skyline, Spectronaut, DIA-NN, and PEAKS X+) using an extensive DDA library previously acquired from both cell lines.
Results DIA analyses allowed for comparisons of immunopeptidome coverage, reproducibility between replicates, and estimates of external false-discovery rates for each data set. In general, DIA-NN was found to achieve the greatest number of peptide identifications on average at FDR 1% for both cell lines (~3000 peptides for C1R-B*57:01 and 2000 peptides for C1R-A*02:01), with the three other tools reporting fewer peptides (~1200-3000 peptides for C1R-B*57:01 and ~1200-1500 peptides for C1R-A*02:01). Through a hybrid spectral library containing HLA-B*57:01 and HLA-A*02:01 peptides, we examined external false-positive rates achieved by each software. In this context, Skyline and DIA-NN achieved lower false-positive rates for C1R-B*57:01 data, whereas Spectronaut and DIA-NN showed better performance for C1R-A*02:01. In examining the HLA peptide identification robustness of pipelines across replicates, Spectronaut, and DIA-NN provided higher reproducibility. We also reported linearity metrics and other analytical figures of merit in pairwise comparisons to validate the results.
Conclusions Through this extensive analysis, we propose a pathway for users to choose an appropriate tool based on research aims. The current data suggest a combined strategy of applying at least two complementary DIA software tools to achieve the greatest degree of confidence and in-depth coverage of immunopeptidome data.