Competitive Fragmentation Modeling of Mass Spectra for Metabolite Identification

Loading...
Thumbnail Image

Institution

http://id.loc.gov/authorities/names/n79058482

Degree Level

Doctoral

Degree

Doctor of Philosophy

Department

Department of Computing Science

Supervisor / Co-Supervisor and Their Department(s)

Examining Committee Member(s) and Their Department(s)

Citation for Previous Publication

Link to Related Item

Abstract

One of the key obstacles to the effective use of mass spectrometry (MS) in high throughput metabolomics is the difficulty in interpreting measured spectra to accurately and efficiently identify metabolites. Traditional methods for automated metabolite identification compare the target MS spectrum to spectra of known molecules in a reference database, ranking candidate molecules based on the closeness of the spectral match. However the limited coverage of available databases has led to interest in computational methods for generating accurate reference MS spectra from chemical structures. This is the target application for this work. My main research contribution is to propose a method for spectrum prediction, which we call Competitive Fragmentation Modeling (CFM). I demonstrate that this method works effectively for both electron ionization (EI)-MS and electrospray tandem MS (ESI-MS/MS). It uses a probabilistic generative model for the fragmentation processes occurring in a mass spectrometer, and a machine learning approach to learn parameters for this model from data. CFM has been used in both a spectrum prediction task (ie, predicting the mass spectrum from a chemical structure), and in a putative metabolite identification task (ranking possible structures for a target spectrum). In the spectrum prediction task, CFM showed improved performance when compared to a full enumeration of all peaks corresponding to all substructures of the molecule. In the metabolite identification task, CFM obtained substantially better rankings for the correct candidate than existing methods. As further validation, this method won the structure identification category of the international Critical Assessment of Small Molecule Identification (CASMI) 2014 competition. The method is also available for general use via a web interface.

Item Type

http://purl.org/coar/resource_type/c_46ec

Alternative

License

Other License Text / Link

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

en

Location

Time Period

Source