Towards a harmonized identification scoring system in LC-HRMS/MS based non-target screening (NTS) of emerging contaminants
Non-target screening (NTS) methods are rapidly gaining in popularity, empowering researchers to search for an ever-increasing number of contaminants of emerging concern. Given this possibility to screen for thousands of compounds via NTS strategies, communicating the confidence of identification in an automated, concise and unambiguous manner that reflects all the evidence available is becoming increasingly important. An automated identification point (IP) system for NTS identification will support identification confidence communication in a reproducible and transparent manner, opening up possibilities to develop novel prioritization schemes for the management of chemicals. In this study, we compiled several pieces of evidence necessary for communicating NTS identification confidence and developed an automated and interpretable machine learning-based approach for automated classification of “sufficient” versus “insufficient” evidence using the NORMAN Digital Sample Freezing Platform. The machine learning approach was trained and optimized using data generated by four laboratories equipped with different instrumentation. This approach automatically discarded substances with insufficient identification evidence efficiently, while revealing the relevance of different parameters for identification. A simplified and straightforward IP-based system is proposed for the communication of the evidence associated with identification confidence based on the knowledge generated in these efforts. This new NTS-oriented system is based on mass accuracy, retention time plausibility, isotopic fit, and MS fragmentation pattern and is compatible and comparable with analysis and the currently widely-used five level system. It increases the precision in reporting and the reproducibility of current approaches via the inclusion of evidence scores, while being suitable for automation. The system can be expanded to incorporate additional evidence in the future.