Machine learning has been used for distinct purposes in the science field but no applications on illegal drug have been done before. This study proposes a new web-based system for cocaine classification, profiling relations and comparison, that is capable of producing meaningful output based on a large amount of chemical profiling’s data. In particular, the Profiling Relations In Drug trafficking in Europe (PRIDE) system, offers several advantages to intelligence actions across Europe. Thus, it provides a standardized, broad methodology which uses machine learning algorithms to classify and compare drug profiles, highlight how similar drug samples are, and how probable it is that they share a common origin, batch, or preparation process. We evaluated the proposed algorithms using precision and recall metrics and analyzed the quality of predictions performed by the algorithms, with respect to our gold standard. In our experiments, we reached a value of 88% for F0.5-measure, 91% for precision, and 78% for recall, confirming our main hypothesis: machine learning can learn and be applied to have an automatic classification of cocaine profiles.

Cascini, F., De Giovanni, N., Inserra, I., Santaroni, F., Laura, L., A data-driven methodology to discover similarities between cocaine samples, <<SCIENTIFIC REPORTS>>, 2020; 10 (1): 1-12. [doi:10.1038/s41598-020-72652-w] [http://hdl.handle.net/10807/206361]

A data-driven methodology to discover similarities between cocaine samples

Cascini, F.;Inserra, I.;
2020

Abstract

Machine learning has been used for distinct purposes in the science field but no applications on illegal drug have been done before. This study proposes a new web-based system for cocaine classification, profiling relations and comparison, that is capable of producing meaningful output based on a large amount of chemical profiling’s data. In particular, the Profiling Relations In Drug trafficking in Europe (PRIDE) system, offers several advantages to intelligence actions across Europe. Thus, it provides a standardized, broad methodology which uses machine learning algorithms to classify and compare drug profiles, highlight how similar drug samples are, and how probable it is that they share a common origin, batch, or preparation process. We evaluated the proposed algorithms using precision and recall metrics and analyzed the quality of predictions performed by the algorithms, with respect to our gold standard. In our experiments, we reached a value of 88% for F0.5-measure, 91% for precision, and 78% for recall, confirming our main hypothesis: machine learning can learn and be applied to have an automatic classification of cocaine profiles.
Inglese
Cascini, F., De Giovanni, N., Inserra, I., Santaroni, F., Laura, L., A data-driven methodology to discover similarities between cocaine samples, <<SCIENTIFIC REPORTS>>, 2020; 10 (1): 1-12. [doi:10.1038/s41598-020-72652-w] [http://hdl.handle.net/10807/206361]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/206361
Citazioni
  • ???jsp.display-item.citation.pmc??? 2
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact