This study explores the application of topic modeling techniques for auditing purposes in the banking sector, focusing on the analysis of reviews of anti-money laundering alerts. We compare three topic modeling algorithms: Latent Dirichlet Allocation (LDA), Embedded Topic Model (ETM), and Product of Experts LDA (ProdLDA), using a dataset of 35,000 suspicious activity reports from an Italian bank. The models were evaluated using the coherence score, NPMI coherence, and topic diversity metrics. Our results show that ProdLDA consistently outperformed LDA and ETM, with the best performance achieved using 1-gram word embeddings. The study reveals distinct topics related to specific client activities, cross-border transactions, and high-risk business sectors, like gambling. These results demonstrate the potential of advanced topic modeling techniques in enhancing the efficiency and effectiveness of auditing processes in the banking sector, particularly in the analysis of activities that could be tied to money laundering and terrorism.

Giaconia, A., Chiariello, V., Passarotti, M. C., Topic Modeling for Auditing Purposes in the Banking Sector, in Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024), (Pisa, 04-06 December 2024), CEUR Workshop Proceedings, Pisa 2024: 1030-1035 [https://hdl.handle.net/10807/308720]

Topic Modeling for Auditing Purposes in the Banking Sector

Passarotti, Marco Carlo
2024

Abstract

This study explores the application of topic modeling techniques for auditing purposes in the banking sector, focusing on the analysis of reviews of anti-money laundering alerts. We compare three topic modeling algorithms: Latent Dirichlet Allocation (LDA), Embedded Topic Model (ETM), and Product of Experts LDA (ProdLDA), using a dataset of 35,000 suspicious activity reports from an Italian bank. The models were evaluated using the coherence score, NPMI coherence, and topic diversity metrics. Our results show that ProdLDA consistently outperformed LDA and ETM, with the best performance achieved using 1-gram word embeddings. The study reveals distinct topics related to specific client activities, cross-border transactions, and high-risk business sectors, like gambling. These results demonstrate the potential of advanced topic modeling techniques in enhancing the efficiency and effectiveness of auditing processes in the banking sector, particularly in the analysis of activities that could be tied to money laundering and terrorism.
2024
Inglese
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)
Pisa
4-dic-2024
6-dic-2024
979-12-210-7060-6
CEUR Workshop Proceedings
Giaconia, A., Chiariello, V., Passarotti, M. C., Topic Modeling for Auditing Purposes in the Banking Sector, in Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024), (Pisa, 04-06 December 2024), CEUR Workshop Proceedings, Pisa 2024: 1030-1035 [https://hdl.handle.net/10807/308720]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/308720
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact