IRIS UniCatt

Background: Artificial intelligence (AI)-based chatbots have shown promise in providing counseling to patients with metabolic dysfunction-associated steatotic liver disease (MASLD). While ChatGPT3.5 has demonstrated the ability to comprehensively answer MASLD-related questions in English, its accuracy remains suboptimal. Whether language influences these results is unclear. This study aims to assess ChatGPT's performance as a counseling tool for Italian MASLD patients. Methods: Thirteen Italian experts rated the accuracy, completeness and comprehensibility of ChatGPT3.5 in answering 15 MASLD-related questions in Italian using a six-point accuracy, three-point completeness and three-point comprehensibility Likert's scale. Results: Mean scores for accuracy, completeness and comprehensibility were 4.57 +/- 0.42, 2.14 +/- 0.31 and 2.91 +/- 0.07, respectively. The physical activity domain achieved the highest mean scores for accuracy and completeness, whereas the specialist referral domain achieved the lowest. Overall, Fleiss's coefficient of concordance for accuracy, completeness and comprehensibility across all 15 questions was 0.016, 0.075 and -0.010, respectively. Age and academic role of the evaluators did not influence the scores. The results were not significantly different from our previous study focusing on English. Conclusion: Language does not appear to affect ChatGPT's ability to provide comprehensible and complete counseling to MASLD patients, but accuracy remains suboptimal in certain domains.

Pugliese, N., Polverini, D., Lombardi, R., Pennisi, G., Ravaioli, F., Armandi, A., Buzzetti, E., Dalbeni, A., Liguori, A., Mantovani, A., Villani, R., Gardini, I., Hassan, C., Valenti, L., Miele, L., Petta, S., Sebastiani, G., Aghemo, A., Nafld Expert Chatbot Working Group, N., Evaluation of ChatGPT as a Counselling Tool for Italian-Speaking MASLD Patients: Assessment of Accuracy, Completeness and Comprehensibility, <<JOURNAL OF PERSONALIZED MEDICINE>>, 2024; 14 (6): N/A-N/A. [doi:10.3390/jpm14060568] [https://hdl.handle.net/10807/297290]

Evaluation of ChatGPT as a Counselling Tool for Italian-Speaking MASLD Patients: Assessment of Accuracy, Completeness and Comprehensibility

Pugliese, Nicola;Polverini, Davide;Lombardi, Rosa;Pennisi, Grazia;Ravaioli, Federico;Armandi, Angelo;Buzzetti, Elena;Dalbeni, Andrea;Liguori, Antonio;Mantovani, Alessandro;Villani, Rosanna;Gardini, Ivan;Hassan, Cesare;Valenti, Luca;Miele, Luca;Petta, Salvatore;Sebastiani, Giada;Aghemo, Alessio;

2024

Abstract

Background: Artificial intelligence (AI)-based chatbots have shown promise in providing counseling to patients with metabolic dysfunction-associated steatotic liver disease (MASLD). While ChatGPT3.5 has demonstrated the ability to comprehensively answer MASLD-related questions in English, its accuracy remains suboptimal. Whether language influences these results is unclear. This study aims to assess ChatGPT's performance as a counseling tool for Italian MASLD patients. Methods: Thirteen Italian experts rated the accuracy, completeness and comprehensibility of ChatGPT3.5 in answering 15 MASLD-related questions in Italian using a six-point accuracy, three-point completeness and three-point comprehensibility Likert's scale. Results: Mean scores for accuracy, completeness and comprehensibility were 4.57 +/- 0.42, 2.14 +/- 0.31 and 2.91 +/- 0.07, respectively. The physical activity domain achieved the highest mean scores for accuracy and completeness, whereas the specialist referral domain achieved the lowest. Overall, Fleiss's coefficient of concordance for accuracy, completeness and comprehensibility across all 15 questions was 0.016, 0.075 and -0.010, respectively. Age and academic role of the evaluators did not influence the scores. The results were not significantly different from our previous study focusing on English. Conclusion: Language does not appear to affect ChatGPT's ability to provide comprehensible and complete counseling to MASLD patients, but accuracy remains suboptimal in certain domains.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno pubblicazione
	
				2024
			
	Lingua del contenuto
	
				Inglese
			
	Nome del periodico
	
				JOURNAL OF PERSONALIZED MEDICINE
			
	DOI del contributo
	
				https://dx.doi.org/10.3390/jpm14060568
			
	Citazione
	
				Pugliese, N., Polverini, D., Lombardi, R., Pennisi, G., Ravaioli, F., Armandi, A., Buzzetti, E., Dalbeni, A., Liguori, A., Mantovani, A., Villani, R., Gardini, I., Hassan, C., Valenti, L., Miele, L., Petta, S., Sebastiani, G., Aghemo, A., Nafld Expert Chatbot Working Group, N., Evaluation of ChatGPT as a Counselling Tool for Italian-Speaking MASLD Patients: Assessment of Accuracy, Completeness and Comprehensibility, <<JOURNAL OF PERSONALIZED MEDICINE>>, 2024;  14 (6): N/A-N/A. [doi:10.3390/jpm14060568] [https://hdl.handle.net/10807/297290]
			
	Appare nelle tipologie:
	
				Articolo in rivista, Nota a sentenza

File in questo prodotto:

File	Dimensione	Formato
chat.pdf accesso aperto Tipologia file ?: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 1.3 MB Formato Adobe PDF Visualizza/Apri	1.3 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/297290

Citazioni

9

15

15

social impact