GPT Hallucination Detection Through Prompt Engineering

IRIS

Detecting hallucinated or factually inaccurate information from GPT models can be challenging for humans. Consequently, it is crucial to thoroughly test Large Language Models (LLMs) for their accuracy before deployment. One potential method for identifying hallucinated content, which is explored at the ELOQUENT 2024 Lab hosted at CLEF 2024, involves using LLMs to assess the output of other LLMs. In this paper, we discuss the application of a Mistral 7B model to address the task in the hard labelling setup for English and Swedish. Our approach leverages a Mistral 7B model along with a few-shot learning strategy and prompt engineering. Thanks to our approach, on the English test set, our proposal achieved an F1 of 0.72, and on the Swedish test set, it achieved an F1 of 0.75. Our selected approach is able to outperform some of the baselines provided for the competition while outperforming other LLM-based approaches.

GPT Hallucination Detection Through Prompt Engineering

Siino M.^Primo;Tinnirello I.

2024-01-01

Abstract

Detecting hallucinated or factually inaccurate information from GPT models can be challenging for humans. Consequently, it is crucial to thoroughly test Large Language Models (LLMs) for their accuracy before deployment. One potential method for identifying hallucinated content, which is explored at the ELOQUENT 2024 Lab hosted at CLEF 2024, involves using LLMs to assess the output of other LLMs. In this paper, we discuss the application of a Mistral 7B model to address the task in the hard labelling setup for English and Swedish. Our approach leverages a Mistral 7B model along with a few-shot learning strategy and prompt engineering. Thanks to our approach, on the English test set, our proposal achieved an F1 of 0.72, and on the Swedish test set, it achieved an F1 of 0.75. Our selected approach is able to outperform some of the baselines provided for the competition while outperforming other LLM-based approaches.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Parole chiave
	
				GPT
hallucinations detection
LLM
mistral 7B
prompt engineering
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/689430

Citazioni

ND

11

ND

social impact