Can a Llama Be a Watchdog? Exploring Llama 3 and Code Llama for Static Application Security Testing

IRIS

Research in software vulnerability detection has seen significant growth, with numerous systems and techniques being developed. Deep learning approaches have become partic-ularly popular, with various architectures being adapted for this purpose. Llama 3, the newest AI model from Meta, was released in April 2024. It has been trained on an extensive text corpus and contains four times the amount of code compared to its prede-cessor, Llama 2. In contrast, Code Llama stands out as the only model in the Llama series that has been pre-trained specifically on source code. In this study, we examine the effectiveness of the Llama architectures in static security analysis tasks by fine-tuning Llama 3 and Code Llama for vulnerability classification and detection with high precision. To provide comprehensive insights, we compare their performance against three leading models-CodeBERT, PolyCoder, and NatGen-known for their effectiveness in source code analysis, using two benchmark C/C++ datasets.

Can a Llama Be a Watchdog? Exploring Llama 3 and Code Llama for Static Application Security Testing

Curto, Claudio;Giordano, Daniela;Indelicato, Daniel Gustav;Patatu, Vladimiro

2024-01-01

Abstract

Research in software vulnerability detection has seen significant growth, with numerous systems and techniques being developed. Deep learning approaches have become partic-ularly popular, with various architectures being adapted for this purpose. Llama 3, the newest AI model from Meta, was released in April 2024. It has been trained on an extensive text corpus and contains four times the amount of code compared to its prede-cessor, Llama 2. In contrast, Code Llama stands out as the only model in the Llama series that has been pre-trained specifically on source code. In this study, we examine the effectiveness of the Llama architectures in static security analysis tasks by fine-tuning Llama 3 and Code Llama for vulnerability classification and detection with high precision. To provide comprehensive insights, we compare their performance against three leading models-CodeBERT, PolyCoder, and NatGen-known for their effectiveness in source code analysis, using two benchmark C/C++ datasets.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Parole chiave
	
				Cybersecurity
Large Language Models
Transformer
Vulnerability Detection
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/653390

Citazioni

ND

0

ND

social impact