Beyond Horspool: A Comparative Analysis in Sampled Matching

IRIS

The exact online string matching problem, pivotal in fields ranging from computational biology to data compression, involves identifying all instances of a speci- fied pattern within a text. Despite extensive examination over the decades, this problem has remained computationally challenging due to the time and space limitations inher- ent in traditional online and offline methods, respectively. Introduced in 1991, sampled string matching has now emerged as a groundbreaking approach, ingeniously combin- ing classical online string matching techniques with efficient text sampling methods. This approach not only addresses the spatial constraints of indexed string matching but also significantly reduces the search duration in online environments, achieving speed increases of up to hundreds of times while requiring less than 4% of the text size for its partial index. In this paper, we explore the adaptability of various online string matching algorithms within the framework of sampled string matching, which has tra- ditionally relied on the Horspool algorithm. Our investigation reveals that integrating alternative string matching algorithms as subroutines markedly enhances overall per- formance. These findings highlight the potential for reevaluating established method- ologies in light of newer, more dynamic solutions and set the stage for transformative impacts across multiple domains.

Beyond Horspool: A Comparative Analysis in Sampled Matching

Simone Faro;Francesco Pio Marino;Andrea Moschetto

2024-01-01

Abstract

The exact online string matching problem, pivotal in fields ranging from computational biology to data compression, involves identifying all instances of a speci- fied pattern within a text. Despite extensive examination over the decades, this problem has remained computationally challenging due to the time and space limitations inher- ent in traditional online and offline methods, respectively. Introduced in 1991, sampled string matching has now emerged as a groundbreaking approach, ingeniously combin- ing classical online string matching techniques with efficient text sampling methods. This approach not only addresses the spatial constraints of indexed string matching but also significantly reduces the search duration in online environments, achieving speed increases of up to hundreds of times while requiring less than 4% of the text size for its partial index. In this paper, we explore the adaptability of various online string matching algorithms within the framework of sampled string matching, which has tra- ditionally relied on the Horspool algorithm. Our investigation reveals that integrating alternative string matching algorithms as subroutines markedly enhances overall per- formance. These findings highlight the potential for reevaluating established method- ologies in light of newer, more dynamic solutions and set the stage for transformative impacts across multiple domains.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Codice ISBN
	
				978-80-01-07328-5
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
PSC2024-3.pdf solo gestori archivio Tipologia: Versione Editoriale (PDF) Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 621.91 kB Formato Adobe PDF Visualizza/Apri	621.91 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/641835

Citazioni

ND

ND

ND

social impact