Given a pattern x of length m and a text y of length n, both over a totally ordered alphabet, the order-preserving pattern matching (OPPM) problem consists in finding all substrings of the text with the same relative order as the pattern. The OPPM problem, which might be viewed as an approximate variant of the well-known exact pattern matching problem, has gained attention in recent years. This interesting problem finds applications in such diverse fields as time series analysis (as share prices on stock markets or weather data analysis) and musical melody matching, just to mention a few. In this paper we present two new filtering approaches which turn out to be much more effective in practice than the previously presented methods, reducing the number of false positives up to 99%. We also present a new efficient approach inspired by the well-known Skip Search algorithm for the exact string matching problem. It makes use of efficient SIMD SSE instructions for speeding up the searching phase. Experimental results show that our proposed algorithms are up to twice faster than previous solutions.

The order-preserving pattern matching problem in practice

Cantone D.;Faro S.
;
2020-01-01

Abstract

Given a pattern x of length m and a text y of length n, both over a totally ordered alphabet, the order-preserving pattern matching (OPPM) problem consists in finding all substrings of the text with the same relative order as the pattern. The OPPM problem, which might be viewed as an approximate variant of the well-known exact pattern matching problem, has gained attention in recent years. This interesting problem finds applications in such diverse fields as time series analysis (as share prices on stock markets or weather data analysis) and musical melody matching, just to mention a few. In this paper we present two new filtering approaches which turn out to be much more effective in practice than the previously presented methods, reducing the number of false positives up to 99%. We also present a new efficient approach inspired by the well-known Skip Search algorithm for the exact string matching problem. It makes use of efficient SIMD SSE instructions for speeding up the searching phase. Experimental results show that our proposed algorithms are up to twice faster than previous solutions.
2020
Approximate text analysis; Experimental algorithms; Filtering algorithms; Text processing
File in questo prodotto:
File Dimensione Formato  
The order-preserving pattern.pdf

solo gestori archivio

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 701.63 kB
Formato Adobe PDF
701.63 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/373769
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 1
social impact