APPAGATO: an APproximate PArallel and stochastic GrAph querying TOol for biological networks

IRIS

Motivation: Biological network querying is a problem requiring a considerable computational effort to be solved. Given a target and a query network, it aims to find occurrences of the query in the target by considering topological and node similarities (i.e. mismatches between nodes, edges, or node labels). Querying tools that deal with similarities are crucial in biological network analysis since they provide meaningful results also in case of noisy data. In addition, since the size of available networks increases steadily, existing algorithms and tools are becoming unsuitable. This is rising new challenges for the design of more efficient and accurate solutions.Results: This paper presents APPAGATO, a stochastic and parallel algorithm to find approximate occurrences of a query network in biological networks. APPAGATO handles node, edge, and node label mismatches. Thanks to its randomic and parallel nature, it applies to large networks and, compared to existing tools, it provides higher performance as well as statistically significant more accurate results. Tests have been performed on protein-protein interaction networks annotated with synthetic and real gene ontology terms. Case studies have been done by querying protein complexes among different species and tissues.Availability and implementation: APPAGATO has been developed on top of CUDA-C++ Toolkit 7.0 framework. The software is available at http://profs.sci.univr.it/~bombieri/APPAGATO.

APPAGATO: an APproximate PArallel and stochastic GrAph querying TOol for biological networks

Bonnici V;Busato F;Micale G;Bombieri N;PULVIRENTI, ALFREDO;Giugno R.

2016-01-01

Abstract

Motivation: Biological network querying is a problem requiring a considerable computational effort to be solved. Given a target and a query network, it aims to find occurrences of the query in the target by considering topological and node similarities (i.e. mismatches between nodes, edges, or node labels). Querying tools that deal with similarities are crucial in biological network analysis since they provide meaningful results also in case of noisy data. In addition, since the size of available networks increases steadily, existing algorithms and tools are becoming unsuitable. This is rising new challenges for the design of more efficient and accurate solutions.Results: This paper presents APPAGATO, a stochastic and parallel algorithm to find approximate occurrences of a query network in biological networks. APPAGATO handles node, edge, and node label mismatches. Thanks to its randomic and parallel nature, it applies to large networks and, compared to existing tools, it provides higher performance as well as statistically significant more accurate results. Tests have been performed on protein-protein interaction networks annotated with synthetic and real gene ontology terms. Case studies have been done by querying protein complexes among different species and tissues.Availability and implementation: APPAGATO has been developed on top of CUDA-C++ Toolkit 7.0 framework. The software is available at http://profs.sci.univr.it/~bombieri/APPAGATO.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2016

Appare nelle tipologie:

1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
BIOINFO2016.pdf accesso aperto Tipologia: Versione Editoriale (PDF) Licenza: Non specificato Dimensione 1.53 MB Formato Adobe PDF Visualizza/Apri	1.53 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/17772

Citazioni

ND

9

9

social impact