Ecology is, nowadays, an interdisciplinary, collabo- rative and data-intensive science, therefore, discovering, integrat- ing and analysing daily-produced data is necessary to support researchers to investigate complex questions, ranging from single particles to animals to the biosphere [1]. As a consequence, ecology-related multimedia content has been produced massively in recent years: for example, the Xeno-canto project1 and the Pl@ntNet project2 respectively collected 140,000 audio records of 8,700 bird species and about 60,000 thousand images covering thousand of plant species, to be used by scientists or professionals. Unfortunately, a manual analysis of such amount of generated data is impossible: automatic analysis tools combined with high- performance computing (HPC) solutions are therefore heavily demanded for making sense of such big ecological data. In this paper we present a case study of large-scale video processing on HPC facilities for underwater fish monitoring in the context of the Fish4Knowledge project 3, where a system to analyse long-term underwater camera footage has been developed. The paper is meant to report on the employed hardware/software architecture, the design and deployment of the parallel job manager, and the problems encountered during the whole process, from load balancing to job submission policies to bottlenecks.
Large scale data processing in ecology: A case study on long-term underwater video monitoring
Palazzo S;SPAMPINATO, CONCETTO;GIORDANO, Daniela
2014-01-01
Abstract
Ecology is, nowadays, an interdisciplinary, collabo- rative and data-intensive science, therefore, discovering, integrat- ing and analysing daily-produced data is necessary to support researchers to investigate complex questions, ranging from single particles to animals to the biosphere [1]. As a consequence, ecology-related multimedia content has been produced massively in recent years: for example, the Xeno-canto project1 and the Pl@ntNet project2 respectively collected 140,000 audio records of 8,700 bird species and about 60,000 thousand images covering thousand of plant species, to be used by scientists or professionals. Unfortunately, a manual analysis of such amount of generated data is impossible: automatic analysis tools combined with high- performance computing (HPC) solutions are therefore heavily demanded for making sense of such big ecological data. In this paper we present a case study of large-scale video processing on HPC facilities for underwater fish monitoring in the context of the Fish4Knowledge project 3, where a system to analyse long-term underwater camera footage has been developed. The paper is meant to report on the employed hardware/software architecture, the design and deployment of the parallel job manager, and the problems encountered during the whole process, from load balancing to job submission policies to bottlenecks.File | Dimensione | Formato | |
---|---|---|---|
large scale data processing-IEEE2014.pdf
solo gestori archivio
Licenza:
Non specificato
Dimensione
200.54 kB
Formato
Adobe PDF
|
200.54 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.