The identification of gravitationally bound substructures (subhalos) within cosmological simulations is a cornerstone for understanding galaxy formation and evolution. Traditional algorithms, while accurate, are often computationally intensive, posing a significant bottleneck for the analysis of next-generation cosmological simulations and limiting the feasibility of on-the-fly processing. In this work, we introduce HALOS (Hierarchical Aggregation Learning for Overdensity Search), a novel deep learning pipeline for subhalo identification in 3D point clouds. Our method employs a multi-stage approach that decouples particle classification from instance segmentation. First, we engineer a set of physically motivated features for each particle. Second, a multi-layer perceptron simultaneously performs two tasks: (i) a semantic segmentation to classify particles as either bound to a subhalo or part of the unbound background, and (ii) a regression to predict the 3D coordinates of the parent subhalo centroid for each bound particle. Finally, the HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise) algorithm performs a density-based clustering exclusively on the pre-filtered set of bound particles, significantly reducing computational complexity. We train and validate our model using catalogues generated from cosmological N-body simulations by the SUBFIND algorithm. HALOS achieves a semantic classification accuracy of 95%, an Adjusted Rand Index for instance segmentation >90%, and an overall Completeness of 90%, demonstrating a close alignment with SUBFIND, while reducing computational time by a factor of ∼ 16.

HALOS: Hierarchical Aggregation Learning for Overdensity Search

Mezzina A.;Naso L.;
2026-01-01

Abstract

The identification of gravitationally bound substructures (subhalos) within cosmological simulations is a cornerstone for understanding galaxy formation and evolution. Traditional algorithms, while accurate, are often computationally intensive, posing a significant bottleneck for the analysis of next-generation cosmological simulations and limiting the feasibility of on-the-fly processing. In this work, we introduce HALOS (Hierarchical Aggregation Learning for Overdensity Search), a novel deep learning pipeline for subhalo identification in 3D point clouds. Our method employs a multi-stage approach that decouples particle classification from instance segmentation. First, we engineer a set of physically motivated features for each particle. Second, a multi-layer perceptron simultaneously performs two tasks: (i) a semantic segmentation to classify particles as either bound to a subhalo or part of the unbound background, and (ii) a regression to predict the 3D coordinates of the parent subhalo centroid for each bound particle. Finally, the HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise) algorithm performs a density-based clustering exclusively on the pre-filtered set of bound particles, significantly reducing computational complexity. We train and validate our model using catalogues generated from cosmological N-body simulations by the SUBFIND algorithm. HALOS achieves a semantic classification accuracy of 95%, an Adjusted Rand Index for instance segmentation >90%, and an overall Completeness of 90%, demonstrating a close alignment with SUBFIND, while reducing computational time by a factor of ∼ 16.
2026
3D point clouds
Clustering
Cosmology
Deep learning
Halo-finders
Particle segmentation
SUBFIND
Subhalos
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/713172
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact