Background: Single nucleotide polymorphism (SNP) arrays are commonly used for studying the genomic structure and diversity of livestock breeds, but whole-genome sequencing (WGS) provides higher-resolution genomic data. Genotype imputation has become a standard practice for increasing the genomic resolution of association studies. This work aimed to extend imputation to biodiversity analyses, comparing SNP array data before and after imputation. A 40 k SNP dataset of 281 horses from 12 breeds (DSSNP) was imputed to sequence-level using a reference panel of 327 sequenced individuals, generating approximately 9 million markers after filtering (DSIMP). Both datasets were used to study genetic variability, population structure and runs of homozygosity (ROH). Results: Genetic indices and relationships showed similar trends for both datasets, with high Pearson correlations and Mantel test values (> 0.8) indicating that the imputed data are a reliable alternative to SNP array data for genetic studies. Multidimensional scaling and admixture analyses highlighted how the genetic proximity between breeds observed for the DSSNP was amplified by the imputation process in cases of those breeds with a few sequences included in the WGS reference panel. ROH investigation showed overlapping homozygosity regions between the two datasets, highlighting the benefits of having more markers for gene and QTL annotation. Of the 141 ROH islands identified in the DSSNP, 79 overlapped perfectly with those found in the imputed data. Validation with the reference panel of 327 sequenced horses revealed a single ROH island on ECA11 shared across all three datasets, containing genes associated with morphology and behavioral traits. Conclusions: High correlations between SNP array and imputed data indicate that imputed genotypes provide a reliable alternative for assessing population structure and genetic diversity in horse breeds. Specifically, imputation can enhance the detection of ROH and the annotation of genes within ROH islands, with the reliability of these results depending on the quality of the reference panel and its representation of the studied breeds, among others.

Comparison between SNP array and imputed data to estimate population structure and ROH hotspots in horse breeds

Chessari, Giorgio
Primo
;
Criscione, Andrea;Tumino, Serena;Bordonaro, Salvatore;Marletta, Donata;
2025-01-01

Abstract

Background: Single nucleotide polymorphism (SNP) arrays are commonly used for studying the genomic structure and diversity of livestock breeds, but whole-genome sequencing (WGS) provides higher-resolution genomic data. Genotype imputation has become a standard practice for increasing the genomic resolution of association studies. This work aimed to extend imputation to biodiversity analyses, comparing SNP array data before and after imputation. A 40 k SNP dataset of 281 horses from 12 breeds (DSSNP) was imputed to sequence-level using a reference panel of 327 sequenced individuals, generating approximately 9 million markers after filtering (DSIMP). Both datasets were used to study genetic variability, population structure and runs of homozygosity (ROH). Results: Genetic indices and relationships showed similar trends for both datasets, with high Pearson correlations and Mantel test values (> 0.8) indicating that the imputed data are a reliable alternative to SNP array data for genetic studies. Multidimensional scaling and admixture analyses highlighted how the genetic proximity between breeds observed for the DSSNP was amplified by the imputation process in cases of those breeds with a few sequences included in the WGS reference panel. ROH investigation showed overlapping homozygosity regions between the two datasets, highlighting the benefits of having more markers for gene and QTL annotation. Of the 141 ROH islands identified in the DSSNP, 79 overlapped perfectly with those found in the imputed data. Validation with the reference panel of 327 sequenced horses revealed a single ROH island on ECA11 shared across all three datasets, containing genes associated with morphology and behavioral traits. Conclusions: High correlations between SNP array and imputed data indicate that imputed genotypes provide a reliable alternative for assessing population structure and genetic diversity in horse breeds. Specifically, imputation can enhance the detection of ROH and the annotation of genes within ROH islands, with the reliability of these results depending on the quality of the reference panel and its representation of the studied breeds, among others.
2025
Genome diversity
Horse species
Imputation
Runs of homozygosity
SNP
Whole-genome sequencing
File in questo prodotto:
File Dimensione Formato  
Chessari et al., 2025_BMC.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 2.44 MB
Formato Adobe PDF
2.44 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/703889
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact