Aim: This scoping review aimed to evaluate the accuracy of fully automatic AI-based cephalometric linear and angular measurements - defined as the degree of agreement between AI-generated and manual tracings, expressed as mean differences or mean absolute errors - by synthesizing evidence from comparative studies. Materials and methods: A systematic search was conducted in PubMed, Web of Science, and Scopus on March 22, 2025, following PRISMA-ScR guidelines. Eligible studies were retrospective validation or comparative studies involving human lateral cephalograms, fully automatic AI analysis tools, and manual cephalometric measurements as the reference. Studies focusing only on landmark detection without measurement comparison were excluded. Data extracted included sample size, AI software used, types of cephalometric parameters analyzed, and accuracy outcomes. Results: Out of 629 records screened, 10 studies met the inclusion criteria. The included software systems varied (e.g., WebCeph, OrthoDx, CephX, Cephio), and all used manual tracings by orthodontists as the reference. Compared to manual tracing, AI usually showed a good agreement in dental measurements (e.g., U1-NA, L1-NB, IMPA). Conversely, a lower agreement was observed for specific skeletal (e.g., SNA, SNB, GoGn-SN) and soft tissue measurements (e.g., nasolabial angle), with deviations often exceeding 2 mm or 2° compared to the reference. Measurement reproducibility was often high, but did not always equate to accuracy, since many studies reported intraclass correlation coefficient values, which assess reliability and internal consistency rather than capturing the actual magnitude of error when AI measurements are compared with manual tracings. Conclusions: AI-driven cephalometric measurement tools demonstrate higher accuracy for dental measurements but remain inconsistent for skeletal and soft tissue measurements. Differences in software algorithms and anatomical complexity contribute to variability. Human validation remains essential to ensure clinically reliable measurements.

Accuracy of artificial intelligence fully-automatic cephalometric analysis in linear and angular measurement: a critical scoping review

Polizzi, Alessandro
Primo
;
Serra, Sara;Isola, Gaetano
Penultimo
;
Leonardi, Rosalia
Ultimo
2026-01-01

Abstract

Aim: This scoping review aimed to evaluate the accuracy of fully automatic AI-based cephalometric linear and angular measurements - defined as the degree of agreement between AI-generated and manual tracings, expressed as mean differences or mean absolute errors - by synthesizing evidence from comparative studies. Materials and methods: A systematic search was conducted in PubMed, Web of Science, and Scopus on March 22, 2025, following PRISMA-ScR guidelines. Eligible studies were retrospective validation or comparative studies involving human lateral cephalograms, fully automatic AI analysis tools, and manual cephalometric measurements as the reference. Studies focusing only on landmark detection without measurement comparison were excluded. Data extracted included sample size, AI software used, types of cephalometric parameters analyzed, and accuracy outcomes. Results: Out of 629 records screened, 10 studies met the inclusion criteria. The included software systems varied (e.g., WebCeph, OrthoDx, CephX, Cephio), and all used manual tracings by orthodontists as the reference. Compared to manual tracing, AI usually showed a good agreement in dental measurements (e.g., U1-NA, L1-NB, IMPA). Conversely, a lower agreement was observed for specific skeletal (e.g., SNA, SNB, GoGn-SN) and soft tissue measurements (e.g., nasolabial angle), with deviations often exceeding 2 mm or 2° compared to the reference. Measurement reproducibility was often high, but did not always equate to accuracy, since many studies reported intraclass correlation coefficient values, which assess reliability and internal consistency rather than capturing the actual magnitude of error when AI measurements are compared with manual tracings. Conclusions: AI-driven cephalometric measurement tools demonstrate higher accuracy for dental measurements but remain inconsistent for skeletal and soft tissue measurements. Differences in software algorithms and anatomical complexity contribute to variability. Human validation remains essential to ensure clinically reliable measurements.
2026
Artificial intelligence
Cephalometric analysis
Cephalometry
Deep learning
Orthodontics
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S1073874625000908-main.pdf

solo gestori archivio

Tipologia: Documento in Post-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 699.12 kB
Formato Adobe PDF
699.12 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/700759
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact