Accuracy of artificial intelligence fully-automatic cephalometric analysis in linear and angular measurement: a critical scoping review

IRIS

Aim: This scoping review aimed to evaluate the accuracy of fully automatic AI-based cephalometric linear and angular measurements - defined as the degree of agreement between AI-generated and manual tracings, expressed as mean differences or mean absolute errors - by synthesizing evidence from comparative studies. Materials and methods: A systematic search was conducted in PubMed, Web of Science, and Scopus on March 22, 2025, following PRISMA-ScR guidelines. Eligible studies were retrospective validation or comparative studies involving human lateral cephalograms, fully automatic AI analysis tools, and manual cephalometric measurements as the reference. Studies focusing only on landmark detection without measurement comparison were excluded. Data extracted included sample size, AI software used, types of cephalometric parameters analyzed, and accuracy outcomes. Results: Out of 629 records screened, 10 studies met the inclusion criteria. The included software systems varied (e.g., WebCeph, OrthoDx, CephX, Cephio), and all used manual tracings by orthodontists as the reference. Compared to manual tracing, AI usually showed a good agreement in dental measurements (e.g., U1-NA, L1-NB, IMPA). Conversely, a lower agreement was observed for specific skeletal (e.g., SNA, SNB, GoGn-SN) and soft tissue measurements (e.g., nasolabial angle), with deviations often exceeding 2 mm or 2° compared to the reference. Measurement reproducibility was often high, but did not always equate to accuracy, since many studies reported intraclass correlation coefficient values, which assess reliability and internal consistency rather than capturing the actual magnitude of error when AI measurements are compared with manual tracings. Conclusions: AI-driven cephalometric measurement tools demonstrate higher accuracy for dental measurements but remain inconsistent for skeletal and soft tissue measurements. Differences in software algorithms and anatomical complexity contribute to variability. Human validation remains essential to ensure clinically reliable measurements.

Accuracy of artificial intelligence fully-automatic cephalometric analysis in linear and angular measurement: a critical scoping review

Polizzi, Alessandro^Primo;Nucci, Ludovica^Secondo;Serra, Sara;Isola, Gaetano^Penultimo;Leonardi, Rosalia^Ultimo

2026-01-01

Abstract

Aim: This scoping review aimed to evaluate the accuracy of fully automatic AI-based cephalometric linear and angular measurements - defined as the degree of agreement between AI-generated and manual tracings, expressed as mean differences or mean absolute errors - by synthesizing evidence from comparative studies. Materials and methods: A systematic search was conducted in PubMed, Web of Science, and Scopus on March 22, 2025, following PRISMA-ScR guidelines. Eligible studies were retrospective validation or comparative studies involving human lateral cephalograms, fully automatic AI analysis tools, and manual cephalometric measurements as the reference. Studies focusing only on landmark detection without measurement comparison were excluded. Data extracted included sample size, AI software used, types of cephalometric parameters analyzed, and accuracy outcomes. Results: Out of 629 records screened, 10 studies met the inclusion criteria. The included software systems varied (e.g., WebCeph, OrthoDx, CephX, Cephio), and all used manual tracings by orthodontists as the reference. Compared to manual tracing, AI usually showed a good agreement in dental measurements (e.g., U1-NA, L1-NB, IMPA). Conversely, a lower agreement was observed for specific skeletal (e.g., SNA, SNB, GoGn-SN) and soft tissue measurements (e.g., nasolabial angle), with deviations often exceeding 2 mm or 2° compared to the reference. Measurement reproducibility was often high, but did not always equate to accuracy, since many studies reported intraclass correlation coefficient values, which assess reliability and internal consistency rather than capturing the actual magnitude of error when AI measurements are compared with manual tracings. Conclusions: AI-driven cephalometric measurement tools demonstrate higher accuracy for dental measurements but remain inconsistent for skeletal and soft tissue measurements. Differences in software algorithms and anatomical complexity contribute to variability. Human validation remains essential to ensure clinically reliable measurements.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Parole chiave
	
				Artificial intelligence
Cephalometric analysis
Cephalometry
Deep learning
Orthodontics
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S1073874625000908-main.pdf solo gestori archivio Tipologia: Documento in Post-print Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 699.12 kB Formato Adobe PDF Visualizza/Apri	699.12 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11769/700759

Citazioni

ND

0

0

social impact