1
Director of Orthodontics, Department of Dentistry and OMFS, Jacobi Medical Center, Bronx, NY, United States
2
Assistant Professor of Statistics, Loyola University Chicago, IL, United States
3
Fellow, Surgical Orthodontics and Craniofacial Orthopedics, Jacobi Medical Center, Bronx, NY, United States
4
Chair, Department of Dentistry and OMFS, Jacobi Medical Center, Bronx, NY, United States
Corresponding author details:
Levine TP, DMD, Director of Orthodontics, Consultant in Orthodontics in Dental Planet
Dept of Dentistry
Jacobi Medical Center
NY,United States
Copyright:
© 2018 Levine TP et al. This is
an open-access article distributed under the
terms of the Creative Commons Attribution 4.0
international License, which permits unrestricted
use, distribution, and reproduction in any
medium, provided the original author and source
are credited
Objective: To establish reliability of cephalometric landmark identification in threedimensions using ProPlan CMF software.
Methods: Two orthodontist identified a series of 33 cephalometric landmarks on 20 CBCT scans of Class I, pre-orthodontic patients and repeated the landmark identification about two months later. Intraclass correlations (ICC) were calculated by landmark in the X, Y, and Z dimensions and F-test were used to assess difference in landmark location in the X, Y, and Z dimensions
Results: The majority of landmarks had good to excellent ICC for both inter- and intraobserver reliability. F-test also showed the majority of landmarks had no significant difference between the observers.
Conclusion: Most landmarks showed good to very good reliability and reproducibility using
ProPlan CMF, with some landmarks proving more reliable than others and further research is
needed to establish the utility and practicality of three-dimensional cephalometrics as a common
diagnostic tool in orthodontics
Dentistry; Orthodontics; Oral surgery; Three-Dimension imaging
Broadbent introduced cephalometric analysis in 1931[1]. The tool quickly became a critical element in the study and diagnosis of malocclusion and skeletal issues that contribute to malocclusion. Through comparison with established normal values, linear and angular measurements on lateral cephalograms can be used to define relational issues with the teeth and the skeletal structure of the face.
There are numerous limitations to two-dimensional cephalometrics [2]. For one, an entire dimension of measurement is lost by necessity when a three-dimensional object is projected on a two-dimensional film. This also creates artifacts from overlapping structures and magnification of areas of the subject that are farther from the film. Repeating cephalometric films is difficult in practice and even small subject positioning changes can artificially alter relationships between points of interest. Studies show that two-dimensional projections inadequately reflect clinical diagnoses [2,3].
Three-dimensional radiography with cone beam computed tomography (CBCT) can allow clinicians to improve accuracy of diagnosis and treatment planning [4]. The image is three-dimensional, eliminating the projection errors and making irrelevant distortion and magnification issues inherent in two-dimensional imaging. Its main drawback, increased radiation compared to traditional panoramic or lateral cephalometric films, is mitigated by the fact that the sum of radiation exposure of a standard orthodontic patient, including a lateral and posterior-anterior cephalograms, panoramic and periapical films, is similar to or even more than a single CBCT [5]. Orthodontists have to adjust to using CBCTs, should they become standard, as most practicing orthodontists were only trained in two-dimensional cephalometrics. Additionally, studies must be done to validate cephalometric analysis of CBCTs and to establish reliability and reproducibility between operators. An additional consideration must be made for software, as well, as CBCTs are a purely digital medium and different software packages present different viewing and tracing options. A number of studies have examined the accuracy of CBCTs converted to two-dimensional films and the reproducibility of landmark identification, the reliability of linear measurements, and reliability of landmark identification between multiple operators [6-14]. No current study has established reliability and reproducibility using ProPlan CMF (Materialise, Belgium), a common software package used in planning orthognathic surgery.
The aims of this study were to assess intra- and inter-operator observer reliability in located anatomic landmarks on the hard tissue of the skull using ProPlan CMF on images produced via CBCT.
Institutional review board approval was obtained. Twenty (N=20) pre-treatment CBCTs were collected from a private orthodontic office whose routine pre-treatment records include CBCT images. Images were obtained on an Orthophos XG 3D (Sirona Dental Systems, New York City) operated via a personal computer running Windows 7 operating system (Microsoft Corporation, Redmond, WA). Records were anonymized, removing all identifying information, and given a unique identifier. Ten female and 10 male patients (average age 14.7 years, range 11.0 to 20.1 years) were selected. Each scan was assessed to assured all points were viewable on the image, with a field of 8 cm3 and resolution of 160 µm. The raw image was processed by Sidexis NG (Sirona) and exported into a DICOM (Digital Imaging and Communications in Medicine) file. The file was then imported into ProPlan CMF on a dedicated laptop running Windows 7 (Figure 1). A volumetric model was generated via ProPlan CMF.
Two orthodontists, were trained and calibrated on the ProPlan CMF software, with assistance from Materialise customer support. Each observer was given several weeks and five “practice” scans not included in this study in order to become acquainted with and calibrated to the software. Following the calibration period, the operators identified 33 points cephalometric points, commonly used in the Downs, Steiner and Grummons analyses, listed in Table 1, with definitions of locations adapted from de Oliveira [15,16]. All points were identified on each CBCT (T1). Sixty to 80 days later (T2), the 20 CBCTs were re-ordered, and the operators repeated the identification. The ProPlan CMF software then produced numerical values for the X (coronal plane), Y (axial plane), and Z (sagittal plane) coordinates for each point, exported into a comma-separated values (CSV) file, yielding 40 sets of 99 observations for each observer.
For each of the landmarks in each dimension, intra-observer reliability and inter-observer reliability were estimated using intraclass correlation (ICC), with ICC at or above .9 evaluated as “excellent reliability”, .9 to .75 as “good reliability”, .75 to .45 as “fair reliability” and below .45 as “weak reliability” [17].
An F-test was calculated for the X, Y and Z coordinates of each landmark to test the null hypothesis that there was no significant difference in the mean location of landmarks by each observer. The Family Wise Error Rate (FWER) was set at alpha = .05 and the Hochberg correction for multiple hypothesis testing were used to control the FWER [18]. The sample size of this study was chosen based on sample sizes from similar studies [19]. Therefore, rather than performing a sample size calculation, effect size was calculated based on the sample. This power calculation was performed using a simulation study with 500 simulations per effect size. For the simulation, we assumed that the error variance was 0.5 and the variance component associated with the patient was 2.2 where these values were calculated from the observed data. In each replicate of the simulation, data was generated assuming different effect sizes and an F-test was performed testing the null hypothesis that the mean difference between the doctors was 0 versus the alternative that the mean difference was non-zero.
Figure 1: A screenshot of ProPlan CMF, demonstrating the multi-planar views and 3D model of a CBCT scan, including the landmarks
identified. Reproduced with permission from Materialise.
Power was calculated using 500 simulations per effect size, with error variance set to 0.5 and variance component set to 2.2, with these values calculated from observed data. In each replicate of the simulation, data was generated using different effect sizes. An F-test was performed on the null hypothesis, producing 80 per cent power at an effect size just above 0.3. As the coordinates were a whole number system, an effect size of 0.3 was deemed very satisfactory
ICC estimated reliability for each coordinate for each landmark: Table 2 displays all ICC results, by landmark, for both intra- and inter-observer reliability. Table 3 summarizes the ICC estimates for intra-observer reliability and Table 4 summarizes the ICC estimates for inter-observer reliability. Overall, the tables show that ICC indicated excellent reliability for bother intra- and inter-observer assessments. Table 5 shows the F-test results for all landmarks, which indicated general agreement between the observers.
In 78 (79.79%) of intra-observations, ICC estimates were > 0.9, and 97 (97.98%) were > 0.75. Only two landmarks (2.12%) were < 0.75 (Z coordinate of Apex LR1 and Y coordinate of menton), and none were < 0.45. Midline structures (A, B, N, ANS, PNS, Gn, Me, Pog, S, crista galli, Ba) had better overall reliability with 32 of 33 coordinates > 0.75; only menton’s Y coordinate was under 0.75. Lateral structures (left and right of each Or, Co, Go, Po, incisal tip of upper and lower 1s, apices of upper and lower 1s, MB cusp of lower 6s, MB cusp of U6s, and apices of upper 6s) had 65 coordinates > 0.75; only the apex of LR1 was below.
Intra-observer reliability ICC had 73 landmarks (73.74%) with estimates > 0.90, and 94 (96.9%) were > 0.75. Five of the remaining landmarks were > 0.45 (X coordinates for Incisal of UR1, left and right Or, Y coordinate for menton, Z coordinate for apex LR1); none were below 0.45. Midline structures had 32 of 33 coordinates (97.0%) above 0.75, and lateral structures had 62 of 66 coordinates (93.9%) above 0.75
Table 5 shows the results of the F-test, which had 87 of
99 (87.88%) of observations with no significant difference in
coordinates. The results for the X coordinates indicated that only four
of the 33 (12%- GoR, MB Cusp L6R, OrL, OrR) produced significant
results. In the Y dimension, seven observations (21.2%- apices of
all four central incisors, Ba, Incisal L1L, Or R) produced significant
results. The Z coordinates showed all but one coordinate with a nonsignificant result (3.0%- apex L1L).
Table 1: A list of 33 cephalometric points, commonly used in the Downs, Steiner and Grummons analyses, with corresponding anatomic locations in
each dimension
Table 2: Intraclass correlation coefficients for all landmarks for intra- and inter-observer reliability
Table 3: Intraclass Correlation Coefficients for intra-observer
reliability
Table 4: Intrarclass Correlation Coefficients for inter-observer
reliability
Table 5: F-test (alpha = 0.05) results for all landmarks (* p < 0.5, **
p < 0.01)
Cephalometrics, as developed by Broadbent and Hofrath decades ago, uses linear and angular measurements based on landmarks on two-dimensional film [1]. CBCTs offer three-dimensional images of three-dimensional objects, e.g. the human skull, eliminating the translation into two-dimensions required by traditional cephalometry
As pointed out by Zamora, two-dimensional cephalometrics are images of a three-dimensional skull into two dimensions, rather than specific points on specific bones [19]. This fact hinders any study that attempts to directly apply traditional cephalometrics into CBCTs. Points such as sella, defined broadly as the geometric center of sella turcica, have a new variable, the third dimension, which creates greater variation in identification [19,20]. As per de Oliveira, in these situations there is a natural tendency to identify landmarks in one or two planes that are easily visualized and while disregarding a plane where the point is difficult to visualize [16].
This fact is emphasized in the present study by the weak reliability of the landmarks’ X dimension coordinates, representing the coronal plane; the coronal plane is the plane that is not represented in traditional cephalometry. This was true for both intra- and inter-observer reliability. Even with this increased difficulty, the present study found that overall reliability was excellent. Additionally, ProPlan CMF allows the CBCTs to be viewed in multiplanar (i.e., sagittal, axial and coronal) views as well as volumetric reconstructions. Several studies have shown this to improve reliability of landmark identification [13,14,18,21].
Overall, the present study agrees with previous studies that landmark identification in CBCTs is reliable and reproducible. It also suggests that ProPlan CMF is a program in which three-dimensional cephalometrics can be performed with confidence. The estimates of reliability for both intra- and inter-observer reliability were satisfactory, as no measurement had a coefficient that would be rated “poor” by ICC and only seven out of 198 total observations falling between 0.75 and 0.45, the range rated as “fair.” All other observations, (n=191 (96.46%) were rated as “good” and 151 is rated as “excellent.” (0.9) Furthermore, the F-test found that 87 of 99 (87.99%) of the observations of the coordinates had no significant difference.
The general trend in the present study matched previous studies, in that midline structures show high reliability when translated into three-dimensional cephalometrics [13,14,18,22,23]. The 11 midline structures showed excellent reliability in all dimensions for both intra- and inter-observer reliability with the exception of menton, which rated as “good” in the X and Z dimensions and only “fair” in the Y dimension. The F-test produced a significant result from a single coordinate for a midline structure, the Y coordinate of basion. The lateral skeletal structures showed overall good to excellent reliability with the ICC. Left orbitale in the X plane for both inter- and intraoperator observations and right and right orbitale for inter-operator in the X plane were both under 0.75. The F-test also produced significant results for X coordinate for both left and right orbitales and the Y coordinate for the right orbitale. Right gonion in the X dimension was the only other lateral skeletal structure to produce a significant F-test result. De Oliveira suggested that discrepancies in landmarks identification are likely due to inadequate definitions of the points in space and not a clear definition as to where they are on curved surfaces, which is consistent with the limitations of translating cephalometric language for three dimensions images.
Dental structures fared somewhat worse than skeletal structures, with ICC estimates in both intra- and inter-operator reliability. Apex of lower right central in the X plane for both interand intra-operator observations and the incisal tip of upper right central for inter-operator observations were below 0.75 and only rated “fair.” Seven of the 11 significant results of the F-test in the present study were for dental structures. Katkar et al. had previously found that dental points were less reliably identifiable on CBCTs while Zomora concluded that dental landmark location was more highly reproducible [14,24]. The present study agrees with Katkar’s findings that, at least with the 160 µm resolution of scans used here, dental landmarks are indeed less reliably identified.
The current study used ProPlan CMF, an extremely common software package for planning orthognathic surgeries. Based on the results of the F-test and ICC, ProPlan appears to be a reliable program for three-dimensional cephalometrics. Lisboa noted that there are a paucity of three-dimensional cephalometric analysis software, thus it is important to test the reliability of those available to us. Twodimensional cephalometrics remains the standard in orthodontics; the cost, increased exposure and the time investment required for landmark identification in three dimensions all remain obstacles [5,18]. Three-dimensional cephalometrics has to overcome these barriers before it can displace two-dimensional evaluation as the standard diagnostic tool
Copyright © 2020 Boffin Access Limited.