Abstracts - 2007
Detecting Errors in Manual Labeling Using an Inverse Registration Scheme
Boon Thye Thomas Yeo, Mert Sabuncu & Polina Golland
We are developing a framework for detecting errors in the manual labeling of medical images.
Manual labeling of certain medical images is a laborious but common practice in medical imaging because of their use in many clinical and scientific studies. As computer scientists, our goal is to design algorithms that can automate this process. Thus, we use manual labeling in two ways: firstly, as training data for our automatic segmentation algorithms; and secondly, as gold standards in the evaluation of automated segmentation algorithms. Therefore the accuracy of manual labeling is very important. In practice, we observed that when the accuracy of automated segmentation is around eighty percent (as compared with the manual segmentation), then any further improvement is limited by the limited accuracy of manual segmentation.
Previous works (such as ) combine multiple manual/automatic segmentations of a single image into a better segmentation. However, in practice, one might have images of the same structures (e.g. brain MRI) of different subjects with only a single manual segmentation per image. The idea is that we should be able to compare these segmentations of the same structures across different brains to discover errors/inconsistencies. In some sense, this is a more difficult problem than what has been previously attempted because the correspondences across the subjects are unknown. The problem is compounded by possible anatomical differences across subjects.
On the other hand, detecting errors in manual segmentations should be easier than supervised automated segmentation, where one utilizes subjects with manual labeling (training data) to deduce the segmentation of a new (test) subject. In the canonical registration/segmentation scheme, one builds a probabilistic atlas from the training subjects, and align (register) a new brain to this atlas using features/geometries presumably relevant to the segmentation of the subject. The algorithm then deduces the “underlying” labels of new subjects using the segmentation atlas. One of the drawback of this method is the existence of multiple local minima, i.e. a wrong registration can occur, resulting in most errors in segmentation.
Instead, we propose an Inverse Registration Scheme (IRS). Since we have manual segmentation of all the subjects, this information can be employed for the co-registration of the data sets. In practice, this simplifies the problem substantially and yields robust algorithms. As opposed to using the geometric features to achieve registration and then guess the labels, we propose the inverse, i.e., we use the manual labels to register the training images and then deduce errors in image geometries. The employment of labels yield a well-behaved and smooth registration function. Any geometrical inconsistencies that still exist are then either due to incorrect labeling or anatomical variability, both of which are interesting to detect.
The figure below shows preliminary results. The two spherical images on the left show two brain surfaces after they have been aligned using their labels. Note that the brown structural boundaries are at the same spatial locations. However, there is a manual segmentation error in the middle brain, resulting in a misalignment of geometry or image intensity. Because all the other brains look like the subject on the left, a simple measure such as the sum of square differences between the middle spherical image and the remaining brains yields the result on the right, where the bright red spots indicate geometrical discrepancies between the middle brain and other brains. In this case, we have successfully detected segmentation errors in the middle brain.
Bruce Fischl (MGH), Ron Kiliany (BU)
Boon Thye Thomas Yeo is funded by the Agency for Science, Technology and Research, Singapore. Support for this research was provided in part by NIH NIBIB NAMIC U54-EB005149, NIH NCRR NAC P41-RR13218, NIH NINDS R01-NS051826, NIH NCRR mBIRN U24-RR021382, P41-RR14075, R01 RR16594-01A1 and NSF JHU ERC CISST, NCRR P41-RR14075, R01 RR16594-01A1, the National Institute for Biomedical Imaging and Bioengineering (R01 EB001550), the National Institute for Neurological Disorders and Stroke (R01 NS052585-01) as well as the Mental Illness and Neuroscience Discovery (MIND) Institute.
 Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. Warfield, S.K.; Zou, K.H.; Wells, W.M. Medical Imaging, IEEE Transactions on, Vol.23, Iss.7, July 2004, Pages: 903- 921