Abstract
Classification-based approaches for segmenting medical images commonly suffer from missing ground truth: often one has to resort to manual labelings by human experts, which may show considerable intra-rater and inter-rater variability. We experimentally evaluate several latent class and latent score models for tumor classification based on manual segmentations of different quality, using approximate variational techniques for inference. For the first time, we also study models that make use of image feature information on this specific task. Additionally, we analyze the outcome of hybrid techniques formed by combining aspects of different models. Benchmarking results on simulated MR images of brain tumors are presented: while simple baseline techniques already gave very competitive performance, significant improvements could be made by explicitly accounting for rater quality. Furthermore, we point out the transfer of these models to the task of fusing manual tumor segmentations derived from different imaging modalities on real-world data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Buntine, W.: Operations for Learning with Graphical Models. Journal of Artificial Intelligence Research 2, 159–225 (1994)
Dempster, A., Laird, N., Rubin, D., et al.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977)
Gelfand, A.E., Smith, A.F.: Sampling-Based Approaches to Calculating Marginal Densities. Journal of the American Statistical Association 85(410), 398–409 (1990)
Giannini, C., Scheithauer, B., Weaver, A., et al.: Oligodendrogliomas: reproducibility and prognostic value of histologic diagnosis and grading. Journal of Neuropathology & Experimental Neurology 60(3), 248 (2001)
Koller, D., Friedman, N.: Probabilistic Graphical Models – Principles and Techniques. MIT Press, Cambridge (2009)
Lunn, D., Thomas, A., Best, N., et al.: WinBUGS – A Bayesian modelling framework: Concepts, structure and extensibility. Statistics and Computing 10, 325–337 (2000)
Minka, T.: Expectation Propagation for approximate Bayesian inference. In: Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence, pp. 362–369 (2001)
Minka, T., Winn, J., Guiver, J., et al.: Infer.NET 2.3. Microsoft Research, Cambridge (2009), http://research.microsoft.com/infernet
Minka, T., Winn, J.: Gates. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 21, pp. 1073–1080. MIT Press, Cambridge (2009)
Prastawa, M., Bullitt, E., Gerig, G.: Simulation of Brain Tumors in MR Images for Evaluation of Segmentation Efficacy. Medical Image Analysis 13(2), 297–311 (2009)
Raykar, V.C., Yu, S., Zhao, L.H., et al.: Learning From Crowds. Journal of Machine Learning Research 11, 1297–1322 (2010)
Rogers, S., Girolami, M., Polajnar, T.: Semi-parametric analysis of multi-rater data. Statistics and Computing 20(3), 317–334 (2010)
Schmidt, M., Levner, I., Greiner, R., et al.: Segmenting Brain Tumors using Alignment-Based Features. In: Proceedings of the Fourth International Conference on Machine Learning and Applications (ICMLA), pp. 215–220 (2005)
Smyth, P., Fayyad, U., Burl, M., et al.: Inferring Ground Truth From Subjective Labelling of Venus Images. In: Tesauro, G., Toretzy, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 1085–1092. MIT Press, Cambridge (1995)
Warfield, S., Zou, K., Wells, W.: Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. IEEE Transactions on Medical Imaging 23(7), 903–921 (2004)
Warfield, S., Zou, K., Wells, W.: Validation of image segmentation by estimating rater bias and variance. Philosophical Transactions of the Royal Society A 366(1874), 2361–2375 (2008)
Whitehill, J., Ruvolo, P.: fan Wu, T., et al.: Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22, pp. 2035–2043. MIT Press, Cambridge (2009)
Winn, J., Bishop, C.: Variational Message Passing. Journal of Machine Learning Research 6, 661–694 (2005)
Zhang, J.: The mean field theory in EM procedures for Markov random fields. IEEE Transactions on Signal Processing 40(10), 2570–2583 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kaster, F.O., Menze, B.H., Weber, MA., Hamprecht, F.A. (2011). Comparative Validation of Graphical Models for Learning Tumor Segmentations from Noisy Manual Annotations. In: Menze, B., Langs, G., Tu, Z., Criminisi, A. (eds) Medical Computer Vision. Recognition Techniques and Applications in Medical Imaging. MCV 2010. Lecture Notes in Computer Science, vol 6533. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18421-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-18421-5_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-18420-8
Online ISBN: 978-3-642-18421-5
eBook Packages: Computer ScienceComputer Science (R0)