Skip to main content

Advertisement

Log in

Anomaly Detection Techniques in the Gaia Space Mission Data

Journal of Signal Processing Systems Aims and scope Submit manuscript

A Correction to this article was published on 03 November 2021

This article has been updated

Abstract

In this paper we deal with classification of anomalous data detected by the data reduction system of the Gaia space mission, in operation since 2013. Given the size and complexity of intermediate data and plots for diagnostics, beyond practical possibility of full human evaluation, the need for automated signal processing tools is becoming more and more relevant. Our classification task consists in discriminating among “normal” data and data affected by anomalies, which at present are grouped into four different classes. We investigate the use of some clever pre-processing approaches that allow the application of a tailored technique based on the Hough transform, and of some machine learning tools, evidencing that the task can be exactly solved in the former case. In the latter case, random forests and support vector machine provide less than satisfactory performance, while convolutional neural networks achieve very good classification accuracy, up to 91.22%. Statistics show satisfactory results also in terms of precision and recall of each class.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Change history

Notes

  1. http://www.cosmos.esa.int/web/gaia;

    http://gea.esac.esa.int/archive

  2. Astronomical magnitude is a relative logarithmic unit. The scale is defined such that each step of one magnitude changes the brightness by a factor of approximately 2.512, so that a magnitude 1 star is exactly 100 times brighter than a magnitude 6 star. The brighter is an object, the lower is the value of its magnitude.

  3. https://www.python.org/

References

  1. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from https://tensorflow.org/.

  2. Aizerman, M. A. (1964). Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control, 25:821–837.

    MATH  Google Scholar 

  3. Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B. (2011). Algorithms for hyper-parameter optimization. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. C. N. Pereira, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain, 2546–2554.

  4. Bergstra, J., Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13:281–305.

    MathSciNet  MATH  Google Scholar 

  5. Bergstra, J., Yamins, D., Cox, D. D. (2013).Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013, volume 28 of JMLR Workshop and Conference Proceedings, pages 115–123. https://jmlr.org/.

  6. Boser, B. E., Guyon, I., Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In D. Haussler, editor, Proceedings of the Fifth Annual ACM Conference on Computational Learning Theory, COLT 1992, Pittsburgh, PA, USA, July 27-29, 1992, 144–152. ACM.

  7. Bradski, G. (2000). The OpenCV Library. Dr. Dobb’s Journal of Software Tools.

  8. Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.

    Article  Google Scholar 

  9. Bridle, J. S. (1989). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In F. Fogelman-Soulié and J. Hérault, editors, Neurocomputing - Algorithms, Architectures and Applications, Proceedings of the NATO Advanced Research Workshop on Neurocomputing Algorithms, Architectures and Applications, Les Arcs, France, February 27 - March 3, 1989, volume 68 of NATO ASI Series, 227–236. Springer.

  10. Choi, J., Eun, H., Kim, C. (2018). Boosting proximal dental caries detection via combination of variational methods and convolutional neural network. Journal of Signal and Processing System, 90(1):87–97.

    Article  Google Scholar 

  11. Chollet, F. et al. (2015) Keras. https://keras.io/.

  12. Druetto, A., Roberti, M., Cancelliere, R., Cavagnino, D., Gai, M. (2019) A deep learning approach to anomaly detection in the gaia space mission data. In I. Rojas, G. Joya, and A. Català, editors, Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Gran Canaria, Spain, June 12-14, 2019, Proceedings, Part II, volume 11507 of Lecture Notes in Computer Science, pages 390–401. Springer.

  13. Duda, R. O., Hart, P. E. (1972) Use of the Hough transformation to detect lines and curves in pictures. Communication ACM, 15(1):11–15.

    Article  Google Scholar 

  14. Eren, L., Ince, T., Kiranyaz, D. (2019). A generic intelligent bearing fault diagnosis system using compact adaptive 1d CNN classifier. Journal of Signal and Processing System, 91(2):179–189.

    Article  Google Scholar 

  15. Evans, D., Riello, M., De Angeli, F., Carrasco, J., Montegriffo, P., Fabricius, C., Jordi, C., Palaversa, L., Diener, C., Busso, G et al. (2018). Gaia data release 2-photometric content and validation. Astronomy & Astrophysics, 616:A4.

    Article  Google Scholar 

  16. Flach, P. A. (2012). Machine Learning - The Art and Science of Algorithms that Make Sense of Data. Cambridge University Press.

  17. Gaia Collaboration, Babusiaux, C., van Leeuwen, F., Barstow, M. A., Jordi, C., Vallenari, A., Bossini, D., Bressan, A., Cantat-Gaudin, T., van Leeuwen M, et al. (2018). Gaia Data Release 2. Observational Hertzsprung-Russell diagrams. Astronomy and Astrophysics, 616:A10.

  18. Gaia Collaboration, Brown, A. G. A., Vallenari, A., Prusti, T., de Bruijne, J. H. J., Babusiaux, C., Bailer-Jones, C. A. L., Biermann, M., Evans, D. W., Eyer, L, et al. (2018). Gaia Data Release 2. Summary of the contents and survey properties. Astronomy and Astrophysics, 616:A1.

  19. Gaia Collaboration, Mignard, F., Klioner, S. A., Lindegren, L., Hernández, J., Bastian, U., Bombrun, A., Hobbs, D., Lammers, U., Michalik, D, et al. (2018). Gaia Data Release 2. The celestial reference frame (Gaia-CRF2). Astronomy and Astrophysics, 616:A14.

  20. Gaia Collaboration, Spoto, F., Tanga, P., Mignard, F., Berthier, J., Carry, B., Cellino, A., Dell’Oro, A., Hestroffer, D., Muinonen, K, et al. (2018). Gaia Data Release 2. Observations of solar system objects. Astronomy and Astrophysics, 616:A13.

  21. Glorot, X., Bordes, A., Bengio, Y. (2011). Deep sparse rectifier neural networks. In G. J. Gordon, D. B. Dunson, and M. Dudík, editors, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2011, Fort Lauderdale, USA, April 11-13, 2011, volume 15 of JMLR Proceedings, pages 315–323. https://jmlr.org/.

  22. Goodfellow, I. J., Bengio, Y., Courville, A. C. (2016). Deep Learning. Adaptive computation and machine learning. MIT Press.

    MATH  Google Scholar 

  23. Hart, P. E. (2009). How the Hough transform was invented. Signal Processing Magazine, IEEE, 26:18 – 22, 12.

  24. Hinton, G. E. (1987). Learning translation invariant recognition in a massively parallel networks. In International Conference on Parallel Architectures and Languages Europe, pages 1–13. Springer.

  25. Hunter, J. D. (2007). Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3):90–95.

    Article  Google Scholar 

  26. Ioffe, S., Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In F. R. Bach and D. M. Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, volume 37 of JMLR Workshop and Conference Proceedings, pages 448–456. https://jmlr.org/.

  27. Jarrett, K., Kavukcuoglu, K., Ranzato, M., LeCun, Y. (2009). What is the best multi-stage architecture for object recognition? In IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27 - October 4, 2009, pages 2146–2153. IEEE Computer Society.

  28. Kingma, D. P., Ba, J. (2015). Adam: A method for stochastic optimization. In Y. Bengio and Y. LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, page 15.

  29. Kou, R., Petit, P., Paletou, F., Kulenthirarajah, L., Glorian, J.-M. (2018). Deep learning determination of stellar atmospheric fundamental parameters. In SF2A 2018, pages 167–170, Paris, France.

  30. Krizhevsky, A., Sutskever, I., Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In P. L. Bartlett, F. C. N. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012, Lake Tahoe, Nevada, United States, pages 1106–1114.

  31. Krogh, A., Hertz, J. A. (1991). A simple weight decay can improve generalization. In J. E. Moody, S. J. Hanson, and R. Lippmann, editors, Advances in Neural Information Processing Systems 4, [NIPS Conference, Denver, Colorado, USA, December 2-5, 1991], pages 950–957. Morgan Kaufmann.

  32. Kumar, V., Asati, A., Gupta, A. (2018). Hardware Accelerators for Iris Localization. Journal of Signal Processing Systems, 90(4), 655–671.

    Article  Google Scholar 

  33. Lecun, Y. (1989). Generalization and network design strategies, chapter Learning, pages 143–156. Elsevier.

  34. Lee, K., Kung, S., Verma, N. (2012). Low-energy formulations of support vector machine kernel functions for biomedical sensor applications. Journal of Signal Processing System, 69(3):339–349.

    Article  Google Scholar 

  35. Leung, H. W., Bovy, J. (2019). Deep learning of multi-element abundances from high-resolution spectroscopic data. Monthly Notices of the Royal Astronomical Society, 483(3):3255–3277.

    Google Scholar 

  36. Lindegren, L., Hernández, J., Bombrun, A., Klioner, S., Bastian, U., Ramos-Lerate, M., de Torres, A., Steidelmüller, H., Stephenson, C., Hobbs, D, et al. (2018) Gaia Data Release 2. The astrometric solution. Astronomy and Astrophysics, 616:A2.

    Article  Google Scholar 

  37. Liu, X., Du, J., Yang, J., Xiong, P., Liu, J., Lin, F. (2020). Coronary artery fibrous plaque detection based on multi-scale convolutional neural networks. Jornal of Signal Processing System, 92(3):325–333.

    Article  Google Scholar 

  38. McDonald, J. B., Donald, J. B. M. (2001). Application of the Hough transform to lane detection and following on high speed roads. In in Motorway Driving Scenarios, in Proceeding of Irish Signals and Systems Conference, page 9.

  39. Nair, V., Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In J. Fürnkranz and T. Joachims, editors, Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21-24, 2010, Haifa, Israel, pages 807–814. Omnipress,.

  40. Nakjai, P., Katanyukul, T. (2019). Hand sign recognition for thai finger spelling: an application of convolution neural network. Journal of Signal Processing System, 91(2):131–146.

    Article  Google Scholar 

  41. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.

    MathSciNet  MATH  Google Scholar 

  42. Polyak, B. T. (1964). Some methods of speeding up the convergence of iteration methods. USSR Computational Mathematics and Mathematical Physics, 4(5):1–17.

    Article  Google Scholar 

  43. Prusti, T., De Bruijne, J., Brown, A. G., Vallenari, A., Babusiaux, C., Bailer-Jones, C., Bastian, U., Biermann, M., Evans, D. W., Eyer, L, et al. (2016). The gaia mission. Astronomy & Astrophysics, 595:A1.

    Article  Google Scholar 

  44. Tian, Q. C., Pan, Q., Cheng Y. M., Gao, Q. X. (2004). Fast algorithm and application of Hough transform in iris segmentation. In Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826), 7:3977–3980.

  45. Robbins, H., Monro, S. (1951). A stochastic approximation method. The annals of mathematical statistics, 400–407.

  46. Rumelhart, D. E., Hinton, G. E., Williams, R. J. (1985). Learning internal representations by error propagation. Technical report, California Univ San Diego La Jolla Inst for Cognitive Science.

  47. Schilling, N., Wistuba, M., Schmidt-Thieme, L. (2016). Scalable hyperparameter optimization with products of gaussian process experts. In P. Frasconi, N. Landwehr, G. Manco, and J. Vreeken, editors, Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part I, volume 9851 of Lecture Notes in Computer Science, pages 33–48. Springer.

  48. Silburt, A., Ali-Dib, M., Zhu, C., Jackson, A., Valencia, D., Kissin, Y., Tamayo, D., Menou, K. (2019). Lunar crater identification via deep learning. Icarus, 317:27–38.

    Article  Google Scholar 

  49. Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  50. Tsoutsouras, V., Koliogeorgi, K., Xydis, S., Soudris D. (2017) An exploration framework for efficient high-level synthesis of support vector machines: Case study on ECG arrhythmia detection for xilinx zynq soc. Journal of Signal Processing System, 88(2):127–147.

    Article  Google Scholar 

  51. Tuccillo, D., Huertas-Company, M., Decencière, E., Velasco-Forero, S., Domínguez Sánchez, H., Dimauro P. (2018). Deep learning for galaxy surface brightness profile fitting. Monthly Notices of the Royal Astronomical Society, 475(1):894–909.

  52. Yang, C., Collins, J. (2018) Improvement of Honey Bee Tracking on 2D Video with Hough Transform and Kalman Filter. Journal of Signal Processing Systems, 90(12): 1639–1650.

    Article  Google Scholar 

  53. Zhou, Y., Chellappa, R. (1988) Computation of optical flow using a neural network. In Proceedings of International Conference on Neural Networks (ICNN’88), San Diego, CA, USA, July 24-27, 1988, pages 71–78. IEEE.

  54. Zingales, T., Waldmann, I. P. (2018). Exogan: Retrieving exoplanetary atmospheres using deep convolutional generative adversarial networks. The Astronomical Journal, 156(6):268.

    Article  Google Scholar 

Download references

Acknowledgements

The activity has been partially funded by the Italian Space Agency (ASI) under contracts Gaia Mission, The Italian Participation to DPAC, 2014-025-R.1.2015 and 2018-24-HH.0. This research has been partially carried on in the context of the Visiting Professor Program of the Italian Istituto Nazionale di Alta Matematica (INdAM).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Roberti.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The original version of this article, published on 13 September 2021, unfortunately, the authors have found an error in the published version of the paper, which they didn't notice during the publication process. The address of the department is shown as the same as the Astronomic Observatory they have been working with.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Roberti, M., Druetto, A., Busonero, D. et al. Anomaly Detection Techniques in the Gaia Space Mission Data. J Sign Process Syst 93, 1339–1357 (2021). https://doi.org/10.1007/s11265-021-01688-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-021-01688-6

Keywords

Navigation