Skip to main content
Log in

Ethological data mining: an automata-based approach to extract behavioral units and rules

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

We propose an efficient automata-based approach to extract behavioral units and rules from continuous sequential data of animal behavior. By introducing novel extensions, we integrate two elemental methods—the N-gram model and Angluin’s machine learning algorithm into an ethological data mining framework. This allows us to obtain the minimized automaton-representation of behavioral rules that accept (or generate) the smallest set of possible behavioral patterns from sequential data of animal behavior. With this method, we demonstrate how the ethological data mining works using real birdsong data; we use the Bengalese finch song and perform experimental evaluations of this method using artificial birdsong data generated by a computer program. These results suggest that our ethological data mining works effectively even for noisy behavioral data by appropriately setting the parameters that we introduce. In addition, we demonstrate a case study using the Bengalese finch song, showing that our method successfully grasps the core structure of the singing behavior such as loops and branches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Angluin D (1982) Inference of Reversible Languages. J Assoc Comput Mach 29(3): 741–765

    MATH  MathSciNet  Google Scholar 

  • Berwick RC, Pilato SF (1987) Learning syntax by automata induction. Mach Learn 2(1): 9–38

    Google Scholar 

  • Brainard MS, Doupe AJ (2002) What songbirds teach us about learning. Nature 417: 351–358

    Article  Google Scholar 

  • Brian L, Michael G (1979) Biology of communication. Kluwer Academic Publishers Group

  • Catchpole CK, Slater PJB (1995) Bird song: biological themes and variations. Cambridge University Press

  • Chatfield C, Lemon RE (1970) Analysing sequences of behavioural events. J Theor Biol 29(3): 427–445

    Article  Google Scholar 

  • Doupe AJ, Kuhl PK (1999) Birdsong and human speech: common themes and mechanisms. Ann Rev Neurosci 22: 567–631

    Article  Google Scholar 

  • Gentner TQ, Fenn KM, Margoliash D, Nusbaum H (2006) Recursive syntactic pattern learning by songbirds. Nature 440: 1204–1207

    Article  Google Scholar 

  • Gold ME (1967) Language identification in the limit. Inf Control 10(5): 447–474

    Article  MATH  Google Scholar 

  • Graham S (2004) Essential animal behavior. Wiley-Blackwell

  • Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1): 55–86

    Article  MathSciNet  Google Scholar 

  • Hauser MD, Chomsky N, Fitch WT (2002) The faculty of language: what is it, who has it, and how did it evolve?. Science 298: 1569–1589

    Article  Google Scholar 

  • Hopcroft JE, Ullman JD (1979) Introduction to automata theory, languages and computation. Addison Wesley

  • Hosino T, Okanoya K (2000) Lesion of a higher-order song control nucleus disrupts phrase-level complexity in Bengalese finches. NeuroReport 11: 2091–2095

    Article  Google Scholar 

  • Ian H W, Eibe F (2005) Data mining: practical machine learning tools and techniques, 2nd edn (Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann

  • Jelinek F (1990) Self-organized language modeling for speech recognition. Morgan Kaufmann, San Francisco, pp 450–506

    Google Scholar 

  • Jelinek F (1998) Statistical methods for speech recognition (language, speech, and communication). The MIT Press

  • Kakishita Y, Sasahara K, Nishino T, Takahasi M, Okanoya K (2007) Pattern extraction improves automata-based syntax analysis in songbirds. In: Progress in artificial life. Lecture notes in artificial intelligence, vol 4828. Springer, pp 320–332

  • Marler PR, Slabbekoorn H (2004) Nature’s music: the science of birdsong. Academic Press

  • Okanoya K (2004) Song syntax in Bengalese finches: proximate and ultimate analyses. Adv Study Behav 34: 297–346

    Article  Google Scholar 

  • Ramus F, Hauser MD, Miller C, Morris D, Mehler J (2000) Language discrimination by human newborns and by cotton-top Tamarin monkeys. Science 288: 349–351

    Article  Google Scholar 

  • Sasahara K, Kakishita Y, Nishino T, Takahasi M, Okanoya K (2006) A reversible automata approach to modeling birdsongs. In: 15th international conference on computing (CIC’06). IEEE Computer Society, pp 80–85

  • Shannon CE (1948) A mathematical theory of communication. Bell Sys Tech J 27:379–423, 623–656

    Google Scholar 

  • Shannon CE (1950) Prediction and entropy of printed English. Bell Sys Tech J 3: 50–64

    MathSciNet  Google Scholar 

  • Suzuki R, Buck JR, Tyack PL (2006) Information entropy of humpback whale songs. J Acoust Soc Am 119(3): 1849–1866

    Article  Google Scholar 

  • Wren JD, Hildebrand WH, Chandrasekaran S, Melcher U (2005) Markov model recognition and classification of DNA/protein sequences within large text databases. Bioinformatics 21(21): 4046–4053

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yasuki Kakishita or Kazutoshi Sasahara.

Additional information

Responsible editor: Eamonn Keogh.

Yasuki Kakishita and Kazutoshi Sasahara have contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kakishita, Y., Sasahara, K., Nishino, T. et al. Ethological data mining: an automata-based approach to extract behavioral units and rules. Data Min Knowl Disc 18, 446–471 (2009). https://doi.org/10.1007/s10618-008-0122-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-008-0122-1

Keywords

Navigation