Abstract
Spoken language interactive systems range from speech-enabled command interfaces to dialogue systems which conduct spoken conversations with the user. In the first case, spoken language is used as an alternative input and output modality, so that the commands, which the user could type or select from the menu, may also be uttered. The system responses can also be given as spoken utterances, instead of written language or drawings on the screen, so the whole interaction can be conducted in speech. Spoken dialogue systems, however, are built on models concerning spoken conversations between participants so as to allow flexible interaction capabilities. Although interactions are limited concerning topics, turn-taking principles and conversational strategies, the systems aim at human–computer interaction that would support natural interaction which enables the user to interact with the system in an intuitive manner. Moreover, trying to combine insights of the processes that underlie typical human interactions, spoken dialogue modelling also seeks to advance our knowledge and understanding of the principles that govern communicative situations in general.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ai, H., Raux, A., Bohus, D., Eskenazi, M., Litman, D. (2007). Comparing spoken dialog corpora collected with recruited subjects versus real users. In: Proc. 8th SIGDial Workshop on Discourse and Dialogue, Antwerp, Belgium.
Allen, J., Byron, D., Dzikovska, M., Ferguson, G., Galescu, L., Stent, A. (2000). An architecture for a generic dialog shell. Nat. Lang. Eng., 6 (3), 1–16.
Allen, J., Perrault, C.R. (1980). Analyzing intention in utterances. Artif. Intell., 15, 143–178.
Allen, J. F., Schubert, L. K., Ferguson, G., Heeman, P., Hwang, C. H., Kato, T., Light, M., Martin, N. G., Miller, B. W. Poesio, M., Traum, D. R. (1995). The TRAINS Project: A case study in building a conversational planning agent. J. Exp. Theor. AI, 7, 7–48. Also available as TRAINS Technical Note 94–3 and Technical Report 532, Computer Science Department, University of Rochester, September 1994.
Allwood, J. (1976). Linguistic Communication as Action and Cooperation. Department of Linguistics, University of Göteborg. Gothenburg Monographs in Linguistics, 2.
Allwood, J. (1977). A critical look at speech act theory. In: Dahl, Ö. (ed.) Logic, Pragmatics, and Grammar, Studentlitteratur, Lund.
Allwood, J. (1994). Obligations and options in dialogue. Think Q., 3, 9–18.
Allwood, J. Cerrato, L., Jokinen, K., Navarretta, C., Paggio, P. (2007). The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena. In: Martin, J. C., Paggio, P., Kuenlein, P., Stiefelhagen, R., Pianesi F. (eds), Multimodal Corpora For Modelling Human Multimodal Behaviour. Int. J. Lang. Res. Eval. (Special Issue), 41 (3–4), 273–287.
Allwood, J., Traum, D., Jokinen, K. (2000). Cooperation, dialogue, and ethics. Int. J. Hum. Comput. Studies, 53, 871–914.
Anderson, A. H., Bader, M., Bard, E. G., Boyle, E., Doherty, G., Garrod, S., Isard, S., Kowtko, J., McAllister, J., Miller, J., Sotillo, C., Thompson, H. S., Weinert, R. (1991). The HCRC map task corpus. Lang. Speech, 34 (4), 351–366.
Appelt, D. E. (1985). Planning English Sentences. Cambridge University Press, Cambridge.
Aust, H., Oerder, M., Seide, F., Steinbiss, V. (1995). The Philips automatic train timetable information system. Speech Commun., 17, 249–262.
Austin, J. L. (1962). How to do Things with Words. Clarendon Press, Oxford.
Axelrod, R. (1984). Evolution of Cooperation. Basic Books, New York.
Ballim, A., Wilks, Y. (1991). Artificial Believers. Lawrence Erlbaum Associates, Hillsdale, NJ.
Black, W., Allwood, J., Bunt, H., Dols, F., Donzella, C., Ferrari, G., Gallagher, J., Haidan, R., Imlah, B., Jokinen, K., Lancel, J.-M., Nivre, J., Sabah, G., Wachtel, T. (1991). A pragmatics based language understanding system. In: Proc. ESPRIT Conf. Brussels, Belgium.
Bolt, R.A. (1980). Put-that-there: Voice and gesture at the graphic interface. Comput. Graphics, 14 (3), 262–270.
Bos, J., Klein, E., Oka T. (2003). Meaningful conversation with a mobile robot. In: Proceedings of the Research Note Sessions of the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL’03), Budapest, 71–74.
Brown, P., Levinson, S. C. (1999) [1987]. Politeness: Some universals in language usage. In: Jaworski, A., Coupland, N. (eds) The Discourse Reader. Routledge, London, 321–335.
Bunt, H. C. (1990). DIT – Dynamic interpretation in text and dialogue. In: Kálmán, L., Pólos, L. (eds) Papers from the Second Symposium on Language and Logic. Akademiai Kiadó, Budapest.
Bunt, H. C. (2000). Dynamic interpretation and dialogue theory. In: Taylor, M. M. Néel, F., Bouwhuis, D. G. (eds) The Structure of Multimodal Dialogue II., John Benjamins, Amsterdam, 139–166.
Bunt, H. C. (2005). A framework for dialogue act specification. In: Fourth Workshop on Multimodal Semantic Representation (ACL-SIGSEM and ISO TC37/SC4), Tilburg.
Carberry, S. (1990). Plan Recognition in Natural Language Dialogue. MIT Press, Cambridge, MA.
Carletta, J. (2006). Announcing the AMI Meeting Corpus. ELRA Newslett., 11 (1), 3–5.
Carletta, J., Dahlbäck, N., Reithinger, N., Walker, M. (eds) (1997). Standards for Dialogue Coding in Natural Language Processing. Dagstuhl-Seminar Report 167.
Carlson R. (1996). The dialog component in the Waxholm system. In: LuperFoy, S., Nijholt, A., Veldhuijzen van Zanten, G. (eds) Proc. Twente Workshop on Language Technology. Dialogue Management in Natural Language Systems (TWLT 11), Enschede, The Netherlands, 209–218.
Chin, D. (1989). KNOME: Modeling what the user knows in UC. In: Kobsa, A., Wahlster, W. (eds) User Modeling in Dialogue Systems. Springer-Verlag Berlin, Heidelberg, 74–107.
Chomsky, N. (1957). Syntactic Structures. Mouton, The Hague/Paris.
Chu-Carroll, J., Brown, M. K. (1998). An evidential model for tracking initiative in collaborative dialogue interactions. User Model. User-Adapted Interact., 8 (3–4), 215–253.
Chu-Carroll, J., Carpenter, B. (1999). Vector-based natural language call routing. Comput. Linguist., 25 (3), 256–262.
Clark, H. H., Schaefer, E. F. (1989). Contributing to discourse. Cogn. Sci., 13, 259–294.
Clark, H. H., Wilkes-Gibbs, D. (1986). Referring as a collaborative process. Cognition, 22, 1–39.
Cohen, P. R., Levesque, H. J. (1990a). Persistence, intention, and commitment. In: Cohen, P. R., Morgan, J., Pollack, M. E. (eds) Intentions in Communication. The MIT Press, Cambridge, MA, 33–69.
Cohen, P. R., Levesque, H. J. (1990b). Rational interaction as the basis for communication. In: Cohen, P. R., Morgan, J., Pollack, M. E. (eds) Intentions in Communication. The MIT Press, Cambridge, MA, 221–255.
Cohen, P. R., Levesque, H. J. (1991). Teamwork. Nous, 25 (4), 487–512.
Cohen, P. R., Morgan, J., Pollack, M. (eds) (1990). Intentions in Communication. MIT Press, Cambridge.
Cohen, P. R., Perrault, C. R. (1979). Elements of plan-based theory of speech acts. Cogn. Sci., 3, 177–212.
Cole, R. A., Mariani, J., Uszkoreit, H., Zaenen, A., Zue, V. (eds) (1996). Survey of the State of the Art in Human Language Technology. Also available at http://www.cse.ogi.edu/CSLU/HLTSurvey/
Core, M. G., Allen, J. F. (1997). Coding dialogs with the DAMSL annotation scheme. In: Working Notes of AAAI Fall Symposium on Communicative Action in Humans and Machines, Boston, MA.
Danieli M., Gerbino E. (1995). Metrics for evaluating dialogue strategies in a spoken language system. In: Proc. AAAI Spring Symposium on Empirical Methods in Discourse Interpretation and Generation, Stanford University, 34–39.
Dybkjaer, L., Bernsen, N. O., Dybkjaer, H. (1996). Evaluation of spoken dialogue systems. In: Proc. 11th Twente Workshop on Language Technology, Twente.
Erman, L. D., Hayes-Roth, F., Lesser, V. R., Reddy, D. R. (1980). The HEARSAY-II speech understanding system: Integrating knowledge to resolve uncertainty. Comput. Surv., 12 (2), 213–253.
Esposito, A., Campbell, N., Vogel, C., Hussain, A., and Nijholt, A. (Eds.). Development of Multimodal Interfaces: Active Listening and Synchrony. Springer Publishers.
Galliers, J. R. (1989). A theoretical framework for computer models of cooperative dialogue, acknowledging multi-agent conflict. Technical Report 17.2, Computer Laboratory, University of Cambridge.
Gmytrasiewicz, P. J., Durfee, E. H. (1993). Elements of utilitarian theory of knowledge and action. In: Proc. 12th Int. Joint Conf. on Artificial Intelligence, Chambry, France, 396–402.
Gmytrasiewicz, P. J., Durfee, E. H., Rosenschein, J. S. (1995). Towards rational communicative behavior. In: AAAI Fall Symp. on Embodied Language, AAAI Press, Cambridge, MA.
Goodwin, C. (1981). Conversational Organization: Interaction between Speakers and Hearers. Academic Press, New York.
Gorin, A. L., Riccardi, G., Wright, J. H. (1997). How may i help you? Speech Commun., 23 (1/2), 113–127.
Grice, H. P. (1975). Logic and conversation. In: Cole, P., Morgan, J. L. (eds) Syntax and Semantics. Vol 3: Speech Acts. Academic Press, New York, 41–58.
Grosz, B. J. (1977). The Representation and Use of Focus in Dialogue Understanding. SRI Stanford Research Institute, Stanford, CA.
Grosz, B. J., Hirschberg, J. (1992). Some international characteristics of discourse. Proceedings of the Second International Conference on Spoken Language Processing (ICSLP’92), Banff, Alberta, Canada, 1992, 429–432.
Grosz, B. J., Kraus, S. (1995). Collaborative plans for complex group action. Technical Report TR-20-95, Harvard University, Center for Research in Computing Technology.
Grosz, B. J., Sidner, C. L. (1986). Attention, intentions, and the structure of discourse. Comput. Linguist., 12 (3), 175–203.
Grosz, B. J., Sidner, C. L. (1990). Plans for discourse. In: Cohen, P. R., Morgan, J., Pollack, M. E. (eds) Intentions in Communication. The MIT Press. Cambridge, MA, 417–444.
Guinn, C. I. (1996). Mechanisms for mixed-initiative human-computer collaborative discourse. In: Proc. 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, USA, 278–285.
Hasida, K., Den, Y., Nagao, K., Kashioka, H., Sakai, K., Shimazu, A. (1995). Dialeague: A proposal of a context for evaluating natural language dialogue systems. In: Proc. 1st Annual Meeting of the Japanese Natural Language Processing Society, Tokyo, Japan, 309–312 (in Japanese).
Heeman, P. A., Allen, J. F. (1997). International boundaries, speech repairs, and discourse markers: Modelling spoken dialog. In: Proc. 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics, Madrid, Spain.
Hirasawa, J., Nakano, M., Kawabata, T., Aikawa, K. (1999). Effects of system barge-in responses on user impressions. In: Sixth Eur. Conf. on Speech Communication and Technology, Budapest, Hungary, 3, 1391–1394.
Hirschberg, J., Litman, D. (1993). Empirical studies on the disambiguation of cue phrases Comput. Linguist., 19 (3), 501–530.
Hirschberg, J., Nakatani, C. (1998). Acoustic indicators of topic segmentation. In: Proc. Int. Conf. on Spoken Language Processing, Sydney, Australia, 976–979.
Hobbs, J. (1979). Coherence and coreference. Cogn. Sci., 3 (1), 67–90.
Hovy, E. H. (1988). Generating Natural Language under Pragmatic Constraints. Lawrence Erlbaum Associates, Hillsdale, NJ.
Isard, A., McKelvie, D., Cappelli, B., Dybkjær, L., Evert, S., Fitschen, A., Heid, U., Kipp, M., Klein, M., Mengel, A., Møller, M. B., Reithinger, N. (1998). Specification of workbench architecture. MATE Deliverable D3.1.
Jekat, S., Klein, A., Maier, E., Maleck, I., Mast, M., Quantz, J. (1995). Dialogue acts in VERBMOBIL. Technical Report 65, BMBF Verbmobil Report.
Jokinen, K. (1996). Goal formulation based on communicative principles. In: Proc. 16th Int. Conf. on Computational Linguistics (COLING - 96) Copenhagen, Denmark, 598–603.
Jokinen, K. (2009). Constructive Dialogue Modelling – Speech Interaction and Rational Agents. John Wiley, Chichester.
Jokinen, K., Hurtig, T. (2006). User expectations and real experience on a multimodal interactive system. In: Proc. 9th Int. Conf. on Spoken Language Processing (Interspeech 2006 – ICSLP) Pittsburgh, US.
Jokinen, K., Hurtig, T., Hynnä, K., Kanto, K., Kerminen, A., Kaipainen, M. (2001). Self-organizing dialogue management. In: Isahara, H., Ma, Q. (eds) NLPRS2001 Proc. 2nd Workshop on Natural Language Processing and Neural Networks, Tokyo, Japan, 77–84.
Joshi, A., Webber, B. L., Weischedel, R. M. (1984). Preventing false inferences. In: Proc. 10th In Conf. on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics, 1984, Stanford, California, USA, 34–138.
Jurafsky, D., Shriberg, E., Fox, B., Curl, T. (1998). Lexical, prosodic, and syntactic cues for dialog acts. In: ACL/COLING-98 Workshop on Discourse Relations and Discourse Markers. 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics Montreal, Quebec, Canada.
Kearns, M., Isbell, C., Singh, S., Litman, D., Howe, J. (2002). CobotDS: A spoken dialogue system for chat. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence, Edmonton, Alberta.
Keizer, S., Akker, R. op den, Nijholt, A. (2002). Dialogue act recognition with Bayesian Network for Dutch dialogues. In: Jokien, K., McRoy, S. (eds.) Proc. 3rd SIGDial Workshop on Discourse and Dialogue, Philadelphia, US.
Kerminen, A., Jokinen, K. (2003). Distributed dialogue management. In: Jokinen, K., Gambäck, B., Black, W. J., Catizone, R., Wilks, Y. (eds.) Proc. EACL Workshop on Dialogue Systems: Interaction, Adaptation and Styles of Management. Budapest, Hungary.
Kendon, A. (2004). Gesture: Visible action as utterance. Cambridge. In: Proc. 13th Eur. Conf. on Artificial Intelligence (ECAI).
Kipp, M. (2001). Anvil – A generic annotation tool for multimodal dialogue. In: Proc. 7th Eur. Conf. on Speech Communication and Technology, (Eurospeech), Aalborg, Denmark, 1367–1370.
Koeller, A., Kruijff, G.-J. (2004). Talking robots with LEGO mindstorms. In: Proc. 20th COLING, Geneva.
Koiso, H., Horiuchi, Y., Tutiya, S., Ichikawa, A., Den, Y. (1998). An analysis of turn taking and backchannels based on prosodic and syntactic features in Japanese Map Task dialogs. Lang. Speech, 41 (3–4), 295–321.
Krahmer, E., Swerts, M., Theune, M., Weegels, M. (1999). Problem spotting in human-machine interaction. In: Proc. Eurospeech ‘99, Budapest, Hungary, 3, 1423–1426.
Lemon, O., Bracy, A., Gruenstein, A., Peters, S. (2001). The WITAS multi-modal dialogue system I. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), Aalborg, Denmark.
Lendvai, P., Bosch, A. van den, Krahmer, E. (2003). Machine learning for shallow interpretation of user utterances in spoken dialogue systems. In: Jokinen, K., Gambäck B., Black, W. J., Catizone, R., Wilks, Y. (eds) Proc. ACL Workshop on Dialogue Systems: Interaction, Adaptation and Styles of Management, Budapest, Hungary, 69–78.
Lesh, N., Rich, C., Sidner, C. L. (1998). Using plan recognition in human-computer collaboration. MERL Technical Report.
Levesque, H. J., Cohen, P. R., Nunes, J. H. T. (1990). On acting together. In: Proc. AAAI-90, 94–99. Boston, MA.
Levin, E., Pieraccini, R. (1997). A stochastic model of computer-human interaction for learning dialogue strategies. In: Proc. Eurospeech, 1883–1886, Rhodes, Greece.
Levin, E., Pieraccini, R., Eckert, W. (2000). A stochastic model of human-machine interaction for learning dialog strategies. IEEE Trans. Speech Audio Process., 8, 1.
Levinson, S. (1983). Pragmatics. Cambridge University Press, Cambridge.
Litman, D. J., Allen, J. (1987). A plan recognition model for subdialogues in conversation. Cogn. Sci., 11(2), 163–200.
Litman, D., Kearns, M., Singh, S., Walker, M. (2000). Automatic optimization of dialogue management. In: Proc. 18th Int. Conf. on Computational Linguistics (COLING 2000) Saarbrcken, Germany, 502–508.
Lopez Cozar, R., Araki, M. (2005). Spoken, multilingual and multimodal dialogue systems. Wiley, New York, NY.
Majaranta, P., Räihä, K. (2002). Twenty years of eye typing: Systems and design issues. In: Proc. 2002 Symp. on Eye Tracking Research & Applications (ETRA '02), ACM, New York, 15–22.
Martin, D., Cheyer, A., Moran, D. (1998). Building distributed software systems with the Open Agent Architecture. In: Proc. 3rd Int. Conf. on the Practical Application of Intelligent Agents and Multi-Agent Technology, Blackpool, UK. The Practical Application Company, Ltd.
McCoy, K. F. (1988). Reasoning on a highlighted user model to respond to misconceptions. Comput. Linguist., 14 (3), 52–63.
McGlashan, S., Fraser, N. M, Gilbert, N., Bilange, E., Heisterkamp, P., Youd, N. J. (1992). Dialogue management for telephone information services. In: Proc. Int. Conf. on Applied Language Processing, Trento, Italy.
McRoy, S. W., Hirst, G. (1995). The repair of speech act misunderstandings by abductive inference. Comput. Linguist., 21 (4), 435–478.
McTear, M. (2004). Spoken Dialogue Technology: Toward the Conversational User Interface. Springer Verlag, London.
Miikkulainen, R. (1993). Sub-symbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon, and Memory. MIT Press, Cambridge.
Minsky, M. (1974). A Framework for Representing Knowledge. AI Memo 306. M.I.T. Artificial Intelligence Laboratory, Cambridge, MA.
Moore, J. D., Swartout, W. R. (1989). A reactive approach to explanation. In: Proc. 11th Int. Joint Conf. on Artificial Intelligence (IJCAI), Detroit, MI, 20–25.
Motooka, T., Kitsuregawa, M., Moto-Oka, T., Apps, F. D. R. (1985). The Fifth Generation Computer: The Japanese Challenge. Wiley, New York, NY.
Möller, S. (2002). A new taxonomy for the quality of telephone services based on spoken dialogue systems. In: Jokinen, K., McRoy, S. (eds) Proc. 3rd SIGdial Workshop on Discourse and Dialogue, Philadelphia, PA, 142–153.
Nagata, M., Morimoto, T. (1994). First steps towards statistical modeling of dialogue to predict the speech act type of the next utterance. Speech Commun., 15 (3–4), 193–203.
Nakano, M., Miyazaki, N., Hirasawa, J., Dohsaka, K., Kawabata, T. (1999). Understanding unsegmented user utterances in real-time spoken dialogue systems. In: Proc. 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, Maryland, USA, 200–207.
Nakano, M., Miyazaki, N., Yasuda, N., Sugiyama, A., Hirasawa, J., Dohsaka, K., Aikawa, K. (2000). WIT: Toolkit for building robust and real-time spoken dialogue systems. In: Dybkjær, L., Hasida, K., Traum, D. (eds) Proc. 1st SIGDial workshop on Discourse and Dialouge – Volume 10, Hong Kong, 150–159.
Nakatani, C., Hirschberg, J. (1993). A speech-first model for repair detection and correction. In: Proc. 31st Annual Meeting on Association for Computational Linguistics, Columbus, OH, 46–53.
Nakatani, C., Hirschberg, J., Grosz, B. (1995). Discourse structure in spoken language: Studies on speech corpora. In: Working Notes of the AAAI-95 Spring Symposium on Empirical Methods in Discourse Interpretation, Palo Alto, CA.
Newell, A., Simon, H. (1976). Computer science as empirical inquiry: Symbols and search. Commun. ACM, 19, 113–126.
Nielsen, J. (1994). Heuristic evaluation. In: Nielsen, J., Mack, R. L. (eds) Usability Inspection Methods, Chapter 2, John Wiley, New York.
Norman, D. A., Draper, S. W. (eds) (1986). User Centered System Design: New Perspectives on Human-Computer Interaction. Lawrence Erlbaum Associates, Hillsdale, NJ.
Paek; T., Pieraccini, R. (2008). Automating spoken dialogue management design using machine learning: an industry perspective. In: McTear, M. F, Jokinen, K., Larson, J. (eds) Evaluating New Methods and Models for Advanced Speech-Based Interactive Systems. Special Issue of Speech Commun., 50 (8–9).
Paris, C. L. (1988). Tailoring object descriptions to a user’s level of expertise. Comput. Linguist., 14 (3), 64–78.
Power, R. (1979). Organization of purposeful dialogue. Linguistics, 17, 107–152.
Price, P., Hirschman, L., Shriberg, E., Wade, E. (1992). Subject-based evaluation measures for interactive spoken language systems. In: Proc. Workshop on Speech and Natural Language, Harriman, New York, 34–39.
Reichman, R. (1985). Getting Computers to Talk Like You and Me. Discourse Context, Focus, and Semantics (An ATN Model). The MIT Press, Cambridge, MA.
Reithinger, N., Maier, E. (1995). Utilizing statistical dialogue act processing in Verbmobil. In: Proc. 33rd Annual Meeting of ACL, MIT, Cambridge, US, 116–121.
Ries, K. (1999). HMM and neural network based speech act detection. ICASSP. Also available: citeseer.nj.nec.com/ries99hmm.html
Roy, N., Pineau, J., Thrun, S. (2000). Spoken dialog management for robots. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong.
Rudnicky, A., Thayer, E, Constantinides, P., Tchou, C., Shern, R., Lenzo, K., Xu, W., Oh, A. (1999). Creating natural dialogs in the Carnegie Mellon Communicator System. In: Proc. 6th Eur. Conf. on Speech Communication and Technology (Eurospeech-99), Budapest, 1531–1534.
Sacks, H., Schegloff, E., Jefferson, G. (1974). A simplest systematics for the organization of turn-taking for conversation. Language, 50 (4), 696–735.
Sadek, D., Bretier, P., Panaget, F. (1997). ARTIMIS: Natural dialogue meets rational agency. In: Proc. IJCAI-97, Nagoya, Japan, 1030–1035.
Samuel, K., Carberry, S., Vijay-Shanker, K. (1998). Dialogue act tagging with transformation-based learning. In: Proc. 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics (ACL-COLING), Montreal, Quebec, Canada, 1150–1156.
Schank, R. C., Abelson, R. P. (1977). Scripts, Plans, Goals, and Understanding. Lawrence Erlbaum Associates, Hillsdale, NJ.
Schatzmann, J., Weilhammer, K., Stuttle, M. N., Young, S. (2006). A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. Knowledge Eng. Rev., 21 (2), 97–126.
Scheffler, K., Young, S. (2000). Probabilistic simulation of human-machine dialogues. In: Proc. IEEE ICASSP, Istanbul, Turkey, 1217–1220.
Searle, J. R. (1979). Expression and Meaning: Studies in the Theory of Speech Acts. Cambridge University Press, Cambridge.
Seneff, S., Hurley, E., Lau, R., Pao, C., Schmid, P., Zue, V. (1998). GALAXY-II: A reference architecture for conversational system development. In: Proc. 5th Int. Conf. on Spoken Language Processing (ICSLP 98). Sydney, Australia.
Shriberg, E., Bates, R., Taylor, P., Stolcke, A., Jurafsky, D., Ries, K., Coccaro, N., Martin, R., Meteer, M., Van Ess-Dykema, C. (1998). Can prosody aid the automatic classification of dialog acts in conversational speech? Lang. Speech, 41, 3–4, 439–487.
Sinclair, J. M., Coulthard, R. M. (1975). Towards an Analysis of Discourse: The English Used by Teacher and Pupils. Oxford University Press, Oxford.
Smith, R. W. (1998). An evaluation of strategies for selectively verifying utterance meanings in spoken natural language dialog. Int. J. Hum. Comput. Studies, 48, 627–647.
Smith, R. W., Hipp, D. R. (1994). Spoken Natural Language Dialog Systems – A Practical Approach. Oxford University Press, New York, NY.
Stent, A., Dowding, J., Gawron, J. M., Owen-Bratt, E., Moore, R. (1999). The CommandTalk spoken dialogue system. In: Proc. 37th Annual Meeting of the Association for Computational Linguistics, College Park, Maryland, US, 20–26.
Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., Taylor, P., Martin, R., Van Ess-Dykema, C., Meteer, M. (2000). Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech. Comput. Linguist., 26 (3), 339–373.
Suhm, B., Geutner, P., Kemp, T., Lavie, A., Mayfield, L., McNair, A. E., Rogina, I., Schultz, T., Sloboda, T., Ward, W., Woszczyna, M., Waibel, A. (1995). JANUS: Towards multilingual spoken language translation. In: Proc. ARPA Spoken Language Workshop, Austin, TX.
Swerts, M., Hirschberg, J., Litman, D. (2000). Correction in spoken dialogue systems. In: Proc. Int. Conf. on Spoken Language Processing (ICSLP-2000), Beijing, China, 615–618.
Takezawa, T., Morimoto, T., Sagisaka, Y., Campbell, N., Iida, H., Sugaya, F., Yokoo, A., Yamamoto, S. (1998). A Japanese-to-English speech translation system: ATR-MATRIX. In: Proc. (ICSLP-98), Sydney, Australia, 957–960.
Traum, D. R. (2000). 20 questions on dialogue act taxonomies. J. Semantics, 17, 7–30.
Traum, D. R., Allen, J. F. (1994). Discourse obligations in dialogue processing. In: Proc. 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, New Mexico, USA, 1–8.
Traum, D., Roque, A., Leuski, A., Georgiou, P., Gerten, J., Martinovski, B., Narayanan, S., Robinson, S., Vaswani Hassan, A. (2007). A virtual human for tactical questioning. In: Proc. 8th SIGDial Workshop on Discourse and Dialogue, Antwerp, Belgium, 71–74.
Turing, A. M. (1950). Computing machinery and intelligence. Mind, 49, 433–460.
Wahlster, W. (ed) (2000). Verbmobil: Foundations of Speech-to-Speech Translation. Springer, Berlin.
Wahlster, W., Marburger, H., Jameson, A., Busemann, S. (1983). Overanswering yes-no Questions: Extended responses in a NL interface to a vision system. In: Proc. 8th Int. Joint Conf. on Artificial Intelligence (IJCAI'83), Karlsruhe, 643–646.
Walker, M. A., Fromer, J. C., Narayanan, S. (1998). Learning optimal dialogue strategies: A case study of a spoken dialogue agent for email. In: Proc. 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics Montreal, Quebec, Canada.
Walker, M. A., Hindle, D., Fromer, J., Di Fabbrizio, G., Mestel, G. (1997a). Evaluating competing agent strategies for a voice email agent. In: Proc. 5th Eur. Conf. on Speech Communication and Technology. (Eurospeech 97), Rhodes, Greece.
Walker, M. A., Litman, D. J., Kamm, C. A., Abella, A. (1997b). Evaluating spoken dialogue agents with PARADISE: Two case studies. Comput. Speech Lang., 12 (3), 317–347.
Wallace, M. D., Anderson, T. J. (1993). Approaches to interface design. Interacting Comput., 5 (3), 259–278.
Ward, N., Tsukahara, W. (2000). Prosodic features which cue back-channel responses in English and Japanese. J. Pragmatics, 23, 1177–1207.
Weinschenk, S., Barker, D. (2000). Designing Effective Speech Interfaces. Wiley, London.
Weiser, M. (1991). The computer for the twenty-first century. Sci. Am., September 1991 (Special Issue: Communications, Computers and Networks), 265(3), 94–104.
Weizenbaum, J. (1966). ELIZA – A computer program for the study of natural language communication between man and machine. Commun. ACM, 9, 36–45.
Wermter, S., Weber, V. (1997). SCREEN: Learning a flat syntactic and semantic spoken language analysis using artificial neural networks. J. Artif. Intell. Res., 6 (1), 35–85.
Williams, J. D., Young, S. J. (2007). Partially observable Markov decision processes for spoken dialog systems. Comput. Speech Lang., 21 (2), 231–422.
Winograd, T. (1972). Understanding Natural Language. Academic Press, New York.
Woods, W. A., Kaplan, R. N., Webber, B. N. (1972). The lunar sciences natural language information system: Final Report. BBN Report 2378, Bolt Beranek and Newman Inc., Cambridge, MA.
Yankelovich, N. (1996). How do users know what to say? Interactions, 3 (6), 32–43.
Young, S. L., Hauptmann, A. G., Ward, W. H., Smith, E. T., Werner, P. (1989). High-level knowledge sources in usable speech recognition systems, Commun. ACM, 32 (2), 183–194.
Zock, M., Sabah, G. (eds) (1988). Advances in Natural Language Generation: An Interdisciplinary Perspective. Pinter Publishers, London.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Jokinen, K. (2010). Spoken Language Dialogue Models. In: Chen, F., Jokinen, K. (eds) Speech Technology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-73819-2_3
Download citation
DOI: https://doi.org/10.1007/978-0-387-73819-2_3
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-73818-5
Online ISBN: 978-0-387-73819-2
eBook Packages: EngineeringEngineering (R0)