Design and implementation of a single-chip speech-to-speech translation system
Design and implementation of a single-chip speech-to-speech translation system
- Author(s): J.-F. Wang ; S.-C. Lin ; J.-C. Wang ; H.-W. Yang
- DOI: 10.1049/ip-cds:20045067
For access to this article, please select a purchase option:
Buy article PDF
Buy Knowledge Pack
IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.
Thank you
Your recommendation has been sent to your librarian.
- Author(s): J.-F. Wang 1 ; S.-C. Lin 1 ; J.-C. Wang 1 ; H.-W. Yang 1
-
-
View affiliations
-
Affiliations:
1: Electrical Engineering, National Cheng Kung University, Taiwan, Republic of China
-
Affiliations:
1: Electrical Engineering, National Cheng Kung University, Taiwan, Republic of China
- Source:
Volume 153, Issue 5,
October 2006,
p.
416 – 426
DOI: 10.1049/ip-cds:20045067 , Print ISSN 1350-2409, Online ISSN 1359-7000
This work presents the first chip design for a portable speech-to-speech translation application. First, we construct a speech-to-speech translation system based on multiple-translation spotting (MTS). In terms of the proposed MTS method, the optimal multiple-translation spotting template is retrieved and the appropriate target patterns are extracted. To overcome the computational bottleneck due to the MTS algorithm, this work introduces a template retrieval core and a target pattern extraction core. Combined with a low cost programmable core, this work takes the most out of both programmable and application-specific architectures, including performance, design complexity, and flexibility. This design was experimentally verified via semi-custom chips using 0.35 μm CMOS single-poly-four-metal technology on a die of approximately 3.85×3.85 mm2 size.
Inspec keywords: speech recognition equipment; integrated circuit design; CMOS digital integrated circuits; speech synthesis; application specific integrated circuits; language translation
Other keywords:
Subjects: Speech recognition and synthesis; Speech processing techniques; Digital circuit design, modelling and testing; CMOS integrated circuits; Semiconductor integrated circuit design, layout, modelling and testing
References
-
-
1)
- Matsui, K., Wakita, Y., Konuma, T., Mizutani, K., Endo, M., Murata, M.: `An experimental multilingual speech translation system', Proc. ICMI-PUI, 2001, p. 1–4.
-
2)
- B.E. Bagnell , M. Lee . (1990) New Globe English course on travel.
-
3)
- Sugaya, F., Takezawa, T., Yokoo, A., Yamamoto, S.: `End-to-end evaluation in ATR-MATRIX: speech translation system between English and Japanese', Proc. 6th Eur. Conf. Speech Communication and Technology, Sep. 1999, p. 2431–2434.
-
4)
- Simard, M.: `Translation spotting for translation memories', Proc. HLT-NAACL Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, May 2003, p. 65–72.
-
5)
- Isotani, R., Yamabana, K., Ando, S., Hanazawa, K., Ishikawa, S., Emori, T., Hattori, H., Okumura, A., Watanabe, T.: `An automatic speech translation system on PDAs for travel conversation', Proc. IEEE Int. Conf. Multimodal Interfaces, Oct. 2002, p. 211–216.
-
6)
- Zhang, Y.: ‘Survey of current speech translation research’. Presented at Multilingual Speech-to-Speech Translation Seminar, Carnegie Mellon University, Pittsburgh, PA, 2003.
-
7)
- Rossato, S., Blanchon, H., Besacier, L.: `Speech-to-speech translation system evaluation: Results for French for the Nespole! project first showcase', Proc. ICSLP, 2002, p. 1905–1908.
-
8)
- Casacuberta, F., Vidal, E., Vilar, J.M.: `Architectures for speech-to-speech translation using finite-state models', Proc. ACL Workshop on Speech-to-Speech Translation: Algorithms and Systems, 2002, p. 39–44.
-
9)
- L. Rabiner , B. Juang . (1993) Fundamentals of speech recognition.
-
10)
- Lavie, A., Waibel, A., Levin, L., Finke, M., Gates, D., Gavalda, M., Zeppenfield, T., Zhan, P.: `JANUS III: speech-to-speech translation in multiple languages', Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Apr. 1997, p. 99–102.
-
11)
- Ney, H.: `Speech translation: coupling of recognition and translation', Proc. ICASSP, 1999, p. 517–520.
-
12)
- Lee, Y.S., Roukos, S.: `IBM spoken language translation system evaluation', Proc. INTERSPEECH2004 Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation, Oct. 2004, p. 39–46.
-
13)
- Wang, J.F., Yang, C.H., Chang, K.H.: `Design of a subspace tracking based speech enhancement system', Proc. IEEE TENCON 2004, 1, p. 147–150.
-
14)
- J. Véronis , P. Langlais . (2000) Evaluation of parallel text alignment systems – the ARCADE project, Parallel text processing.
-
15)
- Gu, L., Gao, Y.Q.: `On feature selection in maximum entropy approach to statistical concept-based speech-to-speech translation', Proc. INTERSPEECH2004 Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation, Oct. 2004, p. 115–121.
-
16)
- Watanabe, T., Okumura, A., Sakai, S., Yamabana, K., Doi, S., Hanazawa, K.: `An automatic interpretation system for travel conversation', Proc. Int. Conf. Spoken Language Processing, Sep. 2000, p. IV-444–IV-447.
-
17)
- W. Wahlster . (2000) Verbmobil: foundations of speech-to-speech translation.
-
18)
- Ching, P.C., Chi, H.H.: `ISIS: a trilingual conversational system with learning capabilities and combined interaction and delegation dialogs', Proc. National Conf. Man-Machine Speech Communications, Nov. 2001, p. 119–124.
-
19)
- Waibel, A., Badran, A., Black, A.W., Frederking, R., Gates, D., Lavie, A., Levin, L., Lenzo, K., Tomokiyo, L.M., Reichert, J., Schultz, T., Wallace, D., Woszczyna, M., Zhang, J.: `Speechalator: two-way speech-to-speech translation on a consumer PDA', Proc. Eur. Conf. Speech Communication and Technology, Sep. 2003, p. 369–372.
-
20)
- J.F. Wang , S.C. Lin , H.W. Yang . Multiple-translation spotting for Mandarin-Taiwanese speech-to-speech translation. Comput. Linguist. Chinese Lang. Process. , 2 , 13 - 28
-
21)
- Verhelst, W., Roelands, M.: `An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech', Proc. ICASSP, 1993, p. 554–557.
-
22)
- Casacuberta, F.: `Speech-to-speech translation based on finite-state transducers', Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, May 2001, p. 613–616.
-
23)
- J.F. Wang , S.H. Chen . Speech enhancement using perception wavelet packet decomposition and teager energy operator. J. VLSI Signal Process. , 125 - 139
-
24)
- J.T. Chien , C.H. Huang . Bayesian learning of speech duration models. IEEE Trans. Speech Audio Process. , 6 , 558 - 567
-
25)
- H. Ney , S. Nießen , F.J. Och , H. Sawaf , C. Tillmann , S. Vogel . Algorithms for statistical translation of spoken language. IEEE Trans. Speech Audio Process. , 24 - 36
-
26)
- Nakamura, S., Markov, K., Jitsuhiro, T., Zhang, J.S., Yamamoto, H., Kikui, G.: `Multi-lingual speech recognition system for speech-to-speech translation', Proc. INTERSPEECH2004 Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation, Oct. 2004, p. 146–154.
-
27)
- H.G. Ilk , S. Tugac . Channel and source considerations of a bit-rate reduction technique for a possible wireless communications system’s performance enhancement. IEEE Trans. Wirel. Commun. , 1 , 93 - 99
-
28)
- E.T. Cornelius . (1999) English 900.
-
29)
- Demol, M., Struyve, K., Verhelst, W., Paulussen, H., Desmet, P., Verhoeve, P.: `Efficient non-uniform time-scaling of speech with WSOLA for CALL applications', Presented at InSTIL/ICALL Symp. Computer Assisted Learning, 2004, Venice, Italy.
-
30)
- Wang, J.F., Suen, A.N., Chieh, C.K.: `A programmable application specific architecture for real-time speech recognition', Proc. VLSI Design/CAD Symp., Aug. 1995, p. 261–264.
-
1)