As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
This paper describes empirical results of information retrieval in 13 languages of the Cross Language Evaluation Forum (CLEF) collection augmented with results of Turkish using syllables as a means to manage morphological variation in the languages. This kind of approach has been used in speech retrieval [1], but for some reason it has not been much tried out in text-based IR, although it has many clear advantages. Firstly, a quite well working version of it can be implemented with a very simple syllabification algorithm, consisting of only variants of one syllable structure rule, CV, consonant vowel. Secondly, although syllable-based word form variation management resembles n-gramming [2], it has the advantage, that the number of grams with syllables is more restricted which keeps the size of the text index smaller and retrieval faster. Thirdly, syllable-based approach makes possible to use different types of syllabification procedures, which can be either very fine grained, i.e. language specific or very coarse, i.e. more language independent. Fourthly, syllable based methods work for both speech and text retrieval. Our results show, that the two different CV syllabification procedures produced good results with four morphologically complex languages of the CLEF collection. For Turkish they produced also good results. For three of the languages that got good results with the CV syllabification (De, Fi and Tu), we tried also language specific, accurate syllabification procedures. Accurate syllabification was not able to produce as good IR results as CV procedures, but it was not far behind in performance.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.