Intelligent computer assisted blog writing system

https://doi.org/10.1016/j.eswa.2011.09.139Get rights and content

Abstract

In this paper, we designed and implemented an intelligent computer assisted blog writing system. The system includes concept expansion module, text generation module and content replacement module. The concept expansion is employed to expand the concept provided by the user and these expanded concepts can provide the system more semantic information. The text generation model is based on keyword generation model and text segment generation model of which the important keywords are extracted from system corpus to serve as the backbone of the template. Meanwhile, the text segments between keywords will make up the content of the template and candidate text segments are retrieved from the corpus based on statistical analysis. The content replacement module employs Google to retrieve content from the Web and ranks the content based on term and POS tagging similarities. The prototype system has shown that it could work well on blog writing application domain and the concept of this research could be extended to other domains easily.

Highlights

► Design and implement an intelligent computer assisted blog writing system. ► Propose a replaceable content retrieval mechanism to obtain example texts from the Web. ► Propose a content ranking mechanism to provide a ranking score for replaceable content.

Introduction

With the popularity of the World Wide Web, the Web has become a new knowledge source. Recently, the concept of “Web 2.0” facilitates communication, information sharing, interoperability, and collaboration on the Web. End users now are not only information consumers, but also information producers. The popularity of blog inspires people to share their feelings or opinions about some events on their own blogs. According to Technorati statistics, over 112.8 million blogs have been recorded and over 250 million pieces of tagged social media exist. Over 1.6 million blog entries are updated every day. In addition to the popularity of the Web, the evolution of tools to facilitate the production and maintenance of Web articles made the publishing process become less technical. Meanwhile, it seems such a potentially rich set of on-line writing activities can provide users opportunities to learn essay writing, because blog writing incorporates familiar writing skills such as organization, paraphrases, and the development of thinking.

In recent years, essays have become a major part of a formal education and students are encouraged to have the ability to write in many exams. Although writing is very important, it is still a difficult job for many users to write an article from scratch. Besides, writing is important not only in schools, but also in our daily life. For example, when people communicate with their clients using emails, the ability of writing will facilitate them to express their ideas clearly. According to our observation, many students regard essay writing as a difficult job, but they are willing to write articles on their own blogs. One of the reasons is that the article’s structure and content are not as formal as the essays and the topic can be anything. Even so, people still can obtain writing experiences from blog writing, since they can learn by doing.

Over the last few decades, much research has been done on spelling and grammar checkers and these checkers have been integrated into many word processor applications. In practice, these tools could correct users’ writing errors, but they could not assist the users in writing articles from scratch. For example, if a user would like to compose an article, one major challenge is how to organize the content and how to use appropriate text segments to express his/her ideas rather than the spelling and grammar errors. In this paper, the proposed blog writing system assists users to compose blog articles from a single concept.

Basically, a user always starts with a subject or title in his/her mind when he/she is writing a blog article or an essay. For example, if a user wants to write a blog article about travel, he/she will start to collect ideas about travel and use text segments related to travel events. The subject or title can be regarded as the main concept of the article which is supposed to lead the users to finish this article. In addition, people tend to use some text segments or terms that have been appeared in other articles. A common learning approach is learning by doing which could enable users to imitate the structures and patterns in good writings to compose their own articles. Currently, digital content on the Web grows exponentially, and many people are willing to share their blogs on the Web. The Web can be regarded as a big database and many example texts can be obtained from the Web. Based on the above observation, we designed and implemented an intelligent assisted blog writing system. The writing system assists users to compose a blog article from a single concept and regards the Web as system corpus to provide example texts.

We collected about 30,000 blog articles from the Internet. Document classification process is first performed to filter out inappropriate articles and the remaining data is used as the basic data corpus. When the user starts to use the system, he/she is asked to provide a concept for the system to generate appropriate content. The concept expansion module derives related concepts based on the concept given by the user and assigns related concepts with weighting information. The system can employ these related concepts with weighting information to rank the content templates each of which contains keywords and text segments between keywords. To generate keywords of the content template, we employ keyword extraction and expansion modules proposed by Liu, Lee, Yu, and Chen (2011) to retrieve representative keywords as well as to expand these keywords. To generate text segments in the content template, we employ statistical approach to obtain the text segments between keywords from the corpus.

In addition to content templates, users are allowed to edit or replace content to meet their requirements. The content replacement module employs Google to retrieve candidate content. Since Google may return hundreds of thousands results, we propose a content ranking mechanism, which includes term and Part-Of-Speech (POS) tagging similarities, to obtain appropriate content. The prototype system has shown that it could work well on blog application domain and the concept of this research could be extended to another domain easily. The main contributions of this paper include:

  • 1.

    Design and implement an intelligent computer assisted blog writing system.

  • 2.

    Propose a replaceable content retrieval mechanism to obtain example texts from the Web.

  • 3.

    Propose a content ranking mechanism to provide a ranking score for replaceable content.

The rest of the paper is structured as follows. In Section 2 a survey of related researches is introduced. In Section 3, the system architecture and design is presented. In Section 4, the experiments and evaluation result are introduced. In Section 5, the conclusion is presented.

Section snippets

Related work

In essence, the ability to write plays an important role in language learning. Not only can it improve users’ writing skills, but also it helps them develop the ability of communication. Although machine translation is widely used for this purpose, how to find an efficient way in which human collaborates with computers remains an open issue. Recently, various on-line writing assistance tools have been developed to assist users to compose articles. Liu, Zhou, Gao, Xun, and Huang (2000) proposed

System architecture

Fig. 1 shows the system flow, which includes data preprocess, concept expansion, keyword extraction and expansion, template article generation and phrase substitution. These modules will be described in the following sections.

Experiment and discussion

In this paper, we designed and implemented an intelligent computer assisted blog writing system. The experimental results are shown in the following sections. Meanwhile, ten people were invited to experience the system and several evaluation criteria are listed in the questionnaire to evaluate the system.

Conclusion

An intelligent computer assisted blog writing system, which includes concept expansion, text generation and replaceable content suggestion, is presented in this paper. In this paper, we propose a layer-based concept expansion approach, which is based on HowNet, to extend a single concept into several related concepts with weighting information. Then, the related concepts with weighting information can facilitate the system to rank and determine content templates each of which consists of a

Acknowledgment

This work was supported in part by the National Science Council under the Grants NSC-99-2221-E-009-150 and NSC-099-2811-E-009-041.

References (19)

  • T.-C. Huang et al.

    A blog article recommendation generating mechanism using an sbacpso algorithm

    Expert Systems with Applications

    (2009)
  • C.-L. Liu et al.

    Computer assisted writing system

    Expert Systems with Applications

    (2011)
  • Bustamante, F. R., & Leon, F. S. (1996). Gramcheck: A grammar and style checker. In Proceedings of the international...
  • Chang, T.-H., & Lee, C.-H. (2003). Automatic chinese unknown word extraction using small-corpus-based method. In...
  • L. Dai et al.

    Measuring semantic similarity between words using hownet

  • Dong, Z., & Dong, Q. (2003). Hownet – a hybrid language and knowledge resource. In Proceedings of the International...
  • Esuli, A., & Sebastiani, F. (2005). Determining the semantic orientation of terms through gloss classification. In...
  • Genthial, D., & Courtin, J. (1992). From detection/correction to computer aided writing. In Proceedings of the 14th...
  • Komatsu, H., Takabayashi, S., & Masui, T. (2005). Corpus-based predictive text input. In Proceedings of the 2005...
There are more references available in the full text version of this article.

Cited by (8)

View all citing articles on Scopus
View full text