Improving tag-based recommendation with the collaborative value of wiki pages for knowledge sharing

Durao, Frederico; Dolog, Peter

doi:10.1007/s12652-012-0119-x

Improving tag-based recommendation with the collaborative value of wiki pages for knowledge sharing

Original Research
Published: 19 April 2012

Volume 5, pages 21–38, (2014)
Cite this article

Download PDF

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Improving tag-based recommendation with the collaborative value of wiki pages for knowledge sharing

Download PDF

Frederico Durao¹ &
Peter Dolog¹

481 Accesses
13 Citations
Explore all metrics

Abstract

This exploratory study investigates how organisations can support knowledge transferring by exploiting social and community intelligence. In particular, this work analysis the potential of wiki technology as a tool for knowledge sharing in corporate wikis. Wikis are hypertext systems that support team-oriented collaborative work. Corporate wikis are especially turned to enhance internal knowledge sharing in enterprises. This research study sought to empirically determine the value of wiki pages that emerged from such collaboration in corporate wikis. As a research challenge, we evaluate how tag-based recommendations benefit from this value in a problem solving context. The recommendations are evaluated on their capability of transferring knowledge and help users to solve tasks. In this sense, we create a problem solving scenario where users need to use the recommendations to get their tasks solved. Meanwhile, we attempt to support users individually to find their own solutions, our recommendations are intended to enhance the overall organisation’s problem solving capacity. Results from an experiment with 63 participants show that more successful recommendations can be obtained if the collaborative value of pages is considered. In essence, this work demonstrates how the value of wiki pages can produce significant quality support in assisting individuals to get their problems solved and sharing knowledge in collaborative spaces. In addition to this evaluation, professionals from software companies were interviewed about the usefulness and adoption of the recommendation model in their corporate wikis.

Motivating Wiki-Based Collaborative Learning by Increasing Awareness of Task Conflict: A Design Science Approach

Collaborative Knowledge Management Using Wiki Front-End Modules

Untangling knowledge creation and knowledge integration in enterprise wikis

Article 06 January 2015

Roman Beck, Arun Rai, … Mark Keil

1 Introduction

Knowledge is seen as one of the most important assets in any organisation. It is the key factor of representing organizational core competence (Nonaka et al. 2000). Knowledge sharing within an organisation makes the knowledge of individual or group in the organisation transformed into the knowledge of the whole organisation. Subsequently the staff can fully and efficiently utilise these knowledge resources which the organisation has already owned to create value for the organisation (Hu et al. 2007). Therefore, companies have invested resources on the development of technological solutions to especially enhance internal knowledge sharing (Chen and Chen 2006).

As organisations seek innovative ways to capture and distribute knowledge, many corporate intranets now try to exploit social and community intelligence (SCI) to identify potential business opportunities as well as experts that conform to their needs (Zhang et al. 2011). In order to meet their requirements, companies have started to perform large-scale analysis of personal, group, and community dynamics over individual activities in virtual environments such as content-sharing, blogging, social networking and collaborative tagging. In this work, we exploit community intelligence in corporate wikis which have been widely adopted in companies for encouraging knowledge sharing among participants. Wiki is one of the important application software in the Web 2.0 wave. It’s a hypertext system, which supports community-oriented collaborative composition and also includes a set of assisting tools that support this collaboration (Harrer et al. 2008). The informal and unstructured nature of wikis makes it ideal to knowledge sharing. It has less to do with technology and more about changing habits aimed at easing collaborative participation and knowledge sharing among peers. The content of wiki pages is the fruit of collective work where individuals, in general, collaborate towards a common goal (Pancerella et al. 2001).

In such a multi-editing environment, individuals can take part in a collaborative process of creating, revising and sharing knowledge that might be of interest to many. The amount and quality of knowledge produced and shared will depend on the level of engagement of those involved in the collaborative process. Usually, the more intense the social commitment is, the higher is the chance of creating a refined content. Intense participation facilitates the “brainstorming” process, where new concepts may emerge due to the diversity of contributions (Jun and Weiguo 2008).

Within the organisational context, corporate wikis ^{Footnote 1} benefit from a collection of specialised contributions since different groups have specific knowledge (skills) on different areas of domain. For instance, in a software factory, a wiki page intended to document a system requires inputs from project managers, software architects, programmers, testers, etc. Project managers depict the work plan and task allocations; software architects report the architectural style and design patterns; programmers tell about the code conventions and documentation; testers describe the procedure to find bugs (errors or other defects). Each group shares and contributes with own knowledge to the overall content building of a wiki page. Nevertheless, to create this collective knowledge structured in a wiki system, it requires commitment of a team to maintain such information up-to-date and consistent. As pointed out by Holtzblatt et al. (2010), “some people did not feel inclined to share specific information because of the extra time and effort involved”. The maintenance of knowledge is an issue toward the effective utilisation of wikis as a tool to support the dissemination of knowledge within an enterprise (Holtzblatt et al. 2010; Harrer et al. 2008).

One propitious way of overcoming, or at least lessening this challenge, is to recommend missing (and related) information placed in other pages of the system or external sources. Recommendations would therefore play the role of employees by providing missing and related information and thereby augmenting the usefulness of wikis towards knowledge sharing. A straightforward advantage of recommendations is that end users do not have the need of seeking for further information in other pages or from external sources. Information is right there ready to be consumed. Obviously, the utility of recommendations will depend on the quality of the recommended content. The tricky point lies in deciding which recommendations are most likely to provide constructive information towards target pages.

Beyond the adoption of traditional similarity measurements to find related pages, the expertise of contributors and social commitment on the wiki pages should be considered. We believe that promising recommendations can be biased by the collaborative work on a wiki page, i.e. the social commitment in transferring information to share/solve issues for the overall benefit of the community (Portes 1998; Putnam 2000). Our hypothesis is that the collaborative value of wiki pages reflects the knowledge sharing capacity. In this context, we propose a measure for calculating the collaborative value of wiki pages as a summarisation of individual efforts where collaboration takes place. We name such measure as WPCV—Wiki Page Collaborative Value. This value is later considered to rank higher recommendations that are more promising for problem solving. In particular, this study relies on tag-based recommendations for suggesting related wiki pages that might be useful for providing the complementary information. Although the model can be applied on top of other types of recommender systems, we test the tag-based recommendations due to the large adoption and interest of taggings in Web 2.0 applications (including wikis) (Ricci et al. 2011; Breslin et al. 2009; Tonkin et al. 2008).

The goal of this work is, therefore, to assess whether the collaborative value of wiki pages improves the performance of tag-based recommendations in assisting users to find relevant information. We make the following contributions:

We provide a tag-based recommender system that suggests related and prospect wiki pages for knowledge sharing.
We develop a method to automatically estimate the value of wiki pages developed within collaborative spaces. We also provide theoretical and experimental justification for our methods.
Finally, we conduct an experimental and comparative analysis to evaluate how much tag-based recommendations can benefit from the collaborative value of wiki pages. Our experiments indicate a reasonable improvement in the recommendation performance—we observed about 18 % precision and 15 % recall improvement over pure tag-based recommendation mechanism.

The content of this paper is organised as follows. Section 2 summarises the related work. Section 3 describes the aspects of collaboration and how the collaborative value of wiki pages is measured. Section 4 explains how the tag-based recommendations are generated and powered by the collaborative effort on wiki pages. Section 5 presents the methodology and data of the evaluation study. Section 6 presents the results obtained from the experiments, limitations and a discussion of the findings. Section 7 presents the interview with people in industry that assessed the potential application of the model in their companies. Section 8 sets fourth the conclusions.

2 Related work

This section overviews a number of related works that serve as foundations for our study. We review studies that investigate knowledge sharing through wiki system, then we revise works that address the collaborative work in multi-editing environment, with main focus on wikis systems. Finally, we overview a number of recommendation techniques turned to support social applications.

2.1 Knowledge sharing in wiki systems

Many studies have investigated the adoption of wikis as a means of knowledge sharing in organisations. Tseng and Huang (2011) examine the content, technical and social values of a wiki system on knowledge sharing and job performance. Hu et al. (2007) study how knowledge sharing supported by a wiki platform within an organisation makes the knowledge of individual or group transformed into the knowledge of the whole organisation. Rhee et al. (2008) introduce a wiki system that supports group communication and knowledge sharing in research projects. The system goal is to provide users with straightforward access to necessary information. Hester (2008) investigate the underlying circumstances fostering adoption of wikis to overcome problems of ineffective knowledge sharing in organisation. Holtzblatt et al. (2010) investigate factors impeding wiki use in enterprise.

Despite wikis are key ingredient to organizational knowledge sharing, a number of studies outline technical and social barriers that inhibit wiki adoption leading to ineffective knowledge sharing. Happel (2009) claim that people inhibited to share information due to the lack of central guidance. There is no systematic orientation on how and what to publish on wikis. To face this issue and support wiki use, they designed Woogle, a concept to improve search with collaboration features (“social search”) and to guide knowledge sharing with actual information needs of the user community (“need-driven knowledge sharing”). In wikis, information needs are conceived as explicit red links, i.e. a page missing content is a entry point for knowledge sharing. Our recommendations might supply this need by providing missing information to such a information gap. Recommendation would work as complementary information source. In a related work, Dearman et al. (2008) defend that time and effort are among the major barriers for knowledge sharing. It is clear that in an enterprise setting but also in large communities such as Wikipedia resources for knowledge creation and sharing are limited. Dearman et al. (2008) also emphasise that people generally like to share knowledge with others, but only if they know, that it will be useful for them. Similar research was carried out by Holtzblatt et al. (2010). They explore factors that inhibit the use of wikis as a tool to support the dissemination of knowledge within an enterprise. The key factors observed are: extra time to share information, lack of guidance, and sensitivities to the openness of information. These barriers for knowledge sharing also open up a new perspective for using recommendations as new information source and thereby reducing the need of human resources to provide needed information. Sousa et al. (2010) investigate how the usage of a wiki in an organizational environment may work as a knowledge management tool. The evaluation results says that users usually access the wiki seeking for information but less to contribute. Further, if information needed is unavailable users are reluctant to fulfil the gaps. Again, this issue could be attenuated if recommendations are there to supply wiki pages with missing information.

2.2 Collaborative work on wikis

Although wikis have been adopted as a platform for knowledge sharing, their success depends on the collaborative work of contributors. A number of studies investigate diverse aspects of collaborative work on wikis, ranging from methods to quantify collaboration and knowledge transfer until metrics to measure the quality of pages created.

Kotlarsky and Oshri (2005) study the contribution of social ties and knowledge sharing to successful collaboration in distributed IS development teams on a wiki system. They claim that human-related issues in the form of social ties and knowledge sharing were reported as keys to successful collaboration. Our recommendations allow a practical implementation of Kotlarsky and Oshri (2005)’s view. In our model, strength social ties and social commitment on the wiki pages are considered as key factor to determine the value of recommendations. Stein and Blaschke (2009) propose quantitative and qualitative measures to measure the quality of corporate wikis. In their use case, the corporate wiki pages are evaluated to assess quality of pages created for technical documentation, issue tracking, and status reports for project management. Wilkinson and Huberman (2007) propose a method for assessing the value of cooperation in Wikipedia. They demonstrate a crucial correlation between article quality and number of edits. They also claim that topics of particular interest or relevance are thus naturally brought to the forefront of quality. The works of Stein and Blaschke (2009), Wilkinson and Huberman (2007) converge with our approach since both aim at assessing the value of cooperation in wikis. While our model considers a number of wiki activities such as commenting and tagging, they solely rely on the number of edits.

A crucial element lacking from previous studies Stein and Blaschke (2009), Wilkinson and Huberman (2007) is the assessment of social engagement and reputation analysis on the contributors. Although not restricted to wiki systems, Zhang et al. (2011) defend that social and community intelligence can be exploited by leveraging digital traces left by groups of people while interacting with web, wireless and wearable applications. In the context of corporate wikis, companies are increasingly investing in Social Media and Online Communities to capitalise on this valuable resource of information. Another promising direction is seen on Dondio et al. (2006)’s work, which proposes a heuristic method for computing the trustworthiness of wiki articles based on article stability and the collaboration in the article. Our work approximates to her study in the sense that we address the collaborative work as the primary aspect on the quality of the wiki pages. Similarly, Vickery and Wunsch-Vincent (2007) express a concern with the quality of content produced when contributors are located in a non-professional context, outside of traditional media oversight and often without any monitoring. This can have implications for the “quality” of material being posted and harms the knowledge sharing process, admitting that the concept of quality is hard to define and has both subjective and contextual aspects. In our work we also share the same concern about quality of produced content. Although we do not examine the text content itself, we believe that quality wiki pages are results of an intense collaborative work and individual expertise of contributors.

2.3 Social recommender systems

Social Recommender Systems (SRSs) aim to alleviate information overload over social media users by presenting the most attractive and relevant content. SRSs also aim at increasing adoption, engagement, and participation of new and existing users of social media sites. Recommendations of content (blogs, wikis, etc.) Guy et al. (2010), tags Sigurbjörnsson and van Zwol (2008), people Guy et al. (2009), and communities Chen et al. (2009) often use personalisation techniques adapted to the needs and interests of the individual user, or a set of users. In the following, we present a number of related works divided in recommender systems for collaborative work and the traditional collaborative filtering technique.

2.3.1 Recommender systems for collaborative work

Related works intended to support the collaborative work can be seen in Liu and Ke (2007), which introduces a knowledge recommendation module that recommends relevant documents and decision-making/dependency knowledge as knowledge support. The support is based on the discovered knowledge patterns from predetermined situation and action profiles. Avancini et al. (2007) is a personalised recommender system for a collaborative Digital Library environment to foster the collaborative work among users. A very similar work to our approach is Palau et al. (2004), which investigates the collaboration in social networks in order to improve the performance of recommendations. The difference from our work is that the collaboration is represented in terms of social interaction between the ties solely. We instead consider the user expertise and the amount of collaboration on the wiki pages in addition to the social interaction among the contributors. Sen et al. (2006) presents a model for tagging evolution based on community influence and personal tendency. It shows how four different options to display tags affect user’s tagging behaviour. Zanardi and Capra (2008) proposes Social Ranking, a method that exploits recommender system techniques to increase the efficiency of searches within social tagging spaces. They proposed a mechanism that rank (and recommend) content based on the inferred semantic distance of the user query to the tags associated to such content, weighted by the similarity of the querying user to the users who created those tags. Similarly to our work, tags have been utilised as engine for generating recommendation aimed at supporting collaborative systems. The difference from our work is that none of them addresses the collaborative value on the recommended items. Differently, our model understands that tag-based recommendations can be improved by this collaboration value, accounted as collection of collaborative efforts, and not simply by the tagging activity.

2.3.2 Collaborative filtering techniques

Collaborative Filtering (CF) is one of the most successful technologies for recommender systems. They have gained expressive interest with the emergence e-commerce applications on the Web that recommend products based on past customer’s opinion. The underlying assumption of the CF technique is that those who agreed in the past tend to agree again in the future (Ricci et al. 2011).

The tag-based recommender model of this paper is not based on the CF-technique, but on the content-based one. In general, our recommendations rely on the similarity of tags weighed by a collaboration factor (see Sect. 4) Although both CF- and Content-based techniques are conceptually distinct, it is important to understand and review CF-techniques, once the similarity techniques for computing recommendations are based on a collaborative users’ opinion. In order to recommend, CF systems need to relate with two different entities: users and items in a user-item matrix. Usually, this correlation is given in terms of explicit ratings, which represents the user’s opinion on a given item or product. For instance, if a user rated 5 a movie in a 5-star rating schema, this means he liked the movie, on the other hand, a rating of 1 means that he disliked it. Deshpande and Karypis (2004) presents a class of model-based top-N recommendation algorithm that uses item-to-item or item set-to-item similarities to compute the recommendations. The difference from this study is that we utilise a cosine similarity weighed by a collaborative value of pages to find similarities while they rely on conditional probability-based item similarity scheme and higher-order item-based models. In line with Deshpande and Karypis (2004), Linden et al. (2003) proposes the item-to-item collaborative filtering algorithm, which rather than matching the user to similar customers, it matches each of the user’s purchased and rated items to similar items, then combines those similar items into a recommendation list. To determine the most-similar match for a given item, the algorithm builds a similar-items table by finding items that customers tend to purchase together. Similarly to our work, their algorithm computes the similarity between two items using the cosine measure, in which each vector corresponds to an item rather than a customer, and the vector’s M dimensions correspond to customers who have purchased that item. The difference lies in the fact that any other collaboration activity rather than ratting is considered for weighing the recommendations. A comparative study is presented by Sarwar et al. (2001), which look into different techniques for computing item-item similarities (e.g., item-item correlation vs. cosine similarities between item vectors) and different techniques for obtaining recommendations from them (e.g., weighted sum vs. regression model). Item-based techniques first analyses the user-item matrix to identify relationships between different items, and then use these relationships to indirectly compute recommendations for users. In their work, the user-item matrix is formed based on a rating score, in our work tags assigned to pages where collaboration took place is the engine to determine the user’s interest of a given page.

3 The collaborative value of wiki pages: the WPCV model

In this section, we demonstrate how the collaborative value of wiki pages is calculated. First, we present the wiki model used in the context of this work, secondly, we discuss and formally describe the collaboration model we rely on, and finally, we present two essential pillars of the model: user expertise and social interaction between social ties.

3.1 The wiki model

This section introduces the wiki model used to assess our proposal. Besides traditional media wiki features like easy creation and editing of interlinked web pages via a web browser using a simplified markup language, we propose the support of other social activities such as tagging, commenting, rating and social networking.

3.2 Collaboration model

Our collaboration model is much in line with Beyerlein’s definition Beyerlein et al. (2002): “the collaborative work of two or more individuals where the work is undertaken with a sense of shared purpose and direction”. Inspired by this definition, we outline some characteristics envisaged for the collaboration model:

Collaboration is goal-oriented; thus, the reason for working together is to achieve something.
Collaboration requires a group of individuals working together. No one can collaborate alone therefore.
The group can be of any size, may be geographically dispersed, not necessarily friends or known to each other.
Collaboration is coordinated, i.e. no one is supposed to work in a lawless rebellious manner. The coordination may follow a formal methodology, or can be equally implicit and informal.

In conformance with the context of this study, we subdivide the collaborative work into a set of activities including editing, rating, tagging and commenting. We understand collaboration as: if two persons edit the same page, then they are “collaborating” in some way. If one assigns a tag to a wiki page create by the other, the collaboration is also taking place. In fact, one is providing means to classify or organise the other’s page. The model does not exclude the addition of others collaborative activities though.

Provided such conceptualisation, we formally describe the collaborative work W as a tuple W = {U, P, A, C}, where: $U = \{u_{1},\ldots,u_{k}\}, P = \{p_{1},\ldots,p_{m}\}, $ and $A = \{a_{1},\ldots,a_{n}\} , $ where U, P, and A correspond to users, pages, and activities including commenting, editing and tagging. A collaboration is an element of set C, where: $C \subseteq U \times P \times A . $ Particularly to understand the collaborations of single user, we concentrate on the activities and pages that are associated with this particular user. We then define the set of individual collaborations as I _u = (C _u, P _u, A _u) , where C _u is the set of collaborations of the user: $C_{u} =\{(a,p)\mid(u,a,p) \in C\} , A_{u}$ is the user’s set of collaborative activities: $A_{u} =\{a \mid (a,p) \in C_{u}\}, P_{u}$ is the set of pages: $P_{u} =\{p \mid(a,p) \in C_{u}\}. $ More specifically, the collaborations of user u on the pages created by user w is denoted by $C_{u,w} = \{(a,p) \mid (w,a,p) \in C_{u}, w \in U \}$ and the collaborations of a user u on the particular page p is denoted by $C_{u,p} =\{(a,p) \mid (u,a,p) \in C_{u}, p \in P_{u} \}. $ These notations will be utilised to describe the recommendation model.

3.3 User expertise

In addition to the collaborative work, we believe that the value of wiki pages may vary according to the expertise of individuals involved in the overall content building. In this sense, we incorporate the user expertise as a weighting factor in our model.

In the real world individual knowledge is built up from life experiences and academic education. Likewise, in collaborative environments the individual knowledge can be derived from his contributions and participation within the community. All produced material and consumed information can be used as an indicator of user expertise. Based on this premise, we envisage a model to infer the user’s expertise from the pages that demonstrate user activity. Unlike the approaches that rely on self-evaluation, we define the set of user expertise E as the most frequent terms that appear in the pages from P _u. For calculating E, we applied the tf–idf metric (term frequency–inverse document frequency) Baeza-Yates and Ribeiro-Neto (1999). The importance of each expertise $e \in E$ is proportional to the term frequency and inversely proportional to the document frequency in the corpus. The calculus of user expertise is defined as:

$$ UE(u) = \sum\limits_{p \in P_{u}}\sum\limits_{e \in E} tf(e,p) \times idf(e), $$

(1)

where |P _u| is the total amount of pages created, edited, tagged, rated or commented by a given user u, p _i is a particular page, and e _j is a particular expertise (or term), and |E| is the amount of expertise (top terms). As a limitation, the present model does not weigh the activities differently. We believe that one activity can increase its importance over the other depending on the value that it aggregate to wiki page. For instance, a tag that categorises a page can be more valuable than a couple of lines added to the text.

3.4 Interaction between social ties

Besides their own expertise, we believe that individuals may inherit knowledge from their social ties. This premise is motivated by Striukova and Rayna (2007), which claims that strong ties created between members of a virtual team are essential for the community’s knowledge sharing. Nevertheless, the intensity of knowledge sharing/inheritance is determined by the degree of interaction with others in the networked community. Every time individuals interact with each other, they contribute towards building high skilled relationships besides improving the overall knowledge sharing within the community (Nazir et al. 2008). Motivated also by works in this specific subject of Fiore et al. (2002), Adiele and Penner (2006), we weigh the inheritance of knowledge considering the interaction between individuals as:

$$ IT(u,w) = \frac{|{C_{u,w}|}} {|{A}| \cdot |{P_{w}}|}, $$

(2)

where |C _u,w| is the total amount of collaborations performed by user $u \in U$ on the pages created by user $w \in U, {|{A}|}$ corresponds to the total amount of possible collaborative activities in the system, |P _w| corresponds to the amount of pages created by user w.

Because of social interactions, we redefine the user expertise UE(u) as user knowledge UK(u), which is the individual expertise plus the inherited knowledge from his social ties. We say that a user u has established a social tie with user w, if one contributes to the other’s page. The user knowledge UK(u) is defined as follows:

$$ UK(u) = UE(u) \cdot \frac{\sum_{s \in S_u} UE(s) \cdot IT(u,s)} {|S_u|}, $$

(3)

where $S_{u} \subset U$ represents the social ties of the user u, |S _u| is the total amount of social ties of the user u, s is a particular social tie of user u, and U is the set of users.

3.5 The WPCV model

The value of a wiki page built under collaborative work must contemplate a number of user interactions such as editing, tagging, rating and commenting. Thus, we define the Wiki Page Collaborative Value or WPCV(p) of a page as a summation of contributions to that page weighed by individual knowledge.

$$ WPCV(p) = \sum\limits_{u \in U} {|{C_{u,p}}|} \cdot UK(u), $$

(4)

where $|{C_{u,p}}|$ is the total amount of collaborative activities performed by a user u on a particular page p, weighed by the user knowledge calculated from UK(u). The set of users is represented by U, and P represents the set of existing pages in the system. The value obtained with the function WPCV(p) is utilised as an additional factor in our recommendation model to privilege those pages with more collaboration.

4 Tag-based recommendation for knowledge sharing

As previously introduced, recommendations play a very important role by supplying complementary information to pages when information is poor or scarce. We therefore explore tagging activity as means of finding related pages based on the similarity of tags to provide alternative sources of assistance. In order to promote the most promising pages, we weigh and rank the related pages with the collaborative commitment that they carry. The social commitment is calculated using the model described in the previous section. In this section, we demonstrate how the tag-based recommendations are calculated and biased by the collaborative work. Before, we introduce the tagging formalism necessary to describe the recommendation model.

4.1 Tagging activity and notations

Tagging activity is the engine for our recommendations. The usage of tagging is motivated not only by its ability of organising collections of content, but also for increasing the search capabilities in collaborative environments Smith (2007). Our recommendations are conceived from a tagging scenario Y formally described as a four-tuple Y = {U, P, T, N}, where there exists a set of users, U; a set of pages, P; a set of tags, T; and a set of annotations, N. The annotations, N, are represented as a set of triples containing a user, tag and page defined as: $N \subseteq \{\langle u, p, t \rangle : u \in U, p \in P, t \in T \}. $ More personally, N _u represents the set of tag annotations of the user: $N_{u} =\{(t,p)\mid(u,t,p) \in N\} , T_{u}$ is the user’s set of tags: $T_{u} =\{t \mid(t,p) \in N_{u}\}, $ and P _u is the user’s set of pages: $P_{u} =\{p \mid(t,p) \in N_{u}\}. $

4.2 Tag-based recommendation model

In order to select similar pages that address the related problems, we apply the cosine similarity over tags plus additional factors as described below. This tag-based model is based on a previous work of Durao and Dolog (2009b) and the difference from the present study is that in the previous work no collaboration model was neither envisaged nor implemented. To the well understanding on how the recommendations are generated, we find necessary to introduce the previous tag-based recommendation model.

In the process of finding similar pages, we measure the degree of similarity between two pages, x and y, by computing the cosine of the angle between their corresponding tag vectors $\overrightarrow{x}$ and $\overrightarrow{y}, $ which is defined as $cosSim(\overrightarrow{x},\overrightarrow{y}) = \frac{\overrightarrow{x}.\overrightarrow{y}}{|\overrightarrow{x}||\overrightarrow{y}|}$ Baeza-Yates and Ribeiro-Neto (1999). Further, we extend the model with external factors detailed as follows:

Tag Popularity—pop(t). It indicates how often a tag $t \in T$ is being assigned to the pages in the system. We calculate pop(t) as $\frac{n_{t}} {|P|}, $ where n _t is the amount of occurrences of a tag $t \in T$ and |P| is the amount of exiting pages in the system.
Tag Representativeness—rep(t, p). It indicates how important a tag $t \in T$ is to a page $p \in P$ based on the amount of occurrences of t as a term (or word) in p. For calculating rep(t), we utilise the term frequency metric Baeza-Yates and Ribeiro-Neto (1999), with the belief that the tags that most appear as term in a page can better describe it.
Affinity between User and Tag—aut(u,t). It measures how important is a tag $t \in T_{u}$ for a user $u \in U. $ We calculate aut(u, t) as $\frac{n_{t}} {|T_{u}|}, $ where n _t is the amount of occurrences of a tag $t \in T_{u}$ and |T _u| is the amount of tags of a user $u \in U.$

The external factor function that bias recommendations is defined as:

$$ extFact(p,u) = \sum\limits^{|t \in T_{p}|} pop(t) \cdot rep(t,p) \cdot aut(t,u), $$

(5)

where $|t \in T_{p}|$ is the amount of tags assigned to a page $p \in P. $ Provided such definition, we describe how the recommendations are generated. It is worth mentioning that this tag-based recommendation model suggests pages only based on the user’s current page. This means that the recommendations should meet the content domain addressed in visited page.

1.
A user $u \in U,$ whose preferences are denoted by T _u, visits a page $p \in P$ assigned with a tag vector $\overrightarrow{T_{p}}=\{t1,\ldots,t_{y}\}. $ For each remaining page $\overline{p} \in \overline{P} = P - p, $ we calculate the cosine similarity between their corresponding tag vectors $\overrightarrow{T_{\overline{p}}}$ and $\overrightarrow{T_{p}}.$
2.
For each page $p \in \overline{P}, $ we weigh the cosine similarity calculus with the external factor function extFact(p, u) as defined previously. The outcome is a ranking list of pages $\overline{P}$ to be recommended. The rank list follows the ordering $ \tau = [p_{1},p_{2},\ldots,p_{k}], $ where $p_{i} \in \overline{P}$ and the ordering relation is defined by $p_{i} \geq p_{j} \Leftrightarrow cosSim(p_{i},p) \cdot extFact(p_{i},u) \geq cosSim(p_{j},p) \cdot extFact(p_{j},u); $
3.
A final ranking $\tau^{\prime}$ of pages $\overline{P}^{\prime}$ is producing by multiplying the current ranking score of each page $p \in \overline{P}$ with the value obtained from WPCV(p) function defined in Sect. 3.5;
4.
The final ranked list of pages $\overline{P}^{\prime} \subset \overline{P}$ are the recommendations for the user $u \in U.$

Although not considered in this work, a complementary study of the complexity of the tag-based recommender model is discussed in Durao and Dolog (2009a). It analyses how the model scales for cases involving a large number of users, tags and wiki pages. In this study the social factor is not considered but it is not difficult to predict that the performance of the current model can be compromised once the amount of collaboration evolves. As a consequence, it impacts the augment of user expertise and requires more time to elicit the user’s preferences.

5 Experimental evaluation

The motivation of this paper focuses on corporate wikis. Although they are fundamentally deployed in an organisational context, their major objective goes beyond the enterprise borders: to enhance knowledge sharing. Having this as the focus of the evaluation and due to practical difficulties to run the experiment in an enterprise, our approach is not evaluated in an organisation with real employees using a wiki as part of their normal activities. This envisaged scenario is difficult to realise due to the fact that personnel is usually involved in pre-scheduled tasks and companies resist to allow a considerable number of people to interrupt their regular activities and take part in an experiment. Perhaps, a small group of employees (up to 5 or 10) would be available to participate but this amount would not allow us to claim significant results. In addition, a minimal training section would be necessary in order to teach the employees how to operate our test wiki and how to proceed in the experiment. This also would require additional hours from them. In order to lessening this gap, we fought for get some answers from ten professionals in software companies that utilise corporate wikis in their daily activity. Section presents therefore questions and answers about the usefulness of our recommendations model under the enterprise view.

In face of difficulty of conducting the experiments in a enterprise scenario, we decided to simulate the envisaged scenario in a laboratory with a significant amount of users (63) performing tasks with fixed goals (see details in Sect. 5.1). This sense of “commitment to the tasks” was crucial to reproduce the conditions of an organisational environment. The tasks were necessary to trigger recommendations, enforce knowledge sharing and provide inputs for assessing the approach. In this context, we could demonstrate the applicability of the approach to other audiences as well as analyse the major goal of this work: to assess whether the collaboration value within wiki pages improves the quality of tag-based recommendations in assisting users to find the needed information.

5.1 Experiment setup and methodology

Our approach was evaluated in wiki system called KiWi (Knowledge in a Wiki), a semantic wiki platform for knowledge sharing and content management Schaffert et al. (2009). Besides supporting wiki’s primary features, such as creating and editing pages, the platform supports social tagging, collaborative rating and allows user to comment on any existing page. Additionally, KiWi supports social networking where users can establish social ties besides forming groups for specific purposes. In KiWi, tag-based recommendations play a very important role by revealing extra content in addition to the currently viewed page. Further, the recommendations support users in making a choice from a large number of possible alternatives once they are ranked according to the degree of similarity with the currently visited page. The accuracy of recommendations is essential to reduce information overload and provide for better quality decisions to be made. In the context of this work, the decision refers to which recommendations should a user follow to solve a task. In order to assess the eventual improvements of the approach, we observed and compared the use of the recommendations powered by the collaborative value of wiki pages against the pure tag-based recommendations.

In particular, we challenged the participants to carry out individual tasks: to fill in incomplete wiki pages by collecting information placed in other pages of the system. This encouraged the participants to transfer knowledge along the pages in the system. In order to do that, they were also required to navigate through the pages using our recommendations to find the needed information. While the participants visited a page searching for a solution, two sets of recommendations were generated: one powered by the collaborative value, and the other was purely based on tag similarities. Both sets, however, were not distinguished so as not to bias the final results. As long as the users found a helpful page, they were free to edit, rate, tag, comment or increase their social ties as much as they liked. Although these activities were not compulsory, we needed to motivate them to do so by introducing the overall benefits of knowledge sharing in collaborative environments. This introduction took place before their participation in the experiment. In spite of this introduction, the participants did not have access to any signal as to the value of their contributions in the wiki pages.

In principle, we expected that everyone would play according to the instructions. However we made some minimal efforts for robustness. One participant cannot remove a tag from the other. One tag can be assigned only two times to the same page by the same user. Tags are limited at 40 characters to avoid that users write long sentences in the tagging space. Besides these automatic controlling, participation was monitored to avoid malicious behaviour’s (e.g., copy and paste instead of editing) and to keep the minimum sense of coordination. The objective of this preparation was to be aligned with the requirements proposed in the collaboration model in Sect. 3.

5.2 Task scenario

For the evaluation, KiWi was initially populated with pages extracted either from pages or from the keywords meta information embedded in the HTML code. These tags were not considered in the statistics of the experiment but extremely necessary to initiate the experiment so that the participants could run the tasks. Instead of having KiWi full of pages reporting about corporate assets such as work plans, tasks assignments or personnel allocation, we filled in KiWi with content (news) imported from communication channels such as the BBC and CNN. For that, KiWi has an importer feature that automatically gathers rss content and generates pages into the system. In a sequence, we edited the pages to create the “missing parts” and allocated them into other pages in the system.

The task scenario was carefully designed so that a chain of dependency between the pages was created. This assembly allowed us to determine which pages should be recommended and which pages would move users away from a possible solution. In line with the motivation of this paper, a solution is successfully achieved only if information from page is placed in another one, i.e. knowledge transfer.

Figure 1 illustrates the task scenario and dependency between pages. As seen in the sketch, Page 1 is dependent on Page 2 because this provides the missing information on Page 1: “11 june and 11 july”. On the other hand, Page 2 depends on the information “August 2007” provided by Page 3, which in turn depends on the information “Oceania Football Confederation” encountered in Page 4.

An excerpt of the recommendations in KiWi system can be seen in Fig. 2. In details, the page title (a) is followed by some meta data including the date of creation (b), last update and last contributor of the page (user44). Just beside (c), the option “Add as friend” is a function for the viewer to make new friendships. Below (d), it shows the tag set assigned to the page followed by the average ratings (e). In a sequence, it shows the page content (f) containing a missing part (g) that can be filled in if one click on the option “Edit” (h) exposed on the drop down menu. At the bottom of the page, it shows the comment area (i). In this example, another user (user34) is asking for additional information on that page that he will likely need. The two sets of undistinguished recommendations are located to the right of the page panel (j). Each recommendation is a link to another page containing a possible solution for the current page. Importantly, although one set appears above the other in the UI, we made it very clear to the participants that the most promising link is not always the top ones. This reminder was motivated by user interfaces studies which claim that links that appear higher up on a page are much more likely to be seen, read, and clicked (Kalbach and Bosenick 2003).

5.3 Participation overview

In total, 63 participants, mostly students from the Computer Science Department ^{Footnote 2} at Aalborg University in Denmark, took part in the experiment during December of 2009. Before starting the activities, they were introduced to the experiment guidelines as well as the individual tasks. During one month of work, 830 tags (avg. 13,1 tags/user) were assigned to 201 pages and 146 social ties were established. In total, 1,465 ratings were computed along the whole set of pages and, on average, one page was rated by approximately eight participants. This number coincides with the mean of pages visited until a participant concludes his task. In addition, 93 % of the pages were edited, 91 % were rated, 88 % were tagged and 43 % received comments. Two or more activities were performed in 83 % of the pages and all tasks were successfully completed.

The overall participation and commitment to the experiment were summarised in subfigures of Fig. 3. Figure 3a describes the distribution of pages visited by the participants and the overall tagging activity. The first pair of bars (1st case—from left to right) says that five pages were assigned with only one tag in seven opportunities. Figure 3b describes the distribution of commenting, editing and rating activities. The first case says that only one page received ten comments, one edition and no ratings. Figure 3c presents the distribution of inferred expertise per user. The first case says that there were only two users with ten kinds of expertise. Figure 3d exposes the social participation evolution along the experiment. Figure 3e correlates the interaction between users and the overall collaborative value evolution along the experiment. Finally, Fig. 3f demonstrates the task accomplishment evolution during the whole experiment.

5.4 Analysis of social factors

Before showing the results obtained from the experiment, we assessed the social factors of our model: the user expertise and interaction between social ties. This assessment is crucial for the credibility of the results obtained in the experiment. In the following, we present the user expertise evaluation followed by the analysis of interaction between social ties.

5.4.1 User expertise

In order to evaluate whether user expertise was accurately computed, we invited each participant (after his participation) to judge whether the list of terms collected described correctly his expertise. In addition, we provided another list of terms that appeared bellow the stipulated threshold not considered part of the user expertise list. Each term was requested to be assessed individually. The terms constrained by the set of stop words were not sent for evaluation. The results are expressed in a confusion matrix within these four categories: True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). A false positive prediction occurs when an expertise is incorrectly assigned to one who indeed does not have it. A false negative prediction occurs when the expertise is incorrectly predicted as negative. True positives and true negatives are obviously correct prediction. The result of the assessment is presented in Table 1.

Table 1 Percentage mean of true positive, true negative, false positive and false negative

Full size table

According to Table 1, the overall results were satisfactory, 71 % of acceptance shows that various user expertise were correctly inferred although particular improvements on the incorrectly inferred expertise are necessary. In order to improve the quality of user expertise inference, we aim at consulting an ontology that handles user skill vocabulary to validate the terms inferred. Nevertheless this aim is planned for a future effort.

5.4.2 Interaction between social ties

The correlation between the participant’s interaction level (PIL) and the amount of expertise inferred (EI) was also analysed. The expertise inferred refers to all knowledge acquired without collaboration. The straightforward model that defines EI is defined as:

$$ EI(u) = UK(u) - UE(u), $$

(6)

For this analysis, we applied the Pearson product-moment correlation coefficient (PMCC). Thus, we could measure the linear dependence between two variables PIL and EI, giving a value between +1 and −1 inclusive. If the result is close to +1, it means that both variables have a strong correlation. On the other hand, if the result tends towards −1, it means that there is a weak correlation between the variables Cohen and Cohen (1975).

For each participant, we calculated the PMCC coefficient by comparing the total amount of expertise against the mean interaction level. As shown in Fig. 4, the mean of all PMCC calculations was 0.72133, which indicates a strong correlation between both factors. In other words, this correlation signifies that the most active participants were pleased with new kinds of expertise. In fact, some terms were ranked higher so that they became part of the user expertise list. This outcome also credits the final results presented in the next section.

6 Experimental results

In order to ascertain whether our collaborative model assisted users in finding the needed information, we performed a subjective and quantitative evaluation of the experiment. The subjective assessment was carried out by measuring the user satisfaction expressed by the ratings left on the recommended pages and the quantitative analysis of the amount of recommendations needed until a task is finalised. Furthermore, we measured the performance of recommendations, in terms of precision and recall, on leading users to helpful pages.

6.1 Subjective assessment

The ratings left on the visited pages indicate the quality of the recommendations. Since the participants were engaged in fulfilling their tasks, we interpreted pages with high rates as the most helpful. As a result, the recommendations powered by the collaborative value of wiki pages outperformed the pure tag-based recommendations. According to Table 2, 40 % of the recommendations were rated with highest ratings (5–6) whereas 24 % were rated with the lowest values (1–2). This advantage was also evidenced when we focused specifically on ratings greater than or equal to 4 stars (66 against 47 %) and on ratings lesser than or equal to 3. We also performed a t-test analysis comparing the average number of ratings for both sets in each range of ratings. This helps to understand the magnitude of the difference between both sets and eliminate any lingering doubt due to randomness. The last row of Table 2 shows that the difference between both sets is statistically significant (p < 0.05).

Table 2 Recommendation ratings (# stars) and P-values from a T-test between both sets

Full size table

This subjective assessment demonstrates that the participants were more satisfied with the set of recommendations powered by our model than the pure tag-based set. This result enforces that the visited pages were more informative and contained important and helpful information.

6.2 Quantitative assessment

The high level of participant satisfaction with the set of recommendations powered by our model is justified by the number of pages visited before the task gets solved. As seen in Table 3, 19 % of users who arbitrarily followed the recommendations powered by the collaborative model (WPCV) solved the tasks by visiting 7 pages at most, 66 % of users that utilised both sets visited between 8 and 10 pages while 15 % of users that followed the pure tag-based (PTB) set visited between 11 and 15 pages. We validated this outcome by measuring the precision and recall of each visited page. Precision expresses the fraction of recommendations relevant to a given page whereas recall expresses the fraction of the relevant recommendations retrieved. We calculated the precision and recall respectively as:

$$ prec(p) = \frac{|R_{p}| \cap |R_{p}^{\prime}|} {|R_{p}|} $$

(7)

$$ rec(p) = \frac{|R_{p}| \cap |R_{p}^{\prime}|} {|R_{p}^{\prime}|} $$

(8)

where |R _p| is the amount of retrieved recommendations for a page p while $|R_{p}^{\prime}|$ is the amount of relevant recommendations for the same page p. The relevant recommendations for a page p are those items which contain the missing information for page p. Additionally, we calculated the f-measure, the weighted harmonic mean of precision and recall defined as:

$$ f-m(p) = \frac{2 \times prec(p)\times rec(p)} {prec(p) + rec(p)} $$

(9)

Table 3 Recommendation usage

Full size table

Similarly, as subjective evaluation, we obtained positive results in the quantitative assessment. Table 4 shows the advantage of recommendations powered by the collaborative model (WPCV) over the set of pure tag-based recommendations. As seen, the precision, recall and f-measure improved at rates of 11, 7 and 12 % respectively.

Table 4 Precision, recall and f-measure rates with their respective standard deviation values (SD)

Full size table

Figure 5 shows the ascending f-measure rates for both recommendation sets along with the participation in the experiment. As observed, the performance of the collaborative model outperforms the tag-based one during the entire experiment. The interesting observation from Fig. 5 is that the difference between the performance of both approaches diminishes over participation. This descending behaviour can be observed in Fig. 6, which shows the difference in performance between both approaches. In particular, from the 40th participation, both recommendation models’ perform as nearly equals. From this point on the difference achieves values no higher than 0.05. This scenario bears two interpretations: (1) at this point, most of the pages were filled out with the missing piece of information; and (2) the tagging activity increased considerably so that the tag-based recommendations performed as well as the collaborative-based ones. This opens up a new perspective of research to determine the optimal applicability of the collaborative model (WPCV). The current results, however, indicate clearly that for the cases that pages are rarely visited, the proposed model could be particularly helpful. Such research is not part of the scope of this work as it would require a compilation of several observations.

6.3 Limitations and improvements

Although the demonstrated results seem promising, the model suffers from some limitations so that improvements are also being envisaged. The first and likely the most critical concerns the real value of the contributions. At the present stage, it is still impossible to differentiate whether or not a contribution left on the pages is positive. This means that anyone can add disruptive or nonsensical information that do not contribute at all to solving a targeted task. As an immediate and negative effect, recommendations intended to be supportive become useless and disappointing to users. This problem, however, is partially solved once the recommended pages can be rated based on the quality of their content. Nevertheless, the best option would be to detect the problem at its origin.

Another limitation concerns the calculus of user expertise. In the current development, it is unprotected against malicious behaviour where one can edit pages by simply pasting information copied from external sources. As a result, this user will have his skills incremented regardless of his factual knowledge about the copied content. Additionally, the inferred expertise needs to be refined and/or validated by experts or controlled vocabularies from domain ontologies. The objective is to filter out terms that do not convey the idea of skill or knowledge. For instance, the inferred expertise leader could be turned into leadership, or the term study could be turned into education.

As to the calculation for user expertise, equal weight is given to all activities without distinction. In a future work, we plan to apply different weights depending on the activity. For example, a user that assigned tags to a large number of pages will not have the same expertise rating as a user who wrote the content for these pages. Further, in line with Best and Krueger (2006), the collaborative model should also look at the individual activity within the community and not only at a certain page, considering the fact that participative users are more likely to be engaged with collaborative work than those who contribute sporadically.

Finally, all contributions should be pondered with a time decay. A declining marginal value should weigh a contribution based on existing contributions. For instance, is the 50th contribution really just as valuable as the 5th?

6.4 Discussion

As for the support towards collaborative work in problem solving spaces, we investigated whether recommendations could be more helpful if the collaborative value of wiki pages is taken into account. As results showed, this hypothesis was confirmed, and the participants in fact received the needed support from the tag-based recommendations powered by the collaborative value to find information they were looking for. In particular, we observed improvements in three dimensions: quality, effectiveness and performance. Quality refers to the content of pages recommended. The high ratings left on the pages where the collaboration was taken into account reflected the participant’s satisfaction in finding helpful information to solve tasks. Effectiveness refers to the amount of pages visited to accomplish the task. When the collaborative value was considered, we observed that less pages were necessary (on average) for participants to have their tasks completed. Finally, performance refers to the accuracy of recommendation in suggesting the precise information.

As to collaborative activities, we observed some interesting behavioural aspects during the experiment. Rating was the activity most frequently performed due to its simplicity of usage. It was also the first activity to be performed as soon as the participants visited a page. In very particular cases, we observed that some participants rated pages with high grades even though the content was not really helpful. They probably enjoyed the content even though no support was provided. Tagging was the second activity most performed. In general, participants used terms to describe the page content or provide some categorisation. Implicitly, they were providing a significant contribution to the experiment once the tagging activity was the core engine for generating recommendations. Further, a tagging activity decay was observed as soon as the experiment approached its end. This evidence was interpreted as a natural behaviour due to the tag saturation in some pages. Naturally, participants stopped tagging pages whose tag set seemed to be sufficient to describe the page. Editing was the third activity most performed among the participants. In most cases, the page content was moderately improved with shorter sentences without big restructuring or removals. It was clear they respected existing content posted by other participants. Commenting was the activity least performed. An opportunistic (but minor) behaviour was detected when participants visiting unsuccessful recommended pages used the comment area to ask verbally for the missing parts they needed. Regarding the usage of recommendations, no malicious behaviour was detected and not much support was necessary during the experiment.

As to the applicability of the approach, we are quite positive that other recommender systems of any nature (content-based, collaborative filtering, etc.) can utilise and/or adapt the present model to their reality. The proposed model, for example, can be extended by other collaborative activities not addressed in this work or down sized if not all suggested activities are available. In this work, we utilised a wiki system tuned by a number of Web 2.0 features due to their convenience for evaluating all factors addressed in the model. In addition, these features were important to collect more evidences of collaborative work in a multi-user environment. In order to consolidate the research, the next section presents an interview with professionals from software companies, in which they evaluate the usefulness of our recommendation model in their corporate wikis.

7 Interview with professionals from software industry

In order to ascertain whether our collaborative model is a viable at the industry eyes, we interviewed project managers and software engineers about the usefulness of model. The questionnaire was responded by ten professionals in the following countries: Denmark, Czech Republic, Brazil and Austria. As a constraint, we solely interviewed professionals from companies where wikis were tools available as part of their daily activities. The compilation of answers are shown in the following:

Question 1: What is the purpose of using a wiki in your company?

The common answers were that people in industry find information in the wiki about specific parts of the software development, test systems and lessons learnt. Many others said that a wiki is important for documenting software, creating proposals, and sharing any information that is relevant for several people. They also stated that wikis are commonly used to discuss certain topics based on the people comments.
Question 2: Is the wiki available for all employees? On average, how many employees utilise the wiki and which is the main reason for it?

To the first question, the general answers report that in general the wiki is available for all employees and only very few pages are restricted to the management personnel. To the second question, the general answers report that about 90 % of employees read it regularly, 50 % edit the wiki every now and then and 20 % read or edit it on a regular basis. Reasons for that include: (1) fear to be noticed, (2) shyness and (3) selfishness (why should I share, if what I know makes me special?)
Question 3: How do you find specific information within the wiki?

According to the responses, 90 % of them utilise search or self-created navigation through wiki links. Some features such as tag cloud are rarely implemented/used. Only one company has a wiki with recommendations available.
Question 4: How recommendations could be/are useful within a wiki system? Could you provide an example?

The general answers state that recommendations support users in discovering related pages not listed in the main menu besides keeping users focused on a given topic of interest. One respondent stated clearly “It would be great if I get links to related pages", i.e. exactly what we propose. As examples, the participants expected to have personalised recommendations that suggest pages that meets their field of interest or warn if important pages have changed. In addition, one participant claimed that recommendations become more interesting when they come along with a justification why they were triggered. “I like to use features that I understand”.
Question 5: Since wiki is a collaborative space, what makes a recommendation to be relevant?

The general answers state that recommendations help focusing on the currently read content. One participant stated clearly “I don’t want to search, I want to find. In any wiki the users should spend more time reading than writing. But if I don’t waste time searching around and instead find what is interesting to me immediately, I am more efficient and I consider the Wiki as a more valuable tool.”.
Question 6: Do you think the value of a wiki page should take into account the expertise of collaborators? Why?

In general, the respondents stated that it might help when one wants to learn about a subject in a general way and the system leads them to potentially high quality content. While the expertise of the collaborator might influence the quality of wiki pages, it does help filtering information relevant to the information need. Expertise helps to ensure reliability, preciseness and good cover of the topic. One participant defended that “people can learn only from those, who are better than themselves (possessing higher expertise) - skill/expertise should thus be reflected in the recommender system”. Some considerations were made in the sense that not necessarily expert posts are more important than novice posts. The clarity and readability of posts should also be considered.
Question 7: In addition to easy editing, which others features a wiki could implement so that we know more about the collaborators?

Participants stated that recording a history of contributions to estimate a field of interest is an interesting feature. Many outlined the importance of tagging for searching content and statistics about pages last editions. One in particular suggested the integration with existing social networks attracts users. In general, any feature such as tagging, bookmarking, recommendation, or systems that compile overview pages could be useful as long as they help finding relevant content. One participant outlined his interest in knowing the position of the collaborator “… I need to see where that person is in the organisation. What department does the person belong to? How far up in the food-chain is the person?”.
Question 8: Do you believe that recommender systems that consider the collaborative value of wiki pages (given by the expertise of participants and user’s participation) can be useful for problem solving in corporate wikis? Why?

Participants appreciate the feature and claimed that it scales: “The larger the wiki, the more useful this could be”. In wikis populated with few pages, users opportunistically knows where the information is. On the other hand, large wikis might require sophisticate methods for assessing the quality of the content. To our surprise, one of the participants basically described part of our model: “The more people involved the more wisdom of the crowd would I consider it. On such matters, where I am looking at opinion…more individuals involved in the page means more consensus and more collaborative value”. Some respondents also stated that the collaborative value of page however needs to be carefully analysed because high collaboration does not necessarily mean that the quality of the content is high, however it might help finding experts. As an example: Paul writes about Topic A, John completely disagrees and changes most of the content. Both start discussing and counter editing through the wiki. The page about topic A will barely reach any form of quality that way. But outsiders can understand that Paul and John are experts on the fields with different opinions. That might be a vital aspect in competitive settings. An important aspect was raised by one participant, which argued that “a self-proclaimed expert need not be an expert. A troll may collaborate a lot while having little real understanding of the topic.” This observation meets exactly what we defend in our model. The user expertise is determined by his participation in the wiki system instead of self-evaluation.
Question 9: What could be the possible drawbacks of recommender systems that implement the such a solution from question 8?

General answers state that it is difficult to understand, whether people are really collaborating or simply arguing about something. Some participants stated that it could happen that the same heavily edited content is always on top of the recommendations and that the long tail becomes even harder to find. One participant stated that such a new features will make the system more and more complex, and thus not so easy to adopt.
Question 10:Would you implement such a recommendation model (from question 8) in your company? Why?

Many stated that our recommendation model looks promising and it could even help boosting teamwork. Some respondents emphasised that the benefits would be more clear if current wikis were more ore people oriented and more community oriented. Others stated that although the feature is promising, it could not bring much improvements for wikis with low amount of pages or low activity.

7.1 Summary of answers

The questionnaire was important to provide a realistic feedback on how our approach is seen under industry eyes. As the answers showed, wikis are important tools utilised in software companies for knowledge transfer and problem solving. The respondents also reported that in general all employees have unrestricted access to wikis regardless their roles. The questionnaire also unveiled that 90 % of the wikis utilise search as a primary mechanism to find content and, more importantly for our study, that recommendations are rarely implemented by wikis. The respondents also recognised the importance of recommendations as a tool for supporting users in discovering related pages besides keeping users focused on a given topic of interest. Many also outlined the importance of having personalised recommendations.

The respondents also agreed that recommendations should take into account the user expertise once they can point out content of potentially high quality, besides helping filtering information relevant. Participants also stated that recording a history of contributions to estimate a field of interest is an interesting feature. Focusing exclusively on our model, participants appreciated the feature and claimed that it should work better for large the corporate wikis with intense participation. Interestingly, they also stated that self evaluation should not be decisive to determine the user expertise. Regarding the drawbacks, overall answers outlined that it is difficult to understand whether people are really collaborating or simply arguing about something, and that the complexity of the model could difficult its deployment in existing corporate wikis. Finally, participants also stated that the recommendation model looks promising and it could even help boosting teamwork.

8 Conclusion and future works

This study exploits social and community intelligence in corporate wikis for encouraging knowledge transfer in companies. In particular, we proposed a recommender system weighed by a collaboration model as summarisation of individual efforts on the wiki pages. The experiment carried out with 63 participants showed that recommendations powered by the collaborative value of wiki pages achieved a better performance over the recommendations from pure tag-based similarity in assisting users to solve proposed tasks. This results was enforced by ten professionals from software companies that stated that the method look promising and is capable to boot teamwork. As a future work, we aim at refining the inference of expertise using controlled vocabulary from domain ontologies. Further, we also plan to analyse the correlation between the social network structure and the performance of recommendations. We also aim at studying the recommendation performance when the present model is measured in collaborative spaces with heterogeneous goals and characteristics rather than problem solving. Finally, we plan to redo the experiments in a enterprise environment and compared with the results obtained in this study.

Notes

Wikis used in a organizational context are called corporate wikis, especially to enhance internal knowledge sharing (Hu et al. 2007).
http://www.cs.aau.dk

References

Adiele C, Penner W (2006) Monitoring interactivity in a web-based community. In: WISE, pp 405–410
Avancini H, Candela L, Straccia U (2007) Recommenders in a personalized, collaborative digital library environment. J Intell Inf Syst 28(3):253–283. ISSN 0925-9902
Google Scholar
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison Wesley, Boston, MA, USA. ISBN 020139829X
Best SJ, Krueger BS (2006) Online interactions and social capital distinguishing between new and existing ties. Soc Sci Comput Rev 24(4):395–410. ISSN 0894-4393
Google Scholar
Beyerlein MM, Freedman S, McGee C, Moran L (2002) Beyond teams: building the collaborative organization. Pfeiffer, San Francisco, CA, USA
Breslin JG, Passant A, Decker S, Breslin JG, Passant A, Decker S (2009) Social tagging. In: The social semantic web, pp 137–158. Springer, Berlin. ISBN 978-3-642-01172-6
Chen M-Y, Chen A-P (2006) Knowledge management performance evaluation: a decade review from 1995 to 2004. J Inf Sci 32:17–38. ISSN 0165-5515
Google Scholar
Chen W-Y, Chu J-C, Luan J, Bai H, Wang Y, Chang EY (2009) Collaborative filtering for orkut communities: discovery of user latent behavior. In: Proceedings of the 18th international conference on world wide web, WWW ’09, USA, ACM, pp 681–690
Cohen J, Cohen P (1975) Applied multiple regression/correlation analysis for the behavioral sciences / Jacob Cohen, Patricia Cohen. Lawrence Erlbaum Associates; distributed by Halsted Press Division of John Wiley, Hillsdale, NJ, New York. ISBN 0470163607
Dearman D, Kellar M, Truong KN (2008) An examination of daily information needs and sharing opportunities. In: Proceedings of the 2008 ACM conference on computer supported cooperative work. CSCW ’08, New York, NY, USA, ACM, pp 679–688
Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Trans Inf Syst 22:143–177. ISSN 1046-8188
Google Scholar
Dondio P, Barrett S, Weber S, Seigneur J (2006) Extracting trust from domain analysis: a case study on the wikipedia project. In: Yang L, Jin H, Ma J, Ungerer T (eds) Autonomic and trusted computing. Lecture notes in computer science, vol 4158. Springer, Berlin, pp 362–373
Durao F, Dolog P (2009a) Analysis of tag-based recommendation performance for a semantic wiki. In: Fourth workshop on semantic wikis (SemWiki2009) in conjunction with the 6th annual European semantic web conference (ESWC2009), vol 464. CEUR.org
Durao F, Dolog P (2009b) A personalized tag-based recommendation in social web systems. In: International workshop on adaptation and personalization for web 2.0 (AP-WEB 2.0 2009), 485. CEUR.org
Fiore AT, Tiernan SL, Smith MA (2002) Observed behavior and perceived value of authors in usenet newsgroups: bridging the gap. In: CHI ’02: proceedings of the SIGCHI conference on human factors in computing systems, New York, NY, USA. ACM, pp 323–330. ISBN 1-58113-453-3
Guy I, Zwerdling N, Carmel D, Ronen I, Uziel E, Yogev S, Ofek-Koifman S (2009) Personalized recommendation of social software items based on social relations. In: Proceedings of the third ACM conference on recommender systems. RecSys ’09, New York, NY, USA. ACM, pp 53–60
Guy I, Zwerdling N, Ronen I, Carmel D, Uziel E (2010) Social media recommendation based on people and tags. In: Proceeding of the 33rd international ACM SIGIR conference on research and development in information retrieval, SIGIR ’10. New York, NY, USA. ACM, pp 194–201
Happel H-J (2009) Social search and need-driven knowledge sharing in wikis with woogle. In: Proceedings of the 5th international symposium on wikis and open collaboration, WikiSym ’09, New York, NY, USA. ACM, pp 13:1–13:10
Harrer A, Moskaliuk J, Kimmerle J, Cress U (2008) Visualizing wiki-supported knowledge building: co-evolution of individual and collective knowledge. In: Proceedings of the 4th international symposium on Wikis, WikiSym ’08, USA, ACM, pp 19:1–19:9
Hester A (2008) Innovating with organizational wikis: factors facilitating adoption and diffusion of an effective collaborative knowledge management system. In: Proceedings of the 2008 ACM SIGMIS CPR conference on computer personnel doctoral consortium and research, SIGMIS CPR ’08, New York, NY, USA, ACM, pp 161–163
Holtzblatt LJ, Damianos LE, Weiss D (2010) Factors impeding wiki use in the enterprise: a case study. In: Proceedings of the 28th of the international conference extended abstracts on human factors in computing systems, CHI EA ’10, New York, NY, USA. ACM, pp 4661–4676
Hu C, Zhao Y, Zhao X (2007) Wiki-based knowledge sharing in a knowledge-intensive organization. In: Wang W, Li Y, Duan Z, Yan L, Li H, Yang X (eds) Integration and innovation orient to e-society, vol 2. IFIP advances in information and communication technology, vol 252. Springer, Boston, pp 18–25
Jun L, Weiguo Z (2008) Supporting social network-based knowledge creation. In: CSSE ’08: proceedings of the 2008 international conference on computer science and software engineering, Washington, DC, USA. IEEE Computer Society, pp 411–414. ISBN 978-0-7695-3336-0
Kalbach J, Bosenick T (2003) Web page layout: a comparison between left- and right-justified site navigation menus. J Digit Inf 4(1)
Kotlarsky J, Oshri I (2005) Social ties, knowledge sharing and successful collaboration in globally distributed system development projects. Eur J Inf Syst 14:37–48. ISSN 0960-085X
Google Scholar
Linden G, Smith B, York J (2003) Amazon.com recommendations: item-to-item collaborative filtering. Internet Comput IEEE 7(1):76–80
Article Google Scholar
Liu D-R, Ke C-K (2007) Knowledge support for problem-solving in a production process: a hybrid of knowledge discovery and case-based reasoning. Expert Syst Appl 33(1):147–161. ISSN 0957-4174
Google Scholar
Nazir A, Raza S, Chuah C-N (2008) Unveiling facebook: a measurement study of social network based applications. In: IMC ’08: proceedings of the 8th ACM SIGCOMM conference on internet measurement, New York, NY, USA. ACM, pp 43–56. ISBN 978-1-60558-334-1
Nonaka I, Toyama R, Konno N (2000) Seci, ba and leadership: a unified model of dynamic knowledge creation. Long Range Plan 33(1):5–34. ISSN 0024-6301
Google Scholar
Palau J, Montaner M, Lpez B, Llus De La Rosa J et al (2004) Collaboration analysis in recommender systems using social networks. In: Cooperative information agents VIII: 8th international workshop, CIA 2004. Lectures notes in computer science, vol 3191, pp 137–151
Pancerella CM, Myers JD, Yang CL, Gracio DK (2001) Collaborative problem solving environments. In: Proceedings of the 34th annual Hawaii international conference on system sciences (HICSS-34)-9-9, 9024, Washington, DC, USA. IEEE Computer Society. ISBN 0-7695-0981-9
Portes A (1998) Social capital: its origins and applications in modern sociology. Ann Rev Sociol 24(1):1–24
Article Google Scholar
Putnam RD (2000) Bowling alone: the collapse and revival of American community. Simon and Schuster, New York, NY, USA
Rhee SK, Lee J, Park M-W (2008) Riki: a wiki-based knowledge sharing system for collaborative research projects. In: Proceedings of the 8th Asia-Pacific conference on computer–human interaction, APCHI ’08, Berlin, Heidelberg, Springer-Verlag, pp 68–76. ISBN 978-3-540-70584-0
Ricci F, Rokach L, Shapira B, Kantor PB (eds) (2011) Recommender systems handbook. Springer, New York, NY, USA. ISBN 978-0-387-85819-7
Sarwar B, Karypis G, Konstan J, Reidl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th international conference on world wide web. ACM, pp 285–295
Schaffert S, Eder J, Grünwald S, Kurz T, Radulescu M (2009) Kiwi—a platform for semantic social software. In: ESWC 2009 Heraklion: proceedings of the 6th European semantic web conference on the semantic web, Berlin, Heidelberg. Springer-Verlag, pp 888–892. ISBN 978-3-642-02120-6
Sen S, Lam SK, Rashid AM, Cosley D, Frankowski D, Osterhouse J, Maxwell Harper F, Riedl J (2006) tagging, communities, vocabulary, evolution. In: Hinds PJ, Martin D (eds) Proc of the 2006 ACM conference on computer supported cooperative work, CSCW 2006, Banff, Canada, November 2006. ACM, pp 181–190. ISBN 1-59593-249-6
Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceeding of the 17th international conference on world wide web, WWW ’08, USA. ACM, pp 327–336
Smith G (2007) Tagging: people-powered metadata for the social web. New Riders Publishing, Thousand Oaks
Google Scholar
Sousa F, Aparicio M, Costa CJ (2010) Organizational wiki as a knowledge management tool. In: Proceedings of the 28th ACM international conference on design of communication, SIGDOC ’10, New York, NY, USA. ACM, pp 33–39
Stein K, Blaschke S (2009) Corporate wikis: a comparative analysis of structures and dynamics. In: Hinkelmann K, Wache H (eds) WM2009: 5th conference on professional knowledge management. Lecture notes in informatics (LNI), vol 145. Gesellschaft fr Informatik, Bonn
Striukova L, Rayna T (2007) The role of social capital in virtual teams and organisations: corporate value creation. Int J Netw Virtual Organ 5(1):103–119. ISSN 1470-9503
Google Scholar
Tonkin E, Corrado EM, Moulaison HL, Kipp MEI, Resmini A, Pfeiffer H, Zhang Q (2008) Collaborative and social tagging networks. In: Ariadne, January 2008. ISBN 1361-3200
Tseng S-M, Huang J-S (2011) The correlation between wikipedia and knowledge sharing on job performance. Expert Syst Appl 38:6118–6124. ISSN 0957-4174
Google Scholar
Vickery G, Wunsch-Vincent S (2007) Participative web and user-created content: web 2.0 wikis and social networking. In: Organization for economic, October 2007
Wilkinson DM, Huberman BA (2007) Assessing the value of cooperation in wikipedia. CoRR. http://arxiv.org/abs/cs/0702140
Zanardi V, Capra L (2008) Social ranking: uncovering relevant content using tag-based recommender systems. In: RecSys ’08: Proceedings of the 2008 ACM conference on recommender systems, New York, NY, USA. ACM, pp 51–58
Zhang D, Guo B, Yu Z (2011) The emergence of social and community intelligence. Computer 44(7):21–28
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, IWIS-Intelligent Web and Information Systems, Aalborg University, Room 2.2.05 Selma Lagerlöfs Vej 300, 9220, Aalborg-East, Denmark
Frederico Durao & Peter Dolog

Authors

Frederico Durao
View author publications
You can also search for this author in PubMed Google Scholar
Peter Dolog
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frederico Durao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Durao, F., Dolog, P. Improving tag-based recommendation with the collaborative value of wiki pages for knowledge sharing. J Ambient Intell Human Comput 5, 21–38 (2014). https://doi.org/10.1007/s12652-012-0119-x

Download citation

Received: 20 October 2011
Accepted: 26 March 2012
Published: 19 April 2012
Issue Date: February 2014
DOI: https://doi.org/10.1007/s12652-012-0119-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Improving tag-based recommendation with the collaborative value of wiki pages for knowledge sharing

Abstract

Similar content being viewed by others

Motivating Wiki-Based Collaborative Learning by Increasing Awareness of Task Conflict: A Design Science Approach

Collaborative Knowledge Management Using Wiki Front-End Modules

Untangling knowledge creation and knowledge integration in enterprise wikis

1 Introduction

2 Related work

2.1 Knowledge sharing in wiki systems

2.2 Collaborative work on wikis

2.3 Social recommender systems

2.3.1 Recommender systems for collaborative work

2.3.2 Collaborative filtering techniques

3 The collaborative value of wiki pages: the WPCV model

3.1 The wiki model

3.2 Collaboration model

3.3 User expertise

3.4 Interaction between social ties

3.5 The WPCV model

4 Tag-based recommendation for knowledge sharing

4.1 Tagging activity and notations

4.2 Tag-based recommendation model

5 Experimental evaluation

5.1 Experiment setup and methodology

5.2 Task scenario

5.3 Participation overview

5.4 Analysis of social factors

5.4.1 User expertise

5.4.2 Interaction between social ties

6 Experimental results

6.1 Subjective assessment

6.2 Quantitative assessment

6.3 Limitations and improvements

6.4 Discussion

7 Interview with professionals from software industry

7.1 Summary of answers

8 Conclusion and future works

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation