Knowledge acquisition from social platforms based on network distributions fitting
Introduction
Social networking sites are used as the research environment, and they provide opportunities to analyze real-world behavior (Abbasi, Chai, Liu, & Sagoo, 2012) as well as online activities (Gjoka et al., 2009, Utz and Beukeboom, 2011) with the applications in the areas related to collaborative learning (Kwon, Liu, & Johnson, 2014), computer-mediated educational environments (Rummel & Spada, 2005) and knowledge management (Ordóñez de Pablos, 2004). Due to the complexity of the network structures, the analyses are usually performed using some samples to find structures that are smaller, but which share similar properties and distributions (Ebbes, Huang, Rangaswamy, & Thadakamalla, 2008). Recent studies in this field have focused on new algorithms (Lee et al., 2006, Stumpf et al., 2005) and various areas of application (Gjoka et al., 2009, Lakhina et al., 2003, Rusmevichientong et al., 2001). The knowledge gathered from social network analysis can be extended using either typical surveys or new approaches based on adaptive surveys that optimize survey costs, quality and response rates. Research in this area is still in the early stages and adaptive methods are rarely implemented (Schouten, Calinescu, & Luiten, 2011). Another motivation for further research on the development of sampling methods is to increase the representativeness of survey data. The majority of studies on social media focuses on social network sites such as Facebook, and many of these studies use (online) surveys (Back et al., 2010, Utz and Krämer, 2009). The participants are usually students or self-selected. A problem with this approach is the representativeness of the sample – young, highly educated individuals or highly motivated users are usually overrepresented. Similar issues were identified in the field of knowledge management and collaborative learning to build groups with specific profile (Dascalua, Bodea, Lytras, Ordoñez de Pablos, & Burlacua, 2014). Although it is possible to extract behavioral data from social media and use them as the basis of the analysis (Liu, 2007, Thelwall, 2008), social scientists are often interested in the subjective experience of social media users, such as motivations for and gratifications of social media use, evaluation of competences and knowledge resources within the network (Colomo-Palacios et al., 2014b, Ordóñez de Pablos, 2004, Różewski and Ciszczyk, 2009). To evaluate them, surveys are still the most suitable tool. In this paper, a new method for judging and enhancing the representativeness of an online sample is presented. The authors argue that it might be useful to utilize network measures such as centrality or degree as a basis for determining the representativeness of an online sample vs. the entire population.
Some users have a very central social position within the online social networks, and they possess many more inbound and outbound connections when compared with other users. By comparing the network profile of the sample and the overall population, the representativeness of the online sample can be determined. Moreover, it is possible to develop algorithms that suggest which users should be approached in order to enhance the representativeness of a given sample so that the results will have higher potential in the areas of community building, information dissemination, and collaborative learning (Cowan & Jonard, 2004). The approach presented below is based on selecting an adequate set of candidates in each step of the multistage process to improve the representativeness of the sample in terms of network measures. Depending on the research goal and the area of applications, different network characteristics might be considered. To identify opinion leaders, the best candidates for leadership in collaborative learning or knowledge brokers, it is usually necessary to evaluate centrality measures (Boari & Riboldazzi, 2014). However, fulfilling a bridge position is more important when focusing on advertising and diffusing innovation or spreading knowledge among network nodes. From the perspective of collaborative learning, it is important to select nodes with specific characteristic for future activity within the network, and representative selection can impact on the future spread of knowledge within it.
While the structure of connections within the social network influences collaborative learning processes, there is a clear need to access information about participants and their potential for learning processes and sharing of information with other participants. Collaborative learning and group-based learning is closely related to dynamic social systems (Strijbos, 2001) where the members of the community interact and share experiences with one another (Chiu, 2008). During the learning process, members of the community evaluate other ideas and get engaged in monitoring the tasks and progress of other participants (Chiu, 2000). Key problems found here can be addressed to quantify proper users’ features, select users with specific characteristic, and split users into optimal groups (Long & Qing-hong, 2014) in order to boost the sharing of knowledge in organizations (Lytras, Tennyson, & Ordóñez de Pablos, 2008). During collaborative learning processes, building teams and increasing potential by acquiring additional representatives with specific knowledge or competences can be very important, not only in terms of knowledge itself, but also in terms of network characteristics. While the ability to attain knowledge from all nodes of a network can be limited, sampling methods can be applied to acquire information desired. The proposed method can be adapted to different research goals by using weighted sampling. As online surveys are usually based on voluntary participation, and because there may be low response rates, the obtained sample may have other characteristics than the random sample. The proposed method makes it possible to direct the selection process towards expected characteristics of the sample.
Section snippets
Conventional and adaptive network sampling
Research related to network sampling is based on various techniques using both conventional and adaptive approaches. Sampling design is treated as conventional when it does not use acquired data in the sampling process. The first group of methods in this class is based on random-node selection focused on uniform or proportional-to-node degree probabilities (Maiya & Berger-Wolf, 2010), random edge selection (Ahmed, Neville, & Kompella, 2011) and the egocentric method (Ma, Gustafson, Moitra, &
Conceptual framework
In this part a balanced adaptive distribution fitting approach based on a set of network measure distributions is proposed. Its main goal is to build representative survey responses based on a selected set of participants in terms of distance from the whole network distribution. The function minimizing a distance from the vector of network distributions is proposed, and the network members are selected to fit the reference distributions for the whole network, which are known in advance. In Fig.
Empirical research
The new approach is demonstrated by presenting the results of a survey performed within an online social network based on a graphical virtual world with both entertainment and educational purposes. An online survey covering motivations, self-disclosure and self-presentation was conducted among portal users and filled in by 373 of them, while 9631 users logged into the system in the examined period and were identified by their unique user_ID. The structural measures computed for the full network
Discussion and summary
Growing engagement in social network systems and moving from traditional environments to online systems creates a new space for both theoretical and empirical studies. Together with technological development, the need for new methods also grows, making research processes more efficient and increasing their quality. While adaptive survey methodologies were the subject of earlier research, they are not frequently applied to online research. An alternative to available solutions was presented in
Acknowledgments
The work was partially supported by Fellowship co-Financed by European Union within European Social Fund, by European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no 316097 [ENGINE] and by The National Science Centre, the decision no. DEC-2013/09/B/ST6/02317.
References (58)
- et al.
How knowledge brokers emerge and evolve: The role of actors’ behaviour
Research Policy
(2014) - et al.
Network structure and the diffusion of knowledge
Journal of Economic Dynamics and Control
(2004) - et al.
Variance approximation under balanced sampling
Journal of Statistical Planning and Inference
(2005) Estimation of population totals by use of snowball samples
- et al.
Group regulation and social-emotional interactions observed in computer supported collaborative learning: Comparison between good vs. poor collaborators
Computers & Education
(2014) - et al.
Real-world behavior analysis through a social media lens, social computing, behavioral – Cultural modeling and prediction
Lecture Notes in Computer Science
(2012) - Ahmed, N., Neville, J., & Kompella, R. (2011). Network Sampling via Edge-based Node Selection with Graph Induction,...
- et al.
Analysis of topological characteristics of huge online social networking services
- et al.
Facebook profiles reflect actual personality, not self-idealization
Psychological Science
(2010) Group problem solving processes: Social interactions and individual actions
Theory of Social Behavior
(2000)