Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Contrary to traditional data collection approaches, crowdsourcing has enabled the active participation of citizens in decision planning processes by opening the door toward simplifying the challenging task of understanding the complexities underlying human perceptions. Crowdsourcing-based data collection techniques have also made it possible to obtain the knowledge required for constructing a filtered view of urban perceptions while allowing the interpretation of cultural, social and economic factors that contributed to shaping these perceptions as well as individuals’ personal experiences and opinions [1,2,3,4,5,6]. That is, these techniques have facilitated building an accumulative picture of perceptions while allowing the data collection process to span geographical and cultural boundaries [7].

In this paper, we employ crowdsourcing as a human-centered data collection approach to obtain a unified view of entertainment perceptions for local populations in Saudi Arabia. We present the first research attempt that quantifies the entertainment-related perceptions of people in the Saudi context. Using perception data collected by crowdsourcing the task of rating paired images in a visual survey, we show the effectiveness of crowdsourcing as a tool for uncovering differences of entertainment perceptions in Saudi Arabia, a culture representing people of diverse backgrounds, experiences and cultural beliefs.

One of the main properties of the Saudi culture, as suggested by Hofstede’s theory of cultural dimensions, is that it is a collectivistic culture meaning that its individuals work for the welfare of the social group for which they belong [8]. Given this and the fact that people in Saudi Arabia have different social and cultural norms, we expect collecting perceptual data on people living in different areas in Saudi Arabia via visual surveys to help in revealing and quantifying the perceptual differences of various Saudi cultural groups. We also aim to investigate whether the highly masculine nature of the Saudi society has any clear effects on individuals’ perceptions of entertainment.

While the psychology literature has unveiled many factors that explain cross-cultural human behavior and perception, the study of inter-cultural features that explain perceptions of residents in Arabic countries is still yet to be explored. In particular, we note that despite the multi-cultural nature of the Saudi environment, the question of what effects does this cultural diversity have on individuals’ values and preferences remains open. Despite the fact that these preferences can be studied from a variety of different perspectives, we utilize the fact that formulation of entertainment preferences in individuals’ minds is resulted from the intersection of a multitude of dimensions, including the consideration of personal preferences and the acceptability of new entertainment options, for demonstrating the effectiveness of crowdsourcing in unrolling the complexities of these human factors. Thus, for each specific sub-cultural group in Saudi Arabia, we aim to show the factors that could contribute to defining what makes an entertainment option more or less preferred than the other available options. Synthesizing our observations on each group also makes it possible to draw some generalizations that are applicable for the Saudi culture as a whole and at the same time identify the perceptual similarities and differences between all the sub-cultural groups in question.

Our overarching goal is not only to provide insights for urban planners and designers in the Saudi context but also to demonstrate the ability of crowdsourcing-based visual ratings to capture quantified subjective judgments of people regardless of the complex features of their cultures, their social backgrounds or demographics. Therefore, the contributions of the present research effort are: (1) building a human-centered crowdsourcing platform for eliciting perception data in the Saudi context that is based on a new image data set of many entertainment modes, (2) exploiting entertainment as a way for uncovering the perceptual indicators of citizens and residents in Saudi Arabia, and (3) proposing a data collection model that could be generalized for gaining an improved understanding of perceptual responses on a society level while considering the subjective properties of individuals who collectively contribute to shaping the big picture of the society’s overall impression.

2 Related Work

Prior work has demonstrated the effectiveness of online crowdsourced visual surveys as a technique for facilitating the collection of perception data on a large scale and enabling the interpretation and evaluation of urban impressions from multiple dimensions [1, 7]. This technique was exploited by a number of researchers for gathering data about urban perceptions of places and relating people’s aggregate impressions with economic, social, demographic and cultural factors [1,2,3, 7]. For instance, Quercia et al. constructed a visualized map representing the collective recognizability of places in London based on the subjective responses collected via a crowdsourcing game [3]. The analysis of the perception data gathered by Quercia et al. facilitated establishing a quantified subjective ranking of the main regions in London according to the recognizability ratings obtained from the respondents [3]. Quercia et al. also demonstrated how the conducted analyses helped them uncover the correlation between a respondent’s evaluation of the recognizability of a place and his/her country of residence [3]. In another research effort that crowdsourced the task of judging the beauty of London’s streets, pairwise image ratings collected from a large pool of respondents helped researchers understand the role played by the aesthetic and visual features of urban environments on shaping public perceptions [2].

Salesses et al. also outsourced the task of rating images on a larger scale to explain the differences in public perceptions in a number of cities and relate the obtained quantified data to participants’ social and demographic variables [1]. Using a similar approach, Ruiz-Correa et al. grounded the study of youth perceptions of outdoor environments on data gathered through crowdsourced visual surveys [9]. Particularly for indoor environments, other researchers built a characterization of place ambiance based on the analysis of image ratings gathered from Amazon Mechanical Turk crowd workers [7]. Based on the results of the experiment conducted in crowdsourcing setting, Santani and Gatica-Perez also demonstrated the applicability of using images of indoor places as stimuli for understanding the psychological factors that contribute to shaping place perceptions and examining the differences in collective impressions on a large scale [7]. Aggregate impressions of people gathered using crowdsourcing platforms were also utilized in other research attempts to particularly explore what drive people to feel unsafe in urban places [4, 10]. For instance, Traunmueller et al. found that safety perceptions of cities are affected primarily by people’s familiarity of places [10]. In another interesting research effort, a platform named Streetsmart was particularly built to gather safety ratings about people depicted in images rather than focusing on perceptual dimensions of places [4].

The above mentioned research efforts clearly show that the scalability of image-based crowdsourced surveys enabled capturing detailed insight on people’s impressions and obtaining contextualized understanding of social perceptions from multiple perspectives. Despite the tremendous advantages of this data collection approach, there exists no study that utilizes this approach for particularly understanding the collective perceptions of the local populations in Saudi Arabia. We also note that prior work has mostly focused on studying perceptual factors independently without giving considerable attention to studying the effects of the interaction of social, cultural and personal factors on shaping individuals’ preferences and perceptions. Unlike previous studies, we crowdsource visual surveys for uncovering the factors that contribute to shaping entertainment-related perceptions and demonstrating the effect placed by the interaction between these factors on public perceptions in the Saudi context, a problem that has not been tackled in the preceding studies.

3 Research Methodology

We followed a structured research methodology that facilitated collecting and analyzing perception data of the participants in systematic manner (see Fig. 1). The methodology we followed for conducting this study was inspired by Salesses et al. work [1]. As the focus of this study is on uncovering the differences in perceptions of entertainment methods in the Saudi culture, we started by preparing a dataset of images, sourced from Google images, representing a wide spectrum of entertainment options.

Fig. 1.
figure 1

The research methodology

While selecting images included in the dataset, we excluded those images that included aspects that could distract the participant from focusing on the presented entertainment option or point the participant to a specific answer rather than the other. For instance, images that included large textual descriptions or drawings were excluded. We also made sure that all images considered in the study are not of variable quality levels to avoid the possibility of not preferring an image because of its bad resolution or clarity. We also considered having more than one image of one place with different factors included in the image (e.g., having the same image in different times of the day) as this helped us study the effect of these factors on viewers’ perceptions. At the end of the image collection phase, we had a dataset containing 1200 different images (see Table 1).

Table 1. Sample of the images included in the dataset

We then manually inspected the collected images and annotated them with labels indicating the entertainment options depicted in the image. We chose to cover a variety of entertainment elements and labeled the collected images accordingly. We also made sure that all image labels are evenly distributed across our dataset to avoid the bias resulted from having more images belonging to a particular category compared to the others. For each image, our annotations indicated whether the image represents indoor entertainment activities (e.g., shopping malls, arcades, and cafes), outdoor entertainment options (e.g., parks, resorts and beaches), sports activities and special cultural and social events. We also placed labels indicating if the entrainment method depicted in the image is electronic (e.g., video games) or kinetic.

These annotations were then used to facilitate extracting the different dimensions that characterize entertainment-related perceptions in each Saudi region. As we aim to elicit users’ perceptions through their image choices and since images were annotated while taking into consideration a variety of dimensions that are contributive to shaping individuals perceptions, we note that the majority of the collected images were representative of multiple entertainment preferences, depending on the inclusion of an image of human subjects, whether the image shows a sport or cultural activity and the type of entertainment option that it depicts (e.g., video games vs kinetic games). For instance, in Table 1, the choice of images (a), (h), (i) and (l) could mean that the participant tend to prefer outdoor entertainment options over those which exist in indoor environments. On the other hand, preferring images (e), (g) and (h) signals that a user’s tendency to accept non-cultural events over cultural ones is not probable. We also gave special attention to sport activities and heritage places as depicted in images (a) and (d), respectively. After annotating all the images in the dataset, we recruited a number of students and faculty members from the College of Computer and Information Sciences at King Saud University and some social media sites for a pilot study that aimed at validating that the images included in the crowdsourcing platform capture and depict the entertainment properties in question.

The collected images were then plugged into the crowdsourcing platform built for the purpose of collecting subjective ratings on how preferable a particular entertainment mode is by allowing the random presentation of these images for each user in a visual survey form. In our crowdsourcing platform, we included one question as a basis for the visual survey which is: “Which place looks more entertaining, fun or relaxing?” (Figure 2 shows the crowdsourcing platform used to collect urban perceptions of entertainment in the Saudi context). A link to the developed web-based platform was then distributed to Saudi residents via social networking and instant messaging platforms such as Twitter and WhatsApp in order to collect perception data needed for the study. For each user, we collected demographic data relating to his/her gender, age group, educational background and the particular Saudi region in which he/she resides (i.e., eastern region, central region, northern region and western region). We also noted whether a participant is a Saudi citizen or a Saudi resident to explore the perceptual differences between Saudis and non-Saudis. While analyzing the collected data, we related participants’ demographic and personal properties with our perception-related observations. We also noted the differences in perceptual factors between individuals belonging to different Saudi Arabian regions or cultural groups. This was followed by conducting interviews in-person with individuals belonging to these different groups in order to verify the results of the conducted analyses and gain insights on the applicability of applying crowdsourcing to capture a holistic view of entertainment-related impressions in the Saudi context.

Fig. 2.
figure 2

Visual surveys in Saudi-Impression crowdsourcing system

3.1 Quantifying Perceptions

To measure users’ perceptions, users’ clicks were considered to give each image a weight scaling from 0 to 10, with 0 being indicative of the lowest preference and 10 the highest. The method was adapted from a similar approach proposed by Salesses et al. to quantify users’ perceptions in urban areas [1]. In the crowdsourcing system, images were displayed in pairs in which both images were chosen randomly and some would be significantly attractive or superior compared to the other. For each image, the total number of image clicks was taken into account (see Eqs. 1 and 2), where Wi and Li refer to the win score and loss score of image i respectively. For a given image i, Wi, li, and ti refer to the number of win clicks, the number of loss clicks, and the number of equality clicks for the image, respectively.

$$ W_{i} = \frac{{w_{i} }}{{( w_{i} + l_{i} + t_{i} )}} $$
(1)
$$ L_{i} = \frac{{l_{i} }}{{( w_{i} + l_{i} + t_{i} )}} $$
(2)

The wins and losses of other paired images were also taken into account to validate each image’s weight (see Eq. 3). Where Qi is the accumulated weight of an image i, niw is the total number of images i was preferred over in the pairwise comparison, nil is the total number of images that were preferred over i in the pairwise comparison, ji is the subset of images i was preferred over, and j2 is the subset of images preferred over image i. Finally, the \( \frac{10}{3} \) and +1 are to scale the weight in the range 1–10.

$$ Q_{i\, = } \frac{10}{3} \left( {W_{i} + \frac{1}{{n_{i}^{w} }} \mathop \sum \limits_{{j_{1} = 1}}^{{n_{i}^{w} }} W_{{j_{1} }}\, -\, \frac{1}{{n_{i}^{l} }} \mathop \sum \limits_{{j_{2} = 1}}^{{n_{i}^{l} }} L_{{j_{2} }} + 1} \right) $$
(3)

4 The Role of Visual Analytics in the Study

With the introduction of social media and internet of things, large amounts of complex data and information is increasingly being made available by the public, representing a rich resource for scientists and stakeholders who intend to draw conclusions and view problems from the eye of beneficiaries within the public audience. However, the large amounts and high heterogeneity of data poses a challenge relating to analyzing and understanding data volumes in an effective way. Visual analysis has therefore been shown effective at enhancing human analyzing capabilities by exploiting computer intelligence. Visualization is simply, the study of transforming data and information into interactive visual representations [11]. Iterative, interactive and dynamic integration of human intelligence with data analysis creates a novel analysis dimension, namely, visual analytics [12].

In the context of visualizing urban perceptions, visual analysis was utilized with the aim to reveal hidden patterns and represent the structures and distributions of raw perception data. It facilitates obtaining global understanding of entertainment preferences while correlating entertainment variables between different regions or sub categorize or sub groups. For the purpose of visualizing perception data collected for the present study, a number of visualization techniques could be employed for discovering the differences in perception among different Saudi cultural contexts while uncovering the common themes in perception between different Saudi regions or different generations. Grounding the visualization of collected perception data on the Q-scores recorded for each image and relating these scores with the personal and demographic variables collected from each participant is expected to help in understanding how different cultural groups prioritize entertainment modes. This would also provide urban planners with a tool that informs their design of entertainment modes in each sub-cultural group while allowing users to grasp complex properties effortlessly.

It is also noteworthy to mention that since Saudi Impression is a real-time online tool, instant visualization will allow the observation of how collective perceptions of Saudis change over time by giving the capability to present real-time visualization of perceptions. For instance, point-based visualization could be used to plot information about the participants, while representing each individual by a single dot in the Saudi region on the map where the participants belong to and with the color of the dot being the indicator of the age group or the gender of the participant. The advantage of the dot-based visualization is that it provides the viewer with the ability to observe the state of every single object independently. But when the data accumulates for many objects as time passes, dot-based representations could become infeasible as the visualization becomes complex and hard to understand [13]. A good alternative could be using heatmap to illustrate the integrated amount of a large number of objects on the Saudi map [13].

Furthermore, the perceptions of the respondents from each Saudi region could be reflected on cartograms with each cartogram representing the perceptions of individuals belonging to each geographical region, with the dark shade representing low Q-score values and the lighter shade being reflective of high Q-score values. Region-based representation is suitable for showing the relevant preference of each region compared to its neighbors by revealing hidden macro-patterns [14]. On the other hand, it is incapable to analyze micro patterns so it is commonly used in combination with other techniques that can go to more detailed levels.

Bubble charts could be used to display the set of Q-scores with each bubble corresponding to different entertainment mode and the size of the bubbles representing the mean Q-score values. For instance, the analysis of responses from the large and industrial cities in the Central and Eastern regions of Saudi Arabia, where the majority of the population are youth, could reflect that sport activities and indoor entertainment are more preferred than other types of entertainment (see Fig. 3). On the other hand, outdoor entertainment could outperform other types of entertainment in the Southern region of Saudi Arabia due to the moderate weather.

Fig. 3.
figure 3

Using bubble charts for analyzing perception data

5 Conclusion, Limitations and Future Work

The aim of this work is not only to understand the Saudi view of entertainment and explore the potential areas of improvements in entertainment but also to provide a tool that can be easily generalized to read the audience visual perceptions on any topic and in turn anticipate those perceptions in measurable deliverables to the beneficiaries, whether they are from the public, business or government. The rationale of using a visual survey for reading audience perceptions in an area like entertainment is that entertaining facilities have properties, such as ambiance, natural surroundings and social interactions that cannot be collectively measured by the traditional approaches which takes only tradable factors into account. The strength of the pair-image comparison approach comes from the flexibility to be applied on any topic with pre-collected related images and proposing any question of interest. It also has the advantage of simplicity where participants involved with different demographics whether they are tech savvy youth or non-technology oriented elderly they all have the capability to participate and even enjoy the game like questionnaire. It also allows reaching a reasonable characterization of individuals’ preferences and encompassing the whole range of factors affecting the perceptions of individuals even if they belong to different cultural groups or geographical regions.

In the course of this study, a number of limitations have raised that allow for future enhancements. First, since the sample pictures were sourced from Google images, we lacked the full control of the quality or resolution which may affect the user perception of the image content. In some cases, the angle of the image may put the focus or distract the viewer from observing some image contents. In future work, we aim to collect a dataset of manually taken images where we can unify the ambiance of sample images. This piece of research contributes to set a crowdsourcing approach to understand the collective public perception of participants who are not physically located or interacting with the objects under study. A flexible approach which can be generalized on any topic with a set of images that satisfies measurable dimensions.