Keywords

1 Introduction

Visualizations act as key media for perceiving, understanding, reasoning and interacting with the sheer volume of data and information in the Big Data era. Understanding what does the visualization mean, and how it creates meaning are paramount tasks for both audiences and designers. The essence of each visualization is not to provide information solely, but to act as a platform and inference start point from which the audience could frame new information and knowledge [1]. In other words, how we designers can make the visualization much clearly through visual representations and encoding methods is crucial for constructing the inference point for audiences. Among visual representations, acquired codes, together with perceptual cues can leverage readability, comprehensibility and familiarity of visualizations [2]. Visual conventions, the perceptually efficient graphic agreements that are shared by both the users and designers, can lead to faster recognition and less effort spent. Apart from this, the emergence of knowledge visualization [3] makes it a worthwhile issue to investigate the function of visual conventions and perceptual cues in the perception and inference when interacting with visualizations.

2 Background Work

Visual conventions and metaphors are indispensable factors in improving the efficiency for information discovery, extraction and transfer [4]. They make the diagrammatic representation superior to sentential one in aspects of searching and identifying data. Larkin and Simon [2] concluded that diagrammatic representations can make relations between individual elements more explicit, which facilitate and simplify perception and comprehension. Encoding forms, like color and size, do carry a lot of cultural conventions. For example, color includes many common emotional and aesthetic associations, and sizes can be used to represent the relative importance of entities [5]. As is depicted in Fig. 1(a), the size and lightness indicate the number of linkages the dot may contain; and in Fig. 1(b), the lightness of BLUE means the volume of rainfall in each month, which coordinate with the perceptual decoding conventions and become universally-used visual representations provided to the end user. These encoding forms are termed as visual representation conventions. But whether their associations with entities can be perceived depends largely on the context.

Fig. 1.
figure 1

Visual representation conventions in visualization design (Color figure online)

Abundant evidences in linguistic domain suggest that integration of visual conventions and metaphor facilitates comprehension. And user performance will suffer once violating these. Santa-Maria and Dyson [6] investigated what would happen in seeking information on websites that violate visual conventions, like putting the scroll bar on the left side. They found that user performances are far better when interacting with the convention-conforming websites than those in convention-violating ones. The efficiency of searching in latter context will get improved after trainings. Lai and Curran [7] defined the mapping process during the conventional or novel metaphor cognition as a process, not representation stored in the long-term memory. They used N400, the neural indicator of the ease of retrieval in lexical research domain, to depict the ease of retrieval and comparisons of concepts through conventional metaphors and metaphor-free contexts. However, effects of visual metaphors varied from novices and experts, as it would serve as a mental scaffolding tool for novices, despite that these metaphors help the reconstruction of their knowledge structure. Owing to the skillfulness for experts to consolidate their background knowledge, metaphors will greatly facilitate their retrieval of new knowledge [8]. Concepts of conventions and metaphors originate from linguistics, and researches emerge with the focus of visual language.

As uses of visual conventions and metaphors lead to the ease of memory retrieval and information extraction, it may also play an important role in visual inference, which is characterized by the use of spatial relations to highlight conceptual ones, to represent meaning, and chunk information for computational efficiency [9]. The inference process is a higher cognitive process, which is indicated by Wang et al. [10] in the Layered Reference Model of the Brain (LRMB) (Shown in Fig. 2).

Fig. 2.
figure 2

Cognitive flow of perception and comprehension of visual stimuli

Entities in the upper dashed box are cognitive levels mentioned in LRMB, and entities in the bottom box depict the cognitive flow of perception and comprehension of visual stimuli, which consist of interdependent processes, with the former processes serve as basis for the latter ones.

Researches on inference in visualization demonstrate that these diagrammatic representations can improve reasoning, both in aspects of accuracy rate and inference time when compared with sentential representations. Bauer and Johnson-Laird [11] extended the conclusion drawn by Larkin and Simon [2] in target searching only, and found that more valid inferences and faster responses were made with diagrammatic representations. Grant and Spivey [12] found that attentional guidance can improve reasoning in a problem-solving task that relies on a diagram. They related attention capture theory with inference-making, but they discussed little about the mechanisms supporting the facilitation of making inferences. Fabrikant et al. [13] suggested that visually perceptual saliency corresponding to relationship of thematic relevance can affect viewing behavior and enhance the efficiency of spatial inference making, which can be manifested in eye-tracking data. Borgo et al. [14] investigated the impact of visual embellishment on visualization memorization, visual searching and concept grasping. Under the instruction of concept grasping, identification of key concepts within or behind visualizations were required. Despite the incremental difficulty, no higher-level of cognition was mentioned in their research. Mineshima et al. [15] explored the efficacy of Euler diagram in syllogistic reasoning and the checking of invalidity of an inference. They uncovered that the better performance of Euler diagram derived from the explicit visual representation of relational structures of elements.

Despite the fact that effect of visual representation on inference-making has been studied, little is known about the influence of visual conventions on perception and inference in visualizations. In addition, whether the influence is equivalent from expert to naïve users remain to be studied. That is why we focus on the effect of conventional metaphors in capturing, analyzing and reasoning about information in this study, and experimental data are compared between experts and novices. It is a worthwhile issue for visualization designers in order to create a more effective visualization and bring about more enjoyable user interactive experience.

3 Experiment Method

The study is based on the taxonomy of visual conventions, including color, spatial metaphor and some guidelines in Gestalt theory, like adjacency. These conventions are integrated into visualizations, which describe a context of certain topic.

3.1 Participants and Equipment

20 participants were enrolled from Department of Industrial Design and Department of Mechanical Automation in Southeast University. A classifying procedure, which is similar to Lee’s in [8], was administered to identify all participants into expert and novice group.

Three questions were asked: (1) Have you ever had experience of reading diagrams or InfoViz (Information Visualization) before?; (2) Have you ever built diagrams in data analysis platform, like Excel or Tableau, etc.?; (3) Do you know some principles in interface design, including information encoding and data visualization, or combine those guidelines into your practice? Question (1) was designed to ensure that each participant has experience in reading diagrams, thus their experience should not be interferential variable. Question (2) was set to ensure that they are able to distinguish difference between two kinds of visualizations. For each time they choose the diagram format in the data analysis platform, they might keep in mind the relationship between the data category and visual representation format. The last question was to classify participants into two groups, with those being expert as well as having know-how knowledge of designing interface of visualizations in Expert Group, and those getting no prior experience in designing visualizations in Novice Group.

There were altogether 10 participants in Expert Group, and 10 in Novice Group. Their ages range from 19 to 27 (Mean1 = 23.1, SD1 = 1.66, Mean2 = 21.7, SD2 = 1.77), and all participants have normal or corrected-to-normal vision, without color blindness or weakness.

The experiment was carried out in the Ergonomics Lab of Southeast University, under the normal lighting condition (40 W daylight continuous current tungsten lamp). All the stimuli were presented, and both behavioral and eye movement data were collected by Tobii Studio 3.3.0. In addition, the display of visual stimuli was presented on a CRT monitor, of which CPU main frequency was 3.0 GHZ and display size was 17 in. (1280 pixels * 1024 pixels).

3.2 Procedure

Each participant underwent two blocks, and one block is featured by plain visualizations with no conventional representations; the other is featured by visual conventions. A randomized block experimental design was adopted in order to counteract the influence of block sequences. During the experiment, eye movement data were recorded by Tobii X2-30. Eye tracking indices including total fixation number, fixation duration were analyzed statistically. In order to get whether participants had made correct choices, their answers were recorded in the form of multiple choice.

Experiment tasks were chosen based on three main purposes or actions when interacting with info graphs [16] as is shown in Fig. 3. Participants would complete four tasks on each stimulus, and the four tasks were set according to the above-mentioned three actions. The example task instructions are listed in Table 1. In Task One, participants were instructed to localize the exact point in the visualization. In Task Two, simple comparisons between two values were required. While in Task Three, they were instructed to compare multiple pairs of values and even tendency of the data series, which is more demanding and complicated than Task II. In Task Four, participants were required to make inferences and get more abstract and potential information in those graphics. The general complexities of these four tasks are incremental.

Fig. 3.
figure 3

Hierarchy of tasks in the interaction with visualization

Table 1. Examples of task instructions and the corresponding action categories

Additionally, visualizations used in our experiment are elaborately designed, to achieve the equilibrium of both perceptual and cognitive loads across the stimuli. All the information depicted in the visualizations is extracted from daily life to ensure that participants are familiar with those contexts, like precipitation of City A or citation number of research topics in certain journal. However, hardly can they predict or infer from their prior knowledge without viewing the stimuli.

There are altogether 56 trials in two blocks, with 28 in each. Each stimulus has four trials in accord with corresponding four tasks. The experiment flow of one core experiment unit is established in Fig. 4. The task instructions were presented until participants pushing the button and indicating that they have finished reading and fully comprehended the instruction. Then, commentary about the visualization was displayed, providing background information of the following visualizations. The fixation marker followed, lasting for one thousand mini-seconds. Then the diagram was presented for infinite time until response was made. Multiple choices followed, to record the answer of each question. All 28 stimuli in each section were pseudo-randomized to ensure that in each pair, stimuli with visual conventions have exactly 50% chance to be displayed before plain stimuli, and vice versa. After participants finished each task, a 5 min break was set to ease tiredness.

Fig. 4.
figure 4

Experimental flow for one block

4 Results

Behavioral, subjective and eye-tracking data were collected to compare subjects’ performances of four individual tasks under conditions of visualizations with and without visual conventions. ANOVAs and paired t-test method were employed to compare whether there is significant difference between two experimental conditions; and whether performances vary significantly between expert and novice group.

4.1 Behavioral Data

Behavioral data includes response time and accurate rate under each condition for two groups. Among all the 20 subjects, two subjects’ response time in Task Four in plain visualizations were rejected, for they asked question about task instructions during the presentation of diagrams. Their response time is replaced by the average response time of their group.

Response Time.

As is shown in Fig. 5, both expert and novice groups responded in shorter time under visualizations without visual conventions in task one and two. These two tasks feature in localizing single or smaller amount data points than Task Three. However, performances under diagrams with visual conventions are better than those under without conditions in task three and four. These two tasks involve comparing multiple data points (N > 2) along the entire time span, and reasoning about abstract conclusion.

Fig. 5.
figure 5

Response time for two groups of participants under different experimental conditions

2 * 4 repeated ANOVA (two visualization categories, four task categories) was carried out to further analyze the differences. In expert group, the main effect of visualization category on response time is not significant (F = 3.552, p = 0.064, α = 0.05, η 2 = 0.049), but task category exerts significant effect on response time (F = 8.962, p = 0.000, α = 0.05, η 2 = 0.115). The interaction effect of diagram and task categories is significant (F = 18.180, p = 0.000, α = 0.05, η 2 = 0.209). In novice group, the main effect of visualization categories and task category on response time is significant \( {\boldsymbol{F}}_{1}=4.723, p_{1}=0.033, \alpha=0.05, \eta_{1}^{2}=0.064;\,{\boldsymbol{F}}_{2}=11.130, p_{2}=0.000, \alpha=0.05, \eta_{2}^{2}=\) \(0.139 \). The interaction effect of diagram and task categories is significant (F = 11.425, p = 0.000, α = 0.05, η 2 = 0.142). The set of ANOVA result indicates that whether visualizations containing visual conventions or not may affect novices’ response time considerably. Both expert and novice participants responded faster under visualizations with visual conventions than those without conventions in the last two tasks, which support the conclusion that conventional encodings can facilitate more complex tasks like comparison and knowledge inference. It should be noted that when the task is to localize specific data points, that is to say when the task is comparably easy to conduct, visualizations without conventional encoding will be more conducive.

Figure 5(a) and (b) shows response time for expert and novice participants under conditions of two kinds of visualizations respectively.

Accuracy Rate.

Accuracy rates under different conditions are established in Fig. 6. It can be seen that accuracy rate of both expert and novice group gradually declined as task difficulty increasing in visualizations without visual conventions. However, accuracy rate of acquiring information from convention-conforming encoding diagrams showed an ascending tendency, from task one to task three. When it comes to task four, accuracy rate for those two groups of participants dropped remarkably. But expert group still outperformed novice group when there were visual conventions contained in the diagram in task three and four. According to Fig. 6, accuracy rate of novice participants in task three is higher with visually conventional diagrams than the one with plain visualizations. But the discrepancy narrowed in task four. What needs to mention is that accuracy rate of expert group in task four is higher than the one in the convention-free section. Paired t-test result indicates that experts’ performance are significantly better in task four under conventionally encoding representations (t = 2.813, df = 69, Sig.(two-sided) = 0.006, α = 0.05). But the difference does not reach to significance in task three for novice participants (t = 1.270, df = 69, Sig.(two-sided) = 0.208, α = 0.05).

Fig. 6.
figure 6

Accuracy rate for two groups of participants under different experimental conditions

For expert participants, whether the visualization is plain or visually conventional make significant difference for accuracy rate (F = 8.357, p = 0.005, α = 0.05). But the difference is not significant for novice (F = 0.079, p = 0.780, α = 0.05). The interactive effect of representation category and task feature shares the same pattern, in which for expert groups, the interactive effect is significant (F = 5.884, p = 0.018, α = 0.05), and it is not significant when for novice participants (F = 2.162, p = 0.100, α = 0.05).

The result indicates that visual conventions contained in InfoViz carry little effect on novice participants’ comprehension. However, it will significantly aid expert participants to correctly make inferences. When multiple data point’s localization and comparisons were required, visualizations with visual conventions will help them to acquire information, reason about new knowledge and make decisions. Combining with the result of response time, novice participants can make more accurate options with less time spent when there are conventions in the visualization, so as to expert participants. This means that general performances of participants get improved in the conventional section. In another word, conventions will help participants to perceive information and make inference. But if the task is relatively simple, like localization of single data point, participants with common visualization will perform better. We hypothesize that the lower performance in task one and two when visual conventions were displayed may have something to do with unfamiliarity and the increase of perceptual load.

4.2 Subjective Data

Each participant would fill in the questionnaire as feedback after their completion of the entire experiment. In the questionnaire, five questions were listed: (a) Which section of representation facilitate your acquisition of information? (b) Which task is more difficult for you to answer? Question (c)–(e) corresponding to task one, task two and three, task four listed in Fig. 3 were set to get feedback of confidences of task completion across two groups.

Table 2 shows that the most majority of participants classified task four as the most difficult task to complete, validating the task arrangement in our research. Figure 7 shows the result of Question(c)–(e). Most participants claimed that plain visualizations facilitated their information-grabbing and decision-making in task one. As task difficulties mount, an increasing proportion of participants preferred to the conventionally encoding visualizations, rather than the plain ones, supporting the merit of visual conventions from a subjective perspective.

Table 2. Subjective difficulty evaluation of four tasks across participant groups
Fig. 7.
figure 7

Confidences of answering for participants with different visualizations

4.3 Eye-Tracking Data

Total Fixation Duration (TFD).

The index of total fixation duration measures the sum of duration for all fixations within an AOI [17]. This index is often adopted in comprehension and evaluation of human cognitive process and also mental load. In our study, mean of total fixation duration for participants in two conditions of visualizations are presented in Fig. 8. For two groups, it is conspicuous from the figure that total fixation duration was less than that in without-convention diagrams in task three and four. ANOVA results indicate that main effects of visualization and task category are significant on total fixation duration across two groups (\( {\boldsymbol{F}}_{1}=16.782, p_{1}=0.000, \alpha=0.05,\) \(\eta_{1}^{2}=0.015;\,{\boldsymbol{F}}_{2}=19.394, p_{2}=0.000, \alpha=0.05, \eta_{2}^{2}=0.050 \)). Paired t-test supports the significant difference between TFD in task one, two and task three, four (Sig 1-3&4.(two-sided) = 0.000, Sig 2-3&4.(two-sided) = 0.000, Sig 3-4.(two-sided) = 0.055, α = 0.05;). No significant effect of participant groups is found, despite the gap depicted in Fig. 8.

Fig. 8.
figure 8

Mean of total fixation duration for two groups in two conditions of diagrams

These are consistent with the hypothesis that visually conventional diagrams can remarkably shrink cognitive workload for participants. In addition, cognitive load increases from the task of data localization to knowledge inference. For total fixation duration can generally give us a good indication of the attentional allocation to an object. The interactive effect of visualization and task categories is also significant (F = 9.724, p = 0.000, α = 0.05, η 2 = 0.026), which supports the hypothesis that with the variance of task categories, whether diagrams containing conventional encodings or not is important to information acquiring and higher-level cognitive activities, like reasoning and making comparisons among multi-data points. According to ANOVA results, the main effect of participant group is not significant on TFD (p = 0.988, α = 0.05). And interactive effect of participant group and diagram category is not significant neither (p = 0.848, α = 0.05). The non-significance means that knowledge background of participants exerts little implication when there is visual convention contained in the visualization. That is, visual conventions can be equally perceived for both expert and novice users. The finding deepens definition of visual conventions of perceptual agreements across users.

In spite of the fact that visualization format is marked as the covariate, we are surprised to find that the covariate also has an effect on total fixation duration (F = 5.970, p = 0.015, α = 0.05, η 2 = 0.005). In our future study, we may focus on the interactive effect of different forms of visualization and visual conventions on information perception and knowledge inference.

Fixation Count (FC).

The index of fixation count measures the times participant fixates on an AOI [17]. Mean fixation counts of expert and novice participants are established in Fig. 9. The mean fixation count varied across conditions with the same pattern as what is presented in Fig. 8. To be specific, mean fixation count, when visually conventional encodings were not presented in diagrams, was remarkably more than those in visually conventional diagrams. The advantage of visual convention is increasingly significant when it comes to task three and four, concerning more complex task features. However, in task one and two, the advantage of visual conventions weakened. We attribute the demerit of visual convention in task one and two to the stand-out of visual encodings. Some participants gave us feedback after experiment that their attention might be attracted to the most salient elements, whereas the attention-grabbing element might have nothing to do with the task. For example, the largest size of bubble signifies the most click amount of Keyword A (in June), but what we instructed participants to search for is the click amount of Keyword B in September. The saliency of the largest bubble thus became a distractor for participants.

Fig. 9.
figure 9

Mean fixation counts for two groups in two conditions of diagrams

To further support the conclusion, ANOVA was conducted. Main effect of visualization and task category are significant (\( {\boldsymbol{F}}_{1}=19.714, p_{1}=0.000, \alpha=0.05, \eta_{1}^{2}=0.018;\) \({\boldsymbol{F}}_{2}=32.386, p_{2}=0.000, \alpha=0.05, \eta_{2}^{2}=0.081 \)). Interactive effect of these two elements is significant on the mean fixation count for two participants groups (F = 16.663, p = 0.000, α = 0.05, η 2 = 0.043). The significant interactive effect highlights that the influence of conventionally consistent diagrams varied as the task features change.

According to the theory that information can be further processed only when it was fixated, more fixation count in plain visualizations can lead to the conclusion that the amount of information needed to get in-depth analysis is more than those in visualizations with visual conventions. The more the to-be-processed information is, the more cognitive load for participants will be, no matter what group of participants is, for the main effect of participant group is not significant (F = 0.003, p = 0.960, α = 0.05, η 2 = 0.000). Another explanation for fewer fixation counts in convention-conforming visualizations is that these visual conventions provide more guidance for visual search and high-level cognition. As is shown in Fig. 6, accuracy rates of task three under with-conventions conditions are higher than those under without-convention conditions. In another word, fewer fixations were made but higher performances were achieved with these visual conventions. For two groups, the FC differences under two conditions during the completion of task two are significant according paired t-test result (Sig 1.(two-sided) = 0.012, Sig 2.(two-sided) = 0.019, α = 0.05).

5 Conclusion

We measure performance and cognitive loads in dimensions of behavior and eye movement patterns. Eye-tracking measurement can be a good complement of behavioral experiment method, since it can qualitatively and quantitatively reflect cognitive loads for users. Future research will in-depth discuss effect of specific conventional representations on visual perception and new knowledge generation. Under conditions of plain and visually conventional visualizations in this study, following conclusion can be drawn:

  • Visualizations containing visual conventions can facilitate multiple data points search, comparison and new knowledge inference. However, the advantage for visual conventions over plain visualization does not emerge when task is relatively simple, like searching for only single data. The advantage of conventional visualizations for more complex task manifests itself in shorter reaction time, higher accuracy rate, less total fixation duration and fewer fixation counts.

  • Visual conventions can be equally perceived by expert and novice groups. That is, regardless of the prior knowledge for users, visual conventions can enhance perception and inference performance. Meanwhile, cognitive load can be significantly reduced when interacting with visual conventional representations.

  • When compared with plain visual representations, visual conventions in visualizations can provide better visual guidance for users, which can in turn improve the confidence of answering for users. The advantage of answering confidence for visually conventional diagrams is conspicuous in these more complex tasks. In simple tasks, however, conventional encoding elements may distract users’ attention to some extent.

6 Application

Applications of visual conventions are not restricted in static information visualization design. It can also be used in many areas of human-computer interaction. To name but a few, in area of data visualization, where sheer volume of data, coupling with high velocity variance are to be presented, visual conventions can be used to facilitate mining new knowledge from what we present dynamically to the end user. In addition, visual conventions can be considered as an effective way for layout designing in digital interfaces.