Identifying usability and fun problems in a computer game during first use and after some practice

doi:10.1016/j.ijhcs.2006.03.004

International Journal of Human-Computer Studies

Volume 64, Issue 9, September 2006, Pages 830-846

https://doi.org/10.1016/j.ijhcs.2006.03.004 Get rights and content

Abstract

This paper describes an experiment to discover the change in the types of detected problems and the attitude of children towards a game when user testing a computer game for young children during first use and after they have practiced with a game. Both the numbers of different types of identified problems and the severity of the problems are investigated. Based on this knowledge, practitioners could adapt the set up of their user tests to effectively find as many aspects of the game as possible that merit change, according to the aims of the developers. The study shows that usability problems caused by a lack of knowledge were more often identified during first use. Furthermore, fun problems related to a too-high challenge level may disappear after some practice, whereas fun problems caused by the game taking over control for too long while the user wants to proceed playing the game were identified more often after some practice. The study shows that the impact severity of problems detected during first use was higher than when children had more practice with a game. As a result of these changes in experienced problems the commonly used measures efficiency, effectiveness and satisfaction increased when children had practiced with the game. Finally, the study also shows that the set of most severe problems identified during first use may be radically different from the set of most severe problems identified after some practice.

Introduction

Testing products with representative users is one of the core aspects of user-centred design. While it is possible to perform a test to determine quantitative measures like efficiency, effectiveness, and satisfaction (ISO, 1998), another common goal is to identify specific parts of a system that cause users trouble (Hertzum and Jacobsen, 2001) in order to improve the system. This is often called formative testing (Barnum, 2002). In this paper the term ‘user test’ refers to this latter practice of identifying problems by observing actual users using a product. The term ‘user test’ here does not refer directly to assessing the quantitative measures, although detecting and solving the problems in a product should eventually lead to increases in efficiency, effectiveness, and satisfaction.

Ideally, a user test will reveal as many aspects of a product as possible that cause users trouble and that need to be modified. However, research has shown that the set of identified problems depends both on the users taking part in the user test (Nielsen, 1994), and the evaluators analysing the data (Jacobsen, 1999; Hertzum and Jacobsen, 2001). Usually, products are tested with users who use these products for the first time, but it could be expected that there are differences in users’ behaviour when they have become more familiar with a product. This difference might also result in different sets of identified problems, because some problems may have been overcome while other problems may arise. Often there is a pattern in the kind of problems that is likely to have been overcome and the kind of problems that will arise when users become more familiar with a product (Prümper et al., 1992).

When performing formative evaluations in practice, not only detecting problems but also separating severe problems from insignificant problems is important because resources may not be available to correct all identified problems (Jacobsen, 1999). However, just like the numbers of certain types of detected problems may change, the overall severity estimations of problems may also change when users become more familiar with a game. Furthermore, the list of most important problems to fix might be different when testing first use or more familiar use.

Depending on the goal of the user test and the intended use of the product these differences in the numbers and severity of certain problem types could have consequences for the way the user test is organized. For user testing computer games with young children this seems especially relevant, because children often get help from their parents when playing a game (MPFS, 2003), or they play with somebody else (sibling, friend, parent) (Nikken, 2002). Therefore, it is likely that they will be able to overcome some of the usability problems easily while others may arise over time. Furthermore, computer games should be fun to play, not just the first time but preferably also after a while. Testing a game with children after they have become familiar with it, might give a better, or at least different idea of the problems that need to be fixed than testing its first use.

The study described in this paper examines the differences in numbers of identified problems of different types and their severity when testing adventure-type computer games for young children, between 5 and 7 years old, during first use, and after they have practiced two times with the game for about half an hour. Although real practice effects, like learning to drive a car, or doing complex arithmetic cannot be found in such a short time, the nature of many things that young children have to learn in computer games is much simpler. For example, children may not know at first that the purpose of a certain subgame is to catch all the blue flies because the explanation of the subgame is unclear. Once they have figured this out they will know what to do the next time they play the same subgame. This type of knowledge development belongs to the simplest types of learning in the cognitive domain according to Bloom's taxonomy of educational activities (Bloom et al., 1964). In contrast to the more complex types of learning in this taxonomy, this type of learning can probably be attained by playing a game two times for half an hour, especially when children only have to recognize the way to play the subgames.

The current analysis was done on an existing set of observations (Bekker et al., 2004) for the purpose of identifying specific problems that can be detected when testing first or more familiar use, in two frameworks of theories of human problems and intrinsic motivation. Furthermore, two other evaluators used Observer logging software (Noldus, 2002) to examine the collected video tapes of the first study again and in much more detail. Subsequently, the method to test the hypotheses is given. Reliability analyses of both the problem detection procedure and the problem classification are given. The results show which types of problems can be found more easily during first use and which ones when children have practiced with a game and what the effects are on the severity estimations of problems and the ranking of most important problems. Furthermore, the changes in the efficiency, effectiveness and satisfaction are discussed. Finally, these results are discussed in terms of their generalizability and practical use.

Section snippets

Problems in computer game play

Zapf and colleagues (Frese and Zapf, 1991; Zapf et al., 1992) proposed a taxonomy of problems occurring in work with office computers combining the work of Reason (1990), Norman and Draper (1986), Rasmussen (1982) and Hacker (1986). Norman's model of user-system interaction is created to describe interactions of humans with all sorts of systems and can also easily be applied to games (Barendregt and Bekker, 2004). Rasmussen's classification refers to the degree of conscious control exercised by

Intrinsic motivation in computer games

Because having pleasure and fun are key factors in a computer game (Pagulayan et al., 2003), fun problems are another category of problems that can occur. Therefore, problems that undermine fun are worth examining. In this paper, fun problems are defined as follows:

•
Fun problems: Fun problems Fun problems occur when there are aspects in the game that make the game less motivating to use, even though they are not usability problems. For example, the music can be too scary, the characters can be

Usability problems

When children practice with a game they change from complete novices to more experienced players of the game. While eight different types of usability problems have been defined, Zapf et al. (1992) found that for adults using an office application there are only three significant differences between novices and experts concerning usability problems:

•
Experts have significantly fewer knowledge problems than novices.
•
Experts have significantly fewer thought problems than novices.
•
Experts have

Participants

To test the hypotheses and answer the other research questions an experiment was run with 25 children of group three and four (grade one and two) of De Brembocht, an elementary school in Veldhoven, The Netherlands. This school is situated in a neighbourhood mainly inhabited by people who received higher education and earn more than minimum wage. All children were between five and seven years old (mean age was 84 months, S.D.=5.5 months), 8 girls and 17 boys. They were recruited by means of a

Reliability of the problem detection

Two evaluators analysed eight out of the 25 videotapes from the first session. The 25 videotapes of the last session were all analysed by two evaluators. To check the inter-coder reliability for the two evaluators the any-two agreement measures were calculated for the results of the individual breakdown coding and for the lists of problems, as proposed by Hertzum and Jacobsen (2001) $\frac{| P_{1} \cap P_{2} |}{| P_{1} \cup P_{2} |} .$

In this equation, P1 and P2 are the sets of problem indications or the sets of problems detected by

Types of problems in both test sessions

In the first test session 98 problems were identified and in the last test session 115 problems. The distribution of all unique problems found in the two analysed sessions over the problem categories is given in Fig. 3.

Hypotheses problem types

Most children did not visit exactly the same parts of the game in the first and the last test session. They could visit subgames, story screens, or navigational screens in any order they liked. In order to test the hypotheses only those subgames, story screens and navigational

Considerations about the unconfirmed hypotheses

Contrary to the expectations, the number of knowledge inefficiencies was not significantly higher for the first test session than for the last test session. This was caused in both test sessions by the low number of inefficiencies that all were identified by a only a few children. Inefficiencies can sometimes be hard to observe. This is caused by two factors; firstly, the observers have to be knowledgeable of all the possibilities in the game, secondly, it must be clear what the child is trying

Conclusion

The experiment described in this paper examined the differences in the outcomes of a test when a game is tested when children see it for the first time or when they have become more familiar with the game. The experiment showed that even after only 1 hour of practice with a game, children were able to finish significantly more subgames in the same time, increasing the efficiency in the game. Children were also able to finish a higher percentage of the subgames that they started, increasing the

Acknowledgements

This research was funded by the Innovation-Oriented Research Programme Human-Machine Interaction (IOP-MMI) of the Dutch government. We would like to thank Silvia Crombeen and Mariëlle Biesheuvel for conducting the test sessions. We would also like to thank the children and teachers of primary school de Brembocht for taking part in our research, and we would like to thank Prof. Dr. G.W.M. Rauterberg for bringing the relevant book of Frese and Zapf (1991) to our attention. Finally, we would like

References (46)

J. Nielsen
Estimating the number of subjects needed for a thinking aloud test
International Journal of Human–Computer Studies
(1994)
J.R. Anderson
The Architecture of Cognition
(1983)
Barendregt, W., Bekker, M. M., 2004. Towards a framework for design guidelines for young children's computer games. In:...
Barendregt, W., Bekker, M. M., Speerstra, M., 2003. Empirical evaluation of usability and fun in computer games for...
Barendregt, W., Bekker, M. M., Bouwhuis, D. G., Baauw, E., 2006. Predicting effectiveness of children participants in...
C.M. Barnum
Usability Testing and Research
(2002)
Bekker, M. M., Barendregt, W., Crombeen, S., Biesheuvel, M., 2004. Evaluating usability and fun during initial and...
D.E. Berlyne
Curiosity and Exploration
Science
(1968)
B.S. Bloom et al.
Taxonomy of Educational Objectives
(1964)
Clanton, C., 1998. An interpreted demonstration of computer game design. In: Proceedings of the Conference on CHI 98...

Cockton, G., Lavery, D., 1999. A framework for usability problem extraction. In: Proceedings of the IFIP Seventh...

M. Csikszentmihalyi

Intrinsic rewards and emergent motivation

E.L. Deci

Intrinsic Motivation

(1975)

Federoff, M. A., 2002. Heuristics and usability guidelines for the creation and evaluation of fun in video games....

M. Frese et al.

Fehler bei der Arbeit mit dem Computer, Ergebnisse von Beobachtungen und Befragungen im Burobereich (Errors in Working with Computers, Results of Observations and Interviews in the Office Field)

(1991)

D.A. Gentner et al.

Mental Models

(1983)

W. Hacker

Arbeitspsychologie

(1986)

M. Hertzum et al.

The evaluator effect: a chilling fact about usability evaluation methods

International Journal of Human–Computer Interaction: Special issue on Empirical Evaluation of Information Visualisations

(2001)

ISO, 1998. Ergonomic requirements for office work with visual display terminals (VDTs), Part 11: guidance on usability...

Jacobsen, N. E., 1999. Usability evaluation methods, the reliability and usage of cognitive walkthrough and usability...

Jacobsen, N. E., Hertzum, M., John, B. E., 1998. The evaluator effect in usability tests. In: ACM CHI’98 Conference...

Kersten-Tsikalkina, M., Bekker, M. M., 2001. Evaluating usability and fun of children's products. In: Proceedings of...

D. Lavery et al.

Comparison of evaluation methods using structured usability problem reports

Behaviour & Information Technology

(1997)

Cited by (49)

Towards a methodology for user experience assessment of serious games with children with cochlear implants
2018, Telematics and Informatics
Citation Excerpt :
The incorporation of serious games in therapy can make a healthy contribution in such a way that they bring together entertainment and education (Chen and Michael, 2005; Sawyer, 2008) and can be integrated into the rehabilitation process. Currently, games have adapted formal Human-Computer Interaction (HCI) techniques in order to assess interaction and product quality (Gonzalez et al., 2012; Britain and Bolchini, 2010; Read et al., 2002; Risden et al., 1999; Barendregt et al., 2006; Wang et al., 2009). Usability is a quality attribute that determines user satisfaction and consequently the product.
Information technology is transforming different areas, such as rehabilitation, in such a way that serious games are finding a use as an alternative in hearing therapies for children with cochlear implants, creating a motivating experience in children. As a result, the design of products oriented to children depends on the skills they have to interact, because if they have a better user experience they may have a better learning experience. Most existing methods of assessment are aimed at adults, although some have been adapted for children with special needs, including children with cochlear implants. This article presents a methodology for User Experience Assessment (UXA), that provides support for following the necessary guidelines and choosing techniques adapted to the characteristics of the child with cochlear implant. The methodology has been applied in a case study with 23 children with cochlear implants in the Institute for Blind and Deaf Children in Colombia, where different methods have been used and adapted to assess the user experience.
Using the MemoLine to capture changes in user experience over time with children
2016, International Journal of Child-Computer Interaction
Citation Excerpt :
It may be more difficult to identify some of these issues with other approaches used to capture changes in user experience over time; such as cross-sectional or longitudinal research design. For example, using a pre and post-test with survey tools, as in the study by Barendregt et al. [26], it would be difficult to capture specific events that might trigger boredom or identify the challenges faced by the children in higher levels.
In this paper, we focus on the MemoLine, a retrospective tool for capturing changes in long-term user experience of games with children, which has had little attention from the Child Computer Interaction community. To investigate the appropriateness of the MemoLine, two studies were performed. In the first study, 16 children aged 7–12 were instructed to use the MemoLine at home to reflect on their 4 month experience with a music game. The second study took place in a school context, with 32 children aged 10–11 who used MemoLine to report on their 3 month experience with an educational game. The results suggested that children along the age spectrum of 7–12 were able to complete the MemoLine Instruments. In the two different contexts children were able to recall experiences relating to the game and provide data that would be useful for developers to understand how and why their experiences changed over time. Finally, the results showed that the procedural choices for the data gathering could be adjusted to a home and school context. Based on the insights from the case studies, best practices are defined to facilitate the use and further development of the method.
Physiological and psychophysiological responses in experienced players while playing different dance exer-games
2015, Computers in Human Behavior
As exer-game technology is increasingly integrated into workout and rehabilitation programs, it is important to understand how physical exertion, hedonics, and game usability experiences affect the knowledgeable player. In a repeated measures experiment, seven experienced players played five popular dance exer-games on separate days. Exertion was measured using indirect calorimetry. Hedonic experience was measured using questions about engagement and game-flow. Platform and software usability was determined by questionnaire and percent time spent in non-play activities (menu interfaces, software loading). Mean MET levels ranged from 4.26 to 9.18 during gameplay, depending on the game. Player engagement and game-flow were closely related to MET activity. Usability scores were closely related to time spent in non-play activities. Games with increased MET levels and higher usability scores were reflected in higher engagement and game-flow scores. Degree of interface embodiment did not affect these outcomes. Based on these results, problems in usability are barriers to engagement. Designing games that provide a balance of challenge, immersion, and engagement is a difficult task. These results demonstrate the importance of examining players of varying skill levels in order to fully understand the interaction of physical exertion, platform usability and player engagement.
Affective surfing in the visualized interface of a digital library for children
2015, Information Processing and Management
The uncertainty children experience when searching for information influences their information seeking behavior by stimulating curiosity or hindering their search efforts. This study explored the interactions and the usability of various search interfaces, and the enjoyment or uncertainty experienced by children when using them. Structural Equation Modeling was used to determine whether children feel uncertainty or a sense of control when using virtual game-like interfaces to search for information associated with entertainment or as a means to satisfy an assigned learning task. We then analyzed the weight relationships among three latent variables (information needs, interface media, and affective state) using statistical (path) analysis. Our results indicate that children prefer using a retrieval interface with situated affordance to satisfy entertainment-related information needs, as opposed to searching for information to solve specific problems. Furthermore, their perceptions of text and graphic icons determined the degree to which they experienced a sense of uncertainty or control. When searching for entertainment-related information, they were better able to deal with uncertainty and sought greater control in their search interface, compared to when they were searching for information related to assigned tasks. According to their information needs, children may regard a game-like interface as a toy or a tool for learning. The results of this study can serve as reference for the future development of information search interfaces aimed at arousing the interest of children. The use of virtual game-like interfaces to guide the IS behavior of children warrants further study.
Player-video game interaction: A systematic review of current concepts
2015, Computers in Human Behavior
Citation Excerpt :
Two papers adapted classic user testing to test video game usability. Barendregt, Bekker, Bouwhuis, and Baauw (2006), in an experimental study, suggested testing the usability of a game twice with the same player. They reported that the problems before and after some practice were different in terms of quality and severity.
Video game design requires a user-centered approach to ensure that the experience enjoyed by players is as good as possible. However, the nature of player–video game interactions has not as yet been clearly defined in the scientific literature. The purpose of the present study was to provide a systematic review of empirical evidences of the current concepts of player–video game interactions in entertainment situations. A total of 72 articles published in scientific journals that deal with human–computer interaction met the criteria for inclusion in the present review. Major findings of these articles were presented in a narrative synthesis. Results showed that player–video game interactions could be defined with multiple concepts that are closely linked and intertwined. These concepts concern player aspects of player–video game interactions, namely engagement and enjoyment, and video game aspects, namely information input/output techniques, game contents and multiplayer games. Global approaches, such as playability, also exist to qualify player–video game interactions. Limitations of these findings are discussed to help researchers to plan future advances of the field and provide supplementary effort to better know the role of less-studied aspects. Practical implications are also discussed to help game designers to optimize the design of player–video game interactions.
The design and evaluation of a peripheral device for use with a computer game intended for children with motor disabilities
2015, Computer Methods and Programs in Biomedicine
Citation Excerpt :
However, accessibility and usability are essential for allowing users to completely realize the benefits of informatics. Usability is related to efficiency and user satisfaction [12]. Accessibility refers to not only the ability to reach a resource but also the potential for using a resource in a satisfactory manner [13].
Many children with motor impairments cannot participate in games and jokes that contribute to their formation. Currently, commercial computer games there are few options of software and sufficiently flexible access devices to meet the needs of this group of children. In this study, a peripheral access device and a 3D computerized game that do not require the actions of dragging, clicking, or activating various keys at the same time were developed. The peripheral access device consists of a webcam and a supervisory system that processes the images. This method provides a field of action that can be adjusted to various types of motor impairments. To analyze the sensitivity of the commands, a virtual course was developed using the scenario of a path of straight lines and curves. A volunteer with good ability in virtual games performed a short training with the virtual course and, after 15 min of training, obtained similar results with a standard keyboard and the adapted peripheral device. A 3D game in the Amazon forest was developed using the Blender 3D tool. This free software was used to model the characters and scenarios. To evaluate the usability of the 3D game, the game was tested by 20 volunteers without motor impairments (group A) and 13 volunteers with severe motor limitations of the upper limbs (group B). All the volunteers (group A and B) could easily execute all the actions of the game using the adapted peripheral device. The majority positively evaluated the questions of usability and expressed their satisfaction. The computerized game coupled to the adapted device will offer the option of leisure and learning to people with severe motor impairments who previously lacked this possibility. It also provided equality in this activity to all the users.

View all citing articles on Scopus

View full text