1 Introduction

Multi-touch tabletops have the potential to enhance collaborative learning in the classroom by fostering a playful, enjoyable environment that promotes communication [6], awareness of others [10] and equity of participation [24]. Nonetheless, the conventional multi-touch technology approach of enabling interaction by augmenting surface instrumentation, does not provide the portability and economical feasibility needed for widespread classroom deployments. In this context, Wilson’s proposal of using a front mounted depth camera to furnish interactive capabilities to any physical surface [25] becomes promising. Explorations around this author’s initiative have shown that touch input can in fact be achieved on everyday surfaces [13, 17].

Although the results of these studies are highly indicative of portable tabletops’ potential for the classroom, further explorations are still required to fully understand how this technology can support typical classroom activities such as sketching, designing and painting. Former research has shown that pen and touch tabletop support can boost the rich interactions required for engaging in such type of creative tasks [5, 11]; while the dominant hand engages in fine-precision actions with the pen, the non-dominant hand can lend itself to activities that do not require high levels of dexterity, such as zooming, rotating and tapping [12, 14]. In fact, a previous experiment [5] demonstrated that the pen and touch approach offers a good support for drawing as a mean for problem solving. Moreover, supporting pen and touch can enable students to seamlessly move from the physical to the digital world [3], decreasing the cognitive load the user faces when having to shift from one world to the other. Despite all the advantages of the pen and touch approach, the technological support for this interaction still poses practical challenges for low-cost portable tabletops; most initiatives in the area have focused on supporting pen and touch either on high-cost interactive surfaces such as Microsoft Surface HubFootnote 1 and WacomFootnote 2; or on non-portable low-cost ones such as the one proposed in [11].

Furthermore, the exploration of widespread tabletop classroom deployments should offer support for user differentiation: touch/stroke identification that allows a tabletop to associate objects with the users’ identity. This ability becomes vital for supporting educational activities’ features such as enforced turn-taking and the analysis of groups’ dynamics. Previous studies that have required user differentiation for pen-based multi-touch surfaces, have mostly relied on commercial high-cost devices that do not work on portable projectable tabletops; some examples are the Mimio pen [8], the Promethean Activeboard pen [15], and the Anoto penFootnote 3.

Additionally, former research that has proposed new pen-based initiatives to enable user identification on portable tabletops, has not aim at low-cost approaches [4]. This lack of low-cost explorations in the area of users’ pen strokes differentiation for portable tabletops, can potentially avert this tabletop technology from effectively buttressing educational experiences in the classroom. The current study poses that low-cost portable tabletops initiatives, need to better address pen and touch support as well as user stroke identification in order to achieve a real ubiquitous status in educational settings.

In this paper, we present a low-cost portable multi-touch pen-based tabletop that recognizes the pen strokes of up to three different users using a depth camera as well as infrared pens. The total cost of the hardware solution is around USD $450. In line with validation approaches used by similar works on projectable tabletops [7, 13], we conducted a user study to quantify the proposed system’s support for simple gestures and drawings, as well as a stress testing to gauge user differentiation accuracy. Our findings demonstrated that our tabletop has a strong technological potential to support pen and touch interactions; the proposed solution allows the user to draw and move objects with almost no difficulty within an acceptable time response (0.1–0.13 s). Moreover, the proposed system was able to differentiate users’ pen strokes with an error rate lower than the one reported in similar works [7, 20]. Nevertheless, the hardware used for recognition tasks needs to be reviewed before a full deployment of the system in terms of achieving successful execution of some pen-based actions (drawing of complex figures), as well as touch-based gestures (rotating objects).

This paper is structured as follows: first, a related work section is presented and the proposed low-cost portable multi-touch tabletop is described. Then, the research context, evaluation and corresponding results are detailed. Finally, a conclusion section along with reflections about further research is proposed.

2 Related Work

Numerous multi-touch technologies enable tabletops to sense touches; among them are: capacitance, optical, LCD and computer vision-based approaches. The latter is particularly popular because it can be enabled by low-cost devices [9]. Nonetheless, most of the tabletops that rely on this technology require a fixed-positioned table to allow physical interaction [19]. Early work presented by Wilson [25], explored multi-touch detection in a non-flat surface using a Kinect sensor as a depth camera. This author’s work triggered other explorations on how to use a depth-sensing camera to deliver interactive capabilities to any physical surface: HuddleLamp is a desk lamp with an integrated low-cost RGB-D camera that detects and tracks smaller displays on tables [22], and OmniTouch is a wearable projection system that enables surfaces, such as an individual’s body, to become interactive [13].

Additionally, there have been initiatives to enhance depth cameras’ touch detection mechanisms; Klompmaker et al. proposed dSensingNI [16], a framework for depth-camera sensing that tracks user fingers and hand palms to enable recognition of gestures such as grasping, grouping and stacking; and Murugappan et al. [19] proposed an extended multi-touch approach for low-cost tabletops that can recover finger, wrist and hand posture of the user. Our work differs from this latter initiative, in the approach used to validate the precision of pen and touch interactions; while they focus on gestures used to control actions (e.g. navigation), we propose validating gestures that allow the manipulation of 2D objects.

On the area of pen-and-touch support for interactive surfaces, research has mostly been conducted on non-portable tabletop solutions such as capacitive multi-touch tabletops [14, 15]. Likewise, most of the approaches proposed to enable pen and touch interactions cannot be considered low-cost solutions; well-known products that are an example of this are the Anoto pen used for tabletop interactions, Wacom, and Microsoft Surface Hub. In addition, current digital pen technology still exhibits basic issues that hinder natural interactions: existing devices are bulky and pose restrictions on the type of movements an individual can perform with them [3].

In the context of vision-based systems, seldom initiatives have used the visual signature of the pen for user differentiation and recognition. For example, Qin et al. describes the implementation of Ppen, a pressure sensitive pen with an active IR-emitting tip, a laser emitter for remote interaction, three buttons and a RF module for user identification and data transmission [20]. However, this device tends to rely on high-cost technologies. Among the few studies that seek to achieve user’s pen strokes differentiation, is the work of Chen et al., who proposed IR pens as writing tools on a low-cost tabletop [7]. Nonetheless, their solution is not a hand touch technology. Although our work builds on theirs, we propose a less bulky device that allows drawing on a muti-touch low-cost portable surface.

3 Proposed Solution: Portable Tabletop

In order to achieve a low-cost portable tabletop design that supports collaborative pen and touch interactions as well as user differentiation, the proposed solution implements the following design guidelines:

  • Support simultaneous user actions of up to three users: As suggested by [23], to engage users into tabletop activities, multiple users’ inputs should be supported by the surface.

  • Allow users to freely move and regulate their workspace: Xambó [26] warned against the harmful effects on creativity and free collaborative activities that tabletops with territorial constraints can cause. Following recommendations presented in [15], this solution let users to choose where to work, just as they would do on a table.

  • Allowing for pen and touch inputs: A pen and touch tabletop can boost creative classroom activities by enabling digital interactivity while supporting behaviors found on the pen and paper approach [2].

  • Portability and low-cost: This guideline aims at enabling tabletop solutions to easily integrate into educational settings.

  • Distinguishing users work: Awareness of individual contributions makes a successful collaborative work [18].

In order to support these requirements, a combination of hardware and software solution was deployed. Afterwards, an evaluation was carried out in terms of accuracy of user’s pen strokes identification and a user testing of the proposed solution.

3.1 Hardware Solution

The hardware solution is built using low-cost components: (1) up to four color-tracking pens, (2) two web cameras for pen tracking (60fps), (3) one Kinect sensor (version 1, 30fps) for touch interaction and, (4) a projector for presenting the image of a canvas, where the users can interact with the system. Colored-tracking pens were constructed placing an infrared led on each pen’s tip and a colored ball on each pen’s top. The projector, the Kinect sensor and the web cameras are located above a flat surface, which becomes the projected area of interaction. Depending on the projector capabilities, this area can be expandable. Figure 1 depicts the proposed solution and a scheme with the current settings of the system, which covers a projected area of 43 in.

Fig. 1.
figure 1

Prototype and scheme of the proposed solution

3.2 Software Solution

The software solution had to address several challenges to enable the tabletop’s low-cost and portable approach:

  • Low performance of pen identification due to low resolution of web cameras.

  • Inaccurate hand tracking due to the low resolution and the related sensor noise of Kinect (version 1).

  • The unavailability of a centralized controller that can handle the events generated both by pens and touch interactions.

To tackle these issues, we designed a client-server architecture that handles interactions triggered by pen-and-touch inputs. Next, we describe the architecture’s components: a pen-tracking server, a hand-tracking server, and a user interface client component.

  1. 1.

    Pen-Tracking server component: Its purpose is to recognize and identify each pen by tracking it through infrared (IR) and color web cameras. IR tracking is used to identify pen tips. A binarization process is applied to each frame of the IR camera, which enables the recognition of one or more IR led lights. Position and timing of detection of IR light sources are stored temporarily in memory. Color tracking is used to identify the colored ball located at the top of the pen. In this process, each RGB frame from the color camera is converted to a HSV color space. Then, a process of color filtering is applied to the HSV frame using the HSV values of the colored pens. This process generates the positions of the tracked colors, which are ultimately given to the Camshift algorithm [1]. Camshift is responsible of doing tracking of each colored pen over the interaction. In case Camshift looses track of a particular color, the color filtering process is used again to find the correspondent color. Pairs of IR and color points are used to detect a pen. The criterion used for this pairing is to use the nearest IR and colored points. Once a pen is recognized and identified by its color, a multi-touch event is created and delivered to the system’s User Interface-client component.

  2. 2.

    Hand-Tracking server component: Hand-tracking is achieved through: a scene capturing process, a depth-image thresholding process, and a blob tracking process. First, a process captures the initial 3D scene of the projection surface. Next, a background subtraction between each frame and the initial 3D scene is calculated. This subtraction is used to recognize 3D objects that were not previously presented into the 3D scene. The depth-image thresholding process is used to detect fingers or hands near the projection table. This process consists on substracting any object located five centimeters or more above the flat surface. This substraction results in depth information about the objects over the surface. Depth information is transformed to a binary image. Since each finger or hand produces a shadow in the binary image, this information is suitable to be used for blob tracking. When a blob is detected, the hand-tracking server creates and delivers a multi-touch event. In order to prevent false recognition of a pen as a hand, the system requires a minimum and maximum area to confirm a blob is in fact a hands’ blob.

  3. 3.

    User Interface client component: This component is responsible of receiving and interpreting each touch event, storing the status of each object and rendering the user interface. Touch events coming from the hand-tracking server are used to move objects on the projected surface. The touch events generated by the pen-tracking server are used to draw colored strokes according to the color of each pen.

4 Tools and Frameworks

A set of several tools and frameworks were used to build the components described in the previous section. For deploying the pen-tracking server component, the Openframeworks libraryFootnote 4 was used at the top level. In addition, the OpenCV libraryFootnote 5 helped to process images using a set of algorithms such as: RGB to HSV transformation, binarization and Camshift. As for the hand-tracking server component, a modified version of Kinect Core Vision softwareFootnote 6 was used to process data coming from Kinect. The connection between Kinect Core Vision and hardware itself is controlled by the libfreenect libraryFootnote 7. Furthermore, multi-touch events were implemented using TUIO multi-touch protocolFootnote 8 across all components. Finally, the user-interface client component uses Kivy frameworkFootnote 9 to easily deploy multithreaded applications for multi-touch aplications.

The implemented solution presents the following technical characteristics: a latency of 0.1 s for pen-drawing, and 0.133 s for hand interactions; a disparity of 4 mm between the real pen’s tip position and the position showed by the system; and a multi-touch accuracy of approximately 10 mm. Moreover, the pen-tracking server component works at 48 fps, while the hand-tracking server component works at 29 fps.

The overall hardware cost is around $450. We used the following devices to deploy this solution: an Axxa pico projector ($319.99), a Kinect sensor v1 ($99.99) and 2 web cameras ($16.98 both). The price of the computer is not considered in the overall price.

5 Evaluation

Two testings were conducted to assess the performance of the low-cost portable tabletop. The first was based on user evaluations of the system and attempted to gauge accuracy-related variables of both pen and touch interactions, whereas the second focused on stress testing and measured the accuracy of the color recognition algorithm that enables user differentiation.

5.1 User Study

Twelve people (7 males and 5 females, average age 28) with previous experience using multi-touch surfaces, participated in a one-session user study. The study sought to gauge how much time and effort the user had to invest in successfully drawing and moving objects on the tabletop. For this purpose, participants were asked to perform tasks on a tabletop application which consisted of a canvas with a rectangular 2D object that a user could move with their hands (Fig. 2b). Furthermore, the canvas allowed users to draw using the IR proposed pen (Fig. 2a). At the beginning of the session, participants were given an average time of two minutes to get familiar with the tabletop application. After this, they were assigned a set of 11 tasks; they had to use the pen to draw five different figures on the tabletop (line, square, circle, fork-like figure, spoon-like figure), as can be seen in Fig. 2c; and they had to perform seven object movement tasks to manipulate the rectangle projected on the tabletop left to right (L-R), right to left (R-L), upward and downward movements; circular movements; and object rotations). Participants were instructed to attempt a task as many times they felt they needed to successfully finishing it. At the end of task, participants were required to fill a questionnaire. For pen-related tasks, the questionnaire asked the number of attempts required to successfully execute the task, the time invested in successfully finishing the task, and whether the final result resembled the participant’s original intention. For movement-related tasks, the first two questions were the same; and the last question asked whether the participant was able to successfully finish the task.

Fig. 2.
figure 2

Tabletop application used in evaluation

5.2 Stress Testing

We measured the accuracy of the system to differentiate users’ pen strokes. A stress testing was used and consisted of the following:

  • One-minute stress testing: Three users were asked to simultaneously make the maximum number of strokes they could in a one-minute period. The count of strokes made by each individual during this period was gathered, as well as, the number of number of times the system misidentified each pen.

  • Thirty-strokes stress testing: Three users were asked to simultaneously make thirty strokes on a random area of the tabletop. They were asked to count the number of misidentifications of their colored pen. A misidentification is counted when the color of a stroke on the canvas does not match a pen’s color.

6 Results

The outcomes of the user test are summarized in Tables 1 and 2. Table 1 shows the results related to drawing tasks. As can be seen, most of the participants required less than three attempts to finish a given task; one participant had difficulties in drawing squares, forks and spoons and two participants faced challenges when drawing circles. Additionally, most participants agreed that the final version of their drawings resembled their intended drawing. As for the time used to perform a drawing task, users reported investing 5 to 7 s for drawing squares, forks and spoons. Table 2 shows the results from the movement related tasks. It is evident that the users experimented difficulties rotating objects. In the rest of the tasks, few users (1 or 2) reported investing more than three attempts. The percentages of successful completion of the assigned tasks were high, with the exception of the rotation movement task, which only reached 50 %. Regarding the average time per task, the circular movement and object rotation took eight and twelve seconds, respectively.

In terms of the one-minute stress testing, results showed a colored detection average error of 5.56 %, whereas the thirty-strokes stress testing presented an error rate of 6.46 %.

Table 1. Drawings section of User Testing Results
Table 2. Object movements User Testing Results

7 Conclusions and Further Work

The proposed system integrates pen and touch interactions with low-cost portable tabletop technologies. The main goal of this proposed solution is to enable tabletop widespread usage in educational settings; therefore the reported system was designed to allow more than two users to interact with the tabletop using pen and touch, regardless of their location. This approach differs from other proposed solutions; while [7] enabled two users to simultaneously interact with a projected tabletop, [27] does not report a portable solution. We validated our approach by quantifying the time and effort users had to invest in successfully drawing and moving objects on the tabletop. Moreover, we measured the proposed tabletop’s ability to identify different user’s stroke. The fact that all participants had previous experience interacting with multi-touch surfaces minimized the time invested in learning how to use the proposed system. Results showed that seldom times users had to engage in more than one attempt to draw figures on the proposed tabletop. In contrast, users had to execute an average of two attempts when having to manipulate 2D objects on the tabletop. Additionally, the system exhibited low rates of users’ strokes misidentification. To our knowledge, these results cannot be compared to other initiatives; previous work on projectable tabletops has had a different focus than ours, it has explored mostly: technology’s support for gesture analysis [16, 19]; pen-based interactions [7]; or has neither attempted to identify different users nor published resultsFootnote 10. In general, the proposed tabletop is a promising solution at an affordable price. Nevertheless, before a full deployment of the system, the following challenges must be addressed in further work:

  • Challenge 1: More drawing precision is required. In average, 6 % of the attempts when drawing complex shapes required more than three attempts to resemble the intended drawing.

  • Challenge 2: More effectiveness in interpreting gestures is needed. Rotating an object became a challenge for users.

  • Challenge 3: A lower error rate for users’ strokes differentiation needs to be achieved. Even though error rates on user identification are better or similar to other reviewed solutions, for tabletops to achieve a massive usage in the classroom, error rates should be minimum. [7, 21]

Overall, the first two challenges could be overcome by exploring the usage of devices that can more accurately sense user’s movement, such as Kinect 2. The user recognition challenge could be improved through a shape recognition algorithm.