Keywords

1 Introduction

Food and retail stores have been aware that increasing customer retention would also increase customer satisfaction [1]. Waiting time is critical to influence customers’ shopping experience and purchase termination rate and more generally customers’ perceptions of retailer service offerings [2]. Paper [3] found that the psychological factors at the checkout area may influence perception, called the irritation of waiting. To increase customer retention can be achieved by avoiding waiting queues at the checkouts [4].

A custom-designed optimum is required to avoid waiting queues of customers at checkouts, also required to attend to them as economically as possible. Obviously, cost-effectiveness increases when the customers experience reduced waiting and dwell times, and when this is achieved with an optimal attend capacity. Apparently, optimization of waiting time and the needed resources at the checkouts requires a forecast of the customers’ inflow into the checkout area. A various sensor-based technologies have been developed to get more dynamic customers’ information for better business decision-making. For this reason, the supermarket can be divided into shopping and a checkout area. The inflow into both areas is measured to get the number of customers separated. This paper investigates the current different sensor technologies applied in location prediction systems to provide fundamental data for further customer satisfaction optimization.

The paper is organized as follows. Section 2 introduces various sensor platforms. Section 3 discusses the location prediction methods developed so far. Section 4 summaries their applications for the retail environment. Then experimental results are presented in Sect. 4. Finally, we conclude in Sect. 5.

2 Sensor Technologies

A variety of sensor technologies is available for collecting dynamic information about customers’ numbers in the supermarket or other community areas. Mechanical systems were the first generation of counting systems. For example, a turnstile counts the customers when passing. Because of the separation of the customers, these counting systems have high accuracy. But it is not possible to pass the entrance simultaneously. Later on, electronic counting systems appeared. The electronic counting systems are divided into categories: Those mounted at floor or ceiling and “active” or “passive” sensors. In the following, we give a short introduction of these technologies in terms of the principle of operation and measurement, operational conditions and features of each type.

2.1 Photoelectric Sensor Systems

Photoelectric sensor systems are used in a pair of transmitter and receiver. The transmitter sends a horizontal linear light beam to a photosensitive sensor element - the receiver. When a person breaks this light beam a signal is sent to the electronic device and interprets this as counting an object. Photoelectric sensor systems are “passive” systems. In retail stores or supermarkets the transmitter and the receiver can be installed directly at the entrance. Customers have to cross the light beam of the photoelectric counting system. To avoid miss detection, multiple paired need to be used in an unobstructed detection zone. This system is cheap, but its light beam limited by distance, and blind spots. Accuracy of detection decreases depending on the width of entrances.

2.2 Radar and Laser Systems

Radar and Laser systems are “active” systems. The sensor sends an active signal and this signal is interpreted by reflecting the environment. In the case of radar systems, the signals are electromagnetic waves (radar beam). Laser systems use a focused light beam (laser beam). The respective environment of the object generates a specific reflective characteristic which can be interpreted as an entry of a person. The field of view is diversified by using a rotating mirror inside the laser system. If two or more parallel laser beams are used, grid flooring is applied to the area, in which case the direction of motion can be determined. Radar systems determine the direction of motion by using the Doppler Effect. It is the change in frequency of a radar wave for an observer moving relative to its source. Both sensor types can be installed at ceiling height. The accuracy of these systems depends on the surface texture of the environment or objects.

2.3 Infrared Systems

Infrared systems differ between “active” and “passive”. Active infrared systems are similar to radar systems. They send an infrared beam and analyse the reflected beam. An evaluation unit interprets this as an object. These active sensors are called “position sensitive devices” (PSD-sensors). The passive infrared systems detect heat sources by measuring the temperature of the environment. Since the body temperatures of humans have a different temperature compared with the surrounding environment, the sensor can detect people easily.

These systems are typically mounted at the ceiling height of an entrance. They are usually used for door openers or revolving doors. Its advantage is that the passive system will not count objects or anything else that is not human body temperature. But the passive system is affected by sudden temperature and light changes. Furthermore, the passive system is affected by immobile persons because the person becomes part of the background. That means the passive system is affected by changes in the background, especially in quick temperature changes and strong sunlight. But active systems are unaffected by sudden temperature and light changes. Furthermore, they are unaffected by immobile passengers. These active systems are more accurate than passive systems. To cover wide entrances, an array installation can be set up.

2.4 Video-Based Two-Dimensional (2D) Systems

Video-based systems are composed of a video camera and a downstream processing unit. Video-based two dimensional counting systems are mounted at the ceiling height of the entrance.

Fig. 1.
figure 1

Background and head counting (Vitracom AG)

This processing unit provides the images continuously. The counting software evaluates the counting events by a virtual counter line, which is placed in the field of view. Either background counting based or head counting based can be seen in Fig. 1. The counting results are stored at the processing unit and can be fetched for further processing by the network. However, if an object does not move anymore, the object becomes part of the background and will not be detected anymore. The counting system using IP camera can be maintained and validated remotely. The accuracy of such systems is very high, even in crowded situations. Characteristics like shape, colour, velocity and size or kinetics behaviour can separate objects or pets from people. Such video sensors can also evaluate more features such as dwell time of people in certain areas or detection of the walking path in a store of customers. But they are affected by vibrations and changing light conditions, etc. The image quality and the image processing software influences accuracy.

2.5 Video-Based Three Dimensional Systems

One camera can only capture 2-D information. So it faces the difficulty when occlusion happens. 3D systems provide not just punctuate or area information, but 3D distance or height information about the object. 3D sensor systems simulate humans’ binocular vision. Two camera lenses that are calibrated with each other have a different perspective of the scene. By triangulating the virtual visual beam, the distance between each pixel can be reconstructed. Figure 2 shows the stereoscopic camera of Hella Aglia. Another possibility of 3D measurement is Photo-Mixing-Devices (PMD). This camera system measures the time of flight of a light signal between the camera and the subject for each point of the image. These “time-of-flight”-cameras use a coordinate infrared beam. The reflection of this infrared beam is measured by an optical sensor chip. Using the time of flight, the distance of the environment point of the image is continuously determined. Video-based three-dimensional counting systems were mounted at the ceiling height of the entrance. More rich information can improve the accuracy of video-based three-dimensional sensor systems who are less affected by vibrations and changing light conditions, etc.

Fig. 2.
figure 2

Video-based three-dimensional system in the supermarket (A stereoscopic camera)

2.6 IoT Based Wireless Sensor Networks

The rapid progress of the Internet of Things (IoT) has accelerated the development of wireless sensor networks dramatically. With the advances in wireless communication, now it is possible to utilise wireless signals to track people who are with a smartphone. Meshlium Scanner [5] is a new product of the Libelium which allows detecting iPhone and Android devices or any device which works with WiFi or Bluetooth interfaces. The devices can be detected without the need of being connected to a specific Access Point, enabling the detection of any smart phone, laptop or hands-free device which comes into the coverage area of Meshlium AP scanner. Thus such a product can be applied in the supermarket to detect the number of people at a specific time. Hence, the data collected can be used to evaluate and analyse the real traffic of people.

3 Existing Location Detection Methods

Location can be identified later by checking the captured sensors’ data. There are several location detection techniques developed: a topological graph-based, grid searching based, Markov Models or Hidden Markov Models, Bayesian Networks, self-organized maps, Neural Network approaches and the state predictor methods. Following we compare and contrast these various methods in details.

A topological graph-based method [6] requires sensors that relate to the layout of an environments. Topological graphs seem to be robust to the fragility of purely geometrical methods. Due that the topological approach depends on the semantics of the environments, it is more capable than others in managing reactive behaviors, especially in large-scale cases [7]. However, this approach is the coarseness of its representation. Thus these methods may lack the details of an environment. It only provides rough information about the person’s location. To overcome its shortcomings, Shi et al. [7] proposed a hybrid map combining topological and the metric paradigm of the grid-based approaches. Their research showed that the positive characteristics of both can be integrated to compensate for the weakness of each single approach. Moreover, Shi et al. [7] combined the topological paradigm with the grid-based paradigm. They used the topological map to represent the building map and the grid-based approach for the localization.

While grid-based approaches [8] can represent arbitrary distributions over the discrete state space. However, the requirement of computational and space complexity to keep the position grid in memory and to update it for predictions. The complexity grows exponentially with the number of dimensions and supposes using a grid-based approach for low-dimensional estimations [6]. They apply the Bayesian Filtering to the Voronoi graph has the advantage that they can represent arbitrary probability densities.

Furthermore, various machine learning-based methods have been integrated into location detection for higher accuracy. Gellert et al. [9] improved accuracy of 84.81% by using the Hidden Markov model. They predicted the next location of person movements. Ashbrook and Starner [10] used in their study a Markov chain model and K-means clustering algorithm to predict future movements. They clustered the GPS data by K-means algorithm to find significant locations at which persons stayed for a long time. They designed a Markov chain model with the historical movements among these locations. They found in their study that changes in routine take longer in their developed model. For that reason, they propose a way of weighting certain updates. Zhou [11] proposed the Markov object location prediction to get the initial position of the object for compressive tracking. This method can locate the object accurately and quickly, and the classifier parameter adaptive updating strategy is given based on the confidence map.

On the other hand, the Bayesian filter can converge to the true posterior probability even in nonlinear dynamic systems. Furthermore, they claim that the Bayesian filtering approach compared with the grid-based (cellular automaton) approaches are more efficient because their focus is on their resources (particles) on regions in state space with high probability. Nevertheless, the efficiency depends on the number of samples used for filtering [6].

The complexity grows exponentially in the dimensions of the state space in all of the presented methods. Furey et al. [8] noticed that a researcher applying these methods has to be careful with high-dimensional estimation problems. This complexity of cellular automaton can be avoided by representing the area in a non-metric way using a topological approach [6]. The researcher claimed that motion models in general use topological approaches and give a discrete or fixed number of probabilities. Furthermore, they notice that the efficiency increases in areas where no sensors are available for measuring people.

Furey et al. [8] compared different filter implementations to measure how well the different approaches can estimate the location of people given appropriate sensors. It seems that cellular automaton or grid-based approaches can reach arbitrary accuracy. High accuracy means on the other handy high computational costs. Using Bayesian Kalman filters means robustness and efficiency regarding computation and memory. Han et al. [12] tried to use a Self-Organising Map based on Ashbrook and Starner [8] for learning without any prior knowledge. Self-Organising Maps overcome the gap of missing prior knowledge of moving patterns. Such Self-Organising Maps are learning neural networks that can preserve the topology of a map as they create it.

Applying it to a Markov chain Han et al. [12] converted GPS data into a significant pattern. Hence the researcher can predict the next location of a person by the output from the Self-Organising Maps. Jiang et al. [13] designed a multi-order Markov Chains to take consideration of users’ current location and associated historical mobility data to predict human mobility.

A further way to predict the movement of persons is neural networks. Vintan et al. [9] suggested a prediction technique to anticipate a person’s next move by using neural networks. In their study, they used neural predictors of the multi-layer perceptron with backpropagation. Their results show an up to 92% accuracy of pre-trained cases of next location prediction. Mantyjarvi et al. [14] applied the same multi-layer perceptron classifier to recognize a human’s motion by using neural networks.

Assam [15] proposed a robust location predictor for check-in data by using Wavelets and Conditional Random Fields (CRF) with an assumption that check-in generation is governed by the Poisson distribution. In [16] a novel model called Space Time Features-based Recurrent Neural Network (STF-RNN) was proposed for predicting people’s next movement based on mobility patterns obtained from GPS devices logs. Through extracting the internal representation of space and time features automatically, this model improves the capability of RNN and shows good performance to discover useful knowledge about people’s behavior in a more efficient way.

So, location detection does play a key role in various retail environments. [17, 18] processed and characterise queuing data of inflow and outflow through distribution models. [19,20,21,22] focused their study about queuing control theory on retail stores. The research [23] calculates a deterministic model dependent on the current in- and outflow also at the shop area and the checkout area.

The methods mentioned before present researches that measure just the inflow and outflow of a supermarket [23, 24]. The customer’s dwell time is estimated from the captured data. Therefore, the system developed should control the operational resources depending on the expected number of customers at the checkout desk in a supermarket. Our proposed approach considers monitoring of inflow and outflow of the service area together with monitoring of queue length and inflow to the checkouts, to better differentiate between dwell time in the shop area and processing in the checkout area.

4 Experiment and Results

4.1 Setup

Based on the video-based counting system, we investigate a supermarket that has the following settings as an example.

The selected sale area is approximately 8000 \(\mathrm{m}^2\), which includes a mall with a bakery, dry cleaner, post office, bank, a pharmacy and a newspaper kiosk. The building has two main entrances. The supermarket has 12 checkouts and 4 self-scanning checkouts. To reach the entrances of the supermarket area, customers have to pass the mall area first and then use one of the two entrances into the supermarket which are available. The width of each main entrance is 8 m, the width of each entrance into the supermarket is 3.10 m, and the width of the checkout line is approximately 28 m.

The research assumes that the inflow of subjects arrives in an observed time interval with a fixed time lag. One more assumption is that the checkout time during the time interval at the checkout is constant. That means the system is deterministic. The research assumes if the rate of inflow and checkout rate is balanced, the waiting queue length doesn’t change. In reality, this is not possible, because with regard to the waiting queue theory the randomized interruptions of events are continuously increasing. According to the state of the waiting queue theory, the present challenge is an interaction between multiple processes which are characterized by non-steady Poisson processes. This is the basis of the waiting queue models which are formally defined in the waiting queue theory.

In this paper, to leave the supermarket, all customers have to pass through one of these checkouts. The selected 3D video technology has to be installed at the point of entering/leaving the main shop (includes the mall) and the point of entering into the supermarket area. Furthermore, the counting sensors should be placed at the checkout area in order to count the customers entering this area, to observe the waiting queue and to count the leaving customers of the checkout area.

4.2 Results and Analysis

To achieve a good and realistic forecast of dwell times, the sample supermarket area has to be divided into the shop area and the checkout area. By the entering of the customers into the checkout area, it can be assumed that the operational resources requested are to be used. If the customers are not in the checkout area and just in the flow field, no staff are requested at the checkouts. Besides the inward counting of customers in the supermarket, this research also counts the inward flow of customers to the checkout area. Furthermore, the length of the waiting queue in front of each checkout is monitored. In our experiment, the inflow, outflow of customers have to be prepared for presentation, analysis and interpretation. To handle the huge amount of data, the average of the queue length of all checkouts during one day will be considered. In addition, to avoid the irregular behavior of customers, a variation of the number of customers, the minimum measured period time is one week (Monday to Saturday) without any holiday.

Table 1. Average values of each weekday

Table 1 presents the average inflow and outflow values from Monday to Saturday of 12 months monitored. We noticed that Fridays and Saturdays have the most customers. The supermarket has fewer customers on other days. External effects and different situations like public holidays, bridging days, seasons (e.g. Christmas, Easter) influence the behavior of customers. The proposed system treats these as the new situation.

Fig. 3.
figure 3

A sample in- and out- flow trend of customers

Also, Fig. 3 presents the structure of the trend of inflow and outflow during a weekday for three weeks in April. We compared three weeks daily data across different months, we found similar buying behavior of customers. The summary of the system performance is shown in Table 2.

Table 2. Overview of system performance

According to our proposed counting method, the dwell time is not affected by the purchase process, and can be evaluated separately. In general, considering the desired customer satisfaction which is expressed in the waiting queue length, we set the maximum queue length is 3 customers. We noticed that there is a difference between inflow and outflow. The reason for this was introduced by the counting system. The variance between the inflow and outflow would represent the dwell time. Based on the estimation of the dwell time, the opening and closing of checkouts have been scheduled in a better way. Thus it helps relief the stress of staff. With our current counting system, it is not accurate enough. In the future, people tracking and trolley tracking system can increase the accuracy.

Depending on the captured dynamic process data, a forecasting model has been developed. Process data are the inflows of customers at the entrance, number of customers entering the checkout area, current waiting queue length and the number of available checkouts. With these parameters staffing has to be sufficient in order to keep the waiting queue time short.

The principle of making the decision is to provide additional or less operational resources is a cost-benefit calculation. On one hand, the personnel costs at the checkout area should be as low as possible, and on the other hand, the waiting queues length or respectively the waiting time should not be too long to annoy the customers. For that reason, the optimum value of the number of available resources has to be determined using the following cost function K:

(1)

where () is the amount of requested resources, NB the available resources, \(j_A\) the inward flow into the checkout area, \(j_O\) the outward flow of the shop, L the length of the waiting queue and the vector as a combination of boundary conditions. Boundary conditions can be, for instance, a minimum time period for operational resources. This boundary condition is important because an evaluation unit based on the inward flow, the outward flow and the waiting queue length is necessary to cover short-term fluctuations. Otherwise this could result in a very high number of opening and closing of checkouts (as Fig. 4 shows), which would be neither economical nor reasonable for the staff which has to deal with the customer.

To avoid this fluctuation, the controller has to evaluate if it is necessary to change the status in a new state which is persistent for relevant time periods, or transient just for short time periods. If such a trend is not persistent, there is no change in the actual number of open checkouts.

Fig. 4.
figure 4

Long-term-view of the numbers of openings and closings

5 Conclusions

This paper investigates the problem of how to avoid waiting queues of customers at checkouts as economically as possible. We found that cost effectiveness increases when the customers experience reduced waiting and dwell times, and when this is achieved with an optimal attend capacity. Since these requirements conflict, the best optimisation could be achieved through the opening and closing checkouts. Thus, it helps to improve the staff and customers’ satisfaction; improve work processes/stress reduction and control waiting time, checkout efficiency and conversion rate. Moreover, similar problems and solutions can be applied to different other fields such as telephone switching systems, computers and communication systems, telecommunication systems, SAN (storage area network) and recovery systems, economy, quality control, transportation systems and much more.