doi:10.1016/j.cviu.2004.07.011
Copyright © 2004 Elsevier Inc. All rights reserved.
A novel non-intrusive eye gaze estimation using cross-ratio under large head motion
Dong Hyun Yoo and Myung Jin Chung
, 
Division of Electrical Engineering, Department of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology 373-1, Guseong-dong, Yuseong-gu, Daejeon 305-701, Republic of Korea
Received 15 October 2003;
accepted 27 July 2004.
Available online 12 October 2004.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
Eye gaze estimation systems calculate the direction of human eye gaze. Numerous accurate eye gaze estimation systems considering a user’s head movement have been reported. Although the systems allow large head motion, they require multiple devices and complicate computation in order to obtain the geometrical positions of an eye, cameras, and a monitor. The light-reflection-based method proposed in this paper does not require any knowledge of their positions, so the system utilizing the proposed method is lighter and easier to use than the conventional systems. To estimate where the user looks allowing ample head movement, we utilize an invariant value (cross-ratio) of a projective space. Also, a robust feature detection using an ellipse-specific active contour is suggested in order to find features exactly. Our proposed feature detection and estimation method are simple and fast, and shows accurate results under large head motion.
Keywords: Non-intrusive eye gaze estimation; Large head motion; Cross-ratio; Robust pupil detection
Fig. 1. Eye structure (Illustration taken from [15]).
Fig. 2. System configuration: a monitor with four IR LEDs attached to the corners and two cameras, one of which is with an IR LED attached to its lens.
Fig. 3. Input images captured by a camera: (A) dark eye image (corneal reflection) and (B) bright eye image.
Fig. 4. The relation between the IR LEDs of the monitor (LED1, LED2, LED3, and LED4) and the virtual projection points (v1, v2, v3, and v4). The points are the projection of IR LEDs onto the surface of the cornea. p is the pupil position in an image and g is the gaze point of the eye.
Fig. 5. Geometrical relation of the reflection. r1, r2, and c are the glints of LED. v1 and v2 are the virtual projection points.
Fig. 6. How to compute a virtual projection point. Σc is a coordinate system origin which is located at the optical center of the camera, and Σi is an image coordinate system origin.
(A) ur1.
(B) uv1.
(C) uv1/ur1.
Fig. 7. The variation of the features in the image according to the eye’s location.
Fig. 8. Computation of virtual projection points in an image plane. The white circles (○) are the glints, and the black circles (•) are the virtual projection points of the glints. The cross (+) marks the pupil center.
Fig. 9. How a cross-ratio in an image is computed. uv1, uv2,uv3, and uv4 are the virtual projection points on the cornea. up is the center of the pupil in the image. e is an intersection of diagonal lines of the polygon uv1,uv2,uv3, and uv4.
Fig. 10. The screen of a monitor. w is the width of the screen and h is its height. g is the estimated eye gaze point and its position is
.
Fig. 11. Simulation environment.
(A) Cross-ratio.
(B) Estimated gaze point.
Fig. 12. The variation of the cross-ratio according to the eye’s location. The positions of the glints in the image are used to compute the cross-ratio: (A) the cross-ratio and (B) the estimation result.
(A) Cross-ratio.
(B) Estimated gaze point.
Fig. 13. The variation of the cross-ratio according to the eye’s location. The positions of the virtual projection points are used to compute the cross-ratio.
(A) Cornea center = (152, 112, and 500).
(B) Cornea center = (10, 0, and 500).
Fig. 14. Sensitivity study. The camera is at (152, 0, and 100) and the gaze point is (152, 117).
Fig. 15. Detection of the corneal reflections: (A) dark-eye image, (B) segmented result, and (C) detected glints.
Fig. 16. Segmented results of three cases.
Fig. 17. Ellipse fitting to the edge detected by a Sobel operator. The red ellipse is the fitted result.
Fig. 18. Edge detection of a pupil region by a 1D search.
Fig. 19. Boundary detection results of the examples from Fig. 16. The small circle is an initial position of a simplified active contour. The crosses mark the boundary of the pupil region.
Fig. 20. Fitting results of Fig. 16.
(A) t = 16.
(B) t = 72.
(C) t = 94.
(D) t = 111.
(E) t = 16.
(F) t = 72.
(G) t = 94.
(H) t = 111.
Fig. 21. Face and eye tracking results: (A–D) are the images obtained by a wide-view camera, and (E–H) are the images simultaneously obtained by a high-zoom camera.
Fig. 22. Experimental setup: a monitor with four IR LEDs and two cameras mounted on a pan-tilt unit. A small wide-view camera is attached to the top of the high-zoom camera.
(A) Conventional method.
(B) Proposed method.
Fig. 23. Experimental result. The resolution of the screen was 1024 × 768 and the screen had 5 × 5 target points.
Table 1.
The average, standard deviation, and maximum of the estimation error using the conventional method

Table 2.
The average, standard deviation, and maximum of the estimation error using the proposed method
