The perception of a moving object depends on how our visual system processes both the shape of the object and the direction and speed at which it is moving. Understanding how the brain processes and integrates this information into a coherent percept is a fundamental challenge for vision scientists. Here, we investigate the mechanisms underlying form and motion perception, and specifically we examine how local form and motion information influences the overall perceived shape of a moving object.

Historically, form perception and motion perception are thought to be largely independent neural processes. This view was born from an abundance of evidence supporting a categorical boundary between form and motion processing. For example, the magnocellular pathway is more suited to motion detection, whereas the parvocellular pathway is more sensitive to form information (Benardete, Kaplan, & Knight, 1992; Lee, Pokorny, Smith, & Kremers, 1994). Additionally, selective damage to visual cortex can lead to deficits in the ability to perceive object motion, while preserving form and shape (i.e., akinetopsia: Vaina, 1989; Zihl, von Cramon, & Mai, 1983; Zihl, von Cramon, Mai, & Schmid, 1991), and damage to other areas can lead to somewhat opposite deficits (apperceptive agnosia; Lissauer, 1889). Furthermore, some neurons throughout visual cortex respond identically to motion across their receptive fields, independent of the moving object’s shape (Hubel & Wiesel, 1959, 1962, 1968), and other neurons respond to a given shape characteristic, independent of motion or directional information (Schiller, Finlay, & Volman, 1976a, b, c).

However, a growing body of evidence is demonstrating that form and motion interact in various and complex ways (Caplovitz & Tse, 2006, 2007; Georges, Seriès, Frégnac, & Lorenceau, 2002; Tse, 2006; Tse & Caplovitz, 2006; Tse & Hsieh, 2006; Tse & Logothetis, 2002; Whitney et al., 2003). For example, global form perception can influence the perceived speed of moving objects (Caplovitz & Tse, 2007; Kohler, Caplovitz, & Tse, 2009). Specifically, the perceived speed of an object made up of perceptually grouped elements is dictated by the overall shape of the grouped object rather than by the speeds of the local elements (Caplovitz & Tse, 2007; Kohler et al., 2009). In addition, the perceived position of a drifting Gabor pattern can be influenced by the speed and direction of the drift (De Valois & De Valois, 1988; Shapiro, Lu, Huang, Knight, & Ennis, 2010; Tse & Hsieh, 2006). Moreover, Whitney et al. demonstrated that these drift–position interactions lead to a perceived boundary shift for objects formed from drifting Gabors.

Specifically related to the present article, the orientation of an object relative to its direction of motion (Fig. 1a) has been shown to influence the speed at which it appears to move (Georges et al., 2002; Seriès, Georges, Lorenceau, & Frégnac, 2002). Specifically, at high speeds (peak ~64º/s), an elongated Gaussian blob moving parallel to its orientation will appear to translate faster than if it moves perpendicular to the motion axis. It is hypothesized that this local form–motion interaction arises from the horizontal connections that exist between neighboring neurons with collinearly aligned receptive fields (Field, Hayes, & Hess, 1993; Georges et al., 2002; Seriès et al., 2002). We note that the effect of orientation on perceived speed is quite different for much slower continuous motion (Castet, Lorenceau, Shiffrar, & Bonnet, 1993; Krolik, 1934; Metzger, 1936).

Fig. 1
figure 1

a Stimuli used in Experiment 1. The left column shows the Gaussian blobs used as the reference stimulus in each condition. The middle and right columns show, respectively, the horizontally and vertically elongated Gaussian blobs used as the test stimuli. b Stimuli used in Experiment 2. The horizontally and vertically elongated Gaussian blobs from Experiment 1 were arranged to create a global object. This figure shows the condition in which the aspect ratios formed a square (i.e., 6° × 6°). When the leading edge of the stimulus was consistent with the direction of motion, the configuration was said to be parallel. When the leading edge of the stimulus was inconsistent with the direction of motion, the configuration was said to be perpendicular. The left of panel 1B illustrates the parallel and perpendicular leading-edge conditions when the stimulus traversed a horizontal motion axis. The right of panel 1B illustrates the parallel and perpendicular leading-edge conditions when the stimulus traversed a vertical motion axis

In this article, we investigate whether the local form–motion interaction described by Georges et al. (2002) influences global shape perception. Specifically, do the neural processes that dictate how we perceive the shapes of moving objects depend in part on neural mechanisms that mediate local form–motion interactions? To answer this question, we examined the perceived shape of a moving object made up of four small oriented Gaussian blobs, as shown in Fig. 1b.

If local form–motion interactions contribute to global form perception, predictable shape distortions should be observed when objects like those illustrated in Fig. 1b are put in motion. This hypothesis is based on the assumption that if the leading and trailing edge elements were in fact to move at different speeds, the shape of the global object would appear distorted. For example, if the object shown at the left of Fig. 1b were to move rightward, the leading edge would be composed of elements oriented parallel to the motion axis (and thus perceived as moving faster), and the trailing edge of elements oriented perpendicular to the motion axis (thus perceived as moving slower). If the perceived speeds of local elements contribute to global form analysis, the overall perceived shape should be stretched relative to the veridical shape. Similarly, the converse effect (compression) should occur if the leading elements are perpendicular to the motion axis and the trailing edge is parallel to the motion axis. If this is the case, we can conclude that the mechanisms underlying local form–motion interactions precede and influence the construction of global form. However, if global form analyses were independent of local form–motion interactions, no shape distortions would be expected. In the following two experiments, we first quantify the effect of local orientation information on perceived speed and then, by investigating the perceived shapes of stimuli like those shown in Fig. 1b, demonstrate that these local speed differences lead to global shape distortions.

General method

Participants

Seven observers participated in the first experiment, and six observers participated in the second experiment. The participants provided informed consent in accordance with the Institutional Review Board of the University of Nevada, Reno. They reported normal or corrected-to-normal vision, were naïve to the aims and specifics of the experiments, and received course credit for participating.

Apparatus and display

All stimuli were displayed on an 85-Hz CRT monitor (Dell Trinitron P991, 19 in., 1,024 × 768) and were generated and presented using the Psychophysics Toolbox (Brainard, 1997) for MATLAB (MathWorks Inc., Natick, MA). Each participant placed his or her head in a chinrest and viewed the stimuli binocularly from a distance of 57 cm.

Experiment 1

The goal of the first experiment was to replicate the effect of orientation-dependent speed modulation demonstrated by Georges et al. (2002)—specifically, their finding that the orientation of a Gaussian blob relative to the motion axis influences its perceived speed.

Method

Stimuli and procedure

Using the method of constant stimuli, participants made two-interval forced choice (2IFC) judgments of the perceived speed of apparent-motion stimuli. The stimulus in each interval was a Gaussian blob (mean luminance = 59 cd/m2) that was presented at spatially sequential locations upward or downward along the vertical axis and against a black background (mean luminance = .05 cd/m2) (Fig. 1a). The stimulus was shown at each position for two frame refreshes (~23.5 ms) with a 0-ms interstimulus interval (ISI). Each trial consisted of two intervals: a reference and a test, randomly ordered (500-ms ISI). The reference interval consisted of a circular Gaussian blob (standard deviation = 1º) with an apparent motion speed of 64º/s. The test interval consisted of a Gaussian blob that was circular (standard deviation = 1º) or was elongated either vertically (aspect ratio = 4/9) or horizontally (aspect ratio = 9/4). Speed was controlled by varying the spatial separation between successive positions. The speed of the test interval was selected from the following list: 16º/s, 32º/s, 48º/s, 64º/s, 80º/s, 96º/s, or 112º/s, corresponding to spatial separations of 0.38º, 0.75º, 1.13º, 1.5º, 1.88º, 2.25º, or 2.63º, respectively. The number of positions traversed in an interval varied randomly from three to five (70.5, 94, or 117.5 ms) in order to prevent judgments based on the duration or length of the motion sequence. On a given trial, the duration and direction of the reference interval was not necessarily the same as that of the test interval. Each trial began with a central fixation point (colored green; 0.35º) for 500 ms, after which the two intervals were presented. Participants were instructed to maintain fixation and to indicate by pressing one of two buttons the interval in which motion was perceived to be faster.

In total, 21 different conditions (three orientations and seven speeds) were presented. Each participant completed 420 pseudorandomly presented trials (20 trials of each condition). The participants were given a break after every 100 trials, and they completed 30 practice trials prior to the experiment that were not included in the analyses.

Results

The number of times that the test stimulus was perceived to move faster than the reference was recorded for each condition. Thus, seven values were calculated for each of the three test stimuli. Weibull functions were fit to the corresponding data, with the ceiling of the fitting function set to the first data point for each participant: \( f(x) = {\hbox{data}}(1) - {e^{{ - {{\left( {x/a} \right)}^b}}}} \). The fits of the data for each participant were quite good (mean R 2 = .934, SEM = .013), and no significant differences were observed between the goodnesses of fit for the different conditions [repeated measures ANOVA: F(2, 12) = 0.434, n.s.]. The point of subjective equality (PSE; i.e., the speed at which the test stimulus must move in order to be perceived as moving at the same speed as the reference) was computed by interpolating the 50 % point for each curve. The raw data averaged across subjects are shown in Fig. 2, along with curves fit to the mean data. The inset of Fig. 2 illustrates the PSEs for each condition, averaged across subjects.

Fig. 2
figure 2

Results of Experiment 1. The symbols indicate the percentages of trials in which the participants judged the reference to be faster than the test sequences for each condition (averaged across all participants). The solid curves illustrate the fits of the averaged data. The inset of the figure illustrates the points of subjective equality (PSEs) for the three curves. Asterisks indicate significance at the p < .01 level, and error bars represent ±1 SEM

A repeated measures ANOVA on the PSEs revealed a significant main effect of shape [F(2, 12) = 15.36, p < .001, η2 = .72]. Consistent with Georges et al. (2002), a Gaussian blob oriented parallel to the motion axis was perceived to move faster than one oriented perpendicular to the motion [mean difference ~15º/s; t(6) = 5.197, p < .01] or than one containing no orientation information [mean difference ~16º/s; t(6) = 4.56, p < .01]. This effect is comparable to that derived by Georges et al. using similar stimuli. However, unlike the results of Georges et al., no difference was observed between the perpendicular and circular Gaussians [t(6) = 0.58, n.s.]. It is unclear why this discrepancy exists, and it could perhaps depend on subtle differences in the experimental designs. Importantly, having demonstrated that parallel orientations can influence perceived speed relative to both circular and perpendicular stimuli, we next investigated whether this effect could influence the perceived shape of a moving object composed of such elements.

Experiment 2

Here, we tested whether the aspect ratio of a rectangular object would appear distorted as a function of the orientations, relative to the motion axis, of the corner elements comprising the leading and trailing edges. If the effects observed in Experiment 1 contribute to global shape processing, rectangles in which the leading-edge elements appear to move faster than those in the trailing edge should appear elongated.

Method

Stimuli and procedure

Four elongated Gaussian blobs, arranged to form the corners of a rectangle, moved across the screen. The participants judged the orientation (either horizontal or vertical) of the rectangular array in a single interval. On each trial, two of the blobs were elongated perpendicular to the path of motion, and two were oriented parallel to the path of motion. Here we tested two distinct categories of rectangles (Fig. 1b): one in which the two leading-edge blobs were oriented parallel (with the trailing edge oriented perpendicular), and a second with the two leading-edge blobs oriented perpendicular (with the trailing edge oriented parallel) to the motion axis.

The size of each rectangular array was chosen from the following list on each trial: 3° × 6°, 4.5° × 6°, 5.4° × 6°, 6° × 6°, 6.6° × 6°, 7.5° × 6°, 9° × 6°. Each array traversed either a horizontal or a vertical apparent motion trajectory at 64º/s. The apparent motion sequence was composed of six (23.5-ms) steps (each separated by 1.5º), for a total stimulus duration of 141 ms. Importantly, the aspect ratios given above are relative to the direction of motion. Thus, if the trajectory was horizontal, the vertical distance between the blobs was always 6º, and if the trajectory was vertical, the horizontal distance between the blobs was 6º. The direction of motion (left, right, up, or down) was randomized on every trial. This uncertainty was used to prevent judgments based on local strategies for performing the task (e.g., judging the distance between the topmost two elements). In total, there were 14 trial types: the two leading-edge configurations in seven aspect ratios. A total of 20 trials of each condition were pseudorandomly presented, resulting in 280 total trials.

Results

The number of times that participants perceived the global object to be compressed relative to the axis of motion was calculated (i.e., if the stimulus moved horizontally, was it perceived to be oriented vertically?). The data were fit using the same procedures used in Experiment 1 (mean R 2 = .925, SEM = .027), and no significant difference in goodness of fit was observed between the conditions, according to a paired-samples t test: t(5) = 1.71, n.s. Figure 3 illustrates the raw data averaged across subjects and the corresponding PSEs (50 % chance = when participants indicated a perceived square) for the two conditions, as well as data collected in a supplementary control experiment using unoriented Gabors (see the supplementary materials for the experimental details and statistical analyses). When the leading edge was oriented parallel to the motion axis, the stimulus appeared stretched ~0.65º of visual angle along the direction of motion: One-sample t tests revealed that the PSEs for the parallel leading-edge condition were significantly different than the value for a square [t(5) = −2.9, p < .05]. However, no such effect was observed for the perpendicular leading-edge condition [t(5) = 0.2, n.s.]. A paired-samples t test between the parallel and perpendicular conditions revealed a significant [t(5) = −3.147, p < .026] effect of leading/trailing-edge orientation on perceived shape. Specifically, the shape of a global object with leading-edge elements oriented parallel to the motion axis appeared stretched relative to when the leading edge was composed of perpendicular elements. However, these results revealed an unexpected asymmetry, in that the parallel trailing-edge elements did not lead to shape compression.

Fig. 3
figure 3

Results of Experiment 2 and of the first supplementary control experiment. The symbols indicate the percentages of trials in which the participants judged the orientations of the stimuli to be compressed, relative to the motion axis, for the two conditions (averaged across all participants). The solid curves indicate the fits of the averaged data. The data for Experiment 2 are shown as black circles and light gray squares, and the data for the supplementary control experiment are shown as dark gray triangles. The inset of the figure shows the points of subjective equality (PSEs) for the parallel and perpendicular leading-edge conditions (when participants reported that the object was horizontally or vertically oriented at a 50 % chance level). Asterisks indicate significance at the p < .05 level, and error bars represent ±1 SEM

There are at least two hypotheses as to why this asymmetry may exist. It could be that the effects of local orientation are mediated in part by attentionally driven prediction effects that are biased toward the leading edge (Roach, McGraw, & Johnston, 2011). To investigate this possibility, we repeated Experiment 2 using circular rather than oriented Gaussian blobs. Here, the hypothesis predicted a distortion in the shape of the stimulus, as the attention-predictive effects were not thought to be exclusively orientation-dependent. However, the results of the circular-blob experiment revealed no significant changes in perceived shape (see the “unoriented leading edge” data in Fig. 3).

An alternative hypothesis is born out of the positions of the trailing-edge elements in the apparent motion sequences. At 64º/s, each element “jumps” 1.5º from one step to the next in the sequence. As such, starting with the second step, the trailing-edge elements in the 6° × 6° square configuration will occupy locations previously occupied by the leading-edge elements. Because the leading- and trailing-edge elements always have orthogonal orientations, this “imprinting” is likely to interfere with whatever collinear facilitatory effects may arise from the trailing edge, thereby limiting the local-orientation effects to the un-interfered-with leading edge.

The primary observation that we present here, though, is that the orientations of individual elements (at least along the leading edge) can influence the perceived shape of an object that they comprise. For completeness’ sake, we wanted to rule out the possibility that this effect arises due to the inability of observers to accurately perceive the “aspect ratio” of a rectangle formed out of these oriented elements, rather than to the motion of the elements. We therefore repeated Experiment 2 (including the circular elements) using stationary stimuli (i.e., the elements on a given trial never moved). Unlike the case in which they were moving, no significant differences in perceived shape were observed when the stimuli were stationary (Supplementary Fig. 1).

Taken together, these data support the hypothesis that mechanisms underlying local form–motion interactions precede and contribute to constructing global form. Furthermore, they substantially contribute to the overall perceived shape of a moving object.

Discussion

The purpose of this research was to investigate whether local form–motion interactions influence global shape perception. These experiments were based on previous research that demonstrated that an object’s orientation relative to the motion axis influences its perceived speed (Castet et al., 1993; Georges et al., 2002). Here, we demonstrated that this local form–motion interaction contributes to global shape perception. This illustrates that not only do form and motion interact with each other at a local level of processing, but these interactions also contribute to the higher-level perceptual processes that construct global form. These findings are consistent with previous research (Shipley & Kellman, 1994, 1997) demonstrating that local motion signal extraction precedes global motion perception and boundary formation.

A fundamental question that arises from the present results is the degree to which the perceived speeds of the elements contribute to global shape perception. The data suggest that the global shape is not determined solely by the perceived speeds of the elements: If the leading- and trailing-edge elements were in fact moving with speeds that differed by 15º/s (as in Exp. 1), at the end of each trial an object that started as a 6° × 6° square would have stretched by ~2.12º, nearly three times the observed distortion.

This underestimation can perhaps be accounted for on the basis of the Seriès et al. (2002) model for long-range horizontal connections. The critical characteristic of the model that differentiates it from the description above is that the latency advances accounting for the shifts in perceived speed are expected to saturate over time and distance. Using the model parameters of Seriès et al. (see their Fig. 7B), and assuming a saturated latency advance of 25 ms for the 6th step in our stimulus array, the model predicts no less than a ~1.6º distortion (25 ms × 64°/s), again much greater than the distortion observed in Experiment 2. However, a number of factors, such as contrast and location in the visual field, can influence the strength and speed of the long-range horizontal connections that presumably underlie the results of Experiment 1. As such, it is difficult to fully interpret the results of Experiment 2 in the context of the Seriès et al. model. Future research will be necessary to fully investigate whether the neural mechanisms underlying effects reported by Georges et al. (2002) are also responsible for the results obtained in Experiment 2.

What is more likely is that global form perception relies on multiple sources of information, of which local speed is but one, albeit a strongly contributing source (Bulakowski, Bressler, & Whitney, 2007; Whitney, 2002; Whitney et al., 2003). Other sources of information may arise, for example, from an analysis of the retinotopic spacing of activations across the visual field (Inoue, 1909/2000; Murray, Boyaci, & Kersten, 2006). In the stimuli used here, all four elements that make up the global shape are simultaneously present in the display. As such, neural mechanisms that process the relative positions of the elements independently of their apparent motion may offset the mechanisms underlying the distortions that we report here. The relatively small distortion reported here suggests that these sources of positional information get integrated with local velocity information (De Valois & De Valois, 1988; Shapiro et al., 2010; Tse & Hsieh, 2006; Whitney, 2002; Whitney et al., 2003). It has been demonstrated that V4 is a site for the integration of many types of visual information (Desimone & Schein, 1987). V4 receives input primarily from V2 projections containing information about visual features such as size and orientation (Deyoe & van Essen, 1985; Hubel & Livingstone, 1985; Shipp & Zeki, 1985). This V1 → V2 → V4 pathway plays a critical role in object recognition (Desimone & Schein, 1987). Additionally, V4 receives input from area V3 (Ungerleider, Desimone, & Moran, 1986; Ungerleider, Gattas, Sousa, & Mishkin, 1983), which contains motion-sensitive neurons (Burkhalter, Felleman, Newsome, & van Essen, 1986). The V1–V4 pathway, along with areas MT, MST, and VIP, plays a major role in the ability to judge spatial relationships (Desimone & Schein, 1987; Ungerleider & Desimone, 1986).

The local-orientation-specific interaction of form and motion may also be explained in part by biased feedforward connections to area MT. Long-range horizontal connections in V1 facilitate processing for similarly tuned neurons that respond to an object that is oriented parallel to the motion axis (Field et al., 1993; Georges et al., 2002; Seriès et al., 2002). Neurons in MT that encode perceived speed would subsequently be influenced by the biases in the inputs that they receive from V1 (Georges et al., 2002; Seriès et al., 2002). It is possible that in the processing of global form and motion, these biased motion signals arising in V1 and MT become integrated with other, multiple sources of visual information in area V4 concerning the spatial relationships that comprise a global form (Berzhanskaya, Grossberg, & Mingolla, 2007; Brincat & Connor, 2006; Ditchfield, McKendrick, & Badcock, 2006; Francis & Grossberg, 1995; Van Essen & Gallant, 1994). Thus, the initial misperception of speed may contribute an inaccurate representation of the spatial relationships that form the global object at higher levels of visual processing.

From the findings of this study, we concluded that discrepant and illusory local-motion signals can lead to a distorted percept of global form. Importantly, the illusory local-motion signals can arise because of a specific interaction between the form (or orientation) and motion of the local element. These results add to a growing body of evidence that form information and motion information interact in various and complex ways across multiple levels of visual processing.