Introduction

Due to the rapid and detailed analysis required to distinguish between highly similar faces, face recognition presents a unique challenge for the human mind. It is thought that holistic processing, or paying attention to all aspects of an object, aids in the individuation of visually similar objects such as faces (Richler et al., 2012). Faces are frequently processed at the individual level (e.g., "Paulo"), as opposed to common objects, which are typically identified at the category level (e.g., "cat"; Rosch et al., 1976). This level of processing requires distinguishing between different faces with the same set of features (eyes, nose, and mouth) and a common general configuration (eyes above nose, nose above mouth), requiring perceptual expertise.

Considering two visual objects, words and faces, on the one hand, Farah and colleagues (Farah, 1991, 1992; Farah et al., 1998; Tanaka & Farah, 1993) proposed that the two extremes of visual object recognition are word perception (part-based) and face perception (holistic processing). On the other hand, in a different theoretical stance, word recognition has been proposed to involve similarly difficult discrimination as face recognition, because typical reading experiences involve rapid identification of letters and their position within words, and words are formed by arranging a fixed number of letters from a limited set with a high degree of self-similarity (Kleinschmidt & Cohen, 2006; Wong et al., 2011).

Holistic processing measures from the face recognition literature have recently been extended to holistic processing of words. As more attention has been paid to visual word recognition within the context of a perceptual expertise framework (e.g., Liu et al., 2016; Ventura, 2014; Wong & Gauthier, 2007), there has been debate about whether holistic processing can also be a marker of perceptual expertise in word recognition (Gauthier et al., 2010; Wong & Gauthier, 2007). Word recognition does pose a comparable challenge to the human mind as face recognition. As mentioned above, readers must identify rapidly words formed by arranging a fixed number of letters from a limited set with high self-similarity (Kleinschmidt & Cohen, 2006; Wong et al., 2011). In addition, the dual route model of reading states that words can be processed either on a letter-by-letter basis (alphabetic route) or via a direct route towards the word level (orthographic route; Coltheart et al., 2001). The latter route is preferably applied for frequently used words and may involve holistic processing.

The composite effect (Fig. 1) shows that all parts of a visual object are fully processed even if the task requires decision on one part only. Participants are asked to perform a same-different matching task on a specific visual part (e.g., left half of two words or two faces) of two sequential objects and not on whole objects. Two critical components in this task argue for the holistic processing of objects. First, the influence of the irrelevant part (e.g., the right half) on performance over the target part (e.g., the left half), that is, a significant congruency effect: better performance when the irrelevant part is congruent in response to the one induced by the critical part (i.e., “same” for both target and distractor parts, or “different” for both) than when incongruent (i.e., “same” for target parts and “different” for distractor parts, or vice versa). Second, the congruency effect is modulated by alignment, that is, it is severely reduced when the two parts of the object are misaligned (e.g., the right part is moved down relative to the left part) rather than aligned, probably because the whole percept is disrupted. This interaction between alignment and congruency is more indicative of holistic processing than the observation of a congruency effect, which is tainted by other confounds including response compatibility and decisional processes. For a meta-analysis of studies on holistic face processing using the composite task, see Richler and Gauthier (2014).

Fig. 1
figure 1

Illustration of the composite task with left-right face composites. Adapted from Liu and Behrmann (2014)

Composite effects have also been described for fluent readers with English words (Wong et al., 2011), Chinese characters (Chen et al., 2013; Wong et al., 2012), and Portuguese words (Ventura et al., 2017).

In further studies, holistic processing has been consistently found for other non-face objects, but always for experts having extensive practice identifying them, including cars (Gauthier et al., 2000), dogs (Curby et al., 2009; Diamond & Carey, 1986), birds (Diamond & Carey, 1986; Gauthier et al., 2000), fingerprints (Busey & Vanderkolk, 2005), musical notation (Wong & Gauthier, 2010), artificial objects (Rossion et al., 2002), and even inverted faces (Richler, Mack, et al., 2011b).

The discoveries of face-like holistic processing hallmarks for objects of expertise (including words) discussed in the previous paragraphs suggest that this processing style may develop, in part, through extensive experience in identifying those objects. According to the expertise hypothesis, holistic processing is the result of automatized attention to multiple parts of an object, which is developed through extensive discrimination experience (i.e., expertise; Diamond & Carey, 1986; Gauthier & Tarr, 1997; Richler et al., 2011b, b). According to this expert viewpoint, people process faces holistically because they have learned to individuate faces by attending to multiple facial parts at the same time, most likely because diagnostic information is typically distributed across entire faces (Chua et al., 2015).

However, the expected relationship between face holistic processing and face recognition has been difficult to establish (Richler et al., 2015). One possible problem with faces is that we all have extensive experience individuating faces, and thus it might be difficult to establish a relationship between face holistic processing and face recognition. Furthermore, manipulating experience with faces to study causes and mechanisms underlying holistic effects is a difficult endeavor (cf. Gauthier, 20100). Artificial objects, on the other hand, are uniquely suited to investigate the effect of experience and the causes of holistic effects (Gauthier, 20100). Indeed, recent research manipulating the degree of experience with different artificial objects revealed that domain-specific (degree of) experience leads to holistic processing (Chua & Gauthier, 2020).

Nonetheless, there have been demonstrations of holistic processing with much less involvement of experience. In the absence of expertise, Zhao et al. (2016) identified face-like holistic processing for unique Gestalt line patterns. Line patterns containing salient Gestalt information (i.e., connectedness, closure, and continuity between components) were processed as holistically as faces without any training. As a result, face-like holistic processing extends beyond faces and objects of expertise. These findings lend support to a dual-route explanation of holistic processing (Zhao et al., 2016), which includes a stimulus-based and an experience-based approach to holistic processing. Both the bottom-up and the experience-based routes are used to execute holistic processing. Observer-based experience shapes what constitutes perceptual Gestalts and how sensitive the visual system is to them, whereas object-based Gestalt information offers a perceptual basis for holistic processing (i.e., bottom-up way).

Curby and Moerel (2019) investigated whether a trade-off in holistic processing indices for faces and Gestalt stimuli can be observed in a task designed to tap an overlap in early perceptual processing stages supporting Gestalt perception. They found evidence of reciprocal interference between the two stimuli. Faces were processed less holistically when an aligned Gestalt line pattern (i.e., processed holistically) was overlaid compared to a misaligned Gestalt line pattern (i.e., not processed holistically). Gestalt line patterns were processed less holistically when an aligned face (i.e., processed holistically) was overlaid compared to a misaligned face (i.e., not processed holistically). This overlap in the holistic processing of faces and Gestalt line patterns suggests an overlap at earlier, automatic, more perceptual processing stages that could support a stimulus-based Gestalt contribution to holistic processing. Curby and Moerel (2019) drew attention to the fact that the interference they found between holistic processing of line stimuli and faces is present even though the processing of only one of the stimuli is task-relevant. This suggests that participants were obliged to process both stimuli, and potentially did so automatically. This provides further support for the locus of this type of interference being in the early processing stages. The evidence found by Curby and Moerel (2019) for early, stimulus-based holistic processing adds to the experience-based approach to holistic processing.

For words, there seems to be some evidence for a similar dual-route account, which adds late, experience-based holistic effects to earlier holistic effects. When several previous findings (Ventura et al., 2017, 20192020b) are considered together, they suggest that holistic word processing is related to abstract, late, lexical factors, depending on extensive experience with words. Ventura et al. (2019) showed that holistic processing of words is involved in fast parallel word reading. Ventura et al. (2017), additionally, and for example, found specific evidence for a late locus of word holistic effects by evaluating whether low-level visual aspects of words affected the word-composite effect. More specifically, Ventura et al. (2017) found that the composite effect was comparable for words differing in specific visual structure (different fonts: courier; alternating CaSe, and manuscript font), implying an involvement at the level of abstract lexical representations. These findings are also supported by recent research showing that the VWFA's (Visual Word Form Area) lateral anterior region is sensitive to lexical properties and underpins holistic word representations, whereas the VWFA's most posterior region is sensitive to sublexical orthographic representations (Bouhali et al., 2019; Lerma-Usabiaga et al., 2018).

But there appears to be some evidence for another, earlier, locus of the word-composite effect. Indeed, Chen et al. (2013) demonstrated in an event-related potential (ERP) study using Chinese characters that holistic processing of words has an earlier neurophysiological correlate (P1) than that commonly found for face holistic processing (N170; e.g., Jacques & Rossion, 2009), implying the involvement of early visual processes. However, the P1 component found by Chen et al. (2013) in a composite task with Chinese characters is usually larger for attended stimuli than for unattended stimuli, a phenomenon known as the P1 attention effect. As a result, P1 is thought to represent the first sensory and attentional processing of a stimulus by specific areas of the cerebral cortex (e.g., Klimesch, 2011) rather than an early stage of holistic word processing. In another study, Chen et al. (2016) found that changes in exposure duration between 170 and 600 ms did not result in significant changes in the holistic word effect, suggesting an early locus of holistic word processing. Nevertheless, holistic face processing has been observed for durations as short as 50 ms (Richler et al., 2011b, b) and this short duration was not investigated by Chen et al. (2016).

Nevertheless, early effects of word holistic processing might be related to Gestalt information. Salient Gestalt information (i.e., connectedness, closure, and continuity between parts) is important for letter identification (Pelli et al., 2009) and for word identification (Pelli et al., 2006). Pelli et al. (2009) found that letter identification is inversely proportional to the violation of the Gestalt law of good continuation. The authors found that the efficiency of letter identification, measured through threshold contrast for identifying the letters in visual noise, was determined by the disruption of good continuation promoted by shape perturbation (e.g., grated orientation, phase or offset perturbation).

Thus, an overlap may still exist at an early level in the mechanisms supporting Gestalt stimulus-based contributions to holistic processing and early, Gestalt-based word holistic processing. This type of overlap would be better tapped with a task where the locus of the overlap is at earlier, more perceptual processing stages. The objective of the present study was thus to evaluate whether a trade-off in holistic processing indices for words and Gestalt stimuli could be observed in a task equivalent to the one previously used with faces and Gestalt stimuli by Curby and Moerel (2019). The same Gestalt stimuli (Zhao et al., 2016) were used. Words were presented in Tracker font. This font increases the importance of Gestalt information (i.e., connectedness, closure, and continuity between parts) for words and increases the possibility of shared mechanisms supporting Gestalt stimulus-based contributions to holistic processing of line patterns and word holistic processing. Words were presented overlaid on a Gestalt line pattern.

Considering hypothetical results, three possibilities might arise. There might be no trade-off in holistic processing indices for words and Gestalt stimuli, which would suggest that words are not processed holistically at this early Gestalt-dependent perceptual stage. Alternatively, and as for faces and Gestalt line patterns, there might be a mutual trade-off in holistic processing indices for words and Gestalt stimuli. In this case, words would be processed less holistically when an aligned (i.e., processed holistically), compared with a misaligned (i.e., not processed holistically), Gestalt line pattern was overlaid, and Gestalt line patterns would be processed less holistically when an aligned (i.e., processed holistically), compared to a misaligned (i.e., not processed holistically), word was overlaid. This would imply shared resources, that is, Gestalt information (i.e., connectedness, closure, and continuity between parts) for holistic processing of words and Gestalt line patterns. Finally, it is possible to foresee a non-reciprocal interference between holistic word processing and novel Gestalt line pattern stimuli. When an aligned word is overlaid on a line pattern, the line pattern may be processed less holistically than when a misaligned word is overlaid. However, when an aligned line pattern is overlaid on a word, the word may not be processed less holistically.

This last prediction is based on the notion that words are a more cohesive unit than Gestalt line patterns. This cohesiveness probably reflects the need to deal effectively with many letters in rapid temporal succession and the fact that in usual reading contexts, words appear close to other words both spatially (in terms of location) and temporally (as many words are recognized within a short time window). The human brain, therefore, needs to ensure that letters belonging to the same word are grouped into the same perceptual unit rather than mixed with letters from neighboring words. Such an organization of the complex visual display of letters into representational objects may be crucial in satisfying the highly demanding visual task of word recognition and reading.

A further aspect contributing to higher cohesiveness of words might be feedback from lexical levels to early perceptual ones. There is evidence suggesting that whole-object level representations facilitate early individual part processing, such as the findings on the part-whole effect for words (i.e., better letter identification in strings when the string is a word vs. a pseudoword or a nonword; Reicher, 1969; Wheeler, 1970). The advantage of the word context has been interpreted as a top-down influence of whole-word representations at the orthographic level on the letter identification level (McClelland & Rumelhart, 1981). Finally, words have phonology and semantics that might help the cohesiveness of words through re-entrant feedback to the orthographic level both directly and indirectly (Seidenberg & McClelland, 1989).

Experiment 1

In Experiment 1, the task was to attend to line patterns and ignore words, and the corresponding within-subject factors were line pattern alignment, line pattern congruency, and word alignment. Holistic processing of line patterns is reflected by an interaction between the first two factors (line pattern alignment × line pattern congruency). The question of interest is whether the task-irrelevant word alignment interferes with the holistic processing of line patterns. Thus, the critical comparison is the magnitude of holistic processing of line patterns under different word contexts (word-aligned vs. word-misaligned).

Method

Participants

To perform a power analysis, we took into consideration that in Curby and Moerel (2019) the critical three-way interaction was only present, for each experiment, in one of the analyses, either dprime or response time. Considering sensitivity analysis in Experiment 1 of Curby and Moerel (2019), there was a ηp2=.26. According to MorePower 6.0.4 (Campbell & Thompson, 2012) a sample size of 32 would be required to find that effect at α = 0.05 with a power of 0.9 for a repeated-measures ANOVA with three repeated-measures factors each with two levels.

Despite the predetermined minimum sample size of 32, all students enrolled in a psychology course in Faculdade de Psicologia of Universidade de Lisboa were invited to participate due to anticipated data exclusion. Seventy-three participants accepted the invitation and took part in the Experiment. Eleven participants were eliminated (see below).

The study's protocol adhered to the guidelines of the Declaration of Helsinki and the Portuguese deontological regulation for Psychology and was approved by the Deontological Committee of Faculdade de Psicologia of Universidade de Lisboa. All participants provided informed consent.

Stimuli

Composite line patterns

Twenty-four pairs of line stimuli created by Zhao et al. (2016) were used to create composite line patterns. Each pair had two top halves and two bottom halves, and each part was 7.3 cm wide and 3.6 cm high. Both top halves of a pair could be matched with both bottom halves while retaining the Gestalt information (e.g., the continuity of the lines). Thus, each of the 24 pairs of line stimuli allowed for the creation of 16 trials, meaning a total of 384 trials.

Instead of grey, the line patterns were made transparent (transparency = 65%) and blue (RGB: 4, 59, 132) so they could be seen when overlaid on the greyscale words. Because the words that would be superimposed on the Gestalt line patterns have a horizontal orientation and are divided into a left and a right half, Gestalt line patterns were rotated counterclockwise, and we considered a horizontal orientation and divided line patterns into a left and a right half (total figure aligned: 248 × 248 pixels; total figure misaligned: 248 × 306 pixels).

Composite words

For this task, 24 sets of four Consonant-Vowel.Consonant-Vowel (CV.CV) Portuguese words in Tracker font were used. Words were divided into a left and a right half, between the second and third letter (total word aligned: 242 × 184 pixels; total word misaligned: 242 × 264 pixels). Within each set, the left and right halves of each word were interchanged to create the four words resulting from the orthogonal manipulation of response (same; different) and congruency (congruent; incongruent): for example, vida, muro, viro, muda. Each word appeared both as study and as test stimuli. Thus, the same distractor parts were used in different conditions (congruent and incongruent). Each set of four words allowed for the creation of 16 trials, thus a total of 384 trials.

Design and procedure

Participants completed a composite task, with a total of 384 trials divided over four blocks. The experiment used Eprime Go and was entirely run online. In each trial, line composite images were presented with word composite images overlaid on top (see Fig. 2). A 2-pixel vertical blue line separated the second and third letters of a word and the left and right halves of the Gestalt line patterns. Each trial proceeded as follows (see Fig. 2): (1) fixation screen (500 ms, not shown in figure), (2) first stimulus (i.e., line pattern composite with word composite overlaid; 250 ms), (3) pattern mask (500 ms), (4) second stimulus (i.e., line pattern composite with word composite overlaid; 250 ms). Participants were instructed to make same-different judgments on the left half of the line patterns while ignoring the right halves and the overlaid words. Before starting the experiment, participants completed 16 practice trials.

Fig. 2
figure 2

Trial structure used for the modified composite task. Composite word stimuli were presented overlaid with either aligned (left) or misaligned (right) line patterns. In Experiment 1, participants made judgments on whether the left halves of the line patterns were the same or different. In Experiment 2, participants made judgments on whether the left halves of the words were the same or different

The word and line stimuli parts were either aligned or misaligned, resulting in four stimulus conditions. The conditions were blocked (both stimuli aligned, both stimuli misaligned, and one of the stimuli aligned and the other misaligned, and vice-versa) and block order was randomized (96 trials per block). The congruency for the task-irrelevant word images was counterbalanced with the congruency for the line patterns within each block. Correct responses to words (same/different) were also counterbalanced with correct responses for the line patterns within each block.

Results and discussion

Mean response times were examined for outliers (mean RT > 2.5 SD from group mean), with four participants eliminated from further analysis. Seven participants had poor sensitivity performance (mean d′ < 0) and were also eliminated from further analysis.

Sensitivity analysis

A 2 (line alignment: aligned, misaligned) × 2 (line congruency: congruent, incongruent) × 2 (word alignment: aligned, misaligned) ANOVA performed on the sensitivity (d′) scores revealed a word alignment effect, F(1, 61) = 9.82, p ≤ .005, ηp2 = .13, a line congruency effect, F(1, 61) = 28.93, p ≤ .0001, ηp2 = .32, and an interaction between line congruency and line alignment, F(1, 61) = 4.02, p = .05, ηp2 = .06. There was also a three-way interaction between line alignment, line congruency, and word alignmentl, F(1, 361) = 7.96, p = .006, ηp2 = .12. This three-way interaction is the crucial effect. To probe the underlying source of the three-way interaction, the data from the trials where the words were aligned and those where they were misaligned were analyzed separately. Holistic processing is indicated by the presence of an interaction between line alignment and line congruency, with a greater congruency effect for aligned than for misaligned line patterns. Thus, the presence or absence of this interaction indicates whether holistic processing of the line patterns is occurring.

The 2 (line alignment: aligned, misaligned) × 2 (line congruency: congruent, incongruent) ANOVA of the misaligned word trials found no main effect of line pattern alignment, F(1, 61) = 1.50, p = .22, ηp2 = .02, but there was a main effect of line congruency, F(1, 61) = 21.32, p ≤ .0001, ηp2 = .26, and an interaction between line congruency and line patten alignment, F(1, 61) = 22.46, p ≤ .0001, ηp2 = .27. Thus, line patterns in the presence of the misaligned words showed evidence of being holistically processed (see Fig. 3b). In contrast, the 2 (line pattern alignment) × 2 (line congruency) ANOVA of the aligned word trials revealed a main effect of congruency, F(1, 61) = 17.03, p ≤ .001, ηp2 = .22, but no main effect of line pattern alignment, F < 1, or interaction between line pattern alignment and congruency, F < 1. Thus, while the key marker of holistic processing was present when line patterns were processed in the presence of misaligned words (see Fig. 3b), it was no longer present when they were processed in the context of the aligned word stimuli (see Fig. a).

Fig. 3
figure 3

Mean sensitivity (d′) for the congruent and incongruent conditions, for the words (a - aligned; b - misaligned) overlaid with aligned and misaligned line patterns in Experiment 1. Standard error scores are presented

Response-time analysis

Trials with a response time < 200 ms or > 1,750 ms; < 1% (similar to the criteria used in Curby & Moerel, 2019) were removed from the data. A 2 (line alignment: aligned, misaligned) × 2 (line congruency: congruent, incongruent) × 2 (word alignment: aligned, misaligned) ANOVA was performed on the remaining RT data from correct trials.

The analysis (cf. Fig. 4a and b) revealed a line congruency effect, F(1, 61) = 11.5, p ≤ .001, ηp2 = .16. No other effect was significant, including the three-way interaction between word alignment, line alignment, and line congruency, all ps > .24. Previous studies looking at holistic processing have found markers of holistic processing in RT, but not d′ or vice versa. Here, as in the equivalent Experiment 1 of Curby and Moerel (2019), the effects were significant in d′ but not in RT.

Fig. 4
figure 4

Mean response times for the congruent and incongruent conditions, for the words (a – aligned; b – misaligned) overlaid with aligned and misaligned line patterns in Experiment 1. Standard error scores are presented

Experiment 2

To test the degree to which line pattern alignment interferes with holistic word processing, Experiment 2 was identical to Experiment 1 except that the task was to attend to words and ignore line patterns. The overlaid aligned and misaligned line patterns create high and low interference conditions, respectively.

Method

Participants

We retained the indication of a sample size of 32 from the power analysis in Experiment 1. Nevertheless, all students enrolled in a psychology course in Faculdade de Psicologia of Universidade de Lisboa were invited to participate due to anticipated data exclusion. Eighty-eight participants accepted the invitation and took part in the Experiment. Five participants were eliminated (see below).

This study's protocol adhered to the guidelines of the Declaration of Helsinki and the Portuguese deontological regulation for Psychology and was approved by the Deontological Committee of Faculdade de Psicologia of Universidade de Lisboa. All participants provided written informed consent.

Design and procedure

The design and procedure were identical to Experiment 1, except for the fact that participants were instructed to make judgements about the words instead of the line patterns.

Results and discussion

Mean response times were examined for outliers (mean RT > 2.5 SD from group mean), resulting in four participants being eliminated from further analysis. One participant had poor sensitivity performance (mean d′ < 0) and was eliminated from further analysis.

Sensitivity analysis

A 2 (word congruency: congruent, incongruent) × 2 (word alignment: aligned, misaligned) × 2 (line alignment: aligned, misaligned) ANOVA performed on the sensitivity (d′) scores revealed a word congruency effect, F(1, 82) = 11.85, p < .001, ηp2 = .13). The interaction of word alignment × word congruency, indicative of holistic processing, was marginally significant, F(1, 82) = 3.24, p = .08, ηp2 = .04. No other effects or interactions were significant (all ps > .15). The non-significant three-way interaction (p = .94) points to the fact that holistic processing of words is not influenced by alignment of line patterns (cf. Fig. 5a and b).

Fig. 5
figure 5

Mean sensitivity (d′) for the congruent and incongruent conditions, for the line stimuli (a – aligned; b – misaligned) overlaid with aligned and misaligned words in Experiment 2. Standard error scores are presented

Response-time analysis

Trials with a response time < 200 ms or > 1,750 ms; < 1% (similar to the criteria in Curby & Moerel, 2019) were removed from the data. A 2 (word congruency: congruent, incongruent) × 2 (word alignment: aligned, misaligned) × 2 (line alignment: aligned, misaligned) ANOVA was performed on the remaining RT data from correct trials (see Fig. 6a and b).

Fig. 6
figure 6

Mean response times for the congruent and incongruent conditions, for the line stimuli (a – aligned; b – misaligned) overlaid with aligned and misaligned words in Experiment 2. Standard error scores are presented

There was a word alignment effect, F(1, 82) = 4.80, p = .03, ηp2 = .06, and a word congruency effect, F(1, 82) = 8.08, p < .01, ηp2 = .09. The interaction of word alignment and word congruency, indicative of holistic processing was significant, F(1, 82) = 5.34, p = .02, ηp2 = .07. This interaction was not modulated by line pattern alignment (three-way interaction, F < 1). Thus, it seems that holistic processing of words is not penetrated by holistic processing of line patterns.

General discussion

In previous studies (Ventura et al., 2017, 20192020a, b), we provided demonstrations of a late, lexical, locus of holistic word processing. Evidence for an earlier locus of word holistic processing is far from clear (Chen et al., 2013; Chen et al., 2016). In the present study, evidence for an early, bottom-up stage of holistic word processing comes from the fact that holistic word processing influences Gestalt line pattern holistic processing. When an aligned word was overlaid on a line pattern, the line pattern was processed less holistically than when a misaligned word was overlaid. This means that an aligned word (processed holistically) affects holistic processing of Gestalt line patterns for which participants had no previous expertise. Findings of interference from holistic processing of words and stimuli strong in Gestalt information suggest that the mechanisms supporting holistic perception of words and novel Gestalt stimuli are not completely independent. This interference must occur at an early level of holistic processing reflecting the type of information that is available for Gestalt line patterns, that is, salient Gestalt information (i.e., connectedness, closure, and continuity between parts). Given that participants had no experience with these line patterns, this precludes an experience-based contribution to holistic processing.

It is worth noting that interference between holistic processing of words and line stimuli exists even though only one of the inputs is task-relevant. This implies that participants were obliged to process both stimuli and probably have done so automatically. This adds to the evidence that the source of the interference is in the early rather than late phases of processing. It is critical to emphasize that the interference between word processing and Gestalt line stimuli is specifically in terms of holistic processing, as it is modulated by the extent to which the stimuli draw on holistic processing capacity – that is, aligned word stimuli that are processed holistically. As a result, these findings cannot be explained by a general overlap of visual stimuli. Contrary to what was found by Curby and Moerel in their study with Gestalt line patterns and faces, that is, evidence of reciprocal interference between the holistic processing of those stimuli, in our study when an aligned line pattern was overlaid on a word (vs. when a misaligned line pattern was overlaid on a word), the word was not processed less holistically.

The overlap in the processing of faces and stimuli strong in Gestalt cues occurs via early, perceptual grouping contributions to holistic perception (Zhao et al., 2016). The dual route account of Zhao et al. (2016) proposes two routes to holistic processing. These include a stimulus-based and an experience-based route. The processing of faces and novel Gestalt stimuli do not show evidence of a functional overlap in task tapping experience-based contributions to holistic processing (Curby et al., 2019), that is, a task in which participants must simultaneously hold a face and Gestalt stimulus in working memory. This task is not ideal for detecting an overlap at earlier, more perceptual stages of processing, such as those underlying perceptual grouping. An overlap between faces and line patterns is still present, but in the mechanisms supporting stimulus-based contributions to holistic processing using a task where the locus of the overlap is at earlier, more perceptual processing stages that could support a stimulus-based contribution to holistic processing (Curby & Moerel, 2019).

These results suggest that there are multiple paths to holistic perception, one dependent on experience and the other on stimulus-level properties (Zhao et al., 2016). Therefore, the relative strength and robustness of holistic face perception may arise from the ability of face stimuli to commandeer both paths. A similar model may account for our results with words. We know from previous studies (e.g., Ventura et al., 2017) that there is a late, experience-dependent locus to word holistic processing. Consistently, the lateral anterior region of the VWFA is sensitive to lexical properties, underpins holistic word representations, and has greater connectivity to language and conceptual neural networks (Bouhali et al., 2019; Lerma-Usabiaga et al., 2018). The novel nature of the line stimuli precludes an experience-based account of these holistic effects. The overlap in the processing of words and stimuli strong in Gestalt cues occurs necessarily via early, perceptual grouping contributions to holistic perception given that no other information characterizes line stimuli (Zhao et al., 2016). But the added cohesiveness of words, or higher interconnectivity between elements, would make them more resistant to the reciprocal early influence of line patterns.

The possibility of a non-reciprocal pattern of interference between words and Gestalt line patterns was anticipated in the Introduction and was based on the notion that words may constitute a more cohesive unit than Gestalt line patterns, meaning a heightened inter-connectivity between visual components. Robust holistic processing is at the root of word recognition and, by extension, orthographic reading. The human brain must ensure that letters from the same word are grouped into the same perceptual unit rather than being mixed with letters from neighboring words. Thus, it is crucial to have a high degree of cohesiveness in words. Also, linguistic factors may contribute to this cohesiveness by forming, for example, chunks based on the frequency of segments and transitional probabilities. Linguistic regularities such as word frequency and transitional probabilities between sub-lexical units may lead to the construction of chunks at the whole-word level, according to statistical learning research (Orbán et al., 2008). Such an organization of the complex visual display of letters into representational objects may be crucial in satisfying the highly demanding visual task of word recognition and reading. This higher cohesiveness would make the influence of line patterns on the holistic processing of words more difficult. Reading, with its rapid spatial and temporal scanning, is a more demanding task than face recognition (i.e., which entails the combination of a limited set of facial features with broadly similar spatial arrangement). Faces most probably have less cohesiveness due to the fact that faces do not need to be processed in quick temporal and spatial sequences.

Second, feedback from lexical processing to early perceptual processing contributes to word cohesiveness (Reicher, 1969; Wheeler, 1970). Finally, words have phonology and semantics that might help the perceived cohesiveness of words through re-entrant feedback to the orthographic level both directly and indirectly (Seidenberg & McClelland, 1989). Curby and Moerel (2019) could have likely observed the same asymmetrical interference if they had used famous faces (rather than unfamiliar faces) in their previous studies since famous faces contain both phonological and semantics information.

Word processing includes both early, perceptual-based (orthographic) and late, experience-based (phonological/semantic) components, whereas line-pattern processing is primarily driven by perceptual variables. This distinction may lead to the uneven influence of holistic processing between the two object types. We should acknowledge that our results refer to the idea of multiple paths to holistic perception, one dependent on experience and the other on stimulus-level properties (Zhao et al., 2016). Nevertheless, perceptual-based (orthographic) and late, experience-based (phonological/semantic) components might map into early and late holistic processing of words, respectively. In addition, the dual-route model of reading states that words can be processed either on a letter-by-letter basis (alphabetic route) or via a direct route towards the word level (orthographic route; Coltheart et al., 2001). The latter route is preferably applied for frequently used words and may involve holistic processing, and may also reflect experience-dependent holistic word processing.

One additional difference between the processing of words and line patterns is the degree of automaticity/dominance with which the two classes of stimuli are processed. It has long been assumed that the foundation of skilled reading is the automation of the association between printed symbols and spoken language (e.g., Blau et al., 2009, 2010; Blomert, 2011; Harm & Seidenberg, 1999; Seidenberg & McClelland, 1989). Joo et al. (2021) recently used magnetoencephalography to measure word-selective responses across multiple cognitive tasks. Even when attention was diverted away from the words by performing an attention-demanding fixation task, strong word-selective responses were observed in a language region (i.e., superior temporal gyrus) beginning 300 ms after stimulus onset. Importantly, this automatic word-selective response was predictive of individual reading ability. These findings imply that skilled reading is characterized by the automatic recruitment of spoken-language circuits; with practice, reading becomes effortless as the brain learns to automatically translate letters into sounds and meaning.

The paradigm we adopted may be less important for the idea of overlapping resources and may be more adequate to test how automatic/robust the holistic processing of line patterns/words is, because even if the two types of stimuli recruit the same resources for holistic processing, they do not necessarily exert influence on each other when one is task-relevant and the other is not. Thus, one plausible difference between line pattern and word processing is the automaticity/dominance with which word stimuli are processed, and thus their ability to interfere with line-pattern processing (providing the words are intact, i.e., not misaligned). The dominance (speed and automaticity of processing) of word stimuli would also protect them from interference from line patterns to some extent. This asymmetrical interference seems similar to the Stroop effect, with words influencing font color naming but font color not influencing the words.

Considering a dual-route approach to word holistic processing, late, experienced dependent, word holistic processing may help the workings of the VWFA. The VWFA, it is generally agreed, intervenes in the efficient identification of orthographic stimuli (Dehaene et al., 2001) and enables quick association of such stimuli with phonological and lexical information (Hashimoto & Sakai, 2004). In expert alphabetic readers, the VWFA is organized in a posterior-to-anterior hierarchy (Dehaene et al., 2004; Thesen et al., 2012; Vinckier et al., 2007): posterior parts respond to individual letters, (thus underpinning sublexical representations) irrespective of case (Dehaene et al., 2004; Thesen et al., 2012), and as such letters are abstract units at this level. The lateral anterior region is sensitive to lexical properties, underpins holistic word representations, and has greater connectivity to language and conceptual neural networks (Bouhali et al., 2019; LLerma-Usabiaga et al., 2018). Later holistic processing may intervene to bind together individual letters that activate the posterior part of the VWFA, providing the input that activates more anterior parts of the VWFA. Recent literature shows the relevance of holistic visual word representations in reading (Ventura et al., 2020a, b). Indeed, holistic processing of visual words is associated with efficient access to the orthographic lexicon among adult fluent readers. Specifically, individual differences in the word-composite effect were correlated with those in the word-frequency effect measured in an independent lexical decision task (Ventura et al., 2020a, b). This word-frequency effect reflects efficient access to the lexicon.

In the present study, we found evidence for early perceptual grouping contributions to word holistic perception given that no other information characterizes line stimuli (Zhao et al., 2016). Salient Gestalt information (i.e., connectedness, closure, and continuity between parts) are important for letter identification (Pelli et al., 2009) and for word identification (Pelli et al., 2006). Although speculative, it is possible that holistic processing at an early level, reflecting Gestalt information processing, may help in keeping together the different segments that make up a letter and help keep the letters from the same word (between two blank spaces) together. Further studies are needed to investigate this proposition.

We should acknowledge that the visual properties of the words (e.g., darker color) may also have contributed to the differential pattern of interference: in our task, words might be visually more dominant while Gestalt line patterns might be less dominant, more like a visual background for the words. Another factor that might have contributed to the pattern of results is that the task for the words is very easy and it might not be sensitive to the potential influence of line-pattern processing. The interference from words to line patterns, but not from line patterns to words, is related to holistic processing – that is, the alignment of the word stimuli. Thus, our results cannot be explained in terms of the general expected overlap between the processing of any visual stimuli. If this was true, the aligned and misaligned word stimuli should have been equally effective in interfering with concurrent processing of stimuli from the other stimulus class.

In conclusion, we found evidence of an early, perceptual grouping stage of holistic processing of words, which adds to previous evidence of a late, lexical locus of holistic processing (e.g., Ventura et al., 2017). Line stimuli were processed less holistically when an aligned, compared with a misaligned, word was overlaid, but words were not processed less holistically when an aligned, compared to a misaligned, line stimuli was overlaid. We discussed several factors that might protect words from interference from holistic processing of line patterns and promote their ability to interfere with line-pattern processing (providing the words are intact, i.e., not misaligned): the higher cohesiveness of words, higher automaticity/dominance of words, feedback from lexical processing, and the fact that words have phonology and semantics. Although all these factors make theoretical sense in explaining our pattern of results, it is of course possible that some unforeseen factors might also contribute to the results. In this vein, it would be interesting to run a similar study using known faces (i.e., which would arguably have phonological and semantic information associated with them). If our interpretation is correct, and phonology and semantics also help in the cohesiveness of faces, as well as in the cohesiveness of words, we would observe an asymmetrical pattern of interference between famous faces and Gestalt line patterns. Our study expands the literature on the cognitive mechanisms underlying early holistic processing of faces and non-face stimuli, considering factors that might contribute to both word and face holistic processing (phonology and semantics in known faces) or word holistic processing predominantly (e.g., higher cohesiveness to ensure that letters from the same word are grouped into the same perceptual unit rather than being mixed with letters from neighboring words). Finally, considering together the results of Curby and Moerel (2019) with faces and our results with words, we might consider the possibility that words and faces share early, Gestalt-dependent, holistic mechanisms. Using a similar paradigm of superimposition, but with faces and words, would also allow us to contribute to the debate regarding whether word and face recognition rely on shared or dissociable neural resources and cognitive processes (for recent reviews, see Burns & Bukach, 2021; Rossion & Lochy, 2021).