The neural time course of art perception: An ERP study on the processing of style versus content in art
Highlights
► Style and content: key features to study processes of art perception. ► Style: art-specific, content: similar to “normal” object and scene perception. ► Current study shows: style follows content in terms of underlying neural processes. ► Style available for classification at ca. 224 ms or between 40 and 94 ms after content. ► Paradigm employed is promising to further uncover the time course of art perception.
Introduction
Many people report that art constitutes an important part of their lives. It inspires and fascinates them. Consequently, the phenomenon of art is not only puzzling to art historians but also to cognitive researchers and neuroscientists, whose interest in the possible sources as well as neural correlates of such fascination has been reflected in a significant number of relatively recent publications that are related to questions of art perception and aesthetics (e.g., Cela-Conde et al., 2004, Chatterjee, 2003, Di Dio et al., 2007Jacobsen and Hofel, 2003, Jacobsen et al., 2006, Leder et al., 2004, Muller et al., 2010, Nadal et al., 2008, Ramachandran and Hirstein, 1999, Redies, 2007). A review of many of these contributions was recently provided by Chatterjee (2011).
From a vision scientist's point of view, the central question regarding art is yet even more basic than questions about fascination – but equally unsolved: what is specific about art perception, i.e., what differentiates it from normal object and scene perception? One aspect that has been proposed in this respect is the way by which ambiguities are resolved (Mamassian, 2008) – either with a view to prior constraints (“normal” vision) or by conventions (art). Another important aspect differentiating art perception from normal object and scene perception is the presence and relevance of (artistic) style (Augustin, Leder, Hutzler, & Carbon, 2008). In representational art, content (motif) is closely related to objects and scenes in our surroundings. In contrast, style, the way how something is depicted, is generally of very little relevance in object and scene perception. It might be that we, for instance, have to find our way or recognise an object through fog or a snowstorm, but in such cases most of us would probably regard the fog or snow as random noise in perceptual terms – which hardly anyone would claim with respect to style in art. Rather, style is an essential aspect of an artwork, not only from an art historical point of view but also with respect to visual perception and cognition. For example, it has been shown that even people without special expertise in the arts are able to successfully judge the style-related similarity of artworks (Cupchik, Winston, & Herz, 1992), are sensitive to style across different media (Hasenfus, Martindale, & Birnbaum, 1983) and refer to both content and style when asked to judge the similarity of representational artworks (Augustin et al., 2008). By comparing the processing of these two essential perceptual aspects of representational artworks scientists have the opportunity to contrast general processes of vision with art-specific processes, which may yield important knowledge about the vision-related underpinnings of the phenomenon of art.
One central and very basic question with respect to style- and content-related processing is their temporal relation, following the idea that percepts do not exist ex nihilo, but undergo a temporal evolution, a microgenesis (Bachmann, 2000). In the literature on object-, face- and scene perception the question of the time course and interrelations of different processes has received noticeable attention in the past few years (e.g., Bacon-Mace et al., 2005, Bar et al., 2006, Carbon and Leder, 2005, Grill-Spector and Kanwisher, 2005, Hegde, 2008, Joubert et al., 2007, Joubert et al., 2009Kent and Lamberts, 2006, Rahman et al., 2002, Thorpe et al., 1996, VanRullen and Thorpe, 2001b). With respect to the time course of the processing of style and content in art, empirical data is much more scarce so far, even though first theoretical approaches have been published (Chatterjee, 2003, Leder et al., 2004) and few empirical studies have generally taken on the question of temporal aspects of art perception (Bachmann and Vipper, 1983, Locher et al., 2007). Given that we know so little empirically, what assumptions could we derive from current theories? The model of aesthetic appreciation and aesthetic judgments by Leder et al. (2004) proposes that explicit classifications of style and content take place during the same processing stage, with the general probability of classifying in terms of content or style being related to a person's art-related expertise (see also Belke, Leder, Harsanyi, & Carbon, 2010). This view reflects a definition of style in terms of an abstract category that has to be learned in order to be successfully applied and recognised (see also Hartley & Homa, 1981). Given such a view, one could also argue for a sequence of processing with style following content, as the classification of content is presumably far more overlearned in real life than the classification of style, given that classification of objects and scenes is an essential ability for biological and social survival. From a completely different perspective, style can be regarded as a combination of different low level features (Augustin et al., 2008), and recent attempts to characterise particular styles on the basis of low level cues by means of image processing tools seem to be in accord with such a view (e.g., Johnson et al., 2008). Regarding this definition of style, important information on the time course of style- versus content-related processing comes from the literature on the relation of low level information versus object-related information in object and scene perception (e.g., Grill-Spector and Kanwisher, 2005, Marr, 1982, Sanocki, 1993). Yet, this information is not entirely clear either: On the one hand, classical theories of object recognition assume, for instance, that the perception of single features such as colour precedes the perception of the object as such (Marr, 1982). According to Feature Integration Theory (FIT) the binding of features to more complex units also requires attention (Treisman & Gelade, 1980). On the other hand there is evidence that processing of some low-level features comes into play relatively late (Yao & Einhauser, 2008), as is for example indicated by evidence showing that visual attention may be guided by complete objects rather than by the early saliency of single features (Einhauser, Spain, & Perona, 2008).
To our knowledge, there is only one experimental study that focused especially on the temporal relation between style- and content-related processing (Augustin et al., 2008). This study assessed similarity judgments for pairs of pictures that were systematically crossed in style (artist) and content (motif), with presentation times systematically varied between 10 ms and 3000 ms. Effects of content could be found from PTs as short as 10 ms on and stayed relatively stable over time. In contrast, effects of style slightly emerged from 50 ms on, with effect sizes increasing steadily over time. These results suggest that the processing of style starts later and develops more slowly than the processing of content. More precisely, they indicate that the information extracted within a presentation time window of 10 ms is enough for content to become a relevant criterion of similarity, while from 50 ms of presentation time on similarity judgments also significantly rely on style. Two characteristics of similarity judgments have to be borne in mind: On the one hand, similarity judgments reflect the relevance of a certain variable rather than the ability to distinguish that variable. Thus, it would theoretically be possible that participants are able to refer to style as early as to content, if they are explicitly asked to focus on both. On the other hand, if people refer to style or content in similarity judgments this does not necessarily mean that they are also able to explicitly classify style and content (see Augustin et al., 2008). Therefore, the time course suggested by the study by Augustin et al. (2008) cannot necessarily be generalised to tasks requiring explicit classification. The current study aimed to fill this gap. It investigated the relative duration of the processing of style and content in terms of when information has processed far enough to allow successful classification of style and content, respectively. To this end, we employed a paradigm that has repeatedly been used in psycholinguistics (Rodriguez-Fornells et al., 2002, Schmitt et al., 2000, Schmitt et al., 2001, Zhang and Damian, 2009) to track the timeline of different processes: a combination of a go/nogo- and a dual choice-task with assessment of event related potentials (ERPs). The dependent measures of interest are the Lateralised Readiness Potential (LRP) and the N200 effect, which are both illustrated in the following sections.
The Lateralised Readiness Potential is derived from the Bereitschaftspotential (engl. Readiness Potential, RP), a negative shift in brain activation preceding voluntary hand- (and also foot-) movements (Kornhuber & Deecke, 1965), with the largest amplitude over the central region contra-lateral to the response limb (Kornhuber and Deecke, 1965, Kutas and Donchin, 1974). While the RP strongly corresponds with the readiness for hand-related motor actions, its lateralised aspect, the LRP, correlates with the preparation of voluntary motor actions of a specific hand, thus allowing to assess not only general but task-related aspects of preparation in a dual choice task (Osman, Coles, Donchin, Bashore, & Meyer, 1992). According to van Turennout, Hagoort, and Brown (1998: 573), the LRP “…has been shown to develop as soon as task-relevant perceptual and cognitive information is available for the motor system...” Importantly, it can not only be observed prior to executed movements, but also occurs when a movement is planned but finally not executed (Osman et al., 1992, van Turennout et al., 1998). These two characteristics make the LRP an excellent tool for studies on the temporal relation of different processes. A paradigm for this purpose was proposed by Osman et al. (1992) and further explicated by van Turennout et al. (1998): the employment of the LRP in a combined dual choice go/nogo task. In such a task, participants have to refer to two different dimensions of the same stimulus at the same time. Dimension A determines whether or not to react at all (go/nogo), and Dimension B determines which hand to react with if Dimension A signals a “go” (hand). The crucial case regarding the temporal relation of processes are the nogo trials: If Dimension A is processed before Dimension B, no nogo LRP (i.e., no significant divergence of the LRP curve from baseline in nogo trials), should develop, because the decision not to react should precede any decision about response hand. In contrast, if Dimension B is processed before Dimension A, a nogo LRP should be traceable, because response preparation presumably starts before the nogo information from Dimension A comes into play.
This paradigm has been successfully employed to examine the time course of processing for different questions in psycholinguistics (Schmitt et al., 2000, Schmitt et al., 2001, van Turennout et al., 1998) as well as in face perception (Rahman et al., 2002).
The term N200 or N2 denotes the second negative peak in an averaged ERP waveform (Folstein & van Petten, 2008). One condition under which the N200 has been shown to be especially pronounced is under conditions of response inhibition, such as in so-called go/nogo tasks (Folstein and van Petten, 2008, Schmitt et al., 2000, Schmitt et al., 2001, Zhang and Damian, 2009). In such tasks, where participants are instructed to react to one kind of stimulus (go) and withhold responses to another (nogo), nogo trials were shown to be associated with larger negativity than go trials (Pfefferbaum, Ford, Weller, & Kopell, 1985), especially at frontal sites (Folstein & van Petten, 2008).
Subtraction of the go- from the nogo waveform yields a difference curve known as the N200 effect (Schmitt et al., 2000). The N200 effect at frontal sites comprises nogo-specific activation. Thus a common interpretation of this effect is in terms of activation related to the inhibition of inappropriate responses (Falkenstein et al., 1999, Thorpe et al., 1996), even though alternative interpretations have also been proposed (e.g., Donkers & van Boxtel, 2004). Importantly, this effect can be utilised to estimate processing times: If a person is able to correctly withhold a response in a go/nogo task, this means that she must have analysed the relevant information to a sufficient amount. Of special importance in this respect is the onset of the N200 effect, the point from which on nogo- and go-curve diverge. According to Schmitt et al. (2000: 474), the onset of the N200 effect “… can be taken as the time by which there must have been enough information available to help the person decide whether or not to respond”. Very prominently this was illustrated by Thorpe et al. (1996), who used a go/nogo-paradigm to examine the speed of processing in scene perception. Participants saw scenes flashed at 20 ms and were required to release a button (go) when they saw an animal in the scene, and to keep this button pressed (nogo) when there was no animal. There was a significant difference between go- and nogo ERPs at frontal electrodes starting from 152 ms after stimulus onset, which according to Thorpe et al. (1996) indicates that a great deal of processing of relevant information must have been completed before this time.
Unlike the LRP, the N200 effect is not related to motor activity and can be traced earlier than motor-related activity (Thorpe et al., 1996). Its onset has successfully been employed to estimate processing times in psycholinguistics, regarding questions such as processing times for semantic versus phonological encoding in picture naming (Schmitt et al., 2000) or segment versus tone encoding in Chinese spoken word production (Zhang & Damian, 2009). In the present study we utilised the characteristics of the N200 effect to find out more about the time course of style- and content-related processing in art perception by gaining first numerical estimates regarding the respective processing times.
Following up on the findings by Augustin et al. (2008) the current study examined the time course of the processes underlying successful classification of style and content in artworks. We employed a combination of a go/nogo- and a dual choice task that has been reported in studies of psycholinguistics (Schmitt et al., 2000, Schmitt et al., 2001, van Turennout et al., 1998) as well as face perception (Rahman et al., 2002), but – to our knowledge – has not been applied to art perception up to now. The paradigm allows to investigate the relative time course of two cognitive processes. The general logic is that participants have to consider two different dimensions of the same stimulus at the same time. For each trial, one stimulus dimension determines whether to react or not to react (go/nogo), the other dimension determines which hand to react with (hand) – if the go/nogo-dimension signals a “go”. In the present study, the two relevant stimulus dimensions were style and content. The two levels of style used consisted of pictures from two artists with very distinct individual styles, Paul Cézanne (Cézanne) and Ernst Ludwig Kirchner (Kirchner). The two levels of content were operationalized by using pictures of those artists that depicted the motifs landscape and person(s), respectively. To make sure that style- and content-related information were comparably salient in the materials used, the stimuli were chosen on the basis of a pre-study. To furthermore ensure that participants in the present study were definitely able to master both the content- and the style-related part of the task, they received a training prior to participating in the main experiment (see Section 2.4).
In the dual choice go/nogo-task the roles of style and content, the roles of the sublevels and the response hands were completely balanced, resulting in 2 (dimension determining the go/nogo-decision) × 2 (level signalling “go”) × 2 (meaning of left and right hand) conditions. Fig. 1 illustrates the experimental paradigm by depicting the go/nogo- and hand-logic for one of the experimental conditions.
Inferring from the results by Augustin et al. (2008), we supposed that participants would be able to classify content earlier than style. The two dependent variables we wanted to test this with were the LRP and the N200 effect.
As described above, the crucial conditions regarding the LRP are the nogo conditions, because in the case of go trials the presence of an LRP is self-evident (if there is motor activity there should be motor preparation). A nogo LRP should be visible for those cases in which the information determining the hand decision is processed before the information regarding the go/nogo decision. Thus, we expected a nogo LRP for those conditions in which the hand-decision was about content and the go/nogo decision was about style (hand = content). In those cases, the LRP was expected to rise but to flatten out as soon as the style-related information was available. In contrast, no nogo LRP at all was expected for cases in which the hand decision was about style and the go/nogo decision was about content (hand = style), because in those cases the information about response inhibition was assumed to be available earlier than information about the response hand.
Table 1 summarises the hypotheses regarding the LRP results. In addition to an analysis of the relative time course of the processing of style versus content we also aimed to use the LRP data to derive some information about absolute time course, following the analysis by van Turennout et al. (1998). The idea was to statistically compare the go- and the nogo LRP in those cases where style decided over go and nogo to estimate the length of the time interval in which content-related, but no style-related information was available. The relevant time points to estimate the length of this interval were the point from which the go- and nogo LRPs started to diverge from zero and the point from which go- and nogo LRP differed in amplitude, with the go LRP rising further and the nogo LRP returning to baseline.
With regard to the N200 effect (nogo minus go), we were also interested in numerical estimates regarding the time course of processing, but the logic was slightly different. As explained above, the onset of the N200 effect might be taken as the time point at which enough information is available in order to correctly withhold a response. We aimed to use the onset latencies of the N200 effects to come to first estimates of the processing times required for content- and style-related classifications, respectively. Following the behavioural study by Augustin et al. (2008), the decision to conduct an EEG-study with the paradigm just described was motivated by the fact that this method provides the opportunity to examine the relative time course of different processes with a focus on processing times themselves rather than required stimulus duration (variation of presentation times, as in Augustin et al., 2008). A central advantage of the current method over a behavioural classification–response time paradigm was that confounds by times required for response execution are reduced.
Section snippets
Participants
Twenty-nine people participated, seven of which had to be excluded due to low recording quality or excessive artefacts (less than 75% of data remaining after artefact correction, see below). The remaining sample of 22 persons (12 men, 10 women) had an age range between 18 and 33 years (mean age 23.2 years). All participants were either students of non-art subjects or graduates who worked in fields that were not art-related. All were right-handed, as tested by the Edinburgh Handedness Inventory
Behavioural results
On average, the participants indicated that they would have known only 4.18% (SD = 5.72%) of the pictures before participating in the study. This cross-validates our stimulus selection for low familiarity (see above), indicating that influences of prior knowledge of the materials themselves were relatively unlikely for the given material. All following analyses exclusively concentrated on the behavioural and EEG results of the dual choice go/nogo task.
The mean RT for correct go responses was
Discussion
The current study aimed to investigate the neural time course of the processing of style and content in representational art, using materials that systematically varied in style (artist) and content (motif). Following up on the behavioural study by Augustin et al. (2008), we were interested in finding out about the relative duration of style- and content-related processing in terms of when processing would have proceeded far enough to allow successful classification. Could we find evidence that
Acknowledgements
The authors would like to thank Andreas Gartus for technical support, Jennifer R. Ramautar for helpful comments regarding the N200 data and Lee de-Wit for linguistic advice. We furthermore thank the Staatliche Kunsthalle Karlsruhe, the Von der Heydt-Museum in Wuppertal, the Ernst Ludwig Kirchner Archiv, Galerie Henze & Ketterer, in Wichtrach/Bern and the Philadelphia Museum of Art for granting the rights to reproduce artworks in this article for illustration purposes.
Preparation and data
References (62)
- et al.
Style follows content: On the microgenesis of art perception
Acta Psychologica
(2008) - et al.
The time course of visual processing: Backward masking and natural scene categorisation
Vision Research
(2005) - et al.
When a Picasso is a “Picasso”: The entry point in the identification of visual art
Acta Psychologica
(2010) - et al.
The N2 in go/no-go tasks reflects conflict monitoring not response inhibition
Brain and Cognition
(2004) - et al.
ERP components in go nogo tasks and their relation to inhibition
Acta Psychologica
(1999) Time course of visual perception: Coarse-to-fine processing and beyond
Progress in Neurobiology
(2008)- et al.
The impact of level of expertise on the evaluation of original and altered versions of post-impressionistic paintings
Acta Psychologica
(1996) - et al.
Brain correlates of aesthetic judgment of beauty
Neuroimage
(2006) - et al.
Processing scene context: Fast categorization and object interference
Vision Research
(2007) - et al.
Entitling art: Influence of title information on understanding and appreciation of paintings
Acta Psychologica
(2006)
Ambiguities and conventions in the perception of visual art
Vision Research
Aesthetic judgments of music in experts and laypersons – An ERP study
International Journal of Psychophysiology
The human visual system is optimised for processing the spatial information in natural visual images
Current Biology
ERPs to response production and inhibition
Electroencephalography and Clinical Neurophysiology
Electrophysiological estimates of the time course of semantic and phonological encoding during listening and naming
Neuropsychologia
Feature-integration theory of attention
Cognitive Psychology
Categorizing art: Comparing humans and computers
Computers & Graphics
The time course of segment and tone encoding in Chinese spoken production: An event-related potential study
Neuroscience
Art expertise: A study of concepts and conceptual spaces
Psychology Science
Microgenetic approach to the conscious mind
Perceptual rating of paintings from different artistic styles as a function of semantic differential scales and exposure time
Archiv für Psychologie
Very first impressions
Emotion
When feature information comes first! Early processing of inverted faces
Perception
Activation of the prefrontal cortex in the human visual aesthetic perception
Proceedings of the National Academy of Sciences of the United States of America
Prospects for a cognitive neuroscience of visual aesthetics
Bulletin of Psychology and the Arts
Neuroaesthetics: A coming of age story
Journal of Cognitive Neuroscience
Psyscope: An interactive graphic system for designing and controlling experiments in the psychology laboratory using Macintosh computers
Behavior Research Methods Instruments & Computers
The search for meaning in art: Interpretive styles and judgments of quality
Visual Arts Research
Judgments of similarity and difference between paintings
Visual Arts Research
The golden beauty: Brain response to classical and Renaissance sculptures
PLoS One
Objects predict fixations better than early saliency
Journal of Vision
Cited by (49)
Concreteness of semantic interpretations of abstract and representational artworks
2021, Acta PsychologicaCitation Excerpt :According to the model, in early perceptual rounds of aesthetic processing, recognisable forms and objects are identified, yielding initial meaning. Augustin et al. (2011) and Augustin et al. (2008) document a series of microgenetic processes on timescale of 100 ms to about 3 s for these initial processes, with both perceptual content and style being probed, and content emerging before style. If processing continues beyond basic perceptual stages, deeper meaning may be sought.
People's sensitivity to content vs. formal properties of visual stimuli: Evidence from category construction
2019, Acta PsychologicaCitation Excerpt :The post-test showed that our subjects could see the difference between the bold and non-bold items and group them on that basis, but being able to identify a dimension when instructed to do so is not sufficient for noticing the dimension when it is not explicitly mentioned (as in Shuwairi et al., 2014). Thus, our results are not inconsistent with previous experiments showing that people can identify artistic style quite quickly when instructed (and trained) to do so (Augustin et al., 2008, Augustin et al., 2011). We suspect that our subjects were similar to those of Simons and Levin (1998), who saw the eyes, nose, hair, and clothes of the person who asked them for directions but nonetheless did not notice that the person was replaced with someone with different eyes, a different nose, etc.
First gender, then attractiveness: Indications of gender-specific attractiveness processing via ERP onsets
2018, Neuroscience LettersCitation Excerpt :The aim of the present study was to shed light on the very early processing of facial attractiveness, specifically whether the assessment of gender is processed before attractiveness. This was done by employing a paradigm that has frequently been used to test neural time courses in the field of psycholinguistics [48,55,58] and also recently in the field of empirical aesthetics [1]. By combining a dual-choice task with a go/nogo-paradigm and focusing on the LRP and N200 effect, one can estimate processing times and onsets of specific processes [see 54] independently of motor execution timing–and thus much more precise than through simple reaction times in traditional behavioural experiments.
Electrophysiological study of action-affordance priming between object names
2018, Brain and LanguageCitation Excerpt :Using a go/nogo task they compared the temporal onset of the N200 component when participants were asked to make a judgment on whether objects were graspable or non-graspable, or whether they were living or non-living. The N200 is a negative going component resulting from the subtraction of go from nogo trials, and is thought to provide an indication about when sufficient information has become available to allow a participant to make or withhold their response (Augustin, Defranceschi, Fuchs, Carbon, & Hutzler, 2011). Amsel et al. (2013) found that the onset of the N200 related to a living/non-living judgment was at around 160 ms after stimulus presentation, compared to 300 ms for the graspable/non-graspable judgment.
- 1
Present address: Effretikon, Switzerland.