Elsevier

Neuropsychologia

Volume 49, Issue 7, June 2011, Pages 2071-2081
Neuropsychologia

The neural time course of art perception: An ERP study on the processing of style versus content in art

https://doi.org/10.1016/j.neuropsychologia.2011.03.038Get rights and content

Abstract

A central prerequisite to understand the phenomenon of art in psychological terms is to investigate the nature of the underlying perceptual and cognitive processes. Building on a study by Augustin, Leder, Hutzler, and Carbon (2008) the current ERP study examined the neural time course of two central aspects of representational art, one of which is closely related to object- and scene perception, the other of which is art-specific: content and style. We adapted a paradigm that has repeatedly been employed in psycholinguistics and that allows one to examine the neural time course of two processes in terms of when sufficient information is available to allow successful classification. Twenty-two participants viewed pictures that systematically varied in style and content and conducted a combined go/nogo dual choice task. The dependent variables of interest were the Lateralised Readiness Potential (LRP) and the N200 effect. Analyses of both measures support the notion that in the processing of art style follows content, with style-related information being available at around 224 ms or between 40 and 94 ms later than content-related information. The paradigm used here offers a promising approach to further explore the time course of art perception, thus helping to unravel the perceptual and cognitive processes that underlie the phenomenon of art and the fascination it exerts.

Highlights

► Style and content: key features to study processes of art perception. ► Style: art-specific, content: similar to “normal” object and scene perception. ► Current study shows: style follows content in terms of underlying neural processes. ► Style available for classification at ca. 224 ms or between 40 and 94 ms after content. ► Paradigm employed is promising to further uncover the time course of art perception.

Introduction

Many people report that art constitutes an important part of their lives. It inspires and fascinates them. Consequently, the phenomenon of art is not only puzzling to art historians but also to cognitive researchers and neuroscientists, whose interest in the possible sources as well as neural correlates of such fascination has been reflected in a significant number of relatively recent publications that are related to questions of art perception and aesthetics (e.g., Cela-Conde et al., 2004, Chatterjee, 2003, Di Dio et al., 2007Jacobsen and Hofel, 2003, Jacobsen et al., 2006, Leder et al., 2004, Muller et al., 2010, Nadal et al., 2008, Ramachandran and Hirstein, 1999, Redies, 2007). A review of many of these contributions was recently provided by Chatterjee (2011).

From a vision scientist's point of view, the central question regarding art is yet even more basic than questions about fascination – but equally unsolved: what is specific about art perception, i.e., what differentiates it from normal object and scene perception? One aspect that has been proposed in this respect is the way by which ambiguities are resolved (Mamassian, 2008) – either with a view to prior constraints (“normal” vision) or by conventions (art). Another important aspect differentiating art perception from normal object and scene perception is the presence and relevance of (artistic) style (Augustin, Leder, Hutzler, & Carbon, 2008). In representational art, content (motif) is closely related to objects and scenes in our surroundings. In contrast, style, the way how something is depicted, is generally of very little relevance in object and scene perception. It might be that we, for instance, have to find our way or recognise an object through fog or a snowstorm, but in such cases most of us would probably regard the fog or snow as random noise in perceptual terms – which hardly anyone would claim with respect to style in art. Rather, style is an essential aspect of an artwork, not only from an art historical point of view but also with respect to visual perception and cognition. For example, it has been shown that even people without special expertise in the arts are able to successfully judge the style-related similarity of artworks (Cupchik, Winston, & Herz, 1992), are sensitive to style across different media (Hasenfus, Martindale, & Birnbaum, 1983) and refer to both content and style when asked to judge the similarity of representational artworks (Augustin et al., 2008). By comparing the processing of these two essential perceptual aspects of representational artworks scientists have the opportunity to contrast general processes of vision with art-specific processes, which may yield important knowledge about the vision-related underpinnings of the phenomenon of art.

One central and very basic question with respect to style- and content-related processing is their temporal relation, following the idea that percepts do not exist ex nihilo, but undergo a temporal evolution, a microgenesis (Bachmann, 2000). In the literature on object-, face- and scene perception the question of the time course and interrelations of different processes has received noticeable attention in the past few years (e.g., Bacon-Mace et al., 2005, Bar et al., 2006, Carbon and Leder, 2005, Grill-Spector and Kanwisher, 2005, Hegde, 2008, Joubert et al., 2007, Joubert et al., 2009Kent and Lamberts, 2006, Rahman et al., 2002, Thorpe et al., 1996, VanRullen and Thorpe, 2001b). With respect to the time course of the processing of style and content in art, empirical data is much more scarce so far, even though first theoretical approaches have been published (Chatterjee, 2003, Leder et al., 2004) and few empirical studies have generally taken on the question of temporal aspects of art perception (Bachmann and Vipper, 1983, Locher et al., 2007). Given that we know so little empirically, what assumptions could we derive from current theories? The model of aesthetic appreciation and aesthetic judgments by Leder et al. (2004) proposes that explicit classifications of style and content take place during the same processing stage, with the general probability of classifying in terms of content or style being related to a person's art-related expertise (see also Belke, Leder, Harsanyi, & Carbon, 2010). This view reflects a definition of style in terms of an abstract category that has to be learned in order to be successfully applied and recognised (see also Hartley & Homa, 1981). Given such a view, one could also argue for a sequence of processing with style following content, as the classification of content is presumably far more overlearned in real life than the classification of style, given that classification of objects and scenes is an essential ability for biological and social survival. From a completely different perspective, style can be regarded as a combination of different low level features (Augustin et al., 2008), and recent attempts to characterise particular styles on the basis of low level cues by means of image processing tools seem to be in accord with such a view (e.g., Johnson et al., 2008). Regarding this definition of style, important information on the time course of style- versus content-related processing comes from the literature on the relation of low level information versus object-related information in object and scene perception (e.g., Grill-Spector and Kanwisher, 2005, Marr, 1982, Sanocki, 1993). Yet, this information is not entirely clear either: On the one hand, classical theories of object recognition assume, for instance, that the perception of single features such as colour precedes the perception of the object as such (Marr, 1982). According to Feature Integration Theory (FIT) the binding of features to more complex units also requires attention (Treisman & Gelade, 1980). On the other hand there is evidence that processing of some low-level features comes into play relatively late (Yao & Einhauser, 2008), as is for example indicated by evidence showing that visual attention may be guided by complete objects rather than by the early saliency of single features (Einhauser, Spain, & Perona, 2008).

To our knowledge, there is only one experimental study that focused especially on the temporal relation between style- and content-related processing (Augustin et al., 2008). This study assessed similarity judgments for pairs of pictures that were systematically crossed in style (artist) and content (motif), with presentation times systematically varied between 10 ms and 3000 ms. Effects of content could be found from PTs as short as 10 ms on and stayed relatively stable over time. In contrast, effects of style slightly emerged from 50 ms on, with effect sizes increasing steadily over time. These results suggest that the processing of style starts later and develops more slowly than the processing of content. More precisely, they indicate that the information extracted within a presentation time window of 10 ms is enough for content to become a relevant criterion of similarity, while from 50 ms of presentation time on similarity judgments also significantly rely on style. Two characteristics of similarity judgments have to be borne in mind: On the one hand, similarity judgments reflect the relevance of a certain variable rather than the ability to distinguish that variable. Thus, it would theoretically be possible that participants are able to refer to style as early as to content, if they are explicitly asked to focus on both. On the other hand, if people refer to style or content in similarity judgments this does not necessarily mean that they are also able to explicitly classify style and content (see Augustin et al., 2008). Therefore, the time course suggested by the study by Augustin et al. (2008) cannot necessarily be generalised to tasks requiring explicit classification. The current study aimed to fill this gap. It investigated the relative duration of the processing of style and content in terms of when information has processed far enough to allow successful classification of style and content, respectively. To this end, we employed a paradigm that has repeatedly been used in psycholinguistics (Rodriguez-Fornells et al., 2002, Schmitt et al., 2000, Schmitt et al., 2001, Zhang and Damian, 2009) to track the timeline of different processes: a combination of a go/nogo- and a dual choice-task with assessment of event related potentials (ERPs). The dependent measures of interest are the Lateralised Readiness Potential (LRP) and the N200 effect, which are both illustrated in the following sections.

The Lateralised Readiness Potential is derived from the Bereitschaftspotential (engl. Readiness Potential, RP), a negative shift in brain activation preceding voluntary hand- (and also foot-) movements (Kornhuber & Deecke, 1965), with the largest amplitude over the central region contra-lateral to the response limb (Kornhuber and Deecke, 1965, Kutas and Donchin, 1974). While the RP strongly corresponds with the readiness for hand-related motor actions, its lateralised aspect, the LRP, correlates with the preparation of voluntary motor actions of a specific hand, thus allowing to assess not only general but task-related aspects of preparation in a dual choice task (Osman, Coles, Donchin, Bashore, & Meyer, 1992). According to van Turennout, Hagoort, and Brown (1998: 573), the LRP “…has been shown to develop as soon as task-relevant perceptual and cognitive information is available for the motor system...” Importantly, it can not only be observed prior to executed movements, but also occurs when a movement is planned but finally not executed (Osman et al., 1992, van Turennout et al., 1998). These two characteristics make the LRP an excellent tool for studies on the temporal relation of different processes. A paradigm for this purpose was proposed by Osman et al. (1992) and further explicated by van Turennout et al. (1998): the employment of the LRP in a combined dual choice go/nogo task. In such a task, participants have to refer to two different dimensions of the same stimulus at the same time. Dimension A determines whether or not to react at all (go/nogo), and Dimension B determines which hand to react with if Dimension A signals a “go” (hand). The crucial case regarding the temporal relation of processes are the nogo trials: If Dimension A is processed before Dimension B, no nogo LRP (i.e., no significant divergence of the LRP curve from baseline in nogo trials), should develop, because the decision not to react should precede any decision about response hand. In contrast, if Dimension B is processed before Dimension A, a nogo LRP should be traceable, because response preparation presumably starts before the nogo information from Dimension A comes into play.

This paradigm has been successfully employed to examine the time course of processing for different questions in psycholinguistics (Schmitt et al., 2000, Schmitt et al., 2001, van Turennout et al., 1998) as well as in face perception (Rahman et al., 2002).

The term N200 or N2 denotes the second negative peak in an averaged ERP waveform (Folstein & van Petten, 2008). One condition under which the N200 has been shown to be especially pronounced is under conditions of response inhibition, such as in so-called go/nogo tasks (Folstein and van Petten, 2008, Schmitt et al., 2000, Schmitt et al., 2001, Zhang and Damian, 2009). In such tasks, where participants are instructed to react to one kind of stimulus (go) and withhold responses to another (nogo), nogo trials were shown to be associated with larger negativity than go trials (Pfefferbaum, Ford, Weller, & Kopell, 1985), especially at frontal sites (Folstein & van Petten, 2008).

Subtraction of the go- from the nogo waveform yields a difference curve known as the N200 effect (Schmitt et al., 2000). The N200 effect at frontal sites comprises nogo-specific activation. Thus a common interpretation of this effect is in terms of activation related to the inhibition of inappropriate responses (Falkenstein et al., 1999, Thorpe et al., 1996), even though alternative interpretations have also been proposed (e.g., Donkers & van Boxtel, 2004). Importantly, this effect can be utilised to estimate processing times: If a person is able to correctly withhold a response in a go/nogo task, this means that she must have analysed the relevant information to a sufficient amount. Of special importance in this respect is the onset of the N200 effect, the point from which on nogo- and go-curve diverge. According to Schmitt et al. (2000: 474), the onset of the N200 effect “… can be taken as the time by which there must have been enough information available to help the person decide whether or not to respond”. Very prominently this was illustrated by Thorpe et al. (1996), who used a go/nogo-paradigm to examine the speed of processing in scene perception. Participants saw scenes flashed at 20 ms and were required to release a button (go) when they saw an animal in the scene, and to keep this button pressed (nogo) when there was no animal. There was a significant difference between go- and nogo ERPs at frontal electrodes starting from 152 ms after stimulus onset, which according to Thorpe et al. (1996) indicates that a great deal of processing of relevant information must have been completed before this time.

Unlike the LRP, the N200 effect is not related to motor activity and can be traced earlier than motor-related activity (Thorpe et al., 1996). Its onset has successfully been employed to estimate processing times in psycholinguistics, regarding questions such as processing times for semantic versus phonological encoding in picture naming (Schmitt et al., 2000) or segment versus tone encoding in Chinese spoken word production (Zhang & Damian, 2009). In the present study we utilised the characteristics of the N200 effect to find out more about the time course of style- and content-related processing in art perception by gaining first numerical estimates regarding the respective processing times.

Following up on the findings by Augustin et al. (2008) the current study examined the time course of the processes underlying successful classification of style and content in artworks. We employed a combination of a go/nogo- and a dual choice task that has been reported in studies of psycholinguistics (Schmitt et al., 2000, Schmitt et al., 2001, van Turennout et al., 1998) as well as face perception (Rahman et al., 2002), but – to our knowledge – has not been applied to art perception up to now. The paradigm allows to investigate the relative time course of two cognitive processes. The general logic is that participants have to consider two different dimensions of the same stimulus at the same time. For each trial, one stimulus dimension determines whether to react or not to react (go/nogo), the other dimension determines which hand to react with (hand) – if the go/nogo-dimension signals a “go”. In the present study, the two relevant stimulus dimensions were style and content. The two levels of style used consisted of pictures from two artists with very distinct individual styles, Paul Cézanne (Cézanne) and Ernst Ludwig Kirchner (Kirchner). The two levels of content were operationalized by using pictures of those artists that depicted the motifs landscape and person(s), respectively. To make sure that style- and content-related information were comparably salient in the materials used, the stimuli were chosen on the basis of a pre-study. To furthermore ensure that participants in the present study were definitely able to master both the content- and the style-related part of the task, they received a training prior to participating in the main experiment (see Section 2.4).

In the dual choice go/nogo-task the roles of style and content, the roles of the sublevels and the response hands were completely balanced, resulting in 2 (dimension determining the go/nogo-decision) × 2 (level signalling “go”) × 2 (meaning of left and right hand) conditions. Fig. 1 illustrates the experimental paradigm by depicting the go/nogo- and hand-logic for one of the experimental conditions.

Inferring from the results by Augustin et al. (2008), we supposed that participants would be able to classify content earlier than style. The two dependent variables we wanted to test this with were the LRP and the N200 effect.

As described above, the crucial conditions regarding the LRP are the nogo conditions, because in the case of go trials the presence of an LRP is self-evident (if there is motor activity there should be motor preparation). A nogo LRP should be visible for those cases in which the information determining the hand decision is processed before the information regarding the go/nogo decision. Thus, we expected a nogo LRP for those conditions in which the hand-decision was about content and the go/nogo decision was about style (hand = content). In those cases, the LRP was expected to rise but to flatten out as soon as the style-related information was available. In contrast, no nogo LRP at all was expected for cases in which the hand decision was about style and the go/nogo decision was about content (hand = style), because in those cases the information about response inhibition was assumed to be available earlier than information about the response hand.

Table 1 summarises the hypotheses regarding the LRP results. In addition to an analysis of the relative time course of the processing of style versus content we also aimed to use the LRP data to derive some information about absolute time course, following the analysis by van Turennout et al. (1998). The idea was to statistically compare the go- and the nogo LRP in those cases where style decided over go and nogo to estimate the length of the time interval in which content-related, but no style-related information was available. The relevant time points to estimate the length of this interval were the point from which the go- and nogo LRPs started to diverge from zero and the point from which go- and nogo LRP differed in amplitude, with the go LRP rising further and the nogo LRP returning to baseline.

With regard to the N200 effect (nogo minus go), we were also interested in numerical estimates regarding the time course of processing, but the logic was slightly different. As explained above, the onset of the N200 effect might be taken as the time point at which enough information is available in order to correctly withhold a response. We aimed to use the onset latencies of the N200 effects to come to first estimates of the processing times required for content- and style-related classifications, respectively. Following the behavioural study by Augustin et al. (2008), the decision to conduct an EEG-study with the paradigm just described was motivated by the fact that this method provides the opportunity to examine the relative time course of different processes with a focus on processing times themselves rather than required stimulus duration (variation of presentation times, as in Augustin et al., 2008). A central advantage of the current method over a behavioural classification–response time paradigm was that confounds by times required for response execution are reduced.

Section snippets

Participants

Twenty-nine people participated, seven of which had to be excluded due to low recording quality or excessive artefacts (less than 75% of data remaining after artefact correction, see below). The remaining sample of 22 persons (12 men, 10 women) had an age range between 18 and 33 years (mean age 23.2 years). All participants were either students of non-art subjects or graduates who worked in fields that were not art-related. All were right-handed, as tested by the Edinburgh Handedness Inventory

Behavioural results

On average, the participants indicated that they would have known only 4.18% (SD = 5.72%) of the pictures before participating in the study. This cross-validates our stimulus selection for low familiarity (see above), indicating that influences of prior knowledge of the materials themselves were relatively unlikely for the given material. All following analyses exclusively concentrated on the behavioural and EEG results of the dual choice go/nogo task.

The mean RT for correct go responses was

Discussion

The current study aimed to investigate the neural time course of the processing of style and content in representational art, using materials that systematically varied in style (artist) and content (motif). Following up on the behavioural study by Augustin et al. (2008), we were interested in finding out about the relative duration of style- and content-related processing in terms of when processing would have proceeded far enough to allow successful classification. Could we find evidence that

Acknowledgements

The authors would like to thank Andreas Gartus for technical support, Jennifer R. Ramautar for helpful comments regarding the N200 data and Lee de-Wit for linguistic advice. We furthermore thank the Staatliche Kunsthalle Karlsruhe, the Von der Heydt-Museum in Wuppertal, the Ernst Ludwig Kirchner Archiv, Galerie Henze & Ketterer, in Wichtrach/Bern and the Philadelphia Museum of Art for granting the rights to reproduce artworks in this article for illustration purposes.

Preparation and data

References (62)

  • P. Mamassian

    Ambiguities and conventions in the perception of visual art

    Vision Research

    (2008)
  • M. Muller et al.

    Aesthetic judgments of music in experts and laypersons – An ERP study

    International Journal of Psychophysiology

    (2010)
  • C.A. Parraga et al.

    The human visual system is optimised for processing the spatial information in natural visual images

    Current Biology

    (2000)
  • A. Pfefferbaum et al.

    ERPs to response production and inhibition

    Electroencephalography and Clinical Neurophysiology

    (1985)
  • A. Rodriguez-Fornells et al.

    Electrophysiological estimates of the time course of semantic and phonological encoding during listening and naming

    Neuropsychologia

    (2002)
  • A.M. Treisman et al.

    Feature-integration theory of attention

    Cognitive Psychology

    (1980)
  • C. Wallraven et al.

    Categorizing art: Comparing humans and computers

    Computers & Graphics

    (2009)
  • Q.F. Zhang et al.

    The time course of segment and tone encoding in Chinese spoken production: An event-related potential study

    Neuroscience

    (2009)
  • M.D. Augustin et al.

    Art expertise: A study of concepts and conceptual spaces

    Psychology Science

    (2006)
  • T. Bachmann

    Microgenetic approach to the conscious mind

    (2000)
  • T. Bachmann et al.

    Perceptual rating of paintings from different artistic styles as a function of semantic differential scales and exposure time

    Archiv für Psychologie

    (1983)
  • M. Bar et al.

    Very first impressions

    Emotion

    (2006)
  • C.C. Carbon et al.

    When feature information comes first! Early processing of inverted faces

    Perception

    (2005)
  • C.J. Cela-Conde et al.

    Activation of the prefrontal cortex in the human visual aesthetic perception

    Proceedings of the National Academy of Sciences of the United States of America

    (2004)
  • A. Chatterjee

    Prospects for a cognitive neuroscience of visual aesthetics

    Bulletin of Psychology and the Arts

    (2003)
  • A. Chatterjee

    Neuroaesthetics: A coming of age story

    Journal of Cognitive Neuroscience

    (2011)
  • J. Cohen et al.

    Psyscope: An interactive graphic system for designing and controlling experiments in the psychology laboratory using Macintosh computers

    Behavior Research Methods Instruments & Computers

    (1993)
  • G.C. Cupchik et al.

    The search for meaning in art: Interpretive styles and judgments of quality

    Visual Arts Research

    (1988)
  • G.C. Cupchik et al.

    Judgments of similarity and difference between paintings

    Visual Arts Research

    (1992)
  • C. Di Dio et al.

    The golden beauty: Brain response to classical and Renaissance sculptures

    PLoS One

    (2007)
  • W. Einhauser et al.

    Objects predict fixations better than early saliency

    Journal of Vision

    (2008)
  • Cited by (49)

    • Concreteness of semantic interpretations of abstract and representational artworks

      2021, Acta Psychologica
      Citation Excerpt :

      According to the model, in early perceptual rounds of aesthetic processing, recognisable forms and objects are identified, yielding initial meaning. Augustin et al. (2011) and Augustin et al. (2008) document a series of microgenetic processes on timescale of 100 ms to about 3 s for these initial processes, with both perceptual content and style being probed, and content emerging before style. If processing continues beyond basic perceptual stages, deeper meaning may be sought.

    • People's sensitivity to content vs. formal properties of visual stimuli: Evidence from category construction

      2019, Acta Psychologica
      Citation Excerpt :

      The post-test showed that our subjects could see the difference between the bold and non-bold items and group them on that basis, but being able to identify a dimension when instructed to do so is not sufficient for noticing the dimension when it is not explicitly mentioned (as in Shuwairi et al., 2014). Thus, our results are not inconsistent with previous experiments showing that people can identify artistic style quite quickly when instructed (and trained) to do so (Augustin et al., 2008, Augustin et al., 2011). We suspect that our subjects were similar to those of Simons and Levin (1998), who saw the eyes, nose, hair, and clothes of the person who asked them for directions but nonetheless did not notice that the person was replaced with someone with different eyes, a different nose, etc.

    • First gender, then attractiveness: Indications of gender-specific attractiveness processing via ERP onsets

      2018, Neuroscience Letters
      Citation Excerpt :

      The aim of the present study was to shed light on the very early processing of facial attractiveness, specifically whether the assessment of gender is processed before attractiveness. This was done by employing a paradigm that has frequently been used to test neural time courses in the field of psycholinguistics [48,55,58] and also recently in the field of empirical aesthetics [1]. By combining a dual-choice task with a go/nogo-paradigm and focusing on the LRP and N200 effect, one can estimate processing times and onsets of specific processes [see 54] independently of motor execution timing–and thus much more precise than through simple reaction times in traditional behavioural experiments.

    • Electrophysiological study of action-affordance priming between object names

      2018, Brain and Language
      Citation Excerpt :

      Using a go/nogo task they compared the temporal onset of the N200 component when participants were asked to make a judgment on whether objects were graspable or non-graspable, or whether they were living or non-living. The N200 is a negative going component resulting from the subtraction of go from nogo trials, and is thought to provide an indication about when sufficient information has become available to allow a participant to make or withhold their response (Augustin, Defranceschi, Fuchs, Carbon, & Hutzler, 2011). Amsel et al. (2013) found that the onset of the N200 related to a living/non-living judgment was at around 160 ms after stimulus presentation, compared to 300 ms for the graspable/non-graspable judgment.

    View all citing articles on Scopus
    1

    Present address: Effretikon, Switzerland.

    View full text