The question of how behavior is represented in the mind lies at the core of psychology as the science of mind and behavior. While a long-standing research tradition has established two opposing fundamental views of perceptual representation, Structuralism and Gestalt psychology, we test both accounts with respect to action representation: Are multiple actions (characterizing human behavior in general) represented as the sum of their component actions (Structuralist view) or holistically (Gestalt view)? Using a single-/dual-response switch paradigm, we analyzed switches between dual ([A + B]) and single ([A], [B]) responses across different effector systems and revealed comparable performance in partial repetitions and full switches of behavioral requirements (e.g., in [A + B] → [A] vs. [B] → [A], or [A] → [A + B] vs. [B] → [A + B]), but only when the presence of dimensional overlap between responses allows for Gestalt formation. This evidence for a Gestalt view of behavior in our paradigm challenges some fundamental assumptions in current (tacitly Structuralist) action control theories (in particular the idea that all actions are represented compositionally with reference to their components), provides a novel explanatory angle for understanding complex, highly synchronized human behavior (e.g., dance), and delimitates the degree to which complex behavior can be analyzed in terms of its basic components.
While a substantial body of research on cognition has been established regarding mechanisms underlying action control 1 , the basic issue of how behavior—typically characterized by multiple, cross-modal motor movements—is mentally represented has not yet received sufficient attention. For example, while driving a car, is pushing the clutch (pedal movement) and shifting the gear (manual movement) represented as a single action or just as the sum of two behavioral parts? This conceptual neglect is particularly surprising considering a long-standing research tradition with respect to perceptual representations. That is, on the input (perceptual) side of processing, two opposing general views have been historically established as predecessors of modern information processing approaches: Structuralism and Gestalt psychology. While Structuralism (harking back to Wundt and Titchener 2,3 ) assumes that complex mental representations can be analyzed in terms of their (atom-like) components, Gestalt psychology in the Wertheimer 4 tradition rather proposes that a complex, holistic representation (“the whole”) is different from the sum of its parts 5 . Examples for Gestalt effects abound: In the domain of perception, it was shown that holistic figures can emerge from seemingly unrelated, simultaneously processed stimulus elements (points, lines …). Relatedly, in the domain of learning there is a long-standing debate on elemental versus configural stimulus learning in association formation 6,7 . However, while both Structuralists and Gestalt psychologists were also concerned with studying and conceptualizing human behavior, there has been a surprising scarcity of research on the specific question of how simultaneously processed actions (essentially characterizing all real-life human behavior) are mentally represented: As the sums of their elementary behavioral parts or as distinct action Gestalten?
A typical method to study cognitive representations and their dynamics is sequential performance analysis. The underlying assumption is that changes in cognitive representations yield performance costs compared to situations involving unchanged (or shared) cognitive representations, the latter often enabling relative performance (repetition) benefits (also referred to as priming). For example, performance declines when subjects change from one action representation to another (e.g., from left to right key press 8 ), or from one task representation to another (task switching 9 ). In this way, performance can—under otherwise controlled conditions—serve as an empirical marker for shared versus different underlying cognitive representations between successive actions.
Here, we utilized this rationale to test whether (simultaneous) multiple actions share (or do not share) mental representations with their constituent component actions by having participants switch between single and dual responses from trial to trial (single-/dual-switch paradigm). This novel paradigm is a combined derivative of both the dual-task and the task-switching paradigm 9,10 . Unlike these previous paradigms, we did not focus on comparing single- versus dual-task performance or on differences between switching versus repeating tasks, but more specifically on sequential transitions across single- and dual-responses.
Overall, we expected repeated action requirements (single → single, dual → dual) to yield performance benefits, serving as a proof of concept for the underlying assumption that unchanged cognitive representations result in (relative) benefits. However, the Gestalt and Structuralist accounts of dual-action representation fundamentally differ with respect to predictions for action type switches, thereby allowing (probably for the first time) for a rigorous experimentum crucis (Table 1): If the Structuralist account is true, dual responses (e.g., simultaneous processing of responses [A + B]) should share cognitive representations with either component response ([A], [B]), and thus a switch from [A + B] to [A] should result in better performance than a switch from [B] to [A] (partial repetition benefit for single responses). Similarly, the A-part of [A + B] should be performed better when preceded by [A] instead of [B] (partial repetition benefit for dual responses).
Most importantly, RTs did not significantly differ between partial repetition and switch conditions in Experiments 1–3 (p > .12 for all post hoc LSD contrasts, see Table 3 for a complete overview of ANOVA results). That is, there was no statistical evidence whatsoever for a partial repetition benefit in any of these experiments involving spatial dimensional overlap between responses across effector systems (see Fig. 2). Also note that the maximum border of the 95% CI for the partial repetition benefit amounted to 11 ms, 3 ms, 12 ms, and 15 ms (for Experiments 1A, 1B, 2, and 3, respectively), which is clearly too small to be compatible with the assumption of a meaningful partial repetition benefit overall (especially when compared to the size of the full repetition benefit, see Fig. 2). Specifically, the maximum border of the 95% CIs (see above) never includes an (absolute) effect size of 36 ms that was reported in a previous study for a partial repetition benefit in single task trials, or of 18/62 ms that were previously reported as partial repetition benefits for dual-task trials 12 , serving as a smallest effect size of interest. Therefore, in accordance with typical procedures recommended for equivalence testing 19 , we can conclude that there is significant statistical evidence for the absence of a meaningful partial repetition benefit effect throughout Experiments 1–3.
Table 3 ANOVA results.Thus, even though in Experiment 2 separate stimuli were used to trigger both responses (based on the idea that separate stimulation of two responses might counteract Gestalt formation), we still observed evidence for Gestalt representations. Likewise, even though in Experiment 3 the two spatial responses were spatially incompatible in half of the dual-response trials (based on the idea that the presence of spatially incompatible responses might prevent Gestalt formation), these situations also evidently still enabled Gestalt representations.
Importantly, however, in Experiment 4, where we completely removed any spatial dimensional overlap between responses, the data pattern was fundamentally different: Here, a statistically robust overall partial repetition effect (95% CI 26–79 ms overall, see Fig. 2) emerged for the vocal dual-task condition, indicating the lack of robust Gestalt formation in the absence of a common (spatial) dimension for the two component responses in Experiment 4. Note though that we did not find this similarly for the manual dual-task condition, where a deviant pattern in RTs was observed, as indicated by the corresponding interactions. The fact that these partial repetition benefits were present in the vocal (but not in the manual) responses might be due to the fact that in manual-vocal dual tasks, vocal response processing is typically prioritized 20,21 and might thus benefit more from repeated (component) responses. Note that this finding of partial repetition benefits in Experiment 4 is further supported by two other recent studies that analyzed switches between single- and dual-task trials in situations also characterized by low (spatial) cross-task dimensional overlap 12,13 . Together, this is very robust empirical evidence for compositional, Structuralist action representation when (via the absence of dimensional overlap) a crucial precondition of action Gestalt formation is removed.
Finally, we additionally re-analyzed the RT data by excluding all trials involving the successive execution of exactly the same response requirements (full stimulus repetitions involving response direction repetitions such as left → left), as one might argue that repetition priming (due to repeating the exact same stimulus–response episode) might distort the results. However, this did not substantially alter the pattern of transition effects (see Table 3), indicating that the findings were not solely driven by specific identity priming mechanisms. In addition, this observation empirically supports our tenet that the relevant response components here are defined by effector modalities (e.g., [A] refers to manual, [B] to vocal), not by response direction (see also 22 ). Note that we also did not find any evidence for partial repetition benefits when only analyzing consecutive trials involving the same spatial response direction in Experiments 1–2, where a sufficient number of corresponding trials was available. Finally, refraining from the exclusion of trials with (uninstructed) saccade execution (in Experiments 1A, 2, 3, 4) also did not change the overall pattern of results.
Note that the interpretation of the main effect of transition on RTs is not severely compromised by any interaction of transition with other factors: In Experiment 2, a significant interaction with response condition indicated that the full repetition advantage is more pronounced in single versus dual conditions, whereas in single-task conditions of Experiment 3 a partial repetition cost in manual RTs was traded off against a partial repetition benefit in vocal error rates (no partial repetition benefit or cost was present in RTs or error rates in the dual-task conditions of Exp. 3). In sum, the lack of partial repetition benefits (e.g., better performance of [A] when preceded by [A + B] instead of [B], or better performance of the A-part of [A + B] when preceded by [A] instead of [B]) in Experiments 1–3 shows that dual responses ([A + B]) are distinctly represented without reference to their constituent component responses ([A], [B]), supporting the assumption of Gestalt representations of simultaneous multiple responses whenever (here: spatial) dimensional overlap 23 is present as a reference for establishing a Gestalt representation.
One interesting observation is that in Experiment 3 we did not find clear evidence for a full repetition benefit in the dual-task data (see Fig. 2). One possible explanation would be that due to the presence of both spatially compatible and incompatible dual-task trials in this experiment, participants adopted a strategy within which they no longer made use of the (principally available) benefit of a full repetition, as the incompatible trials were experienced as so difficult to process that they started the processing of any dual-task trial “from scratch”. While this mechanism is rather speculative, it is important to note that any such (potentially strategic) mechanism underlying this particular aspect of the data pattern would not endanger our general conclusions regarding the underlying representation formats throughout the experiments of the present study.
While overall error rates are probably too low for meaningful interpretation in some experiments (participants responded correctly in more than 96.5% of the trials in each of the Experiments 1–2), statistical analyses revealed no evidence for partial repetition benefits across Experiments 1–3. If anything, we observed slightly higher error rates in partial repetition versus switch conditions in the dual manual conditions of Experiment 1A/B and in the single manual condition in Experiment 1B (ps < .05 for post hoc LSD contrasts). However, this was not a consistent pattern, since it did neither occur in any of the nine remaining comparisons (see Table 4), nor anywhere in the RT data. Unlike in Experiments 1–3, however, we observed substantial evidence for partial repetition benefits in the error rates of Experiment 4 (in both single- and dual-task conditions). In sum, this confirms the RT-based conclusions that dual responses are represented in a Structuralist fashion in the absence of dimensional overlap.
Table 4 Error rates.A Gestalt view of human action, which was reliably supported by all relevant contrasts in Experiments 1A, 1B, 2, and 3, has important implications for current action control theories. Traditionally, the field in which cognitive underpinnings of multiple action control are discussed is multitasking research. Interestingly, however, the question of Gestalt versus Structuralist representations of behavior has never been a vital empirical or conceptual concern in this field.
Early theorizing has sometimes interpreted the emergence of dual-task interference per se as evidence for the claim that multitasking (which essentially characterizes any human real-life action in general) is more than the sum of its component tasks 24 . However, this view never entailed the more radical idea that dual-action representations may not resemble their constituent components at all. Instead, more recent theories explicitly 11 or tacitly assume that the cognitive representation of a component task always remains structurally comparable under single- and dual-task requirements. Specifically, performance decrements elicited by the presence of additional action requirements are assumed to originate from interrupted (generic bottleneck models 10 ) or strategically deferred 25 component task processing, or because resource competition, crosstalk phenomena, or activation/inhibition dynamics between representations (associated with each component action) slow down component task processing 26,27,28,29,30 .
However, our present results challenge the underlying assumption of structurally comparable task/action representations under single and dual conditions as a universal, generally valid principle, an assumption that is also a prerequisite for the explanation of dual-task/dual-response costs in terms of the impact of secondary task presence on task processing. Instead, a Gestalt view would rather assume that complex action (i.e., action composed of at least two distinguishable sub-units) can—under appropriate conditions—be configured in a holistic way (similar to the notion of chunking in memory 31 ), and thus attribute putative dual-task costs to a more complex (but essentially unitary) configuration (or selection) process associated with multiple action control. While action components here were defined in terms of effector modalities (e.g., [A] = saccade), a tenet that was also supported by the data, it may be worthwhile in future research to additionally focus on simple (instead of choice) responses.
The idea of action Gestalten being characterized by the lack of any strong reference to representations of their action components (“different from the sum of parts”) renders this view fundamentally distinct from previous action integration (or feature binding) accounts 11 , according to which complex action is merely “more than the sum” (this may also be referred to as a “weak” as opposed to a “strong” Gestalt account). Thus, in integration accounts the component representations still remain intact but are coded in a strongly associated manner 32,33,34 . Effects indicating integrated action (or task) representations were reported, for example, in studies on bi-manual control, task switching, and implicit learning 35,36,37,38,39,40 . A typical example for an integration account of multiple action control is a recent dual-task control framework 30 in which each component response relevant in a dual-task setting is conceptualized as being bound to an integrated event file (containing information associated with the particular action), and the dynamic activation/inhibition patterns within and between event file representations eventually determine multiple action performance. Despite the idea of integrated representations within such an event file (consisting of stimuli, responses, effects etc.), this account still represents an essentially Structuralist (compositional) theorizing because each component response calls for a distinct event file. Despite this, this account is principally open to a possible integration of event files (or task representations, see also 37 ).
Interestingly, according to some of these Structuralist integration accounts, most notably so-called feature binding accounts 41,42 , partial repetitions of features (here, on a conceptually somewhat higher level, referred to as task demands) should not yield performance benefits, but rather partial mismatch costs: As any partial repetition necessarily also implies some degree of change (in stimuli, context, or particular response requirements), this change (via the retrieval of an unwarranted type of task/action representation) can eventually make it harder to retrieve the appropriate action (see 41 , for empirical examples). Note, however, that this logic cannot be easily transferred to our study that involved switches between single and multiple actions, and we clearly did not find any substantial evidence for consistent performance costs associated with switching from dual to single actions (or vice versa) in our data. In a similar vein, some previous task switching studies addressed the related question of whether it is possible to re-use some control settings after a partial (vs. full) task switch, or whether all control settings need to be re-set (the latter being more in line with the assumption of holistic task representations). However, these studies yielded inconsistent results 43,44,45,46,47 , most likely due to the lack of a performance baseline regarding the component tasks (i.e., a baseline equivalent to the crucial single-response conditions in the present study).
Several studies in the realm of motor control have previously referred to Gestalt principles, but without empirically testing Structuralist against Gestalt predictions directly with respect to action representations. Nevertheless, these studies laid the groundwork for the present research. For example, in a pioneering attempt to re-conceptualize previous motor control findings in terms of Gestalt principles, Klapp and Jagacinski 48 interpreted the observation that choice reaction time depends on motor chunk complexity as resulting from a motor Gestalt that must be programmed prior to any of its component gestures. However, they did not experimentally rule out the alternative (Structuralist) explanation that a motor chunk might still be represented in terms of its components. Another study 49 reported that lateral oscillations of two index fingers are easier to synchronize when the movements are symmetric on a perceptual level (not on the level of homologous muscles). However, this effect basically demonstrates that perceptual Gestalt principles can be utilized to guide motor control, but it does not directly address the nature of action representations per se. Taken together, the essential question of whether mental representations of multiple actions can be organized in terms of motor Gestalten has not been sufficiently addressed in prior research.
Gestalt psychology has often been criticized for exhibiting a lack of clear, quantifiable predictions and for assuming rather opaque underlying mechanisms 5 . This issue may have prevented a more substantial generalization to other domains, including motor control. The present study demonstrates that—by developing a novel single-/dual-response switch paradigm in which we address trial-by-trial switches from dual- to single-response performance and vice versa—it is possible to derive clear and experimentally testable predictions from Gestalt theory that can be directly pitted against Structuralist accounts. Our present results are also in line with other recent evidence in favor of action Gestalten: When a certain single response [A] is more frequently practiced, this practice does not appear to transfer to allow for an easier execution of the A-part of dual [A + B] responses 50 . In addition, recent research on action imitation showed that executing a dual action (lifting both index and middle finger) is facilitated by observing a corresponding dual action but not by seeing the two composite actions (i.e., one stimulus hand lifts the index finger while another stimulus hand lifts the middle finger) 51 . Note that while our present study still involves rather basic actions, we believe that similar Gestalt representation formats occur as task complexity increases (especially in complex body movements such as dancing). In fact, Structuralist accounts typically have a hard time explaining how it is even possible that complex actions (e.g., dancing, a one-man band) evolved in the first place, as a distinct and time-consuming 10 selection, initiation, and control of each individual component movement (of muscles and joints) in such situations would render any complex, highly synchronized movement virtually impossible. The present Gestalt account may thus offer a solution to this “complex movement paradox”.
At first sight, distinct Gestalt-like representations appear to lack parsimony: Why not benefit from partial feature overlap to more efficiently activate required action patterns? Probably, distinct representations are also characterized by the advantage of preventing unwanted conflict between task-relevant response requirements in situations involving switches between different requirements, thereby promoting resistance to interference (shielding, see 52 ).
Nevertheless, despite the evidence for Gestalt representations in Experiments 1–3, the results from Experiment 4 together with other, similar data 12,13 also show that behavior can principally be represented flexibly (i.e., Gestalt-like or in a Structuralist manner) depending on context (in particular, the degree of dimensional overlap across responses), demonstrating an astonishing extent of representational flexibility, in particular with respect to response coding. A corresponding flexibility with respect to task representations has previously been proposed in several lines of research: For example, it has been shown that one might instruct participants to represent a set of (e.g., 8) stimulus–response rules in terms of the individual, distinct (8) rules, or in terms of fewer (2) integrated, higher-order task rules 52,53 . Furthermore, mental task representations were shown to be flexibly configured across different groups: For example, younger adults were reported to rely more on internal (memory-retrieval-based) sources of information than older adults, the latter relying more on environmental cues to guide their behavior 54,55 . This type of mental flexibility, in particular with respect to action representation, should be further explored in the future as a potential source of intelligent behavior in general, as it is distinct from what is usually studied under the umbrella term “cognitive flexibility”, which rather focuses on a flexible “rewiring” of already established representations 56 . Despite this representational flexibility, however, we believe that the lack of any dimensional overlap (or other type of relation) between multiple concurrent actions (which fosters Structuralist representations) may represent the exception rather than the rule for real-life behavior, as the latter is typically guided towards a common object or person, or driven by a common overarching goal, thereby supporting the assumption of action Gestalt formation as a major principle of behavior control.
Note that the present study focuses on the mental representation of simultaneous action events. The mental representation of temporal event sequences in Gestalt psychology was studied in the context of melodies and the phi phenomenon, where the whole pattern displays characteristics that reach beyond those of the component elements (e.g., emotional expression emerging from a melody 57 ; perception of continuous motion emerging from still images 58 ). It would thus be interesting to follow up on our results by addressing representations of action sequences 48 . Furthermore, a Gestalt perspective on action control may also stimulate novel promising research lines. For example, compatibility phenomena (e.g., advantage of executing two “right” actions instead of a “right” and a “left” action, see Table 5) may be regarded as special cases of a “common fate”-like principle for actions (at least on a semantic level in the case of vocal actions used here), and temporal response grouping 59 may be interpreted as a means to support (and reflect) motor Gestalten. In the future, the relation between other action-related phenomena and known individual Gestalt principles may be explored systematically (e.g., by re-conceptualizing motor learning in terms of Gestalt formation etc.). In addition, research in the domain of fundamental learning principles, where a long-lasting debate has emerged on elemental versus configural stimulus learning in association formation 6,7 , might benefit from a stronger focus on the potential role of configural inter-action associations, too 36 .
Table 5 Compatibility effects in Experiment 3 (all statistically significant, all ps < .002).Finally, the present findings may also be relevant on a fundamental methodological level: The quest for rigorous experimental control has led many cognitive psychologists to utilize basic motor responses (i.e., key presses) as a proxy and pars pro toto for the study of behavioral foundations in general (atomistic approach 60 ). Correspondingly, current theorizing on (multiple) action control typically takes a Structuralist approach by assuming individual mental representations corresponding to the elements that occur in an experimental trial in the lab (relevant/irrelevant stimuli, responses, effects) and turning them into mental codes with inhibitory and excitatory connections that thereby allow for some level of separation or integration (based on trials, tasks etc.) 30,41 . However, actual (mental) life does not come chopped up into trials and their elements, so that corresponding accounts therefore run the risk of vastly restricting their explanatory range to the very situation they are built upon: subjects repeatedly issuing highly restricted elementary behavior triggered by elementary stimulation in line with a set of rather arbitrary instructions. In line with this critique, the present results (which were notably based on similarly restricted trial-by-trial situations) delimitate the degree to which complex behavior can simply be analyzed in terms of its basic components (a hallmark of research methodology in cognition). This should remind us that the study of the principles underlying basic component behavior may not necessarily lead us towards a full understanding of more complex actions that actually characterize human behavior. Instead, a Gestalt view on mental action representation may provide a novel explanatory angle for understanding the human ability to display complex, temporally well-organized behavior.
The datasets generated during and/or analyzed during the current study are available from the corresponding author on request.
The present research was funded by the Deutsche Forschungsgemeinschaft (HU 1847/3-1). We thank Magali Kreutzfeldt and Verena Maag for data collection, and those who kindly volunteered to participate in the study.
Open Access funding enabled and organized by Projekt DEAL.