That we all learn from play seems undeniable. But often it takes an educational psychologist to point out the obvious: that for most of life’s important lessons we as individuals, and as teams, learn first, and perhaps primarily from playing around (Vygotsky, 1966, 1978). Playful learning both relies on distributed team cognition and contributes to its development. While playful learning is a large part of childhood, it also remains a lifelong aspect of formal and informal education. Children play house without the burdens of really cooking, cleaning, or paying bills, or play cowboys without the dangers of real horses, guns, and stampeding animals.
That we all learn from play seems undeniable. But often it takes an educational psychologist to point out the obvious: that for most of life’s important lessons we as individuals, and as teams, learn first, and perhaps primarily from playing around (Vygotsky, 1966, 1978). Playful learning both relies on distributed team cognition and contributes to its development. While playful learning is a large part of childhood, it also remains a lifelong aspect of formal and informal education. Children play house without the burdens of really cooking, cleaning, or paying bills, or play cowboys without the dangers of real horses, guns, and stampeding animals.
Many simulations involve playful learning and draw on role playing and imagination to establish realistic, if not real, conditions for learning. Distributed cognition is an essential component of successful team play, whether in formal team sports or in massively multiplayer online games such as World of Warcraft. In such cases, key roles are predefined (quarterback vs. tight end, tank vs. healer) but success is largely determined by a shared and coordinated understanding of goals and execution of well-synchronized creative actions.
Yet, important dimensions of playful learning may be lost when play becomes the focus of experimental controlled studies, academic theory, scholarly analysis, assessment, and other research. Traditional approaches to measurement and research methods push for the assessment of isolated and individual play behaviors (not teams), linear conceptions of knowledge, and dispositions (not distributed knowledge) over discrete short periods of play from only single game instances (Resnick & Resnick, 1992; Roth, 1998; Schwartz & Arena, 2013). This stands in contrast to our real-life experiences where we play the same games repeatedly with different opponents and even with different strategies and different team roles, and learn not from single instances of game play, but from repeated experiences over substantial time, and through post-game discussion, reflection, and analysis with other players—referred to as metagame experiences. The dynamics of game play over various occasions and metagame reflections and interactions are an important aspect of playful learning that are often missed but are an essential element of our ecological and situated approach (Young, 2004; Young et al., 2012). Play is both something we do when we follow the rules of a game and an approach we take when entering into an interaction perhaps playfully testing the limits of those rules. This latter idea, of a playful approach to the world, represents an individual or group’s goal set to intentionally manipulate aspects of the situation in order to achieve a playful interaction (Young & Slota, 2017).
As children play games, they learn physical and mental skills (Resnick, 2006) and acquire new way to
affect the world (effectivities). They also learn strategies for how to play games
and how to interact with other players (Gee,
2003; Squire, 2008). As
described by a quote from a future teacher in our college teacher preparation
program, When I was a child, I played all kinds of board games and physical games
(like twister or spud). Those games taught me a lot of rules. For instance,
I had to learn the rules for the game to play, but I also learned ethical
rules, such as don’t cheat and don’t touch other people’s cards/property.
Physical games and games we played outside taught me basic coordination and
how to play on a team. Some games were centered around educational concepts
like counting (Hershey’s Kisses game), or making words, or describing a word
(Scrabble, Scattergories, and Taboo). Some required deductive reasoning and
critical thinking (Clue), while others required drawing (Pictionary) or
acting (charades).
It seems from games children can learn many of the basic social
interactions they will need the rest of their lives. From games like Scrabble they
can learn to be playful with language and add
to their vocabularies with a focus on word relationships and spelling. In many
games, like Monopoly, they can learn about money, value, and counting. And they
can learn the basics of how to treat each other, including how to follow rules or
how to cheat (hopefully not), as well. In this process they can playfully explore
a personal ethic (strategy) as well as experience the social consequences of
defying the rules. Play has also been linked to the development of creativity and
other key aspects of child development (Howard-Jones, Taylor, & Sutton, 2002; Lillard, 2013; Resnick, 2008; Russ, 2003).
It is interesting to note that for some approaches to early childhood education,
like that of Maria Montessori, the fantasy aspects of play were decoupled from the
mechanics of games and playful learning, and this remains an important distinction
often discussed when trying to parse simulations versus games.
For our purposes, the question then becomes, what can adults learn from game play, and can game play be as powerful a social learning environment for advanced skills, such as the distributed team cognition of surgical teams debriefing or military planners during war games, as it is for development of early childhood social skills?
One important distinction to make at the outset of this chapter is between structured play, as is the case with most video games, card games, and board games, versus unstructured or “make believe” play. The latter form of playful learning has been discussed by child psychologists with regard to the development of social skills, creativity, and self-regulation (see e.g., Bodrova, Germeroth, & Leong, 2013; Lillard et al., 2013). In this chapter, we will focus on structured play that fits the standard definition of “game,” which generally includes the presence of rules, turn-taking, and some sort of end goal state such as winners and losers.
Games have certainly been suggested as a teaching tool for skills in many advanced disciplines beyond preschool and early elementary school, and appear to be useful for learning about real-world environments including transportation planning (Huang & Levinson, 2012), safety science (Crichton, 2009), 5th grade science (Wilkerson, Shareff, Laina, & Grave, 2018), engineering design (Hirsch & McKenna, 2008), cybersecurity for avoiding phishing (Arachchilage & Love, 2013), and notably for our present topic, engineering teamwork (Hadley, 2014). Meta analyses of the effectiveness of games for learning advanced school content generally find small but positive effects of playful learning approaches (Clark, Tanner-Smith, & Killingsworth, 2015; Vogel et al., 2006; Wouters, van Nimwegen, van Oostendorp, & van der Spek, 2013; Young et al., 2012), but such effects are summarized across a diverse set of games and players and thus are hard to interpret at such a broad level of analysis. We have characterized this as looking for our princess in the wrong castle (Young et al., 2012) and have made a case that a more situated analysis of game play is required to account for the rich social context in which game play emerges.
In our work to study games as assessments, distinctions between what makes a game a simulation, or what makes a simulation also a game, have not been useful. At times, simulations are judged by their fidelity and verisimilitude, whereas games are judged by their playfulness. Yet upon closer inspection, games and more specifically game mechanics have been characterized variously as a subtype of simulations, as the superset of simulations, and parallel with shared element with simulations. For example, when developing taxonomies of games, Wilson et al. (2009) described 18 categories of game features, which Bedwell, Pavlas, Heyne, Lazzara, and Salas (2012) refined into nine features. Koehler, Arnold, Greenhalgh, and Boltz (2017) tested the value of these nine features for characterizing how gamers review and evaluate games, only to find that some additional features were needed. Drawing on Malone and Lepper’s (1987) description of what makes learning “fun,” many of these game features describe things that players enjoy (e.g., competition, conflict, control, human interaction) and not features that might distinguish key differences between the more veridical nature of simulations (real-world physics constraints) and the sometimes more fantasy-based nature of games. In our own work, we have found it best to characterize both simulations and games along the dimensions of playfulness, emotional experience as fun, and rules/game structures. For example, simulations (constrained by real-world parameters) can be playful, in the sense they allow users to explore openly and combine actions in ways not fully anticipated by instructors/designers. Likewise, many games involve some real-world constraints (gravity, solid structures) and we are aware of elite soccer players who use soccer video games to simulate player strategies. In the case of professional sports, playing a game may not be experienced as fun at all and be instead hard work. While some believe all games are a subset of simulations because they use some real-world constraints, others have argued that simulations are a subset of games because they use various game mechanics to various degrees and are sometimes experienced as fun and playful, like some games. For our purposes, none of these attempts to apply a taxonomy to games or organize the features of games in contrast to simulations have proved useful for research. Instead, we would assert that both games and simulations utilize various degrees of playful mechanics, are experienced with varying degrees of “fun,” and have various levels of rules and end states/goals, whether competitive or collaborative or some combination (such as games with a betrayal mechanic).
For the topic of team distributed cognition, our interest is particularly on collaborative board games in which players perform game tasks together through communication, coordination, and other team process behaviors and achieve results as an integrated unit. There is an element of competition still with such games as they often include individual player scores, but collaborative board games are played against the board itself and the win condition is win or lose for all. For example, in Forbidden Island, time is running out as the island is sinking and players must collaborate and leverage their individual skills to escape before the water rises to cover the island; if the island sinks, the players all lose as a team.
As mentioned regarding engineering teamwork, Hadley (2014) used the board game Pandemic as an educational intervention to encourage engineers to reflect on their teamwork skills and strategies. In this study, board game play was the intervention, not the assessment. But this intervention highlights the need to focus work concerning team effectiveness on the interactions of player and board position dynamics rather than on a static momentary snapshot of individual scores, team scores, or individual cognitive strategies. For our purposes team cognition involves:
As stated, our view of game-based learning includes more than simply the interactions that unfold during the game itself. They include the metagame. Metagame interactions include post-game reflections and conversations, online searches for game-related hints, cheats, and tutorials, and apply to a series of game plays, as well as include creative reflection by players on how to formulate strategies for game play on subsequent occasions. It is the dynamics across game-to-game strategies and collaboration among game players outside any single instance of game play that interest us as much as the interactions within a single game experience or the final score from a single instance of game play. This perspective alters classical notions of assessment as starting points, midpoints, or endpoints as in the cases of placement (e.g., pretest), formative (e.g., midterm), and summative (e.g., final exam) evaluations. Instead, the entire set of temporal trajectories of coordinated activity between team players and across game trials becomes the means for defining operationally what playing a game means (e.g., Andrews et al., 2017; Kim, Almond, & Shute, 2016).
Dating back to classic educational games, like Oregon Trail, students are known to
play games without “getting” the key educational premise (the learning objective).
That is, some players adopt game strategies (goals and intentions) that are
unintended by game designers. For instance, players can focus on “gaming” the game
(e.g., accumulating achievement points or killing off cattle) or on goals related
to other superficial features of play (accumulating gold in the World of Warcraft
auction house), rather than on the play mechanics intended to align with
curricular goals and objectives. For example, Caftori (1994) described how the competitiveness aspect of Oregon Trail
led some players (middle schoolers) to race for the end of the trail as fast as
possible, without regard for their companions or oxen. This defeated the main
objective of the game designers to encourage social problem solving for survival
and draw attention to difficulties of losses along the trail. In this case, the
common video game attraction of killing for loot replaced the instructor-intended
and designer-intended student learning outcomes from game play: The goal becomes so important that players neglect the health of other
travelers and their own lives. Another example, shooting animals for food,
was designed to teach children about different animals in different terrain,
as well as be part of the reality of life on the trail. However, “shoot ’em
up” has become a focus of attention for
many students (mostly boys). Unfortunately, besides eye-hand coordination,
not much else is learned. And eye-hand coordination is not one of the stated
objectives of this game.
This simple example highlights that game play is an emergent
interaction that results from player strategies that arise from a dialectic
interaction between game affordances and player intentions, on the fly. It is this
property of board game play, essential to understanding games as assessments, that
we focus on in this chapter.
As a final part of our introduction, we would like to separate board game play from typical action video game play. Board game play typically takes place on a much more relaxed and social scale than action video games. Recent work highlights the general cognitive impacts that highly stimulating action video games can have on overall cognition and of course distributed cognition, including verbal interactions as well as spatial skills and attention (Bediou et al., 2018). But for our purposes, we would like to focus on the impacts of interacting with others over a table-top board game where game mechanics draw from board positions, there is time for extended conversation and strategizing (turn taking rather than timed responses), and changing conditions related to random card selection and/or dice rolls. While engaging at a social level, we would suggest that the excitement levels and cognitive attentional demands of board game play would not parallel the impacts on general cognition of playing action video games. The cognitive impacts we observe tend to emerge less from overall cognitive arousal, but from dynamic social interactions, problem solving, and group dynamics.
The theoretical framework for our work draws from J. J. Gibson’s (1986) ecological psychology of individual perception-action that defines human behavior as a dynamic interaction between individuals and their environments. An interaction is taken as the fundamental unit of behavior (unit of analysis), thus co-defined by properties of individuals that enable them to affect the world (effectivities) and properties of the environment that afford or invite actions (affordances). This differs from the predominant cognitivist representational view of cognition in several ways and these can be illustrated in an analysis of typical board game play.
For the purposes of our analysis, we next introduce a few key concepts:
These information flow fields that are continuously signaling problems and changing conditions that the goal-driven agent (i.e., player) can interact, detect, and act on provide the context in which each player/team takes action (or makes moves). In doing so, they are guided by their goals and intentions (for winning) that direct their attention guiding their “pick up” of information directly from the environment, particularly certain regularities in the information flow field that specify affordances for action. These affordances are detected as invariant structures, much as Gibson described consistencies in the visual flow field when observers walked around a stationary object like a cube. As one moves, some things, like the visual background, move while some visual relationships (internal to the object) remain invariant. Those invariant structures specify the object and its properties (such as its graspability). Similarly, we would suggest that there are invariant structures in the information flow fields of game interactions that specify affordances for individual player “moves” and strategies as well as emergent (new) collaborative group objectives that can be perceived and acted on.
An elaboration of Gibson’s ideas was added to this analysis when Shaw and Turvey (1999) described the affordances that are detected in the information flow fields as co‑determined by properties of the individual and properties of the environment, an agent-environment interaction. Properties of the individual include what cognitivists would call their skills and abilities, but what are more precisely described within ecological psychology as “effectivities” or an agent’s abilities to have an effect on the environment through their actions. This would include their physical abilities to roll dice and move tokens around a board, but also their abilities to engage with other players in shared decision making and civil discourse and the skills (gear, weapons, etc.) their character acquires in the game.
To Gibson’s ecological framework of cognition, we add the contemporary learning science of situated cognition that draws heavily on apprenticeship learning and describes learning as the movement of individual behaviors from peripheral participation within communities of practice toward central participation. Situated cognition (Brown, Collins, & Duguid, 1989; Lave & Wenger, 1991; Rogoff, 1990) described learning as inherently social and a process of moving from the novice status as a peripheral contributor to the community of gamers toward a more central role of expert player. The ideas of situated cognition date back to John Dewey (1938) and the insight that we learn by doing. This has a modern equivalent in Papert’s (1991) constructionism (the word with the “N”) builds on constructivism (the word with the “V”) by adding that it is only by building (constructing) artifacts that we genuinely come to know things. We apply this framework to games and playful learning in general, and specifically for this chapter, to collaborative and competitive board games that may serve as assessments.
To learn or play a game is a goal-driven dynamic process of perceptual tuning (Gibson, 2000). Across a player’s experience within a single game, and across several instances of game play, a player begins to detect invariance in the information flow fields (visual, auditory, and tactile) of the game that specify objects and concepts (including team concepts) that are part of game strategies, rules, movements, and outcomes. Outside the game, a player learns from discussions (online and face-to-face) with other players and eventually moves from status as a novice player to a more central contributor to the gamer world.
Fundamental to our description of game play is the idea that human behavior is a dynamic interaction of perception and action, a person-environment interaction. Action drives and defines what we perceive and perception guides our actions. Thus, perception and action co-define each other. The reciprocal nature of perception is often missing from a traditional description of behavior and from cognitive or behavioral assessments (Young, 1995; Young, Kulikowich, & Barab, 1997). Perception is often described as a passive cognitive act, by individuals and by teams. In describing board games, this would be the equivalent of assuming that the rules of the game are the rules that are guiding each player’s actions. But for our purposes, a player’s actions are determined on the fly, in each dynamic interaction with the game and other players. Thus, the chances that a player will play the game exactly the same on two occasions, or that a team will act identically from game to game are essentially impossible. That is not to say that regularities in the game and in the play of other players will not result in observable or testable consistencies in the overall interactions that constitute game play.
We have selected to focus on collaborative board games because, in addition to an individual player interaction with the game, there must also be a coordinated team interaction. While each individual typically has some unique skills defined by cards or dice rolls, there are also collective team goals that are only achieved through the coordinated action of all the players. While this adds a layer of complexity for a full description of game behavior, it also adds a potential source of information for assessment of team cognition in relation to individual achievements by creating a participatory environment (Gee & Hayes, 2012) where civil discourse, informal mentorship, and intentional learning can occur and be detected. That is, not only may collaborative board game play provide evidence of an individual player’s leadership and communication skills, it may also externalize how a team discerns problems and takes coordinated actions. In short, collaborative games, in comparison to competitive play, represent a richer context for distributed team cognition.
During any particular instance of a game, players must respond to the current board position, their own situations determined by cards, rolls, and prior accomplishments, and also perceive the trajectory of the game as the situation unfolds with other players. We could identify three types of goals for collaborative board game play:
In addition to these three types of goals that unfold during game play itself, there are also game-related activities that occur outside of game play (meta-game interactions). Across instances of game play, players engage in reflective dialog with other players using online cheat/hint sites as well as live conversations about the game. This level of game learning seeks to pick up invariances across games to detect team interactions that proved successful or problematic, and to detect strategies that had worked in other games that had not yet been experienced directly through their own game interactions. Both within and across games, we would posit that the learning is well described as perceptual tuning.
To this we would add that within the community of gamers, there is a dynamic of social progress from peripheral players (passive readers of game sites) toward a more experienced central role (contributor to game sites) that helps define and possibly even develop the game environment. While many games are classics and evolve little through time (Monopoly, Scrabble) other games do change with user content (Cards Against Humanity, Trivial Pursuit).
From the ecological psychology perspective, human behavior is an emergent interaction between an intentional agent and a dynamic environment. The agent-environment interaction is the basic unit of analysis for any understanding of game play or distributed cognitive teamwork. On any particular occasion of human behavior (a speech act or game move), our framework holds that the intentionality of the moment is itself a complex interaction among a hierarchy of competing goals—a hierarchical ontological descent of goals to the moment of a game move. By this, we are not simply referring to Maslow’s 1943 hierarchy of goals, but to a complex hierarchy of intentions that each individual may adopt that exist and play out on a variety of space-time continua, including the low-level muscle movements and the highest-level game-long play strategies.
Many have argued that teams think. We agree that team cognition is more than the sum of individual thinking (perceptions and actions). Our ecological perspective on intentionality has been described in more detail in Young, DePalma, and Garrett (2002). Here we reframe that discussion away from butterflies and termite mounds (many simple animals’ intentions to eat and reproduce, when taken together create a shared, collective intentionality and coordinated activity to maintain, expand, and work toward functional goals within an ecosystem), toward a focus on the specific environments of humans playing collaborative board games. Board game designers create constraints on player interactions by constructing game rules. Designers hope to limit the interactions among players and the board to a specific goal space that reduces the degrees of freedom in normal life interactions with the world to a much smaller set of “valid” game moves. This limiting of the problem space may be particularly valuable for using board games as assessments. To explain the majority of board game interactions, it is fair to assume that players/teams have adopted the goal to win the game, and thus their intentions all draw from the state space of legal game actions. It is worth noting that this would not always be the case, and sometimes players may adopt goals to make illegal moves or have a play goal to intentionally undermine or disrupt game play or team success (the betrayer mechanic, as built into games like Betrayal at House on the Hill). Presumably, these non-standard game interactions could be detected somewhat readily, during and as a result of play assessments in the form of inferred player short-term and game-long goals.
Collaborative game play can be viewed as values realizing (Zheng, Newgarden, & Young, 2012). Drawing from Hodges (2007, 2009), our work with language learning through collaborative play in World of Warcraft (five player raid parties) shows how various individual goals for play and for team-building can be braided together with team play. Thus, we view games as ecosystems where value-realizing dynamics come into play. Values are real goods, intentions of the system that can only be realized through perception-action within a particular event/context. From an ecological psychology perspective, language is a perception-action-caring system in which speaking and listening “demand an ongoing commitment to directing others and being directed by them to alter one’s attention and action so that movements from lesser goods (i.e., one’s present board position, achievement or goal) to greater goods (e.g., values) is realized” (Hodges, 2007, p. 599). As such, values are not only properties of a person but are also about relationships and the demands with all others in the context. Collaborative board game play is a socially constrained field where all game decisions and actions are constrained and legitimated by values of the ecosystem with all the others (e.g., other players, board position, game moves). That is, it is a jointly enacted field, intended to realize shared values (i.e., achieving board positions and other subgoals and ultimately winning the game).
Dixon, Holden, Mirman, and Stephen (2012) organized their manuscript around a central question that we would pose to all those interested in team cognition: “Does cognition arise from the activity of insular components or the multiplicative interactions among many nested structures?” We agree with Dixon et al. that there is mounting evidence from a wide range of domains suggesting that cognition is multi-fractal, emerging from dynamic in-situ interactions and distributed across teams, rather than stable latent traits or constructs dominated cognition (see also Hilpert & Marchand, 2018). Assessment models that break down human cognition into numerous independent components, such as memory, motivation, self-efficacy, interest, spatial skills, verbal skills, and reasoning, then apply an additive model and attribute all else to error variance, may well mask key distributed cognition dynamics that underlie team coordination and thinking. Perhaps a more positive takeaway from the multi-fractal analysis is that you can study cognition at any level (collaborative game play) and expect to see parallels at other levels (individual game play, applied collaborative teamwork).
Similarly, Cooke, Salas, Kiekel, and Bell (2004) argued that team cognition is more than the sum of the cognition of the individual team members. Instead, “team cognition emerges from the interplay of the individual cognition of each team member and team process behaviors” (p. 85). They argued that to measure team cognition, a holistic level of measures is required rather than collective metrics or aggregation procedures. The holistic measures focus on the dynamic team processes and co-actions taken by the team as a whole. In this regard, team consensus or communication consistency, team process behaviors, and team situation awareness (and associated taken-as-shared goals) are posited as potential measures of team cognition (perception-action).
Alternatively, we would like to begin to build an understanding of game interactions and their potential for assessment. Using our current learning games, we would like to find 100 or so examples and consider training a deep learning network like Amazon’s AMI and let it analyze for determining features that might align with 21st-century skills of collaboration or instances of micro problem solving that occur during game play (in board games like Expertise), as they have been used in speech recognition, recognizing Chinese handwriting, zip codes, classifying text as spam, and computer vision. One approach would be to create labels of successful and unsuccessful collaborative game play activities, and we might perhaps proceed toward a deep learning network solution for recognizing those activities during game play.
To this approach we would want to add individual and team stated goals and objectives for play, discerned from retrospective game play analysis. This would enable us to begin to construct a description of game play as situated cognition and action. Toward this purpose we next describe one collaborative board game designed to enable and assess the team cognition of teachers working together to wisely integrate technology into classroom instruction.
Board games are just one form of the broader category of playful learning. Board games feature board “position” of players or tokens and often involve cards or dice to introduce chance events. When board games are collaborative, there are aspects related to individual players’ actions and decisions, as well as those related to the effectiveness of the team as a whole. Of course, when considering playful approaches to assessing distributed cognition, the traditional concerns of the assessment to provide evidence that is consistent (i.e., reliability) and credible/useful (i.e., validity) must be addressed. If an assessment was not able to validly capture what it is supposed to measure, it has little value formatively or summatively.
Traditional test theory relies on classical test theory, which assumes that observed scores are either simply the sum of true scores (i.e., true ability) plus error scores, or in the case of latent trait or item response theory (IRT), that logistic functions specify the relation between ability (e.g., achievement, reading comprehension, problem solving) and the probability of getting an item correct on a test. These classical psychometric frameworks assume that there is an individual factor and all else is error, taking no account of other factors such as context. However, in alignment with a perspective of situated cognition, we view that cognition is always situated in context of a world interaction, and contextual complexities (i.e., constraints) can substantially co-define (along with individual abilities) the nature of many activities. As such, our true ability always exists and operates as an interaction within the environment on the fly, co-defined by properties of the individual and properties of the environment. It means that the error term from classical test theory may not be entirely error at all, but rather needs to be taken into account to define our true ability (Bateson, 2000/1972; Hutchins, 2010; Young, 1995). We thus need to integrate contextual complexities into any analytic framework (i.e., unit of analysis) for the validity of an assessment and to examine to what extent the interaction with the assessment context fully represents what happens in the real world, as well as to what extent the assessment context is really related to an applied practical contexts (Wiggins, 1993), with the goal to produce evidence showing that the results from the assessment really capture one’s authentic practice. Mislevy (2016) made these points in an article entitled, “How Developments in Psychology and Technology Challenge Validity Argumentation,” and he drew specific attention to situative/socio-cognitive psychological perspectives as reasons for needed developments in psychometric modeling given opportunities to design interactive performance assessments with digital technologies. We will summarize some of these developments in this chapter. However, we first introduce the board game Expertise to anchor our explanation of the theoretical assessment premises.
To establish such an ecological validity or contextual fidelity, we implemented a collaborative board game called Expertise as an assessment environment and examined to what extent the play scores from this collaborative board game parallel real-world practice in regard to teacher’s technology integration skills. The unique attributes of Expertise as a collaborative board game constrain, to some extent, the large degrees of freedom present in any social interaction involving as many as five players.
The game play we are investigating comes from a board game we created to assess the wise technology integration skills of master teachers, a game called Expertise. This game is the result of several iterations of game design. Two theoretical frameworks, our ecological and situated cognition framework, and the TPACK framework (i.e., teacher knowledge of technology integration), guided its development.
Figure 3.1 Expertise game initial game position
The Expertise game set up is shown in Figure 3.1, including one board, three technology card decks, one theory card deck, and two pedagogy card decks. Each player also has an expert level card (tracking individual scores) in light of TPACK-L performance over the course of the game play. Up to five players can play this collaborative game that draws on team distributed knowledge and coordinated action. In every round, selection of two technology cards, one pedagogy card, and one theory card creates the environment or instructional context for player collaborative action. The game starts with one player (the speaker) sharing a curricular student learning objective with a content area for a lesson that could involve technology integration. In this first phase, the team serves as co-teachers or technology integration advisors to help the speaker construct possible technology integrations. In the second phase, each player judges a brief summary of the proposal presented by the speaker. The game runs two rounds, meaning all players rotate through the role of speaker two times and serve as advisors when not the speaker. In the first round, players use the first-round pedagogy card deck that includes more traditional instructional strategies such as direct instruction, group discussion and the like; in the second, more advanced round, they use the second-round pedagogy card deck that includes more complex and innovative contemporary teaching strategies, such as problem-based learning, gamification, anchored instruction, and the like.
Once the speaker shares the content area, all of the players together have three minutes to discuss their best collaborative solution to teach the given content by taking the given technology, pedagogy, and theory into account. After this co-construction process is done, the speaker has two minutes to state this solution (as if presenting it to a Board of Education for funding) and how the chosen technologies will be integrated with the given pedagogy and learning theory. During the speaker’s proposal, each of the other players serves as a reviewer in one of the TPACK areas. Respectively, they decide to what degree the speaker’s proposal addresses wise use of board technologies, sound pedagogy aligned with the pedagogy card, and alignment with the board card for learning theory. To structure their judging, players roll dice to indicate their role play predisposition regarding their judgment and are thus assigned a degree of harshness to apply to their judgments, ranging from “easygoing” through “critical” to “hard-ass” for each round. If the proposal successfully hits any of the components of TPACK-L, the game token placed for each of the TPACK-L components moves one step inward toward the center of the board, indicating the level of team success. Once co-players’ judgments are done, the speaker’s expert level is promoted based on the result of the judgment. If all of the components are successfully addressed (i.e., three components move in)—the speaker hits TPACK-L reasoning perfectly—his/her game expert level is promoted two ranks. If the proposal meets two components of TPACK-L, his/her expert level is promoted one rank. If the proposal meets only one component or does not meet any, his/her game level stays at the current level. But if the proposal fails to meet any of the components of TPACK-L components, his/her game level is downgraded accordingly. This way, the player’s expertise in TPACK-L during the game play is given an individual score. In a nod toward scoring distributed cognition, after judgment is done, the speaker also has a chance to elevate the individual score of one of the co-players who was most helpful that round. There are five expert levels: Two Blue Agent, Two Summer Agent, Master Two Summer Agent, Technology Coach, and Master Technology Coordinator. At the end of two rounds of play, the final board (team) position and each individual player (expertise) level are recorded.
In addition, in each round, players are asked to use a technology card deck according to their expert level. Suppose that the player expert level is a Two Blue Agent, then this player should use the Two Blue Agent technology card deck. This way, game technologies become more sophisticated for some speakers as the game proceeds.
While beyond the scope of this chapter, the description of Expertise and its design helps to establish the types of data streams that are possible as means to study complex, interactive team performance and distributed cognition in game environments. The dissertation research of our third author provided evidence of the validity of Expertise as a measure of teacher technology competence. These preliminary research results point to the value of estimating both individual player and team success parameters in the spirit of an ecological description of game play.
Because games are designed with rules and lead to individual and team outcomes
that can be observed directly as win, lose, or draw, they provide a means for
actions and behaviors to be analyzed as ones that result from goal-driven,
intentional dynamics (e.g., McKee, Rappaport,
Boker, Moskowitz, & Neale, 2018). One could argue that attempting to
get items correct on any familiar standardized achievement test, whether its
format is multiple-choice or constructed response, is similar, and learning can
occur from completing tests and retests, just as it can from playing a game and
replaying it. For example, in their now classic piece entitled “The Theoretical
Status of Latent Variables,” Borsboom,
Mellenbergh, and van Heerden (2003) considered how Albert Einstein might go about completing a general intelligence
number series problem, the Fibonacci series problem, 1, 1, 2, 3, 5, 8 …? as a
sequence of steps leading up to the construction of an answer rather than just its
simple recording as 13. They wrote: Let us reconstruct the procedure. Einstein enters the testing situation,
sits down, and takes a look at the test. He then perceives the item. This
means that the bottom-up and top-down processes in his visual system
generate a conscious perception of the task to be fulfilled; it happens to
be a number series problem. Now he applies the rule and concludes that the next number must be 13.
Einstein then goes through the various motoric processes that result in the
appearance of the number 13 on the piece of paper, which is coded as 1 by
the person hired to do the typing. Einstein now has a 1 in his response
pattern, indicating that he gave a correct response to the item.
Borsboom et al. continued the description making reference to
working memory and drawing on information from long-term memory (e.g., Einstein
recognizes it is the Fibonacci series). While working memory and long-term memory
resonate with information-processing theory accounts of problem solving (e.g.,
Anderson, Reder, & Simon, 1996;
Vera & Simon, 1993) rather than
those positioned as Gibsonian descriptions of perception-action cycles, the
described sequence of Einstein’s hypothetical steps as a test-taking narrative
does make for an important illustration. It shows how a simple problem, and just
one test item, commonly encountered on a large-scale standardized test is an in
situ experience that is based on a sequence of goal states continuously created
and annihilated through on-the-fly agent-environment interactions by test takers
as they (in this case, Einstein hypothetically) interact with the test
environment. Borsboom et al. continued their description of Einstein’s test-taking
behavior as well as that of another agent who assigns a score to the response
Einstein records as his answer:
Of course, modern test theory (e.g., Embretson & Reise, 2000; Markus & Borsboom, 2013), associated with the study of
the reliability and validity of scores on standardized achievement and
intelligence tests, does not unravel the perception-action experience of Einstein
or any other learner as Borsboom et al. described. Further, Borsboom et al. were
not offering an ecological psychology view of problem solving as the alternative
to cognitive and in the mind theoretical perspectives such as
information-processing (e.g., Anderson et al.,
1996) or connectionist (e.g., McClelland, 1988) accounts. Instead, the goal of Borsboom et al. was to
highlight that within-individual variations over time in “achievement,”
“creativity,” “decision making,” “motivation,” “problem-solving,” etc., and
treated as latent variables, drivers of behavior within the knower, have been
under-addressed in psychometric modeling frameworks. Instead, latent variable
modeling frameworks that undergird the broad spectrum of current psychometric and
statistical techniques accepted as best-practice trade tools for establishing the reliability and validity of
scores are best understood as between-subjects variation examined in at least one
point in time but open to study of the between-subjects variation over multiple
points in time.
As described previously, ecological psychologists work from a different world view focused on perception-action dynamics, and these perception-action cycles or interactions cannot be understood by anything other than the contribution of the individual agent interacting with her/his environment over time (e.g., Young, 1993; Young & Barab, 1999). Therefore, variation is of the within-subjects kind, and yet, there are other important sources of within-subjects, perhaps “within-contexts-unit” variation that must be coordinated in any study of game play. Specifically, and at minimum, these include the dynamics within one’s team and associated dynamics of the environment that result in dynamics of individual and team intentions real spacetime.
There are similarities, but also, there are differences between game play and test taking as a means to study learning, whether learning is latent, somewhat latent, or not. As an initial step in outlining a psychometrics for games as assessments, the distinctions between game play and test taking can be compared and contrasted (see Table 3.1). Psychometric theory, both classical and contemporary, relies extensively
Characteristic |
Game Play |
Test Taking |
---|---|---|
Attribute |
Performance (individual and team) |
Achievement |
Psychometric status |
Behavior/manifest |
Construct/latent |
Unit of analysis |
Perception-action coupling and cycles |
Individual, classroom, school, district |
Stimuli of assessment situations |
Board features, opponents, team members (interactions) |
Directions, items, options |
Responses to assessment stimuli |
Moves |
Item selection or spoken/written response construction |
Feedback to responses |
Immediate |
Delayed |
Goal and intentions based on feedback |
Level up, improve strategy, try to win |
More correct responses |
Degrees of freedom |
Governed by rules of the game |
Governed by directions of the test and sampling space of items |
Score assignment |
Binary (i.e., dichotomous) |
Binary (i.e., dichotomous), polytomous |
Dimensionality of scores |
Locally dependent, emergent |
Locally independent, unidimensional/multidimensional |
Analysis of scores |
Differential, nonlinear |
Discrete, linear |
Evaluation of the scores |
“Making progress,” “we can improve” |
“Your score is at the 50th percentile or 75th percentile, etc. which means that&” |
Consequences given, evaluation of scores |
Rewards for progress |
Penalties for lack of improvement |
In the next section, we compare and contrast characteristics of game play and test taking. Both activities require agents to assess their progress in formal and informal learning settings. Our analysis of game play applies the ecological worldview of situational ecological dynamics and could be described as “behavioral, yet intentional” rather than “latent” given definitions and perspectives described by Markus and Borsboom in their Frontiers of Test Validity Theory (2013). Further, we are encouraged by contemporary developments in psychometric and statistical analysis on specification and test of nonlinear dynamical systems models (e.g., Helm, Ram, Cole, & Chow, 2016; McKee et al., 2018; Molenaar, 2014) at an individual time series level of analysis as well as co-integrated (e.g., Boker & Martin, 2018) with another player’s trajectory of intentional dynamics. The individual and co-integrated time series may be the most fruitful avenue of research as to the scientific study of distributed cognition within game play. However, while the avenue appears a promising and worthwhile one to take, challenges are to be noted and may be difficult to overcome at the current time and for some time. These challenges relate to theory (Borsboom et al., 2003); game design as assessment design (Kim et al., 2016); data capture (e.g., Lindell, House, Gestring, & Wu, 2018); and mathematical/statistical modeling (e.g., Greenberg, 2014) as well as difficulties that arise given standard, accepted educational best practices in the use of test scores to diagnose and promote learning. Such forces of accepted practices include the best explanation of achievement/performance is the most simple one (e.g., Templin & Bradshaw, 2014) and that the test score is “my score” not “our score” (von Davier & Halpin, 2013).
Table 3.1 outlines characteristics of game play and test taking that were selected based on the way psychometricians and statisticians approach data analysis to understand human learning (e.g., Markus & Borsboom, 2013). First, consider the primary attribute of game play versus test taking, which is an assessible variable, an entity that belongs to the learner and for which he/she provides data for measurement, assessment, and evaluation of assessment. For game play, we identify the primary attribute as “performance,” which occurs in real spacetime and may or may not be influenced by prior experience. For example, consider the student who is playing a game like Expertise for the first time. At minimum, rules need to be learned. However, no one would penalize the player for her/his willingness to do better the next time. Comparatively, the primary attribute for taking a test is demonstrating “achievement” or perhaps aptitude or intelligence, which is after the fact and the result of rather than a feature of learning, “shaped by curriculum standards,” and prone to evaluation at a level beyond any individual student (e.g., the teacher taught well, the school district operated well).
Next, we consider the psychometric status of the primary assessment attribute. The psychometric status pertains to how the reliability and validity of scores are to be interpreted as behavioral/manifest or formative versus latent and reflective. For game play, performance is behavioral as goals must be realized in action (e.g., Kulikowich & Young, 2001; Young, 1993). However, an important distinction is to be made between behavior that is reactive and responsive to the environment as in premises laid out by Skinner (1988) compared to intention-given by the learner. Gibsonian dynamics are intentional dynamics, and therefore, learners are often referred to as “agents,” referencing their agency to shape experience given their continuous interaction with the environment. While some tests adapt to user responses, a game is much more dynamic with game moves and strategies taken as a given part of the context. In contrast, achievement in test taking is primarily treated as latent. “Reflective” is a psychometric term (e.g., Markus & Borsboom, 2013) and it means that any item or task response that is assigned a score depends upon or is regressed upon an underlying long-term stable latent trait that is not directly observed. As such, the latent trait, abstracted from the world and stored in the head of the users, is the cause of student responses. Treatment of attributes as latent traits is arguably the most popular psychometric approach to the study of scores as reliable and valid and is evidenced in the variety of confirmatory factor analysis (CFA) and item response theory (IRT) models that psychometricians specify and test. It is also important to note that these approaches are successful between-subjects covariation techniques where scores are compared and positioned relative to one another, most often in reference to location in a normal distribution—all such assumptions are not taken as given in an ecological psychology description of game play.
From the discussion of the psychometric status of the attributes of game play and test taking, respectively, it is hopefully becoming clear that the unit of analysis for game play must be a continuous cycle of perception-action couplings in a dynamic agent-environment interaction. These perception-action couplings are manifest given the effectivities of the learner with the affordances available in the game (learning) environment, and neither student nor environment remain the same. The process unfolds and presents a response stream that can be modeled statistically as situated experience. Therefore, there is no abstraction from the world or need for storage of ideas. Comparatively, the unit of analysis for test taking is the static student whose achievement is nested in the static contexts of higher-order or multilevel units of analysis—classrooms, schools, school districts, provinces/states, countries (e.g., Raudenbush & Bryk, 2002). Indeed, covariates (e.g., classroom enrollment size, school socio-economic status) can be entered into the hierarchical equations at any level signifying contextual elements that explain sources of variation in students’ scores. Further, cross-level interactions (e.g., students’ prior achievement scores crossed with school socio-economic status) can be evaluated for significance and effect size. However, these covariates or covariate interactions are not affordances students detect and with which they interact during learning. Further, for both students and for higher levels of analysis such as classrooms, variables are most often treated as trait-like and stable, not state-like and continually changing (e.g., Molenaar, 2004).
The next five characteristics in Table 3.1 address the data sources that researchers can analyze for both game play and test taking. These include (1) stimuli of assessment situation; (2) responses to assessment stimuli; (3) feedback given responses; (4), goals or intentions based on feedback provided; and (5) degrees of freedom for any agent to alter the course of game play or test taking based on any feedback provided. We unpack each of these briefly next.
The stimuli of assessment situations are arguably infinite and would include any sensory information that can be detected (e.g., light, sound, touch). However, as an assessment activity, these stimuli are select features that can be incorporated into the quantitative models to indicate students’ progress. As such, they are part of assessment design. For game play, the stimuli would include board features (see Figure 3.1) as well as any additional metagame resources (e.g., notes, internet support sites) that allow the game activity to unfold on a single instance or across instances. Team members and opponents would also be part of the stimulus field. By comparison, while students can complete test-type exercises collaboratively (e.g., Borge & White, 2016), most often, and in the case of standardized achievement tests, they do so alone with limited stimuli that include directions, items, options, calculators, and note-pads. All stimuli can be coded as contributing to classical item characteristics such as level of difficulty, degree to which the item distinguishes between low and high scores (i.e., item discrimination), and guessing. For example, Gorin and Embretson (2006) have used this strategy to understand more about students’ answers on paragraph comprehension items beyond basic estimates of a one- or two-parameter logistic IRT model. Often called multicomponent latent trait models (Embretson, 1985), these IRT models are informed by cognitive theory and are applied best when tenets of the theory are incorporated into the assessment design. Gorin and Embretson (2006) described construction of a spatial analogical reasoning task, like Raven’s Progressive Matrices, and showed how addition, deletion, rotation, and transformation of item stimuli correlate with item difficulty and estimated ability or proficiency of examinees. Still, this approach to assessment, data capture, and psychometric modeling is more trait-based, stable, and between-subjects than what can be modeled for game play, and this leads to the next characteristic of Table 3.1, and potentially, one of the most important: the manifest response indicators (Markus & Borsboom, 2013) that become the data streams by which psychometricians and statisticians test their models.
The primary responses for game play are the moves or the turns that each player takes and contributions they make to team planning and distributed cognition. Most games require several moves or turns in order to arrive at the final outcome: win, lose, or draw. Consider three of the most classic of all games: checkers, chess, and tic-tac-toe. Like Expertise, these are board games, but simpler in design. All have multiple stimuli. All require multiple moves, often timed moves as in the case of elite-performance chess, and all progress in such a way that “on-the-board” stimuli (e.g., game locations, pieces) decrease in time. For test taking, there are some similarities; however, there are also key differences. Test items do afford a sequence of responses as either selected on multiple-choice tests or constructed for short answer or essay questions. Further, any situation of test taking can be timed as in the case of the Scholastic Achievement Test (SAT) or Graduate Record Examination (GRE). However, the student who is likely attempting to provide her/his best performance, response after response, etc., whether it is selected or constructed, is unaware of what the test developers’ intentions and goal dynamics are as each item is encountered. This lack of information also applies for computer-administered tests as in the case of computer adaptive testing (CAT) where items are tailored to examinees as they record or submit each response in sequence (e.g., Lord, 1968; Van der Linden & Glas, 2000). As such, there is little feedback provided to the examinee, certainly no knowledge of results, and not even guidance in a form such as, “Here is an easier item. Try it.” or, “Here is a more difficult item. I think you can do it.” Instead, the test taking endeavor is to maintain the calibration of items as positioned given a between-subjects’ evaluation as below average, average, or above average within most likely a normal distribution. Further, environments for test taking often require a “standard setting,” which means that the conditions (e.g., proctor, seating arrangements, time limits) for completing the assessment remain constant for all examinees as possible.
Next in Table 3.1 the characteristic feedback to responses highlights that feedback for each move or turn taken in game play is immediate, and that immediate feedback resets the dynamics of intentions (a single move) to intentional dynamics (e.g., how has my individual strategy or my team’s goals changed? (Young & Barab, 1999)) unless play is disrupted or interrupted as with the end of the game or game delays as in the case of outdoor games (e.g., weather delays). By comparison, feedback to responses that students provide on tests can be significantly delayed. While contemporary standardized assessments, like the SAT and GRE, provide computerized score reports upon completion of the tests, the examinees are not provided any feedback while taking the test. In other test taking situations such as classroom assessments, students might wait days or weeks before they have a sense of their progress. Finally, as in tests administered for research purposes, participants are unlikely to receive any feedback about how well they performed unless they were to inquire about their results at the conclusion of the study.
These points made about the immediacy or the delay of feedback relate directly to the next characteristic in Table 3.1. While Gibsonian accounts of perception-action are as relevant for game play as they are for test taking, as we have written previously, game play affords more opportunity for discussion and adoption of goals related directly to learning to improve performance as part of the activity (e.g., preparing for the challenges at the next level; altering strategies to improve; identifying opportunities for distributed cognition among one’s teammates) than does test taking, which in some ways can only mean attempting to get as many or more correct answers on a test, even if guessing. The overt learning goals of game play highlight how players may be operating at multiple levels of an ontological hierarchy of goals, and within particular instances of games, may not appear to be showing optimal performance, as in the case of testing out strategies or stretching the rules of the game (e.g., playing “what will happen if I try this?”).
These goal reformulation properties are related to what can be called the degrees of freedom for each of the two measurement situations. In statistics, degrees of freedom are values that are free to vary. While the term primarily pertains to samples selected from populations in the estimation of parameters, the idea is relevant for our current discussion. Degrees of freedom relate to the number of parameters that constrain one’s perceiving and acting. In effect, they relate to constraining the opportunities for action. In some situations such a free play on a school yard, as with most human situations, there are nearly infinite degrees of freedom (Mitchell, 2009). In situations like the card game War, there are fewer degrees of freedom and performance can seem repetitive, leaving little behavior to assess. In most games, the rules are set at the start of play, but in some games (like Flux) the game rules themselves emerge from play or are initially hidden to all but a few players (as in Betrayal at House on the Hill). In these cases, the degrees of freedom are changeable and must be assessed continuously throughout play. Whether emergent or not, the rules of the game establish the degrees of freedom for game play. The patterns of turn-taking, rolling dice, drawing cards, and discussing next moves with teammates are part of the rules of the game.
Note the use of gerunds in the preceding sentence. These nouns derived from verbs illustrate not only the value of action, but also, each gerund defines constraints under which game play can unfold. For test taking, it is no different. Test taking involves following directions, reading stems, selecting among options A to D, and constructing short answer responses or short essays as in the case of the National Assessment of Educational Progress (NAEP). Consider two hypothetical NAEP-like items, one is multiple choice and one is short constructed response.
Multiple-choice item: Which of the following psychologists focused his work on the importance of visual perception?
There are several important points to consider about such items. First, the actions of examinees are arguably more passive or reactive than those for game play. Second, except for the responses that examinees can construct, the constraints are so limiting (e.g., finite option set, restricted space allocation for writing a constructed response) that there is limited opportunity for students to interact with the environment in such ways where their responses now provide affordances for further action by an evaluator, team member, teacher, etc. So, in essence, the flow of activity has stopped with the response for that item stimulus field. Examinees must then re‑situate to move onto the next stimulus field (i.e., the next item on the test) that may present content that is significantly different from the item just completed.
We acknowledge that many computer-based assessments are now designed to relax some constraints (e.g., Jang et al., 2017; Siddiq, Gochyyev, & Wilson, 2017). For example, Quellmalz and colleagues (2013) studied scores of three different “next-generation” assessment environments designed in accordance with school day standards such as those of College Board Standards for Science Success (College Board, 2009) and the National Research Council’s Framework for K-12 Science Education (National Research Council [NRC], 2012). Referred to as “static,” “active,” and “interactive” designs, Quellmalz and her research team demonstrated psychometrically that as environments became least constrained (i.e., interactive), the dimensionality of the responses (i.e., how many factors or latent traits predict performance) increased.
Static modality item designs looked very much like the two hypothetical examples we presented previously. As one example, students looked at four different ecosystem food chain diagrams as a stem that depicted relationships among animals (i.e., bear, caribou, and hare) and plants (i.e., grass and lichen). Then students read a description of a food web diagram. They had to select the correct option that indicated a match of description to diagram.
In the active and interactive modalities, students’ opportunities to use menu tools to construct food web diagrams increased. For example, students could view simulated animations to watch predator-prey ecosystem dynamics. With the ability to control viewing and re-viewing of the videos, the active modality afforded examinees more control over managing information before they would either select a response among multiple-choice options or construct a response using arrows to diagram food web flow. Finally, the interactive modality permitted the highest degree of activity (and degrees of freedom) and authentic engagement for students. In this modality, learners could conduct scientific inquiry using data streams much as expert scientists do before they would select or construct an item response. Much as in game play, the interactive modality updated information flow based on each input or “move” made by the student.
Tracking the data via computer logs (e.g., Tsai, 2018) allows for process tracing (Lindell et al., 2018) that is an entire specialization in design of technology-rich learning environments (Jang et al., 2017) used for the study of dynamic decision making. Large amounts of data (big data) can be harvested and used for information systems research (Barki, Titah, & Boffo, 2007). This set of topics also leads to the next characteristic in Table 3.1, score assignment. For highly interactive modalities such as game play, any move, such as a keystroke, menu review, roll of dice, or selection of a card from a deck can be coded dichotomously as present or absent (i.e., binary 1 or 0) as well as time stamped for duration and even position (space coordinates). By comparison, test taking, described as a static assessment modality, can lead to score assignment that is either dichotomous (e.g., disagree/agree, incorrect/correct) or polytomous (e.g., Likert scale, partial credit). In fact, IRT models vary given the type of assessment (e.g., achievement measure, affective scale) as well as the score assignment (i.e., dichotomous or polytomous). Masters (1982) proposed a partial credit Rasch model to scale distractors on multiple-choice items given degrees of correctness. Similarly, Andrich (1978) presented a Rasch rating scale model to evaluate the contributions of Likert-scale categories (e.g., strongly disagree to strongly agree) in score reliability and validity. Numerous other IRT models have been introduced in psychometrics literature for polytomous score assignment that might prove useful for game play assessments (Embretson & Reise, 2000).
What the vast majority of IRT models share in common leads to the next topic of Table 3.1, dimensionality of scores, which pertains to a topic in statistics called local independence. for test taking, the only property connecting the responses as a sequence is the one or more latent traits that predict the responses as manifest item indicators (Markus & Borsboom, 2013). Various models of dimensionality exist including the classical one-, two-, and three-parameter dichotomous IRT models (Hambleton, Swaminathan, & Rogers, 1991) and their extensions to polytomous singular and multiple dimension IRT models (Reckase, 2009), to bifactor models (DeMars, 2013), and even to multilevel IRT models (e.g., Fox, 2004; Wilson, Gochyyev, & Scalise, 2017). In these more advanced models, estimations of item parameters are studied given variance contributed by clusters (e.g., dyads, groups, classrooms, schools, districts) as well as individual, which potentially could capture both player and distributed team cognition. Maximum likelihood estimation facilitates the estimation of parameters and study of model-data fit so that error or residual variance is minimized given score patterns. Therefore, when model-data fit is supported, then scores can be summed as a total composite if unidimensional and as discrete subscales if multidimensional (e.g., Dumas & Alexander, 2016; Schilling, 2007). This can facilitate statistical analysis using general linear model (GLM) procedures and their extensions (e.g., HLMs, HGLMs) to address classical research questions such as, “Are there significant differences in instructional treatment conditions on reading comprehension after controlling for prior knowledge?” (e.g., McNamara & Kendeou, 2017), or “Do offline and online reading skills predict critical evaluation of Internet sources?” (e.g., Forzani, 2018). However, these group-based and aggregate modeling techniques (Boker & Martin, 2018) impede, and potentially prohibit, data-analytic science of learning and problem solving as it occurs most naturally for any one learner attempting to coordinate activity with any other learner, as in the instance of the distributed cognition of collaborative game play.
Game play, in contrast to test taking, cannot be anything other than emergent and nonlinear. Local dependence (see Table 3.1) is inevitable, and dimensionality, singular or multiple, unfolds within multiple nested spacetimes (e.g., McKee et al., 2018). As we introduced previously, there is no presumption of a latent abstraction underlying individual and distributed cognition. Summed scores for composite variables such as those of the many verbal (e.g., vocabulary) and performance (e.g., working memory) subscales of intelligence tests are not the measured attributes of interest. Instead, researchers use terminology of “complexity” (e.g., Hilpert & Marchand, 2018) and “dynamical systems” (e.g., Kringelbach, McIntosh, Ritter, Jirsa, & Deco, 2015) and dynamic concepts including “adaptation” (e.g., Mitchell, 2009), “diffusion” (e.g., Dixon et al., 2012), “emergence” (e.g., Hilpert & Marchand, 2018), “entropy” (e.g., Stephen et al., 2009), and “self-organization” (e.g., Greenberg, 2014) to describe and model phenomena scientifically. These are the dynamics that underly an ecological psychology description of board game play, and that we propose to capture in Expertise play as an assessment.
Analysis of assessment data relies extensively on a mathematics that is suited to
such terminology for either game play or test taking. Historically, complex and
dynamical systems rely on difference or differential equations (e.g., Dixon et al., 2012; Stephen et al., 2009). However, there are numerous
challenges when undertaking mathematical and
statistical modeling of such kind. As a primary challenge, only the simplest of
models are tractable or have approximate solutions (Ocone, Millar, & Sanguinetti, 2013). Relatedly,
modeling with differential equations makes assumptions about initial conditions.
While it is nearly impossible for researchers to specify initial conditions of any
complex social system such as learning and problem solving as it takes place in
classrooms, the initial board and player conditions of games may be more
tractable. A final limitation of use of such models is that their properties are
not easy to understand mathematically or as results that can inform best practices
or classroom decision making. As Hilpert and
Marchand (2018) discussed in their review of complex systems for
educational psychology, dominant component linear models of studying test scores
(i.e., test taking) such as regression and their extensions to path and structural
equation models (SEMs) will prevail, at least for the near future. These linear
(static) models are entrenched in classical texts and manuscripts adopted by the
field (e.g., Bollen & Long, 1992;
Byrne, 2001). In their final
remarks, the authors write: Integrating CS (complex systems) research into educational psychology may
require more flexible thinking about research methods, particularly with
regard to significance testing, commensurate forms of data, and generally
what counts as sound evidence within empirical research.
However, such progress also reflects the final two characteristics
we present in Table 3.1. For game
play, considering emergent dynamics allows for evaluations of progress and
improvement and rewards for improvement as students’ goals become more numerous,
their challenges increase, and distributed team cognition becomes more effective.
This evaluation and rewards perspective often differs from typical course grades
and high-stakes test scores that focus on percentile ranks, school comparisons,
and possible penalties for lack of improvement. Substantial challenges for
applying these complex dynamical systems analyses remain, including:
Games have long been attractive as learning environments given that games provide goals (i.e., objectives), rules for constrained interactions, relevant/immediate feedback, content mastery, and more importantly, create playful spaces where players are welcome to explore the context (i.e., engaged participation), share, and build the experience with others (i.e., co-construction and distributed team cognition) (Gee, 2003; Shaffer, 2006; Squire, 2006; Young, 2004). In a 2012 meta-review of the value of video games for classroom learning, Young et al. (2012) contended that individual game play is best described as situated cognition, while collaborative team play can be described as emergent effectivities of a collective of agents that comes together to act in synchrony on shared goals. Several additional studies have pointed to the affordances of games as useful assessment tools (e.g., Chin, Dukes, & Gamson, 2009; Loh, 2012; Shute & Ke, 2012). Games may create good assessment contexts that capture the interactive/situative nature of group and individual cognition and action (Schwartz & Arena, 2013; Shaffer & Gee, 2012; Steinkuehler & Squire, 2014; Young et al., 2012). For example, Shaffer and Gee (2012) argued that games are good assessments in that every action/decision that players make in a gaming context draws on players’ abilities in the moment on many dimensions, and collaborative play hinges on the situated and embodied nature of team cognition and action (e.g., thinking in situ in the gaming context, strategizing while acting in the context). Steinkuehler and Squire (2014) also asserted that a game itself is a good assessment tool in that games enable us to track and monitor players’ performance, provide just-in-time feedback on any performance, and offer rich data on problem solving in situ. In addition, some games enable us to make observations in authentic contexts where we create complex, realistic scenarios required to evaluate players’ situative actions and cognition (DiCerbo, 2014).
The dynamics of collaborative board game play seem to parallel the dynamics of complex educational settings like public schools, and we have preliminary data showing positive correlations between games scores, specifically Expertise and other measures of master teachers’ performance in the realm of technology integration at the graduate school level. Some of the interesting aspects of board game play as a form of distributed team cognition are that there is a defined pattern of turn-taking, so each individual has roughly equal opportunities to contribute to the overall team success within the constraints of their various roles. This again nicely constrains some of the degrees of freedom in group collaborative interactions, perhaps enabling complex systems modeling.
It seems clear to us that current assessments that rely on linear assessment models to evaluate collaborative cognition in game play bring with them limiting factors that strain our theory of human cognition. We recommend the adoption of a situated approach that may rely more heavily on big data computer processing and analysis of deep learning networks. There is no guarantee that these approaches will necessarily be better able to characterize the emergent interactions of individual team cognition as described by ecological psychology and situated learning. But we have to try. The playful learning that occurs in structured board game play may prove a sufficiently bounded context in which to explore the application of these approaches while keeping the degrees of freedom manageable.
We are encouraged by recent developments in psychometrics where attention has been given to modeling complex dynamics (e.g., von Davier, 2017) that could be specifically applied to distributed team cognition and collaborative game play. In a 2017 special issue of the Journal of Educational Measurement with guest editor Alina von Davier, initial presentations of network models, time series incorporated into IRT modeling, and multilevel multidimensional modeling individual student variation coupled with their peers (team player distributed cognition) provided evidence of promising developments that bring together learning theory, game design, big data capture, and psychometric analysis. In our future work, we look to contribute to this dialogue with analysis of the many unfolding data streams available through study of board games like Expertise. We also acknowledge there is much work yet to do to fully describe the dynamics that unfold on multi-fractal levels when individuals interact with game mechanics as individuals and collaborative teams. Fortunately, the tools and supporting theoretical frameworks appear to be emerging to support such research efforts. Our suggestion is that this work be framed within the ecological psychology world view and investigate individual and team cognition as “situated” and emergent in a dynamic agent-environment interaction that is organized by individual and shared goals.