Collaborative Board Games as Authentic Assessments of Professional Practices, Including Team Cognition and Other 21st-Century Soft Skills

Authored by: Michael F. Young , Jonna M. Kulikowich , Beomkyu Choi

Fields of Practice and Applied Solutions within Distributed Team Cognition

Print publication date:  September  2020
Online publication date:  September  2020

Print ISBN: 9781138626003
eBook ISBN: 9780429459542
Adobe ISBN:




That we all learn from play seems undeniable. But often it takes an educational psychologist to point out the obvious: that for most of life’s important lessons we as individuals, and as teams, learn first, and perhaps primarily from playing around (Vygotsky, 1966, 1978). Playful learning both relies on distributed team cognition and contributes to its development. While playful learning is a large part of childhood, it also remains a lifelong aspect of formal and informal education. Children play house without the burdens of really cooking, cleaning, or paying bills, or play cowboys without the dangers of real horses, guns, and stampeding animals.

 Add to shortlist  Cite

Collaborative Board Games as Authentic Assessments of Professional Practices, Including Team Cognition and Other 21st-Century Soft Skills

Learning from Play

That we all learn from play seems undeniable. But often it takes an educational psychologist to point out the obvious: that for most of life’s important lessons we as individuals, and as teams, learn first, and perhaps primarily from playing around (Vygotsky, 1966, 1978). Playful learning both relies on distributed team cognition and contributes to its development. While playful learning is a large part of childhood, it also remains a lifelong aspect of formal and informal education. Children play house without the burdens of really cooking, cleaning, or paying bills, or play cowboys without the dangers of real horses, guns, and stampeding animals.

Many simulations involve playful learning and draw on role playing and imagination to establish realistic, if not real, conditions for learning. Distributed cognition is an essential component of successful team play, whether in formal team sports or in massively multiplayer online games such as World of Warcraft. In such cases, key roles are predefined (quarterback vs. tight end, tank vs. healer) but success is largely determined by a shared and coordinated understanding of goals and execution of well-synchronized creative actions.

Yet, important dimensions of playful learning may be lost when play becomes the focus of experimental controlled studies, academic theory, scholarly analysis, assessment, and other research. Traditional approaches to measurement and research methods push for the assessment of isolated and individual play behaviors (not teams), linear conceptions of knowledge, and dispositions (not distributed knowledge) over discrete short periods of play from only single game instances (Resnick & Resnick, 1992; Roth, 1998; Schwartz & Arena, 2013). This stands in contrast to our real-life experiences where we play the same games repeatedly with different opponents and even with different strategies and different team roles, and learn not from single instances of game play, but from repeated experiences over substantial time, and through post-game discussion, reflection, and analysis with other players—referred to as metagame experiences. The dynamics of game play over various occasions and metagame reflections and interactions are an important aspect of playful learning that are often missed but are an essential element of our ecological and situated approach (Young, 2004; Young et al., 2012). Play is both something we do when we follow the rules of a game and an approach we take when entering into an interaction perhaps playfully testing the limits of those rules. This latter idea, of a playful approach to the world, represents an individual or group’s goal set to intentionally manipulate aspects of the situation in order to achieve a playful interaction (Young & Slota, 2017).

As children play games, they learn physical and mental skills (Resnick, 2006) and acquire new way to affect the world (effectivities). They also learn strategies for how to play games and how to interact with other players (Gee, 2003; Squire, 2008). As described by a quote from a future teacher in our college teacher preparation program,

When I was a child, I played all kinds of board games and physical games (like twister or spud). Those games taught me a lot of rules. For instance, I had to learn the rules for the game to play, but I also learned ethical rules, such as don’t cheat and don’t touch other people’s cards/property. Physical games and games we played outside taught me basic coordination and how to play on a team. Some games were centered around educational concepts like counting (Hershey’s Kisses game), or making words, or describing a word (Scrabble, Scattergories, and Taboo). Some required deductive reasoning and critical thinking (Clue), while others required drawing (Pictionary) or acting (charades).

Rebekah Labak, threaded discussion post, fall 2018 It seems from games children can learn many of the basic social interactions they will need the rest of their lives. From games like Scrabble they can learn to be playful with language and add to their vocabularies with a focus on word relationships and spelling. In many games, like Monopoly, they can learn about money, value, and counting. And they can learn the basics of how to treat each other, including how to follow rules or how to cheat (hopefully not), as well. In this process they can playfully explore a personal ethic (strategy) as well as experience the social consequences of defying the rules. Play has also been linked to the development of creativity and other key aspects of child development (Howard-Jones, Taylor, & Sutton, 2002; Lillard, 2013; Resnick, 2008; Russ, 2003). It is interesting to note that for some approaches to early childhood education, like that of Maria Montessori, the fantasy aspects of play were decoupled from the mechanics of games and playful learning, and this remains an important distinction often discussed when trying to parse simulations versus games.

For our purposes, the question then becomes, what can adults learn from game play, and can game play be as powerful a social learning environment for advanced skills, such as the distributed team cognition of surgical teams debriefing or military planners during war games, as it is for development of early childhood social skills?

One important distinction to make at the outset of this chapter is between structured play, as is the case with most video games, card games, and board games, versus unstructured or “make believe” play. The latter form of playful learning has been discussed by child psychologists with regard to the development of social skills, creativity, and self-regulation (see e.g., Bodrova, Germeroth, & Leong, 2013; Lillard et al., 2013). In this chapter, we will focus on structured play that fits the standard definition of “game,” which generally includes the presence of rules, turn-taking, and some sort of end goal state such as winners and losers.

Games have certainly been suggested as a teaching tool for skills in many advanced disciplines beyond preschool and early elementary school, and appear to be useful for learning about real-world environments including transportation planning (Huang & Levinson, 2012), safety science (Crichton, 2009), 5th grade science (Wilkerson, Shareff, Laina, & Grave, 2018), engineering design (Hirsch & McKenna, 2008), cybersecurity for avoiding phishing (Arachchilage & Love, 2013), and notably for our present topic, engineering teamwork (Hadley, 2014). Meta analyses of the effectiveness of games for learning advanced school content generally find small but positive effects of playful learning approaches (Clark, Tanner-Smith, & Killingsworth, 2015; Vogel et al., 2006; Wouters, van Nimwegen, van Oostendorp, & van der Spek, 2013; Young et al., 2012), but such effects are summarized across a diverse set of games and players and thus are hard to interpret at such a broad level of analysis. We have characterized this as looking for our princess in the wrong castle (Young et al., 2012) and have made a case that a more situated analysis of game play is required to account for the rich social context in which game play emerges.

In our work to study games as assessments, distinctions between what makes a game a simulation, or what makes a simulation also a game, have not been useful. At times, simulations are judged by their fidelity and verisimilitude, whereas games are judged by their playfulness. Yet upon closer inspection, games and more specifically game mechanics have been characterized variously as a subtype of simulations, as the superset of simulations, and parallel with shared element with simulations. For example, when developing taxonomies of games, Wilson et al. (2009) described 18 categories of game features, which Bedwell, Pavlas, Heyne, Lazzara, and Salas (2012) refined into nine features. Koehler, Arnold, Greenhalgh, and Boltz (2017) tested the value of these nine features for characterizing how gamers review and evaluate games, only to find that some additional features were needed. Drawing on Malone and Lepper’s (1987) description of what makes learning “fun,” many of these game features describe things that players enjoy (e.g., competition, conflict, control, human interaction) and not features that might distinguish key differences between the more veridical nature of simulations (real-world physics constraints) and the sometimes more fantasy-based nature of games. In our own work, we have found it best to characterize both simulations and games along the dimensions of playfulness, emotional experience as fun, and rules/game structures. For example, simulations (constrained by real-world parameters) can be playful, in the sense they allow users to explore openly and combine actions in ways not fully anticipated by instructors/designers. Likewise, many games involve some real-world constraints (gravity, solid structures) and we are aware of elite soccer players who use soccer video games to simulate player strategies. In the case of professional sports, playing a game may not be experienced as fun at all and be instead hard work. While some believe all games are a subset of simulations because they use some real-world constraints, others have argued that simulations are a subset of games because they use various game mechanics to various degrees and are sometimes experienced as fun and playful, like some games. For our purposes, none of these attempts to apply a taxonomy to games or organize the features of games in contrast to simulations have proved useful for research. Instead, we would assert that both games and simulations utilize various degrees of playful mechanics, are experienced with varying degrees of “fun,” and have various levels of rules and end states/goals, whether competitive or collaborative or some combination (such as games with a betrayal mechanic).

For the topic of team distributed cognition, our interest is particularly on collaborative board games in which players perform game tasks together through communication, coordination, and other team process behaviors and achieve results as an integrated unit. There is an element of competition still with such games as they often include individual player scores, but collaborative board games are played against the board itself and the win condition is win or lose for all. For example, in Forbidden Island, time is running out as the island is sinking and players must collaborate and leverage their individual skills to escape before the water rises to cover the island; if the island sinks, the players all lose as a team.

As mentioned regarding engineering teamwork, Hadley (2014) used the board game Pandemic as an educational intervention to encourage engineers to reflect on their teamwork skills and strategies. In this study, board game play was the intervention, not the assessment. But this intervention highlights the need to focus work concerning team effectiveness on the interactions of player and board position dynamics rather than on a static momentary snapshot of individual scores, team scores, or individual cognitive strategies. For our purposes team cognition involves:

  1. Intellectual diversity
  2. Civil discourse, active listening
  3. Balanced contributions across team members
  4. Shared group decision making, consensus building
  5. Coordinated goal/task planning/setting
  6. A shared perception of team cohesiveness
These properties of team cognition were present in the game play of Pandemic while the game environment allowed players to fail with little consequence and in so doing, reflect on how their team contributions impacted success or failure overall. For our analysis, we would note that it was the repeated trials across game play that provided the information “flow field” from which to detect the invariant structure of good distributed cognition. As our analysis suggests, most of the learning from Hadley’s (2014) study of Pandemic came from players reflecting on game play in written memos. This was triangulated with the observations of game facilitators.

As stated, our view of game-based learning includes more than simply the interactions that unfold during the game itself. They include the metagame. Metagame interactions include post-game reflections and conversations, online searches for game-related hints, cheats, and tutorials, and apply to a series of game plays, as well as include creative reflection by players on how to formulate strategies for game play on subsequent occasions. It is the dynamics across game-to-game strategies and collaboration among game players outside any single instance of game play that interest us as much as the interactions within a single game experience or the final score from a single instance of game play. This perspective alters classical notions of assessment as starting points, midpoints, or endpoints as in the cases of placement (e.g., pretest), formative (e.g., midterm), and summative (e.g., final exam) evaluations. Instead, the entire set of temporal trajectories of coordinated activity between team players and across game trials becomes the means for defining operationally what playing a game means (e.g., Andrews et al., 2017; Kim, Almond, & Shute, 2016).

Dating back to classic educational games, like Oregon Trail, students are known to play games without “getting” the key educational premise (the learning objective). That is, some players adopt game strategies (goals and intentions) that are unintended by game designers. For instance, players can focus on “gaming” the game (e.g., accumulating achievement points or killing off cattle) or on goals related to other superficial features of play (accumulating gold in the World of Warcraft auction house), rather than on the play mechanics intended to align with curricular goals and objectives. For example, Caftori (1994) described how the competitiveness aspect of Oregon Trail led some players (middle schoolers) to race for the end of the trail as fast as possible, without regard for their companions or oxen. This defeated the main objective of the game designers to encourage social problem solving for survival and draw attention to difficulties of losses along the trail. In this case, the common video game attraction of killing for loot replaced the instructor-intended and designer-intended student learning outcomes from game play:

The goal becomes so important that players neglect the health of other travelers and their own lives. Another example, shooting animals for food, was designed to teach children about different animals in different terrain, as well as be part of the reality of life on the trail. However, “shoot ’em up” has become a focus of attention for many students (mostly boys). Unfortunately, besides eye-hand coordination, not much else is learned. And eye-hand coordination is not one of the stated objectives of this game.

(Caftori, 1994, p. 6) This simple example highlights that game play is an emergent interaction that results from player strategies that arise from a dialectic interaction between game affordances and player intentions, on the fly. It is this property of board game play, essential to understanding games as assessments, that we focus on in this chapter.

As a final part of our introduction, we would like to separate board game play from typical action video game play. Board game play typically takes place on a much more relaxed and social scale than action video games. Recent work highlights the general cognitive impacts that highly stimulating action video games can have on overall cognition and of course distributed cognition, including verbal interactions as well as spatial skills and attention (Bediou et al., 2018). But for our purposes, we would like to focus on the impacts of interacting with others over a table-top board game where game mechanics draw from board positions, there is time for extended conversation and strategizing (turn taking rather than timed responses), and changing conditions related to random card selection and/or dice rolls. While engaging at a social level, we would suggest that the excitement levels and cognitive attentional demands of board game play would not parallel the impacts on general cognition of playing action video games. The cognitive impacts we observe tend to emerge less from overall cognitive arousal, but from dynamic social interactions, problem solving, and group dynamics.

Theoretical Framework

The theoretical framework for our work draws from J. J. Gibson’s (1986) ecological psychology of individual perception-action that defines human behavior as a dynamic interaction between individuals and their environments. An interaction is taken as the fundamental unit of behavior (unit of analysis), thus co-defined by properties of individuals that enable them to affect the world (effectivities) and properties of the environment that afford or invite actions (affordances). This differs from the predominant cognitivist representational view of cognition in several ways and these can be illustrated in an analysis of typical board game play.

For the purposes of our analysis, we next introduce a few key concepts:

  • The information flow fields (visual, auditory, tactile) of game play
  • Detection of invariance among a sea of variance within the information flow fields across games
  • Affordances detected in the flow field that are co-determined by effectivities of the player and develop during game play
  • The hierarchy of intentionality and associated ontological descent of intentions during a single game and across instances of the same game
  • The boundary constraints created by goal adoption and rule sets of game spaces
In describing human perception and experience, the Greek philosopher Heraclitus of Ephesus is famously quoted as saying, “You can never step in the same river twice.” Drawing from Gibson’s ecological psychology we similarly would contend that you cannot play the same game the same way twice. While you might be guided to enact the same strategy twice, the changing information field created through interactions with other players, random events in the game such as dice rolls, and your own attention to certain details (and not others) mean the game experience as an act of perception-action that unfolds on the fly cannot happen identically twice. We suggest here that game play is thus an interaction between player goals and intention and the information flow field created by the actions of other players and unfolding game events. For us, rather than “turn” or “score” the basic unit of analysis in game play is a player-game position interaction within a context bounded by the game rules and the intentions adopted at the moment by the player/team.

These information flow fields that are continuously signaling problems and changing conditions that the goal-driven agent (i.e., player) can interact, detect, and act on provide the context in which each player/team takes action (or makes moves). In doing so, they are guided by their goals and intentions (for winning) that direct their attention guiding their “pick up” of information directly from the environment, particularly certain regularities in the information flow field that specify affordances for action. These affordances are detected as invariant structures, much as Gibson described consistencies in the visual flow field when observers walked around a stationary object like a cube. As one moves, some things, like the visual background, move while some visual relationships (internal to the object) remain invariant. Those invariant structures specify the object and its properties (such as its graspability). Similarly, we would suggest that there are invariant structures in the information flow fields of game interactions that specify affordances for individual player “moves” and strategies as well as emergent (new) collaborative group objectives that can be perceived and acted on.

An elaboration of Gibson’s ideas was added to this analysis when Shaw and Turvey (1999) described the affordances that are detected in the information flow fields as co‑determined by properties of the individual and properties of the environment, an agent-environment interaction. Properties of the individual include what cognitivists would call their skills and abilities, but what are more precisely described within ecological psychology as “effectivities” or an agent’s abilities to have an effect on the environment through their actions. This would include their physical abilities to roll dice and move tokens around a board, but also their abilities to engage with other players in shared decision making and civil discourse and the skills (gear, weapons, etc.) their character acquires in the game.

To Gibson’s ecological framework of cognition, we add the contemporary learning science of situated cognition that draws heavily on apprenticeship learning and describes learning as the movement of individual behaviors from peripheral participation within communities of practice toward central participation. Situated cognition (Brown, Collins, & Duguid, 1989; Lave & Wenger, 1991; Rogoff, 1990) described learning as inherently social and a process of moving from the novice status as a peripheral contributor to the community of gamers toward a more central role of expert player. The ideas of situated cognition date back to John Dewey (1938) and the insight that we learn by doing. This has a modern equivalent in Papert’s (1991) constructionism (the word with the “N”) builds on constructivism (the word with the “V”) by adding that it is only by building (constructing) artifacts that we genuinely come to know things. We apply this framework to games and playful learning in general, and specifically for this chapter, to collaborative and competitive board games that may serve as assessments.

A Situated Description of Learning from Game Play

To learn or play a game is a goal-driven dynamic process of perceptual tuning (Gibson, 2000). Across a player’s experience within a single game, and across several instances of game play, a player begins to detect invariance in the information flow fields (visual, auditory, and tactile) of the game that specify objects and concepts (including team concepts) that are part of game strategies, rules, movements, and outcomes. Outside the game, a player learns from discussions (online and face-to-face) with other players and eventually moves from status as a novice player to a more central contributor to the gamer world.

Fundamental to our description of game play is the idea that human behavior is a dynamic interaction of perception and action, a person-environment interaction. Action drives and defines what we perceive and perception guides our actions. Thus, perception and action co-define each other. The reciprocal nature of perception is often missing from a traditional description of behavior and from cognitive or behavioral assessments (Young, 1995; Young, Kulikowich, & Barab, 1997). Perception is often described as a passive cognitive act, by individuals and by teams. In describing board games, this would be the equivalent of assuming that the rules of the game are the rules that are guiding each player’s actions. But for our purposes, a player’s actions are determined on the fly, in each dynamic interaction with the game and other players. Thus, the chances that a player will play the game exactly the same on two occasions, or that a team will act identically from game to game are essentially impossible. That is not to say that regularities in the game and in the play of other players will not result in observable or testable consistencies in the overall interactions that constitute game play.

We have selected to focus on collaborative board games because, in addition to an individual player interaction with the game, there must also be a coordinated team interaction. While each individual typically has some unique skills defined by cards or dice rolls, there are also collective team goals that are only achieved through the coordinated action of all the players. While this adds a layer of complexity for a full description of game behavior, it also adds a potential source of information for assessment of team cognition in relation to individual achievements by creating a participatory environment (Gee & Hayes, 2012) where civil discourse, informal mentorship, and intentional learning can occur and be detected. That is, not only may collaborative board game play provide evidence of an individual player’s leadership and communication skills, it may also externalize how a team discerns problems and takes coordinated actions. In short, collaborative games, in comparison to competitive play, represent a richer context for distributed team cognition.

During any particular instance of a game, players must respond to the current board position, their own situations determined by cards, rolls, and prior accomplishments, and also perceive the trajectory of the game as the situation unfolds with other players. We could identify three types of goals for collaborative board game play:

  • Individual interactions with the board
  • Team co-actions with the board
  • Individual perceptions of teammate interactions and trajectories
Since in a collaborative board game, players must also coordinate their actions among their fellow players to work toward team objectives, we can distinguish goals the team shares and associated coordinated actions they take together from individual player’s actions and turn taking. All levels of game learning are done on the fly and in the moment and can be described by the ecological psychology of Gibson’s perception-action. This is, team goals are dynamic, as they are assumed to adjust with each individual play, and individual play is assumed to adjust to each changing game condition as well.

In addition to these three types of goals that unfold during game play itself, there are also game-related activities that occur outside of game play (meta-game interactions). Across instances of game play, players engage in reflective dialog with other players using online cheat/hint sites as well as live conversations about the game. This level of game learning seeks to pick up invariances across games to detect team interactions that proved successful or problematic, and to detect strategies that had worked in other games that had not yet been experienced directly through their own game interactions. Both within and across games, we would posit that the learning is well described as perceptual tuning.

To this we would add that within the community of gamers, there is a dynamic of social progress from peripheral players (passive readers of game sites) toward a more experienced central role (contributor to game sites) that helps define and possibly even develop the game environment. While many games are classics and evolve little through time (Monopoly, Scrabble) other games do change with user content (Cards Against Humanity, Trivial Pursuit).

Goals and Intentions: Individual and Team Strategies

From the ecological psychology perspective, human behavior is an emergent interaction between an intentional agent and a dynamic environment. The agent-environment interaction is the basic unit of analysis for any understanding of game play or distributed cognitive teamwork. On any particular occasion of human behavior (a speech act or game move), our framework holds that the intentionality of the moment is itself a complex interaction among a hierarchy of competing goals—a hierarchical ontological descent of goals to the moment of a game move. By this, we are not simply referring to Maslow’s 1943 hierarchy of goals, but to a complex hierarchy of intentions that each individual may adopt that exist and play out on a variety of space-time continua, including the low-level muscle movements and the highest-level game-long play strategies.

Many have argued that teams think. We agree that team cognition is more than the sum of individual thinking (perceptions and actions). Our ecological perspective on intentionality has been described in more detail in Young, DePalma, and Garrett (2002). Here we reframe that discussion away from butterflies and termite mounds (many simple animals’ intentions to eat and reproduce, when taken together create a shared, collective intentionality and coordinated activity to maintain, expand, and work toward functional goals within an ecosystem), toward a focus on the specific environments of humans playing collaborative board games. Board game designers create constraints on player interactions by constructing game rules. Designers hope to limit the interactions among players and the board to a specific goal space that reduces the degrees of freedom in normal life interactions with the world to a much smaller set of “valid” game moves. This limiting of the problem space may be particularly valuable for using board games as assessments. To explain the majority of board game interactions, it is fair to assume that players/teams have adopted the goal to win the game, and thus their intentions all draw from the state space of legal game actions. It is worth noting that this would not always be the case, and sometimes players may adopt goals to make illegal moves or have a play goal to intentionally undermine or disrupt game play or team success (the betrayer mechanic, as built into games like Betrayal at House on the Hill). Presumably, these non-standard game interactions could be detected somewhat readily, during and as a result of play assessments in the form of inferred player short-term and game-long goals.

Collaborative game play can be viewed as values realizing (Zheng, Newgarden, & Young, 2012). Drawing from Hodges (2007, 2009), our work with language learning through collaborative play in World of Warcraft (five player raid parties) shows how various individual goals for play and for team-building can be braided together with team play. Thus, we view games as ecosystems where value-realizing dynamics come into play. Values are real goods, intentions of the system that can only be realized through perception-action within a particular event/context. From an ecological psychology perspective, language is a perception-action-caring system in which speaking and listening “demand an ongoing commitment to directing others and being directed by them to alter one’s attention and action so that movements from lesser goods (i.e., one’s present board position, achievement or goal) to greater goods (e.g., values) is realized” (Hodges, 2007, p. 599). As such, values are not only properties of a person but are also about relationships and the demands with all others in the context. Collaborative board game play is a socially constrained field where all game decisions and actions are constrained and legitimated by values of the ecosystem with all the others (e.g., other players, board position, game moves). That is, it is a jointly enacted field, intended to realize shared values (i.e., achieving board positions and other subgoals and ultimately winning the game).

Dixon, Holden, Mirman, and Stephen (2012) organized their manuscript around a central question that we would pose to all those interested in team cognition: “Does cognition arise from the activity of insular components or the multiplicative interactions among many nested structures?” We agree with Dixon et al. that there is mounting evidence from a wide range of domains suggesting that cognition is multi-fractal, emerging from dynamic in-situ interactions and distributed across teams, rather than stable latent traits or constructs dominated cognition (see also Hilpert & Marchand, 2018). Assessment models that break down human cognition into numerous independent components, such as memory, motivation, self-efficacy, interest, spatial skills, verbal skills, and reasoning, then apply an additive model and attribute all else to error variance, may well mask key distributed cognition dynamics that underlie team coordination and thinking. Perhaps a more positive takeaway from the multi-fractal analysis is that you can study cognition at any level (collaborative game play) and expect to see parallels at other levels (individual game play, applied collaborative teamwork).

Similarly, Cooke, Salas, Kiekel, and Bell (2004) argued that team cognition is more than the sum of the cognition of the individual team members. Instead, “team cognition emerges from the interplay of the individual cognition of each team member and team process behaviors” (p. 85). They argued that to measure team cognition, a holistic level of measures is required rather than collective metrics or aggregation procedures. The holistic measures focus on the dynamic team processes and co-actions taken by the team as a whole. In this regard, team consensus or communication consistency, team process behaviors, and team situation awareness (and associated taken-as-shared goals) are posited as potential measures of team cognition (perception-action).

Alternatively, we would like to begin to build an understanding of game interactions and their potential for assessment. Using our current learning games, we would like to find 100 or so examples and consider training a deep learning network like Amazon’s AMI and let it analyze for determining features that might align with 21st-century skills of collaboration or instances of micro problem solving that occur during game play (in board games like Expertise), as they have been used in speech recognition, recognizing Chinese handwriting, zip codes, classifying text as spam, and computer vision. One approach would be to create labels of successful and unsuccessful collaborative game play activities, and we might perhaps proceed toward a deep learning network solution for recognizing those activities during game play.

To this approach we would want to add individual and team stated goals and objectives for play, discerned from retrospective game play analysis. This would enable us to begin to construct a description of game play as situated cognition and action. Toward this purpose we next describe one collaborative board game designed to enable and assess the team cognition of teachers working together to wisely integrate technology into classroom instruction.

Playful Assessment for Externalizing Teachers’ Technology Integration Skills: The Board Game Expertise

Board games are just one form of the broader category of playful learning. Board games feature board “position” of players or tokens and often involve cards or dice to introduce chance events. When board games are collaborative, there are aspects related to individual players’ actions and decisions, as well as those related to the effectiveness of the team as a whole. Of course, when considering playful approaches to assessing distributed cognition, the traditional concerns of the assessment to provide evidence that is consistent (i.e., reliability) and credible/useful (i.e., validity) must be addressed. If an assessment was not able to validly capture what it is supposed to measure, it has little value formatively or summatively.

Traditional test theory relies on classical test theory, which assumes that observed scores are either simply the sum of true scores (i.e., true ability) plus error scores, or in the case of latent trait or item response theory (IRT), that logistic functions specify the relation between ability (e.g., achievement, reading comprehension, problem solving) and the probability of getting an item correct on a test. These classical psychometric frameworks assume that there is an individual factor and all else is error, taking no account of other factors such as context. However, in alignment with a perspective of situated cognition, we view that cognition is always situated in context of a world interaction, and contextual complexities (i.e., constraints) can substantially co-define (along with individual abilities) the nature of many activities. As such, our true ability always exists and operates as an interaction within the environment on the fly, co-defined by properties of the individual and properties of the environment. It means that the error term from classical test theory may not be entirely error at all, but rather needs to be taken into account to define our true ability (Bateson, 2000/1972; Hutchins, 2010; Young, 1995). We thus need to integrate contextual complexities into any analytic framework (i.e., unit of analysis) for the validity of an assessment and to examine to what extent the interaction with the assessment context fully represents what happens in the real world, as well as to what extent the assessment context is really related to an applied practical contexts (Wiggins, 1993), with the goal to produce evidence showing that the results from the assessment really capture one’s authentic practice. Mislevy (2016) made these points in an article entitled, “How Developments in Psychology and Technology Challenge Validity Argumentation,” and he drew specific attention to situative/socio-cognitive psychological perspectives as reasons for needed developments in psychometric modeling given opportunities to design interactive performance assessments with digital technologies. We will summarize some of these developments in this chapter. However, we first introduce the board game Expertise to anchor our explanation of the theoretical assessment premises.

To establish such an ecological validity or contextual fidelity, we implemented a collaborative board game called Expertise as an assessment environment and examined to what extent the play scores from this collaborative board game parallel real-world practice in regard to teacher’s technology integration skills. The unique attributes of Expertise as a collaborative board game constrain, to some extent, the large degrees of freedom present in any social interaction involving as many as five players.

Expertise: The Game

The game play we are investigating comes from a board game we created to assess the wise technology integration skills of master teachers, a game called Expertise. This game is the result of several iterations of game design. Two theoretical frameworks, our ecological and situated cognition framework, and the TPACK framework (i.e., teacher knowledge of technology integration), guided its development.

Figure 3.1   Expertise game initial game position

The Expertise game set up is shown in Figure 3.1, including one board, three technology card decks, one theory card deck, and two pedagogy card decks. Each player also has an expert level card (tracking individual scores) in light of TPACK-L performance over the course of the game play. Up to five players can play this collaborative game that draws on team distributed knowledge and coordinated action. In every round, selection of two technology cards, one pedagogy card, and one theory card creates the environment or instructional context for player collaborative action. The game starts with one player (the speaker) sharing a curricular student learning objective with a content area for a lesson that could involve technology integration. In this first phase, the team serves as co-teachers or technology integration advisors to help the speaker construct possible technology integrations. In the second phase, each player judges a brief summary of the proposal presented by the speaker. The game runs two rounds, meaning all players rotate through the role of speaker two times and serve as advisors when not the speaker. In the first round, players use the first-round pedagogy card deck that includes more traditional instructional strategies such as direct instruction, group discussion and the like; in the second, more advanced round, they use the second-round pedagogy card deck that includes more complex and innovative contemporary teaching strategies, such as problem-based learning, gamification, anchored instruction, and the like.

Once the speaker shares the content area, all of the players together have three minutes to discuss their best collaborative solution to teach the given content by taking the given technology, pedagogy, and theory into account. After this co-construction process is done, the speaker has two minutes to state this solution (as if presenting it to a Board of Education for funding) and how the chosen technologies will be integrated with the given pedagogy and learning theory. During the speaker’s proposal, each of the other players serves as a reviewer in one of the TPACK areas. Respectively, they decide to what degree the speaker’s proposal addresses wise use of board technologies, sound pedagogy aligned with the pedagogy card, and alignment with the board card for learning theory. To structure their judging, players roll dice to indicate their role play predisposition regarding their judgment and are thus assigned a degree of harshness to apply to their judgments, ranging from “easygoing” through “critical” to “hard-ass” for each round. If the proposal successfully hits any of the components of TPACK-L, the game token placed for each of the TPACK-L components moves one step inward toward the center of the board, indicating the level of team success. Once co-players’ judgments are done, the speaker’s expert level is promoted based on the result of the judgment. If all of the components are successfully addressed (i.e., three components move in)—the speaker hits TPACK-L reasoning perfectly—his/her game expert level is promoted two ranks. If the proposal meets two components of TPACK-L, his/her expert level is promoted one rank. If the proposal meets only one component or does not meet any, his/her game level stays at the current level. But if the proposal fails to meet any of the components of TPACK-L components, his/her game level is downgraded accordingly. This way, the player’s expertise in TPACK-L during the game play is given an individual score. In a nod toward scoring distributed cognition, after judgment is done, the speaker also has a chance to elevate the individual score of one of the co-players who was most helpful that round. There are five expert levels: Two Blue Agent, Two Summer Agent, Master Two Summer Agent, Technology Coach, and Master Technology Coordinator. At the end of two rounds of play, the final board (team) position and each individual player (expertise) level are recorded.

In addition, in each round, players are asked to use a technology card deck according to their expert level. Suppose that the player expert level is a Two Blue Agent, then this player should use the Two Blue Agent technology card deck. This way, game technologies become more sophisticated for some speakers as the game proceeds.

While beyond the scope of this chapter, the description of Expertise and its design helps to establish the types of data streams that are possible as means to study complex, interactive team performance and distributed cognition in game environments. The dissertation research of our third author provided evidence of the validity of Expertise as a measure of teacher technology competence. These preliminary research results point to the value of estimating both individual player and team success parameters in the spirit of an ecological description of game play.

Psychometric Issues of Games vs. Tests as Assessments

Because games are designed with rules and lead to individual and team outcomes that can be observed directly as win, lose, or draw, they provide a means for actions and behaviors to be analyzed as ones that result from goal-driven, intentional dynamics (e.g., McKee, Rappaport, Boker, Moskowitz, & Neale, 2018). One could argue that attempting to get items correct on any familiar standardized achievement test, whether its format is multiple-choice or constructed response, is similar, and learning can occur from completing tests and retests, just as it can from playing a game and replaying it. For example, in their now classic piece entitled “The Theoretical Status of Latent Variables,” Borsboom, Mellenbergh, and van Heerden (2003) considered how Albert Einstein might go about completing a general intelligence number series problem, the Fibonacci series problem, 1, 1, 2, 3, 5, 8 …? as a sequence of steps leading up to the construction of an answer rather than just its simple recording as 13. They wrote:

Let us reconstruct the procedure. Einstein enters the testing situation, sits down, and takes a look at the test. He then perceives the item. This means that the bottom-up and top-down processes in his visual system generate a conscious perception of the task to be fulfilled; it happens to be a number series problem.

(p. 213) Borsboom et al. continued the description making reference to working memory and drawing on information from long-term memory (e.g., Einstein recognizes it is the Fibonacci series). While working memory and long-term memory resonate with information-processing theory accounts of problem solving (e.g., Anderson, Reder, & Simon, 1996; Vera & Simon, 1993) rather than those positioned as Gibsonian descriptions of perception-action cycles, the described sequence of Einstein’s hypothetical steps as a test-taking narrative does make for an important illustration. It shows how a simple problem, and just one test item, commonly encountered on a large-scale standardized test is an in situ experience that is based on a sequence of goal states continuously created and annihilated through on-the-fly agent-environment interactions by test takers as they (in this case, Einstein hypothetically) interact with the test environment. Borsboom et al. continued their description of Einstein’s test-taking behavior as well as that of another agent who assigns a score to the response Einstein records as his answer:

Now he applies the rule and concludes that the next number must be 13. Einstein then goes through the various motoric processes that result in the appearance of the number 13 on the piece of paper, which is coded as 1 by the person hired to do the typing. Einstein now has a 1 in his response pattern, indicating that he gave a correct response to the item.

(p. 213)
Of course, modern test theory (e.g., Embretson & Reise, 2000; Markus & Borsboom, 2013), associated with the study of the reliability and validity of scores on standardized achievement and intelligence tests, does not unravel the perception-action experience of Einstein or any other learner as Borsboom et al. described. Further, Borsboom et al. were not offering an ecological psychology view of problem solving as the alternative to cognitive and in the mind theoretical perspectives such as information-processing (e.g., Anderson et al., 1996) or connectionist (e.g., McClelland, 1988) accounts. Instead, the goal of Borsboom et al. was to highlight that within-individual variations over time in “achievement,” “creativity,” “decision making,” “motivation,” “problem-solving,” etc., and treated as latent variables, drivers of behavior within the knower, have been under-addressed in psychometric modeling frameworks. Instead, latent variable modeling frameworks that undergird the broad spectrum of current psychometric and statistical techniques accepted as best-practice trade tools for establishing the reliability and validity of scores are best understood as between-subjects variation examined in at least one point in time but open to study of the between-subjects variation over multiple points in time.

As described previously, ecological psychologists work from a different world view focused on perception-action dynamics, and these perception-action cycles or interactions cannot be understood by anything other than the contribution of the individual agent interacting with her/his environment over time (e.g., Young, 1993; Young & Barab, 1999). Therefore, variation is of the within-subjects kind, and yet, there are other important sources of within-subjects, perhaps “within-contexts-unit” variation that must be coordinated in any study of game play. Specifically, and at minimum, these include the dynamics within one’s team and associated dynamics of the environment that result in dynamics of individual and team intentions real spacetime.

There are similarities, but also, there are differences between game play and test taking as a means to study learning, whether learning is latent, somewhat latent, or not. As an initial step in outlining a psychometrics for games as assessments, the distinctions between game play and test taking can be compared and contrasted (see Table 3.1). Psychometric theory, both classical and contemporary, relies extensively

Table 3.1   Characteristics of Game Play and Test Taking


Game Play

Test Taking


Performance (individual and team)


Psychometric status



Unit of analysis

Perception-action coupling and cycles

Individual, classroom, school, district

Stimuli of assessment situations

Board features, opponents, team members (interactions)

Directions, items, options

Responses to assessment stimuli


Item selection or spoken/written response construction

Feedback to responses



Goal and intentions based on feedback

Level up, improve strategy, try to win

More correct responses

Degrees of freedom

Governed by rules of the game

Governed by directions of the test and sampling space of items

Score assignment

Binary (i.e., dichotomous)

Binary (i.e., dichotomous), polytomous

Dimensionality of scores

Locally dependent, emergent

Locally independent, unidimensional/multidimensional

Analysis of scores

Differential, nonlinear

Discrete, linear

Evaluation of the scores

“Making progress,” “we can improve”

“Your score is at the 50th percentile or 75th percentile, etc. which means that&”

Consequences given, evaluation of scores

Rewards for progress

Penalties for lack of improvement

on a test-taking perspective. Only recently is psychometric theory embracing the value of understanding game play (e.g., Kim et al., 2016; Mislevy et al., 2014) as a means to analyze learning and problem solving. Yet, the study of game dynamics is a field by itself in the discipline of mathematics. Finally, game play is much better suited as an assessment of problem solving from an ecological account of learning than standardized test taking because the environment is constrained or designed in such a way to permit analysis of player-board as well as distributed team interactivity as it unfolds in spacetime (e.g., Stephen, Boncoddo, Magnuson, & Dixon, 2009). Such game play spacetimes can be conceived of as goal states of each play against the board individually nested within team goal states as they emerge and are then eliminated when achieved.

In the next section, we compare and contrast characteristics of game play and test taking. Both activities require agents to assess their progress in formal and informal learning settings. Our analysis of game play applies the ecological worldview of situational ecological dynamics and could be described as “behavioral, yet intentional” rather than “latent” given definitions and perspectives described by Markus and Borsboom in their Frontiers of Test Validity Theory (2013). Further, we are encouraged by contemporary developments in psychometric and statistical analysis on specification and test of nonlinear dynamical systems models (e.g., Helm, Ram, Cole, & Chow, 2016; McKee et al., 2018; Molenaar, 2014) at an individual time series level of analysis as well as co-integrated (e.g., Boker & Martin, 2018) with another player’s trajectory of intentional dynamics. The individual and co-integrated time series may be the most fruitful avenue of research as to the scientific study of distributed cognition within game play. However, while the avenue appears a promising and worthwhile one to take, challenges are to be noted and may be difficult to overcome at the current time and for some time. These challenges relate to theory (Borsboom et al., 2003); game design as assessment design (Kim et al., 2016); data capture (e.g., Lindell, House, Gestring, & Wu, 2018); and mathematical/statistical modeling (e.g., Greenberg, 2014) as well as difficulties that arise given standard, accepted educational best practices in the use of test scores to diagnose and promote learning. Such forces of accepted practices include the best explanation of achievement/performance is the most simple one (e.g., Templin & Bradshaw, 2014) and that the test score is “my score” not “our score” (von Davier & Halpin, 2013).

Table 3.1 outlines characteristics of game play and test taking that were selected based on the way psychometricians and statisticians approach data analysis to understand human learning (e.g., Markus & Borsboom, 2013). First, consider the primary attribute of game play versus test taking, which is an assessible variable, an entity that belongs to the learner and for which he/she provides data for measurement, assessment, and evaluation of assessment. For game play, we identify the primary attribute as “performance,” which occurs in real spacetime and may or may not be influenced by prior experience. For example, consider the student who is playing a game like Expertise for the first time. At minimum, rules need to be learned. However, no one would penalize the player for her/his willingness to do better the next time. Comparatively, the primary attribute for taking a test is demonstrating “achievement” or perhaps aptitude or intelligence, which is after the fact and the result of rather than a feature of learning, “shaped by curriculum standards,” and prone to evaluation at a level beyond any individual student (e.g., the teacher taught well, the school district operated well).

Next, we consider the psychometric status of the primary assessment attribute. The psychometric status pertains to how the reliability and validity of scores are to be interpreted as behavioral/manifest or formative versus latent and reflective. For game play, performance is behavioral as goals must be realized in action (e.g., Kulikowich & Young, 2001; Young, 1993). However, an important distinction is to be made between behavior that is reactive and responsive to the environment as in premises laid out by Skinner (1988) compared to intention-given by the learner. Gibsonian dynamics are intentional dynamics, and therefore, learners are often referred to as “agents,” referencing their agency to shape experience given their continuous interaction with the environment. While some tests adapt to user responses, a game is much more dynamic with game moves and strategies taken as a given part of the context. In contrast, achievement in test taking is primarily treated as latent. “Reflective” is a psychometric term (e.g., Markus & Borsboom, 2013) and it means that any item or task response that is assigned a score depends upon or is regressed upon an underlying long-term stable latent trait that is not directly observed. As such, the latent trait, abstracted from the world and stored in the head of the users, is the cause of student responses. Treatment of attributes as latent traits is arguably the most popular psychometric approach to the study of scores as reliable and valid and is evidenced in the variety of confirmatory factor analysis (CFA) and item response theory (IRT) models that psychometricians specify and test. It is also important to note that these approaches are successful between-subjects covariation techniques where scores are compared and positioned relative to one another, most often in reference to location in a normal distribution—all such assumptions are not taken as given in an ecological psychology description of game play.

From the discussion of the psychometric status of the attributes of game play and test taking, respectively, it is hopefully becoming clear that the unit of analysis for game play must be a continuous cycle of perception-action couplings in a dynamic agent-environment interaction. These perception-action couplings are manifest given the effectivities of the learner with the affordances available in the game (learning) environment, and neither student nor environment remain the same. The process unfolds and presents a response stream that can be modeled statistically as situated experience. Therefore, there is no abstraction from the world or need for storage of ideas. Comparatively, the unit of analysis for test taking is the static student whose achievement is nested in the static contexts of higher-order or multilevel units of analysis—classrooms, schools, school districts, provinces/states, countries (e.g., Raudenbush & Bryk, 2002). Indeed, covariates (e.g., classroom enrollment size, school socio-economic status) can be entered into the hierarchical equations at any level signifying contextual elements that explain sources of variation in students’ scores. Further, cross-level interactions (e.g., students’ prior achievement scores crossed with school socio-economic status) can be evaluated for significance and effect size. However, these covariates or covariate interactions are not affordances students detect and with which they interact during learning. Further, for both students and for higher levels of analysis such as classrooms, variables are most often treated as trait-like and stable, not state-like and continually changing (e.g., Molenaar, 2004).

The next five characteristics in Table 3.1 address the data sources that researchers can analyze for both game play and test taking. These include (1) stimuli of assessment situation; (2) responses to assessment stimuli; (3) feedback given responses; (4), goals or intentions based on feedback provided; and (5) degrees of freedom for any agent to alter the course of game play or test taking based on any feedback provided. We unpack each of these briefly next.

The stimuli of assessment situations are arguably infinite and would include any sensory information that can be detected (e.g., light, sound, touch). However, as an assessment activity, these stimuli are select features that can be incorporated into the quantitative models to indicate students’ progress. As such, they are part of assessment design. For game play, the stimuli would include board features (see Figure 3.1) as well as any additional metagame resources (e.g., notes, internet support sites) that allow the game activity to unfold on a single instance or across instances. Team members and opponents would also be part of the stimulus field. By comparison, while students can complete test-type exercises collaboratively (e.g., Borge & White, 2016), most often, and in the case of standardized achievement tests, they do so alone with limited stimuli that include directions, items, options, calculators, and note-pads. All stimuli can be coded as contributing to classical item characteristics such as level of difficulty, degree to which the item distinguishes between low and high scores (i.e., item discrimination), and guessing. For example, Gorin and Embretson (2006) have used this strategy to understand more about students’ answers on paragraph comprehension items beyond basic estimates of a one- or two-parameter logistic IRT model. Often called multicomponent latent trait models (Embretson, 1985), these IRT models are informed by cognitive theory and are applied best when tenets of the theory are incorporated into the assessment design. Gorin and Embretson (2006) described construction of a spatial analogical reasoning task, like Raven’s Progressive Matrices, and showed how addition, deletion, rotation, and transformation of item stimuli correlate with item difficulty and estimated ability or proficiency of examinees. Still, this approach to assessment, data capture, and psychometric modeling is more trait-based, stable, and between-subjects than what can be modeled for game play, and this leads to the next characteristic of Table 3.1, and potentially, one of the most important: the manifest response indicators (Markus & Borsboom, 2013) that become the data streams by which psychometricians and statisticians test their models.

The primary responses for game play are the moves or the turns that each player takes and contributions they make to team planning and distributed cognition. Most games require several moves or turns in order to arrive at the final outcome: win, lose, or draw. Consider three of the most classic of all games: checkers, chess, and tic-tac-toe. Like Expertise, these are board games, but simpler in design. All have multiple stimuli. All require multiple moves, often timed moves as in the case of elite-performance chess, and all progress in such a way that “on-the-board” stimuli (e.g., game locations, pieces) decrease in time. For test taking, there are some similarities; however, there are also key differences. Test items do afford a sequence of responses as either selected on multiple-choice tests or constructed for short answer or essay questions. Further, any situation of test taking can be timed as in the case of the Scholastic Achievement Test (SAT) or Graduate Record Examination (GRE). However, the student who is likely attempting to provide her/his best performance, response after response, etc., whether it is selected or constructed, is unaware of what the test developers’ intentions and goal dynamics are as each item is encountered. This lack of information also applies for computer-administered tests as in the case of computer adaptive testing (CAT) where items are tailored to examinees as they record or submit each response in sequence (e.g., Lord, 1968; Van der Linden & Glas, 2000). As such, there is little feedback provided to the examinee, certainly no knowledge of results, and not even guidance in a form such as, “Here is an easier item. Try it.” or, “Here is a more difficult item. I think you can do it.” Instead, the test taking endeavor is to maintain the calibration of items as positioned given a between-subjects’ evaluation as below average, average, or above average within most likely a normal distribution. Further, environments for test taking often require a “standard setting,” which means that the conditions (e.g., proctor, seating arrangements, time limits) for completing the assessment remain constant for all examinees as possible.

Next in Table 3.1 the characteristic feedback to responses highlights that feedback for each move or turn taken in game play is immediate, and that immediate feedback resets the dynamics of intentions (a single move) to intentional dynamics (e.g., how has my individual strategy or my team’s goals changed? (Young & Barab, 1999)) unless play is disrupted or interrupted as with the end of the game or game delays as in the case of outdoor games (e.g., weather delays). By comparison, feedback to responses that students provide on tests can be significantly delayed. While contemporary standardized assessments, like the SAT and GRE, provide computerized score reports upon completion of the tests, the examinees are not provided any feedback while taking the test. In other test taking situations such as classroom assessments, students might wait days or weeks before they have a sense of their progress. Finally, as in tests administered for research purposes, participants are unlikely to receive any feedback about how well they performed unless they were to inquire about their results at the conclusion of the study.

These points made about the immediacy or the delay of feedback relate directly to the next characteristic in Table 3.1. While Gibsonian accounts of perception-action are as relevant for game play as they are for test taking, as we have written previously, game play affords more opportunity for discussion and adoption of goals related directly to learning to improve performance as part of the activity (e.g., preparing for the challenges at the next level; altering strategies to improve; identifying opportunities for distributed cognition among one’s teammates) than does test taking, which in some ways can only mean attempting to get as many or more correct answers on a test, even if guessing. The overt learning goals of game play highlight how players may be operating at multiple levels of an ontological hierarchy of goals, and within particular instances of games, may not appear to be showing optimal performance, as in the case of testing out strategies or stretching the rules of the game (e.g., playing “what will happen if I try this?”).

These goal reformulation properties are related to what can be called the degrees of freedom for each of the two measurement situations. In statistics, degrees of freedom are values that are free to vary. While the term primarily pertains to samples selected from populations in the estimation of parameters, the idea is relevant for our current discussion. Degrees of freedom relate to the number of parameters that constrain one’s perceiving and acting. In effect, they relate to constraining the opportunities for action. In some situations such a free play on a school yard, as with most human situations, there are nearly infinite degrees of freedom (Mitchell, 2009). In situations like the card game War, there are fewer degrees of freedom and performance can seem repetitive, leaving little behavior to assess. In most games, the rules are set at the start of play, but in some games (like Flux) the game rules themselves emerge from play or are initially hidden to all but a few players (as in Betrayal at House on the Hill). In these cases, the degrees of freedom are changeable and must be assessed continuously throughout play. Whether emergent or not, the rules of the game establish the degrees of freedom for game play. The patterns of turn-taking, rolling dice, drawing cards, and discussing next moves with teammates are part of the rules of the game.

Note the use of gerunds in the preceding sentence. These nouns derived from verbs illustrate not only the value of action, but also, each gerund defines constraints under which game play can unfold. For test taking, it is no different. Test taking involves following directions, reading stems, selecting among options A to D, and constructing short answer responses or short essays as in the case of the National Assessment of Educational Progress (NAEP). Consider two hypothetical NAEP-like items, one is multiple choice and one is short constructed response.

Multiple-choice item: Which of the following psychologists focused his work on the importance of visual perception?

  • Gibson
  • Piaget
  • Skinner
  • Vygotsky
Short constructed response item: For Skinner, stimulus-response connections defined schedules of reinforcement. In the space provided below, define “stimulus” and define “response” according to Skinner.

There are several important points to consider about such items. First, the actions of examinees are arguably more passive or reactive than those for game play. Second, except for the responses that examinees can construct, the constraints are so limiting (e.g., finite option set, restricted space allocation for writing a constructed response) that there is limited opportunity for students to interact with the environment in such ways where their responses now provide affordances for further action by an evaluator, team member, teacher, etc. So, in essence, the flow of activity has stopped with the response for that item stimulus field. Examinees must then re‑situate to move onto the next stimulus field (i.e., the next item on the test) that may present content that is significantly different from the item just completed.

We acknowledge that many computer-based assessments are now designed to relax some constraints (e.g., Jang et al., 2017; Siddiq, Gochyyev, & Wilson, 2017). For example, Quellmalz and colleagues (2013) studied scores of three different “next-generation” assessment environments designed in accordance with school day standards such as those of College Board Standards for Science Success (College Board, 2009) and the National Research Council’s Framework for K-12 Science Education (National Research Council [NRC], 2012). Referred to as “static,” “active,” and “interactive” designs, Quellmalz and her research team demonstrated psychometrically that as environments became least constrained (i.e., interactive), the dimensionality of the responses (i.e., how many factors or latent traits predict performance) increased.

Static modality item designs looked very much like the two hypothetical examples we presented previously. As one example, students looked at four different ecosystem food chain diagrams as a stem that depicted relationships among animals (i.e., bear, caribou, and hare) and plants (i.e., grass and lichen). Then students read a description of a food web diagram. They had to select the correct option that indicated a match of description to diagram.

In the active and interactive modalities, students’ opportunities to use menu tools to construct food web diagrams increased. For example, students could view simulated animations to watch predator-prey ecosystem dynamics. With the ability to control viewing and re-viewing of the videos, the active modality afforded examinees more control over managing information before they would either select a response among multiple-choice options or construct a response using arrows to diagram food web flow. Finally, the interactive modality permitted the highest degree of activity (and degrees of freedom) and authentic engagement for students. In this modality, learners could conduct scientific inquiry using data streams much as expert scientists do before they would select or construct an item response. Much as in game play, the interactive modality updated information flow based on each input or “move” made by the student.

Tracking the data via computer logs (e.g., Tsai, 2018) allows for process tracing (Lindell et al., 2018) that is an entire specialization in design of technology-rich learning environments (Jang et al., 2017) used for the study of dynamic decision making. Large amounts of data (big data) can be harvested and used for information systems research (Barki, Titah, & Boffo, 2007). This set of topics also leads to the next characteristic in Table 3.1, score assignment. For highly interactive modalities such as game play, any move, such as a keystroke, menu review, roll of dice, or selection of a card from a deck can be coded dichotomously as present or absent (i.e., binary 1 or 0) as well as time stamped for duration and even position (space coordinates). By comparison, test taking, described as a static assessment modality, can lead to score assignment that is either dichotomous (e.g., disagree/agree, incorrect/correct) or polytomous (e.g., Likert scale, partial credit). In fact, IRT models vary given the type of assessment (e.g., achievement measure, affective scale) as well as the score assignment (i.e., dichotomous or polytomous). Masters (1982) proposed a partial credit Rasch model to scale distractors on multiple-choice items given degrees of correctness. Similarly, Andrich (1978) presented a Rasch rating scale model to evaluate the contributions of Likert-scale categories (e.g., strongly disagree to strongly agree) in score reliability and validity. Numerous other IRT models have been introduced in psychometrics literature for polytomous score assignment that might prove useful for game play assessments (Embretson & Reise, 2000).

What the vast majority of IRT models share in common leads to the next topic of Table 3.1, dimensionality of scores, which pertains to a topic in statistics called local independence. for test taking, the only property connecting the responses as a sequence is the one or more latent traits that predict the responses as manifest item indicators (Markus & Borsboom, 2013). Various models of dimensionality exist including the classical one-, two-, and three-parameter dichotomous IRT models (Hambleton, Swaminathan, & Rogers, 1991) and their extensions to polytomous singular and multiple dimension IRT models (Reckase, 2009), to bifactor models (DeMars, 2013), and even to multilevel IRT models (e.g., Fox, 2004; Wilson, Gochyyev, & Scalise, 2017). In these more advanced models, estimations of item parameters are studied given variance contributed by clusters (e.g., dyads, groups, classrooms, schools, districts) as well as individual, which potentially could capture both player and distributed team cognition. Maximum likelihood estimation facilitates the estimation of parameters and study of model-data fit so that error or residual variance is minimized given score patterns. Therefore, when model-data fit is supported, then scores can be summed as a total composite if unidimensional and as discrete subscales if multidimensional (e.g., Dumas & Alexander, 2016; Schilling, 2007). This can facilitate statistical analysis using general linear model (GLM) procedures and their extensions (e.g., HLMs, HGLMs) to address classical research questions such as, “Are there significant differences in instructional treatment conditions on reading comprehension after controlling for prior knowledge?” (e.g., McNamara & Kendeou, 2017), or “Do offline and online reading skills predict critical evaluation of Internet sources?” (e.g., Forzani, 2018). However, these group-based and aggregate modeling techniques (Boker & Martin, 2018) impede, and potentially prohibit, data-analytic science of learning and problem solving as it occurs most naturally for any one learner attempting to coordinate activity with any other learner, as in the instance of the distributed cognition of collaborative game play.

Game play, in contrast to test taking, cannot be anything other than emergent and nonlinear. Local dependence (see Table 3.1) is inevitable, and dimensionality, singular or multiple, unfolds within multiple nested spacetimes (e.g., McKee et al., 2018). As we introduced previously, there is no presumption of a latent abstraction underlying individual and distributed cognition. Summed scores for composite variables such as those of the many verbal (e.g., vocabulary) and performance (e.g., working memory) subscales of intelligence tests are not the measured attributes of interest. Instead, researchers use terminology of “complexity” (e.g., Hilpert & Marchand, 2018) and “dynamical systems” (e.g., Kringelbach, McIntosh, Ritter, Jirsa, & Deco, 2015) and dynamic concepts including “adaptation” (e.g., Mitchell, 2009), “diffusion” (e.g., Dixon et al., 2012), “emergence” (e.g., Hilpert & Marchand, 2018), “entropy” (e.g., Stephen et al., 2009), and “self-organization” (e.g., Greenberg, 2014) to describe and model phenomena scientifically. These are the dynamics that underly an ecological psychology description of board game play, and that we propose to capture in Expertise play as an assessment.

Analysis of assessment data relies extensively on a mathematics that is suited to such terminology for either game play or test taking. Historically, complex and dynamical systems rely on difference or differential equations (e.g., Dixon et al., 2012; Stephen et al., 2009). However, there are numerous challenges when undertaking mathematical and statistical modeling of such kind. As a primary challenge, only the simplest of models are tractable or have approximate solutions (Ocone, Millar, & Sanguinetti, 2013). Relatedly, modeling with differential equations makes assumptions about initial conditions. While it is nearly impossible for researchers to specify initial conditions of any complex social system such as learning and problem solving as it takes place in classrooms, the initial board and player conditions of games may be more tractable. A final limitation of use of such models is that their properties are not easy to understand mathematically or as results that can inform best practices or classroom decision making. As Hilpert and Marchand (2018) discussed in their review of complex systems for educational psychology, dominant component linear models of studying test scores (i.e., test taking) such as regression and their extensions to path and structural equation models (SEMs) will prevail, at least for the near future. These linear (static) models are entrenched in classical texts and manuscripts adopted by the field (e.g., Bollen & Long, 1992; Byrne, 2001). In their final remarks, the authors write:

Integrating CS (complex systems) research into educational psychology may require more flexible thinking about research methods, particularly with regard to significance testing, commensurate forms of data, and generally what counts as sound evidence within empirical research.

(p. 15) However, such progress also reflects the final two characteristics we present in Table 3.1. For game play, considering emergent dynamics allows for evaluations of progress and improvement and rewards for improvement as students’ goals become more numerous, their challenges increase, and distributed team cognition becomes more effective. This evaluation and rewards perspective often differs from typical course grades and high-stakes test scores that focus on percentile ranks, school comparisons, and possible penalties for lack of improvement. Substantial challenges for applying these complex dynamical systems analyses remain, including:
  1. How to get away from the general linear model and take as fundamental nonlinear dynamics
  2. How to characterize the multiple levels of intentions (for play) that may be factors in individual player game acts, and the shared coordinated action of teams
  3. Effects of context (playing a game in a classroom may impose different constraints than playing it at home, for homework, with friends, with strangers, etc.)
  4. How to assess individual play and team play on multiple simultaneous spacetimes defined by various conflicting and coordinated goals

Potential of Games to Assess Individual and Distributed Team Cognition

Games have long been attractive as learning environments given that games provide goals (i.e., objectives), rules for constrained interactions, relevant/immediate feedback, content mastery, and more importantly, create playful spaces where players are welcome to explore the context (i.e., engaged participation), share, and build the experience with others (i.e., co-construction and distributed team cognition) (Gee, 2003; Shaffer, 2006; Squire, 2006; Young, 2004). In a 2012 meta-review of the value of video games for classroom learning, Young et al. (2012) contended that individual game play is best described as situated cognition, while collaborative team play can be described as emergent effectivities of a collective of agents that comes together to act in synchrony on shared goals. Several additional studies have pointed to the affordances of games as useful assessment tools (e.g., Chin, Dukes, & Gamson, 2009; Loh, 2012; Shute & Ke, 2012). Games may create good assessment contexts that capture the interactive/situative nature of group and individual cognition and action (Schwartz & Arena, 2013; Shaffer & Gee, 2012; Steinkuehler & Squire, 2014; Young et al., 2012). For example, Shaffer and Gee (2012) argued that games are good assessments in that every action/decision that players make in a gaming context draws on players’ abilities in the moment on many dimensions, and collaborative play hinges on the situated and embodied nature of team cognition and action (e.g., thinking in situ in the gaming context, strategizing while acting in the context). Steinkuehler and Squire (2014) also asserted that a game itself is a good assessment tool in that games enable us to track and monitor players’ performance, provide just-in-time feedback on any performance, and offer rich data on problem solving in situ. In addition, some games enable us to make observations in authentic contexts where we create complex, realistic scenarios required to evaluate players’ situative actions and cognition (DiCerbo, 2014).

The dynamics of collaborative board game play seem to parallel the dynamics of complex educational settings like public schools, and we have preliminary data showing positive correlations between games scores, specifically Expertise and other measures of master teachers’ performance in the realm of technology integration at the graduate school level. Some of the interesting aspects of board game play as a form of distributed team cognition are that there is a defined pattern of turn-taking, so each individual has roughly equal opportunities to contribute to the overall team success within the constraints of their various roles. This again nicely constrains some of the degrees of freedom in group collaborative interactions, perhaps enabling complex systems modeling.

Potential of Games to Assess Collaborative Distributed Team Cognition

It seems clear to us that current assessments that rely on linear assessment models to evaluate collaborative cognition in game play bring with them limiting factors that strain our theory of human cognition. We recommend the adoption of a situated approach that may rely more heavily on big data computer processing and analysis of deep learning networks. There is no guarantee that these approaches will necessarily be better able to characterize the emergent interactions of individual team cognition as described by ecological psychology and situated learning. But we have to try. The playful learning that occurs in structured board game play may prove a sufficiently bounded context in which to explore the application of these approaches while keeping the degrees of freedom manageable.

We are encouraged by recent developments in psychometrics where attention has been given to modeling complex dynamics (e.g., von Davier, 2017) that could be specifically applied to distributed team cognition and collaborative game play. In a 2017 special issue of the Journal of Educational Measurement with guest editor Alina von Davier, initial presentations of network models, time series incorporated into IRT modeling, and multilevel multidimensional modeling individual student variation coupled with their peers (team player distributed cognition) provided evidence of promising developments that bring together learning theory, game design, big data capture, and psychometric analysis. In our future work, we look to contribute to this dialogue with analysis of the many unfolding data streams available through study of board games like Expertise. We also acknowledge there is much work yet to do to fully describe the dynamics that unfold on multi-fractal levels when individuals interact with game mechanics as individuals and collaborative teams. Fortunately, the tools and supporting theoretical frameworks appear to be emerging to support such research efforts. Our suggestion is that this work be framed within the ecological psychology world view and investigate individual and team cognition as “situated” and emergent in a dynamic agent-environment interaction that is organized by individual and shared goals.


Anderson, J. R. , Reder, I. M. , & Simon, H. A. (1996). Situated learning and education. Educational Researcher, 25(4), 5–11.
Andrews, J. J. , Kerr, D. , Mislevy, R. J. , von Davier, A. , Hao, J. , & Liu, L. (2017). Modeling collaborative interaction patterns in a simulation-based task. Journal of Educational Measurement, 54(1), 54–69.
Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573.
Arachchilage, N. G. A. , & Love, S. (2013). A game design framework for avoiding phishing attacks. Computers in Human Behavior, 29(3), 706–714.
Barki, H. , Titah, R. , & Boffo, C. (2007). Information system use-related activity. An expanded behavioral conceptualization of individual-level information system use. Information Systems Use, 18(2), 173–192.
Bateson, G. (2000/1972). Steps to an ecology of mind: Collected essays in anthropology, psychiatry, evolution, and epistemology. Chicago, IL: University of Chicago Press.
Bediou, B. , Adams, D. M. , Mayer, R. E. , Tipton, E. , Green, C. S. , & Bavelier, D. (2018). Meta-analysis of action video game impact on perceptual, attentional, and cognitive skills. Psychological Bulletin, 144(1), 77–110.
Bedwell, W. L. , Pavlas, D. , Heyne, K. , Lazzara, E. H. , & Salas, E. (2012). Toward a taxonomy linking game attributes to learning: An empirical study. Simulation & Gaming, 43, 729–760.
Bodrova, E. , Germeroth, C. , & Leong, D. J. (2013). Play and self-regulation: Lessons from Vygotsky. American Journal of Play, 6(1), 111–123.
Boker, S. M. , & Martin, M. (2018). A conversation between theory, methods, and data. Multivariate Behavioral Research, 53(6), 806–819.
Bollen, K. A. , & Long, J. S. (1992). Tests for structural equation models: Introduction. Sociological Methods & Research, 21(2), 123–131.
Borge, M. , & White, B. (2016). Toward the development of socio-metacognitive expertise: An approach to developing collaborative competence. Cognition and Instruction, 34(4), 323–360.
Borsboom, D. , Mellenbergh, G. J. , & van Heerden, J. (2003). The theoretical status of latent variables. Psychological Bulletin, 110(2), 203–219.
Brown, J. S. , Collins, A. , & Duguid, P. (1989). Situated cognition and the culture of learning. Educational Researcher, 18(1), 32–42.
Byrne, B. M. (2001). Structural equation modeling with AMOS, EQS, and LISREL: Comparative approaches to testing for the factorial validity of a measuring instrument. International Journal of Testing, 1(1), 55–86.
Caftori, N. (1994). Educational effectiveness of computer software. T.H.E. Journal, 22(1) 62–65. Retrieved from
Chin, J. , Dukes, R. , & Gamson, W. (2009). Assessment in simulation and gaming a review of the last 40 years. Simulation & Gaming, 40(4), 553–568.
Clark, D. B. , Tanner-Smith, E. E. , & Killingsworth, S. S. (2015). Digital games, design, and learning: A systematic review and meta-analysis. Review of Educational Research, 86(1).
College Board. (2009). Science: College Boards standards for college success. Retrieved from
Cooke, N. J. , Salas, E. , Kiekel, P. A. , & Bell, B. (2004). Advances in measuring team cognition. In E. Salas & S. M. Fiore (Eds.), Team cognition: Understanding the fac‑ tors that drive process and performance (pp. 83–106). Washington, DC: American Psychological Association.
Crichton, M. T. (2009). Improving team effectiveness using tactical decision games. Safety Science, 47(3), 330–336.
DeMars, C. E. (2013). A tutorial on interpreting bifactor model scores. International Journal of Testing, 13(4), 354–378.
Dewey, J. (1938). Experience & education. New York: Kappa Delta Pi.
DiCerbo, K. E. (2014). Game-based assessment of persistence. Educational Technology & Society, 17(1), 17–28.
Dixon, J. A. , Holden, J. G. , Mirman, D. , & Stephen, D. G. (2012). Multifractal dynamics in the emergence of cognitive structure. Topics in Cognitive Science, 4, 51–62.
Dumas, D. , & Alexander, P. A. (2016). Calibration of the test of relational reasoning. Psychological Assessment, 28(10), 1303.
Embretson, S. E. (1985). Multicomponent latent trait models for test design. Test Design: Developments in Psychology and Psychometrics, 195–218.
Embretson, S. E. , & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates, Inc., Publishers.
Forzani, E. (2018). How well can students evaluate online science information? Contributions of prior knowledge, gender, socioeconomic status, and offline reading Ability. Reading Research Quarterly, 53(4), 385–390.
Fox, J. P. (2004). Applications of multilevel IRT modeling. School Effectiveness and School Improvement, 15(3–4), 261–280.
Gee, J. P. (2003). What video games have to teach us about learning and literacy. Computers in Entertainment (CIE), 1(1), 20–20.
Gee, J. P. , & Hayes, E. (2012). Nurturing affinity spaces and game-based learning. In C. Steinkuehler , K. Squire , & S. Barab (Eds.), Games, learning, and society: Learning and meaning in the digital age (pp. 129–153). New York: Cambridge University Press.
Gibson, E. J. (2000). Perceptual learning in development: Some basic concepts. Ecological Psychology, 12(4), 295–302.
Gibson, J. J. (1986). The ecological approach to visual perception. Hillsdale, NJ: Erlbaum.
Gorin, J. S. , & Embretson, S. E. (2006). Item difficulty modeling of paragraph comprehension items. Applied Psychological Measurement, 30(5), 394–411.
Greenberg, G. (2014). How new ideas in physics and biology influence developmental science. Research in Human Development, 11(1), 5–21.
Hadley, K. R. (2014). Teaching teamwork skills through alignment of features within a commercial board game. International Journal of Engineering Education, 6(A), 1376–1394.
Hambleton, R. K. , Swaminathan, H. , & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage.
Helm, J. L. , Ram, N. , Cole, P. M. , & Chow, S. M. (2016). Modeling self-regulation as a process using a multiple time-scale multiphase latent basis growth model. Structural Equation Modeling: A Multidisciplinary Journal, 23(5), 635–648.
Hilpert, J. C. , & Marchand, G. C. (2018). Complex systems research in educational psychology: Aligning theory and method. Educational Psychologist, 53(3), 185–202.
Hirsch, P. L. , & McKenna, A. F. (2008). Using reflection to promote teamwork understanding in engineering design education. International Journal of Engineering Education, 24(2), 377–385.
Hodges, B. H. (2007). Good prospects: Ecological and social perspectives on conforming, creating, and caring in conversation. Language Sciences, 29, 584–604.
Hodges, B. H. (2009). Ecological pragmatics: Values, dialogical arrays, complexity and caring. Pragmatics & Cognition, 17(3), 628–652.
Howard-Jones, P. , Taylor, J. , & Sutton, L. (2002). The effect of play on the creativity of young children during subsequent activity. Early Child Development and Care, 172(4), 323–328.
Huang, A. , & Levinson, D. (2012). To game or not to game teaching transportation planning with board games. Transportation Research Record, 2307, 141–149.
Hutchins, E. (2010). Cognitive ecology. Topics in Cognitive Science, 2(4), 705–715.
Jang, E. E. , Lajoie, S. P. , Wagner, M. , Xu, Z. , Poitras, E. , & Naismith, L. (2017). Person-oriented approaches to profiling learners in technology-rich learning environments for ecological learner modeling. Journal of Educational Computing Research, 55(4), 552–597.
Kim, Y. J. , Almond, R. G. , & Shute, V. J. (2016). Applying evidence-centered design for the development of game-based assessments in physics playground. International Journal of Testing, 16(2), 142–163.
Koehler, M. J. , Arnold, B. , Greenhalgh, S. P. , & Boltz, L. O. (2017). A taxonomy approach to studying how gamers review games. Simulation & Gaming, 48(3), 363–380.
Kringelbach, M. L. , McIntosh, A. R. , Ritter, P. , Jirsa, V. K. , & Deco, G. (2015). The rediscovery of slowness: Exploring the timing of cognition. Trends in Cognitive Sciences, 19(10), 616–628.
Kulikowich, J. M. , & Young, M. F. (2001). Locating an ecological psychology methodology for situated action. Journal of the Learning Sciences, 10(1 & 2), 165–202.
Lave, J. , & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge: Cambridge University Press.
Lillard, A. S. (2013). Playful learning and Montessori education. American Journal of Play, 5(2), 157–186. Retrieved from
Lillard, A. S. , Lerner, M. D. , Hopkins, E. J. , Dore, R. A. , Smith, E. D. , & Palmquist, C. M. (2013). The impact of pretend play on children’s development: A review of the evidence. Psychological Bulletin, 139(1), 1–34.
Lindell, M. K. , House, D. H. , Gestring, J. , & Wu, H. C. (2018). A tutorial on DynaSearch: A Web-based system for collecting process-tracing data in dynamic decision tasks. Behavior Research Methods, 1–15.
Loh, C. S. (2012). Information trails: In-process assessment of game-based learning. In D. Ifenthaler , D. Eseryel , & X. Ge (Eds.), Assessment in game‑based learning (pp. 123–144). New York: Springer.
Lord, F. M. (1968). Some test theory for tailored testing. ETS Research Bulletin Series, 1968(2), i–62.
Malone, T. W. , & Lepper, M. R. (1987). Making learning fun: A taxonomy of intrinsic motivations for learning. In R. E. Snow & M. J. Farr (Eds.), Aptitude, learning, and instruc‑ tion: Cognitive and affective process analysis (Vol. 3, pp. 223–253). Hillsdale, NJ: Lawrence Erlbaum Associates.
Markus, K. A. , & Borsboom, D. (2013). Frontiers in test validity theory: Measurement, cau‑ sation and meaning. New York: Psychology Press.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174.
McClelland, J. L. (1988). Connectionist models and psychological evidence. Journal of Memory and Language, 27(2), 107–123.
McKee, K. L. , Rappaport, L. M. , Boker, S. M. , Moskowitz, D. S. , & Neale, M. C. (2018). Adaptive equilibrium regulation: Modeling individual dynamics on multiple times-cales. Structural Equation Modeling: A Multidisciplinary Journal, 1442224.
McNamara, D. S. , & Kendeou, P. (2017). Translating advances in reading comprehension research to educational practice. International Electronic Journal of Elementary Education, 4(1), 33–46.
Mislevy, R. J. (2016). How developments in psychology and technology challenge validity argumentation. Journal of Educational Measurement, 53(3), 265–292.
Mislevy, R. J. , Oranje, A. , Bauer, M. , von Davier, A. A. , Hao, J. , Corrigan, S. (2014). Psychometric considerations in game‑based assessment. New York: Institute of Play.
Mitchell, M. (2009). Complexity: A guided tour. New York: Oxford University Press.
Molenaar, P. C. M. (2004). A manifesto on psychology as idiographic science: Bringing the person back into scientific psychology, this time forever. Measurement, 2(4), 201–218.
Molenaar, P. C. M. (2014). Dynamic models of biological pattern formation have surprising implications for understanding the epigenetics of development. Research in Human Development, 11(1), 50–62.
National Research Council. (2012). A framework for K‑12 science education: Practices, crosscutting concepts, and core ideas. Washington, DC: National Academies Press.
Ocone, A. , Millar, A. J. , & Sanguinetti, G. (2013). Hybrid regulatory models: A statistically tractable approach to model regulatory network dynamics. Bioinformatics, 29(7), 910–916.
Papert, S. , & Harel, I. (1991). Constructionism. Norwood, NH: Ablex Publishing Corporation.
Quellmalz, E. S. , Davenport, J. L. , Timms, M. J. , DeBoer, G. E. , Jordan, K. A. (2013). Next-generation environments for assessing and promoting complex science learning. Journal of Educational Psychology, 105(4), 1100–1114.
Raudenbush, S. W. , & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Thousand Oaks, CA: Sage.
Reckase, M. D. (2009). Multidimensional item response theory. New York: Springer.
Resnick, L. B. , & Resnick, D. P. (1992). Assessing the thinking curriculum: New tools for educational reform. In B. Gifford & M. O’Connor (Eds.), Changing assess‑ ments (pp. 37–75). New York: Springer.
Resnick, M. (2006). Computer as paintbrush: Technology, play, and the creative society. In D. Singer , R. Golikoff , & K. Hirsh-Pasek (Eds.), Play = learning: How play motivates and enhances children’s cognitive and social‑emotional growth (pp. 192–208). New York: Oxford University Press.
Resnick, M. (2008). Sowing the seeds for a more creative society. Learning & Leading with Technology, 35(4), 18–22.
Rogoff, B. (1990). Apprenticeship in thinking: Cognitive development in social context. New York: Oxford University Press.
Roth, W. M. (1998). Situated cognition and assessment of competence in science. Evaluation and Program Planning, 21(2), 155–169.
Russ, S. W. (2003). Play and creativity: Developmental issues. Scandinavian Journal of Educational Research, 47(3), 291–303.
Schilling, S. G. (2007). The role of psychometric modeling in test validation: An application of multidimensional Item Response Theory. Measurement: Interdisciplinary Research and Perspectives, 5(2–3), 93–106.
Schwartz, D. L. , & Arena, D. (2013). Measuring what matters most: Choice‑based assess‑ ments for the digital age. Cambridge, MA: MIT Press.
Shaffer, D. W. (2006). How computer games help children learn. New York: Palgrave Macmillan.
Shaffer, D. W. , & Gee, J. (2012). The right kind of gate: Computer games and the future of assessment. In M. Mayrath , J. Clarke-Midura , & D. Robinson (Eds.), Technology based assessment for 21st century skills: Theoretical and practical implications from modern research. New York: Springer-Verlag.
Shaw, R. E. , & Turvey, M. T. (1999). Ecological foundations of cognition: II. Degrees of freedom and conserved quantities in animal-environment system. Journal of Consciousness Studies, 6(11–12), 111–123.
Shute, V. , & Ke, F. (2012). Games, learning, and assessment. In D. Ifenthaler , D. Eseryel , & X. Ge (Eds.), Assessment in game‑based learning: Foundations, innovations and perspectives (pp. 43–58). New York: Springer.
Siddiq, F. , Gochyyev, P. , & Wilson, M. (2017). Learning in digital networks—ICT literacy: A novel assessment of students’ 21st century skills. Computers & Education, 109, 11–37.
Skinner, B. F. (1988). Preface to the behavior of organism. Journal of the Experimental Analysis of Behavior, 50(2), 355–358.
Squire, K. (2006). From content to context: Videogames as designed experience. Educational Researcher, 35(8), 19–29.
Squire, K. (2008). Video games and education: Designing learning systems for an interactive age. Educational Technology, 17–26.
Steinkuehler, C. , & Squire, K. (2014). Videogames and learning. In K. Sawyer (Ed.), Cambridge handbook of the learning sciences (2nd ed.). New York: Cambridge University Press.
Stephen, D. G. , Boncoddo, R. A. , Magnuson, J. S. , & Dixon, J. A. (2009). The dynamics of insight: Mathematical discovery as a phase transition. Memory & Cognition, 37(8), 1132–1149.
Templin, J. , & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika, 79(2), 317–339.
Tsai, F.-H. (2018). The development and evaluation of a computer-simulated science inquiry environment using gamified elements. Journal of Educational Computing Research, 56(1), 3–22.
Van der Linden, W. J. , & Glas, C. A. (Eds.). (2000). Computerized adaptive testing: Theory and practice. Dordrecht: Kluwer Academic.
Vera, A. H. , & Simon, H. A. (1993). Situated action: A symbolic interpretation. Cognitive Science, 17(1), 7–48.
Vogel, J. J. , Vogel, D. S. , Cannon-Bowers, J. , Bowers, C. A. , Muse, K. , & Wright, M. (2006). Computer gaming and interactive simulations for learning: A meta-analysis. Journal of Educational Computing Research, 34, 229–243.
von Davier, A. A. (2017). Computational psychometrics in support of collaborative educational assessments. Journal of Educational Measurement, 54(1), 3–11.
von Davier, A. A. , & Halpin, P. F. (2013). Collaborative problem solving and the assessment of cognitive skills: Psychometric considerations. ETS Research Report Series, 2013(2), i–36.
Vygotsky, L. S. (1966). Play and its role in the mental development of the child. Soviet Psychology, 12(6), 62–76. Retrieved from
Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard University Press. Retrieved from
Wiggins, G. P. (1993). Assessing student performance: Exploring the purpose and limits of testing. San Francisco, CA: Jossey-Bass.
Wilkerson, M. H. , Shareff, R. , Laina, V. , & Gravel, B. (2018). Epistemic gameplay and discovery in computational model-based inquiry activities. Instructional Science, 46(1), 35–60. Retrieved from
Wilson, K. A. , Bedwell, W. L. , Lazzara, E. H. , Salas, E. , Burke, C. S. , Estock, J. L. , … Conkey, C. (2009). Relationships between game attributes and learning outcomes: Review and research proposals. Simulation & Gaming, 40, 217–266.
Wilson, M. , Gochyyev, P. , & Scalise, K. (2017). Modeling data from collaborative assessments: Learning in digital interactive social networks. Journal of Educational Measurement, 54(1), 85–102.
Wouters, P. , van Nimwegen, C. , van Oostendorp, H. , & van der Spek, E. D. (2013). A meta-analysis of the cognitive and motivational effects of serious games. Journal of Educational Psychology, 105, 249–265.
Young, M. F. (1993). Instructional design for situated learning. Educational Technology Research and Development, 41(1), 43–58.
Young, M. F. (1995). Assessment of situated learning using computer environments. Journal of Science Education and Technology, 4(1), 89–96.
Young, M. F. (2004). An ecological description of video games in education. Paper presented at the International Conference on Education and Information Systems Technologies and Applications (EISTA), Orlando, FL, July 2004.
Young, M. F. , & Barab, S. A. (1999). Perception of the raison d’etre in anchored instruction: An ecological psychology perspective. Journal of Educational Computing Research, 20(2), 119–141.
Young, M. F. , DePalma, A. , & Garrett, S. (2002). An ecological psychology perspective on situations, interactions, process and affordances. Instructional Science, 30, 47–63. Retrieved from,D,&G2002.pdf
Young, M. F. , Kulikowich, J. M. , & Barab, S. A. (1997). The unit of analysis for situated assessment. Instructional Science, 25(2), 133–150.
Young, M. F. , & Slota, S. (2017). Exploding the castle: How video games and game mechan‑ ics can shape the future of education. Charlotte, NC: Information Age Publishing.
Young, M. F. , Slota, S. , Cutter, A. B. , Jalette, G. , Mullin, G. , Lai, B. (2012). Our princess is in another castle a review of trends in serious gaming for education. Review of Educational Research, 82, 61–89. Retrieved from
Zheng, D. , Newgarden, K. , & Young, M. F. (2012). Multimodal analysis of language learning in World of Warcraft play: Languaging as values realizing. ReCALL, 24, 339–360. Retrieved from
Search for more...
Back to top

Use of cookies on this website

We are using cookies to provide statistics that help us give you the best experience of our site. You can find out more in our Privacy Policy. By continuing to use the site you are agreeing to our use of cookies.