Emergent cyber-attack threats against cyber-physical systems can create potentially catastrophic impacts. The operators must intervene at the right moment when suspected attacks occur, without over-reliance on systems to detect the cyber-attacks. However, military operators are normally trained to trust, rather than suspect systems. We applied suspicion theory to explore how operators detect and respond to cyber-attacks against an unmanned ground vehicle (UGV) system in the operational context of a human-machine team (HMT). We investigated the relationships between the operator suspicion and HMT performance by conducting human-in-the-loop experiments on eight mission scenarios with 32 air-force officers. The experiment yielded a significant, negative relationship between operator suspicion and HMT performance (quantified both in terms of the desirability of decision response and the time to respond). Notably, operator suspicion increased with the combined effects of cyber-attacks and a sentinel alert but not with the alert alone. This finding was particularly meaningful for “false-negative” scenarios, in which no sentinel alert was sent despite cyber-attacks having occurred. Although the operators did not receive an alert, the operators grew more suspicious, seeking more information; it took longer for the operators to respond, and their decision responses were highly divergent (17.2% came with less-desirable responses, and 21.9% were considered instances of over-reliance). In contrast, in “false-positive” scenarios, 95.3% of the operator responses were highly desirable. This experiment has implications for the role of a sentinel alert in engineering trustworthy HMT systems so that the operators can quickly transition through state-suspicion to the most desirable decision.
A considerable effort is ongoing to prevent, detect and mitigate cyberattacks on the Department of Defense networks and information technology (IT) systems; in contrast, the effort to address these concerns in cyber-physical system (CPS), such as unmanned vehicle systems, pales in comparison. These systems represent an intrinsic vulnerability and allow adversaries to attempt cyberattacks with the malicious intention of undermining military assets. As an example, Iranian cyber capabilities were believed to have forced down the Central Intelligence Agency-operated RQ-170 Sentinel drone while operating near the Iranian border in 2011 [1], causing concern over the potential compromise of highly sensitive surveillance capabilities. This incident sparked much research directed toward the hardware and software security of unmanned vehicle systems [2, 3]. However, research addressing the human dimensions of cyberattack detection and response in the mission operation context remains sparse and represents an emergent area of research needed to fully address cyberattacks against CPS.
Our research took an operator-centric approach toward exploring the human dimensions of cyberattack detection and responses through a scenario-based, human-in-the-loop experiment with Air Force personnel as operators of an unmanned vehicle system in a military context. In prior work, we took a systems-oriented approach to the problem by considering the interaction of a Human–Machine Team (HMT) [4, 5] responding to cyberattacks and defining a framework of performance measurement [6].
In this work, HMT is defined as a team of an operator and Sentinel, an automated cyberattack detection aid. For machine design, the operators’ biases associated with suspicion in their responses to cyberattacks shed light on the development of an adaptive sentinel. For human operators, the findings on the relationship between HMT performance and level of suspicion have implications for the selection, evaluation and training of appropriate personnel.
A challenge in designing for high-performance HMT is a lack of theory to help understand how humans interact with machines in work contexts. A recent paradigm in human–machine automation considers autonomy as a variable, rather than a fixed parameter, which can be distributed between human and artificial agents to achieve optimal performance at work [7]. An ultimate vision for human–machine teamwork is to “race with machines” [8], not against ones, by continuously redefining human roles under new work processes. The promise of complementary engagement of human and machine abilities for enhanced performance has seen some positive examples [9, 10]. Yet, it is difficult to fully accomplish this vision without knowing the constraints of the human, the machine, and the environment [11].
In military operations, mission complexity is outpacing the ability to manage disruptions, which calls for systemic approaches that span technology, human, and mission space [12]. At a minimum, any framework that addresses this complexity should enable the evaluation of human–machine interactions with regard to the nature of problem and solution sets [13, 14], under the situational constraints of mission context. The traditional framework of Level of Automation (LOA) and its alternatives [15–17] are confined to the concept of function allocation, not reflecting situational constraints.
So far, many unmanned systems [18] have attained assurance by counting on human supervision as the last resort. Some systems attempt to augment human cognitive abilities in particular tasks, such as spatial detection [19] and path planning [20]. The cognitive support in HMT [4, 21, 22], focuses on team cognition and mental workload. In particular, human–machine collaboration for emergency management has gained attention, with a focus on risk management and resiliency [23, 24]. Under emergency situations, HMTs are forced to make decisions within tight time schedules often with incomplete information, while the new situational complexity is likely to overload team cognitive resources [25]. In military unmanned systems, a failure to first-respond to the emergency situations can result in catastrophic damages, and there are growing concerns over the potential of cyber threats to impede the timely responses [26].
There are methods proposed to help analyze and guide cognitive responses of human supervisor under cyberattacks (see [27], for instance), but they do not fully consider the dynamic interdependence of human, machine, and situational context. The Instance-based Learning (IBL) model for cyber situation awareness [28] predicted security analysts’ recognition of cyberattacks based on the situational attributes and on their similarity to past instances (to be retrieved from memory). Another example of analyzing cyber situation awareness in [29], proposed a distribution-based simulation model to identify cyber-behaviors and their cognitive aspects based on browser log data. In a hybrid approach, the work in [30] proposed a decision-support scheme to assist in response selection against cyber threats by combining qualitative expert assessment, event history, and multi-criteria decision analysis [31]. Although these works presented formal models and methods to represent performance in cognitive aspects, they focused exclusively on humans, rather than on the dynamics of the HMT.
The dynamics associated with the analysis of HMT performance can be internal (i.e., between human and machine), or external (i.e., situation-specific relations between the team and work-related factors). Regarding internal dynamics, the concept of “trust” is key to successful emergency responses – i.e., how trust is formed, developed and confirmed with the automated agents [32]. The literature on operator trust abounds [33–36], including when the autonomous systems are under potential cyberattacks [37]. A wide array of factors has been identified that influence the level of trust in human-automation interaction [38]. Not only formation, but the confirmation of trust becomes critical particularly when an unmanned system is under cyberattack. On the contrary, relatively little attention has been paid to understand the external dynamics of the HMT. Such investigations are not straightforward because it is not always feasible to keep the situational factors transparent to the supervisor or the machine [39]. For example, in which task-related conditions can the HMT performance be weakened (or strengthened)? Are there particular cognitive states of the human supervisor that can help improve HMT performance? What are the effective ways for the machine to support the supervisor under cyberattacks?
This paper determined the construct of suspicion to be particularly useful for investigating HMT performance in response to cyberattacks. In recent work, the theory of suspicion [40] defines state-suspicion as “a person’s simultaneous state of cognitive activity, uncertainty, and perceived malintent about underlying information that is being electronically generated, collated, sent, analyzed, or implemented by an external agent”. This work also describes the sequential structure of state-suspicion development across three stages.
Stage 1 refers to perceptual cues and indications from the task environment that can trigger suspicious states in the mind of the operators. For example,missing information,patterns of negative discrepancy,or other system and interface characteristics can serve to provoke different levels of suspicion. In UGV control,an operator and Sentinel collaborate in a team for detection and response to cyberattacks,and the Sentinel alert messages,or their lack,on a control interface can serve as stage-1 cues to initiate operator suspicion. Therefore,this research manipulated the sentinel alert messages to stimulate state-suspicion. For example,the Sentinel alert message popped up in the mission video window that read “Cyberattack: Throttle Control,” and it remained visible for 30 s,see Figure 3.1. Not only the display of alert,but the lack of alert when the vehicle was maneuvering abnormally could also trigger suspicion from the interface.
Figure 3.1 A Mission Scenario on a UGV Control Interface. (a) Ego-Centric view of the UGV with a sentinel alert, (b) numerical indicators of the UGV control parameters, (c) Bird's-Eye view with way-points (Circled) and the UGV's location and direction (dotted lines).
Stage 2 of the suspicion model identifies individual levels of trust, distrust, training and other personal traits that can affect state-level suspicion [41, 42]. Especially, an operator’s trait-level attributes, including creativity, cognitive demand and capacity, and propensity to trust [43], can form an internal condition to the arousal of state-suspicion [40]. This research incorporated a set of pre-test surveys prior to the experiment about operator self-ratings of intelligence scores, creativity, general attitude toward complex problems, and propensity to trust.
Finally, stage 3 refers to behavioral, cognitive and emotional outcomes of becoming suspicious. In particular, the State-Suspicion Index (SSI) [40] has been developed to quantify the level of suspicion through a 20-item questionnaire that assesses the suspicion components of uncertainty, malintent, and cognitive load, as well as overall suspicion. To reflect the operational context of UGV missions, the original SSI was adapted to a 13-item questionnaire in collaboration with one of those authors.
This research primarily revolves around the relationship between the level of operator suspicion and HMT performance in the mission operation context of an unmanned ground vehicle (UGV). The definition of key variables and their measurements, and experimental process are described in this section.
This research primarily revolves around the relationship between the level of operator suspicion and HMT performance in the mission operation context of an UGV. To answer how suspicion affects HMT performance in a human-in-the-loop simulation, this research paired a UGV operator with a sentinel for automated cyberattack detection. Guided by suspicion theory, a set of visual cues in the sentinel alarm and control environment was simulated for anomalous system events under different mission scenarios. On completing each mission scenario, HMT performance, as well as suspicion level, was quantified. HMT performance was evaluated on the two general criteria of speed and accuracy [44], for the detection and selection of responses to suspected cyberattacks. To elaborate on the research question, the following hypotheses were set.
Figure 3.2 depicts how these four hypotheses associate the operator’s suspicion with the responses to cyberattacks on unmanned systems. Based on suspicion theory, operator suspicion is presumed to be a latent variable that has three components, “Uncertainty”, “Malicious Intent”, and “Cognitive Activity.” The experimental levels, either high or low, of both uncertainty and malicious intent were manipulated as independent variables (IV) through each mission scenario, while cognitive activity was measured as a dependent variable (DV) at the end of each mission run. For the estimation of cognitive activity, NASA-TLX [45, 46] and the related items in the SSI questionnaire were used. The two levels for each one of the IVs were verified based on the responses to the corresponding items in the SSI questionnaire, which helped confirm if different mission scenarios effectively set different levels of perceptions as intended. The 13-item SSI questionnaire determined the overall level of suspicion (by linear combination of its components).
Figure 3.2 Overview of experimental variables, relationships among them, and methods of measurement.
The performance measures of response time and score were recorded while the mission videos dynamically played back anomalous system events, including cyberattacks and sentinel alert messages. The time to respond to such events was recorded using an interactive polling software (TurningPoint, TurningTechnologies, Ltd.) during the experiment, and the performance score was determined post-experiment based on rubrics. Each mission scenario has its own unique score rubric defined by subject matter experts. The operator’s response selections from a given set of decision trees were logged in the software and were then evaluated against the rubric. Furthermore, the four-way combinations of cyberattacks (attacks vs. no attacks) and sentinel alert messages (alert vs. no alert) enabled us to analyze operator suspicion under different circumstances.
The human-in-the-loop experiments were designed and conducted in three phases. In phase 1, we obtained consent from thirty-two military operators (IRB: FWR20160115H) and collected personal information, including demographic and personality-related questionnaires. Phase 2 familiarized participants with the experimental tasks through instruction and demonstrations, so that an acceptable level of fluency was ensured in the operational context. In phase 3, participants were presented, in a random order, a series of eight mission scenarios, each with a pair of mission briefing and mission videos. Once the mission briefing was done, the participant responded to events on the mission videos that occurred during each mission scenario while response selection and response times were recorded simultaneously. On completion of each mission scenario, participants’ perceptions of uncertainty, malicious intent, and cognitive workload during the mission were obtained via the NASA TLX and SSI questionnaires.
To the operator, a mission scenario was characterized by the combination of mission briefings, illustrated in Figure 3.3, and mission videos. The mission briefings described mission type, mission context, and descriptive profiles for the operation of the unmanned ground vehicle system (UGVS). The mission type was either training or operational missions for transport and re-supply. The mission context was set in the U.S. or Middle Eastern locations, with the corresponding estimated frequencies of cyberattacks in the past.
Figure 3.3 Illustration of a mission briefing (part).
For the machine-side, a mission profile configured the UGV behaviors: when a profile was deployed, the UGVS autonomously ran it and generated mission views for playback to use in the simulation experiments. Overall, both verbal and visual elements of mission scenarios were constructed to indirectly manipulate the operator’s state-suspicion by forming the two independent variables (IVs), “uncertainty” and “malicious intent”, into a two-level full-factorial design.
After being oriented to the mission briefing, the participants were tasked to record the UGV speed every thirty seconds while monitoring the mission video as well as instrument readouts for anomalous events from the UGV mission. On detecting anomalous events, the participants were instructed to select the most appropriate response from a decision tree that was provided as a guide to standardize the range of possible operator responses (see Table 3.1). These tasks were closely aligned with typical unmanned vehicle system operator tasks. At the conclusion of each mission scenario, the operator completed two questionnaires: (i) the NASA TLX questionnaire which quantifies the operator’s self-assessment of cognitive workload on six dimensions, with each dimension rated on a 0–100 scale, and (ii) the 13-item SSI questionnaire which was developed specifically for this research and evaluated (on a 7-point Likert scale) the operator’s perception of uncertainty, malicious intent, cognitive activation, and overall suspicion. The 13 items measuring suspicion were then aggregated to form an overall quantitative measure of operator suspicion. Cronbach’s alpha [47] for the 13 items was.88, indicating acceptable internal consistency of the measure. Participants were thirty-two Air Force officers from the Air Force Institute of Technology (AFIT), with each experiment taking 2–2 1/2 h to complete. Since many current operations associated with unmanned vehicle missions occur in an office environment, the experiment took place in such a space.
Dependence Variable |
(a) TP Attack Yes /Alert Yes |
(b) TN Attack No/ Alert No |
(c) FP Attack No/ Alert Yes |
(d) FN Attack Yes /Alert No |
---|---|---|---|---|
Suspicion; (7-Likert) |
4.38 ± 0.97 |
3.97 ± 0.96 |
3.90 ± 0.96 |
4.52 ± 1.00 |
Score; (0–100) |
90.8 ± 18.1 |
94.8 ± 12.1 |
92.7 ± 15.7 |
81.2 ± 27.7 |
Time (s) |
14.91 ± 16.45 |
1.00 ± 2.56 |
4.61 ± 3.69 |
13.48 ± 12.79 |
NASA-TLX; (Rating 0–25) |
22.1 ± 14.1 |
16.6 ± 11.4 |
16.2 ± 9.8 |
23.5 ± 13.8 |
The experiment yielded significant outcomes on the relationship between operator suspicion and HMT performance. The overall level of suspicion derived from the 13-item SSI questionnaire had a significant (p < 0.001) Pearson correlation with the response score (ρ =−0.251), as well as with the response time (ρ = 0.379). As was expected with the suspicion theory, the subgroups of the SSI questionnaire items that correspond to perception of uncertainty, malicious intent, and cognitive activation also showed a strong, significant correlation with overall suspicion (p < 0.001). Correlations were estimated to be ρ = 0.803, ρ = 0.905, and ρ = 0.828, respectively. In addition, there were predictable relationships among the HMT performance metrics. The response score was negatively correlated with response time (ρ = −0.225), as well as with the standard deviation of its own score (ρ = −0.354), implying that less-desirable decision responses tend to accompany slow and inconsistent responses.
Contrary to H1, which proposed Sentinel alerts are related to operator suspicion, the result of the one-way ANOVA was not significant (F1 , 254 = 0.688, p = 0.408); hence Sentinel alerts alone did not create operator suspicion. The variability within each group of the sentinel alert being activated versus not activated, outweighed that of the between-group (MSEWithin−group = 1.009, MSEBetween−group = 0.694). It contrasts, for the factor of cyberattack (or not), a second one-way ANOVA showed a significantly different level of suspicion (F1 , 254 = 18.393, p < 0.001); i.e., a higher level of suspicion was observed when attacks occurred. These results that Sentinel alerts are not an independent factor of suspicion despite their visual saliency on a display, while cyberattacks did significantly arouse suspicion, imply a complicated cognitive structure of judgment based on the uncertainty of perceptual information [48, 49]. Of particular statistical concern with this uncertainty, is a shared-variance structure of an individual operator’s suspicion, which might be determined by the combined effects of the Sentinel alert and cyberattack scenarios.
To resolve these combined effects, a Hierarchical Linear Model (HLM) was applied. The HLM is capable of accounting for the shared-variance structure in a nested data with hierarchical levels of variables, by using a complex form of Ordinary Least Squares (OLS) regressions [50]. This HLM method can effectively compensate for the known risks [51]; a risk of ignoring the between-scenario effects on suspicion (as was seen in the first one-way ANOVA of the previous paragraph), as well as ignoring the individual propensity to trust the Sentinel alert (as seen in the second one-way ANOVA of the previous paragraph).
In general, the final outcome of HLM takes on a form of simple regression, where a dependent variable Y ij is predicted by using an ith-level variable X ij that is nested within a higher-level variable, j.
In HLM, this low-level model (3.1) further incorporates the higher-level models of equations (3.2) and (3.3) below, for each of the coefficients, β 0 j and β 1 j, in terms of the interim variable Q j of the jth-level variable, and the random effects, U 0 j and U 1 j, that are adjusted for Q j. The statistical significance of β 1 j can be tested to determine if the combined levels i and j influence the dependent variable Y ij.
Finally, the overall model in (3.4) incorporates both the ith-level and the jth-level predictors X ij and Q j, respectively, by combining (3.2) and (3.3) into (3.1).
In order to apply HLM, the dependent variables as summarized in Figure 3.2 for state-suspicion, HMT performance, and cognitive workload, respectively, were structured for each combination of the level-i of Sentinel alert (i = 0 if no alert; i = 1 if an alert message was shown on a display) and the level j of cyberattacks (j = 0 if no attacks; j = 1 if any attacks occurred in an experimental scenario). Table 3.1 summarizes the mean and the standard deviation for each combination of the nested levels. Such orthogonal dichotomies of True/False (of cyberattacks) and Positive/Negative (of sentinel alarm) on a 2-by-2 contingency table allows us to further analyze the experimental results around the classic framework of signal detection theory [52].
Since cyberattacks are by nature malicious events and require consideration of multiple solutions for the observed behavior, H2 hypothesized that operator suspicion is positively related with HMT performance, suggesting a suspicious operator would score better on the tasks. This hypothesis was the opposite of the experimental results. The linear coefficient of the HLM analysis was significant when modeled after (1) (β 10 =−5.63, p < 0.001), and the direction of the relationship was negative, meaning increased operator suspicion had a significantly negative relationship to HMT performance as depicted in Figure 3.4.
Figure 3.4 Performance score as a function of operator suspicion.
Additionally, H4 proposed operator suspicion is positively related to operator task response time, which meant higher suspicion is associated with a longer task response time. The linear coefficient of the HLM analysis supported H4; i.e., the relationship is statistically significant and in a positive direction (β 10 = 6.95, p < 0.001). This linear relationship is depicted in Figure 3.5.
Figure 3.5 Time as a function of operator suspicion.
Finally, the four cyberattack and Sentinel alert combinations were tested in the experiment and analyzed by using the HLM as summarized in Table 3.2. The two combinations without cyberattacks, both (b) True Negative (TN) and (c) False Positive (FP), had a significant (p < 0.05) negative impact on operator suspicion, meaning that operator suspicion was lowered in both cases.
|
Cyberattacks |
||
---|---|---|---|
|
Yes |
No |
|
Sentinel Alert |
Yes |
(a) True Positive (TP) Increases suspicion ↑ (β 10 = +0.255, p = 0.047) |
(c) False Positive (FP) Decreases suspicion ↓ (β 10 = -0.394, p = 0.002) |
No |
(d) False Negative (FN) Increases suspicion ↑ (β 10 = +0.440, p = 0.001) |
(d) True Negative (TN) Decreases suspicion ↓ (β 10 = -0.301, p = 0.019) |
In contrast, the two combinations containing cyberattacks, (a) True Positive (TP) and (d) False Negative (FN), had a significantly (p < 0.05) positive impact on operator suspicion by increasing operator suspicion. These results are consistent with the finding for H1 that Sentinel alerts alone do not always create suspicion.
The combined effects of Table 3.2 warrant further discussion. Table 3.3 presents a frequency analysis of HMT actions for each combination (a–d) in terms of the four dependent variables: operator decision selections, suspicion, HMT performance score evaluated in terms of the desirability of the decision response to a given mission scenario, and response time. As previously noted, all operators in the experiment responded to suspicious events by referring to a pre-defined tree of decision responses, and the frequencies associated with those response options are summarized in the first section of Table 3.3. The HMT actions in the combinations of (a) True Positive (TP) and (b) True Negative (TN) are predictable based on the findings of other hypotheses and will not require further discussion. The more interesting behaviors are from situations (c) False Positive (FP) that represent scenarios in which no cyberattacks occurred, but the Sentinel sent an alert to the operator anyway, and (d) False Negative (FN) that represent scenarios in which cyberattacks occurred, but the Sentinel failed to send an alert.
|
(a) True Positive |
(b) True Negative |
(c) False Positive |
(d) False Negative |
---|---|---|---|---|
Decision Responses on a Decision Tree |
||||
0 - No response |
- |
51 |
1 |
1 |
1 - Continue Mission |
2 |
4 |
46 |
6 |
2 - Take action; Sentinel fixes the problem; continue |
54 |
5 |
11 |
14 |
3 - Take action; Operator fixes the problem; continue |
5 |
4 |
6 |
38 |
4 - Take action; Call backup; continue |
- |
- |
- |
2 |
5 - Abort; recovery; backup |
2 |
- |
- |
2 |
6 - Abort; recovery; no backup |
1 |
- |
- |
1 |
Subtotal (N) |
64 |
64 |
64 |
64 |
Suspicion (SSI Total range of 1–7) * Higher indicates more suspicious |
||||
Low (SSI Total: 1–3) |
5 |
10 |
12 |
1 |
Medium (SSI Total: 5–10) |
40 |
43 |
41 |
40 |
High (SSI Total: 10–60) |
19 |
11 |
11 |
23 |
Subtotal (N) |
64 |
64 |
64 |
64 |
HTF Performance (Score range 0–100) * Evaluated by desirability |
||||
Low (Score: 0–5) |
3 |
- |
1 |
11 |
Medium (Score: 50–75) |
4 |
5 |
2 |
3 |
High (Score: 75–100) |
57 |
59 |
61 |
50 |
Subtotal (N) |
64 |
64 |
64 |
64 |
Response Time (Time range 1–60 s) |
||||
Fast (Time: 0–5) |
|
|
|
|
Medium (Time: 5–10) |
|
|
|
|
Slow (Time: 10–60) |
|
|
|
|
Subtotal (N) |
64 |
64 |
64 |
64 |
In FP scenarios, 71.8% of responses (i.e., 46 out of 64) were judged desirable for the mission context by subject matter experts: when the operators received the Sentinel alert, most of them collected information available from the system to tell if a cyberattack was in effect and decided to over-ride the Sentinel alert by continuing the mission without taking additional action. This quick search-and-override decision resulted in a relatively higher HMT performance, and faster response times compared with other combinations as summarized in Table 3.3. Furthermore, there were no “call for backup” or “abort” actions, which may have come with high cost in mission operation. Overall, these responses in False Positive (FP) scenarios are generally desirable.
In contrast, the HMT actions in False Negative (FN) scenarios were considerably less desirable to the mission context. Regardless of the fact that the operators did not receive a Sentinel alert to prompt information search, they grew more suspicious when cyberattacks occurred, and it took longer for them to respond, yielding lower HMT performance scores. Of 64 responses, 38 chose to develop their own solutions, 2 called for backup, and 14 even allowed the Sentinel to act, which can be considered instances of over-reliance on the Sentinel although it did not detect the attack [53]. Another issue that emerged in this operational context is the frequency of missed detections. The operators completely missed the cyberattack seven times (the responses with codes 0 or 1). Overall, the HMT behaviors around the FN scenarios were potentially more damaging to mission outcomes.
The analysis of recent cyberattacks on cyber-physical infrastructures reveals that adversaries promptly adapt their attack strategies to mitigation actions [54]. This makes early detection and recognition of incoming cyberattacks even more critical to effective mitigation. So far, much research has focused on engineering cyberattack detection aids [55], while not necessarily considering their cognitive effects on the human or human–machine collaboration in mission contexts.
The finding that Sentinel alerts did not necessarily arouse operator suspicion (i.e., rejection of H1) has implications for vigilant human–machine integration. Perhaps, rather than the Sentinel alert, visual cues of unexpected system behaviors in mission environment are more likely to determine suspicion.
In fact, our finding could be related to the perceived risk that might have been triggered by a Sentinel alert. In the decision science literature [56–58], a positive correlation of perceived risk and information-seeking behavior is widely observed in decision under uncertainty. For instance, when a consumer has to choose a service that does not allow feature-by-feature comparison, information-seeking behavior is a common strategy to reduce perceived risk. In particular, information search triggered by perceived risk is more likely to be thorough if decision-makers have less knowledge about their choice and its consequences [59], leading to increased search time. The operators not knowing the true system states on cyberattacks, Sentinel alert could have triggered perceived risk, which then initiated wider information search to resolve suspicion.
In this regard, operator suspicion is a state of suspended or postponed decision-making, and it significantly lengthened mission time as observed both in (a) TP and (d) FN of Table 3.1. This strong linear relationship of suspicion and time is depicted in Figure 3.5. The negative correlation of suspicion and performance score also suggests it is wider information-seeking behavior, rather than more elaborated response selection, which actually lengthened the mission time. If the increased mission times were due to more effort investigated into response selection, the operator would have obtained a better score.
Yet, one cannot rule out the possibility that the scenarios which evoked suspicion were inherently more difficult to respond to, and thus increased mission time. Besides, causal relations among alert, state-suspicion and information-seeking behavior are not fully established. The current results do not allow us to conclude how state-suspicion is aroused, modulated, and resolved in the context of HMT collaboration.
The novel application of suspicion theory to UGV operations in a military context demonstrated the potential of that theory – particularly in relation to understanding the operation of a human–machine (sentinel) team. We suggest that operator suspicion needs to be managed in order for a HMT to achieve the best results in regard to the detection of cyberattacks, and subsequent responses when unmanned vehicle systems incur those cyberattacks. This research provides an understanding of suspicion effects on HMT performance and offers insights about moving quickly (or not) from a position of state-suspicion to making a decision.
A Sentinel alert on cyberattack symbolizes the roles that automation can play in responding to cyberattacks and sheds light on how HMT design can help exploit operator suspicion. As systems developers consider the balance of false-positive and false-negative errors in the design of cyberattack detection aids, the results of this experiment suggest erring on the side of false positives as more desirable. In addition, the Sentinel design that was used in the experiment did not provide operators with any indication of the need for a more or less immediate response to the attack that was detected. Providing such information could potentially help operators in managing the undesirable delays that were experienced during the experiments. Satisfying such a need could be difficult as it places requirements on the Sentinel to develop more detailed assessments of the attacks that it detects and may also require access to additional data sources that would serve this purpose.
Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the United States Department of Defense.
Reprinted with permission: Gay, C., Horowitz, B., Elshaw, J. J., Bobko, P., & Kim, I. (2019). Operator suspicion and human-machine team performance under mission scenarios of unmanned ground vehicle operation. IEEE Access, 7, 36371–36379.