Accurate and timely identification of military personnel who have a high likelihood of engaging in destructive behaviors is crucial in order to support continued military health and readiness. In the wake of the 2009 Fort Hood shooting, an independent review board established by the Department of Defense (DoD) identified the development of behavioral risk assessment tools as an area of pressing need to protect forces at home and abroad . Studies of settings with high risk for interpersonal conflict, such as those with high levels of physical crowding, high levels of social and sensory monotony, and low levels of privacy/control of physical space (e.g., forward deployed areas, open-bay barracks, space, and other isolated areas such as Antarctica [2,3]), echo the need for rapid and precise behavioral risk assessment tools.
To ensure efficiency, systems designed to assess the risk of destructive behavior (e.g., illicit drug use, harm to self, and violence toward others) must achieve the following tasks.
- Separate higher-risk individuals from lower-risk individuals
- Identify periods of particularly heightened risk (i.e., periods of imminent risk that may or may not be anticipated)
- Quickly adapt to the changing nature and form of destructive behaviors as culture and technology evolve
Existing methods of behavioral risk assessment (see Table 1), including those developed after the DoD independent review, place overwhelming focus on detecting static risk factors to discriminate between high-risk and low-risk individuals. The utility of these methods is limited by: low-to-moderate levels of sensitivity and specificity [4, 5]; an unknown ability to determine how soon a higher risk individual is likely to engage in an act of destructive behavior; and, in the case of clinical interviews and chart reviews, the requirement of a specially trained assessor .
Behavioral Signal Processing
An effective system for assessing destructive behavior is known as Behavioral Signal Processing (BSP) —a technologically facilitated method of analyzing behavior in near real-time using artificial intelligence that is well-suited to this task. BSP was initially developed to detect communication behaviors and affective expressions during structured dyadic interactions, such as interpersonal communication and psychotherapy sessions for substance abuse. It has since been applied to a wide array of behavioral outcomes relevant to military personnel, such as suicide risk assessment.
BSP employs a set of computational techniques for extracting mathematical quantities, referred to as features, from a digital record of behavior that is captured by text, audio, or video recording. Based on these features, BSP assigns a score to an individual. A suite of machine learning techniques indicates the risk for becoming violent. These properties give BSP several advantages over existing risk assessment methods.
- High ecological validity because it can be used to analyze documentation of behaviors presented by military service members in their routine course of duties (e.g., audio or video recordings of conversations, timing and location of entry and exit logs for rooms and buildings, and text of typed reports)
- No specific equipment required beyond a method for digitally recording behavior (e.g., audio or video recorders that are already part of the workplace or networked file storage for work products)
- Can be used to sort individuals into higher and lower risk groups and monitor changes in risk for engaging in destructive behavior, as well as engagement in new or different destructive behaviors
In addition to potentially identifying individuals who are at higher risk, BSP may detect when there is an increased risk—all without interrupting their routine duties or requiring additional assessment tasks. BSP achieves this through a series of steps that include data acquisition, data processing, computational modeling, and predicting the level of risk for engaging in destructive behavior(s). Behaviors of interest are defined to include any action taken by or communication performed by a person, including interactions between people and machines or other non-living entities. GPS logs from cars, work emails, and audio recordings of performance review meetings could all be incorporated into BSP analysis.
The data processing step of BSP involves sub steps and is dependent on the modality of the data. Figure 1 depicts the sub steps involved in processing audio of a conversation and preparing it for acoustic and linguistic feature extraction, which include cleaning the audio by removing noise (parts of the digital record that are not related to an action taken by or communication made by a person); determining who is speaking and when; and generating a transcript.
Similar principles are involved when processing other modalities of data, such as cleaning and grouping the data and identifying actors. It should be noted, the primary difference between processing audio recordings and other modalities of data is the algorithms used.
Following these steps, features are extracted from the acoustic and linguistic data using modality-specific algorithms. Algorithms used to extract features from the acoustic data are referred to as speech signal processing methods, which quantify both the spectral and temporal aspect of sound. Prosodic features are a subset of spectral features that quantify the tune and rhythm of a sound. Generally, prosodic features have perceptual correlates. For example, fundamental frequency (f0) refers to the lowest frequency harmonic of the speech sound wave and is highly correlated with the perceived pitch of a sound. Higher f0 corresponds to higher perceived pitch. Thousands of acoustic features can be extracted from each second of sound, which ensures a thorough, precise mathematical description of the manner and tone in which something is said.
Algorithms used to extract features from linguistic data are referred to as natural language processing techniques. Linguistic features characterize semantic aspects and syntactic aspects of the words used. Semantic features vary in terms of whether those characterizations are abstract or concrete. For example, topic modeling can be used to determine the themes discussed. This abstract characterization of semantic meaning can be used to summarize large corpora of text with a high degree of flexibility. However, the results are not necessarily readily interpretable.
On the other extreme, n-grams can be used to characterize the frequency of specific words (uni-grams) or phrases (bigrams for two word sequences). This kind of concrete characterization can be used to summarize specific words and phrases that are identified a priori into highly interpretable metrics but it is much less flexible than more abstract representations.
A combination of concrete and abstract semantic features maximizes the flexibility and interpretability of the feature set and is advisable. For example, topic modeling of the transcript of a conversation between two service members might identify themes of aggression and substance use. These themes may not be of concern in and of themselves if the aggressive language is used in reference to sports teams or in a similar context, rather than to other members of their unit.
Similarly, language related to substance use may not be relevant if in reference to cigarette use, but it could be of strong interest if related to illicit substances or heavy alcohol consumption. N-gram processing of the topic modeling results would provide additional, actionable information in cases where leadership can construct lists of keywords and phrases that are of specific interest. For example, if leadership identified slang terms for methamphetamine (e.g., meth, speed, and crank) as being of specific interest, n-gram processing could return a frequency count of the number of times those terms appear in a substance use topic. An increase in that frequency count could be used to identify increased interest in and use of methamphetamine.
It will not always be possible for leadership to provide such keyword lists. Even when it is possible, there is a large amount of additional information present in linguistic features, and features from other modalities, that can help identify service members’ level of risk for engaging in destructive behaviors. This information is utilized in the next step, computational modeling.
In the computational modeling step of BSP, features extracted in the previous step are used to estimate the presence versus absence and/or level of a range of cognitive, behavioral, and psychological markers. These markers include anger, impulsivity, boredom, and psychological distress, and they are known (or suspected) to be associated with the risk for engaging in destructive behavior.
The cognitive, behavioral, and psychological markers estimated in this step are indices that must be determined by content matter experts and leadership, as the scores on these markers are estimated using semi-supervised and supervised machine learning approaches, which rely on the availability of data that includes the specific behaviors of interest. These data are referred to the training set and are used to determine the linguistic, acoustic, and other features characteristic of the marker(s) of interest.
Once identified, the similarity of the features in the target data being processed are compared to those in the training set. The more similar the two sets of features, the more likely it is that the data being processed contains the marker(s) of interest.
Using training data in this step has important benefits. It allows the continual updating of cognitive, behavioral, and psychological markers of risk for destructive behaviors, even as knowledge about risk markers and the forms of destructive behavior advances. It also allows for adapting markers of low and high risk for destructive behaviors to the unique social norms and technical terminology of different military bases, theaters, and job requirements.
For example, different training sets could be used for analyzing conversations between two military service members and for analyzing an interaction between a supervisor and a subordinate as compared to an interaction between two peers. This adaptation is important because it allows for distinguishing actions and communications that may be appropriate and expected, from those that are atypical in a particular context from actions and communications that are inappropriate and of concern in all contexts.
This quality is the reason BSP has the potential for significantly greater sensitivity and specificity in predictive accuracy in comparison to chart reviews and structured clinical interviews that assess broad and general tendencies.
In BSP’s final step, the risk marker estimates from the computational modeling stage and the raw features from the data processing stage are used in combination to estimate risk for engaging in destructive behaviors. Similar to the computational modeling stage, risk estimates from the final step are generated using machine learning methods. However, in this step, unsupervised and semi-supervised machine learning approaches can be used, whereas supervised and semi-supervised approaches are used in the computational modeling stage.
We can learn in an unsupervised manner by exploiting contextual or multimodal cues. For example, even if the definition of a word is not understood, its use can be observed in context a number of times, which may help to identify its contextual relationship (e.g., “king” is related to “queen” the same way that “man” is related to “woman”). This functions in a manner similar to other forms of association, such as for behavioral association of similar sounding speech.
This method of machine learning can be used to group individuals into categories, presenting similar levels of risk without having to specify all the possible forms of destructive behavior in which they might engage. Semi-supervised methods are based mostly on first-pass estimates from initial models, and then selective use of the output of those methods to train supervised models. This method of machine learning can be used to target risk estimation for specific forms of destructive behaviors that are of particular interest.
BSP is a proven method for the accurate prediction of complex, multiply determined, short- and long-term behavioral outcomes [8-12]. Given the recent successful demonstrations and the ongoing development of the technologies utilized in BSP, it is highly likely they could be effectively adapted to detect individuals who pose a greater risk of engaging in destructive behaviors; identify periods of time when high-risk individuals are particularly likely to engage in destructive behaviors; and measure risk for new forms of destructive behavior as they emerge. Close collaboration between technologists and leadership may enhance the likely success of such an endeavor and be a vital component of introducing these methods to military settings.
1. Department of Defense. (2010). Protecting the force: Lessons from Fort Hood, The Report of the DoD Independent Review (Rep.). Retrieved https://www.defense.gov/Portals/1/Documents/pubs/DOD-Protecting-TheForce-Web_Security_HR_13Jan10.pdf
2. Applewhite, L. (1994). Prevention measures to reduce psychosocial distress in MFO operations. In, Peace Operations: Workshop Proceedings (pp. 47–52). Alexandria, VA:U.S. Army Research Institute for the Behavioral and Social Sciences.
3. Bartone, P. T. (1996). American IFOR experience: Psychological stressors in the early deployment period. In, Proceedings of the 32nd International Applied Military Psychology Symposium (pp. 87–97). Brussels, Belgium.
4. Fazel, S., Singh, J. P., Doll, H., & Grann, M. (2012). Use of risk assessment instruments to predict violence and antisocial behavior in 73 samples involving 24 827 people: Systematic review and meta-analysis. BMJ, 345. doi:10.1136/bmj.e4692
5. Richard-Devantoy, S., Ding, Y., Turecki, G., & Jollant, F. (2016). Attentional bias toward suicide-relevant information in suicide attempters: A cross-sectional study and a meta-analysis. Journal of Affective Disorders, 196, 101-108. doi:10.1016/j.jad.2016.02.046
6. Nock, M. K., Holmberg, E. B., Photos, V. I., & Michel, B. D. (2007). Self-injurious thoughts and behaviors interview: Development, reliability, and validity in an adolescent sample.Psychological Assessment, 19(3).
7. Narayanan, S., & Georgiou, P. G. (2013). Behavioral signal processing: Deriving human behavioral informatics from speech and language. Proceedings of the IEEE, 101(5),1203-1233. doi:10.1109/jproc.2012.2236291
8. Baucom, B. R., Atkins, D. C., Simpson, L. E., & Christensen, A. (2009). Prediction of response to treatment in a randomized clinical trial of couple therapy: A 2-year follow-up. Journal of Consulting and Clinical Psychology, 77(1), 160-173. doi:10.1037/a0014405
9. Black, M., Katsamanis, N., Baucom, B. R., Lee, C. Lammert, A., Christensen, A., Georgiou, P., & Narayanan, S. (2013). Towards automating a human behavioral coding system for married couples’ interactions using acoustic features. Speech Communication, 55(1). doi:10.1016/j.specom.2011.12.003
10. Kliem, S., Weusthoff, S., Hahlweg, K., Baucom, K. J. W., & Baucom, B. R. (2015). Predicting long-term risk for relationship dissolution using nonparametric conditional survival trees. Journal of Family Psychology, 29(6), 807-817. doi:10.1037/fam0000134
11. Lee, C., Katsamanis, A., Black, M. P., Baucom, B. R., Christensen, A., Georgiou, P. G., & Narayanan, S. S. (2014). Computing vocal entrainment: A signal-derived PCA-based quantification with application to affect recognition in married couples’ interaction. Computer Speech and Language, 28(2), 518-539. doi:10.1016/j.csl.2012.06.006
12. Nasir, M., Baucom, B. R., Georgiou, P., & Narayanan, S. S. (2017). Predicting couple therapy outcomes based on speech acoustic features. Plos One, 12(9). doi:10.1371/journal.pone.0185123
13. Steadman, H. J., Silver, E., Monahan, J., Appelbaum, P. S., Clark Robbins, P., Mulvey, E. P., … & Banks, S. (2000). A classification tree approach to the development of actuarial violence risk assessment tools. Law and Human Behavior, 24(1), 83-100.
14. Douglas, K. S., Ogloff, J. R., Nicholls, T. L., & Grant, I. (1999). Assessing risk for violence among psychiatric patients: The HCR-20 violence risk assessment scheme and the Psychopathy Checklist: Screening Version. Journal of Consulting and Clinical Psychology, 67(6), 917-930.
15. Quinsey, V. L., Harris, G. T., Rice, M. E., & Cormier, C. A. (2006). Actuarial Prediction of Violence. In V. L. Quinsey, G. T. Harris, M. E. Rice, & C. A. Cormier, The law and public policy. Violent offenders: Appraising and managing risk (pp. 155-196). Washington, DC: American Psychological Association.
16. Cha, C. B., Najmi, S., Park, J. M., Finn, C. T., & Nock, M. K. (2010). Attentional bias toward suicide-related stimuli predicts suicidal behavior. Journal of Abnormal Psychology, 119(3), 616-622. doi:10.1037/a0019710
17. Nock, M. K., Park, J. M., Finn, C. T., Deliberto, T. L., Dour, H. J., & Banaji, M. R. (2010). Measuring the suicidal mind: Implicit cognition predicts suicidal behavior. Psychological Science, 21(4), 511-517. doi:10.1177/0956797610364762