At Axon, our mission is to Protect Life. To achieve this, we’ve set an ambitious goal: to reduce gun-related deaths between police and the public in the U.S. by 50% by 2033. A critical part of this effort is enhancing officer training, particularly in verbal de-escalation—one of the most essential yet undertrained skills in law enforcement. Research shows that an officer’s first 45 words during an interaction can determine whether a situation escalates or is peacefully resolved.[1] However, traditional training methods are resource-intensive, rigid, and difficult to scale.
To address these challenges, Axon is developing AI-powered Verbal Skills Training in Axon VR—a set of advanced, scenario-based trainings currently in trials with agencies and planned for general availability in late 2025. These immersive experiences allow officers to engage in real-time, adaptive conversations with AI-driven virtual characters. The system leverages machine learning, natural language processing, and real-time speech synthesis to deliver realistic, scalable, and impactful training tailored to dynamic public safety encounters.
At Axon, ethical AI is a foundational principle guiding how we build and deploy technology. We’ve built our Verbal Skills Training system with safeguards from the ground up, ensuring character responses are not only realistic, but also fair, consistent, and addressing potential bias. Through continuous rigorous testing and a commitment to responsible innovation, we’re setting a new standard for ethical AI in public safety training.
Axon first showcased this training module at the 2024 International Association of Chiefs of Police (IACP) Conference, alongside a range of AI-powered solutions. The module is currently in trials with law enforcement agencies and will be available in the U.S. and Canada in late 2025.
While law enforcement officers undergo extensive training in tactics and procedures, verbal skills training is often overlooked. Agencies report that younger officers, in particular, struggle with communication and de-escalation. Traditional training methods, such as role-playing exercises, face significant scalability challenges—they require additional personnel, resources, and dedicated facilities, making frequent, individualized training difficult.
Costly and inefficient – Requires human trainers, scheduling, and travel
Repetitive & rigid – Scenarios are often predictable and fail to evolve with officer skill levels.
Not scalable – Agencies struggle to provide frequent, high-quality verbal skills training for all officers.
Dynamic, adaptive interactions – AI-driven characters respond dynamically to officer responses.
Scalable & cost-effective – Eliminates the need for human role-players and dedicated training spaces.
On-demand & immersive – Officers can train anytime, anywhere using a wireless VR system.
As Chief Paul Oliveira of New Bedford Police shared as his agency trials the module, "We've never had anything in our profession that's taught cops how to talk to people, and one of our biggest problems we have is cops talking to people...We can teach them de-escalation all day. We can teach them all these different tactics, but most of the time you won't need those tactics if you can just build a rapport with people."
Recent AI advancements, such as Generative AI, have demonstrated how virtual agents can engage in realistic conversations. However, law enforcement training requires far more than a chatbot that responds to text prompts. To take the advanced technology of a chatbot a step further, Axon’s Verbal Skills Training is built for high-stakes, unscripted interactions where officers must adapt their approach in real time. Instead of relying on pre-scripted responses, this system integrates:
Situational awareness – AI characters track past interactions, tone shifts, and contextual cues to make realistic, evolving decisions.
Emotional intelligence – Characters respond with facial expressions, vocal tones, and physical gestures that mirror real human behavior.
Dynamic narrative progression – The system doesn’t just chat; it drives the scenario forward, forcing officers to make decisions that influence how events unfold.
Unlike traditional AI chatbots that operate in a question-response format, Axon’s AI-driven training requires officers to build rapport, recognize escalating behaviors, and de-escalate situations dynamically. This creates a real-world level of complexity that no simple chatbot can replicate.
Axon VR’s Verbal Skills Training is built on a multi-layered AI architecture that dynamically generates conversations, ensuring that each training session is unique, immersive, and responsive to an officer’s decisions. Unlike traditional simulators that rely on pre-scripted interactions, this system adapts based on real-time speech processing, situational awareness, and emotional intelligence to provide officers with realistic de-escalation training.
To generate natural interactions, the system processes officer speech in real time using:
Automatic Speech Recognition (ASR) – Converts spoken words into structured text for seamless interpretation.
Natural Language Understanding (NLU) – Extracts intent to inform character responses.
This ensures that not only the words and their intent shape the direction of the scenario.
To prevent training from feeling repetitive or predictable, Axon VR continuously adapts conversations based on officer behavior and evolving context. This is powered by:
Agent facts database – Stores real-time details such as character emotions, officer speech, and contextual elements.
Agent rules engine – Guides how characters react based on trust-building, tone, and de-escalation progress.
Fuzzy pattern matching – Allows for organic, unscripted responses rather than rigidly following predefined scripts. [2]
This means that each officer decision influences the scenario’s outcome, reinforcing the need for critical thinking and adaptability.
The AI-driven characters in Axon VR are designed to go beyond chatbots by integrating responsiveness and dynamism of the virtual characters:
Verbal response AI – Generates dialogue aligned with training objectives and officer interactions.
Emotion AI – Modifies facial expressions, tone, and vocal inflections based on scenario progression.
Animation AI – Synchronizes body language and gestures with speech, creating a more immersive experience.
By dynamically generating character responses, the system mirrors real-world conversations, challenging officers to adjust their approach in real time rather than following scripted exchanges.
Instead of relying on fixed decision trees to determine the scenario progression, Axon VR uses an Agentic LLM Action Generator that evaluates multiple factors to determine how a conversation unfolds. These include:
Task description – What the virtual character is trying to accomplish.
Scenario context – Environmental and past interaction factors influencing behavior.
Persona details – The character’s pre-defined communication style and disposition.
Running transcript – A record of the entire conversation between the officer and the character.
Agent facts & rule nudges – Real-time adjustments guiding responses based on officer input.
This dynamic system eliminates the need for manual scripting, allowing for flexible, adaptive roleplay that challenges officers to refine their communication strategies.
To maintain an immersive experience, Axon VR’s AI system ensures fast, seamless interactions through:
Parallel processing – Multiple AI models work simultaneously, reducing lag and enabling responses in under 1.5 seconds.
Low-latency streaming – Maintains real-time conversational flow for natural interactions.
Voice synthesis & lip-syncing – Converts AI-generated text into natural-sounding speech, also known as TTS (text-to-speech technology) and in this case is synchronized with facial animations.
These optimizations ensure that training remains fluid, realistic, and as close to real-world interactions as possible.
Unlike traditional 2D simulators or pre-scripted role-playing scenarios, Axon VR’s AI-driven interactions evolve dynamically, shaped by both the content and delivery of the officer’s communication.
Take, for example, a noise complaint at a motel. A virtual character might start off calm and cooperative, but certain officer behaviors, like sounding impatient or dismissive, can cause the character to become defensive or even confrontational. Likewise, a tense or agitated character might gradually de-escalate if the officer shows patience and communicates with care.
These changes aren’t always predictable. Just like in real interactions, the character’s responses can catch officers off guard. By simulating this full spectrum of behavior, Axon VR helps officers build the adaptability and communication skills needed to navigate complex, fast-evolving situations.
At Axon, we are committed to responsible innovation—our mission is to harness cutting-edge AI technology to revolutionize community safety, all the while prioritizing the rigorous mitigation of biases and other potential risks. Similar to our work on Draft One[3][4], our AI systems undergo rigorous evaluation to minimize risks such as bias or inappropriate responses. To ensure validity, consistency, and appropriateness, we conducted a study analyzing 1,564 AI-generated dialogue responses from 30 full training sessions [5].
Our results showed that the character's responses were highly valid, consistent, and appropriate, and by comparing the character's responses to what the real human—not a role player—said in the original transcript dialog, we didn't see a statistically significant difference in metrics between the human and the character's responses.
We also tested our AI system for demographic fairness using counterfactual analysis, modifying only the race descriptor of virtual characters across 869 test cases per demographic (“Asian,” “Black,” “Hispanic,” “White,” and a control group with no specified race). No statistically significant variations were found in AI-generated responses, reinforcing that the system does not introduce unintended bias based on race.
Looking ahead, Axon is focused on expanding emotional AI modeling, ensuring compliance with strict privacy protocols in line with our Criminal Justice Information Services (CJIS) certification, improving long-term scenario adaptation, and integrating verbal training with Axon’s VR use-of-force simulations.
Axon’s AI-powered Verbal Skills Training represents a breakthrough approach to immersive, scalable, and adaptive law enforcement training—currently in trials and coming to market in late 2025. As we refine and expand this technology, Axon remains dedicated to pushing the boundaries of AI-driven training and furthering our mission to protect life through smarter, safer law enforcement solutions.
[1] Rho, E. H., Harrington, M., Zhong, Y., Pryzant, R., Camp, N. P., Jurafsky, D., & Eberhardt, J. L. (2023). Escalated police stops of Black men are linguistically and psychologically distinct in their earliest moments. Proceedings of the National Academy of Sciences.
[2] Ruskin, Elan; Valve Corporation (2018). “AI-driven Dynamic Dialog through Fuzzy Pattern Matching.” Game Developers Conference.
[3] Examining Quality and Bias: Axon’s Studies on Draft One. Retrieved Nov 12, 2024, from https://www.axon.com/blog/examining-quality-and-bias
[4] Draft One: Comparing quality between Officer-only and Draft One report narratives. Retrieved Nov 12, 2024, from https://a.storyblok.com/f/198504/x/7a83779017/axon_marketing_draft-one_double-blind-study_fnl.pdf
[5] Axon utilized training sessions from agencies that have opted into Axon’s ACEIP's Tier 2 program.