By Aaron Festinger, Machine Learning Engineer
In Part II of this series, we discussed acoustically based artificial intelligence and why its development and implementation make sense. In this last part, we’ll discuss feasibility and potential pitfalls.
Feasibility of acoustically based AI: simplicity yields effectiveness
Considering tactical solutions that implement AI (rather than existing AI applications which can be rendered tactical) leads us to a place of interesting new possibilities: acoustically-based artificial intelligence for simplified situational awareness, a system for delivering boiled-down insights rather than more information, one that relieves the operator of the unmanageable burden of keeping track of it all. As an analogy, whereas most artificial intelligence applications endeavor to provide a bigger phone book, we can develop a way to keep the phone book but deliver only the requisite phone numbers.
In its minimal manifestation a wearable acoustic intelligence unit would include:
- A processor
- A power source
- A microphone array with directional sensitivity
- A single earpiece.
At this level the AI is a self-contained piece of equipment with no network support required. For an operator already equipped with a visual AI interface, it is possible that the only additional equipment needed would be the microphone array.
A more complex version of acoustically based AI could involve multiple wireless units connected via an ad-hoc battlefield network, possibly forgoing the directional sensitivity of the individual microphones and instead relying on the network. This design would still be relatively simple, requiring only a handful of networked microphones and a single computing unit. A few functions would prove especially useful and it’s relatively certain they could be successfully implemented.
These are, in order of priority:
- Small arms fire source localization
- Ammo usage count/weapons dry alerts (basic audio classification)
- Vehicle/teammate crisis forecasting.
Slightly less certain, but still feasible, the acoustic AI could localize sources of indirect fire such as artillery or mortars and report the coordinates to assist with calls for fire. Other capabilities which could be explored for feasibility include casualty detection, detection of separated teammates, proximity detection of stealthy non-teammates, and the detection of dropped equipment.
Relative to other tactical AI systems, an acoustically based AI could involve significantly lower hardware overhead, making it cheaper to produce and easier to implement. Computational requirements would be reduced relative to most visually based systems because acoustic data is less complex and dense compared to visual data, and the lack of remote modules and interfaces means that it requires fewer moving parts and is less likely to break down. And rather than a complex visual display, the acoustically based AI assembly could be constructed using simple wired or Bluetooth earpieces. In operation, this would be easier to use because the system would be inconspicuously out of the way except when the device/AI is reporting an event.
Developing acoustic AI: potential pitfalls
While an acoustic AI tool might sound like the perfect solution to situation awareness challenges, there are potential obstacles in the way of developing this solution.
Overcomplication – Considering the variety of problems that an intelligent listening system could solve, it is tempting to immediately begin pursuing multiple features and adding functionality. But this would yield a common fault: feature fatigue. By trying to integrate too many functions at once, developers risk creating answers to questions nobody asked.”
Too many alerts – A second potential pitfall related to the first is being overly helpful. Every pointless or inaccurate alert degrades the value of the information channel and adds unnecessary complexity. At some point, it becomes just another background noise. Less is more when it comes to alerts. Rather than making its presence felt constantly, the acoustic AI device should stay silent and out of the way except when an event is detected. For auxiliary information such as the device’s status, or the time, coordinates, weather, etc. the device should respond to a specific set of code words rather than risk becoming a hindrance.
Learning curve – Though an audio interface cannot be complicated by additional options and features like a visual interface, operators will still need to learn functionality to take advantage of the tool. This would create additional overhead that could prevent its use in training, and what doesn’t make it to the field in training is often underutilized or simply left behind when it comes time for combat.
Overcoming objections – It is also difficult to promote adoption of acoustic AI when the answer to the question of what it does is, “It listens to a lot of stuff and gives helpful notifications.” A much better answer would be, “It tells you where the enemy is when they’re shooting at you.” If the interface does that well, no one will object to having the option to say, “ammo count” to receive an immediate, accurate response. But few decision makers hear this response, making adoption challenging.
While the potential pitfalls may seem daunting, developing and implementing acoustic AI would solve myriad problems operators face on the battlefield. At Octo, we pride ourselves on creating human centered solutions by collaborating with our customers. We work with our federal customers, helping them harness the latest technology in a way that makes sense for end users. For questions, reach out to a member of Team Octo.