Voice-assisted AI
A little while ago I read through the engineering publications from Bungie. Some of the AI papers discuss one of the worst problems in game AI: recognizing the player’s intention. The next time I played on Xbox Live, I picked up my controller, put on my headset, and had an idea: why not use voice to help AI figure out what you’re thinking?
For example, if you say “fall back,” the AI could recognize that and do it. If you have the thermal vision of the group and you see enemies around a corner, you might say “enemies around that corner.” Even though you didn’t tell the AI which corner you meant, they could either guess, or - since they’re controlled by the game system - use information in the system to figure out which corner you mean. It’s cheating, but it improves the user experience, so it works.
So why isn’t this in games already? Aside from localization issues there really aren’t any barriers. Speech recognition works pretty well - it’s a bit slow and not always accurate - but that’s good enough to provide help to an AI in a game (not to entirely guide it - I’m talking about a layer on top of the normal AI you’d find in a game). The only other issue would be that not everyone has a headset. But if you only use voice to assist AI, you can still provide a quality experience otherwise. We should at least be seeing this sort of thing in top-tier games.
In fact, to demonstrate how easy it is to get some very rudimentary speech-driven AI working, I decided to code a little test tonight. I set aside some time for it and went to work. 15 minutes later I was done. Turns out .NET 3.0 has speech recognition built into the System.Speech.Recognition namespace. The result is this:
Download it here
(requires .NET 3.0 and XNA 2.0)
You say some variation of top, bottom, left, right or center to move the ball to that position on the screen. It’d be trivial to translate that into a 3D space based on a first-person perspective (though “front” might be a good word). It could also be more context-sensitive - e.g. maybe “on the left” after “the door” would refer to the left side of the hall instead of the player’s current “left”.
So what are your thoughts? Why aren’t we seeing this in games already? Are there games that do something like this? I know there are games that use sound recognition, but why wouldn’t the more popular games use voice recognition? I looked for patents, and there are a couple, but they seem very specific, and not really applicable to the first-person genre. Other than taking system resources (which with more powerful systems should be less of an issue), what reasons are there for this not being in games already?
September 15th, 2008 at 8:58 am
Socom 2 on the ps2 had this capability: http://www.gaming-age.com/cgi-bin/reviews/review.pl?sys=ps2&game=socom2
“The voice headset, which is NOT included with SOCOM II, is a very important piece of the game. The headset which was bundled with SOCOM I, or the new and improved headset which is now only sold separately, is integral for really experiencing the game as it was meant to be. In addition to vocally commanding your units to perform certain actions (Examples: “Bravo..Deploy..Read Smoke” or “Fireteam..Lead to..Charlie”)”
September 15th, 2008 at 9:00 am
oh, and I totally forgot to include … very awesome!
I do wish XNA would give you access to the xbox live headset data … would be cool to include this in an xna game that runs on the xbox
September 15th, 2008 at 9:14 am
Ah, I didn’t see that. Wonder why it’s not in more games.
September 15th, 2008 at 2:51 pm
Heh, pretty cool demo. It is a little odd why more games don’t do it. Games like Brothers in Arms could benefit greatly from voice commands.
Tom Clancy’s Endwar I think used voice recognition for all the commands in the game.