Feature Roadmap

Handsfree Acoustic Development Interface

A general purpose voice user interface for IntelliJ IDEA. Use case for blind and RSI users (and distracting coworkers).
Using CMU Sphinx-4 for speech recognition and MaryTTS speech synthesis end-to-end voice control in pure Java.
Pretrained language models have good recognition accuracy for small vocabulary grammars.
Check out this presentation: Using Python to Code by Voice

Idear is currently a work in progress. These are some of the features we have implemented and are currently working on:

User presses button or activates voice control by saying something, “Okay __, help me.”
“Hello , welcome to the handsfree audio development interface for IntelliJ IDEA.”
“There are a number of commands you can use, for example ‘Open settings’, ‘Find action’, ‘Open file’...”

Action reader. When user enables a flag, any selecting menu options or actions read back to user.
Status updates. User says, “Run application”. Plugin responds, “building project”, “compiling application”, “running project”.
Text selection. Plugin reads back selected region (rapidly).
User says, "Where am I?". Plugin responds, "You are inside method X, on line Y".

User says, “open Analyze”. Plugin responds, “Would you like to ‘Inspect Code’, ‘Code Cleanup’...”
User says, “open tip of the day”. Plugin responds, “Did you know that... ”
User says, “activate intentions”. Plugin responds, “Would you like to ‘Invert if condition’, ‘Remove braces’,...”

Understand numbers (one, two , three, four, five, six…)
- Jump to text inside the editor window
- Goto line numbers
Understand free form language
- Finding text in the editor
- Performing arbitrary actions
Menus (open + file, edit, view, navigate, code, analyze, refactor, build, run, tools, version control)
Navigation keys (“Page Up”, “Page down”, “line up”, “line down”, “go left”, “go right”)
Fixed actions (“extract method”, “expand selection”, “shrink selection”, “focus project”)

Define a grammar & vocabulary
1. For example: dialog.gram
Binding speech results to Actions
1. Current hack: VoiceControlAction.java
2. How to trigger actions programmatically
Speech synthesis API
1. Convert text (in code, comments, menus) to audio
2. SplitCamelCaseText -> Split Camel Case Text
3. Interruptible audio input / output
GUI Navigation
1. Extracting menu text
2. Selecting menu items
Code creation
1. Java language features
2. Refactoring actions
3. Code selection actions
4. Spelling letter-by-letter
Code navigation
1. File based navigation
2. Line based navigation
3. Searching for symbols