LinVAM/README.md

# LinVAM
Linux Voice Activated Macro

## Status
This project is currently a work-in-progress and is minimally functional only for english.

Utilising Pocketsphinx, a lightweight voice to text engine you can specify voice commands for the tool to recognise and actions to perform.

Known bugs and planned additions
- To save and use changes click Ok on the main GUI then reload.
- Remember last loaded profile and load on start
- Log window showing spoken words the V2T recognises with ability to right click and assign voice command and actions to current profile
- Support for joysticks and gaming devices

## Requirements
- python3
- PyQt5
- python3-xlib
- pyaudio
- pocketsphinx
- swig3.0

## Optional requirements
- xdotool
- ffplay (part of ffmpeg, usually already installed)
- HCS voicepacks

## Install
- $ pip3 install PyQt5
- $ pip3 install python3-xlib
- $ pip3 install pyaudio
- $ pip3 install pocketsphinx
- $ sudo apt-get install swig3.0
- $ (optional) sudo apt install xdotool
- $ (optional) sudo apt install ffmpeg
- $ sudo ln -s /usr/bin/swig3.0 /usr/bin/swig
- $ git clone https://github.com/aidygus/LinVAM.git

## Usage
This script must be run with root privilege because it must hook and simulate input devices such as keyboard, mouse etc.
- $ cd LinVAM
- $ xhost +
- $ sudo ./main.py

As an alternative, if you use the X window manager and have xdotool installed, you can run the script like this:
- $ cd LinVAM
- $ ./main.py -noroot

### Profiles
Multiple profiles are supported.  To create a new profile for a specific task/game click new and the main profile editor window will be displayed

![Main GUI](https://raw.githubusercontent.com/aidygus/LinVAM/master/.img/gui.png)
### Key combinations
To assign key combinations first decide which functional key to press by clicking on Ctrl, Alt, Shift or Win to denote left or right key then press the actual command key

![Main GUI](https://raw.githubusercontent.com/aidygus/LinVAM/master/.img/combination.png)
### Complex commands
It is possible to add multiple actions to a voice command for complex macros with the ability to add a pause between each action.
You can also assign mouse movements and system commands if you require (eg opening applications such as calculator, browser etc)

![Main GUI](https://raw.githubusercontent.com/aidygus/LinVAM/master/.img/complex.png)
### Threshold
As a rough guide use a value of 10 for each syllable of a word then tweak it down for better accuracy.

### Output audio
In the Command Edit Dialog, chose 'Other' and then 'Play sound'. Pick the sound you would like to play.
For this to work you need to copy any audio file you would like to use to the folder 'voicepacks'.
You are required to create a subfolder to hold all your audio files (voicepack folder), then within that subfolder, create as many folders as you like to group your audio files (category folders).
Place the audio file into these category folders or in any subfolder within a category folder.
In theory any audio file should work, but tested only with MP3 files.

Example:
/voicepacks/my voicepack/custom commands/hello.mp3
/voicepacks/my voicepack/other/thank you.mp3

If you own a HCS voicepack, copy the whole voicepack folder (like 'hcspack', 'hcspack-eden', ...) to the 'voicepacks' folder, so it reads like this:
/voicepacks/hcspack/...

### Improve voice recognition accuracy
Please see this resource on how to train the acoustic model of pocketsphinx to match your voice:
https://cmusphinx.github.io/wiki/tutorialadapt/
Initial commit 2019-04-09 00:13:16 +10:00			`# LinVAM`
			`Linux Voice Activated Macro`
Fixes & implementation of playing audio files. See changes.md for details. - hacked profileexcutor to reload command list when changing it. it did not do that, so changing the profile had no effect at all. - hacked profileexecutor to somewhat make use of "Enable listening". It was an option without functionality at all until now. - created new directory 'voicepacks', copy HCS voicepacks here - new command to play a sound - new gui to select sounds from voicepacks - created playsound class that reads voicepack files and plays audio files with ffplay - added alternate keypress handling using xdotool, which won't require root privileges. might not work for any key existing in profiles, as some need to be remapped. added some remappings to profileexecutor.py pressKey() - added basic command line argument reading to set some configs. right now there: -noroot - will enable xdotool usage for keypresses -xdowindowid <windowid> - will send keypresses to this window only, makes it more relyable when window is not focused for any reason - added auto-detection of Elite Dangerous client window id if -noroot is used and no -xdowindowid is supplied. for this to work, start this script AFTER the client is already running. - monkey-wrenched a volume slider to the main window - added confirmation dialog for removing a profile (!) - copy existing profile. added a copy button to the main menu - properly shutdown audio recording stream - updated gitignore to ignore audio files 2020-05-17 04:51:24 +10:00
added the initial version for voice command spotting 2019-04-17 13:52:40 +10:00			`## Status`
other updates 2019-04-21 22:58:46 +10:00			`This project is currently a work-in-progress and is minimally functional only for english.`
Added more information and images to readme 2019-04-19 21:59:58 +10:00
			`Utilising Pocketsphinx, a lightweight voice to text engine you can specify voice commands for the tool to recognise and actions to perform.`

			`Known bugs and planned additions`
			`- To save and use changes click Ok on the main GUI then reload.`
			`- Remember last loaded profile and load on start`
			`- Log window showing spoken words the V2T recognises with ability to right click and assign voice command and actions to current profile`
			`- Support for joysticks and gaming devices`
Fixes & implementation of playing audio files. See changes.md for details. - hacked profileexcutor to reload command list when changing it. it did not do that, so changing the profile had no effect at all. - hacked profileexecutor to somewhat make use of "Enable listening". It was an option without functionality at all until now. - created new directory 'voicepacks', copy HCS voicepacks here - new command to play a sound - new gui to select sounds from voicepacks - created playsound class that reads voicepack files and plays audio files with ffplay - added alternate keypress handling using xdotool, which won't require root privileges. might not work for any key existing in profiles, as some need to be remapped. added some remappings to profileexecutor.py pressKey() - added basic command line argument reading to set some configs. right now there: -noroot - will enable xdotool usage for keypresses -xdowindowid <windowid> - will send keypresses to this window only, makes it more relyable when window is not focused for any reason - added auto-detection of Elite Dangerous client window id if -noroot is used and no -xdowindowid is supplied. for this to work, start this script AFTER the client is already running. - monkey-wrenched a volume slider to the main window - added confirmation dialog for removing a profile (!) - copy existing profile. added a copy button to the main menu - properly shutdown audio recording stream - updated gitignore to ignore audio files 2020-05-17 04:51:24 +10:00
added the initial version for voice command spotting 2019-04-17 13:52:40 +10:00			`## Requirements`
			`- python3`
			`- PyQt5`
			`- python3-xlib`
Update README.md 2019-04-22 01:23:57 +10:00			`- pyaudio`
Added more information and images to readme 2019-04-19 21:59:58 +10:00			`- pocketsphinx`
			`- swig3.0`
Fixes & implementation of playing audio files. See changes.md for details. - hacked profileexcutor to reload command list when changing it. it did not do that, so changing the profile had no effect at all. - hacked profileexecutor to somewhat make use of "Enable listening". It was an option without functionality at all until now. - created new directory 'voicepacks', copy HCS voicepacks here - new command to play a sound - new gui to select sounds from voicepacks - created playsound class that reads voicepack files and plays audio files with ffplay - added alternate keypress handling using xdotool, which won't require root privileges. might not work for any key existing in profiles, as some need to be remapped. added some remappings to profileexecutor.py pressKey() - added basic command line argument reading to set some configs. right now there: -noroot - will enable xdotool usage for keypresses -xdowindowid <windowid> - will send keypresses to this window only, makes it more relyable when window is not focused for any reason - added auto-detection of Elite Dangerous client window id if -noroot is used and no -xdowindowid is supplied. for this to work, start this script AFTER the client is already running. - monkey-wrenched a volume slider to the main window - added confirmation dialog for removing a profile (!) - copy existing profile. added a copy button to the main menu - properly shutdown audio recording stream - updated gitignore to ignore audio files 2020-05-17 04:51:24 +10:00
			`## Optional requirements`
			`- xdotool`
			`- ffplay (part of ffmpeg, usually already installed)`
			`- HCS voicepacks`

added the initial version for voice command spotting 2019-04-17 13:52:40 +10:00			`## Install`
			`- $ pip3 install PyQt5`
			`- $ pip3 install python3-xlib`
missing package 2019-04-21 22:54:39 +10:00			`- $ pip3 install pyaudio`
Added more information and images to readme 2019-04-19 21:59:58 +10:00			`- $ pip3 install pocketsphinx`
			`- $ sudo apt-get install swig3.0`
Fixes & implementation of playing audio files. See changes.md for details. - hacked profileexcutor to reload command list when changing it. it did not do that, so changing the profile had no effect at all. - hacked profileexecutor to somewhat make use of "Enable listening". It was an option without functionality at all until now. - created new directory 'voicepacks', copy HCS voicepacks here - new command to play a sound - new gui to select sounds from voicepacks - created playsound class that reads voicepack files and plays audio files with ffplay - added alternate keypress handling using xdotool, which won't require root privileges. might not work for any key existing in profiles, as some need to be remapped. added some remappings to profileexecutor.py pressKey() - added basic command line argument reading to set some configs. right now there: -noroot - will enable xdotool usage for keypresses -xdowindowid <windowid> - will send keypresses to this window only, makes it more relyable when window is not focused for any reason - added auto-detection of Elite Dangerous client window id if -noroot is used and no -xdowindowid is supplied. for this to work, start this script AFTER the client is already running. - monkey-wrenched a volume slider to the main window - added confirmation dialog for removing a profile (!) - copy existing profile. added a copy button to the main menu - properly shutdown audio recording stream - updated gitignore to ignore audio files 2020-05-17 04:51:24 +10:00			`- $ (optional) sudo apt install xdotool`
			`- $ (optional) sudo apt install ffmpeg`
Added more information and images to readme 2019-04-19 21:59:58 +10:00			`- $ sudo ln -s /usr/bin/swig3.0 /usr/bin/swig`
Readme tweaks 2019-04-22 01:11:13 +10:00			`- $ git clone https://github.com/aidygus/LinVAM.git`
Fixes & implementation of playing audio files. See changes.md for details. - hacked profileexcutor to reload command list when changing it. it did not do that, so changing the profile had no effect at all. - hacked profileexecutor to somewhat make use of "Enable listening". It was an option without functionality at all until now. - created new directory 'voicepacks', copy HCS voicepacks here - new command to play a sound - new gui to select sounds from voicepacks - created playsound class that reads voicepack files and plays audio files with ffplay - added alternate keypress handling using xdotool, which won't require root privileges. might not work for any key existing in profiles, as some need to be remapped. added some remappings to profileexecutor.py pressKey() - added basic command line argument reading to set some configs. right now there: -noroot - will enable xdotool usage for keypresses -xdowindowid <windowid> - will send keypresses to this window only, makes it more relyable when window is not focused for any reason - added auto-detection of Elite Dangerous client window id if -noroot is used and no -xdowindowid is supplied. for this to work, start this script AFTER the client is already running. - monkey-wrenched a volume slider to the main window - added confirmation dialog for removing a profile (!) - copy existing profile. added a copy button to the main menu - properly shutdown audio recording stream - updated gitignore to ignore audio files 2020-05-17 04:51:24 +10:00
added the initial version for voice command spotting 2019-04-17 13:52:40 +10:00			`## Usage`
			`This script must be run with root privilege because it must hook and simulate input devices such as keyboard, mouse etc.`
			`- $ cd LinVAM`
			`- $ xhost +`
other updates 2019-04-21 22:58:46 +10:00			`- $ sudo ./main.py`
added the initial version for voice command spotting 2019-04-17 13:52:40 +10:00
Fixes & implementation of playing audio files. See changes.md for details. - hacked profileexcutor to reload command list when changing it. it did not do that, so changing the profile had no effect at all. - hacked profileexecutor to somewhat make use of "Enable listening". It was an option without functionality at all until now. - created new directory 'voicepacks', copy HCS voicepacks here - new command to play a sound - new gui to select sounds from voicepacks - created playsound class that reads voicepack files and plays audio files with ffplay - added alternate keypress handling using xdotool, which won't require root privileges. might not work for any key existing in profiles, as some need to be remapped. added some remappings to profileexecutor.py pressKey() - added basic command line argument reading to set some configs. right now there: -noroot - will enable xdotool usage for keypresses -xdowindowid <windowid> - will send keypresses to this window only, makes it more relyable when window is not focused for any reason - added auto-detection of Elite Dangerous client window id if -noroot is used and no -xdowindowid is supplied. for this to work, start this script AFTER the client is already running. - monkey-wrenched a volume slider to the main window - added confirmation dialog for removing a profile (!) - copy existing profile. added a copy button to the main menu - properly shutdown audio recording stream - updated gitignore to ignore audio files 2020-05-17 04:51:24 +10:00			`As an alternative, if you use the X window manager and have xdotool installed, you can run the script like this:`
			`- $ cd LinVAM`
			`- $ ./main.py -noroot`

Added more information and images to readme 2019-04-19 21:59:58 +10:00			`### Profiles`
			`Multiple profiles are supported. To create a new profile for a specific task/game click new and the main profile editor window will be displayed`
Added more information and images to readme 2019-04-19 22:05:36 +10:00
Added more information and images to readme 2019-04-19 22:05:01 +10:00			`![Main GUI](https://raw.githubusercontent.com/aidygus/LinVAM/master/.img/gui.png)`
Added more information and images to readme 2019-04-19 21:59:58 +10:00			`### Key combinations`
			`To assign key combinations first decide which functional key to press by clicking on Ctrl, Alt, Shift or Win to denote left or right key then press the actual command key`
Added more information and images to readme 2019-04-19 22:05:36 +10:00
Added more information and images to readme 2019-04-19 22:05:01 +10:00			`![Main GUI](https://raw.githubusercontent.com/aidygus/LinVAM/master/.img/combination.png)`
Added more information and images to readme 2019-04-19 21:59:58 +10:00			`### Complex commands`
			`It is possible to add multiple actions to a voice command for complex macros with the ability to add a pause between each action.`
			`You can also assign mouse movements and system commands if you require (eg opening applications such as calculator, browser etc)`
Added more information and images to readme 2019-04-19 22:05:36 +10:00
Added more information and images to readme 2019-04-19 22:05:01 +10:00			`![Main GUI](https://raw.githubusercontent.com/aidygus/LinVAM/master/.img/complex.png)`
Added more information and images to readme 2019-04-19 21:59:58 +10:00			`### Threshold`
			`As a rough guide use a value of 10 for each syllable of a word then tweak it down for better accuracy.`
Fixes & implementation of playing audio files. See changes.md for details. - hacked profileexcutor to reload command list when changing it. it did not do that, so changing the profile had no effect at all. - hacked profileexecutor to somewhat make use of "Enable listening". It was an option without functionality at all until now. - created new directory 'voicepacks', copy HCS voicepacks here - new command to play a sound - new gui to select sounds from voicepacks - created playsound class that reads voicepack files and plays audio files with ffplay - added alternate keypress handling using xdotool, which won't require root privileges. might not work for any key existing in profiles, as some need to be remapped. added some remappings to profileexecutor.py pressKey() - added basic command line argument reading to set some configs. right now there: -noroot - will enable xdotool usage for keypresses -xdowindowid <windowid> - will send keypresses to this window only, makes it more relyable when window is not focused for any reason - added auto-detection of Elite Dangerous client window id if -noroot is used and no -xdowindowid is supplied. for this to work, start this script AFTER the client is already running. - monkey-wrenched a volume slider to the main window - added confirmation dialog for removing a profile (!) - copy existing profile. added a copy button to the main menu - properly shutdown audio recording stream - updated gitignore to ignore audio files 2020-05-17 04:51:24 +10:00
			`### Output audio`
			`In the Command Edit Dialog, chose 'Other' and then 'Play sound'. Pick the sound you would like to play.`
			`For this to work you need to copy any audio file you would like to use to the folder 'voicepacks'.`
			`You are required to create a subfolder to hold all your audio files (voicepack folder), then within that subfolder, create as many folders as you like to group your audio files (category folders).`
			`Place the audio file into these category folders or in any subfolder within a category folder.`
			`In theory any audio file should work, but tested only with MP3 files.`

			`Example:`
			`/voicepacks/my voicepack/custom commands/hello.mp3`
			`/voicepacks/my voicepack/other/thank you.mp3`

			`If you own a HCS voicepack, copy the whole voicepack folder (like 'hcspack', 'hcspack-eden', ...) to the 'voicepacks' folder, so it reads like this:`
			`/voicepacks/hcspack/...`

			`### Improve voice recognition accuracy`
fixed semicolons in main, and type in readme, centered volume slider 2020-05-23 01:11:13 +10:00			`Please see this resource on how to train the acoustic model of pocketsphinx to match your voice:`
Fixes & implementation of playing audio files. See changes.md for details. - hacked profileexcutor to reload command list when changing it. it did not do that, so changing the profile had no effect at all. - hacked profileexecutor to somewhat make use of "Enable listening". It was an option without functionality at all until now. - created new directory 'voicepacks', copy HCS voicepacks here - new command to play a sound - new gui to select sounds from voicepacks - created playsound class that reads voicepack files and plays audio files with ffplay - added alternate keypress handling using xdotool, which won't require root privileges. might not work for any key existing in profiles, as some need to be remapped. added some remappings to profileexecutor.py pressKey() - added basic command line argument reading to set some configs. right now there: -noroot - will enable xdotool usage for keypresses -xdowindowid <windowid> - will send keypresses to this window only, makes it more relyable when window is not focused for any reason - added auto-detection of Elite Dangerous client window id if -noroot is used and no -xdowindowid is supplied. for this to work, start this script AFTER the client is already running. - monkey-wrenched a volume slider to the main window - added confirmation dialog for removing a profile (!) - copy existing profile. added a copy button to the main menu - properly shutdown audio recording stream - updated gitignore to ignore audio files 2020-05-17 04:51:24 +10:00			`https://cmusphinx.github.io/wiki/tutorialadapt/`