Skip to content

5. Command Line Interface

Jim Schwoebel edited this page Aug 20, 2020 · 42 revisions
              AAA               lllllll lllllll   iiii                      
              A:::A              l:::::l l:::::l  i::::i                     
             A:::::A             l:::::l l:::::l   iiii                      
            A:::::::A            l:::::l l:::::l                             
           A:::::::::A            l::::l  l::::l iiiiiii     eeeeeeeeeeee    
          A:::::A:::::A           l::::l  l::::l i:::::i   ee::::::::::::ee  
         A:::::A A:::::A          l::::l  l::::l  i::::i  e::::::eeeee:::::ee
        A:::::A   A:::::A         l::::l  l::::l  i::::i e::::::e     e:::::e
       A:::::A     A:::::A        l::::l  l::::l  i::::i e:::::::eeeee::::::e
      A:::::AAAAAAAAA:::::A       l::::l  l::::l  i::::i e:::::::::::::::::e 
     A:::::::::::::::::::::A      l::::l  l::::l  i::::i e::::::eeeeeeeeeee  
    A:::::AAAAAAAAAAAAA:::::A     l::::l  l::::l  i::::i e:::::::e           
   A:::::A             A:::::A   l::::::ll::::::li::::::ie::::::::e          
  A:::::A               A:::::A  l::::::ll::::::li::::::i e::::::::eeeeeeee  
 A:::::A                 A:::::A l::::::ll::::::li::::::i  ee:::::::::::::e  
AAAAAAA                   AAAAAAAlllllllllllllllliiiiiiii    eeeeeeeeeeeeee  
                                                                             

 _____                                           _   _     _            
/  __ \                                         | | | |   (_)           
| /  \/ ___  _ __ ___  _ __ ___   __ _ _ __   __| | | |    _ _ __   ___ 
| |    / _ \| '_ ` _ \| '_ ` _ \ / _` | '_ \ / _` | | |   | | '_ \ / _ \
| \__/\ (_) | | | | | | | | | | | (_| | | | | (_| | | |___| | | | |  __/
 \____/\___/|_| |_| |_|_| |_| |_|\__,_|_| |_|\__,_| \_____/_|_| |_|\___|
                                                                        
                                                                        
 _____      _             __               
|_   _|    | |           / _|              
  | | _ __ | |_ ___ _ __| |_ __ _  ___ ___ 
  | || '_ \| __/ _ \ '__|  _/ _` |/ __/ _ \
 _| || | | | ||  __/ |  | || (_| | (_|  __/
 \___/_| |_|\__\___|_|  |_| \__,_|\___\___|

Allie has a rich command-line interface to perform many of the API functions from it. In this section of the wiki you can learn more about how to use the Allie CLI.

To follow along with these examples, quickly seed some data (51 male files / 51 female files):

cd allie
cd datasets
python3 seed_test.py

You can also use any of the links below to go to your section of interest:

Help

To get started, you can explore commands Allie CLI by typing in:

cd ~ 
cd allie
python3 allie.py -h

Which should output some ways you can use Allie with commands in the API:

Usage: allie.py [options]

Options:
  -h, --help            show this help message and exit
  --c=command, --command=command
                        the target command (annotate API = 'annotate',
                        augmentation API = 'augment',  cleaning API = 'clean',
                        datasets API = 'data',  features API = 'features',
                        model prediction API = 'predict',  preprocessing API =
                        'transform',  model training API = 'train',  testing
                        API = 'test',  visualize API = 'visualize',
                        list/change default settings = 'settings')
  --p=problemtype, --problemtype=problemtype
                        specify the problem type ('c' = classification or 'r'
                        = regression)
  --s=sampletype, --sampletype=sampletype
                        specify the type files that you'd like to operate on
                        (e.g. 'audio', 'text', 'image', 'video', 'csv')
  --n=common_name, --name=common_name
                        specify the common name for the model (e.g. 'gender'
                        for a male/female problem)
  --i=class_, --class=class_
                        specify the class that you wish to annotate (e.g.
                        'male')
  --d=dir, --dir=dir    an array of the target directory (or directories) that
                        contains sample files for the annotation API,
                        prediction API, features API, augmentation API,
                        cleaning API, and preprocessing API (e.g.
                        '/Users/jim/desktop/allie/train_dir/teens/')

Classification problem

You can annotate a folder of audio files here as a classification problem with the label male in a directory with a command like this:

python3 allie.py --command annotate --sampletype audio --problemtype c --class male --dir /Users/jim/desktop/allie/train_dir/males

It will then play back audio files for you to annotate around the specified class:


  0%|                                                    | 0/51 [00:00<?, ?it/s]playing file... 16.WAV

16.wav:

 File Size: 137k      Bit Rate: 256k
  Encoding: Signed PCM    
  Channels: 1 @ 16-bit   
Samplerate: 16000Hz      
Replaygain: off         
  Duration: 00:00:04.29  

In:100%  00:00:04.29 [00:00:00.00] Out:189k  [      |      ] Hd:5.9 Clip:0    
Done.
MALE label 1 (yes) or 0 (no)?
yes
error annotating, annotating again...
error - file 16.wav not recognized
  2%|β–Š                                           | 1/51 [00:07<06:09,  7.39s/it]playing file... 17.WAV

17.wav:

 File Size: 229k      Bit Rate: 256k
  Encoding: Signed PCM    
  Channels: 1 @ 16-bit   
Samplerate: 16000Hz      
Replaygain: off         
  Duration: 00:00:07.17  

In:100%  00:00:07.17 [00:00:00.00] Out:316k  [      |      ]        Clip:0    

Regression problem

To change to a regression problem, you just need to change the problemtype to -r and the class (--i) to a regression class problem (e.g. age):

python3 allie.py --command annotate --sampletype audio --problemtype r --i age --dir /Users/jim/desktop/allie/train_dir/males

This similarly allows you to annotate for regression problems:

  0%|                                                    | 0/51 [00:00<?, ?it/s]playing file... 16.WAV

16.wav:

 File Size: 137k      Bit Rate: 256k
  Encoding: Signed PCM    
  Channels: 1 @ 16-bit   
Samplerate: 16000Hz      
Replaygain: off         
  Duration: 00:00:04.29  

In:100%  00:00:04.29 [00:00:00.00] Out:189k  [      |      ] Hd:5.9 Clip:0    
Done.
AGE value?
50
[{'age': {'value': 50.0, 'datetime': '2020-08-07 12:22:06.180569', 'filetype': 'audio', 'file': '16.wav', 'problemtype': 'r', 'annotate_dir': '/Users/jim/desktop/allie/train_dir/males'}}]
  2%|β–Š                                           | 1/51 [00:11<09:37, 11.55s/it]playing file... 17.WAV

17.wav:

 File Size: 229k      Bit Rate: 256k
  Encoding: Signed PCM    
  Channels: 1 @ 16-bit   
Samplerate: 16000Hz      
Replaygain: off         
  Duration: 00:00:07.17  

In:100%  00:00:07.17 [00:00:00.00] Out:316k  [      |      ]        Clip:0    
Done.
AGE value?

You can augment data like this via the default_augmentation settings:

python3 allie.py --command augment --sampletype audio --dir /Users/jim/desktop/allie/train_dir/males --dir /Users/jim/desktop/allie/train_dir/females

You now have an augmented set of files in both directories:

males:   0%|                                             | 0/52 [00:00<?, ?it/s](87495,)
males:   2%|β–‹                                    | 1/52 [00:00<00:46,  1.09it/s](88906,)
males:   4%|β–ˆβ–                                   | 2/52 [00:01<00:34,  1.44it/s](94551,)
males:   6%|β–ˆβ–ˆβ–                                  | 3/52 [00:01<00:26,  1.87it/s](90317,)
males:   8%|β–ˆβ–ˆβ–Š                                  | 4/52 [00:01<00:20,  2.38it/s](90317,)
males:  10%|β–ˆβ–ˆβ–ˆβ–Œ                                 | 5/52 [00:01<00:16,  2.79it/s](158055,)
males:  12%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž                                | 6/52 [00:02<00:16,  2.73it/s](114308,)
males:  13%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰                                | 7/52 [00:02<00:15,  2.82it/s](104429,)
males:  15%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹                               | 8/52 [00:02<00:14,  2.98it/s](104429,)
males:  17%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–                              | 9/52 [00:02<00:12,  3.37it/s](129831,)
males:  19%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰                             | 10/52 [00:03<00:12,  3.38it/s](228615,)
males:  21%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ                            | 11/52 [00:03<00:13,  3.08it/s](103018,)
males:  23%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž                           | 12/52 [00:03<00:13,  2.96it/s](101607,)
males:  25%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                           | 13/52 [00:04<00:12,  3.09it/s](87495,)
males:  27%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹                          | 14/52 [00:04<00:10,  3.54it/s](94551,)
males:  29%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–                         | 15/52 [00:04<00:09,  3.75it/s](129831,)
males:  31%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                         | 16/52 [00:04<00:09,  3.81it/s](91728,)
males:  33%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š                        | 17/52 [00:05<00:08,  4.30it/s](198980,)
males:  35%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–                       | 18/52 [00:05<00:08,  3.85it/s](143943,)
males:  37%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–                      | 19/52 [00:05<00:08,  3.76it/s](124186,)
males:  38%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š                      | 20/52 [00:05<00:08,  3.82it/s](114308,)
males:  40%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ                     | 21/52 [00:06<00:07,  3.93it/s](107252,)
males:  42%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–                    | 22/52 [00:06<00:07,  4.25it/s](97373,)
males:  44%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰                    | 23/52 [00:06<00:06,  4.62it/s](541901,)
males:  46%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ                   | 24/52 [00:07<00:12,  2.28it/s](203213,)
males:  48%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž                  | 25/52 [00:07<00:11,  2.39it/s](214503,)
males:  50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                  | 26/52 [00:08<00:09,  2.61it/s](94551,)
males:  52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹                 | 27/52 [00:08<00:08,  3.08it/s](111485,)
males:  54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–                | 28/52 [00:08<00:07,  3.29it/s](303408,)
males:  56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                | 29/52 [00:09<00:08,  2.71it/s](155232,)
males:  58%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š               | 30/52 [00:09<00:07,  3.02it/s](94551,)
males:  60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–              | 31/52 [00:09<00:06,  3.45it/s](90317,)
males:  62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–             | 32/52 [00:09<00:05,  3.84it/s](117130,)
males:  63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š             | 33/52 [00:09<00:04,  4.05it/s](128420,)
males:  65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ            | 34/52 [00:10<00:04,  3.94it/s](115719,)
males:  67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–           | 35/52 [00:10<00:04,  3.97it/s](134064,)
males:  69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰           | 36/52 [00:10<00:04,  3.70it/s](152410,)
males:  71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ          | 37/52 [00:10<00:03,  3.79it/s](145354,)
males:  73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž         | 38/52 [00:11<00:03,  3.64it/s](90317,)
males:  75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ         | 39/52 [00:11<00:03,  3.82it/s](108663,)
males:  77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹        | 40/52 [00:11<00:02,  4.13it/s](119952,)
males:  79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–       | 41/52 [00:11<00:02,  4.02it/s](108663,)
males:  81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ       | 42/52 [00:12<00:02,  4.35it/s](115719,)
males:  83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š      | 43/52 [00:12<00:02,  4.26it/s](124186,)
males:  85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–     | 44/52 [00:12<00:01,  4.51it/s](94551,)
males:  87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–    | 45/52 [00:12<00:01,  4.72it/s](136887,)
males:  88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š    | 46/52 [00:13<00:01,  4.54it/s](136887,)
males:  90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ   | 47/52 [00:13<00:01,  4.41it/s](121364,)
males:  92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  | 48/52 [00:13<00:00,  4.64it/s](403604,)
males:  94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰  | 49/52 [00:14<00:01,  2.06it/s](94551,)
males:  96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 50/52 [00:15<00:00,  2.09it/s](396548,)
males: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 52/52 [00:16<00:00,  3.15it/s]

females:   0%|                                           | 0/51 [00:00<?, ?it/s](208858,)
females:   2%|β–‹                                  | 1/51 [00:01<00:50,  1.01s/it](224381,)
females:   4%|β–ˆβ–Ž                                 | 2/51 [00:01<00:39,  1.23it/s](90317,)
females:   6%|β–ˆβ–ˆ                                 | 3/51 [00:01<00:30,  1.59it/s](156644,)
females:   8%|β–ˆβ–ˆβ–‹                                | 4/51 [00:01<00:24,  1.89it/s](598349,)
females:  10%|β–ˆβ–ˆβ–ˆβ–                               | 5/51 [00:02<00:30,  1.50it/s](93140,)
females:  12%|β–ˆβ–ˆβ–ˆβ–ˆ                               | 6/51 [00:03<00:23,  1.89it/s](248372,)
females:  14%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š                              | 7/51 [00:03<00:21,  2.02it/s](129831,)
females:  16%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–                             | 8/51 [00:03<00:18,  2.37it/s](196157,)
females:  18%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–                            | 9/51 [00:04<00:17,  2.47it/s](213092,)
females:  20%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹                           | 10/51 [00:04<00:15,  2.60it/s](107252,)
females:  22%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž                          | 11/51 [00:04<00:12,  3.12it/s](104429,)
females:  24%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                          | 12/51 [00:04<00:11,  3.45it/s](129831,)
females:  25%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹                         | 13/51 [00:05<00:09,  3.82it/s](118541,)
females:  27%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž                        | 14/51 [00:05<00:09,  4.07it/s](98784,)
females:  29%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                        | 15/51 [00:05<00:08,  4.19it/s](103018,)
females:  31%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹                       | 16/51 [00:05<00:08,  4.22it/s](90317,)
females:  33%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž                      | 17/51 [00:05<00:07,  4.31it/s](249783,)
females:  35%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                      | 18/51 [00:06<00:08,  3.85it/s](124186,)
females:  37%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹                     | 19/51 [00:06<00:08,  3.88it/s](324576,)
females:  39%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž                    | 20/51 [00:06<00:09,  3.13it/s](143943,)
females:  41%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                    | 21/51 [00:07<00:09,  3.22it/s](93140,)
females:  43%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹                   | 22/51 [00:07<00:07,  3.73it/s](153821,)
females:  45%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž                  | 23/51 [00:07<00:07,  3.62it/s](156644,)
females:  47%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                  | 24/51 [00:07<00:07,  3.60it/s](321754,)
females:  49%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹                 | 25/51 [00:08<00:08,  3.06it/s](589882,)
females:  51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž                | 26/51 [00:09<00:12,  2.01it/s](242727,)
females:  53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                | 27/51 [00:09<00:11,  2.09it/s](93140,)
females:  55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹               | 28/51 [00:09<00:09,  2.55it/s](104429,)
females:  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž              | 29/51 [00:10<00:07,  2.90it/s](235671,)
females:  59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ              | 30/51 [00:10<00:07,  2.86it/s](101607,)
females:  61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹             | 31/51 [00:10<00:06,  3.30it/s](87495,)
females:  63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž            | 32/51 [00:10<00:05,  3.80it/s](101607,)
females:  65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ            | 33/51 [00:11<00:04,  4.22it/s](122775,)
females:  67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹           | 34/51 [00:11<00:04,  4.23it/s](101607,)
females:  69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž          | 35/51 [00:11<00:03,  4.50it/s](91728,)
females:  71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ          | 36/51 [00:11<00:03,  4.59it/s](98784,)
females:  73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹         | 37/51 [00:11<00:03,  4.62it/s](87495,)
females:  75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž        | 38/51 [00:12<00:02,  5.01it/s](166522,)
females:  76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ        | 39/51 [00:12<00:02,  4.08it/s](134064,)
females:  78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹       | 40/51 [00:12<00:03,  3.43it/s](118541,)
females:  80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž      | 41/51 [00:13<00:03,  3.26it/s](149588,)
females:  82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ      | 42/51 [00:13<00:02,  3.37it/s](206036,)
females:  84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹     | 43/51 [00:13<00:02,  3.28it/s](87495,)
females:  86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž    | 44/51 [00:13<00:01,  3.76it/s](211680,)
females:  88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ    | 45/51 [00:14<00:01,  3.45it/s](325988,)
females:  90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹   | 46/51 [00:14<00:01,  2.55it/s](241316,)
females:  92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž  | 47/51 [00:15<00:01,  2.71it/s](87495,)
females:  94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  | 48/51 [00:15<00:00,  3.20it/s](90317,)
females:  96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 49/51 [00:15<00:00,  3.63it/s](90317,)
females:  98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 50/51 [00:15<00:00,  4.10it/s](101607,)
females: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 51/51 [00:15<00:00,  3.20it/s]

You can clean data like this via the default_augmentation settings:

python3 allie.py --command clean --sampletype audio --dir /Users/jim/desktop/allie/train_dir/males --dir /Users/jim/desktop/allie/train_dir/females

You now have a set of cleaned files in both directories.

males:   0%|                                            | 0/102 [00:00<?, ?it/s]ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with Apple clang version 11.0.3 (clang-1103.0.32.62)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from '6ab789e1-5994-4796-b4ab-63ff9f20cf09.wav':
  Metadata:
    encoder         : Lavf58.29.100
  Duration: 00:00:04.10, bitrate: 256 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '6ab789e1-5994-4796-b4ab-63ff9f20cf09_cleaned.wav':
  Metadata:
    ISFT            : Lavf58.45.100
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
    Metadata:
      encoder         : Lavc58.91.100 pcm_s16le
size=     128kB time=00:00:04.09 bitrate= 256.2kbits/s speed=2.32e+03x    
video:0kB audio:128kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.059509%
ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with Apple clang version 11.0.3 (clang-1103.0.32.62)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Guessed Channel Layout for Input Stream #0.0 : 5.0
Input #0, wav, from '8caa6f96-04e8-48ee-8817-8b9da97734b2.wav':
  Duration: 00:00:08.00, bitrate: 1764 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, 5.0, s16, 1764 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '8caa6f96-04e8-48ee-8817-8b9da97734b2_cleaned.wav':
  Metadata:
    ISFT            : Lavf58.45.100
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
    Metadata:
      encoder         : Lavc58.91.100 pcm_s16le
size=     250kB time=00:00:08.00 bitrate= 256.1kbits/s speed= 276x    
video:0kB audio:250kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.030469%
males:   2%|β–‹                                   | 2/102 [00:00<00:09, 10.18it/s]ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with Apple clang version 11.0.3 (clang-1103.0.32.62)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from '49b96c77-5971-409a-9cb0-a2abfd9b1f37.wav':
  Metadata:
    encoder         : Lavf58.29.100
  Duration: 00:00:04.74, bitrate: 256 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '49b96c77-5971-409a-9cb0-a2abfd9b1f37_cleaned.wav':
  Metadata:
    ISFT            : Lavf58.45.100
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
    Metadata:
      encoder         : Lavc58.91.100 pcm_s16le
size=     148kB time=00:00:04.73 bitrate= 256.1kbits/s speed=3.48e+03x    
video:0kB audio:148kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.051467%
ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with Apple clang version 11.0.3 (clang-1103.0.32.62)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from '06499054-2859-4861-88d8-841fbaec0365.wav':
  Metadata:
    encoder         : Lavf58.29.100
  Duration: 00:00:06.21, bitrate: 256 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '06499054-2859-4861-88d8-841fbaec0365_cleaned.wav':
  Metadata:
    ISFT            : Lavf58.45.100
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
    Metadata:
      encoder         : Lavc58.91.100 pcm_s16le
size=     194kB time=00:00:06.20 bitrate= 256.1kbits/s speed=4.28e+03x    
video:0kB audio:194kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.039264%
males:   4%|β–ˆβ–                                  | 4/102 [00:00<00:08, 11.29it/s]ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with Apple clang version 11.0.3 (clang-1103.0.32.62)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack

...

You can call the Allie datasets API with this command:

python3 allie.py --command data

You can then download a dataset quickly following through the website instructions. Note that many datasets have different ways for downloading, so we have only taken you to the websites of interest for these datasets for you to figure this out. In future versions of Allie, these datasets can be downloaded directly through an API.

/usr/local/lib/python3.7/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
  warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
what dataset would you like to download? (1-audio, 2-text, 3-image, 4-video, 5-csv)
1
found 34 datasets...
----------------------------
here are the available AUDIO datasets
----------------------------
TIMIT dataset
Parkinson's speech dataset
ISOLET Data Set
AudioSet
Multimodal EmotionLines Dataset (MELD)
Free Spoken Digit Dataset
Speech Accent Archive
2000 HUB5 English
Emotional Voice dataset - Nature
LJ Speech
VoxForge
Million Song Dataset
Free Music Archive
Common Voice
Spoken Commands dataset
Bird audio detection challenge
Environmental audio dataset
Urban Sound Dataset
Ted-LIUM
Noisy Dataset
Librispeech
Emotional Voices Database
CMU Wilderness
Arabic Speech Corpus
Flickr Audio Caption
CHIME
Tatoeba
Freesound dataset
Spoken Wikipeida Corpora
Karoldvl-ESC
Zero Resource Speech Challenge
Speech Commands Dataset
Persian Consonant Vowel Combination (PCVC) Speech Dataset
VoxCeleb
what audio dataset would you like to download?
Speech Commmands
found dataset: Speech Commands Dataset
-speech-commands-dataset.html) - The dataset (1.4 GB) has 65,000 one-second long utterances of 30 short words, by thousands of different people, contributed by members of the public through the AIY website.
just confirming, do you want to download the Speech Commands Dataset dataset? (Y - yes, N - no) 
yes

You can featurize data just like augmentation and cleaning:

python3 allie.py --command features --sampletype audio --dir /Users/jim/desktop/allie/train_dir/males --dir /Users/jim/desktop/allie/train_dir/females

This will then featurize both folders with the default_audio_features in the settings.json.

males:   0%|                                            | 0/102 [00:00<?, ?it/s]deepspeech_dict transcribing: 17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned.wav
--2020-08-07 12:29:42--  https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.pbmm
Resolving github.com (github.com)... 140.82.113.3
Connecting to github.com (github.com)|140.82.113.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github-production-release-asset-2e65be.s3.amazonaws.com/60273704/db3b3f80-84bd-11ea-93d7-1ddb76a21efe?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20200807%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20200807T162942Z&X-Amz-Expires=300&X-Amz-Signature=75a04415e8839d00e611a7414d420fc4a1a465de88a93e80e9417ee7e55c4325&X-Amz-SignedHeaders=host&actor_id=0&repo_id=60273704&response-content-disposition=attachment%3B%20filename%3Ddeepspeech-0.7.0-models.pbmm&response-content-type=application%2Foctet-stream [following]
--2020-08-07 12:29:42--  https://github-production-release-asset-2e65be.s3.amazonaws.com/60273704/db3b3f80-84bd-11ea-93d7-1ddb76a21efe?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20200807%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20200807T162942Z&X-Amz-Expires=300&X-Amz-Signature=75a04415e8839d00e611a7414d420fc4a1a465de88a93e80e9417ee7e55c4325&X-Amz-SignedHeaders=host&actor_id=0&repo_id=60273704&response-content-disposition=attachment%3B%20filename%3Ddeepspeech-0.7.0-models.pbmm&response-content-type=application%2Foctet-stream
Resolving github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)... 52.216.204.83
Connecting to github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)|52.216.204.83|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 188916323 (180M) [application/octet-stream]
Saving to: β€˜deepspeech-0.7.0-models.pbmm’

deepspeech-0.7.0-mo 100%[===================>] 180.16M  6.88MB/s    in 18s     

2020-08-07 12:30:00 (9.94 MB/s) - β€˜deepspeech-0.7.0-models.pbmm’ saved [188916323/188916323]

--2020-08-07 12:30:00--  https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.scorer
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github-production-release-asset-2e65be.s3.amazonaws.com/60273704/49dcc500-84df-11ea-9cb6-ec1d98c50dd4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20200807%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20200807T163001Z&X-Amz-Expires=300&X-Amz-Signature=ed079c1be3b63caf76b2daf1ad6d62537d0e4a6aa856c2428995993484bd2872&X-Amz-SignedHeaders=host&actor_id=0&repo_id=60273704&response-content-disposition=attachment%3B%20filename%3Ddeepspeech-0.7.0-models.scorer&response-content-type=application%2Foctet-stream [following]
--2020-08-07 12:30:01--  https://github-production-release-asset-2e65be.s3.amazonaws.com/60273704/49dcc500-84df-11ea-9cb6-ec1d98c50dd4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20200807%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20200807T163001Z&X-Amz-Expires=300&X-Amz-Signature=ed079c1be3b63caf76b2daf1ad6d62537d0e4a6aa856c2428995993484bd2872&X-Amz-SignedHeaders=host&actor_id=0&repo_id=60273704&response-content-disposition=attachment%3B%20filename%3Ddeepspeech-0.7.0-models.scorer&response-content-type=application%2Foctet-stream
Resolving github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)... 52.216.236.67
Connecting to github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)|52.216.236.67|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 953363776 (909M) [application/octet-stream]
Saving to: β€˜deepspeech-0.7.0-models.scorer’

deepspeech-0.7.0-mo 100%[===================>] 909.20M  10.4MB/s    in 88s     

2020-08-07 12:31:29 (10.3 MB/s) - β€˜deepspeech-0.7.0-models.scorer’ saved [953363776/953363776]

ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with Apple clang version 11.0.3 (clang-1103.0.32.62)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from '17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned.wav':
  Metadata:
    encoder         : Lavf58.45.100
  Duration: 00:00:02.00, bitrate: 256 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned_newaudio.wav':
  Metadata:
    ISFT            : Lavf58.45.100
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
    Metadata:
      encoder         : Lavc58.91.100 pcm_s16le
size=      63kB time=00:00:02.00 bitrate= 256.3kbits/s speed=1.24e+03x    
video:0kB audio:62kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.121875%
deepspeech --model /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.pbmm --scorer /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.scorer --audio "17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned_newaudio.wav" >> "17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned.txt"
Loading model from file /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.pbmm
TensorFlow: v1.15.0-24-gceb46aae58
DeepSpeech: v0.7.4-0-gfcd9563f
2020-08-07 12:31:30.122614: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.0155s.
Loading scorer from files /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.scorer
Loaded scorer in 0.00188s.
Running inference.
Inference took 2.470s for 2.000s audio file.
DEEPSPEECH_DICT
--> 
librosa featurizing: 17ebdf90-b6dc-4940-85c3-055e3f0c5e9a_cleaned.wav
/usr/local/lib/python3.7/site-packages/librosa/beat.py:306: DeprecationWarning: np.asscalar(a) is deprecated since NumPy v1.16, use a.item() instead
  hop_length=hop_length))
[15.0, 44.8, 27.914631766632116, 82.0, 3.0, 52.0, 143.5546875, 1.236462812624379, 0.7251315935053164, 3.334862198133343, 0.0, 1.0859751751040547, 1.0, 0.0, 1.0, 1.0, 1.0, 0.9021154304582399, 0.011871022692166161, 0.9248579351103645, 0.8845252162754503, 0.9007761885347338, 0.8005566025338086, 0.02351835274803656, 0.846290326519876, 0.7666702190624551, 0.7974848758054224, 0.765299804936602, 0.028837831871710528, 0.8210615152089579, 0.723447224850167, 0.7616938878984344, 0.7718090402633874, 0.030151369356669896, 0.8289804789632119, 0.7266534111141562, 0.7686879868973036, 0.7936400196140749, 0.031036953487073464, 0.8507660198788851, 0.7451751937203362, 0.7913818436015906, 0.774629021009383, 0.03215688854479966, 0.8344764767038448, 0.7251595846469461, 0.7719273208794816, 0.7428815548766532, 0.035011430707789774, 0.8084967674473139, 0.6894574657533005, 0.7397077230156677, 0.7471335383622011, 0.03463006342656048, 0.811789409800113, 0.6939570618543236, 0.7441413516783196, 0.7610523695125583, 0.033862478442252354, 0.8235134515207413, 0.7081080052043068, 0.7585638476404928, 0.789820226591492, 0.03384767249624557, 0.8500833679586846, 0.7343945280617632, 0.7885333494859879, 0.8059179015451146, 0.03157630618281619, 0.8615948245740482, 0.7535417225113215, 0.8050269828120753, 0.7935417840439638, 0.031145337061902156, 0.8492406377587695, 0.7427011806455776, 0.7922497709713059, -381.8200653053766, 23.009903107621383, -321.89837910119815, -441.6259904054828, -379.3488122081372, 149.04439804929382, 15.356164419049225, 172.85739766597234, 110.28800952925451, 153.08285883743733, -32.84528000164682, 13.009141709326732, -2.154076829327365, -64.91296682470796, -30.99198128144861, 40.70623621978138, 17.548974836043755, 73.18507958780387, 8.337746078102892, 40.63827945609428, -52.86069958238985, 13.478379908189092, -27.553997955729045, -87.5715612441206, -51.58811003068236, 31.42738418944771, 6.795009930398713, 49.46758858300626, 18.299603573231376, 31.6997571738992, -35.82303959204243, 8.486268198834747, -14.57639089253998, -55.40606622608898, -34.90037102114016, 15.955209103884254, 8.934103499373093, 44.50048758909077, -5.494667426263748, 15.650980212978268, -17.338170873356056, 5.727612678025376, -3.1374263092378176, -33.480526476176806, -17.446097772263684, -0.4383376230378039, 6.2672128421452875, 15.720566519612913, -14.330033302127145, -0.244906066419725, -1.467000393875816, 6.138683208427911, 15.463481385114175, -15.812384056333133, -2.0209526024605786, -6.972125329645311, 4.816668688995419, 6.338229281172403, -17.349015718809397, -6.496008401327131, 6.688298302126343, 6.351559382372022, 20.66368904480788, -9.92049214526477, 7.446377744032864, -3.146738423029468e-05, 1.0080761356243433e-05, -1.2325730935412251e-05, -6.256804392224112e-05, -3.133272391000924e-05, 0.25608241330935894, 0.07833864054721744, 0.5017299271817217, 0.1071248558508767, 0.2536783052990988, 1698.1068839811428, 247.34317762284775, 2332.782952940378, 1298.7956436686768, 1652.8623945849333, 1916.493769636128, 217.42031398082, 2404.6618216796505, 1537.6071015613695, 1882.6348073434424, 15.195013434264473, 4.030761691034897, 26.309048213128285, 5.981068616687288, 15.426375628186392, 0.0004132288449909538, 0.000528702512383461, 0.004117688629776239, 7.970369915710762e-05, 0.0002881725667975843, 3901.964911099138, 915.5047430674098, 5792.431640625, 2196.38671875, 3757.5439453125, 0.0726977819683908, 0.013766841258384812, 0.11962890625, 0.03857421875, 0.0712890625, 0.009517103433609009, 0.0026407463010400534, 0.01786264032125473, 0.004661164246499538, 0.009199550375342369]
males:   1%|β–Ž                                | 1/102 [01:51<3:08:13, 111.82s/it]deepspeech_dict transcribing: d2a57cd6-f757-435d-9768-cac1667f79e1_cleaned.wav
ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with Apple clang version 11.0.3 (clang-1103.0.32.62)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
  libpostproc    55.  7.100 / 55.  7.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'd2a57cd6-f757-435d-9768-cac1667f79e1_cleaned.wav':
  Metadata:
    encoder         : Lavf58.45.100
  Duration: 00:00:08.00, bitrate: 256 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'd2a57cd6-f757-435d-9768-cac1667f79e1_cleaned_newaudio.wav':
  Metadata:
    ISFT            : Lavf58.45.100
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s
    Metadata:
      encoder         : Lavc58.91.100 pcm_s16le
size=     250kB time=00:00:08.00 bitrate= 256.1kbits/s speed=1.37e+03x    
video:0kB audio:250kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.030469%
deepspeech --model /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.pbmm --scorer /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.scorer --audio "d2a57cd6-f757-435d-9768-cac1667f79e1_cleaned_newaudio.wav" >> "d2a57cd6-f757-435d-9768-cac1667f79e1_cleaned.txt"
Loading model from file /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.pbmm
TensorFlow: v1.15.0-24-gceb46aae58
DeepSpeech: v0.7.4-0-gfcd9563f
2020-08-07 12:31:34.542205: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.0235s.
Loading scorer from files /Users/jim/Desktop/allie/allie/features/audio_features/helpers/deepspeech-0.7.0-models.scorer
Loaded scorer in 0.000517s.
Running inference.
...

You can train machine learning models quickly with:

python3 allie.py --command train

This will then train models based on the CLI. Note since we have already featurized the folders this will speed up the modeling process.

is this a classification (c) or regression (r) problem? 
c
what problem are you solving? (1-audio, 2-text, 3-image, 4-video, 5-csv)
1

 OK cool, we got you modeling audio files 

how many classes would you like to model? (2 available) 
2
these are the available classes: 
['females', 'males']
what is class #1 
males
what is class #2 
females
what is the 1-word common name for the problem you are working on? (e.g. gender for male/female classification) 
gender
-----------------------------------
          LOADING MODULES          
-----------------------------------
Requirement already satisfied: scikit-learn==0.22.2.post1 in /usr/local/lib/python3.7/site-packages (0.22.2.post1)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (0.15.1)
Requirement already satisfied: scipy>=0.17.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (1.4.1)
Requirement already satisfied: numpy>=1.11.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (1.18.4)
WARNING: You are using pip version 20.2; however, version 20.2.1 is available.
You should consider upgrading via the '/usr/local/opt/python/bin/python3.7 -m pip install --upgrade pip' command.
-----------------------------------
______ _____  ___ _____ _   _______ _____ ___________ _   _ _____ 
|  ___|  ___|/ _ \_   _| | | | ___ \_   _|___  /_   _| \ | |  __ \
| |_  | |__ / /_\ \| | | | | | |_/ / | |    / /  | | |  \| | |  \/
|  _| |  __||  _  || | | | | |    /  | |   / /   | | | . ` | | __ 
| |   | |___| | | || | | |_| | |\ \ _| |_./ /____| |_| |\  | |_\ \
\_|   \____/\_| |_/\_/  \___/\_| \_|\___/\_____/\___/\_| \_/\____/
                                                                  
                                                                  
______  ___ _____ ___  
|  _  \/ _ \_   _/ _ \ 
| | | / /_\ \| |/ /_\ \
| | | |  _  || ||  _  |
| |/ /| | | || || | | |
|___/ \_| |_/\_/\_| |_/
                       
                       

-----------------------------------
-----------------------------------
           FEATURIZING MALES
-----------------------------------
males:   0%|                                             | 0/51 [00:00<?, ?it/s]librosa featurizing: 38.wav

...
... [skipping a lot of output in terminal]
...

WARNING: You are using pip version 20.2; however, version 20.2.1 is available.
You should consider upgrading via the '/usr/local/opt/python/bin/python3.7 -m pip install --upgrade pip' command.
Warning: xgboost.XGBClassifier is not available and will not be used by TPOT.
Generation 1 - Current best internal CV score: 0.8882352941176471               
Generation 2 - Current best internal CV score: 0.8882352941176471     

You need to have a model that you have trained with Allie in the ./models/[sampletype]_models directory. For example, an audio model that is detecting gender may be in this tree structure. Since we have already trained the gender model above, we just need to put a sample file in the ./load_dir to make a prediction (which can be found here).

Now call the CLI:

python3 allie.py --command predict

A gender prediction is now made.

'gender_tpot_classifier']
['gender_tpot_classifier']
[]
[]
[]
[]
[]
[]
error
------------------------------
	      IDENTIFIED MODELS      
------------------------------
{'audio_models': ['gender_tpot_classifier'], 'text_models': [], 'image_models': [], 'video_models': [], 'csv_models': []}
-----------------------
FEATURIZING AUDIO_MODELS
-----------------------
/Users/jim/Desktop/allie/allie/features/audio_features
load_dir: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3/3 [00:00<00:00, 1154.71it/s]
-----------------------
MODELING AUDIO_MODELS
-----------------------
audio_models
--> predicting gender_tpot_classifier
gender_tpot_classifier_transform.pickle
tpot
audio_models
['jim.wav', 'jim.json', 'README.md']
['jim.json']
audio
{'sampletype': 'audio', 'transcripts': {'audio': {}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'features': {'audio': {'librosa_features': {'features': [48.0, 204.70833333333334, 114.93240011947118, 396.0, 14.0, 216.5, 103.359375, 1.8399430940147428, 1.9668646890049444, 11.385597904924593, 0.0, 1.0559244294941723, 1.0, 0.0, 1.0, 1.0, 1.0, 0.7548547717034835, 0.010364651324331484, 0.7781876333922462, 0.7413898270731765, 0.7531146388573189, 0.5329470301378001, 0.017993121900389288, 0.5625446261639591, 0.49812177879164954, 0.5362475050227643, 0.5019325152826378, 0.014070606086479512, 0.523632811868841, 0.46894673564752315, 0.5016733890463346, 0.47905132092405744, 0.02472744944944913, 0.5170032241543769, 0.408032583763636, 0.481510183943566, 0.47211244429901056, 0.018043067999864118, 0.4986083755424441, 0.4084943475419884, 0.47487594610615297, 0.4698145764425497, 0.02481760404009747, 0.5249353704873062, 0.4293713399428569, 0.4722612320098678, 0.45261508487773017, 0.026497663310172545, 0.49848564789957234, 0.40880741566998524, 0.4507269108715201, 0.4486803104555413, 0.058559460166888094, 0.5292193402660791, 0.3401144705267203, 0.44999945618536435, 0.47497774707770735, 0.06545659069313127, 0.5736851049778624, 0.37925421500129, 0.4734915617563768, 0.4650337731799947, 0.06320658864729298, 0.5675856011170606, 0.3828128296325481, 0.45491284769941215, 0.4336677569640048, 0.06580364398487831, 0.5229786825561087, 0.3254973876934075, 0.4435446804048719, 0.4510229261935718, 0.0716424867984984, 0.5607997826027251, 0.3319068941555564, 0.45336899240905365, -378.4693712461592, 123.45005738361948, -131.02074973363048, -645.6119532302674, -365.0407612849682, 108.01722016743142, 78.5850621057939, 244.19279156346005, -109.89544987268641, 113.87757464191944, -18.990339871317058, 38.97227759803155, 80.46313291288668, -113.14922433281748, -19.5478460234633, 25.85348830525823, 36.66801973350443, 140.72102808980202, -59.74682246793187, 18.3196627309548, 25.890819294565695, 28.110070916600474, 109.71209190044716, -32.50655086525428, 24.126562365562382, -12.77779324195114, 25.980150189124338, 37.34024720564918, -89.18596268298815, -14.092855104596493, -14.213402047550273, 17.851386217883952, 24.416921204215857, -53.80916251929509, -15.616460366626296, -11.056262156053059, 18.131479541957944, 25.019042211813467, -65.95011982036516, -10.115261093647717, -2.111560667454096, 11.800353875032327, 33.815281150727785, -35.047615612670526, -2.4632489045982657, -12.855548041442455, 13.841955462451525, 26.49950045235625, -54.65905146286438, -12.258563565004795, -5.991988010947961, 11.560727147262314, 26.699383611419385, -46.86210002294128, -5.08389478450145, -11.905883972886778, 13.110884275285521, 18.96898208296976, -55.222181120197234, -8.889351847151506, -13.282554457300717, 9.363802595261776, 13.125079504552438, -42.40351688080857, -12.904730673116855, -6.647081175227956e-05, 6.790962221819154e-05, 3.898538767970233e-05, -0.0003530532719088282, -5.176161063821292e-05, 0.5775604310470552, 0.5363262958114443, 3.0171051694951547, 0.005876029108677461, 0.447613631105005, 2196.6402149427804, 1460.1082170800585, 6848.122696727527, 474.45532202867423, 1779.7575344580457, 1879.6573011499802, 758.0548156953982, 3968.436183431614, 710.7057371268927, 1783.9133839417857, 25.057721821734972, 7.417488037600184, 48.54069273302066, 7.980294433517432, 26.382808285840404, 0.02705797180533409, 0.049401603639125824, 0.22989588975906372, 2.3204531316878274e-05, 0.0016842428594827652, 3896.30511090472, 2618.9936438064337, 9829.9072265625, 484.4970703125, 2993.115234375, 0.13837594696969696, 0.11062751539644003, 0.62060546875, 0.01220703125, 0.10009765625, 0.025540588423609734, 0.02010413259267807, 0.09340725094079971, 0.00015651443391107023, 0.02306547947227955], 'labels': ['onset_length', 'onset_detect_mean', 'onset_detect_std', 'onset_detect_maxv', 'onset_detect_minv', 'onset_detect_median', 'tempo', 'onset_strength_mean', 'onset_strength_std', 'onset_strength_maxv', 'onset_strength_minv', 'onset_strength_median', 'rhythm_0_mean', 'rhythm_0_std', 'rhythm_0_maxv', 'rhythm_0_minv', 'rhythm_0_median', 'rhythm_1_mean', 'rhythm_1_std', 'rhythm_1_maxv', 'rhythm_1_minv', 'rhythm_1_median', 'rhythm_2_mean', 'rhythm_2_std', 'rhythm_2_maxv', 'rhythm_2_minv', 'rhythm_2_median', 'rhythm_3_mean', 'rhythm_3_std', 'rhythm_3_maxv', 'rhythm_3_minv', 'rhythm_3_median', 'rhythm_4_mean', 'rhythm_4_std', 'rhythm_4_maxv', 'rhythm_4_minv', 'rhythm_4_median', 'rhythm_5_mean', 'rhythm_5_std', 'rhythm_5_maxv', 'rhythm_5_minv', 'rhythm_5_median', 'rhythm_6_mean', 'rhythm_6_std', 'rhythm_6_maxv', 'rhythm_6_minv', 'rhythm_6_median', 'rhythm_7_mean', 'rhythm_7_std', 'rhythm_7_maxv', 'rhythm_7_minv', 'rhythm_7_median', 'rhythm_8_mean', 'rhythm_8_std', 'rhythm_8_maxv', 'rhythm_8_minv', 'rhythm_8_median', 'rhythm_9_mean', 'rhythm_9_std', 'rhythm_9_maxv', 'rhythm_9_minv', 'rhythm_9_median', 'rhythm_10_mean', 'rhythm_10_std', 'rhythm_10_maxv', 'rhythm_10_minv', 'rhythm_10_median', 'rhythm_11_mean', 'rhythm_11_std', 'rhythm_11_maxv', 'rhythm_11_minv', 'rhythm_11_median', 'rhythm_12_mean', 'rhythm_12_std', 'rhythm_12_maxv', 'rhythm_12_minv', 'rhythm_12_median', 'mfcc_0_mean', 'mfcc_0_std', 'mfcc_0_maxv', 'mfcc_0_minv', 'mfcc_0_median', 'mfcc_1_mean', 'mfcc_1_std', 'mfcc_1_maxv', 'mfcc_1_minv', 'mfcc_1_median', 'mfcc_2_mean', 'mfcc_2_std', 'mfcc_2_maxv', 'mfcc_2_minv', 'mfcc_2_median', 'mfcc_3_mean', 'mfcc_3_std', 'mfcc_3_maxv', 'mfcc_3_minv', 'mfcc_3_median', 'mfcc_4_mean', 'mfcc_4_std', 'mfcc_4_maxv', 'mfcc_4_minv', 'mfcc_4_median', 'mfcc_5_mean', 'mfcc_5_std', 'mfcc_5_maxv', 'mfcc_5_minv', 'mfcc_5_median', 'mfcc_6_mean', 'mfcc_6_std', 'mfcc_6_maxv', 'mfcc_6_minv', 'mfcc_6_median', 'mfcc_7_mean', 'mfcc_7_std', 'mfcc_7_maxv', 'mfcc_7_minv', 'mfcc_7_median', 'mfcc_8_mean', 'mfcc_8_std', 'mfcc_8_maxv', 'mfcc_8_minv', 'mfcc_8_median', 'mfcc_9_mean', 'mfcc_9_std', 'mfcc_9_maxv', 'mfcc_9_minv', 'mfcc_9_median', 'mfcc_10_mean', 'mfcc_10_std', 'mfcc_10_maxv', 'mfcc_10_minv', 'mfcc_10_median', 'mfcc_11_mean', 'mfcc_11_std', 'mfcc_11_maxv', 'mfcc_11_minv', 'mfcc_11_median', 'mfcc_12_mean', 'mfcc_12_std', 'mfcc_12_maxv', 'mfcc_12_minv', 'mfcc_12_median', 'poly_0_mean', 'poly_0_std', 'poly_0_maxv', 'poly_0_minv', 'poly_0_median', 'poly_1_mean', 'poly_1_std', 'poly_1_maxv', 'poly_1_minv', 'poly_1_median', 'spectral_centroid_mean', 'spectral_centroid_std', 'spectral_centroid_maxv', 'spectral_centroid_minv', 'spectral_centroid_median', 'spectral_bandwidth_mean', 'spectral_bandwidth_std', 'spectral_bandwidth_maxv', 'spectral_bandwidth_minv', 'spectral_bandwidth_median', 'spectral_contrast_mean', 'spectral_contrast_std', 'spectral_contrast_maxv', 'spectral_contrast_minv', 'spectral_contrast_median', 'spectral_flatness_mean', 'spectral_flatness_std', 'spectral_flatness_maxv', 'spectral_flatness_minv', 'spectral_flatness_median', 'spectral_rolloff_mean', 'spectral_rolloff_std', 'spectral_rolloff_maxv', 'spectral_rolloff_minv', 'spectral_rolloff_median', 'zero_crossings_mean', 'zero_crossings_std', 'zero_crossings_maxv', 'zero_crossings_minv', 'zero_crossings_median', 'RMSE_mean', 'RMSE_std', 'RMSE_maxv', 'RMSE_minv', 'RMSE_median']}}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'models': {'audio': {'females': [{'sample type': 'audio', 'created date': '2020-08-07 12:43:08.041610', 'device info': {'time': '2020-08-07 12:43', 'timezone': ['EST', 'EDT'], 'operating system': 'Darwin', 'os release': '19.5.0', 'os version': 'Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2/RELEASE_X86_64', 'cpu data': {'memory': [8589934592, 2205638656, 74.3, 4275126272, 94564352, 2113089536, 1905627136, 2162036736], 'cpu percent': 77.4, 'cpu times': [157014.42, 0.0, 65743.85, 543406.97], 'cpu count': 4, 'cpu stats': [73245, 527841, 971986315, 465241], 'cpu swap': [2147483648, 1335623680, 811859968, 62.2, 542256906240, 1029312512], 'partitions': [['/dev/disk1s6', '/', 'apfs', 'ro,local,rootfs,dovolfs,journaled,multilabel'], ['/dev/disk1s5', '/System/Volumes/Data', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s4', '/private/var/vm', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s1', '/Volumes/Macintosh HD - Data', 'apfs', 'rw,local,dovolfs,journaled,multilabel'], ['/dev/disk1s3', '/Volumes/Recovery', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel']], 'disk usage': [499963174912, 10985529344, 339299921920, 3.1], 'disk io counters': [11783216, 8846330, 592488718336, 504655544320, 11689626, 6788646], 'battery': [56, -2, True], 'boot time': 1596543104.0}, 'space left': 339.29992192}, 'session id': 'b4a4b8e8-d8cc-11ea-8e12-acde48001122', 'classes': ['males', 'females'], 'problem type': 'classification', 'model name': 'gender_tpot_classifier.pickle', 'model type': 'tpot', 'metrics': {'accuracy': 0.7, 'balanced_accuracy': 0.6666666666666667, 'precision': 0.6666666666666666, 'recall': 0.5, 'f1_score': 0.5714285714285715, 'f1_micro': 0.7, 'f1_macro': 0.6703296703296704, 'roc_auc': 0.6666666666666666, 'roc_auc_micro': 0.6666666666666666, 'roc_auc_macro': 0.6666666666666666, 'confusion_matrix': [[5, 1], [2, 2]], 'classification_report': '              precision    recall  f1-score   support\n\n       males       0.71      0.83      0.77         6\n     females       0.67      0.50      0.57         4\n\n    accuracy                           0.70        10\n   macro avg       0.69      0.67      0.67        10\nweighted avg       0.70      0.70      0.69        10\n'}, 'settings': {'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': True, 'select_features': True, 'test_size': 0.1, 'transcribe_audio': False, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False}, 'transformer name': 'gender_tpot_classifier_transform.pickle', 'training data': ['gender_all.csv', 'gender_train.csv', 'gender_test.csv', 'gender_all_transformed.csv', 'gender_train_transformed.csv', 'gender_test_transformed.csv'], 'sample X_test': [-0.5905520916740901, -0.5769542196749199, -0.4206917011248405, 0.7529668802365918, 0.09174953222020645, -0.6629166129037093, 0.7375134654386479, -0.15802980098738767, -0.15805735479905567, -0.6081655509519702, 0.9011267037791326, 1.0261687941190032, 0.20454394530543832, -0.8448466788837747, 0.35198391256345163, 0.8316422800614328, -0.8654169740282134, 0.8593094113006882, 0.8285863806706361, -0.7025236163101211], 'sample y_test': 0}]}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'labels': ['load_dir'], 'errors': [], 'settings': {'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': False, 'select_features': False, 'test_size': 0.1, 'transcribe_audio': False, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False, 'default_dimensionionality_reducer': ['pca']}}
['librosa_features']
audio
Pipeline(memory=None,
         steps=[('standard_scaler',
                 StandardScaler(copy=True, with_mean=True, with_std=True)),
                ('rfe',
                 RFE(estimator=SVR(C=1.0, cache_size=200, coef0=0.0, degree=3,
                                   epsilon=0.1, gamma='scale', kernel='linear',
                                   max_iter=-1, shrinking=True, tol=0.001,
                                   verbose=False),
                     n_features_to_select=20, step=1, verbose=0))],
         verbose=False)
[48.0, 204.70833333333334, 114.93240011947118, 396.0, 14.0, 216.5, 103.359375, 1.8399430940147428, 1.9668646890049444, 11.385597904924593, 0.0, 1.0559244294941723, 1.0, 0.0, 1.0, 1.0, 1.0, 0.7548547717034835, 0.010364651324331484, 0.7781876333922462, 0.7413898270731765, 0.7531146388573189, 0.5329470301378001, 0.017993121900389288, 0.5625446261639591, 0.49812177879164954, 0.5362475050227643, 0.5019325152826378, 0.014070606086479512, 0.523632811868841, 0.46894673564752315, 0.5016733890463346, 0.47905132092405744, 0.02472744944944913, 0.5170032241543769, 0.408032583763636, 0.481510183943566, 0.47211244429901056, 0.018043067999864118, 0.4986083755424441, 0.4084943475419884, 0.47487594610615297, 0.4698145764425497, 0.02481760404009747, 0.5249353704873062, 0.4293713399428569, 0.4722612320098678, 0.45261508487773017, 0.026497663310172545, 0.49848564789957234, 0.40880741566998524, 0.4507269108715201, 0.4486803104555413, 0.058559460166888094, 0.5292193402660791, 0.3401144705267203, 0.44999945618536435, 0.47497774707770735, 0.06545659069313127, 0.5736851049778624, 0.37925421500129, 0.4734915617563768, 0.4650337731799947, 0.06320658864729298, 0.5675856011170606, 0.3828128296325481, 0.45491284769941215, 0.4336677569640048, 0.06580364398487831, 0.5229786825561087, 0.3254973876934075, 0.4435446804048719, 0.4510229261935718, 0.0716424867984984, 0.5607997826027251, 0.3319068941555564, 0.45336899240905365, -378.4693712461592, 123.45005738361948, -131.02074973363048, -645.6119532302674, -365.0407612849682, 108.01722016743142, 78.5850621057939, 244.19279156346005, -109.89544987268641, 113.87757464191944, -18.990339871317058, 38.97227759803155, 80.46313291288668, -113.14922433281748, -19.5478460234633, 25.85348830525823, 36.66801973350443, 140.72102808980202, -59.74682246793187, 18.3196627309548, 25.890819294565695, 28.110070916600474, 109.71209190044716, -32.50655086525428, 24.126562365562382, -12.77779324195114, 25.980150189124338, 37.34024720564918, -89.18596268298815, -14.092855104596493, -14.213402047550273, 17.851386217883952, 24.416921204215857, -53.80916251929509, -15.616460366626296, -11.056262156053059, 18.131479541957944, 25.019042211813467, -65.95011982036516, -10.115261093647717, -2.111560667454096, 11.800353875032327, 33.815281150727785, -35.047615612670526, -2.4632489045982657, -12.855548041442455, 13.841955462451525, 26.49950045235625, -54.65905146286438, -12.258563565004795, -5.991988010947961, 11.560727147262314, 26.699383611419385, -46.86210002294128, -5.08389478450145, -11.905883972886778, 13.110884275285521, 18.96898208296976, -55.222181120197234, -8.889351847151506, -13.282554457300717, 9.363802595261776, 13.125079504552438, -42.40351688080857, -12.904730673116855, -6.647081175227956e-05, 6.790962221819154e-05, 3.898538767970233e-05, -0.0003530532719088282, -5.176161063821292e-05, 0.5775604310470552, 0.5363262958114443, 3.0171051694951547, 0.005876029108677461, 0.447613631105005, 2196.6402149427804, 1460.1082170800585, 6848.122696727527, 474.45532202867423, 1779.7575344580457, 1879.6573011499802, 758.0548156953982, 3968.436183431614, 710.7057371268927, 1783.9133839417857, 25.057721821734972, 7.417488037600184, 48.54069273302066, 7.980294433517432, 26.382808285840404, 0.02705797180533409, 0.049401603639125824, 0.22989588975906372, 2.3204531316878274e-05, 0.0016842428594827652, 3896.30511090472, 2618.9936438064337, 9829.9072265625, 484.4970703125, 2993.115234375, 0.13837594696969696, 0.11062751539644003, 0.62060546875, 0.01220703125, 0.10009765625, 0.025540588423609734, 0.02010413259267807, 0.09340725094079971, 0.00015651443391107023, 0.02306547947227955]
[[ 0.06457865 -0.71415559 -1.50986974 -0.70250164 -1.28613767  0.25201476
   1.62472075  1.76165016 -1.54345567 -2.54907196  1.38612324  2.71927715
   0.06395576 -0.53985087  1.10550038  0.54977639 -0.55146853  3.84376773
   1.70959641 -0.73457093]]
tpot
[1]
{'males': [0], 'females': [1]}
1
females
females

Note that predictions are made within the standard data dictionary (as a .JSON file):

{'sampletype': 'audio', 'transcripts': {'audio': {}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'features': {'audio': {'librosa_features': {'features': [48.0, 204.70833333333334, 114.93240011947118, 396.0, 14.0, 216.5, 103.359375, 1.8399430940147428, 1.9668646890049444, 11.385597904924593, 0.0, 1.0559244294941723, 1.0, 0.0, 1.0, 1.0, 1.0, 0.7548547717034835, 0.010364651324331484, 0.7781876333922462, 0.7413898270731765, 0.7531146388573189, 0.5329470301378001, 0.017993121900389288, 0.5625446261639591, 0.49812177879164954, 0.5362475050227643, 0.5019325152826378, 0.014070606086479512, 0.523632811868841, 0.46894673564752315, 0.5016733890463346, 0.47905132092405744, 0.02472744944944913, 0.5170032241543769, 0.408032583763636, 0.481510183943566, 0.47211244429901056, 0.018043067999864118, 0.4986083755424441, 0.4084943475419884, 0.47487594610615297, 0.4698145764425497, 0.02481760404009747, 0.5249353704873062, 0.4293713399428569, 0.4722612320098678, 0.45261508487773017, 0.026497663310172545, 0.49848564789957234, 0.40880741566998524, 0.4507269108715201, 0.4486803104555413, 0.058559460166888094, 0.5292193402660791, 0.3401144705267203, 0.44999945618536435, 0.47497774707770735, 0.06545659069313127, 0.5736851049778624, 0.37925421500129, 0.4734915617563768, 0.4650337731799947, 0.06320658864729298, 0.5675856011170606, 0.3828128296325481, 0.45491284769941215, 0.4336677569640048, 0.06580364398487831, 0.5229786825561087, 0.3254973876934075, 0.4435446804048719, 0.4510229261935718, 0.0716424867984984, 0.5607997826027251, 0.3319068941555564, 0.45336899240905365, -378.4693712461592, 123.45005738361948, -131.02074973363048, -645.6119532302674, -365.0407612849682, 108.01722016743142, 78.5850621057939, 244.19279156346005, -109.89544987268641, 113.87757464191944, -18.990339871317058, 38.97227759803155, 80.46313291288668, -113.14922433281748, -19.5478460234633, 25.85348830525823, 36.66801973350443, 140.72102808980202, -59.74682246793187, 18.3196627309548, 25.890819294565695, 28.110070916600474, 109.71209190044716, -32.50655086525428, 24.126562365562382, -12.77779324195114, 25.980150189124338, 37.34024720564918, -89.18596268298815, -14.092855104596493, -14.213402047550273, 17.851386217883952, 24.416921204215857, -53.80916251929509, -15.616460366626296, -11.056262156053059, 18.131479541957944, 25.019042211813467, -65.95011982036516, -10.115261093647717, -2.111560667454096, 11.800353875032327, 33.815281150727785, -35.047615612670526, -2.4632489045982657, -12.855548041442455, 13.841955462451525, 26.49950045235625, -54.65905146286438, -12.258563565004795, -5.991988010947961, 11.560727147262314, 26.699383611419385, -46.86210002294128, -5.08389478450145, -11.905883972886778, 13.110884275285521, 18.96898208296976, -55.222181120197234, -8.889351847151506, -13.282554457300717, 9.363802595261776, 13.125079504552438, -42.40351688080857, -12.904730673116855, -6.647081175227956e-05, 6.790962221819154e-05, 3.898538767970233e-05, -0.0003530532719088282, -5.176161063821292e-05, 0.5775604310470552, 0.5363262958114443, 3.0171051694951547, 0.005876029108677461, 0.447613631105005, 2196.6402149427804, 1460.1082170800585, 6848.122696727527, 474.45532202867423, 1779.7575344580457, 1879.6573011499802, 758.0548156953982, 3968.436183431614, 710.7057371268927, 1783.9133839417857, 25.057721821734972, 7.417488037600184, 48.54069273302066, 7.980294433517432, 26.382808285840404, 0.02705797180533409, 0.049401603639125824, 0.22989588975906372, 2.3204531316878274e-05, 0.0016842428594827652, 3896.30511090472, 2618.9936438064337, 9829.9072265625, 484.4970703125, 2993.115234375, 0.13837594696969696, 0.11062751539644003, 0.62060546875, 0.01220703125, 0.10009765625, 0.025540588423609734, 0.02010413259267807, 0.09340725094079971, 0.00015651443391107023, 0.02306547947227955], 'labels': ['onset_length', 'onset_detect_mean', 'onset_detect_std', 'onset_detect_maxv', 'onset_detect_minv', 'onset_detect_median', 'tempo', 'onset_strength_mean', 'onset_strength_std', 'onset_strength_maxv', 'onset_strength_minv', 'onset_strength_median', 'rhythm_0_mean', 'rhythm_0_std', 'rhythm_0_maxv', 'rhythm_0_minv', 'rhythm_0_median', 'rhythm_1_mean', 'rhythm_1_std', 'rhythm_1_maxv', 'rhythm_1_minv', 'rhythm_1_median', 'rhythm_2_mean', 'rhythm_2_std', 'rhythm_2_maxv', 'rhythm_2_minv', 'rhythm_2_median', 'rhythm_3_mean', 'rhythm_3_std', 'rhythm_3_maxv', 'rhythm_3_minv', 'rhythm_3_median', 'rhythm_4_mean', 'rhythm_4_std', 'rhythm_4_maxv', 'rhythm_4_minv', 'rhythm_4_median', 'rhythm_5_mean', 'rhythm_5_std', 'rhythm_5_maxv', 'rhythm_5_minv', 'rhythm_5_median', 'rhythm_6_mean', 'rhythm_6_std', 'rhythm_6_maxv', 'rhythm_6_minv', 'rhythm_6_median', 'rhythm_7_mean', 'rhythm_7_std', 'rhythm_7_maxv', 'rhythm_7_minv', 'rhythm_7_median', 'rhythm_8_mean', 'rhythm_8_std', 'rhythm_8_maxv', 'rhythm_8_minv', 'rhythm_8_median', 'rhythm_9_mean', 'rhythm_9_std', 'rhythm_9_maxv', 'rhythm_9_minv', 'rhythm_9_median', 'rhythm_10_mean', 'rhythm_10_std', 'rhythm_10_maxv', 'rhythm_10_minv', 'rhythm_10_median', 'rhythm_11_mean', 'rhythm_11_std', 'rhythm_11_maxv', 'rhythm_11_minv', 'rhythm_11_median', 'rhythm_12_mean', 'rhythm_12_std', 'rhythm_12_maxv', 'rhythm_12_minv', 'rhythm_12_median', 'mfcc_0_mean', 'mfcc_0_std', 'mfcc_0_maxv', 'mfcc_0_minv', 'mfcc_0_median', 'mfcc_1_mean', 'mfcc_1_std', 'mfcc_1_maxv', 'mfcc_1_minv', 'mfcc_1_median', 'mfcc_2_mean', 'mfcc_2_std', 'mfcc_2_maxv', 'mfcc_2_minv', 'mfcc_2_median', 'mfcc_3_mean', 'mfcc_3_std', 'mfcc_3_maxv', 'mfcc_3_minv', 'mfcc_3_median', 'mfcc_4_mean', 'mfcc_4_std', 'mfcc_4_maxv', 'mfcc_4_minv', 'mfcc_4_median', 'mfcc_5_mean', 'mfcc_5_std', 'mfcc_5_maxv', 'mfcc_5_minv', 'mfcc_5_median', 'mfcc_6_mean', 'mfcc_6_std', 'mfcc_6_maxv', 'mfcc_6_minv', 'mfcc_6_median', 'mfcc_7_mean', 'mfcc_7_std', 'mfcc_7_maxv', 'mfcc_7_minv', 'mfcc_7_median', 'mfcc_8_mean', 'mfcc_8_std', 'mfcc_8_maxv', 'mfcc_8_minv', 'mfcc_8_median', 'mfcc_9_mean', 'mfcc_9_std', 'mfcc_9_maxv', 'mfcc_9_minv', 'mfcc_9_median', 'mfcc_10_mean', 'mfcc_10_std', 'mfcc_10_maxv', 'mfcc_10_minv', 'mfcc_10_median', 'mfcc_11_mean', 'mfcc_11_std', 'mfcc_11_maxv', 'mfcc_11_minv', 'mfcc_11_median', 'mfcc_12_mean', 'mfcc_12_std', 'mfcc_12_maxv', 'mfcc_12_minv', 'mfcc_12_median', 'poly_0_mean', 'poly_0_std', 'poly_0_maxv', 'poly_0_minv', 'poly_0_median', 'poly_1_mean', 'poly_1_std', 'poly_1_maxv', 'poly_1_minv', 'poly_1_median', 'spectral_centroid_mean', 'spectral_centroid_std', 'spectral_centroid_maxv', 'spectral_centroid_minv', 'spectral_centroid_median', 'spectral_bandwidth_mean', 'spectral_bandwidth_std', 'spectral_bandwidth_maxv', 'spectral_bandwidth_minv', 'spectral_bandwidth_median', 'spectral_contrast_mean', 'spectral_contrast_std', 'spectral_contrast_maxv', 'spectral_contrast_minv', 'spectral_contrast_median', 'spectral_flatness_mean', 'spectral_flatness_std', 'spectral_flatness_maxv', 'spectral_flatness_minv', 'spectral_flatness_median', 'spectral_rolloff_mean', 'spectral_rolloff_std', 'spectral_rolloff_maxv', 'spectral_rolloff_minv', 'spectral_rolloff_median', 'zero_crossings_mean', 'zero_crossings_std', 'zero_crossings_maxv', 'zero_crossings_minv', 'zero_crossings_median', 'RMSE_mean', 'RMSE_std', 'RMSE_maxv', 'RMSE_minv', 'RMSE_median']}}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'models': {'audio': {'females': [{'sample type': 'audio', 'created date': '2020-08-07 12:43:08.041610', 'device info': {'time': '2020-08-07 12:43', 'timezone': ['EST', 'EDT'], 'operating system': 'Darwin', 'os release': '19.5.0', 'os version': 'Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2/RELEASE_X86_64', 'cpu data': {'memory': [8589934592, 2205638656, 74.3, 4275126272, 94564352, 2113089536, 1905627136, 2162036736], 'cpu percent': 77.4, 'cpu times': [157014.42, 0.0, 65743.85, 543406.97], 'cpu count': 4, 'cpu stats': [73245, 527841, 971986315, 465241], 'cpu swap': [2147483648, 1335623680, 811859968, 62.2, 542256906240, 1029312512], 'partitions': [['/dev/disk1s6', '/', 'apfs', 'ro,local,rootfs,dovolfs,journaled,multilabel'], ['/dev/disk1s5', '/System/Volumes/Data', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s4', '/private/var/vm', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s1', '/Volumes/Macintosh HD - Data', 'apfs', 'rw,local,dovolfs,journaled,multilabel'], ['/dev/disk1s3', '/Volumes/Recovery', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel']], 'disk usage': [499963174912, 10985529344, 339299921920, 3.1], 'disk io counters': [11783216, 8846330, 592488718336, 504655544320, 11689626, 6788646], 'battery': [56, -2, True], 'boot time': 1596543104.0}, 'space left': 339.29992192}, 'session id': 'b4a4b8e8-d8cc-11ea-8e12-acde48001122', 'classes': ['males', 'females'], 'problem type': 'classification', 'model name': 'gender_tpot_classifier.pickle', 'model type': 'tpot', 'metrics': {'accuracy': 0.7, 'balanced_accuracy': 0.6666666666666667, 'precision': 0.6666666666666666, 'recall': 0.5, 'f1_score': 0.5714285714285715, 'f1_micro': 0.7, 'f1_macro': 0.6703296703296704, 'roc_auc': 0.6666666666666666, 'roc_auc_micro': 0.6666666666666666, 'roc_auc_macro': 0.6666666666666666, 'confusion_matrix': [[5, 1], [2, 2]], 'classification_report': '              precision    recall  f1-score   support\n\n       males       0.71      0.83      0.77         6\n     females       0.67      0.50      0.57         4\n\n    accuracy                           0.70        10\n   macro avg       0.69      0.67      0.67        10\nweighted avg       0.70      0.70      0.69        10\n'}, 'settings': {'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': True, 'select_features': True, 'test_size': 0.1, 'transcribe_audio': False, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False}, 'transformer name': 'gender_tpot_classifier_transform.pickle', 'training data': ['gender_all.csv', 'gender_train.csv', 'gender_test.csv', 'gender_all_transformed.csv', 'gender_train_transformed.csv', 'gender_test_transformed.csv'], 'sample X_test': [-0.5905520916740901, -0.5769542196749199, -0.4206917011248405, 0.7529668802365918, 0.09174953222020645, -0.6629166129037093, 0.7375134654386479, -0.15802980098738767, -0.15805735479905567, -0.6081655509519702, 0.9011267037791326, 1.0261687941190032, 0.20454394530543832, -0.8448466788837747, 0.35198391256345163, 0.8316422800614328, -0.8654169740282134, 0.8593094113006882, 0.8285863806706361, -0.7025236163101211], 'sample y_test': 0}, {'sample type': 'audio', 'created date': '2020-08-07 12:43:08.041610', 'device info': {'time': '2020-08-07 12:43', 'timezone': ['EST', 'EDT'], 'operating system': 'Darwin', 'os release': '19.5.0', 'os version': 'Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2/RELEASE_X86_64', 'cpu data': {'memory': [8589934592, 2205638656, 74.3, 4275126272, 94564352, 2113089536, 1905627136, 2162036736], 'cpu percent': 77.4, 'cpu times': [157014.42, 0.0, 65743.85, 543406.97], 'cpu count': 4, 'cpu stats': [73245, 527841, 971986315, 465241], 'cpu swap': [2147483648, 1335623680, 811859968, 62.2, 542256906240, 1029312512], 'partitions': [['/dev/disk1s6', '/', 'apfs', 'ro,local,rootfs,dovolfs,journaled,multilabel'], ['/dev/disk1s5', '/System/Volumes/Data', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s4', '/private/var/vm', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel'], ['/dev/disk1s1', '/Volumes/Macintosh HD - Data', 'apfs', 'rw,local,dovolfs,journaled,multilabel'], ['/dev/disk1s3', '/Volumes/Recovery', 'apfs', 'rw,local,dovolfs,dontbrowse,journaled,multilabel']], 'disk usage': [499963174912, 10985529344, 339299921920, 3.1], 'disk io counters': [11783216, 8846330, 592488718336, 504655544320, 11689626, 6788646], 'battery': [56, -2, True], 'boot time': 1596543104.0}, 'space left': 339.29992192}, 'session id': 'b4a4b8e8-d8cc-11ea-8e12-acde48001122', 'classes': ['males', 'females'], 'problem type': 'classification', 'model name': 'gender_tpot_classifier.pickle', 'model type': 'tpot', 'metrics': {'accuracy': 0.7, 'balanced_accuracy': 0.6666666666666667, 'precision': 0.6666666666666666, 'recall': 0.5, 'f1_score': 0.5714285714285715, 'f1_micro': 0.7, 'f1_macro': 0.6703296703296704, 'roc_auc': 0.6666666666666666, 'roc_auc_micro': 0.6666666666666666, 'roc_auc_macro': 0.6666666666666666, 'confusion_matrix': [[5, 1], [2, 2]], 'classification_report': '              precision    recall  f1-score   support\n\n       males       0.71      0.83      0.77         6\n     females       0.67      0.50      0.57         4\n\n    accuracy                           0.70        10\n   macro avg       0.69      0.67      0.67        10\nweighted avg       0.70      0.70      0.69        10\n'}, 'settings': {'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': True, 'select_features': True, 'test_size': 0.1, 'transcribe_audio': False, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False}, 'transformer name': 'gender_tpot_classifier_transform.pickle', 'training data': ['gender_all.csv', 'gender_train.csv', 'gender_test.csv', 'gender_all_transformed.csv', 'gender_train_transformed.csv', 'gender_test_transformed.csv'], 'sample X_test': [-0.5905520916740901, -0.5769542196749199, -0.4206917011248405, 0.7529668802365918, 0.09174953222020645, -0.6629166129037093, 0.7375134654386479, -0.15802980098738767, -0.15805735479905567, -0.6081655509519702, 0.9011267037791326, 1.0261687941190032, 0.20454394530543832, -0.8448466788837747, 0.35198391256345163, 0.8316422800614328, -0.8654169740282134, 0.8593094113006882, 0.8285863806706361, -0.7025236163101211], 'sample y_test': 0}]}, 'text': {}, 'image': {}, 'video': {}, 'csv': {}}, 'labels': ['load_dir'], 'errors': [], 'settings': {'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': False, 'select_features': False, 'test_size': 0.1, 'transcribe_audio': False, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False, 'default_dimensionionality_reducer': ['pca']}}

You can make a transformer to reduce or select features after folders of files have been featurized. Note 'males' and 'females' are the two directories that are being used to complete the transformation:

python3 allie.py --command transform --dir /Users/jim/desktop/allie/allie/train_dir/males --dir /Users/jim/desktop/allie/allie/train_dir/females --sampletype audio --problemtype c --name gender

This makes the transformer based defaults set in the settings.json file:

Requirement already satisfied: scikit-learn==0.22.2.post1 in /usr/local/lib/python3.7/site-packages (0.22.2.post1)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (0.15.1)
Requirement already satisfied: scipy>=0.17.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (1.4.1)
Requirement already satisfied: numpy>=1.11.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn==0.22.2.post1) (1.18.4)
WARNING: You are using pip version 20.2; however, version 20.2.1 is available.
You should consider upgrading via the '/usr/local/opt/python/bin/python3.7 -m pip install --upgrade pip' command.
/Users/jim/Desktop/allie/allie
True
False
True
['standard_scaler']
['pca']
['rfe']
['males']
['males', 'females']
----------LOADING MALES----------
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 51/51 [00:00<00:00, 1904.04it/s]
----------LOADING FEMALES----------
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 51/51 [00:00<00:00, 1596.84it/s]
[30.0, 92.66666666666667, 49.01451032319126, 169.0, 3.0, 94.5, 129.19921875, 1.690250229326713, 1.2789544717499288, 7.456481484968146, 0.0, 1.367037342965302, 1.0, 0.0, 1.0, 1.0, 1.0, 0.8144353054485143, 0.07315829369998768, 0.9417573794204208, 0.719033364501182, 0.8028530517913689, 0.6516314010377938, 0.12866423184791223, 0.880534700043115, 0.48682582807073504, 0.628550582813504, 0.6557855743777458, 0.12156072712457477, 0.876943271385895, 0.5062201640787684, 0.6303779638265605, 0.7009278935840026, 0.11771103035069283, 0.9076673271146752, 0.5511437216636424, 0.6804123245018233, 0.7258621781459784, 0.11636256125038048, 0.9226262209665014, 0.5658213459263224, 0.7115627650092918, 0.7319688473295565, 0.10607996699023634, 0.9132149573567513, 0.5828179013969554, 0.7190717663201714, 0.6810593533546547, 0.1271240541410719, 0.8928223130579849, 0.49494090316773426, 0.6701480413560182, 0.6492148665611919, 0.1313768776855918, 0.874588980402631, 0.4626074503575165, 0.6337619863577791, 0.6913642725773188, 0.11647170482925652, 0.8893955618442694, 0.5215713488179923, 0.6793508362465139, 0.740905465414844, 0.11388770333587857, 0.9189510886031139, 0.5580632361822792, 0.7396580862023646, 0.706541518233447, 0.12965432917680048, 0.909703948901588, 0.5021640861839305, 0.7040640631286661, 0.660178894654137, 0.13467299472507263, 0.881431610290283, 0.45922877094491754, 0.6508665445976783, -125.02876889048967, 39.94989669951198, -41.153294052622755, -241.5324712414671, -123.15837201863796, 127.28465582663468, 35.80741987232664, 192.13303272873583, 26.363771628602464, 131.39178180883965, -31.40824555387448, 14.845912346019078, 4.214112235102621, -63.02792432547794, -31.674782272416806, 63.59218904191833, 19.29518727295757, 113.68750652424006, 2.028838106725491, 66.10692313577907, -26.962807839040785, 21.096820187821937, 21.987230456807126, -72.99876213857725, -26.07311047818838, 28.076659576584003, 19.396170848691963, 83.59413022430225, -8.32134239204613, 28.358152011527395, -26.198379123200496, 13.790690985287558, -0.4330725985216038, -67.13887308266585, -24.024634643719104, 8.944904763952549, 11.567269550826811, 45.93168128672215, -13.803048109141683, 8.970779559926964, -14.501981105396572, 9.903767724994214, 5.117014585578213, -37.86970568591489, -13.963186489353376, -9.611464362952907, 9.47798222378478, 10.076670810295305, -33.566944481425, -10.062288102168846, -3.327091218464487, 7.47132694440408, 18.9357242293349, -21.052624308721697, -2.9141128249673565, -10.727285681938708, 7.619622699284192, 10.784910450523757, -33.111268975426995, -10.533412108789724, 6.26325664564468, 7.931147571087225, 25.507855843506437, -14.300663616055672, 6.416538653827497, -0.000515034409211756, 0.0002809336537487639, -7.005439972082589e-05, -0.0012700827705259812, -0.0005047523754684405, 4.271716503539869, 2.1772810824505493, 9.883547207471688, 0.6583119006445217, 4.221365244890755, 1862.442990522799, 877.294093498969, 4822.9006596619165, 860.6399518698654, 1546.5592748248223, 1794.3816987793941, 382.3749135091646, 2659.687547401115, 1050.0506924306242, 1772.359533399369, 22.382069858248713, 5.600814877396231, 46.28314734561523, 10.567706808766474, 22.745121605457474, 0.00013835863501299173, 0.0002091049827868119, 0.0008215561974793673, 4.916078069072682e-06, 2.9649212592630647e-05, 3742.1810752467104, 1489.038670641279, 6696.826171875, 1130.4931640625, 3402.24609375, 0.10214501096491228, 0.06874772038732836, 0.37109375, 0.0107421875, 0.08349609375, 0.20060665905475616, 0.09223830699920654, 0.3933629095554352, 0.021954059600830078, 0.22151805460453033]
['onset_length', 'onset_detect_mean', 'onset_detect_std', 'onset_detect_maxv', 'onset_detect_minv', 'onset_detect_median', 'tempo', 'onset_strength_mean', 'onset_strength_std', 'onset_strength_maxv', 'onset_strength_minv', 'onset_strength_median', 'rhythm_0_mean', 'rhythm_0_std', 'rhythm_0_maxv', 'rhythm_0_minv', 'rhythm_0_median', 'rhythm_1_mean', 'rhythm_1_std', 'rhythm_1_maxv', 'rhythm_1_minv', 'rhythm_1_median', 'rhythm_2_mean', 'rhythm_2_std', 'rhythm_2_maxv', 'rhythm_2_minv', 'rhythm_2_median', 'rhythm_3_mean', 'rhythm_3_std', 'rhythm_3_maxv', 'rhythm_3_minv', 'rhythm_3_median', 'rhythm_4_mean', 'rhythm_4_std', 'rhythm_4_maxv', 'rhythm_4_minv', 'rhythm_4_median', 'rhythm_5_mean', 'rhythm_5_std', 'rhythm_5_maxv', 'rhythm_5_minv', 'rhythm_5_median', 'rhythm_6_mean', 'rhythm_6_std', 'rhythm_6_maxv', 'rhythm_6_minv', 'rhythm_6_median', 'rhythm_7_mean', 'rhythm_7_std', 'rhythm_7_maxv', 'rhythm_7_minv', 'rhythm_7_median', 'rhythm_8_mean', 'rhythm_8_std', 'rhythm_8_maxv', 'rhythm_8_minv', 'rhythm_8_median', 'rhythm_9_mean', 'rhythm_9_std', 'rhythm_9_maxv', 'rhythm_9_minv', 'rhythm_9_median', 'rhythm_10_mean', 'rhythm_10_std', 'rhythm_10_maxv', 'rhythm_10_minv', 'rhythm_10_median', 'rhythm_11_mean', 'rhythm_11_std', 'rhythm_11_maxv', 'rhythm_11_minv', 'rhythm_11_median', 'rhythm_12_mean', 'rhythm_12_std', 'rhythm_12_maxv', 'rhythm_12_minv', 'rhythm_12_median', 'mfcc_0_mean', 'mfcc_0_std', 'mfcc_0_maxv', 'mfcc_0_minv', 'mfcc_0_median', 'mfcc_1_mean', 'mfcc_1_std', 'mfcc_1_maxv', 'mfcc_1_minv', 'mfcc_1_median', 'mfcc_2_mean', 'mfcc_2_std', 'mfcc_2_maxv', 'mfcc_2_minv', 'mfcc_2_median', 'mfcc_3_mean', 'mfcc_3_std', 'mfcc_3_maxv', 'mfcc_3_minv', 'mfcc_3_median', 'mfcc_4_mean', 'mfcc_4_std', 'mfcc_4_maxv', 'mfcc_4_minv', 'mfcc_4_median', 'mfcc_5_mean', 'mfcc_5_std', 'mfcc_5_maxv', 'mfcc_5_minv', 'mfcc_5_median', 'mfcc_6_mean', 'mfcc_6_std', 'mfcc_6_maxv', 'mfcc_6_minv', 'mfcc_6_median', 'mfcc_7_mean', 'mfcc_7_std', 'mfcc_7_maxv', 'mfcc_7_minv', 'mfcc_7_median', 'mfcc_8_mean', 'mfcc_8_std', 'mfcc_8_maxv', 'mfcc_8_minv', 'mfcc_8_median', 'mfcc_9_mean', 'mfcc_9_std', 'mfcc_9_maxv', 'mfcc_9_minv', 'mfcc_9_median', 'mfcc_10_mean', 'mfcc_10_std', 'mfcc_10_maxv', 'mfcc_10_minv', 'mfcc_10_median', 'mfcc_11_mean', 'mfcc_11_std', 'mfcc_11_maxv', 'mfcc_11_minv', 'mfcc_11_median', 'mfcc_12_mean', 'mfcc_12_std', 'mfcc_12_maxv', 'mfcc_12_minv', 'mfcc_12_median', 'poly_0_mean', 'poly_0_std', 'poly_0_maxv', 'poly_0_minv', 'poly_0_median', 'poly_1_mean', 'poly_1_std', 'poly_1_maxv', 'poly_1_minv', 'poly_1_median', 'spectral_centroid_mean', 'spectral_centroid_std', 'spectral_centroid_maxv', 'spectral_centroid_minv', 'spectral_centroid_median', 'spectral_bandwidth_mean', 'spectral_bandwidth_std', 'spectral_bandwidth_maxv', 'spectral_bandwidth_minv', 'spectral_bandwidth_median', 'spectral_contrast_mean', 'spectral_contrast_std', 'spectral_contrast_maxv', 'spectral_contrast_minv', 'spectral_contrast_median', 'spectral_flatness_mean', 'spectral_flatness_std', 'spectral_flatness_maxv', 'spectral_flatness_minv', 'spectral_flatness_median', 'spectral_rolloff_mean', 'spectral_rolloff_std', 'spectral_rolloff_maxv', 'spectral_rolloff_minv', 'spectral_rolloff_median', 'zero_crossings_mean', 'zero_crossings_std', 'zero_crossings_maxv', 'zero_crossings_minv', 'zero_crossings_median', 'RMSE_mean', 'RMSE_std', 'RMSE_maxv', 'RMSE_minv', 'RMSE_median']
STANDARD_SCALER
RFE - 20 features
[('standard_scaler', StandardScaler(copy=True, with_mean=True, with_std=True)), ('rfe', RFE(estimator=SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1,
                  gamma='scale', kernel='linear', max_iter=-1, shrinking=True,
                  tol=0.001, verbose=False),
    n_features_to_select=20, step=1, verbose=0))]
11
11
transformed training size
[ 0.87867072 -0.2927148  -1.3942374  -2.21466181 -1.24338953 -0.5532292
  0.02975783  0.42827433 -0.28430065 -1.2838709   0.90746239  1.67629585
 -1.48610134  1.03165105  1.35402715  0.73145188 -0.61561207  0.41546984
  0.63114357  1.48055371]
/Users/jim/Desktop/allie/allie/preprocessing
your transform can now be found in the ./preprocessing/audio_transforms directory

For more information about Allie's preprocessing capabilities, see this link.

You can run unit tests with:

python3 allie.py --command test

This will then show if any tests succeeded or failed:

----------------------------------------------------------------------
Ran 28 tests in 551.517s

OK
-----------------^^^-----------------------
-------------^^^^---^^^^-------------------
-----------CLEANUP TEMP FILES--------------
---------^^^^^^^^^^^^^^^^^^^^^^------------
deleting temp files from FFmpeg and SoX tests
-------------------------------------------
deleting temp files load_dir tests
-------------------------------------------
deleting temp model files (audio, text, image, and video)
-------------------------------------------

Note that these unit tests are contextualized around the settings that you create in the settings.json database.

You can visualize multi-class problems that have featurized folders with:

python3 allie.py --command visualize

Now specify the folders to run the visualization on:

what is the problem that you are going after (e.g. "audio", "text", "image","video","csv")
audio
audio
how many classes do you want to model? (e.g. 2)
2
what is class #1
males
what is class #2
females
minimum length is...
51
----------LOADING MALES----------
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 51/51 [00:00<00:00, 1593.25it/s]
----------LOADING FEMALES----------
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 51/51 [00:00<00:00, 1227.09it/s]
...

This will then take you through a visualization prompt to set the classes and structure a visualization session, as output in the 'visualization_session" folder.

You can set some new settings within Allie quite easily by doing:

python3 allie.py --command settings

This will then open up a list of questions to allow you to specify new settings within Allie or visualize the existing settings, as set by the settings.json database.

For example, you may want to turn off video_transcribe setting by setting it to False:

{'version': '1.0.0', 'augment_data': False, 'balance_data': True, 'clean_data': False, 'create_csv': True, 'default_audio_augmenters': ['augment_tsaug'], 'default_audio_cleaners': ['clean_mono16hz'], 'default_audio_features': ['librosa_features'], 'default_audio_transcriber': ['deepspeech_dict'], 'default_csv_augmenters': ['augment_ctgan_regression'], 'default_csv_cleaners': ['clean_csv'], 'default_csv_features': ['csv_features_regression'], 'default_csv_transcriber': ['raw text'], 'default_dimensionality_reducer': ['pca'], 'default_feature_selector': ['rfe'], 'default_image_augmenters': ['augment_imgaug'], 'default_image_cleaners': ['clean_greyscale'], 'default_image_features': ['image_features'], 'default_image_transcriber': ['tesseract'], 'default_outlier_detector': ['isolationforest'], 'default_scaler': ['standard_scaler'], 'default_text_augmenters': ['augment_textacy'], 'default_text_cleaners': ['remove_duplicates'], 'default_text_features': ['nltk_features'], 'default_text_transcriber': ['raw text'], 'default_training_script': ['tpot'], 'default_video_augmenters': ['augment_vidaug'], 'default_video_cleaners': ['remove_duplicates'], 'default_video_features': ['video_features'], 'default_video_transcriber': ['tesseract (averaged over frames)'], 'dimension_number': 2, 'feature_number': 20, 'model_compress': False, 'reduce_dimensions': False, 'remove_outliers': True, 'scale_features': True, 'select_features': True, 'test_size': 0.1, 'transcribe_audio': True, 'transcribe_csv': True, 'transcribe_image': True, 'transcribe_text': True, 'transcribe_video': True, 'transcribe_videos': True, 'visualize_data': False}

Would you like to change any of these settings? Yes (-y) or No (-n)
y
What setting would you like to change?
transcribe_video
What setting would you like to set here?
False
<class 'bool'>

Note that a list of all possible settings to change can be found here.