Skip to content

Commit

Permalink
readme updates (#338)
Browse files Browse the repository at this point in the history
  • Loading branch information
laves authored Nov 30, 2023
1 parent 1e1463f commit 103bf00
Show file tree
Hide file tree
Showing 24 changed files with 335 additions and 201 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/c-demos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ on:
push:
branches: [ master ]
paths:
- '!demo/c/README.md'
- '.github/workflows/c-demos.yml'
- 'demo/c/**'
- '!demo/c/README.md'
- 'include/**'
- 'lib/common/**'
- 'lib/jetson/**'
Expand All @@ -20,9 +20,9 @@ on:
pull_request:
branches: [ master, 'v[0-9]+.[0-9]+' ]
paths:
- '!demo/c/README.md'
- '.github/workflows/c-demos.yml'
- 'demo/c/**'
- '!demo/c/README.md'
- 'include/**'
- 'lib/common/**'
- 'lib/jetson/**'
Expand Down
6 changes: 4 additions & 2 deletions .github/workflows/react-native-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,19 @@ on:
push:
branches: [ master ]
paths:
- '.github/workflows/react-native-tests.yml'
- 'binding/react-native/**'
- '!binding/react-native/README.md'
- 'lib/common/**'
- '.github/workflows/react-native-tests.yml'
- 'resources/audio_samples/**'
- 'resources/.test/**'
pull_request:
branches: [ master, 'v[0-9]+.[0-9]+' ]
paths:
- '.github/workflows/react-native-tests.yml'
- 'binding/react-native/**'
- '!binding/react-native/README.md'
- 'lib/common/**'
- '.github/workflows/react-native-tests.yml'
- 'resources/audio_samples/**'
- 'resources/.test/**'

Expand Down
121 changes: 68 additions & 53 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,10 +79,10 @@ pip3 install pvleoparddemo
Run the following in the terminal:

```bash
leopard_demo_file --access_key ${ACCESS_KEY} --audio_paths ${AUDIO_PATH}
leopard_demo_file --access_key ${ACCESS_KEY} --audio_paths ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you
wish to transcribe.

### C Demo
Expand All @@ -96,12 +96,12 @@ cmake -S demo/c/ -B demo/c/build && cmake --build demo/c/build
Run the demo:

```console
./demo/c/build/leopard_demo -a ${ACCESS_KEY} -l ${LIBRARY_PATH} -m ${MODEL_PATH} ${AUDIO_PATH}
./demo/c/build/leopard_demo -a ${ACCESS_KEY} -l ${LIBRARY_PATH} -m ${MODEL_FILE_PATH} ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console, `${LIBRARY_PATH}` with the path to appropriate
library under [lib](/lib), `${MODEL_PATH}` to path to [default model file](./lib/common/leopard_params.pv)
(or your custom one), and `${AUDIO_PATH}` with a path to an audio file you wish to transcribe.
library under [lib](/lib), `${MODEL_FILE_PATH}` to path to [default model file](./lib/common/leopard_params.pv)
(or your custom one), and `${AUDIO_FILE_PATH}` with a path to an audio file you wish to transcribe.

### iOS Demo

Expand Down Expand Up @@ -132,10 +132,10 @@ yarn global add @picovoice/leopard-node-demo
Run the following in the terminal:

```console
leopard-file-demo --access_key ${ACCESS_KEY} --input_audio_file_path ${AUDIO_PATH}
leopard-file-demo --access_key ${ACCESS_KEY} --input_audio_file_path ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you
wish to transcribe.

For more information about Node.js demos go to [demo/nodejs](./demo/nodejs).
Expand Down Expand Up @@ -165,10 +165,10 @@ The demo requires `cgo`, which on Windows may mean that you need to install a gc
From [demo/go](./demo/go) run the following command from the terminal to build and run the file demo:

```console
go run filedemo/leopard_file_demo.go -access_key "${ACCESS_KEY}" -input_audio_path "${AUDIO_PATH}"
go run filedemo/leopard_file_demo.go -access_key "${ACCESS_KEY}" -input_audio_path "${AUDIO_FILE_PATH}"
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you wish to transcribe.
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you wish to transcribe.

For more information about Go demos go to [demo/go](./demo/go).

Expand Down Expand Up @@ -202,10 +202,10 @@ From [demo/java](./demo/java) run the following commands from the terminal to bu
cd demo/java
./gradlew build
cd build/libs
java -jar leopard-file-demo.jar -a ${ACCESS_KEY} -i ${AUDIO_PATH}
java -jar leopard-file-demo.jar -a ${ACCESS_KEY} -i ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you wish to transcribe.
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you wish to transcribe.

For more information about Java demos go to [demo/java](./demo/java).

Expand All @@ -217,10 +217,10 @@ file or on real-time microphone input.
From [demo/dotnet/LeopardDemo](./demo/dotnet/LeopardDemo) run the following in the terminal:

```console
dotnet run -c FileDemo.Release -- --access_key ${ACCESS_KEY} --input_audio_path ${AUDIO_PATH}
dotnet run -c FileDemo.Release -- --access_key ${ACCESS_KEY} --input_audio_path ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you
wish to transcribe.

For more information about .NET demos, go to [demo/dotnet](./demo/dotnet).
Expand All @@ -233,10 +233,10 @@ file or on real-time microphone input.
From [demo/rust/filedemo](./demo/rust/filedemo) run the following in the terminal:

```console
cargo run --release -- --access_key ${ACCESS_KEY} --input_audio_path ${AUDIO_PATH}
cargo run --release -- --access_key ${ACCESS_KEY} --input_audio_path ${AUDIO_FILE_PATH}
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_PATH}` with a path to an audio file you
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console and `${AUDIO_FILE_PATH}` with a path to an audio file you
wish to transcribe.

For more information about Rust demos, go to [demo/rust](./demo/rust).
Expand Down Expand Up @@ -294,14 +294,14 @@ Create an instance of the engine and transcribe an audio file:
```python
import pvleopard

handle = pvleopard.create(access_key='${ACCESS_KEY}')
leopard = pvleopard.create(access_key='${ACCESS_KEY}')

print(handle.process_file('${AUDIO_PATH}'))
print(leopard.process_file('${AUDIO_FILE_PATH}'))
```

Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and
`${AUDIO_PATH}` to path an audio file. Finally, when done be sure to explicitly release the resources using
`handle.delete()`.
`${AUDIO_FILE_PATH}` to path an audio file. Finally, when done be sure to explicitly release the resources using
`leopard.delete()`.

### C

Expand All @@ -314,17 +314,29 @@ Create an instance of the engine and transcribe an audio file:

#include "pv_leopard.h"

pv_leopard_t *handle = NULL;
bool automatic_punctuation = false;
pv_status_t status = pv_leopard_init("${ACCESS_KEY}", "${MODEL_PATH}", automatic_punctuation, &handle);
pv_leopard_t *leopard = NULL;
bool enable_automatic_punctuation = false;
bool enable_speaker_diarization = false;

pv_status_t status = pv_leopard_init(
"${ACCESS_KEY}",
"${MODEL_FILE_PATH}",
enable_automatic_punctuation,
enable_speaker_diarization,
&leopard);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}

char *transcript = NULL;
int32_t num_words = 0;
pv_word_t *words = NULL;
status = pv_leopard_process_file(handle, "${AUDIO_PATH}", &transcript, &num_words, &words);
status = pv_leopard_process_file(
leopard,
"${AUDIO_FILE_PATH}",
&transcript,
&num_words,
&words);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}
Expand All @@ -333,20 +345,21 @@ fprintf(stdout, "%s\n", transcript);
for (int32_t i = 0; i < num_words; i++) {
fprintf(
stdout,
"[%s]\t.start_sec = %.1f .end_sec = %.1f .confidence = %.2f\n",
"[%s]\t.start_sec = %.1f .end_sec = %.1f .confidence = %.2f .speaker_tag = %d\n",
words[i].word,
words[i].start_sec,
words[i].end_sec,
words[i].confidence);
words[i].confidence,
words[i].speaker_tag);
}

pv_leopard_transcript_delete(transcript);
pv_leopard_words_delete(words);
```
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console, `${MODEL_PATH}` to path to
[default model file](./lib/common/leopard_params.pv) (or your custom one), and `${AUDIO_PATH}` to path an audio file.
Finally, when done be sure to release resources acquired using `pv_leopard_delete(handle)`.
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console, `${MODEL_FILE_PATH}` to path to
[default model file](./lib/common/leopard_params.pv) (or your custom one), and `${AUDIO_FILE_PATH}` to path an audio file.
Finally, when done be sure to release resources acquired using `pv_leopard_delete(leopard)`.
### iOS
Expand Down Expand Up @@ -393,20 +406,20 @@ Create an instance of the engine and transcribe an audio file:
import ai.picovoice.leopard.*;

final String accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
final String modelPath = "${MODEL_FILE}";
final String modelPath = "${MODEL_FILE_PATH}";
try {
Leopard handle = new Leopard.Builder()
Leopard leopard = new Leopard.Builder()
.setAccessKey(accessKey)
.setModelPath(modelPath)
.build(appContext);

File audioFile = new File("${AUDIO_FILE_PATH}");
LeopardTranscript transcript = handle.processFile(audioFile.getAbsolutePath());
LeopardTranscript transcript = leopard.processFile(audioFile.getAbsolutePath());

} catch (LeopardException ex) { }
```

Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console, `${MODEL_FILE}` with a custom trained model from [console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv), and `${AUDIO_FILE_PATH}` with the path to the audio file.
Replace `${ACCESS_KEY}` with yours obtained from Picovoice Console, `${MODEL_FILE_PATH}` with a custom trained model from [console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv), and `${AUDIO_FILE_PATH}` with the path to the audio file.

### Node.js

Expand All @@ -421,19 +434,19 @@ Create instances of the Leopard class:
```javascript
const Leopard = require("@picovoice/leopard-node");
const accessKey = "${ACCESS_KEY}" // Obtained from the Picovoice Console (https://console.picovoice.ai/)
let handle = new Leopard(accessKey);
let leopard = new Leopard(accessKey);

const result = engineInstance.processFile('${AUDIO_PATH}');
const result = engineInstance.processFile('${AUDIO_FILE_PATH}');
console.log(result.transcript);
```

Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and
`${AUDIO_PATH}` to path an audio file.
`${AUDIO_FILE_PATH}` to path an audio file.

When done, be sure to release resources using `release()`:

```javascript
handle.release();
leopard.release();
```

### Flutter
Expand All @@ -450,29 +463,29 @@ Create an instance of the engine and transcribe an audio file:
```dart
import 'package:leopard/leopard.dart';

const accessKey = "{ACCESS_KEY}" // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
final String accessKey = '{ACCESS_KEY}' // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

try {
Leopard _leopard = await Leopard.create(accessKey, '{LEOPARD_MODEL_PATH}');
Leopard _leopard = await Leopard.create(accessKey, '{MODEL_FILE_PATH}');
LeopardTranscript result = await _leopard.processFile("${AUDIO_FILE_PATH}");
print(result.transcript);
} on LeopardException catch (err) { }
```

Replace `${ACCESS_KEY}` with your `AccessKey` obtained from [Picovoice Console](https://console.picovoice.ai/), `${MODEL_FILE}` with a custom trained model from [Picovoice Console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv), and `${AUDIO_FILE_PATH}` with the path to the audio file.
Replace `${ACCESS_KEY}` with your `AccessKey` obtained from [Picovoice Console](https://console.picovoice.ai/), `${MODEL_FILE_PATH}` with a custom trained model from [Picovoice Console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv), and `${AUDIO_FILE_PATH}` with the path to the audio file.

### Go

Install the Go binding:

```console
go get github.com/Picovoice/leopard/binding/go
go get github.com/Picovoice/leopard/binding/go/v2
```

Create an instance of the engine and transcribe an audio file:

```go
import . "github.com/Picovoice/leopard/binding/go"
import . "github.com/Picovoice/leopard/binding/go/v2"

leopard = Leopard{AccessKey: "${ACCESS_KEY}"}
err := leopard.Init()
Expand All @@ -481,7 +494,7 @@ if err != nil {
}
defer leopard.Delete()

transcript, words, err := leopard.ProcessFile("${AUDIO_PATH}")
transcript, words, err := leopard.ProcessFile("${AUDIO_FILE_PATH}")
if err != nil {
// handle process error
}
Expand All @@ -490,7 +503,7 @@ log.Println(transcript)
```

Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and
`${AUDIO_PATH}` to path an audio file. Finally, when done be sure to explicitly release the resources using
`${AUDIO_FILE_PATH}` to path an audio file. Finally, when done be sure to explicitly release the resources using
`leopard.Delete()`.

### React Native
Expand All @@ -511,7 +524,7 @@ const getAudioFrame = () => {
}

try {
const leopard = await Leopard.create("${ACCESS_KEY}", "${MODEL_FILE}")
const leopard = await Leopard.create("${ACCESS_KEY}", "${MODEL_FILE_PATH}")
const { transcript, words } = await leopard.processFile("${AUDIO_FILE_PATH}")
console.log(transcript)
} catch (err: any) {
Expand All @@ -521,7 +534,7 @@ try {
}
```

Replace `${ACCESS_KEY}` with your `AccessKey` obtained from Picovoice Console, `${MODEL_FILE}` with a custom trained model from [Picovoice Console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv) and `${AUDIO_FILE_PATH}` with the absolute path of the audio file.
Replace `${ACCESS_KEY}` with your `AccessKey` obtained from Picovoice Console, `${MODEL_FILE_PATH}` with a custom trained model from [Picovoice Console](https://console.picovoice.ai/) or the [default model](./lib/common/leopard_params.pv) and `${AUDIO_FILE_PATH}` with the absolute path of the audio file.
When done be sure to explicitly release the resources using `leopard.delete()`.

### Java
Expand All @@ -541,14 +554,14 @@ final String accessKey = "${ACCESS_KEY}";

try {
Leopard leopard = new Leopard.Builder().setAccessKey(accessKey).build();
LeopardTranscript result = leopard.processFile("${AUDIO_PATH}");
LeopardTranscript result = leopard.processFile("${AUDIO_FILE_PATH}");
leopard.delete();
} catch (LeopardException ex) { }

System.out.println(transcript);
System.out.println(result.getTranscriptString());
```

Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and `${AUDIO_PATH}` to the path an audio file. Finally, when done be sure to explicitly release the resources using `leopard.delete()`.
Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and `${AUDIO_FILE_PATH}` to the path an audio file. Finally, when done be sure to explicitly release the resources using `leopard.delete()`.

### .NET

Expand All @@ -564,14 +577,14 @@ Create an instance of the engine and transcribe an audio file:
using Pv;

const string accessKey = "${ACCESS_KEY}";
const string audioPath = "/absolute/path/to/audio_file";
const string audioPath = "${AUDIO_FILE_PATH}";

Leopard handle = Leopard.Create(accessKey);
Leopard leopard = Leopard.Create(accessKey);

Console.Write(handle.ProcessFile(audioPath));
Console.Write(leopard.ProcessFile(audioPath));
```

Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/). Finally, when done release the resources using `handle.Dispose()`.
Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/). Finally, when done release the resources using `leopard.Dispose()`.

### Rust

Expand Down Expand Up @@ -698,8 +711,10 @@ function App(props) {

## Releases

### v2.0.0 - November 27th, 2023
### v2.0.0 - November 30th, 2023

- Added speaker diarization feature
- Added React SDK
- Improvements to error reporting
- Upgrades to authorization and authentication system
- Improved engine accuracy
Expand Down
Loading

0 comments on commit 103bf00

Please sign in to comment.