Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DANE ASR and download workers add timing + limited prov info to DANE Results #70

Open
jblom opened this issue Oct 18, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@jblom
Copy link
Contributor

jblom commented Oct 18, 2022

To prepare for a full provenance chain, it's good to start adding easy-to-obtain prov and timing information to the DANE Results of the ASR worker and the download worker.

Next to the desired provenance model (for informing e.g. researchers) it is very useful to store this information for more precise debugging of the DANE ASR workflow

@jblom jblom self-assigned this Oct 18, 2022
@jblom
Copy link
Contributor Author

jblom commented Oct 18, 2022

Update

Now the ASR worker will store the following information in each DANE Result:

    asr_processing_time: float  # retrieved via submit_asr_job()
    download_time: float  # retrieved via dane-beng-download-worker or download_content()
    kaldi_nl_version: str = "Kaldi-NL v0.4.1"  # default for now
    kaldi_nl_git_url: str = (
        "https://github.com/opensource-spraakherkenning-nl/Kaldi_NL"  # default for now
    )

The code has not been tested in a real workflow yet, but has been merged already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant