Skip to content
/ mlconjug3 Public template

A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.

License

Notifications You must be signed in to change notification settings

Ars-Linguistica/mlconjug3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

mlconjug3 PyPi Home Page
Package Maintenance Status Package Maintener OpenSSF Best Practices OpenSSF ScoreCard Build status on Windows, MacOs and Linux Pypi Python Package Index Status Anaconda Package Index Status Supported platforms Conda Code Coverage Status Code Vulnerability Status DOI Follow me on Mastodon

mlconjug3: The multi-lingual conjugator

A Command Line application and Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese, and Romanian (with more languages soon to come) using Machine Learning techniques. 🧠

The mlconjug3 project is now a proud member of the ARS Linguistica organization. 🀝 ARS Linguistica is a community-driven, open source project that aims to develop free and accessible linguistic tools and resources for all. 🌍 With a focus on advancing linguistic research, documentation, and education, ARS Linguistica is dedicated to preserving and promoting linguistic diversity through the use of open source and open science. πŸ’‘

With mlconjug3, you can:

  • Conjugate any verb in one of the supported languages, even completely new or made-up verbs, with the help of a pre-trained Machine Learning model. πŸ’ͺ
  • Easily modify and retrain the models using any compatible classifiers from scikit-learn. πŸ”§
  • Integrate mlconjug3 in your own projects. 🧬

Using mlconjug3 in Academic Research

mlconjug3 is a valuable tool for linguistic researchers, as it provides accurate and up-to-date conjugation information for a wide range of languages. πŸ§ͺ With its ability to handle completely new or made-up verbs, mlconjug3 is perfect for exploring new linguistic concepts and theories. πŸ” It can also be used to compare and contrast conjugation patterns across different languages, helping researchers to identify and understand linguistic trends.

Integrating mlconjug3 in Applications

In addition to academic research, mlconjug3 can be integrated into a wide range of web and desktop applications. πŸ’» For language learning platforms, mlconjug3 provides an accurate and comprehensive source of conjugation information, helping students to quickly and easily master verb conjugation. πŸ“š For language translation tools, mlconjug3 can help to ensure that translations are grammatically correct, by providing accurate verb conjugation information in real-time. πŸ’¬

By using mlconjug3, you are not only getting a powerful and flexible tool for verb conjugation, but you are also supporting the goals and mission of ARS Linguistica. πŸ™Œ Whether you are a linguistic researcher, language teacher, or simply someone who is passionate about preserving linguistic heritage, your support is crucial to the success of our organization.

Join us in our mission to make linguistic tools and resources accessible to all! πŸ’ͺ


Conjugation for the verb to be.


Supported Languages

  • French
  • English
  • Spanish
  • Italian
  • Portuguese
  • Romanian

Academic publications citing mlconjug3

BibTeX

If you want to cite mlconjug3 in an academic publication use this citation format:

@article{mlconjug3,
  title={mlconjug3},
  author={Sekou Diao},
  journal={GitHub. Note: https://github.com/Ars-Linguistica/mlconjug3 Cited by},
  year={2023}
}

Software projects using mlconjug3

  • EDS-NLP provides a set of spaCy components that are used to extract information from clinical notes written in French.
  • Translation flask API for the Helsinki NLP models available in the Huggingface Transformers library.
  • NLP Suite is a package of tools designed for non-specialists, for scholars with no knowledge or little knowledge of Natural Language Processing.
  • Runebook translates various references such as programming languages, frameworks, libraries, and APIs that software engineers refer to in development.
  • This project offers tools to visualize the gender bias in pre-trained language models to better understand the prejudices in the data.
  • This project uses language models to generate text that is well suited to the type of publication.
  • Dockerized microservice with REST API for conjugation of any verb in French and Spanish.
  • A tool to Manage and tansform HTML documents.
  • A Tux bot.
  • Tweets the words of the French language. Largely inspired by the @botducul (identical lexicon, but code in Python) and the @botsupervnr.
    Posts on @botduslip. Stores the position of the last tweeted word in a Redis database.
  • This project offers a tool to help learn differnt verbal forms.
  • A collection of common NLP tasks such as dataset parsing and explicit semantic extraction.
  • This project offers a model which recognizes covid-19 masks.
  • Need an excuse for why you can't show up in your Zoom lectures? Just generate one here!
  • Repository to store Natural Language Processing models.
  • This is a simple virtual assistant. With it, you can search the Internet, access websites, open programs, and more using just your voice.
    This virtual assistant supports the English and Portuguese languages and has many settings that you can adjust to your liking.
  • This python module responds to yes or no questions. It dishes out its advice at random.
    Disclaimer: Do not actually act on this advice ;)
  • Python+Flask web app that uses mlconjug to dynamically generate foreign language conjugation questions.
  • A dwarf-fortress adventure mode-inspired rogue-like Pygame Python3 game.
  • A WebApp to learn Spanish.
  • Application for German-French vocabulary with simple GUI.

Installation

To install mlconjug3, you have multiple options:

Using pip:

This is the preferred method to install mlconjug3, as it will always install the most recent stable release.

To install mlconjug3, run this command in your terminal:

$ pip install mlconjug3

If you don't have pip installed, this Python installation guide can guide you through the process.

Using pipx:

Recommended for users who want to avoid conflicts with other Python packages.
$ pipx install mlconjug3

Using conda:

You can also install mlconjug3 by using Anaconda or Miniconda instead of pip. To install Anaconda or Miniconda, please follow the installation instructions on their respective websites. After having installed Anaconda or Miniconda, run these commands in your terminal:

$ conda config --add channels conda-forge
$ conda config --set channel_priority strict
$ conda install mlconjug3

If you already have Anaconda or Miniconda available on your system, just type this in your terminal:

$ conda install -c conda-forge mlconjug3

You can find detailed instructions for installing mlconjug3 on the Anaconda eco-system here: https://github.com/conda-forge/mlconjug3-feedstock#installing-mlconjug3

Warning

If you intend to install mlconjug3 on a Apple Macbook with an Apple M1 or M2 processor or newer, it is advised that you install mlconjug3 by using the conda installation method as all dependencies will be pre-compiled.

From sources

The sources for mlconjug3 can be downloaded from the Github repo.

You can either clone the public repository:

$ git clone git://github.com/Ars-Linguistica/mlconjug3

Or download the tarball:

$ curl  -OL https://github.com/Ars-Linguistica/mlconjug3/tarball/master

Once you have a copy of the source, get in the source directory and you can install it with:

$ python setup.py install

Alternatively, you can use poetry to install the software:

$ pip install poetry

$ poetry install

Signing of Releases

Starting with version 3.10, all versions of the mlconjug3 package released on PyPi and GitHub will be signed using sigstore. This is to ensure the authenticity and integrity of the package, and to provide an added layer of security for our users.

Signing a software package is a way to ensure that the package has not been tampered with and that it comes from a trusted source. This is important because malicious actors may try to tamper with a package by adding malware or other unwanted code, or by pretending to be the author of the package.

By signing mlconjug3 releases using sigstore, users can verify that the package they are downloading is the one that was created and uploaded by the package's author, Sekou Diao (diao.sekou.nlp@gmail.com), and that it has not been tampered with. This provides an additional layer of security for users and helps to ensure that they can trust the package they are using.

What is sigstore?

Sigstore is an open-source tool that allows developers to easily sign their software releases, making it easy for users to verify the authenticity of the package. The signature is cryptographically verified against the developer's public key, which is stored on a publicly accessible keyserver. This ensures that the package has not been tampered with and that it was indeed released by the developer who claims to have released it.

How to verify the signature of a release?

To verify the package, you can use the instructions provided below, which will show you how to check the package's signature and certificate using the python package sigstore, and also check for claims specific to GitHub Actions.

To verify a mlconjug3 release, the sigstore python module can be used. By default, sigstore verify will attempt to find a <filename>.sig and <filename>.crt in the same directory as the file being verified. For example, to verify the file mlconjug3-3.10.tar.gz, sigstore verify will look for mlconjug3-3.10.tar.gz.sig and mlconjug3-3.10.tar.gz.crt.

To verify the signature, use the following command:

$ python -m sigstore verify identity mlconjug3-3.10.tar.gz \
    --cert-identity 'diao.sekou.nlp@gmail.com' \
    --cert-oidc-issuer 'https://github.com/login/oauth'

Multiple files can be verified at once:

$ python -m sigstore verify identity mlconjug3-3.10.tar.gz mlconjug3-3.10.0-py3-none-any.whl \
    --cert-identity 'diao.sekou.nlp@gmail.com' \
    --cert-oidc-issuer 'https://github.com/login/oauth'

If the signature and certificate files are at different paths, they can be specified explicitly (but only for one file at a time):

$ python -m sigstore verify identity mlconjug3-3.10.tar.gz \
    --certificate some/other/path/mlconjug3-3.10.crt \
    --signature some/other/path/mlconjug3-3.10.sig \
    --cert-identity 'diao.sekou.nlp@gmail.com' \
    --cert-oidc-issuer 'https://github.com/login/oauth'

Verifying signatures from GitHub Actions:

$ python -m sigstore verify github mlconjug3-3.10.tar.gz \
    --certificate mlconjug3-3.10.tar.gz.crt \
    --signature mlconjug3-3.10.tar.gz.sig \
    --cert-identity https://github.com/diao.sekou.nlp/mlconjug3/.github/workflows/sign_and_publish.yml@refs/tags/v3.10.0

GitHub Actions specific claims can also be verified by adding flags such as --trigger, --sha, --name, --repository, and --ref.

Please note that these are examples and the exact file names and paths may vary depending on the version and distribution of mlconjug3 being verified. It is important to ensure that the correct signature and certificate files are being used for verification.

Credits

This package was created with the help of Verbiste and scikit-learn.

The logo was designed by Zuur.