Archival

This repository is archived as of October 2024. While it was a fun project, it turns out that scaling this application beyond a certain complexity in Python just is not feasible. I do think there is value in this project, but I would have to do an entire rewrite in a programming language better suited for this usecase.

About

This application's goal is to enable people in learning languages while conversing with native speakers. It is not a standalone language-learning app; instead, it aims to provide translations for everyday phrases while explaining the grammatical structure and vocabulary of those sentences.

lingolift uses a mixture of Generative AI and Natural Language Processing (NLP) to perform translation and sentence analysis. For example, both the idiomatic translation of the input sentence and the literal translations are generated by an LLM (currently using the OpenAI API); however, the syntactical analysis of sentences is largely achieved using the spaCy library.

Features

As of now, lingolift can do the following:

Auto-detect the language of the input sentence
Translate sentences from other languages to English
Provide a literal translation of each word in the input sentence (up to certain sentence lengths)
Provide a coherent syntactical analysis of the input sentence based in part-of-speech tagging
Provide response suggestions for the user to continue the conversation

Currently, those features can be accessed via Chatbot-like UIs on both the Streamlit Community Cloud and Telegram.

I'm currently working on error detection. Also, I'm looking to move away from language detection, instead focussing on specific languages. Language detection is difficult and takes time; and this application won't work equally well for all languages anyway, so it makes more sense to focus on a few languages and make them work well.

Usage

I am hosting an instance of the application on the Streamlit Community Cloud and on Telegram here. The backend, as defined in this repository, is hosted as a set of serverless functions on AWS Lambda, abstracted behind an API Gateway.

Running

You can run lingolift locally as a dockerized Flask server. To do so, you need to have Docker installed on your machine. You can simply pull a pre-built Docker image (amd64 only) for a given language from Docker Hub:

docker pull tobiaswaslowski/lingolift-webserver-de:latest
docker run -p 5001:5001 -e OPENAI_API_KEY="$OPENAI_API_KEY" tobiaswaslowski/lingolift-webserver-de:latest

Note that this image can only perform syntactical analysis for German. I host another model for the Russian language (tobiaswaslowski/lingolift-webserver-ru); if you would like more images, you have to build them yourself. This is not terribly difficult. You can build an image for a given language with the following command:

# Build the image for the Spanish language
# Retrieve model id here: https://spacy.io/models
./do build_webserver --spacy_model es_core_news_sm  --source-lang es
./do run_webserver es

The easiest option to interact with the provided endpoints is to clone the Streamlit-based frontend and run it locally:

git clone git@github.com:twaslowski/lingolift-frontend.git && cd lingolift-frontend
poetry install --no-root
./do run

Contributing

All contributions are welcome! If you want to contribute, please fork the repository and create a pull request. You can run tests with ./do test and perform linting, import sorting and formatting with ./do pc or pre-commit run --all-files.

Project Overview

The codebase for this project is split into four distinct repositories. You are currently in the main repository that provides the backend functionality. The primary frontend is hosted in the lingolift-frontend repository. The Telegram bot is hosted in the lingolift-telegram-bot repository. Lastly, there is a shared repository that contains client functionality for accessing the API provided here as well as models for all tasks to ensure type safety.

Name		Name	Last commit message	Last commit date
Latest commit History 378 Commits
.github		.github
docker		docker
lingolift		lingolift
terraform		terraform
test		test
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE.txt		LICENSE.txt
README.md		README.md
do		do
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Archival

About

Features

Usage

Running

Contributing

Project Overview

About

Releases 4

Contributors 2

Languages

License

twaslowski/lingolift-core

Folders and files

Latest commit

History

Repository files navigation

Archival

About

Features

Usage

Running

Contributing

Project Overview

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Contributors 2

Languages