Releases: Hyuto/indo-nlp
Releases · Hyuto/indo-nlp
v0.3.4
Changelog
Bug Fixing
Memperbaiki dan mengganti inner pattern pada fungsi replace_word_elongation
.
Updates
- Menambahkan wkwk pattern pada stopwords.
- Menggunakan inner flags untuk insensitive case.
v.0.3.3
Changelog
Bug Fixing
Fixing bog on slang words regex pattern caused by common string from SLANG_DATA
.
v0.3.2
Changelog
Update
- Mengubah return
Dataset.read
method menjadi dataclass Data
.
- Update docstring pada code.
v.0.3.1
Changelog
Documentation 📝
Make documentation site for indoNLP
using mkdocs with mkdocs-material theme and auto generating code references using mkdocstring.
Visit indoNLP website
- Fixing code and change code docstring to Bahasa
- Merge docs to master and deploy site using github action
Bug Fixing
- Fixing top level import at
indoNLP/__init__.py
- Fixing inconsistent return
indoNLP.dataset.reader.txt_table_reader
v0.3.0
Changelog
New Feature
Dataset 📖
New module indoNLP.dataset
to provide easy way to access Indonesian open dataset for NLP.
v0.2.0
Changelog
Bug Fixing
Fixing bug on preprocessing.replace_word_elongation
to only replace repeating characters in the end of words
New Feature
Emoji Supports 🤗
Able to preprocess emoji containing text with functions
emoji_to_words
words_to_emoji
v.0.1.1
Changelog
Change preprocessing.pipline
to preprocessing.pipeline
v0.1.0
Initial Release
Create preprocessing
module consist of several common utility functions.
preprocessing.remove_html
preprocessing.remove_url
preprocessing.remove_stopwords
preprocessing.replace_slang
preprocessing.replace_word_elongation
preprocessing.pipeline