Skip to content

Releases: INGEOTEC/dialectid

Version - 0.0.5b

09 Sep 14:15
c3a1f82
Compare
Choose a tag to compare

DenseBoW can encode a text in a matrix where the number of columns corresponds to the tokens, and the rows are the decision values of Support Vector Machines trained to identify the token represented in each row.

Version - 0.0.5

09 Sep 14:07
268c598
Compare
Choose a tag to compare

DenseBoW can encode a text in a matrix where the number of columns corresponds to the tokens, and the rows are the decision values of Support Vector Machines trained to identify the token represented in each row.

Version - 0.0.4

02 Aug 15:52
268c598
Compare
Choose a tag to compare

The version includes the subwords model. This is the default in DialectId.

Version - 0.0.3

19 Jun 03:10
f611d5b
Compare
Choose a tag to compare

It addresses a typo issue in French countries.

Version - 0.0.2

15 Jun 14:17
3153dda
Compare
Choose a tag to compare

It includes the class DialectId to identify the dialect (country) from a text, given the language.

Version - 0.0.1

05 Jun 16:00
3f5518a
Compare
Choose a tag to compare

The first version of dialectid. It has a Bag of Word (BoW) model where the weights were estimated in 4 million ($2^{22}$) tweets uniformly selected from the Spanish countries.

Data

05 Jun 14:41
23e4cff
Compare
Choose a tag to compare

The release is to have a place to store the data of the dialectid.