Releases · INGEOTEC/dialectid

09 Sep 14:15

mgraffg

v0.0.5b

c3a1f82

Version - 0.0.5b Latest

Latest

DenseBoW can encode a text in a matrix where the number of columns corresponds to the tokens, and the rows are the decision values of Support Vector Machines trained to identify the token represented in each row.

Assets 2

09 Sep 14:07

mgraffg

v0.0.5

268c598

Version - 0.0.5

Assets 2

02 Aug 15:52

mgraffg

v0.0.4

268c598

Version - 0.0.4

The version includes the subwords model. This is the default in DialectId.

Assets 2

19 Jun 03:10

mgraffg

v0.0.3

f611d5b

Version - 0.0.3

It addresses a typo issue in French countries.

Assets 2

15 Jun 14:17

mgraffg

v0.0.2

3153dda

Version - 0.0.2

It includes the class DialectId to identify the dialect (country) from a text, given the language.

Assets 2

05 Jun 16:00

mgraffg

v0.0.1

3f5518a

Version - 0.0.1

The first version of dialectid. It has a Bag of Word (BoW) model where the weights were estimated in 4 million ($2^{22}$) tweets uniformly selected from the Spanish countries.

Assets 2

05 Jun 14:41

mgraffg

data

23e4cff

Data

The release is to have a place to store the data of the dialectid.

Assets 650

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: INGEOTEC/dialectid

Version - 0.0.5b

Version - 0.0.5

Version - 0.0.4

Version - 0.0.3

Version - 0.0.2

Version - 0.0.1

Data