Skip to content

naiaden/cococpyp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Language Machines Badge Build Status

This repository contains the code for my PhD project "What's left in the bag for latent variable language modelling", a joint-doctorate project between the Radboud University Nijmegen, the Netherlands, and the KU Leuven, Belgium. In this project I look at bag-of-words for language modelling, and try to find information in this bag-of-words that is currently unexploited such as skipgrams. Bayesian models are exemplar for latent variable models, and it is this intersection of language modelling and Bayesian statistics that I find interesting.

Our main model is a hierarchical Pitman-Yor language model based on skipgrams. The models generated by this toolkit are language agnostic.

It is based on a fork of cpyp (https://github.com/redpony/cpyp) which I enhanced with Colibri-core (https://github.com/proycon/colibri-core).

More info and results will be added later.

About

This is a Colibri-core enhanced fork of https://github.com/redpony/cpyp

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •