Skip to content

AntonioCoppola/stmpy

Repository files navigation

Covariate-Augmented Probabilistic Topic Models in PySpark

This Python library implements a set of probabilistic topic models that allow integration of arbitrary document-level metainformation into the generative process for the data. The model currently implemented is the Structural Topic Model (STM) of Roberts, Stewart, and Tingley. A nonparametric variant is forthcoming. While the software can be run in serial on a local machine, the software's primary goal is use for parallel computation with Apache Spark.

About

Covariate-Augmented Probabilistic Topic Models in PySpark

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published