Skip to content

Implementation of the lossy counting algorithm with demo examples

Notifications You must be signed in to change notification settings

mgeletka/lossy-counting

Repository files navigation

Implementation of the lossy-counting algorithm

Implementation of the lossy counting algorithm for purposes of the Data Stream class on FCUP at University of Porto

How to work with the implementation

The whole algorithm is implemented in the single python file lossy_counting.py. The example of use of the LossyCounting class can be seen in the demo.py file General use of the implementation:

  • create LossyCounting class with parameters:
    • minSupport - minimal support of all items in the results
    • error - maximal error in the results
  • process stream by calling LossyCounting().processNextElement(element)
  • get results by calling the method LossyCounting().getFrequentItems()

Run example

  • to run the included demos
    • extract the zip file
    • enter the lossy-counting folder
    • to run the example you have to have install python3 on your machine
    • then simply run python3 main.py

Included demos:

  • exponential demo - with samples drawn from the exponential distribution
  • warehouse demo - with samples from anonymized retail market basket data from an anonymous Belgian retail store
  • Romeo and Juliet demo - with word frequencies from the Romeo and Juliet drama

About

Implementation of the lossy counting algorithm with demo examples

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages