Implementation of the lossy counting algorithm for purposes of the Data Stream class on FCUP at University of Porto
The whole algorithm is implemented in the single python file lossy_counting.py. The example of use of the LossyCounting class can be seen in the demo.py file General use of the implementation:
- create LossyCounting class with parameters:
- minSupport - minimal support of all items in the results
- error - maximal error in the results
- process stream by calling LossyCounting().processNextElement(element)
- get results by calling the method LossyCounting().getFrequentItems()
- to run the included demos
- extract the zip file
- enter the lossy-counting folder
- to run the example you have to have install python3 on your machine
- then simply run
python3 main.py
- exponential demo - with samples drawn from the exponential distribution
- warehouse demo - with samples from anonymized retail market basket data from an anonymous Belgian retail store
- Romeo and Juliet demo - with word frequencies from the Romeo and Juliet drama