Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix minimum and maximum #6

Open
harendra-kumar opened this issue Dec 23, 2021 · 2 comments
Open

Fix minimum and maximum #6

harendra-kumar opened this issue Dec 23, 2021 · 2 comments
Assignees

Comments

@harendra-kumar
Copy link
Member

harendra-kumar commented Dec 23, 2021

We can possibly use a heap instead of a Deque, hopefully giving better performance. Here is how it might work.

Assuming the input is (a, Maybe a), where the first element of the tuple is the element being inserted in the window and the second element is the one being ejected from the window. Assuming a min heap to find the minimum:

  • When the input is (x, Nothing), insert x into the heap
  • When inserting x into the heap, discard any of the existing heap elements that are greater than x.
  • When the input is (x, Just y), since we know that y is the last element in the window, we know that y has to be minimum if it is present in the heap. So remove y from the top of the heap if it is the top element in the heap. Then insert x as described in the previous steps. If there are multiple instances of y in the window then we should not remove it, we can keep a refcount for that and decrement the refcount until it becomes 0. Increment the refcount when inserting a duplicate.

Note that we would need a custom heap implementation as we need to cut the top of the heap in one case and the bottom of the heap in another.

The cost would likely be n * log w where n is the number of elements in the stream and w is the window size. The worst case for minimum would be when the input stream is sorted ascending order. The best case would be when it is sorted in the descending order.

Note that we can perform a partial sort of the stream by scanning it using the min or the max fold.

@harendra-kumar
Copy link
Member Author

We can possibly combine the ring and the heap into a single data structure to efficiently find the min/max in a rolling window.

@harendra-kumar
Copy link
Member Author

For smaller window sizes we could just use the ring buffer and perform a linear search in it to find the min/max. The total cost would be n * w.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants