Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize auto-commit process #31

Merged
merged 4 commits into from
Jun 25, 2013
Merged

Optimize auto-commit process #31

merged 4 commits into from
Jun 25, 2013

Conversation

mahendra
Copy link
Collaborator

The auto-commit thread is a bit heavy and there are points of optimization. I tried two attempts to do it.

  • Spawn the commit thread only if necessary - If there are no messages being consumed, the timer keeps creating new threads at the specified intervals. This may not be necessary. We can control this behaviour such that the timer thread is started only when a message is consumed

  • The previous approach optimized the commit thread such that the timer started only when there were messages to be consumed. This goes a step further and ensures the following:

    • Only one timer thread is created
    • The main app does not block on exit (waiting for timer thread to finish). In the current code, if you exit the main thread, the app would wait till the commit thread exits (typically 5 seconds).

    This is ensured by having a single thread blocking on an event and keeps calling a function. We use events instead of time.sleep() so as to prevent the python interpreter from running every 50ms checking if the timer has expired (logic copied from threading.Timer)

Also provide support for passing args/kwargs to the timer callback function. Maybe useful in some future scenario

Also, we keep the re-entrant nature of the timer code intact by using an threading.Event and passing it as an argument to the thread target.

Also, with this, the logic of commit() becomes simple for manual or timed operations. We can remove all the extra checks of stopping/restarting the timer now. Commit operation is done by acquiring a lock. By checking for the commit requirements inside the lock, we can avoid any extra steps of stopping/starting the timer (which now runs forever). We do the check for self.count_since_commit in two places

  • Before the lock - so that we can avoid acquiring the lock if there is no need to commit
  • After the lock - so that we can avoid committing if the other thread has done the commit before we could acquire the lock.

If there are no messages being consumed, the timer keeps
creating new threads at the specified intervals. This may
not be necessary. We can control this behaviour such that
the timer thread is started only when a message is consumed
The previous commit optimized the commit thread such that the timer
started only when there were messages to be consumed. This commit
goes a step further and ensures the following:
* Only one timer thread is created
* The main app does not block on exit (waiting for timer thread to finish)

This is ensured by having a single thread blocking on an event and
keeps calling a function. We use events instead of time.sleep() so
as to prevent the python interpreter from running every 50ms checking
if the timer has expired (logic copied from threading.Timer)
@mahendra
Copy link
Collaborator Author

@mumrah did you get a chance to look at this? I was working on a design/prototype for the MutliProcessConsumer. This patch in master would be helpful :-)

mumrah added a commit that referenced this pull request Jun 25, 2013
Optimize auto-commit process
@mumrah mumrah merged commit 64b7578 into dpkp:master Jun 25, 2013
@mumrah
Copy link
Collaborator

mumrah commented Jun 25, 2013

Looks good, @mahendra thanks for the patch

@mahendra
Copy link
Collaborator Author

@mumrah yaaay! thanks for merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants