Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batching Enhancements #289

Open
jganoff opened this issue Feb 25, 2017 · 6 comments
Open

Batching Enhancements #289

jganoff opened this issue Feb 25, 2017 · 6 comments

Comments

@jganoff
Copy link
Contributor

jganoff commented Feb 25, 2017

I'm building an InfluxDB producer that I'd like to guarantee at-least-once delivery for. My throughput requirements are high enough that batching is required. I was happy when I found BatchProcessor but then realized a couple of shortcomings that are preventing me from using it. I'd rather improve this library than create my own batch producer. Here's where the current batch implementation is falling short. Are any of these being worked on currently?

  1. There's no way to request the batch buffer to be flushed and not destroy the backing thread pool. The only option I see possible today without resorting to reflection is calling InfluxDB.disableBatch followed by InfluxDB.enableBatch to force a flush but that creates a new executor every time.
  2. BatchProcessor.write() is not thread safe and may be called concurrently from BatchProcessor.flush() on the current thread and the scheduled timer on the thread from the pool. This results in duplicate records being sent during flush().
  3. Batching records can silently fail. I'd like to provide a callback with each asynchronous write request that would be invoked once the point is sent or if an error was encountered while attempting to send it.
  4. The consistency level for all BatchPoints produced by the BatchProcessor is ConsistencyLevel.ONE. Direct control over the consistently level would be nice but ONE is a show stopper.

Some of the above are fairly substantial deviations in behavior from the current version. Would it be worthwhile for this to be a separate batch implementation rather than migrating the current version? Perhaps create a new asynchronous interface named InfluxDBAsync?

@jganoff
Copy link
Contributor Author

jganoff commented Feb 27, 2017

My apologies, I was wrong about 2 above. No duplicate records are written during flush() since BlockingQueue.drainTo() will prevent concurrent drains.

@majst01
Copy link
Collaborator

majst01 commented Feb 27, 2017

  1. is easy to fix, we should introduce a InflusDB,flush() which does this.
  2. is more difficult, see Batching / Request Buffering feature review #148 and some other issues as well.
  3. Seems easy as well, but only on a BatchProcessor instance, not per Point.

Introduction of a new Interface is not necessary in my opinion, to achieve async write some methods must be implemented with an async signature.

Do you mind implementing 1 and 3 ?

@jganoff
Copy link
Contributor Author

jganoff commented Feb 27, 2017

Sure, I'll send a PR for 1 and 3. Unfortunately I won't be able to use the built in batch support until it's at least possible to be alerted when records are not delivered. Not sure how I missed #148 but that's a good list of enhancements. Hopefully a combination of those will improve the batch implementation enough that I can use it. If not I'll see what the gap is and propose additional work.

Regarding the additional interface I can see the desire to stick with a single interface with additional asynchronous method signatures for writes.

@jganoff
Copy link
Contributor Author

jganoff commented Feb 28, 2017

I reviewed the issues and PRs bundled in #148. The last comment by @andrewdodd on #108 makes a lot of sense to me (separating concerns for this "auto batch" mode and the InfluxDB client that has a write(BatchPoints) method). Have you given that much thought?

@kkarski
Copy link

kkarski commented May 9, 2017

@majst01
Copy link
Collaborator

majst01 commented May 9, 2017

Current promising is: #318

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants