Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receive handler should failfast on unhandled exception #127

Merged
merged 6 commits into from
Feb 24, 2017

Conversation

serkantkaraca
Copy link
Member

@serkantkaraca serkantkaraca commented Feb 23, 2017

Receivepump goes numb when while loop throws. This is not expected. We should failfast the process so that the developer can take proper actions to fix whatever caused the exception.

  • ReceiveHandler should failfast when an unhandled exception detected.
  • Adding a unit test to cover error handling and recovery for EPH.

Addressing bug reported at #121

This checklist is used to make sure that common guidelines for a pull request are followed.

  • I have read the contribution guidelines.
  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR.
  • The pull request does not introduce breaking changes (unless a major version change occurs in the assembly and module).
  • If applicable, the public code is properly documented.
  • Pull request includes test coverage for the included changes.
  • The code builds without any errors.

@msftclas
Copy link

@serkantkaraca,
Thanks for your contribution as a Microsoft full-time employee or intern. You do not need to sign a CLA.
Thanks,
Microsoft Pull Request Bot

@serkantkaraca serkantkaraca merged commit f88469b into dev Feb 24, 2017
@jtaubensee jtaubensee deleted the ReceiveHandlerShouldFailfastOnUnhandledException branch March 2, 2017 18:16
serkantkaraca added a commit that referenced this pull request Mar 23, 2017
* Bumping up client version to 1.0.1 (#107)

* Fixing build break introduced with preview check-in.

* Bumping up the client version to 1.0.1

* Adding 2 new epoch unit tests (#108)

* Fixing build break introduced with preview check-in.

* Adding 2 new epoch receiver unit tests.

* Update readme.md (#110)

Fixed broken links.

* Deliver receiver disconnected exception to client (#109)

* Fixing build break introduced with preview check-in.

* + PartitionPump to deliver ReceiverDisconnectedException to processor host.
+ Don't look for ReceiverDisconnectedException while creating the receiver since CreateReceiver doesn't throw it.

* Adding unit test covering ReceiverDisconnectedException handling due to higher epoch receiver.

* Bumping up EPH client version to 1.0.1 on dev branch (#118)

* Fixing build break introduced with preview check-in.

* Bumping up EPH client version to 1.0.1 on dev branch

* Keeping sample readmes in sync with docs.microsoft.com

* Fix invoke processor after receive timeout behavior (#124)

* Update readme.md (#110)

Fixed broken links.

* EPH to deliver Enumerable.Empty instead of NULL when EventProcessorOptions.InvokeProcessorAfterReceiveTimeout is set

* Small refactoring to honor positive check.

* Amqp create link should honor time left (#128)

* Update readme.md (#110)

Fixed broken links.

* AmqpSender and AmqpReceiver should honor tile left while creating AMQP link.

* Remove unnecessary variable.

* Receive handler should failfast on unhandled exception (#127)

* Update readme.md (#110)

Fixed broken links.

* ReceiveHandler should failfast when an unhandled exception detected.
Adding a unit test to cover error handling and recovery for EPH.

* Remove unintentional internal modifier.

* Trace receive handler exit w/ error.

* Receive should not throw timeout exception (#130)

* Update readme.md (#110)

Fixed broken links.

* ReceiveAsync call should return NULL instead of throwing System.TimeoutException

* Unregister receive handler fix (#135)

* Update readme.md (#110)

Fixed broken links.

* Unregistering receive handler shouldn't generate OperationCanceledException

* Removing .Fork() comment

* Remove ActiveClientManager from service client (#136)

* Update readme.md (#110)

Fixed broken links.

* Code cleanup: Removing ActiveLinkManager from AmqpServiceClient

* Fix empty lease race issue (#141)

* Update readme.md (#110)

Fixed broken links.

* Handle race condition where downloaded empty-lease might be already LEASED by some other host.

* Remove lease object from LeaseLostException (#143)

* Remove Lease object from LeaseLostException

* Use partition id from context.

* ProcessorHost to close on ReceiverDisconnectedException (#137)

* Update readme.md (#110)

Fixed broken links.

* Fault partition pump on ReceiverDisconnectedException only.

* Moving samples to the azure-event-hubs repo (#142)

* Moving samples to the azure-event-hubs repo

* Fixes

* EPH lease manager and checkpointing fixes and improvements (#117)

* Fixing build break introduced with preview check-in.

* Here is the list of the changes:
+ Moving Offset and SequenceNumbre to base class Lease.cs
+ AzureStorageCheckpointLeaseManager.cs to stop downloading lease at every checkpoint call. This was causing 2 hosts operating on the same lease for a while.
+ Set MaximumExecutionTime on renewRequestOptions used while renewing storage lease.
+ Throw LeaseLostException when lease is lost on operations like Renew and Update.
+ Updating interface ICheckpointManager to accept Lease while updating checkpoint.
+ Some Linq improvements at lease steal and balancing logics.
+ Updating multiple-hosts unit test to validate load balancing.

* Set PartitionContext's SequenceNumber and Offset only before delivering the messages to the handler.

* CheckpointAsync(EventData eventData) should not allow checkpointing if the suplied eventData is ahead of processed batch.

* Add unit test to cover checkpointing messages in the middle of a batch.

* Marking ICheckpointManager.UpdateCheckpointAsync(Checkpoint checkpoint) as obsolute

* Fix: ReceiveHandler and PartitionPump should gracefully handle exceptions thrown at ProcessError callbacks.
Adding a CIT to cover error cases in EPH.

* HostShouldRecoverWhenProcessProcessEventsAsyncThrows unit test to unregister host at the end.

* Cancelling swallow and trace change in ReceiveHandlerProcessErrorAsync(Exception error)

* Remove unnecessary ExceptionHandled trace from EventSource.

* + Removing Obsolete signature from ICheckpointManager.
+ TaskContinuationOptions.OnlyOnRanToCompletion while persisting lease.

* Adding Obsolete ICheckpointManager.UpdateCheckpointAsync(Checkpoint checkpoint) back

* Avoid TaskCancelledException from ContinueWith when blob call fails due to lease lost. (#145)

* Update global.json (#147)

Specifying .net core version for Appveyor build

* Bumping up the Storage version to 8.1.1 (#149)
sjkwak pushed a commit that referenced this pull request Jun 7, 2018
* Bumping up client version to 1.0.1 (#107)

* Fixing build break introduced with preview check-in.

* Bumping up the client version to 1.0.1

* Adding 2 new epoch unit tests (#108)

* Fixing build break introduced with preview check-in.

* Adding 2 new epoch receiver unit tests.

* Update readme.md (#110)

Fixed broken links.

* Deliver receiver disconnected exception to client (#109)

* Fixing build break introduced with preview check-in.

* + PartitionPump to deliver ReceiverDisconnectedException to processor host.
+ Don't look for ReceiverDisconnectedException while creating the receiver since CreateReceiver doesn't throw it.

* Adding unit test covering ReceiverDisconnectedException handling due to higher epoch receiver.

* Bumping up EPH client version to 1.0.1 on dev branch (#118)

* Fixing build break introduced with preview check-in.

* Bumping up EPH client version to 1.0.1 on dev branch

* Keeping sample readmes in sync with docs.microsoft.com

* Fix invoke processor after receive timeout behavior (#124)

* Update readme.md (#110)

Fixed broken links.

* EPH to deliver Enumerable.Empty instead of NULL when EventProcessorOptions.InvokeProcessorAfterReceiveTimeout is set

* Small refactoring to honor positive check.

* Amqp create link should honor time left (#128)

* Update readme.md (#110)

Fixed broken links.

* AmqpSender and AmqpReceiver should honor tile left while creating AMQP link.

* Remove unnecessary variable.

* Receive handler should failfast on unhandled exception (#127)

* Update readme.md (#110)

Fixed broken links.

* ReceiveHandler should failfast when an unhandled exception detected.
Adding a unit test to cover error handling and recovery for EPH.

* Remove unintentional internal modifier.

* Trace receive handler exit w/ error.

* Receive should not throw timeout exception (#130)

* Update readme.md (#110)

Fixed broken links.

* ReceiveAsync call should return NULL instead of throwing System.TimeoutException

* Unregister receive handler fix (#135)

* Update readme.md (#110)

Fixed broken links.

* Unregistering receive handler shouldn't generate OperationCanceledException

* Removing .Fork() comment

* Remove ActiveClientManager from service client (#136)

* Update readme.md (#110)

Fixed broken links.

* Code cleanup: Removing ActiveLinkManager from AmqpServiceClient

* Fix empty lease race issue (#141)

* Update readme.md (#110)

Fixed broken links.

* Handle race condition where downloaded empty-lease might be already LEASED by some other host.

* Remove lease object from LeaseLostException (#143)

* Remove Lease object from LeaseLostException

* Use partition id from context.

* ProcessorHost to close on ReceiverDisconnectedException (#137)

* Update readme.md (#110)

Fixed broken links.

* Fault partition pump on ReceiverDisconnectedException only.

* Moving samples to the azure-event-hubs repo (#142)

* Moving samples to the azure-event-hubs repo

* Fixes

* EPH lease manager and checkpointing fixes and improvements (#117)

* Fixing build break introduced with preview check-in.

* Here is the list of the changes:
+ Moving Offset and SequenceNumbre to base class Lease.cs
+ AzureStorageCheckpointLeaseManager.cs to stop downloading lease at every checkpoint call. This was causing 2 hosts operating on the same lease for a while.
+ Set MaximumExecutionTime on renewRequestOptions used while renewing storage lease.
+ Throw LeaseLostException when lease is lost on operations like Renew and Update.
+ Updating interface ICheckpointManager to accept Lease while updating checkpoint.
+ Some Linq improvements at lease steal and balancing logics.
+ Updating multiple-hosts unit test to validate load balancing.

* Set PartitionContext's SequenceNumber and Offset only before delivering the messages to the handler.

* CheckpointAsync(EventData eventData) should not allow checkpointing if the suplied eventData is ahead of processed batch.

* Add unit test to cover checkpointing messages in the middle of a batch.

* Marking ICheckpointManager.UpdateCheckpointAsync(Checkpoint checkpoint) as obsolute

* Fix: ReceiveHandler and PartitionPump should gracefully handle exceptions thrown at ProcessError callbacks.
Adding a CIT to cover error cases in EPH.

* HostShouldRecoverWhenProcessProcessEventsAsyncThrows unit test to unregister host at the end.

* Cancelling swallow and trace change in ReceiveHandlerProcessErrorAsync(Exception error)

* Remove unnecessary ExceptionHandled trace from EventSource.

* + Removing Obsolete signature from ICheckpointManager.
+ TaskContinuationOptions.OnlyOnRanToCompletion while persisting lease.

* Adding Obsolete ICheckpointManager.UpdateCheckpointAsync(Checkpoint checkpoint) back

* Avoid TaskCancelledException from ContinueWith when blob call fails due to lease lost. (#145)

* Update global.json (#147)

Specifying .net core version for Appveyor build

* Bumping up the Storage version to 8.1.1 (#149)

* Bumping up dependent SDK to latest - 3131. (#150)

* Updating NuGet package descriptions

* Updating EPH NuGet package description

* Slimming down AsyncLock by removing unnecessary checks and calls. (#153)

* Codecov (#155)

* Updated Xunit refs, set assemblies to run in order, added codecov step to build

* Adding an additional filter

* ran same tests twice

* made variables a little more clear

* Missed another variable for build

* Adding codecov badges

* Adding back net46 test runs

* Allow at least 60 seconds for AMQP session create. (#154)

* Allow at least AmqpMinimumOpenSessionTimeoutInSeconds seconds to open the session.

* refactor

* Fixing unit tests to support timeout change.

* Setting appveyor build to only build master/dev branches.

* Batch event data API (#157)

* Draft

* EvetDataBatch first iteration.

* Implementing IDisposable on EventData and EventDataBatch

* Removing EventHubSendBatchMismatchPartitionKey from resources.

* .

* Remove unused NullString constant

* Calling MaxMessageSize() instead of MaxMessageSize property.

* Updating readme with sample location

* Update project to VS 2017 (#158)

* Fix audience on the ActiveClientLink (#164)

* Appveyor and unit test updates (#166)

* Adding API documentation (#168)

* Adding shared access signature support. (#174)

* Adding shared access signature support.
* Fixing a comment typo.
* Reducing the number of clients to 4 in SmallReceiveTimeout unit test.

* Using ConfigureAwait(false) to avoid context restoration (#182)

* Fix lease-lost decision logic (#179)

* Provide inner exception's message as LeaseLostException's message.

* Fix the lease-lost decision logic.

* Fix appveyor.yml (#192)

* Fix appveyor.yml

* Add build log to build.ps1

* .

* .

* .

* .

* .

* Appveyor account changes in readme.md

* Web sockets support (#177)

* Cleaning things up for websockets support.

* Adding AMQP web sockets support

* Moving to AMQP 2.1.0

* Fixing build failure.

* Remove secrets

* Fix SasKey remove issue in unit test.

* Bumping up the client version to 1.0.3 (#190)

* Bumping up the client version to 1.0.3

* Updating version prefix as well.

* Close sender and receiver clients. (#194)

* Fixing receiver pump open/close race issue (#175)

* Fixing receiver pump open/close race issue and adding a new unit test to cover the fix.

* receiveHandler.ProcessErrorAsync should not crash the process. Basically ignore any failure from user code.

* Add missing TaskContinuationOptions.OnlyOnFaulted

* Using a fixed timeout while creating AMQP session. (#195)

* Azure AMQP 2.1.1 and small RetryPolicy fix. (#196)

* Using a fixed timeout while creating AMQP session.

* Move to Azure.Amqp 2.1.1

* Use AddOrUpdate at retry count incrementation.

* Fix Appveyor badge link and add VS 2017 build instructions. (#199)

* Fix Appveyor badge link and add VS 2017 build instructions.

* Update readme.md

* Update readme.md

* Client entities should inherit retry policy from parent client. (#200)

* Client entities should inherit retry policy from client.

* Minor test fix.

* Propagate retry policy updates on client to inner sender.

* Adding PartitionManagerOptions (#202)

* Adding PartitionManagerOptions

* Comment change

* Fixing lease manager initialization issue.

* Addressing SJ's comments.

* Adding receiver runtime metrics support. (#180)

* Adding receiver runtime metrics support.

* Instantiate PartitionContext.RuntimeInformation

* Moving to SDK 2.0 (#205)

* Moving to SDK 2.0

* Updating Appveyor scripts for version change.

* Moving to .Net 4.6.1

* Updating the package version to 2.0.0

* Updating version prefix to 2.0.0

* .

* .

* more versioning fixes

* Removing PackageTargetFallback since failing restore

* Removing PackageTargetFallback in test project

* appveyor.yml fix

* Appveyor version 2.0

* [DS] Added Diagnostic Source instrumentation for Send operation (#203)

* [DS] Added Diagnostic Source instrumentation for Send operation

* [DS] Addressed PR feedback

* [DS] Added Diagnostic Source instrumentation for Receive operation

* [DS] Improve Diagnostic Source instrumentation (#212)

* [DS] Improve Diagnostic Source instrumentation

* [DS] Added back several tags

* [DS] Added Diagnostic instrumentation tests

* [DS] Standardized event payload properties

* [DS] Properly dispose Diagnostics tests

* [DS] Added Diagnostic-Id and Correlation-Context injection (#214)

* [DS] Added Diagnostic-Id and Correlation-Context injection + ExtractActivity helper

* [DS] Fixed diagnostics tests stability

* Fix EventHubsConnectionStringBuilder ctor crash (#219)

When supplying SharedAccessSignature as a parameter

* Fixing deceptive description of InitialOffsetProvider (#224)

* Aad token provider (#225)

* Adding AAD token support.

* improve lock

* Adding MSI provider.

* Some small changes

* Adddresing comments and couple small fixes.

* Remove string resource

* Remove test secrets

* Renaming paramater endpoint to endpointAddress

* Renaming paramater endpoint to endpointAddress

* Remove unused AsyncLock

* Remove redundant config name

* Remove using

* Remove usings

* Merge fix

* API name change

* Xmldoc type fix.

* Remove SecurityToken internal constructor

* Build error fix

* Cleanup code

* Pass temporary audience instead of empty string to SecurityToken.

* Make token type required

* Refactoring SharedAccessSignatureToken for cleanup

* Function renaming to carry UTC notion.

* Remove procted accesors from SecurityToken class properties.

* changing the icon for nuget (#228)

* Adding receiver identifier support (#226)

* Fixing EventDataBatch with partition-key related issues. (#198)

* Fixing DataBatch partition-key usage

* Fixing EventDataBatch with partition-key related issues.

* Incorporating comments from latest team discussion.

* Removing a leftover from debug session.

* Adding maxMessageSize to CreateBatch API on clients.

* Aligning API with Java client.

* Fixing build error.

* Adding EH client changes to support sequence number receiver. (#230)

* EventData.SystemProperties to inherit Dictionary (#206)

* Adding BatchOptions

* Avoid application properties to overwrite IOT set properties.

* Remove batchoptions, not related to this change

* Avoid IOT reserved names to be used in the message propery bag.

* SystemProperties to inherit System.Dictionary.

* Return system property from dictionary instead of assigning them during message conversion.

* Throw if system property is missing.

* Fix merge errors

* Merge fix

* Remove unnecessary partitionid/partitionkey check

* Close receivers

* Adjusting class comment of EventDataBatch class (#213)

* Adjusting class comment of EventDataBatch class

Adjusting top comment, as SendBatch and SendBatchAsync methods don't exist on EventHubClient anymore, the methods are called Send and SendAsync and take either a single EventData or an IEnumerable<EventData>.
In addition the comment makes it not clear why a dev should use this class. Why not just create an IEnumerable<EventData> on your own? A dev should use this class because it takes care of the size limit, that's the main reason and it should be in the header.

* Update EventDataBatch.cs

Fixing the comment to make it XML-compliant.

* Update EventDataBatch.cs

Link EventHubClient in XML comment

* Add default batch size for receive handler. (#235)

* Add default batch size for receive handler.

* Rely on default MaxBatchSize in the unit tests

* XMLdoc correction

* Address James's comment on PR

* Fixed a typographical error (#239)

* invokeWhenNoEvents flag for Pump (#238)

* Adding invokeWhenNoEvents to pump and wiring up EH null callback on this.

* Fix default pump invoke on null behavior

* Stop-CancellationToken for PartitionPump (#187)

* Fixing dev branch badge img URL.

* Bumping up AMQP and some other Nuget package versions (#249)

* Bumping up AMQP and some other Nuget package versions

* Nuet package version bump up for EPH client and fixing Obsolute storage call.

* Enable custom proxy endpoints (#252)

* Enable custom proxy setting fpr client.

* Move webproxy to clients.

* Trivial change

* Addressing SJ's comments.

* Reducing Newtonsoft.Json dependency version to 10.0.3 (#255)

* Fixing empty offset from no-receive partition lease. (#253)

* Notify partitionId in the error callback if lease call fails. (#257)

* Update readme.md

Updating docs to reflect .NET Standard 2.0

* Apply proxy settings for EPH on management client.

* EPH - enable proxy on management client (#261)

* Apply proxy settings for EPH on management client.

* Enable SequenceId receive tests

* Use entire hostname in the EPH traces and error messages. (#263)

* Apply proxy settings for EPH on management client.

* Use hostname instead of trimmed version in the EPH traces.

* Log ReceiverDisconnectedException error message in EPH traces.

* Calculate EPH renew interval (#264)

* Apply proxy settings for EPH on management client.

* Consider reducing the wait time with last lease-walkthrough's time taken.

* Increase operation-timeout for tests to reduce intermittent timeout failures.

* Partition manager lease handling improvements (#266)

* Apply proxy settings for EPH on management client.

* Parallel lease renewal and couple more improvements in partition manager.

* Nit

* Handle lease renew race between checkpointing and partition manager renew loop.

* Don't throw for missing partition-key since not all events will have one.

* Fix event data batch size miscalculation (#268)

* Apply proxy settings for EPH on management client.

* Fix EventDataBatch size calculation

* Fixing overhead calculation for smaller messages.

* Optimize

* Shorten unit test names

* Reduce total run time for EPH tests.

* Increase ResetEvent timeout

* Move EPH negative test to its own class.

* Shorten run time for PartitionKeyValidation test

* Rename GetSize to GetEventSizeForBatch

* Take wrapper message into account while calculating batch size.

* Remove MultipleConsumerGroups from EPH test coverage. This test doesn't really validate any functionality.

* Bumping up Microsoft.Azure.Services.AppAuthentication to 1.0.0 release

* Fix ObjectDisposedException when send retried with the same set of EventDatas (#275)

* Fix ObjectDisposedException when send retried with the same set of messages.

* Correct where to reset AmqpMessage on EventData

* Always use default timeout for AMQP sesssion create. (#272)

* Always use default timeout for AMQP sesssion.

* Fix timeouts for AMQP management calls

* Fix timeouthelper on Sender as well.

* AAD Token Provider support for EPH (#276)

* AAD Token Provider support for EPH

* Remove AAD secrets from test

* Addressing CR comments

* Support Azure Storage AAD/MSI token provider at EPH instantiation. (#281)

* Support Azure Storage AAD/MSI token provider at EPH instantiation.

* Versioning updates.

* Mergin couple more changes from dev to master

* Merge ProcessorTestBase changes from dev to master

* Bump up SDK version to 2.1.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants