Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.0] Refactor how the events are captured and stored #487

Merged
merged 27 commits into from
Dec 9, 2017
Merged

[2.0] Refactor how the events are captured and stored #487

merged 27 commits into from
Dec 9, 2017

Conversation

ste93cry
Copy link
Collaborator

@ste93cry ste93cry commented Aug 4, 2017

Q A
Branch? 2.0
Bug fix? no
New feature? yes
BC breaks? yes
Deprecations? no
Tests pass? yes
License MIT

This PR introduces an object-oriented approach for the storage of events. Before the events were stored as plain arrays of data which were frequently passed through functions. As all know PHP arrays are passed by variable while instead objects are passed by reference, so there should be improvements in term of performance. Does it makes sense to make the Event class immutable (like the PSR-7 Request class?

@Jean85
Copy link
Collaborator

Jean85 commented Aug 4, 2017

There shouldn't be a use case for modifying events after their creation, so 👍 for immutable events, since they are being passed around by reference now and we should avoid any "spooky action from distance".

Copy link
Collaborator

@Jean85 Jean85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's still a WIP, I've added just small comments.
But good job, go ahead!

composer.json Outdated
@@ -20,6 +20,7 @@
"php-http/discovery": "~1.2",
"php-http/httplug": "~1.1",
"psr/http-message-implementation": "~1.0",
"ramsey/uuid": "~3.6",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we lower this to 3.3.0? Just for more BC, since that release is the oldest that I would advise to use (due to the fix to the collision bug).

@@ -737,7 +694,7 @@ public function sanitize(&$data)
/**
* Process data through all defined \Raven\Processor sub-classes
*
* @param array $data Associative array of data to log
* @param array $data Associative array of data to logf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

@@ -65,6 +65,20 @@ public function __construct(Client $client)
$this->reprSerializer = $client->getReprSerializer();
}

/**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation error

@ste93cry
Copy link
Collaborator Author

ste93cry commented Aug 4, 2017

According to the official SDK reference what is not an attribute of the event is assumed to be an interface. I was thinking to split make the event composable: a set of data collectors should collect the data (e.g. the stacktrace or the user information) and then somehow the resulting data should be added to the event. What do you think about this? This way the Event class would be decoupled from the logic to track additional data and it would make easier to add support for new interfaces if needed

@Jean85
Copy link
Collaborator

Jean85 commented Aug 5, 2017

I like the idea of splitted interfaces that follows the official SDK. I just hope that those are not too much work for no benefit!!
For me is 👍 but I would like to ear from @stayallive too, because I'm not sure that we can use a read-only-value-object approach with all of those stuff.

@stayallive
Copy link
Collaborator

Well I'm not sure if we would benefit from making it immutable since an event is mostly only handled by the Client and we might shoot ourselfs int he foot making it immutable (if we later want to introduce middleware for example) to make it easier for 3rd parties to modify the event being sent or scrub some extra data or something, or am I thinking in the wrong direction here?

I do like the idea of splitting the event in components that can be composed to an event however this sounds very complex, how would you see an event being built (in pseudocode)?

@ste93cry ste93cry mentioned this pull request Sep 9, 2017
@nokitakaze
Copy link
Contributor

@Jean85
Copy link
Collaborator

Jean85 commented Sep 11, 2017

I've re-read the code, and I'm now inclined against an immutable object: everywhere we do a $event = $event->withSomething();, which is easy to miss and create issues. We cannot leverage the advantage of immutability here, since the event is created, filled and the immediately sent or stored to be sent; it's no longer juggled around.

Therefore, I'm 👍 for avoiding complexity and go forward with this without immutability. @ste93cry do you want help with this?

@ste93cry
Copy link
Collaborator Author

since the event is created, filled and the immediately sent or stored to be sent; it's no longer juggled around.

This is not entirely true, as the event is supposed to be passed to the processors which can edit the data it contains. Also, an immutable event would prevent any modification of itself after it reached the Httplug client that could create bugs or strange things difficult to track.

do you want help with this?

The main problem with this PR is how to collect things for the SDK data interfaces. I was thinking about creating a sort of system that works with the middleware pattern so that the event can be composed and then the processors are called on top of the built object. This would allow third parties to hook into the process at any time. Does it makes sense?

@Jean85
Copy link
Collaborator

Jean85 commented Sep 11, 2017

For me it make sense, but you would be forced to create a generic ->with($name, $value) method, since the current implementation has freedom to use any string as a sub-category/tag for the event.

@ste93cry
Copy link
Collaborator Author

if (in_array(get_class($exception), $this->config->getExcludedExceptions())) {
return null;
}

The following check can be found in the Client::captureException method. An exception can have multiple previous exceptions, shouldn't this check be repeated for each one instead of the first only?

@Jean85
Copy link
Collaborator

Jean85 commented Oct 23, 2017

The following check can be found in the Client::captureException method. An exception can have multiple previous exceptions, shouldn't this check be repeated for each one instead of the first only?

I don't know, but this would be a change in the behavior of the lib; I think we should be coherent with all the others Sentry implementantions, and we should change this only if the others are doing it like that.

@nokitakaze
Copy link
Contributor

I understood the meaning of this pull request only now. I approve changing of storing internal data from plain array to structured objects.
https://github.com/nokitakaze/php-mutex/blob/master/src/phpdoc.php
https://github.com/nokitakaze/php-keyvalue/blob/master/src/phpdoc.php
https://github.com/nokitakaze/php-queue/blob/master/src/phpdoc.php

This is how structured data (like Sentry events) should be stored. It helps to understand the meaning of every field in event datum, makes highlights in code (in IDE) and allows to find incorrect using of fields (i.e. using integer field as string argument in some method).

@ste93cry
Copy link
Collaborator Author

I think we should be coherent with all the others Sentry implementantions,

I had a look at the other SDKs, and the most common like Javascript or Ruby does not have anything similar to the "previous exception" concept of PHP. Instead, the Python SDK behaves a bit different: the excluded_exceptions option accepts strings or classes. In the first case the string is matched strictly against the name of the exception class, in the second case the exception object instance is matched against each item of the list using issubclass so that you don't have to explicitly configure all exceptions classes if they all share the same common parent. As there doesn't seems to be a shared behaviour between integrations regarding how an exception is excluded, I think my proposal to apply the filter of excluded exceptions to the nested exceptions is valid

@Jean85
Copy link
Collaborator

Jean85 commented Oct 23, 2017

Ok, good investigation. Please list it as a BC in the migration document then.

@stayallive stayallive changed the title [WIP] Refactor how the events are captured and stored [2.0] [WIP] Refactor how the events are captured and stored Oct 29, 2017
@ste93cry
Copy link
Collaborator Author

Does it makes sense to log the session_id() in the user collector? According to the SDK reference of the User interface:

You should provide at least either an id (a unique identifier for an authenticated user) or ip_address (their IP address).

I don't think the session ID is a valuable information (not that the IP address is more valuable anyway imho). If users want to identify a user that got an error, then they should add tags or even better create their own middleware and edit the event object by attaching whatever info they want

@ste93cry
Copy link
Collaborator Author

Ok so here it's the status of this PR: I asked to review the code because it's almost done, the only thing I would like to improve a bit is the code coverage of some methods (e.g. the Client::capture method) but this can be done later with other changes. I would also ask if it makes sense to keep the Client::getLastEventID method as Client::capture already returns the same information and we could then simplify a bit the class.

@Jean85
Copy link
Collaborator

Jean85 commented Nov 27, 2017

I'm 👍 for removing Client::getLastEventID, since it will remove an other bit of state from the client. Anyone else can think of a reason for not dropping it?

Copy link
Collaborator

@Jean85 Jean85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've asked a few questions, but overall this PR is great! 👍


- The `Client::captureQuery` method has been removed.

- The `Client::captureMessage` method has changed its signature by removing the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add a reason to why we removed that?
Also: it's still possible to send a generic message to sentry like this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example: can we easily attach a stack trace to the message?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's still possible to send a generic message to sentry like this?

It's possible by using the Client::capture method as all other Client::capture<...> methods just forward the call to it by passing the right options as payload.

can we easily attach a stack trace to the message?

you can by passing the exception option in the $payload argument.

/**
* @var callable The tip of the middleware call stack
*/
private $middlewareTip;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should call it firstMiddleware or middlewareStackTip for more clarity?

/**
* Seeds the middleware stack with the first callable.
*/
private function seedMiddlewareStack()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we do it in the constructor once and for all?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, original code for the middleware part was taken from a PHP trait available to Slim framework users and meant to be reusable, but there is no reason we have the same requirements here 👍

@@ -979,13 +672,43 @@ private function isEncodingCompressed()
return 'gzip' === $this->config->getEncoding();
}

private function autoloadRavenStacktrace()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this has been removed? It's still needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot reproduce the bug anymore so I assumed it was somehow fixed as it's pretty old (PHP 5.2)... I tried testing with the code found in this issue: https://bugs.php.net/bug.php?id=42098.

];
}

public function testInvokeWithExceptionContainingLatin1Characters()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on migrating all those tests! 👍

@stayallive
Copy link
Collaborator

I am not sure Client::getLastEventID should be removed, I do see why it should, but this would require any plain SDK implementers to implement their own state while it's such a minor thing to do ourselfs. And yes it is also a minor thing to do for the end user, but still I'm not sure.

For the ease of use, just 3-5 lines you can implement the lib (although I am not sure what the state on that is now...) and implementing user feedback means just asking the client for the last ID no matter from where it was called (maybe even by 3rd party components on which you have no control).

So in favor of keeping it simple for the end user (which I think should be considered over ultra-clean code in this lib) it might be a good idea to leave the Client::getLastEventID.

@Jean85
Copy link
Collaborator

Jean85 commented Nov 27, 2017

It's a nice reason, in fact. I could happen frequently that Sentry is not called directly but by third party integrations (the bundle, the Laravel integration...) so yeah, maybe we should leave it.

Even better, maybe we can change it to Client::getLastEvent()! With this PR the Event is an immutable object, so it can be passed to the end user without issues.

@stayallive
Copy link
Collaborator

I would propose to have Client::getLastEvent but also a Client::getLastEventID (which uses Client::getLastEvent to retrieve the event and just returns the ID) for the ease of use and not breaking BC and people needing to know how to retrieve the ID from the Event object.

@Jean85
Copy link
Collaborator

Jean85 commented Nov 28, 2017

Ok. We can deprecate it to drop in in next major, so people will migrate toward Event::getId().

@stayallive
Copy link
Collaborator

Yeah we can, although I'm not bothered by it being there (if it uses the Event for getting the ID).

But if you want to deprecate it I'm not against.

@ste93cry
Copy link
Collaborator Author

I would propose to have Client::getLastEvent but also a Client::getLastEventID

My opinion about removing any getLastEvent* method was dictated by the fact that I don't think it's useful to maintain in the state of the client the last event (ID or object). The only reason I could think of an user would want to get the event is to log it somewhere but this is possible now with a custom middleware. Also, the event is just a container of information that was added to provide a better way to interact with the message and to prevent passing arrays all along which gets copied everytime by PHP: it does not know if it was sent, it's being sent, failed to be sent or whatever else. By the way if all agree to just deprecate it and not take the chance to remove it in 2.0 I'm fine with it

@dcramer
Copy link
Member

dcramer commented Nov 28, 2017

getLastEventID (or getLastEvent, but thats less useful) needs to stay. This is used for things like User Feedback, where you want to simply respond to "there was an error, I want to ask the user for feedback".

In general, a lot of the API that exists, make an assumption that it has a good reason to exist, and I'd consider getting sign off from @getsentry/sdks before removing an actual feature.

@ste93cry
Copy link
Collaborator Author

In general, a lot of the API that exists, make an assumption that it has a good reason to exist

Sorry if my message seemed to be rude, it was not my intention. I like to simplify things the most I can and I didn't know the getLastEventID method was useful for the User Feedback feature. I still think now that we have the middleware pattern we should point users to use that instead of relying on the client maintaining its own state, but it's fine if everyone agrees as I said before. I'm going to fix the remaining things, if anyone wants to add more comments to the review before this PR gets merged this is the good time

@ashwoods
Copy link

I've been refactoring the python SDK and implementing a prototype roughly based on this gist. Some features like an interface for internal SDK errors have only been shortly discussed. Axing the equivalent of getLastEventID is also planned once the integrations don't depend on it.

@Jean85
Copy link
Collaborator

Jean85 commented Nov 29, 2017

Ok so:

  • getLastEvent(): Event should be added
  • we deprecate getLastEventId() in favor of getLastEvent()->getId()
  • we will always have an endpoint that will allow the end user to retrieve the last event in some form (that is required to Python too, @ashwoods!)

It seems a good plan to me 👍

Copy link
Collaborator

@Jean85 Jean85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a lot of work, this LGTM! 👍
If it's no longer WIP, we can merge it!

@ste93cry
Copy link
Collaborator Author

ste93cry commented Dec 8, 2017

we deprecate getLastEventId() in favor of getLastEvent()->getId()

should I trigger a deprecated notice in the getLastEventId() method?

@Jean85
Copy link
Collaborator

Jean85 commented Dec 8, 2017

Since I suggested it, I'm obviously 👍 for it. That will mean that it will not be dropped until next major, so I don't think others will oppose it.

@ste93cry ste93cry changed the title [2.0] [WIP] Refactor how the events are captured and stored [2.0] Refactor how the events are captured and stored Dec 9, 2017
@ste93cry ste93cry merged commit 873ab3c into getsentry:2.0 Dec 9, 2017
@ste93cry ste93cry deleted the event-refactoring branch December 9, 2017 19:47
@Jean85 Jean85 mentioned this pull request May 4, 2018
27 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants