Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APIError unmarshaling broken in 1.4 #339

Closed
dsymonds opened this issue Jun 30, 2021 · 15 comments · Fixed by #382
Closed

APIError unmarshaling broken in 1.4 #339

dsymonds opened this issue Jun 30, 2021 · 15 comments · Fixed by #382
Labels
rest api issue this looks to be an issue with the rest API, not our client
Milestone

Comments

@dsymonds
Copy link
Contributor

When making some types of API calls, the error response fails to unmarshal into the JSON structures and so the error itself is lost to client code, who only gets a JSON error. In particular, I was creating an override in the past, and triggered this.

My apologies for not capturing the error itself (I worked around it, then lost my terminal history), but it should be easy to reproduce. Let me know if that's not the case and I'll try to go back into my local history.

@dsymonds
Copy link
Contributor Author

dsymonds commented Jun 30, 2021

Oh, to be clear, this was using v1.4.1, though I don't think any differences between v1.4.0 and v1.4.1 are relevant.

@theckman
Copy link
Collaborator

@dsymonds I would need a reproduction case so that I can figure it out.

@dsymonds
Copy link
Contributor Author

I reproduced it with the pd command line tool in this repo (after fixing a wee bug in it).

Using an input file like:

{
  "start": "2021-06-29T12:00:00Z",
  "end": "2021-06-29T18:00:00Z",
  "user": {...}
}

or with any other date in the past, you can get an API response like:

{"message":"Invalid Override","code":2001,"errors":"Override must end in the future, Override must end after its start"}

and that doesn't unmarshal into the APIErrorObject struct, which has Errors []string.

@theckman
Copy link
Collaborator

theckman commented Jun 30, 2021

@dsymonds thank you for that. That errors key is supposed to be an array, at least according to their API documentation.

https://developer.pagerduty.com/docs/rest-api-v2/errors/

@stmcallister could you ping someone inside of PagerDuty to see whether that's expected? API documentation shows errors being an array but here it's a string.

@stmcallister
Copy link
Contributor

Interestingly, when I create an override with a raw API call, using dates in the past as you did, my errors object is an array of strings.

[
    {
        "status": 400,
        "errors": [
            "Override must end in the future",
            "Override must end after its start"
        ],
        "override": {
            "start": "2021-07-01T00:24:27Z",
            "end": "2021-06-02T04:00:00Z",
            "user": {
                "id": "PZACW3D",
                "type": "user_reference",
                "summary": "Bart J Simpson",
                "self": "https://api.pagerduty.com/users/PZACW3D",
                "html_url": "https://pdt-smcallister.pagerduty.com/users/PZACW3D"
            }
        }
    }
]

@theckman
Copy link
Collaborator

theckman commented Jul 2, 2021

I'm noticing that the output @stmcallister provided looks to be an array of objects, which may explain the failure. If so, I should be able to fix that.

While looking into that, though, I'm actually surprised the code works at all since it seems to provide a different payload than what the documentation shows. The documentation says the payload needs to be an object with one key, overrides, that's an array of objects (notice the plural).

In our code, it's a single object at the key override (singular):

go-pagerduty/schedule.go

Lines 261 to 265 in e23b94c

d := map[string]Override{
"override": o,
}
resp, err := c.post(ctx, "/schedules/"+id+"/overrides", d, nil)

It seems like we might be making use of some undocumented API feature that allows us to specify a single object. We probably need to fix this too, to be aligned with the spec.

theckman added a commit that referenced this issue Jul 4, 2021
This is to help with debugging issue #339. This branch is based off of #325, so
that I could make use of the new response debugging functionality I added.

I ran the following commands from within the `command` directory to get the
below result:

```shell
go build -o pd .
./pd schedule override create -authtoken "$(cat ../api.key)" PF4QFIN debug_override.json
```

The `api.key` file is just an API key from the PD console. `PF4QFIN` is a
PagerDuty service in my development account. Here is the output:

```
time="2021-07-04T01:00:24-07:00" level=info msg="service id is:PF4QFIN"
time="2021-07-04T01:00:24-07:00" level=info msg="Input file is:debug_override.json"
req:
req.Method: POST
req.URL: https://api.pagerduty.com/schedules/PF4QFIN/overrides
req.Header: http.Header{"Accept":[]string{"application/vnd.pagerduty+json;version=2"}, "Authorization":[]string{"Token token=u+XP5yPH7pzsQ3aJA5EQ"}, "Content-Type":[]string{"application/json"}, "User-Agent":[]string{"go-pagerduty/1.5.0"}}
req.Body:
{"override":{"start":"2021-06-29T12:00:00Z","end":"2021-06-29T18:00:00Z","user":{"id":"P6QKA9O"}}}

err: HTTP response with status code 400, JSON error object decode failed: json: cannot unmarshal string into Go struct field APIError.error of type []string

resp:
resp.Status: 400 Bad Request
resp.Header: http.Header{"Access-Control-Allow-Headers":[]string{"Authorization, Content-Type, AuthorizationOauth, X-EARLY-ACCESS"}, "Access-Control-Allow-Methods":[]string{"GET, POST, PUT, DELETE, OPTIONS"}, "Access-Control-Allow-Origin":[]string{"*"}, "Access-Control-Expose-Headers":[]string{""}, "Access-Control-Max-Age":[]string{"1728000"}, "Cache-Control":[]string{"no-cache"}, "Connection":[]string{"keep-alive"}, "Content-Length":[]string{"95"}, "Content-Type":[]string{"application/json"}, "Date":[]string{"Sun, 04 Jul 2021 08:00:24 GMT"}, "Server":[]string{"nginx"}, "X-Request-Id":[]string{"f6cda459-587c-49a6-ab98-17fc5bd60642"}}
resp.Body:
{"error":{"message":"Invalid Override","code":2001,"errors":"Invalid input for this override"}}

```

Note that `error.errors` is not an array, it's a single string. This seems to
violate the API spec.
@theckman
Copy link
Collaborator

theckman commented Jul 4, 2021

@stmcallister I was able to repro this, and created a dedicated branch with some extra debug code to show it: https://github.com/PagerDuty/go-pagerduty/tree/debugging_weird_errors

This specific commit has the instructions for how to use that branch to show the API returning an incompatible response: b053fc7

In summary,

cd command
go build -o pd .
./pd schedule override create -authtoken "$(cat ../api.key)" PF4QFIN debug_override.json

@theckman
Copy link
Collaborator

theckman commented Jul 4, 2021

@stmcallister I could also replicate it with:

curl -v -X POST \
    -H 'Accept: application/vnd.pagerduty+json;version=2' \
    -H 'Authorization: Token token=u+GLqAhVYVnyjWVAhZYg' \
    -H 'Content-Type: application/json' \
    --data '{"override":{"start":"2021-06-29T12:00:00Z","end":"2021-06-29T18:00:00Z","user":{"id":"P6QKA9O"}}}' \
    https://api.pagerduty.com/schedules/PF4QFIN/overrides
< HTTP/2 400 
< server: nginx
< date: Sun, 04 Jul 2021 17:01:26 GMT
< content-type: application/json
< content-length: 95
< access-control-allow-origin: *
< access-control-allow-methods: GET, POST, PUT, DELETE, OPTIONS
< access-control-max-age: 1728000
< access-control-allow-headers: Authorization, Content-Type, AuthorizationOauth, X-EARLY-ACCESS
< access-control-expose-headers: 
< cache-control: no-cache
< x-request-id: 5d32e7b8-2194-4117-8a07-2504178df57c
< 
* Connection #0 to host api.pagerduty.com left intact
{"error":{"message":"Invalid Override","code":2001,"errors":"Invalid input for this override"}}

My expectation is that the JSON would instead look like this:

{"error":{"message":"Invalid Override","code":2001,"errors":["Invalid input for this override"]}}

Note: the API key shown here is no longer valid.

@theckman theckman added the rest api issue this looks to be an issue with the rest API, not our client label Jul 4, 2021
@theckman theckman added this to the v1.5.0 milestone Aug 29, 2021
@theckman
Copy link
Collaborator

I'll see whether there is a way to fix this within a v1.4.x patch release, but from what I remember from when I explored it last this is maybe a side effect of us using the API incorrectly. We send JSON in a format different than the API documentation specifies, and since I'll be fixing some of that in v1.5.0 I may try to do it within that version.

@theckman
Copy link
Collaborator

theckman commented Sep 3, 2021

I know nobody else but me (and maybe PD employees) can see this, but I've filed a support ticket with PagerDuty to try and get a sense of what the right path forward is here:

https://tickets.pagerduty.com/hc/requests/341221

This feels like an obvious REST API bug to me, but I'm unsure if PagerDuty feels the same. I'll loop-back on this issue once I've some sort of resolution from that support ticket.

@theckman
Copy link
Collaborator

theckman commented Oct 7, 2021

The ticket is still open with PagerDuty Support. The last communication from them on Sept 5th was that they would be escalating it to engineering, and I'd hear back once they have an answer. I'll loop back around to see if they have an estimate of when that may be.

@tstromberg
Copy link

tstromberg commented Nov 4, 2021

I believe I'm hitting this bug with a this call:

resp, err := client.CreateOverrideWithContext(ctx, o.Schedule, pagerduty.Override{User: o.User, Start: o.Start, End: o.End})
if err != nil {
return fmt.Errorf("failed to create override: %w", err)
}

The value of err is HTTP response with status code 400, JSON error object decode failed: json: cannot unmarshal string into Go struct field APIError.error of type []string

The value of o:

{
  User:{
    ID:PKJZEIC
    Type:user_reference
    Summary:Thomas Stromberg Self:https://api.pagerduty.com/users/PKJZEIC 
   HTMLURL:https://packet.pagerduty.com/users/PKJZEIC
  }
  Schedule:PJ4M2NN
  Start:2021-11-09T08:00:00-08:00
  End:2021-11-09T08:00:00-08:00
}

When I changed the code to dump the body, I see:

Status code: 400 Body: {"error":{"message":"Invalid Override","code":2001,"errors":"Override must end after its start"}}

@theckman
Copy link
Collaborator

theckman commented Nov 9, 2021

I just pinged PagerDuty again for our monthly check-in, as well as asked if there are any engineers I can bribe to look at it sooner. 😅

I'll look at this again before v1.5.0 to see if I can noodle on a temporary shim to handle this bad error format edge case, or see if there are other options.

theckman added a commit that referenced this issue Nov 10, 2021
The PagerDuty REST API documentation[1] indicates that the `errors` field of an
error response is an array of error strings. However, as shown in #339 the API
sometimes violates that specification and instead returns the `errors` field as
a string.

There is a PagerDuty customer support ticket open[2] to to address this issue, but
there is currently no ETA on the resolution. As such we are going to create this
workaround to properly parse the invalid responses and return the proper error
type to callers.

Closes #339

[1] https://developer.pagerduty.com/docs/ZG9jOjExMDI5NTYz-errors
[2] https://tickets.pagerduty.com/hc/requests/341221
theckman added a commit that referenced this issue Nov 10, 2021
The PagerDuty REST API documentation[1] indicates that the `errors` field of an
error response is an array of error strings. However, as shown in #339 the API
sometimes violates that specification and instead returns the `errors` field as
a string.

There is a PagerDuty customer support ticket open[2] to to address this issue, but
there is currently no ETA on the resolution. As such we are going to create this
workaround to properly parse the invalid responses and return the proper error
type to callers.

Closes #339

[1] https://developer.pagerduty.com/docs/ZG9jOjExMDI5NTYz-errors
[2] https://tickets.pagerduty.com/hc/requests/341221
@theckman
Copy link
Collaborator

theckman commented Nov 10, 2021

Alrighty. Since it doesn't seem like PagerDuty engineering may be able fix this for a bit of time, I raised #382 to address this bug on our side by trying to reparse the JSON if it seems to fail due to mismatched types.

@theckman theckman modified the milestones: v1.5.0, v1.4.3 Nov 14, 2021
theckman added a commit that referenced this issue Nov 14, 2021
The bug this fixes was introduced as part of v1.4.0, so we need to backport it
into the v1.4.x release line. Because the master branch has diverged towards
v1.5.0, we need to manually backport this to the release-1.4.x branch.

Original commit message (0933613):

The PagerDuty REST API documentation[1] indicates that the errors field of an
error response is an array of error strings. However, as shown in #339 the API
sometimes violates that specification and instead returns the errors field as
a string.

There is a PagerDuty customer support ticket open[2] to to address this issue, but
there is currently no ETA on the resolution. As such we are going to create this
workaround to properly parse the invalid responses and return the proper error
type to callers.

Closes #339

[1] https://developer.pagerduty.com/docs/ZG9jOjExMDI5NTYz-errors
[2] https://tickets.pagerduty.com/hc/requests/341221
theckman added a commit that referenced this issue Nov 14, 2021
The bug this fixes was introduced as part of v1.4.0, so we need to backport it
into the v1.4.x release line. Because the master branch has diverged towards
v1.5.0, we need to manually backport this to the release-1.4.x branch.

Original commit message (0933613):

The PagerDuty REST API documentation[1] indicates that the errors field of an
error response is an array of error strings. However, as shown in #339 the API
sometimes violates that specification and instead returns the errors field as
a string.

There is a PagerDuty customer support ticket open[2] to to address this issue, but
there is currently no ETA on the resolution. As such we are going to create this
workaround to properly parse the invalid responses and return the proper error
type to callers.

Closes #339
Backports #382

[1] https://developer.pagerduty.com/docs/ZG9jOjExMDI5NTYz-errors
[2] https://tickets.pagerduty.com/hc/requests/341221
@theckman
Copy link
Collaborator

@dsymonds I just backported the fix or this into the release-v1.4.x branch, and cut v1.4.3. This issue will be auto-closed in a few moments when I merge the same fix into the master branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rest api issue this looks to be an issue with the rest API, not our client
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants