Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul error handling to prevent unrelated text() error from masking underlying exceptions #135

Merged
merged 8 commits into from
Oct 9, 2023

Conversation

jhamon
Copy link
Collaborator

@jhamon jhamon commented Oct 6, 2023

Problem

Some users are still running into errors that can't be properly diagnosed because of a bug in error handling causing the real error to get obscured by an unrelated problem. There's a point at which we assume we should be dealing with a ResponseError with a text() field; in reality, that's not a correct assumption so it's not safe to cast to that type.

Solution

This was one of those changes where you pull on the thread and the whole carpet unravels.

  • Refactor error handling away from nested callbacks into a flat error handler function
  • Wrap the underlying exception by attaching it as a cause whenever a PineconeConnectionError exception is being thrown. This will help us investigate [Bug] v1 Index<T>.query throwing TypeError: Cannot read properties of undefined (reading 'text') during migration #124
  • Migrate away from method-specific error handling to a consistent approach based on middleware functions. We already had an httpErrorMapper, but with this approach it only has to get wired up twice instead of like 20 times so that really simplifies what we need to cover in unit testing.
  • Implement a bunch of new integration tests to give confidence the middleware is actually working as expected.
    • The cost of this change is we lose a little bit of convenience in the 404 case where we were fetching and presenting the list of valid indexes/collections. But that small bit of convenience doesn't seem worth a radically more complex approach to error handling.
    • We no longer need to test exception handling in most method implementations
    • It unlocks the ability for some really cool debug options. See Add PINECONE_DEBUG request middleware #136.

Summary of files changed:

  • Error handling logic in src/errors/handling.ts:
    • Refactor from nested callbacks into a flat error handler function.
    • Continue to delegate to httpErrorMapper, but check that the error actually is a ResponseError before trying to access the text() property.
    • For other errors, wrap the underlying errors and throw PineconeConnectionError
  • Error classes in src/errors:
    • PineconeConnectionError and BasePineconeError modified to allow reference to underlying errors to be captured.
    • Revised error message that explains more reasons why PineconeConnectionError might occur.
  • Middleware: These get invoked at different points in the lifecycle of every request.
    • Defined middleware using handleApiError
    • Wired up new middleware for error handling in src/pinecone.ts and src/data/index.ts. Now we can be confident we are handling errors consistently across every method without the need for a lot of repetitive unit testing to verify wiring on each client method.
  • New integration tests.
    • Added a more robust suite of integration tests to exercise the error handling in a variety of real scenarios.
    • Deleted method-specific error handling that is no longer needed
    • Deleted method-level unit testing of error handling

Type of Change

  • Bug fix (non-breaking change which fixes an issue)

Test Plan

Describe specific steps for validating this change.

@jhamon jhamon force-pushed the jhamon/error-handling branch from 204670e to 316adc1 Compare October 6, 2023 21:39
@jhamon jhamon force-pushed the jhamon/error-handling branch from 316adc1 to 808366c Compare October 6, 2023 21:45
@jhamon jhamon marked this pull request as ready for review October 6, 2023 22:13
Copy link
Contributor

@austin-denoble austin-denoble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work, really appreciate this cleanup. I found it's much easier overall to reason about how errors are flowing through things now as opposed to before. I also really appreciate adding in additional integration tests while you're here.

Just a few questions, but overall looks good to me. Having error handling and throwing centralized rather than all over the place feels good. 🚢

@@ -8,13 +9,56 @@ import { BasePineconeError } from './base';
* - Incorrect configuration of the client. The client builds its connection url using values supplied in configuration, so if these values are incorrect the request will not reach a Pinecone server.
* - An outage of Pinecone's APIs. See [Pinecone's status page](https://status.pinecone.io/) to find out whether there is an ongoing incident.
*
* The `cause` property will contain a reference to the underlying error. Inspect its value to find out more about the root cause of the error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Thank you for the docstring date. 👍


export const middleware = [
{
onError: async (context) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Huge praise for all this cleanup, I feel like errors are much more straightforward to reason around now, and it's really helpful to pull all the testing into a more centralized place rather than having it spread across all the individual methods.

@@ -0,0 +1,11 @@
import { ResponseError } from '../pinecone-generated-ts-fetch';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Can this file be removed? Doesn't seem like it's actually being used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it can be removed. I had created a new test-helper in here but later I ended up deleting all the unit tests that were using it. 😂

Comment on lines +9 to +24
beforeEach(async () => {
pinecone = new Pinecone();
indexName = randomIndexName('configureIndex');

await pinecone.createIndex({
name: indexName,
dimension: 5,
waitUntilReady: true,
podType: 'p1.x1',
replicas: 2,
});
});

afterEach(async () => {
await pinecone.deleteIndex(indexName);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Do we need to be creating and deleting an index for each test rather than using beforeAll and afterAll? It seems like this would lead to slower tests as we have to wait somewhere around 10 seconds for each index to spin up for each test. For this specific suite could we not just create an index once, perform all the tests and operations, and then delete afterAll for cleanup?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As currently written, this test suite does end up adding quite a bit of time due to the index creation. But for now I want to keep every test fully isolated from others.

There seems to be some timing issues in this configure endpoint when called multiple times at a row. At one point I had a test here that scaled replicas up and then down again (using the same index) and it was throwing a 500 error at a high rate, even when I waited for status to reach the Ready state. I don't think they envisioned people scaling up and down separated only by milliseconds.


test('calling data plane', async () => {
const p = new Pinecone({
apiKey: 'api-key-2',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Should this still be process.env.PINECONE_API_KEY? Asking as the section is described as when environment is wrong.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it to make it more clear that the environment is the only thing that matters here. Ultimately only the environment value matters for this test since it's an input into the url and there is no wrong-environment environment in production.

@jhamon jhamon merged commit edd85a8 into main Oct 9, 2023
18 checks passed
@jhamon jhamon deleted the jhamon/error-handling branch October 9, 2023 17:21
@jhamon jhamon changed the title Overhaul error handling to address text() error Overhaul error handling to prevent unrelated text() error from masking underlying exceptions Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants