Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v6.3.0 connection bug #963

Closed
Tracked by #992
grod220 opened this issue Apr 19, 2024 · 7 comments · Fixed by #994
Closed
Tracked by #992

v6.3.0 connection bug #963

grod220 opened this issue Apr 19, 2024 · 7 comments · Fixed by #994
Assignees
Labels
bug Something isn't working priority Important to work on next

Comments

@grod220
Copy link
Collaborator

grod220 commented Apr 19, 2024

There have been reports of users having syncing issues on v6.3.0. From @conorsch on discord:

today i was able to replicate the cannot-sync problem on the web extension, using v6.3.0 from the publishes releases. here's a screenshot of the browser console error message, and a searching string: "Could not establish connection. Receiving end does not exist."

this error occurred on a fresh launch of Google Chrome, with a previously installed extension version. to resolve, i tried several things that did not work:

1. restarting the browser
2. force-refreshing the page

i was able to "kick it" and restore syncing functionality, by—drum roll please—:

1. opening the rpc settings
2. confirming the default grpc.testnet.penumbra.zone was the only rpc endpoint configured
3. pressing the Save button on the rpc endpoints, even though i made no changes
4. waiting for countdown to restart, then force-refreshed the page.

i am uncertain whether the force-refresh in step 4 was required or not.

Screenshot 2024-04-19 at 2 50 56 PM

For minifront, this produces an "Penumbra extension unavailable" page.

@grod220 grod220 added bug Something isn't working priority Important to work on next mainnet labels Apr 19, 2024
@github-project-automation github-project-automation bot moved this to 🗄️ Backlog in Penumbra web Apr 19, 2024
@grod220 grod220 moved this from 🗄️ Backlog to 📝 Todo in Penumbra web Apr 19, 2024
@Valentine1898
Copy link
Contributor

I can reproduce this, but the thing is, it's happening now because of a network halt
From this we can deduce that you will get “Penumbra extension unavailable” if the services have not been started. When the network is halted this is related to #971
But, when @conorsch mentioned this bug, the network had not been halted yet, so we probably need to look for another reason why the services in extension might not be started

The bad news is that the error log from minifront doesn't provide any useful information because the issue is in extension.

If this bug does not occur for all users, we should make sure that other wallet extensions do not affect the extension transport (e.g. test in a new empty browser profile).

@conorsch it would be very helpful if you could share the extension logs after the testnet is deployed

@Valentine1898
Copy link
Contributor

Based on the steps Connor described to restore sync I suspected this PR
#868

  1. Switched to commit 725be7783a55e784dd01ddcd01b40636fea1f6f4 (before Add RPC selection to onboarding flow #868 was merged)
  • Notice that local storage does not have a grpcEndpoint record
    telegram-cloud-photo-size-2-5285050184096670252-y
  1. Switched to commit 71b71e82bb02a4aa1f8bed7ebf8a92d5757d251e (after Add RPC selection to onboarding flow #868 merged)
  • Service worker does nothing because it is waiting for grpcEndpoint to be set in chrome local storage (look at waitUntilGrpcEndpointExists() )
  • minifront shows “Penumbra extension unavailable”.

I think that's the most likely cause of the bug

We're probably better off with a release with a fix. I see two options:

  • fill grpcEndpoint value in local storage with default value
  • show the user a mandatory dialog with grpc endpoint selection

cc @grod220 @jessepinho

@turbocrime
Copy link
Contributor

i have some work on this in #984 and #983 already

@grod220
Copy link
Collaborator Author

grod220 commented Apr 26, 2024

Ah, good find! This is tricky. Up to this point, we have made the default rpc selection on behalf of users and now that they need to select something---we have to migrate those old users somehow. Yes, think you are right Valentine. We can either a) somewhat forcefully migrate them via hardcoding a default or b) do a better job of signaling errors to users.

I'm somewhat leaning toward option B. Currently, we don't have a way of propagating errors to users within the extension:

Screenshot 2024-04-26 at 10 41 16 AM

However, for an immediate fix, it seems easiest to simply add a default. Perhaps we can do this with a code comment to remove before we go to mainnet. I'll create another issue for error messaging in the extension.

@Valentine1898 Valentine1898 self-assigned this Apr 26, 2024
@Valentine1898
Copy link
Contributor

Valentine1898 commented Apr 26, 2024

It turned out to be more complicated than I anticipated

If we just add grpcEndpoint to localDefaults again, it will break our onboarding behavior
The service worker has a waitUntilGrpcEndpointExists() function that waits until the user selects grpcEndpoint and only then it initializes services.
But if we just return the default value for grpcEndpoint, the service worker will start initializing before the user has selected grpcEndpoint (with the default value)

It seems to me that we have to do a more tricky migration, which should affect users who have already finished onboarding but do not have grpcEndpoint set.

But, the thing is, we can't write the usual migration either
If we just bump the chrome local storage version to V3 and write something like

export const migrations: Migration = {
  grpcEndpoint: {
    [LocalStorageVersion.V1]: old => old || DEFAULT_GRPC_URL,
    [LocalStorageVersion.V2]: old => old || DEFAULT_GRPC_URL,
  },
};

Then it won't work, because migrations only work when we've previously set the value

async get<K extends keyof T>(key: K): Promise<T[K]> {
const result = (await this.storage.get(String(key))) as
| Record<K, StorageItem<T[K]>>
| EmptyObject;
if (isEmptyObj(result)) return this.defaults[key];
else return await this.migrateIfNeeded(key, result[key]);
}

Also, ideally, if a new user interrupts onboarding during the RPC selection phase and resumes onboarding afterwards, we should not consider them a migration candidate

@Valentine1898
Copy link
Contributor

Valentine1898 commented Apr 26, 2024

Also, ideally, if a new user interrupts onboarding during the RPC selection phase and resumes onboarding afterwards, we should not consider them a migration candidate

Ah, okay, we don't have to worry about that because if the user interrupts onboarding during the RPC selection phase he will get a broken extension without the grpcEndpoint set anyway 🤔

@Valentine1898
Copy link
Contributor

#993

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority Important to work on next
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants