Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deprecate DEFAULT_WORKSPACE_ID #5009

Merged
merged 2 commits into from
Jul 28, 2021
Merged

Conversation

cgardens
Copy link
Contributor

@cgardens cgardens commented Jul 27, 2021

closes #4716

What

Describe what the change is solving
It helps to add screenshots if it affects the frontend.

How

  • In the main method of the ServerApp, create a workspace if no workspace exists. This replaces the existing behavior where if the default workspace exists but hasn't been assigned a customer id (for segment tracking) then we do assign it. Now this will be handled as part of creating the workspace.
  • Remove workspace as the seeds in airbyte-config:init
  • Add workspaceId to the TrackingClient iface, so that tracking get fetch the appropriate workspace id

Note

We need to make a trade off here. Because we are opting to not add an "organization" level concept now the tracking story gets a little complicated. The rest of this section will assume that we are not adding an organization concept now. @michel-tricot & @jrhizor would love your thoughts on this to sanity check I haven't over looked some better option.

Option 1: (this is this PR as currently implemented)

In this approach the workspace becomes our core tracking identifier because we choosing to not have a formal way of connecting them. This gets weird in a few cases:

  1. When users have multiple workspaces in the OSS project.
  2. When users do the old manual upgrade process where the old workspace was replaced by the new one (and aliased in amplitude). That's not really reasonable to think about anymore.

For our metrics in BQ we can build up a user identity by assuming that all workspaces that have ever been on the same deployment are the same user. I am not sure how realistic this is in amplitude. In which case we would have some double counting of users in some metrics because we would assume each workspace is a separate user.

Option 2: Restrict multiple workspace in the OSS product

The only other thing I just thought of is saying that in the OSS version we allow one workspace. That should resolve, I think, all of these problems. Potentially just keeping this limitation in the UI will be good enough (I think there was only one user trying to use multiple workspace support via the API and I think they aren't doing it anywhere). In the cases that someone has used the API to have multiple workspaces, the metrics will be wrong, but we will assume it is small enough because it is API-only to not affect things. While not ideal long term this probably could get us by until we add an organization concept.

In this case, we can pretty much follow the protocol we already have on master. Basically whenever we use the import endpoint we alias any existing workspaces to whatever the first workspace is in the import. Again in the case where someone has more than one workspace from setting up in the API that'll be a little funky, but it should be incredibly rare and cause only small errors.

Doing this doesn't seem like that much additional work on top of what's already in this PR, while it's a hack, I at least feel like the metrics will be very accurate. In the multiple workspace world in OSS, I think we might totally slag our metrics.

Recommended reading order

  1. ServerApp.java
  2. TrackingClientSingleton and SegmentTrackingClient

Pre-merge Checklist

  • - need to write a migration that takes every workspace id that is a DEFAULT_WORKSPACE_ID and generate a new UUID for it. Likely will do this as a separate PR as it is easy to review separately.

@cgardens cgardens requested review from jrhizor and tuliren July 27, 2021 01:37
@github-actions github-actions bot added the area/platform issues related to the platform label Jul 27, 2021
import io.airbyte.validation.json.JsonValidationException;
import java.io.IOException;
import java.util.Objects;
import java.util.UUID;
import java.util.concurrent.ExecutionException;
import org.checkerframework.checker.nullness.qual.NonNull;

// todo (cgardens) - this class is in an unintuitive module. it is weird that you need to import
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to move this class so that JobNotifier and JobTracker could get at it. feels like a lame brain move, but I couldn't find a better place, and was reticent to add another module just for this.

@@ -119,7 +106,6 @@ public ImportRead importDataWithSeed(String targetVersion, File archive, Path se
private ImportRead importDataInternal(String targetVersion, File archive, Optional<Path> seedPath) {
Preconditions.checkNotNull(seedPath);

final Optional<UUID> previousCustomerIdOptional = getCurrentCustomerId();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer trying to track and alias previous workspace. in the automigration case this was superfluous anyway. in the case of using the import endpoint this could mess up metrics a little. see the note about the trade off we are making in the main description of the pr.

Copy link
Contributor

@tuliren tuliren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Some of the files have changes that just add final keywords to the variables. This seems unnecessary, especially because they are not affected by this PR, and some of the files are tests only (e.g. ConfigPersistenceBuilderTest.java), and I don't think anyone is going to redefine those variables there.

identitySupplier.get().getAirbyteVersion(),
identitySupplier.get().getCustomerId(),
identityFetcher.apply(workspaceId).getAirbyteVersion(),
identityFetcher.apply(workspaceId).getCustomerId(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function may query the database. So we probably want to store the result of identityFetcher.apply(workspaceId) instead of calling the apply twice here.

@cgardens
Copy link
Contributor Author

Nit. Some of the files have changes that just add final keywords to the variables. This seems unnecessary, especially because they are not affected by this PR, and some of the files are tests only (e.g. ConfigPersistenceBuilderTest.java), and I don't think anyone is going to redefine those variables there.

Haha. I just do that instinctively when I read code now so that I know that reassignment can't happen. Does it mess you up if I add the finals in? Or is the issue that it makes the reviewing hard because the diff is noisy? I can revise my behavior to make PR reviews easier.

harmony

getting close

Revert "wip"

This reverts commit 8a7ab00.

tracking fixed to work with workspace?

purge default workspace id

fix HealthCheckHandlerTest

fix tests
@cgardens cgardens force-pushed the cgardens/add_workspace_id_to_tracking branch from f908ed6 to d75601c Compare July 28, 2021 20:52
@cgardens cgardens merged commit 1f58fb7 into master Jul 28, 2021
@cgardens cgardens deleted the cgardens/add_workspace_id_to_tracking branch July 28, 2021 23:59
@tuliren
Copy link
Contributor

tuliren commented Jul 29, 2021

Haha. I just do that instinctively when I read code now so that I know that reassignment can't happen. Does it mess you up if I add the finals in? Or is the issue that it makes the reviewing hard because the diff is noisy? I can revise my behavior to make PR reviews easier.

No, the finals do not mess up with my change. But they are a distraction for the code review. In general reassignment is not a big concern. And people usually are careful not to do it. So having finals all over the places seem like a hassle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform issues related to the platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Airbyte Cloud: Remove (or handle) default workspace
2 participants