-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend web_backend/connections/get
to include diff of catalog
#13232
Comments
Thanks for this ticket @timroes! @lmossman and I were discussing this yesterday and I don't think we need an oldName field here, since we don't process table name changes. If a table name changes, we would read it as a new table being added and another one being removed, so those would be listed under streamsRemoved and streamsAdded. |
Thanks for that info. We had that in the designs so I thought it was planned to track. If not we can of course drop this. Does the same apply for field names? |
We can track field name changes. We discussed possibly pulling the field name change display out into a separate feature, but given this new API format maybe it makes sense to include it here. @lmossman what do you think? |
@alovew I think it will be fine even extending the API later to include field name changes, since it's just an extension to the API, not breaking any backwards compatibility. |
Think will be complicated to do. This means that the connection/get endpoint will have to run and wait the end of the run of the discover catalog job and the process the diff. Processing the diff will be doable but fetching the new catalog should not be in this endpoint because of the time it will take. |
Sorry I missed the crutial part here. This should only be calculated as long as Since it's current behavior is already to refretch the catalog with that parameter set to |
Sorry, didn't see this earlier. I think field name changes would be hard for us to detect, for the same reason that stream name changes are hard to detect - in either case, we would be comparing the new schema with the old schema and we would just see that some field/stream name is missing from the old schema and a new field/stream name is present in the new schema. I don't think there is any way we can tell that it was changed. The only thing we can reasonably detect is field type changes, since we can compare the types of fields with the same name in a given stream |
Two things I noticed working on the frontend:
|
@cgardens Can you have a look at the above questions, please? |
For the connection isolation changes, we're requiring information about which streams have changed and how, when we get a new catalog. When the user uses the "Refresh source schema" button in the UI we're doing a call to
v1/web_backend/connections/get
and set thewithRefreshedCatalog: true
in the query, which will fetch a new catalog and return that assyncCatalog
in the response. To implement the designs we have for showing changes, we'll also need this API to have a new keysyncCatalogChanges
, that will contain the changes between the new catalog and the previously synced one.I'd expect the format to be as follows:
syncCatalogChanges: null
, if there was no previous catalog (i.e. if we're just creating the connection for the first time).Otherwise
syncCataglogChanges
should be an object in the following format:@benmoriceau Please let me know if there are any questions around the API.
cc @andyjih
The text was updated successfully, but these errors were encountered: