Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(deployments datasource): get by name #343

Merged
merged 4 commits into from
Dec 20, 2024
Merged

Conversation

mitchnielsen
Copy link
Contributor

Summary

For the Deployment datasource, we were getting it by name by using the
/filter endpoint. But we weren't passing any filter body in the request,
so it was always getting all Deployments back.

This change uses the API endpoint to get a deployment by name, which now
requires a flow name as well.

Related to #330

func (c *DeploymentsClient) Get(ctx context.Context, deploymentID uuid.UUID) (*api.Deployment, error) {
// GetByName returns details for a Deployment by name.
func (c *DeploymentsClient) GetByName(ctx context.Context, flowName, deploymentName string) (*api.Deployment, error) {
url := fmt.Sprintf("%s/name/%s/%s", c.routePrefix, flowName, deploymentName)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It surprised me that the API requires the flow_name to get a deployment by name. I started by adding that field as an optional attribute on the datasource specifically, and then validate that either id or name and flow_name are set.

Alternatives considered so far:

  • Going back to List/Filter, and passing the filter payload: the filter payload is pretty complex, and I'm not keen to model the payload and response in Terraform tbh
  • Going back to List/Filter, and getting all Deployments then finding the one with a matching name: this doesn't quite feel right given the potential size of the response, and I have this feeling that flow_name is required for a reason - maybe Deployments can have the same name if they're associated with different flows?
  • Automatically getting the flow_name for the user based on the flow_id: we already have the flow_id set on the Deployment resource, so this would be easier for users in some cases (such as when they don't define the flow in Terraform and therefore don't have access to prefect_flow.example.name). This is a fairly simple API call, and is probably my preference out of these options so far, but still doesn't quite feel optimal.

Open to any opinions here

Copy link
Contributor

@parkedwards parkedwards Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the option we choose will depend on what interface we're choosing to expose to the user, which is coupled to what we think they're going to use the datasource for:

  • search a single deployment by name => return a single Deployment. since deployment names aren't unique + need an extra piece of info to triangulate (the flow itself), we could either implicitly look up the flow as you suggested OR expect the user to include that flow identifier in the datasource schema (so they'd put in a deployment name + flow name) and pass it into the filter call (see next bullet about the Deployment filter object)
  • search all deployments with a name => return a list of Deployments. use the /filter endpoint, and model out the payload. The Deployment filter schema is pretty big and includes a bunch of searchable options (work pool, flow name, etc.) so I'd advocate for investing in defining this. We don't have to model the entire filter payload to start, it could look something like this
type DeploymentFilter struct {
	Deployments struct {
		Name struct {
			Any []string `json:"any_"`
		} `json:"name"`
	} `json:"deployments"`
}

# usage:
filter := DeploymentFilter{}
filter.Deployments.Name.Any = []string{"deployment-1", "deployment-2"}

I don't think the "find all deployments and do the filtering at runtime" is better than either of those options, just given that we have a filtering endpoint that pretty much does this if we're able to represent the filtering payload

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great points - I could see that advanced datasource logic being really helpful. I created #344 to track that. I think for now it makes sense to proceed with the current PR since it addresses the bug based on how we currently say this datasource should work. That's just my 2 cents though, happy to pivot

Comment on lines +63 to +64
// Get returns details for a Deployment by ID.
func (c *DeploymentsClient) Get(ctx context.Context, deploymentID uuid.UUID) (*api.Deployment, error) {
Copy link
Contributor Author

@mitchnielsen mitchnielsen Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff shows things in kind of a weird way. I technically replaced List with GetByName.

So to clarify:

  • Get is unchanged
  • GetByName is added
  • List is removed

For the Deployment datasource, we were getting it by name by using the
/filter endpoint. But we weren't passing any filter body in the request,
so it was always getting all Deployments back.

This change uses the API endpoint to get a deployment by name, which now
requires a flow name as well.

Related to #330
Validates that the correct combination of fields is provided to retrieve
the datasource correctly.
@mitchnielsen mitchnielsen marked this pull request as ready for review December 19, 2024 20:27
@mitchnielsen mitchnielsen requested a review from a team as a code owner December 19, 2024 20:27
@@ -44,6 +44,7 @@ data "prefect_deployment" "test_by_id" {

data "prefect_deployment" "test_by_name" {
name = prefect_deployment.test.name
flow_name = prefect_flow.test.name
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this was only passing before because there was only one deployment in the test workspace, meaning the existing logic actually worked. This obviously breaks in the real world when there is more than one deployment (of any name).

@parkedwards parkedwards merged commit 8ad5d78 into main Dec 20, 2024
@parkedwards parkedwards deleted the fix-deployments-by-name branch December 20, 2024 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Fixing a bug docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants