-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ISSUE] CreateCluster
is missing data_security_mode
attribute
#225
Comments
CreateCluster
is missing data_security_mode
attributeCreateCluster
is missing data_security_mode
attribute
@mgyucht could you please look at this issue? Does this mean that this attribute is missing from the OpenAPI spec? It definitely is accepted and used by the actually endpoint. |
Agree with adding this to the Class (ClusterCreate) and related method (clusters.create). Example: from databricks.sdk import WorkspaceClient
from databricks.sdk.service.compute import AutoScale, AwsAttributes, AwsAvailability, ClusterSource, DataSecurityMode, \
RuntimeEngine
import time
w = WorkspaceClient(profile='DEFAULT')
cluster_policies = [pol for pol in w.cluster_policies.list() if pol.name == 'HIPAA_intelli_curvgh']
cluster_policy = cluster_policies[0]
spark_version = '13.2.x-cpu-ml-scala2.12'
cluster_info = {
'spark_version': spark_version,
'autoscale': AutoScale(min_workers=2, max_workers=8),
'autotermination_minutes': 30,
'aws_attributes': AwsAttributes(
availability=AwsAvailability('SPOT_WITH_FALLBACK'),
ebs_volume_count=0,
first_on_demand=1,
instance_profile_arn='<add_arn_role>',
spot_bid_price_percent=100,
zone_id='auto'
),
'cluster_name': 'Nick Cluster Copy',
'cluster_source': ClusterSource('API'),
'data_security_mode': DataSecurityMode('SINGLE_USER'),
'driver_node_type_id': 'i3en.2xlarge',
'enable_elastic_disk': True,
'enable_local_disk_encryption': False,
'enable_unity_catalog': True,
'node_type_id': 'i3en.2xlarge',
'policy_id': cluster_policy.policy_id,
'runtime_engine': RuntimeEngine('STANDARD'),
'single_user_name': '<add_your_user_name>',
'spark_conf': {'spark.databricks.service.port': '8787', 'spark.databricks.service.server.enabled': 'true'},
'spark_env_vars': None,
'ssh_public_keys': None
}
resp = w.clusters.create(**cluster_info)
## wait until the cluster is running
while w.clusters.get(resp.response.cluster_id).state.name == 'PENDING':
time.sleep(60)
w.clusters.edit(cluster_id=resp.response.cluster_id, **cluster_info) |
@narquette |
This is true but is also kind of weird behaviour from Databricks imo. It isn't clear that creating a cluster should also start it. Once #227 is merged I'd argue that the obvious (though you're correct that it is unnecessary) code would be: resp = w.clusters.create(**cluster_info)
w.clusters.ensure_cluster_is_running(resp.response.cluster_id) But Databricks has a lot of unintuitive behaviour 🤷 |
@judahrand it's starting a cluster, yes. will need to make it clear in the documentation.
SDK docs will get improved over time. Please keep an eye on them :) |
More importantly, is this issue likely to be fixed any time soon? It isn't one that the community can help with since the OpenAPI spec isn't publicly available (I'm still somewhat unclear as to why). |
Hi @judahrand, sorry I missed your tag. In the meantime, this field was added to the OpenAPI spec. It is included in the latest release of the SDK: https://github.com/databricks/databricks-sdk-py/blob/main/databricks/sdk/service/compute.py#L4090. As for the OpenAPI spec, we will eventually make the spec public but have not prioritized it yet. We understand that your ability to contribute to the SDK is very limited without the spec. For now, we've primarily focused on improving the SDK development cycle for internal contributors, but over time we expect that others will be able to contribute. Thank you for your understanding. |
Description
We have policies in place which require
data_security_mode
to be set when creating a Cluster. Because this attribute is missing we cannot create clusters with the SDK.Expected behavior
One should be able to set
data_security_mode
when callingClustersAPI.create
.Debug Logs
The SDK logs helpful debugging information when debug logging is enabled. Set the log level to debug by adding
logging.basicConfig(level=logging.DEBUG)
to your program, and include the logs here.Other Information
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: