Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49387][PYTHON] Fix type hint for accuracy in percentile_approx and approx_percentile #47869

Closed

Conversation

zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Fix type hint for accuracy in percentile_approx and approx_percentile

Why are the changes needed?

float accuracy is not supported:

In [9]: df.select(approx_percentile("value", [0.25, 0.5, 0.75], 1.1).alias("quantiles")).show()

...


AnalysisException: [DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE] Cannot resolve "approx_percentile(value, array(0.25, 0.5, 0.75), 1.1)" due to data type mismatch: The third parameter requires the "INTEGRAL" type, however "1.1" has the type "DOUBLE". SQLSTATE: 42K09;

Does this PR introduce any user-facing change?

yes, minor doc change

How was this patch tested?

CI

Was this patch authored or co-authored using generative AI tooling?

No

@HyukjinKwon
Copy link
Member

Merged to master.

@zhengruifeng zhengruifeng deleted the py_approx_percentile_acc branch August 26, 2024 06:47
IvanK-db pushed a commit to IvanK-db/spark that referenced this pull request Sep 20, 2024
…rox` and `approx_percentile`

### What changes were proposed in this pull request?
Fix type hint for `accuracy` in `percentile_approx` and `approx_percentile`

### Why are the changes needed?
float `accuracy` is not supported:
```
In [9]: df.select(approx_percentile("value", [0.25, 0.5, 0.75], 1.1).alias("quantiles")).show()

...

AnalysisException: [DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE] Cannot resolve "approx_percentile(value, array(0.25, 0.5, 0.75), 1.1)" due to data type mismatch: The third parameter requires the "INTEGRAL" type, however "1.1" has the type "DOUBLE". SQLSTATE: 42K09;
```

### Does this PR introduce _any_ user-facing change?
yes, minor doc change

### How was this patch tested?
CI

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#47869 from zhengruifeng/py_approx_percentile_acc.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
…rox` and `approx_percentile`

### What changes were proposed in this pull request?
Fix type hint for `accuracy` in `percentile_approx` and `approx_percentile`

### Why are the changes needed?
float `accuracy` is not supported:
```
In [9]: df.select(approx_percentile("value", [0.25, 0.5, 0.75], 1.1).alias("quantiles")).show()

...

AnalysisException: [DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE] Cannot resolve "approx_percentile(value, array(0.25, 0.5, 0.75), 1.1)" due to data type mismatch: The third parameter requires the "INTEGRAL" type, however "1.1" has the type "DOUBLE". SQLSTATE: 42K09;
```

### Does this PR introduce _any_ user-facing change?
yes, minor doc change

### How was this patch tested?
CI

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#47869 from zhengruifeng/py_approx_percentile_acc.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants