Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-33999: [Go] Removed cast from byte[] to string and copied entire string when Value() is called #34450

Closed
wants to merge 1 commit into from

Conversation

rtadepalli
Copy link
Contributor

@rtadepalli rtadepalli commented Mar 4, 2023

Rationale for this change

Remove some unsafe casts and copy a string instead of returning a reference to an underlying array.

What changes are included in this PR?

Remove unsafe.Pointer and add a string deep copy.

Are these changes tested?

I think existing tests should suffice.

Are there any user-facing changes?

There are no user facing/breaking changes.

@rtadepalli rtadepalli requested a review from zeroshade as a code owner March 4, 2023 01:42
@rtadepalli rtadepalli changed the title GH-33999: [Go] Removed cast from byte[] to string and copied entire string when is called GH-33999: [Go] Removed cast from byte[] to string and copied entire string when Value() is called Mar 4, 2023
@github-actions github-actions bot added the awaiting review Awaiting review label Mar 4, 2023
@github-actions
Copy link

github-actions bot commented Mar 4, 2023

@github-actions
Copy link

github-actions bot commented Mar 4, 2023

⚠️ GitHub issue apache/arrow-go#68 has been automatically assigned in GitHub to PR creator.

Comment on lines +53 to +57
sb := strings.Builder{}

i = i + a.array.data.offset
return a.values[a.offsets[i]:a.offsets[i+1]]
sb.WriteString(a.values[a.offsets[i]:a.offsets[i+1]])
return sb.String()
Copy link
Member

@zeroshade zeroshade Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to use the string builder here, you can just do string([]byte(a.values[a.offsets[i]:a.offsets[i+1]])[:]) or something like:

val := a.values[a.offsets[i]:a.offsets[i+1]]
out := make([]byte, len(val))
copy(out, val)
return *(*string)(unsafe.Pointer(&out))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we'd still maintain a method that would let someone retrieve the slice referencing the data to allow consumers to avoid the copy if desired (such as the GetView in the C++ lib)

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Mar 6, 2023
Comment on lines 250 to +252
if vdata := data.buffers[2]; vdata != nil {
b := vdata.Bytes()
a.values = *(*string)(unsafe.Pointer(&b))
a.values = string(b[:])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't copy the entire buffer here. Keep the unsafe.Pointer here so we only perform copies in the Value method and only on the values that are requested.

@amol-
Copy link
Member

amol- commented Mar 30, 2023

Closing because it has been untouched for a while, in case it's still relevant feel free to reopen and move it forward 👍

@amol- amol- closed this Mar 30, 2023
Copy link

⚠️ GitHub issue #33999 has no components, please add labels for components.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting committer review Awaiting committer review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Go] array.String.Value is unsafe
3 participants