Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Union type decorator leaking into CSV output #4207

Closed
philrz opened this issue Nov 14, 2022 · 1 comment · Fixed by #4338
Closed

Union type decorator leaking into CSV output #4207

philrz opened this issue Nov 14, 2022 · 1 comment · Fixed by #4338
Assignees
Labels
bug Something isn't working

Comments

@philrz
Copy link
Contributor

philrz commented Nov 14, 2022

Current repro is with Zed commit 375f11d.

tl;dr - The following input data has two different types for the same field foo. When fused and output as CSV, there's an unexpected appearance of the type decorator for the created union.

$ zq -version
Version: v1.2.0-126-g375f11db

$ echo '{"foo": 1} {"foo": "bar"}' | zq -f csv 'fuse' -
foo
"1((int64,string))"
"""bar""((int64,string))"

Here's a detailed history of how we got to this spot...

Up to a very old commit 7d8f1e5 (from July, 2021), fusing and outputting this as CSV resulted in an additional column with a _2 appended to its name to represent data with the second data type encountered for foo.

$ zq -version
Version: v0.29.0-435-g7d8f1e55

$ echo '{"foo": 1} {"foo": "bar"}' | zq -f csv 'fuse' -
foo,foo_2
1,
,bar

Then starting with the following commit 3b0f493 (associated with #2885), the output changed to the strange-looking:

$ zq -version
Version: v0.29.0-436-g3b0f4936

$ echo '{"foo": 1} {"foo": "bar"}' | zq -f csv 'fuse' -
foo
0:1
1:bar

It looks like those 0 and 1 were references to numbered type definitions that are leaking out.

$ echo '{"foo": 1} {"foo": "bar"}' | zq -z 'fuse' -
{foo:1 (0=((int64,string)))} (=1)
{foo:"bar"} (1)

Then it changed again at commit 547a203 (associated with #3129) when the leaking became more obvious.

$ zq -version
Version: v0.30.0-56-g547a2032

$ echo '{"foo": 1} {"foo": "bar"}' | zq -f csv 'fuse' -
foo
"1 (int64) (int64,string)"
"""bar"" (string) (int64,string)"

Finally, it changed one more time at commit 942500c (associated with #3487), which is the output that's still with us today.

$ zq -version
Version: v0.33.0-113-g942500cf

$ echo '{"foo": 1} {"foo": "bar"}' | zq -f csv 'fuse' -
foo
"1((int64,string))"
"""bar""((int64,string))"
@philrz philrz added the bug Something isn't working label Nov 14, 2022
nwt added a commit that referenced this issue Jan 26, 2023
To format a Zed union, zio/csvio.Writer.Write calls through formatValue
to zson.String, producing a string that includes an unwanted ZSON union
type decorator.  Remove that type decorator by calling zed.Value.Under
in Write.

Closes #4207.
@nwt nwt closed this as completed in #4338 Jan 27, 2023
nwt added a commit that referenced this issue Jan 27, 2023
To format a Zed union, zio/csvio.Writer.Write calls through formatValue
to zson.String, producing a string that includes an unwanted ZSON union
type decorator.  Remove that type decorator by calling zed.Value.Under
in Write.

Closes #4207.
@philrz
Copy link
Contributor Author

philrz commented Jan 27, 2023

Verified in Zed commit b39b4bd.

We now find the union type decorator is no longer leaked into the output.

$ zq -version
Version: v1.4.0-33-gb39b4bdb

$ echo '{"foo": 1} {"foo": "bar"}' | zq -f csv 'fuse' -
foo
1
bar

Alas, while this covers a case involving the union of two primitive types, I've now noticed a different kind of CSV output problem related to the union of a primitive and complex type. I've opened a new issue #4342 to track that.

Thanks @nwt!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants