-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
to_json/to_jsonl does not have encoding options when exporting file? #70
Comments
Thanks @chslink for the bug report and thanks! I'll look into it, but if you have a test case that would be helpful to reproduce the bug with. |
That's strange,maybe i found another bug... My test environment:
My test code: testlist = [
{"id": 1, "key": u"some unicode 哈哈哈 1"},
{"id": 2, "key": u"some unicode 哈哈哈 2"}
]
data = seq(testlist)
print data.find(lambda x: x['id'] == 1)
print data.drop_while(lambda x: ['id'] == 1)
data.to_json("r.json")
data.to_jsonl("r.jsonl") Output:
I thought two files will be encoded with utf-8. But r.jsonl file was encoded with my system default charset. |
In your code, you wrote I took a closer look at the code, and your bug report is correct. Currently, the default My proposal to fix this issue would be:
The outcome is that |
Sorry,I was too careless... data = seq(1,2,3,4)
print data.drop(1)
print data Output:
Obviously, element isn't really drop from the sequence... |
The behavior you describe is actually exactly the intended behavior. Once a It would be better to think about the function calls as a declarative sequence of transformations rather than procedural directives. For example, I would interpret your example as:
This can be seen a little more clearly by doing this: print(data.drop(1)._lineage)
# Lineage: sequence -> drop(1)
print(data._lineage)
# Lineage: sequence In other words, any mutations to a sequence are not in place, but instead describe a pipeline of transformations to apply to the original sequence. On the bug, I just pushed a fix in 4a3735b UPDATE: I decided to remove the encoding parameter/option. In python 3 json gets loaded/dumped as unicode so it makes sense to keep consistency with this (python 2 has the encoding option, but that is removed/deprecated in python 3). Here is the code output now:
It looks like it works, but could you verify on your end to be sure? The best way to do this would be to do one of these inside a
|
I'd known what you explaining about drop() behavior. |
Sounds like everything is good so resolving this issue. Feel free to open another issue if you find another bug. Thanks for the bug report! |
when using to_json/to_jsonl to export files.I found the methon no encoding option.I read the code,and both functions using json.dump and utf-8 charset.
And thanks for your awesome work,I really love this functional toolkits.
The text was updated successfully, but these errors were encountered: