Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Migrate serialization format to JSON #79

Merged
merged 2 commits into from
Oct 25, 2017
Merged

RFC: Migrate serialization format to JSON #79

merged 2 commits into from
Oct 25, 2017

Conversation

ararslan
Copy link
Member

@ararslan ararslan commented Oct 20, 2017

This is probably hot garbage at the moment, but I figured I'd put it up to get feedback while I continue to improve it.

Also note that the dependency on JSON.jl is likely temporary, until I can implement a mini JSON parser in this package.

Fixes #15

@ararslan ararslan requested a review from jrevels October 20, 2017 00:12
Copy link
Member

@jrevels jrevels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to finally get rid of the JLD dependency!

"Only JSON deserialization is supported. Use `load(\"$noext.json\", args...)`."
end
throw(ArgumentError(msg))
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to break out this duplicated assert into its own function (either rewriting the message to be the same for load/save or just pass a flag giving the context).

return nothing
parsed = open(JSON.parse, filename, "r")
if !isa(parsed, Vector) || length(parsed) != 2 || !isa(parsed[1], Dict) || !isa(parsed[2], Vector)
error("Unexpected JSON format. Was this file produced by BenchmarkTools?")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"produced by" --> "originally written by" would be clearer IMO


const VERSIONS = Dict("Julia" => string(VERSION), "BenchmarkTools" => string(BENCHMARKTOOLS_VERSION))
# TODO: Add any new types as they're added
const supported_types = [Benchmark, BenchmarkGroup, Parameters, TagFilter, Trial,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change this to SUPPORTED_TYPES?

isfile(tmp) && rm(tmp)
open(tmp, "w") do f
print(f, """
{"never":1,[{"gonna":2,"give":3,"you":4,"up":5}]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consult the manual before adding a test like this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!!!!!!!!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right Jarrett, my apologies. We're no strangers to unit testing; you know the rules and so do I.

# JLD.translate("BenchmarkTools.Trial", "BenchmarkTools.Trial")
# end
# return result
#end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure you'd do this eventually anyway, but just to be explicit, we should definitely delete this commented out stuff before merging...what a joy it will be to be rid of it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah definitely

@ararslan
Copy link
Member Author

ararslan commented Oct 20, 2017

I'm not sure what to make of that error message. I'm not seeing it locally. :/

Edit: I would have been, I had been checking on 0.6.

@ararslan
Copy link
Member Author

TESTS ARE PASSING ON 0.6

@ararslan
Copy link
Member Author

ararslan commented Oct 23, 2017

I've verified locally that the new serialization tests pass on 0.7. There are other issues that are unrelated to this PR, likely due to the 0.7 macro hygiene chaos, which are keeping the full test suite from passing.

length(x) == 2 || throw(ArgumentError("Expecting a vector of length 2"))
typename = x[1]::String
fields = x[2]::Dict
T = eval(parse(typename))::Type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a bit scary to call on something that comes from a json file? Should there be a whitelist of types and just switch on that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I suppose so. The hope is that people will only read in JSON files that were produced by this package, but I suppose it's pretty easy to falsify that if you know the expected format.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently Jarrett is unfazed by this.

@ararslan
Copy link
Member Author

I tried adding this to address @KristofferC's comment but for some reason it fails with an UndefVarError for NefariousCode rather than the expected failure...

diff --git a/src/serialization.jl b/src/serialization.jl
index 52715ed..ba3e9e1 100644
--- a/src/serialization.jl
+++ b/src/serialization.jl
@@ -22,6 +22,10 @@ function recover(x::Vector)
     length(x) == 2 || throw(ArgumentError("Expecting a vector of length 2"))
     typename = x[1]::String
     fields = x[2]::Dict
+    if !any(t->ismatch(Regex(string('^', t, "({[^}]+)?\$")), typename), SUPPORTED_TYPES)
+        throw(ArgumentError("Cannot deserialize type '$typename'; only BenchmarkTools " *
+                            "types can be deserialized"))
+    end
     T = eval(parse(typename))::Type
     fc = fieldcount(T)
     xs = Vector{Any}(fc)
diff --git a/test/SerializationTests.jl b/test/SerializationTests.jl
index 7a51ae7..54f1f8f 100644
--- a/test/SerializationTests.jl
+++ b/test/SerializationTests.jl
@@ -78,6 +78,22 @@ end
         end
     end

+    withtempdir() do
+        tmp = joinpath(pwd(), "tmp.json")
+        open(tmp, "w") do f
+            print(f, """
+            [{"Julia":"1.2.3","BenchmarkTools":"3.2.1"},[["NefariousCode",{}]]]
+            """)
+        end
+        try
+            BenchmarkTools.load(tmp)
+            error("aaahhh")
+        catch err
+            @test err isa ArgumentError
+            @test contains(err.msg, "Cannot deserialize type 'NefariousCode'")
+        end
+    end
+
     @test_throws ArgumentError BenchmarkTools.recover([1])
 end

@ararslan ararslan changed the title WIP: Migrate serialization format to JSON RFC: Migrate serialization format to JSON Oct 24, 2017
@KristofferC
Copy link
Contributor

KristofferC commented Oct 24, 2017

I think you are missing an end brace in your regex, so should be "({[^}]+})?\$"). However, you could just use a type like BenchmarkTools.Benchmark{println(1+1)}".

On the other hand, I think jld files can already execute arbitrary code so I this shouldn't be any worse security than before.

@jrevels
Copy link
Member

jrevels commented Oct 25, 2017

LGTM! Some of these removed lines were a long time coming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants