Clean up and added test cases for `data_objects.data_containers` #1831

git-abhishek · 2018-06-11T21:37:45Z

PR Summary

Removed dead code
Fixed few issues which were identified by adding more test cases
Increased coverage

PR Checklist

Code passes flake8 checker
New features are documented, with docstrings and narrative docs NA
Adds a test for any bugs fixed. Adds tests for new features.

…est case

git-abhishek · 2018-06-11T21:41:56Z

Unable to add @Colin as a reviewer. If someone can add him, that would be great!

…containers

colinmarc

I'm missing the context to really review the content changes or the effectiveness of the new tests, but I left a few surface-level nitpicks :)

colinmarc · 2018-06-12T08:51:29Z

yt/data_objects/tests/test_data_containers.py

+        file_row_2 = file.readline()
+        file_row_2 = np.array(file_row_2.split('\t'), dtype=np.float64)
+    sorted_keys = sorted(sp.field_data.keys())
+    _keys = [str(k) for k in sorted_keys]


I don't see the point of signifying these as private with _.

colinmarc · 2018-06-12T08:52:20Z

yt/data_objects/tests/test_data_containers.py

+
+def test_to_dataframe():
+    try:
+        import pandas as pd


Does import pandas as _ make flake8 happy?

Rather than wrapping the test logic in a try/except, please use yt.testing.requires_module, which is a decorator you can apply to a test function to tell nose that the test function requires some optional module (in this case pandas).

You probably also need to install pandas on travis and appveyor to get this test to actually execute.

colinmarc · 2018-06-12T08:53:27Z

yt/data_objects/tests/test_data_containers.py

+    rho = dd.quantities["WeightedAverageQuantity"]("density",
+                                                   weight="cell_mass")
+    dd.extract_isocontours("density", rho, "triangles.obj", True)
+    dd.calculate_isocontour_flux("density", rho, "x", "y", "z",


This formatting choice seems odd (the complete line is shorter than the lines above). Are you using an auto-formatter like yapf or eyeballing it?

I'd format these two lines like:

dd.calculate_isocontour_flux( "density", rho, "x", "y", "z", "dx")

cphyc · 2018-06-12T12:55:44Z

yt/data_objects/tests/test_data_containers.py

+    ds = fake_particle_ds()
+    sp = ds.sphere(ds.domain_center, 0.25)
+    sp.save_object("my_sphere_1", filename="test_save_obj")
+    obj = shelve.open("test_save_obj", protocol=-1)


This could be wrapped in a with statement to prevent the file handler to being left opened if the assertion bellow fail.

I had done it earlier using with statement however it was failing in python 2.7 (error: AttributeError: DbfilenameShelf instance has no attribute '__exit__').

OK, fair enough!

ngoldbaum

On the whole this is good although I've left a number of comments below.

In addition it looks like you added two empty files (I think from your other PR?) that should be removed.

ngoldbaum · 2018-06-12T14:34:55Z

yt/data_objects/construction_data_containers.py

@@ -922,7 +922,8 @@ def __init__(self, *args, **kwargs):
        self._final_start_index = self.global_startindex

    def _setup_data_source(self, level_state = None):
-        if level_state is None: return
+        if level_state is None:
+            return super(YTSmoothedCoveringGrid, self)._setup_data_source()


I'd format this as:

if level_state is None: super(YTSmoothedCoveringGrid, self)._setup_data_source() return

_setup_data_source doesn't return anything, and this way of calling it makes it clearer that that's the case.

ngoldbaum · 2018-06-12T14:39:51Z

yt/data_objects/data_containers.py

-            if iterable(width):
-                radius = max(width)
+            if isinstance(width, tuple):
+               radius = width[0]


I don't think this change is correct - the original code is correct. We're checking for iterable(width) because width might be a list or an ndarray or some other kind of iterable. It's easier to just use the iterable function to test for this rather than using isinstance and insisting on only accepting a certain number of types.

Also it's a behavior change to pick width[0] instead of max(width). Can you explain why you did that? Choosing max(width) is arbitrary, but since we already made the decision it's best to not change our minds without a good reason.

I changed it as I was getting error for the test case:

def test_to_frb(): ds = fake_amr_ds(fields=["density", "cell_mass"], geometry="cylindrical", particles=16**3) proj = ds.proj("density", weight_field="cell_mass", axis=1, data_source=ds.all_data()) proj.to_frb((1.0, 'unitary'), 64)

Error:

line 1668, in to_frb radius = max(width) TypeError: '>' not supported between instances of 'str' and 'float'

Width is allowed to be a tuple of tuples, like this:

((1.0, 'unitary'), (0.7, 'unitary'))

this allows constructing images where the width doesn't equal the height.

For the cylindrical FRB the width really represents a radius, and what happens right now is it assumes if you're passing a tuple you're passing two floats and it assumes the radius you really mean is the maximum of the two floats you passed in.

That's not great, so maybe it's ok to change this. Rather than what you've done though I'd suggest if we're passed a tuple, to check if it's a valid (length, unit) tuple, and if not raise an error. That is, for a cylindrical dataset, it only makes sense to pass in one width to inrepret as the radius, so if someone passes in more than one, we should raise an error.

Basically the code here:

yt/yt/data_objects/data_containers.py

Lines 1691 to 1696 in cc299b0

if iterable(width):

w, u = width

if isinstance(w, tuple) and isinstance(u, tuple):

height = u

w, u = w

width = self.ds.quan(w, input_units = u)

But in the if statement raise an error instead of interpreting the tuple as a width and height.

ngoldbaum · 2018-06-12T14:41:27Z

yt/data_objects/tests/test_data_containers.py

+    with assert_raises(RuntimeError) as err:
+        YTDataContainer(None, None)
+    desired = 'Error: ds must be set either through class type or parameter' \
+              ' to the constructor'


style nit: If you put the string inside parentheses then you don't need to use the line continuation

ngoldbaum · 2018-06-12T14:42:30Z

yt/data_objects/tests/test_data_containers.py

+    assert_equal('pz' in proj.keys(), False)
+
+    # Delete the key and check if exits
+    proj.__delitem__('px')


Rather than calling __delitem__ directly I'd format this (and similarly below) as del proj['px']

ngoldbaum · 2018-06-12T14:43:32Z

yt/data_objects/tests/test_data_containers.py

+    assert_equal(str(ex.exception), desired)
+
+    # Test the convert method
+    assert_equal(proj.convert('HydroMethod'), -1)


This is actually a relic of the old pre yt-3.0 API that should have been removed a long time ago. Can you remove this test and the function calling it?

ngoldbaum · 2018-06-12T14:47:50Z

yt/data_objects/tests/test_data_containers.py

+
+def test_to_dataframe():
+    try:
+        import pandas as pd


Rather than wrapping the test logic in a try/except, please use yt.testing.requires_module, which is a decorator you can apply to a test function to tell nose that the test function requires some optional module (in this case pandas).

You probably also need to install pandas on travis and appveyor to get this test to actually execute.

ngoldbaum · 2018-06-12T14:49:50Z

yt/data_objects/tests/test_data_containers.py

+                     particles=16**3)
+    proj = ds.proj("density", weight_field="cell_mass", axis=1,
+                   data_source=ds.all_data())
+    proj.to_frb((1.0, 'unitary'), 64)


Can you test that the FixedResolutionBuffer object you get back is correct? (e.g. it produces images with a resolution of (64, 64) and that the image covers the full domain? For the latter you can just inspect the state of the FixedResolutionBuffer object (e.g. by checking frb.bounds) rather than trying to make sure the image is correct.

In case of CylindricalFixedResolutionBuffer the object does not have bounds field instead there is buff_size.

ngoldbaum · 2018-06-12T14:50:35Z

yt/data_objects/tests/test_data_containers.py

+    rho = dd.quantities["WeightedAverageQuantity"]("density",
+                                                   weight="cell_mass")
+    dd.extract_isocontours("density", rho, "triangles.obj", True)
+    dd.calculate_isocontour_flux("density", rho, "x", "y", "z",


I'd format these two lines like:

dd.calculate_isocontour_flux( "density", rho, "x", "y", "z", "dx")

ngoldbaum · 2018-06-12T14:52:03Z

yt/data_objects/tests/test_data_containers.py

+    ds = fake_amr_ds(fields=["density", "cell_mass"], particles=16**3)
+    dd = ds.all_data()
+    rho = dd.quantities["WeightedAverageQuantity"]("density",
+                                                   weight="cell_mass")


I'd format these lines like:

q = dd.quantities["WeightedAverageQuantity"] rho = q("density", weight="cell_mass")

Often when you find yourself hitting a line length limit you can get around it by defining an intermediate variable. This has the side benefit of often making the code more understandable.

ngoldbaum · 2018-06-12T14:53:25Z

yt/data_objects/tests/test_data_containers.py

+    min_val, max_val = data_source[field].min() / 2, data_source[field].max() / 2
+
+    data_source.extract_connected_sets(field, 3, min_val, max_val,
+                                              log_space=True, cumulative=True)


Can you test that the data returned by this function are correct? Also the formatting right here is a little odd, this line should be aligned with field, not 3.

Removing this test as it is already present in test_connected_sets.py.

git-abhishek · 2018-06-12T20:30:58Z

yt/data_objects/construction_data_containers.py

@@ -922,7 +922,8 @@ def __init__(self, *args, **kwargs):
        self._final_start_index = self.global_startindex

    def _setup_data_source(self, level_state = None):
-        if level_state is None: return
+        if level_state is None:
+            return super(YTSmoothedCoveringGrid, self)._setup_data_source()


git-abhishek · 2018-06-12T20:36:33Z

yt/data_objects/tests/test_data_containers.py

+    with assert_raises(RuntimeError) as err:
+        YTDataContainer(None, None)
+    desired = 'Error: ds must be set either through class type or parameter' \
+              ' to the constructor'


git-abhishek · 2018-06-12T20:36:39Z

yt/data_objects/tests/test_data_containers.py

+    assert_equal('pz' in proj.keys(), False)
+
+    # Delete the key and check if exits
+    proj.__delitem__('px')


git-abhishek · 2018-06-12T20:50:24Z

yt/data_objects/tests/test_data_containers.py

+    assert_equal(str(ex.exception), desired)
+
+    # Test the convert method
+    assert_equal(proj.convert('HydroMethod'), -1)


Should I delete convert and __delitem__ methods only?

Don't delete __delitem__, that's still needed. I was saying that in the test don't invoke __delitem__ directly, instead use the del statement to invoke it indirectly.

You should delete convert since it's a relic function that doesn't do what it says in its own docstring.

git-abhishek

Updated as per first review comments. Please review again.

git-abhishek · 2018-06-12T23:10:42Z

yt/data_objects/tests/test_data_containers.py

+    assert_equal(str(ex.exception), desired)
+
+    # Test the convert method
+    assert_equal(proj.convert('HydroMethod'), -1)


git-abhishek · 2018-06-12T23:13:24Z

yt/data_objects/tests/test_data_containers.py

+        file_row_2 = file.readline()
+        file_row_2 = np.array(file_row_2.split('\t'), dtype=np.float64)
+    sorted_keys = sorted(sp.field_data.keys())
+    _keys = [str(k) for k in sorted_keys]


git-abhishek · 2018-06-13T04:13:32Z

yt/data_objects/tests/test_data_containers.py

+    ds = fake_particle_ds()
+    sp = ds.sphere(ds.domain_center, 0.25)
+    sp.save_object("my_sphere_1", filename="test_save_obj")
+    obj = shelve.open("test_save_obj", protocol=-1)


I had done it earlier using with statement however it was failing in python 2.7 (error: AttributeError: DbfilenameShelf instance has no attribute '__exit__').

git-abhishek · 2018-06-13T16:26:14Z

yt/data_objects/tests/test_data_containers.py

+    obj = shelve.open("test_save_obj", protocol=-1)
+    loaded_sphere = obj["my_sphere_1"][1]
+    assert_array_equal(loaded_sphere.center, sp.center)
+    assert_equal(loaded_sphere.radius, sp.radius)


git-abhishek · 2018-06-13T16:32:40Z

yt/data_objects/tests/test_data_containers.py

+
+def test_to_dataframe():
+    try:
+        import pandas as pd


git-abhishek · 2018-06-13T17:04:40Z

yt/data_objects/tests/test_data_containers.py

+    ds = fake_amr_ds(fields=["density", "cell_mass"], particles=16**3)
+    dd = ds.all_data()
+    rho = dd.quantities["WeightedAverageQuantity"]("density",
+                                                   weight="cell_mass")


git-abhishek · 2018-06-13T23:46:01Z

yt/data_objects/tests/test_data_containers.py

+    min_val, max_val = data_source[field].min() / 2, data_source[field].max() / 2
+
+    data_source.extract_connected_sets(field, 3, min_val, max_val,
+                                              log_space=True, cumulative=True)


Removing this test as it is already present in test_connected_sets.py.

git-abhishek · 2018-06-14T00:20:27Z

yt/data_objects/data_containers.py

-            if iterable(width):
-                radius = max(width)
+            if isinstance(width, tuple):
+               radius = width[0]


git-abhishek · 2018-06-14T00:51:32Z

yt/data_objects/tests/test_data_containers.py

+                     particles=16**3)
+    proj = ds.proj("density", weight_field="cell_mass", axis=1,
+                   data_source=ds.all_data())
+    proj.to_frb((1.0, 'unitary'), 64)


In case of CylindricalFixedResolutionBuffer the object does not have bounds field instead there is buff_size.

git-abhishek · 2018-06-14T04:40:45Z

yt/data_objects/tests/test_data_containers.py

+    filename = "sphere.txt"
+    ds = fake_particle_ds()
+    sp = ds.sphere(ds.domain_center, 0.25)
+    sp.write_out(filename)


ngoldbaum

Just a few minor comments, other than that this looks good to me now.

ngoldbaum · 2018-06-14T16:26:35Z

.travis.yml

@@ -23,6 +23,7 @@ env:
    FLAKE8=flake8
    CODECOV=codecov
    COVERAGE=coverage
+    PANDAS=pandas<0.21


is there some problem with the latest version of pandas?

I was getting Double requirement given: numpy==1.12.1 ... (already in numpy==1.9.3 ) and build was failing at Travis.
Solution: pandas-dev/pandas#20723

Ah, so it's because pandas dropped support for python3.4 - can you only add this version restriction for the python3.4 builder?

To be honest, we should probably drop testing for python3.4, it's going EOL - I'll make a note to bring that up at the team meeting tomorrow.

Our Travis lint stage is executed for python3.4, do you want me to update that too with python3.6?

ngoldbaum · 2018-06-14T16:27:09Z

yt/data_objects/data_containers.py

+            elif center.lower() == "min":
+                self.center = self.ds.find_min(("gas", "density"))[1]
+            elif center.startswith("min_"):
+                self.center = self.ds.find_min(center[4:])[1]


good catch!

ngoldbaum · 2018-06-14T16:30:32Z

yt/data_objects/tests/test_data_containers.py

+    for k in loaded_sphere._key_fields:
+        assert_array_equal(loaded_sphere[k], sp[k])
+
+    # Object is saved but retrieval is not working


can you open an issue to track this?

cphyc · 2018-06-16T13:48:13Z

yt/data_objects/data_containers.py

-        field_order += [field for field in fields if field not in field_order]
-        fid = open(filename,"w")
-        fid.write("\t".join(["#"] + field_order + ["\n"]))
+        fid = open(filename, "w")


I'd be happier with a with statement there, to prevent the file handler from being left opened.

ngoldbaum · 2018-06-28T22:47:02Z

@yt-fido test this please

git-abhishek added 4 commits June 8, 2018 17:40

Removed dead code, added support to set center at min values, added t…

cc5f39a

…est case

update fiel i/o using context mgr; fixed to_frb

3fca6bc

Fixed YTSmoothedCoveringGrid __init__

ef19719

Added test cases

bf72557

git-abhishek requested review from Xarthisius and ngoldbaum June 11, 2018 21:37

git-abhishek added 8 commits June 11, 2018 17:51

fixed flake8 warnings

680fb92

Merge branch 'master' of https://github.com/yt-project/yt into cc-do-…

78b567a

…containers

Revert the use of context manager due to python 2

218e33c

Fixed Pandas test, sort writing to file

e2b4d70

fixed pandas import flake8 error

1550eab

closing file explicitly due to py2 with context mgr

11f62ca

Removed context mgr usage from test

77c27d5

fixed incompatibility b/w py 2 and 3

9abeca0

colinmarc reviewed Jun 12, 2018

View reviewed changes

cphyc reviewed Jun 12, 2018

View reviewed changes

ngoldbaum requested changes Jun 12, 2018

View reviewed changes

git-abhishek commented Jun 12, 2018

View reviewed changes

Updated as per first code review comments

70682e5

git-abhishek commented Jun 14, 2018

View reviewed changes

Resolved pandas dependency issue

cb3bf71

ngoldbaum approved these changes Jun 14, 2018

View reviewed changes

Updated Lint stage to run in py3.6; pinned pandas for py3.4

4f8fb5d

cphyc reviewed Jun 16, 2018

View reviewed changes

using with statement for file write ops

d12ca6e

cphyc approved these changes Jun 19, 2018

View reviewed changes

ngoldbaum approved these changes Jun 19, 2018

View reviewed changes

Merge branch 'master' into cc-do-containers

0054e09

ngoldbaum merged commit a5db4fc into yt-project:master Jun 29, 2018

git-abhishek deleted the cc-do-containers branch June 29, 2018 22:05

	if iterable(width):
	w, u = width
	if isinstance(w, tuple) and isinstance(u, tuple):
	height = u
	w, u = w
	width = self.ds.quan(w, input_units = u)

Clean up and added test cases for data_objects.data_containers #1831

Clean up and added test cases for data_objects.data_containers #1831

Conversation

git-abhishek commented Jun 11, 2018

PR Summary

PR Checklist

git-abhishek commented Jun 11, 2018

colinmarc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ngoldbaum left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ngoldbaum Jun 12, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

git-abhishek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ngoldbaum left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ngoldbaum commented Jun 28, 2018

Clean up and added test cases for `data_objects.data_containers` #1831

Clean up and added test cases for `data_objects.data_containers` #1831

ngoldbaum Jun 12, 2018 •

edited

Loading