Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency in common_time with 'bin': are the bins closed or open on the right? #340

Closed
MarcoGorelli opened this issue Feb 22, 2023 · 4 comments

Comments

@MarcoGorelli
Copy link

MarcoGorelli commented Feb 22, 2023

Describe the bug
It's not clear what the bins are, nor how values are assigned to them.

To Reproduce

ts1 = pyleo.Series(
    time=np.array([1, 2, 3]),
    value=np.array([8, 3, 5]),
    time_unit='yr BP',
)
ts2 = pyleo.Series(
    time=np.array([1, 2, 3]),
    value=np.array([6, 2, 1]),
    time_unit='yr BP',
)
ms = pyleo.MultipleSeries([ts1, ts2])
newms = ms.common_time('bin')
print(newms.series_list)

This gives

{'log': ({0: 'clean_ts', 'applied': True, 'verbose': False},)}
{'log': ({0: 'clean_ts', 'applied': True, 'verbose': False},)}
[None
time [yr BP]
1.5    8.0
2.5    4.0
Name: value, dtype: float64, None
time [yr BP]
1.5    6.0
2.5    1.5
Name: value, dtype: float64]

which I find quite odd. It looks like we have:

  • 1.5 means [1, 2)
  • 2.5 means [2, 3]

In the first case, the bin is closed on the left and open on the right. But in the second case, it is closed on both sides.

Expected behavior
I'd have expected:

  • 1.5 to mean [1, 2)
  • 2.5 to mean [2, 3)
  • 3.5 to mean [3, 4)

Desktop (please complete the following information):

  • OS: [e.g. iOS] linux
  • Browser [e.g. chrome, safari] irrelevant
  • Version [e.g. 22] Development branch, commit e606f8
@alexkjames
Copy link
Contributor

Hi marco, currently the way bin edges are defined is all but the last bin are half open. It is a bit confusing and should be specified more clearly in the docstring. I'm not sure about the solution you propose in #341 though, wouldn't this result in data from outside of the range defined by start and stop being included in the binned series? I suppose I'm not totally clear on the advantage of this approach (which may very well be on me!).

@MarcoGorelli
Copy link
Author

Hey

Hi marco, currently the way bin edges are defined is all but the last bin are half open. It is a bit confusing and should be specified more clearly in the docstring.

OK that's fine, maybe all that's needed is to document it then, thanks!

@alexkjames
Copy link
Contributor

sounds good, will do!

@MarcoGorelli
Copy link
Author

closed by #347

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants