Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add highlevel.crop_edf function #196

Merged
merged 11 commits into from
Sep 20, 2023
Merged

Conversation

raphaelvallat
Copy link
Contributor

As discussed in #195, this adds a new function to crop the EDF file.

I'm still working on the unit tests. The following works like a charm:

from pyedflib import highlevel
from datetime import datetime
new_start = datetime(2011, 4, 4, 12, 58, 0)
new_stop = datetime(2011, 4, 4, 13, 0, 0)
highlevel.crop_edf("data/test_generator.edf", new_start=new_start, new_stop=new_stop, verbose=True)

However, when I set both new_start and new_stop to None, I should get True with compare_edf(original, cropped), but instead got the following error:

image

Do you have any idea why that might be? I'm not changing any of the signal headers 🤔

PS: I would recommend using black for code formatting. I usually set a max line length to 100 instead of the default 80 because I find it more readable. If that's something you'd be interested in let me know and I can submit a second PR after this one.

Thanks,
Raphael

@skjerns
Copy link
Collaborator

skjerns commented Apr 17, 2023

Sorry for the late reply, much to do right now

This might be due to rounding errors that get introduced by converting forth and back from digital to analog signal space. EDF saves files in digital space (i.e. amplifier integers), and then converts to floats (i.e. Voltage) as output, depending on the digital range dmin/dmax, there can be loss of information.

Try loading and saving the data with digital=True, the signal should then stay the same

I'm flying to a conference to the US soon, so will only have time to look at it in May. If you want, you can also wait for my further input until then, but knowing that you're a good dev (we'll be citing YASA in our upcoming publication!😊), I assume your code will be good anyway :)

@skjerns
Copy link
Collaborator

skjerns commented Apr 17, 2023

Using black or any other formatting tool/guideline would be great. Until now, the code base has no enforcement of any contribution guidelines or style guides. However, @holgern is the person that has to decide that in the end.

@raphaelvallat
Copy link
Contributor Author

Hi @skjerns!

Haha, that's kind of you but I would not trust myself too much as I've never dive into the pyedflib code before 😅 Anyway it's pretty straightforward so hopefully it's not too much work to review.

Try loading and saving the data with digital=True, the signal should then stay the same

Yep works perfectly, thanks!

I've added unit tests and everything is running smoothly on my machine ✅

pyedflib/highlevel.py Outdated Show resolved Hide resolved
Comment on lines 310 to 315
highlevel.crop_edf(
edf_file, new_file=outfile, new_start=new_start, new_stop=new_stop)
highlevel.crop_edf(
edf_file, new_file=outfile, new_start=None, new_stop=new_stop, verbose=False)
highlevel.crop_edf(
edf_file, new_file=outfile, new_start=new_start, new_stop=None)
Copy link
Collaborator

@skjerns skjerns Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add tests here that read the files and check if the data is actually the cropped portion we would expect? The startdatetime and stop are already checked within the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@skjerns
Copy link
Collaborator

skjerns commented Sep 7, 2023

My apologies that this took so long.

The code looks good. thank you a lot for this addition!

One more request: It would be great if cropping could be done with relative times as well. My suggestion would be to have three parameters start_dt, start_sec, start_smp (which are mutually exclusive) and their corresponding stop_x. This way, cropping could be done without knowing the RecordingStartdatetime of the EDF. Think of a scenario where you want to remove e.g. the first 15 minutes of the recording or have a very specific exact number of samples after midnight (that means start_x and stop_x can be of different type, e.g. start_dt together with stop_smp). Internally we transform them to a common unit (as is done right now already) and apply the cropping.

What are your thoughts on this?

@raphaelvallat
Copy link
Contributor Author

Thanks for the review @skjerns. Sounds good about adding the possibility to crop based on samples or seconds. For simplicity, what about:

highlevel.crop_edf(
"data/test_generator.edf", 
start=new_start, stop=new_stop, 
start_format="datetime", stop_format="datetime", 
verbose=True
)

where start/stop_format can be one of "datetime", "seconds", or "samples"?

@skjerns
Copy link
Collaborator

skjerns commented Sep 12, 2023

Yes that sounds like a good solution!

@raphaelvallat
Copy link
Contributor Author

Ready for re-review @skjerns !

Of note, I did not add support for "samples" since I think it could lead to errors when the signals have different sampling frequencies. The only supported options are "datetime" and "seconds". I have also updated the unit tests, which now compare that the signal values and headers are the same.

@skjerns
Copy link
Collaborator

skjerns commented Sep 20, 2023

Ready for re-review @skjerns !

Of note, I did not add support for "samples" since I think it could lead to errors when the signals have different sampling frequencies. The only supported options are "datetime" and "seconds". I have also updated the unit tests, which now compare that the signal values and headers are the same.

ah, yes that makes total sense. Thanks for thinking of this! I'll review it soon.

pyedflib/highlevel.py Outdated Show resolved Hide resolved
pyedflib/highlevel.py Outdated Show resolved Hide resolved
pyedflib/highlevel.py Outdated Show resolved Hide resolved
pyedflib/highlevel.py Outdated Show resolved Hide resolved
raphaelvallat and others added 5 commits September 20, 2023 18:13
Co-authored-by: Simon Kern <14980558+skjerns@users.noreply.github.com>
Co-authored-by: Simon Kern <14980558+skjerns@users.noreply.github.com>
Co-authored-by: Simon Kern <14980558+skjerns@users.noreply.github.com>
Co-authored-by: Simon Kern <14980558+skjerns@users.noreply.github.com>
Co-authored-by: Simon Kern <14980558+skjerns@users.noreply.github.com>
@raphaelvallat
Copy link
Contributor Author

All suggestions merged. Thanks!

@skjerns skjerns merged commit 28084a2 into holgern:master Sep 20, 2023
@skjerns
Copy link
Collaborator

skjerns commented Sep 20, 2023

Thanks a lot for the contribution @raphaelvallat !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants