-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix writing of decimal sample frequency #159
Conversation
Build Python 3.10 wheels for aarch64
Run tests on Python 3.10
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #159 +/- ##
==========================================
+ Coverage 76.08% 81.08% +5.00%
==========================================
Files 4 4
Lines 828 809 -19
Branches 176 170 -6
==========================================
+ Hits 630 656 +26
+ Misses 156 112 -44
+ Partials 42 41 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
nsigs = int(header['n_signals']) | ||
label = [f.read(16).decode() for i in range(nsigs)] | ||
transducer = [f.read(80).decode().strip() for i in range(nsigs)] | ||
dimension = [f.read(8).decode().strip() for i in range(nsigs)] | ||
pmin = [f.read(8).decode() for i in range(nsigs)] | ||
pmax = [f.read(8).decode() for i in range(nsigs)] | ||
dmin = [f.read(8).decode() for i in range(nsigs)] | ||
dmax = [f.read(8).decode() for i in range(nsigs)] | ||
prefilter = [f.read(80).decode().strip() for i in range(nsigs)] | ||
n_samples = [f.read(8).decode() for i in range(nsigs)] | ||
reserved = [f.read(32).decode() for i in range(nsigs)] | ||
_ = zip(label, transducer, dimension, pmin, pmax, dmin, dmax, prefilter, n_samples, reserved) | ||
values = locals().copy() | ||
fields = ['label', 'transducer', 'dimension', 'pmin', 'pmax', 'dmin', 'dmax', 'prefilter', 'n_samples', 'reserved'] | ||
sheaders = [{field:values[field][i] for field in fields} for i in range(nsigs)] | ||
print('\n##### Signal Headers') | ||
print(json.dumps(sheaders, indent=2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated, just ignore, just for debugging
from ._extensions._pyedflib import set_startdatetime, set_starttime_subsecond, set_samplefrequency, set_physical_minimum, set_label, set_physical_dimension | ||
from ._extensions._pyedflib import set_startdatetime, set_starttime_subsecond, set_samples_per_record, set_physical_minimum, set_label, set_physical_dimension |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to accurately reflect what the function actually does.
pyedflib/edfwriter.py
Outdated
record_duration : integer | ||
Sets the datarecord duration in units of seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was wrong. We insert the record_duration
in seconds, not 10mS (this is how it is saved later inside the edf, but not how we indicate it here)
block_size : int | ||
set the block size for writing. Should be divisor of signal length | ||
in seconds. Higher values mean faster writing speed, but if it | ||
is not a divisor of the signal duration, it will append zeros. | ||
Can be any value between 1=><=60, -1 will auto-infer the fastest value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the use of block_size was entirely wrong here, my bad. needs to be removed.
# get annotations, in format [[timepoint, duration, description], [...]] | ||
annotations = header.get('annotations', []) | ||
|
||
with pyedflib.EdfWriter(edf_file, n_channels=n_channels, file_type=file_type) as f: | ||
f.setDatarecordDuration(int(100000 * block_size)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
completely misunderstood the use of record_duration
|
||
def tearDown(self): | ||
# small hack to close handles in case of tests throwing an exception | ||
for obj in gc.get_objects(): | ||
if isinstance(obj, (EdfWriter, EdfReader)): | ||
obj.close() | ||
del obj | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a test fails it will leave the file open. if the same file is accessed in a later test, that will fail again, just due to the file still being open.
added this to close all files that were not closed because a test failed.
sample_frequencies = [1000, 800, 500, 975, 999] | ||
sample_frequencies = [2000, 1600, 1000, 1950, 1998] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these were wrong before. Now they are correct and finally are also what EDFBrowser would display.
sample_frequencies = [1000, 800, 500, 975, 999] | ||
sample_frequencies = [500, 400, 250, 487.5, 499.5] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these were wrong before. Now they are correct and finally are also what EDFBrowser would display.
@@ -156,11 +166,14 @@ def test_fortran_write(self): | |||
|
|||
|
|||
def test_read_write_decimal_sample_frequencies(self): | |||
signals = np.random.randint(-2048, 2048, [3, 256*60]) | |||
# first test with digital signals | |||
signals = np.random.randint(-2048, 2048, [3, 256*60+8]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+8 as the file is padded by zeros else.
Add tests for ChannelDoesNotExist Exception
Fix importlib
change win2016 to win2019 as its deprecated
remove deprecated "mode" parameter
…ue148_samplerate2
Fix writing of annotations in very long records
…ue148_samplerate2
Remove commented code and merged from develop
I changed using Also Python 3.6 is removed from testing as its no longer available. |
Went once through all the changes, added some more comments. I think all changes are okay and it's long overdue to merge this PR Thanks for @gcathelain for reviewing! I'll merge it soon. |
In this PR I'm fixing some mistakes that have been made before, and include some clarifications w.r.t to
sampling_frequency
andsample_rate
Summary:
sample_frequency
is now the main term being used and denoted the samples in Hz (samples per second)sample_rate
. To prevent confusion, for now we pretendsample_frequency
andsample_rate
are the same term towards the user.record_duration
was entirely wrong. Fixed that now.Terms:
smp_per_record
: tells us how many samples there are in each record.record_duration
: denotes the time window of one record. If all sampling frequencies are ints, this can simply be 1.sample_frequency
: is calculated by dividing smp_per_record/record_duration. If we write a file, we have to do the reverse, and find an optimal record_duration such that can actually represent the wished sample_frequency.Problem:
Sampling frequency (samples per seconds in Hz) is not explicitly denoted in EDF. It is calculated by
smp_per_record
(samples, int) /record_duration
(seconds, float). Previously it was just assumed thatsample_rate
is equivalent tosample_frequency
, which is not the case, assample_rate
is taken directly fromsmp_per_record
. However, when therecord_duration
is unequal to 1, this is not the same as the sampling frequency.Similarly, if we want to save a file with decimal
sampling_frequency
, e.g. 0.5 Hz, we need to make the equation0.5=smp_per_record/record_duration
work such thatsmp_per_record
is an int. That means we need to look for arecord_duration
that matches thesmp_per_record
for all channels in the EDF. I posted a simplified solution that should work in most cases (just try outrecord_duration
for 1s to 60s in 1s steps), however, we can think of some more mathematical solution as well later.Proposition
I think nobody cares how many samples are stored within a record_duration. These are technical details. People want to set a sampling frequency. They do not care how this is realized and stored within the file. Therefore I opt to remove any setting of
record_duration
and ``closes #148 #111
This is a rather large commit, and I would appreciate if someone would take a close look at it @holgern