Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor timestamps managing #210

Merged
merged 25 commits into from
Oct 22, 2024

Conversation

Gautzilla
Copy link
Contributor

This PR follows the changes made regarding issue #209

@Gautzilla Gautzilla added the short term Issue that takes a short time to be corrected label Oct 15, 2024
@Gautzilla Gautzilla self-assigned this Oct 15, 2024
@Gautzilla Gautzilla linked an issue Oct 15, 2024 that may be closed by this pull request
@Gautzilla Gautzilla changed the title [DRAFT] rename build files test [DRAFT] refactor dataset build() method Oct 16, 2024
@Gautzilla Gautzilla changed the title [DRAFT] refactor dataset build() method [DRAFT] refactor timestamps managing Oct 17, 2024
@Gautzilla Gautzilla marked this pull request as ready for review October 17, 2024 14:51
@Gautzilla Gautzilla changed the title [DRAFT] refactor timestamps managing refactor timestamps managing Oct 17, 2024
@Gautzilla
Copy link
Contributor Author

Gautzilla commented Oct 17, 2024

@mathieudpnt @cazaudo, basically this PR does:

A rework of how the timestamps are parsed from the audio file names. I added some basic utils which might simplify the old write_timestamp function:

  • Dataset now uses the _write_timestamp_csv_from_audio_files function to write the timestamp csv from the audio files present in the audio folder. this function:
    • get the supported audio files from the audio files folder
    • parse the Timestamps from the audio file names thanks to the provided format pattern string (this is the step that uses all the added utils, see associate_timestamps in the utils\timestamp_utils module)
    • write the timestamps.csv with the uniformed OSmOSE datetime string format (defined in the config module)

I added a bunch of unit and integration tests to make the thing robust, and it should not break the global (untested) OSEkit workflow since the timestamp.csv is written in the same fashion as before (except for the timestamp precision, which has been reduced to the millisecond).

I intentionally kept the build() method as it is for now (and integrated the latter changes to it), but these changes might help refactoring the build() method (I think this is the next step!)

The utils added to timestamp_utils should also be used elsewhere in the codebase (e.g. by the reshaper), I'll do that in the near future too.

PS: pytest throws some warnings on the Python 3.10 used in the github actions here. I managed to solve them locally in an environment that uses Python 3.12.7: is there a reason for sticking with the 3.10 version?

@Gautzilla Gautzilla requested review from mathieudpnt and cazaudo and removed request for mathieudpnt October 17, 2024 15:06
Copy link
Contributor

@mathieudpnt mathieudpnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few minor changes, overall looks good to me!

src/OSmOSE/utils/timestamp_utils.py Outdated Show resolved Hide resolved
src/OSmOSE/utils/timestamp_utils.py Show resolved Hide resolved
src/OSmOSE/Dataset.py Outdated Show resolved Hide resolved
@Gautzilla Gautzilla merged commit 4138c80 into Project-OSmOSE:main Oct 22, 2024
1 check passed
@Gautzilla Gautzilla deleted the refactor/audio_files_build branch October 22, 2024 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
short term Issue that takes a short time to be corrected
Projects
None yet
Development

Successfully merging this pull request may close these issues.

write_timestamp called during Dataset build should be exploded
2 participants