Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a utility function get_random_sequence_subset #2098

Merged
merged 7 commits into from
Jul 2, 2024

Conversation

amontanez24
Copy link
Contributor

resolves #2085
CU-86b0z70r1

@amontanez24 amontanez24 requested a review from a team as a code owner June 27, 2024 20:57
@amontanez24 amontanez24 requested review from frances-h and removed request for a team June 27, 2024 20:57
@sdv-team
Copy link
Contributor

Copy link
Contributor

@R-Palazzo R-Palazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

sdv/utils/poc.py Outdated Show resolved Hide resolved
sdv/utils/poc.py Outdated
- last_rows: Keep the last n rows of the sequence, where n is the max sequence length.
- random: Randomly choose n rows to keep within the sequence. It is important to keep
the randomly chosen rows in the same order as they appear in the original data.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other poc/utils methods start by validating the metadata. It can be useful here also to prevent the case where the sequence_key inside the metadata is not in the data

tests/unit/utils/test_poc.py Outdated Show resolved Hide resolved
tests/unit/utils/test_poc.py Outdated Show resolved Hide resolved
tests/unit/utils/test_poc.py Outdated Show resolved Hide resolved
tests/unit/utils/test_poc.py Outdated Show resolved Hide resolved
@amontanez24 amontanez24 force-pushed the issue-2085-get-random-sequence-subset branch from c810b15 to 4f7887e Compare June 28, 2024 17:45
Copy link
Contributor

@R-Palazzo R-Palazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@amontanez24 amontanez24 requested review from rwedge and gsheni and removed request for frances-h and rwedge June 28, 2024 19:23
Copy link
Contributor

@gsheni gsheni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. Just minor suggestions

sdv/utils/utils.py Outdated Show resolved Hide resolved
sdv/utils/utils.py Outdated Show resolved Hide resolved
@amontanez24 amontanez24 merged commit d8962df into main Jul 2, 2024
39 checks passed
@amontanez24 amontanez24 deleted the issue-2085-get-random-sequence-subset branch July 2, 2024 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a utility function get_random_sequence_subset
5 participants