Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/text aug mapper #17

Merged
merged 8 commits into from
Sep 13, 2023
Merged

Feature/text aug mapper #17

merged 8 commits into from
Sep 13, 2023

Conversation

HYLcool
Copy link
Collaborator

@HYLcool HYLcool commented Sep 11, 2023

  • New OPs: simple text augmentation OPs for en/zh; aggregate several augmentation methods in each OP
    • They are also updated in the config_all.yaml with augmentation examples.
  • Config Optimization:
    • A new argument export_in_parallel to control whether to export the result dataset into a single file in parallel. It's disabled by default.
    • The input config file will be backed up in the work directory for further usage.
    • Data-Juicer will display the config table before starting the processing for better management of the current running tasks.

configs/config_all.yaml Outdated Show resolved Hide resolved
data_juicer/core/tracer.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@yxdyc yxdyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job. Plz see the inline comments

configs/config_all.yaml Outdated Show resolved Hide resolved
configs/config_all.yaml Outdated Show resolved Hide resolved
configs/config_all.yaml Outdated Show resolved Hide resolved
data_juicer/config/config.py Show resolved Hide resolved
data_juicer/core/exporter.py Outdated Show resolved Hide resolved
data_juicer/core/tracer.py Outdated Show resolved Hide resolved
data_juicer/ops/base_op.py Outdated Show resolved Hide resolved
@yxdyc yxdyc merged commit e221d06 into main Sep 13, 2023
@HYLcool HYLcool deleted the feature/text_aug_mapper branch September 13, 2023 06:53
@HYLcool HYLcool added the enhancement New feature or request label Sep 13, 2023
@HYLcool HYLcool added the dj:op issues/PRs about some specific OPs label Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dj:op issues/PRs about some specific OPs enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants