-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dev#135 #183
Conversation
Korpora/cli.py
Outdated
@@ -57,6 +58,20 @@ def main(): | |||
parser_lmdata.add_argument('--save_each', dest='save_each', action='store_true', help='store each corpus as a file') | |||
parser_lmdata.set_defaults(func=create_lmdata) | |||
|
|||
# create parallel corpus data | |||
parser_parallel = subparsers.add_parser('parallel', help='Create parallel corpus data') | |||
parser_parallel.add_argument('--corpus', type=str, required=True, nargs='+', help='corpus names') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
현재 CLI argument의 long form만 존재하는데, short form 또한 같이 추가하는거에 대해서는 어떻게 생각하시나요?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--corpus
처럼 자주 입력해야 하는 arguments 이거나, 다른 arguments 와 혼동되지 않는 arguments 라면 도입하는게 좋을 듯 합니다. 감사합니다.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8b31bc4
에서 반영하였습니다. short term 으로 기술하면 예상하기 어렵거나, 패키지 내 다른 함수에서 자주 이용되는 argument 가 아닌 경우에는 short term 을 추가하지 않았습니다.
Korpora/cli.py
Outdated
parser_parallel.add_argument('--root_dir', type=str, default=None, help='path/to/Korpora') | ||
parser_parallel.add_argument('--output_dir', type=str, required=True, help='output file path') | ||
parser_parallel.add_argument('--sampling_ratio', type=float, default=None, help='Sampling ratio') | ||
parser_parallel.add_argument('--n_first_samples', type=int, default=None, help='Number of first samples') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linux CLI 기반 프로그램의 경우, 해당 argument의 기능을 보통 head로 표현하는 것 같은데, argument명을 head로 변경하는건 어떨까요?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
좋은 아이디어라 생각합니다. 반영하겠습니다.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
daad8ca
에서 반영하였습니다.
Korpora/task_parallel_corpus.py
Outdated
available = [] | ||
for name in corpus_names: | ||
if name not in ITERATE_TEXTS: | ||
print(f'Not provide {name} corpus. Check the `corpus` argument') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not provide {name} corpus --> {name} corpus not provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d68bfff
에서 반영하였습니다.
parser_fetch.add_argument('--root', type=str, default=None, help='path/to/Korpora/') | ||
parser_fetch.add_argument('--force_download', dest='force_download', action='store_true') | ||
parser_fetch.add_argument('--corpus', '-c', type=str, default='all', nargs='+', help='corpus name') | ||
parser_fetch.add_argument('--root', '-r', type=str, default=None, help='path/to/Korpora/') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
저희 패키지 전체적으로 root_dir
이라는 인자를 쓰고 있는데 통일하면 어떨지 싶습니다.
parser_fetch.add_argument('--corpus', type=str, default='all', nargs='+', help='corpus name') | ||
parser_fetch.add_argument('--root', type=str, default=None, help='path/to/Korpora/') | ||
parser_fetch.add_argument('--force_download', dest='force_download', action='store_true') | ||
parser_fetch.add_argument('--corpus', '-c', type=str, default='all', nargs='+', help='corpus name') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
저희 패키지 전체적으로 corpus_name
을 쓰고 있는 것으로 아는데 맞나요?
맞다면 이 역시 corpus_name
으로 통일하면 어떨지 싶습니다.
parser_lmdata.add_argument('--corpus', type=str, required=True, nargs='+', help='corpus names') | ||
parser_lmdata.add_argument('--root_dir', type=str, default=None, help='path/to/Korpora') | ||
parser_lmdata.add_argument('--output_dir', type=str, required=True, help='output file path') | ||
parser_lmdata.add_argument('--corpus', '-c', type=str, required=True, nargs='+', help='corpus names') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
저희 패키지 전체적으로 corpus_name
을 쓰고 있는 것으로 아는데 맞나요?
맞다면 이 역시 corpus_name
으로 통일하면 어떨지 싶습니다.
parser_lmdata.add_argument('--multilingual', dest='multilingual', action='store_true', help='If True, make include train data foreign language text') | ||
parser_lmdata.add_argument('--save_each', dest='save_each', action='store_true', help='store each corpus as a file') | ||
parser_lmdata.set_defaults(func=create_lmdata) | ||
|
||
# create parallel corpus data | ||
parser_parallel = subparsers.add_parser('parallel', help='Create parallel corpus data') | ||
parser_parallel.add_argument('--corpus', '-c', type=str, required=True, nargs='+', help='corpus names') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
저희 패키지 전체적으로 corpus_name
을 쓰고 있는 것으로 아는데 맞나요?
맞다면 이 역시 corpus_name
으로 통일하면 어떨지 싶습니다.
Pull Request
Korpora에 기여해 주셔서 감사합니다.
해당 Pull Request를 제출하기 전에 아래 사항이 완료되었는지 확인 부탁드립니다:
Korpora/test
)1. 해당 PR은 어떤 내용인가요?
번역용 학습말뭉치를 만드는 CLI 기능과 사용법
2. PR과 관련된 이슈가 있나요?
#135