解決recognition的train test分割程式執行後的文檔每行間多出一行空格 #11280
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
使用gen_ocr_train_val_test.py分割recognition data後產生的train.txt、val.txt和test.txt每行label間多出一行空格行(\n),導致訓練時出現異常,移除換行\n後便可正常運行。
因多出一行空格行,導致以下error。
[2023/11/21 09:58:27] ppocr ERROR: When parsing line D:\PaddleOCR\train_data\rec\train\FAB06_input_Win 2000_crop_1.jpg l , error happened with msg: Traceback (most recent call last): File "D:\PaddleOCR\ppocr\data\simple_dataset.py", line 252, in __getitem__ data['ext_data'] = self.get_ext_data() File "D:\PaddleOCR\ppocr\data\simple_dataset.py", line 124, in get_ext_data label = substr[1] IndexError: list index out of range