-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: fix csv parser #9005
executor: fix csv parser #9005
Conversation
/run-all-tests |
Codecov Report
@@ Coverage Diff @@
## master #9005 +/- ##
==========================================
+ Coverage 67.16% 67.17% +0.01%
==========================================
Files 371 371
Lines 76393 76486 +93
==========================================
+ Hits 51311 51383 +72
- Misses 20494 20511 +17
- Partials 4588 4592 +4
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to add the test cases which this PR could fix.
executor/load_data.go
Outdated
sep = append(sep, e.FieldsInfo.Enclosed) | ||
sep = append(sep, e.FieldsInfo.Terminated...) | ||
sep = append(sep, e.FieldsInfo.Enclosed) | ||
var ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are so many lines in this function. Please split it into pieces.
/run-all-tests |
1 similar comment
/run-all-tests |
/run-unit-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
fp, err = os.Create(path) | ||
c.Assert(err, IsNil) | ||
c.Assert(fp, NotNil) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can os.Remove(path)
here immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next test case still use this file.
executor/load_data.go
Outdated
} | ||
|
||
func (w *fieldWriter) GetField() (bool, field) { | ||
// the bool return value implies whether is at the end of line. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/the/The
whether ?? is at the end of line
w.rollback() | ||
} | ||
} | ||
for { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it's complex enough and more difficult to maintain.
If we meet some error next time, I'll consider use some more general method instead of hard written those things.
executor/load_data.go
Outdated
w.term = term | ||
} | ||
|
||
func (w *fieldWriter) rollback() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to change the function name.
@@ -198,6 +198,11 @@ func (s *testExecSuite) TestGetFieldsFromLine(c *C) { | |||
`"\0\b\n\r\t\Z\\\ \c\'\""`, | |||
[]string{string([]byte{0, '\b', '\n', '\r', '\t', 26, '\\', ' ', ' ', 'c', '\'', '"'})}, | |||
}, | |||
// Test mixed. | |||
{ | |||
`"123",456,"\t7890",abcd`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to add more test cases to guarantee the behavior we support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest LGTM
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What problem does this PR solve?
TiDB will panic when encounter irregular csv format, like this.
Test SQL:
This is because csv file above mixed enclosed words and unenclosed words.
another case will cause panic is shown below.
TiDB:
MySQL:
What is changed and how it works?
Restructuring
getFieldsFromLine
function.Check List
Tests
Code changes
Side effects
Related changes