-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/parser #5
Fix/parser #5
Conversation
Introduce a new test, TestBigPgn, to validate parsing and tokenizing of large PGN files. Ensures proper handling of multiple games in large datasets, improving robustness and error reporting.
Replaced generic error messages with the new `ParserError` struct, providing detailed context such as token type, value, and position. Added a `String()` method for `TokenType` to support better error descriptions. This enhances debugging and improves error clarity during PGN parsing.
Refactored how valid moves are retrieved to improve clarity, replacing a method call with direct access to the position's valid moves. Enhanced error messages for better debugging by including actual values in mismatch scenarios.
Enhanced PGN fixtures by including game annotations, opening names, and move evaluations from lichess.org.
This commit introduces a new `big_big.pgn` file containing numerous chess games from Eric Rosen, including rated and casual matches against various opponents. The file will be useful for testing PGN parsing, game rendering, and related functionalities.
This commit introduces functionality for parsing variations in PGN files, including updating parser logic for move generation from root and variation contexts. Additionally, lexer improvements address undefined tokens and support NAG symbols (!, ?). Relevant tests and example PGN files have also been added for validation.
Enhanced the parsing of comments and commands in PGN files by introducing key-value pair storage for commands. Adjusted related test cases and parsing functions to accommodate this change, ensuring better handling of PGN comments and more robust error reporting.
WalkthroughThe pull request introduces comprehensive enhancements to the chess package's parsing and error handling capabilities. Key modifications include a new Changes
Sequence DiagramsequenceDiagram
participant Lexer
participant Parser
participant Game
Lexer->>Parser: Tokenize PGN
Parser->>Parser: Parse Tag Pairs
Parser->>Parser: Parse Moves
Parser->>Game: Construct Game State
Parser-->>Lexer: Handle Errors
Game-->>Parser: Game Representation
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5 +/- ##
==========================================
- Coverage 66.99% 66.23% -0.76%
==========================================
Files 25 25
Lines 3675 3865 +190
==========================================
+ Hits 2462 2560 +98
- Misses 1100 1196 +96
+ Partials 113 109 -4 ☔ View full report in Codecov by Sentry. |
Adjusted code indentation and alignment in several files to improve readability. Reordered struct fields in `ParserError` and `Parser` to match logical grouping. These changes enhance maintainability and adhere to coding standards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (5)
lexer.go (1)
180-181
: Address the TODO: Improve NAG handling for different formatsThe comment indicates that the current NAG handling may not fully conform to the PGN specification, and there is a need to handle different formats more comprehensively.
Would you like assistance in implementing improved NAG handling according to the PGN specification? I can help generate the necessary code or open a GitHub issue to track this task.
pgn.go (1)
340-340
: Avoid magic number8
in rank calculationUsing the magic number
8
directly in calculations reduces code readability and maintainability. Consider defining a constant or using existing methods to compute the rank.Apply this diff:
- if moveData.originRank != "" && strconv.Itoa(int((m.S1()/8)+1)) != moveData.originRank { + const boardSize = 8 // Define a constant for the board size + if moveData.originRank != "" && strconv.Itoa(int((m.S1()/boardSize)+1)) != moveData.originRank {Alternatively, if there is a method to calculate the rank from a square, utilizing it would enhance clarity.
🧰 Tools
🪛 golangci-lint (1.62.2)
340-340: Magic number: 8, in detected
(mnd)
pgn_test.go (3)
115-144
: Enhance variation testing coverageWhile the test verifies basic parsing and move count, it could be strengthened by:
- Validating the structure of variations
- Verifying the specific moves within variations
- Asserting the relationship between main line and variation moves
Would you like me to provide an example of how to enhance this test with more detailed assertions?
🧰 Tools
🪛 golangci-lint (1.62.2)
144-144: unnecessary trailing newline
(whitespace)
217-250
: Consider validating game contentThe test verifies parsing success but doesn't validate the content of parsed games. Consider adding basic validations for each game:
- Verify required tag pairs (Event, Site, Date, etc.)
- Check for valid move notation
- Ensure game termination markers are correct
337-363
: Consider using constants for test dataThe test contains several magic numbers and hardcoded values. Consider:
- Define constants for expected move count
- Create test data structs for expected moves, comments, and annotations
- Group related assertions into helper functions
Example:
const ( expectedMoveCount = 104 firstMove = "d2d4" ) type expectedMove struct { notation string comment string eval string nag string } var expectedMoves = []expectedMove{ {notation: "d2d4", eval: "0.17"}, // ... more moves }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
errors.go
(2 hunks)fixtures/pgns/complete_game.pgn
(1 hunks)fixtures/pgns/variations.pgn
(1 hunks)lexer.go
(4 hunks)move.go
(1 hunks)pgn.go
(15 hunks)pgn_test.go
(3 hunks)
✅ Files skipped from review due to trivial changes (2)
- fixtures/pgns/complete_game.pgn
- fixtures/pgns/variations.pgn
🧰 Additional context used
🪛 golangci-lint (1.62.2)
pgn.go
115-115: unnecessary trailing newline
(whitespace)
340-340: Magic number: 8, in detected
(mnd)
473-473: unnecessary leading newline
(whitespace)
531-531: unnecessary trailing newline
(whitespace)
pgn_test.go
144-144: unnecessary trailing newline
(whitespace)
🔇 Additional comments (4)
lexer.go (1)
552-554
: Handling unrecognized characters with Undefined
TokenType
The addition of the Undefined
TokenType
ensures that unrecognized characters are appropriately tokenized, enhancing error handling during parsing.
pgn.go (1)
417-463
: Improved parsing of comments and commands
The updated parseComment
method now returns both a comment string and a command map, enhancing the handling of embedded commands within comments.
errors.go (1)
42-52
: Introduction of ParserError
enhances error reporting
The addition of the ParserError
type with detailed fields like Message
, TokenType
, TokenValue
, and Position
improves the robustness and clarity of parsing error messages.
move.go (1)
28-29
: Update Move
struct to use a map for commands
Changing the command
field to a map[string]string
allows for flexible command representation, aligning with the updated command parsing logic. This change enhances the ability to store multiple command parameters associated with a move.
func (t TokenType) String() string { | ||
types := []string{ | ||
"EOF", | ||
"TagStart", | ||
"TagEnd", | ||
"TagKey", | ||
"TagValue", | ||
"MoveNumber", | ||
"DOT", | ||
"ELLIPSIS", | ||
"PIECE", | ||
"SQUARE", | ||
"CommentStart", | ||
"CommentEnd", | ||
"COMMENT", | ||
"RESULT", | ||
"CAPTURE", | ||
"FILE", | ||
"RANK", | ||
"KingsideCastle", | ||
"QueensideCastle", | ||
"PROMOTION", | ||
"PromotionPiece", | ||
"CHECK", | ||
"CHECKMATE", | ||
"NAG", | ||
"VariationStart", | ||
"VariationEnd", | ||
"CommandStart", | ||
"CommandName", | ||
"CommandParam", | ||
"CommandEnd", | ||
} | ||
|
||
if t < 0 || int(t) >= len(types) { | ||
return "Unknown" | ||
} | ||
|
||
return types[t] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix mismatch between TokenType
constants and their string representations
In the String
method, the types
slice is missing the "Undefined"
token type, causing misalignment between the TokenType
constants and their string representations. This will lead to incorrect or off-by-one string outputs for token types beyond "EOF"
.
Apply this diff to include the "Undefined"
token in the types
slice:
func (t TokenType) String() string {
types := []string{
"EOF",
+ "Undefined",
"TagStart",
"TagEnd",
"TagKey",
"TagValue",
"MoveNumber",
"DOT",
"ELLIPSIS",
"PIECE",
"SQUARE",
"CommentStart",
"CommentEnd",
"COMMENT",
"RESULT",
"CAPTURE",
"FILE",
"RANK",
"KingsideCastle",
"QueensideCastle",
"PROMOTION",
"PromotionPiece",
"CHECK",
"CHECKMATE",
"NAG",
"VariationStart",
"VariationEnd",
"CommandStart",
"CommandName",
"CommandParam",
"CommandEnd",
}
// Rest of the method remains the same
}
Committable suggestion skipped: line range outside the PR's diff.
pgn_test.go
Outdated
func TestBigBigPgn(t *testing.T) { | ||
t.Skip("This test is too slow") | ||
pgn := mustParsePGN("fixtures/pgns/big_big.pgn") | ||
reader := strings.NewReader(pgn) | ||
|
||
scanner := NewScanner(reader) | ||
count := 0 | ||
|
||
for scanner.HasNext() { | ||
count++ | ||
t.Run(fmt.Sprintf("bigbig pgn : %d", count), func(t *testing.T) { | ||
scannedGame, err := scanner.ScanGame() | ||
if err != nil { | ||
t.Fatalf("fail to scan game from valid pgn: %s", err.Error()) | ||
} | ||
|
||
tokens, err := TokenizeGame(scannedGame) | ||
if err != nil { | ||
t.Fatalf("fail to tokenize game from valid pgn: %s", err.Error()) | ||
} | ||
|
||
raw := scannedGame.Raw | ||
|
||
parser := NewParser(tokens) | ||
game, err := parser.Parse() | ||
if err != nil { | ||
t.Fatalf("fail to read games from valid pgn: %s | %s", err.Error(), raw[:min(200, len(raw))]) | ||
} | ||
|
||
if game == nil { | ||
t.Fatalf("game is nil") | ||
} | ||
}) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve test management strategy
The test duplicates TestBigPgn
logic but is skipped as "too slow". Consider:
- Document the size/complexity differences between
big.pgn
andbig_big.pgn
- Add criteria for when this test should be run (e.g., pre-release validation)
- Consider making it a benchmark instead of a test
- Use build tags to control test execution
Example build tag usage:
//go:build extended_test
pgn_test.go
Outdated
parser := NewParser(tokens) | ||
game, err := parser.Parse() | ||
if err != nil { | ||
t.Fatalf("fail to read games from valid pgn: %s | %s", err.Error(), raw[:min(200, len(raw))]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix undefined min function
The min
function is used but not defined. This could cause compilation errors.
Add this helper function:
func min(a, b int) int {
if a < b {
return a
}
return b
}
pgn_test.go
Outdated
if len(game.Moves()) != 104 { | ||
t.Fatalf("game moves are not correct, expected 52, got %d", len(game.Moves())) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix inconsistent error message
The error message mentions "expected 52" but the actual check is for 104 moves.
- t.Fatalf("game moves are not correct, expected 52, got %d", len(game.Moves()))
+ t.Fatalf("game moves are not correct, expected 104, got %d", len(game.Moves()))
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if len(game.Moves()) != 104 { | |
t.Fatalf("game moves are not correct, expected 52, got %d", len(game.Moves())) | |
} | |
if len(game.Moves()) != 104 { | |
t.Fatalf("game moves are not correct, expected 104, got %d", len(game.Moves())) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
errors.go (1)
49-52
: Consider adding error wrapping support.While the error formatting is good, consider implementing the
Unwrap()
method to support error wrapping, which would allow for better error handling in calling code.+func (e *ParserError) Unwrap() error { + return fmt.Errorf(e.Message) +}pgn.go (1)
85-90
: Consistent error handling pattern.Good use of the new ParserError type with consistent error message formatting. However, consider extracting the error creation into helper functions to reduce code duplication.
+func newParserError(message string, token Token, pos int) *ParserError { + return &ParserError{ + Message: message, + TokenType: token.Type, + TokenValue: token.Value, + Position: pos, + } +}Also applies to: 96-101, 108-114, 121-126
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
errors.go
(2 hunks)game_test.go
(2 hunks)notation.go
(1 hunks)pgn.go
(14 hunks)
✅ Files skipped from review due to trivial changes (1)
- notation.go
🧰 Additional context used
🪛 golangci-lint (1.62.2)
pgn.go
115-115: unnecessary trailing newline
(whitespace)
340-340: Magic number: 8, in detected
(mnd)
473-473: unnecessary leading newline
(whitespace)
531-531: unnecessary trailing newline
(whitespace)
🔇 Additional comments (4)
errors.go (1)
42-47
: LGTM! Well-structured error type.
The new ParserError
type provides comprehensive error context with token value, type, and position, which will greatly improve debugging and error reporting capabilities.
pgn.go (1)
14-14
: Good improvement in error tracking and command handling.
The addition of the errors
field in the Parser struct and the changes to parseComment
to return a command map improve error tracking and command handling capabilities.
Also applies to: 419-424
game_test.go (2)
925-932
: LGTM! Good test coverage for edge case.
The test case properly validates the behavior when attempting to remove a non-existent tag pair.
Line range hint 933-939
: LGTM! Good test coverage for empty map case.
The test case properly validates the behavior when attempting to remove a tag pair from an empty map.
if variationParent.parent != nil { | ||
p.game.pos = variationParent.parent.position.copy() | ||
if newPos := p.game.pos.Update(variationParent); newPos != nil { | ||
p.game.pos = newPos | ||
} | ||
} else { | ||
p.game.pos = StartingPosition() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential nil pointer dereference.
The position update logic needs to handle the case where variationParent.parent
is nil but variationParent
is not.
- if variationParent.parent != nil {
+ if variationParent != nil && variationParent.parent != nil {
p.game.pos = variationParent.parent.position.copy()
if newPos := p.game.pos.Update(variationParent); newPos != nil {
p.game.pos = newPos
}
} else {
p.game.pos = StartingPosition()
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if variationParent.parent != nil { | |
p.game.pos = variationParent.parent.position.copy() | |
if newPos := p.game.pos.Update(variationParent); newPos != nil { | |
p.game.pos = newPos | |
} | |
} else { | |
p.game.pos = StartingPosition() | |
if variationParent != nil && variationParent.parent != nil { | |
p.game.pos = variationParent.parent.position.copy() | |
if newPos := p.game.pos.Update(variationParent); newPos != nil { | |
p.game.pos = newPos | |
} | |
} else { | |
p.game.pos = StartingPosition() |
func (p *Parser) parseCommand() (map[string]string, error) { | ||
command := make(map[string]string) | ||
var key string | ||
|
||
// Consume the opening "[" | ||
p.advance() | ||
|
||
for p.currentToken().Type != CommandEnd && p.position < len(p.tokens) { | ||
switch p.currentToken().Type { | ||
case CommandName, CommandParam: | ||
parts = append(parts, p.currentToken().Value) | ||
|
||
case CommandName: | ||
// The first token in a command is treated as the key | ||
key = p.currentToken().Value | ||
case CommandParam: | ||
// The second token is treated as the value for the current key | ||
if key != "" { | ||
command[key] = p.currentToken().Value | ||
key = "" // Reset key after assigning value | ||
} | ||
default: | ||
return nil, &ParserError{ | ||
Message: "unexpected token in command", | ||
Position: p.position, | ||
TokenType: p.currentToken().Type, | ||
TokenValue: p.currentToken().Value, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve command parsing robustness.
The command parsing logic could be more robust:
- Consider validating the command name format
- Add support for multiple parameters
- Handle empty command values
func (p *Parser) parseCommand() (map[string]string, error) {
command := make(map[string]string)
var key string
+ var params []string
p.advance()
for p.currentToken().Type != CommandEnd && p.position < len(p.tokens) {
switch p.currentToken().Type {
case CommandName:
+ if !isValidCommandName(p.currentToken().Value) {
+ return nil, newParserError("invalid command name", p.currentToken(), p.position)
+ }
key = p.currentToken().Value
case CommandParam:
- if key != "" {
- command[key] = p.currentToken().Value
- key = ""
- }
+ params = append(params, p.currentToken().Value)
default:
return nil, &ParserError{
Message: "unexpected token in command",
Position: p.position,
TokenType: p.currentToken().Type,
TokenValue: p.currentToken().Value,
}
}
p.advance()
}
+
+ if key != "" {
+ if len(params) == 0 {
+ command[key] = ""
+ } else if len(params) == 1 {
+ command[key] = params[0]
+ } else {
+ command[key] = strings.Join(params, " ")
+ }
+ }
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 golangci-lint (1.62.2)
473-473: unnecessary leading newline
(whitespace)
Summary by CodeRabbit