Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/parser #5

Merged
merged 8 commits into from
Dec 17, 2024
Merged

Fix/parser #5

merged 8 commits into from
Dec 17, 2024

Conversation

CorentinGS
Copy link
Owner

@CorentinGS CorentinGS commented Dec 17, 2024

Summary by CodeRabbit

  • New Features
    • Introduced a new error type for improved error reporting during parsing.
    • Added new PGN files for a complete chess game and variations, enhancing game record availability.
  • Bug Fixes
    • Enhanced error handling in the parser for better clarity on parsing failures.
    • Improved handling of chess game functionalities, ensuring compliance with game rules.
  • Tests
    • Expanded test coverage for PGN parsing, including tests for variations and complete games.
    • Added tests for various chess game scenarios, such as checkmate and stalemate.

Introduce a new test, TestBigPgn, to validate parsing and tokenizing of large PGN files. Ensures proper handling of multiple games in large datasets, improving robustness and error reporting.
Replaced generic error messages with the new `ParserError` struct, providing detailed context such as token type, value, and position. Added a `String()` method for `TokenType` to support better error descriptions. This enhances debugging and improves error clarity during PGN parsing.
Refactored how valid moves are retrieved to improve clarity, replacing a method call with direct access to the position's valid moves. Enhanced error messages for better debugging by including actual values in mismatch scenarios.
Enhanced PGN fixtures by including game annotations, opening names, and move evaluations from lichess.org.
This commit introduces a new `big_big.pgn` file containing numerous chess games from Eric Rosen, including rated and casual matches against various opponents. The file will be useful for testing PGN parsing, game rendering, and related functionalities.
This commit introduces functionality for parsing variations in PGN files, including updating parser logic for move generation from root and variation contexts. Additionally, lexer improvements address undefined tokens and support NAG symbols (!, ?). Relevant tests and example PGN files have also been added for validation.
Enhanced the parsing of comments and commands in PGN files by introducing key-value pair storage for commands. Adjusted related test cases and parsing functions to accommodate this change, ensuring better handling of PGN comments and more robust error reporting.
Copy link
Contributor

coderabbitai bot commented Dec 17, 2024

Walkthrough

The pull request introduces comprehensive enhancements to the chess package's parsing and error handling capabilities. Key modifications include a new ParserError type for detailed error reporting, expanded token type handling in the lexer, and improved PGN parsing logic. The changes focus on making the parser more robust by providing more context during parsing, supporting variations, and handling complex game notations. New test cases have been added to validate the enhanced functionality, including tests for game variations, large PGN files, and complete game parsing.

Changes

File Change Summary
errors.go Added ParserError struct with detailed error reporting capabilities and updated import statements.
lexer.go Introduced Undefined token type, enhanced token handling and error management in NextToken.
move.go Changed command field from string to map[string]string in Move struct.
pgn.go Improved error handling, modified parsing methods to return structured ParserError objects, and updated command parsing.
pgn_test.go Added new test functions for variations, large PGN files, and complete game parsing, including error handling and assertions.
fixtures/pgns/complete_game.pgn New PGN file with a complete chess game including metadata and move details.
fixtures/pgns/variations.pgn New PGN file demonstrating move variations with branching notation.
game_test.go Enhanced tests for game rules and scenarios, updated tag pair handling tests.
notation.go Minor cosmetic changes in comments related to the generateOptions method.

Sequence Diagram

sequenceDiagram
    participant Lexer
    participant Parser
    participant Game
    
    Lexer->>Parser: Tokenize PGN
    Parser->>Parser: Parse Tag Pairs
    Parser->>Parser: Parse Moves
    Parser->>Game: Construct Game State
    Parser-->>Lexer: Handle Errors
    Game-->>Parser: Game Representation
Loading

Poem

🐰 A Lexer's Tale of Chess Delight 🏁

With tokens dancing, errors bright,
Our parser leaps from move to might,
Variations bloom, commands take flight,
In PGN's realm of black and white,
A rabbit's code now shines so tight!

🐇 Hop, hop, hooray! 🎉

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Dec 17, 2024

Codecov Report

Attention: Patch coverage is 40.67797% with 140 lines in your changes missing coverage. Please review.

Project coverage is 66.23%. Comparing base (4c75fb7) to head (b9dbcbd).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pgn.go 44.38% 98 Missing and 1 partial ⚠️
lexer.go 28.30% 38 Missing ⚠️
errors.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main       #5      +/-   ##
==========================================
- Coverage   66.99%   66.23%   -0.76%     
==========================================
  Files          25       25              
  Lines        3675     3865     +190     
==========================================
+ Hits         2462     2560      +98     
- Misses       1100     1196      +96     
+ Partials      113      109       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Adjusted code indentation and alignment in several files to improve readability. Reordered struct fields in `ParserError` and `Parser` to match logical grouping. These changes enhance maintainability and adhere to coding standards.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (5)
lexer.go (1)

180-181: Address the TODO: Improve NAG handling for different formats

The comment indicates that the current NAG handling may not fully conform to the PGN specification, and there is a need to handle different formats more comprehensively.

Would you like assistance in implementing improved NAG handling according to the PGN specification? I can help generate the necessary code or open a GitHub issue to track this task.

pgn.go (1)

340-340: Avoid magic number 8 in rank calculation

Using the magic number 8 directly in calculations reduces code readability and maintainability. Consider defining a constant or using existing methods to compute the rank.

Apply this diff:

- if moveData.originRank != "" && strconv.Itoa(int((m.S1()/8)+1)) != moveData.originRank {
+ const boardSize = 8 // Define a constant for the board size
+ if moveData.originRank != "" && strconv.Itoa(int((m.S1()/boardSize)+1)) != moveData.originRank {

Alternatively, if there is a method to calculate the rank from a square, utilizing it would enhance clarity.

🧰 Tools
🪛 golangci-lint (1.62.2)

340-340: Magic number: 8, in detected

(mnd)

pgn_test.go (3)

115-144: Enhance variation testing coverage

While the test verifies basic parsing and move count, it could be strengthened by:

  • Validating the structure of variations
  • Verifying the specific moves within variations
  • Asserting the relationship between main line and variation moves

Would you like me to provide an example of how to enhance this test with more detailed assertions?

🧰 Tools
🪛 golangci-lint (1.62.2)

144-144: unnecessary trailing newline

(whitespace)


217-250: Consider validating game content

The test verifies parsing success but doesn't validate the content of parsed games. Consider adding basic validations for each game:

  • Verify required tag pairs (Event, Site, Date, etc.)
  • Check for valid move notation
  • Ensure game termination markers are correct

337-363: Consider using constants for test data

The test contains several magic numbers and hardcoded values. Consider:

  • Define constants for expected move count
  • Create test data structs for expected moves, comments, and annotations
  • Group related assertions into helper functions

Example:

const (
    expectedMoveCount = 104
    firstMove        = "d2d4"
)

type expectedMove struct {
    notation  string
    comment   string
    eval     string
    nag      string
}

var expectedMoves = []expectedMove{
    {notation: "d2d4", eval: "0.17"},
    // ... more moves
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4c75fb7 and 69c2dd9.

📒 Files selected for processing (7)
  • errors.go (2 hunks)
  • fixtures/pgns/complete_game.pgn (1 hunks)
  • fixtures/pgns/variations.pgn (1 hunks)
  • lexer.go (4 hunks)
  • move.go (1 hunks)
  • pgn.go (15 hunks)
  • pgn_test.go (3 hunks)
✅ Files skipped from review due to trivial changes (2)
  • fixtures/pgns/complete_game.pgn
  • fixtures/pgns/variations.pgn
🧰 Additional context used
🪛 golangci-lint (1.62.2)
pgn.go

115-115: unnecessary trailing newline

(whitespace)


340-340: Magic number: 8, in detected

(mnd)


473-473: unnecessary leading newline

(whitespace)


531-531: unnecessary trailing newline

(whitespace)

pgn_test.go

144-144: unnecessary trailing newline

(whitespace)

🔇 Additional comments (4)
lexer.go (1)

552-554: Handling unrecognized characters with Undefined TokenType

The addition of the Undefined TokenType ensures that unrecognized characters are appropriately tokenized, enhancing error handling during parsing.

pgn.go (1)

417-463: Improved parsing of comments and commands

The updated parseComment method now returns both a comment string and a command map, enhancing the handling of embedded commands within comments.

errors.go (1)

42-52: Introduction of ParserError enhances error reporting

The addition of the ParserError type with detailed fields like Message, TokenType, TokenValue, and Position improves the robustness and clarity of parsing error messages.

move.go (1)

28-29: Update Move struct to use a map for commands

Changing the command field to a map[string]string allows for flexible command representation, aligning with the updated command parsing logic. This change enhances the ability to store multiple command parameters associated with a move.

Comment on lines +44 to +83
func (t TokenType) String() string {
types := []string{
"EOF",
"TagStart",
"TagEnd",
"TagKey",
"TagValue",
"MoveNumber",
"DOT",
"ELLIPSIS",
"PIECE",
"SQUARE",
"CommentStart",
"CommentEnd",
"COMMENT",
"RESULT",
"CAPTURE",
"FILE",
"RANK",
"KingsideCastle",
"QueensideCastle",
"PROMOTION",
"PromotionPiece",
"CHECK",
"CHECKMATE",
"NAG",
"VariationStart",
"VariationEnd",
"CommandStart",
"CommandName",
"CommandParam",
"CommandEnd",
}

if t < 0 || int(t) >= len(types) {
return "Unknown"
}

return types[t]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix mismatch between TokenType constants and their string representations

In the String method, the types slice is missing the "Undefined" token type, causing misalignment between the TokenType constants and their string representations. This will lead to incorrect or off-by-one string outputs for token types beyond "EOF".

Apply this diff to include the "Undefined" token in the types slice:

 func (t TokenType) String() string {
     types := []string{
         "EOF",
+        "Undefined",
         "TagStart",
         "TagEnd",
         "TagKey",
         "TagValue",
         "MoveNumber",
         "DOT",
         "ELLIPSIS",
         "PIECE",
         "SQUARE",
         "CommentStart",
         "CommentEnd",
         "COMMENT",
         "RESULT",
         "CAPTURE",
         "FILE",
         "RANK",
         "KingsideCastle",
         "QueensideCastle",
         "PROMOTION",
         "PromotionPiece",
         "CHECK",
         "CHECKMATE",
         "NAG",
         "VariationStart",
         "VariationEnd",
         "CommandStart",
         "CommandName",
         "CommandParam",
         "CommandEnd",
     }
     // Rest of the method remains the same
 }

Committable suggestion skipped: line range outside the PR's diff.

pgn_test.go Outdated
Comment on lines +252 to +286
func TestBigBigPgn(t *testing.T) {
t.Skip("This test is too slow")
pgn := mustParsePGN("fixtures/pgns/big_big.pgn")
reader := strings.NewReader(pgn)

scanner := NewScanner(reader)
count := 0

for scanner.HasNext() {
count++
t.Run(fmt.Sprintf("bigbig pgn : %d", count), func(t *testing.T) {
scannedGame, err := scanner.ScanGame()
if err != nil {
t.Fatalf("fail to scan game from valid pgn: %s", err.Error())
}

tokens, err := TokenizeGame(scannedGame)
if err != nil {
t.Fatalf("fail to tokenize game from valid pgn: %s", err.Error())
}

raw := scannedGame.Raw

parser := NewParser(tokens)
game, err := parser.Parse()
if err != nil {
t.Fatalf("fail to read games from valid pgn: %s | %s", err.Error(), raw[:min(200, len(raw))])
}

if game == nil {
t.Fatalf("game is nil")
}
})
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve test management strategy

The test duplicates TestBigPgn logic but is skipped as "too slow". Consider:

  • Document the size/complexity differences between big.pgn and big_big.pgn
  • Add criteria for when this test should be run (e.g., pre-release validation)
  • Consider making it a benchmark instead of a test
  • Use build tags to control test execution

Example build tag usage:

//go:build extended_test

pgn_test.go Outdated
parser := NewParser(tokens)
game, err := parser.Parse()
if err != nil {
t.Fatalf("fail to read games from valid pgn: %s | %s", err.Error(), raw[:min(200, len(raw))])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix undefined min function

The min function is used but not defined. This could cause compilation errors.

Add this helper function:

func min(a, b int) int {
    if a < b {
        return a
    }
    return b
}

pgn_test.go Outdated
Comment on lines +338 to +340
if len(game.Moves()) != 104 {
t.Fatalf("game moves are not correct, expected 52, got %d", len(game.Moves()))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix inconsistent error message

The error message mentions "expected 52" but the actual check is for 104 moves.

-		t.Fatalf("game moves are not correct, expected 52, got %d", len(game.Moves()))
+		t.Fatalf("game moves are not correct, expected 104, got %d", len(game.Moves()))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if len(game.Moves()) != 104 {
t.Fatalf("game moves are not correct, expected 52, got %d", len(game.Moves()))
}
if len(game.Moves()) != 104 {
t.Fatalf("game moves are not correct, expected 104, got %d", len(game.Moves()))
}

This was referenced Dec 17, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
errors.go (1)

49-52: Consider adding error wrapping support.

While the error formatting is good, consider implementing the Unwrap() method to support error wrapping, which would allow for better error handling in calling code.

+func (e *ParserError) Unwrap() error {
+    return fmt.Errorf(e.Message)
+}
pgn.go (1)

85-90: Consistent error handling pattern.

Good use of the new ParserError type with consistent error message formatting. However, consider extracting the error creation into helper functions to reduce code duplication.

+func newParserError(message string, token Token, pos int) *ParserError {
+    return &ParserError{
+        Message:    message,
+        TokenType:  token.Type,
+        TokenValue: token.Value,
+        Position:   pos,
+    }
+}

Also applies to: 96-101, 108-114, 121-126

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 69c2dd9 and b9dbcbd.

📒 Files selected for processing (4)
  • errors.go (2 hunks)
  • game_test.go (2 hunks)
  • notation.go (1 hunks)
  • pgn.go (14 hunks)
✅ Files skipped from review due to trivial changes (1)
  • notation.go
🧰 Additional context used
🪛 golangci-lint (1.62.2)
pgn.go

115-115: unnecessary trailing newline

(whitespace)


340-340: Magic number: 8, in detected

(mnd)


473-473: unnecessary leading newline

(whitespace)


531-531: unnecessary trailing newline

(whitespace)

🔇 Additional comments (4)
errors.go (1)

42-47: LGTM! Well-structured error type.

The new ParserError type provides comprehensive error context with token value, type, and position, which will greatly improve debugging and error reporting capabilities.

pgn.go (1)

14-14: Good improvement in error tracking and command handling.

The addition of the errors field in the Parser struct and the changes to parseComment to return a command map improve error tracking and command handling capabilities.

Also applies to: 419-424

game_test.go (2)

925-932: LGTM! Good test coverage for edge case.

The test case properly validates the behavior when attempting to remove a non-existent tag pair.


Line range hint 933-939: LGTM! Good test coverage for empty map case.

The test case properly validates the behavior when attempting to remove a tag pair from an empty map.

Comment on lines +522 to +528
if variationParent.parent != nil {
p.game.pos = variationParent.parent.position.copy()
if newPos := p.game.pos.Update(variationParent); newPos != nil {
p.game.pos = newPos
}
} else {
p.game.pos = StartingPosition()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential nil pointer dereference.

The position update logic needs to handle the case where variationParent.parent is nil but variationParent is not.

-        if variationParent.parent != nil {
+        if variationParent != nil && variationParent.parent != nil {
             p.game.pos = variationParent.parent.position.copy()
             if newPos := p.game.pos.Update(variationParent); newPos != nil {
                 p.game.pos = newPos
             }
         } else {
             p.game.pos = StartingPosition()
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if variationParent.parent != nil {
p.game.pos = variationParent.parent.position.copy()
if newPos := p.game.pos.Update(variationParent); newPos != nil {
p.game.pos = newPos
}
} else {
p.game.pos = StartingPosition()
if variationParent != nil && variationParent.parent != nil {
p.game.pos = variationParent.parent.position.copy()
if newPos := p.game.pos.Update(variationParent); newPos != nil {
p.game.pos = newPos
}
} else {
p.game.pos = StartingPosition()

Comment on lines +465 to +490
func (p *Parser) parseCommand() (map[string]string, error) {
command := make(map[string]string)
var key string

// Consume the opening "["
p.advance()

for p.currentToken().Type != CommandEnd && p.position < len(p.tokens) {
switch p.currentToken().Type {
case CommandName, CommandParam:
parts = append(parts, p.currentToken().Value)

case CommandName:
// The first token in a command is treated as the key
key = p.currentToken().Value
case CommandParam:
// The second token is treated as the value for the current key
if key != "" {
command[key] = p.currentToken().Value
key = "" // Reset key after assigning value
}
default:
return nil, &ParserError{
Message: "unexpected token in command",
Position: p.position,
TokenType: p.currentToken().Type,
TokenValue: p.currentToken().Value,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve command parsing robustness.

The command parsing logic could be more robust:

  1. Consider validating the command name format
  2. Add support for multiple parameters
  3. Handle empty command values
 func (p *Parser) parseCommand() (map[string]string, error) {
     command := make(map[string]string)
     var key string
+    var params []string
 
     p.advance()
 
     for p.currentToken().Type != CommandEnd && p.position < len(p.tokens) {
         switch p.currentToken().Type {
         case CommandName:
+            if !isValidCommandName(p.currentToken().Value) {
+                return nil, newParserError("invalid command name", p.currentToken(), p.position)
+            }
             key = p.currentToken().Value
         case CommandParam:
-            if key != "" {
-                command[key] = p.currentToken().Value
-                key = ""
-            }
+            params = append(params, p.currentToken().Value)
         default:
             return nil, &ParserError{
                 Message:    "unexpected token in command",
                 Position:   p.position,
                 TokenType:  p.currentToken().Type,
                 TokenValue: p.currentToken().Value,
             }
         }
         p.advance()
     }
+
+    if key != "" {
+        if len(params) == 0 {
+            command[key] = ""
+        } else if len(params) == 1 {
+            command[key] = params[0]
+        } else {
+            command[key] = strings.Join(params, " ")
+        }
+    }

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 golangci-lint (1.62.2)

473-473: unnecessary leading newline

(whitespace)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant