Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command line length limit in GenerateProtoTask causes overwriting of proto descriptors #774

Open
SlavikN14 opened this issue Feb 3, 2025 · 0 comments

Comments

@SlavikN14
Copy link

SlavikN14 commented Feb 3, 2025

Description

When compiling a large number of .proto files, the generated command exceeds the Default CMD character limit.

// Most OSs impose some kind of command length limit.
// Rather than account for all cases, pick a reasonable default of 64K.
  static final int DEFAULT_CMD_LENGTH_LIMIT = 65536

To circumvent this, the generateCmds function splits the command into multiple smaller commands. However, each of these commands writes to the same output descriptor file via --descriptor_set_out, causing the last command to overwrite the previous ones.

Affected Code

GenerateProtoTask.groovy (Lines 187-211)

static List<List<String>> generateCmds(List<String> baseCmd, List<File> protoFiles, int cmdLengthLimit) {
  List<List<String>> cmds = []
  if (!protoFiles.isEmpty()) {
    int baseCmdLength = baseCmd.sum { it.length() + CMD_ARGUMENT_EXTRA_LENGTH } as int
    List<String> currentArgs = []
    int currentArgsLength = 0
    for (File proto: protoFiles) {
      String protoFileName = proto
      int currentFileLength = protoFileName.length() + CMD_ARGUMENT_EXTRA_LENGTH
      if (baseCmdLength + currentArgsLength + currentFileLength > cmdLengthLimit) {
        cmds.add(baseCmd + currentArgs) // Adds a command before overflow
        currentArgs.clear()
        currentArgsLength = 0
      }
      currentArgs.add(protoFileName)
      currentArgsLength += currentFileLength
    }
    cmds.add(baseCmd + currentArgs)
  }
  return cmds
}

Expected Behavior

Each split command should write to a unique descriptor file and then merge them, preventing data loss.

Actual Behavior

Each generated command uses the same --descriptor_set_out parameter, leading to overwriting instead of appending.

Steps to Reproduce

  1. Compile a large number of .proto files that exceed the CMD character limit.
  2. Observe that multiple commands are executed.
  3. Check the final descriptor file – it contains only the last batch of .proto files.

Proposed Fix

  • Modify the --descriptor_set_out path for each split command.
  • Merge the descriptor files after all commands are executed.

Would appreciate feedback or any alternative suggestions! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant