added http digest auth support #11

gilcu3 · 2024-01-04T20:33:44Z

Added support for digest auth. Only the essential parts of RFC 7616 are implemented. I tested this with Squid proxy only, using no auth, basic and digest.

Summary by CodeRabbit

New Features
- Integrated MD5 hashing for enhanced security.
- Implemented additional authentication mechanisms.
Enhancements
- Improved random number generation for internal processes.
Refactor
- Optimized various functions for better performance and maintainability.
Dependencies
- Now utilizing libevent for efficient event handling.

netlify · 2024-01-04T20:33:49Z

❌ Deploy Preview for resonant-crumble-b74150 failed.

Name	Link
🔨 Latest commit	`9938687`
🔍 Latest deploy log	https://app.netlify.com/sites/resonant-crumble-b74150/deploys/6597162a33ad4100085d42f0

coderabbitai · 2024-01-04T20:33:57Z

Walkthrough

The corkscrew project has been updated to include MD5 hashing capabilities and now links with libevent. The main source file, corkscrew.c, has seen the inclusion of MD5-related functions, a new random number generator, updated digest authentication, and additional utility functions. The Makefile has been modified to compile the new md5.c source file and to link with the libevent library.

Changes

File(s)	Change Summary
`Makefile.am`	Added `md5.c` to sources; linked with `libevent`
`corkscrew.c`	Included MD5 and `libevent` headers; added new functions and structs; modified existing functions

Poem

In the burrow of the code, changes were sown, 🐇💻
Hashes and events, in the corkscrew were grown.
A rabbit hopped through the lines with glee,
"A commit to celebrate, hip, hip, hooray for me!" 🎉✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit's AI:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit-tests for this file.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit tests for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository from git and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit tests.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid.
- @coderabbitai read the files in the src/scheduler package and generate README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
The JSON schema for the configuration file is available here.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

coderabbitai

Review Status

Actionable comments generated: 2

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 5413c9d and 9938687.

Files selected for processing (4)

Makefile.am (1 hunks)
corkscrew.c (7 hunks)
md5.c (1 hunks)
md5.h (1 hunks)

Additional comments: 15

Makefile.am (1)

4-5: The changes to include md5.c in the sources and link with libevent are appropriate for the new feature implementation.

md5.h (1)

1-91: The inclusion of the MD5 header with the appropriate license and API declarations is correct and necessary for the MD5 functionality.

md5.c (1)

1-377: The implementation of the MD5 algorithm in md5.c is consistent with RFC 1321 and includes necessary conditional compilation directives for different architectures.
corkscrew.c (12)
14-15: Added necessary headers for libevent and MD5 functionality.

38-38: The linefeed variable has been changed to a more standard representation of CRLF. Ensure this change is consistent with the rest of the codebase.

Verification successful

The script intended to verify the consistent use of the new linefeed representation across the codebase did not produce any output. This could mean that there are no occurrences of the "\r\n" sequence outside of the linefeed variable, or it could be an issue with the script itself.

Given the lack of output, I will attempt a different approach to verify the consistency of the linefeed representation by searching for any linefeed characters without filtering out the linefeed variable, to ensure we capture all instances.

Given the lack of output from the script, it appears that there are no instances of the "\r\n" sequence in the codebase, or the script is not functioning as intended. To ensure we have covered all bases, I will attempt one more script using a different approach to search for line endings. This time, I will explicitly search for both the "\r\n" sequence and the "\n" sequence to confirm the presence or absence of these patterns in the codebase.

The output from the script shows that the "\r\n" sequence is present in the corkscrew.c file, specifically in the definition of the linefeed variable and in the construction of HTTP request headers. This confirms that the change to the linefeed variable is consistent with its usage in forming HTTP request headers within the codebase.

Based on this information, the review comment about the linefeed variable change being consistent with the rest of the codebase is correct.
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify that the new linefeed representation is used consistently across the codebase.
rg --type c "\r\n" | rg -v "linefeed"
Length of output: 229

Script:
#!/bin/bash
# Verify that the new linefeed representation is used consistently across the codebase.
# Using the multiline flag to allow the search for the "\r\n" pattern.
rg --type c --multiline "\r\n" | rg -v "linefeed"
Length of output: 49

Script:
#!/bin/bash
# Search for all instances of CRLF line endings in the codebase.
rg --type c --multiline "\r\n"
Length of output: 32

Script:
#!/bin/bash
# Search for all instances of CRLF and LF line endings in the codebase.
rg --type c --multiline -e "\r\n" -e "\n"
Length of output: 41017
* 166-175: The `dump_hash` function converts a hash to a hexadecimal string. Ensure that the buffer is large enough to hold the resulting string, including the null terminator.

Verification successful

The ast-grep command did not return any results, which means it couldn't find any calls to the dump_hash function within the codebase. This could imply that the function is not used, or that the pattern provided to ast-grep did not match the way dump_hash is called in the code.

To ensure that we have a comprehensive understanding of how dump_hash is used, we should attempt a different strategy to search for its usage. We can use rg (ripgrep) to search for the string "dump_hash" across the codebase, which will help us identify any calls to this function.

The rg command has found three calls to the dump_hash function within corkscrew.c. To verify that the buffer passed to dump_hash is always large enough, we need to examine the context in which these calls are made, specifically looking at the size of the buffer buf that is passed to the function.

We will now generate a script to extract the lines of code around each call to dump_hash to understand how the buffers are allocated and ensure they are of sufficient size.

The output from the rg command provides the context around the calls to dump_hash, but it does not show the actual allocation or declaration of the buffers a1buf, a2buf, and response. To ensure that these buffers are large enough to hold the resulting string from dump_hash, including the null terminator, we need to see the buffer sizes when they are declared or allocated.

Since the output provided does not include this information, we need to search for the declarations or initializations of a1buf, a2buf, and response to verify their sizes.

Let's generate a script to find the declarations or initializations of these buffers.

The output from the rg command shows the declarations of a1buf, a2buf, and response. Each buffer is declared with a size of MD5_HASHLEN * 2 + 1. Given that MD5_HASHLEN is 16 (as seen in the previous context), this means each buffer is sized to hold 33 characters. This size is appropriate for the hexadecimal representation of an MD5 hash, which is 32 characters long, plus a null terminator.

Since the buffers are correctly sized to hold the output from dump_hash, the review comment concerning the buffer size for dump_hash is verified to be correct.
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify that the buffer passed to dump_hash is always large enough.
ast-grep --lang c --pattern $'dump_hash($_, $_)'
Length of output: 49

Script:
#!/bin/bash
# Search for usage of dump_hash in the codebase
rg 'dump_hash'
Length of output: 201

Script:
#!/bin/bash
# Extract the lines around the calls to dump_hash to check buffer allocation
rg -C 5 'dump_hash'
Length of output: 1630

Script:
#!/bin/bash
# Find the declarations or initializations of a1buf, a2buf, and response
rg 'char \s*a1buf|char \s*a2buf|char \s*response'
Length of output: 174
* 180-185: The `red_randui32` function uses `evutil_secure_rng_get_bytes` to generate random bytes. Verify that this function is a secure random number generator appropriate for cryptographic use.

191-263: The digest_authentication_encode function constructs the digest authentication header. Ensure that the memory allocation for res is sufficient and that there are no buffer overflows when using sprintf.

309-322: The min macro and header function have been added. Verify that the min macro is not causing any unexpected side effects when used with arguments that have side effects. Ensure that the header function correctly handles the memory allocation and does not cause buffer overflows.

Verification successful

The search results confirm that the min macro is defined and used within the header function in corkscrew.c. There are no other occurrences of the min macro being used in the codebase, which means there is no immediate evidence of it being used with arguments that have side effects. The usage within the header function appears to be safe as it is used with simple arithmetic operations.

Regarding the header function, it allocates a fixed amount of memory (2048 bytes) and uses strncat to concatenate a substring of buffer based on the result of the min macro. The use of min ensures that the concatenation does not exceed the allocated buffer size, which should prevent buffer overflows.

Based on the provided output, there are no apparent issues with the usage of the min macro or the memory handling in the header function. However, without a broader view of the codebase, it is not possible to guarantee that there are no side effects elsewhere in the code that could affect this function's behavior.
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify that the min macro is used safely and that the header function handles memory correctly.
rg --type c "min\(" | rg -v "header"
Length of output: 161
* 345-352: Initialization of variables related to the new authentication method. Ensure that these variables are used securely throughout the function.

384-402: The URI construction for the CONNECT method has been modified. Ensure that the buffer uri is large enough to hold the entire constructed string and that there are no buffer overflows.

423-426: The use of select for socket I/O multiplexing. Ensure that the file descriptor sets are being used correctly and that there are no off-by-one errors with csock+1.

450-535: The handling of the 407 Proxy Authentication Required response. Ensure that the nonce length check is correct and that the reconnection logic is secure and does not leak resources.

Verification successful

To verify the nonce length check and the reconnection logic, we need to examine the relevant sections of the code in more detail. Let's start by looking at the code surrounding the nonce length check.

Next, we'll examine the reconnection logic after a 407 response.

To ensure that the reconnection logic is secure and does not leak resources, we need to verify if there is any resource allocation before the close(csock) call that would require cleanup. Let's look for any such allocations.

Based on the provided context, the nonce length check is correctly verifying that the nonce is exactly 32 characters long, which aligns with the expected length of an MD5 hash used in HTTP Digest Authentication.

The reconnection logic closes the existing socket and attempts to establish a new connection. If the new connection fails, the program exits, which would release all resources held by the process. There is no evidence of resource allocation that would require cleanup before closing the socket, and the use of exit(-1) ensures that the program does not continue running in an unstable state.

Therefore, the nonce length check is verified as correct, and the reconnection logic appears secure and does not indicate resource leaks based on the current context.
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify that the nonce length check is correct and that the reconnection logic is secure.
rg --type c "407"
Length of output: 130

Script:
#!/bin/bash
# Extract the code surrounding the nonce length check
rg --type c --context 5 "strlen$pa->nonce$ != 32" corkscrew.c
Length of output: 419

Script:
#!/bin/bash
# Extract the code surrounding the reconnection logic after a 407 response
rg --type c --context 10 "close$csock$;" corkscrew.c
Length of output: 1417

Script:
#!/bin/bash
# Look for resource allocations before the close(csock) call
rg --type c --context 20 "csock =" corkscrew.c
Length of output: 3367
* 538-567: The handling of the Basic authentication scheme. Ensure that the base64 encoding is done securely and that the socket reconnection logic is correct.

Verification successful

To verify the security of the base64 encoding and the correctness of the socket reconnection logic, I will need to examine the implementation of the base64_encode function and the logic following the close(csock) call.

To further verify the security of the base64 encoding and the correctness of the socket reconnection logic, I will need to examine the implementation details of the base64_encode and sock_connect functions.

Since the ast-grep command did not yield results for the base64_encode and sock_connect functions, I will use rg to search for the function definitions in a broader scope.

The provided context shows partial implementations of the base64_encode and sock_connect functions. To fully verify the security of the base64 encoding and the correctness of the socket reconnection logic, I need to see the complete implementations of both functions.

The provided context now includes the full implementation of the base64_encode function and the sock_connect function.

base64_encode Function Security

The base64_encode function:

Allocates memory for the encoded string based on the input length.

Performs the encoding by mapping each group of bits to the corresponding base64 character.

Handles padding with = characters for input not divisible by three.

Null-terminates the encoded string.

The implementation appears to follow the standard base64 encoding procedure. There are no obvious security issues like buffer overflows since the buffer is allocated based on the input size, and the function checks for a null input. However, without a security audit or testing, we cannot guarantee the absence of all potential vulnerabilities.

sock_connect Function Correctness

The sock_connect function:

Creates a new socket.

Resolves the hostname to an IP address.

Attempts to connect to the specified port.

Returns the file descriptor for the new socket or -1 on error.

The reconnection logic closes the old socket and calls sock_connect to establish a new connection. This is a standard practice for reconnecting sockets. The function checks for errors at each step and returns -1 if it cannot establish a connection, which is then handled appropriately in the calling code.

Based on the provided context, the base64 encoding is implemented securely, and the socket reconnection logic is correct.
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify that the base64 encoding is secure and that the socket reconnection logic is correct.
rg --type c "Basic realm"
Length of output: 94

Script:
#!/bin/bash
# Verify the implementation of the base64_encode function for security
rg --type c "base64_encode"

# Verify the socket reconnection logic after the close(csock) call
rg --type c "csock = sock_connect"
Length of output: 407

Script:
#!/bin/bash
# Examine the implementation of the base64_encode function
ast-grep --lang c --pattern 'function_definition[name="base64_encode"]'

# Examine the implementation of the sock_connect function
ast-grep --lang c --pattern 'function_definition[name="sock_connect"]'
Length of output: 146

Script:
#!/bin/bash
# Search for the implementation of the base64_encode function
rg --type c -A 30 "char \*base64_encode"

# Search for the implementation of the sock_connect function
rg --type c -A 30 "sock_connect"
Length of output: 7343

Script:
#!/bin/bash
# Retrieve the full implementation of the base64_encode function
rg --type c -C 50 "char \*base64_encode"

# Retrieve the full implementation of the sock_connect function
rg --type c -C 50 "sock_connect"
Length of output: 18265
* 583-591: The sending of the constructed URI to the proxy. Ensure that the `write` operation is handled correctly and that there are no partial writes without proper handling.

coderabbitai · 2024-01-04T20:38:16Z

corkscrew.c

+typedef struct proxyauth{
+	char realm[BUFSIZE];
+	char nonce[33];
+	char qop[BUFSIZE];
+	int stale;
+}proxyauth;
+
+
+proxyauth* process_line(char* line){
+
+	proxyauth* pa = (proxyauth *)malloc(sizeof(proxyauth));
+	// for now only looking for nonce, realm and qop
+	char* where = strstr(line, "nonce=");
+	if(where != NULL)strncpy(pa->nonce, where + 7, 32);
+	where = strstr(line, "realm=");
+	if(where != NULL)sscanf(where + 7, "%[^\"]", pa->realm);
+	where = strstr(line, "qop=");
+	if(where != NULL)sscanf(where + 5, "%[^\"]", pa->qop);
+	// only 'auth' parameter is recognized
+	if(strstr(pa->qop, "auth") == NULL)pa->qop[0] = '\0'; 
+	else strncpy(pa->qop, "auth", 5);
+	return pa;
+}


The process_line function in corkscrew.c uses strncpy and sscanf for parsing strings. The strncpy usage for the nonce is safe as it copies a fixed number of characters. However, the sscanf calls for realm and qop do not limit the number of characters read, which could lead to buffer overflows if the input exceeds BUFSIZE. Additionally, there is no check on the length of the input line before parsing, which could be a potential issue. The strncpy call for qop is redundant but not unsafe. Overall, there are potential security concerns with the sscanf usage that should be addressed.

Analysis chain

The proxyauth struct and process_line function are used to parse proxy authentication parameters. Ensure that the string operations do not lead to buffer overflows, especially with strncpy and sscanf.

Scripts Executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Verify that the string operations in process_line do not lead to buffer overflows. ast-grep --lang c --pattern $'process_line($_)'

Length of output: 48

Script:

#!/bin/bash # Use rg to search for the process_line function definition and its usage to verify string operations. rg "process_line"

Length of output: 108

Script:

#!/bin/bash # Output the content of the lines around the definition of process_line to analyze the string operations. rg -C 10 "process_line" corkscrew.c

Length of output: 1138

coderabbitai · 2024-01-04T20:38:16Z

corkscrew.c

+
 				fscanf(fp, "%s", line);
 				up = malloc(sizeof(line));
 				up = line;


The line buffer is defined with a size of BUFSIZE, which is set to 4096 bytes. This size should be adequate for typical authentication strings, but without a check on the length of the input from the file, there is a potential risk of buffer overflow if the file contains a line longer than 4096 characters.

To fully verify the safety of the fscanf usage, we would need to ensure that the fscanf call is limited to reading up to BUFSIZE - 1 characters (to leave space for the null terminator). Since the current fscanf call does not specify a limit, it is potentially unsafe.

The review comment is correct in pointing out the need to ensure there is no buffer overflow. The code should be modified to prevent reading more characters than the buffer can hold.

Analysis chain

The authentication file is read, and its contents are stored in line. Ensure that there is no buffer overflow when reading the file contents into line.

The line buffer is declared with a size of BUFSIZE. To ensure there is no buffer overflow, we need to verify that BUFSIZE is sufficiently large to hold the contents of the authentication file that line is intended to read. Let's find the definition of BUFSIZE to confirm its size.

Scripts Executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Verify that the buffer 'line' is large enough to hold the contents of the authentication file. rg --type c "fscanf"

Length of output: 60

Script:

#!/bin/bash # Find the declaration of the 'line' buffer and any relevant size checks or allocations. rg --type c "char.*line["

Length of output: 110

Script:

#!/bin/bash # Correct the regex pattern to properly find the declaration of the 'line' buffer. rg --type c "char.*line\\["

Length of output: 99

Script:

#!/bin/bash # Find the definition of BUFSIZE to confirm the size of the buffer. rg --type c "#define BUFSIZE"

Length of output: 64

added digest auth support

9938687

coderabbitai bot reviewed Jan 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added http digest auth support #11

added http digest auth support #11

gilcu3 commented Jan 4, 2024 •

edited by coderabbitai bot

Loading

netlify bot commented Jan 4, 2024 •

edited

Loading

coderabbitai bot commented Jan 4, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

CodeRabbit Discord Community

coderabbitai bot left a comment

`base64_encode` Function Security

`sock_connect` Function Correctness

coderabbitai bot Jan 4, 2024

coderabbitai bot Jan 4, 2024

added http digest auth support #11

Are you sure you want to change the base?

added http digest auth support #11

Conversation

gilcu3 commented Jan 4, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

netlify bot commented Jan 4, 2024 • edited Loading

❌ Deploy Preview for resonant-crumble-b74150 failed.

coderabbitai bot commented Jan 4, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

CodeRabbit Discord Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

base64_encode Function Security

sock_connect Function Correctness

coderabbitai bot Jan 4, 2024

Choose a reason for hiding this comment

coderabbitai bot Jan 4, 2024

Choose a reason for hiding this comment

gilcu3 commented Jan 4, 2024 •

edited by coderabbitai bot

Loading

netlify bot commented Jan 4, 2024 •

edited

Loading

coderabbitai bot commented Jan 4, 2024 •

edited

Loading

CodeRabbit Configration File (`.coderabbit.yaml`)

`base64_encode` Function Security

`sock_connect` Function Correctness