Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WAL implementation #1203

Merged
merged 17 commits into from
Nov 7, 2024
Merged

WAL implementation #1203

merged 17 commits into from
Nov 7, 2024

Conversation

arpitbbhayani
Copy link
Contributor

@arpitbbhayani arpitbbhayani commented Oct 26, 2024

Naive AOF vs Naive SQLite implementation

Screenshot 2024-10-29 210934

Copy link
Contributor

@soumya-codes soumya-codes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.
IMO, it may be beneficial if the commands are written to the WAL async with configurable fsync policies similar to how Redis does.
This is needed to ensure our latency numbers are not high at scale.

@@ -379,13 +382,17 @@ func (w *BaseWorker) gather(ctx context.Context, diceDBCmd *cmd.DiceDBCmd, numCm
return err
}

w.wl.LogCommand(diceDBCmd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are doing an synchronous I/O heavy operation in the critical path where a lot of workers would be contending for the same resource.
This may become a bottleneck and introduce latency at scale.
Should we do it async?

case MultiShard:
err := w.ioHandler.Write(ctx, val.composeResponse(storeOp...))
if err != nil {
slog.Debug("Error sending response to client", slog.String("workerID", w.id), slog.Any("error", err))
return err
}

w.wl.LogCommand(diceDBCmd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we log the command just after it is parsed?

@@ -164,14 +184,14 @@ func main() {
go runServer(ctx, &serverWg, asyncServer, serverErrCh)

if config.EnableHTTP {
httpServer := server.NewHTTPServer(shardManager)
httpServer := server.NewHTTPServer(shardManager, wl)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With all the complexity that will be involved in capturing commands in WAL, should we now run Server with one protocol at a given time?


// LogCommand serializes a WALLogEntry and writes it to the current WAL file.
func (w *WAL) LogCommand(c *cmd.DiceDBCmd) {
w.mutex.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LogCommand seems to be a high contention and compute intensive method that involves locks and marshalling(although we are using PROTOBUF). Having this as part of critical path may effect latency. Before adding this code I would suggest we run the memtier benchmarks.

@arpitbbhayani arpitbbhayani marked this pull request as ready for review November 7, 2024 14:59
@arpitbbhayani arpitbbhayani merged commit 1e9e85b into master Nov 7, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants