-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WAL implementation #1203
WAL implementation #1203
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR.
IMO, it may be beneficial if the commands are written to the WAL async with configurable fsync policies similar to how Redis does.
This is needed to ensure our latency numbers are not high at scale.
@@ -379,13 +382,17 @@ func (w *BaseWorker) gather(ctx context.Context, diceDBCmd *cmd.DiceDBCmd, numCm | |||
return err | |||
} | |||
|
|||
w.wl.LogCommand(diceDBCmd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are doing an synchronous I/O heavy operation in the critical path where a lot of workers would be contending for the same resource.
This may become a bottleneck and introduce latency at scale.
Should we do it async?
case MultiShard: | ||
err := w.ioHandler.Write(ctx, val.composeResponse(storeOp...)) | ||
if err != nil { | ||
slog.Debug("Error sending response to client", slog.String("workerID", w.id), slog.Any("error", err)) | ||
return err | ||
} | ||
|
||
w.wl.LogCommand(diceDBCmd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we log the command just after it is parsed?
@@ -164,14 +184,14 @@ func main() { | |||
go runServer(ctx, &serverWg, asyncServer, serverErrCh) | |||
|
|||
if config.EnableHTTP { | |||
httpServer := server.NewHTTPServer(shardManager) | |||
httpServer := server.NewHTTPServer(shardManager, wl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With all the complexity that will be involved in capturing commands in WAL, should we now run Server with one protocol at a given time?
internal/wal/main.go
Outdated
|
||
// LogCommand serializes a WALLogEntry and writes it to the current WAL file. | ||
func (w *WAL) LogCommand(c *cmd.DiceDBCmd) { | ||
w.mutex.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LogCommand seems to be a high contention and compute intensive method that involves locks and marshalling(although we are using PROTOBUF). Having this as part of critical path may effect latency. Before adding this code I would suggest we run the memtier benchmarks.
Naive AOF vs Naive SQLite implementation