Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements in command handling #5 #631

Open
wants to merge 155 commits into
base: main
Choose a base branch
from

Conversation

TalZaccai
Copy link
Contributor

@TalZaccai TalZaccai commented Sep 3, 2024

  • Introducing RawStringInput struct to simplify main store command processing
    • BasicCommands
    • ArrayCommands
    • MainStoreOps
    • AdvancedOps
    • BitmapCommands
    • BitmapOps
    • HyperLogLogCommands
    • HyperLogLogOps
    • KeyAdminCommands
    • CustomRawStringFunctions
  • Supporting RawStringInput serialization/deserialization for AOF recovery
  • Relaying parse state to custom commands / transactions
  • Fixing bug in PFCOUNT logic

Benchmarking results:
main:

Method Mean Error StdDev Allocated
InlinePing 1.493 us 0.0225 us 0.0188 us -
Set 8.963 us 0.1783 us 0.1751 us -
SetEx 13.153 us 0.2306 us 0.4388 us -
Get 5.804 us 0.0819 us 0.0726 us -
ZAddRem 73.947 us 1.1658 us 1.0334 us 23552 B
LPushPop 80.001 us 1.5987 us 3.0801 us 30721 B
SAddRem 64.207 us 1.2544 us 2.3560 us 16384 B
HSetDel 79.286 us 1.0700 us 1.0009 us 55297 B
MyDictSetGet 116.929 us 1.1018 us 0.9200 us 30720 B

current branch:

Method Mean Error StdDev Allocated
InlinePing 1.492 us 0.0185 us 0.0173 us -
Set 8.841 us 0.1280 us 0.1198 us -
SetEx 13.146 us 0.1434 us 0.1341 us -
Get 6.049 us 0.0763 us 0.0714 us -
ZAddRem 71.771 us 1.2070 us 1.1291 us 23552 B
LPushPop 71.520 us 1.2398 us 1.1597 us 30721 B
SAddRem 59.022 us 0.7709 us 0.7211 us 16384 B
HSetDel 76.690 us 1.2125 us 1.0748 us 55297 B
MyDictSetGet 114.768 us 1.0504 us 0.9825 us 30720 B

@TalZaccai TalZaccai marked this pull request as ready for review September 5, 2024 02:25
libs/server/Resp/HyperLogLog/HyperLogLogCommands.cs Outdated Show resolved Hide resolved
libs/server/Storage/Session/MainStore/HyperLogLogOps.cs Outdated Show resolved Hide resolved
libs/server/InputHeader.cs Show resolved Hide resolved
*(long*)(pcurr) = bOffset; pcurr += sizeof(long);
*pcurr = bSetVal;
#endregion
var input = new RawStringInput
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this pattern is repeated across commands. Does it make sense to do it right after parseState instead of at the time of command execution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a. When you're setting parseState you don't know the command yet
b. parseStateStartIdx is 1 only when it's a single-key command, so it really depends on the command syntax.
c. What we can do is have a cleaner constructor for RawStringInput (as well as ObjectInput).

if (parseState.Count > 4)
{
var sbOffsetType = parseState.GetArgSliceByRef(4).ReadOnlySpan;
bitOffsetType = sbOffsetType.EqualsUpperCaseSpanIgnoringCase("BIT"u8) ? (byte)0x1 : (byte)0x0;
if (!sbOffsetType.EqualsUpperCaseSpanIgnoringCase("BIT"u8) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How difficult is to propagate the result of this parsing to RMWMethods? It seems we perform this comparison twice which is not efficient.
Maybe we can have an array int args for those commands that require translation of string to categorical arguments in order to pass them in backend functions and avoid another translation/comparison.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, we are parsing twice, once for validation and once for execution. I think that's part of the price for simplifying the code and making it more readable... @badrishc any ideas here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can this not be passed as a parameter in input, similar to arg1 etc. and if the backend does not find it (not sure why, maybe because of GarnetAPI code path?), it can reparse it from the parse state.

Copy link
Contributor

@badrishc badrishc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. In general, pls check for all cases where we create an array of structs, such as ArgSlice[] or SpanByte[] -- these are allocations that need to be avoided if possible, by moving them to session-level fields that are reused across calls into Garnet.

libs/server/Resp/Parser/ParseUtils.cs Show resolved Hide resolved
libs/server/Resp/HyperLogLog/HyperLogLogCommands.cs Outdated Show resolved Hide resolved
libs/server/AOF/AofProcessor.cs Outdated Show resolved Hide resolved
libs/server/AOF/AofProcessor.cs Outdated Show resolved Hide resolved
libs/server/Resp/ArrayCommands.cs Outdated Show resolved Hide resolved
libs/server/Storage/Session/StorageSession.cs Show resolved Hide resolved
libs/server/Storage/Functions/MainStore/PrivateMethods.cs Outdated Show resolved Hide resolved
libs/server/AOF/AofProcessor.cs Outdated Show resolved Hide resolved
libs/server/InputHeader.cs Show resolved Hide resolved
libs/server/Resp/BasicCommands.cs Outdated Show resolved Hide resolved
Copy link
Contributor

@TedHartMS TedHartMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed to BitmapCommands.cs

/// Get header as Span
/// </summary>
/// <returns>Span</returns>
public unsafe Span<byte> AsSpan() => new(ToPointer(), Size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we need this as a Span? The only place I see this used is in Length below, which I think could just return Size--and is only used by SpanByte method, so maybe can just be consolidated into that

var serializedLength = header.SpanByte.TotalSize
+ (3 * sizeof(int)) // Length + arg1 + arg2
+ parseState.GetSerializedLength(parseStateStartIdx);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the local var? Can just be an expression =>


/// <inheritdoc />
public unsafe void CopyTo(byte* dest)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should pass in the dest len and verify. I know you've verified the length at the callsite, but even so, we are essentially doing C/C++ buffer-copy and should follow the security guidelines for those by doing bounds-checking, just as earlier secure-code initiatives replaced memcpy with memcpy_s

// 1. Header
((RespInputHeader*)pcurr)->SetHeader(RespCommandAccessor.MIGRATE, 0);
var input = new RawStringInput();
input.header.SetHeader(RespCommandAccessor.MIGRATE, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this pass the RespCommand as a ctor arg? The more we guarantee correct initialization, the better

var parseSuccessful = TryReadLongSafe(ref ptr, end, out value, out bytesRead, out var signRead,
out var overflow, allowLeadingZeros);

if (parseSuccessful) return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better readability to have return on its own line. In fact parseSuccessful is not needed; just return true if TryReadLongSafe. and only the second of the "return false" lines below is needed

if (parseStateCount > 0)
{
parseState.Initialize(parseStateCount);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to go look to figure out what "count" was here. Better to name it (and the ctor arg) "argCount"

var curr = ptr + sizeof(AofHeader);
ref var key = ref Unsafe.AsRef<SpanByte>(curr);
curr += key.TotalSize;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be cleaned up with an AofRecordDescriptor or something like that, to cover these pointer manipulations

/// </summary>
public unsafe SpanByte SpanByte => new(Length, (nint)ToPointer());
public int parseStateStartIdx;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another opportunity for clearer naming: What index? into the byte*? The reader has to look to see it refers to arguments. You use "token" in some other places, which is also good but should be consistent. (I know this field name came from existing ObjectInput; it would be nice to make these all consistent and clear)

long startOffset = *(long*)(input + sizeof(byte));
long endOffset = *(long*)(input + sizeof(byte) + sizeof(long));
byte offsetType = *(input + sizeof(byte) + sizeof(long) * 2);

if (offsetType == 0x0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

offsetType should be an enum or at least named constants. Magic numbers are uninformative and less maintainable (I know this was there before, but since you're doing such extensive cleanup including this param, we should get this too)

return slice.length != 0 &&
RespReadUtils.TryReadIntSafe(ref ptr, slice.ptr + slice.length, out number, out var bytesRead, out _,
out _, false) &&
(int)bytesRead == slice.length;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use "paramName: false" for readability

E = HyperLogLog.DefaultHLL.Count(value.ToPointer());
*(long*)dst.SpanByte.ToPointer() = E;
return;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it intentional to remove the
E = HyperLogLog.DefaultHLL.Count(value.ToPointer());
call?

pcurr += sizeof(long);
*pcurr = (byte)(useBitInterval ? 1 : 0);
var startBytes = Encoding.ASCII.GetBytes(start.ToString(CultureInfo.InvariantCulture));
var endBytes = Encoding.ASCII.GetBytes(end.ToString(CultureInfo.InvariantCulture));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These do allocations. stackalloc long[1] instead?

*(long*)pcurr = commandArguments[i].value; pcurr += 8;
*pcurr = commandArguments[i].overflowType;
var op = (RespCommand)commandArguments[i].secondaryOpCode;
var opBytes = Encoding.ASCII.GetBytes(op.ToString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More GetBytes allocations and also some ToString() allocations. Can these be avoided? If not, comment why they are necessary

{
*(long*)pcurr = (long)HashUtils.MurmurHash2x64A(ptr, bString.Length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the Murmur has moved into IterateUpdate. Can the string be fixed directly rather than another GetBytes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants