Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HashedRunFinder run-finding improvements #4974

Merged
merged 3 commits into from
Dec 29, 2023

Conversation

lbooker42
Copy link
Contributor

@lbooker42 lbooker42 commented Dec 21, 2023

I compared a few strategies for run-finding:

  • Linear Probe Hashing (default)
  • Sorted
  • No Run Finding
  • Quadratic Probe Hashing
  • Double Hash Hashing

Here are the results:
image

@lbooker42
Copy link
Contributor Author

lbooker42 commented Dec 21, 2023

Here is the test code to generate those results:

io.deephaven.engine.table.impl.QueryTable.DISABLE_PARALLEL_WHERE=true

QueryScope.addParam("test", (tableFunction, tableSize, uniqueValues, description, repeat, lowRemove, highRemove) -> {
    List<Double> results = new ArrayList<>();

    for (int ii = 0; ii < repeat; ii++) {
        table = tableFunction(tableSize, uniqueValues)
        start = System.nanoTime();
        t2=table.countBy("Count", "someIntColumn");
        end = System.nanoTime();
        results.add((end-start)/1_000_000_000);
    }
    Collections.sort(results)
    sum = 0.0;
    count = 0;
    for (int ii = 0; ii < results.size(); ii++) {
        if (ii >= lowRemove && ii <= results.size() - highRemove - 1) {
            sum += results.get(ii);
            count++;
        }
    }
    println(description + "\t" + tableSize + "\t" + uniqueValues + "\t" + sum / count)
    results    
})

values = [10, 100, 1_000, 10_000, 100_000, 1_000_000]
tableSize = 10_000_000

description = "Ascending"
tableGen = (tableSize, uniqueValues) -> emptyTable(tableSize).view("someIntColumn=i%" + uniqueValues)
for (v : values) {
    test(tableGen, tableSize, v, description, 10, 0, 2)
}

description = "Descending"
tableGen = (tableSize, uniqueValues) -> emptyTable(tableSize).view("someIntColumn=(-i%" + uniqueValues + "+" + uniqueValues + ")")
for (v : values) {
    test(tableGen, tableSize, v, description, 10, 0, 2)
}

description = "Random"
tableGen = (tableSize, uniqueValues) -> emptyTable(tableSize).update("someIntColumn=randomInt(0, " + uniqueValues + ")")
for (v : values) {
    test(tableGen, tableSize, v, description, 10, 0, 2)
}

runLength = 2
description = "RunsLength" + runLength
tableGen = (tableSize, uniqueValues) -> emptyTable(tableSize).view("someIntColumn=(i-i%" + runLength + ") % (" + runLength + "*" + uniqueValues + ")")
for (v : values) {
     test(tableGen, tableSize, v, description, 10, 0, 2)
}

runLength = 5
description = "RunsLength" + runLength
tableGen = (tableSize, uniqueValues) -> emptyTable(tableSize).view("someIntColumn=(i-i%" + runLength + ") % (" + runLength + "*" + uniqueValues + ")")
for (v : values) {
     test(tableGen, tableSize, v, description, 10, 0, 2)
}

runLength = 10
description = "RunsLength" + runLength
tableGen = (tableSize, uniqueValues) -> emptyTable(tableSize).view("someIntColumn=(i-i%" + runLength + ") % (" + runLength + "*" + uniqueValues + ")")
for (v : values) {
     test(tableGen, tableSize, v, description, 10, 0, 2)
}

runLength = 100
description = "RunsLength" + runLength
tableGen = (tableSize, uniqueValues) -> emptyTable(tableSize).view("someIntColumn=(i-i%" + runLength + ") % (" + runLength + "*" + uniqueValues + ")")
for (v : values) {
     test(tableGen, tableSize, v, description, 10, 0, 2)
}

runLength = 1000
description = "RunsLength" + runLength
tableGen = (tableSize, uniqueValues) -> emptyTable(tableSize).view("someIntColumn=(i-i%" + runLength + ") % (" + runLength + "*" + uniqueValues + ")")
for (v : values) {
     test(tableGen, tableSize, v, description, 10, 0, 2)
}

@rcaudy
Copy link
Member

rcaudy commented Dec 21, 2023

I think you should run with all variations of these configuration settings and see where we're at:

    public static final boolean SKIP_RUN_FIND =
            Configuration.getInstance().getBooleanWithDefault("ChunkedOperatorAggregationHelper.skipRunFind", false);
    static final boolean HASHED_RUN_FIND =
            Configuration.getInstance().getBooleanWithDefault("ChunkedOperatorAggregationHelper.hashedRunFind", true);

@rcaudy rcaudy requested a review from cpwright December 21, 2023 22:49
@rcaudy rcaudy added query engine core Core development tasks NoDocumentationNeeded ReleaseNotesNeeded Release notes are needed labels Dec 21, 2023
@rcaudy rcaudy added this to the December 2023 milestone Dec 21, 2023
cpwright
cpwright previously approved these changes Dec 22, 2023
Copy link
Contributor

@cpwright cpwright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numbers indicate it makes things pretty much universally better.

Ryan's suggestion of an improved comment makes sense.

Did you consider a version that mixes the hash a bit more? It might reduce locality and get worse in the general case.

@rcaudy
Copy link
Member

rcaudy commented Dec 23, 2023

I'm happy to merge this if @cpwright also approves and nightlies are run and passing. Separately, I want to do more benchmarking to consider whether we should disable run finding in more scenarios, but that needs to consider other aggregations and should involve working with @stanbrub .

hashSlot = (hashSlot + 1) & context.tableMask;
if (probe == 0) {
// double hashing
probe = 1 + (outputPosition % (context.tableSize - 2));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this probe sequence can never result in an infinite loop?

Copy link
Contributor Author

@lbooker42 lbooker42 Dec 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, similar to Trove we force tableSize to be a prime number and our h2() function returns a number in the range 1 <= n < tableSize which guarantees the second hash has no common factors other than 1 (i.e. relatively prime). With no common factors, every probe sequence will reach every cell in the probe sequence. Also, since we size the table to have a load factor of 0.75 we can be certain that there will always be an empty cell to find.

I was initially worried about small desired capacities (0,1,2) but have verified that PrimeFinder#nextPrime() returns 3 as the smallest possible prime (i.e. when provided 0 <= x <= 3). so the small cases are covered properly as well.

@lbooker42 lbooker42 merged commit f4f1d2b into deephaven:main Dec 29, 2023
19 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Dec 29, 2023
@lbooker42 lbooker42 deleted the lab-hash-perf branch June 26, 2024 19:57
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
core Core development tasks NoDocumentationNeeded query engine ReleaseNotesNeeded Release notes are needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants