Skip to content

Commit

Permalink
Configure referenceDB request timeout (facebookincubator#9537)
Browse files Browse the repository at this point in the history
Summary:
When window fuzzer is run for longer durations with Presto as reference DB, it is
observed that some queries do not reach Presto for validation and they fail in
velox with the following error message:
```
W20240418 13:15:48.194890 1896311 AggregationFuzzerBase.cpp:470] Query failed in the reference DB
```
This appears to be because the default timeout of `1000ms` for the http requests
made to Presto is lower and increasing this to a higher value, such as `5000ms`,
fixes the issue (thanks for the fix kgpai). This PR makes the request timeout used
by the `cpr::Get` request in `PrestoQueryRunner` a program argument to
`velox_window_fuzzer_test ` and `velox_aggregation_fuzzer_test`, so it can be
configured as necessary.

Pull Request resolved: facebookincubator#9537

Reviewed By: Yuhta

Differential Revision: D56371144

Pulled By: kgpai

fbshipit-source-id: 2e03df66e20bd54325954a441f9afa27ad11a3f5
  • Loading branch information
pramodsatya authored and facebook-github-bot committed Apr 25, 2024
1 parent b1c993b commit a40dd55
Show file tree
Hide file tree
Showing 7 changed files with 63 additions and 18 deletions.
26 changes: 24 additions & 2 deletions velox/docs/develop/testing/fuzzer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -178,14 +178,21 @@ aggregate functions supported by the engine, and call
``AggregationFuzzerRunner::run()`` defined in `AggregationFuzzerRunner.h`_. See
`AggregationFuzzerTest.cpp`_.

.. _AggregationFuzzerRunner.h: https://github.com/facebookincubator/velox/blob/main/velox/exec/tests/AggregationFuzzer.h
.. _AggregationFuzzerRunner.h: https://github.com/facebookincubator/velox/blob/main/velox/exec/fuzzer/AggregationFuzzer.h

.. _AggregationFuzzerTest.cpp: https://github.com/facebookincubator/velox/blob/main/velox/exec/tests/AggregationFuzzerTest.cpp
.. _AggregationFuzzerTest.cpp: https://github.com/facebookincubator/velox/blob/main/velox/functions/prestosql/fuzzer/AggregationFuzzerTest.cpp

Aggregation Fuzzer allows to indicate functions whose results depend on the
order of inputs and optionally provide custom result verifiers. The Fuzzer
also allows to provide custom input generators for individual functions.

Integration with the Window Fuzzer is similar to Aggregation Fuzzer. See
`WindowFuzzerRunner.h`_ and `WindowFuzzerTest.cpp`_.

.. _WindowFuzzerRunner.h: https://github.com/facebookincubator/velox/blob/main/velox/exec/fuzzer/WindowFuzzer.h

.. _WindowFuzzerTest.cpp: https://github.com/facebookincubator/velox/blob/main/velox/functions/prestosql/fuzzer/WindowFuzzerTest.cpp

How to run
----------------------------

Expand Down Expand Up @@ -255,6 +262,21 @@ An example set of arguments to run the expression fuzzer with all features enabl
--repro_persist_path=<a_valid_local_path>
--logtostderr=1``


`WindowFuzzerTest.cpp`_ and `AggregationFuzzerTest.cpp`_ allow results to be
verified against Presto. To setup Presto as a reference DB, please follow these
`instructions`_. The following flags control the connection to the presto
cluster; ``--presto_url`` which is the http server url along with its port number
and ``--req_timeout_ms`` which sets the request timeout in milliseconds. The
timeout is set to 1000 ms by default but can be increased if this time is
insufficient for certain queries. Example command:

::

velox/functions/prestosql/fuzzer:velox_window_fuzzer_test --enable_window_reference_verification --presto_url="http://127.0.0.1:8080" --req_timeout_ms=2000 --duration_sec=60 --logtostderr=1 --minloglevel=0

.. _instructions: https://github.com/facebookincubator/velox/issues/8111

How to reproduce failures
-------------------------------------

Expand Down
8 changes: 6 additions & 2 deletions velox/exec/fuzzer/AggregationFuzzerBase.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -787,7 +787,8 @@ void persistReproInfo(

std::unique_ptr<ReferenceQueryRunner> setupReferenceQueryRunner(
const std::string& prestoUrl,
const std::string& runnerName) {
const std::string& runnerName,
const uint32_t& reqTimeoutMs) {
if (prestoUrl.empty()) {
auto duckQueryRunner = std::make_unique<DuckQueryRunner>();
duckQueryRunner->disableAggregateFunctions({
Expand All @@ -802,7 +803,10 @@ std::unique_ptr<ReferenceQueryRunner> setupReferenceQueryRunner(
LOG(INFO) << "Using DuckDB as the reference DB.";
return duckQueryRunner;
} else {
return std::make_unique<PrestoQueryRunner>(prestoUrl, runnerName);
return std::make_unique<PrestoQueryRunner>(
prestoUrl,
runnerName,
static_cast<std::chrono::milliseconds>(reqTimeoutMs));
LOG(INFO) << "Using Presto as the reference DB.";
}
}
Expand Down
3 changes: 2 additions & 1 deletion velox/exec/fuzzer/AggregationFuzzerBase.h
Original file line number Diff line number Diff line change
Expand Up @@ -335,7 +335,8 @@ void persistReproInfo(
// properly.
std::unique_ptr<ReferenceQueryRunner> setupReferenceQueryRunner(
const std::string& prestoUrl,
const std::string& runnerName);
const std::string& runnerName,
const uint32_t& reqTimeoutMs);

// Returns the function name used in a WindowNode. The input `node` should be a
// pointer to a WindowNode.
Expand Down
2 changes: 1 addition & 1 deletion velox/exec/fuzzer/PrestoQueryRunner.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ class PrestoQueryRunner : public velox::exec::test::ReferenceQueryRunner {
PrestoQueryRunner(
std::string coordinatorUri,
std::string user,
std::chrono::milliseconds timeout = std::chrono::milliseconds{1000});
std::chrono::milliseconds timeout);

/// Converts Velox query plan to Presto SQL. Supports Values -> Aggregation or
/// Window with an optional Project on top.
Expand Down
24 changes: 14 additions & 10 deletions velox/exec/tests/PrestoQueryRunnerTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,10 @@ class PrestoQueryRunnerTest : public ::testing::Test,
// This test requires a Presto Coordinator running at localhost, so disable it
// by default.
TEST_F(PrestoQueryRunnerTest, DISABLED_basic) {
auto queryRunner =
std::make_unique<PrestoQueryRunner>("http://127.0.0.1:8080", "hive");
auto queryRunner = std::make_unique<PrestoQueryRunner>(
"http://127.0.0.1:8080",
"hive",
static_cast<std::chrono::milliseconds>(1000));

auto results = queryRunner->execute("SELECT count(*) FROM nation");
auto expected = makeRowVector({
Expand Down Expand Up @@ -87,8 +89,10 @@ TEST_F(PrestoQueryRunnerTest, DISABLED_fuzzer) {
.project({"a0", "a1", "array_sort(a2)"})
.planNode();

auto queryRunner =
std::make_unique<PrestoQueryRunner>("http://127.0.0.1:8080", "hive");
auto queryRunner = std::make_unique<PrestoQueryRunner>(
"http://127.0.0.1:8080",
"hive",
static_cast<std::chrono::milliseconds>(1000));
auto sql = queryRunner->toSql(plan);
ASSERT_TRUE(sql.has_value());

Expand All @@ -104,8 +108,8 @@ TEST_F(PrestoQueryRunnerTest, DISABLED_fuzzer) {
}

TEST_F(PrestoQueryRunnerTest, sortedAggregation) {
auto queryRunner =
std::make_unique<PrestoQueryRunner>("http://unused", "hive");
auto queryRunner = std::make_unique<PrestoQueryRunner>(
"http://unused", "hive", static_cast<std::chrono::milliseconds>(1000));

auto data = makeRowVector({
makeFlatVector<int64_t>({1, 2, 1, 2, 1}),
Expand Down Expand Up @@ -144,8 +148,8 @@ TEST_F(PrestoQueryRunnerTest, sortedAggregation) {
}

TEST_F(PrestoQueryRunnerTest, distinctAggregation) {
auto queryRunner =
std::make_unique<PrestoQueryRunner>("http://unused", "hive");
auto queryRunner = std::make_unique<PrestoQueryRunner>(
"http://unused", "hive", static_cast<std::chrono::milliseconds>(1000));

auto data =
makeRowVector({makeFlatVector<int64_t>({}), makeFlatVector<int64_t>({})});
Expand All @@ -161,8 +165,8 @@ TEST_F(PrestoQueryRunnerTest, distinctAggregation) {
}

TEST_F(PrestoQueryRunnerTest, toSql) {
auto queryRunner =
std::make_unique<PrestoQueryRunner>("http://unused", "hive");
auto queryRunner = std::make_unique<PrestoQueryRunner>(
"http://unused", "hive", static_cast<std::chrono::milliseconds>(1000));
auto dataType = ROW({"c0", "c1", "c2"}, {BIGINT(), BIGINT(), BOOLEAN()});

// Test window queries.
Expand Down
9 changes: 8 additions & 1 deletion velox/functions/prestosql/fuzzer/AggregationFuzzerTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,12 @@ DEFINE_string(
"source of truth. Otherwise, use DuckDB. Example: "
"--presto_url=http://127.0.0.1:8080");

DEFINE_uint32(
req_timeout_ms,
1000,
"Timeout in milliseconds for HTTP requests made to reference DB, "
"such as Presto. Example: --req_timeout_ms=2000");

namespace facebook::velox::exec::test {
namespace {

Expand Down Expand Up @@ -182,6 +188,7 @@ int main(int argc, char** argv) {
facebook::velox::VectorFuzzer::Options::TimestampPrecision::kMilliSeconds;
return Runner::run(
initialSeed,
setupReferenceQueryRunner(FLAGS_presto_url, "aggregation_fuzzer"),
setupReferenceQueryRunner(
FLAGS_presto_url, "aggregation_fuzzer", FLAGS_req_timeout_ms),
options);
}
9 changes: 8 additions & 1 deletion velox/functions/prestosql/fuzzer/WindowFuzzerTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,12 @@ DEFINE_string(
"source of truth. Otherwise, use DuckDB. Example: "
"--presto_url=http://127.0.0.1:8080");

DEFINE_uint32(
req_timeout_ms,
1000,
"Timeout in milliseconds for HTTP requests made to reference DB, "
"such as Presto. Example: --req_timeout_ms=2000");

namespace facebook::velox::exec::test {
namespace {

Expand Down Expand Up @@ -169,6 +175,7 @@ int main(int argc, char** argv) {
facebook::velox::VectorFuzzer::Options::TimestampPrecision::kMilliSeconds;
return Runner::run(
initialSeed,
setupReferenceQueryRunner(FLAGS_presto_url, "window_fuzzer"),
setupReferenceQueryRunner(
FLAGS_presto_url, "window_fuzzer", FLAGS_req_timeout_ms),
options);
}

0 comments on commit a40dd55

Please sign in to comment.