implement billion scale data #612

kmrmt · 2020-08-05T09:26:30Z

Description:

implement billion scale dataset loader and download job.
If you have a good package name, please suggest that.

Related Issue:

Nothing.

How Has This Been Tested?:

There is not test but benchmark.

Environment:

Go Version: 1.14.4
Docker Version: 19.03.8
Kubernetes Version: 1.18.2
NGT Version: 1.12.0

Types of changes:

Bug fix [type/bug]
New feature [type/feature]
Add tests [type/test]
Security related changes [type/security]
Add documents [type/documentation]
Refactoring [type/refactoring]
Update dependencies [type/dependency]
Update benchmarks and performances [type/bench]
Update CI [type/ci]

Changes to Core Features:

Have you added an explanation of what your changes do and why you'd like us to include them?
Have you written new tests for your core changes, as applicable?
Have you successfully ran tests with your changes locally?

Checklist:

I have read the CONTRIBUTING document.
I have checked open Pull Requests for the similar feature or fixes?
I have added tests and benchmarks to cover my changes.
I have ensured all new and existing tests passed.
I have commented my code, particularly in hard-to-understand areas
I have updated the documentation accordingly.

pull-assistant · 2020-08-05T09:26:32Z

Best reviewed: commit by commit

Optimal code review plan (2 warnings)

implement billion scale data loader

...ssets/x1b/loader_test_bench.go 44% changes removed in fix benchmark

fix dataset to datasets

implement billion scale loader

.../e2e/strategy/stream_remove.go 50% changes removed in fix benchmark timer

fix benchmark timer

fix error handling

fix benchmark timers

🤖 Update license headers / Format go codes and yaml files

add x1b for docker build

stop benchmark if error is occurred

fix benchmark

Powered by Pull Assistant. Last update c3e2cb3 ... 7a57e13. Read the comment docs.

vdaas-ci · 2020-08-05T09:26:55Z

[CHATOPS:HELP] ChatOps commands.

🙆‍♀️ /approve - approve
💌 /changelog - add changelog comment
🍱 /format - format codes and add licenses
✅ /gen-test - generate test codes
🏷️ /label - add labels
⏪ /rebase - rebase master

pkg/tools/cli/loadtest/assets/large_dataset.go

codecov · 2020-08-05T09:35:11Z

Codecov Report

Merging #612 into master will decrease coverage by 0.20%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master     #612      +/-   ##
==========================================
- Coverage   14.53%   14.32%   -0.21%     
==========================================
  Files         417      419       +2     
  Lines       19432    19420      -12     
==========================================
- Hits         2824     2782      -42     
- Misses      16371    16404      +33     
+ Partials      237      234       -3

Impacted Files	Coverage Δ
hack/benchmark/assets/x1b/loader.go	`0.00% <0.00%> (ø)`
...k/benchmark/core/benchmark/strategy/bulk_insert.go	`0.00% <0.00%> (ø)`
...mark/core/benchmark/strategy/bulk_insert_commit.go	`0.00% <0.00%> (ø)`
hack/benchmark/core/benchmark/strategy/insert.go	`0.00% <0.00%> (ø)`
...benchmark/core/benchmark/strategy/insert_commit.go	`0.00% <0.00%> (ø)`
hack/benchmark/core/benchmark/strategy/search.go	`0.00% <0.00%> (ø)`
hack/benchmark/core/benchmark/strategy/util.go	`0.00% <0.00%> (ø)`
hack/benchmark/internal/e2e/strategy/insert.go	`0.00% <0.00%> (ø)`
hack/benchmark/internal/e2e/strategy/remove.go	`0.00% <0.00%> (ø)`
hack/benchmark/internal/e2e/strategy/search.go	`0.00% <0.00%> (ø)`
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dd60c10...7a57e13. Read the comment docs.

Makefile.d/bench.mk

rinx

LGTM 👍

kmrmt · 2020-08-21T05:17:23Z

@hlts2 I want your advice how way to fix e2e benchmark.
I fixed Dataset interface to have same one small dataset and large dataset.
In this case, it is difficult for me to fix the bulk_insert benchmark.

hack/benchmark/assets/x1b/loader_test_bench.go

hack/benchmark/assets/x1b/loader.go

hack/benchmark/assets/x1b/loader_test_bench.go

hack/benchmark/core/benchmark/strategy/bulk_insert.go

kmrmt · 2020-08-27T07:10:57Z

/rebase
/format

vdaas-ci · 2020-08-27T07:12:37Z

[REBASE] Rebase triggered by kmrmt for branch: feature/benchmark/billion_scale_dataset

vdaas-ci · 2020-08-27T07:13:35Z

[FORMAT] Updating license headers and formatting go codes triggered by kmrmt.

vdaas-ci · 2020-09-24T05:47:23Z

[FORMAT] Updating license headers and formatting go codes triggered by vankichi.

vdaas-ci

[APPROVED] This PR is approved by vankichi.

vankichi · 2020-09-24T05:49:40Z

/rebase
/format
/approve

vdaas-ci · 2020-09-24T05:50:55Z

[REBASE] Rebase triggered by vankichi for branch: feature/benchmark/billion_scale_dataset

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

Signed-off-by: vdaas-ci <ci@vdaas.org>

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

vdaas-ci · 2020-09-24T05:51:55Z

[FORMAT] Updating license headers and formatting go codes triggered by vankichi.

vdaas-ci

[APPROVED] This PR is approved by vankichi.

github-actions

Remaining comments which cannot be posted as a review comment to avoid GitHub Rate Limit

golangci

pkg/tools/cli/loadtest/assets/large_dataset_test.go|1024| pkg/tools/cli/loadtest/assets/large_dataset_test.go:1024: Line contains TODO/BUG/FIXME: "TODO test cases" (godox)
pkg/tools/cli/loadtest/assets/large_dataset_test.go|1098| pkg/tools/cli/loadtest/assets/large_dataset_test.go:1098: Line contains TODO/BUG/FIXME: "TODO test cases" (godox)
pkg/tools/cli/loadtest/assets/large_dataset_test.go|1114| pkg/tools/cli/loadtest/assets/large_dataset_test.go:1114: Line contains TODO/BUG/FIXME: "TODO test cases" (godox)
pkg/tools/cli/loadtest/assets/large_dataset_test.go|1188| pkg/tools/cli/loadtest/assets/large_dataset_test.go:1188: Line contains TODO/BUG/FIXME: "TODO test cases" (godox)
pkg/tools/cli/loadtest/assets/large_dataset_test.go|1204| pkg/tools/cli/loadtest/assets/large_dataset_test.go:1204: Line contains TODO/BUG/FIXME: "TODO test cases" (godox)
pkg/tools/cli/loadtest/assets/small_dataset_test.go|51| pkg/tools/cli/loadtest/assets/small_dataset_test.go:51: Line contains TODO/BUG/FIXME: "TODO test cases" (godox)
pkg/tools/cli/loadtest/assets/small_dataset_test.go|66| pkg/tools/cli/loadtest/assets/small_dataset_test.go:66: Line contains TODO/BUG/FIXME: "TODO test cases" (godox)
pkg/tools/cli/loadtest/assets/small_dataset_test.go|128| pkg/tools/cli/loadtest/assets/small_dataset_test.go:128: Line contains TODO/BUG/FIXME: "TODO test cases" (godox)
pkg/tools/cli/loadtest/assets/small_dataset_test.go|140| pkg/tools/cli/loadtest/assets/small_dataset_test.go:140: Line contains TODO/BUG/FIXME: "TODO test cases" (godox)
pkg/tools/cli/loadtest/assets/dataset.go|28| File is not gofumpt-ed (gofumpt)
pkg/tools/cli/loadtest/assets/dataset.go|29 col 2| exported var ErrOutOfBounds should have comment or be unexported (golint)
pkg/tools/cli/loadtest/assets/large_dataset.go|1 col 1| package comment should be of the form "Package assets ..." (golint)
pkg/tools/cli/loadtest/assets/large_dataset.go|54 col 27| error strings should not be capitalized or end with punctuation or a newline (golint)
pkg/tools/cli/loadtest/assets/small_dataset.go|90 col 19| G404: Use of weak random number generator (math/rand instead of crypto/rand) (gosec)
pkg/tools/cli/loadtest/assets/small_dataset.go|91 col 19| G404: Use of weak random number generator (math/rand instead of crypto/rand) (gosec)
pkg/tools/cli/loadtest/assets/hdf5_loader_test.go|300| line is 155 characters (lll)
pkg/tools/cli/loadtest/assets/dataset.go|89 col 2| return statements should not be cuddled if block has more than two lines (wsl)
pkg/tools/cli/loadtest/assets/large_dataset.go|67 col 4| only one cuddle assignment allowed before range statement (wsl)
pkg/tools/cli/loadtest/assets/large_dataset.go|70 col 4| append only allowed to cuddle with appended value (wsl)
pkg/tools/cli/loadtest/assets/large_dataset.go|61 col 3| for statement without condition should never be cuddled (wsl)
pkg/tools/cli/loadtest/assets/small_dataset.go|89 col 4| only one cuddle assignment allowed before range statement (wsl)
pkg/tools/cli/loadtest/assets/small_dataset.go|86 col 3| only one cuddle assignment allowed before range statement (wsl)
pkg/tools/cli/loadtest/assets/large_dataset.go|63 col 7| err113: do not compare errors directly, use errors.Is() instead: "err == ErrOutOfBounds" (goerr113)
pkg/tools/cli/loadtest/assets/hdf5_loader_test.go|16 col 9| package should be assets_test instead of assets (testpackage)
pkg/tools/cli/loadtest/assets/large_dataset_test.go|16 col 9| package should be assets_test instead of assets (testpackage)

pkg/tools/cli/loadtest/assets/small_dataset.go

pkg/tools/cli/loadtest/assets/hdf5_loader_test.go

pkg/tools/cli/loadtest/assets/large_dataset_test.go

github-actions bot added area/makefile area/tools/cli/loadtest team/sre SRE team size/L type/feature New feature priority/medium labels Aug 5, 2020

github-actions bot reviewed Aug 5, 2020

View reviewed changes

pkg/tools/cli/loadtest/assets/large_dataset.go Outdated Show resolved Hide resolved

kmrmt requested a review from rinx August 6, 2020 05:08

rinx reviewed Aug 7, 2020

View reviewed changes

Makefile.d/bench.mk Outdated Show resolved Hide resolved

rinx previously approved these changes Aug 11, 2020

View reviewed changes

kmrmt marked this pull request as draft August 14, 2020 05:16

kmrmt dismissed rinx’s stale review via c79fddd August 21, 2020 04:53

kmrmt force-pushed the feature/benchmark/billion_scale_dataset branch from d7955c5 to c79fddd Compare August 21, 2020 04:53

github-actions bot added team/set SET team size/XXL and removed size/L labels Aug 21, 2020

kmrmt marked this pull request as ready for review August 24, 2020 08:42

kmrmt requested a review from hlts2 August 25, 2020 04:08

hlts2 requested changes Aug 25, 2020

View reviewed changes

vdaas-ci force-pushed the feature/benchmark/billion_scale_dataset branch from ef8733d to 9655fcd Compare August 27, 2020 07:12

kmrmt requested a review from hlts2 August 27, 2020 08:00

github-actions bot added size/XXXL and removed size/XXL labels Aug 27, 2020

vdaas-ci dismissed their stale review via 0023a92 September 24, 2020 05:46

vdaas-ci force-pushed the feature/benchmark/billion_scale_dataset branch from c8c1198 to 0023a92 Compare September 24, 2020 05:46

vdaas-ci previously approved these changes Sep 24, 2020

View reviewed changes

kmrmt and others added 15 commits September 24, 2020 05:51

implement billion scale data loader

c3e2cb3

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

fix dataset to datasets

db838a3

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

implement billion scale loader

7a974a1

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

fix benchmark timer

530f5a9

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

fix error handling

ba2c440

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

fix benchmark timers

c7308f1

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

🤖 Update license headers / Format go codes and yaml files

38464c5

Signed-off-by: vdaas-ci <ci@vdaas.org>

add x1b for docker build

f43d558

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

regenerate tests

2635f81

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

regenerate tests

f64b1cc

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

fix generated test

bdd80f2

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

by reviewdog

868e338

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

by comments

fd16f33

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

stop benchmark if error is occurred

1d9d0ba

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

fix benchmark

7a57e13

Signed-off-by: Kosuke Morimoto <kou.morimoto@gmail.com>

vdaas-ci dismissed their stale review via 7a57e13 September 24, 2020 05:51

vdaas-ci force-pushed the feature/benchmark/billion_scale_dataset branch from 0023a92 to 7a57e13 Compare September 24, 2020 05:51

vdaas-ci approved these changes Sep 24, 2020

View reviewed changes

github-actions bot reviewed Sep 24, 2020

View reviewed changes

vankichi merged commit b96c5d0 into master Sep 24, 2020

vankichi deleted the feature/benchmark/billion_scale_dataset branch September 24, 2020 06:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement billion scale data #612

implement billion scale data #612

kmrmt commented Aug 5, 2020 •

edited

Loading

pull-assistant bot commented Aug 5, 2020 •

edited

Loading

vdaas-ci commented Aug 5, 2020

codecov bot commented Aug 5, 2020 •

edited

Loading

rinx left a comment

kmrmt commented Aug 21, 2020

kmrmt commented Aug 27, 2020

vdaas-ci commented Aug 27, 2020

vdaas-ci commented Aug 27, 2020

vdaas-ci commented Sep 24, 2020

vdaas-ci left a comment

vankichi commented Sep 24, 2020

vdaas-ci commented Sep 24, 2020

vdaas-ci commented Sep 24, 2020

vdaas-ci left a comment

github-actions bot left a comment

implement billion scale data #612

implement billion scale data #612

Conversation

kmrmt commented Aug 5, 2020 • edited Loading

Description:

Related Issue:

How Has This Been Tested?:

Environment:

Types of changes:

Changes to Core Features:

Checklist:

pull-assistant bot commented Aug 5, 2020 • edited Loading

vdaas-ci commented Aug 5, 2020

codecov bot commented Aug 5, 2020 • edited Loading

Codecov Report

rinx left a comment

Choose a reason for hiding this comment

kmrmt commented Aug 21, 2020

kmrmt commented Aug 27, 2020

vdaas-ci commented Aug 27, 2020

vdaas-ci commented Aug 27, 2020

vdaas-ci commented Sep 24, 2020

vdaas-ci left a comment

Choose a reason for hiding this comment

vankichi commented Sep 24, 2020

vdaas-ci commented Sep 24, 2020

vdaas-ci commented Sep 24, 2020

vdaas-ci left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

kmrmt commented Aug 5, 2020 •

edited

Loading

pull-assistant bot commented Aug 5, 2020 •

edited

Loading

codecov bot commented Aug 5, 2020 •

edited

Loading