Running Workload D and E in Parallel #1061

ntrhieu89 · 2017-11-12T01:37:12Z

Hi,

I am trying to run workloads D and E in parallel (using multiple nodes to issue actions).
The problem is the insert operation. How to prevent two different nodes to insert the same new user to the data store?

This link https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload-in-Parallel gives an instruction for loading the data store with multiple nodes but doesn't mention anything about the run phase.

Thanks,
Hieu Nguyen

The text was updated successfully, but these errors were encountered:

busbey · 2018-05-19T07:04:42Z

the insertstart and insertcount properties are used to constrain the keys chosen for all operations, including when workloads are in the run phase.

busbey · 2018-06-14T06:02:50Z

closing as stale. if you're still having trouble please reopen.

perdelt · 2023-04-06T08:44:04Z

Hi, I still have this problem. I load data with

recordcount=1000000
operationcount=500000
workload=site.ycsb.workloads.CoreWorkload

readallfields=true

readproportion=0
updateproportion=0
scanproportion=0.95
insertproportion=0.05

requestdistribution=zipfian

maxscanlength=100

scanlengthdistribution=uniform

insertstart=0
insertcount=500000

and

recordcount=1000000
operationcount=500000
workload=site.ycsb.workloads.CoreWorkload

readallfields=true

readproportion=0
updateproportion=0
scanproportion=0.95
insertproportion=0.05

requestdistribution=zipfian

maxscanlength=100

scanlengthdistribution=uniform

insertstart=500000
insertcount=500000

without a problem. I then have two processes for workload E
First

recordcount=2000000
operationcount=500000
workload=site.ycsb.workloads.CoreWorkload

readallfields=true

readproportion=0
updateproportion=0
scanproportion=0.95
insertproportion=0.05

requestdistribution=zipfian

maxscanlength=100

scanlengthdistribution=uniform

insertstart=1000000
insertcount=500000

Second

recordcount=2000000
operationcount=500000
workload=site.ycsb.workloads.CoreWorkload

readallfields=true

readproportion=0
updateproportion=0
scanproportion=0.95
insertproportion=0.05

requestdistribution=zipfian

maxscanlength=100

scanlengthdistribution=uniform

insertstart=1500000
insertcount=500000

and I receive a lot of PK violation errors.
Can you help me, where am I doing something wrong?
Many thanks!

seybi87 · 2023-04-20T07:50:42Z

Hi @perdelt

the problems with the PV violations are caused by the specified recordcount values in the parallel RUN phase. The recordcount also specifies the key range of the inserted records, i.e. in the RUN phase you have specified for both workloads recordcount=2000000.

In consequence, the Workloads E1 and E2 will insert records from the key range insertstart to recordcount:

Workload E-1 1000000 to 2000000
Workload E-2 1500000 to 2000000

Setting the recordcount=1500000 for Workload E-1 should solve the issue.

busbey closed this as completed Jun 14, 2018

perdelt mentioned this issue Apr 6, 2023

Implement all workloads of YCSB Beuth-Erdelt/Benchmark-Experiment-Host-Manager#136

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running Workload D and E in Parallel #1061

Running Workload D and E in Parallel #1061

ntrhieu89 commented Nov 12, 2017 •

edited

Loading

busbey commented May 19, 2018

busbey commented Jun 14, 2018

perdelt commented Apr 6, 2023

seybi87 commented Apr 20, 2023

Running Workload D and E in Parallel #1061

Running Workload D and E in Parallel #1061

Comments

ntrhieu89 commented Nov 12, 2017 • edited Loading

busbey commented May 19, 2018

busbey commented Jun 14, 2018

perdelt commented Apr 6, 2023

seybi87 commented Apr 20, 2023

ntrhieu89 commented Nov 12, 2017 •

edited

Loading