Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark's memory requirement #9

Open
alihadian opened this issue Jun 28, 2020 · 7 comments
Open

Benchmark's memory requirement #9

alihadian opened this issue Jun 28, 2020 · 7 comments

Comments

@alihadian
Copy link

alihadian commented Jun 28, 2020

On a system with 16GB data, the benchmark crashes while building the benchmark:

[100%] Built target benchmark
Generating lookups for osm_cellids_200M_uint64
Generating lookups for osm_cellids_400M_uint64
Generating lookups for osm_cellids_600M_uint64
read 600000000 values from ../data/osm_cellids_600M_uint64 in 2095 ms (286.396 M values/s)
terminate called after throwing an instance of 'std::bad_alloc'

To upgrade the memory, I wonder what is the peak memory usage of the benchmark. Do you have any rough estimate?

Apparently a bit more than 16GB is enough for the build of osm_cellids_600M_uint64, but then perhaps the benchmark execution could take more memory. right?

@alexandervanrenen
Copy link
Contributor

Yeah, I think the larger datasets simply require more than 16GB to generate. You have two options.

  1. You can just skip the larger datasets and only run the smaller (200M keys) dataset. For this, just remove the lines from the prepare.sh script.
  2. Allow swapping 32GB in total should be sufficient. However, this might blow up the time it takes to generate...

Hope this helps :)

@alihadian
Copy link
Author

alihadian commented Jun 28, 2020

Thanks for your suggestions, @alexandervanrenen
Unfortunately, the benchmark doesn't even run on 200M-key datasets. It's surprising as the previous version ( https://github.com/learnedsystems/SOSD/tree/mlforsys19 ) could easily run datasets of this size on a system with 16GB of memory (~15GB free memory)

Here is the sample output when I comment out the 400M, 600M, and 800M datasets and only try to run the 200M-record ones (GCC 10):

Executing benchmark and saving results...
Executing workload osm_cellids_200M_uint64
Repeating lookup code 1 time(s).
Using 1 thread(s).
read 200000000 values from ./data/osm_cellids_200M_uint64 in 3616 ms (55.3097 M values/s)
data is unique
read 10000000 values from ./data/osm_cellids_200M_uint64_equality_lookups_10M in 388 ms (25.7732 M values/s)
RESULT: RMI,0,228.528,402653216,0,BinarySearch
RESULT: RMI,1,241.258,201326624,0,BinarySearch
RESULT: RMI,2,263.921,100663328,0,BinarySearch
RESULT: RMI,3,313.02,41943040,0,BinarySearch
RESULT: RMI,4,341.368,12582944,0,BinarySearch
RESULT: RMI,5,373.992,6291488,0,BinarySearch
RESULT: RMI,6,407.533,1835008,0,BinarySearch
RESULT: RMI,7,530.899,786448,0,BinarySearch
RESULT: RMI,8,805.79,24592,0,BinarySearch
RESULT: RMI,9,978.256,3088,0,BinarySearch
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Executing workload wiki_ts_200M_uint64
Repeating lookup code 1 time(s).
Using 1 thread(s).
read 200000000 values from ./data/wiki_ts_200M_uint64 in 3967 ms (50.4159 M values/s)
data contains duplicates
read 10000000 values from ./data/wiki_ts_200M_uint64_equality_lookups_10M in 356 ms (28.0899 M values/s)
RESULT: RMI,0,153.876,402653216,0,BinarySearch
RESULT: RMI,1,156.456,201326624,0,BinarySearch
RESULT: RMI,2,157.162,100663328,0,BinarySearch
RESULT: RMI,3,168.869,25165856,0,BinarySearch
RESULT: RMI,4,172.909,12582944,0,BinarySearch
RESULT: RMI,5,177.13,6291488,0,BinarySearch
RESULT: RMI,6,179.807,3145760,0,BinarySearch
RESULT: RMI,7,219.125,786464,0,BinarySearch
RESULT: RMI,8,473.818,24608,0,BinarySearch
RESULT: RMI,9,616.995,3088,0,BinarySearch
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Executing workload books_200M_uint64
...

In the previous version, the key-value pairs for 200M records took 3.2 GB (64-bit key & payload), and another 3.2 GB when building any index (copying data to data_), plus the index size. It's surprising why the benchmark can't handle datasets of that size anymore.

@alexandervanrenen
Copy link
Contributor

Ok for this one I am not sure what is happening (have not seen it before). Might be the case that RMI is not freeing some memory or RS is allocating too much memory .. has any of you seen this one @RyanMarcus @andreaskipf ?

@andreaskipf
Copy link
Contributor

andreaskipf commented Jun 30, 2020

Not sure what changed w.r.t. the memory requirements (I have no issue running the benchmark on a machine with 32GiB RAM). However, we will replace the current RS implementation soon with the new one, which operates directly on the input array without creating a copy. I'll ping this thread once this is done.

@alexandervanrenen
Copy link
Contributor

Ok, I will investigate ... should be easy enough to figure out :)

@andreaskipf
Copy link
Contributor

We have just replaced RS with the new version. @alihadian can you please verify whether you can now build it on your 16GiB machine? Thanks!

@alihadian
Copy link
Author

alihadian commented Jul 25, 2020

We have just replaced RS with the new version. @alihadian can you please verify whether you can now build it on your 16GiB machine? Thanks!

[Wrong output was posted in my previous comment]

Thanks. If you want to make the benchmark hands-off on 16GB, then prepare.sh must take into account the selected datasets defined in datasets_under_test.txt. The script currently tries to load all datasets and generate queries for all datasets (including the 400M & 600M-record ones), and hence crashes.

I manually commented out the 400M+ from prepare & datasets_under_test, but still some algorithms crash:

Executing workload osm_cellids_200M_uint64
Repeating lookup code 1 time(s).
Using 1 thread(s).
read 200000000 values from ./data/osm_cellids_200M_uint64 in 15702 ms (12.7372 M values/s)
data is unique
read 10000000 values from ./data/osm_cellids_200M_uint64_equality_lookups_10M in 2070 ms (4.83092 M values/s)
RESULT: RMI,0,1010.59,402653216,0,BinarySearch
RESULT: RMI,1,1074.33,201326624,0,BinarySearch
RESULT: RMI,2,1095.28,100663328,0,BinarySearch
RESULT: RMI,3,1204.58,41943040,0,BinarySearch
RESULT: RMI,4,1360.01,12582944,0,BinarySearch
RESULT: RMI,5,1403.15,6291488,0,BinarySearch
RESULT: RMI,6,1793.47,1835008,0,BinarySearch
RESULT: RMI,7,2218.07,786448,0,BinarySearch
RESULT: RMI,8,3657.79,24592,0,BinarySearch
RESULT: RMI,9,5234.25,3088,0,BinarySearch
RESULT: RS,1,641.705,507999944,8002877965,BinarySearch
RESULT: RS,2,853.138,251011320,4622782514,BinarySearch
RESULT: RS,3,926.852,127129548,3979255691,BinarySearch
RESULT: RS,4,1026.19,63095928,3763571541,BinarySearch
RESULT: RS,5,1154.16,31774880,3650606262,BinarySearch
RESULT: RS,6,1110.11,15948212,3633930730,BinarySearch
RESULT: RS,7,1106.99,7995228,3461003894,BinarySearch
RESULT: RS,8,1069.86,3994788,3576537020,BinarySearch
RESULT: RS,9,1166.48,1998144,3562924055,BinarySearch
RESULT: RS,10,1207.08,1998144,3363430008,BinarySearch
RESULT: PGM,16,1052.47,27793600,15803107461,BinarySearch
RESULT: PGM,4,1119.51,118835160,18404535380,BinarySearch
RESULT: PGM,8,980.764,57372800,16146689882,BinarySearch
RESULT: PGM,32,1032.2,13610680,14305571668,BinarySearch
RESULT: PGM,64,1161.61,6735780,13661276135,BinarySearch
RESULT: PGM,256,1262.36,1681180,13233111712,BinarySearch
RESULT: PGM,1024,1257.17,432980,12709408050,BinarySearch
RESULT: PGM,2048,1499.83,220880,12661867611,BinarySearch
RESULT: PGM,4096,1686.77,114060,12548411211,BinarySearch
RESULT: PGM,8192,1759.52,59220,12633139100,BinarySearch
Executing workload wiki_ts_200M_uint64
Repeating lookup code 1 time(s).
Using 1 thread(s).
read 200000000 values from ./data/wiki_ts_200M_uint64 in 10557 ms (18.9448 M values/s)
data contains duplicates
read 10000000 values from ./data/wiki_ts_200M_uint64_equality_lookups_10M in 528 ms (18.9394 M values/s)
RESULT: RMI,0,657.774,402653216,0,BinarySearch
RESULT: RMI,1,744.178,201326624,0,BinarySearch
RESULT: RMI,2,696.591,100663328,0,BinarySearch
RESULT: RMI,3,640.246,25165856,0,BinarySearch
RESULT: RMI,4,686.239,12582944,0,BinarySearch
RESULT: RMI,5,706.922,6291488,0,BinarySearch
RESULT: RMI,6,750.191,3145760,0,BinarySearch
RESULT: RMI,7,942.32,786464,0,BinarySearch
RESULT: RMI,8,2004.11,24608,0,BinarySearch
RESULT: RMI,9,2584.12,3088,0,BinarySearch
RESULT: RS,1,851.499,506867068,3673072594,BinarySearch
RESULT: RS,2,920.771,247625288,2619266961,BinarySearch
RESULT: RS,3,885.471,125076056,2422303161,BinarySearch
RESULT: RS,4,880.775,63877208,2333537346,BinarySearch
RESULT: RS,5,908.698,31917576,2316853214,BinarySearch
RESULT: RS,6,942.354,15909560,2169338252,BinarySearch
RESULT: RS,7,962.092,7976336,2320432160,BinarySearch
RESULT: RS,8,1075.88,3999204,2117304207,BinarySearch
RESULT: RS,9,850.06,1999416,2216469260,BinarySearch
RESULT: RS,10,940.018,1000212,2141017992,BinarySearch
RESULT: PGM,16,768.075,4604720,6691198227,BinarySearch
RESULT: PGM,4,826.544,45559380,8861879257,BinarySearch
RESULT: PGM,8,807.3,14269500,7485016008,BinarySearch
RESULT: PGM,32,819.202,1722040,6132314744,BinarySearch
RESULT: PGM,64,812.763,754920,5947471202,BinarySearch
RESULT: PGM,256,1005.45,222760,6259760928,BinarySearch
RESULT: PGM,1024,1224.93,89780,6813015896,BinarySearch
RESULT: PGM,2048,1487.19,54580,7067813374,BinarySearch
RESULT: PGM,4096,1701.52,38880,7487636636,BinarySearch
RESULT: PGM,8192,1872.6,20140,7084727705,BinarySearch
index ART is not applicable
index ART is not applicable
index ART is not applicable
index ART is not applicable
index ART is not applicable
index ART is not applicable
index ART is not applicable
index ART is not applicable
index ART is not applicable
index ART is not applicable
RESULT: BTree,32,1187.93,116016232,87378266,BinarySearch
Executing workload books_200M_uint64
Repeating lookup code 1 time(s).
Using 1 thread(s).
read 200000000 values from ./data/books_200M_uint64 in 9139 ms (21.8842 M values/s)
data is unique
read 10000000 values from ./data/books_200M_uint64_equality_lookups_10M in 507 ms (19.7239 M values/s)
RESULT: RMI,0,538.599,402653216,0,BinarySearch
RESULT: RMI,1,728.999,201326608,0,BinarySearch
RESULT: RMI,2,711.099,100663312,0,BinarySearch
RESULT: RMI,3,645.941,41943040,0,BinarySearch
RESULT: RMI,4,745.704,12582928,0,BinarySearch
RESULT: RMI,5,797.9,6291472,0,BinarySearch
RESULT: RMI,6,770.18,786464,0,BinarySearch
RESULT: RMI,7,1016.37,24608,0,BinarySearch
RESULT: RMI,8,1196.07,6160,0,BinarySearch
RESULT: RMI,9,1391.49,3088,0,BinarySearch
RESULT: RS,1,627.79,505165832,5832526494,BinarySearch
RESULT: RS,2,655.512,253054936,4165524366,BinarySearch
RESULT: RS,3,695.846,125721624,3874965837,BinarySearch
RESULT: RS,4,861.732,56216360,3365617111,BinarySearch
RESULT: RS,5,807.659,31050536,3388414667,BinarySearch
RESULT: RS,6,858.398,14957704,3119480145,BinarySearch
RESULT: RS,7,810.165,7976712,3171154769,BinarySearch
RESULT: RS,8,950.973,3965432,3050638318,BinarySearch
RESULT: RS,9,836.549,1999352,3003286558,BinarySearch
RESULT: RS,10,858.645,1000888,2962058313,BinarySearch
RESULT: PGM,16,816.247,15418080,11628441556,BinarySearch
RESULT: PGM,4,939.067,142554300,18119772055,BinarySearch
RESULT: PGM,8,796.124,45176040,13506001539,BinarySearch
RESULT: PGM,32,773.995,5268520,10054656238,BinarySearch
RESULT: PGM,64,821.887,1615360,8962632296,BinarySearch
RESULT: PGM,256,1012.33,119800,7457546186,BinarySearch
RESULT: PGM,1024,1145.81,9300,6938414580,BinarySearch
RESULT: PGM,2048,1268.6,4160,6848001357,BinarySearch
RESULT: PGM,4096,1378.11,2300,7004965760,BinarySearch
RESULT: PGM,8192,1606.61,1480,7058185953,BinarySearch
Executing workload fb_200M_uint64
Repeating lookup code 1 time(s).
Using 1 thread(s).
read 200000000 values from ./data/fb_200M_uint64 in 21916 ms (9.12575 M values/s)
data is unique
read 10000000 values from ./data/fb_200M_uint64_equality_lookups_10M in 1758 ms (5.68828 M values/s)
RESULT: RMI,0,891.706,402653200,0,BinarySearch
RESULT: RMI,1,865.73,201326608,0,BinarySearch
RESULT: RMI,2,933.047,100663312,0,BinarySearch
RESULT: RMI,3,901.982,25165840,0,BinarySearch
RESULT: RMI,4,1068,12582928,0,BinarySearch
RESULT: RMI,5,1023.22,6291472,0,BinarySearch
RESULT: RMI,6,1075.06,3145744,0,BinarySearch
RESULT: RMI,7,1134.2,786448,0,BinarySearch
RESULT: RMI,8,1505.35,24592,0,BinarySearch
RESULT: RMI,9,1938.53,3088,0,BinarySearch
RESULT: RS,1,2242.48,508497300,5382345445,BinarySearch
RESULT: RS,2,1612.51,255289316,3997186516,BinarySearch
RESULT: RS,3,1697.66,125577380,4065322793,BinarySearch
RESULT: RS,4,1400.74,63861140,3492300676,BinarySearch
RESULT: RS,5,1425.41,31846612,3720742681,BinarySearch
RESULT: RS,6,1272.41,15994836,3371243135,BinarySearch
RESULT: RS,7,1255.25,7982532,3372459774,BinarySearch
RESULT: RS,8,1422.1,3995812,3944594772,BinarySearch
RESULT: RS,9,1285.06,3995812,3744742735,BinarySearch
RESULT: RS,10,1347.09,3995812,3323231016,BinarySearch
RESULT: PGM,16,983.225,43774520,17042005299,BinarySearch
RESULT: PGM,4,1124.01,173186120,20824760678,BinarySearch
RESULT: PGM,8,1154.59,87810720,17319094311,BinarySearch
RESULT: PGM,32,1006.56,21637120,15224770298,BinarySearch
RESULT: PGM,64,1049.02,10639900,14408312276,BinarySearch
RESULT: PGM,256,1084.49,2413640,13264859290,BinarySearch
RESULT: PGM,1024,1238.33,371400,11098957265,BinarySearch
RESULT: PGM,2048,1341.34,118520,9836834009,BinarySearch
RESULT: PGM,4096,1529.15,34160,8846405597,BinarySearch
RESULT: PGM,8192,1671.03,9660,8056823886,BinarySearch
Executing workload books_200M_uint32
Repeating lookup code 1 time(s).
Using 1 thread(s).
read 200000000 values from ./data/books_200M_uint32 in 9247 ms (21.6286 M values/s)
data contains duplicates
read 10000000 values from ./data/books_200M_uint32_equality_lookups_10M in 7359 ms (1.35888 M values/s)
RESULT: RMI,0,579.371,402653216,0,BinarySearch
RESULT: RMI,1,625.251,201326624,0,BinarySearch
RESULT: RMI,2,729.407,100663328,0,BinarySearch
RESULT: RMI,3,607.383,41943040,0,BinarySearch
RESULT: RMI,4,713,12582944,0,BinarySearch
RESULT: RMI,5,796.667,6291488,0,BinarySearch
RESULT: RMI,6,833.428,3145760,0,BinarySearch
RESULT: RMI,7,901.001,786464,0,BinarySearch
RESULT: RMI,8,1212.67,24608,0,BinarySearch
RESULT: RMI,9,1447.71,3088,0,BinarySearch
RESULT: RS,1,579.732,493374828,5437196574,BinarySearch
RESULT: RS,2,547.162,241716588,4266883872,BinarySearch
RESULT: RS,3,834.094,127805212,3885179565,BinarySearch
RESULT: RS,4,776.516,63887116,3886596694,BinarySearch
RESULT: RS,5,834.514,31993964,3679867951,BinarySearch
RESULT: RS,6,822.483,15916236,3236048582,BinarySearch
RESULT: RS,7,772.506,7907100,3250247235,BinarySearch
RESULT: RS,8,825.804,3957404,3045642203,BinarySearch
RESULT: RS,9,881.573,1991324,3021639997,BinarySearch
RESULT: RS,10,950.948,999564,2988914402,BinarySearch
RESULT: PGM,16,782.688,11703200,11996685299,BinarySearch
RESULT: PGM,4,818.124,70640688,15973491758,BinarySearch
RESULT: PGM,8,779.107,29643568,13155445944,BinarySearch
RESULT: PGM,32,748.399,4119136,10104298748,BinarySearch
RESULT: PGM,64,826.829,1293168,8914442329,BinarySearch
RESULT: PGM,256,954.821,97760,7234706040,BinarySearch
RESULT: PGM,1024,1184.74,7552,6619491073,BinarySearch
RESULT: PGM,2048,1295.49,3312,6569362634,BinarySearch
RESULT: PGM,4096,1461.65,1904,6614974335,BinarySearch
RESULT: PGM,8192,1567.02,1152,6722357169,BinarySearch
RESULT: BTree,32,1121.36,87075880,41689096,BinarySearch
$```

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants