Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create new test that creates lots of splits #267

Merged
merged 9 commits into from
Jan 5, 2024

Conversation

DomGarguilo
Copy link
Member

Fixes #266

  • refactored continuous ingest so it can be used by other classes
  • added a test that creates a number of tables with data then lowers the split threshold on the tables and checks that the number of splits expected eventually occurs, reporting the time taken for each iteration

@DomGarguilo DomGarguilo self-assigned this Dec 8, 2023
@DomGarguilo
Copy link
Member Author

I marked this as WIP because there is currently no args that are accepted to configure the numbers used in the test, they are all hardcoded.

It seems like the values that are tested against need to be known before the test starts which I guess is fine but it seems to make the passing of args to this test fragile.

@DomGarguilo
Copy link
Member Author

DomGarguilo commented Dec 14, 2023

Everything here works I just need to do the following:

  • remove usage of non-public accumulo code
  • accept and make use of an initial table splits param from the props file

@DomGarguilo DomGarguilo changed the title WIP - Create new test that creates lots of splits Create new test that creates lots of splits Dec 15, 2023
Co-authored-by: Keith Turner <kturner@apache.org>
@DomGarguilo
Copy link
Member Author

Here is what the test output looks like currently:

2023-12-15T17:23:41,394 [testing.continuous.ManySplits] INFO : Properties being used to create tables for this test: {table.split.threshold=1G}
2023-12-15T17:23:41,395 [testing.continuous.ManySplits] INFO : Creating initial table: manysplits.table1
2023-12-15T17:23:41,754 [testing.continuous.CreateTable] INFO : Created Accumulo table manysplits.table1 with 1 tablets
2023-12-15T17:23:41,755 [testing.continuous.ManySplits] INFO : Ingesting 10000000 entries into first table, manysplits.table1.
2023-12-15T17:23:41,763 [testing.continuous.ContinuousIngest] INFO : Ingest instance ID: fbd7ecff-2dda-49da-bda0-d2169f2870c5 current time: 1702679021763ms
2023-12-15T17:23:41,763 [testing.continuous.ContinuousIngest] INFO : A flush will occur after every 1000000 entries written
2023-12-15T17:23:41,820 [testing.continuous.ContinuousIngest] INFO : Total entries to be written: 10000000
2023-12-15T17:23:41,822 [testing.continuous.ContinuousIngest] INFO : DELETES will occur with a probability of 0.00
2023-12-15T17:23:46,600 [testing.continuous.ContinuousIngest] INFO : FLUSH - duration: 1148ms, since last flush: 4780ms, total written: 1000000, total deleted: 0
2023-12-15T17:23:51,798 [testing.continuous.ContinuousIngest] INFO : FLUSH - duration: 444ms, since last flush: 5198ms, total written: 2000000, total deleted: 0
2023-12-15T17:23:58,619 [testing.continuous.ContinuousIngest] INFO : FLUSH - duration: 500ms, since last flush: 6821ms, total written: 3000000, total deleted: 0
2023-12-15T17:24:04,419 [testing.continuous.ContinuousIngest] INFO : FLUSH - duration: 412ms, since last flush: 5800ms, total written: 4000000, total deleted: 0
2023-12-15T17:24:10,281 [testing.continuous.ContinuousIngest] INFO : FLUSH - duration: 520ms, since last flush: 5862ms, total written: 5000000, total deleted: 0
2023-12-15T17:24:18,411 [testing.continuous.ContinuousIngest] INFO : FLUSH - duration: 1000ms, since last flush: 8130ms, total written: 6000000, total deleted: 0
2023-12-15T17:24:23,261 [testing.continuous.ContinuousIngest] INFO : FLUSH - duration: 485ms, since last flush: 4850ms, total written: 7000000, total deleted: 0
2023-12-15T17:24:28,543 [testing.continuous.ContinuousIngest] INFO : FLUSH - duration: 389ms, since last flush: 5282ms, total written: 8000000, total deleted: 0
2023-12-15T17:24:34,605 [testing.continuous.ContinuousIngest] INFO : FLUSH - duration: 362ms, since last flush: 6062ms, total written: 9000000, total deleted: 0
2023-12-15T17:24:40,499 [testing.continuous.ContinuousIngest] INFO : FLUSH - duration: 558ms, since last flush: 5894ms, total written: 10000000, total deleted: 0
2023-12-15T17:24:40,545 [testing.continuous.ManySplits] INFO : Creating 2 more tables by cloning the first
2023-12-15T17:24:48,539 [testing.continuous.ManySplits] INFO : Changing split threshold on all tables from 1G to 102M
2023-12-15T17:24:48,580 [testing.continuous.ManySplits] INFO : Waiting for each tablet to have a sum file size <= 102M
2023-12-15T17:24:50,605 [testing.continuous.ManySplits] INFO : 1 tablets have file sizes not yet <= 102M on table manysplits.table2. Diff of avg offending file(s): 194M
2023-12-15T17:24:50,605 [testing.continuous.ManySplits] INFO : 1 tablets have file sizes not yet <= 102M on table manysplits.table1. Diff of avg offending file(s): 197M
2023-12-15T17:24:50,605 [testing.continuous.ManySplits] INFO : 1 tablets have file sizes not yet <= 102M on table manysplits.table3. Diff of avg offending file(s): 194M
....
2023-12-15T17:25:35,736 [testing.continuous.ManySplits] INFO : 1 tablets have file sizes not yet <= 102M on table manysplits.table2. Diff of avg offending file(s): 52M
2023-12-15T17:25:35,736 [testing.continuous.ManySplits] INFO : 1 tablets have file sizes not yet <= 102M on table manysplits.table1. Diff of avg offending file(s): 52M
2023-12-15T17:25:35,736 [testing.continuous.ManySplits] INFO : 1 tablets have file sizes not yet <= 102M on table manysplits.table3. Diff of avg offending file(s): 52M
2023-12-15T17:25:36,740 [testing.continuous.ManySplits] INFO : Time taken for all tables to reach expected total file size (102M): 48 seconds (48199ms)
2023-12-15T17:25:36,742 [testing.continuous.ManySplits] INFO : Changing split threshold on all tables from 102M to 10M
2023-12-15T17:25:36,751 [testing.continuous.ManySplits] INFO : Waiting for each tablet to have a sum file size <= 10M
2023-12-15T17:25:38,758 [testing.continuous.ManySplits] INFO : 3 tablets have file sizes not yet <= 10M on table manysplits.table1. Diff of avg offending file(s): 67M
2023-12-15T17:25:38,758 [testing.continuous.ManySplits] INFO : 3 tablets have file sizes not yet <= 10M on table manysplits.table2. Diff of avg offending file(s): 67M
2023-12-15T17:25:38,758 [testing.continuous.ManySplits] INFO : 3 tablets have file sizes not yet <= 10M on table manysplits.table3. Diff of avg offending file(s): 67M
....
2023-12-15T17:27:30,030 [testing.continuous.ManySplits] INFO : 1 tablets have file sizes not yet <= 10M on table manysplits.table2. Diff of avg offending file(s): 9M
2023-12-15T17:27:33,037 [testing.continuous.ManySplits] INFO : 1 tablets have file sizes not yet <= 10M on table manysplits.table2. Diff of avg offending file(s): 9M
2023-12-15T17:27:36,044 [testing.continuous.ManySplits] INFO : 1 tablets have file sizes not yet <= 10M on table manysplits.table2. Diff of avg offending file(s): 9M
2023-12-15T17:27:37,049 [testing.continuous.ManySplits] INFO : Time taken for all tables to reach expected total file size (10M): 120 seconds (120304ms)
2023-12-15T17:27:37,052 [testing.continuous.ManySplits] INFO : Changing split threshold on all tables from 10M to 1M
2023-12-15T17:27:37,094 [testing.continuous.ManySplits] INFO : Waiting for each tablet to have a sum file size <= 1M
2023-12-15T17:27:39,102 [testing.continuous.ManySplits] INFO : 7 tablets have file sizes not yet <= 1M on table manysplits.table3. Diff of avg offending file(s): 8M
2023-12-15T17:27:39,102 [testing.continuous.ManySplits] INFO : 4 tablets have file sizes not yet <= 1M on table manysplits.table1. Diff of avg offending file(s): 8M
2023-12-15T17:27:39,102 [testing.continuous.ManySplits] INFO : 7 tablets have file sizes not yet <= 1M on table manysplits.table2. Diff of avg offending file(s): 8M
....
2023-12-15T17:29:48,569 [testing.continuous.ManySplits] INFO : 3 tablets have file sizes not yet <= 1M on table manysplits.table3. Diff of avg offending file(s): 187K
2023-12-15T17:29:51,582 [testing.continuous.ManySplits] INFO : 8 tablets have file sizes not yet <= 1M on table manysplits.table2. Diff of avg offending file(s): 189K
2023-12-15T17:29:54,614 [testing.continuous.ManySplits] INFO : Time taken for all tables to reach expected total file size (1M): 137 seconds (137549ms)
2023-12-15T17:29:54,615 [testing.continuous.ManySplits] INFO : Test completed successfully.
2023-12-15T17:29:54,616 [testing.continuous.ManySplits] INFO : Test results:
Total test rounds: 3
Table count: 3
Test round 0:
TABLE_SPLIT_THRESHOLD 1G -> 102M
Splits count:         0 -> 9
Splits per second:    0.19
Test round 1:
TABLE_SPLIT_THRESHOLD 102M -> 10M
Splits count:         9 -> 93
Splits per second:    0.70
Test round 2:
TABLE_SPLIT_THRESHOLD 10M -> 1M
Splits count:         93 -> 1533
Splits per second:    10.51

2023-12-15T17:29:54,616 [testing.continuous.ManySplits] INFO : Deleting tables
2023-12-15T17:30:02,545 [testing.continuous.ManySplits] INFO : Deleting namespace



# Conflicts:
#	src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java
@DomGarguilo DomGarguilo merged commit 3a4a09f into apache:main Jan 5, 2024
1 check passed
@DomGarguilo DomGarguilo deleted the splitScaling branch January 5, 2024 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a split scaling test
2 participants