Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11503. Add Robot test to verify Container Balancer for EC containers. #7311

Merged
merged 9 commits into from
Oct 18, 2024
Merged
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ services:
volumes:
- tmpfs1:/data
- ../..:/opt/hadoop
deploy:
replicas: ${DATANODE1_REPLICA:-1}
datanode2:
<<: *common-config
ports:
Expand All @@ -50,6 +52,8 @@ services:
volumes:
- tmpfs2:/data
- ../..:/opt/hadoop
deploy:
replicas: ${DATANODE2_REPLICA:-1}
datanode3:
<<: *common-config
ports:
Expand All @@ -61,6 +65,8 @@ services:
volumes:
- tmpfs3:/data
- ../..:/opt/hadoop
deploy:
replicas: ${DATANODE3_REPLICA:-1}
datanode4:
<<: *common-config
ports:
Expand All @@ -72,6 +78,34 @@ services:
volumes:
- tmpfs4:/data
- ../..:/opt/hadoop
deploy:
replicas: ${DATANODE4_REPLICA:-1}
datanode5:
<<: *common-config
ports:
- 19864
- 9882
environment:
<<: *replication
command: [ "ozone","datanode" ]
volumes:
- tmpfs5:/data
- ../..:/opt/hadoop
deploy:
replicas: ${DATANODE5_REPLICA:-1}
datanode6:
<<: *common-config
ports:
- 19864
- 9882
environment:
<<: *replication
command: [ "ozone","datanode" ]
volumes:
- tmpfs6:/data
- ../..:/opt/hadoop
deploy:
replicas: ${DATANODE6_REPLICA:-1}
om1:
<<: *common-config
environment:
Expand Down Expand Up @@ -175,3 +209,15 @@ volumes:
o: "size=1g,uid=4000"
device: tmpfs
type: tmpfs
tmpfs5:
driver: local
driver_opts:
o: "size=1g,uid=5000"
device: tmpfs
type: tmpfs
tmpfs6:
driver: local
driver_opts:
o: "size=1g,uid=6000"
device: tmpfs
type: tmpfs
5 changes: 4 additions & 1 deletion hadoop-ozone/dist/src/main/compose/ozone-balancer/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,13 @@ export OM=om1
export SCM=scm1
export OZONE_REPLICATION_FACTOR=3

export DATANODE2_REPLICA=0
export DATANODE5_REPLICA=0

# shellcheck source=/dev/null
source "$COMPOSE_DIR/../testlib.sh"

# We need 4 dataNodes in this tests
start_docker_env 4

execute_robot_test ${OM} balancer/testBalancer.robot
execute_robot_test ${OM} -v REPLICATION:THREE -v TYPE:RATIS -v KEYS:3 -v LOWER_LIMIT:3 -v UPPER_LIMIT:3.5 balancer/testBalancer.robot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding a test suite name that clearly indicates the replication type.

Suggested change
execute_robot_test ${OM} -v REPLICATION:THREE -v TYPE:RATIS -v KEYS:3 -v LOWER_LIMIT:3 -v UPPER_LIMIT:3.5 balancer/testBalancer.robot
execute_robot_test ${OM} -v REPLICATION:THREE -v TYPE:RATIS -v KEYS:3 -v LOWER_LIMIT:3 -v UPPER_LIMIT:3.5 -N ozone-balancer-RATIS balancer/testBalancer.robot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

30 changes: 30 additions & 0 deletions hadoop-ozone/dist/src/main/compose/ozone-balancer/test_ec.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#suite:balancer

COMPOSE_DIR="$( cd "$( dirname "${BASH_SOURCE0}" )" >/dev/null 2>&1 && pwd )"
export COMPOSE_DIR
export OM_SERVICE_ID="om"
export OM=om1
export SCM=scm1
export OZONE_REPLICATION_FACTOR=3

source "$COMPOSE_DIR/../testlib.sh"

start_docker_env 6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
start_docker_env 6
start_docker_env

this parameter is not needed as "--scale" is not used and you defined each datanote separately

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

execute_robot_test ${OM} -v REPLICATION:rs-3-2-1024k -v TYPE:EC -v KEYS:7 -v LOWER_LIMIT:1.5 -v UPPER_LIMIT:2.5 balancer/testBalancer.robot
Copy link
Contributor

@sarvekshayr sarvekshayr Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well.

Suggested change
execute_robot_test ${OM} -v REPLICATION:rs-3-2-1024k -v TYPE:EC -v KEYS:7 -v LOWER_LIMIT:1.5 -v UPPER_LIMIT:2.5 balancer/testBalancer.robot
execute_robot_test ${OM} -v REPLICATION:rs-3-2-1024k -v TYPE:EC -v KEYS:7 -v LOWER_LIMIT:1.5 -v UPPER_LIMIT:2.5 -N ozone-balancer-EC balancer/testBalancer.robot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

19 changes: 10 additions & 9 deletions hadoop-ozone/dist/src/main/smoketest/balancer/testBalancer.robot
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
*** Settings ***
Documentation Smoketest ozone cluster startup
Library OperatingSystem
Library String
Library Collections
Resource ../commonlib.robot
Resource ../ozone-lib/shell.robot
Expand All @@ -35,7 +36,7 @@ Prepare For Tests
Execute dd if=/dev/urandom of=/tmp/100mb bs=1048576 count=100
Run Keyword if '${SECURITY_ENABLED}' == 'true' Kinit test user testuser testuser.keytab
Execute ozone sh volume create /${VOLUME}
Execute ozone sh bucket create /${VOLUME}/${BUCKET}
Execute ozone sh bucket create --replication ${REPLICATION} --type ${TYPE} /${VOLUME}/${BUCKET}


Datanode In Maintenance Mode
Expand Down Expand Up @@ -67,7 +68,7 @@ Run Container Balancer
Wait Finish Of Balancing
${result} = Execute ozone admin containerbalancer status
Should Contain ${result} ContainerBalancer is Running.
Wait Until Keyword Succeeds 3min 10sec ContainerBalancer is Not Running
Wait Until Keyword Succeeds 10min 10sec ContainerBalancer is Not Running
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this timeout increase necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Sleep 60000ms

Verify Verbose Balancer Status
Expand Down Expand Up @@ -111,7 +112,7 @@ Create Multiple Keys
${fileName} = Set Variable file-${INDEX}.txt
${key} = Set Variable /${VOLUME}/${BUCKET}/${fileName}
LOG ${fileName}
Create Key ${key} ${file}
Create Key ${key} ${file} --replication=${REPLICATION} --type=${TYPE}
Key Should Match Local File ${key} ${file}
END

Expand All @@ -126,14 +127,14 @@ Get Uuid

Close All Containers
FOR ${INDEX} IN RANGE 15
${container} = Execute ozone admin container list --state OPEN | jq -r 'select(.replicationConfig.replicationFactor == "THREE") | .containerID' | head -1
${container} = Execute ozone admin container list --state OPEN | jq -r 'select(.replicationConfig.data == 3) | .containerID' | head -1
EXIT FOR LOOP IF "${container}" == "${EMPTY}"
${message} = Execute And Ignore Error ozone admin container close "${container}"
Run Keyword If '${message}' != '${EMPTY}' Should Contain ${message} is in closing state
${output} = Execute ozone admin container info "${container}"
Should contain ${output} CLOS
END
Wait until keyword succeeds 3min 10sec All container is closed
Wait until keyword succeeds 4min 10sec All container is closed

All container is closed
${output} = Execute ozone admin container list --state OPEN
Expand All @@ -146,15 +147,15 @@ Get Datanode Ozone Used Bytes Info
[return] ${result}

** Test Cases ***
Verify Container Balancer for RATIS containers
Verify Container Balancer for RATIS/EC containers
Prepare For Tests

Datanode In Maintenance Mode

${uuid} = Get Uuid
Datanode Usageinfo ${uuid}

Create Multiple Keys 3
Create Multiple Keys ${KEYS}

Close All Containers

Expand All @@ -175,8 +176,8 @@ Verify Container Balancer for RATIS containers

${datanodeOzoneUsedBytesInfoAfterContainerBalancing} = Get Datanode Ozone Used Bytes Info ${uuid}
Should Not Be Equal As Integers ${datanodeOzoneUsedBytesInfo} ${datanodeOzoneUsedBytesInfoAfterContainerBalancing}
Should Be True ${datanodeOzoneUsedBytesInfoAfterContainerBalancing} < ${SIZE} * 3.5
Should Be True ${datanodeOzoneUsedBytesInfoAfterContainerBalancing} > ${SIZE} * 3
Should Be True ${datanodeOzoneUsedBytesInfoAfterContainerBalancing} < ${SIZE} * ${UPPER_LIMIT}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have a description of what UPPER_LIMIT and LOWER_LIMIT are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Should Be True ${datanodeOzoneUsedBytesInfoAfterContainerBalancing} > ${SIZE} * ${LOWER_LIMIT}



Expand Down