Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronization specification proposal for multihost tests #1706

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions docs/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -720,6 +720,62 @@ variables or context dimensions::
ref: $@distro




.. _multihost synchronization example:

Multihost synchronization
------------------------------------------------------------------

Multihost tests introduce various synchronization methods defined
in the step under :ref:`/spec/plans/sync` keyword. The section shows
several complex examples of multihost plan with different
synchronization::

provision:
- name: server
how: virtual
- name: client1
how: virtual
role: client
- name: client2
how: virtual
role: client

prepare:
- name: packages
how: ansible
playbook: plans/packages.yml
where: client

# no sync specified -> continue in executing prepare steps in parallel
- name: services
how: shell
script:
- systemctl stop firewalld
- systemctl start httpd
where: server

# Adding a synchronization barrier to start executing
# following scripts at the same time on all hosts
- name: tuned
how: shell
sync: start
script:
- dnf install tuned
- tuned-adm profile throughput-performance

discover:
how: fmf
url: https://src.fedoraproject.org/rpms/tmt/
where: client
# execute tests on both clients at the same time
# sync barrier before each executed test
sync: index

execute:
how: tmt

Stories
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
5 changes: 5 additions & 0 deletions spec/plans/discover.fmf
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,11 @@ description: |
much more concise config especially when defining several
shell scripts for each guest or role.

Defined tests can be :ref:`synchronized</spec/plans/sync>`
in parallel execution using the ``sync`` keyword
in the discovery step definition.


example:
- |
# Run different script for each guest or role
Expand Down
56 changes: 56 additions & 0 deletions spec/plans/sync.fmf
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
summary: Define synchronization method for each TMT step

description: |
The main idea of :ref:`/spec/plans/provision/multihost`
tests is to run certain tasks in parallel (e.g., a RestAPI
requests from a client to a server). Therefore, these kinds
of tests may require test synchronization.
The synchronization method may be specified for each step
by providing one of defined options under ``step`` keyword.
This creates a synchronization barrier before each provided step.

start
Synchronize at the beginning of the ``step`` by adding
a synchronization barrier at the beginning of the step.
In case of multiple configurations, it also adds a barrier
at the beginning of the ``step``, even if the configuration
filters out every ``test``.
name
Synchronize execution of tests with the same name. Standalone
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean tmt needs to implement some heuristics to do the syncing properly? Like having two test sets.
for server

test1
test-multi-1
test2
test-multi-2

and for client

test-multi-1
test3
test4
test-multi-2

tmt should somehow realize that the first common match is test-multi-1 and sync on that? I guess that may become quite complicated when more than 2 servers are involved and there is quite a lot of space for deadlocks. Would you have some specific idea how this would be implemented?

tests (without parallel test on a different guest) instantly
pass synchronization barrier. ``tmt`` is not responsible for
possible synchronization problems caused by duplicate test names
and similar mistakes.
index
Synchronize execution of all tests by adding a synchronization
barriers in front of each ``test`` execution.
none
**Default** value, which disables the synchronization for
given ``step``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I lack more details about the properties of this "synchronization barrier", namely conditions required to match to pass the barrier. Ok, the general idea is clear, there would be some kind of synchronization points in tmt's workflow, but how do they behave, and what actors are involved?

        start
            Synchronize at the beginning of the ``step`` by adding
            a synchronization barrier at the beginning of the step.
            In case of multiple configurations, it also adds a barrier
            at the beginning of the ``step``, even if the configuration
            filters out every ``test``.
  • The literal step in this paragraph seems misleading to me, I'd expect a mere "step" here rather than a keyword-looking step.
  • Workflow would suspend before entering the step with sync: step - what I'm missing here: when would it resume? Is it waiting for some events? Or for other steps to catch up? If so, which events, which steps, the same "kind" of steps? All of them must reach the same barrier to proceed, or just one? Which one? If there are multiple step configurations defined (e.g. 5 prepare configs, 2 with where: server, 3 with where: client), what's the expected outcome? Shall server and client wait till they both reach prepare in general and move in their own speed, or would their configs run in lockstep? What happens to the third client's prepare config, will it wait for something, e.g. for the server's 2nd config to complete?

I'm afraid all this needs to be specified - it should be fairly doable to implement any set of rules, more or less complicated, but synchronization does have many aspects and corner cases, and the specification here seems like a tautology to me: "Synchronize at the beginning of the step by adding a synchronization barrier at the beginning of the step." - sure, but I didn't learn anything new after reading this sentence :)

WRT name and test - these seem like areas we're supposed to think about over Christmas :) To me, it's something that's probably orthogonal to step-level synchronization which could be specified properly first, before diving into test-level synchronization. "Run tests in lockstep" can then enable "run (at least) execute steps in lockstep", for example - step-level might be implemented once fully specified, as a building block to be used later by test-level synchronization. It seems less hairy, and our expectations seem to be more aligned than when it comes to test-level synchronization. The current specification draft leaves many questions unanswered, and that's just the step-level specification. My 2 cents would focus on step-level synchronization first, leaving test-level for later. I already have a rough draft of step-level synchronization, I'm missing the specification to follow :) Tests will be harder...


More complex synchronization scenarios can be found :ref:`Multihost synchronization examples. <multihost synchronization example>`

example:
- |
discover:
how: fmf
url: https://src.fedoraproject.org/rpms/tmt/
where: client
sync: name # add sync barrier before

- |
prepare:
- name: packages
how: ansible
playbook: plans/packages.yml
- name: services
how: shell
script:
- systemctl start iperf3.service
- systemctl start netperf.service
- systemctl stop firewalld.service
sync: index # add sync barrier in front of each shell command

execute:
how: tmt
sync: start # add sync barrier before the execute step