Skip to content
This repository has been archived by the owner on May 21, 2024. It is now read-only.

Aktualizr secondary auto reboot #1578

Merged
merged 6 commits into from
Mar 5, 2020
Merged

Conversation

lbonn
Copy link
Contributor

@lbonn lbonn commented Feb 28, 2020

Tested manually on qemu.

@lbonn lbonn marked this pull request as ready for review February 28, 2020 13:55
@@ -164,6 +175,12 @@ void SecondaryTcpServer::HandleOneConnection(int socket) {
} else {
LOG_DEBUG << "Not sending a response to message " << msg->present();
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that there is much reason in calling stop here as it's mainly intended for the server stooping from another thread. Perhaps just returning a bool/int from HandleOneConnection() and handling in the while loop would be slightly better (you could set keep_running_ = false; or just break directly there).
Also, the name may_reboot a bit confusing, it's rather want reboot or require reboot, just minor :)

@codecov-io
Copy link

codecov-io commented Feb 28, 2020

Codecov Report

Merging #1578 into master will decrease coverage by 0.02%.
The diff coverage is 90.32%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1578      +/-   ##
==========================================
- Coverage   82.35%   82.32%   -0.03%     
==========================================
  Files         189      189              
  Lines       11851    11872      +21     
==========================================
+ Hits         9760     9774      +14     
- Misses       2091     2098       +7
Impacted Files Coverage Δ
...c/aktualizr_secondary/aktualizr_secondary_config.h 100% <ø> (ø) ⬆️
src/aktualizr_secondary/secondary_tcp_server.h 100% <ø> (ø) ⬆️
src/aktualizr_secondary/aktualizr_secondary.h 60% <ø> (ø) ⬆️
src/aktualizr_secondary/update_agent_file.h 100% <ø> (ø) ⬆️
src/aktualizr_secondary/update_agent_ostree.h 100% <ø> (ø) ⬆️
src/aktualizr_secondary/update_agent_file.cc 66.66% <0%> (-2.09%) ⬇️
src/aktualizr_secondary/aktualizr_secondary.cc 76.74% <100%> (+0.18%) ⬆️
.../aktualizr_secondary/aktualizr_secondary_config.cc 95.06% <100%> (+0.12%) ⬆️
src/libaktualizr/utilities/utils.h 93.33% <100%> (+0.47%) ⬆️
src/aktualizr_secondary/update_agent_ostree.cc 84.31% <100%> (+0.31%) ⬆️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c09519...9689bbb. Read the comment docs.

tcp_server.run();

if (tcp_server.exit_reason() == SecondaryTcpServer::ExitReason::kRebootNeeded) {
secondary->completeInstall();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we guarantee here that the TCP/IP stack has sent all data from its internal buffers before the reboot is called?
It looks like there is missing connection shutdown in HandleOneConnection() and connection socket closing in run().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a RAII socket in run().

Copy link
Collaborator

@mike-sul mike-sul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I think it's important to make sure that all data from TCP/IP internal buffers are sent through wire before reboot takes place. IMHO, Just shutdown and close a connection socket + wait a bit before reboot.

Copy link
Collaborator

@pattivacek pattivacek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thankfully doesn't look quite as bad as we feared. How can we test this?

SecondaryTcpServer(Uptane::SecondaryInterface& secondary, const std::string& primary_ip, in_port_t primary_port,
in_port_t port = 0);
enum class ExitReason {
kNA,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"NA" for "not applicable"? Minor, but maybe worth spelling out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also renamed Other to Unknown, more consistent with other enums we have.

@lbonn lbonn force-pushed the feat/OTA-4524/secondary-auto-reboot branch from ac4f0c1 to 3fda90b Compare February 28, 2020 15:36
@lbonn
Copy link
Contributor Author

lbonn commented Feb 28, 2020

How can we test this?

There should be something to do with ipsecondary_test.py.


if (need_reboot && reboot_after_install_) {
exit_reason_ = ExitReason::kRebootNeeded;
return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an option exit_reason_ can be returned from HandleOneConnection to run() and run() in turn can return the exit status to a caller/client so no class member is required to store an exit reason and no additional call for a client to get this exit reason. But it' really minor and not important just an alternative.

Copy link
Collaborator

@mike-sul mike-sul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just think that it would be great to add some test(s) for this functionality.

It's only used for tests, the equivalent case can be handled in the main
one.

Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
It was not explicitely closed!

Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
"ON" is not the only way to get a true value from CMake

Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
@lbonn lbonn force-pushed the feat/OTA-4524/secondary-auto-reboot branch from 3fda90b to 515d858 Compare March 4, 2020 16:14
@lbonn
Copy link
Contributor Author

lbonn commented Mar 4, 2020

Now comes with a test.

pattivacek
pattivacek previously approved these changes Mar 5, 2020
Copy link
Collaborator

@pattivacek pattivacek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@with_sysroot()
@with_secondary(start=False, output_logs=False, force_reboot=True)
@with_aktualizr(start=False, run_mode='once', output_logs=True)
def test_secondary_ostree_reboot(uptane_repo, secondary, aktualizr, treehub, sysroot, director, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a negative test for this? (Same goes for the Primary, actually.) Less important, but just curious.

with aktualizr:
aktualizr.wait_for_completion()

if not director.get_install_result():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to add checking if what is currently installed on Secondary is what we expect
something like this `if target_rev != aktualizr.get_current_image_info(secondary.id):

Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
Signed-off-by: Laurent Bonnans <laurent.bonnans@here.com>
add_test(NAME test_ip_secondary
COMMAND ${PROJECT_SOURCE_DIR}/tests/ipsecondary_test.py
--build-dir ${PROJECT_BINARY_DIR} --src-dir ${PROJECT_SOURCE_DIR} --ostree ${BUILD_OSTREE})
--build-dir ${PROJECT_BINARY_DIR} --src-dir ${PROJECT_SOURCE_DIR} ${TEST_IPSEC_ARGS})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What ${TEST_IPSEC_ARGS} is for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's here to set or not set --ostree as an argument to the script, as using ${BUILD_OSTREE} in the python script would force us to reimplement CMake boolean logic:

True if the constant is 1, ON, YES, TRUE, Y, or a non-zero number. False if the constant is 0, OFF, NO, FALSE, N, IGNORE, NOTFOUND, the empty string, or ends in the suffix -NOTFOUND. Named boolean constants are case-insensitive. If the argument is not one of these specific constants, it is treated as a variable or string and the following signature is used.
https://cmake.org/cmake/help/latest/command/if.html#command:if

I had the problem that the tests would not run because I configured my build directory with -DBUILD_OSTREE=on instead of -DBUILD_OSTREE=ON.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I got it.

@lbonn lbonn merged commit e502843 into master Mar 5, 2020
@lbonn lbonn deleted the feat/OTA-4524/secondary-auto-reboot branch March 5, 2020 12:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants