-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
impl: original integration test. #56
Comments
@utam0k I'd like to try it. |
Sure! I assigned you to it. I'm looking forward your PR. |
I would like to report that I was able to reproduce the container lifecycle of the tutorial on rust's And I have a few concerns that I would like to discuss with you.
As for the future roadmap, I would like to proceed as follows.
Also, I would like to know if there are any high priority tests. |
Hey @minakawa-daiki ,sorry for delay in reply. If it is possible, can you try implementing the validation tests first? As once we have those, we can test the changes made on local system, currently the runtime-tools validation works in github action only, see discussion in #82 . @utam0k can you take a look at the concerns? I think it would be fine to allow root requirements for running the tests, but it would be better to leave docket export part out of the tests, as it would force us to depend on docker being correctly configured and also the network status. It might be better to check if the rootfs folder exists in the root dir of project, and fail the test if it does not. |
@YJDoc2 Thanks for your mention. |
I think it is inevitable that you will need root privileges. I agree with the assumption that root privileges are required, although it is possible that rootless could be used to get around this. As for rootfs, why not mimic runtime-tools and zip it up and put it in the repository first?
You can change the directory with the create argument, so run help and see what happens.
|
@YJDoc2 Sorry, he's actually a real friend of mine and I've been supporting him a lot on discord. Thank you for your attention. I'll use issues as much as possible to promote open discussion. |
Hey, I just thought that you missed the message, so I mentioned you 😅 Also, if possible, let us keep updating main points in issues, even if they are discussed elsewhere, not only for open discussion, but so that we will have history of issues and reason behind implementation at a single place, and can be referenced easily if needed 😃 |
@YJDoc2 Of course! I'll do my best to make it so. |
I agree. I think the work done by @minakawa-daiki has created a great base for integration tests 🎉 Currently we are using integration tests in runtime-tools, which only work in github CI/CD and does not seem to work on local machines, making it really hard to verify that changes made have not accidentally broken things elsewhere. Migrating all tests from runtime-tools to rust would not only allow us to confidently make changes, but reduce one of the external dependencies as well. Currently there are around 56 tests in tuntime-tools, we can maybe make a separate issue for tracking how many are implemented , similar to cgroups tracking issues. I would really like to contribute in it as much as I can. In case, list of test directories in runtime-tools/validation config_updates_without_affect create default delete delete_only_create_resources delete_resources hooks hooks_stdin hostname kill kill_no_effect killsig linux_cgroups_blkio linux_cgroups_cpus linux_cgroups_devices linux_cgroups_hugetlb linux_cgroups_memory linux_cgroups_network linux_cgroups_pids linux_cgroups_relative_blkio linux_cgroups_relative_cpus linux_cgroups_relative_devices linux_cgroups_relative_hugetlb linux_cgroups_relative_memory linux_cgroups_relative_network linux_cgroups_relative_pids linux_devices linux_masked_paths linux_mount_label linux_ns_itype linux_ns_nopath linux_ns_path linux_ns_path_type linux_process_apparmor_profile linux_readonly_paths linux_rootfs_propagation linux_seccomp linux_sysctl linux_uid_mappings misc_props mounts pidfile poststart poststart_fail poststop poststop_fail prestart prestart_fail process process_capabilities process_capabilities_fail process_oom_score_adj process_rlimits process_rlimits_fail process_user root_readonly_true start state Currently the initial tests of container lifecycle depend on panics to indicate failure, maybe we can start by changing that to returning a result or something similar so we can get more information about the error? If that is a good idea, I'd like to try doing that. |
I'd be very interested in implementing the runtime integration tests for youki. I think it would be a great benefit for us to eventually build out cgv2 tests, since I think the testing there could be improved. I agree that the tests output could be improved, and I'd be open to seeing how you're thinking to approach that @YJDoc2. I think maybe breaking this out into a tracking issue like we've done for other things, and implementing each test from the oci-spec is a good way to start and break up the work. It can also provide some good issues for others to get involved with. |
Of course it's a good improvement.
@tsturzl @YJDoc2 |
I understand that you want to push forward with the test implementation as soon as possible. |
Work has been a bit consuming lately, but I should hopefully have time starting in the middle of next month. I would be interested in taking this on and implementing a few vital pieces and maybe getting some feedback. Unless @YJDoc2 has some ideas here and is interested, or can start on it sooner. I've also been beginning to hack on #17 again, but I think this is generally more important and ultimately #17 would greatly benefit from this. I agree with the priority here. |
Hey, So I have been thinking and came up with two approaches for this:
Do you think these could be useful with some changes, or should be try to do something else altogether @utam0k @tsturzl ? I am a bit busy this week, so even if we decide on the idea, I'll start implementing this weekend or next week. Maybe we can come up with something better in meantime as well. On a side note, I think we should hold creating the tracking issue of this until we have decided on this, and implemented lifecycle tests ; as otherwise contributors will start creating tests without having a format, and we will need to mend those later again to fit the format that we decide on. |
@YJDoc2 @tsturzl
This is the minimum feature that I would like to have, and if this is not possible with traits, then I suggest using hashmap.
👍
|
I would like to work on this and let @tsturzl focus on #17 , hopefully I can create a good base for the cgroups v2 tests. As for the implementation, I'm not saying it would not be possible with traits, we can make the test registering function take test name as well, but my problem with traits is that they will need empty structs to be implemented on, and for tests we would only be interested in the test function itself, so by using hashmap we would not need the extra boilerplate.
I'm waiting to hear @tsturzl 's thoughts on this, after that I'd start working of this. |
@YJDoc2 I think the traits pattern is more familiar to some of the patterns already used in youki, eg the controller managers do dispatch in this way. I think the hashmap approach is simple, but seems less idiomatic and possibly like it could get messy. I think creating empty structs is mostly an issue with the code readability than the actual runtime. I think it'll also be more idiomatic Rust to use a trait. There is one solution I can think of for making the empty structs look cleaner, hide the empty struct creation with a macro and derive the test name and thus the struct's name from the function name using a procedural macro(might be hacky). I think perhaps one of the benefits for the hashmap is the ability to quickly find a given number of input tests names to run, however maintaining the order in which tests are invoked could be nice for readability. I think either approach is fine, and perhaps the hashmap is a better way to start out, but eventually maybe the trait approach is going to be more extensible. For example you might eventually want something like a trait method which allows a test to check if it's runnable on that system, like a cgv2 controller might check to see that the subsystem is available on that system and then return a boolean, and then we can have 3 tests states like pass, fail, skipped. You could also eventually have setup and cleanup methods on a test, or even better you can just implement cleanup logic in the Drop trait of the struct and then actually store setup resources in the struct so it's not actually just empty. You could even have a struct represent a body of tests, and then have a reusable test type which takes a test function. I think the struct approach is just more extendable long term, but I do see immediate benefits in the second approach. It's possible even a hybrid of the 2 could turn out to be useful. Maybe even something like this. A logical grouping of tests are an entry in a hashmap, the grouping is a struct that implements a "TestGroup" trait, the tests within the group are just a reusable test type which takes an anonymous function like |
Something loosely like this is appealing to me, you could collect tests for a group into a static array or even a vector and then run them, or you can even just invoke them inline like I did here. |
Might even be worth having our own Test type which is from-able from a anyhow::Result. Something like a TestResult, and this has 3 states which are Ok(T), Err(E), Skip. Skip is good for cases where a test isn't runnable on a system, eg you don't have support for a given cgroup controller. The TestResult can just implement From for the result type we expect, so it's easy to just turn a anyhow::Result<()> into a TestResult for example. |
You could even just have a TestGroup be an array of Testables inserted into a hashmap instead. If the given Testable's don't fit a specific usecase you can always make case specific Testable types as well, but for the most part I imagine you can just make reusable Test types with it. Sorry if this was a long rant, but these were some of the ideas I had for a composable and reusable test framework. |
Hey, these are some great ideas! Thanks for sharing them.
I agree that my issue with traits was more of readability and boilerplate, than usage, and traits would in fact be more idiomatic.
The main reason for suggesting a hashmap was to be able to get the tests quickly from their names, but I agree it is less idiomatic, and might get messy to manage later. Although the constraint that the tests should be run in a consistent order , which hashmap will not guarantee, can be solved by using a BTreeMap, which maintains a fixed order, but still we cannot define a particular order we want, so using a vec might be better solution in long run if we need tests to be ran in a particular order only. ( though this will mean that we will need to iterate over all vec when only cmd specified tests are to be run.) In my opinion, the suggestion of making a trait for group tests, even though elegant and useful in long term, might be too complex for implementing and using in the beginning, so my suggestion would be to start with only a Testable trait, and then as need will arise, introduce appropriate traits. I might not be understanding what you wanted to say, but I do not agree that a test groups should share the setup resources ; using a common method to do setup and tear down is quite sensible, but using same setup across tests might not be a good idea, as a failing or panic-ing test might leave them in undefined state, and the next tests might also fail due to that.
I think your point to use a macro to hide the implementation of test is great, and I have some ideas for that.
For the beginning, what do you think of this enum TestResult<T,E>{
Ok(T),
Skip,
Err(E)
};
#[bitflags]
#[repr(u8)]
enum TestGroup{
NoGroup = 1<<0,
CGroupV1 = 1<<1,
CGroupV2 = 1<<2,
TLBCheck = 1<<3
}
trait IntegrationTest{
pub fn name()->String;
pub fn group()->BitFlags<TestGroup>;
pub fn test()->TestResult<(),String>;
}
// to actually write a test
#[integration_test(test_group = CGroupV1|TLBCheck)]
fn my_integration_test()->Test<Result,String>{
// do test...
}
// integration test will be a proc macro which will generate following
pub struct my_integration_test;
impl IntegrationTest for my_integration_test{
pub fn name()->String{
"my_integration_test".to_string()
}
pub fn group()->BitFlags<TestGroup>{
TestGroup::CGroupV1|TestGroup::TLBCheck
}
pub fn test()->TestResult<(),String>{
// do test
}
};
// in main() we can register test something like :
...
test_manager.register(my_integration_test);
... Where birflags are used from crate enumflags2, which allows bitflags to be set using rust's enums. the test_manager will be a controller struct, which will store tests internally , probably in a vec, and also store them group wise in a hashmap , so when running specific groups we can retrieve those specific tests quickly. I specifically kept the functions in the trait static, instead of instance, because of the point I mentioned above that resources should not be shared, but if needed we can change that as required. Using such traits, we can define custom grouping as well if certain logic is common across several tests using custom traits. My current thoughts of cmd interface are something like this # to run all tests, -p is path to the runtime
integration_test -p ./youki
# to run specific tests
integration_test -p ./youki -t test1 test2 test3
# to run specific groups
integration_test -p ./youki -g group1 group2 What are your thoughts on this? |
@YJDoc2 you're probably right with the fact that groups shouldn't share any resources, so the ability to do a drop trait there isn't all that useful. We can kinda maintain the order it a BTreeMap by making our own key type and doing hacky things, but they're just that, hacky things. Order of test execution is less of a hard requirement and more of a nice to have, a BTree might be good enough too since I'm mostly after just having the output look the same each time. I'm kind of in favor of my later idea. I think having some predefined and reusable Test structs that implement some Testable trait might be better than the proc macro approach. Like you just have a Test struct and you pass a name, a function, and optionally some other args(depending on what the Testable type does). Then you're not creating a struct for each test your just creating an instance of one of the Testable types. Enums for groups might work. I suppose I was thinking of a group as a granular unit similar to oci-spec test. Like the ability to just run the group of tests pertaining to say the cgroups v1 cpu subsystem. Then you can call an entire group or an individual test or set of tests from a group like 'test-cmd group1 group2::a group3::a,b,c' which runs all of group1, one test of group2, and 3 tests of group3. |
Maybe in the morning I can write a more comprehensive example of what I'm thinking. |
Sure, I will also think some more on your as well as my approach in the meantime. Thanks! |
@YJDoc2 This is a comprehensive example of what I'm thinking to combine parts from both our ideas based on our discussion thus far. This avoids having to define structs for each test all together, and therefore also avoids the complexity of making something like a proc macro, and I think has the most reusability and should be clean and relatively easy to start out. I don't think a lot of boiler plate is involved. I decided to go with BTreeMap to keep ordering, HashMap should work just fine too. I kept the concept of a test group because I think it's useful, and I just made it a reusable type so no need to define a new struct, it also adds a lot of reusable methods for groups which I think will be useful. I kept the idea of using lazy statics here so you can have a lazy static for each test group to define a singleton, and then at the top level there is a lazy static BTreeMap which holds all the TestGroups, this should give you O(l(og n) m) time for provided tests to run both for running entire groups or specific tests within groups. I think this design isn't too complicated, hard to implement, and I think it's relatively extensible. I think it's well worth the time to start out with good abstractions that can be built upon. I think the one major place for improvement might be obscuring the ugliness of creating the lazy statics using purpose built macros. I put the code in a gist in case you want to fork it and edit it, also it's just a lot to post in an comment. |
|
Hey @tsturzl I saw your code, and I agree with most of your code, but I still don't agree that TestGroup should be a structure 😅 I would instead suggest that we use bitflags to specify which group a test belongs to. This will allow a test to be part of multiple groups if needed, as well as give granular control over test to be run. In the code you suggested, we will need to specify the tests as group1::test_1,test_2 etc. on cmd , that ends up giving names of tests anyways, (test_1,test_2); so we may as well just give names of the tests themselves , and specify the the groups separately as needed. I have updated your gist with my suggestion of bitflags, so please take a look : https://gist.github.com/YJDoc2/ad9f46c393e45819c8c7f853e6aec3d1 |
I think bitflags for groupings is fine if that works better or looks cleaner. I guess I just see it as cleaner to have a group own the tests it runs versus the tests own which group them belong to; you end up specifying the group for every test you have rather than creating the test and then specifying group once for all the tests, it's just more repetition and kind of the inverse of how you'd probably logically think of the problem. You can create all the groups and then populate them later, and test groups can own a lookup structure for the tests they own. Like in my example a group has its own Map for quickly filtering lookups. I think having granular groups is also important, that's exactly how the original tests are, and I think more often you'll want to run a group of tests pertaining to what you're working on. Like I want to run all the cgv2 memory controller tests because that's what I'm working on, maybe 2 tests are failing and I already know the group so saying 'cgv2_memory::test1,test2' is relatively easy and then the collision of test names isn't a worry by considering test names to be in a global namespace, because cgv1 and cgv2 might have the same test names, and if you start making more verbose test names like "cgv2_memory_max_limit", "cgv1_memory_max_limit" there isn't a ton of efficiency gain from just typing out a group "cgv2_memory::max_limit". In fact I'd argue that there are cases were filtering by group then test is more efficient since you can do "cgv2_memory::test1,test2'" instead of "cgv2_memory_test1 cgv2_memory_test2", which I think will be relatively common. |
Hey, So I have been thinking for last couple of days, and after reading your comment, I can see that the group method would be better 😅 Initially I was thinking that most of our run cases would be either all tests, or particular named tests and groups, but then I realized that either we would run all of our cases or a specific one which has issues, and with your argument about groups providing namespacing, I agree that tests as part of group rather than groups as a part of test would be a better strategy. Sorry for such a long discussion over this 😓 I will now start implementing this framework as an individual crate inside our repo, similar to oci-spec and current rust implementation of integration tests. Thanks for all the help,discussion and comments :) |
Also I was thinking, would it be possible and useful to set up a CI step in github actions once we start implementing this? The step would essentially compile only the integration tests, and run that against runc and/or some other runtime, to validate the tests themselves. This step is easy to manually do locally as well, and including in CI would just make sure that the tests we make are satisfied by "production" runtimes as well. I would also like to know @utam0k 's thoughts on this. |
@YJDoc2 no worries! I hope I wasn't too critical. I think having lazy_statics does simplify the design and I wasn't previously aware of that pattern in rust. I'm excited to see how it turns out. As far as having CI tests for the integration testing framework my opinion would be that we don't want to tie in the workflow actions for testing against runc into the youki repo, so I'd either say we should just omit it from CI or think about breaking out into another repo. Per another conversation I think oci spec is breaking out to its own repo possibly now since it's also useful in other projects in the container group. I'd actually say that these integration tests could end up being useful for other projects, such as testing against crun which is also in the container group. My hunch would be it's a bit early to move this out into it's own repo, and with that I think it's probably best to leave it out of CI. However I think there might be a comfortable medium here. You could make an entirely new workflow for testing the integration suite against runc, and then have some other trigger (eg just a manual invocation). That way we can at least reference these test results easily in GitHub. It might not be as nice as having automatic tests run, but I don't know if we want to convalute youki CI with testing integration tests against runc. That's just my opinion. I'm also interested in what @utam0k thinks, and perhaps that can start the bigger conversation of eventually growing this aspect of the project into it's own efforts to replace oci runtime spec all together with our own implementation that addresses some of the issues we've run into along with supporting things like cgroups v2 |
We believe that this integration test will be very useful for the field of container runtime. If it works well, I will consider separating them into different repositories, but I think it's too early to do that now, since managing separate repositories is more or less a burden.
I think it is a good idea to use runc to verify the correctness of the integration test itself. However, I also think it is better to be able to run them manually. Or, how about running it only when there is a change in the codes in the integration test directory? |
This is ideal, it would be great if users could add tests in the form of plugins. For example, I think it would be a very good project if we could add tests with wasm. However this is for use outside of youki, so I think it should be optional. |
Youki is currently using the integration tests from rutime-tools, but there are some areas where this isn't sufficient (e.g. cgv2), so we will implement our own.
The text was updated successfully, but these errors were encountered: