Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove duplicate and random algorithms from tests #9626

Closed
wants to merge 1 commit into from

Conversation

PrivatePuffin
Copy link
Contributor

This is a spinoff from #8941 seperating general, unrelated, test changes from the zstd PR.

TL:DR:

  • Random tests (send-c_verify_ratio) lead to inpredicatble results
  • Duplicate algorithms lead to duplicate execution of tests
  • Combined the effects are even worse and lead to false positives

Signed-off-by: Kjeld Schouten-Lebbing kjeld@schouten-lebbing.nl

Motivation and Context

During my work on investigating the compression test suite for #8941 a few things became appearant:
send-c_verify_ratio used randomly selected compression algorithms. Those algorithms differ widely in performance and for the majority consist of different levels of gzip. This lead to difficulty comparing performance of tests over multiple runs. This also lead to situations where tests ended with a false-positive (pass) result, due to algorithms being skipped.

During my research into this there seemed to be 2 versions of multi-compression test suits:

  1. Randomly selected from a pool listing all different levels of algorithms
  2. Fixed selective non-random selection of algorithms

The 2 tests using version 1, both suffered from the same problems. (listed above) This wouldn't be that bad, but it gets worse when new algorithms get added with multiple levels.

Description

This change removes duplicate algorithms from all compression tests (and leaves just 1 of each kind), including the random compression pool.

It also removed the random draw of algorithms from send-c_verify_ratio, to make sure results are repeatable. This also removes one whole loop of the test and thus improves performance.

Is * taken into account?

The following is taken into account:

  • There is a seperate test for testing aliasses, those should work regardless
  • If many more algorithms get added, possibly in the future in some places random selection might be needed, but still should only list one level each to prevent limiting the chances of drawing a non-level based algorithm like lz4
  • If seperate levels work totally differently, a seperate test for that algorithm would be a cleaner solution than adding all levels everywhere.

How Has This Been Tested?

i've been debating and testing multiple versions/levels of these changes for about a week now. After I came up with them due to (unrelated) issues in #8941

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

- Random tests (send-c_verify_ratio) lead to inpredicatble results
- Duplicate algorithms lead to duplicate execution of tests
- Combined the effects are even worse and lead to false positives

Signed-off-by: Kjeld Schouten-Lebbing <kjeld@schouten-lebbing.nl>
@codecov
Copy link

codecov bot commented Nov 26, 2019

Codecov Report

Merging #9626 into master will decrease coverage by 0.19%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #9626     +/-   ##
=========================================
- Coverage   79.37%   79.18%   -0.2%     
=========================================
  Files         418      418             
  Lines      123531   123531             
=========================================
- Hits        98057    97817    -240     
- Misses      25474    25714    +240
Flag Coverage Δ
#kernel 79.9% <ø> (-0.02%) ⬇️
#user 66.79% <ø> (-0.51%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0c46813...0823d84. Read the comment docs.

@PrivatePuffin
Copy link
Contributor Author

PrivatePuffin commented Nov 27, 2019

*note
Small thank-you to @c0d3z3r0 for mentioning general changes should be kept out of zstd when possible. If I complain about those of others, my own should be seperated as well :)

*additional note:
Test failures are unrelated, known issues

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Nov 27, 2019
@PrivatePuffin
Copy link
Contributor Author

@behlendorf You might want a restart on those failed tests (unrelated).
But I don't think its worth it, it has absolutely nothing to do with the PR and would've been caught by those tests regardless if there where any issues with this.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking in to this! Looks good, I agree we don't want to test each different compression level, and making the tests behave more consistently is a good thing.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Nov 27, 2019
@c0d3z3r0
Copy link
Contributor

What about tests/zfs-tests/include/libtest.shlib?

rand_set_prop $vol compression "on" "off" "lzjb" "gzip" \
"gzip-1" "gzip-2" "gzip-3" "gzip-4" "gzip-5" "gzip-6" \
"gzip-7" "gzip-8" "gzip-9"
rand_set_prop $vol compression "off" "lzjb" "gzip" "lz4"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add zle? (separate PR?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ZLE with random writen data (which is often used) is basically the same as compression=off

@c0d3z3r0
Copy link
Contributor

c0d3z3r0 commented Nov 28, 2019

Oh, I just stumbled over this... aren't we missing zle and lz4 in function get_compress_opts in tests/zfs-tests/include/libtest.shlib?

@PrivatePuffin
Copy link
Contributor Author

PrivatePuffin commented Nov 28, 2019

@c0d3z3r0 I purposefully left this PR to do one thing only:
Remove duplicate algorithms and random selected algorithms where possible.

I have decided not to add the added algorithms brainslayer added to my PR.

  • lz4 is included in libtest.shlib as on
  • I've decided not to remove the gzip list there, I am not 100% certain that list should contain just 1 gzip variant.... The way brainslayer did those changes didn;t look completely right to me, throwing in GZIP_OPTS with just 1 value and adding ZSTD_OPTS with also just one value...

But i'll look into it today, Just need to go over all instances get_compress_opts is called and make sure what the consequences of these changes would be.

@PrivatePuffin
Copy link
Contributor Author

@c0d3z3r0
You made me realise the fact we might need a bigger refactor of this, I'll close this for now while I work on that. We can always reopen if i'm not happy with how that works out :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants