-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse and save quota scopes #190
Conversation
miq-bot add-label enhancement, gaprindashvili/yes |
end | ||
|
||
def get_container_quota_scopes_graphs(parent, hashes) | ||
hashes.each do |hash| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hashes
& hash
?
what are this hashes contain ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my understanding it is coming from here: https://github.com/ManageIQ/manageiq-providers-kubernetes/pull/190/files#diff-0324981fdb3019ce6d98f9c86d97f2bbR921
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a. so it should be:
scopes.each do |scope|
?
Since scopes are just an array of strings, why do we need to convert them into hashes, what do we plan to store in them ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 good catch, it's strings indeed, too much copy-paste :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, no I did make them hashes. let me wake up and I'll respond again :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see other discussion
on naming, assuming these actually remain hashes: I've tried to follow file convention here, all get_foos_graph
methods take outputs from parse_foo
named hashes / hash
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all get_foos_graph methods take outputs from parse_foo named hashes / hash.
make sense to me 👍
@@ -910,7 +918,8 @@ def parse_persistent_volume_claim(claim) | |||
|
|||
def parse_resource_quota(resource_quota) | |||
new_result = parse_base_item(resource_quota) | |||
new_result[:container_quota_items] = parse_resource_quota_items resource_quota | |||
new_result[:container_quota_scopes] = resource_quota.spec.scopes.to_a.collect { |scope| {:scope => scope} } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we need to convert array of strings to an array of hashs ? are we planning to put more metadata in the quota_scopes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a hash since save inventory container has to know the name of the column to persist the data into.
Maybe we should have named this container_quota_scopes.name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If scopes are strings ( e.g. we store only the name
of the scope ) ?
then it is much more efficient to store them in an array of strings.
For example:
instead of:
[{name: 'hello'}, {name: 'world'}]
we will store:
['hello', 'world']
This will also simplify the DB and save time parsing and storing inventory to and from the DB, since we will be searching and storing text ( array<=>string:csv
or array<=>string:json
or even an array
type if manageiq use those ) instead of creating and deleting table rows in a related table.
This is off curse only valid if scope is indeed only list of strings, do you know what we plan for it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
about this code (given current db schema):
Mooli is correct, the convention here is that parse_foo
functions return nested hashes shaped like save_inventory wants them — separate hash per table, key per column.
(Then get_foos
can use these as-is, and get_foos_graph
functions do some extra work diconnecting these hashes into separate collections. We might revisit this in after we drop the old refresh support.)
about DB schema:
Input is only a list of strings. (actually even a list of "enums" — currently k8s allows only 4 values)
Indeed a postgres array or JSONB column sounds (untested!) more efficient than separate table, at least for storing.
The consequence on efficiency of display / reports is hard (for me) to estimate up-front...
I've asked myself this when adding the schema, and figured that performance here is not important — there will be few (frequently 0 or 1) quotas per project, and up to 2 scopes per quota.
So I decided, absent strong reasons for one or other, to take the road more travelled by — manageiq uses separate tables for arrays almost everything, jsonb almost nowhere.
Oh, there is additional data here beyond the scope string — created_at / deleted_at!
We intend to (mis)use archiving to retain full history of quotas.
If a quota's scopes change (TODO: check if that possible in k8s, for now I'm assuming yes),
having scopes as separate rows allows archiving/adding just the individual scopes, instead of having to deep-copy whole quota and its items.
(We will discuss deep-copying again soon, when I work on the archiving, I have other arguments for and against... But when deciding schema, I wanted to keep both approaches possible.)
In any case, we want this gaprindashvili/yes, but can no longer change the schema there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
performance here is not important
I thought the whole re-factoring of this code is for performance ? what am I missing ?
(actually even a list of "enums" — currently k8s allows only 4 values)
Than why add a select join
to each query ? just to avoid writing the code to handle this 4 possible strings ...
The consequence on efficiency of display / reports is hard (for me) to estimate up-front...
Reports are done only monthly/daily/on request whie inventory is done every 15m (?) , this should not be taken lightly.
Extra thing to note here is that current code deletes all scopes and write them in again, each refresh (15m), using 'join' instead of regular query on lots of data can waist some time we can use elesware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra thing to note here is that current code deletes all scopes and write them in again, each refresh (15m)
No, it shouldn't do that, both old and graph refresh diff what's in DB to what parser emits and should modify nothing if scopes didn't change.
I think I may have confused you about "deep copying".
If only scopes are modified:
- Current schema can archive and create new scope(s) without archiving and re-creating quota + quota items. Because each scope has timestamps.
- If scopes were an array column on quota table, would have to archive & re-create quota + quota items, less effecient for this scenario!
using 'join' instead of regular query on lots of data can waist some time we can use elesware.
join is standard "data normalization", we do them all over, DBs handle them pretty well (not obviously slower than an array column). Old refresh is probably slower because it recurses record by record; graph refresh does just one big join for all scopes of all quotas. And we have similar join for quota items table anyway.
Anyway this schema already exists, and can't be changed for Gaprindashvili. This is what always happens given time pressure to add schema far before code :-(
I'm happy if you already know what's best schema here, because I honestly don't :-), I just went with most "straightforward" one...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If scopes were an array column on quota table, would have to archive & re-create quota + quota items, less effecient for this scenario!
Changing one text field should be faster then doing join
to change atext field ... why do you think queering with join over two tables is more efficient ? ( maybe depending on implementation ? )
join is standard "data normalization", we do them all over
That is sad, may explain the slowness of things.
not obviously slower than an array column
Do you have benchmarks to show that ? sounds strange.
I'm happy if you already know what's best schema here, because I honestly don't :-)
I suggested above a text
field storing data in csv/json , setting up a table holding only one string field ( "we do them all over" ) is sad ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: the question of "data normalization" in this context is whether scope
is atomic part of quata
, do using scope.name
without it's related quata
make sense ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think that in this case, simple strait forward solutions are better, like:
using one text field that holds one of the 9 possible scope combinations
,
or using two text fields, terminating
- yes/no/undefined and best-effort
- yes/no/undefined.
But since we want this done, and we abuse :hase_many
all over the place already, I can live with current implementation.
Edit: I think we had this conversion but that we figured out that it would be easier to query this schema of the DB Edit Edit: And the changes are already in schema anyway |
@zeari that is a very good reason ... 😿 |
|
@miq-bot add-label enhancement, gaprindashvili/yes |
Checked commit cben@29ef338 with ruby 2.3.3, rubocop 0.47.1, haml-lint 0.20.0, and yamllint 1.10.0 spec/models/manageiq/providers/kubernetes/container_manager/refresh_parser_spec.rb
|
PTAL.
No, that's OK, just top-level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a lot of word were written about the direction of a new model vs a test/json column. I think once we decided to go with a model this PR is good. Pending green tests which can only happen after the model PR in core manageiq will be merged.
@moolitayer core dep merged, I believe this is ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
👍
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Parse and save quota scopes (cherry picked from commit 139850b)
Gaprindashvili backport details:
|
Kubernetes namespaces (aka openshift projects) may have quotas.
Each quota may have one or more scopes, which we didn't store before but should:
https://kubernetes.io/docs/concepts/policy/resource-quotas/#quota-scopes
https://docs.openshift.com/container-platform/3.7/admin_guide/quota.html#quota-scopes
https://bugzilla.redhat.com/show_bug.cgi?id=1504560
@zeari @enoodle please review. VCR tests coming in later PR.