Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse and save quota scopes #190

Merged
merged 1 commit into from
Dec 20, 2017
Merged

Conversation

cben
Copy link
Contributor

@cben cben commented Dec 13, 2017

Kubernetes namespaces (aka openshift projects) may have quotas.
Each quota may have one or more scopes, which we didn't store before but should:
https://kubernetes.io/docs/concepts/policy/resource-quotas/#quota-scopes
https://docs.openshift.com/container-platform/3.7/admin_guide/quota.html#quota-scopes

https://bugzilla.redhat.com/show_bug.cgi?id=1504560

@zeari @enoodle please review. VCR tests coming in later PR.

@cben
Copy link
Contributor Author

cben commented Dec 13, 2017

miq-bot add-label enhancement, gaprindashvili/yes

end

def get_container_quota_scopes_graphs(parent, hashes)
hashes.each do |hash|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hashes & hash ?
what are this hashes contain ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a. so it should be:
scopes.each do |scope|
?

Since scopes are just an array of strings, why do we need to convert them into hashes, what do we plan to store in them ?

Copy link
Contributor Author

@cben cben Dec 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 good catch, it's strings indeed, too much copy-paste :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, no I did make them hashes. let me wake up and I'll respond again :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other discussion

on naming, assuming these actually remain hashes: I've tried to follow file convention here, all get_foos_graph methods take outputs from parse_foo named hashes / hash.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all get_foos_graph methods take outputs from parse_foo named hashes / hash.

make sense to me 👍

@@ -910,7 +918,8 @@ def parse_persistent_volume_claim(claim)

def parse_resource_quota(resource_quota)
new_result = parse_base_item(resource_quota)
new_result[:container_quota_items] = parse_resource_quota_items resource_quota
new_result[:container_quota_scopes] = resource_quota.spec.scopes.to_a.collect { |scope| {:scope => scope} }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need to convert array of strings to an array of hashs ? are we planning to put more metadata in the quota_scopes ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a hash since save inventory container has to know the name of the column to persist the data into.

Maybe we should have named this container_quota_scopes.name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If scopes are strings ( e.g. we store only the name of the scope ) ?
then it is much more efficient to store them in an array of strings.

For example:
instead of:
[{name: 'hello'}, {name: 'world'}]
we will store:
['hello', 'world']

This will also simplify the DB and save time parsing and storing inventory to and from the DB, since we will be searching and storing text ( array<=>string:csv or array<=>string:json or even an array type if manageiq use those ) instead of creating and deleting table rows in a related table.

This is off curse only valid if scope is indeed only list of strings, do you know what we plan for it ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about this code (given current db schema):

Mooli is correct, the convention here is that parse_foo functions return nested hashes shaped like save_inventory wants them — separate hash per table, key per column.
(Then get_foos can use these as-is, and get_foos_graph functions do some extra work diconnecting these hashes into separate collections. We might revisit this in after we drop the old refresh support.)

about DB schema:

Input is only a list of strings. (actually even a list of "enums" — currently k8s allows only 4 values)
Indeed a postgres array or JSONB column sounds (untested!) more efficient than separate table, at least for storing.
The consequence on efficiency of display / reports is hard (for me) to estimate up-front...

I've asked myself this when adding the schema, and figured that performance here is not important — there will be few (frequently 0 or 1) quotas per project, and up to 2 scopes per quota.
So I decided, absent strong reasons for one or other, to take the road more travelled by — manageiq uses separate tables for arrays almost everything, jsonb almost nowhere.

Oh, there is additional data here beyond the scope string — created_at / deleted_at!
We intend to (mis)use archiving to retain full history of quotas.
If a quota's scopes change (TODO: check if that possible in k8s, for now I'm assuming yes),
having scopes as separate rows allows archiving/adding just the individual scopes, instead of having to deep-copy whole quota and its items.
(We will discuss deep-copying again soon, when I work on the archiving, I have other arguments for and against... But when deciding schema, I wanted to keep both approaches possible.)

In any case, we want this gaprindashvili/yes, but can no longer change the schema there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performance here is not important

I thought the whole re-factoring of this code is for performance ? what am I missing ?

(actually even a list of "enums" — currently k8s allows only 4 values)

Than why add a select join to each query ? just to avoid writing the code to handle this 4 possible strings ...

The consequence on efficiency of display / reports is hard (for me) to estimate up-front...

Reports are done only monthly/daily/on request whie inventory is done every 15m (?) , this should not be taken lightly.

Extra thing to note here is that current code deletes all scopes and write them in again, each refresh (15m), using 'join' instead of regular query on lots of data can waist some time we can use elesware.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra thing to note here is that current code deletes all scopes and write them in again, each refresh (15m)

No, it shouldn't do that, both old and graph refresh diff what's in DB to what parser emits and should modify nothing if scopes didn't change.

I think I may have confused you about "deep copying".
If only scopes are modified:

  • Current schema can archive and create new scope(s) without archiving and re-creating quota + quota items. Because each scope has timestamps.
  • If scopes were an array column on quota table, would have to archive & re-create quota + quota items, less effecient for this scenario!

using 'join' instead of regular query on lots of data can waist some time we can use elesware.

join is standard "data normalization", we do them all over, DBs handle them pretty well (not obviously slower than an array column). Old refresh is probably slower because it recurses record by record; graph refresh does just one big join for all scopes of all quotas. And we have similar join for quota items table anyway.

Anyway this schema already exists, and can't be changed for Gaprindashvili. This is what always happens given time pressure to add schema far before code :-(
I'm happy if you already know what's best schema here, because I honestly don't :-), I just went with most "straightforward" one...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If scopes were an array column on quota table, would have to archive & re-create quota + quota items, less effecient for this scenario!

Changing one text field should be faster then doing join to change atext field ... why do you think queering with join over two tables is more efficient ? ( maybe depending on implementation ? )

join is standard "data normalization", we do them all over

That is sad, may explain the slowness of things.

not obviously slower than an array column

Do you have benchmarks to show that ? sounds strange.

I'm happy if you already know what's best schema here, because I honestly don't :-)

I suggested above a text field storing data in csv/json , setting up a table holding only one string field ( "we do them all over" ) is sad ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: the question of "data normalization" in this context is whether scope is atomic part of quata , do using scope.name without it's related quata make sense ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that in this case, simple strait forward solutions are better, like:
using one text field that holds one of the 9 possible scope combinations,
or using two text fields, terminating - yes/no/undefined and best-effort - yes/no/undefined.

But since we want this done, and we abuse :hase_many all over the place already, I can live with current implementation.

@zeari
Copy link

zeari commented Dec 14, 2017

@cben Since we know there are only a few types of scopes (Terminating, Non-terminating, etc.)
Does it make more sense to keep a scope column(or columns?) on quota instead of a new table + association + record for each scope?

Edit: I think we had this conversion but that we figured out that it would be easier to query this schema of the DB

Edit Edit: And the changes are already in schema anyway

@yaacov
Copy link
Member

yaacov commented Dec 14, 2017

Edit Edit: And the changes are already in schema anyway

@zeari that is a very good reason ... 😿

@cben
Copy link
Contributor Author

cben commented Dec 17, 2017

oops, I forgot to write old refresh get_container_quota_scopes (which is what happens when I try to sneak in a PR without refresher specs :-)

@cben cben changed the title Parse and save quota scopes [WIP] Parse and save quota scopes Dec 17, 2017
@cben
Copy link
Contributor Author

cben commented Dec 17, 2017

@miq-bot add-label enhancement, gaprindashvili/yes

@cben cben changed the title [WIP] Parse and save quota scopes Parse and save quota scopes Dec 19, 2017
@miq-bot
Copy link
Member

miq-bot commented Dec 19, 2017

Checked commit cben@29ef338 with ruby 2.3.3, rubocop 0.47.1, haml-lint 0.20.0, and yamllint 1.10.0
4 files checked, 1 offense detected

spec/models/manageiq/providers/kubernetes/container_manager/refresh_parser_spec.rb

@miq-bot miq-bot removed the wip label Dec 19, 2017
@cben
Copy link
Contributor Author

cben commented Dec 19, 2017

PTAL.
Added parser specs.
Added trivial assertion that no scopes are generated with current VCR;
tested non-trivially locally with upcoming VCR in both old and graph refresh modes but the cassette I have now is not good for other tests, so this will come in future PR after I finalize ManageIQ/manageiq-providers-openshift#75.

oops, I forgot to write old refresh get_container_quota_scopes

No, that's OK, just top-level get_resource_quotas is enough for old refresh since parse_resource_quota already emits the nested hashes in necessary format.

Copy link

@enoodle enoodle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, a lot of word were written about the direction of a new model vs a test/json column. I think once we decided to go with a model this PR is good. Pending green tests which can only happen after the model PR in core manageiq will be merged.

@cben cben closed this Dec 20, 2017
@cben cben reopened this Dec 20, 2017
@cben
Copy link
Contributor Author

cben commented Dec 20, 2017

@moolitayer core dep merged, I believe this is ready.

Copy link
Member

@yaacov yaacov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@moolitayer
Copy link

👍

# oc describe resourcequota
Name:		besteffort
Namespace:	default
Scopes:		BestEffort, Terminating
 * Matches all pods that do not have resource requirements set. These pods have a best effort quality of service.
 * Matches all pods that have an active deadline. These pods have a limited lifespan on a node before being actively terminated by the system.
Resource	Used	Hard
--------	----	----
pods		0	1

> ap ContainerQuota.first.container_quota_scopes
[
    [0] #<ContainerQuotaScope:0x000000073b2818> {
                        :id => 2,
        :container_quota_id => 3,
                     :scope => "BestEffort",
                :created_at => Wed, 20 Dec 2017 16:48:46 UTC +00:00,
                :updated_at => Wed, 20 Dec 2017 16:48:46 UTC +00:00,
                :deleted_on => nil
    },
    [1] #<ContainerQuotaScope:0x000000073b2408> {
                        :id => 3,
        :container_quota_id => 3,
                     :scope => "Terminating",
                :created_at => Wed, 20 Dec 2017 16:48:46 UTC +00:00,
                :updated_at => Wed, 20 Dec 2017 16:48:46 UTC +00:00,
                :deleted_on => nil
    }
]

Copy link

@moolitayer moolitayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@moolitayer moolitayer merged commit 139850b into ManageIQ:master Dec 20, 2017
@moolitayer moolitayer self-assigned this Dec 20, 2017
@moolitayer moolitayer added this to the Sprint 76 Ending Jan 1, 2018 milestone Dec 20, 2017
simaishi pushed a commit that referenced this pull request Jan 3, 2018
Parse and save quota scopes
(cherry picked from commit 139850b)
@simaishi
Copy link

simaishi commented Jan 3, 2018

Gaprindashvili backport details:

$ git log -1
commit 2b7974358ee909df18efb6806d974a1e369bf216
Author: Mooli Tayer <mtayer@redhat.com>
Date:   Wed Dec 20 18:54:22 2017 +0200

    Merge pull request #190 from cben/quota-scopes
    
    Parse and save quota scopes
    (cherry picked from commit 139850b4b3ff435944e604b1c5480869459db5e6)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants