Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cron by ressource #175

Closed

Conversation

Gabriel-Desharnais
Copy link
Contributor

So this modification integrate the cron inside the web app. There is no need to use crontab. Also there is one cron by resource and each of them can be set at different time interval

@justb4 justb4 self-requested a review November 18, 2017 12:00
@justb4
Copy link
Member

justb4 commented Nov 18, 2017

Thanks @Gabriel-Desharnais, this is a much requested feature from other users also described in issue #132 by @alexandreleroux . I was wondering if this could be done within a Flask webapp, especially when running in multi-process WGSI deployments like gunicorn, wouldn't we get one Scheduler per process? See e.g. this issue. As it touches the core of GHC and all existing installations and the database, please allow us some time to carefully review your PR.
But again: much appreciated.

As a first step, you may have noticed that the Travis build has failed. Mostly this is due to the Flake8 Python Coding Standards check. It is quite strict but easy to fix and test. See the Travis details. And just run find . -name "*.py" | grep -v docs | grep -v migrations | xargs flake8.

Also one of the tests failed, unclear why but your can rerun while fixing:

# install GHC with DB etc

# Load testdata into DB
python GeoHealthCheck/models.py load tests/data/fixtures.json y

# Run the healthchecks (what is normally run via cron)
python GeoHealthCheck/models.py run

@Gabriel-Desharnais
Copy link
Contributor Author

That should do it

@justb4
Copy link
Member

justb4 commented Dec 8, 2017

@Gabriel-Desharnais @tomkralidis Note some discussion on Gitter on this PR. Probably better to proceed here.

Copy link
Member

@justb4 justb4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try take this direction. Some general remarks:

  • database upgrade: existing installations need to update their DBs using paver upgrade. A migration needs to be added. I have a migration script, generated when trying your PR.
  • some spelling errors of frequency (will indicate in code)
  • test_frequency use database column default i.s.o. code to always have default of 60 mins.
  • need to tackle the case of "multi-processing": where GHC runs as several separate processes, e.g. in Apache mod_wsgi or gunicorn. See suggestion on Gitter.
  • GHC_RETENTION_DAYS, why the change into days, seconds, microseconds, milliseconds, minutes, hours, weeks? Will be incompatible with existing configs. May use cron type scheduling.
  • flush_runs() runs once a minute, possibly once an hour or even day will do.
  • requirements.txt add specific version of APScheduler as dependency

def start_crons():
# Cold start evry cron of evry ressource
for resource in Resource.query.all():
freq = resource.test_frequency
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do this via setting a default value for test_frequency in Models.py and the migration script. No need to set here.

'max_instances': 100000
})

scheduler.add_job(flush_runs, 'interval', minutes=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite often, possibly once an hour will do.

@@ -37,7 +37,8 @@
# Replace None with 'your secret key string' in quotes
SECRET_KEY = None

GHC_RETENTION_DAYS = 30
# days, seconds, microseconds, milliseconds, minutes, hours, weeks
GHC_RETENTION_DAYS = (30, 0, 0, 0, 0, 0, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to retain the existing type for GHC_RETENTION_DAYS, may expand this to a timedelta explicitly.

# Complete handle of old runs deletion
def flush_runs():
APP = App.get_app()
retention_time = timedelta(*APP.config['GHC_RETENTION_DAYS'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See earlier comments: may just calculate retention time from days, e.g. retention_time in milliseconds will be something like:
GHC_RETENTION_DAYS * 24 * 3600 * 1000

<tr>
<th>Test frenqucy</th>
<td>
<input type="number" id="input_resource_frequency" name="resource_frenquency_value" value="{{ resource.test_frequency }}" style="width: 100%;"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: resource_frenquency_value should be resource_frequency_value



// Collect test_frequency
var new_frequency = $('input[name="resource_frenquency_value"]').val();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: resource_frenquency_value should be resource_frequency_value

@@ -80,6 +80,12 @@ <h3 class="page-header">{{ _('Monitoring Period') }}: {{ resource.first_run.chec
<button class="btn btn-{{ resource.reliability| cssize_reliability }} btn-sm nohover">{{ resource.reliability|round2 }}%</button>
</td>
</tr>
<tr>
<th>{{ _('Test frenquency (minutes)') }}</th>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: frenquency should be frequency

@@ -200,6 +199,7 @@ class Resource(DB.Model):
owner = DB.relationship('User',
backref=DB.backref('username2', lazy='dynamic'))
tags = DB.relationship('Tag', secondary=resource_tags, backref='resource')
test_frequency = DB.Column(DB.Integer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add default 60 mins here

@justb4
Copy link
Member

justb4 commented Nov 5, 2018

Sorry, but closing this PR, as it has been implemented via PR #222. Thanks @Gabriel-Desharnais for the ideas in this PR. The actual implementation in #222 appeared to be more involved as GHC can run multiple processes in cases, e.g. via WSGI. Also attended via #132.

@justb4 justb4 closed this Nov 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants