Remove newsroom.smgov.net from tracked websites #57

thekaveman · 2017-09-27T17:51:30Z

We launched www.santamonica.gov on September 22, which includes the newsroom functionality. On that date, we began redirecting newsroom URLs to the corresponding URLs on the newer site.

In the short-term, we can disable realtime reporting for newsroom.smgov.net.

In the long-term, we can completely remove newsroom.smgov.net. Since our longest reporting period is 90 days, the timeframe here is sometime after December 21, 2017.

this is for #57

thekaveman · 2017-09-27T17:54:01Z

See also #56

thekaveman · 2017-10-21T00:07:18Z

@allejo: related to what I mentioned on the closure of #56. When I made ed16d7d and removed an old site, the aggregate WebJob started failing.

Removing the site from the _websites collection removes the key from the Jekyll-generated reports/variables.json file. Subsequent runs of the other WebJobs won't generate new data for the removed site. This is all expected 👍

However, data generated prior to removing the site is never cleaned up. When aggregate runs following the removal, it uses the contents of the data directory (a subdirectory for each agency) and the keys in reports/variables.json; since there is a mismatch, we get the error.

Two options I can think of: either use the keys from reports/variables.json exclusively, or have a separate cleanup WebJob that continuously deletes subdirectories of data that don't exist as keys in reports/variables.json. (I kind of like the former approach better than latter). Your thoughts?

allejo · 2017-10-21T00:25:03Z

Ahhh that would make a lot of sense... Yea, I'm in favor of using reports/variables.json exclusively in the aggregate WebJob.

As for cleaning up old data, we could have a manual WebJob available to delete any old data that we could run every so often? Or we could tie that WebJob/script to be run on deployment as well.

thekaveman · 2017-10-21T00:42:54Z

Oh I like the idea of doing a clean on deployment! That plus moving aggregate to key off the reports/variables.json file should solve our current issue with removing sites and prevent stagnant data from sitting around forever.

allejo · 2017-10-21T00:58:09Z

Should the change go into the feature/aggregate-script-46 branch so that can be revived/merged? Or do it in both branches (rewrite + master).

thekaveman · 2017-10-21T01:01:00Z

Let's revive that thing and get it merged! I think I was supposed to review your changes, right?

allejo · 2017-10-21T01:03:38Z

Yea, and I just need to confirm that the generated data is the same as with the current script.

Relying on the data in the filesystem is only reliable when working with a clean slate. However, deleting websites will leave old data behind, so instead of checking the filesystem, use Jekyll generated files for their actual purpose: being the authoritative source of sites & reports. Fixes #57

thekaveman added a commit that referenced this issue Sep 27, 2017

disabling realtime reporting

df727f4

this is for #57

thekaveman mentioned this issue Oct 19, 2017

Add santamonica.gov to tracked websites #56

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove newsroom.smgov.net from tracked websites #57

Remove newsroom.smgov.net from tracked websites #57

thekaveman commented Sep 27, 2017

thekaveman commented Sep 27, 2017

thekaveman commented Oct 21, 2017 •

edited

Loading

allejo commented Oct 21, 2017

thekaveman commented Oct 21, 2017 •

edited

Loading

allejo commented Oct 21, 2017

thekaveman commented Oct 21, 2017

allejo commented Oct 21, 2017

Remove newsroom.smgov.net from tracked websites #57

Remove newsroom.smgov.net from tracked websites #57

Comments

thekaveman commented Sep 27, 2017

thekaveman commented Sep 27, 2017

thekaveman commented Oct 21, 2017 • edited Loading

allejo commented Oct 21, 2017

thekaveman commented Oct 21, 2017 • edited Loading

allejo commented Oct 21, 2017

thekaveman commented Oct 21, 2017

allejo commented Oct 21, 2017

thekaveman commented Oct 21, 2017 •

edited

Loading

thekaveman commented Oct 21, 2017 •

edited

Loading