-
-
Notifications
You must be signed in to change notification settings - Fork 5
Have NetKAN bot collect download counts #67
Conversation
I'll take a close look next coffee break, but we already have all the JSON parsing + http tools available as part of the Perl stack. Is there a good reason to do this in bash? |
No, no reason, I just started it as a standalone script to see how far I could get with it, and I like bash for rapid prototyping. I'm more familiar with wget and jq than their Perl equivalents these days. I'll take that as a recommendation to rewrite in Perl... |
Yeah it's a full package install. You can use cpanm + lib::local to do it, instructions in the readme. I've gotta package it up with docker, that'd make deployments much easier! Also then you can grab the docker image instead for testing. |
I'd probably do something along the lines of the Status module if I were to reimplement it, but I'm guessing you were going for a quick win 🙂 (which is totally fair) |
Ahh, the readme, of course! |
Hmm.
in the repo. DZIL is super heavy weight, but it makes release management a breeze. The install will take care of installing the files into bin, and I'm pretty sure Local::Lib sorts out pathing from memory. |
f520bce
to
d012192
Compare
Perl port completed! I'm going to go back and delete some of my comments about installing and errors and tests and so forth... |
d012192
to
30196c8
Compare
c43e091
to
dafb2d7
Compare
dafb2d7
to
50e0b11
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really neat addition, I can't see anything to prevent merging. Awesome work!
|
||
method _build__http { | ||
return HTTP::Tiny->new(timeout => 15); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have our own wrapped HTTP::Tiny, but considering the utility I'm not overly stressed.
"all downloads counts" makes it less "popularity" and more "how often mod was updated, and how many bug-fixing releases there was". May be count popularity as maximum of the downloads of the last 2 versions? It does neutralize the above problem, the last-hour-release fall, and get KER back in the row, |
The plurality of mods are hosted on SpaceDock, which only provides one overall count value, sample: |
@techman83, did this get deployed to the server? The file hasn't shown up yet, but the bot otherwise appears to be functioning normally, so if the new code is running then I have some debugging to do... |
Hasn't been deployed yet, I wasn't at home at all over the weekend. I'll take a look this arvo if I get a moment. |
Motivation
KSP-CKAN/CKAN#2415 suggests using publicly available data from host APIs to display aggregate download counts in CKAN. If this was in a GUI column, users could sort by "popularity" to find major mods that they haven't tried before.
Changes
This pull request is a first step towards making that idea a reality on the infrastructure side.
Now after the NetKAN bot finishes inflating all modules, it will generate a ~37 KB file at
CKAN-meta/download_counts.json
, that looks like this (but without the whitespace):This file is then added, committed, and pushed to the
CKAN-meta
repo. Since clients download themaster.tar.gz
ofCKAN-meta
when they update the registry, this will give us the ability in a future pull request to parse this json file into a newDictionary<string, int>
so the counts can be shown in GUI.Alternatives considered
In theory we could have the client collect this data itself, but it takes several minutes, and users probably don't want to wait for that. It also effectively requires a GitHub token to work.
We could store a
download_count
property in .ckan files, but I don't think we want to have the bot constantly updating these files as the counts change. I also think we don't want the counts to be version-specific, as they would be if they were in .ckan files.Caveats
The download counting isn't perfect, but then what is?
$kref
is checked. So if your$kref
points to SpaceDock, then your download count will only be your SpaceDock download count, with any downloads from Curse or GitHub not included.asset_match
isn't supported, but could be added in the future. Mods that share downloads will probably be impossible to separate.x_netkan_github.use_source_archive
will not be counted, we only add up values from theassets
list of releasesjq
can't handle. I submitted pull requests for 3 of them, but I think the 4th is valid syntax thatjq
doesn't support (//
comments). These will also be excluded: