-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do the initial loading of images from GPO #1
Comments
Specifically, I'm proposing 4 directories:
I moved the It'd be nice if the main Python script kicked this off (or did it itself) after checking for new photos, so that people don't have to remember to run two scripts to keep the photos up to date. |
Originals added in d293cac For example: Resized to follow. |
Oh, just spotted the directory is called |
Nice! :) Er, sure, if it's not much trouble, might as well rename it, since |
All right, I've beefed up the docs, and updated the script to I've also updated our old contribution page to link here, and added a section to the README on non-GPO contributions (which I hope are rare). Before we update our Congress API docs to link to this instead of our old zips, I'd like to make sure that we have We currently have photos in our Sunlight collection for members back to at least 2007 (the 110th Congress). So we should collect photos for the 112th, 111th, and 110th first. Presumably, the cache means it will only bother to download photos for members we don't already have. |
Also - if the originals are consistently in the 500+x700+ range, I'd be up for revisiting our supported sizes, and putting some burden on clients to downsize where needed in-browser or in-app. For example, maybe just:
|
Why are the directories the sizes and not the bioguide IDs? It would seem to me that this would be more useful: Unless of course you're worried about having descriptive filenames... |
Also, since the original images are not always guaranteed to be the same size, maybe the size should only refer to a single axis (e.g. width) so that you're not distorting (or misrepresenting) the aspect ratio of the image? |
aa5b395 kicks off resizing from the script and adds the resized photos, using the initial 200x250/100x125, 40x50 for now. |
About directories, I think About sizes, Flickr only refers to the longest size (unless they're square sizes). I think Last.fm do something similar, and both have labels for image sizes. Anyway, the ImageMagick resize command is using the
|
About sizes, the originals are currently sized:
So most are indeed 500+x700+. It's not terribly hard to resize images on our side, so I'm for having having a good selection. Let me know if any sizes should be added/removed. About going back to 2007: memberguide goes back to the 110th and we have a switch to choose the number. Already downloaded images won't be overwritten. Cached member pages won't be replaced or reloaded from the web, but different congress sessions have different pages and even photos:
So for example, we'll download the 113 page as Currently we're matching against |
…k why: Smith, Gordon 110/SR/Smith / Specter, Arlen 110/SR/Specter / Stevens, Ted 110/SR/Stevens / Sununu, John E. 110/SR/Sununu / Voinovich, George V. 110/SR/Voinovich / Warner, John W. 110/SR/Warner
@hugovk, looking at your commits, it might be worth reporting the mistakes you found (Curzon, at least) to GPO to see if they can fix it. @GPHemsley, the dir structure is modeled by how Sunlight's managed our photo archive in the past, which included offering zips of individual directories (all For resizing, we can use imagemagick to ensure a consistent width and height without distorting the aspect ratio (by cropping a bit where needed). That's been what we've done so far and it's never looked bad. Also: this is coming together SO WELL! 😸 |
Looking at the image stats you compiled, @hugovk, it looks like It's much easier to ask people to have their browser/app/etc auto-scale an image down than to auto-scale it up. So, proposed sizes:
And if anyone wants smaller, they can auto-scale (or batch process them on their own). Any objections? /cc @dcloud @drinks @jcarbaugh |
Given that the majority of the images are This would mean something like |
And, actually, a lot of the other images are also in a 9:11 ratio with smaller sizes. So I'd propose these sizes:
|
Good point, we don't have to lose data just to keep with the sizes of yore. I'd propose |
And those images that aren't in a 9:11 ratio are in a 33:37 ratio (or roughly 3:4), so they would prefer the following sizes:
The full data table is here:
You'll note that the stars indicate images that are already in an idealized aspect ratio (i.e. one without rounding): in particular, |
I would recommend against attempting to use "roundish" numbers, as that can significantly change the aspect ratio, particularly at smaller sizes. For example, |
Damn, thanks for bringing some rigor to it. And you're right that the aspect ratio should stay consistent between all the sizes, and that's it's nice to stay close to the original aspect ratio. But I do think there's a value in round numbers here, and these aren't photographs of detailed scenery -- cropping out some flag or solid-color background fluff around the rim loses no meaningful information, and imagemagick makes it trivial to do. I'm more concerned about ensuring people make apps see this as trivially easy to integrate. So altogether I'm still inclined to go back to the original proposal:
It's quick to understand, it's easy to fit it into one's existing layout -- in other words, it uses a simpler aspect ratio with more round common denominators people can auto-scale them down to if necessary. That's more important to me than preserving original aspect ratio. |
Any competent image resizing application can maintain aspect ratio—there's no need to worry about having more round common denominators, IMO. Given that the majority of the images are in a single aspect ratio, and the rest are in a fairly similar one, I'd recommend the following sizes:
That way, they're all at least divisible by 5, which makes them round enough. And then you can crop the rest based on their width (removing whatever is left over on the top and bottom). |
By which I mean: resize to the highest standard width that is less that the original width, and then crop off the top and bottom to match the standard height ((original height - standard height) / 2). |
OK, so, I did more thinking on this. Since we want relatively "round" numbers, I determined all the possible sizes that we could have, using multiples of 5:
Then I determined which of these sizes we can get out of each original image size:
According to these calculations, these are the largest sizes we can use that will cover a certain percentage of images:
Given that 71.3% of images can be covered by the original size of two thirds of the images, I recommend we offer that size by default. Then, we can offer a bunch of sizes that apply to most of the images. As such, I propose we offer the following sizes:
In general, they line up with what has already been proposed. But now they're backed by data! |
@konklone, I've emailed GPO to ask them to correct Byrne Bradley and David Curzon. |
@hugovk, thank you! Does anyone else think The reason I think round numbers are important is partly for automatic downscaling, so that someone can still fit these images into their I hate taking so much time to talk about this! But since we're generating permalinks, it's hard to take them back. Anyone have an opinion to help settle this? |
Either way, I think we shouldn't promise a size that only ~70% of the photos have. ~90% is more reasonable, so I'd put OK, so someone, anyone, ring in with a preference for:
|
So I'm already rescaling myself from the original, and I've totally lost track of the data here.... Which of the pairs (400x500 vs 450x500; 200x250 vs 225x275) is the closest to the most common aspect ratio? (i.e. minimize data loss during resizing) |
The bigger ones ( |
So I'd say go with that, but take that as only a +0.1 since I probably won't be using the scaled versions in the near future anyway. (Also, resolutions keep getting higher so 100px is probably not going to be very useful for much longer.) |
All right, then barring objection let's go with the bigger sizes. |
@konklone You don't seem to think that ~70% is a high number, and I'm not sure why. It seems to be the standard size going forward, and it's already the same as the "original" size for most of the ones that are currently available. If we restrict our sizes to the least common denominator, we're going to lose a lot of information for no reason, I think. But perhaps I didn't mention that clearly before: I chose |
@konklone @JoshData Also, the larger sizes ( |
Okay.... I think everyone's happy with the 450 and 225 sizes, and since no one actually is coming with a use case/need for 675x825 I don't think there's any reason to keep talking about it @GPHemsley. |
There may be a use case for a small, thumbnail and possibly a square avatar-size, but we can add smaller sizes as and when a need arises. |
Definitely. And thank you once again for the follow-through, @hugovk, I see the images and docs have all been updated. :) http://theunitedstates.io/images/congress/450x550/L000551.jpg I'm ready to start pointing people here and telling them to use the URLs. |
The
congress
directory is empty!The text was updated successfully, but these errors were encountered: