Performance issues

For usage statistics, see Site usage.

These are historical notes; some have been addressed, some not yet.

The software is CPU-bound. There are 4 cores on the current WDTK server. These are the first two. Some cron jobs run with processor affinity set, which is why CPU0 has higher load than CPU1 (and the other cores not shown).

Why?

Why is it CPU-bound? It is possible there are some performance snafus here, since some of the processes that are chewing up CPU cycles are performing tasks one would not expect to be computationally-intensive (e.g. sending out reminder emails). There are some details below, but more needs doing to understand the causes.

Possible solutions

Use AWS or similar for high-intensity operations, e.g. https://github.com/documentcloud/cloud-crowd/wiki/Getting-Started

Specifically, use DocumentCloud service for document conversion and hosting.

Reduce storing the number of bogus post redirects that aren't people

Receiving email can be resource drain starting app instance each time - use daemon instead

Cache /feed/list/successful Cache /body/list/a

Cache parts of /body/xxxxx Cache parts of /user/xxxxx

Finish migration to Ruby 1.9 - for uncached requests, seems to be twice as fast.

Regular expression library - change to faster one. Oniguruma isn't enough. This shows slowness: e = InfoRequestEvent.find(213700) text = e.incoming_message.get_main_body_text (XXX alter to call internal not cache) IncomingMessage.remove_quoted_sections(text, "")

wvWare sometimes loops: https://github.com/mysociety/alaveteli/issues/299 pdftk sometimes loops: http://www.whatdotheyknow.com/request/87534/response/234022/attach/7/HC15.pdf

This is slow: http://www.whatdotheyknow.com/request/renumeration_committee

Varnish config http://www.varnish-cache.org/wiki/VCLExampleCachingLoggedInUsers

Some requests to lower memory use of still: PID: 676 CONSUME MEMORY: 16968 KB Now: 102604 KB http://www.whatdotheyknow.com/request/parking_ticket_data_81 PID: 2036 CONSUME MEMORY: 129368 KB Now: 179652 KB http://www.whatdotheyknow.com/request/14186/response/33740

search engines shouldn't be going for those URLs. and do they really need to unpack so much? could use snippet cache.

Things to make bots not crawl somehow: /request/13683/response?internal_review=1 /request/febrile_neutropenia_154?unfold=1

Renaming of a body, or changing its domain, should clear the cached bubbles of all requests to that body.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issues

Why?

Possible solutions

Clone this wiki locally