Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling results #17

Open
takluyver opened this issue Apr 27, 2017 · 5 comments
Open

Profiling results #17

takluyver opened this issue Apr 27, 2017 · 5 comments

Comments

@takluyver
Copy link
Contributor

This isn't a bug report, just some information for anyone looking to accelerate pyKriging. I profiled running one of the examples (2d_leave_n_out.py) to see which parts of the code take most time. This was run after my changes in #16 which reduced the time spent in pyKriging.samplingplan code.

As you can see below, most of the time is in the fittingObjective method, and the calls it makes to neglikelihood and updateModel. So if anyone's working on performance, these are places to focus. I couldn't see any obvious changes like those I made in #16, but I don't understand the maths the code is implementing.

screenshot from 2017-04-27 11-33-01

(The visualisation is by snakeviz)

@capaulson
Copy link
Owner

@takluyver Thanks for the results. I've looked at those sections of code as well to look for methods of acceleration. I haven't spent too much time looking at performance simply because I typically train a model once, then query often. Do you have a need where model training would limit the usefulness of PyKriging?

@takluyver
Copy link
Contributor Author

No, not particularly. I just spotted an easy way to improve performance in the samplingplan module, and so I profiled the rest of an example to see if there was more low-hanging fruit. I couldn't see any, but I thought I'd leave the results here as a starting point for anyone who wants to take it further.

@TsingQAQ
Copy link
Contributor

TsingQAQ commented May 18, 2017

====== Update Line =======
Hi @capaulson and @takluyver
It has been verified that the following line in the updatePsi function is a major(more than 70% runtime!) contributer to the runtime of the code.
newPsi = np.exp(-np.sum(self.theta * np.power(self.distance, self.pl), axis=2))
Since the updateModel will be called frequently this command will be excuted thousands or even millions times and will take up most of the rumtime.
So if there are some way to make this code run faster, the whole program will be boosted dramatically.

========================
Hi there, I've faced with the performance problem in #20 , and just ran some simple test to see what's going on in the code that takes time. I believe that what really takes time is the PSO optimization in train function.
This optimization takes about 1 minute on average in my case.

I've got some experience on PSO, though I haven't take look at its detail in Inspyred, if it is a classic PSO,
the swarm size could be smaller(40 in my experience), and a 1000 max_evaluations could give a relative good results (30000 will take much more time if no terminate conditions meet).

Take Rosenbrock benchmark function with 10 dimensions as an example:(according to my experience)
This functions global optimum is:
f(1, 1, 1, 1, 1, 1, 1, 1, 1, 1) = 0
PSO with 40 swarms and 1000 max iter perform results:

classic PSO reaches around 0.5-5(may be trapped in local minima)
PSO with adaptive intertia weight reaches around 0.5
OPSO reaches around 0.0005 - 0.05
All this PSO will not take more than 15 seconds on this optimization.

So a relatively smaller swarm size and max evaluation may save time significantly without too much lose accuacy. If a absolute global minimum is indeed needed, I'll suggest OPSO method.

@capaulson
Copy link
Owner

@TsingQAQ thanks for the recommendation. I'll take a look. The line you've identified is indeed the bulk of the computation happening here. This line is where new hyperparameters are tested during optimization. I've looked for ways to expedite that calculation, but haven't been too successful yet (at least not in a way that remains general enough for distribution). There may be ways to speed this up with GPU computing etc.

One thing that would be useful to have would be a bench marking test suite. It would be great if you could compile some of these test functions you're using and create a pull request. I can work from there to start building a way to actually record performance values based on optimizers, swarm sizes, etc.

@TsingQAQ
Copy link
Contributor

TsingQAQ commented May 31, 2017

Hi @capaulson , I've update the testfunction in #23 , which includes the bench mark function tested in #20

Also, I've tried to make it faster through some ways(numba, numexpr) though failed to make any progress, though I'm not familar with these tools so just some biased experience for your reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants