Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion - Reduce inserts #15

Closed
michelinok opened this issue Dec 2, 2023 · 29 comments
Closed

Suggestion - Reduce inserts #15

michelinok opened this issue Dec 2, 2023 · 29 comments
Labels

Comments

@michelinok
Copy link

Hi!
This is not an issue but a suggestion.
Will you make some mods to the code to have less inserts?
I don't know where the code is slow but it seems to me that it takes ages because you do an insert for each qso.
It would be a great improvement if you can modify to to have less "insert into" but with a lot of qsos, something like 100 qso would improve a lot.

It's JUST an IDEA.

Two years ago uploading my log required 30 seconds...now (doubled my qsos!!) it require 15 minutes.
Don't know how much code changed in these 2 years.
PS: I changed my pc...it's much more powerfull that 2 years ago....it's not a pc problem

@jxmx
Copy link
Owner

jxmx commented Dec 2, 2023

That seems very strange and unexpected. I don't believe there's been any material change to the SQL parts and I don't experience that. Can you send me an ADIF file for testing?

@michelinok
Copy link
Author

Here's my log

mylog.adi.txt

@jxmx
Copy link
Owner

jxmx commented Dec 24, 2023

@michelinok Loading mylog_adi.txt on my test system took about 2 seconds. There were 7810 contacts in that log and SmoothQSL churned through them like nothing. This is running on a 2CPU development VM with 2Gb of RAM. No MariaDB tuning has been done - it's configuration is "out of the box" Debian 12.

Can you tell me more about your setup?

@michelinok
Copy link
Author

michelinok commented Dec 24, 2023 via email

@jxmx
Copy link
Owner

jxmx commented Dec 24, 2023

To confirm, you are running Apache 2.4 on Windows? What version of PHP? How is Apache connecting to PHP - FCGI proxy or mod_php or something else?

Is there a long delay between the first and second load screen, second to third, or both?

@jxmx
Copy link
Owner

jxmx commented Dec 24, 2023

Also, something is wrong if Apache is “idle” at 25% CPU.

@michelinok
Copy link
Author

michelinok commented Dec 24, 2023 via email

@michelinok
Copy link
Author

michelinok commented Dec 24, 2023 via email

@michelinok
Copy link
Author

michelinok commented Dec 24, 2023

I've tryed disabling entire antivirus, disabling xdebug,disabling cgi_module and using 127.0.0.1 instead of localhost.
I've no more idea.
It seems that qsos are displayed in "batch" of 15/20 on the screen.
The final "commit" is done in a fraction of second.
I'll investigate

Apache/2.4.46 (Win64) PHP/7.3.21 mod_fcgid/2.3.10-dev - Port defined for Apache: 80
PHP Version: [Apache module] 7.3.21
[FCGI] 5.6.40 - 7.3.21 - 7.4.9 - 8.1.26 - 8.3.0
5.7.31 - Port defined for MySQL: 3306 - default DBMS

@jxmx
Copy link
Owner

jxmx commented Dec 24, 2023

I am not sure exactly the setup of "External hosting" but if it's on a shared hosting provider, I would expect loading those 7800s QSO in 10s of seconds - maybe 30 or so. Shared hosting providers are always vastly oversubscribed and inserts are the most expensive database transaction. That's why it's all wrapped in one transaction at the end.

Looking at the above are you by chance running the pages through PHP twice? From that server signature is sounds like you have mod_php running AND mod_fci running simultaneously. You definitely should be using one or the other, but not both although preferably proxy_fcgi (not mod_fcgi) is the most performant way to run PHP.

Also, PHP 7.3 is very very old. While I don't think it's a root cause, PHP 8.0+ does have significant performance improvements. Although it's likely been over 10 years since I've see Apache or PHP on a Windows host.

@jxmx
Copy link
Owner

jxmx commented Dec 24, 2023

@michelinok - If you pull down the latest code out of git, there is now a program load/qsladifloader-cli.php. This won't fully load QSOs, but it will tell you where the bottleneck on your system is. Run it from cmd or Powershell like so:

php qsladifloader-cli.php -c IU5HES -l Italy -f c:\path\to\mylog.adi.txt

You will get a timestamp output for how long processing each record took and then one timestamp for how long the transaction stage to the database took. For example:

2023-11-26 15:53        NR4M    28.0078 10m     CW      599     IU5HES
Record processed in 0.000035 s
2023-11-26 15:54        WG3J    28.0097 10m     CW      599     IU5HES
Record processed in 0.000035 s
2023-11-26 15:56        V26K    28.0615 10m     CW      599     IU5HES
Record processed in 0.000036 s
2023-11-26 15:57        N6SS    28.0939 10m     CW      599     IU5HES
Record processed in 0.000035 s
Database insert prepped in 0.041264 s

@jxmx
Copy link
Owner

jxmx commented Dec 24, 2023

Let me know what you get.

@michelinok
Copy link
Author

Just a part of the log...

2017-08-09 10:08 DG5NET 14.0749 20m FT8 -07 IU5HES
Record processed in 0.000212 s
2017-08-09 10:24 PD7RF 14.0749 20m FT8 -02 IU5HES
Record processed in 0.000209 s
2017-08-09 10:35 S56ECR 14.0749 20m FT8 -01 IU5HES
Record processed in 0.000212 s
2017-08-09 10:49 5B4AIF 14.0752 20m FT8 -15 IU5HES
Record processed in 0.000241 s
2017-08-09 10:53 DH1BBH 14.0752 20m FT8 -10 IU5HES
Record processed in 0.000208 s

I'll run entire log asap

@michelinok
Copy link
Author

it seems it takes 10 times your speed processing each qso....

@jxmx
Copy link
Owner

jxmx commented Dec 24, 2023

Even 10x is meaningless at that scale. Thats taking 0.2ms per line in your ADIF. If I wasn’t using microseconds, my system and your system would both take 0ms to complete at that line. That means your bottleneck is in your webserver setup somehow.

@jxmx jxmx added the question label Dec 24, 2023
@michelinok
Copy link
Author

I've tryed from cli with different php version (thread safe and not thread safe). Same results...I'll try a live linux version asap (if you have any suggestion...). it's the latest hope...I hope it's not an hardware bottleneck :)
Maybe my problem will help someone else, so please...don't close the ticked now.
Many thanks, I'll let you know

@michelinok
Copy link
Author

Can you share your php.ini ? Maybe on pastebin
Many thanks

@michelinok
Copy link
Author

Oh my god.....i've profiled your source (with xdebug) code and found what was cousing the problem to me....
For me it's the stripos inside adif_parser (inside get_record function) !!!!
I've replaced stripos with strpos and... it's a rocket!!!!
Since we already replace EOR with eor, it's not a problem at all to search for "eor".
We can also replace
Gotta modify my code now (i've deeply adapted your source for my project...i now include dxcc and flags).
If you want, you can push the mod...I'm still not able to do "pull request"

Many thanks for your help!!!!!!!!!!!!!!!!!!

@jxmx
Copy link
Owner

jxmx commented Dec 25, 2023

So you are saying that replacing stripos with stripos reduces your web processing time from 15 minutes to seconds? That just doesn't make any sense. Stripos is an order of magnitude slower because of what it has to do, but it shouldn't be that bad. Most likely, the right thing to do is replace stripos/strpos with `preg_match.

Also when you say you "can't push" what do you mean? I don't see an open fork of SmoothQSL for you in Github. If you fork it and upload your code, you should be able to propose an upstream pull with no issue.

@michelinok
Copy link
Author

michelinok commented Dec 25, 2023

My bad english....
I've tryied your "banchmark" against my adif.
Before the mod I got 15 minutes, with strpos i get some seconds. I've read that strpos is much faster because it doesn't handle lowercase/upccase.
Preg_match is the slowest.

When I say "can't push" I mean that I'm working on your code and haven't forked....need to study how "git" works.
I'm still learning a lot of things, working on a little project (started from scratch an older project).

I swear I can handle my log in a couple of seconds instead of minutes.
It would be a good idea if you can try the mod of your benchmark on your working machine.
I have no words to thank you for your patience.

AGAIN...I'M A BEGINNER STILL LEARNING, so maybe I'm doing something wrong.

@jxmx
Copy link
Owner

jxmx commented Dec 25, 2023

If you want to attach a zipfile or tarball here of your code, I can take a look at it. Getting familiar with basic Git and Github would help in the long run though.

@jxmx
Copy link
Owner

jxmx commented Dec 25, 2023

Also, what is your local language on your system? I'm assuming Italian? I wonder if the performance problems are due to localization issues with Italian on Windows. I'll look at the ADIF code to see if there's some efficiencies.

@michelinok
Copy link
Author

michelinok commented Dec 25, 2023 via email

@jxmx
Copy link
Owner

jxmx commented Dec 25, 2023

@michelinok - Try the updated adif_parser I just committed from source. I removed all of the stripos() reliance and unnecessary string manipulations. It increased by the per-record processing by 200% and cut the database prep time in half. Reading about stripos() in various PHP places, it sounds like it's a known poor performer in certain cases.

2023-11-26 15:53        NR4M    28.0078 10m     CW      599     IU5HES
Record processed in 0.000007 s
2023-11-26 15:54        WG3J    28.0097 10m     CW      599     IU5HES
Record processed in 0.000006 s
2023-11-26 15:56        V26K    28.0615 10m     CW      599     IU5HES
Record processed in 0.000007 s
2023-11-26 15:57        N6SS    28.0939 10m     CW      599     IU5HES
Record processed in 0.000007 s
Database insert prepped in 0.021563 s


@michelinok
Copy link
Author

Hi,
so great improvments, faster than my previous stripos/strpos mod. The database prep time is also very very faster now!
Many thanks!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

@jxmx
Copy link
Owner

jxmx commented Dec 25, 2023

Fantastic! Upload your code to a different ticket and I'll check out your changes for inclusion in SmoothQSL.

@jxmx jxmx closed this as completed Dec 25, 2023
@jxmx
Copy link
Owner

jxmx commented Dec 25, 2023

@michelinok
Copy link
Author

Fantastic! Upload your code to a different ticket and I'll check out your changes for inclusion in SmoothQSL.

YOU did the final job, i've only investigated where was my problem. Maybe your computer is so fast that you didn't noticed the "problem". The major improvment was changing stripos/strpos.
I've learned also how to profile php code, that's fantastic!

You're a real good programmer with a big patience and I'll put credits to you as soon as the project is finished (I'm working on an permanent hf-award ...need to learn bootstrap too...)

@jxmx
Copy link
Owner

jxmx commented Dec 25, 2023

You're welcome. I'm just happy people use my code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants