-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fails building circuits with large number of anon seeds [still issue with 6.6.0-exp1] #1683
Comments
@synctext I guess this should be assigned to whichever student is in charge of improving the tunnel community? |
OK, yes. Another nice student job. So the tunnel community has scaling problems. Good to hear from our users about this. With a nice performance graph, it's even performance analysis & re-factoring. |
@Pathemeous #21 Seems the 1TByte seeding goal is difficult. Seeding anonymously gives already errors at >60 torrents. |
Yes, this should be the first target to overcome (as expected). It seems that this is related to the big GUI refactor? Without a clean API such high-performance goals are bound to result in errors like these. |
It's not related to it, but in principle we where planning to have it fixed for the wx3 release. Depending on how long the tunnel community refactor will take, it could be that the improved tunnel community is not ready until after the new gui is ready, so maybe it's not worth worrying about breaking the gui parts of it for now. |
If the WX3 part of this milestone is ready before the rest, we can still release that so we get in Debian/Ubuntu ASAP and split the rest for a future milestone/release. |
Starting to target this one as it's the last issue assigned to me for 6.6. What do I have to work with? @colin1497 I see now that it's been open a while, what do you remember; do you have any stacktrace or log related to this issue? My first guess is that downloading or seeding 60 anon torrents means creating >= 60 hidden tunnels which is quite heavy on the cpu side. Having (blocking) python calls for 60 tunnels probably results in e.g. diffie hellman handshakes or intro-points timing out, which may be what is going on here. @whirm @synctext do you think this is a probable cause? Looking at the code there are many possible timeouts. |
I'm afraid it's been a while. Looking back at #1605 I originally had 91 torrents, so it was well over 60. I haven't seen this is a long time. Besides build changes, I've also tripled my data rates with my ISP since I originally created the report. If your speculation is right, then you would hit it at some point. Maybe the issue is just that it shouldn't start all the tunnels in parallel? Maybe it should queue them up and start a max of 10 in parallel or something? Just thinking out loud. |
Even if some requests time out it should still keep building circuits until it hits the circuit target. |
@whirm sure, but if due to a large amount of circuits being built not a single one can actually be constructed, rescheduling them all concurrently will mean that all the newly scheduled circuits will timeout as well. Assuming that this is the issue of course. |
@lfdversluis let's stop guessing and try to reproduce it instead. Once you've got a scenario where this happens. If you don't have a shitty Internet connection, use If you want to limit the amount of cores Tribler can use (this shouldn't make a huge difference) you can use |
FYI, just updated to 6.6, 7cd6ed7 and it fails to build circuits. 103 seeded torrents. Just looking at the Windows resource monitor it doesn't appear to be CPU bound. Resource monitor shows lots of disc activity on the mechanical drive where torrents are stored (60-100MB/s). Network activity rate is relatively low, well under 1Mbps. On exit, I get a log file each time. I have diffed a couple of the log files and they are basically the same: Edit: Edit2: |
@colin1496 |
No problem, just trying to give as much info as possible. After checks were completed I restarted and again no joy with almost an identical log file. |
@colin1497 Thank you that is very valuable info. It seems that the IO is too heavy and probably completely blocking the twisted thread, most likely resulting in circuits timing out due to handshake failures and what not. I am currently in the process of making the IO non-blocking by pushing it out of the twisted thread in Tribler/dispersy#481 but this migrating is still underway. |
Hmm looking at the log file I see
is wx related, we are moving to QT soon so that should be fixed soon.
means our version manager is broken? @devos50 what do you make of this? |
I am relatively certain that I didn't get the log entries in the session where I deleted tribler.conf and it rechecked every file. I think that it's only happening when it never is able to build the circuits. Edit: No - seems a clean install just starting tribler fdfd8db gives this log: |
@colin1497 you need to install |
@colin1497 if you are running from git, you should install all the dependencies listed on |
Downloading Windows installer builds from Jenkins. I shouldn't have to separately install dependencies in that scenario, should I? |
ah, the unchecked latest Windows builds. Fresh from Jenkins.Tribler.org then? These are not often checked if they function OK. |
@colin1497 The devel branch is almost exclusively used by developers that are adding additional dependencies (e.g. I am adding several at the moment). So often we add dependencies on our machines before we add them to the builders to check everything is working. The builders then ship these with the installers :) As @synctext said, there are not regular checks on devel. Our next branch is far more stable, but we do not have any guarantees on this either. The only guarantee we do strive to deliver is that all dependencies are shipped with our installers (naturally). But if something is not working, do let us know so we can add it to our todo list. |
Apologies guys, I had been pulling next branch builds previously, and had an issue in 6.5.2 and went to jenkins to grab latest build to see if same issue still existed. Geez, I can see that I clearly ended up grabbing devel branch versions. /facepalm |
@colin1497 heh, no worries, at least we know it needs to be fixed now :) @lfdversluis maybe this is due to the MSVC rebuild you did? Maybe you forgot something onthe python-openssl dll chain. |
Quick update since there was concern about CPU performance: I tried a few things. I watched CPU usage and it didn't seem that high, not even enough to force the CPU to peg to its max frequency. I set up an idle priority, 100% usage application and pegged it to one core to force the CPU frequency high. I set Tribler to "realtime" priority level. No change in behavior. I can get 20 connected peers, but can't build circuits for my seeds. Looking at network usage, it's really not that high -- never goes over 1Mbps. I have Gb infrastructure and 50Mbps connection to the internet. Obviously that's all macro level. |
@colin1497 You discovered a problem in the tunnel community. The team made a good performance measurement test. Even with light load the tunnel may take 3 minutes to build. btw Gb infrastructure, nice! |
Good to hear I found a legit issue. WRT infrastructure, we completely renovated a house last year and it's relatively ridiculous what all I did.... |
Just wanted to say that that this remains a problem in the 7.0.2 release. I hadn't had an issue with it because:
I'm up to the point where at startup basically everything just spins its wheels saying it's building circuits but none ever get going. |
I'm assuming this to be fixed, but I'll add it to the 7.2 milestone for verification. |
I'm pretty sure this issue has been fixed. Closing the issue. Please let me know if there are any other problems related to circuit building. |
Seeding >60 torrents anon, tribler sometimes fails to build circuits and seed. Lots of info in #1605, see this comment specifically:
#1605 (comment)
Splitting this issue out for tracking purposes, may be related to #1682
The text was updated successfully, but these errors were encountered: