Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues on receiving directory listings #7495

Closed
RubenKelevra opened this issue Jun 19, 2020 · 13 comments
Closed

Performance issues on receiving directory listings #7495

RubenKelevra opened this issue Jun 19, 2020 · 13 comments
Labels
kind/bug A bug in existing code (including security flaws) need/author-input Needs input from the original author need/triage Needs initial labeling and prioritization

Comments

@RubenKelevra
Copy link
Contributor

Version information:

Hosting node:

go-ipfs version: 0.6.0-dev
Repo version: 10
System version: amd64/linux
Golang version: go1.14.4

running master @ 10623a7

Receiving node:

go-ipfs version: 0.6.0-dev
Repo version: 10
System version: amd64/linux
Golang version: go1.14.4

running master @ 10623a7

Description:

I run a node which is hosting data in it's MFS, I publish the root-folder (which is a subfolder of / in the MFS) via IPNS.

The IPNS resolve is instant, but receiving data is quite slow.

We talked before in a different ticket about slow performance of ls with "hundreds of files", but this folder only contains some subfolders. This shouldn't really take 20 seconds to receive it.

On the receiving side:

[ruben@i3 ~]$ ipfs resolve /ipns/pkg.pacman.store
/ipfs/bafybeih5bguq6vztt6ohud64edfmxtncehr6rusegp6a6j32q3klzq55p4
[ruben@i3 ~]$ time ipfs dht findprovs /ipfs/bafybeih5bguq6vztt6ohud64edfmxtncehr6rusegp6a6j32q3klzq55p4
QmWsTDap1zmaVLRaUFKBmo25ST6MGtjZQBAT2u72wz4Qma
QmRMh53FcBzcg93PfhNyqg8f1FxAqVtkE9SL1QJtZFUgjQ
^C
Error: canceled

real	0m1,256s
user	0m0,033s
sys	0m0,054s
[ruben@i3 ~]$ time ipfs ls /ipfs/bafybeih5bguq6vztt6ohud64edfmxtncehr6rusegp6a6j32q3klzq55p4
bafybeidvonwuguxd6mt2jpksqs2muaizn5iwewmv45xhu23nvehakt3o7q - arch/

real	0m0,690s
user	0m0,033s
sys	0m0,037s
[ruben@i3 ~]$ time ipfs ls /ipfs/bafybeih5bguq6vztt6ohud64edfmxtncehr6rusegp6a6j32q3klzq55p4/arch
bafybeiba6f7xrskiay3nrz6sc7vr6itqxwat55bupbt63lpr3rgtdxvd7e - x86_64/

real	0m1,182s
user	0m0,019s
sys	0m0,027s
[ruben@i3 ~]$ time ipfs ls /ipfs/bafybeih5bguq6vztt6ohud64edfmxtncehr6rusegp6a6j32q3klzq55p4/arch/x86_64/
bafybeig3ngbwxisiznj4aubzyubjerrcadek7bcqkpjd37hxz3zu7rhspe - default/

real	0m1,010s
user	0m0,035s
sys	0m0,041s
[ruben@i3 ~]$ time ipfs ls /ipfs/bafybeih5bguq6vztt6ohud64edfmxtncehr6rusegp6a6j32q3klzq55p4/arch/x86_64/default/
bafybeicvkuxveafm2j345d2v3zykjopnog5hnmdmkix7o3h2rwoljtkda4 - cache/
bafybeifxpct32kh5n3yaf2radx6iotlp345nvv43yekqajsdiac3vumd7q - community/
bafybeifgbajsdhruqbjvzdz2oq3f3w4vfdoko24imvaoo7lik227xlgjpy - community-staging/
bafybeibabhm7wsnthpzlfadmdzlkjyblaxkyktuu42lb5baq2n3r75gkwm - community-testing/
bafybeibffbb3r5j2yt3cob6xcuvcoak3we5bnwnsb3fe3knlrbzsjtlwg4 - core/
bafybeigpvalprc3uqiyts3c3hvxe2yhrdej4bhutexx4xgv4me5b2p5x7u - db/
bafybeicod6j5t5h2tt5ivb6xpgx75trngmmvfdcigmy7jau2tlek35xqqi - extra/
bafybeigireazzbmm6pxlfcvud3mv4vf6wyhvtzdncqjqa27tp664olsfwm - gnome-unstable/
bafybeif3o6g4a5judp3g454mtwmt3wzrx3vmuumd3cvdlffnmtlqwa3m3e - kde-unstable/
bafybeiedjipmwah5k27243oyirmpkpotyht5ntuvzp3b4uhheogafkip2u - multilib/
bafybeies6drqyzefbsh54x35cyb2u2jdyhn4t3kpf62bobr5rzmuvu2uxe - multilib-staging/
bafybeig7gjincnxqvzwqmshvslmujqeps47proqn6ad534vcw7hq72typa - multilib-testing/
bafybeibi6d3mx4jfobm3qkno2jeooxowdm6kknzbrpe5n74wy7ajfb7qxm - staging/
bafybeiblxfa3ayxinjgcu26apvgter5n56vtaze7ubkekruhtrqmynvpia - testing/

real	0m20,064s
user	0m0,037s
sys	0m0,032s

On the hosting side:

load average: 0,49, 0,76, 0,85
$ time ipfs ls /ipfs/bafybeih5bguq6vztt6ohud64edfmxtncehr6rusegp6a6j32q3klzq55p4/arch/x86_64/default/
bafybeicvkuxveafm2j345d2v3zykjopnog5hnmdmkix7o3h2rwoljtkda4 - cache/
bafybeifxpct32kh5n3yaf2radx6iotlp345nvv43yekqajsdiac3vumd7q - community/
bafybeifgbajsdhruqbjvzdz2oq3f3w4vfdoko24imvaoo7lik227xlgjpy - community-staging/
bafybeibabhm7wsnthpzlfadmdzlkjyblaxkyktuu42lb5baq2n3r75gkwm - community-testing/
bafybeibffbb3r5j2yt3cob6xcuvcoak3we5bnwnsb3fe3knlrbzsjtlwg4 - core/
bafybeigpvalprc3uqiyts3c3hvxe2yhrdej4bhutexx4xgv4me5b2p5x7u - db/
bafybeicod6j5t5h2tt5ivb6xpgx75trngmmvfdcigmy7jau2tlek35xqqi - extra/
bafybeigireazzbmm6pxlfcvud3mv4vf6wyhvtzdncqjqa27tp664olsfwm - gnome-unstable/
bafybeif3o6g4a5judp3g454mtwmt3wzrx3vmuumd3cvdlffnmtlqwa3m3e - kde-unstable/
bafybeiedjipmwah5k27243oyirmpkpotyht5ntuvzp3b4uhheogafkip2u - multilib/
bafybeies6drqyzefbsh54x35cyb2u2jdyhn4t3kpf62bobr5rzmuvu2uxe - multilib-staging/
bafybeig7gjincnxqvzwqmshvslmujqeps47proqn6ad534vcw7hq72typa - multilib-testing/
bafybeibi6d3mx4jfobm3qkno2jeooxowdm6kknzbrpe5n74wy7ajfb7qxm - staging/
bafybeiblxfa3ayxinjgcu26apvgter5n56vtaze7ubkekruhtrqmynvpia - testing/

real    0m0,116s
user    0m0,027s
sys     0m0,047s

To make sure that it's not only fast on the hosting side, since I already tried to receive it, I ran the following on the receiving side:

[ruben@i3 ~]$ time ipfs repo gc
[...]
[ruben@i3 ~]$ time ipfs ls /ipfs/bafybeih5bguq6vztt6ohud64edfmxtncehr6rusegp6a6j32q3klzq55p4/arch/x86_64/default/
bafybeicvkuxveafm2j345d2v3zykjopnog5hnmdmkix7o3h2rwoljtkda4 - cache/
bafybeifxpct32kh5n3yaf2radx6iotlp345nvv43yekqajsdiac3vumd7q - community/
bafybeifgbajsdhruqbjvzdz2oq3f3w4vfdoko24imvaoo7lik227xlgjpy - community-staging/
bafybeibabhm7wsnthpzlfadmdzlkjyblaxkyktuu42lb5baq2n3r75gkwm - community-testing/
bafybeibffbb3r5j2yt3cob6xcuvcoak3we5bnwnsb3fe3knlrbzsjtlwg4 - core/
bafybeigpvalprc3uqiyts3c3hvxe2yhrdej4bhutexx4xgv4me5b2p5x7u - db/
bafybeicod6j5t5h2tt5ivb6xpgx75trngmmvfdcigmy7jau2tlek35xqqi - extra/
bafybeigireazzbmm6pxlfcvud3mv4vf6wyhvtzdncqjqa27tp664olsfwm - gnome-unstable/
bafybeif3o6g4a5judp3g454mtwmt3wzrx3vmuumd3cvdlffnmtlqwa3m3e - kde-unstable/
bafybeiedjipmwah5k27243oyirmpkpotyht5ntuvzp3b4uhheogafkip2u - multilib/
bafybeies6drqyzefbsh54x35cyb2u2jdyhn4t3kpf62bobr5rzmuvu2uxe - multilib-staging/
bafybeig7gjincnxqvzwqmshvslmujqeps47proqn6ad534vcw7hq72typa - multilib-testing/
bafybeibi6d3mx4jfobm3qkno2jeooxowdm6kknzbrpe5n74wy7ajfb7qxm - staging/
bafybeiblxfa3ayxinjgcu26apvgter5n56vtaze7ubkekruhtrqmynvpia - testing/

real	0m24,286s
user	0m0,026s
sys	0m0,042s

Latency isn't an issue (receiving side to hosting side):

[ruben@i3 ~]$ ipfs ping QmWsTDap1zmaVLRaUFKBmo25ST6MGtjZQBAT2u72wz4Qma
Looking up peer QmWsTDap1zmaVLRaUFKBmo25ST6MGtjZQBAT2u72wz4Qma
PING QmWsTDap1zmaVLRaUFKBmo25ST6MGtjZQBAT2u72wz4Qma.
Pong received: time=44.90 ms
Pong received: time=32.22 ms
Pong received: time=38.13 ms
Pong received: time=40.08 ms
Pong received: time=49.04 ms
Pong received: time=50.87 ms
Pong received: time=38.41 ms
Pong received: time=83.57 ms
Pong received: time=88.84 ms
Pong received: time=126.46 ms
Average latency: 59.25ms

Both sides use noise, tls, secio as the transport protocol and use the badgerds. QUIC is configured to listen on port 443 on the hosting side. Both sides don't use any experimental flags except OverrideSecurityTransports. On the hosting side DisableBandwidthMetrics and DisableNatPortMap are active.

@RubenKelevra RubenKelevra added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Jun 19, 2020
@Stebalien
Copy link
Member

Could you try passing --stream? Does it start streaming immediately, or do you get results all at once?

And what about ipfs ls --resolve-type=false --size=false? Does it return instantly in those cases?

@jacobheun jacobheun added the need/author-input Needs input from the original author label Jul 10, 2020
@jacobheun
Copy link
Contributor

Could you try passing --stream? Does it start streaming immediately, or do you get results all at once?
And what about ipfs ls --resolve-type=false --size=false? Does it return instantly in those cases?

@RubenKelevra have you had a chance to try this?

@RubenKelevra
Copy link
Contributor Author

RubenKelevra commented Jul 10, 2020

Hey guys,

I've changed my setup quite a lot in the last days, but I will try to replicate it. :)

Sorry for the delay on that, had major issues with the cluster setup and it took some days to get everything running smoothly again on a new cluster installation.

@FireMasterK
Copy link

Could you try passing --stream? Does it start streaming immediately, or do you get results all at once?

It seems to be list each file by file, quite slow half a second per file.

And what about ipfs ls --resolve-type=false --size=false? Does it return instantly in those cases?

It seems to return files instantly.

Another interesting thing is that time ipfs ls --stream /ipns/x86-64.archlinux.pkg.pacman.store/ returns all data at once, but still took 32 seconds.

@FireMasterK
Copy link

@Stebalien @jacobheun is there anything else I can do to help debug the problem?

@Stebalien
Copy link
Member

So, the issue here is that we:

  1. List the directory.
  2. Resolve the file types (directory/regular file/symlink) one-by-one by fetching the first block of the file.

That second step is very slow and, unfortunately, won't be fixed in the near future. The best we could do would be to parallelize this (the second step) a bit more, but that's a low priority for the core team so it's only likely to happen if someone decides to contribute a patch.

The patch would need to modify

https://github.com/ipfs/go-ipfs/blob/2ed9254426e900cf00a9b35304dc5b5de8173208/core/coreapi/unixfs.go#L267-L282

Specifically, as links are resolved, it would need to feed those links into a set of workers (e.g., 16?) where each worker would asynchronously call "processLink", feeding the results back into lsFromLinksAsync.

@FireMasterK
Copy link

Could the issue also be fixed by some kind of indexing mechanism? Perhaps the index can be used instead of resolving each file type?

Also, is it currently possible to not resolve the filetype when using the gateway?

@Stebalien
Copy link
Member

Could the issue also be fixed by some kind of indexing mechanism? Perhaps the index can be used instead of resolving each file type?

Unfortunately, no. The slow part here is downloading the "root" of the file so we can see it's type. We literally don't know what it will be.

The ultimate solution here is unixfsv2 (version 2 of the file format) where we've talked about embedding this information in the directory object itself. But that project is on pause at the moment.

Also, is it currently possible to not resolve the filetype when using the gateway?

Not at the moment. I'd like to asynchronously resolve file types on the gateway by streaming an HTML response, but haven't had the time to work on that. Specifically:

  1. Return an HTML body with the directory listing as fast as possible.
  2. Stream back some trailing style elements to "update" the directory listing with file types.

Given that browsers render HTML as it comes in instead of waiting to receive the full page, this should "work" (although it'll be a bit hacky).

@RubenKelevra
Copy link
Contributor Author

@Stebalien wrote:

2. Resolve the file types (directory/regular file/symlink) one-by-one by fetching the first block of the file.

Is this a 256 KByte block, or just a block that contains the meta info and the links to the actual data blocks?

@Stebalien
Copy link
Member

Stebalien commented Mar 9, 2021 via email

@BigLep
Copy link
Contributor

BigLep commented May 29, 2021

@Stebalien : per 2021-05-24 conversation, please extract out relevant bits into a new issue and close this issue. Thanks!

@aschmahmann
Copy link
Contributor

closing as this is related to how UnixFSv1 works, not a bug report. Any issues would be related to UnixFSv2 proposals.

@Stebalien
Copy link
Member

I've filed the remaining part in an issue: #8178.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/author-input Needs input from the original author need/triage Needs initial labeling and prioritization
Projects
None yet
Development

No branches or pull requests

6 participants