You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're currently handling p2p communication between Colossus nodes via the public HTTP(S) API, which has several issues:
Communication is slow and limited, because connections are not persistent and there's a substantial protocol overhead
The overhead will increase even more once we introduce authentication / authorization rules which would require verifying, for example, whether a given node is allowed to request a given object (see: Infrastructure authentication #4414)
Currently for each data object that's about to be removed from local storage we're making HTTP(S) requests to all nodes which are supposed to store this object to ensure the replication threshold will still be met after removal:
for (const { storageBucket } of movedDataObject.storageBag.storageBuckets) {
const url = urljoin(bucketOperatorUrlById.get(storageBucket.id), 'api/v1/files', movedDataObject.id)
await superagent.head(url).timeout(timeoutMs).set('X-COLOSSUS-HOST-ID', hostId)
dataObjectReplicationCount++
}
This takes a very long time and causes connectivity issues if the number of objects is very large.
Although this particular issue can be resolved by introducing some sort of batch request to check status of multiple objects at once (which I think is this is a good idea in general) I think there are many scenarios that would benefit from having a faster way of exchanging information between Colossus nodes.
Another example is the synchronization process. It could be much faster (and safer) if peers to sync from were chosen based on their current load and proximity instead of selected randomly. But to exchange this kind of information efficiently we need to have other means of communication than HTTP API.
Potential solutions
Add support for TCP / WebSocket communication between Colossus nodes.
The initial implementation can target optimizing the cleanup task by introducing persistent connections between Colossus nodes that they can use to exchange information about which data objects they are storing, ie.:
Node A sends request to Node B asking if it has objects [101, 102, 105]
Node B responds with a simple boolean status for each object (for example: b010, meaning: no, yes, no)
To enable greater flexibility for the future I recommend exploring libp2p implementation.
The text was updated successfully, but these errors were encountered:
The issue
We're currently handling p2p communication between Colossus nodes via the public HTTP(S) API, which has several issues:
I'll use a cleanup service as an example.
Currently for each data object that's about to be removed from local storage we're making HTTP(S) requests to all nodes which are supposed to store this object to ensure the replication threshold will still be met after removal:
This takes a very long time and causes connectivity issues if the number of objects is very large.
Although this particular issue can be resolved by introducing some sort of batch request to check status of multiple objects at once (which I think is this is a good idea in general) I think there are many scenarios that would benefit from having a faster way of exchanging information between Colossus nodes.
Another example is the synchronization process. It could be much faster (and safer) if peers to sync from were chosen based on their current load and proximity instead of selected randomly. But to exchange this kind of information efficiently we need to have other means of communication than HTTP API.
Potential solutions
Add support for TCP / WebSocket communication between Colossus nodes.
The initial implementation can target optimizing the cleanup task by introducing persistent connections between Colossus nodes that they can use to exchange information about which data objects they are storing, ie.:
[101, 102, 105]
b010
, meaning: no, yes, no)To enable greater flexibility for the future I recommend exploring libp2p implementation.
The text was updated successfully, but these errors were encountered: