-
Notifications
You must be signed in to change notification settings - Fork 931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: blob.getAll returns ERROR when no blob is found, but is not error #3292
Comments
Duplicate #3192. We had an internal discussion and agreed that returning both empty value and error is not a go idiomatic way, bc the issue is mainly motivated by Rust and based on the Rust semantics. As our implementation was done using golang, we will stick to the rules that have been defined. Any short-term workarounds can lead to more issues and misunderstandings. |
Issue has nothing to do with any programming language, it's an API semantics issue. Semantically the correct response requesting an object that does not exist should be either a success empty response (this is common for search style requests), or more commonly an error code which indicates explicitly JSON RPC provides very few error codes, although Celestia could choose to define sets of internal server errors that could define this, although I'd argue this isn't an internal error (as indicated by not being a 5XX HTTP code in a restful API). You could argue this is a Especially in the instance of The core problem here is that the error messages currently must be relied upon, making any changes to the string messages is a breaking change. Error codes, when adequate, provide a much more robust and standardized way to handle common errors. Unfortunately JSON RPC is a substandard spec in this regard. This could be solved by utilizing the data field on errors to provide structured information. This structure could be standardized across node API errors and have codes used to augment the lacking JSON RPC error codes. This is also commonly done in REST APIs where you can pass an internal error code which provides user more information through the data field. Generally: Message fields on API errors are intended to be human readable information, and they are being relied upon to pattern match to specific. I consider this a strong API anti-pattern, and it's one you are forcing all users of the API to make. The design of the API isn't about semantically correct Go, it's about users being able to build robust applications on top of it in whatever language they choose. |
I don't see any anti-pattern here. |
Is Looking through the codebase it doesn't seem to be Perhaps this is a broader issue: the problem is there are no documented errors, there's no way to know them without scouring the code base. In order to parse as a consumer I must do string matching. JSON RPC provides a mechanism via |
I agree that's a documentation issue in the first place. If we did a better job documenting this and other errors, you would not have been so frustrated with this I believe. |
What is Celestia-Node's recommendation for client handling of blobs not being found? Is error code 1 only used for blobs not found? (IIRC no, the error code is used for other errors) Should clients match on the error message? Will Celestia-Node provide stability guarantees for the blob-not-found error message? Extending on this: if a client issues a single request to fetch blobs under 2 namespaces |
I still do not think an empty value should return an error, as it is not an error. There is just no share at that namespace and that height. |
I actually ran into this scenario:
The Celestia Node RPC docs state the following (https://node-rpc-docs.celestia.org/?version=v0.13.3#blob.GetAll):
This is a bit confusing. |
So @SuperFluffy in this case, at least |
After thinking about this problem for some time, I think it makes sense to return both empty values in
Before opening any PR, let's agree on all cases that may happen during
|
I think this part needs to be re-thought as well. IMO, it was a mistake to allow Regarding the error itself, I still support returning Callers that do not know whether blobs exist under that namespace (e.g. rollup full nodes that may be looking for txs censored by sequencer) would still expect that err for every height for which there should not be blobs for the namespace. It's cleaner than checking that both return values are nil.
|
Maybe the following Google guidelines for API design could be an inspiration:
Specifically the last point under guidance in AIP 231 is interesting here:
So Right now I think most crucial is not necessarily evolving the API (and introducing breaking changes), but to sharpen the docs and give examples that document expected behavior. Right now https://node-rpc-docs.celestia.org/?version=v0.13.3#blob.GetAll only documents the simplest happy case: blob exists for the given namespace. But it should probably also mention a) the error if the blob does not exist, b) the happy case for 2 namespaces, c) the case where two namespaces are queried but one namespace has resources while the other doesn't. Maybe the reported error could also have more information, noting which namespaces failed/had no blobs. |
I agree that returning an error is a semantically correct approach in Golang. The most common example in stdlib would be Our API is designed with Golang semantics. Celestia-node clients in other languages that want an idiomatic feel would have to abstract Golang flavorings away or use different protocol implementations in the language of interest, like lumina for Rust. I would argue that other protocol implementations should similarly optimize APIs for their language of choice, like celestia-node is doing for Go. Our duty as celestia-node team though is to document our API well, which is not the case and is the root cause here. |
I don't have a strong opinion on |
@SuperFluffy, @joroshiba, would improving the API documentation be an acceptable solution for the Astria team? |
@Wondertan better docs are definitely desirable. Even better would be a dedicated (and documented, stable, supported) JSONRPC error code for The specific semantics of your implementation are orthogonal to this at the end of the day. If I were to write a batch-like API in Rust I would certainly return a On the other hand, if I write a celestia-node client using Golang, how should I match on the error? The JSONRPC spec that your API intends to follow has error codes for reporting what exactly failed (well, that and the |
I don't know why our API should be strictly adhering to golang semantics. If people want to use the golang errors, just import node and call the methods on the services directly, there is no need for an RPC in that case. |
There are two options:
Our Golang API users are also expected to import the error to handle such a case. If this case was well documented, it would have been straightforward to use and implemented new clients. |
Where does the JSON-RPC API define this error? I'm a bit confused as to which API you're talking about. Celestia-Node officially documents a jsonrpc API, which is the one we are interacting with. I'm asking for JSONRPC semantics, while you keep referring to your go library. Is the Celestia-Node go library the only way one is supposed to interact with Celestia-Node? Also, is Celestia-Node commiting to a stable message field then? |
This is issue is really painful to read. @Wondertan @vgonkivs Golang semantics have little to do with JSON-RPC APIs. Please, help the team with either 1) how to handle the error properly (if string matching that means committing to not changing that err string and also making it part of the docs; ideally with an example) or 2) return an empty result instead of an error here. The latter could even happen at the JSON-RPC code layer (if err == ErrBlobNotFound return empty slice instead of passing the error to that layer/the resulting json). Thank you @joroshiba for the analysis here and the general feedback @SuperFluffy @jcstein and @distractedm1nd. Also this is really confusing and deserves its own issue. Can someone open that please? |
It doesn't, and that's the API documentation gap I'm talking about.
I mean the API of the BlobModule, with this error defined next to the interface.
Any API fundamentally has to adhere to some semantics consistently. The JSON RPC is very minimal and doesn't have any. You can apply anything over it. You can even pass protobufs over it if you really want to.
We are not v1, but there is no reason for this error message to be changed. |
Unfortunately, they are. That's the nature of JSON-RPC. It's so minimal and unopinionated that you have to have some additional structure over it for things like error handling in this case. |
The decision to adhere to Golang semantics can be changed though. Back in the day, we saw Golang users like Rolkit as our primary users, and we were writing Golang, so it's easy to stick to it. We can revisit this decision over Interfaces WG and discuss alternative formats to run over JSON RPC or move to a different API altogether. |
@joroshiba, @SuperFluffy, we just discussed internally the notion of Canonical API. The CIP standardized API that every node/consensus client/node has to support. That API would be defined using IDL with strong cross-language support, like proto3/gRPC. This layer on top will hide the internals of the native celestia-node API like we are facing in this issue. We would love your thoughts on this idea and any design requests from the beginning. Let's design a better way of interfacing with celestia together. |
Besides, I got feedback that my communication here was somewhat dismissive, and I want to apologize for that. Reading through the threads, I think I can get a sense of it. Naturally, I am not the best person when it comes to empathy, but I do appreciate it when people deeply care about technology, and you are the perfect example of what caring is, even about such small low-level issues. I hope that the long-term solution above will address your concern by addressing the cross-language barrier. |
Appreciate the understanding and excited to move forward with improving the API. Related to this wanted to respond to this idea, because I think it's important as we rework the API to consider what the user journeys are, and how we want the node to be utilized. The API is a product, the APIs we build fill usecases, but also will push people towards specific design decisions. Even if technically workable, the choice to return an error implies the user is doing something wrong. As pointed out by @renaynay the choice to return an error here suggests that users should know what heights data are posted at.
In this example: I don't think the caller should have to know what heights it expects blobs to exist at. If I have adequate DA to know what height my data got posted at, why not just use that DA to give me the actual data? Instead I can follow and check "do I have data" at each block height instead of relying on alternative DA to tell us where the data is. It seems that is not an intended use case of the current API, but it seems to me as though this is how people should use DA. Things to consider as we move towards the next iteration. |
I think this is exactly where we are desynchronized. In Golang world it's its normal to use errors as a form of signalling to the user. The simplest example would be io.EOF. It's returned every time you finish reading a file and it's not an error per se. But that's a Golang quirk and having it across language barriers isn't optimal. The new Canonical API doesn't have to be consistent with Golang semantics and thus won't have this quirk. |
I think this issue has meandered quite far from the original bug report which is making it hard make the issue actionable. I'd like to propose that a) we collectively acknowledge that the current JSON-RPC api has been caught lacking and thus does not provide an adequate or pleasant developer experience Then we can draw a conclusion under some of the debate. Subsequently, as the canonical API work will take time, I'd like to propose we close this issue, considering proper (it not perfect) action being to take @SuperFluffy's suggestion to more properly document the error strings (and potentially formalize them as errors as proposed in rollkit/go-da#65) as the action as tracked in @liamsi's tracking issue does this ok @jcstein @liamsi @joroshiba @SuperFluffy |
I want to butt in before this error is closed, and say that I find it extremely unintuitive that an error is returned when no blobs are found on the passed height. There are use-cases where you do not know in advance if data is going to be available at a height or not. The transparency dictionary rollup Sebastian is building is an example of this, and it's one of the first things being built natively on top of Celestia without a rollup framework. There will be plenty more. The reason this error is infuriating in particular is because instead of just noop-ing on a loop over the returned list (as its empty), you have to go into your error matching statement, just to string compare and then throw the error away. I don't know if this intuition comes across automatically or not, but it becomes quickly apparent when you use it for the first time. It just reads really badly. And this isn't only an issue in Rust of course -- while writing code snippets for the golang client library tutorial today, there is this simple example that also becomes cumbersome: func SubscribeHeaders(ctx context.Context, url string, token string) error {
client, err := client.NewClient(ctx, url, token)
if err != nil {
return err
}
// create a namespace to filter blobs with
namespace, err := share.NewBlobNamespaceV0([]byte{0xDE, 0xAD, 0xBE, 0xEF})
if err != nil {
return err
}
// subscribe to new headers using a <-chan *header.ExtendedHeader channel
headerChan, err := client.Header.Subscribe(ctx)
if err != nil {
return err
}
for {
select {
case header := <-headerChan:
// fetch all blobs at the height of the new header
blobs, err := client.Blob.GetAll(context.TODO(), header.Height(), []share.Namespace{namespace})
if err != nil {
fmt.Println("Error fetching blobs: %v", err)
}
fmt.Println("Found %d blobs at height %d in 0xDEADBEEF namespace", len(blobs), header.Height())
case <-ctx.Done():
return
}
}
} I can't actually leave the code like this, because of this issue. I would need to change the In general, it seems quite obvious to me that returning an empty array when no data is found there is just the expected behavior. Given that this is probably the most important endpoint in the whole API, I feel very strongly about this behavior. |
@distractedm1nd I believe one way forward would could be to provide for example |
The point of the blob module was to have a simple, intuitive way to do things - which is why it is very opinionated. I find that splitting the method up defeats that purpose by providing multiple ways to do things inside of the module. And especially for such a trivial difference, I think it would just lead to additional confusion. This is the same reason that blob.Submit is not very parametrizable, and the alternative state.SubmitPayForBlobs exists. To be consistent here, my opinion is that we should pick the most intuitive option, and have the client handle the error explicitly for the non-intuitive case. Splitting it would mean we have 3 ways to fetch blobs from the network through the RPC. This is a double-edged sword. More users would have the exact method they need without adding any special handling, but then developers also need to learn all 3 and the differences between them to determine the right one -- and it also puts the same burden on the readers of the code. Let's keep it as simple as possible |
Agreed that things should be kept simple and not introducing several similarly sounding API endpoints for the same tasks.
This is actually not uncommon in golang (see the EOF case that Hlib brought up several times above). That said, while this is acceptable in go, it is not acceptable for the API anyways. And I do agree that we should change this to return an empty result instead of an error. |
I am blocked because of the same issue. I am setting up infrastructure for a new OP-stack-based L2, Camp(aign). It uses Celestia as the DA layer. In order for op-node to sync, I need to supply it with a DA RPC, ie, Celestia light node endpoint. So, I spun up and got mocha-4 light node synced to the tip and passed it to op-node through the op-node:
celestia-node:
Is there a way I can fix this issue? If not, I hope returning an empty response instead of an error will help fix this issue. |
This is the correct solution here IMO. Agree with @distractedm1nd. |
Celestia Node version
v0.13.2
OS
macOS
Install tools
docs
Others
No response
Steps to reproduce it
make a blob.getAll request from a height and namespace which has no blobs, the CLI, for example, returns:
Expected result
from @distractedm1nd: blob.GetAll should not return an error when there is no data at that height under that namespace. Its not an error. It is just empty data
Actual result
The logs for light node show an ERROR, along with WARN and INFO logs for the request:
Relevant log output
Notes
No response
The text was updated successfully, but these errors were encountered: