Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: add invocations to applicationlog #3569

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

ixje
Copy link
Contributor

@ixje ixje commented Sep 3, 2024

Problem

neo-project/neo#3386

Solution

Implement as extension. Moved the discussion from Dora's backend PR to here

To do

  • copy arguments to avoid modifications
  • limit the total number of argument stack items in a single transaction (for safety)
  • make this a configurable feature
  • include native contract calls

Do we want to limit the stack item depth (think MaxJSONDepth) or are we content with just limiting the total stack arguments?

Copy link
Member

@AnnaShaleva AnnaShaleva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good prototype, but I have several design questions that we should solve before review.

pkg/core/interop/contract/call.go Outdated Show resolved Hide resolved
pkg/core/interop/contract/call.go Outdated Show resolved Hide resolved
Comment on lines 73 to 76
ic.InvocationCalls = append(ic.InvocationCalls, state.ContractInvocation{
Hash: u,
Method: method,
Params: stackitem.NewArray(args),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "flattened" way of invocations tracking is missing depth, so that given [ContractACall, ContractBCall, ContractCCall] it's impossible to say whether contract B calls contract C internally or contract A calls both B and C subsequently. If comparing with VM-level InvocationsTree, then InvocationsTree gives a clear understanding of calls depth and nesting relationship, which is good for the user:

neo-go/pkg/vm/vm.go

Lines 395 to 397 in d47fe39

newTree := &invocations.Tree{Current: ctx.ScriptHash()}
curTree.Calls = append(curTree.Calls, newTree)
ctx.sc.invTree = newTree

However, using VM InvocationsTree in the current state is impossible, because it does not track call arguments. And it's a problem to make it track call arguments because it only has access to loading context with contract scripthash, and arguments are loaded by interop handlers. This problem may be solved with some additional VM callback.

So the question is: do we need nested relationship to be present in the resulting invocations log? It's important to solve this design question before the implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Dora use-case we don't need this information. Keeping it flat would be similar to notifications where we also can't tell who triggered them (i.e. was it user calling contractA which calls Contract B, or was it user calling ContractA and user calling ContractB using 2 System.Contract.Calls in a tx.script).

However, it does seem like this information can be useful to somebody somewhere down the road and changing it later on is going to be a hassle. What would it look like? An option could be

type ContractInvocation struct {
	Hash        util.Uint160         `json:"contract_hash"`
	Method      string               `json:"method"`
	Arguments   *stackitem.Array     `json:"arguments"`
	IsValid     bool                 `json:"is_valid"`
	Invocations []ContractInvocation `json:"invocations"`
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An option could be

Agree, it's in fact the way how VM InvocationTree works.

But regarding 1D (flattened) / 2D (nested) structure of Invocations: I think we need some third opinion on this topic. Personally, I vote for the nested structure because it contains more information which may be useful in some cases, and especially for contract calls debugging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper call tree would have a higher price. I'm OK with keeping it this way if it doesn't have significant performance penalty.

pkg/core/interop/contract/call.go Outdated Show resolved Hide resolved
pkg/core/interop/contract/call.go Outdated Show resolved Hide resolved
@AnnaShaleva
Copy link
Member

@roman-khimov, I think we need some third opinion on these topics.

@roman-khimov
Copy link
Member

How about System.Runtime.LoadScript calls, btw?

@AnnaShaleva
Copy link
Member

How about System.Runtime.LoadScript calls

It leads to new execution context creation, thus it's a valid part of invocation tree. But is this information useful in practice? Dynamic invocations are identified by hash160 of the loaded script, as a result user can't get this script because he knows only its hash. But still we may include dynamic invocations into the resulting Invocations tree with some special field like isContractCall: false.

@ixje ixje force-pushed the applog-invocations branch from f4e91f5 to dfd5c9b Compare October 31, 2024 09:03
@ixje
Copy link
Contributor Author

ixje commented Oct 31, 2024

Picking this up again. I rebased the branch onto latest master and processed some of the feedback. In particular

  • use stackitem.Serialize instead of deepcopy and re-use the results when storing the data
  • make the behaviour configurable through a SaveInvocations config option

Note; It was unclear to me based on #3569 (comment) if I should have made it a tree or keep it flat. I kept it flat for now.

If the feature is enabled the applicationlog output looks as follows

"invocations": [
                    {
                        "contract_hash": "0xd2a4cff31913016155e38e474a2c06d08be276cf",
                        "method": "transfer",
                        "arguments": {
                            "type": "Array",
                            "value": [
                                {
                                    "type": "ByteString",
                                    "value": "krOcd6pg8ptXwXPO2Rfxf9Mhpus="
                                },
                                {
                                    "type": "ByteString",
                                    "value": "AZelPVEEY0csq+FRLl/HJ9cW+Qs="
                                },
                                {
                                    "type": "Integer",
                                    "value": "1000000000000"
                                },
                                {
                                    "type": "Any"
                                }
                            ]
                        },
                        "arguments_count": 4,
                        "is_valid": true
                    }
                ]

and in disabled state it returns

"invocations": []

I'm looking for feedback on the above before taking care of covering System.Runtime.LoadScript calls

@ixje ixje requested a review from AnnaShaleva October 31, 2024 09:43
@ixje
Copy link
Contributor Author

ixje commented Nov 14, 2024

@AnnaShaleva can this PR also get some review love please

@ixje ixje marked this pull request as ready for review November 19, 2024 10:26
@ixje ixje requested a review from roman-khimov November 19, 2024 10:27
Copy link
Member

@roman-khimov roman-khimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any limits for the overall size of saved data. While the feature is optional we still need to protect node from abuse.

@@ -4,7 +4,6 @@ import (
"encoding/json"
"errors"
"fmt"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the request is here. Comparing it to master there are no spaces there either, just new lines (0a)

"fmt"
"github.com/nspcc-dev/neo-go/pkg/io"

Maybe one of my old commits had it slipped in?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, one commit removes it and another one adds it back.

@@ -120,6 +191,7 @@ func (aer *AppExecResult) DecodeBinary(r *io.BinReader) {
aer.Stack = arr
r.ReadArray(&aer.Events)
aer.FaultException = r.ReadString()
r.ReadArray(&aer.Invocations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes current DB incompatible. We need to have this compatibility.

Method: method,
Arguments: stackitem.NewArray(args),
})
if ic.Chain.GetConfig().Ledger.SaveInvocations {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetConfig() for every call is too costly, we need some new field in ic inited on NewContext() (just like Network and Hardforks are initialized currently).

Hash util.Uint160 `json:"contract_hash"`
Method string `json:"method"`
Arguments *stackitem.Array `json:"arguments"`
ArgumentsCount uint32 `json:"arguments_count"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this one? In the optimistic case you have Arguments with some proper number of elements. In pessimistic, does it make any difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that in case of Isvalid = false, then at least we have some information on the argument count.

as said in #3569 (comment)

I don't think we should throw away the complete invocation record because of a parameter violation. If it's recorded on chain then it was apparently still a valid invocation, regardless if we want to store all of the invocation details

I don't recall a hard necessity for this in my original use-case for this PR, but I think it's cheap to do and harder to add later (assuming the C# follows).

@@ -24,36 +24,28 @@ type ContractInvocation struct {
Hash util.Uint160 `json:"contract_hash"`
Method string `json:"method"`
Arguments *stackitem.Array `json:"arguments"`
ArgumentsBytes []byte `json:"arguments_bytes"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's too many details for a public structure used in popular APIs. This should be hidden.

Arguments *stackitem.Array `json:"arguments"`
ArgumentsBytes []byte `json:"arguments_bytes"`
ArgumentsCount uint32 `json:"arguments_count"`
IsValid bool `json:"is_valid"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice we never use snake_case in JSON structures, this should be something different. Also, from the Go perspective it's very convenient to have the default boolean (false) to follow regular use case. Like we use truncated for various find* results. Maybe truncated is applicable here too.

arr := stackitem.NewArray(args)
arrCount := len(args)
valid := true
argBytes := []byte{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this to be initialized to zero-length slice? Also, stylistically I'd prefer

var (
)

block here since you declare a lot of things.

@@ -69,6 +69,26 @@ func Call(ic *interop.Context) error {
return fmt.Errorf("method not found: %s/%d", method, len(args))
}
hasReturn := md.ReturnType != smartcontract.VoidType

if ic.Chain.GetConfig().Ledger.SaveInvocations {
arr := stackitem.NewArray(args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're serializing this is a useless variable.

@ixje ixje requested a review from roman-khimov November 20, 2024 14:17
@ixje
Copy link
Contributor Author

ixje commented Dec 1, 2024

ping

@ixje
Copy link
Contributor Author

ixje commented Dec 3, 2024

I feel there is a lot of resistance against this PR, but I can't really tell why. Other PRs I opened (later than this one, like #3674 and #3677) were swiftly reviewed (multiple times). This one however takes 2-3 weeks per update to get another response despite multiple pings.

Can any of you @AnnaShaleva @roman-khimov elaborate please? If I understand where this resistance is coming from I can perhaps do something about it.

@roman-khimov
Copy link
Member

@ixje, zero resistance. Sorry, it's just that there are too many things to handle at once. @AnnaShaleva will get back to it soon.

pkg/config/ledger_config.go Outdated Show resolved Hide resolved
pkg/config/ledger_config.go Show resolved Hide resolved
pkg/core/interop/context.go Show resolved Hide resolved
pkg/core/state/notification_event.go Outdated Show resolved Hide resolved
pkg/core/state/notification_event.go Outdated Show resolved Hide resolved
pkg/core/interop/contract/call.go Outdated Show resolved Hide resolved
pkg/core/state/notification_event.go Outdated Show resolved Hide resolved
pkg/core/state/notification_event.go Outdated Show resolved Hide resolved
baseExecFee: baseExecFee,
baseStorageFee: baseStorageFee,
loadToken: loadTokenFunc,
SaveInvocations: bc.GetConfig().SaveInvocations,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move a call to bc.GetConfig to the upper level (before cfg := bc.GetConfig().ProtocolConfiguration), reuse it for cfg retrieval and for SaveInvocations retrieval.

pkg/core/state/notification_event.go Outdated Show resolved Hide resolved
pkg/core/state/notification_event.go Outdated Show resolved Hide resolved
pkg/core/state/notification_event.go Outdated Show resolved Hide resolved
@AnnaShaleva
Copy link
Member

@ixje sorry for the delay. Actually, when we don't need #3569 (comment) and #3569 (comment) to be fixed, there are only a couple of non-critical issues left to be fixed, so the PR is almost ready and looks good.

@ixje
Copy link
Contributor Author

ixje commented Jan 2, 2025

@AnnaShaleva some RPC server tests and TestNEO_CommitteeEvents are failing due to changes requested by you in this comment. If I do write this 1 byte for the length then they pass (with some small updates where necessary).

I've spend some time trying to understand why TestNEO_CommitteeEvents is failing and for some reason the data fetched from the dao does contain the execution events, despite the body of this logic never being called (as per breakpoints not hitting in the debugger)

	if invocLen := len(aer.Invocations); invocLen > 0 {
		w.WriteVarUint(uint64(invocLen))
		for i := range aer.Invocations {
			aer.Invocations[i].EncodeBinaryWithContext(w, sc)
		}
	}

Putting a breakpoint in EncodeBinaryWithContext of ContractInvocation is also not hit. I'm not sure yet what/where I'm overlooking something. Any help/insight there is appreciated.

@ixje ixje requested a review from AnnaShaleva January 2, 2025 12:51
@@ -356,6 +356,52 @@ to various blockchain events (with simple event filtering) and receive them on
the client as JSON-RPC notifications. More details on that are written in the
[notifications specification](notifications.md).

#### Applicationlog invocations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with "getapplicationlog call" to follow the style of other RPC call extension headers?

@@ -356,6 +356,52 @@ to various blockchain events (with simple event filtering) and receive them on
the client as JSON-RPC notifications. More details on that are written in the
[notifications specification](notifications.md).

#### Applicationlog invocations

The `SaveInvocations` node configuration setting stores smart contract invocation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/node/RPC server

@@ -356,6 +356,52 @@ to various blockchain events (with simple event filtering) and receive them on
the client as JSON-RPC notifications. More details on that are written in the
[notifications specification](notifications.md).

#### Applicationlog invocations

The `SaveInvocations` node configuration setting stores smart contract invocation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/stores/makes the node to store

or something like this.

#### Applicationlog invocations

The `SaveInvocations` node configuration setting stores smart contract invocation
details into the application logs under the `invocations` key. This feature is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/into the application logs under the invocations key/as a part of application log

The applog DB key is a pure internal thing, and I'd say it's not related to external user.


Example:
```json
"invocations": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's attach the full response for getapplicationlog call and extend Example: description a bit, otherwise it's not clear what is the source of this JSON.

ArgumentsCount: uint32(arrCount),
Truncated: truncated,
})
ci := state.NewContractInvocation(u, method, argBytes, uint32(arrCount), truncated)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argBytes can't be used if err is not nil, because it contains half-serialized data in this case. Explicitly set argBytes to nil in case of non-nil err.

type contractInvocationAux struct {
Hash util.Uint160 `json:"hash"`
Method string `json:"method"`
Arguments json.RawMessage `json:"arguments"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use omitempty tag?

func (ci ContractInvocation) MarshalJSON() ([]byte, error) {
si, err := stackitem.Deserialize(ci.argumentsBytes)
if err != nil {
return nil, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argumentsBytes may be empty, then deserialization won't work for it. Add a test for storing/retrieving invocation that contains too large number of arguments.

}
params, err := stackitem.FromJSONWithTypes(aux.Arguments)
if err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to handle the case of invalid type ([]byte(fmt.Sprintf("error: %v", err)) set in the marshaller). Try to unmarshal aux.Arguments into string and continue execution flow if unmarshalling is successful (exactly like for state.Execution unmarshalling. Let's add Encode -> Decode -> marshal JSON -> unmarshal JSON unittest for ConstructIvocation structure for the case when argumentsBytes is nil.

w.WriteVarUint(uint64(len(aer.Invocations)))
for i := range aer.Invocations {
aer.Invocations[i].EncodeBinaryWithContext(w, sc)
if invocLen := len(aer.Invocations); invocLen > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some RPC server tests and TestNEO_CommitteeEvents are failing due to changes requested by you #3569 (comment).

Regarding this change: the problem seems a bit more complicated. Turns out that we can't encode aer.Invocations conditionally because decoding code becomes non-deterministic and dependent from if r.Len() > 0 {. We can't rely on buffer length during decoding because buffer may contain arbitrary data after AppExecResult (for example, another encoded structure), hence we need a reliable marker that will indicate whether aer.Invocations are included into buffer. So the suggestion is:

  1. Use aer.VMState as the marker. aer.VMState is byte whereas there's only 4 known VM states (None, Halt, Fault, Break). Hence, we may use free bit of aer.VMState to encode boolean value (if len(aer.Invocations) > 0 then set the most significant bit of aer.VMState to 1).
  2. Adjust decoding code of AppExecResult: decode aer.Invocations only in case if the most significant bit of aer.VMState is set to 1.
  3. Add compatibility test in order to ensure that there's no such valid VM state with value of 1 << 7. This test is required for possible future extension of VM state enum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants