Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate unrealistic base cost of DataReceiptCreationConfig #4482

Closed
Longarithm opened this issue Jul 8, 2021 · 9 comments · Fixed by #4647
Closed

Investigate unrealistic base cost of DataReceiptCreationConfig #4482

Longarithm opened this issue Jul 8, 2021 · 9 comments · Fixed by #4647
Assignees
Labels
A-contract-runtime Area: contract compilation and execution, virtual machines, etc T-core Team: issues relevant to the core team

Comments

@Longarithm
Copy link
Member

In recent runtime-params-estimator launches, DataReceiptCreationConfig base cost is 50 TGas, which is unrealistically high comparing to current cost of 4.5 TGas.

Examples:
#4231
#4455

data_receipt_creation_config: DataReceiptCreationConfig {
  base_cost: Fee {
      send_sir: 50745691155500,
      send_not_sir: 50745691155500,
      execution: 50745691155500,
  }
  ...
}

cc @bowenwang1996 @olonho

@Longarithm Longarithm added the A-contract-runtime Area: contract compilation and execution, virtual machines, etc label Jul 8, 2021
@bowenwang1996 bowenwang1996 added the T-core Team: issues relevant to the core team label Jul 8, 2021
@bowenwang1996
Copy link
Collaborator

To be clear, the current cost of 4.5Tgas is also unrealistically high. See #3279 for a discussion on this.

@Longarithm
Copy link
Member Author

Longarithm commented Jul 14, 2021

From the first sight, difference is explained by taking IO costs into account:
#3771 (comment) - 50 TGas
#3771 (comment) - 300 Ggas
My local computation in docker confirm this.

But there is conflicting info here:
#3771 (comment) - 40 Ggas in both cases.
Need to investigate further

cc #4474 @matklad

@Longarithm
Copy link
Member Author

Data receipt fee estimation depends on its place in the vector of metrics: https://github.com/near/nearcore/blob/master/runtime/runtime-params-estimator/src/cases.rs#L536

Consider an example:

    let v = calls_helper! {
        data_receipt_base_10b_1000_1 => data_receipt_base_10b_1000,
        data_receipt_10b_1000_1 => data_receipt_10b_1000,
        data_receipt_100kib_1000_1 => data_receipt_100kib_1000
        cpu_ram_soak_test => cpu_ram_soak_test,
        base_1M => base_1M,
        read_memory_10b_10k => read_memory_10b_10k,
        ...
        data_receipt_base_10b_1000_2 => data_receipt_base_10b_1000,
        data_receipt_10b_1000_2 => data_receipt_10b_1000,
        data_receipt_100kib_1000_2 => data_receipt_100kib_1000
    };

Computation of DataReceipt fees based on data_receipt_base_10b_1000_1, data_receipt_10b_1000_1, data_receipt_100kib_1000_1 yields 30 Ggas; same computation for data_receipt_*_2 fees yields 50 TGas.

Also, if we cleanup data by re-creating RuntimeTestbed before computation of data_receipt_*_2 fees, we get 30 Ggas again.

Supposed explanation:

  • initially storage is in optimal state A
  • we compute data_receipt_*_1 fees and get 30 Ggas
  • computation of some metric M moves storage to suboptimal state B
  • we compute data_receipt_*_2 fees and get 50 Tgas

I currently suspect M = storage_write_10kib_key_10b_value_1k.

It makes sense to use separate RuntimeTestbed for each computation. But anyway we have to separate the following two cases:

  • on mainnet, storage never comes to state B, thus 30 Ggas is the correct realistic price
  • on mainnet, there is a way to put storage to state B, and then we risk to have x1000 underestimated DataReceipt prices for unknown period of time.

To do so, I plan to understand more deeply what exactly causes the difference by profiling tools.

cc @matklad @olonho @nearmax

@bowenwang1996
Copy link
Collaborator

@Longarithm is it correct that data_receipt_creation_cost measures the cost of one storage write? If not, what exactly does it measure?

@Longarithm
Copy link
Member Author

Longarithm commented Jul 28, 2021

There is also a curious dependency between fees and accounts number.
In the notation of #4482 (comment):

+----------+----------------------+----------------------+
| Accounts | data_receipt_*_1 fee | data_receipt_*_2 fee |
+----------+----------------------+----------------------+
| 10K      | 37 Ggas              | 34 Ggas              |
| 20K      | 35 Ggas              | 50 Tgas              |
| 50K      | 40 Ggas              | 120 Tgas             |
| 100K     | 37 Ggas              | 258 Tgas             |
+----------+----------------------+----------------------+

UPD: added row for 100K accounts

@Longarithm
Copy link
Member Author

Longarithm commented Jul 28, 2021

@bowenwang1996
Base cost measures cost of DataReceipt creation and processing: https://github.com/near/nearcore/blob/master/core/primitives-core/src/runtime/fees.rs#L71-L85
Strictly speaking, it measures cost of:

Though I'm not entirely sure that all these operations are called ~1000 times, as we expect in the measurement.
It's possible that 1000 DataReceipts are processed before processing ActionReceipt which collects them, and in such case we don't save postponed receipt ids:

ReceiptEnum::Action(ref action_receipt) => {

I need to double-check this. cc @olonho @matklad

Note that we don't consider bytes cost here, in which I see no discrepancies.

@bowenwang1996
Copy link
Collaborator

@Longarithm is the base cost calculated based on some trivial data or through some statistical method? It looks like this involves at most 4 storage operations and it should not be more expensive than 4 storage writes of [some trivial data]. Also it seems to me that this fee depends on the shape of the trie, but we also have a separate touching_trie_node fee that accounts for this.

@MaksymZavershynskyi
Copy link
Contributor

@Longarithm Longarithm linked a pull request Aug 6, 2021 that will close this issue
near-bulldozer bot pushed a commit that referenced this issue Aug 13, 2021
**Idea**
We currently reuse the same `RuntimeTestbed` for computing metrics in `calls_helper`. Presumably it saved some time (actually not a lot), and allowed us to skip initialization of testbed.

But this led to issues with fees computation: #4482 (comment)
Explanation: https://near.zulipchat.com/#narrow/stream/295306-dev-contract-runtime/topic/fees.20.26.20state.20size/near/248169740

So we need to create separate testbeds with different `/tmp/data` folders.

**Testing**
Check for discrepancies in resulting `RuntimeConfig`s
@Longarithm
Copy link
Member Author

Closing, because we have a decent explanation for the issue, and it was fixed in #4647.

near-bulldozer bot pushed a commit that referenced this issue Oct 8, 2021
Stabilize features lowering costs for new release:
* #4795
* #4865

Quality control:

* We run param estimator several times and got consistent results: 
  * beginning of Sep 2021, my GCP instance https://hackmd.io/w6ODyKjUReuuofXTuqdyFQ
  * end of Sep 2021, @matklad instance #4778 (comment)

* The current fee values are explained:
  * Data receipt costs - investigated here #4482, the reason was relatively explained and the observed issue was fixed. Note that we don't know the exact root cause, but we assume that it is related to a separate fee (touching_trie_node) for taking store size into account. This fee is problematic because it assumes the constant height of a trie, but we treat it as a separate problem.
  * Ecrecover cost - follow links from this one: #4778 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-contract-runtime Area: contract compilation and execution, virtual machines, etc T-core Team: issues relevant to the core team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants