Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EIP argumenting for request ids #1

Closed
wants to merge 1 commit into from

Conversation

cburgdorf
Copy link
Owner

No description provided.

## Security Considerations
<!--All EIPs must contain a section that discusses the security implications/considerations relevant to the proposed change. Include information that might be important for security discussions, surfaces risks and can be used throughout the life cycle of the proposal. E.g. include security-relevant design decisions, concerns, important discussions, implementation-specific guidance and pitfalls, an outline of threats and risks and how they are being addressed. EIP submissions missing the "Security Considerations" section will be rejected. An EIP cannot proceed to status "Final" without a Security Considerations discussion deemed sufficient by the reviewers.-->

None?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe 😆

I'm also drawing a blank.

We're introducing a predictable part of the message, making known-plaintext attacks easier? ECIES (any encryption, really) should be resistant to that though.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand the known-plaintext attack scenario. We are talking about the risk of reduced security for encrypted parts of the contents such as a block signer in Görli?

Because, afaik, the commands in itself aren't encrypted (just compressed) anyway.


I mean, the template says any EIP without this sections is rejected so I probably just leave it in and see if someone points out something important during the process.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait, that's not right. The current communication is in fact already encrypted.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you should probably just ignore me here, I brought it up because it's the only thing I could think of, but I don't think this makes the encryption meaningfully easier to break.

## Test Cases
<!--Test cases for an implementation are mandatory for EIPs that are affecting consensus changes. Other EIPs can choose to include links to test cases if applicable.-->

TODO
Copy link

@lithp lithp Jan 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be left blank? This EIP does not make any consensus-critical changes

Copy link

@lithp lithp Jan 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I suppose it could include the encodings of some example messages?

(after editing the definition)

>>> from trinity.protocol.eth.commands import GET_NODE_DATA_STRUCTURE
>>> from eth.constants import BLANK_ROOT_HASH
>>> import rlp
>>> rlp.encode([1, [BLANK_ROOT_HASH]], sedes=GET_NODE_DATA_STRUCTURE).hex()
'e301e1a056e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421'

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Yeah, I probably just leave it out unless someone demands it later in the process 😅

@cburgdorf cburgdorf force-pushed the christoph/add-request-id-to-eth-protocol branch 6 times, most recently from ec128d5 to d7fcfd9 Compare January 20, 2020 18:27
* Doesn't need to be sequential (can be random)
* Does allow duplicates

## Rationale
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this Q&A section! Are these questions you've already gotten, or are you predicting that these are the kinds of questions people will ask?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of both. Notice that I have posted this EIP now in the real repo: ethereum#2481

But the QA section is inspired by other EIPs which do that in the rational section, I thought it's a nice fit (overall, I struggle with the pre-defined sections and their meaning. I feel like repeating myself but this QA style helped me at least for this section)


## Simple Summary
<!--"If you can't explain it simply, you don't understand it well enough." Provide a simplified and layman-accessible explanation of the EIP.-->
This document proposes a way to slightly increase the efficiency of the `eth` networking protocol while at the same time reducing complexity in Ethereum node implementations.
Copy link

@lithp lithp Jan 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is personal preference so definitely feel free to ignore me, but I don't think it's a slight improvement! This could let nodes sync much faster; I think "This document proposes a way to increase the efficiency [...]" would be more accurate and more compelling.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I have a hard time judging the actual benefits. But I like your suggestion, I'll take it. 👍


This can be particular tricky for responses that are ambigious such as empty responses.

Now consider we change both the `GetBlockHeaders` and the `BlockHeaders` command to include a `request_id`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a little more clear if instead of "now consider we change" if had "this EIP changes". This is line 48 and the EIP still hasn't concretely said what it's proposing except in the title!

* **Current:** `[blockHeader_0, blockHeader_1, ...]`
* **Then**: `[request_id: P, [blockHeader_0, blockHeader_1, ...]]`

The `request_id` is a simple integer set by the client when it makes the request. On the responding side, the exact same `request_id` from the incoming request is put back into the response object.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is me nitpicking but "simple integer" isn't a term which means anything, it would be slightly more clear if this said "an integer" or, maybe, "a 64-bit integer"

To elaborate, each command is altered in the following way:

1. Create a list with the `request_id` being the first element.
2. Make the second element the list that defines the whole command in the current scheme.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you decide to do it this way? Asking because I'm curious, not because I think it's wrong!

There's a small additional cost to representing and de-serializing lists, a GetBlockHeaders which looked like:

[request_id: P, block: {P, B_32}, maxHeaders: P, skip: P, reverse: P in {0, 1}]

would be slightly more efficient and I think easier to code.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question that I also had to ask myself. I agree that this would be slightly more efficient at the cost of feeling slightly less structured. I have a hard time judging between these efficiency and clarity here. Then I looked at les and noticed they went with this scheme so I thought the efficiency gains could be ignored while at the same time keeping the commands aligned with the les commands (not that this is an explicit goal here)

But it definitely feels like something that should go into an alternative proposals section.


The ``request_id`` has the following characteristics:

* 64 bit integer
Copy link

@lithp lithp Jan 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I measured a little under a year ago there were 284M state entries. Each GetNodeData request can hold up to 384 states, so at least 700k GetNodeData requests are fired in order to fast sync. Adding 8 bytes to each request means adding a total of 5MB. I started this by thinking a 64-bit int would be too big but... that sounds good to me!

Maybe this could become a note on overhead in the Q&A section?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great insights! There's also an alternative proposal that could be made where we would also define connection ids upon when a connection is made and then you can identify requests based on connection id + request id which should allow both of them to be reduced to 16 bit integers. This is an idea that Tomasz from the nethermind team brought up in the ACD channel.

But honestly, I don't have a good intuition which integer size would be big enough to not produce too many clashes while at same time being not too big to waste space. This is probably something that could be measured by counting the number of clashes of the last n generated random integer values with different sizes of integers.

Change the following message types in the `eth` protocol:

* `GetBlockHeaders (0x03)`
* **Current:** `[block: {P, B_32}, maxHeaders: P, skip: P, reverse: P in {0, 1}]`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of "current" and "then", "eth/65" and "eth/66" would be more specific.


<!--You can leave these HTML comments in your merged EIP and delete the visible duplicate text guides, they will not appear and may be helpful to refer to if you edit it again. This is the suggested template for new EIPs. Note that an EIP number will be assigned by an editor. When opening a pull request to submit your EIP, please use an abbreviated title in the filename, `eip-draft_title_abbrev.md`. The title should be 44 characters or less.-->

## Simple Summary
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of the title, this EIP doesn't concretely propose adding request ids until line 48 or 74, depending on how you count. That probably belongs here in the summary! Something like:

This document proposes a way to slightly increase the efficiency of the eth networking protocol while at the same time reducing complexity in Ethereum node implementations. It does so by introducing a request id to all requests which their corresponding responses must include.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Fixing :)


## Motivation
<!--The motivation is critical for EIPs that want to change the Ethereum protocol. It should clearly explain why the existing protocol specification is inadequate to address the problem that the EIP solves. EIP submissions without sufficient motivation may be rejected outright.-->
The lack of request identifiers in the request / response paris of the `eth` protocol puts unnecessary burden of code complexity into every Ethereum client. It also makes the communication slightly less efficient. Another argument can be made that the addition of request identifiers makes the protocol more aligned with the `les` protocol which **does** already defines request identifiers for each request / response pair.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be the place to mention that:

  • the extra complexity only occurs if you want to implement pipelining.
  • because of this complexity neither geth or parity have implemented pipelining
  • pipelining allows nodes to sync off each other faster, which is a likely result of this EIP being accepted.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! I'm not exactly sure if I understand the benefits beyond "We can make more parallel requests to node A for any given command e.g. GetBlockHeaders". That is mentioned in the EIP but I think you are talking about different benefits.

Btw you should feel free to add commits to the PR and add yourself as a co-author if you like :)

@cburgdorf cburgdorf force-pushed the christoph/add-request-id-to-eth-protocol branch from d7fcfd9 to 9eee53c Compare January 22, 2020 12:48
@cburgdorf cburgdorf force-pushed the christoph/add-request-id-to-eth-protocol branch from 9eee53c to 0cdf542 Compare January 22, 2020 12:49
@cburgdorf
Copy link
Owner Author

I'm closing this in favor for ethereum#2481

@cburgdorf cburgdorf closed this Feb 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants