Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize tls resumption #235

Conversation

3andne
Copy link
Contributor

@3andne 3andne commented Aug 21, 2023

Background

In the utls codebase, the management of session-related code has long suffered from a lack of coherence, resulting in various issues and bugs. The current challenges/bugs include:

  1. Coexistence of PSK and Session Ticket: Presently, both PSK and Session Ticket extensions can operate concurrently, leading to unintended behavior. Specifically, nonempty PSK and Session Ticket data can both be marshaled. The correct behavior should be that when PSK is valid, the content of the Session Ticket extension should remain empty. This issue emerges due to the independent initialization and configuration of PSK and Session Ticket in disparate places of the codebase.

  2. Unable to Disable Session Resumption: In TLS 1.2, configuring disableSessionTicket = true does not adequately deactivate session resumption. This is due to multiple code segments responsible for session reading/loading, with only the crypto/tls loadSession() function assessing the status of session ticket usage disablement. Notably, the initialization of the TLS 1.2 session ticket occurs without precondition verification.

  3. Inconsistent Session Override Interfaces: The interfaces for overriding sessions exhibit inconsistencies. While the long-standing method of using uconn.SetSessionState for session overrides exists, improper invocation timing or its application to altering or providing TLS 1.3 sessions can result in unpredictable session states.

  4. Missing Session ID: If a client hello doesn't use a session, its session ID won't be set. In contrast, browsers always use a 32-byte session ID under all circumstances.

  5. Duplicated Loading of Sessions: If no usable session is available when calling BuildHandshakeState(), the client hello is marshalled without the session ticket / psk. The client hello is then being used by handshake(). If a session becomes available during the handshake(), the client hello might use that session. However, the clientHello.raw isn't supposed to be changed at that point. This could lead to handshake failures.

  6. Redundant Session-Related Fields in Multiple Locations: There exists redundancy in session-related fields across several areas including uconn.HandshakeState.Hello, uconn.HandshakeState.State13, uconn.utls, SessionTicketExtension, and PreSharedKeyExtension. The recurrence of fields with the same semantics across different places introduces the potential for inconsistent states, thereby complicating comprehension and maintenance efforts.

  7. Incompatible with TLS 1.2 sessions: If the session is a tls 1.2 session, the PSK extension will throw an error. While the PSK extension has the capability to ignore the session and allows for TLS 1.2 resumption, this approach introduces challenges in ensuring proper initialization of the session ticket extension due to a lack of cohesion between the components.

The existence of these bugs shows us a technical debt present in the code. While they can all be fixed independently, it's only through refactoring and addressing the technical debt that we can prevent similar bugs from being introduced in the future, thus permanently resolving such issues.

Refactoring Design

  1. Introduce SessionController: Use this abstraction to manage all operations related to sessions.
  2. Utilize State Machines for State Management: Due to the complexity of session-related states, relying on simple boolean conditions like session == nil cannot adequately describe all possibilities. In the refactoring process, I've employed state machines to ensure proper handling of each state.
  3. Redundant Assertions, Not Data: The previous session design was prone to generating undefined states. Thanks to the refactored state machine, we can use "verbose" assertions to ensure we don't enter unintended states. I've also cleaned up all data related to sessions, ensuring that they are only stored in one place each. Eliminating data redundancy can prevent hard-to-debug state discrepancy issues.
  4. Explicit Session Operations: Instead of splitting the loading session, setting session, and update binder processes in ApplyConfig and MarshalClientHello, I've centralized them in two new functions, uLoadSession and uApplyPatch, and explicitly called conn.loadSession() at most once.
  5. Optimize PreSharedKeyExtension Interface: To make Psk compatible with the refactored session setup process, I've redesigned the PreSharedKeyExtension interface. The main idea behind this redesign is dependency injection. PreSharedKeyExtension shouldn't fetch sessions on its own; instead, uConn should inject the session to ensure precise control over the session's lifecycle.
  6. Optimize PreSharedKeyExtension Code: Currently, both PskExt implementations contain a considerable amount of duplicate code. Code redundancy makes maintenance difficult. For example, during refactoring, I found an error in the implementation of the Len() function in FakePreSharedKeyExtension. The code in UtlsPreSharedKeyExtension is almost identical but doesn't have that error.
  7. Minimize Changes to crypto/tls Code, Reuse Functions Whenever Possible: This approach will make future maintenance more convenient by preserving compatibility with the existing crypto/tls codebase.
  8. Allow session tickets to be set through SetSessionTicketExtension and PSK to be set through SetPskExtension.

The illustrated workflow

  • before
    stateDiagram-v2
      D: clientHandshake
      A: applyPresetByID()
      B: ApplyConfig()
      C: MarshalClientHello()
      E: [loadSession()]
      [*] --> D
      note right of D
             sessions are loaded independently by multiple components.
             lack of centralized session management.
      end note
      state D {
          [*] --> A
          state A {
              A1: SessionTicketExtension<br/>[custom load session #1]
              [*] --> A1
          }
          A --> B
          state B {
              B1: [loadSession()]<br/>partially setting tls1.3 session to hello
              B2: [custom load session #2]<br/>partially setting tls1.3 session to hello
              [*] --> B1: UtlsPreSharedKeyExtension
              [*] --> B2: FakePreSharedKeyExtension
          }
          B --> C
          state C {
              C1: marshal hello to bytes,<br/>[ReadWithRawHello()]<br/>write binders to bytes,<br/>set binders to hello
              C2: [bufferedWriter.ReadFrom(ext)]
              [*] --> C1: PreSharedKeyExtension
              [*] --> C2: Not PreSharedKeyExtension
          }
          C --> E
      }
    
    Loading
  • after
    stateDiagram-v2
     D: [clientHandshake()]
     A: [applyPresetByID()]
     B: [ApplyConfig()]
     C: [MarshalClientHello()]
     E: [loadSession()]
     F: [uLoadSession()]
     G: [uApplyPatch()]
     H: restore session from hello and state13
     [*] --> D
     state D {
         [*] --> A
         A --> B
         B --> F
         note left of F
             Either Session ticket or Psk is set.
             Session is loaded solely by loadSession().
             loadSession() will be called at most ONCE.            
         end note
         state F {
             F1: precondition checks
             F2: [loadSession()]
             F3: initialize and set session ticket
             F4: [InitializeByUtls()]
             F5: [setPsk()]
             [*] --> F1
             F1 --> F2: should load session
             F1 --> F3: user-provided tls1.2 session
             F1 --> F5: user-initialized psk
             F2 --> F3: tls 1.2 session
             F2 --> F4: tls 1.3 session
         }
         F --> C
         C --> G
         state G {
             G1: update binders if necessary,<br/>[PatchBuiltHello()]<br/>[setPsk()]
             [*] --> G1
         }
         G --> E: if not called by [uLoadSession()]<br/>HelloGolang
         G --> H: if already being called
         H --> [*]
         E --> [*]
     }
    
    Loading

State machines

Session Controller State

stateDiagram-v2
    A: NoSession
    B: TicketInitialized
    C: TicketAllSet
    D: PskExtInitialized
    E: PskAllSet
    F: locked
    
    [*] --> A
    A --> B: SetSessionState12<br/>called by user
    A --> B: uLoadSession<br/>returns tls 1.2 session
    B --> C: setSessionTicketToUConn()
    C --> F: finalCheck()
    A --> F: finalCheck()
    A --> D: SetSessionState13<br/>called by user
    A --> D: uLoadSession<br/>returns tls 1.3 session
    D --> E: setPskToHandshake()
    E --> F: finalCheck()
Loading

LoadSessionTrackerState

stateDiagram-v2
    A: NeverCalled
    B: UtlsAboutToCall
    C: CalledByULoadSession
    D: CalledByGoTLS
    
    [*] --> A
    A --> B: utlsAboutToLoadSession()<br/>called by uLoadSession
    B --> C: onLoadSessionReturn()
    C --> [*]
    A --> D: clientHandshake()<br/>onLoadSessionReturn()
    D --> [*]
Loading

u_parrots.go Outdated Show resolved Hide resolved
u_parrots.go Outdated Show resolved Hide resolved
@gaukas
Copy link
Contributor

gaukas commented Aug 22, 2023

Overall this looks legit!

We may not want to break any of the existing interfaces unless we have to. So please consider introducing new interfaces while setting default parameter for the existing interfaces where new required parameters are missing.

feat: add option `OmitEmptyPsk` and throw error on empty psk by default
feat: revert changes to public interfaces
@3andne
Copy link
Contributor Author

3andne commented Aug 27, 2023

New option OmitEmptyPsk introduced (false by default).
see:

        // OmitEmptyPsk determines whether utls will automatically conceal
	// the psk extension when it is empty. When the psk extension is empty, the
	// browser omits it from the client hello. Utls can mimic this behavior,
	// but it deviates from the provided client hello specification, rendering
	// it unsuitable as the default behavior. Users have the option to enable
	// this behavior at their own discretion.
	OmitEmptyPsk bool // [uTLS]

Copy link
Contributor

@gaukas gaukas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@gaukas gaukas merged commit 894e4c5 into refraction-networking:add-tls-psk-support Aug 27, 2023
gaukas added a commit that referenced this pull request Aug 27, 2023
* uTLS: X25519Kyber768Draft00 hybrid post-quantum key agreement by cloudflare/go (#222)

* crypto/tls: Add hybrid post-quantum key agreement  (#13)

* import: client-side KEM from cloudflare/go

* import: server-side KEM from cloudflare/go

* fix: modify test to get rid of CFEvents.

Note: uTLS does not promise any server-side functionality, and this change is made to be able to conduct unit tests which requires both side to be able to handle KEM Curves.

Co-authored-by: Christopher Wood <caw@heapingbits.net>
Co-Authored-By: Bas Westerbaan <bas@westerbaan.name>

----

Based on:

* crypto/tls: Add hybrid post-quantum key agreement 

Adds X25519Kyber512Draft00, X25519Kyber768Draft00, and
P256Kyber768Draft00 hybrid post-quantum key agreements with temporary
group identifiers.

The hybrid post-quantum key exchanges uses plain X{25519,448} instead
of HPKE, which we assume will be more likely to be adopted. The order
is chosen to match CECPQ2.

Not enabled by default.

Adds CFEvents to detect `HelloRetryRequest`s and to signal which
key agreement was used.

Co-authored-by: Christopher Wood <caw@heapingbits.net>

 [bas, 1.20.1: also adds P256Kyber768Draft00]
 [pwu, 1.20.4: updated circl to v1.3.3, moved code to cfevent.go]

* crypto: add support for CIRCL signature schemes

* only partially port the commit from cloudflare/go. We would stick to the official x509 at the cost of incompatibility.

Co-Authored-By: Bas Westerbaan <bas@westerbaan.name>
Co-Authored-By: Christopher Patton <3453007+cjpatton@users.noreply.github.com>
Co-Authored-By: Peter Wu <peter@lekensteyn.nl>

* crypto/tls: add new X25519Kyber768Draft00 code point

Ported from cloudflare/go to support the upcoming new post-quantum keyshare.

----

* Point tls.X25519Kyber768Draft00 to the new 0x6399 identifier while the
  old 0xfe31 identifier is available as tls.X25519Kyber768Draft00Old.
* Make sure that the kem.PrivateKey can always be mapped to the CurveID
  that was linked to it. This is needed since we now have two ID
  aliasing to the same scheme, and clients need to be able to detect
  whether the key share presented by the server actually matches the key
  share that the client originally sent.
* Update tests, add the new identifier and remove unnecessary code.

Link: https://mailarchive.ietf.org/arch/msg/tls/HAWpNpgptl--UZNSYuvsjB-Pc2k/
Link: https://datatracker.ietf.org/doc/draft-tls-westerbaan-xyber768d00/02/
Co-Authored-By: Peter Wu <peter@lekensteyn.nl>
Co-Authored-By: Bas Westerbaan <bas@westerbaan.name>

---------

Co-authored-by: Bas Westerbaan <bas@westerbaan.name>
Co-authored-by: Christopher Patton <3453007+cjpatton@users.noreply.github.com>
Co-authored-by: Peter Wu <peter@lekensteyn.nl>

* new: enable PQ parrots (#225)

* Redesign KeySharesEcdheParameters into KeySharesParameters which supports multiple types of keys.

* Optimize program logic to prevent using unwanted keys

* new: more parrots and safety update (#227)

* new: PQ and other parrots

Add new preset parrots:
- HelloChrome_114_Padding_PSK_Shuf
- HelloChrome_115_PQ
- HelloChrome_115_PQ_PSK

* new: ShuffleChromeTLSExtensions

Implement a new function `ShuffleChromeTLSExtensions(exts []TLSExtension) []TLSExtension`.

* update: include psk parameter for parrot-related functions

Update following functions' prototype to accept an optional pskExtension (of type *FakePreSharedKeyExtension):
- `UClient(conn net.Conn, config *Config, clientHelloID ClientHelloID)` => `UClient(conn net.Conn, config *Config, clientHelloID ClientHelloID, pskExtension ...*FakePreSharedKeyExtension)`
- `UTLSIdToSpec(id ClientHelloID)` => `UTLSIdToSpec(id ClientHelloID, pskExtension ...*FakePreSharedKeyExtension)`

* new: pre-defined error from UTLSIdToSpec

Update UTLSIdToSpec to return more comprehensive errors by pre-defining them, allowing easier error comparing/unwrapping.

* new: UtlsPreSharedKeyExtension

In `u_pre_shared_key.go`, create `PreSharedKeyExtension` as an interface, with 3 implementations:
- `UtlsPreSharedKeyExtension` implements full support for `pre_shared_key` less resuming after seeing HRR.
- `FakePreSharedKeyExtension` uses CipherSuiteID, SessionSecret and Identities to calculate the corresponding binders and send them, without setting the internal states. Therefore if the server accepts the PSK and tries to resume, the connection fails.
- `HardcodedPreSharedKeyExtension` allows user to hardcode Identities and Binders to be sent in the extension without setting the internal states. Therefore if the server accepts the PSK and tries to resume, the connection fails.

TODO: Only one of FakePreSharedKeyExtension and HardcodedPreSharedKeyExtension should be kept, the other one should be just removed. We still need to learn more of the safety of hardcoding both Identities and Binders without recalculating the latter.

* update: PSK minor changes and example

* Updates PSK implementations for more comprehensible interfaces when applying preset/json/raw fingerprints.
* Revert FakePreSharedKeyExtension to the old implementation. Add binder size checking.
* Implement TLS-PSK example

New bug: setting `tls.Config.ClientSessionCache` will cause PSK to fail. Currently users must set only `tls.UtlsPreSharedKeyExtension.ClientSessionCacheOverride`.

* fix: PSK failing if config session cache set

* Fix a bug causing PSK to fail if Config.ClientSessionCache is set.
* Removed `ClientSessionCacheOverride` from `UtlsPreSharedKeyExtension`. Set the `ClientSessionCache` in `Config`!

Co-Authored-By: zeeker999 <13848632+zeeker999@users.noreply.github.com>

* Optimize tls resumption (#235)

* feat: bug fix and refactor

* feat: improve example docs: add detailed explanation about the design feat: add assertion on uApplyPatch

* fix: address comments
feat: add option `OmitEmptyPsk` and throw error on empty psk by default
feat: revert changes to public interfaces

* fix: weird residue caused by merging conflict

* fix: remove merge conflict residue code

---------

Co-authored-by: Bas Westerbaan <bas@westerbaan.name>
Co-authored-by: Christopher Patton <3453007+cjpatton@users.noreply.github.com>
Co-authored-by: Peter Wu <peter@lekensteyn.nl>
Co-authored-by: zeeker999 <13848632+zeeker999@users.noreply.github.com>
Co-authored-by: 3andne <52860475+3andne@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants