Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python: Next-gen dazl connection API #108

Merged
merged 3 commits into from
Mar 18, 2021
Merged

python: Next-gen dazl connection API #108

merged 3 commits into from
Mar 18, 2021

Conversation

da-tanabe
Copy link
Contributor

@da-tanabe da-tanabe commented Aug 6, 2020

dazl v8 API

Introduces a new API that embraces multi-party subscriptions, and modernizes and simplifies the implementation.

async def main(token):
    async with dazl.connect(url, token=token) as conn:
        await conn.create("Some:Thing", { ... })
        async with conn.stream("Some:Thing") as stream:
            async for event in stream:
                print(event)
                if i_feel_like_it:
                    break
                elif event.cdata["name"] == "Batman":
                    await buy_batmobile(event)

Fixes

dazl turns four years old in May 2021, and many of the design decisions that it was originally built with (even predating the Daml gRPC Ledger API as introduced in October 2018) no longer apply:

No longer consolidates to a single transaction stream per Party

  • SubmitAndWait: older versions of the ledger API only had a CommandSubmissionService.Submit call, which required clients to listen to TransactionStream events in order to know whether or not a command succeeded or not. Establishing independent transaction streams per command submission is obviously an unscalable approach, so dazl instead listens to the transaction stream and doles out completion events internally.
  • ActiveContractSetService didn't originally exist, so dazl has a tradition of reading from the transaction stream, even when it doesn't necessarily need to. This was partially remediated in python: Make initial data reads come from the ACS #10/python: Whoops—actually turn on the ACS fetching for real. #16, but dazl is still very much predicated on reading from the transaction stream, and incorporation of the ActiveContractSetService is somewhat incomplete. By virtue of the single-stream-per-Party design, the ACS is also similarly one-per-Party.
  • The introduction of contract keys removes the need for applications to maintain a lot of in-state memory. In practice, modern Daml applications don't need to hold on to the ACS in memory any more, but dazl makes applications generally pay for maintaining an ACS nonetheless.
  • Exercise result values have been sent over the Ledger API for a while, but since command submission in dazl is only asynchronous, correlating exercise results back to call sites was very challenging.
  • Most importantly, multi-party submissions makes it impossible for dazl's singular transaction stream-per-Party model from being workable.

gRPC and asyncio

  • The gRPC library for Python has recently incorporated asyncio support (L58: Async API for gRPC Python grpc/proposal#155)! This removes the need for the complicated internals in dazl devoted to trying to keep Python thread count under control. Every transaction stream subscription required at least three Python threads to maintain, and this was another reason that dazl tried to "conserve" transaction subscriptions.

HTTP JSON API support/TypeScript library parity

  • The HTTP JSON API exposes a subset of the gRPC Ledger API that supports almost all of what dazl currently exposes a library, and it should be possible for an application written against dazl to talk to either API.
  • Authenticated ledger support is not well-documented, and the current design does not support token refreshing at all.
  • The TypeScript bindings expose a very similar API to dazl's; with some minor symbol renames, the APIs are effectively identical.

This does not yet include the compatibility layer to help ease the transition from the old API to the new one.

Some new low-level utility libraries have been added: see #172, #175, #176, #179

Due to the size of the change, some long-deprecated APIs were dropped as a prerequisite to this work: #155, #159

@da-tanabe da-tanabe force-pushed the python-conn-api branch 5 times, most recently from ce68dbe to 691a578 Compare February 10, 2021 21:07
@da-tanabe da-tanabe force-pushed the python-conn-api branch 5 times, most recently from ab2fec6 to 8d86132 Compare February 11, 2021 03:45
@da-tanabe da-tanabe force-pushed the python-conn-api branch 3 times, most recently from 3f93460 to 076737d Compare March 11, 2021 14:42
@da-tanabe da-tanabe force-pushed the python-conn-api branch 5 times, most recently from 631e509 to 4a439b4 Compare March 13, 2021 03:29
@da-tanabe da-tanabe force-pushed the python-conn-api branch 3 times, most recently from b11f6b2 to e5ee640 Compare March 17, 2021 14:23
@da-tanabe da-tanabe force-pushed the python-conn-api branch 3 times, most recently from 41c1cde to 2dcc5a6 Compare March 17, 2021 18:07
@da-tanabe da-tanabe force-pushed the python-conn-api branch 3 times, most recently from e46714c to cb97132 Compare March 17, 2021 18:25
@da-tanabe da-tanabe changed the title python: A potential connection API revision. python: Next-gen dazl connection API Mar 17, 2021
@da-tanabe da-tanabe force-pushed the python-conn-api branch 2 times, most recently from 150fab6 to 41b30b3 Compare March 17, 2021 20:05
@da-tanabe da-tanabe marked this pull request as ready for review March 17, 2021 20:14
Copy link
Collaborator

@mschaef-da mschaef-da left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple minor comments, but generally this looks very good and I am looking forward to using it.

Given the scope of this change, my suggestion (which may have actually come from you yourself) is to release the as a beta v8, let us get some time with it over the next couple of weeks, and then cut the official v8 once we're sure it works the way we want.

python/dazl/ledger/grpc/conn_aio.py Show resolved Hide resolved
"""
if not self._config.access.ledger_id:
# most calls require a ledger ID; if it wasn't supplied as part of our token or we were
# never given a token in the first place, fetch the ledger ID from the destination
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth letting the user know that this discovery is occurring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya good point—will add some INFO logs here.

if isinstance(self._config.access, PropertyBasedAccessConfig):
self._config.access.ledger_id = response.ledger_id
else:
raise ValueError("token-based access must supply tokens that provide ledger ID")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is not strictly true, is it? (The immediately preceding lines of code are a mechanism by which the access token is NOT required to have a ledger ID.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rephrased this to perhaps to make it clearer, but the story is nuanced:

  • For property-based access on the gRPC Ledger API, the ledger ID can be queried. This is how dazl works today, and why you don't really ever need to specify ledger IDs
  • For property-based access on the HTTP JSON API, dazl mints its own unsigned tokens, because even in an un-authed configuration, clients still identify themselves to the server by using tokens. Here, a ledger ID cannot be inferred, so it must be specified.
  • For token-based access on either gRPC Ledger API or HTTP JSON API, the ledger ID is "part of the spec", so it's required for the token to be considered valid. (And because the token is generally signed, the client can't modify specific claims in the token, particularly ledger ID).

Comment on lines 283 to 290
if workflow_id:
# TODO: workflow_id must be a LedgerString; we could enforce some minimal validation
# here to make for a more obvious error than failing on the server-side
return workflow_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like bad things might happen if not workflow_id... does this just turn into a Nothing.


try:
offset = None
async for event in self._acs_events(tx_filter_pb):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noice.

# now start returning events as they come off the transaction stream; note this
# stream will never naturally close, so it's on the caller to call close() or to
# otherwise exit our current context
async for event in self._tx_events(tx_filter_pb, offset):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this look like in the event of a failure? If the input stream goes down, I'm assuming an exception is thrown that the caller can catch and recover, etc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup exactly! This is one of the things I'm most excited about with respect to the new API— every interaction with the ledger provides an obvious place to wrap with try/except, whereas the more callback-heavy approach made this much more difficult/impossible.

tx_filter_pb = G_TransactionFilter(filters_by_party=filters_by_party)

try:
offset = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth a comment that _acs_events always emits a boundary with an offset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point—shall add.

warnings.warn(f"Received an unknown event: {event}", ProtocolWarning)
yield event

if self._continue_stream:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple things:

  • If offset is None here, then something is wrong, no? Does Python have an assert thing that can enforce this invariant?
  • If the ledger API doesn't guarantee to always be able to produce archived contracts, how long is offset guaranteed to be a valid input for _tx_events? I'm assuming long enough that we don't have to worry here, but not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the first point, not necessarily—offset is None means that you're connecting to an empty ledger and are interested in hearing create/archive events going forward.

On the second point, I'm not sure…but there isn't much we can do about it from a client-side perspective.

Comment on lines +89 to +91
``dazl`` has never had a straightforward way of abandoning event callbacks that were no longer
needed. The new API makes stream lifecycle more explicit and the responsibility of the user of the
library. Disposing of streams is now simpler to reason about.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

@da-tanabe da-tanabe merged commit 66be1a9 into master Mar 18, 2021
@da-tanabe da-tanabe deleted the python-conn-api branch March 18, 2021 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants