Clean up async code #631

fyellin · 2024-11-01T22:09:50Z

Asynchronous code in _api needed to know too much about the internal workings of the AwaitHandler. Changed it so that there's a little bit more separation.

Also made some asynchronous functions really be asynchronous.

Added on_submitted_word_done.

Commenting out implementation of create_xxx_pipeline_async because these are unimplemented in wgpu-native. The code is still there, but a constant forces the synchronous implementation.

Added tests. Added dependency on pytest-asyncio to run those tests.

wgpu/backends/wgpu_native/_helpers.py

Korijn · 2024-11-02T09:47:29Z

Very nice improvements!

fyellin · 2024-11-03T19:43:00Z

While I'm cleaning up. Why do both enumerate_adapters_sync and enumerate_adapters_async exist? There seems to be no value added by the async version. Can these just be merged into `enumerate_adapters"?

When I look at the Rust documentation, there seems to be only a single synchronous operation.

wgpu/backends/wgpu_native/_helpers.py

tests/test_async.py

wgpu/backends/wgpu_native/_api.py

almarklein

Nice work, I like the set_error() and set_result(). I think that crossed my mind when I worked on this, but I can't recall why I went with that weird result dict.

almarklein · 2024-11-04T10:03:18Z

Why do both enumerate_adapters_sync and enumerate_adapters_async exist?

Good question. The reason is that in order to implement this method in JS, we must probably just call request_adapter with different power settings, and request_adapter is async.

fyellin · 2024-11-04T21:18:10Z

I was going to ask about anyio. It seems like a reasonable solution, but I didn’t want to bring in a third-party library without prior approval. And pytest-anyio conveniently runs both asyncio and trio. Ignore the push I just did. I’ll rewrite to use anyio.

…

On Mon, Nov 4, 2024 at 13:11 Korijn van Golen ***@***.***> wrote: Just an open suggestion: maybe it would be helpful to leverage anyio <https://anyio.readthedocs.io/en/stable/index.html> if our goal is to support both trio and asyncio? It has a pytest plugin too. It might make our lives simpler. Applications and libraries written against AnyIO’s API will run unmodified on either asyncio or trio This quote from their docs seems very promising. — Reply to this email directly, view it on GitHub <#631 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABC2LIM2VQLLEMRXSGLMFZLZ67PHRAVCNFSM6AAAAABRBBMQXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJVG4YDSMZYG4> . You are receiving this because you authored the thread.Message ID: ***@***.***>

almarklein · 2024-11-05T08:13:02Z

I was going to ask about anyio.

For testing It definitely makes sense. For implementation, only if it remains somewhat contained and an implementation detail.

There's a lot of changes coming in how wgpu-native implements futures, and I'm also not sure what we'd need to do to support this running in JS. My point is that this will code will evolve quite a bit in the coming months, and I don't want to lock into a dependency.

fyellin · 2024-11-05T08:54:21Z

The Python documentation on writing implementation-agnostic asynchronous code is very thin. I'm still investigating if there is an easy way of saying in an async function "Let other coroutines run and then come back to me" that doesn't tie the code down into a specific implementation. This is one of the more painful areas of Python development. It's a lot easier if you know which async library you're going to use.

…

On Tue, Nov 5, 2024 at 12:13 AM Almar Klein ***@***.***> wrote: I was going to ask about anyio. For testing It definitely makes sense. For implementation, only if it remains somewhat contained and an implementation detail. There's a lot of changes coming in how wgpu-native implements futures, and I'm also not sure what we'd need to do to support this running in JS. My point is that this will code will evolve quite a bit in the coming months, and I don't want to lock into a dependency. — Reply to this email directly, view it on GitHub <#631 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABC2LIOTFDYMYH6VAXRKED3Z7B42JAVCNFSM6AAAAABRBBMQXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJWGUYDEMJTG4> . You are receiving this because you authored the thread.Message ID: ***@***.***>

fyellin · 2024-11-05T21:41:50Z

I need some advice on how to finish this. I'm temporarily using anyio, which can deal with both trio and asyncio and it's mostly working.

The problem I'm running into is on the build. I've included "anyio" in pyproject.toml , and the build and tests run fine. But when the wheel is being built, it doesn't include anyio. How do I tell the wheel-builder to include this?

Just as an aside, I went to a clean environment and did "pip install wgpu". I was surprised to see pycparser included. I can't find where that dependency is coming from.

Korijn · 2024-11-05T22:33:20Z

The problem I'm running into is on the build. I've included "anyio" in pyproject.toml , and the build and tests run fine. But when the wheel is being built, it doesn't include anyio. How do I tell the wheel-builder to include this?

Wheels don't ship their dependencies, they're only listed in their metadata files. When you build a wheel, and then install that wheel into a new environment, you should see anyio being installed.

Just as an aside, I went to a clean environment and did "pip install wgpu". I was surprised to see pycparser included. I can't find where that dependency is coming from.

I'm guessing it's a transient dependency of cffi.

The object returned by WgpuAwaitable is now directly awaitable. Make Almar happy.

almarklein · 2024-11-06T07:44:39Z

I'm guessing it's a transient dependency of cffi.

It is: https://github.com/python-cffi/cffi/blob/88f48d22484586d48079fc8780241dfdfa3379c8/pyproject.toml#L12-L14

almarklein · 2024-11-06T08:25:23Z

It looks like yield None works for Trio too. The below runs two tasks simultaneously. One uses the awaitable that constantly switches back to the event-loop using yield None, the other prints the time on an interval.

import time
import trio

class Awaitable:    
    def __await__(self):
        while True:
            yield None

async def main():
    async with trio.open_nursery() as nursery:
        nursery.start_soon(child1)
        nursery.start_soon(child2)

async def child1():
    await Awaitable()

async def child2():
    while True:
        print(time.time())
        trio.sleep(0.01)
    
trio.run(main)

fyellin · 2024-11-07T20:43:37Z

Everything resolved.

Korijn · 2024-11-08T08:07:00Z

I will agree with Almar on the dependency management topic here.

If you switch from anyio to sniffio, you can write something like the following:

def get_async_backend():
    lib = sniffio.current_async_library()
    if lib == "asyncio":
        import asyncio
        return asyncio
    # and so on

That would reduce our dependency footprint quite a lot.

@almarklein I do prefer to await ...sleep() over using yield None. It may work in specific circumstances but it's probably hinging on implementation details and backward compatibility artefacts...

wgpu/backends/wgpu_native/_helpers.py

almarklein · 2024-11-08T10:52:55Z

Interesting that CI is green; if I checkout this branch and run an example, it panics in _wgpuDeviceCreateRenderPipeline. Note the invalid vertex format for vertex attribute: 0:

thread '<unnamed>' panicked at src/lib.rs:2169:42:
invalid vertex format for vertex attribute: 0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at library/core/src/panicking.rs:221:5:
panic in a function that cannot unwind
stack backtrace:
   0:        0x1034a14e8 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h243268f17d714c7f
   1:        0x1034ba19c - core::fmt::write::hb3cfb8a30e72d7ff
   2:        0x10349fb84 - std::io::Write::write_fmt::hfb2314975de9ecf1
   3:        0x1034a2504 - std::panicking::default_hook::{{closure}}::h14c7718ccf39d316
   4:        0x1034a2128 - std::panicking::default_hook::hc62e60da3be2f352
   5:        0x1034a2fc8 - std::panicking::rust_panic_with_hook::h09e8a656f11e82b2
   6:        0x1034a28f0 - std::panicking::begin_panic_handler::{{closure}}::h1230eb3cc91b241c
   7:        0x1034a1974 - std::sys::backtrace::__rust_end_short_backtrace::hc3491307aceda2c2
   8:        0x1034a25e0 - _rust_begin_unwind
   9:        0x1034df108 - core::panicking::panic_nounwind_fmt::h91ee161184879b56
  10:        0x1034df180 - core::panicking::panic_nounwind::heab7ebe7a6cd845c
  11:        0x1034df224 - core::panicking::panic_cannot_unwind::hedc43d82620205bf
  12:        0x1031ec070 - _wgpuDeviceCreateRenderPipeline
  13:        0x1a5051050 - <unknown>
  14:        0x1a5059b04 - <unknown>
  15:        0x101c51188 - _cdata_call
  16:        0x100ed20d0 - __PyObject_Call
  17:        0x100fc4358 - __PyEval_EvalFrameDefault
  18:        0x100eeb4b0 - _gen_send_ex2
  19:        0x100eeb114 - _gen_send_ex
  20:        0x100f20524 - _cfunction_vectorcall_O
  21:        0x100fe3124 - _context_run
  22:        0x100fc3e30 - __PyEval_EvalFrameDefault
  23:        0x100eeb4b0 - _gen_send_ex2
  24:        0x100eeb114 - _gen_send_ex
  25:        0x100fc3cbc - __PyEval_EvalFrameDefault
  26:        0x100fb942c - _PyEval_EvalCode
  27:        0x100fb6d44 - _builtin_exec
  28:        0x100f2024c - _cfunction_vectorcall_FASTCALL_KEYWORDS
  29:        0x100ed1e60 - _PyObject_Vectorcall
  30:        0x100fc2968 - __PyEval_EvalFrameDefault
  31:        0x100fb942c - _PyEval_EvalCode
  32:        0x101019e20 - _run_mod
  33:        0x1010183ec - __PyRun_SimpleFileObject
  34:        0x101017e20 - __PyRun_AnyFileObject
  35:        0x10103bb50 - _Py_RunMain
  36:        0x10103bf10 - _pymain_main
  37:        0x10103bfb0 - _Py_BytesMain
thread caused non-unwinding panic. aborting.

almarklein · 2024-11-08T10:59:32Z

I was wondering how our trio example was able to run, while your new test indicated that trio does not handle the yield None. Turns out that in main, the Awaitable.is_done() calls the poll function, and it seems to always resolve on the first try, so the None is never yielded. Conclusion: the yield None is a asyncio-specific trick.

edit: I was hoping that the yield trick allowed us to stay async-lib-independent without much effort. But looks like this is not possible.

fyellin · 2024-11-13T07:12:55Z

Is there something holding this up?

Korijn · 2024-11-13T08:08:30Z

Is there something holding this up?

Yes, these two comments:

fyellin · 2024-11-14T21:00:24Z

I double checked "yield None" with the simplest possible code:

>>> class Foo:
...     def __await__(self):  yield None; return 10
... 
>>> async def foo():
...     await Foo()
... 
>>> import trio
>>> trio.run(foo)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/fy/Pycharm/wgpu-py/.venv/lib/python3.12/site-packages/trio/_core/_run.py", line 2407, in run
    raise runner.main_task_outcome.error
  File "<stdin>", line 2, in foo
  File "<stdin>", line 2, in __await__
TypeError: trio.run received unrecognized yield message None. Are you trying to use a library written for some other framework like asyncio? That won't work without some kind of compatibility shim.
>

Meanwhile asyncio.run(foo()) worked just fine. So the yield None is definitely asyncio specific.

fyellin · 2024-11-16T07:05:57Z

@almarklein . What are you doing to see your crash? I haven't seen that on either my Mac or on the headless Linux machines I occasionally test on.

almarklein · 2024-11-18T20:48:19Z

What are you doing to see your crash?

python examples/triangle.py shows a black window. python examples/cube.py crashes.

fyellin · 2024-11-19T03:51:10Z

@almarklein. Hmm. It works just fine on my Mac. . . . Let me see if I can re-create this on a headless Linux device. If not, I'll just close this PR. Not worth the hassle it's causing.

fyellin · 2024-11-19T07:36:10Z

@almarklein Is there some secret to being able to run examples/cube.py on a headless machine with a GPU, and just having the results written to a backing buffer rather than the actual screen? I seem not to be able to figure out how to do that.

What is particularly strange is that although examples/cube.py includes an asynchronous implementation, the code that's actually running is completely synchronous. And the error message you're seeing is utterly bizarre.

fyellin · 2024-11-19T21:29:14Z

@almarklein. Can you give me details on OS, Python version, and GPU? I rewrote cube.py to be windowless, ran it on a Linux machine, and it just worked. The bug you've found sounds serious, but I just can't reproduce it. . .

almarklein · 2024-11-20T07:54:12Z

I'm on MacOS (M1), python 3.12.4. This is very strange indeed. I'm having a quick look.

edit: I think I know what the problem is. The pipeline descriptor is now build in a separate method, but some of the subfields are not referenced anymore and thefore cleaned up.

almarklein · 2024-11-20T10:29:56Z

I found and fixed the problem by adding a new_array() function. It makes sure that the substructs are associated with the array so they are not gc'd prematurely. It also returns null when the list of substructs is empty, which simplifies the code in a few other places (and produces a cleaner struct where we did not consider the case of the empty array). This change should also make further refactoring easier.

almarklein · 2024-11-20T10:32:40Z

The last remaining issue is to replace anyio with sniffio.

wgpu/backends/wgpu_native/_helpers.py

fyellin · 2024-11-21T06:47:55Z

@almarklein. Are all bugs gone? I'm still surprised you had GC problems when I didn't. I wonder if cffi is implemented slightly different on different machines.

almarklein · 2024-11-21T11:36:09Z

For the record, I'll try to explain what was happening a bit better. When you create an ffi struct, it does not hold a reference to its fields when these fields are pointers. So if you do:

sub_struct = ffi.new("CSubType")
struct = ffi.new("CType")
struct.aField = sub_struct

then struct does not hold a reference to sub_struct. If sub_struct goes out of scope, its memory is removed, and struct is now pointing to unclaimed memory; using it can result into garbage or a segfault.

We solved this problem long ago using the new_struct() and new_struct_p() functions in _api.py. One of the things these functions do is associated the fields with the struct, using a WeakKeyDictionary.

However, for arrays we had code like this in many places:

list_of_sub_structs = [new_struct("CSubType", ...) for ... in ...]
array_of_sub_structs = ffi.new("CSubType[]", list_of_sub_structs)
struct.a_field = array_of_sub_structs

These arrays are contiguous copies of the list of structs. The problem is not that the sub-structs in the list get cleared by the gc (as I thought earlier), but that any sub-sub-structs and/or sub-sub-arrays of these sub-structs get cleared when the sub-struct gets cleared.

The issue with arrays was not hit earlier, because in most cases we create a descriptor and then use it to instantiate a wgpu object. As code is refactored to create the descriptor in a separate method (which is a great change by itself) the lists go out of scope before the descriptor is passed to Rust.

almarklein · 2024-11-21T11:38:48Z

Wow, this PR is interesting, in being quite technical/deep on two very different topics 😅

fyellin added 2 commits November 1, 2024 13:27

Fix asynchronous commands

fe9931a

Fix lint problems and codegen problems.

c733c5f

fyellin marked this pull request as ready for review November 1, 2024 23:58

fyellin requested a review from Korijn as a code owner November 1, 2024 23:58

Korijn reviewed Nov 2, 2024

View reviewed changes

wgpu/backends/wgpu_native/_helpers.py Outdated Show resolved Hide resolved

Remove assertion.

f8a71e6

almarklein reviewed Nov 4, 2024

View reviewed changes

wgpu/backends/wgpu_native/_helpers.py Outdated Show resolved Hide resolved

almarklein reviewed Nov 4, 2024

View reviewed changes

tests/test_async.py Outdated Show resolved Hide resolved

almarklein reviewed Nov 4, 2024

View reviewed changes

wgpu/backends/wgpu_native/_api.py Outdated Show resolved Hide resolved

almarklein reviewed Nov 4, 2024

View reviewed changes

fyellin added 3 commits November 4, 2024 10:54

Merge branch 'main' into async_handler

fbfccb7

_api.GPU should be the owner of its internal representation.

bebac48

Thank you, ruff!

3ae7780

fyellin added 4 commits November 5, 2024 10:39

For now, use anyio

5fb6502

Continue to find the minimum changes

780ff9b

Continue to find the minimum changes

f831e6d

Merge branch 'main' into async_handler

3b024be

fyellin added 2 commits November 5, 2024 16:29

Allow "await WgpuAwaitable(..)"

81cf2f7

The object returned by WgpuAwaitable is now directly awaitable. Make Almar happy.

Fix ruff format

b08dc8b

Attempt to delay installing anyio

84db283

Changes requested by reviewers

54488dc

Korijn reviewed Nov 8, 2024

View reviewed changes

wgpu/backends/wgpu_native/_helpers.py Outdated Show resolved Hide resolved

Change sleep to 0

45361ce

Merge branch 'main' into async_handler

1b903fa

almarklein added 4 commits November 20, 2024 09:51

Small tweaks/cleanup

cd574c8

Implement new_array to prevent premature substruct cg

bf5862c

add little debug function that was very useful

2f150c1

fix import in test script

b207bb7

codegen

b1022ed

almarklein reviewed Nov 20, 2024

View reviewed changes

wgpu/backends/wgpu_native/_helpers.py Show resolved Hide resolved

Merge branch 'main' into async_handler

eedd8c9

Make array store sub-refs, not sub-elements

f650208

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clean up async code #631

Clean up async code #631

fyellin commented Nov 1, 2024

Korijn commented Nov 2, 2024

fyellin commented Nov 3, 2024

almarklein left a comment

almarklein commented Nov 4, 2024

fyellin commented Nov 4, 2024 via email

almarklein commented Nov 5, 2024

fyellin commented Nov 5, 2024 via email

fyellin commented Nov 5, 2024

Korijn commented Nov 5, 2024 •

edited

Loading

almarklein commented Nov 6, 2024

almarklein commented Nov 6, 2024

fyellin commented Nov 7, 2024

Korijn commented Nov 8, 2024 •

edited

Loading

almarklein commented Nov 8, 2024

almarklein commented Nov 8, 2024 •

edited

Loading

fyellin commented Nov 13, 2024

Korijn commented Nov 13, 2024

fyellin commented Nov 14, 2024

fyellin commented Nov 16, 2024

almarklein commented Nov 18, 2024

fyellin commented Nov 19, 2024

fyellin commented Nov 19, 2024

fyellin commented Nov 19, 2024

almarklein commented Nov 20, 2024 •

edited

Loading

almarklein commented Nov 20, 2024

almarklein commented Nov 20, 2024

fyellin commented Nov 21, 2024

almarklein commented Nov 21, 2024 •

edited

Loading

almarklein commented Nov 21, 2024

Clean up async code #631

Are you sure you want to change the base?

Clean up async code #631

Conversation

fyellin commented Nov 1, 2024

Korijn commented Nov 2, 2024

fyellin commented Nov 3, 2024

almarklein left a comment

Choose a reason for hiding this comment

almarklein commented Nov 4, 2024

fyellin commented Nov 4, 2024 via email

almarklein commented Nov 5, 2024

fyellin commented Nov 5, 2024 via email

fyellin commented Nov 5, 2024

Korijn commented Nov 5, 2024 • edited Loading

almarklein commented Nov 6, 2024

almarklein commented Nov 6, 2024

fyellin commented Nov 7, 2024

Korijn commented Nov 8, 2024 • edited Loading

almarklein commented Nov 8, 2024

almarklein commented Nov 8, 2024 • edited Loading

fyellin commented Nov 13, 2024

Korijn commented Nov 13, 2024

fyellin commented Nov 14, 2024

fyellin commented Nov 16, 2024

almarklein commented Nov 18, 2024

fyellin commented Nov 19, 2024

fyellin commented Nov 19, 2024

fyellin commented Nov 19, 2024

almarklein commented Nov 20, 2024 • edited Loading

almarklein commented Nov 20, 2024

almarklein commented Nov 20, 2024

fyellin commented Nov 21, 2024

almarklein commented Nov 21, 2024 • edited Loading

almarklein commented Nov 21, 2024

Korijn commented Nov 5, 2024 •

edited

Loading

Korijn commented Nov 8, 2024 •

edited

Loading

almarklein commented Nov 8, 2024 •

edited

Loading

almarklein commented Nov 20, 2024 •

edited

Loading

almarklein commented Nov 21, 2024 •

edited

Loading