Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix double referenced types in paginated endpoints #337

Merged

Conversation

marioortizmanero
Copy link
Collaborator

Description

This removes the potential double reference to Id types (&'a Id<'_>)
which was previously required due to limitations in the async variant of
the paginate function.
Due to limitations in HRTBs (see
https://kevincox.ca/2022/04/16/rust-generic-closure-lifetimes/) it is
not possible, to write something like Req: for<'a> Fn(&'a) -> ('a + Future<...>). The sync version remains almost unchanged.

Motivation and Context

See #305

Dependencies

None

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

How has this been tested?

CI passes

Is this change properly documented?

No need, it's an internal change

@marioortizmanero
Copy link
Collaborator Author

Whoops, I think that by creating the PR myself @eladyn, the original author, can't commit themselves. Can you confirm that? Otherwise, you can create it yourself to iterate over it.

@eladyn
Copy link
Contributor

eladyn commented Jul 14, 2022

Unfortunately, I'm currently not at home and won't be able to get back to this for a week or so. If you'd like to pick this up and fix the tests for example in the meantime, feel free to do so. I'll be glad to assist and explain things as good as possible. :)

@eladyn eladyn force-pushed the fix_double_reference_pagination branch 2 times, most recently from 7ead720 to 6cc9379 Compare July 29, 2022 10:15
@eladyn
Copy link
Contributor

eladyn commented Jul 29, 2022

So I finally got around to fixing the tests. Let me know if you need to know anything about that implementation. It might be a bit confusing at first, but I didn't find any better way to do it.

@marioortizmanero
Copy link
Collaborator Author

Thanks! I didn't end up having enough time to fix it myself. The article you linked explains it quite well, though.

I wonder if this will be fixed in the future with something like chalk. I guess we need to be able to use impl within the FnOnce return type, but not sure if that's something chalk takes care of.

I'll wait for @ramsayleung to take a look, since this is a bit complicated and he should understand it as well.

@ramsayleung
Copy link
Owner

This PR is a little complex, I need a little time to understand the context and solution :-)

@marioortizmanero
Copy link
Collaborator Author

@ramsayleung, just wanted to let you know that I'll be out for vacation for a couple weeks, so I won't be able to work on the next version myself. However, only this PR and #327 are required, you can ignore #356 for now. I already reviewed the two required PRs, so if you don't need any additional feedback you can just release the version yourself. Otherwise, I'll be back in a couple weeks.

@ramsayleung
Copy link
Owner

Sure, just enjoy your summer vacation :)

This removes the potential double reference to Id types (&'a Id<'_>)
which was previously required due to limitations in the async variant of
the `paginate` function.
Due to limitations in HRTBs (see
https://kevincox.ca/2022/04/16/rust-generic-closure-lifetimes/) it is
not possible, to write something like `Req: for<'a> Fn(&'a) ->
('a + Future<...>)`. The sync version remains almost unchanged.
@eladyn eladyn force-pushed the fix_double_reference_pagination branch from 6cc9379 to 122d22e Compare August 26, 2022 13:04
@eladyn
Copy link
Contributor

eladyn commented Aug 26, 2022

I just realized that by doing cargo fmt on Rust 1.63.0, all those occurrences of pub(in crate) were changed to pub(crate). Should I split that off in a separate PR or is that minor enough to stay in this one?

@ramsayleung
Copy link
Owner

Should I split that off in a separate PR or is that minor enough to stay in this one?

I think the modification is minor so that you could just stay in this one, you don't need a separate PR :)

ctx: Ctx,
req: Request,
page_size: u32,
) -> Paginator<'a, ClientResult<T>>
Copy link
Owner

@ramsayleung ramsayleung Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little confused that what's ctx for? What's purpose of ctx variable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the sync case: not much. This is only to keep compatibility with the async version, which needs those. I'll try to explain why on the other review comments.

@@ -9,7 +9,34 @@ use futures::{future::Future, stream::Stream};
/// Alias for `futures::stream::Stream<Item = T>`, since async mode is enabled.
pub type Paginator<'a, T> = Pin<Box<dyn Stream<Item = T> + 'a>>;

pub type RequestFuture<'a, T> = Pin<Box<dyn 'a + Future<Output = ClientResult<Page<T>>>>>;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say, It's a little headache to understand this function signature 😂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean, coming up with it wasn't easy either. 😀 In essence it's mostly the signature of a Future that returns a ClientResult<Page<T>> with the speciality that it is Boxed (and pinned) here.

@ramsayleung
Copy link
Owner

I have fixed the clippy and fmt error in this PR #361, perhaps we could merge #361 first.

) -> Paginator<'a, ClientResult<T>>
where
T: 'a + Unpin,
Request: 'a + for<'ctx> Fn(&'ctx Ctx, u32, u32) -> RequestFuture<'ctx, T>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I understand the design correctly, so I try to retell the story.

What the signature means is to allow req function to hold a reference to Ctx in the returned future with HRTB

So that the returned value of the req function is allowed to reference the argument(Ctx), we pass it.

All the thing we want to do is to specify that the lifetime of Ctx parameter must outlive the lifetime of RequestFuture?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds correct. I'll add my version, so you can double check. :)

The problem we need to solve is: When we move anything into the closure (req), the returned future must not reference it in some way, since we can't ensure that the closure outlives the returned future.

The way we solve it: We pass the reference from outside and await the future nearly immediately afterwards. As such, the lifetime bounds are upheld and the result we get no longer references ctx so we can safely stream it to the consumer.

Due to the limitations in HRTBs, we can't formulate a bound at this time that works with a raw impl Future<...> so we need to Box it (which creates the monstrosity from your above review comment 😅).

Copy link
Owner

@ramsayleung ramsayleung Oct 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little confused about this:

We pass the reference from outside and await the future nearly immediately afterwards. As such, the lifetime bounds are upheld

    fn user_playlists<'a>(
        &'a self,
        user_id: UserId<'a>,
    ) -> Paginator<'_, ClientResult<SimplifiedPlaylist>> {
        paginate_with_ctx(
            (self, user_id),
            move |(slf, user_id), limit, offset| {
                slf.user_playlists_manual(user_id.as_ref(), Some(limit), Some(offset))
            },
            self.get_config().pagination_chunks,
        )
    }

We pass the (self, user_id) as the ctx, but how does the lifetime bound?

and the result we get no longer references ctx so we can safely stream it to the consumer.

According to the function signature: for<'ctx> Fn(&'ctx Ctx, u32, u32), the closure references to ctx, do you mean the result RequestFuture doesn't reference to ctx?

By the way, I realize Rust is hard, I still have a lot to learn. 😂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little confused about this:

We pass the reference from outside and await the future nearly immediately afterwards. As such, the lifetime bounds are upheld

...
We pass the (self, user_id) as the ctx, but how does the lifetime bound?

What I meant by that is:

pub fn paginate_with_ctx<'a, Ctx: 'a, T, Request>(...) -> Paginator<'a, ClientResult<T>>
where
    T: 'a + Unpin,
    Request: 'a + for<'ctx> Fn(&'ctx Ctx, u32, u32) -> RequestFuture<'ctx, T>,
{
    let mut offset = 0;
    Box::pin(stream! {
        loop {
            // we create a reference to ctx here
            let page = req(&ctx, page_size, offset).await?;
            // due to `await`, the reference is not used again after this line
            offset += page.items.len() as u32;
            for item in page.items {
                // doesn't reference ctx, so yielding the items is fine
                yield Ok(item);
            }
            ...
        }
    })
}

and the result we get no longer references ctx so we can safely stream it to the consumer.

According to the function signature: for<'ctx> Fn(&'ctx Ctx, u32, u32), the closure references to ctx, do you mean the result RequestFuture doesn't reference to ctx?

The function itself takes a reference and so does the RequestFuture, but when RequestFuture is awaited, we get a ClientResult<T> which does not reference ctx.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your point, it makes sense:

pub fn paginate_with_ctx<'a, Ctx: 'a, T, Request>(...) -> Paginator<'a, ClientResult<T>>
where
    T: 'a + Unpin,
    Request: 'a + for<'ctx> Fn(&'ctx Ctx, u32, u32) -> RequestFuture<'ctx, T>,
{
    let mut offset = 0;
    Box::pin(stream! {
        loop {
            // we create a reference to ctx here. 
            // the `&ctx` is a shortlived reference, it's just like we put `ctx` into a bounded scope.
           //  ---- scope begins here ----
            let page = req(&ctx, page_size, offset).await?;
            // due to `await`, the reference is not used again after this line
           // ------ scope ends here -----
            offset += page.items.len() as u32;
            for item in page.items {
                // doesn't reference ctx, so yielding the items is fine
                yield Ok(item);
            }
            ...
        }
    })
}

It's clear now :)

@ramsayleung
Copy link
Owner

ramsayleung commented Oct 4, 2022

I think this PR is ready to merge, @eladyn would you like to wait for #361 to merge to fix the Clippy error, or you could fix Clippy error in this PR individually, then #361 will not be a blocker?

@eladyn
Copy link
Contributor

eladyn commented Oct 4, 2022

@ramsayleung Everything should be fine now!

@ramsayleung ramsayleung merged commit 3cb1e87 into ramsayleung:master Oct 4, 2022
@ramsayleung
Copy link
Owner

Merged :)

@eladyn eladyn deleted the fix_double_reference_pagination branch October 4, 2022 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants