Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EnqueueLinks modifies userData when adding label, causing unexpected behavior #1725

Closed
mvolfik opened this issue Dec 28, 2022 · 6 comments · Fixed by #1728
Closed

EnqueueLinks modifies userData when adding label, causing unexpected behavior #1725

mvolfik opened this issue Dec 28, 2022 · 6 comments · Fixed by #1728
Labels
bug Something isn't working. discussion

Comments

@mvolfik
Copy link
Contributor

mvolfik commented Dec 28, 2022

Imagine the following code in a page handler function

const data = { title: $('title').text() };

await enqueueLinks({
    selector: '.WebFooter-bottomLinks a',
    userData: data,
    label: 'footer-link',
});

await enqueueLinks({
    selector: '.LandingPage-actorListCards > a',
    userData: data,
    label: 'actor-link',
});

Then all requests are enqueued with label: 'footer-link', which is quite unexpected. That is due to this code:

requestOptions.userData ??= options.userData ?? {};
requestOptions.userData!.label ??= options.label;

enqueueLinks modifies the userData object, and adds the label only if it's not present already. A simple workaround in user code is to provide a shallow copy of the data object (userData: { ...data }).

The possible resolutions are two:

  • do the shallow copy inside the enqueueLinks code (possibly only when setting the label)
  • always override the label if provided, even if it already is present in the userData (so enqueueLinks({ userData: {label: 'foo'}, label: 'bar'}) would enqueue requests with the label bar).

I'm not really sure which variant is better


Full reproduction repository: mvolfik/apify-enqueuelinks-bug

cc @tomasjindra - that's the issue we debugged together

@mvolfik mvolfik added bug Something isn't working. discussion labels Dec 28, 2022
@B4nan
Copy link
Member

B4nan commented Dec 28, 2022

I'd vote for always overriding the label, that would be the expected behavior to me.

@corford
Copy link
Contributor

corford commented Dec 29, 2022

Personally, I wouldn't expect a label from my first call of enqueueLinks() to appear on userData from a second call to enqueueLinks() (if for, that second call, I hadn't explicitly passed a label).

Or said another way, I wouldn't expect enqueueLinks() to mutate a passed userData object - shallow copy feels more obvious to me (if you want a 'label' prop to persist on the passed userData object, then explicitly set it yourself rather than indirectly via a separate param to enqueueLinks() ).

@B4nan
Copy link
Member

B4nan commented Dec 30, 2022

Well, Request.label is a shortcut for working with Request.userData, it is supposed to mutate it, and if you ask me, it is also supposed to take precedence.

@corford
Copy link
Contributor

corford commented Dec 30, 2022

That's fair. We don't use the Request.label shortcut (preferring to be explicit with what we do on userData, since we use it for a lot of our own internal stuff too), so I guess it's from that PoV that, to me, the label param to enqueueLinks felt more internal to that func (and unexpected that it would mutate the passed userData).

@mvolfik
Copy link
Contributor Author

mvolfik commented Dec 30, 2022

Well, we can do both - first create a shallow copy, and then override userData.label with label

@B4nan should I do a PR with that?

@B4nan
Copy link
Member

B4nan commented Dec 30, 2022

Yeah, sure, PR welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working. discussion
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants