Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Fix listJobs pagination bug #15908

Merged
merged 3 commits into from
Aug 24, 2022
Merged

Conversation

lmossman
Copy link
Contributor

@lmossman lmossman commented Aug 23, 2022

What

I discovered a bug in the listJobs query that takes in a pageSize and offset. The issue is that this query first performs the join between jobs and attempts before applying the LIMIT and OFFSET, which means that LIMIT and OFFSET are applied to the result of the join, so they are actually limiting/offsetting into that join result.

Practically, this means that if some connection has the following job table setup:

(job 4: attempt 1, attempt 2, attempt 3)
(job 3: attempt 1, attempt 2)
(job 2: attempt 1)
(job 1: attempt 1, attempt 2)

Then the EXPECTED result of executing listJobs(limit: 2, offset, 1) is:

(job 3: attempt 1, attempt 2)
(job 2: attempt 1)

but the result that is actually returned due to this bug is:

(job 4: attempt 2, attempt 3)

because the limit and offset are applied to each job<>attempt pair.

How

This PR fixes the bug by changing this query to apply the limit and offset to the jobs table only in a subquery, which is the result of is then joined with the attempts table. This results in the expected behavior of limit and offset being applied at the job record level.

To accomplish this, I had to move the old BASE_JOB_SELECT_AND_JOIN value into a new method jobSelectAndJoin() that takes in the string to use as the job subquery. The paginated listJobs() method can then use this method and pass in its paginated jobs subquery into that method. Whereas BASE_JOB_SELECT_AND_JOIN just passes in jobs as the subquery, so that the resultant string for that constant is actually the same as it was before this change, which I wanted because basically every other query in this class uses that constant.

I also updated the test for this method to properly check that pagination is being performed correctly even when jobs have multiple attempts.

Recommended reading order

  1. DefaultJobPersistence.java
  2. DefaultJobPersistenceTest.java

🚨 User Impact 🚨

The impact of this change is that the pagination behavior of the listJobs API endpoint and query should now work as expected, in that pagination applied at the job level rather than at the (job, attempt) pair level.

@github-actions github-actions bot added area/platform issues related to the platform area/scheduler labels Aug 23, 2022
@lmossman lmossman changed the title Fix listJobs pagination bug and update test to check for it 🐛 Fix listJobs pagination bug and update test to check for it Aug 23, 2022
@lmossman lmossman changed the title 🐛 Fix listJobs pagination bug and update test to check for it 🐛 Fix listJobs pagination bug Aug 23, 2022
@lmossman lmossman temporarily deployed to more-secrets August 23, 2022 23:31 Inactive
@lmossman lmossman marked this pull request as ready for review August 23, 2022 23:36
@alovew
Copy link
Contributor

alovew commented Aug 24, 2022

lgtm 👍

@@ -344,13 +347,10 @@ public List<Job> listJobs(final ConfigType configType, final String configId, fi

@Override
public List<Job> listJobs(final Set<ConfigType> configTypes, final String configId, final int pagesize, final int offset) throws IOException {
final String jobsSubquery = "(SELECT * FROM jobs WHERE CAST(jobs.config_type AS VARCHAR) in " + Sqls.toSqlInFragment(configTypes)
+ " AND jobs.scope = '" + configId + "' ORDER BY jobs.created_at DESC, jobs.id DESC LIMIT " + pagesize + " OFFSET " + offset + ") AS jobs";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A future warning for us: this is fine for now, but if/when the dataset grows large, OFFSET can have some real performance issues. The alternative is to filter using a key or ID, but that has its own problems. This also highlights the need to use pagination from an application framework so we can avoid having to solve this type of problem (e.g. let the framework handle how to generate a paginated query).

Copy link
Contributor Author

@lmossman lmossman Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I definitely agree that using an application framework to generate these queries for us would be greatly preferable. Looks like we also have an EPIC for API Pagination here - it may be a good idea for you to add a note there about avoiding OFFSET and doing a different pagination mechanism (e.g. cursor pagination)

Copy link
Contributor

@jdpgrailsdev jdpgrailsdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@lmossman lmossman temporarily deployed to more-secrets August 24, 2022 18:19 Inactive
@lmossman lmossman merged commit 5c52a44 into master Aug 24, 2022
@lmossman lmossman deleted the lmossman/fix-list-jobs-pagination branch August 24, 2022 18:40
sophia-wiley pushed a commit that referenced this pull request Aug 25, 2022
* fix listJobs pagination bug and update test to check for it

* change variable name
rodireich pushed a commit that referenced this pull request Aug 25, 2022
* fix listJobs pagination bug and update test to check for it

* change variable name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform issues related to the platform area/scheduler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants