-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Fix listJobs pagination bug #15908
Conversation
lgtm 👍 |
@@ -344,13 +347,10 @@ public List<Job> listJobs(final ConfigType configType, final String configId, fi | |||
|
|||
@Override | |||
public List<Job> listJobs(final Set<ConfigType> configTypes, final String configId, final int pagesize, final int offset) throws IOException { | |||
final String jobsSubquery = "(SELECT * FROM jobs WHERE CAST(jobs.config_type AS VARCHAR) in " + Sqls.toSqlInFragment(configTypes) | |||
+ " AND jobs.scope = '" + configId + "' ORDER BY jobs.created_at DESC, jobs.id DESC LIMIT " + pagesize + " OFFSET " + offset + ") AS jobs"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A future warning for us: this is fine for now, but if/when the dataset grows large, OFFSET can have some real performance issues. The alternative is to filter using a key or ID, but that has its own problems. This also highlights the need to use pagination from an application framework so we can avoid having to solve this type of problem (e.g. let the framework handle how to generate a paginated query).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I definitely agree that using an application framework to generate these queries for us would be greatly preferable. Looks like we also have an EPIC for API Pagination here - it may be a good idea for you to add a note there about avoiding OFFSET and doing a different pagination mechanism (e.g. cursor pagination)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* fix listJobs pagination bug and update test to check for it * change variable name
* fix listJobs pagination bug and update test to check for it * change variable name
What
I discovered a bug in the
listJobs
query that takes in a pageSize and offset. The issue is that this query first performs the join betweenjobs
andattempts
before applying theLIMIT
andOFFSET
, which means thatLIMIT
andOFFSET
are applied to the result of the join, so they are actually limiting/offsetting into that join result.Practically, this means that if some connection has the following job table setup:
Then the EXPECTED result of executing
listJobs(limit: 2, offset, 1)
is:but the result that is actually returned due to this bug is:
because the limit and offset are applied to each job<>attempt pair.
How
This PR fixes the bug by changing this query to apply the limit and offset to the jobs table only in a subquery, which is the result of is then joined with the attempts table. This results in the expected behavior of limit and offset being applied at the job record level.
To accomplish this, I had to move the old
BASE_JOB_SELECT_AND_JOIN
value into a new methodjobSelectAndJoin()
that takes in the string to use as the job subquery. The paginated listJobs() method can then use this method and pass in its paginated jobs subquery into that method. WhereasBASE_JOB_SELECT_AND_JOIN
just passes injobs
as the subquery, so that the resultant string for that constant is actually the same as it was before this change, which I wanted because basically every other query in this class uses that constant.I also updated the test for this method to properly check that pagination is being performed correctly even when jobs have multiple attempts.
Recommended reading order
🚨 User Impact 🚨
The impact of this change is that the pagination behavior of the listJobs API endpoint and query should now work as expected, in that pagination applied at the job level rather than at the (job, attempt) pair level.