-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Counts in background #7127
Counts in background #7127
Conversation
bdd8ea6
to
5118f28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah dangit - I think this is going to run up against @david-allison-1 having a strong preference for not pushing task-specific items into TaskData. I am sympathetic to that, I'd prefer TaskData stay not task-specific. @Arthur-Milchior is there a way to implement this passing the data through TaskData but without putting it specifically in to TaskData in the form of concrete types and task-specific methods?
It can be simply saved as an object, but I really don't see the point of loosing typing information. |
My objection is that each time we add an object, it gets harder and harder to refactor it into something reasonable, and we take the hit from allocating more fields on all tasks, including our hot paths. If we're using the internal object field, that makes it easier to turn into a generic, or But, if there's a good reason to use another field, then let's take the hit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Counts can now throw if it hits a path with database access.
Could we ensure that this is properly handled in the reviewer?
I am surprised that there is a hint. In java, this member is simply a pointer. There is no more reason to be worried about pointer than to be worried about ints. And I've not really seen any reason to state that we should avoid introducing more ints when needed. Casting (which require to check whether the cast is valid at all) is far more expensive CollectionTask extends a class which already deal well with Generic, so I don't see why we would want to use generic in TaskData at all in the first place. What is Result ? I don't understand your point about database access. |
d694006 introduced an exception into
Anything which is unnecessarily increasing the memory allocations that |
Current code in the reviewer already call Why would calling it in background introduce more risk than calling it directly from the reviewer ?
Updating the counts is also a part of answering a card. And casting seems a far more expensive operation. So avoiding a cast seems to save more time than avoiding a field. The remaining of the answer does not seems relevant for the current PR, so I'll leave it for when we discuss background tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this work WRT the JavaScript API? Can a JS call to counts return a stale value?
The JS API send what is currently in the the FlashCardViewer. So you are right, while computation occurs in background, JS only have access to the stale number. Do you know whether calls to |
I do not want any sort of message passing or signaling infrastructure between the model (a card, and it's contents) and the viewers, that seems really overdone to me, for performance. Can we have at least some variant of the counts API just be reliable, such that if you call it....you get counts. Real counts ? |
I am lost once again. Currently, there is a problem, the card is computed in backgrounds, but the counts are computer in the UI thread. This means that when the counts are not trivially known (i.e. after undo, suspend, bury... or when you open a new deck), you have computation done in the UI thread. Worse, you are loosing time because you could display the card content sooner instead of waiting for counts to be computed. For complex deck hierarchy, it's a noticeable difference. Up until now, the answer has always been to start a collection task, and when the data is available, return it to the UI so it can display it. That's exactly what I'm using right now, the reviewer ask for the counts, a task is started to compute the counts, and when the task is done, it returns the counts to the UI. I can do exactly the same thing for javascript of course, when it asks for the number of new card, I create a task in background to compute it. So that it gets the real value. The only downside is that it means the function will takes some time to return if the user press "undo", because the counts of new card takes some time to compute. The creator of the javascript code may not expect it to take time. If we follow this idea, the impact is as follow:
|
I've added two new commits. Once ensure that if counts are asked to be recomputed twice at the same time, it still behaves properly (this is a risk since javascript will need the computation to occur in the UI thread) |
5118f28
to
ead4f4e
Compare
The most sensible (and simple) thing would be to block until the data is available, and make this explicit in the documentation. Realistically, we should only be talking milliseconds. |
If the numbers are not already computed and that the request trigger the computation of numbers, it may takes far more than miliseconds. If it never took more than miliseconds to compute the counts, I would not be doing this PR. |
@david-allison-1 what's the prescriptive advice here to get this one in? It looks like it might be close? |
I don't know, I still stand by my suggestion for a blocking function which is documented, but that's weak. The main aim is to avoid introducing bugs. Maybe a JS async wrapper over a blocking API (I don't do JavaScript, so don't know if this is possible), we haven't verified what blocking will do to the JavaScript. I'm out of my comfort zone for best practices here. |
I've got some JS knowledge and (from react-native-webview plus here) some experience with the actual WebView / JS interfaces. I'll take this and think on it then. |
ead4f4e
to
a8fc078
Compare
Rebased |
This makes a difference only during the fraction of second between the time where the card is displayed and the time where reviewer update its counts. This fraction of second only exists when couts needs to be recomputed
a8fc078
to
17a3459
Compare
Hello 👋, this PR has been opened for more than 2 months with no activity on it. If you think this is a mistake please comment and ping a maintainer to get this merged ASAP! Thanks for contributing! You have 7 days until this gets closed automatically |
This is still extremely relevant. We still do a lot of database work in foreground. So Clearly not to close |
This is going to take a long time for me to review again. Apologies, but I'll be looking at it over the next few days. BUG (not directly relevant to this change): Anki-Android/AnkiDroid/src/main/java/com/ichi2/libanki/sched/SchedV2.java Lines 225 to 228 in 67f0938
I'll fix this |
protected void _resetLrnCount(@Nullable CancelListener cancelListener) { | ||
// sub-day | ||
mLrnCount = mCol.getDb().queryScalar( | ||
int lrnCount = mCol.getDb().queryScalar( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implies a race condition. To my knowledge, CollectionTask
ensured that access to sched
was single-threaded.
Did this occur due to other UI-based interactions, or was this done as a precaution?
If this wasn't a precaution, it means there's the potential for race conditions inside the calling method chain (mHaveCounts
at the very least)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're entirely right, CollectionTask
ensures that a single instance is called at each time.
However, the JS interfaces ankiGetLrnCardCount
akiGetNewCardCount
and ankiGetRevCardCount
all require instant access to the number of cards. This currently works because showing the card content is blocked until the numbers are actually computed. My goal is too show card content before the number are computed, to win up to a second on big collection. This creates usually no problem because the numbers are shown in java on top of screen and JS usually do not use them.
This lead to the problem when js require access to those numbers when a card is created. There is two ways to act:
- return the current number that may be false
- block until the number are computed (I want to emphasize that it is what we are currently doing, except that we block even when those numbers are not used.)
The second case seems to make more sens to me, if we have a function, let it returns the right number.
So my last commit ensure that this method call counts
from scheduler. In general, this will return instantaneously, however, in the case where a reset was just done and the computation not yet redone, it may actually trigger the computation to be done immediately. Since the computation was already planned in background, there is a small risk of actual race condition. However, this method is idempotent (with this change), so it presents no danger if the computation is done twice simultaneously, it just will get slower than necessary when it happens.
An easy solution would be of course to ensure that counts
is synchronized
.
In this case, I can't use CollectionTask
in the java method, since I need to return immediately. If the @JavascriptInterface
can be made async, we could go back to CollectionTask
.
Clearly, it would be ideal as it would move a computation that require database access back to background.
@infinyte7 is it possible to use async/promise in the interface? Do you know if it would break anybody's add-on yet?
The downside is that it suddenly makes add-on code more complex, since users suddenly have to use async/await or promises.
I admit that the current PR is not entirely optimal, but I still believe it to be better than the current state of the code, since it now blocks only if the numbers are actually required and not available, instead of blocking even when the numbers are not useful and don't block anything without introduction the synchronized
cost
#11808 will supercede this with regards to speed, we're pretty sure |
Pull Request template
Purpose / Description
Running the profiler, I realized that one of the reason undo was still slow was that computing counts is done in foreground and not in background. So I moved it to background.
This is not a problem for normal review since counts are required to find the next card, so the only case where counts should actually be computed and can't be read from cache is when a card can be found without using
sched.getCard
, i.e. after undo.How Has This Been Tested?
Running it on my merge branch, reviewing, undoing, and seeing that the card's question appear almost immediately while the counts takes a few seconds to actually update.
Checklist
Please, go through these checks before submitting the PR.
if
statements)