-
Notifications
You must be signed in to change notification settings - Fork 520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added score calculations #3040
Added score calculations #3040
Conversation
@TheRealJessicaLi looking at the CI results, it seems there might be some issues with building & running the domain questions tests. Do they build & run for you locally? Also, I'll need to take a closer look at this tomorrow or Wendesday--apologies for the delay. |
for (questionMetric in questionSessionMetrics) { | ||
val totalHintsPenalty = questionMetric.numberOfHintsUsed * viewHintPenalty | ||
val totalWrongAnswerPenalty = | ||
(questionMetric.numberOfAnswersSubmitted - 1) * wrongAnswerPenalty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to ensure that this isn’t a negative value (I.e. numberOfAnswersSubmitted has to be greater than 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into QuestionAssessmentProgressController and it seems like we're only marking a question as completed once the user actually submits an answer for that question. Since we're not calculating scores until all the questions are completed, I don't think numberOfAnswersSubmitted would ever be 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might it be worth filtering the metrics on those that have at least 1 submitted answer? That might be simple with Kotlin:
for (questionMetric in questionSessionMetrics.filter { it.numberOfAnswersSubmitted > 0 }) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we'd want to filter those out because we would still want to include those questions in the total score. As an alternative, I could make sure the computed totalWrongAnswerPenalty
is non-negative?
this.totalScore.numerator += questionScore | ||
for (linkedSkillId in questionMetric.question.linkedSkillIdsList) { | ||
if (!scorePerSkillMapping.containsKey(linkedSkillId)) continue | ||
scorePerSkillMapping[linkedSkillId]!!.numerator += questionScore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There could be cases where the score per skill mapping has a numerator and a denominator as 0. Are we handling that case anywhere? Ideally it’s best to remove those entries lower in the stack so that when we actually divide the numerator and denominator we do not need to handle that scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed! Thanks for catching that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I missed this..won't we still have cases where both the numerator and denominator can be 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're removing any skills with a 0 numerator/denominator in the computeAll() method
/** Compute the overall score as well as the score and mastery per skill. */ | ||
internal fun computeAll(): UserAssessmentPerformance { | ||
calculateScores() | ||
// TODO: set up mastery calculations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you create an issue for this and add the issue number here? I.e. TODO(#issuenum)....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
domain/src/main/java/org/oppia/android/domain/question/QuestionAssessmentCalculation.kt
Outdated
Show resolved
Hide resolved
.../src/test/java/org/oppia/android/domain/question/QuestionAssessmentProgressControllerTest.kt
Show resolved
Hide resolved
Will take a pass after the latest changes are addressed. |
linted more linting
Just addressed Vinita's comments! |
Sorry, will need to look at this tomorrow (but maybe Monday). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @TheRealJessicaLi! Sorry for the super long delay here. Took a first pass (mostly just considering some of the high-level decisions before doing a closer review).
/** | ||
* Private mutable class that stores a grade as a fraction. | ||
*/ | ||
private class MutableFractionGrade(var numerator: BigDecimal, var denominator: BigDecimal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a high-level, why are we using BigDecimal? It seems like a pretty heavy weight solution, and I don't see the need here for perfect precision. What are we trying to mitigate with it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was to avoid floating point precision errors when doing computations, but I just saw your comment about using integers so I'll try that instead
/** Compute the overall score as well as the score and mastery per skill. */ | ||
internal fun computeAll(): UserAssessmentPerformance { | ||
calculateScores() | ||
// TODO(#3067): create mastery calculations method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit (for consistency):
// TODO(#3067): create mastery calculations method | |
// TODO(#3067): Create mastery calculations method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
||
@Provides | ||
@WrongAnswerPenalty | ||
fun provideWrongAnswerPenalty(): Double = 0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this & above: it feels a bit like the unit is missing here. Maybe 'WrongAnswerScorePenalty' would be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -82,14 +82,20 @@ message AnsweredQuestionOutcome { | |||
bool is_correct_answer = 2; | |||
} | |||
|
|||
// A user's grade for a practice session represented as a fraction. | |||
message FractionGrade { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit confusing since fractions can't be a real value divided by a real value. For what reason do we want to keep a numerator and denominator here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe Chantel wants to display the score for each skill as a fraction (so we need both numerator and denominator), and Oppia web is calculating scores as real values instead of integers, so I did the same for consistency. Do we want to multiply everything by a factor of 10 so we have integers instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a worthwhile simplification. I see the current implementation and I find it much easier to follow without the floating point bits.
That being said, a few thoughts:
- If you're multiplying everything by '10', we should make that '10' a Dagger constant since it's now a magic value (e.g. something like 'max points per question' or reuse the existing constant)
- 'numerator' and 'denominator' are still really unclear. Could we just use the same terms that are being used elsewhere? E.g. 'pointsReceived' & 'totalPointsAvailable', respectively?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up on this--can these values be ints now that everything is being scaled? You could always perform the final decimal calculation in the UI and keep things ints at the domain layer to simplify management.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept it as a double because I think Chantel may want to display the scores as a fraction in the UI (e.g. "you got score x / y"), so in that case I thought it would be better to calculate the final values to be consistent with the way Oppia web does it (which may not be integers). To do the calculation, we would also need access to the multiplier constant, which is in the domain module. I thought it made more sense to complete the calculations here before sending off the final score (ready for display) to the app module. Let me know what you think!
|
||
@Provides | ||
@ViewHintPenalty | ||
fun provideViewHintPenalty(): Double = 0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, do these need to be represented as doubles? We could alternatively employ a point system wherein max score is 10 points and penalties are 1 point each. That might let us keep integers everywhere along the computation path which would offer perfect precision, too, until the final division. Would that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should work and it makes sense. I believe we went with a decimal system for web and so thats why we translated the same thing here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the implementation to use integers in the score calculation step under the hood (by multiplying everything by a factor of 10), but I'm returning doubles as the result (by dividing everything by 10 again) because Chantel said we may be displaying the scores in fraction form and I wanted to make sure the scores match the decimal system used for web in that case. Let me know what you think!
} | ||
} | ||
|
||
// TODO(#3067): set up finalMasteryPerSkillMapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TODO(#3067): set up finalMasteryPerSkillMapping | |
// TODO(#3067): Set up finalMasteryPerSkillMapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
@Provides | ||
@ViewHintPenalty | ||
fun provideViewHintPenalty(): Double = 0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should work and it makes sense. I believe we went with a decimal system for web and so thats why we translated the same thing here
this.totalScore.numerator += questionScore | ||
for (linkedSkillId in questionMetric.question.linkedSkillIdsList) { | ||
if (!scorePerSkillMapping.containsKey(linkedSkillId)) continue | ||
scorePerSkillMapping[linkedSkillId]!!.numerator += questionScore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I missed this..won't we still have cases where both the numerator and denominator can be 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a bit longer on trying to simplify the calculation class that I intended. LMK what you think about the suggestion and then I will take a pass on the other parts of the PR.
private val scorePerSkillMapping: MutableMap<String, MutableFractionGrade> | ||
) { | ||
/** Calculate the user's overall score and score per skill for this practice session. */ | ||
private fun calculateScores() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are generally two expected ways to approach this problem (unless I'm missing some fundamental requirements):
- Each time a question is answered, determine what needs to be tracked & track that. Results are a filtered aggregate/combination of this data.
- Each time a question is answered, we record that it was answered. When results are needed, we calculate everything in that moment.
You're right that mutable state is needed now, but I think the choices allow flexibility in storing the mutable state either at the stats level (option 1) or at the event level (option 2). Option 1 is more performant when aggregating since we do some work ahead of time, and option 2 is more performant when recording. We should pick whichever option reflects typical access patterns, and which simplifies the implementation.
I think the current approach is more of option 2 in aggregation (everything is happening in compute) but using the storage of option 1. I don't quite see why we need to store MutableFractionGrade when it's only being changed/used in the compute function. It seems a bit like it's memoization, but unless compute is being called a lot that probably isn't needed.
Now, stepping back a moment. Given that the class no longer is storing state in Dagger (& thus has nothing to reset across sessions--we can just create a new one), and it's intended that computeAll only be computed at the end of the session, this additional caching seems a lot less necessary. Avoiding it actually results in a lot of simplifications, I think. One alternative that follows this pattern might be something like (which also conveniently avoids mutable state):
fun computePerformance(questionSessionMetrics: List<QuestionSessionMetrics>): UserAssessmentPerformance {
val allQuestionScores: List<Int> = questionSessionMetrics.map { it.computeQuestionScore() }
val totalPointsScored = allQuestionScores.sum()
val totalAvailablePoints = allQuestionScores.size * maxScorePerQuestion
// Note to Jessica: this also avoids needing to pass in the list to the class's constructor since we no longer need to initialize the list ahead of time. I think that results in a simpler API even though it comes at the expense of calling computeQuestionScore multiple times (thought that *could* be memoized if we found it to have performance issues).
val scoresPerSkill: Map<String, List<Int>> = questionSessionMetrics.flatMap { questionMetric ->
// Convert to List<Pair<String, QuestionSessionMetrics>>>
questionMetric.linkedSkillIdsList.map { skillId -> skillId to questionMetric}
}.groupBy(
// Covert to Map<String, List<QuestionSessionMetrics>> for each linked skill ID.
{ (skillId, _) -> skillId },
valueTransform = { (_, questionMetric) -> questionMetric }
).mapValues { (_, questionMetrics) ->
// Compute the question score for each question, converting type to Map<String, List<Int>>
questionMetrics.map { it.computeQuestionScore() }
}
// Assemble & return the proto, plus any other needed bookkeeping. Use the same sum()/size() * max to compute the total scores/available for each question metric.
}
private fun UserAssessmentPerformance.computeQuestionScore(): Int = if (!didViewSolution) {
val hintsPenalty = numberOfHintsUsed * viewHintPenalty
val wrongAnswerPenalty = getAdjustedWrongAnswerCount() * wrongAnswerPenalty
(maxScorePerQuestion - hintsPenalty - wrongAnswerPenalty).coerceAtLeast(0)
} else 0
private fun UserAssessmentPerformance.getAdjustedWrongAnswerCount(): Int =
(UserAssessmentPerformance - 1).coerceAtLeast(0)
(this is not complete & could be further split up if desired). https://stackoverflow.com/a/43614691 was a big help for the grouping part since I couldn't quite figure out how to map against the linked skill IDs.
WDYT?
I took a look at the suggestions you made and I agree that it makes things a lot cleaner and safer - I've updated QuestionAssessmentCalculation to reflect the changes. Thanks for taking the time to flesh out the pattern! I'll try to use something similar when I'm implementing mastery calculations to avoid mutable state as much as possible. |
Sorry, been really busy with non-work items this week. Will need to take a look later in the week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
pointsReceived = allQuestionScores.sum().toDouble() / internalScoreMultiplyFactor | ||
totalPointsAvailable = | ||
allQuestionScores.size.toDouble() * maxScorePerQuestion / internalScoreMultiplyFactor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to divide both the numerator and denominator of the FractionGrade by the factor? Wont the fraction stay equivalent regardless?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might also be simpler to just document the multiplier in the other constants rather than factoring it in with math (since it's unlikely we need to tweak that without changing the other constants).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vinita @BenHenning My reasoning for dividing both the numerator and denominator by the multiplier was to make sure the final FractionGrade's numerator and denominator are consistent with the values that we would have gotten on Oppia web. Since Chantel said she may want to display the score as a fraction (e.g. "you got x / y points"), I thought it would be better to do that here so the app layer doesn't have to simplify it.
I made a separate constant for this multiplier because it's possible that the maxScorePerQuestionConstant can be different from the multiplier if the scoring system were to the change in the future. As an example right now, for mastery calculations, maxMasteryGainPerQuestion is 10, but we multiplied all the constants by a factor of 100 to maintain integer values. In the case that they are different (like for masteries), we would need to use both values to calculate the final scores/masteries. I thought keeping this multiplier separate would be good for consistency and also leave room for future changes so the math still checks out.
Let me know what you think, thanks!
pointsReceived = scores.sum().toDouble() / internalScoreMultiplyFactor | ||
totalPointsAvailable = | ||
scores.size.toDouble() * maxScorePerQuestion / internalScoreMultiplyFactor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @TheRealJessicaLi. Took a full pass on the PR & have a few comments. Note that I'm resetting my previous review status to unblock this PR since I'll be largely unavailable over the next couple weeks.
pointsReceived = allQuestionScores.sum().toDouble() / internalScoreMultiplyFactor | ||
totalPointsAvailable = | ||
allQuestionScores.size.toDouble() * maxScorePerQuestion / internalScoreMultiplyFactor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might also be simpler to just document the multiplier in the other constants rather than factoring it in with math (since it's unlikely we need to tweak that without changing the other constants).
allQuestionScores.size.toDouble() * maxScorePerQuestion / internalScoreMultiplyFactor | ||
}.build() | ||
|
||
private fun QuestionSessionMetrics.computeQuestionScore(): Int = if (!didViewSolution) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: suggest moving these below since they're separate from the private helpers of this class (they sort of belong to both this class & QuestionSessionMetrics as extension functions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
@@ -56,6 +57,7 @@ class QuestionAssessmentProgressController @Inject constructor( | |||
|
|||
private val progress = QuestionAssessmentProgress() | |||
private val progressLock = ReentrantLock() | |||
@Inject internal lateinit var scoreCalculatorFactory: QuestionAssessmentCalculation.Factory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be internal or can it be private? Prefer minimizing access to values when possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I make the field private, Dagger complains and says "error: Dagger does not support injection into private fields", so that's why I made it internal. Do you know any workarounds for this?
return dataProviders.createInMemoryDataProviderAsync( | ||
"user_assessment_performance" | ||
) { | ||
(this::retrieveUserAssessmentPerformanceAsync)(skillIdList) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't this just be retrieveUserAssessmentPerformanceAsync(skillIdList)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
@@ -11,6 +11,18 @@ annotation class QuestionCountPerTrainingSession | |||
@Qualifier | |||
annotation class QuestionTrainingSeed | |||
|
|||
@Qualifier | |||
annotation class ViewHintScorePenalty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each of these qualifiers should have documentation to explain what they correspond to/how to use them. E.g.:
/** Qualifier corresponding to the penalty users receive for each hint viewed in a practice session (an application-level injectable int). */
@Qualifier annotation class ViewHintScorePenalty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, thanks!
assertThat(hintAndSolution.correctAnswer.correctAnswer) | ||
.isEqualTo("3.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will these fit on the same line? If so, prefer that (in general, prefer to fill up as much line space as possible before line wrapping).
Ditto elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -994,6 +1003,515 @@ class QuestionAssessmentProgressControllerTest { | |||
assertThat(updatedState.ephemeralState.state.interaction.solution.solutionIsRevealed).isTrue() | |||
} | |||
|
|||
@Test | |||
fun testRevealedSolution_forWrongAnswer_returnScore2OutOf3() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here & below: I suggest factoring high-level actions & their sanity checks into different helper methods to only keep the high-level details relevant to the behavior being tested in the method body. For example, with refactoring this test could instead be:
fun testRevealedSolution_forWrongAnswer_returnScore2OutOf3() {
// Arrange
setUpTestApplicationWithSeed(questionSeed = 0)
subscribeToCurrentQuestionToAllowSessionToLoad()
startTrainingSession(TEST_SKILL_ID_LIST_2)
submitWrongAnswerForQuestion2() // NOTE TO JESSICA: this could also perform verification as a sanity check
submitSolutionIsRevealed()
submitCorrectAnswerForQuestion2()
// Act
val grade = getExpectedSuccessfulGrade()
// Assert
assertThat(grade.numerator).isEqualTo(2)
assertThat(grade.denominator).isEqualTo(3)
}
Notice how this can help focus the test body just to the details that the reader needs to see & hide the boilerplate to make it work. This is an easier test to verify, and reduces a lot of the overall code that we need to maintain for all of the tests being added.
I suggest making this or a similar clean up for all new tests here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored, thanks!
@@ -82,14 +82,20 @@ message AnsweredQuestionOutcome { | |||
bool is_correct_answer = 2; | |||
} | |||
|
|||
// A user's grade for a practice session represented as a fraction. | |||
message FractionGrade { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up on this--can these values be ints now that everything is being scaled? You could always perform the final decimal calculation in the UI and keep things ints at the domain layer to simplify management.
Dismissing to unblock the PR since I'll be unavailable for reviews over the next 2 weeks.
@TheRealJessicaLi Assign me this PR once you have addressed Ben's and Vinita's comments so that I can cross verify them in my full PR pass. Thanks. |
I won't be able to take a look at this until I return on or after the 24th of May but I'm happy to defer to @vinitamurthi for domain layer design decisions. |
I will take a look once again to ensure Ben's comments are addressed, but otherwise the approach LGTM. @rt4914 can you please take a look? We are blocked on your codeowner review |
LGTM from my side, @rt4914 PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for codeowners. Thanks @TheRealJessicaLi
Explanation
Added score calculations (with testing) for practice sessions. Mastery calculations to be implemented in a separate PR.
Checklist