Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with "stat distance" #9508

Closed
wants to merge 3 commits into from
Closed

Experiment with "stat distance" #9508

wants to merge 3 commits into from

Conversation

bhollis
Copy link
Contributor

@bhollis bhollis commented May 31, 2023

This is an idea I had in response to the idea of "stat clusters". For comparing armor against a specific piece, we could use euclidean distance (vector distance) to judge the "similarity" of stat rolls within the 6-dimensional stat space. This could then be used in Compare or Triage to show "similar stat armor" given some cutoff. It works pretty well at lower distances, though it should be noted that it can't say anything about "better" or "worse".

Of course, this would be a challenge to communicate / visualize.

Screenshot 2023-05-31 at 11 04 26 AM

@bhollis bhollis requested review from robojumper and nev-r May 31, 2023 18:07
@robojumper
Copy link
Member

Unrefined thoughts:

  • I get why euclidean distance is the better choice for "similarity", but perhaps Manhattan distance / L^1 would be simpler and easier to explain?
  • I feel like this is a good way to find similar armor in response to a new armor drop (especially when included in Triage), but doesn't really address the "clustering" part for looking at your whole vault since the hard part is finding a centroid / comparison piece.
  • Maybe some integration with custom stats? Would be difficult in Compare but maybe simpler in Triage.

Not really related to "stat distance" in particular but still

  • I think what people really want is an answer to "which armor pieces should I get rid of while still having access to the best builds?", and outside of is:statlower, this is just not something that DIM can answer without looking at all 5 slots and the loadouts the user cares about. Two pieces can be very similar but that doesn't mean you can substitute one with the other, and at the end of the day you do want to keep a bunch of armor, perhaps even similar armor, around so that the law of large numbers can do its thing and there are some combinations that end up producing nice round stat numbers.

@JagetYohan
Copy link

JagetYohan commented Jun 9, 2023

If I may, though it may sound odd coming from a biostatistician: from my understanding, euclidian distance won't meet your expectations here, as useful as it is in regression matters, because a vector with a rather similar profile (=armor stats set, say -2 on each stat) might have the same distance to your control vector (the item you want to be compared to) as a very different one (say with random ±2).
Example:
control: {10,10,10}
vector1: {8,8,8}
vector2: {12,8,12}

Since you're into similarity and geometry, I'd rather see an "orientation" of the vectors, thus making scalar/dot product relevant, or rather cos(control, vector). Have it close to ±1, your armor items are similar; close to 0, their strong spots are opposite. If numbers were real instead of integers, it could even be collinear with 2 different totals; but those are integers, so in practice you'll never reach 1 (except if all the stats are the same, of course).
However this does not address "in which stats are they similar", this would need another parameter.

I understand how one could intuitively be seduced by correlation approaches regarding similarity here (see Principal Component Analysis, for example), the problem being that the stats are ABSOLUTELY uncorrelated to one another. If you plot a PCA, the axis will be defined by the specificity of your vault; it won't have any intelligible sense.

Then again, I'm a biologist before a mathematician.

@robojumper
Copy link
Member

The dot product argument makes sense if you assume that you'll be comparing armor pieces with very different stat totals, but I feel like this feature is meant for people who have a lot of armor that's good on paper but can't decide what to get rid of, after dismantling all the low-stat armor already (<60 is considered VERY low the community). In @bhollis' screenshot all the relevant armor is in the [61-64] bracket, and I have a vault policy of only keeping legendaries in the [65, 68] bracket, so a dot product and a distance metric would probably still serve a very similar purpose. And at that point the better computation might be the one that's easier to explain.

I agree that PCA is not useful here, but not because armor stats are uncorrelated. In fact, some armor stats are heavily correlated, but we don't need a PCA to rediscover that for a base 68 legendary piece, MOB+RES+REC = DIS+INT+STR = 34.

@bhollis
Copy link
Contributor Author

bhollis commented Jul 29, 2023

Cosine similarity does both hit the intention of the feature better, and is easier to display as a "similarity percentage" (even though it's not really a percentage). With a carefully chosen threshold we could present a "Similar Stat Profile" button in compare/triage. Euclidean distance can be mixed in a bit to get some more of an order.

@bhollis bhollis closed this Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants