-
-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experiment with "stat distance" #9508
Conversation
Unrefined thoughts:
Not really related to "stat distance" in particular but still
|
If I may, though it may sound odd coming from a biostatistician: from my understanding, euclidian distance won't meet your expectations here, as useful as it is in regression matters, because a vector with a rather similar profile (=armor stats set, say -2 on each stat) might have the same distance to your control vector (the item you want to be compared to) as a very different one (say with random ±2). Since you're into similarity and geometry, I'd rather see an "orientation" of the vectors, thus making scalar/dot product relevant, or rather cos(control, vector). Have it close to ±1, your armor items are similar; close to 0, their strong spots are opposite. If numbers were real instead of integers, it could even be collinear with 2 different totals; but those are integers, so in practice you'll never reach 1 (except if all the stats are the same, of course). I understand how one could intuitively be seduced by correlation approaches regarding similarity here (see Principal Component Analysis, for example), the problem being that the stats are ABSOLUTELY uncorrelated to one another. If you plot a PCA, the axis will be defined by the specificity of your vault; it won't have any intelligible sense. Then again, I'm a biologist before a mathematician. |
The dot product argument makes sense if you assume that you'll be comparing armor pieces with very different stat totals, but I feel like this feature is meant for people who have a lot of armor that's good on paper but can't decide what to get rid of, after dismantling all the low-stat armor already (<60 is considered VERY low the community). In @bhollis' screenshot all the relevant armor is in the [61-64] bracket, and I have a vault policy of only keeping legendaries in the [65, 68] bracket, so a dot product and a distance metric would probably still serve a very similar purpose. And at that point the better computation might be the one that's easier to explain. I agree that PCA is not useful here, but not because armor stats are uncorrelated. In fact, some armor stats are heavily correlated, but we don't need a PCA to rediscover that for a base 68 legendary piece, MOB+RES+REC = DIS+INT+STR = 34. |
Cosine similarity does both hit the intention of the feature better, and is easier to display as a "similarity percentage" (even though it's not really a percentage). With a carefully chosen threshold we could present a "Similar Stat Profile" button in compare/triage. Euclidean distance can be mixed in a bit to get some more of an order. |
This is an idea I had in response to the idea of "stat clusters". For comparing armor against a specific piece, we could use euclidean distance (vector distance) to judge the "similarity" of stat rolls within the 6-dimensional stat space. This could then be used in Compare or Triage to show "similar stat armor" given some cutoff. It works pretty well at lower distances, though it should be noted that it can't say anything about "better" or "worse".
Of course, this would be a challenge to communicate / visualize.