-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: bugs in evaluator #590
Conversation
…eased the robustness of data.utils.data_preparation, which can divide the dataset into two parts (train, test) or three parts (train, valid, test).
…dle empty dataset now.
FEA: Add config['benchmark_filename'] to load pre-split dataset.
FEA: Increased the robustness of trainer.evaluate && bug fix in GeneralFullDataLoader.
FIX: optimize the update_attentive_A function in KGAT
@guijiql Add some test cases in the |
recbole/evaluator/proxy_evaluator.py
Outdated
for metrics, evaluator in metric_eval_bind: | ||
used_metrics = list(metrics_set.intersection(set(metrics.keys()))) | ||
used_metrics = [metric for metric in metrics_list if metric in metrics.keys()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in metrics.keys()
-> in metrics
recbole/evaluator/metrics.py
Outdated
all_with_pos = np.any(pos_len_list == 0) | ||
all_with_neg = np.any(neg_len_list == 0) | ||
non_zero_idx = np.full(len(user_len_list), True, dtype=np.bool) | ||
if all_with_pos: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the meaning of this? why does all_with_pos
mean np.any(pos_len_list == 0)
, if pos_len_list = array([1,2,3])
, then np.any(pos_len_list == 0)
is False
, so all_with_pos
is False
?
tests/metrics/test_rank_metrics.py
Outdated
from recbole.evaluator.metrics import metrics_dict | ||
|
||
|
||
class TestCases(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, pos_rank_sum
can‘t ensure your metric is right, please test the function evaluator.collect
to ensure your pos_rank_sum
is right. The reason to add the test is that it requires complex logic to calculate it.
1.Individual Evaluator can't raise NotImplementedError when used with eval_setting is full.
2.metrics may be disordered in log information.
3.GAUC will be calculated by error when there are items with same predicted scores for one user.