-
Notifications
You must be signed in to change notification settings - Fork 620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEA: Add Interaction features of the history items for sequential dataloader #547
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! It looks really good.
There is only one small issue, which can be fixed easily (referring to the inline reviews below). After fixing, we'll be delightful to merge this PR. :D
As for the concerns about SEQ features:
- The max length of SEQ features has been stored in
Dataset.field2seqlen
, referring to docs about propertyfield2seqlen
and argseq_len
. - Maybe it's appropriate to be represented as a 3D tensor.
HOWEVER, as we are refactoring DataLoader
in our 0.2.x
branch, where the codes may be changed (but also accelerate a lot), we recommend this part remains NotImplemented temporarily. Thanks for your cautious!
@@ -57,6 +57,21 @@ def __init__(self, config, dataset, | |||
self.target_time_field = self.time_field | |||
self.item_list_length_field = config['ITEM_LIST_LENGTH_FIELD'] | |||
|
|||
for field in dataset.inter_feat: | |||
if field != self.iid_field and field != self.time_field: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If field == self.uid_field
, the corresponding attribute f'{field}_list_field'
should not be setted.
Maybe this line can be replaced by:
if field not in [self.uid_field, self.iid_field, self.time_field]:
@@ -136,6 +151,18 @@ def augmentation(self, uid_list, item_list_index, target_index, item_list_length | |||
for field in self.dataset.inter_feat: | |||
if field != self.iid_field and field != self.time_field: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the aforementioned review.
I agree that 3D tensor could be better, yet looking through the function _dict_to_interaction, Probably, some new FeatureType need to be registered, and then to be applied. Having said that, I may point out that the current bottleneck is we cannot easily incorporate the customized dataloader, and datasampler, since they all reply on the ModelType, which I think is not flexible. I am wondering if Recbole would refactorize the API to adapt to customized essential modules in near future? |
Thanks for pointing this issue out! We are considering refactoring APIs about DataLoader/Sampler selection to make it easier to incorporate custom modules. |
I have added the function as we discussed, and confirm that the feature loaded is correct.
The only issue is, I have not decided how to add SEQ features of the history items, say a number of comments for several movies in the Moview-Len. I am not sure about two aspects: