Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SL: Debugging _adapt_offline() method #746

Merged
merged 9 commits into from
May 4, 2023
Merged

Conversation

steveyuwono
Copy link
Collaborator

Description

Background

Checklists:

  • Read the CONTRIBUTION guide (required).
  • Update the history on the source code (required).

@steveyuwono steveyuwono added bug Something isn't working SL Supervised Learning next release v1.2.0 Extension Hub labels May 3, 2023
@steveyuwono steveyuwono self-assigned this May 3, 2023
@steveyuwono steveyuwono linked an issue May 3, 2023 that may be closed by this pull request
3 tasks
@steveyuwono
Copy link
Collaborator Author

Hi @laxmikantbaheti , thank you once again for noticing the mistakes in MLPro-SL and explaining them in #744. I have updated the sampling and _adapt_offiline methods accordingly. Could you review this and let me know if something is still not correct? Otherwise, I will merge this into the main branch.

@laxmikantbaheti
Copy link
Contributor

Hi @laxmikantbaheti , thank you once again for noticing the mistakes in MLPro-SL and explaining them in #744. I have updated the sampling and _adapt_offiline methods accordingly. Could you review this and let me know if something is still not correct? Otherwise, I will merge this into the main branch.

Hi @steveyuwono , thanks for making the changes. Sure, I will review and let you know.

@laxmikantbaheti
Copy link
Contributor

laxmikantbaheti commented May 3, 2023

Hi @steveyuwono , I think, here you are fetching data with the indexes from the sampler and store those specific data in the input and output keys of trainer and tester dictionaries. Which works, but I dont think you need to implement that entire logic, since you've already provided the sampler to the dataloader, now if you only iterate over it like an iterator, as following:

trainer["input"] = next(iter(trainer_loader_input))

it will return a tensor of all the tensors at the given indices of sampler.

however, current implementation, would still only return first batch of data. In case of howto_mbrl_gridworld or defualt parameters, this is ok, as the batch size is equal to the buffer size and thus the number of batches within it is only one. However, in cases where buffer is larger and the batch size is smaller than half the buffer size, there would be more than one batch. In which case the adapt offline method shall iterate over all the batches in the buffer, optimize after each batch forward and then end.

So, I suggest, the sampling method can be left as it was previously, and the in the _adapt_offline() method of the previous version, insead of doing

`
def _adapt_offline(self, p_dataset:dict) -> bool:

    self._sl_model.train()
    
    outputs = self.forward(p_dataset["input"])
    loss    = self._calc_loss(outputs, p_dataset["output"])
    self._optimize(loss)
        
    return True` 

How about:

`
def _adapt_offline(self, p_dataset:dict) -> bool:

    self._sl_model.train()
    input_batches = iter(p_dataset["input"])
    output_batches = iter(p_dataset["output"])
    
    for i in ( len ( p_dataset["input"] ) - 1 ) :

         outputs = self.forward(next(input_batches))
         loss    = self._calc_loss(outputs, next( output_batches ) )
         self._optimize(loss)
        
    return True` 

This might be one of the option.
Thanks again for working on it. Let me know if I can be helpful.

@steveyuwono
Copy link
Collaborator Author

Hi @steveyuwono , I think, here you are fetching data with the indexes from the sampler and store those specific data in the input and output keys of trainer and tester dictionaries. Which works, but I dont think you need to implement that entire logic, since you've already provided the sampler to the dataloader, now if you only iterate over it like an iterator, as following:

trainer["input"] = next(iter(trainer_loader_input))

it will return a tensor of all the tensors at the given indices of sampler.

however, current implementation, would still only return first batch of data. In case of howto_mbrl_gridworld or defualt parameters, this is ok, as the batch size is equal to the buffer size and thus the number of batches within it is only one. However, in cases where buffer is larger and the batch size is smaller than half the buffer size, there would be more than one batch. In which case the adapt offline method shall iterate over all the batches in the buffer, optimize after each batch forward and then end.

So, I suggest, the sampling method can be left as it was previously, and the in the _adapt_offline() method of the previous version, insead of doing

` def _adapt_offline(self, p_dataset:dict) -> bool:

    self._sl_model.train()
    
    outputs = self.forward(p_dataset["input"])
    loss    = self._calc_loss(outputs, p_dataset["output"])
    self._optimize(loss)
        
    return True` 

How about:

` def _adapt_offline(self, p_dataset:dict) -> bool:

    self._sl_model.train()
    input_batches = iter(p_dataset["input"])
    output_batcher = iter([output])
    
    for i in ( len ( p_dataset["input"] ) - 1 ) :

         outputs = self.forward(next(input_batches))
         loss    = self._calc_loss(outputs, next( output_batches ) )
         self._optimize(loss)
        
    return True` 

This might be one of the option. Thanks again for working on it. Let me know if I can be helpful.

Hi @laxmikantbaheti, previously, I was using trainer["input"] = next(iter(trainer_loader_input)), but then I found out that the order of the indices is random. Thus, the trainer["input"] and trainer["output"] can have different orders and are not relevant once we calculate the prediction losses by the network. Therefore, I fix the batch in the sampling method.

For instance, in the first sampling, the set train_indices is:
[53, 38, 10, 32, 73, 19, 27, 36, 80, 39, 65, 82, 78, 40, 85, 74, 81, 23, 34, 96, 94, 4, 75, 15, 91, 41, 51, 45, 99, 52, 26, 12, 43, 67, 24, 69, 77, 49, 21, 3, 30, 47, 83, 8, 60, 0, 98, 57, 22, 61, 63, 7, 9, 13, 68, 93, 14, 29, 28, 11, 84, 18, 20, 50, 25, 6, 71, 76, 1, 16]
The order of next(iter(trainer_loader_input)) is:
[82, 8, 37, 2, 49, 67, 83, 1, 66, 84, 13, 50, 29, 9, 90, 40, 31, 10, 38, 6, 86, 43, 17, 44, 12, 33, 81, 11, 45, 7, 52, 73, 80, 24, 0, 69, 46, 68, 92, 19, 47, 39, 65, 87, 27, 93, 41, 14, 48, 18, 25, 75, 21, 23, 26, 42, 70, 96, 74, 76, 20, 4, 35, 51, 53, 22, 54, 15, 28]
Meanwhile, the order of next(iter(trainer_loader_output)) is:
[66, 10, 1, 83, 82, 21, 12, 9, 6, 84, 39, 40, 18, 38, 43, 41, 50, 15, 27, 52, 86, 13, 68, 44, 32, 17, 65, 11, 73, 20, 69, 80, 45, 8, 24, 23, 75, 29, 92, 87, 81, 85, 34, 36, 49, 14, 28, 26, 0, 3, 93, 16, 46, 25, 7, 88, 22, 70, 47, 89, 42, 71, 51, 74, 48, 77, 67, 4, 53, 30]

However, I just found out that this issue can be solved through torch.manual_seed(). Hence, it now can be done in the _adapt_offline. I also add p_num_iter as an input of _adapt_offline. Please check it once again and let me know. Thank you!

@laxmikantbaheti
Copy link
Contributor

Hi @laxmikantbaheti, previously, I was using trainer["input"] = next(iter(trainer_loader_input)), but then I found out that the order of the indices is random. Thus, the trainer["input"] and trainer["output"] can have different orders and are not relevant once we calculate the prediction losses by the network. Therefore, I fix the batch in the sampling method.

For instance, in the first sampling, the set train_indices is: [53, 38, 10, 32, 73, 19, 27, 36, 80, 39, 65, 82, 78, 40, 85, 74, 81, 23, 34, 96, 94, 4, 75, 15, 91, 41, 51, 45, 99, 52, 26, 12, 43, 67, 24, 69, 77, 49, 21, 3, 30, 47, 83, 8, 60, 0, 98, 57, 22, 61, 63, 7, 9, 13, 68, 93, 14, 29, 28, 11, 84, 18, 20, 50, 25, 6, 71, 76, 1, 16] The order of next(iter(trainer_loader_input)) is: [82, 8, 37, 2, 49, 67, 83, 1, 66, 84, 13, 50, 29, 9, 90, 40, 31, 10, 38, 6, 86, 43, 17, 44, 12, 33, 81, 11, 45, 7, 52, 73, 80, 24, 0, 69, 46, 68, 92, 19, 47, 39, 65, 87, 27, 93, 41, 14, 48, 18, 25, 75, 21, 23, 26, 42, 70, 96, 74, 76, 20, 4, 35, 51, 53, 22, 54, 15, 28] Meanwhile, the order of next(iter(trainer_loader_output)) is: [66, 10, 1, 83, 82, 21, 12, 9, 6, 84, 39, 40, 18, 38, 43, 41, 50, 15, 27, 52, 86, 13, 68, 44, 32, 17, 65, 11, 73, 20, 69, 80, 45, 8, 24, 23, 75, 29, 92, 87, 81, 85, 34, 36, 49, 14, 28, 26, 0, 3, 93, 16, 46, 25, 7, 88, 22, 70, 47, 89, 42, 71, 51, 74, 48, 77, 67, 4, 53, 30]

However, I just found out that this issue can be solved through torch.manual_seed(). Hence, it now can be done in the _adapt_offline. I also add p_num_iter as an input of _adapt_offline. Please check it once again and let me know. Thank you!

Ohh, now I understand.
This looks cool. Thanks.

just one thing, if we iter the dataloader at each call, it will reset and give only the first batch. And the num_iterations is provided externally. I think it shall be different. Let's say you have a buffer of size 100 and batch size of 20, thus 5 different batches. Each time next shall give a next batch of 20. And then the adapt_offline shall end when all the batches are over.

@laxmikantbaheti
Copy link
Contributor

Thanks very much again for all the changes!!

@steveyuwono
Copy link
Collaborator Author

steveyuwono commented May 4, 2023

Hi @laxmikantbaheti , sorry I did not notice your message yesterday. I fixed the iterations in _adapt_offline according to its batch size. I believe this works properly at the moment. Shall I merge this into the main?

Also thank you anyway for your review

@laxmikantbaheti
Copy link
Contributor

Hi @steveyuwono , thank you. It works perfectly.

@steveyuwono steveyuwono merged commit de5e83f into main May 4, 2023
@steveyuwono steveyuwono deleted the sl/torch/debugging branch May 4, 2023 09:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working SL Supervised Learning v1.2.0 Extension Hub
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Handling of batch size in SL _adapt_offline() method
2 participants