Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue#2878: Adds Siamese Network example #2882

Merged
merged 17 commits into from
Mar 30, 2023

Conversation

DeepC004
Copy link
Contributor

@DeepC004 DeepC004 commented Feb 22, 2023

Related to #2878

Description:
I have added an example on Siamese Networks in the examples folder along with requirements.txt and README.md.
I have tested the network locally with different parameters and set the default parameters to ones which perform the best.
Note: I am also working on Google Colab Notebook for the same and will add it to pytorch-ignite/examples soon.

Check list:

  • New tests are added (if a new feature is added)
  • New doc strings: description and/or example code are in RST format
  • Documentation is updated (if required)

@github-actions github-actions bot added the examples Examples label Feb 22, 2023
@DeepC004
Copy link
Contributor Author

@vfdev-5 please check the PR. I am unable to understand the origin of error. Is the code formatter adding the following line, ✨ 1 file would be formatted, 267 files already formatted ✨, by itself since it does not show up when formatter is run locally

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 23, 2023

@DeepC004 sure, I'll check the PR as soon as possible. To reformat please do

bash ./tests/run_code_style.sh install
bash ./tests/run_code_style.sh fmt

@DeepC004
Copy link
Contributor Author

Thank you. I was running the code formatter in the tests folder earlier, hence the code formatter was not picking up all the files for formatting.

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 23, 2023

Note: I am also working on Google Colab Notebook for the same and will add it to pytorch/examples soon.

@DeepC004 it should be added to https://github.com/pytorch-ignite/examples not pytorch/examples

See #2878 (comment)

Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @DeepC004 !
I left few comments on improving the code and also suggested few changes. Let me know what you think.

examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/README.md Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
@DeepC004 DeepC004 marked this pull request as draft February 24, 2023 06:51
@DeepC004 DeepC004 marked this pull request as ready for review February 25, 2023 23:45
@DeepC004
Copy link
Contributor Author

@vfdev-5 I have made the changes. May you please review them

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 27, 2023

@DeepC004 Thanks ! Can you share here training results (logs, metrics), does model converge etc ?

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 27, 2023

@DeepC004 I just run it locally and it is very very slow. Can you run some profiling to get an idea where is a bottleneck ?
Try this tool: https://pytorch-ignite.ai/how-to-guides/03-time-profiling/#handler-based-profiling-using-handlerstimeprofiler

@DeepC004
Copy link
Contributor Author

DeepC004 commented Feb 27, 2023

Yes, even I ran it locally before submitting and it took me a couple hours to complete the training (with cuda enabled). The loss drops down in ranges of 0.02 after the first epoch itself, and the average test loss came out to be around 0.00006 after 2 epochs of training. I will add time profiling to check for bottlenecks.

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 27, 2023

Sounds good.

I will add time profiling to check for bottlenecks.

The point is not to add profiling into the code but just understand why GPU is almost idle during the run

@DeepC004
Copy link
Contributor Author

I trained the model for 2 epochs and these are the results. Around 98.5% of time is spent in data flow tasks. I tried optimizations like using num_workers and setting pin_memory to true when creating DataLoader object, but there isn't much improvement. Is there any workaround for this problem?
image

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 27, 2023

Thanks for the update, @DeepC004 !
The bottleneck is with dataset and __getitem__ method most probably. Can you check if we could improve the code, maybe take a look how it is done in other repos.

examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
@DeepC004 DeepC004 marked this pull request as draft February 27, 2023 21:35
@DeepC004 DeepC004 marked this pull request as ready for review March 2, 2023 06:41
@DeepC004
Copy link
Contributor Author

DeepC004 commented Mar 2, 2023

@vfdev-5 I have made the changes. The run time is much better now!

Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DeepC004 I left few more comments.
By the way have you committed the changes, because I see a lot of unaddressed comments

examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
@DeepC004
Copy link
Contributor Author

DeepC004 commented Mar 3, 2023

@DeepC004 I left few more comments. By the way have you committed the changes, because I see a lot of unaddressed comments

Oh, I believe I commited the wrong file with incomplete changes. I will update it.

@DeepC004
Copy link
Contributor Author

@vfdev-5 May you please review the changes.

Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DeepC004 thanks for pinging. I left other comments to address

examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
examples/siamese_network/siamese_network.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few other nits and it will be good to go

Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @DeepC004
Next step is to integrate this example into CI.
Please, check this PR as example: #2899

@vfdev-5 vfdev-5 enabled auto-merge (squash) March 30, 2023 11:54
@vfdev-5 vfdev-5 disabled auto-merge March 30, 2023 12:46
@vfdev-5 vfdev-5 merged commit c1c9825 into pytorch:master Mar 30, 2023
@DeepC004
Copy link
Contributor Author

I will look into it. Thank you!

@DeepC004 DeepC004 deleted the deepc004/issue#2878 branch March 30, 2023 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples Examples
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants