Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excuse me, to run your project, what are the requirements for computer configuration (graphics card, memory) #9

Open
xjyp opened this issue Apr 9, 2023 · 7 comments
Labels
question Further information is requested

Comments

@xjyp
Copy link

xjyp commented Apr 9, 2023

No description provided.

@chensong1995
Copy link
Owner

Hello xjyp,

Thanks for your question! In our experiments, we train E-CIR using three Tesla V100-SXM2-32GB GPUs for approximately 100 hours (50 epochs). The batch_size is set to 96. With a smaller batch size, you should be able to train the model on most commercial GPUs. In my experience, you start to get an overall acceptable performance way before reaching the 50th epoch. I hope this helps! Let me know if you have further concerns.

@chensong1995 chensong1995 added the question Further information is requested label Apr 9, 2023
@xjyp
Copy link
Author

xjyp commented Apr 9, 2023

Hello chensong1995,

Thank you for your answer,

  1. I currently only have one RTX 3080 (10GB) * 1 GPU with 40G of memory. Can I train?
  2. I see that you are using 3 GPUs. Are you using distributed training?

@chensong1995
Copy link
Owner

Thanks for the follow-up!

  1. RTX 3080 (10GB) should be good enough for training if you shrink the batch size. I encourage you to also check out our latest work DeblurSR, which has lower computational requirements. With a batch size of 36, DeblurSR only needs about 72 hours and two Tesla V100-SXM2-32GB GPUs for 50 epochs.
  2. We use the DataParallel wrapper from PyTorch instead of DistributedDataParallel mainly because of its simplicity.

I hope this helps! Let me know if you have further concerns.

@xjyp
Copy link
Author

xjyp commented Apr 9, 2023

Okay, thank you very much for your patient answer, I will pay close attention to your work.

@chensong1995
Copy link
Owner

Thanks for your follow-up! It really depends on what your goal is. Are you searching for an event-based motion deblurring model as an inference model in your application? Are you trying to develop another model that improves E-CIR? I will be in a better position to offer help if you can fill me in with more specifics. If you are hesitant to share the details of your project in public, you can also send me an email privately.

@chensong1995
Copy link
Owner

Thanks for your reply! If you are developing a follow-up work in event-based motion deblurring, you do not have to run the entire training code. My suggestion is to download only train_0.hdf5, which will allow you to have an overall idea of how each component of our code works. If you change the number 16 to 0 on this line, the program will only load train_0.hdf5 as the training data. You may want to download val_0.hdf5 and val_1.hdf5 as well since they allow you to evaluate the model on the testing split. I hope this helps! Let me know if you need further assistance.

@xjyp
Copy link
Author

xjyp commented Apr 9, 2023

thank you very much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants