You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello all. I am interested in using lightning for my research project. However I'm having trouble assessing the feasibility of my architecture in lightning due to some particularities.
The typical train loop that lightning abstracts looks like this:
for epoch in range(epochs):
...train code...
However my structure looks something more like this.
for task_number in range(number_of_tasks):
dataloader = Dataloader(task=t) # The datalaoder is task dependent.
if task_number == 0:
for epoch in range(epochs):
...regular train code...
else:
for epoch in range(epochs):
...selective retraining... # This uses pytorch hooks to only train certain nodes by setting grads to 0
model = split(model) # Logic that may add new nodes to the model (size change), also does training of newly added nodes
if loss > loss_threshold:
model = dynamic_expansion(model) # More logic that will do a size change and training
As you can see there are some challenges that don't easily translate to lightning, first the concept of tasks, task dependent loaders (for example, first task is a subset of mnist, second task is a different subset), and more complex task dependent logic which may cause a model size change and require newly added nodes to be trained.
I'm interested in using lightning, but I'm having trouble seeing how this arch could fit.
Thank you.
The text was updated successfully, but these errors were encountered:
I solved a similar problem by creating a DataLoader which yield batch that contains batch for all the tasks and writing the "task loop" inside the training step.
One solution is to have a single data loader but then parse the targets and outputs to only take into account the task at hand (for example, one task per entry in a one hot label). Could require some extra engineering. Will post here if I ever end up doing this.
Questions and Help
Hello all. I am interested in using lightning for my research project. However I'm having trouble assessing the feasibility of my architecture in lightning due to some particularities.
The typical train loop that lightning abstracts looks like this:
However my structure looks something more like this.
As you can see there are some challenges that don't easily translate to lightning, first the concept of tasks, task dependent loaders (for example, first task is a subset of mnist, second task is a different subset), and more complex task dependent logic which may cause a model size change and require newly added nodes to be trained.
I'm interested in using lightning, but I'm having trouble seeing how this arch could fit.
Thank you.
The text was updated successfully, but these errors were encountered: