model.eval() #2083

JaejinCho · 2022-10-15T21:04:19Z

In the tutorial below, isn't it better to have model.eval() for more general cases in addition to the context manager torch.no_grad(), or at least have a brief explanation regarding the difference between the two? I think no_grad does not take care of dropout or batchnorm. Although not having model.eval() is fine in this tutorial, it seems necessary generally for evaluation.

tutorials/beginner_source/basics/optimization_tutorial.py

Line 172 in 5152270

with torch.no_grad():

cc @suraj813 @jerryzh168 @z-a-f @vkuzo

z-a-f · 2023-01-10T09:20:10Z

@svekars I believe label:arch-optimization is for quantization/sparsity related topics, so might not be applicable for this issue.

zabboud · 2023-05-31T18:04:24Z

/assigntome

zabboud · 2023-05-31T19:53:31Z

@JaejinCho as you mentioned model.eval() is important to ensure setting dropout and batch normalization layers to evaluation mode. As per the documentation

Remember that you must call model.eval() to set dropout and batch normalization layers to evaluation mode before running inference. Failing to do this will yield inconsistent inference results.

While the role of torch.no_grad() is to disable gradient calculation at inference time, such that Tensor.backward() will not be called. Also torch.no_grad() serves another purpose, where the memory consumption is decreased for tensors that have requires_grad=True. In effect, torch.no_grad() will temporarily set all the computations on tensors to requires_grad=False. For more details see the documentation.

However, I would tend to agree that for beginners to learn best practices, they should use both model.eval() and with torch.no_grad() to ensure the reduction of memory consumption for unnecessary computations, and for ensuring correct setting of batch_norm and dropout layers to eval mode.

Do you think what is needed is updating the example with comments to clarify the use case of these two modes? and to update the example to include model.eval()?

zabboud · 2023-05-31T20:25:01Z

@JaejinCho Would the following additions to the tutorial be sufficient?

Current test loop:

def test_loop(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    test_loss, correct = 0, 0

    with torch.no_grad():
        for X, y in dataloader:
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()

    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

Addition of comments and model.eval()

def test_loop(dataloader, model, loss_fn):
    # Set the model to evaluation mode - important for batch normalization and dropout layers
    # Unnecessary in this situation but added for best practices
    model.eval()
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    test_loss, correct = 0, 0

    # Evaluating the model with torch.no_grad() ensures that no gradients are computed during test mode
    # also serves to reduce unnecessary gradient computations and memory usage for tensors with requires_grad=True
    with torch.no_grad():
        for X, y in dataloader:
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()

Co-authored-by: Svetlana Karslioglu <svekars@fb.com>

…ansforms.Normalize (#2405) * Fixes #2083 - explain model.eval, torch.no_grad * set norm to mean & std of CIFAR10(#1818) --------- Co-authored-by: Svetlana Karslioglu <svekars@fb.com>

svekars added the arch-optimization quantization, sparsity, ns label Oct 17, 2022

svekars added intro and removed arch-optimization quantization, sparsity, ns labels Mar 1, 2023

chedatomasz mentioned this issue Apr 10, 2023

model.train(False) affects gradient tracking? #2230

Closed

svekars added easy docathon-h1-2023 A label for the docathon in H1 2023 labels May 31, 2023

github-actions bot assigned zabboud May 31, 2023

zabboud added a commit to zabboud/tutorials that referenced this issue Jun 1, 2023

Fixes pytorch#2083 - explain model.eval, torch.no_grad

3c4a726

zabboud mentioned this issue Jun 1, 2023

Resolving issue#2083 #2400

Merged

4 tasks

svekars closed this as completed in #2400 Jun 1, 2023

svekars pushed a commit that referenced this issue Jun 1, 2023

Fixes #2083 - explain model.eval, torch.no_grad (#2400)

9b54056

Co-authored-by: Svetlana Karslioglu <svekars@fb.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model.eval() #2083

model.eval() #2083

JaejinCho commented Oct 15, 2022 •

edited by pytorch-bot bot

Loading

z-a-f commented Jan 10, 2023

zabboud commented May 31, 2023

zabboud commented May 31, 2023 •

edited

Loading

zabboud commented May 31, 2023 •

edited

Loading

model.eval() #2083

model.eval() #2083

Comments

JaejinCho commented Oct 15, 2022 • edited by pytorch-bot bot Loading

z-a-f commented Jan 10, 2023

zabboud commented May 31, 2023

zabboud commented May 31, 2023 • edited Loading

zabboud commented May 31, 2023 • edited Loading

JaejinCho commented Oct 15, 2022 •

edited by pytorch-bot bot

Loading

zabboud commented May 31, 2023 •

edited

Loading

zabboud commented May 31, 2023 •

edited

Loading