The AGP_Pruner example provided can not run successfully, An error occurred: TypeError: unsupported operand type (s) for *: 'Tensor' and 'dict' #2035

yeliang2258 · 2020-02-11T13:00:30Z

The error is:

The first epoch can run, but the second epoch reports an error.

The code is from the sample main_torch_pruner.py（https://github.com/microsoft/nni/blob/master/examples/model_compress/main_torch_pruner.py）

code：

from nni.compression.torch import AGP_Pruner
import torch
import torch.nn.functional as F
from torchvision import datasets, transforms

class Mnist(torch.nn.Module):
def init(self):
super().init()
self.conv1 = torch.nn.Conv2d(1, 20, 5, 1)
self.conv2 = torch.nn.Conv2d(20, 50, 5, 1)
self.fc1 = torch.nn.Linear(4 * 4 * 50, 500)
self.fc2 = torch.nn.Linear(500, 10)

def forward(self, x):
    x = F.relu(self.conv1(x))
    x = F.max_pool2d(x, 2, 2)
    x = F.relu(self.conv2(x))
    x = F.max_pool2d(x, 2, 2)
    x = x.view(-1, 4 * 4 * 50)
    x = F.relu(self.fc1(x))
    x = self.fc2(x)
    return F.log_softmax(x, dim=1)

def train(model, device, train_loader, optimizer):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % 100 == 0:
print('{:2.0f}% Loss {}'.format(100 * batch_idx / len(train_loader), loss.item()))

def test(model, device, test_loader):
model.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for data, target in test_loader:
data, target = data.to(device), target.to(device)
output = model(data)
test_loss += F.nll_loss(output, target, reduction='sum').item()
pred = output.argmax(dim=1, keepdim=True)
correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len(test_loader.dataset)

print('Loss: {}  Accuracy: {}%)\n'.format(
    test_loss, 100 * correct / len(test_loader.dataset)))

def main():
torch.manual_seed(0)
device = torch.device('cpu')

trans = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('data', train=True, download=True, transform=trans),
    batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('data', train=False, transform=trans),
    batch_size=1000, shuffle=True)

model = Mnist()
model.to(device)

'''you can change this to LevelPruner to implement it
pruner = LevelPruner(configure_list)
'''
configure_list = [{
    'initial_sparsity': 0,
    'final_sparsity': 0.8,
    'start_epoch': 0,
    'end_epoch': 10,
    'frequency': 1,
    'op_types': ['default']
}]

pruner = AGP_Pruner(model, configure_list)
model = pruner.compress()

optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
for epoch in range(10):
    pruner.update_epoch(epoch)
    print('# Epoch {} #'.format(epoch))
    train(model, device, train_loader, optimizer)
    test(model, device, test_loader)
pruner.export_model('model.pth', 'mask.pth', 'model.onnx', [1, 1, 28, 28])

if name == 'main':
main()

Thanks！

The text was updated successfully, but these errors were encountered:

Cjkkkk · 2020-02-11T15:09:43Z

Hi @yeliang2258 , thanks for bringing up this issue！Could you try using nni v1.4? This is fixed in latest version.

yeliang2258 · 2020-02-12T06:52:37Z

Hi @yeliang2258 , thanks for bringing up this issue！Could you try using nni v1.4? This is fixed in latest version.

Hello, using cpu, AGP can work normally, but using cuda will report an error, the error message is as follows. The code is the provided example, I changed cpu to cuda, Thanks！

Cjkkkk · 2020-02-12T07:35:00Z

Hi, @yeliang2258 , could you try add model = model.to(device) after line model = pruner.compress() and see if it works? It seems some buffers registered by pruner are not transfered into cuda, which caused the error. Thanks!

yeliang2258 · 2020-02-12T07:46:20Z

Hi, @yeliang2258 , could you try add model = model.to(device) after line model = pruner.compress() and see if it works? It seems some buffers registered by pruner are not transfered into cuda, which caused the error. Thanks!

Still not working, the error message is as follows：

The function pruner.export_model ()) in the example has no effect. Thanks!

Cjkkkk · 2020-02-12T07:53:31Z

Hi @yeliang2258, could you change pruner.export_model('model.pth', 'mask.pth', 'model.onnx', [1, 1, 28, 28]) into pruner.export_model('model.pth', 'mask.pth', 'model.onnx', [1, 1, 28, 28], device)?
default device for export_model is cpu, which cause the error.
If it works, you are welcome to submit a PR for this outdated example. Thanks!

yeliang2258 · 2020-02-12T08:01:40Z

Hi @yeliang2258, could you change pruner.export_model('model.pth', 'mask.pth', 'model.onnx', [1, 1, 28, 28]) into pruner.export_model('model.pth', 'mask.pth', 'model.onnx', [1, 1, 28, 28], device)?
default device for export_model is cpu, which cause the error.
If it works, you are welcome to submit a PR for this outdated example. Thanks!

I modified it and found two problems. First, the pruner.export_model () function did not generate the corresponding file. Second, cuda still couldn't be used, and the same error was reported. my torch is 1.2.0，and nni is V1.4

Cjkkkk · 2020-02-12T08:50:31Z

Hi @yeliang2258, after some debugging, it turns out there is a bug in code for transfering buffers between device. Since most examples set origin buffers on cuda, the bug is not spotted. Anyway, I will fix this issue later and inform you after this issue is fixed and tested.

yeliang2258 · 2020-02-12T09:10:08Z

Hi @yeliang2258, after some debugging, it turns out there is a bug in code for transfering buffers between device. Since most examples set origin buffers on cuda, the bug is not spotted. Anyway, I will fix this issue later and inform you after this issue is fixed and tested.

Ok! thank you very much! Also, the pruner.export_model () function in AGP does not seem to work, it does not generate the corresponding file.

Cjkkkk · 2020-02-12T09:13:09Z

Hi, @yeliang2258 , the file is generated in the same directory as the directory you run the python command. example: python a/b/example.py then file is in directory a. Files are generated as expected in my machine. could you check if files are in other directory? Thanks!

scarlett2018 · 2020-04-15T06:10:13Z

Closing as the original problem is fixed. thanks @yeliang2258 and @Cjkkkk

QuanluZhang assigned Cjkkkk Feb 11, 2020

Cjkkkk mentioned this issue Feb 12, 2020

fix buffer transfer bug #2045

Merged

scarlett2018 added bug Something isn't working user raised labels Feb 14, 2020

scarlett2018 added model compression support labels Apr 15, 2020

scarlett2018 closed this as completed Apr 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The AGP_Pruner example provided can not run successfully, An error occurred: TypeError: unsupported operand type (s) for *: 'Tensor' and 'dict' #2035

The AGP_Pruner example provided can not run successfully, An error occurred: TypeError: unsupported operand type (s) for *: 'Tensor' and 'dict' #2035

yeliang2258 commented Feb 11, 2020 •

edited

Loading

Cjkkkk commented Feb 11, 2020

yeliang2258 commented Feb 12, 2020 •

edited

Loading

Cjkkkk commented Feb 12, 2020

yeliang2258 commented Feb 12, 2020 •

edited

Loading

Cjkkkk commented Feb 12, 2020

yeliang2258 commented Feb 12, 2020 •

edited

Loading

Cjkkkk commented Feb 12, 2020

yeliang2258 commented Feb 12, 2020

Cjkkkk commented Feb 12, 2020

scarlett2018 commented Apr 15, 2020

The AGP_Pruner example provided can not run successfully, An error occurred: TypeError: unsupported operand type (s) for *: 'Tensor' and 'dict' #2035

The AGP_Pruner example provided can not run successfully, An error occurred: TypeError: unsupported operand type (s) for *: 'Tensor' and 'dict' #2035

Comments

yeliang2258 commented Feb 11, 2020 • edited Loading

Cjkkkk commented Feb 11, 2020

yeliang2258 commented Feb 12, 2020 • edited Loading

Cjkkkk commented Feb 12, 2020

yeliang2258 commented Feb 12, 2020 • edited Loading

Cjkkkk commented Feb 12, 2020

yeliang2258 commented Feb 12, 2020 • edited Loading

Cjkkkk commented Feb 12, 2020

yeliang2258 commented Feb 12, 2020

Cjkkkk commented Feb 12, 2020

scarlett2018 commented Apr 15, 2020

yeliang2258 commented Feb 11, 2020 •

edited

Loading

yeliang2258 commented Feb 12, 2020 •

edited

Loading

yeliang2258 commented Feb 12, 2020 •

edited

Loading

yeliang2258 commented Feb 12, 2020 •

edited

Loading