Is layer_activation (register_forward_hook) the same as gradient?

Question

I was wondering if the intermediate layer output initialised by register_forward_hook is the same as the gradient of the intermediate output wrt the image?

What do you mean by "*intermediate layer output* ***initialized*** *by `register_forward_hook`"*? — Ivan, Apr 29 '22 at 13:38
When I use Pytorch, there is a function called register_forward_hook that allows you to get the output of a specific layer. I was wondering if this intermediate layer output is the same as getting the gradient of the layer. — rkraaijveld, May 01 '22 at 12:29
No, the "activation" is the output of the layer, but the gradient of this layer with respect to its input(s) is a different thing! You would need a different hook to access this information since it is computed during the backward pass, ***not the forward pass***: here applying [`register_module_full_backward_hook`](https://pytorch.org/docs/stable/generated/torch.nn.modules.module.register_module_full_backward_hook.html) on the module is required to do such thing. Let me know if you need any additional information. — Ivan, May 02 '22 at 07:37
Thank you so much Ivan!! That helps a lot :) Do you perhaps have an example of how to implement it? I can't seem to find anything online. At the moment I implemented it as follows: def backward_hook(module, grad_input, grad_output): print('grad_output:', grad_output) for name, layer in model.named_modules(): layer.register_module_full_backward_hook(backward_hook) loss_fn = nn.CrossEntropyLoss() model.eval() — rkraaijveld, May 02 '22 at 09:08
for ind, (img, label) in enumerate(loader): img.requires_grad=True label = label.type(torch.LongTensor) img, label = label.to(device).float(), label.to(device) output = model(img) loss = loss_fn(output.float(), label.squeeze()) loss.backward() — rkraaijveld, May 02 '22 at 09:10
This doesn't seem to work as initialising the register_module_full_backward_hook is wrong I believe. — rkraaijveld, May 02 '22 at 09:11
Oups, sorry it should be [`register_full_backward_hook`](https://pytorch.org/docs/stable/generated/torch.nn.Module.html?highlight=register_full_backward_hook#torch.nn.Module.register_full_backward_hook) and not [`register_module_full_backward_hook`](https://pytorch.org/docs/stable/generated/torch.nn.modules.module.register_module_full_backward_hook.html). I have written an answer below, hope it helps! — Ivan, May 02 '22 at 12:46

score 1 · Accepted Answer · answered May 02 '22 at 12:45

You can attach a callback function on a given module with nn.Module.register_full_backward_hook to hook onto the backward pass of that layer. This allows you to access the gradient.

Here is a minimal example, define the hook as you did:

def backward_hook(module, grad_input, grad_output):
    print('grad_output:', grad_output)

Initialize your model and attach the hook on its layers

>>> model = nn.Sequential(nn.Linear(10, 5), nn.Linear(5, 2))
Sequential(
  (0): Linear(in_features=10, out_features=5, bias=True)
  (1): Linear(in_features=5, out_features=2, bias=True)
)

>>> for name, layer in model.named_children():
...     print(f'hook onto {name}')
...     layer.register_full_backward_hook(backward_hook)
hook onto 0
hook onto 1

Perform an inference:

>>> x = torch.rand(5, 10)
>>> y = model(x).mean()

Perform the backward pass:

>>> y.backward()
grad_output: (tensor([[0.1000, 0.1000],
        [0.1000, 0.1000],
        [0.1000, 0.1000],
        [0.1000, 0.1000],
        [0.1000, 0.1000]]),)
grad_output: (tensor([[ 0.0135,  0.0141, -0.0468, -0.0378, -0.0123],
        [ 0.0135,  0.0141, -0.0468, -0.0378, -0.0123],
        [ 0.0135,  0.0141, -0.0468, -0.0378, -0.0123],
        [ 0.0135,  0.0141, -0.0468, -0.0378, -0.0123],
        [ 0.0135,  0.0141, -0.0468, -0.0378, -0.0123]]),)

For more examples, you can look at my other answers related to register_full_backward_hook:

Is layer_activation (register_forward_hook) the same as gradient?

1 Answers1

Linked