pytorch_lightning_spells.utils module

Classes:

EMATracker([alpha])

Keeps the exponential moving average for a single series.

Functions:

count_parameters(parameters)

Count the number of parameters

freeze_layers(layer_groups, freeze_flags)

Freeze or unfreeze groups of layers

separate_parameters(module[, skip_list])

Separate BatchNorm2d, GroupNorm, and LayerNorm paremeters from others

set_trainable(layer, trainable)

Freeze or unfreeze all parameters in the layer.

class pytorch_lightning_spells.utils.EMATracker(alpha=0.05)[source]

Bases: object

Keeps the exponential moving average for a single series.

Parameters

alpha (float, optional) – the weight of the new value, by default 0.05

Examples

>>> tracker = EMATracker(0.1)
>>> tracker.update(1.)
>>> tracker.value
1.0
>>> tracker.update(2.)
>>> tracker.value # 1 * 0.9 + 2 * 0.1
1.1
>>> tracker.update(float('nan')) # this won't have any effect
>>> tracker.value
1.1
update(new_value)[source]

Adds a new value to the tracker.

It will ignore NaNs and raise a warning in those cases.

Parameters

new_value (Union[float, torch.Tensor]) – the incoming value.

property value

The smoothed value.

pytorch_lightning_spells.utils.count_parameters(parameters)[source]

Count the number of parameters

Parameters

parameters (Iterable[Union[torch.Tensor, Parameter]]) – parameters you want to count.

Returns

the number of parameters counted.

Return type

int

Example

>>> count_parameters([torch.rand(100), torch.rand(10)])
110
>>> count_parameters([torch.rand(100, 2), torch.rand(10, 3)])
230
pytorch_lightning_spells.utils.freeze_layers(layer_groups, freeze_flags)[source]

Freeze or unfreeze groups of layers

Parameters
  • layer_groups (Sequence[Layer]) – the target lists of layers

  • freeze_flags (Sequence[bool]) – the corresponding trainable flags

Warning

The value in freeze_flag has the opposite meaning as in trainable of set_trainable.

Set True to freeze; False to unfreeze.

Examples

>>> model = nn.Sequential(nn.Linear(10, 100), nn.Linear(100, 1))
>>> freeze_layers([model[0], model[1]], [True, False])
>>> model[0].weight.requires_grad
False
>>> model[1].weight.requires_grad
True
>>> freeze_layers([model[0], model[1]], [False, True])
>>> model[0].weight.requires_grad
True
>>> model[1].weight.requires_grad
False
pytorch_lightning_spells.utils.separate_parameters(module, skip_list=('bias',))[source]

Separate BatchNorm2d, GroupNorm, and LayerNorm paremeters from others

Parameters
  • module (Union[Parameter, nn.Module, List[nn.Module]]) – to be separated.

  • skip_list (Sequence[str]) –

Returns

lists of decay and no-decay parameters.

Return type

Tuple[List[Parameter], List[Parameter]]

Example

>>> model = nn.Sequential(nn.Linear(100, 10, bias=True), nn.BatchNorm1d(10))
>>> _ = nn.init.constant_(model[0].weight, 2.)
>>> _ = nn.init.constant_(model[0].bias, 1.)
>>> _ = nn.init.constant_(model[1].weight, 1.)
>>> _ = nn.init.constant_(model[1].bias, 1.)
>>> model[0].weight.data.sum().item()
2000.0
>>> model[0].bias.data.sum().item()
10.0
>>> model[1].weight.data.sum().item()
10.0
>>> model[1].bias.data.sum().item()
10.0
>>> decay, no_decay = separate_parameters(model) # separate the parameters
>>> np.sum([x.sum().detach().numpy() for x in decay]) # nn.Linear
2000.0
>>> np.sum([x.sum().detach().numpy() for x in no_decay]) # nn.BatchNorm1d
30.0
>>> optimizer = torch.optim.AdamW([{
...    "params": decay, "weight_decay": 0.1
... }, {
...    "params": no_decay, "weight_decay": 0
... }], lr = 1e-3)
pytorch_lightning_spells.utils.set_trainable(layer, trainable)[source]

Freeze or unfreeze all parameters in the layer.

Parameters
  • layer (Union[torch.nn.Module, torch.nn.ModuleList]) – the target layer

  • trainable (bool) – True to unfreeze; False to freeze

Example

>>> model = nn.Sequential(nn.Linear(10, 100), nn.Linear(100, 1))
>>> model[0].weight.requires_grad
True
>>> set_trainable(model, False)
>>> model[0].weight.requires_grad
False
>>> set_trainable(model, True)
>>> model[0].weight.requires_grad
True