pytorch_lightning_spells.utils module
Classes:
|
Keeps the exponential moving average for a single series. |
Functions:
|
Count the number of parameters |
|
Freeze or unfreeze groups of layers |
|
Separate BatchNorm2d, GroupNorm, and LayerNorm paremeters from others |
|
Freeze or unfreeze all parameters in the layer. |
- class pytorch_lightning_spells.utils.EMATracker(alpha=0.05)[source]
Bases:
object
Keeps the exponential moving average for a single series.
- Parameters:
alpha (float, optional) – the weight of the new value, by default 0.05
Examples
>>> tracker = EMATracker(0.1) >>> tracker.update(1.) >>> tracker.value 1.0 >>> tracker.update(2.) >>> tracker.value # 1 * 0.9 + 2 * 0.1 1.1 >>> tracker.update(float('nan')) # this won't have any effect >>> tracker.value 1.1
- update(new_value)[source]
Adds a new value to the tracker.
It will ignore NaNs and raise a warning in those cases.
- Parameters:
new_value (Union[float, torch.Tensor]) – the incoming value.
- property value
The smoothed value.
- pytorch_lightning_spells.utils.count_parameters(parameters)[source]
Count the number of parameters
- Parameters:
parameters (Iterable[Union[torch.Tensor, Parameter]]) – parameters you want to count.
- Returns:
the number of parameters counted.
- Return type:
int
Example
>>> count_parameters([torch.rand(100), torch.rand(10)]) 110 >>> count_parameters([torch.rand(100, 2), torch.rand(10, 3)]) 230
- pytorch_lightning_spells.utils.freeze_layers(layer_groups, freeze_flags)[source]
Freeze or unfreeze groups of layers
- Parameters:
layer_groups (Sequence[Layer]) – the target lists of layers
freeze_flags (Sequence[bool]) – the corresponding trainable flags
Warning
The value in freeze_flag has the opposite meaning as in trainable of set_trainable.
Set True to freeze; False to unfreeze.
Examples
>>> model = nn.Sequential(nn.Linear(10, 100), nn.Linear(100, 1)) >>> freeze_layers([model[0], model[1]], [True, False]) >>> model[0].weight.requires_grad False >>> model[1].weight.requires_grad True >>> freeze_layers([model[0], model[1]], [False, True]) >>> model[0].weight.requires_grad True >>> model[1].weight.requires_grad False
- pytorch_lightning_spells.utils.separate_parameters(module, skip_list=('bias',))[source]
Separate BatchNorm2d, GroupNorm, and LayerNorm paremeters from others
- Parameters:
module (Union[Parameter, nn.Module, List[nn.Module]]) – to be separated.
skip_list (Sequence[str]) –
- Returns:
lists of decay and no-decay parameters.
- Return type:
Tuple[List[Parameter], List[Parameter]]
Example
>>> model = nn.Sequential(nn.Linear(100, 10, bias=True), nn.BatchNorm1d(10)) >>> _ = nn.init.constant_(model[0].weight, 2.) >>> _ = nn.init.constant_(model[0].bias, 1.) >>> _ = nn.init.constant_(model[1].weight, 1.) >>> _ = nn.init.constant_(model[1].bias, 1.) >>> model[0].weight.data.sum().item() 2000.0 >>> model[0].bias.data.sum().item() 10.0 >>> model[1].weight.data.sum().item() 10.0 >>> model[1].bias.data.sum().item() 10.0 >>> decay, no_decay = separate_parameters(model) # separate the parameters >>> np.sum([x.sum().detach().numpy() for x in decay]) # nn.Linear 2000.0 >>> np.sum([x.sum().detach().numpy() for x in no_decay]) # nn.BatchNorm1d 30.0 >>> optimizer = torch.optim.AdamW([{ ... "params": decay, "weight_decay": 0.1 ... }, { ... "params": no_decay, "weight_decay": 0 ... }], lr = 1e-3)
- pytorch_lightning_spells.utils.set_trainable(layer, trainable)[source]
Freeze or unfreeze all parameters in the layer.
- Parameters:
layer (Union[torch.nn.Module, torch.nn.ModuleList]) – the target layer
trainable (bool) – True to unfreeze; False to freeze
Example
>>> model = nn.Sequential(nn.Linear(10, 100), nn.Linear(100, 1)) >>> model[0].weight.requires_grad True >>> set_trainable(model, False) >>> model[0].weight.requires_grad False >>> set_trainable(model, True) >>> model[0].weight.requires_grad True