• Home
  • Raw
  • Download

Lines Matching full:optimizer

56     * ``scaler.step(optimizer)`` safely unscales gradients and calls ``optimizer.step()``.
66 optimizer.zero_grad()
73 # scaler.step() first unscales gradients of the optimizer's params.
74 # If gradients don't contain infs/NaNs, optimizer.step() is then called,
75 # otherwise, optimizer.step() is skipped.
76 scaler.step(optimizer)
90 …``scaler.step(optimizer)`` (or optional separate ``scaler.unscale_(optimizer)``, see :meth:`unscal…
92 …* If infs/NaNs are found, ``scaler.step(optimizer)`` skips the underlying ``optimizer.step()`` (so…
95 …* If no infs/NaNs are found, ``scaler.step(optimizer)`` runs the underlying ``optimizer.step()`` a…
100 value calibrates. ``scaler.step`` will skip the underlying ``optimizer.step()`` for these
115 invokes the underlying ``optimizer.step()``, and other methods become no-ops.
236 optimizer: torch.optim.Optimizer, argument
254 for group in optimizer.param_groups:
287 def unscale_(self, optimizer: torch.optim.Optimizer) -> None: argument
289 Divides ("unscales") the optimizer's gradient tensors by the scale factor.
300 scaler.unscale_(optimizer)
302 scaler.step(optimizer)
306 optimizer (torch.optim.Optimizer): Optimizer that owns the gradients to be unscaled.
312 :meth:`unscale_` should only be called once per optimizer per :meth:`step` call,
313 … and only after all gradients for that optimizer's assigned parameters have been accumulated.
314 …Calling :meth:`unscale_` twice for a given optimizer between each :meth:`step` triggers a RuntimeE…
324 optimizer_state = self._per_optimizer_states[id(optimizer)]
328 "unscale_() has already been called on this optimizer since the last update()."
339 optimizer, inv_scale, found_inf, False
345 optimizer: torch.optim.Optimizer, argument
352 retval = optimizer.step(*args, **kwargs)
356 self, optimizer: torch.optim.Optimizer, *args: Any, **kwargs: Any argument
358 … """Invoke ``unscale_(optimizer)`` followed by parameter update, if gradients are not infs/NaN.
362 …1. Internally invokes ``unscale_(optimizer)`` (unless :meth:`unscale_` was explicitly called for …
364 2. If no inf/NaN gradients are found, invokes ``optimizer.step()`` using the unscaled
365 gradients. Otherwise, ``optimizer.step()`` is skipped to avoid corrupting the params.
367 ``*args`` and ``**kwargs`` are forwarded to ``optimizer.step()``.
369 Returns the return value of ``optimizer.step(*args, **kwargs)``.
372 optimizer (torch.optim.Optimizer): Optimizer that applies the gradients.
380 return optimizer.step(*args, **kwargs)
389 optimizer_state = self._per_optimizer_states[id(optimizer)]
398 if getattr(optimizer, "_step_supports_amp_scaling", False):
399 … # This optimizer has customized scale-handling logic, so we can call optimizer.step() directly.
401 …# optional grad_scaler kwarg. We append self to the kwargs so the custom optimizer has full infor…
404 … # to `Optimizer.step`. The new behavior is going to add two Tensor attributes of `grad_scale`
405 # and `found_inf` to the passed optimizer so that the optimizer can utilize those
412 "grad_scaler" in inspect.signature(optimizer.step).parameters
417 "optimizer. In the near future GradScaler registers `grad_scale: Tensor` and "
418 … "`found_inf: Tensor` to the passed optimizer and let the optimizer use them directly.",
424 self._check_inf_per_device(optimizer)
436 … # Take the product of the scales, if the user has already set `optimizer.grad_scale`.
437 optimizer.grad_scale = ( # type: ignore[attr-defined]
438 getattr(optimizer, "grad_scale", None)
440 else scaler * getattr(optimizer, "grad_scale", 1)
442 optimizer.found_inf = found_inf # type: ignore[attr-defined]
443 retval = optimizer.step(*args, **kwargs_)
446 del optimizer.grad_scale # type: ignore[attr-defined]
447 del optimizer.found_inf # type: ignore[attr-defined]
451 self.unscale_(optimizer)
455 ), "No inf checks were recorded for this optimizer."
457 retval = self._maybe_opt_step(optimizer, optimizer_state, *args, **kwargs)
466 If any optimizer steps were skipped the scale is multiplied by ``backoff_factor``
479 …th:`update` should only be called at the end of the iteration, after ``scaler.step(optimizer)`` has
672 def _check_inf_per_device(self, optimizer: torch.optim.Optimizer) -> Dict[str, Any]: argument
678 self._per_optimizer_states[id(optimizer)][
680 ] = self._unscale_grads_(optimizer, dummy_inv_scale, found_inf, True)
682 return self._per_optimizer_states[id(optimizer)]["found_inf_per_device"]
684 def _found_inf_per_device(self, optimizer: torch.optim.Optimizer) -> Dict[str, Any]: argument
685 return self._per_optimizer_states[id(optimizer)]["found_inf_per_device"]