distributed_autograd.rst - OpenGrok cross reference for /external/pytorch/docs/source/rpc/distributed

Lines Matching full:backward
41 The main motivation behind distributed autograd is to enable running a backward
51 used to execute the backward pass. For more details see
55 pass to ensure the backward pass is executed appropriately. For this purpose,
61   The input for this function during the backward pass is received from the
66   node to the appropriate ``send`` function during the backward pass.
69   function on a remote node during the backward pass.
83 Each forward and backward pass that uses distributed autograd is assigned a
90 1. Multiple nodes running distributed backward passes might accumulate
92    tensor would have gradients from a variety of distributed backward passes
94    calling :meth:`torch.autograd.backward` multiple times locally. In order to
95    provide a way of separating out the gradients for each backward pass, the
97    for each backward pass.
102    during the backward pass.
115     dist_autograd.backward(context_id, loss)
120 to run the backward pass across all participating nodes.
122 Distributed Backward Pass
126 during a distributed backward pass and describe a couple of algorithms (with
127 tradeoffs) on how we can execute a distributed backward pass.
142   d.sum.().backward()
149 The first step the autograd engine performs as part of the backward pass is
153 dependencies. As you can see, this means during the backward pass the ``add``
159 backward pass poses a challenge for distributed autograd. Consider this piece
185 part of the backward pass (most applications don't perform RPCs that aren't
192 function is valid as part of the backward pass. To address this, we have
202 dependency of 1 when we run a backward pass. In other words, we assume we'll
207 1. We start from the worker which has the roots for the backward pass
251   # part in the distributed backward pass must be within
267     # Run the backward pass.
268     dist_autograd.backward(context_id, [loss])
286 3. Since this is the first time ``Worker 1`` has heard about this backward pass,
292 6. Since ``Worker 0`` has already computed dependencies for this backward pass,
363           # Backward pass (run distributed autograd).
364           dist_autograd.backward(context_id, [loss.sum()])