1Meta device 2============ 3 4The "meta" device is an abstract device which denotes a tensor which records 5only metadata, but no actual data. Meta tensors have two primary use cases: 6 7* Models can be loaded on the meta device, allowing you to load a 8 representation of the model without actually loading the actual parameters 9 into memory. This can be helpful if you need to make transformations on 10 the model before you load the actual data. 11 12* Most operations can be performed on meta tensors, producing new meta 13 tensors that describe what the result would have been if you performed 14 the operation on a real tensor. You can use this to perform abstract 15 analysis without needing to spend time on compute or space to represent 16 the actual tensors. Because meta tensors do not have real data, you cannot 17 perform data-dependent operations like :func:`torch.nonzero` or 18 :meth:`~torch.Tensor.item`. In some cases, not all device types (e.g., CPU 19 and CUDA) have exactly the same output metadata for an operation; we 20 typically prefer representing the CUDA behavior faithfully in this 21 situation. 22 23.. warning:: 24 25 Although in principle meta tensor computation should always be faster than 26 an equivalent CPU/CUDA computation, many meta tensor implementations are 27 implemented in Python and have not been ported to C++ for speed, so you 28 may find that you get lower absolute framework latency with small CPU tensors. 29 30Idioms for working with meta tensors 31------------------------------------ 32 33An object can be loaded with :func:`torch.load` onto meta device by specifying 34``map_location='meta'``:: 35 36 >>> torch.save(torch.randn(2), 'foo.pt') 37 >>> torch.load('foo.pt', map_location='meta') 38 tensor(..., device='meta', size=(2,)) 39 40If you have some arbitrary code which performs some tensor construction without 41explicitly specifying a device, you can override it to instead construct on meta device by using 42the :func:`torch.device` context manager:: 43 44 >>> with torch.device('meta'): 45 ... print(torch.randn(30, 30)) 46 ... 47 tensor(..., device='meta', size=(30, 30)) 48 49This is especially helpful NN module construction, where you often are not 50able to explicitly pass in a device for initialization:: 51 52 >>> from torch.nn.modules import Linear 53 >>> with torch.device('meta'): 54 ... print(Linear(20, 30)) 55 ... 56 Linear(in_features=20, out_features=30, bias=True) 57 58You cannot convert a meta tensor directly to a CPU/CUDA tensor, because the 59meta tensor stores no data and we do not know what the correct data values for 60your new tensor are:: 61 62 >>> torch.ones(5, device='meta').to("cpu") 63 Traceback (most recent call last): 64 File "<stdin>", line 1, in <module> 65 NotImplementedError: Cannot copy out of meta tensor; no data! 66 67Use a factory function like :func:`torch.empty_like` to explicitly specify how 68you would like the missing data to be filled in. 69 70NN modules have a convenience method :meth:`torch.nn.Module.to_empty` that 71allow you to the module to another device, leaving all parameters 72uninitialized. You are expected to explicitly reinitialize the parameters 73manually:: 74 75 >>> from torch.nn.modules import Linear 76 >>> with torch.device('meta'): 77 ... m = Linear(20, 30) 78 >>> m.to_empty(device="cpu") 79 Linear(in_features=20, out_features=30, bias=True) 80 81:mod:`torch._subclasses.meta_utils` contains undocumented utilities for taking 82an arbitrary Tensor and constructing an equivalent meta Tensor with high 83fidelity. These APIs are experimental and may be changed in a BC breaking way 84at any time. 85