• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Windows FAQ
2==========================
3
4Building from source
5--------------------
6
7Include optional components
8^^^^^^^^^^^^^^^^^^^^^^^^^^^
9
10There are two supported components for Windows PyTorch:
11MKL and MAGMA. Here are the steps to build with them.
12
13.. code-block:: bat
14
15    REM Make sure you have 7z and curl installed.
16
17    REM Download MKL files
18    curl https://s3.amazonaws.com/ossci-windows/mkl_2020.2.254.7z -k -O
19    7z x -aoa mkl_2020.2.254.7z -omkl
20
21    REM Download MAGMA files
22    REM version available:
23    REM 2.5.4 (CUDA 10.1 10.2 11.0 11.1) x (Debug Release)
24    REM 2.5.3 (CUDA 10.1 10.2 11.0) x (Debug Release)
25    REM 2.5.2 (CUDA 9.2 10.0 10.1 10.2) x (Debug Release)
26    REM 2.5.1 (CUDA 9.2 10.0 10.1 10.2) x (Debug Release)
27    set CUDA_PREFIX=cuda102
28    set CONFIG=release
29    curl -k https://s3.amazonaws.com/ossci-windows/magma_2.5.4_%CUDA_PREFIX%_%CONFIG%.7z -o magma.7z
30    7z x -aoa magma.7z -omagma
31
32    REM Setting essential environment variables
33    set "CMAKE_INCLUDE_PATH=%cd%\mkl\include"
34    set "LIB=%cd%\mkl\lib;%LIB%"
35    set "MAGMA_HOME=%cd%\magma"
36
37Speeding CUDA build for Windows
38^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
39
40Visual Studio doesn't support parallel custom task currently.
41As an alternative, we can use ``Ninja`` to parallelize CUDA
42build tasks. It can be used by typing only a few lines of code.
43
44.. code-block:: bat
45
46    REM Let's install ninja first.
47    pip install ninja
48
49    REM Set it as the cmake generator
50    set CMAKE_GENERATOR=Ninja
51
52
53One key install script
54^^^^^^^^^^^^^^^^^^^^^^
55
56You can take a look at `this set of scripts
57<https://github.com/peterjc123/pytorch-scripts>`_.
58It will lead the way for you.
59
60Extension
61---------
62
63CFFI Extension
64^^^^^^^^^^^^^^
65
66The support for CFFI Extension is very experimental. You must specify
67additional ``libraries`` in ``Extension`` object to make it build on
68Windows.
69
70.. code-block:: python
71
72   ffi = create_extension(
73       '_ext.my_lib',
74       headers=headers,
75       sources=sources,
76       define_macros=defines,
77       relative_to=__file__,
78       with_cuda=with_cuda,
79       extra_compile_args=["-std=c99"],
80       libraries=['ATen', '_C'] # Append cuda libraries when necessary, like cudart
81   )
82
83Cpp Extension
84^^^^^^^^^^^^^
85
86This type of extension has better support compared with
87the previous one. However, it still needs some manual
88configuration. First, you should open the
89**x86_x64 Cross Tools Command Prompt for VS 2017**.
90And then, you can start your compiling process.
91
92Installation
93------------
94
95Package not found in win-32 channel.
96^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
97
98.. code-block:: bat
99
100    Solving environment: failed
101
102    PackagesNotFoundError: The following packages are not available from current channels:
103
104    - pytorch
105
106    Current channels:
107    - https://conda.anaconda.org/pytorch/win-32
108    - https://conda.anaconda.org/pytorch/noarch
109    - https://repo.continuum.io/pkgs/main/win-32
110    - https://repo.continuum.io/pkgs/main/noarch
111    - https://repo.continuum.io/pkgs/free/win-32
112    - https://repo.continuum.io/pkgs/free/noarch
113    - https://repo.continuum.io/pkgs/r/win-32
114    - https://repo.continuum.io/pkgs/r/noarch
115    - https://repo.continuum.io/pkgs/pro/win-32
116    - https://repo.continuum.io/pkgs/pro/noarch
117    - https://repo.continuum.io/pkgs/msys2/win-32
118    - https://repo.continuum.io/pkgs/msys2/noarch
119
120PyTorch doesn't work on 32-bit system. Please use Windows and
121Python 64-bit version.
122
123
124Import error
125^^^^^^^^^^^^
126
127.. code-block:: python
128
129    from torch._C import *
130
131    ImportError: DLL load failed: The specified module could not be found.
132
133
134The problem is caused by the missing of the essential files. Actually,
135we include almost all the essential files that PyTorch need for the conda
136package except VC2017 redistributable and some mkl libraries.
137You can resolve this by typing the following command.
138
139.. code-block:: bat
140
141    conda install -c peterjc123 vc vs2017_runtime
142    conda install mkl_fft intel_openmp numpy mkl
143
144As for the wheels package, since we didn't pack some libraries and VS2017
145redistributable files in, please make sure you install them manually.
146The `VS 2017 redistributable installer
147<https://aka.ms/vs/15/release/VC_redist.x64.exe>`_ can be downloaded.
148And you should also pay attention to your installation of Numpy. Make sure it
149uses MKL instead of OpenBLAS. You may type in the following command.
150
151.. code-block:: bat
152
153    pip install numpy mkl intel-openmp mkl_fft
154
155Another possible cause may be you are using GPU version without NVIDIA
156graphics cards. Please replace your GPU package with the CPU one.
157
158.. code-block:: python
159
160    from torch._C import *
161
162    ImportError: DLL load failed: The operating system cannot run %1.
163
164
165This is actually an upstream issue of Anaconda. When you initialize your
166environment with conda-forge channel, this issue will emerge. You may fix
167the intel-openmp libraries through this command.
168
169.. code-block:: bat
170
171    conda install -c defaults intel-openmp -f
172
173
174Usage (multiprocessing)
175-------------------------------------------------------
176
177Multiprocessing error without if-clause protection
178^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
179
180.. code-block:: python
181
182    RuntimeError:
183           An attempt has been made to start a new process before the
184           current process has finished its bootstrapping phase.
185
186       This probably means that you are not using fork to start your
187       child processes and you have forgotten to use the proper idiom
188       in the main module:
189
190           if __name__ == '__main__':
191               freeze_support()
192               ...
193
194       The "freeze_support()" line can be omitted if the program
195       is not going to be frozen to produce an executable.
196
197The implementation of ``multiprocessing`` is different on Windows, which
198uses ``spawn`` instead of ``fork``. So we have to wrap the code with an
199if-clause to protect the code from executing multiple times. Refactor
200your code into the following structure.
201
202.. code-block:: python
203
204    import torch
205
206    def main()
207        for i, data in enumerate(dataloader):
208            # do something here
209
210    if __name__ == '__main__':
211        main()
212
213
214Multiprocessing error "Broken pipe"
215^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
216
217.. code-block:: python
218
219    ForkingPickler(file, protocol).dump(obj)
220
221    BrokenPipeError: [Errno 32] Broken pipe
222
223This issue happens when the child process ends before the parent process
224finishes sending data. There may be something wrong with your code. You
225can debug your code by reducing the ``num_worker`` of
226:class:`~torch.utils.data.DataLoader` to zero and see if the issue persists.
227
228Multiprocessing error "driver shut down"
229^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
230
231::
232
233    Couldn’t open shared file mapping: <torch_14808_1591070686>, error code: <1455> at torch\lib\TH\THAllocator.c:154
234
235    [windows] driver shut down
236
237Please update your graphics driver. If this persists, this may be that your
238graphics card is too old or the calculation is too heavy for your card. Please
239update the TDR settings according to this `post
240<https://www.pugetsystems.com/labs/hpc/Working-around-TDR-in-Windows-for-a-better-GPU-computing-experience-777/>`_.
241
242CUDA IPC operations
243^^^^^^^^^^^^^^^^^^^
244
245.. code-block:: python
246
247   THCudaCheck FAIL file=torch\csrc\generic\StorageSharing.cpp line=252 error=63 : OS call failed or operation not supported on this OS
248
249They are not supported on Windows. Something like doing multiprocessing on CUDA
250tensors cannot succeed, there are two alternatives for this.
251
2521. Don't use ``multiprocessing``. Set the ``num_worker`` of
253:class:`~torch.utils.data.DataLoader` to zero.
254
2552. Share CPU tensors instead. Make sure your custom
256:class:`~torch.utils.data.DataSet` returns CPU tensors.
257