• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1.. _pyporting-howto:
2
3*********************************
4Porting Python 2 Code to Python 3
5*********************************
6
7:author: Brett Cannon
8
9.. topic:: Abstract
10
11   With Python 3 being the future of Python while Python 2 is still in active
12   use, it is good to have your project available for both major releases of
13   Python. This guide is meant to help you figure out how best to support both
14   Python 2 & 3 simultaneously.
15
16   If you are looking to port an extension module instead of pure Python code,
17   please see :ref:`cporting-howto`.
18
19   If you would like to read one core Python developer's take on why Python 3
20   came into existence, you can read Nick Coghlan's `Python 3 Q & A`_.
21
22   For help with porting, you can email the python-porting_ mailing list with
23   questions.
24
25The Short Explanation
26=====================
27
28To make your project be single-source Python 2/3 compatible, the basic steps
29are:
30
31#. Update your code to drop support for Python 2.5 or older (supporting only
32   Python 2.7 is ideal)
33#. Make sure you have good test coverage (coverage.py_ can help;
34   ``pip install coverage``)
35#. Learn the differences between Python 2 & 3
36#. Use Modernize_ or Futurize_ to update your code (``pip install modernize`` or
37   ``pip install future``, respectively)
38#. Use Pylint_ to help make sure you don't regress on your Python 3 support
39   (if only supporting Python 2.7/3.4 or newer; ``pip install pylint``)
40#. Use caniusepython3_ to find out which of your dependencies are blocking your
41   use of Python 3 (``pip install caniusepython3``)
42#. Once your dependencies are no longer blocking you, use continuous integration
43   to make sure you stay compatible with Python 2 & 3 (tox_ can help test
44   against multiple versions of Python; ``pip install tox``)
45
46If you are dropping support for Python 2 entirely, then after you learn the
47differences between Python 2 & 3 you can run 2to3_ over your code and skip the
48rest of the steps outlined above.
49
50
51Details
52=======
53
54A key point about supporting Python 2 & 3 simultaneously is that you can start
55**today**! Even if your dependencies are not supporting Python 3 yet that does
56not mean you can't modernize your code **now** to support Python 3. Most changes
57required to support Python 3 lead to cleaner code using newer practices even in
58Python 2.
59
60Another key point is that modernizing your Python 2 code to also support
61Python 3 is largely automated for you. While you might have to make some API
62decisions thanks to Python 3 clarifying text data versus binary data, the
63lower-level work is now mostly done for you and thus can at least benefit from
64the automated changes immediately.
65
66Keep those key points in mind while you read on about the details of porting
67your code to support Python 2 & 3 simultaneously.
68
69
70Drop support for Python 2.5 and older (at least)
71------------------------------------------------
72
73While you can make Python 2.5 work with Python 3, it is **much** easier if you
74only have to work with Python 2.6 or newer (and easier still if you only have
75to work with Python 2.7). If dropping Python 2.5 is not an option then the six_
76project can help you support Python 2.5 & 3 simultaneously
77(``pip install six``). Do realize, though, that nearly all the projects listed
78in this HOWTO will not be available to you.
79
80If you are able to only support Python 2.6 or newer, then the required changes
81to your code should continue to look and feel like idiomatic Python code. At
82worst you will have to use a function instead of a method in some instances or
83have to import a function instead of using a built-in one, but otherwise the
84overall transformation should not feel foreign to you.
85
86But please aim for Python 2.7. Bugfixes for that version of Python will continue
87until 2020 while Python 2.6 is no longer supported. There are also some tools
88mentioned in this HOWTO which do not support Python 2.6 (e.g., Pylint_), and
89this will become more commonplace as time goes on.
90
91Make sure you specify the proper version support in your ``setup.py`` file
92--------------------------------------------------------------------------
93
94In your ``setup.py`` file you should have the proper `trove classifier`_
95specifying what versions of Python you support. As your project does not support
96Python 3 yet you should at least have
97``Programming Language :: Python :: 2 :: Only`` specified. Ideally you should
98also specify each major/minor version of Python that you do support, e.g.
99``Programming Language :: Python :: 2.7``.
100
101Have good test coverage
102-----------------------
103
104Once you have your code supporting the oldest version of Python 2 you want it
105to, you will want to make sure your test suite has good coverage. A good rule of
106thumb is that if you want to be confident enough in your test suite that any
107failures that appear after having tools rewrite your code are actual bugs in the
108tools and not in your code. If you want a number to aim for, try to get over 80%
109coverage (and don't feel bad if you can't easily get past 90%). If you
110don't already have a tool to measure test coverage then coverage.py_ is
111recommended.
112
113Learn the differences between Python 2 & 3
114-------------------------------------------
115
116Once you have your code well-tested you are ready to begin porting your code to
117Python 3! But to fully understand how your code is going to change and what
118you want to look out for while you code, you will want to learn what changes
119Python 3 makes in terms of Python 2. Typically the two best ways of doing that
120is reading the `"What's New"`_ doc for each release of Python 3 and the
121`Porting to Python 3`_ book (which is free online). There is also a handy
122`cheat sheet`_ from the Python-Future project.
123
124
125Update your code
126----------------
127
128Once you feel like you know what is different in Python 3 compared to Python 2,
129it's time to update your code! You have a choice between two tools in porting
130your code automatically: Modernize_ and Futurize_. Which tool you choose will
131depend on how much like Python 3 you want your code to be. Futurize_ does its
132best to make Python 3 idioms and practices exist in Python 2, e.g. backporting
133the ``bytes`` type from Python 3 so that you have semantic parity between the
134major versions of Python. Modernize_,
135on the other hand, is more conservative and targets a Python 2/3 subset of
136Python, relying on six_ to help provide compatibility.
137
138Regardless of which tool you choose, they will update your code to run under
139Python 3 while staying compatible with the version of Python 2 you started with.
140Depending on how conservative you want to be, you may want to run the tool over
141your test suite first and visually inspect the diff to make sure the
142transformation is accurate. After you have transformed your test suite and
143verified that all the tests still pass as expected, then you can transform your
144application code knowing that any tests which fail is a translation failure.
145
146Unfortunately the tools can't automate everything to make your code work under
147Python 3 and so there are a handful of things you will need to update manually
148to get full Python 3 support (which of these steps are necessary vary between
149the tools). Read the documentation for the tool you choose to use to see what it
150fixes by default and what it can do optionally to know what will (not) be fixed
151for you and what you may have to fix on your own (e.g. using ``io.open()`` over
152the built-in ``open()`` function is off by default in Modernize). Luckily,
153though, there are only a couple of things to watch out for which can be
154considered large issues that may be hard to debug if not watched for.
155
156Division
157++++++++
158
159In Python 3, ``5 / 2 == 2.5`` and not ``2``; all division between ``int`` values
160result in a ``float``. This change has actually been planned since Python 2.2
161which was released in 2002. Since then users have been encouraged to add
162``from __future__ import division`` to any and all files which use the ``/`` and
163``//`` operators or to be running the interpreter with the ``-Q`` flag. If you
164have not been doing this then you will need to go through your code and do two
165things:
166
167#. Add ``from __future__ import division`` to your files
168#. Update any division operator as necessary to either use ``//`` to use floor
169   division or continue using ``/`` and expect a float
170
171The reason that ``/`` isn't simply translated to ``//`` automatically is that if
172an object defines its own ``__div__`` method but not ``__floordiv__`` then your
173code would begin to fail.
174
175Text versus binary data
176+++++++++++++++++++++++
177
178In Python 2 you could use the ``str`` type for both text and binary data.
179Unfortunately this confluence of two different concepts could lead to brittle
180code which sometimes worked for either kind of data, sometimes not. It also
181could lead to confusing APIs if people didn't explicitly state that something
182that accepted ``str`` accepted either text or binary data instead of one
183specific type. This complicated the situation especially for anyone supporting
184multiple languages as APIs wouldn't bother explicitly supporting ``unicode``
185when they claimed text data support.
186
187To make the distinction between text and binary data clearer and more
188pronounced, Python 3 did what most languages created in the age of the internet
189have done and made text and binary data distinct types that cannot blindly be
190mixed together (Python predates widespread access to the internet). For any code
191that only deals with text or only binary data, this separation doesn't pose an
192issue. But for code that has to deal with both, it does mean you might have to
193now care about when you are using text compared to binary data, which is why
194this cannot be entirely automated.
195
196To start, you will need to decide which APIs take text and which take binary
197(it is **highly** recommended you don't design APIs that can take both due to
198the difficulty of keeping the code working; as stated earlier it is difficult to
199do well). In Python 2 this means making sure the APIs that take text can work
200with ``unicode`` in Python 2 and those that work with binary data work with the
201``bytes`` type from Python 3 and thus a subset of ``str`` in Python 2 (which the
202``bytes`` type in Python 2 is an alias for). Usually the biggest issue is
203realizing which methods exist for which types in Python 2 & 3 simultaneously
204(for text that's ``unicode`` in Python 2 and ``str`` in Python 3, for binary
205that's ``str``/``bytes`` in Python 2 and ``bytes`` in Python 3). The following
206table lists the **unique** methods of each data type across Python 2 & 3
207(e.g., the ``decode()`` method is usable on the equivalent binary data type in
208either Python 2 or 3, but it can't be used by the text data type consistently
209between Python 2 and 3 because ``str`` in Python 3 doesn't have the method).
210
211======================== =====================
212**Text data**            **Binary data**
213------------------------ ---------------------
214__mod__ (``%`` operator)
215------------------------ ---------------------
216\                        decode
217------------------------ ---------------------
218encode
219------------------------ ---------------------
220format
221------------------------ ---------------------
222isdecimal
223------------------------ ---------------------
224isnumeric
225======================== =====================
226
227Making the distinction easier to handle can be accomplished by encoding and
228decoding between binary data and text at the edge of your code. This means that
229when you receive text in binary data, you should immediately decode it. And if
230your code needs to send text as binary data then encode it as late as possible.
231This allows your code to work with only text internally and thus eliminates
232having to keep track of what type of data you are working with.
233
234The next issue is making sure you know whether the string literals in your code
235represent text or binary data. At minimum you should add a ``b`` prefix to any
236literal that presents binary data. For text you should either use the
237``from __future__ import unicode_literals`` statement or add a ``u`` prefix to
238the text literal.
239
240As part of this dichotomy you also need to be careful about opening files.
241Unless you have been working on Windows, there is a chance you have not always
242bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for
243binary reading).  Under Python 3, binary files and text files are clearly
244distinct and mutually incompatible; see the :mod:`io` module for details.
245Therefore, you **must** make a decision of whether a file will be used for
246binary access (allowing binary data to be read and/or written) or text access
247(allowing text data to be read and/or written). You should also use :func:`io.open`
248for opening files instead of the built-in :func:`open` function as the :mod:`io`
249module is consistent from Python 2 to 3 while the built-in :func:`open` function
250is not (in Python 3 it's actually :func:`io.open`).
251
252The constructors of both ``str`` and ``bytes`` have different semantics for the
253same arguments between Python 2 & 3. Passing an integer to ``bytes`` in Python 2
254will give you the string representation of the integer: ``bytes(3) == '3'``.
255But in Python 3, an integer argument to ``bytes`` will give you a bytes object
256as long as the integer specified, filled with null bytes:
257``bytes(3) == b'\x00\x00\x00'``. A similar worry is necessary when passing a
258bytes object to ``str``. In Python 2 you just get the bytes object back:
259``str(b'3') == b'3'``. But in Python 3 you get the string representation of the
260bytes object: ``str(b'3') == "b'3'"``.
261
262Finally, the indexing of binary data requires careful handling (slicing does
263**not** require any special handling). In Python 2,
264``b'123'[1] == b'2'`` while in Python 3 ``b'123'[1] == 50``. Because binary data
265is simply a collection of binary numbers, Python 3 returns the integer value for
266the byte you index on. But in Python 2 because ``bytes == str``, indexing
267returns a one-item slice of bytes. The six_ project has a function
268named ``six.indexbytes()`` which will return an integer like in Python 3:
269``six.indexbytes(b'123', 1)``.
270
271To summarize:
272
273#. Decide which of your APIs take text and which take binary data
274#. Make sure that your code that works with text also works with ``unicode`` and
275   code for binary data works with ``bytes`` in Python 2 (see the table above
276   for what methods you cannot use for each type)
277#. Mark all binary literals with a ``b`` prefix, use a ``u`` prefix or
278   :mod:`__future__` import statement for text literals
279#. Decode binary data to text as soon as possible, encode text as binary data as
280   late as possible
281#. Open files using :func:`io.open` and make sure to specify the ``b`` mode when
282   appropriate
283#. Be careful when indexing binary data
284
285Prevent compatibility regressions
286---------------------------------
287
288Once you have fully translated your code to be compatible with Python 3, you
289will want to make sure your code doesn't regress and stop working under
290Python 3. This is especially true if you have a dependency which is blocking you
291from actually running under Python 3 at the moment.
292
293To help with staying compatible, any new modules you create should have
294at least the following block of code at the top of it::
295
296    from __future__ import absolute_import
297    from __future__ import division
298    from __future__ import print_function
299    from __future__ import unicode_literals
300
301You can also run Python 2 with the ``-3`` flag to be warned about various
302compatibility issues your code triggers during execution. If you turn warnings
303into errors with ``-Werror`` then you can make sure that you don't accidentally
304miss a warning.
305
306
307You can also use the Pylint_ project and its ``--py3k`` flag to lint your code
308to receive warnings when your code begins to deviate from Python 3
309compatibility. This also prevents you from having to run Modernize_ or Futurize_
310over your code regularly to catch compatibility regressions. This does require
311you only support Python 2.7 and Python 3.4 or newer as that is Pylint's
312minimum Python version support.
313
314
315Check which dependencies block your transition
316----------------------------------------------
317
318**After** you have made your code compatible with Python 3 you should begin to
319care about whether your dependencies have also been ported. The caniusepython3_
320project was created to help you determine which projects
321-- directly or indirectly -- are blocking you from supporting Python 3. There
322is both a command-line tool as well as a web interface at
323https://caniusepython3.com .
324
325The project also provides code which you can integrate into your test suite so
326that you will have a failing test when you no longer have dependencies blocking
327you from using Python 3. This allows you to avoid having to manually check your
328dependencies and to be notified quickly when you can start running on Python 3.
329
330Update your ``setup.py`` file to denote Python 3 compatibility
331--------------------------------------------------------------
332
333Once your code works under Python 3, you should update the classifiers in
334your ``setup.py`` to contain ``Programming Language :: Python :: 3`` and to not
335specify sole Python 2 support. This will tell
336anyone using your code that you support Python 2 **and** 3. Ideally you will
337also want to add classifiers for each major/minor version of Python you now
338support.
339
340Use continuous integration to stay compatible
341---------------------------------------------
342
343Once you are able to fully run under Python 3 you will want to make sure your
344code always works under both Python 2 & 3. Probably the best tool for running
345your tests under multiple Python interpreters is tox_. You can then integrate
346tox with your continuous integration system so that you never accidentally break
347Python 2 or 3 support.
348
349You may also want to use the ``-bb`` flag with the Python 3 interpreter to
350trigger an exception when you are comparing bytes to strings. Usually it's
351simply ``False``, but if you made a mistake in your separation of text/binary
352data handling you may be accidentally comparing text and binary data. This flag
353will raise an exception when that occurs to help track down such cases.
354
355And that's mostly it! At this point your code base is compatible with both
356Python 2 and 3 simultaneously. Your testing will also be set up so that you
357don't accidentally break Python 2 or 3 compatibility regardless of which version
358you typically run your tests under while developing.
359
360
361Dropping Python 2 support completely
362====================================
363
364If you are able to fully drop support for Python 2, then the steps required
365to transition to Python 3 simplify greatly.
366
367#. Update your code to only support Python 2.7
368#. Make sure you have good test coverage (coverage.py_ can help)
369#. Learn the differences between Python 2 & 3
370#. Use 2to3_ to rewrite your code to run only under Python 3
371
372After this your code will be fully Python 3 compliant but in a way that is not
373supported by Python 2. You should also update the classifiers in your
374``setup.py`` to contain ``Programming Language :: Python :: 3 :: Only``.
375
376
377.. _2to3: https://docs.python.org/3/library/2to3.html
378.. _caniusepython3: https://pypi.python.org/pypi/caniusepython3
379.. _cheat sheet: http://python-future.org/compatible_idioms.html
380.. _coverage.py: https://pypi.python.org/pypi/coverage
381.. _Futurize: http://python-future.org/automatic_conversion.html
382.. _Modernize: https://python-modernize.readthedocs.org/en/latest/
383.. _Porting to Python 3: http://python3porting.com/
384.. _Pylint: https://pypi.python.org/pypi/pylint
385.. _Python 3 Q & A: https://ncoghlan-devs-python-notes.readthedocs.org/en/latest/python3/questions_and_answers.html
386
387.. _python-future: http://python-future.org/
388.. _python-porting: https://mail.python.org/mailman/listinfo/python-porting
389.. _six: https://pypi.python.org/pypi/six
390.. _tox: https://pypi.python.org/pypi/tox
391.. _trove classifier: https://pypi.python.org/pypi?%3Aaction=list_classifiers
392.. _"What's New": https://docs.python.org/3/whatsnew/index.html
393