1.. _pyporting-howto: 2 3********************************* 4Porting Python 2 Code to Python 3 5********************************* 6 7:author: Brett Cannon 8 9.. topic:: Abstract 10 11 With Python 3 being the future of Python while Python 2 is still in active 12 use, it is good to have your project available for both major releases of 13 Python. This guide is meant to help you figure out how best to support both 14 Python 2 & 3 simultaneously. 15 16 If you are looking to port an extension module instead of pure Python code, 17 please see :ref:`cporting-howto`. 18 19 If you would like to read one core Python developer's take on why Python 3 20 came into existence, you can read Nick Coghlan's `Python 3 Q & A`_. 21 22 For help with porting, you can email the python-porting_ mailing list with 23 questions. 24 25The Short Explanation 26===================== 27 28To make your project be single-source Python 2/3 compatible, the basic steps 29are: 30 31#. Update your code to drop support for Python 2.5 or older (supporting only 32 Python 2.7 is ideal) 33#. Make sure you have good test coverage (coverage.py_ can help; 34 ``pip install coverage``) 35#. Learn the differences between Python 2 & 3 36#. Use Modernize_ or Futurize_ to update your code (``pip install modernize`` or 37 ``pip install future``, respectively) 38#. Use Pylint_ to help make sure you don't regress on your Python 3 support 39 (if only supporting Python 2.7/3.4 or newer; ``pip install pylint``) 40#. Use caniusepython3_ to find out which of your dependencies are blocking your 41 use of Python 3 (``pip install caniusepython3``) 42#. Once your dependencies are no longer blocking you, use continuous integration 43 to make sure you stay compatible with Python 2 & 3 (tox_ can help test 44 against multiple versions of Python; ``pip install tox``) 45 46If you are dropping support for Python 2 entirely, then after you learn the 47differences between Python 2 & 3 you can run 2to3_ over your code and skip the 48rest of the steps outlined above. 49 50 51Details 52======= 53 54A key point about supporting Python 2 & 3 simultaneously is that you can start 55**today**! Even if your dependencies are not supporting Python 3 yet that does 56not mean you can't modernize your code **now** to support Python 3. Most changes 57required to support Python 3 lead to cleaner code using newer practices even in 58Python 2. 59 60Another key point is that modernizing your Python 2 code to also support 61Python 3 is largely automated for you. While you might have to make some API 62decisions thanks to Python 3 clarifying text data versus binary data, the 63lower-level work is now mostly done for you and thus can at least benefit from 64the automated changes immediately. 65 66Keep those key points in mind while you read on about the details of porting 67your code to support Python 2 & 3 simultaneously. 68 69 70Drop support for Python 2.5 and older (at least) 71------------------------------------------------ 72 73While you can make Python 2.5 work with Python 3, it is **much** easier if you 74only have to work with Python 2.6 or newer (and easier still if you only have 75to work with Python 2.7). If dropping Python 2.5 is not an option then the six_ 76project can help you support Python 2.5 & 3 simultaneously 77(``pip install six``). Do realize, though, that nearly all the projects listed 78in this HOWTO will not be available to you. 79 80If you are able to only support Python 2.6 or newer, then the required changes 81to your code should continue to look and feel like idiomatic Python code. At 82worst you will have to use a function instead of a method in some instances or 83have to import a function instead of using a built-in one, but otherwise the 84overall transformation should not feel foreign to you. 85 86But please aim for Python 2.7. Bugfixes for that version of Python will continue 87until 2020 while Python 2.6 is no longer supported. There are also some tools 88mentioned in this HOWTO which do not support Python 2.6 (e.g., Pylint_), and 89this will become more commonplace as time goes on. 90 91Make sure you specify the proper version support in your ``setup.py`` file 92-------------------------------------------------------------------------- 93 94In your ``setup.py`` file you should have the proper `trove classifier`_ 95specifying what versions of Python you support. As your project does not support 96Python 3 yet you should at least have 97``Programming Language :: Python :: 2 :: Only`` specified. Ideally you should 98also specify each major/minor version of Python that you do support, e.g. 99``Programming Language :: Python :: 2.7``. 100 101Have good test coverage 102----------------------- 103 104Once you have your code supporting the oldest version of Python 2 you want it 105to, you will want to make sure your test suite has good coverage. A good rule of 106thumb is that if you want to be confident enough in your test suite that any 107failures that appear after having tools rewrite your code are actual bugs in the 108tools and not in your code. If you want a number to aim for, try to get over 80% 109coverage (and don't feel bad if you can't easily get past 90%). If you 110don't already have a tool to measure test coverage then coverage.py_ is 111recommended. 112 113Learn the differences between Python 2 & 3 114------------------------------------------- 115 116Once you have your code well-tested you are ready to begin porting your code to 117Python 3! But to fully understand how your code is going to change and what 118you want to look out for while you code, you will want to learn what changes 119Python 3 makes in terms of Python 2. Typically the two best ways of doing that 120is reading the `"What's New"`_ doc for each release of Python 3 and the 121`Porting to Python 3`_ book (which is free online). There is also a handy 122`cheat sheet`_ from the Python-Future project. 123 124 125Update your code 126---------------- 127 128Once you feel like you know what is different in Python 3 compared to Python 2, 129it's time to update your code! You have a choice between two tools in porting 130your code automatically: Modernize_ and Futurize_. Which tool you choose will 131depend on how much like Python 3 you want your code to be. Futurize_ does its 132best to make Python 3 idioms and practices exist in Python 2, e.g. backporting 133the ``bytes`` type from Python 3 so that you have semantic parity between the 134major versions of Python. Modernize_, 135on the other hand, is more conservative and targets a Python 2/3 subset of 136Python, relying on six_ to help provide compatibility. 137 138Regardless of which tool you choose, they will update your code to run under 139Python 3 while staying compatible with the version of Python 2 you started with. 140Depending on how conservative you want to be, you may want to run the tool over 141your test suite first and visually inspect the diff to make sure the 142transformation is accurate. After you have transformed your test suite and 143verified that all the tests still pass as expected, then you can transform your 144application code knowing that any tests which fail is a translation failure. 145 146Unfortunately the tools can't automate everything to make your code work under 147Python 3 and so there are a handful of things you will need to update manually 148to get full Python 3 support (which of these steps are necessary vary between 149the tools). Read the documentation for the tool you choose to use to see what it 150fixes by default and what it can do optionally to know what will (not) be fixed 151for you and what you may have to fix on your own (e.g. using ``io.open()`` over 152the built-in ``open()`` function is off by default in Modernize). Luckily, 153though, there are only a couple of things to watch out for which can be 154considered large issues that may be hard to debug if not watched for. 155 156Division 157++++++++ 158 159In Python 3, ``5 / 2 == 2.5`` and not ``2``; all division between ``int`` values 160result in a ``float``. This change has actually been planned since Python 2.2 161which was released in 2002. Since then users have been encouraged to add 162``from __future__ import division`` to any and all files which use the ``/`` and 163``//`` operators or to be running the interpreter with the ``-Q`` flag. If you 164have not been doing this then you will need to go through your code and do two 165things: 166 167#. Add ``from __future__ import division`` to your files 168#. Update any division operator as necessary to either use ``//`` to use floor 169 division or continue using ``/`` and expect a float 170 171The reason that ``/`` isn't simply translated to ``//`` automatically is that if 172an object defines its own ``__div__`` method but not ``__floordiv__`` then your 173code would begin to fail. 174 175Text versus binary data 176+++++++++++++++++++++++ 177 178In Python 2 you could use the ``str`` type for both text and binary data. 179Unfortunately this confluence of two different concepts could lead to brittle 180code which sometimes worked for either kind of data, sometimes not. It also 181could lead to confusing APIs if people didn't explicitly state that something 182that accepted ``str`` accepted either text or binary data instead of one 183specific type. This complicated the situation especially for anyone supporting 184multiple languages as APIs wouldn't bother explicitly supporting ``unicode`` 185when they claimed text data support. 186 187To make the distinction between text and binary data clearer and more 188pronounced, Python 3 did what most languages created in the age of the internet 189have done and made text and binary data distinct types that cannot blindly be 190mixed together (Python predates widespread access to the internet). For any code 191that only deals with text or only binary data, this separation doesn't pose an 192issue. But for code that has to deal with both, it does mean you might have to 193now care about when you are using text compared to binary data, which is why 194this cannot be entirely automated. 195 196To start, you will need to decide which APIs take text and which take binary 197(it is **highly** recommended you don't design APIs that can take both due to 198the difficulty of keeping the code working; as stated earlier it is difficult to 199do well). In Python 2 this means making sure the APIs that take text can work 200with ``unicode`` in Python 2 and those that work with binary data work with the 201``bytes`` type from Python 3 and thus a subset of ``str`` in Python 2 (which the 202``bytes`` type in Python 2 is an alias for). Usually the biggest issue is 203realizing which methods exist for which types in Python 2 & 3 simultaneously 204(for text that's ``unicode`` in Python 2 and ``str`` in Python 3, for binary 205that's ``str``/``bytes`` in Python 2 and ``bytes`` in Python 3). The following 206table lists the **unique** methods of each data type across Python 2 & 3 207(e.g., the ``decode()`` method is usable on the equivalent binary data type in 208either Python 2 or 3, but it can't be used by the text data type consistently 209between Python 2 and 3 because ``str`` in Python 3 doesn't have the method). 210 211======================== ===================== 212**Text data** **Binary data** 213------------------------ --------------------- 214__mod__ (``%`` operator) 215------------------------ --------------------- 216\ decode 217------------------------ --------------------- 218encode 219------------------------ --------------------- 220format 221------------------------ --------------------- 222isdecimal 223------------------------ --------------------- 224isnumeric 225======================== ===================== 226 227Making the distinction easier to handle can be accomplished by encoding and 228decoding between binary data and text at the edge of your code. This means that 229when you receive text in binary data, you should immediately decode it. And if 230your code needs to send text as binary data then encode it as late as possible. 231This allows your code to work with only text internally and thus eliminates 232having to keep track of what type of data you are working with. 233 234The next issue is making sure you know whether the string literals in your code 235represent text or binary data. At minimum you should add a ``b`` prefix to any 236literal that presents binary data. For text you should either use the 237``from __future__ import unicode_literals`` statement or add a ``u`` prefix to 238the text literal. 239 240As part of this dichotomy you also need to be careful about opening files. 241Unless you have been working on Windows, there is a chance you have not always 242bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for 243binary reading). Under Python 3, binary files and text files are clearly 244distinct and mutually incompatible; see the :mod:`io` module for details. 245Therefore, you **must** make a decision of whether a file will be used for 246binary access (allowing binary data to be read and/or written) or text access 247(allowing text data to be read and/or written). You should also use :func:`io.open` 248for opening files instead of the built-in :func:`open` function as the :mod:`io` 249module is consistent from Python 2 to 3 while the built-in :func:`open` function 250is not (in Python 3 it's actually :func:`io.open`). 251 252The constructors of both ``str`` and ``bytes`` have different semantics for the 253same arguments between Python 2 & 3. Passing an integer to ``bytes`` in Python 2 254will give you the string representation of the integer: ``bytes(3) == '3'``. 255But in Python 3, an integer argument to ``bytes`` will give you a bytes object 256as long as the integer specified, filled with null bytes: 257``bytes(3) == b'\x00\x00\x00'``. A similar worry is necessary when passing a 258bytes object to ``str``. In Python 2 you just get the bytes object back: 259``str(b'3') == b'3'``. But in Python 3 you get the string representation of the 260bytes object: ``str(b'3') == "b'3'"``. 261 262Finally, the indexing of binary data requires careful handling (slicing does 263**not** require any special handling). In Python 2, 264``b'123'[1] == b'2'`` while in Python 3 ``b'123'[1] == 50``. Because binary data 265is simply a collection of binary numbers, Python 3 returns the integer value for 266the byte you index on. But in Python 2 because ``bytes == str``, indexing 267returns a one-item slice of bytes. The six_ project has a function 268named ``six.indexbytes()`` which will return an integer like in Python 3: 269``six.indexbytes(b'123', 1)``. 270 271To summarize: 272 273#. Decide which of your APIs take text and which take binary data 274#. Make sure that your code that works with text also works with ``unicode`` and 275 code for binary data works with ``bytes`` in Python 2 (see the table above 276 for what methods you cannot use for each type) 277#. Mark all binary literals with a ``b`` prefix, use a ``u`` prefix or 278 :mod:`__future__` import statement for text literals 279#. Decode binary data to text as soon as possible, encode text as binary data as 280 late as possible 281#. Open files using :func:`io.open` and make sure to specify the ``b`` mode when 282 appropriate 283#. Be careful when indexing binary data 284 285Prevent compatibility regressions 286--------------------------------- 287 288Once you have fully translated your code to be compatible with Python 3, you 289will want to make sure your code doesn't regress and stop working under 290Python 3. This is especially true if you have a dependency which is blocking you 291from actually running under Python 3 at the moment. 292 293To help with staying compatible, any new modules you create should have 294at least the following block of code at the top of it:: 295 296 from __future__ import absolute_import 297 from __future__ import division 298 from __future__ import print_function 299 from __future__ import unicode_literals 300 301You can also run Python 2 with the ``-3`` flag to be warned about various 302compatibility issues your code triggers during execution. If you turn warnings 303into errors with ``-Werror`` then you can make sure that you don't accidentally 304miss a warning. 305 306 307You can also use the Pylint_ project and its ``--py3k`` flag to lint your code 308to receive warnings when your code begins to deviate from Python 3 309compatibility. This also prevents you from having to run Modernize_ or Futurize_ 310over your code regularly to catch compatibility regressions. This does require 311you only support Python 2.7 and Python 3.4 or newer as that is Pylint's 312minimum Python version support. 313 314 315Check which dependencies block your transition 316---------------------------------------------- 317 318**After** you have made your code compatible with Python 3 you should begin to 319care about whether your dependencies have also been ported. The caniusepython3_ 320project was created to help you determine which projects 321-- directly or indirectly -- are blocking you from supporting Python 3. There 322is both a command-line tool as well as a web interface at 323https://caniusepython3.com . 324 325The project also provides code which you can integrate into your test suite so 326that you will have a failing test when you no longer have dependencies blocking 327you from using Python 3. This allows you to avoid having to manually check your 328dependencies and to be notified quickly when you can start running on Python 3. 329 330Update your ``setup.py`` file to denote Python 3 compatibility 331-------------------------------------------------------------- 332 333Once your code works under Python 3, you should update the classifiers in 334your ``setup.py`` to contain ``Programming Language :: Python :: 3`` and to not 335specify sole Python 2 support. This will tell 336anyone using your code that you support Python 2 **and** 3. Ideally you will 337also want to add classifiers for each major/minor version of Python you now 338support. 339 340Use continuous integration to stay compatible 341--------------------------------------------- 342 343Once you are able to fully run under Python 3 you will want to make sure your 344code always works under both Python 2 & 3. Probably the best tool for running 345your tests under multiple Python interpreters is tox_. You can then integrate 346tox with your continuous integration system so that you never accidentally break 347Python 2 or 3 support. 348 349You may also want to use the ``-bb`` flag with the Python 3 interpreter to 350trigger an exception when you are comparing bytes to strings. Usually it's 351simply ``False``, but if you made a mistake in your separation of text/binary 352data handling you may be accidentally comparing text and binary data. This flag 353will raise an exception when that occurs to help track down such cases. 354 355And that's mostly it! At this point your code base is compatible with both 356Python 2 and 3 simultaneously. Your testing will also be set up so that you 357don't accidentally break Python 2 or 3 compatibility regardless of which version 358you typically run your tests under while developing. 359 360 361Dropping Python 2 support completely 362==================================== 363 364If you are able to fully drop support for Python 2, then the steps required 365to transition to Python 3 simplify greatly. 366 367#. Update your code to only support Python 2.7 368#. Make sure you have good test coverage (coverage.py_ can help) 369#. Learn the differences between Python 2 & 3 370#. Use 2to3_ to rewrite your code to run only under Python 3 371 372After this your code will be fully Python 3 compliant but in a way that is not 373supported by Python 2. You should also update the classifiers in your 374``setup.py`` to contain ``Programming Language :: Python :: 3 :: Only``. 375 376 377.. _2to3: https://docs.python.org/3/library/2to3.html 378.. _caniusepython3: https://pypi.python.org/pypi/caniusepython3 379.. _cheat sheet: http://python-future.org/compatible_idioms.html 380.. _coverage.py: https://pypi.python.org/pypi/coverage 381.. _Futurize: http://python-future.org/automatic_conversion.html 382.. _Modernize: https://python-modernize.readthedocs.org/en/latest/ 383.. _Porting to Python 3: http://python3porting.com/ 384.. _Pylint: https://pypi.python.org/pypi/pylint 385.. _Python 3 Q & A: https://ncoghlan-devs-python-notes.readthedocs.org/en/latest/python3/questions_and_answers.html 386 387.. _python-future: http://python-future.org/ 388.. _python-porting: https://mail.python.org/mailman/listinfo/python-porting 389.. _six: https://pypi.python.org/pypi/six 390.. _tox: https://pypi.python.org/pypi/tox 391.. _trove classifier: https://pypi.python.org/pypi?%3Aaction=list_classifiers 392.. _"What's New": https://docs.python.org/3/whatsnew/index.html 393