17: ADVANCED TOPICS 2 3At this point, hopefully, you have a handle on how the development process 4works. There is still more to learn, however! This section will cover a 5number of topics which can be helpful for developers wanting to become a 6regular part of the Linux kernel development process. 7 87.1: MANAGING PATCHES WITH GIT 9 10The use of distributed version control for the kernel began in early 2002, 11when Linus first started playing with the proprietary BitKeeper 12application. While BitKeeper was controversial, the approach to software 13version management it embodied most certainly was not. Distributed version 14control enabled an immediate acceleration of the kernel development 15project. In current times, there are several free alternatives to 16BitKeeper. For better or for worse, the kernel project has settled on git 17as its tool of choice. 18 19Managing patches with git can make life much easier for the developer, 20especially as the volume of those patches grows. Git also has its rough 21edges and poses certain hazards; it is a young and powerful tool which is 22still being civilized by its developers. This document will not attempt to 23teach the reader how to use git; that would be sufficient material for a 24long document in its own right. Instead, the focus here will be on how git 25fits into the kernel development process in particular. Developers who 26wish to come up to speed with git will find more information at: 27 28 http://git-scm.com/ 29 30 http://www.kernel.org/pub/software/scm/git/docs/user-manual.html 31 32and on various tutorials found on the web. 33 34The first order of business is to read the above sites and get a solid 35understanding of how git works before trying to use it to make patches 36available to others. A git-using developer should be able to obtain a copy 37of the mainline repository, explore the revision history, commit changes to 38the tree, use branches, etc. An understanding of git's tools for the 39rewriting of history (such as rebase) is also useful. Git comes with its 40own terminology and concepts; a new user of git should know about refs, 41remote branches, the index, fast-forward merges, pushes and pulls, detached 42heads, etc. It can all be a little intimidating at the outset, but the 43concepts are not that hard to grasp with a bit of study. 44 45Using git to generate patches for submission by email can be a good 46exercise while coming up to speed. 47 48When you are ready to start putting up git trees for others to look at, you 49will, of course, need a server that can be pulled from. Setting up such a 50server with git-daemon is relatively straightforward if you have a system 51which is accessible to the Internet. Otherwise, free, public hosting sites 52(Github, for example) are starting to appear on the net. Established 53developers can get an account on kernel.org, but those are not easy to come 54by; see http://kernel.org/faq/ for more information. 55 56The normal git workflow involves the use of a lot of branches. Each line 57of development can be separated into a separate "topic branch" and 58maintained independently. Branches in git are cheap, there is no reason to 59not make free use of them. And, in any case, you should not do your 60development in any branch which you intend to ask others to pull from. 61Publicly-available branches should be created with care; merge in patches 62from development branches when they are in complete form and ready to go - 63not before. 64 65Git provides some powerful tools which can allow you to rewrite your 66development history. An inconvenient patch (one which breaks bisection, 67say, or which has some other sort of obvious bug) can be fixed in place or 68made to disappear from the history entirely. A patch series can be 69rewritten as if it had been written on top of today's mainline, even though 70you have been working on it for months. Changes can be transparently 71shifted from one branch to another. And so on. Judicious use of git's 72ability to revise history can help in the creation of clean patch sets with 73fewer problems. 74 75Excessive use of this capability can lead to other problems, though, beyond 76a simple obsession for the creation of the perfect project history. 77Rewriting history will rewrite the changes contained in that history, 78turning a tested (hopefully) kernel tree into an untested one. But, beyond 79that, developers cannot easily collaborate if they do not have a shared 80view of the project history; if you rewrite history which other developers 81have pulled into their repositories, you will make life much more difficult 82for those developers. So a simple rule of thumb applies here: history 83which has been exported to others should generally be seen as immutable 84thereafter. 85 86So, once you push a set of changes to your publicly-available server, those 87changes should not be rewritten. Git will attempt to enforce this rule if 88you try to push changes which do not result in a fast-forward merge 89(i.e. changes which do not share the same history). It is possible to 90override this check, and there may be times when it is necessary to rewrite 91an exported tree. Moving changesets between trees to avoid conflicts in 92linux-next is one example. But such actions should be rare. This is one 93of the reasons why development should be done in private branches (which 94can be rewritten if necessary) and only moved into public branches when 95it's in a reasonably advanced state. 96 97As the mainline (or other tree upon which a set of changes is based) 98advances, it is tempting to merge with that tree to stay on the leading 99edge. For a private branch, rebasing can be an easy way to keep up with 100another tree, but rebasing is not an option once a tree is exported to the 101world. Once that happens, a full merge must be done. Merging occasionally 102makes good sense, but overly frequent merges can clutter the history 103needlessly. Suggested technique in this case is to merge infrequently, and 104generally only at specific release points (such as a mainline -rc 105release). If you are nervous about specific changes, you can always 106perform test merges in a private branch. The git "rerere" tool can be 107useful in such situations; it remembers how merge conflicts were resolved 108so that you don't have to do the same work twice. 109 110One of the biggest recurring complaints about tools like git is this: the 111mass movement of patches from one repository to another makes it easy to 112slip in ill-advised changes which go into the mainline below the review 113radar. Kernel developers tend to get unhappy when they see that kind of 114thing happening; putting up a git tree with unreviewed or off-topic patches 115can affect your ability to get trees pulled in the future. Quoting Linus: 116 117 You can send me patches, but for me to pull a git patch from you, I 118 need to know that you know what you're doing, and I need to be able 119 to trust things *without* then having to go and check every 120 individual change by hand. 121 122(http://lwn.net/Articles/224135/). 123 124To avoid this kind of situation, ensure that all patches within a given 125branch stick closely to the associated topic; a "driver fixes" branch 126should not be making changes to the core memory management code. And, most 127importantly, do not use a git tree to bypass the review process. Post an 128occasional summary of the tree to the relevant list, and, when the time is 129right, request that the tree be included in linux-next. 130 131If and when others start to send patches for inclusion into your tree, 132don't forget to review them. Also ensure that you maintain the correct 133authorship information; the git "am" tool does its best in this regard, but 134you may have to add a "From:" line to the patch if it has been relayed to 135you via a third party. 136 137When requesting a pull, be sure to give all the relevant information: where 138your tree is, what branch to pull, and what changes will result from the 139pull. The git request-pull command can be helpful in this regard; it will 140format the request as other developers expect, and will also check to be 141sure that you have remembered to push those changes to the public server. 142 143 1447.2: REVIEWING PATCHES 145 146Some readers will certainly object to putting this section with "advanced 147topics" on the grounds that even beginning kernel developers should be 148reviewing patches. It is certainly true that there is no better way to 149learn how to program in the kernel environment than by looking at code 150posted by others. In addition, reviewers are forever in short supply; by 151looking at code you can make a significant contribution to the process as a 152whole. 153 154Reviewing code can be an intimidating prospect, especially for a new kernel 155developer who may well feel nervous about questioning code - in public - 156which has been posted by those with more experience. Even code written by 157the most experienced developers can be improved, though. Perhaps the best 158piece of advice for reviewers (all reviewers) is this: phrase review 159comments as questions rather than criticisms. Asking "how does the lock 160get released in this path?" will always work better than stating "the 161locking here is wrong." 162 163Different developers will review code from different points of view. Some 164are mostly concerned with coding style and whether code lines have trailing 165white space. Others will focus primarily on whether the change implemented 166by the patch as a whole is a good thing for the kernel or not. Yet others 167will check for problematic locking, excessive stack usage, possible 168security issues, duplication of code found elsewhere, adequate 169documentation, adverse effects on performance, user-space ABI changes, etc. 170All types of review, if they lead to better code going into the kernel, are 171welcome and worthwhile. 172 173 174