• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1==============================
2Moving LLVM Projects to GitHub
3==============================
4
5.. contents:: Table of Contents
6  :depth: 4
7  :local:
8
9Introduction
10============
11
12This is a proposal to move our current revision control system from our own
13hosted Subversion to GitHub. Below are the financial and technical arguments as
14to why we are proposing such a move and how people (and validation
15infrastructure) will continue to work with a Git-based LLVM.
16
17There will be a survey pointing at this document which we'll use to gauge the
18community's reaction and, if we collectively decide to move, the time-frame. Be
19sure to make your view count.
20
21Additionally, we will discuss this during a BoF at the next US LLVM Developer
22meeting (http://llvm.org/devmtg/2016-11/).
23
24What This Proposal is *Not* About
25=================================
26
27Changing the development policy.
28
29This proposal relates only to moving the hosting of our source-code repository
30from SVN hosted on our own servers to Git hosted on GitHub. We are not proposing
31using GitHub's issue tracker, pull-requests, or code-review.
32
33Contributors will continue to earn commit access on demand under the Developer
34Policy, except that that a GitHub account will be required instead of SVN
35username/password-hash.
36
37Why Git, and Why GitHub?
38========================
39
40Why Move At All?
41----------------
42
43This discussion began because we currently host our own Subversion server
44and Git mirror on a voluntary basis. The LLVM Foundation sponsors the server and
45provides limited support, but there is only so much it can do.
46
47Volunteers are not sysadmins themselves, but compiler engineers that happen
48to know a thing or two about hosting servers. We also don't have 24/7 support,
49and we sometimes wake up to see that continuous integration is broken because
50the SVN server is either down or unresponsive.
51
52We should take advantage of one of the services out there (GitHub, GitLab,
53and BitBucket, among others) that offer better service (24/7 stability, disk
54space, Git server, code browsing, forking facilities, etc) for free.
55
56Why Git?
57--------
58
59Many new coders nowadays start with Git, and a lot of people have never used
60SVN, CVS, or anything else. Websites like GitHub have changed the landscape
61of open source contributions, reducing the cost of first contribution and
62fostering collaboration.
63
64Git is also the version control many LLVM developers use. Despite the
65sources being stored in a SVN server, these developers are already using Git
66through the Git-SVN integration.
67
68Git allows you to:
69
70* Commit, squash, merge, and fork locally without touching the remote server.
71* Maintain local branches, enabling multiple threads of development.
72* Collaborate on these branches (e.g. through your own fork of llvm on GitHub).
73* Inspect the repository history (blame, log, bisect) without Internet access.
74* Maintain remote forks and branches on Git hosting services and
75  integrate back to the main repository.
76
77In addition, because Git seems to be replacing many OSS projects' version
78control systems, there are many tools that are built over Git.
79Future tooling may support Git first (if not only).
80
81Why GitHub?
82-----------
83
84GitHub, like GitLab and BitBucket, provides free code hosting for open source
85projects. Any of these could replace the code-hosting infrastructure that we
86have today.
87
88These services also have a dedicated team to monitor, migrate, improve and
89distribute the contents of the repositories depending on region and load.
90
91GitHub has one important advantage over GitLab and
92BitBucket: it offers read-write **SVN** access to the repository
93(https://github.com/blog/626-announcing-svn-support).
94This would enable people to continue working post-migration as though our code
95were still canonically in an SVN repository.
96
97In addition, there are already multiple LLVM mirrors on GitHub, indicating that
98part of our community has already settled there.
99
100On Managing Revision Numbers with Git
101-------------------------------------
102
103The current SVN repository hosts all the LLVM sub-projects alongside each other.
104A single revision number (e.g. r123456) thus identifies a consistent version of
105all LLVM sub-projects.
106
107Git does not use sequential integer revision number but instead uses a hash to
108identify each commit. (Linus mentioned that the lack of such revision number
109is "the only real design mistake" in Git [TorvaldRevNum]_.)
110
111The loss of a sequential integer revision number has been a sticking point in
112past discussions about Git:
113
114- "The 'branch' I most care about is mainline, and losing the ability to say
115  'fixed in r1234' (with some sort of monotonically increasing number) would
116  be a tragic loss." [LattnerRevNum]_
117- "I like those results sorted by time and the chronology should be obvious, but
118  timestamps are incredibly cumbersome and make it difficult to verify that a
119  given checkout matches a given set of results." [TrickRevNum]_
120- "There is still the major regression with unreadable version numbers.
121  Given the amount of Bugzilla traffic with 'Fixed in...', that's a
122  non-trivial issue." [JSonnRevNum]_
123- "Sequential IDs are important for LNT and llvmlab bisection tool." [MatthewsRevNum]_.
124
125However, Git can emulate this increasing revision number:
126``git rev-list --count <commit-hash>``. This identifier is unique only
127within a single branch, but this means the tuple `(num, branch-name)` uniquely
128identifies a commit.
129
130We can thus use this revision number to ensure that e.g. `clang -v` reports a
131user-friendly revision number (e.g. `master-12345` or `4.0-5321`), addressing
132the objections raised above with respect to this aspect of Git.
133
134What About Branches and Merges?
135-------------------------------
136
137In contrast to SVN, Git makes branching easy. Git's commit history is
138represented as a DAG, a departure from SVN's linear history. However, we propose
139to mandate making merge commits illegal in our canonical Git repository.
140
141Unfortunately, GitHub does not support server side hooks to enforce such a
142policy.  We must rely on the community to avoid pushing merge commits.
143
144GitHub offers a feature called `Status Checks`: a branch protected by
145`status checks` requires commits to be whitelisted before the push can happen.
146We could supply a pre-push hook on the client side that would run and check the
147history, before whitelisting the commit being pushed [statuschecks]_.
148However this solution would be somewhat fragile (how do you update a script
149installed on every developer machine?) and prevents SVN access to the
150repository.
151
152What About Commit Emails?
153-------------------------
154
155We will need a new bot to send emails for each commit. This proposal leaves the
156email format unchanged besides the commit URL.
157
158Straw Man Migration Plan
159========================
160
161Step #1 : Before The Move
162-------------------------
163
1641. Update docs to mention the move, so people are aware of what is going on.
1652. Set up a read-only version of the GitHub project, mirroring our current SVN
166   repository.
1673. Add the required bots to implement the commit emails, as well as the
168   umbrella repository update (if the multirepo is selected) or the read-only
169   Git views for the sub-projects (if the monorepo is selected).
170
171Step #2 : Git Move
172------------------
173
1744. Update the buildbots to pick up updates and commits from the GitHub
175   repository. Not all bots have to migrate at this point, but it'll help
176   provide infrastructure testing.
1775. Update Phabricator to pick up commits from the GitHub repository.
1786. LNT and llvmlab have to be updated: they rely on unique monotonically
179   increasing integer across branch [MatthewsRevNum]_.
1807. Instruct downstream integrators to pick up commits from the GitHub
181   repository.
1828. Review and prepare an update for the LLVM documentation.
183
184Until this point nothing has changed for developers, it will just
185boil down to a lot of work for buildbot and other infrastructure
186owners.
187
188The migration will pause here until all dependencies have cleared, and all
189problems have been solved.
190
191Step #3: Write Access Move
192--------------------------
193
1949. Collect developers' GitHub account information, and add them to the project.
19510. Switch the SVN repository to read-only and allow pushes to the GitHub repository.
19611. Update the documentation.
19712. Mirror Git to SVN.
198
199Step #4 : Post Move
200-------------------
201
20213. Archive the SVN repository.
20314. Update links on the LLVM website pointing to viewvc/klaus/phab etc. to
204    point to GitHub instead.
205
206One or Multiple Repositories?
207=============================
208
209There are two major variants for how to structure our Git repository: The
210"multirepo" and the "monorepo".
211
212Multirepo Variant
213-----------------
214
215This variant recommends moving each LLVM sub-project to a separate Git
216repository. This mimics the existing official read-only Git repositories
217(e.g., http://llvm.org/git/compiler-rt.git), and creates new canonical
218repositories for each sub-project.
219
220This will allow the individual sub-projects to remain distinct: a
221developer interested only in compiler-rt can checkout only this repository,
222build it, and work in isolation of the other sub-projects.
223
224A key need is to be able to check out multiple projects (i.e. lldb+clang+llvm or
225clang+llvm+libcxx for example) at a specific revision.
226
227A tuple of revisions (one entry per repository) accurately describes the state
228across the sub-projects.
229For example, a given version of clang would be
230*<LLVM-12345, clang-5432, libcxx-123, etc.>*.
231
232Umbrella Repository
233^^^^^^^^^^^^^^^^^^^
234
235To make this more convenient, a separate *umbrella* repository will be
236provided. This repository will be used for the sole purpose of understanding
237the sequence in which commits were pushed to the different repositories and to
238provide a single revision number.
239
240This umbrella repository will be read-only and continuously updated
241to record the above tuple. The proposed form to record this is to use Git
242[submodules]_, possibly along with a set of scripts to help check out a
243specific revision of the LLVM distribution.
244
245A regular LLVM developer does not need to interact with the umbrella repository
246-- the individual repositories can be checked out independently -- but you would
247need to use the umbrella repository to bisect multiple sub-projects at the same
248time, or to check-out old revisions of LLVM with another sub-project at a
249consistent state.
250
251This umbrella repository will be updated automatically by a bot (running on
252notice from a webhook on every push, and periodically) on a per commit basis: a
253single commit in the umbrella repository would match a single commit in a
254sub-project.
255
256Living Downstream
257^^^^^^^^^^^^^^^^^
258
259Downstream SVN users can use the read/write SVN bridges with the following
260caveats:
261
262 * Be prepared for a one-time change to the upstream revision numbers.
263 * The upstream sub-project revision numbers will no longer be in sync.
264
265Downstream Git users can continue without any major changes, with the minor
266change of upstreaming using `git push` instead of `git svn dcommit`.
267
268Git users also have the option of adopting an umbrella repository downstream.
269The tooling for the upstream umbrella can easily be reused for downstream needs,
270incorporating extra sub-projects and branching in parallel with sub-project
271branches.
272
273Multirepo Preview
274^^^^^^^^^^^^^^^^^
275
276As a preview (disclaimer: this rough prototype, not polished and not
277representative of the final solution), you can look at the following:
278
279  * Repository: https://github.com/llvm-beanz/llvm-submodules
280  * Update bot: http://beanz-bot.com:8180/jenkins/job/submodule-update/
281
282Concerns
283^^^^^^^^
284
285 * Because GitHub does not allow server-side hooks, and because there is no
286   "push timestamp" in Git, the umbrella repository sequence isn't totally
287   exact: commits from different repositories pushed around the same time can
288   appear in different orders. However, we don't expect it to be the common case
289   or to cause serious issues in practice.
290 * You can't have a single cross-projects commit that would update both LLVM and
291   other sub-projects (something that can be achieved now). It would be possible
292   to establish a protocol whereby users add a special token to their commit
293   messages that causes the umbrella repo's updater bot to group all of them
294   into a single revision.
295 * Another option is to group commits that were pushed closely enough together
296   in the umbrella repository. This has the advantage of allowing cross-project
297   commits, and is less sensitive to mis-ordering commits. However, this has the
298   potential to group unrelated commits together, especially if the bot goes
299   down and needs to catch up.
300 * This variant relies on heavier tooling. But the current prototype shows that
301   it is not out-of-reach.
302 * Submodules don't have a good reputation / are complicating the command line.
303   However, in the proposed setup, a regular developer will seldom interact with
304   submodules directly, and certainly never update them.
305 * Refactoring across projects is not friendly: taking some functions from clang
306   to make it part of a utility in libSupport wouldn't carry the history of the
307   code in the llvm repo, preventing recursively applying `git blame` for
308   instance. However, this is not very different than how most people are
309   Interacting with the repository today, by splitting such change in multiple
310   commits.
311
312Workflows
313^^^^^^^^^
314
315 * :ref:`Checkout/Clone a Single Project, without Commit Access <workflow-checkout-commit>`.
316 * :ref:`Checkout/Clone a Single Project, with Commit Access <workflow-multicheckout-nocommit>`.
317 * :ref:`Checkout/Clone Multiple Projects, with Commit Access <workflow-multicheckout-multicommit>`.
318 * :ref:`Commit an API Change in LLVM and Update the Sub-projects <workflow-cross-repo-commit>`.
319 * :ref:`Branching/Stashing/Updating for Local Development or Experiments <workflow-multi-branching>`.
320 * :ref:`Bisecting <workflow-multi-bisecting>`.
321
322Monorepo Variant
323----------------
324
325This variant recommends moving all LLVM sub-projects to a single Git repository,
326similar to https://github.com/llvm-project/llvm-project.
327This would mimic an export of the current SVN repository, with each sub-project
328having its own top-level directory.
329Not all sub-projects are used for building toolchains. In practice, www/
330and test-suite/ will probably stay out of the monorepo.
331
332Putting all sub-projects in a single checkout makes cross-project refactoring
333naturally simple:
334
335 * New sub-projects can be trivially split out for better reuse and/or layering
336   (e.g., to allow libSupport and/or LIT to be used by runtimes without adding a
337   dependency on LLVM).
338 * Changing an API in LLVM and upgrading the sub-projects will always be done in
339   a single commit, designing away a common source of temporary build breakage.
340 * Moving code across sub-project (during refactoring for instance) in a single
341   commit enables accurate `git blame` when tracking code change history.
342 * Tooling based on `git grep` works natively across sub-projects, allowing to
343   easier find refactoring opportunities across projects (for example reusing a
344   datastructure initially in LLDB by moving it into libSupport).
345 * Having all the sources present encourages maintaining the other sub-projects
346   when changing API.
347
348Finally, the monorepo maintains the property of the existing SVN repository that
349the sub-projects move synchronously, and a single revision number (or commit
350hash) identifies the state of the development across all projects.
351
352.. _build_single_project:
353
354Building a single sub-project
355^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
356
357Nobody will be forced to build unnecessary projects.  The exact structure
358is TBD, but making it trivial to configure builds for a single sub-project
359(or a subset of sub-projects) is a hard requirement.
360
361As an example, it could look like the following::
362
363  mkdir build && cd build
364  # Configure only LLVM (default)
365  cmake path/to/monorepo
366  # Configure LLVM and lld
367  cmake path/to/monorepo -DLLVM_ENABLE_PROJECTS=lld
368  # Configure LLVM and clang
369  cmake path/to/monorepo -DLLVM_ENABLE_PROJECTS=clang
370
371.. _git-svn-mirror:
372
373Read/write sub-project mirrors
374^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
375
376With the Monorepo, the existing single-subproject mirrors (e.g.
377http://llvm.org/git/compiler-rt.git) with git-svn read-write access would
378continue to be maintained: developers would continue to be able to use the
379existing single-subproject git repositories as they do today, with *no changes
380to workflow*. Everything (git fetch, git svn dcommit, etc.) could continue to
381work identically to how it works today. The monorepo can be set-up such that the
382SVN revision number matches the SVN revision in the GitHub SVN-bridge.
383
384Living Downstream
385^^^^^^^^^^^^^^^^^
386
387Downstream SVN users can use the read/write SVN bridge. The SVN revision
388number can be preserved in the monorepo, minimizing the impact.
389
390Downstream Git users can continue without any major changes, by using the
391git-svn mirrors on top of the SVN bridge.
392
393Git users can also work upstream with monorepo even if their downstream
394fork has split repositories.  They can apply patches in the appropriate
395subdirectories of the monorepo using, e.g., `git am --directory=...`, or
396plain `diff` and `patch`.
397
398Alternatively, Git users can migrate their own fork to the monorepo.  As a
399demonstration, we've migrated the "CHERI" fork to the monorepo in two ways:
400
401 * Using a script that rewrites history (including merges) so that it looks
402   like the fork always lived in the monorepo [LebarCHERI]_.  The upside of
403   this is when you check out an old revision, you get a copy of all llvm
404   sub-projects at a consistent revision.  (For instance, if it's a clang
405   fork, when you check out an old revision you'll get a consistent version
406   of llvm proper.)  The downside is that this changes the fork's commit
407   hashes.
408
409 * Merging the fork into the monorepo [AminiCHERI]_.  This preserves the
410   fork's commit hashes, but when you check out an old commit you only get
411   the one sub-project.
412
413Monorepo Preview
414^^^^^^^^^^^^^^^^^
415
416As a preview (disclaimer: this rough prototype, not polished and not
417representative of the final solution), you can look at the following:
418
419  * Full Repository: https://github.com/joker-eph/llvm-project
420  * Single sub-project view with *SVN write access* to the full repo:
421    https://github.com/joker-eph/compiler-rt
422
423Concerns
424^^^^^^^^
425
426 * Using the monolithic repository may add overhead for those contributing to a
427   standalone sub-project, particularly on runtimes like libcxx and compiler-rt
428   that don't rely on LLVM; currently, a fresh clone of libcxx is only 15MB (vs.
429   1GB for the monorepo), and the commit rate of LLVM may cause more frequent
430   `git push` collisions when upstreaming. Affected contributors can continue to
431   use the SVN bridge or the single-subproject Git mirrors with git-svn for
432   read-write.
433 * Using the monolithic repository may add overhead for those *integrating* a
434   standalone sub-project, even if they aren't contributing to it, due to the
435   same disk space concern as the point above. The availability of the
436   sub-project Git mirror addresses this, even without SVN access.
437 * Preservation of the existing read/write SVN-based workflows relies on the
438   GitHub SVN bridge, which is an extra dependency.  Maintaining this locks us
439   into GitHub and could restrict future workflow changes.
440
441Workflows
442^^^^^^^^^
443
444 * :ref:`Checkout/Clone a Single Project, without Commit Access <workflow-checkout-commit>`.
445 * :ref:`Checkout/Clone a Single Project, with Commit Access <workflow-monocheckout-nocommit>`.
446 * :ref:`Checkout/Clone Multiple Projects, with Commit Access <workflow-monocheckout-multicommit>`.
447 * :ref:`Commit an API Change in LLVM and Update the Sub-projects <workflow-cross-repo-commit>`.
448 * :ref:`Branching/Stashing/Updating for Local Development or Experiments <workflow-mono-branching>`.
449 * :ref:`Bisecting <workflow-mono-bisecting>`.
450
451Multi/Mono Hybrid Variant
452-------------------------
453
454This variant recommends moving only the LLVM sub-projects that are *rev-locked*
455to LLVM into a monorepo (clang, lld, lldb, ...), following the multirepo
456proposal for the rest.  While neither variant recommends combining sub-projects
457like www/ and test-suite/ (which are completely standalone), this goes further
458and keeps sub-projects like libcxx and compiler-rt in their own distinct
459repositories.
460
461Concerns
462^^^^^^^^
463
464 * This has most disadvantages of multirepo and monorepo, without bringing many
465   of the advantages.
466 * Downstream have to upgrade to the monorepo structure, but only partially. So
467   they will keep the infrastructure to integrate the other separate
468   sub-projects.
469 * All projects that use LIT for testing are effectively rev-locked to LLVM.
470   Furthermore, some runtimes (like compiler-rt) are rev-locked with Clang.
471   It's not clear where to draw the lines.
472
473
474Workflow Before/After
475=====================
476
477This section goes through a few examples of workflows, intended to illustrate
478how end-users or developers would interact with the repository for
479various use-cases.
480
481.. _workflow-checkout-commit:
482
483Checkout/Clone a Single Project, without Commit Access
484------------------------------------------------------
485
486Except the URL, nothing changes. The possibilities today are::
487
488  svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
489  # or with Git
490  git clone http://llvm.org/git/llvm.git
491
492After the move to GitHub, you would do either::
493
494  git clone https://github.com/llvm-project/llvm.git
495  # or using the GitHub svn native bridge
496  svn co https://github.com/llvm-project/llvm/trunk
497
498The above works for both the monorepo and the multirepo, as we'll maintain the
499existing read-only views of the individual sub-projects.
500
501Checkout/Clone a Single Project, with Commit Access
502---------------------------------------------------
503
504Currently
505^^^^^^^^^
506
507::
508
509  # direct SVN checkout
510  svn co https://user@llvm.org/svn/llvm-project/llvm/trunk llvm
511  # or using the read-only Git view, with git-svn
512  git clone http://llvm.org/git/llvm.git
513  cd llvm
514  git svn init https://llvm.org/svn/llvm-project/llvm/trunk --username=<username>
515  git config svn-remote.svn.fetch :refs/remotes/origin/master
516  git svn rebase -l  # -l avoids fetching ahead of the git mirror.
517
518Commits are performed using `svn commit` or with the sequence `git commit` and
519`git svn dcommit`.
520
521.. _workflow-multicheckout-nocommit:
522
523Multirepo Variant
524^^^^^^^^^^^^^^^^^
525
526With the multirepo variant, nothing changes but the URL, and commits can be
527performed using `svn commit` or `git commit` and `git push`::
528
529  git clone https://github.com/llvm/llvm.git llvm
530  # or using the GitHub svn native bridge
531  svn co https://github.com/llvm/llvm/trunk/ llvm
532
533.. _workflow-monocheckout-nocommit:
534
535Monorepo Variant
536^^^^^^^^^^^^^^^^
537
538With the monorepo variant, there are a few options, depending on your
539constraints. First, you could just clone the full repository::
540
541  git clone https://github.com/llvm/llvm-projects.git llvm
542  # or using the GitHub svn native bridge
543  svn co https://github.com/llvm/llvm-projects/trunk/ llvm
544
545At this point you have every sub-project (llvm, clang, lld, lldb, ...), which
546:ref:`doesn't imply you have to build all of them <build_single_project>`. You
547can still build only compiler-rt for instance. In this way it's not different
548from someone who would check out all the projects with SVN today.
549
550You can commit as normal using `git commit` and `git push` or `svn commit`, and
551read the history for a single project (`git log libcxx` for example).
552
553Secondly, there are a few options to avoid checking out all the sources.
554
555**Using the GitHub SVN bridge**
556
557The GitHub SVN native bridge allows to checkout a subdirectory directly:
558
559  svn co https://github.com/llvm/llvm-projects/trunk/compiler-rt compiler-rt  —username=...
560
561This checks out only compiler-rt and provides commit access using "svn commit",
562in the same way as it would do today.
563
564**Using a Subproject Git Nirror**
565
566You can use *git-svn* and one of the sub-project mirrors::
567
568  # Clone from the single read-only Git repo
569  git clone http://llvm.org/git/llvm.git
570  cd llvm
571  # Configure the SVN remote and initialize the svn metadata
572  $ git svn init https://github.com/joker-eph/llvm-project/trunk/llvm —username=...
573  git config svn-remote.svn.fetch :refs/remotes/origin/master
574  git svn rebase -l
575
576In this case the repository contains only a single sub-project, and commits can
577be made using `git svn dcommit`, again exactly as we do today.
578
579**Using a Sparse Checkouts**
580
581You can hide the other directories using a Git sparse checkout::
582
583  git config core.sparseCheckout true
584  echo /compiler-rt > .git/info/sparse-checkout
585  git read-tree -mu HEAD
586
587The data for all sub-projects is still in your `.git` directory, but in your
588checkout, you only see `compiler-rt`.
589Before you push, you'll need to fetch and rebase (`git pull --rebase`) as
590usual.
591
592Note that when you fetch you'll likely pull in changes to sub-projects you don't
593care about. If you are using spasre checkout, the files from other projects
594won't appear on your disk. The only effect is that your commit hash changes.
595
596You can check whether the changes in the last fetch are relevant to your commit
597by running::
598
599  git log origin/master@{1}..origin/master -- libcxx
600
601This command can be hidden in a script so that `git llvmpush` would perform all
602these steps, fail only if such a dependent change exists, and show immediately
603the change that prevented the push. An immediate repeat of the command would
604(almost) certainly result in a successful push.
605Note that today with SVN or git-svn, this step is not possible since the
606"rebase" implicitly happens while committing (unless a conflict occurs).
607
608Checkout/Clone Multiple Projects, with Commit Access
609----------------------------------------------------
610
611Let's look how to assemble llvm+clang+libcxx at a given revision.
612
613Currently
614^^^^^^^^^
615
616::
617
618  svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm -r $REVISION
619  cd llvm/tools
620  svn co http://llvm.org/svn/llvm-project/clang/trunk clang -r $REVISION
621  cd ../projects
622  svn co http://llvm.org/svn/llvm-project/libcxx/trunk libcxx -r $REVISION
623
624Or using git-svn::
625
626  git clone http://llvm.org/git/llvm.git
627  cd llvm/
628  git svn init https://llvm.org/svn/llvm-project/llvm/trunk --username=<username>
629  git config svn-remote.svn.fetch :refs/remotes/origin/master
630  git svn rebase -l
631  git checkout `git svn find-rev -B r258109`
632  cd tools
633  git clone http://llvm.org/git/clang.git
634  cd clang/
635  git svn init https://llvm.org/svn/llvm-project/clang/trunk --username=<username>
636  git config svn-remote.svn.fetch :refs/remotes/origin/master
637  git svn rebase -l
638  git checkout `git svn find-rev -B r258109`
639  cd ../../projects/
640  git clone http://llvm.org/git/libcxx.git
641  cd libcxx
642  git svn init https://llvm.org/svn/llvm-project/libcxx/trunk --username=<username>
643  git config svn-remote.svn.fetch :refs/remotes/origin/master
644  git svn rebase -l
645  git checkout `git svn find-rev -B r258109`
646
647Note that the list would be longer with more sub-projects.
648
649.. _workflow-multicheckout-multicommit:
650
651Multirepo Variant
652^^^^^^^^^^^^^^^^^
653
654With the multirepo variant, the umbrella repository will be used. This is
655where the mapping from a single revision number to the individual repositories
656revisions is stored.::
657
658  git clone https://github.com/llvm-beanz/llvm-submodules
659  cd llvm-submodules
660  git checkout $REVISION
661  git submodule init
662  git submodule update clang llvm libcxx
663  # the list of sub-project is optional, `git submodule update` would get them all.
664
665At this point the clang, llvm, and libcxx individual repositories are cloned
666and stored alongside each other. There are CMake flags to describe the directory
667structure; alternatively, you can just symlink `clang` to `llvm/tools/clang`,
668etc.
669
670Another option is to checkout repositories based on the commit timestamp::
671
672  git checkout `git rev-list -n 1 --before="2009-07-27 13:37" master`
673
674.. _workflow-monocheckout-multicommit:
675
676Monorepo Variant
677^^^^^^^^^^^^^^^^
678
679The repository contains natively the source for every sub-projects at the right
680revision, which makes this straightforward::
681
682  git clone https://github.com/llvm/llvm-projects.git llvm-projects
683  cd llvm-projects
684  git checkout $REVISION
685
686As before, at this point clang, llvm, and libcxx are stored in directories
687alongside each other.
688
689.. _workflow-cross-repo-commit:
690
691Commit an API Change in LLVM and Update the Sub-projects
692--------------------------------------------------------
693
694Today this is possible, even though not common (at least not documented) for
695subversion users and for git-svn users. For example, few Git users try to update
696LLD or Clang in the same commit as they change an LLVM API.
697
698The multirepo variant does not address this: one would have to commit and push
699separately in every individual repository. It would be possible to establish a
700protocol whereby users add a special token to their commit messages that causes
701the umbrella repo's updater bot to group all of them into a single revision.
702
703The monorepo variant handles this natively.
704
705Branching/Stashing/Updating for Local Development or Experiments
706----------------------------------------------------------------
707
708Currently
709^^^^^^^^^
710
711SVN does not allow this use case, but developers that are currently using
712git-svn can do it. Let's look in practice what it means when dealing with
713multiple sub-projects.
714
715To update the repository to tip of trunk::
716
717  git pull
718  cd tools/clang
719  git pull
720  cd ../../projects/libcxx
721  git pull
722
723To create a new branch::
724
725  git checkout -b MyBranch
726  cd tools/clang
727  git checkout -b MyBranch
728  cd ../../projects/libcxx
729  git checkout -b MyBranch
730
731To switch branches::
732
733  git checkout AnotherBranch
734  cd tools/clang
735  git checkout AnotherBranch
736  cd ../../projects/libcxx
737  git checkout AnotherBranch
738
739.. _workflow-multi-branching:
740
741Multirepo Variant
742^^^^^^^^^^^^^^^^^
743
744The multirepo works the same as the current Git workflow: every command needs
745to be applied to each of the individual repositories.
746However, the umbrella repository makes this easy using `git submodule foreach`
747to replicate a command on all the individual repositories (or submodules
748in this case):
749
750To create a new branch::
751
752  git submodule foreach git checkout -b MyBranch
753
754To switch branches::
755
756  git submodule foreach git checkout AnotherBranch
757
758.. _workflow-mono-branching:
759
760Monorepo Variant
761^^^^^^^^^^^^^^^^
762
763Regular Git commands are sufficient, because everything is in a single
764repository:
765
766To update the repository to tip of trunk::
767
768  git pull
769
770To create a new branch::
771
772  git checkout -b MyBranch
773
774To switch branches::
775
776  git checkout AnotherBranch
777
778Bisecting
779---------
780
781Assuming a developer is looking for a bug in clang (or lld, or lldb, ...).
782
783Currently
784^^^^^^^^^
785
786SVN does not have builtin bisection support, but the single revision across
787sub-projects makes it possible to script around.
788
789Using the existing Git read-only view of the repositories, it is possible to use
790the native Git bisection script over the llvm repository, and use some scripting
791to synchronize the clang repository to match the llvm revision.
792
793.. _workflow-multi-bisecting:
794
795Multirepo Variant
796^^^^^^^^^^^^^^^^^
797
798With the multi-repositories variant, the cross-repository synchronization is
799achieved using the umbrella repository. This repository contains only
800submodules for the other sub-projects. The native Git bisection can be used on
801the umbrella repository directly. A subtlety is that the bisect script itself
802needs to make sure the submodules are updated accordingly.
803
804For example, to find which commit introduces a regression where clang-3.9
805crashes but not clang-3.8 passes, one should be able to simply do::
806
807  git bisect start release_39 release_38
808  git bisect run ./bisect_script.sh
809
810With the `bisect_script.sh` script being::
811
812  #!/bin/sh
813  cd $UMBRELLA_DIRECTORY
814  git submodule update llvm clang libcxx #....
815  cd $BUILD_DIR
816
817  ninja clang || exit 125   # an exit code of 125 asks "git bisect"
818                            # to "skip" the current commit
819
820  ./bin/clang some_crash_test.cpp
821
822When the `git bisect run` command returns, the umbrella repository is set to
823the state where the regression is introduced. The commit diff in the umbrella
824indicate which submodule was updated, and the last commit in this sub-projects
825is the one that the bisect found.
826
827.. _workflow-mono-bisecting:
828
829Monorepo Variant
830^^^^^^^^^^^^^^^^
831
832Bisecting on the monorepo is straightforward, and very similar to the above,
833except that the bisection script does not need to include the
834`git submodule update` step.
835
836The same example, finding which commit introduces a regression where clang-3.9
837crashes but not clang-3.8 passes, will look like::
838
839  git bisect start release_39 release_38
840  git bisect run ./bisect_script.sh
841
842With the `bisect_script.sh` script being::
843
844  #!/bin/sh
845  cd $BUILD_DIR
846
847  ninja clang || exit 125   # an exit code of 125 asks "git bisect"
848                            # to "skip" the current commit
849
850  ./bin/clang some_crash_test.cpp
851
852Also, since the monorepo handles commits update across multiple projects, you're
853less like to encounter a build failure where a commit change an API in LLVM and
854another later one "fixes" the build in clang.
855
856
857References
858==========
859
860.. [LattnerRevNum] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041739.html
861.. [TrickRevNum] Andrew Trick, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041721.html
862.. [JSonnRevNum] Joerg Sonnenberg, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041688.html
863.. [TorvaldRevNum] Linus Torvald, http://git.661346.n2.nabble.com/Git-commit-generation-numbers-td6584414.html
864.. [MatthewsRevNum] Chris Matthews, http://lists.llvm.org/pipermail/cfe-dev/2016-July/049886.html
865.. [submodules] Git submodules, https://git-scm.com/book/en/v2/Git-Tools-Submodules)
866.. [statuschecks] GitHub status-checks, https://help.github.com/articles/about-required-status-checks/
867.. [LebarCHERI] Port *CHERI* to a single repository rewriting history, http://lists.llvm.org/pipermail/llvm-dev/2016-July/102787.html
868.. [AminiCHERI] Port *CHERI* to a single repository preserving history, http://lists.llvm.org/pipermail/llvm-dev/2016-July/102804.html
869