Welcome to LLVM! In order to get started, you first need to know some basic information.
First, LLVM comes in three pieces. The first piece is the LLVM suite. This contains all of the tools, libraries, and header files needed to use the low level virtual machine. It contains an assembler, disassembler, bitcode analyzer and bitcode optimizer. It also contains basic regression tests that can be used to test the LLVM tools and the GCC front end.
The second piece is the GCC front end. This component provides a version of GCC that compiles C and C++ code into LLVM bitcode. Currently, the GCC front end uses the GCC parser to convert code to LLVM. Once compiled into LLVM bitcode, a program can be manipulated with the LLVM tools from the LLVM suite.
There is a third, optional piece called Test Suite. It is a suite of programs with a testing harness that can be used to further test LLVM's functionality and performance.
Here's the short story for getting up and running quickly with LLVM:
Specify for directory the full pathname of where you want the LLVM tools and libraries to be installed (default /usr/local).
Optionally, specify for directory the full pathname of the C/C++ front end installation to use with this LLVM configuration. If not specified, the PATH will be searched. This is only needed if you want to run test-suite or do some special kinds of LLVM builds.
Enable the SPEC2000 benchmarks for testing. The SPEC2000 benchmarks should be available in directory.
Consult the Getting Started with LLVM section for detailed information on configuring and compiling LLVM. See Setting Up Your Environment for tips that simplify working with the GCC front end and LLVM tools. Go to Program Layout to learn about the layout of the source code tree.
Before you begin to use the LLVM system, review the requirements given below. This may save you some trouble by knowing ahead of time what hardware and software you will need.
LLVM is known to work on the following platforms:
OS | Arch | Compilers |
---|---|---|
AuroraUX | x861 | GCC |
Linux | x861 | GCC |
Linux | amd64 | GCC |
Solaris | V9 (Ultrasparc) | GCC |
FreeBSD | x861 | GCC |
FreeBSD | amd64 | GCC |
MacOS X2 | PowerPC | GCC |
MacOS X2,9 | x86 | GCC |
Cygwin/Win32 | x861,8, 11 | GCC 3.4.X, binutils 2.20 |
MinGW/Win32 | x861,6, 8, 10, 11 | GCC 3.4.X, binutils 2.20 |
LLVM has partial support for the following platforms:
OS | Arch | Compilers |
---|---|---|
Windows | x861 | Visual Studio 2005 SP1 or higher4,5 |
AIX3,4 | PowerPC | GCC |
Linux3,5 | PowerPC | GCC |
Linux7 | Alpha | GCC |
Linux7 | Itanium (IA-64) | GCC |
HP-UX7 | Itanium (IA-64) | HP aCC |
Windows x64 | x86-64 | mingw-w64's GCC-4.5.x12 |
Notes:
Note that you will need about 1-3 GB of space for a full LLVM build in Debug mode, depending on the system (it is so large because of all the debugging information and the fact that the libraries are statically linked into multiple tools). If you do not need many of the tools and you are space-conscious, you can pass ONLY_TOOLS="tools you need" to make. The Release build requires considerably less space.
The LLVM suite may compile on other platforms, but it is not guaranteed to do so. If compilation is successful, the LLVM utilities should be able to assemble, disassemble, analyze, and optimize LLVM bitcode. Code generation should work as well, although the generated native code may not work on your platform.
The GCC front end is not very portable at the moment. If you want to get it to work on another platform, you can download a copy of the source and try to compile it on your platform.
Compiling LLVM requires that you have several software packages installed. The table below lists those required packages. The Package column is the usual name for the software package that LLVM depends on. The Version column provides "known to work" versions of the package. The Notes column describes how LLVM uses the package and provides other details.
Package | Version | Notes |
---|---|---|
GNU Make | 3.79, 3.79.1 | Makefile/build processor |
GCC | 3.4.2 | C/C++ compiler1 |
TeXinfo | 4.5 | For building the CFE |
SVN | ≥1.3 | Subversion access to LLVM2 |
DejaGnu | 1.4.2 | Automated test suite3 |
tcl | 8.3, 8.4 | Automated test suite3 |
expect | 5.38.0 | Automated test suite3 |
perl | ≥5.6.0 | Nightly tester, utilities |
GNU M4 | 1.4 | Macro processor for configuration4 |
GNU Autoconf | 2.61 | Configuration script builder4 |
GNU Automake | 1.10 | aclocal macro generator4 |
libtool | 1.5.22 | Shared library manager4 |
Notes:
Additionally, your compilation host is expected to have the usual plethora of Unix utilities. Specifically:
LLVM is very demanding of the host C++ compiler, and as such tends to expose bugs in the compiler. In particular, several versions of GCC crash when trying to compile LLVM. We routinely use GCC 3.3.3, 3.4.0, and Apple 4.0.1 successfully with them (however, see important notes below). Other versions of GCC will probably work as well. GCC versions listed here are known to not work. If you are using one of these versions, please try to upgrade your GCC to something more recent. If you run into a problem with a version of GCC not listed here, please let us know. Please use the "gcc -v" command to find out which version of GCC you are using.
GCC versions prior to 3.0: GCC 2.96.x and before had several problems in the STL that effectively prevent it from compiling LLVM.
GCC 3.2.2 and 3.2.3: These versions of GCC fails to compile LLVM with a bogus template error. This was fixed in later GCCs.
GCC 3.3.2: This version of GCC suffered from a serious bug which causes it to crash in the "convert_from_eh_region_ranges_1" GCC function.
Cygwin GCC 3.3.3: The version of GCC 3.3.3 commonly shipped with Cygwin does not work. Please upgrade to a newer version if possible.
SuSE GCC 3.3.3: The version of GCC 3.3.3 shipped with SuSE 9.1 (and possibly others) does not compile LLVM correctly (it appears that exception handling is broken in some cases). Please download the FSF 3.3.3 or upgrade to a newer version of GCC.
GCC 3.4.0 on linux/x86 (32-bit): GCC miscompiles portions of the code generator, causing an infinite loop in the llvm-gcc build when built with optimizations enabled (i.e. a release build).
GCC 3.4.2 on linux/x86 (32-bit): GCC miscompiles portions of the code generator at -O3, as with 3.4.0. However gcc 3.4.2 (unlike 3.4.0) correctly compiles LLVM at -O2. A work around is to build release LLVM builds with "make ENABLE_OPTIMIZED=1 OPTIMIZE_OPTION=-O2 ..."
GCC 3.4.x on X86-64/amd64: GCC miscompiles portions of LLVM.
GCC 3.4.4 (CodeSourcery ARM 2005q3-2): this compiler miscompiles LLVM when building with optimizations enabled. It appears to work with "make ENABLE_OPTIMIZED=1 OPTIMIZE_OPTION=-O1" or build a debug build.
IA-64 GCC 4.0.0: The IA-64 version of GCC 4.0.0 is known to miscompile LLVM.
Apple Xcode 2.3: GCC crashes when compiling LLVM at -O3 (which is the default with ENABLE_OPTIMIZED=1. To work around this, build with "ENABLE_OPTIMIZED=1 OPTIMIZE_OPTION=-O2".
GCC 4.1.1: GCC fails to build LLVM with template concept check errors compiling some files. At the time of this writing, GCC mainline (4.2) did not share the problem.
GCC 4.1.1 on X86-64/amd64: GCC miscompiles portions of LLVM when compiling llvm itself into 64-bit code. LLVM will appear to mostly work but will be buggy, e.g. failing portions of its testsuite.
GCC 4.1.2 on OpenSUSE: Seg faults during libstdc++ build and on x86_64 platforms compiling md5.c gets a mangled constant.
GCC 4.1.2 (20061115 (prerelease) (Debian 4.1.1-21)) on Debian: Appears to miscompile parts of LLVM 2.4. One symptom is ValueSymbolTable complaining about symbols remaining in the table on destruction.
GCC 4.1.2 20071124 (Red Hat 4.1.2-42): Suffers from the same symptoms as the previous one. It appears to work with ENABLE_OPTIMIZED=0 (the default).
Cygwin GCC 4.3.2 20080827 (beta) 2: Users reported various problems related with link errors when using this GCC version.
Debian GCC 4.3.2 on X86: Crashes building some files in LLVM 2.6.
GCC 4.3.3 (Debian 4.3.3-10) on ARM: Miscompiles parts of LLVM 2.6 when optimizations are turned on. The symptom is an infinite loop in FoldingSetImpl::RemoveNode while running the code generator.
GCC 4.3.5 and GCC 4.4.5 on ARM: These can miscompile value >> 1 even at -O0. A test failure in test/Assembler/alignstack.ll is one symptom of the problem.
GNU ld 2.16.X. Some 2.16.X versions of the ld linker will produce very long warning messages complaining that some ".gnu.linkonce.t.*" symbol was defined in a discarded section. You can safely ignore these messages as they are erroneous and the linkage is correct. These messages disappear using ld 2.17.
GNU binutils 2.17: Binutils 2.17 contains a bug which causes huge link times (minutes instead of seconds) when building LLVM. We recommend upgrading to a newer version (2.17.50.0.4 or later).
GNU Binutils 2.19.1 Gold: This version of Gold contained a bug which causes intermittent failures when building LLVM with position independent code. The symptom is an error about cyclic dependencies. We recommend upgrading to a newer version of Gold.
The remainder of this guide is meant to get you up and running with LLVM and to give you some basic information about the LLVM environment.
The later sections of this guide describe the general layout of the the LLVM source tree, a simple example using the LLVM tool chain, and links to find more information about LLVM or to get help via e-mail.
Throughout this manual, the following names are used to denote paths specific to the local system and working environment. These are not environment variables you need to set but just strings used in the rest of this document below. In any of the examples below, simply replace each of these names with the appropriate pathname on your local system. All these paths are absolute:
For the pre-built GCC front end binaries, the LLVMGCCDIR is llvm-gcc/platform/llvm-gcc.
In order to compile and use LLVM, you may need to set some environment variables.
If you have the LLVM distribution, you will need to unpack it before you can begin to compile it. LLVM is distributed as a set of two files: the LLVM suite and the LLVM GCC front end compiled for your platform. There is an additional test suite that is optional. Each file is a TAR archive that is compressed with the gzip program.
The files are as follows, with x.y marking the version number:
If you have access to our Subversion repository, you can get a fresh copy of the entire source code. All you need to do is check it out from Subversion as follows:
This will create an 'llvm' directory in the current directory and fully populate it with the LLVM source code, Makefiles, test directories, and local copies of documentation files.
If you want to get a specific release (as opposed to the most recent revision), you can checkout it from the 'tags' directory (instead of 'trunk'). The following releases are located in the following subdirectories of the 'tags' directory:
If you would like to get the LLVM test suite (a separate package as of 1.4), you get it from the Subversion repository:
% cd llvm/projects % svn co http://llvm.org/svn/llvm-project/test-suite/trunk test-suite
By placing it in the llvm/projects, it will be automatically configured by the LLVM configure script as well as automatically updated when you run svn update.
If you would like to get the GCC front end source code, you can also get it and build it yourself. Please follow these instructions to successfully get and build the LLVM GCC front-end.
GIT mirrors are available for a number of LLVM subprojects. These mirrors sync automatically with each Subversion commit and contain all necessary git-svn marks (so, you can recreate git-svn metadata locally). Note that right now mirrors reflect only trunk for each project. You can do the read-only GIT clone of LLVM via:
git clone http://llvm.org/git/llvm.git
If you want to check out clang too, run:
git clone http://llvm.org/git/llvm.git cd llvm/tools git clone http://llvm.org/git/clang.git
Since the upstream repository is in Subversion, you should use "git pull --rebase" instead of "git pull" to avoid generating a non-linear history in your clone. To configure "git pull" to pass --rebase by default on the master branch, run the following command:
git config branch.master.rebase true
Please read Developer Policy, too.
Assume master points the upstream and mybranch points your working branch, and mybranch is rebased onto master. At first you may check sanity of whitespaces:
git diff --check master..mybranch
The easiest way to generate a patch is as below:
git diff master..mybranch > /path/to/mybranch.diff
It is a little different from svn-generated diff. git-diff-generated diff has prefixes like a/ and b/. Don't worry, most developers might know it could be accepted with patch -p1 -N.
But you may generate patchset with git-format-patch. It generates by-each-commit patchset. To generate patch files to attach to your article:
git format-patch --no-attach master..mybranch -o /path/to/your/patchset
If you would like to send patches directly, you may use git-send-email or git-imap-send. Here is an example to generate the patchset in Gmail's [Drafts].
git format-patch --attach master..mybranch --stdout | git imap-send
Then, your .git/config should have [imap] sections.
[imap] host = imaps://imap.gmail.com user = your.gmail.account@gmail.com pass = himitsu! port = 993 sslverify = false ; in English folder = "[Gmail]/Drafts" ; example for Japanese, "Modified UTF-7" encoded. folder = "[Gmail]/&Tgtm+DBN-"
To set up clone from which you can submit code using git-svn, run:
git clone http://llvm.org/git/llvm.git cd llvm git svn init https://llvm.org/svn/llvm-project/llvm/trunk --username=<username> git config svn-remote.svn.fetch :refs/remotes/origin/master git svn rebase -l # -l avoids fetching ahead of the git mirror. # If you have clang too: cd tools git clone http://llvm.org/git/clang.git cd clang git svn init https://llvm.org/svn/llvm-project/cfe/trunk --username=<username> git config svn-remote.svn.fetch :refs/remotes/origin/master git svn rebase -l
To update this clone without generating git-svn tags that conflict with the upstream git repo, run:
git fetch && (cd tools/clang && git fetch) # Get matching revisions of both trees. git checkout master git svn rebase -l (cd tools/clang && git checkout master && git svn rebase -l)
This leaves your working directories on their master branches, so you'll need to checkout each working branch individually and rebase it on top of its parent branch. (Note: This script is intended for relative newbies to git. If you have more experience, you can likely improve on it.)
The git-svn metadata can get out of sync after you mess around with
branches and dcommit
. When that happens, git svn
dcommit
stops working, complaining about files with uncommitted
changes. The fix is to rebuild the metadata:
rm -rf .git/svn git svn rebase -l
Before configuring and compiling the LLVM suite (or if you want to use just the LLVM GCC front end) you can optionally extract the front end from the binary distribution. It is used for running the LLVM test-suite and for compiling C/C++ programs. Note that you can optionally build llvm-gcc yourself after building the main LLVM repository.
To install the GCC front end, do the following (on Windows, use an archival tool like 7-zip that understands gzipped tars):
Once the binary is uncompressed, if you're using a *nix-based system, add a symlink for llvm-gcc and llvm-g++ to some directory in your path. If you're using a Windows-based system, add the bin subdirectory of your front end installation directory to your PATH environment variable. For example, if you uncompressed the binary to c:\llvm-gcc, add c:\llvm-gcc\bin to your PATH.
If you now want to build LLVM from source, when you configure LLVM, it will automatically detect llvm-gcc's presence (if it is in your path) enabling its use in test-suite. Note that you can always build or install llvm-gcc at any point after building the main LLVM repository: just reconfigure llvm and test-suite will pick it up.
As a convenience for Windows users, the front end binaries for MinGW/x86 include versions of the required w32api and mingw-runtime binaries. The last remaining step for Windows users is to simply uncompress the binary binutils package from MinGW into your front end installation directory. While the front end installation steps are not quite the same as a typical manual MinGW installation, they should be similar enough to those who have previously installed MinGW on Windows systems.
To install binutils on Windows:
The binary versions of the LLVM GCC front end may not suit all of your needs. For example, the binary distribution may include an old version of a system header file, not "fix" a header file that needs to be fixed for GCC, or it may be linked with libraries not available on your system. In cases like these, you may want to try building the GCC front end from source. Thankfully, this is much easier now than it was in the past.
We also do not currently support updating of the GCC front end by manually overlaying newer versions of the w32api and mingw-runtime binary packages that may become available from MinGW. At this time, it's best to think of the MinGW LLVM GCC front end binary as a self-contained convenience package that requires Windows users to simply download and uncompress the GNU Binutils binary package from the MinGW project.
Regardless of your platform, if you discover that installing the LLVM GCC front end binaries is not as easy as previously described, or you would like to suggest improvements, please let us know how you would like to see things improved by dropping us a note on our mailing list.
Once checked out from the Subversion repository, the LLVM suite source code must be configured via the configure script. This script sets variables in the various *.in files, most notably llvm/Makefile.config and llvm/include/Config/config.h. It also populates OBJ_ROOT with the Makefiles needed to begin building LLVM.
The following environment variables are used by the configure script to configure the build system:
Variable | Purpose |
---|---|
CC | Tells configure which C compiler to use. By default, configure will look for the first GCC C compiler in PATH. Use this variable to override configure's default behavior. |
CXX | Tells configure which C++ compiler to use. By default, configure will look for the first GCC C++ compiler in PATH. Use this variable to override configure's default behavior. |
The following options can be used to set or enable LLVM specific options:
To configure LLVM, follow these steps:
Change directory into the object root directory:
% cd OBJ_ROOT
Run the configure script located in the LLVM source tree:
% SRC_ROOT/configure --prefix=/install/path [other options]
Once you have configured LLVM, you can build it. There are three types of builds:
Once you have LLVM configured, you can build it by entering the OBJ_ROOT directory and issuing the following command:
% gmake
If the build fails, please check here to see if you are using a version of GCC that is known not to compile LLVM.
If you have multiple processors in your machine, you may wish to use some of the parallel build options provided by GNU Make. For example, you could use the command:
% gmake -j2
There are several special targets which are useful when working with the LLVM source code:
Please see the Makefile Guide for further details on these make targets and descriptions of other targets available.
It is also possible to override default values from configure by declaring variables on the command line. The following are some examples:
Every directory in the LLVM object tree includes a Makefile to build it and any subdirectories that it contains. Entering any directory inside the LLVM object tree and typing gmake should rebuild anything in or below that directory that is out of date.
It is possible to cross-compile LLVM itself. That is, you can create LLVM executables and libraries to be hosted on a platform different from the platform where they are build (a Canadian Cross build). To configure a cross-compile, supply the configure script with --build and --host options that are different. The values of these options must be legal target triples that your GCC compiler supports.
The result of such a build is executables that are not runnable on on the build host (--build option) but can be executed on the compile host (--host option).
The LLVM build system is capable of sharing a single LLVM source tree among several LLVM builds. Hence, it is possible to build LLVM for several different platforms or configurations using the same source tree.
This is accomplished in the typical autoconf manner:
Change directory to where the LLVM object files should live:
% cd OBJ_ROOT
Run the configure script found in the LLVM source directory:
% SRC_ROOT/configure
The LLVM build will place files underneath OBJ_ROOT in directories named after the build type:
If you're running on a Linux system that supports the "binfmt_misc" module, and you have root access on the system, you can set your system up to execute LLVM bitcode files directly. To do this, use commands like this (the first command may not be required if you are already using the module):
$ mount -t binfmt_misc none /proc/sys/fs/binfmt_misc $ echo ':llvm:M::BC::/path/to/lli:' > /proc/sys/fs/binfmt_misc/register $ chmod u+x hello.bc (if needed) $ ./hello.bc
This allows you to execute LLVM bitcode files directly. On Debian, you can also use this command instead of the 'echo' command above:
$ sudo update-binfmts --install llvm /path/to/lli --magic 'BC'
One useful source of information about the LLVM source base is the LLVM doxygen documentation available at http://llvm.org/doxygen/. The following is a brief introduction to code layout:
This directory contains some simple examples of how to use the LLVM IR and JIT.
This directory contains public header files exported from the LLVM library. The three main subdirectories of this directory are:
This directory contains most of the source files of the LLVM system. In LLVM, almost all code exists in libraries, making it very easy to share code among the different tools.
This directory contains projects that are not strictly part of LLVM but are shipped with LLVM. This is also the directory where you should create your own LLVM-based projects. See llvm/projects/sample for an example of how to set up your own project.
This directory contains libraries which are compiled into LLVM bitcode and used when linking programs with the GCC front end. Most of these libraries are skeleton versions of real libraries; for example, libc is a stripped down version of glibc.
Unlike the rest of the LLVM suite, this directory needs the LLVM GCC front end to compile.
This directory contains feature and regression tests and other basic sanity checks on the LLVM infrastructure. These are intended to run quickly and cover a lot of territory without being exhaustive.
This is not a directory in the normal llvm module; it is a separate Subversion module that must be checked out (usually to projects/test-suite). This module contains a comprehensive correctness, performance, and benchmarking test suite for LLVM. It is a separate Subversion module because not every LLVM user is interested in downloading or building such a comprehensive test suite. For further details on this test suite, please see the Testing Guide document.
The tools directory contains the executables built out of the libraries above, which form the main part of the user interface. You can always get help for a tool by typing tool_name -help. The following is a brief introduction to the most important tools. More detailed information is in the Command Guide.
This directory contains utilities for working with LLVM source code, and some of the utilities are actually required as part of the build process because they are code generators for parts of LLVM infrastructure.
This section gives an example of using LLVM. llvm-gcc3 is now obsolete, so we only include instructions for llvm-gcc4.
Note: The gcc4 frontend's invocation is considerably different from the previous gcc3 frontend. In particular, the gcc4 frontend does not create bitcode by default: gcc4 produces native code. As the example below illustrates, the '--emit-llvm' flag is needed to produce LLVM bitcode output. For makefiles and configure scripts, the CFLAGS variable needs '--emit-llvm' to produce bitcode output.
First, create a simple C file, name it 'hello.c':
#include <stdio.h> int main() { printf("hello world\n"); return 0; }
Next, compile the C file into a native executable:
% llvm-gcc hello.c -o hello
Note that llvm-gcc works just like GCC by default. The standard -S and -c arguments work as usual (producing a native .s or .o file, respectively).
Next, compile the C file into a LLVM bitcode file:
% llvm-gcc -O3 -emit-llvm hello.c -c -o hello.bc
The -emit-llvm option can be used with the -S or -c options to emit an LLVM ".ll" or ".bc" file (respectively) for the code. This allows you to use the standard LLVM tools on the bitcode file.
Unlike llvm-gcc3, llvm-gcc4 correctly responds to -O[0123] arguments.
Run the program in both forms. To run the program, use:
% ./hello
and
% lli hello.bc
The second examples shows how to invoke the LLVM JIT, lli.
Use the llvm-dis utility to take a look at the LLVM assembly code:
llvm-dis < hello.bc | less
Compile the program to native assembly using the LLC code generator:
% llc hello.bc -o hello.s
Assemble the native assembly language file into a program:
Solaris: % /opt/SUNWspro/bin/cc -xarch=v9 hello.s -o hello.native Others: % gcc hello.s -o hello.native
Execute the native code program:
% ./hello.native
Note that using llvm-gcc to compile directly to native code (i.e. when the -emit-llvm option is not present) does steps 6/7/8 for you.
If you are having problems building or using LLVM, or if you have any other general questions about LLVM, please consult the Frequently Asked Questions page.
This document is just an introduction on how to use LLVM to do some simple things... there are many more interesting and complicated things that you can do that aren't documented here (but we'll gladly accept a patch if you want to write something up!). For more information about LLVM, check out: