• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Assembler Annotations
2=====================
3
4Copyright (c) 2017-2019 Jiri Slaby
5
6This document describes the new macros for annotation of data and code in
7assembly. In particular, it contains information about ``SYM_FUNC_START``,
8``SYM_FUNC_END``, ``SYM_CODE_START``, and similar.
9
10Rationale
11---------
12Some code like entries, trampolines, or boot code needs to be written in
13assembly. The same as in C, such code is grouped into functions and
14accompanied with data. Standard assemblers do not force users into precisely
15marking these pieces as code, data, or even specifying their length.
16Nevertheless, assemblers provide developers with such annotations to aid
17debuggers throughout assembly. On top of that, developers also want to mark
18some functions as *global* in order to be visible outside of their translation
19units.
20
21Over time, the Linux kernel has adopted macros from various projects (like
22``binutils``) to facilitate such annotations. So for historic reasons,
23developers have been using ``ENTRY``, ``END``, ``ENDPROC``, and other
24annotations in assembly.  Due to the lack of their documentation, the macros
25are used in rather wrong contexts at some locations. Clearly, ``ENTRY`` was
26intended to denote the beginning of global symbols (be it data or code).
27``END`` used to mark the end of data or end of special functions with
28*non-standard* calling convention. In contrast, ``ENDPROC`` should annotate
29only ends of *standard* functions.
30
31When these macros are used correctly, they help assemblers generate a nice
32object with both sizes and types set correctly. For example, the result of
33``arch/x86/lib/putuser.S``::
34
35   Num:    Value          Size Type    Bind   Vis      Ndx Name
36    25: 0000000000000000    33 FUNC    GLOBAL DEFAULT    1 __put_user_1
37    29: 0000000000000030    37 FUNC    GLOBAL DEFAULT    1 __put_user_2
38    32: 0000000000000060    36 FUNC    GLOBAL DEFAULT    1 __put_user_4
39    35: 0000000000000090    37 FUNC    GLOBAL DEFAULT    1 __put_user_8
40
41This is not only important for debugging purposes. When there are properly
42annotated objects like this, tools can be run on them to generate more useful
43information. In particular, on properly annotated objects, ``objtool`` can be
44run to check and fix the object if needed. Currently, ``objtool`` can report
45missing frame pointer setup/destruction in functions. It can also
46automatically generate annotations for :doc:`ORC unwinder <x86/orc-unwinder>`
47for most code. Both of these are especially important to support reliable
48stack traces which are in turn necessary for :doc:`Kernel live patching
49<livepatch/livepatch>`.
50
51Caveat and Discussion
52---------------------
53As one might realize, there were only three macros previously. That is indeed
54insufficient to cover all the combinations of cases:
55
56* standard/non-standard function
57* code/data
58* global/local symbol
59
60There was a discussion_ and instead of extending the current ``ENTRY/END*``
61macros, it was decided that brand new macros should be introduced instead::
62
63    So how about using macro names that actually show the purpose, instead
64    of importing all the crappy, historic, essentially randomly chosen
65    debug symbol macro names from the binutils and older kernels?
66
67.. _discussion: https://lkml.kernel.org/r/20170217104757.28588-1-jslaby@suse.cz
68
69Macros Description
70------------------
71
72The new macros are prefixed with the ``SYM_`` prefix and can be divided into
73three main groups:
74
751. ``SYM_FUNC_*`` -- to annotate C-like functions. This means functions with
76   standard C calling conventions, i.e. the stack contains a return address at
77   the predefined place and a return from the function can happen in a
78   standard way. When frame pointers are enabled, save/restore of frame
79   pointer shall happen at the start/end of a function, respectively, too.
80
81   Checking tools like ``objtool`` should ensure such marked functions conform
82   to these rules. The tools can also easily annotate these functions with
83   debugging information (like *ORC data*) automatically.
84
852. ``SYM_CODE_*`` -- special functions called with special stack. Be it
86   interrupt handlers with special stack content, trampolines, or startup
87   functions.
88
89   Checking tools mostly ignore checking of these functions. But some debug
90   information still can be generated automatically. For correct debug data,
91   this code needs hints like ``UNWIND_HINT_REGS`` provided by developers.
92
933. ``SYM_DATA*`` -- obviously data belonging to ``.data`` sections and not to
94   ``.text``. Data do not contain instructions, so they have to be treated
95   specially by the tools: they should not treat the bytes as instructions,
96   nor assign any debug information to them.
97
98Instruction Macros
99~~~~~~~~~~~~~~~~~~
100This section covers ``SYM_FUNC_*`` and ``SYM_CODE_*`` enumerated above.
101
102* ``SYM_FUNC_START`` and ``SYM_FUNC_START_LOCAL`` are supposed to be **the
103  most frequent markings**. They are used for functions with standard calling
104  conventions -- global and local. Like in C, they both align the functions to
105  architecture specific ``__ALIGN`` bytes. There are also ``_NOALIGN`` variants
106  for special cases where developers do not want this implicit alignment.
107
108  ``SYM_FUNC_START_WEAK`` and ``SYM_FUNC_START_WEAK_NOALIGN`` markings are
109  also offered as an assembler counterpart to the *weak* attribute known from
110  C.
111
112  All of these **shall** be coupled with ``SYM_FUNC_END``. First, it marks
113  the sequence of instructions as a function and computes its size to the
114  generated object file. Second, it also eases checking and processing such
115  object files as the tools can trivially find exact function boundaries.
116
117  So in most cases, developers should write something like in the following
118  example, having some asm instructions in between the macros, of course::
119
120    SYM_FUNC_START(function_hook)
121        ... asm insns ...
122    SYM_FUNC_END(function_hook)
123
124  In fact, this kind of annotation corresponds to the now deprecated ``ENTRY``
125  and ``ENDPROC`` macros.
126
127* ``SYM_FUNC_START_ALIAS`` and ``SYM_FUNC_START_LOCAL_ALIAS`` serve for those
128  who decided to have two or more names for one function. The typical use is::
129
130    SYM_FUNC_START_ALIAS(__memset)
131    SYM_FUNC_START(memset)
132        ... asm insns ...
133    SYM_FUNC_END(memset)
134    SYM_FUNC_END_ALIAS(__memset)
135
136  In this example, one can call ``__memset`` or ``memset`` with the same
137  result, except the debug information for the instructions is generated to
138  the object file only once -- for the non-``ALIAS`` case.
139
140* ``SYM_CODE_START`` and ``SYM_CODE_START_LOCAL`` should be used only in
141  special cases -- if you know what you are doing. This is used exclusively
142  for interrupt handlers and similar where the calling convention is not the C
143  one. ``_NOALIGN`` variants exist too. The use is the same as for the ``FUNC``
144  category above::
145
146    SYM_CODE_START_LOCAL(bad_put_user)
147        ... asm insns ...
148    SYM_CODE_END(bad_put_user)
149
150  Again, every ``SYM_CODE_START*`` **shall** be coupled by ``SYM_CODE_END``.
151
152  To some extent, this category corresponds to deprecated ``ENTRY`` and
153  ``END``. Except ``END`` had several other meanings too.
154
155* ``SYM_INNER_LABEL*`` is used to denote a label inside some
156  ``SYM_{CODE,FUNC}_START`` and ``SYM_{CODE,FUNC}_END``.  They are very similar
157  to C labels, except they can be made global. An example of use::
158
159    SYM_CODE_START(ftrace_caller)
160        /* save_mcount_regs fills in first two parameters */
161        ...
162
163    SYM_INNER_LABEL(ftrace_caller_op_ptr, SYM_L_GLOBAL)
164        /* Load the ftrace_ops into the 3rd parameter */
165        ...
166
167    SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
168        call ftrace_stub
169        ...
170        retq
171    SYM_CODE_END(ftrace_caller)
172
173Data Macros
174~~~~~~~~~~~
175Similar to instructions, there is a couple of macros to describe data in the
176assembly.
177
178* ``SYM_DATA_START`` and ``SYM_DATA_START_LOCAL`` mark the start of some data
179  and shall be used in conjunction with either ``SYM_DATA_END``, or
180  ``SYM_DATA_END_LABEL``. The latter adds also a label to the end, so that
181  people can use ``lstack`` and (local) ``lstack_end`` in the following
182  example::
183
184    SYM_DATA_START_LOCAL(lstack)
185        .skip 4096
186    SYM_DATA_END_LABEL(lstack, SYM_L_LOCAL, lstack_end)
187
188* ``SYM_DATA`` and ``SYM_DATA_LOCAL`` are variants for simple, mostly one-line
189  data::
190
191    SYM_DATA(HEAP,     .long rm_heap)
192    SYM_DATA(heap_end, .long rm_stack)
193
194  In the end, they expand to ``SYM_DATA_START`` with ``SYM_DATA_END``
195  internally.
196
197Support Macros
198~~~~~~~~~~~~~~
199All the above reduce themselves to some invocation of ``SYM_START``,
200``SYM_END``, or ``SYM_ENTRY`` at last. Normally, developers should avoid using
201these.
202
203Further, in the above examples, one could see ``SYM_L_LOCAL``. There are also
204``SYM_L_GLOBAL`` and ``SYM_L_WEAK``. All are intended to denote linkage of a
205symbol marked by them. They are used either in ``_LABEL`` variants of the
206earlier macros, or in ``SYM_START``.
207
208
209Overriding Macros
210~~~~~~~~~~~~~~~~~
211Architecture can also override any of the macros in their own
212``asm/linkage.h``, including macros specifying the type of a symbol
213(``SYM_T_FUNC``, ``SYM_T_OBJECT``, and ``SYM_T_NONE``).  As every macro
214described in this file is surrounded by ``#ifdef`` + ``#endif``, it is enough
215to define the macros differently in the aforementioned architecture-dependent
216header.
217