• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<?xml version="1.0"?> <!-- -*- sgml -*- -->
2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3  "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
4[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
5
6<chapter id="cl-format" xreflabel="Callgrind Format Specification">
7<title>Callgrind Format Specification</title>
8
9<para>This chapter describes the Callgrind Profile Format, Version 1.</para>
10
11<para>A synonymous name is "Calltree Profile Format". These names actually mean
12the same since Callgrind was previously named Calltree.</para>
13
14<para>The format description is meant for the user to be able to understand the
15file contents; but more important, it is given for authors of measurement or
16visualization tools to be able to write and read this format.</para>
17
18<sect1 id="cl-format.overview" xreflabel="Overview">
19<title>Overview</title>
20
21<para>The profile data format is ASCII based.
22It is written by Callgrind, and it is upwards compatible
23to the format used by Cachegrind (ie. Cachegrind uses a subset). It can
24be read by callgrind_annotate and KCachegrind.</para>
25
26<para>This chapter gives on overview of format features and examples.
27For detailed syntax, look at the format reference.</para>
28
29<sect2 id="cl-format.overview.basics" xreflabel="Basic Structure">
30<title>Basic Structure</title>
31
32<para>Each file has a header part of an arbitrary number of lines of the
33format "key: value". The lines with key "positions" and "events" define
34the meaning of cost lines in the second part of the file: the value of
35"positions" is a list of subpositions, and the value of "events" is a list
36of event type names. Cost lines consist of subpositions followed by 64-bit
37counters for the events, in the order specified by the "positions" and "events"
38header line.</para>
39
40<para>The "events" header line is always required in contrast to the optional
41line for "positions", which defaults to "line", i.e. a line number of some
42source file. In addition, the second part of the file contains position
43specifications of the form "spec=name". "spec" can be e.g. "fn" for a
44function name or "fl" for a file name. Cost lines are always related to
45the function/file specifications given directly before.</para>
46
47</sect2>
48
49<sect2 id="cl-format.overview.example1" xreflabel="Simple Example">
50<title>Simple Example</title>
51
52<para>The event names in the following example are quite arbitrary, and are not
53related to event names used by Callgrind. Especially, cycle counts matching
54real processors probably will never be generated by any Valgrind tools, as these
55are bound to simulations of simple machine models for acceptable slowdown.
56However, any profiling tool could use the format described in this chapter.</para>
57
58<para>
59<screen>events: Cycles Instructions Flops
60fl=file.f
61fn=main
6215 90 14 2
6316 20 12</screen></para>
64
65<para>The above example gives profile information for event types "Cycles",
66"Instructions", and "Flops". Thus, cost lines give the number of CPU cycles
67passed by, number of executed instructions, and number of floating point
68operations executed while running code corresponding to some source
69position. As there is no line specifying the value of "positions", it defaults
70to "line", which means that the first number of a cost line is always a line
71number.</para>
72
73<para>Thus, the first cost line specifies that in line 15 of source file
74<filename>file.f</filename> there is code belonging to function
75<function>main</function>. While running, 90 CPU cycles passed by, and 2 of
76the 14 instructions executed were floating point operations. Similarly, the
77next line specifies that there were 12 instructions executed in the context
78of function <function>main</function> which can be related to line 16 in
79file <filename>file.f</filename>, taking 20 CPU cycles. If a cost line
80specifies less event counts than given in the "events" line, the rest is
81assumed to be zero.  I.e. there was no floating point instruction executed
82relating to line 16.</para>
83
84<para>Note that regular cost lines always give self (also called exclusive)
85cost of code at a given position. If you specify multiple cost lines for the
86same position, these will be summed up. On the other hand, in the example above
87there is no specification of how many times function
88<function>main</function> actually was
89called: profile data only contains sums.</para>
90
91</sect2>
92
93
94<sect2 id="cl-format.overview.associations" xreflabel="Associations">
95<title>Associations</title>
96
97<para>The most important extension to the original format of Cachegrind is the
98ability to specify call relationship among functions. More generally, you
99specify associations among positions. For this, the second part of the
100file also can contain association specifications. These look similar to
101position specifications, but consist of 2 lines. For calls, the format
102looks like
103<screen>
104 calls=(Call Count) (Destination position)
105 (Source position) (Inclusive cost of call)
106</screen></para>
107
108<para>The destination only specifies subpositions like line number. Therefore,
109to be able to specify a call to another function in another source file, you
110have to precede the above lines with a "cfn=" specification for the name of the
111called function, and a "cfl=" specification if the function is in another
112source file. The 2nd line looks like a regular cost line with the difference
113that inclusive cost spent inside of the function call has to be specified.</para>
114
115<para>Other associations are for example (conditional) jumps. See the
116reference below for details.</para>
117
118</sect2>
119
120
121<sect2 id="cl-format.overview.example2" xreflabel="Extended Example">
122<title>Extended Example</title>
123
124<para>The following example shows 3 functions, <function>main</function>,
125<function>func1</function>, and <function>func2</function>. Function
126<function>main</function> calls <function>func1</function> once and
127<function>func2</function> 3 times. <function>func1</function> calls
128<function>func2</function> 2 times.
129<screen>events: Instructions
130
131fl=file1.c
132fn=main
13316 20
134cfn=func1
135calls=1 50
13616 400
137cfl=file2.c
138cfn=func2
139calls=3 20
14016 400
141
142fn=func1
14351 100
144cfl=file2.c
145cfn=func2
146calls=2 20
14751 300
148
149fl=file2.c
150fn=func2
15120 700</screen></para>
152
153<para>One can see that in <function>main</function> only code from line 16
154is executed where also the other functions are called. Inclusive cost of
155<function>main</function> is 820, which is the sum of self cost 20 and costs
156spent in the calls: 400 for the single call to <function>func1</function>
157and 400 as sum for the three calls to <function>func2</function>.</para>
158
159<para>Function <function>func1</function> is located in
160<filename>file1.c</filename>, the same as <function>main</function>.
161Therefore, a "cfl=" specification for the call to <function>func1</function>
162is not needed. The function <function>func1</function> only consists of code
163at line 51 of <filename>file1.c</filename>, where <function>func2</function>
164is called.</para>
165
166</sect2>
167
168
169<sect2 id="cl-format.overview.compression1" xreflabel="Name Compression">
170<title>Name Compression</title>
171
172<para>With the introduction of association specifications like calls it is
173needed to specify the same function or same file name multiple times. As
174absolute filenames or symbol names in C++ can be quite long, it is advantageous
175to be able to specify integer IDs for position specifications.
176Here, the term "position" corresponds to a file name (source or object file)
177or function name.</para>
178
179<para>To support name compression, a position specification can be not only of
180the format "spec=name", but also "spec=(ID) name" to specify a mapping of an
181integer ID to a name, and "spec=(ID)" to reference a previously defined ID
182mapping. There is a separate ID mapping for each position specification,
183i.e. you can use ID 1 for both a file name and a symbol name.</para>
184
185<para>With string compression, the example from 1.4 looks like this:
186<screen>events: Instructions
187
188fl=(1) file1.c
189fn=(1) main
19016 20
191cfn=(2) func1
192calls=1 50
19316 400
194cfl=(2) file2.c
195cfn=(3) func2
196calls=3 20
19716 400
198
199fn=(2)
20051 100
201cfl=(2)
202cfn=(3)
203calls=2 20
20451 300
205
206fl=(2)
207fn=(3)
20820 700</screen></para>
209
210<para>As position specifications carry no information themselves, but only change
211the meaning of subsequent cost lines or associations, they can appear
212everywhere in the file without any negative consequence. Especially, you can
213define name compression mappings directly after the header, and before any cost
214lines. Thus, the above example can also be written as
215<screen>events: Instructions
216
217# define file ID mapping
218fl=(1) file1.c
219fl=(2) file2.c
220# define function ID mapping
221fn=(1) main
222fn=(2) func1
223fn=(3) func2
224
225fl=(1)
226fn=(1)
22716 20
228...</screen></para>
229
230</sect2>
231
232
233<sect2 id="cl-format.overview.compression2" xreflabel="Subposition Compression">
234<title>Subposition Compression</title>
235
236<para>If a Callgrind data file should hold costs for each assembler instruction
237of a program, you specify subposition "instr" in the "positions:" header line,
238and each cost line has to include the address of some instruction. Addresses
239are allowed to have a size of 64 bits to support 64-bit architectures. Thus,
240repeating similar, long addresses for almost every line in the data file can
241enlarge the file size quite significantly, and
242motivates for subposition compression: instead of every cost line starting with
243a 16 character long address, one is allowed to specify relative addresses.
244This relative specification is not only allowed for instruction addresses, but
245also for line numbers; both addresses and line numbers are called "subpositions".</para>
246
247<para>A relative subposition always is based on the corresponding subposition
248of the last cost line, and starts with a "+" to specify a positive difference,
249a "-" to specify a negative difference, or consists of "*" to specify the same
250subposition. Because absolute subpositions always are positive (ie. never
251prefixed by "-"), any relative specification is non-ambiguous; additionally,
252absolute and relative subposition specifications can be mixed freely.
253Assume the following example (subpositions can always be specified
254as hexadecimal numbers, beginning with "0x"):
255<screen>positions: instr line
256events: ticks
257
258fn=func
2590x80001234 90 1
2600x80001237 90 5
2610x80001238 91 6</screen></para>
262
263<para>With subposition compression, this looks like
264<screen>positions: instr line
265events: ticks
266
267fn=func
2680x80001234 90 1
269+3 * 5
270+1 +1 6</screen></para>
271
272<para>Remark: For assembler annotation to work, instruction addresses have to
273be corrected to correspond to addresses found in the original binary. I.e. for
274relocatable shared objects, often a load offset has to be subtracted.</para>
275
276</sect2>
277
278
279<sect2 id="cl-format.overview.misc" xreflabel="Miscellaneous">
280<title>Miscellaneous</title>
281
282<sect3 id="cl-format.overview.misc.summary" xreflabel="Cost Summary Information">
283<title>Cost Summary Information</title>
284
285<para>For the visualization to be able to show cost percentage, a sum of the
286cost of the full run has to be known. Usually, it is assumed that this is the
287sum of all cost lines in a file. But sometimes, this is not correct. Thus, you
288can specify a "summary:" line in the header giving the full cost for the
289profile run. This has another effect: a import filter can show a progress bar
290while loading a large data file if he knows to cost sum in advance.</para>
291
292</sect3>
293
294<sect3 id="cl-format.overview.misc.events" xreflabel="Long Names for Event Types and inherited Types">
295<title>Long Names for Event Types and inherited Types</title>
296
297<para>Event types for cost lines are specified in the "events:" line with an
298abbreviated name. For visualization, it makes sense to be able to specify some
299longer, more descriptive name. For an event type "Ir" which means "Instruction
300Fetches", this can be specified the header line
301<screen>event: Ir : Instruction Fetches
302events: Ir Dr</screen></para>
303
304<para>In this example, "Dr" itself has no long name associated. The order of
305"event:" lines and the "events:" line is of no importance. Additionally,
306inherited event types can be introduced for which no raw data is available, but
307which are calculated from given types. Suppose the last example, you could add
308<screen>event: Sum = Ir + Dr</screen>
309to specify an additional event type "Sum", which is calculated by adding costs
310for "Ir and "Dr".</para>
311
312</sect3>
313
314</sect2>
315
316</sect1>
317
318<sect1 id="cl-format.reference" xreflabel="Reference">
319<title>Reference</title>
320
321<sect2 id="cl-format.reference.grammar" xreflabel="Grammar">
322<title>Grammar</title>
323
324<para>
325<screen>ProfileDataFile := FormatVersion? Creator? PartData*</screen>
326<screen>FormatVersion := "version:" Space* Number "\n"</screen>
327<screen>Creator := "creator:" NoNewLineChar* "\n"</screen>
328<screen>PartData := (HeaderLine "\n")+ (BodyLine "\n")+</screen>
329<screen>HeaderLine := (empty line)
330  | ('#' NoNewLineChar*)
331  | PartDetail
332  | Description
333  | EventSpecification
334  | CostLineDef</screen>
335<screen>PartDetail := TargetCommand | TargetID</screen>
336<screen>TargetCommand := "cmd:" Space* NoNewLineChar*</screen>
337<screen>TargetID := ("pid"|"thread"|"part") ":" Space* Number</screen>
338<screen>Description := "desc:" Space* Name Space* ":" NoNewLineChar*</screen>
339<screen>EventSpecification := "event:" Space* Name InheritedDef? LongNameDef?</screen>
340<screen>InheritedDef := "=" InheritedExpr</screen>
341<screen>InheritedExpr := Name
342  | Number Space* ("*" Space*)? Name
343  | InheritedExpr Space* "+" Space* InheritedExpr</screen>
344<screen>LongNameDef := ":" NoNewLineChar*</screen>
345<screen>CostLineDef := "events:" Space* Name (Space+ Name)*
346  | "positions:" "instr"? (Space+ "line")?</screen>
347<screen>BodyLine := (empty line)
348  | ('#' NoNewLineChar*)
349  | CostLine
350  | PositionSpecification
351  | AssociationSpecification</screen>
352<screen>CostLine := SubPositionList Costs?</screen>
353<screen>SubPositionList := (SubPosition+ Space+)+</screen>
354<screen>SubPosition := Number | "+" Number | "-" Number | "*"</screen>
355<screen>Costs := (Number Space+)+</screen>
356<screen>PositionSpecification := Position "=" Space* PositionName</screen>
357<screen>Position := CostPosition | CalledPosition</screen>
358<screen>CostPosition := "ob" | "fl" | "fi" | "fe" | "fn"</screen>
359<screen>CalledPosition := " "cob" | "cfl" | "cfn"</screen>
360<screen>PositionName := ( "(" Number ")" )? (Space* NoNewLineChar* )?</screen>
361<screen>AssociationSpecification := CallSpecification
362  | JumpSpecification</screen>
363<screen>CallSpecification := CallLine "\n" CostLine</screen>
364<screen>CallLine := "calls=" Space* Number Space+ SubPositionList</screen>
365<screen>JumpSpecification := ...</screen>
366<screen>Space := " " | "\t"</screen>
367<screen>Number := HexNumber | (Digit)+</screen>
368<screen>Digit := "0" | ... | "9"</screen>
369<screen>HexNumber := "0x" (Digit | HexChar)+</screen>
370<screen>HexChar := "a" | ... | "f" | "A" | ... | "F"</screen>
371<screen>Name = Alpha (Digit | Alpha)*</screen>
372<screen>Alpha = "a" | ... | "z" | "A" | ... | "Z"</screen>
373<screen>NoNewLineChar := all characters without "\n"</screen>
374</para>
375
376</sect2>
377
378<sect2 id="cl-format.reference.header" xreflabel="Description of Header Lines">
379<title>Description of Header Lines</title>
380
381<para>The header has an arbitrary number of lines of the format
382"key: value". Possible <emphasis>key</emphasis> values for the header are:</para>
383
384<itemizedlist>
385
386  <listitem>
387    <para><computeroutput>version: number</computeroutput> [Callgrind]</para>
388    <para>This is used to distinguish future profile data formats.  A
389    major version of 0 or 1 is supposed to be upwards compatible with
390    Cachegrind's format.  It is optional; if not appearing, version 1
391    is supposed.  Otherwise, this has to be the first header line.</para>
392  </listitem>
393
394  <listitem>
395    <para><computeroutput>pid: process id</computeroutput> [Callgrind]</para>
396    <para>This specifies the process ID of the supervised application
397    for which this profile was generated.</para>
398  </listitem>
399
400  <listitem>
401    <para><computeroutput>cmd: program name + args</computeroutput> [Cachegrind]</para>
402    <para>This specifies the full command line of the supervised
403    application for which this profile was generated.</para>
404  </listitem>
405
406  <listitem>
407    <para><computeroutput>part: number</computeroutput> [Callgrind]</para>
408    <para>This specifies a sequentially incremented number for each dump
409    generated, starting at 1.</para>
410  </listitem>
411
412  <listitem>
413    <para><computeroutput>desc: type: value</computeroutput> [Cachegrind]</para>
414    <para>This specifies various information for this dump.  For some
415    types, the semantic is defined, but any description type is allowed.
416    Unknown types should be ignored.</para>
417    <para>There are the types "I1 cache", "D1 cache", "LL cache", which
418    specify parameters used for the cache simulator.  These are the only
419    types originally used by Cachegrind.  Additionally, Callgrind uses
420    the following types:  "Timerange" gives a rough range of the basic
421    block counter, for which the cost of this dump was collected.
422    Type "Trigger" states the reason of why this trace was generated.
423    E.g. program termination or forced interactive dump.</para>
424  </listitem>
425
426  <listitem>
427    <para><computeroutput>positions: [instr] [line]</computeroutput> [Callgrind]</para>
428    <para>For cost lines, this defines the semantic of the first numbers.
429    Any combination of "instr", "bb" and "line" is allowed, but has to be
430    in this order which corresponds to position numbers at the start of
431    the cost lines later in the file.</para>
432    <para>If "instr" is specified, the position is the address of an
433    instruction whose execution raised the events given later on the
434    line.  This address is relative to the offset of the binary/shared
435    library file to not have to specify relocation info.  For "line",
436    the position is the line number of a source file, which is
437    responsible for the events raised. Note that the mapping of "instr"
438    and "line" positions are given by the debugging line information
439    produced by the compiler.</para>
440    <para>This field is optional. If not specified, "line" is supposed
441    only.</para>
442  </listitem>
443
444  <listitem>
445    <para><computeroutput>events: event type abbreviations</computeroutput> [Cachegrind]</para>
446    <para>A list of short names of the event types logged in this file.
447    The order is the same as in cost lines.  The first event type is the
448    second or third number in a cost line, depending on the value of
449    "positions".  Callgrind does not add additional cost types.  Specify
450    exactly once.</para>
451    <para>Cost types from original Cachegrind are:
452      <itemizedlist>
453        <listitem>
454          <para><command>Ir</command>: Instruction read access</para>
455        </listitem>
456        <listitem>
457          <para><command>I1mr</command>: Instruction Level 1 read cache miss</para>
458        </listitem>
459        <listitem>
460          <para><command>ILmr</command>: Instruction last-level read cache miss</para>
461        </listitem>
462        <listitem>
463          <para>...</para>
464        </listitem>
465      </itemizedlist>
466    </para>
467  </listitem>
468
469  <listitem>
470    <para><computeroutput>summary: costs</computeroutput> [Callgrind]</para>
471    <para><computeroutput>totals: costs</computeroutput> [Cachegrind]</para>
472    <para>The value or the total number of events covered by this trace
473    file.  Both keys have the same meaning, but the "totals:" line
474    happens to be at the end of the file, while "summary:" appears in
475    the header.  This was added to allow postprocessing tools to know
476    in advance to total cost. The two lines always give the same cost
477    counts.</para>
478  </listitem>
479
480</itemizedlist>
481
482</sect2>
483
484<sect2 id="cl-format.reference.body" xreflabel="Description of Body Lines">
485<title>Description of Body Lines</title>
486
487<para>There exist lines
488<computeroutput>spec=position</computeroutput>.  The values for position
489specifications are arbitrary strings.  When starting with "(" and a
490digit, it's a string in compressed format.  Otherwise it's the real
491position string.  This allows for file and symbol names as position
492strings, as these never start with "(" + <emphasis>digit</emphasis>.
493The compressed format is either "(" <emphasis>number</emphasis> ")"
494<emphasis>space</emphasis> <emphasis>position</emphasis> or only
495"(" <emphasis>number</emphasis> ")".  The first relates
496<emphasis>position</emphasis> to <emphasis>number</emphasis> in the
497context of the given format specification from this line to the end of
498the file; it makes the (<emphasis>number</emphasis>) an alias for
499<emphasis>position</emphasis>.  Compressed format is always
500optional.</para>
501
502<para>Position specifications allowed:</para>
503<itemizedlist>
504
505  <listitem>
506    <para><computeroutput>ob=</computeroutput> [Callgrind]</para>
507    <para>The ELF object where the cost of next cost lines happens.</para>
508  </listitem>
509
510  <listitem>
511    <para><computeroutput>fl=</computeroutput> [Cachegrind]</para>
512  </listitem>
513
514  <listitem>
515    <para><computeroutput>fi=</computeroutput> [Cachegrind]</para>
516  </listitem>
517
518  <listitem>
519    <para><computeroutput>fe=</computeroutput> [Cachegrind]</para>
520    <para>The source file including the code which is responsible for
521    the cost of next cost lines. "fi="/"fe=" is used when the source
522    file changes inside of a function, i.e. for inlined code.</para>
523  </listitem>
524
525  <listitem>
526    <para><computeroutput>fn=</computeroutput> [Cachegrind]</para>
527    <para>The name of the function where the cost of next cost lines
528    happens.</para>
529  </listitem>
530
531  <listitem>
532     <para><computeroutput>cob=</computeroutput> [Callgrind]</para>
533    <para>The ELF object of the target of the next call cost lines.</para>
534  </listitem>
535
536  <listitem>
537    <para><computeroutput>cfl=</computeroutput> [Callgrind]</para>
538    <para>The source file including the code of the target of the
539    next call cost lines.</para>
540  </listitem>
541
542  <listitem>
543    <para><computeroutput>cfn=</computeroutput> [Callgrind]</para>
544    <para>The name of the target function of the next call cost
545    lines.</para>
546  </listitem>
547
548  <listitem>
549    <para><computeroutput>calls=</computeroutput> [Callgrind]</para>
550    <para>The number of nonrecursive calls which are responsible for the
551    cost specified by the next call cost line. This is the cost spent
552    inside of the called function.</para>
553    <para>After "calls=" there MUST be a cost line. This is the cost
554    spent in the called function. The first number is the source line
555    from where the call happened.</para>
556  </listitem>
557
558  <listitem>
559    <para><computeroutput>jump=count target position</computeroutput> [Callgrind]</para>
560    <para>Unconditional jump, executed count times, to the given target
561    position.</para>
562  </listitem>
563
564  <listitem>
565    <para><computeroutput>jcnd=exe.count jumpcount target position</computeroutput> [Callgrind]</para>
566    <para>Conditional jump, executed exe.count times with jumpcount
567    jumps to the given target position.</para>
568  </listitem>
569
570</itemizedlist>
571
572</sect2>
573
574</sect1>
575
576</chapter>
577