• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Development
2
3Updated: 29 July 2021
4
5This document is meant for the day when I (Gavin D. Howard) get [hit by a
6bus][1]. In other words, it's meant to make the [bus factor][1] a non-issue.
7
8This document is supposed to contain all of the knowledge necessary to develop
9`bc` and `dc`.
10
11In addition, this document is meant to add to the [oral tradition of software
12engineering][118], as described by Bryan Cantrill.
13
14This document will reference other parts of the repository. That is so a lot of
15the documentation can be closest to the part of the repo where it is actually
16necessary.
17
18## What Is It?
19
20This repository contains an implementation of both [POSIX `bc`][2] and [Unix
21`dc`][3].
22
23POSIX `bc` is a standard utility required for POSIX systems. `dc` is a
24historical utility that was included in early Unix and even predates both Unix
25and C. They both are arbitrary-precision command-line calculators with their own
26programming languages. `bc`'s language looks similar to C, with infix notation
27and including functions, while `dc` uses [Reverse Polish Notation][4] and allows
28the user to execute strings as though they were functions.
29
30In addition, it is also possible to build the arbitrary-precision math as a
31library, named [`bcl`][156].
32
33**Note**: for ease, I will refer to both programs as `bc` in this document.
34However, if I say "just `bc`," I am referring to just `bc`, and if I say `dc`, I
35am referring to just `dc`.
36
37### History
38
39This project started in January 2018 when a certain individual on IRC, hearing
40that I knew how to write parsers, asked me to write a `bc` parser for his math
41library. I did so. I thought about writing my own math library, but he
42disparaged my programming skills and made me think that I couldn't do it.
43
44However, he took so long to do it that I eventually decided to give it a try and
45had a working math portion in two weeks. It taught me that I should not listen
46to such people.
47
48From that point, I decided to make it an extreme learning experience about how
49to write quality software.
50
51That individual's main goal had been to get his `bc` into [toybox][16], and I
52managed to get my own `bc` in. I also got it in [busybox][17].
53
54Eventually, in late 2018, I also decided to try my hand at implementing
55[Karatsuba multiplication][18], an algorithm that that unnamed individual
56claimed I could never implement. It took me a bit, but I did it.
57
58This project became a passion project for me, and I continued. In mid-2019,
59Stefan Eßer suggested I improve performance by putting more than 1 digit in each
60section of the numbers. After I showed immaturity because of some burnout, I
61implemented his suggestion, and the results were incredible.
62
63Since that time, I have gradually been improving the `bc` as I have learned more
64about things like fuzzing, [`scan-build`][19], [valgrind][20],
65[AddressSanitizer][21] (and the other sanitizers), and many other things.
66
67One of my happiest moments was when my `bc` was made the default in FreeBSD.
68
69But since I believe in [finishing the software I write][22], I have done less
70work on `bc` over time, though there are still times when I put a lot of effort
71in, such as now (17 June 2021), when I am attempting to convince OpenBSD to use
72my `bc`.
73
74And that is why I am writing this document: someday, someone else is going to
75want to change my code, and this document is my attempt to make it as simple as
76possible.
77
78### Values
79
80[According to Bryan Cantrill][10], all software has values. I think he's
81correct, though I [added one value for programming languages in particular][11].
82
83However, for `bc`, his original list will do:
84
85* Approachability
86* Availability
87* Compatibility
88* Composability
89* Debuggability
90* Expressiveness
91* Extensibility
92* Interoperability
93* Integrity
94* Maintainability
95* Measurability
96* Operability
97* Performance
98* Portability
99* Resiliency
100* Rigor
101* Robustness
102* Safety
103* Security
104* Simplicity
105* Stability
106* Thoroughness
107* Transparency
108* Velocity
109
110There are several values that don't apply. The reason they don't apply is
111because `bc` and `dc` are existing utilities; this is just another
112reimplementation. The designs of `bc` and `dc` are set in stone; there is
113nothing we can do to change them, so let's get rid of those values that would
114apply to their design:
115
116* Compatibility
117* Integrity
118* Maintainability
119* Measurability
120* Performance
121* Portability
122* Resiliency
123* Rigor
124* Robustness
125* Safety
126* Security
127* Simplicity
128* Stability
129* Thoroughness
130* Transparency
131
132Furthermore, some of the remaining ones don't matter to me, so let me get rid of
133those and order the rest according to my *actual* values for this project:
134
135* Robustness
136* Stability
137* Portability
138* Compatibility
139* Performance
140* Security
141* Simplicity
142
143First is **robustness**. This `bc` and `dc` should be robust, accepting any
144input, never crashing, and instead, returning an error.
145
146Closely related to that is **stability**. The execution of `bc` and `dc` should
147be deterministic and never change for the same inputs, including the
148pseudo-random number generator (for the same seed).
149
150Third is **portability**. These programs should run everywhere that POSIX
151exists, as well as Windows. This means that just about every person on the
152planet will have access to these programs.
153
154Next is **compatibility**. These programs should, as much as possible, be
155compatible with other existing implementations and standards.
156
157Then we come to **performance**. A calculator is only usable if it's fast, so
158these programs should run as fast as possible.
159
160After that is **security**. These programs should *never* be the reason a user's
161computer is compromised.
162
163And finally, **simplicity**. Where possible, the code should be simple, while
164deferring to the above values.
165
166Keep these values in mind for the rest of this document, and for exploring any
167other part of this repo.
168
169#### Portability
170
171But before I go on, I want to talk about portability in particular.
172
173Most of these principles just require good attention and care, but portability
174is different. Sometimes, it requires pulling in code from other places and
175adapting it. In other words, sometimes I need to duplicate and adapt code.
176
177This happened in a few cases:
178
179* Option parsing (see [`include/opt.h`][35]).
180* History (see [`include/history.h`][36]).
181* Pseudo-Random Number Generator (see [`include/rand.h`][37]).
182
183This was done because I decided to ensure that `bc`'s dependencies were
184basically zero. In particular, either users have a normal install of Windows or
185they have a POSIX system.
186
187A POSIX system limited me to C99, `sh`, and zero external dependencies. That
188last item is why I pull code into `bc`: if I pull it in, it's not an external
189dependency.
190
191That's why `bc` has duplicated code. Remove it, and you risk `bc` not being
192portable to some platforms.
193
194## Suggested Course
195
196I do have a suggested course for programmers to follow when trying to understand
197this codebase. The order is this:
198
1991.	`bc` Spec.
2002.	Manpages.
2013.	Test suite.
2024.	Understand the build.
2035.	Algorithms manual.
2046.	Code concepts.
2057.	Repo structure.
2068.	Headers.
2079.	Source code.
208
209This order roughly follows this order:
210
2111. High-level requirements
2122. Low-level requirements
2133. High-level implementation
2144. Low-level implementation
215
216In other words, first understand what the code is *supposed* to do, then
217understand the code itself.
218
219## Useful External Tools
220
221I have a few tools external to `bc` that are useful:
222
223* A [Vim plugin with syntax files made specifically for my `bc` and `dc`][132].
224* A [repo of `bc` and `dc` scripts][133].
225* A set of `bash` aliases (see below).
226* A `.bcrc` file with items useful for my `bash` setup (see below).
227
228My `bash` aliases are these:
229
230```sh
231alias makej='make -j16'
232alias mcmake='make clean && make'
233alias mcmakej='make clean && make -j16'
234alias bcdebug='CPPFLAGS="-DBC_DEBUG_CODE=1" CFLAGS="-Weverything -Wno-padded \
235    -Wno-switch-enum -Wno-format-nonliteral -Wno-cast-align \
236    -Wno-unreachable-code-return -Wno-missing-noreturn \
237    -Wno-disabled-macro-expansion -Wno-unreachable-code -Wall -Wextra \
238    -pedantic -std=c99" ./configure.sh'
239alias bcconfig='CFLAGS="-Weverything -Wno-padded -Wno-switch-enum \
240    -Wno-format-nonliteral -Wno-cast-align -Wno-unreachable-code-return \
241    -Wno-missing-noreturn -Wno-disabled-macro-expansion -Wno-unreachable-code \
242    -Wall -Wextra -pedantic -std=c99" ./configure.sh'
243alias bcnoassert='CPPFLAGS="-DNDEBUG" CFLAGS="-Weverything -Wno-padded \
244    -Wno-switch-enum -Wno-format-nonliteral -Wno-cast-align \
245    -Wno-unreachable-code-return -Wno-missing-noreturn \
246    -Wno-disabled-macro-expansion -Wno-unreachable-code -Wall -Wextra \
247    -pedantic -std=c99" ./configure.sh'
248alias bcdebugnoassert='CPPFLAGS="-DNDEBUG -DBC_DEBUG_CODE=1" \
249    CFLAGS="-Weverything -Wno-padded -Wno-switch-enum -Wno-format-nonliteral \
250    -Wno-cast-align -Wno-unreachable-code-return -Wno-missing-noreturn \
251    -Wno-disabled-macro-expansion -Wno-unreachable-code -Wall -Wextra \
252    -pedantic -std=c99" ./configure.sh'
253alias bcunset='unset BC_LINE_LENGTH && unset BC_ENV_ARGS'
254```
255
256`makej` runs `make` with all of my cores.
257
258`mcmake` runs `make clean` before running `make`. It will take a target on the
259command-line.
260
261`mcmakej` is a combination of `makej` and `mcmake`.
262
263`bcdebug` configures `bc` for a full debug build, including `BC_DEBUG_CODE` (see
264[Debugging][134] below).
265
266`bcconfig` configures `bc` with Clang (Clang is my personal default compiler)
267using full warnings, with a few really loud and useless warnings turned off.
268
269`bcnoassert` configures `bc` to not have asserts built in.
270
271`bcdebugnoassert` is like `bcnoassert`, except it also configures `bc` for debug
272mode.
273
274`bcunset` unsets my personal `bc` environment variables, which are set to:
275
276```sh
277export BC_ENV_ARGS="-l $HOME/.bcrc"
278export BC_LINE_LENGTH="74"
279```
280
281Unsetting these environment variables are necessary for running
282[`scripts/release.sh`][83] because otherwise, it will error when attempting to
283run `bc -s` on my `$HOME/.bcrc`.
284
285Speaking of which, the contents of that file are:
286
287```bc
288define void print_time_unit(t){
289	if(t<10)print "0"
290	if(t<1&&t)print "0"
291	print t,":"
292}
293define void sec2time(t){
294	auto s,m,h,d,r
295	r=scale
296	scale=0
297	t=abs(t)
298	s=t%60
299	t-=s
300	m=t/60%60
301	t-=m
302	h=t/3600%24
303	t-=h
304	d=t/86400
305	if(d)print_time_unit(d)
306	if(h)print_time_unit(h)
307	print_time_unit(m)
308	if(s<10)print "0"
309	if(s<1&&s)print "0"
310	s
311	scale=r
312}
313define minutes(secs){
314	return secs/60;
315}
316define hours(secs){
317	return secs/3600;
318}
319define days(secs){
320	return secs/3600/24;
321}
322define years(secs){
323	return secs/3600/24/365.25;
324}
325define fbrand(b,p){
326	auto l,s,t
327	b=abs(b)$
328	if(b<2)b=2
329	s=scale
330	t=b^abs(p)$
331	l=ceil(l2(t),0)
332	if(l>scale)scale=l
333	t=irand(t)/t
334	scale=s
335	return t
336}
337define ifbrand(i,b,p){return irand(abs(i)$)+fbrand(b,p)}
338```
339
340This allows me to use `bc` as part of my `bash` prompt.
341
342## Code Style
343
344The code style for `bc` is...weird, and that comes from historical accident.
345
346In [History][23], I mentioned how I got my `bc` in [toybox][16]. Well, in order
347to do that, my `bc` originally had toybox style. Eventually, I changed to using
348tabs, and assuming they were 4 spaces wide, but other than that, I basically
349kept the same style, with some exceptions that are more or less dependent on my
350taste.
351
352The code style is as follows:
353
354* Tabs are 4 spaces.
355* Tabs are used at the beginning of lines for indent.
356* Spaces are used for alignment.
357* Lines are limited to 80 characters, period.
358* Pointer asterisk (`*`) goes with the variable (on the right), not the type,
359  unless it is for a pointer type returned from a function.
360* The opening brace is put on the same line as the header for the function,
361  loop, or `if` statement.
362* Unless the header is more than one line, in which case the opening brace is
363  put on its own line.
364* If the opening brace is put on its own line, there is no blank line after it.
365* If the opening brace is *not* put on its own line, there *is* a blank line
366  after it, *unless* the block is only one or two lines long.
367* Code lines are grouped into what I call "paragraphs." Basically, lines that
368  seem like they should go together are grouped together. This one comes down
369  to judgment.
370* Bodies of `if` statements, `else` statements, and loops that are one line
371  long are put on the same line as the statement, unless the header is more than
372  one line long, and/or, the header and body cannot fit into 80 characters with
373  a space inbetween them.
374* If single-line bodies are on a separate line from their headers, and the
375  headers are only a single line, then no braces are used.
376* However, braces are *always* used if they contain another `if` statement or
377  loop.
378* Loops with empty bodies are ended with a semicolon.
379* Expressions that return a boolean value are surrounded by paretheses.
380* Macro backslashes are aligned as far to the left as possible.
381* Binary operators have spaces on both sides.
382* If a line with binary operators overflows 80 characters, a newline is inserted
383  *after* binary operators.
384* Function modifiers and return types are on the same line as the function name.
385* With one exception, `goto`'s are only used to jump to the end of a function
386  for cleanup.
387* All structs, enums, and unions are `typedef`'ed.
388* All constant data is in one file: [`src/data.c`][131], but the corresponding
389  `extern` declarations are in the appropriate header file.
390* All local variables are declared at the beginning of the scope where they
391  appear. They may be initialized at that point, if it does not invoke UB or
392  otherwise cause bugs.
393* All precondition `assert()`'s (see [Asserts][135]) come *after* local variable
394  declarations.
395* Besides short `if` statements and loops, there should *never* be more than one
396  statement per line.
397
398### ClangFormat
399
400I attempted three times to use [ClangFormat][24] to impose a standard,
401machine-useful style on `bc`. All three failed. Otherwise, the style in this
402repo would be more consistent.
403
404## Repo Structure
405
406Functions are documented with Doxygen-style doc comments. Functions that appear
407in headers are documented in the headers, while static functions are documented
408where they are defined.
409
410### `bcl.sln`
411
412A Visual Studio solution file for [`bcl`][156]. This, along with
413[`bcl.vcxproj`][63] and [`bcl.vcxproj.filters`][64] is what makes it possible to
414build [`bcl`][156] on Windows.
415
416### `bcl.vcxproj`
417
418A Visual Studio project file for [`bcl`][156]. This, along with [`bcl.sln`][65]
419and [`bcl.vcxproj.filters`][64] is what makes it possible to build [`bcl`][156]
420on Windows.
421
422### `bcl.vcxproj.filters`
423
424A Visual Studio filters file for [`bcl`][156]. This, along with [`bcl.sln`][65]
425and [`bcl.vcxproj`][63] is what makes it possible to build [`bcl`][156] on
426Windows.
427
428### `bc.sln`
429
430A Visual Studio solution file for `bc`. This, along with [`bc.vcxproj`][66]
431and [`bc.vcxproj.filters`][67] is what makes it possible to build `bc` on
432Windows.
433
434### `bc.vcxproj`
435
436A Visual Studio project file for `bc`. This, along with [`bc.sln`][68] and
437[`bc.vcxproj.filters`][67] is what makes it possible to build `bc` on Windows.
438
439### `bc.vcxproj.filters`
440
441A Visual Studio filters file for `bc`. This, along with [`bc.sln`][68] and
442[`bc.vcxproj`][66] is what makes it possible to build `bc` on Windows.
443
444### `configure`
445
446A symlink to [`configure.sh`][69].
447
448### `configure.sh`
449
450This is the script to configure `bc` and [`bcl`][156] for building.
451
452This `bc` has a custom build system. The reason for this is because of
453[*portability*][136].
454
455If `bc` used an outside build system, that build system would be an external
456dependency. Thus, I had to write a build system for `bc` that used nothing but
457C99 and POSIX utilities.
458
459One of those utilities is POSIX `sh`, which technically implements a
460Turing-complete programming language. It's a terrible one, but it works.
461
462A user that wants to build `bc` on a POSIX system (not Windows) first runs
463`configure.sh` with the options he wants. `configure.sh` uses those options and
464the `Makefile` template ([`Makefile.in`][70]) to generate an actual valid
465`Makefile`. Then `make` can do the rest.
466
467For more information about the build process, see the [Build System][142]
468section and the [build manual][14].
469
470For more information about shell scripts, see [POSIX Shell Scripts][76].
471
472`configure.sh` does the following:
473
4741.	It processes command-line arguments and figure out what the user wants to
475	build.
4762.	It reads in [`Makefile.in`][70].
4773.	One-by-one, it replaces placeholders (in [`Makefile.in`][70]) of the form
478	`%%<placeholder_name>%%` based on the [build type][81].
4794.	It appends a list of file targets based on the [build type][81].
4805.	It appends the correct test targets.
4816.	It copies the correct manpage and markdown manual for `bc` and `dc` into a
482	location from which they can be copied for install.
4837.	It does a `make clean` to reset the build state.
484
485### `.gitattributes`
486
487A `.gitattributes` file. This is needed to preserve the `crlf` line endings in
488the Visual Studio files.
489
490### `.gitignore`
491
492The `.gitignore`
493
494### `LICENSE.md`
495
496This is the `LICENSE` file, including the licenses of various software that I
497have borrowed.
498
499### `Makefile.in`
500
501This is the `Makefile` template for [`configure.sh`][69] to use for generating a
502`Makefile`.
503
504For more information, see [`configure.sh`][69], the [Build System][142] section,
505and the [build manual][14].
506
507Because of [portability][136], the generated `Makefile.in` should be a pure
508[POSIX `make`][74]-compatible `Makefile` (minus the placeholders). Here are a
509few snares for the unwary programmer in this file:
510
5111.	No extensions allowed, including and especially GNU extensions.
5122.	If new headers are added, they must also be added to `Makefile.in`.
5133.	Don't delete the `.POSIX:` empty target at the top; that's what tells `make`
514	implementations that pure [POSIX `make`][74] is needed.
515
516In particular, there is no way to set up variables other than the `=` operator.
517There are no conditionals, so all of the conditional stuff must be in
518[`configure.sh`][69]. This is, in fact, why [`configure.sh`][69] exists in the
519first place: [POSIX `make`][74] is barebones and only does a build with no
520configuration.
521
522### `NEWS.md`
523
524A running changelog with an entry for each version. This should be updated at
525the same time that [`include/version.h`][75] is.
526
527### `NOTICE.md`
528
529The `NOTICE` file with proper attributions.
530
531### `README.md`
532
533The `README`. Read it.
534
535### `benchmarks/`
536
537The folder containing files to generate benchmarks.
538
539Each of these files was made, at one time or another, to benchmark some
540experimental feature, so if it seems there is no rhyme or reason to these
541benchmarks, it is because there is none, besides historical accident.
542
543#### `bc/`
544
545The folder containing `bc` scripts to generate `bc` benchmarks.
546
547##### `add.bc`
548
549The file to generate the benchmark to benchmark addition in `bc`.
550
551##### `arrays_and_constants.bc`
552
553The file to generate the benchmark to benchmark `bc` using lots of array names
554and constants.
555
556##### `arrays.bc`
557
558The file to generate the benchmark to benchmark `bc` using lots of array names.
559
560##### `constants.bc`
561
562The file to generate the benchmark to benchmark `bc` using lots of constants.
563
564##### `divide.bc`
565
566The file to generate the benchmark to benchmark division in `bc`.
567
568##### `functions.bc`
569
570The file to generate the benchmark to benchmark `bc` using lots of functions.
571
572##### `irand_long.bc`
573
574The file to generate the benchmark to benchmark `bc` using lots of calls to
575`irand()` with large bounds.
576
577##### `irand_short.bc`
578
579The file to generate the benchmark to benchmark `bc` using lots of calls to
580`irand()` with small bounds.
581
582##### `lib.bc`
583
584The file to generate the benchmark to benchmark `bc` using lots of calls to
585heavy functions in `lib.bc`.
586
587##### `multiply.bc`
588
589The file to generate the benchmark to benchmark multiplication in `bc`.
590
591##### `postfix_incdec.bc`
592
593The file to generate the benchmark to benchmark `bc` using postfix increment and
594decrement operators.
595
596##### `power.bc`
597
598The file to generate the benchmark to benchmark power (exponentiation) in `bc`.
599
600##### `subtract.bc`
601
602The file to generate the benchmark to benchmark subtraction in `bc`.
603
604##### `strings.bc`
605
606The file to generate the benchmark to benchmark `bc` using lots of strings.
607
608#### `dc/`
609
610The folder containing `dc` scripts to generate `dc` benchmarks.
611
612##### `modexp.dc`
613
614The file to generate the benchmark to benchmark modular exponentiation in `dc`.
615
616### `gen/`
617
618A folder containing the files necessary to generate C strings that will be
619embedded in the executable.
620
621All of the files in this folder have license headers, but the program and script
622that can generate strings from them include code to strip the license header out
623before strings are generated.
624
625#### `bc_help.txt`
626
627A text file containing the text displayed for `bc -h` or `bc --help`.
628
629This text just contains the command-line options and a short summary of the
630differences from GNU and BSD `bc`'s. It also directs users to the manpage.
631
632The reason for this is because otherwise, the help would be far too long to be
633useful.
634
635**Warning**: The text has some `printf()` format specifiers. You need to make
636sure the format specifiers match the arguments given to `bc_file_printf()`.
637
638#### `dc_help.txt`
639
640A text file containing the text displayed for `dc -h` or `dc --help`.
641
642This text just contains the command-line options and a short summary of the
643differences from GNU and BSD `dc`'s. It also directs users to the manpage.
644
645The reason for this is because otherwise, the help would be far too long to be
646useful.
647
648**Warning**: The text has some `printf()` format specifiers. You need to make
649sure the format specifiers match the arguments given to `bc_file_printf()`.
650
651#### `lib.bc`
652
653A `bc` script containing the [standard math library][5] required by POSIX. See
654the [POSIX standard][2] for what is required.
655
656This file does not have any extraneous whitespace, except for tabs at the
657beginning of lines. That is because this data goes directly into the binary,
658and whitespace is extra bytes in the binary. Thus, not having any extra
659whitespace shrinks the resulting binary.
660
661However, tabs at the beginning of lines are kept for two reasons:
662
6631.	Readability. (This file is still code.)
6642.	The program and script that generate strings from this file can remove
665	tabs at the beginning of lines.
666
667For more details about the algorithms used, see the [algorithms manual][25].
668
669However, there are a few snares for unwary programmers.
670
671First, all constants must be one digit. This is because otherwise, multi-digit
672constants could be interpreted wrongly if the user uses a different `ibase`.
673This does not happen with single-digit numbers because they are guaranteed to be
674interpreted what number they would be if the `ibase` was as high as possible.
675
676This is why `A` is used in the library instead of `10`, and things like `2*9*A`
677for `180` in [`lib2.bc`][26].
678
679As an alternative, you can set `ibase` in the function, but if you do, make sure
680to set it with a single-digit number and beware the snare below...
681
682Second, `scale`, `ibase`, and `obase` must be safely restored before returning
683from any function in the library. This is because without the `-g` option,
684functions are allowed to change any of the globals.
685
686Third, all local variables in a function must be declared in an `auto` statement
687before doing anything else. This includes arrays. However, function parameters
688are considered predeclared.
689
690Fourth, and this is only a snare for `lib.bc`, not [`lib2.bc`][26], the code
691must not use *any* extensions. It has to work when users use the `-s` or `-w`
692flags.
693
694#### `lib2.bc`
695
696A `bc` script containing the [extended math library][7].
697
698Like [`lib.bc`][8], and for the same reasons, this file should have no
699extraneous whitespace, except for tabs at the beginning of lines.
700
701For more details about the algorithms used, see the [algorithms manual][25].
702
703Also, be sure to check [`lib.bc`][8] for the snares that can trip up unwary
704programmers when writing code for `lib2.bc`.
705
706#### `strgen.c`
707
708Code for the program to generate C strings from text files. This is the original
709program, although [`strgen.sh`][9] was added later.
710
711The reason I used C here is because even though I knew `sh` would be available
712(it must be available to run `configure.sh`), I didn't know how to do what I
713needed to do with POSIX utilities and `sh`.
714
715Later, [`strgen.sh`][9] was contributed by Stefan Eßer of FreeBSD, showing that
716it *could* be done with `sh` and POSIX utilities.
717
718However, `strgen.c` exists *still* exists because the versions generated by
719[`strgen.sh`][9] may technically hit an environmental limit. (See the [draft C99
720standard][12], page 21.) This is because [`strgen.sh`][9] generates string
721literals, and in C99, string literals can be limited to 4095 characters, and
722`gen/lib2.bc` is above that.
723
724Fortunately, the limit for "objects," which include `char` arrays, is much
725bigger: 65535 bytes, so that's what `strgen.c` generates.
726
727However, the existence of `strgen.c` does come with a cost: the build needs C99
728compiler that targets the host machine. For more information, see the ["Cross
729Compiling" section][13] of the [build manual][14].
730
731Read the comments in `strgen.c` for more detail about it, the arguments it
732takes, and how it works.
733
734#### `strgen.sh`
735
736An `sh` script that will generate C strings that uses only POSIX utilities. This
737exists for those situations where a host C99 compiler is not available, and the
738environment limits mentioned above in [`strgen.c`][15] don't matter.
739
740`strgen.sh` takes the same arguments as [`strgen.c`][15], and the arguments mean
741the exact same things, so see the comments in [`strgen.c`][15] for more detail
742about that, and see the comments in `strgen.sh` for more details about it and
743how it works.
744
745For more information about shell scripts, see [POSIX Shell Scripts][76].
746
747### `include/`
748
749A folder containing the headers.
750
751The headers are not included among the source code because I like it better that
752way. Also there were folders within `src/` at one point, and I did not want to
753see `#include "../some_header.h"` or things like that.
754
755So all headers are here, even though only one ([`bcl.h`][30]) is meant for end
756users (to be installed in `INCLUDEDIR`).
757
758#### `args.h`
759
760This file is the API for processing command-line arguments.
761
762#### `bc.h`
763
764This header is the API for `bc`-only items. This includes the `bc_main()`
765function and the `bc`-specific lexing and parsing items.
766
767The `bc` parser is perhaps the most sensitive part of the entire codebase. See
768the documentation in `bc.h` for more information.
769
770The code associated with this header is in [`src/bc.c`][40],
771[`src/bc_lex.c`][41], and [`src/bc_parse.c`][42].
772
773#### `bcl.h`
774
775This header is the API for the [`bcl`][156] library.
776
777This header is meant for distribution to end users and contains the API that end
778users of [`bcl`][156] can use in their own software.
779
780This header, because it's the public header, is also the root header. That means
781that it has platform-specific fixes for Windows. (If the fixes were not in this
782header, the build would fail on Windows.)
783
784The code associated with this header is in [`src/library.c`][43].
785
786#### `dc.h`
787
788This header is the API for `dc`-only items. This includes the `dc_main()`
789function and the `dc`-specific lexing and parsing items.
790
791The code associated with this header is in [`src/dc.c`][44],
792[`src/dc_lex.c`][45], and [`src/dc_parse.c`][46].
793
794#### `file.h`
795
796This header is for `bc`'s internal buffered I/O API.
797
798For more information about `bc`'s error handling and custom buffered I/O, see
799[Error Handling][97] and [Custom I/O][114], along with [`status.h`][176] and the
800notes about version [`3.0.0`][32] in the [`NEWS`][32].
801
802The code associated with this header is in [`src/file.c`][47].
803
804#### `history.h`
805
806This header is for `bc`'s implementation of command-line editing/history, which
807is based on a [UTF-8-aware fork][28] of [`linenoise`][29].
808
809For more information, see the [Command-Line History][189] section.
810
811The code associated with this header is in [`src/history.c`][48].
812
813#### `lang.h`
814
815This header defines the data structures and bytecode used for actual execution
816of `bc` and `dc` code.
817
818Yes, it's misnamed; that's an accident of history where the first things I put
819into it all seemed related to the `bc` language.
820
821The code associated with this header is in [`src/lang.c`][49].
822
823#### `lex.h`
824
825This header defines the common items that both programs need for lexing.
826
827The code associated with this header is in [`src/lex.c`][50],
828[`src/bc_lex.c`][41], and [`src/dc_lex.c`][45].
829
830#### `library.h`
831
832This header defines the things needed for [`bcl`][156] that users should *not*
833have access to. In other words, [`bcl.h`][30] is the *public* header for the
834library, and this header is the *private* header for the library.
835
836The code associated with this header is in [`src/library.c`][43].
837
838#### `num.h`
839
840This header is the API for numbers and math.
841
842The code associated with this header is in [`src/num.c`][39].
843
844#### `opt.h`
845
846This header is the API for parsing command-line arguments.
847
848It's different from [`args.h`][31] in that [`args.h`][31] is for the main code
849to process the command-line arguments into global data *after* they have already
850been parsed by `opt.h` into proper tokens. In other words, `opt.h` actually
851parses the command-line arguments, and [`args.h`][31] turns that parsed data
852into flags (bits), strings, and expressions that will be used later.
853
854Why are they separate? Because originally, `bc` used `getopt_long()` for
855parsing, so [`args.h`][31] was the only one that existed. After it was
856discovered that `getopt_long()` has different behavior on different platforms, I
857adapted a [public-domain option parsing library][34] to do the job instead. And
858in doing so, I gave it its own header.
859
860They could probably be combined, but I don't really care enough at this point.
861
862The code associated with this header is in [`src/opt.c`][51].
863
864#### `parse.h`
865
866This header defines the common items that both programs need for parsing.
867
868Note that the parsers don't produce abstract syntax trees (AST's) or any
869intermediate representations. They produce bytecode directly. In other words,
870they don't have special data structures except what they need to do their job.
871
872The code associated with this header is in [`src/parse.c`][50],
873[`src/bc_lex.c`][42], and [`src/dc_lex.c`][46].
874
875#### `program.h`
876
877This header defines the items needed to manage the data structures in
878[`lang.h`][38] as well as any helper functions needed to generate bytecode or
879execute it.
880
881The code associated with this header is in [`src/program.c`][53].
882
883#### `rand.h`
884
885This header defines the API for the [pseudo-random number generator
886(PRNG)][179].
887
888The PRNG only generates fixed-size integers. The magic of generating random
889numbers of arbitrary size is actually given to the code that does math
890([`src/num.c`][39]).
891
892The code associated with this header is in [`src/rand.c`][54].
893
894#### `read.h`
895
896This header defines the API for reading from files and `stdin`.
897
898Thus, [`file.h`][55] is really for buffered *output*, while this file is for
899*input*. There is no buffering needed for `bc`'s inputs.
900
901The code associated with this header is in [`src/read.c`][56].
902
903#### `status.h`
904
905This header has several things:
906
907* A list of possible errors that internal `bc` code can use.
908* Compiler-specific fixes.
909* Platform-specific fixes.
910* Macros for `bc`'s [error handling][97].
911
912There is no code associated with this header.
913
914#### `vector.h`
915
916This header defines the API for the vectors (resizable arrays) that are used for
917data structures.
918
919Vectors are what do the heavy lifting in almost all of `bc`'s data structures.
920Even the maps of identifiers and arrays use vectors.
921
922#### `version.h`
923
924This header defines the version of `bc`.
925
926There is no code associated with this header.
927
928#### `vm.h`
929
930This header defines the API for setting up and running `bc` and `dc`.
931
932It is so named because I think of it as the "virtual machine" of `bc`, though
933that is probably not true as [`program.h`][57] is probably the "virtual machine"
934API. Thus, the name is more historical accident.
935
936The code associated with this header is in [`src/vm.c`][58].
937
938### `locales/`
939
940This folder contains a bunch of `.msg` files and soft links to the real `.msg`
941files. This is how locale support is implemented in `bc`.
942
943The files are in the format required by the [`gencat`][59] POSIX utility. They
944all have the same messages, in the same order, with the same numbering, under
945the same groups. This is because the locale system expects those messages in
946that order.
947
948The softlinks exist because for many locales, they would contain the exact same
949information. To prevent duplication, they are simply linked to a master copy.
950
951The naming format for all files is:
952
953```
954<language_code>_<country_code>.<encoding>.msg
955```
956
957This naming format must be followed for all locale files.
958
959### `manuals/`
960
961This folder contains the documentation for `bc`, `dc`, and [`bcl`][156], along
962with a few other manuals.
963
964#### `algorithms.md`
965
966This file explains the mathematical algorithms that are used.
967
968The hope is that this file will guide people in understanding how the math code
969works.
970
971#### `bc.1.md.in`
972
973This file is a template for the markdown version of the `bc` manual and
974manpages.
975
976For more information about how the manpages and markdown manuals are generated,
977and for why, see [`scripts/manpage.sh`][60] and [Manuals][86].
978
979#### `bcl.3`
980
981This is the manpage for the [`bcl`][156] library. It is generated from
982[`bcl.3.md`][61] using [`scripts/manpage.sh`][60].
983
984For the reason why I check generated data into the repo, see
985[`scripts/manpage.sh`][60] and [Manuals][86].
986
987#### `bcl.3.md`
988
989This is the markdown manual for the [`bcl`][156] library. It is the source for the
990generated [`bcl.3`][62] file.
991
992#### `benchmarks.md`
993
994This is a document that compares this `bc` to GNU `bc` in various benchmarks. It
995was last updated when version [`3.0.0`][32] was released.
996
997It has very little documentation value, other than showing what compiler options
998are useful for performance.
999
1000#### `build.md`
1001
1002This is the [build manual][14].
1003
1004This `bc` has a custom build system. The reason for this is because of
1005[*portability*][136].
1006
1007If `bc` used an outside build system, that build system would be an external
1008dependency. Thus, I had to write a build system for `bc` that used nothing but
1009C99 and POSIX utilities, including barebones [POSIX `make`][74].
1010
1011for more information about the build system, see the [build system][142]
1012section, the [build manual][14], [`configure.sh`][69], and [`Makefile.in`][70].
1013
1014#### `dc.1.md.in`
1015
1016This file is a template for the markdown version of the `dc` manual and
1017manpages.
1018
1019For more information about how the manpages and markdown manuals are generated,
1020and for why, see [`scripts/manpage.sh`][60] and [Manuals][86].
1021
1022#### `development.md`
1023
1024The file you are reading right now.
1025
1026#### `header_bcl.txt`
1027
1028Used by [`scripts/manpage.sh`][60] to give the [`bcl.3`][62] manpage a proper
1029header.
1030
1031For more information about generating manuals, see [`scripts/manpage.sh`][60]
1032and [Manuals][86].
1033
1034#### `header_bc.txt`
1035
1036Used by [`scripts/manpage.sh`][60] to give the [generated `bc` manpages][79] a
1037proper header.
1038
1039For more information about generating manuals, see [`scripts/manpage.sh`][60]
1040and [Manuals][86].
1041
1042#### `header_dc.txt`
1043
1044Used by [`scripts/manpage.sh`][60] to give the [generated `dc` manpages][80] a
1045proper header.
1046
1047For more information about generating manuals, see [`scripts/manpage.sh`][60]
1048and [Manuals][86].
1049
1050#### `header.txt`
1051
1052Used by [`scripts/manpage.sh`][60] to give all generated manpages a license
1053header.
1054
1055For more information about generating manuals, see [`scripts/manpage.sh`][60]
1056and [Manuals][86].
1057
1058#### `release.md`
1059
1060A checklist that I try to somewhat follow when making a release.
1061
1062#### `bc/`
1063
1064A folder containing the `bc` manuals.
1065
1066Each `bc` manual corresponds to a [build type][81]. See that link for more
1067details.
1068
1069For each manual, there are two copies: the markdown version generated from the
1070template, and the manpage generated from the markdown version.
1071
1072#### `dc/`
1073
1074A folder containing the `dc` manuals.
1075
1076Each `dc` manual corresponds to a [build type][81]. See that link for more
1077details.
1078
1079For each manual, there are two copies: the markdown version generated from the
1080template, and the manpage generated from the markdown version.
1081
1082### `scripts/`
1083
1084This folder contains helper scripts. Most of them are written in pure [POSIX
1085`sh`][72], but one ([`karatsuba.py`][78]) is written in Python 3.
1086
1087For more information about the shell scripts, see [POSIX Shell Scripts][76].
1088
1089#### `afl.py`
1090
1091This script is meant to be used as part of the fuzzing workflow.
1092
1093It does one of two things: checks for valid crashes, or runs `bc` and or `dc`
1094under all of the paths found by [AFL++][125].
1095
1096See [Fuzzing][82] for more information about fuzzing, including this script.
1097
1098#### `alloc.sh`
1099
1100This script is a quick and dirty script to test whether or not the garbage
1101collection mechanism of the [`BcNum` caching][96] works. It has been little-used
1102because it tests something that is not important to correctness.
1103
1104#### `benchmark.sh`
1105
1106A script making it easy to run benchmarks and to run the executable produced by
1107[`ministat.c`][223] on them.
1108
1109For more information, see the [Benchmarks][144] section.
1110
1111#### `bitfuncgen.c`
1112
1113A source file for an executable to generate tests for `bc`'s bitwise functions
1114in [`gen/lib2.bc`][26]. The executable is `scripts/bitfuncgen`, and it is built
1115with `make bitfuncgen`. It produces the test on `stdout` and the expected
1116results on `stderr`. This means that to generat tests, use the following
1117invokation:
1118
1119```
1120scripts/bitfuncgen > tests/bc/bitfuncs.txt 2> tests/bc/bitfuncs_results.txt
1121```
1122
1123It calls `abort()` if it runs into an error.
1124
1125#### `exec-install.sh`
1126
1127This script is the magic behind making sure `dc` is installed properly if it's
1128a symlink to `bc`. It checks to see if it is a link, and if so, it just creates
1129a new symlink in the install directory. Of course, it also installs `bc` itself,
1130or `dc` when it's alone.
1131
1132#### `functions.sh`
1133
1134This file is a bunch of common functions for most of the POSIX shell scripts. It
1135is not supposed to be run; instead, it is *sourced* by other POSIX shell
1136scripts, like so:
1137
1138```
1139. "$scriptdir/functions.sh"
1140```
1141
1142or the equivalent, depending on where the sourcing script is.
1143
1144For more information about the shell scripts, see [POSIX Shell Scripts][76].
1145
1146#### `fuzz_prep.sh`
1147
1148Fuzzing is a regular activity when I am preparing for a release.
1149
1150This script handles all the options and such for building a fuzzable binary.
1151Instead of having to remember a bunch of options, I just put them in this script
1152and run the script when I want to fuzz.
1153
1154For more information about fuzzing, see [Fuzzing][82].
1155
1156#### `karatsuba.py`
1157
1158This script has at least one of two major differences from most of the other
1159scripts:
1160
1161* It's written in Python 3.
1162* It's meant for software packagers.
1163
1164For example, [`scripts/afl.py`][94] and [`scripts/randmath.py`][95] are both in
1165Python 3, but they are not meant for the end user or software packagers and are
1166not included in source distributions. But this script is.
1167
1168This script breaks my rule of only POSIX utilities necessary for package
1169maintainers, but there's a very good reason for that: it's only meant to be run
1170*once* when the package is created for the first time, and maybe not even then.
1171
1172You see, this script does two things: it tests the Karatsuba implementation at
1173various settings for `KARATSUBA_LEN`, and it figures out what the optimal
1174`KARATSUBA_LEN` is for the machine that it is running on.
1175
1176Package maintainers can use this script, when creating a package for this `bc`,
1177to figure out what is optimal for their users. Then they don't have to run it
1178ever again. So this script only has to run on the packagers machine.
1179
1180I tried to write the script in `sh`, by the way, and I finally accepted the
1181tradeoff of using Python 3 when it became too hard.
1182
1183However, I also mentioned that it's for testing Karatsuba with various settings
1184of `KARATSUBA_LEN`. Package maintainers will want to run the [test suite][124],
1185right?
1186
1187Yes, but this script is not part of the [test suite][124]; it's used for testing
1188in the [`scripts/release.sh`][83] script, which is maintainer use only.
1189
1190However, there is one snare with `karatsuba.py`: I didn't want the user to have
1191to install any Python libraries to run it. Keep that in mind if you change it.
1192
1193#### `link.sh`
1194
1195This script is the magic behind making `dc` a symlink of `bc` when both
1196calculators are built.
1197
1198#### `locale_install.sh`
1199
1200This script does what its name says: it installs locales.
1201
1202It turns out that this is complicated.
1203
1204There is a magic environment variable, `$NLSPATH`, that tells you how and where
1205you are supposed to install locales.
1206
1207Yes, *how*. And where.
1208
1209But now is not the place to rant about `$NLSPATH`. For more information on
1210locales and `$NLSPATH`, see [Locales][85].
1211
1212#### `locale_uninstall.sh`
1213
1214This script does what its name says: it uninstalls locales.
1215
1216This is far less complicated than installing locales. I basically generate a
1217wildcard path and then list all paths that fit that wildcard. Then I delete each
1218one of those paths. Easy.
1219
1220For more information on locales, see [Locales][85].
1221
1222#### `manpage.sh`
1223
1224This script is the one that generates markdown manuals from a template and a
1225manpage from a markdown manual.
1226
1227For more information about generating manuals, see [Manuals][86].
1228
1229#### `ministat.c`
1230
1231This is a file copied [from FreeBSD][221] that calculates the standard
1232statistical numbers, such as mean, average, and median, based on numbers
1233obtained from a file.
1234
1235For more information, see the [FreeBSD ministat(1) manpage][222].
1236
1237This file allows `bc` to build the `scripts/ministat` executable using the
1238command `make ministat`, and this executable helps programmers evaluate the
1239results of [benchmarks][144] more accurately.
1240
1241#### `package.sh`
1242
1243This script is what helps `bc` maintainers cut a release. It does the following:
1244
12451.	Creates the appropriate `git` tag.
12462.	Pushes the `git` tag.
12473.	Copies the repo to a temp directory.
12484.	Removes files that should not be included in source distributions.
12495.	Creates the tarballs.
12506.	Signs the tarballs.
12517.	Zips and signs the Windows executables if they exist.
12528.	Calculates and outputs SHA512 and SHA256 sums for all of the files,
1253	including the signatures.
1254
1255This script is for `bc` maintainers to use when cutting a release. It is not
1256meant for outside use. This means that some non-POSIX utilities can be used,
1257such as `git` and `gpg`.
1258
1259In addition, before using this script, it expects that the folders that Windows
1260generated when building `bc`, `dc`, and [`bcl`][156], are in the parent
1261directory of the repo, exactly as Windows generated them. If they are not there,
1262then it will not zip and sign, nor calculate sums of, the Windows executables.
1263
1264Because this script creates a tag and pushes it, it should *only* be run *ONCE*
1265per release.
1266
1267#### `radamsa.sh`
1268
1269A script to test `bc`'s command-line expression parsing code, which, while
1270simple, strives to handle as much as possible.
1271
1272What this script does is it uses the test cases in [`radamsa.txt`][98] an input
1273to the [Radamsa fuzzer][99].
1274
1275For more information, see the [Radamsa][128] section.
1276
1277#### `radamsa.txt`
1278
1279Initial test cases for the [`radamsa.sh`][100] script.
1280
1281#### `randmath.py`
1282
1283This script generates random math problems and checks that `bc`'s and `dc`'s
1284output matches the GNU `bc` and `dc`. (For this reason, it is necessary to have
1285GNU `bc` and `dc` installed before using this script.)
1286
1287One snare: be sure that this script is using the GNU `bc` and `dc`, not a
1288previously-installed version of this `bc` and `dc`.
1289
1290If you want to check for memory issues or failing asserts, you can build the
1291`bc` using `./scripts/fuzz_prep.sh -a`, and then run it under this script. Any
1292errors or crashes should be caught by the script and given to the user as part
1293of the "checklist" (see below).
1294
1295The basic idea behind this script is that it generates as many math problems as
1296it can, biasing towards situations that may be likely to have bugs, and testing
1297each math problem against GNU `bc` or `dc`.
1298
1299If GNU `bc` or `dc` fails, it just continues. If this `bc` or `dc` fails, it
1300stores that problem. If the output mismatches, it also stores the problem.
1301
1302Then, when the user sends a `SIGINT`, the script stops testing and goes into
1303report mode. One-by-one, it will go through the "checklist," the list of failed
1304problems, and present each problem to the user, as well as whether this `bc` or
1305`dc` crashed, and its output versus GNU. Then the user can decide to add them as
1306test cases, which it does automatically to the appropriate test file.
1307
1308#### `release_settings.txt`
1309
1310A text file of settings combinations that [`release.sh`][83] uses to ensure that
1311`bc` and `dc` build and work with various default settings. [`release.sh`][83]
1312simply reads it line by line and uses each line for one build.
1313
1314#### `release.sh`
1315
1316This script is for `bc` maintainers only. It runs `bc`, `dc`, and [`bcl`][156]
1317through a gauntlet that is mostly meant to be used in preparation for a release.
1318
1319It does the following:
1320
13211.	Builds every [build type][81], with every setting combo in
1322	[`release_settings.txt`][93] with both calculators, `bc` alone, and `dc`
1323	alone.
13242.	Builds every [build type][81], with every setting combo in
1325	[`release_settings.txt`][93] with both calculators, `bc` alone, and `dc`
1326	alone for 32-bit.
13273.	Does #1 and #2 for Debug, Release, Release with Debug Info, and Min Size
1328	Release builds.
13294.	Runs the [test suite][124] on every build, if desired.
13305.	Runs the [test suite][124] under [ASan, UBSan, and MSan][21] for every build
1331	type/setting combo.
13326.	Runs [`scripts/karatsuba.py`][78] in test mode.
13337.	Runs the [test suite][124] for both calculators, `bc` alone, and `dc` alone
1334	under [valgrind][20] and errors if there are any memory bugs or memory
1335	leaks.
1336
1337#### `safe-install.sh`
1338
1339A script copied from [musl][101] to atomically install files.
1340
1341#### `test_settings.sh`
1342
1343A quick and dirty script to help automate rebuilding while manually testing the
1344various default settings.
1345
1346This script uses [`test_settings.txt`][103] to generate the various settings
1347combos.
1348
1349For more information about settings, see [Settings][102] in the [build
1350manual][14].
1351
1352#### `test_settings.txt`
1353
1354A list of the various settings combos to be used by [`test_settings.sh`][104].
1355
1356### `src/`
1357
1358This folder is, obviously, where the actual heart and soul of `bc`, the source
1359code, is.
1360
1361All of the source files are in one folder; this simplifies the build system
1362immensely.
1363
1364There are separate files for `bc` and `dc` specific code ([`bc.c`][40],
1365[`bc_lex.c`][41], [`bc_parse.c`][42], [`dc.c`][44], [`dc_lex.c`][45], and
1366[`dc_parse.c`][46]) where possible because it is cleaner to exclude an entire
1367source file from a build than to have `#if`/`#endif` preprocessor guards.
1368
1369That said, it was easier in many cases to use preprocessor macros where both
1370calculators used much of the same code and data structures, so there is a
1371liberal sprinkling of them through the code.
1372
1373#### `args.c`
1374
1375Code for processing command-line arguments.
1376
1377The header for this file is [`include/args.h`][31].
1378
1379#### `bc.c`
1380
1381The code for the `bc` main function `bc_main()`.
1382
1383The header for this file is [`include/bc.h`][106].
1384
1385#### `bc_lex.c`
1386
1387The code for lexing that only `bc` needs.
1388
1389The headers for this file are [`include/lex.h`][180] and [`include/bc.h`][106].
1390
1391#### `bc_parse.c`
1392
1393The code for parsing that only `bc` needs. This code is the most complex and
1394subtle in the entire codebase.
1395
1396The headers for this file are [`include/parse.h`][181] and
1397[`include/bc.h`][106].
1398
1399#### `data.c`
1400
1401Due to [historical accident][23] because of a desire to get my `bc` into
1402[toybox][16], all of the constant data that `bc` needs is all in one file. This
1403is that file.
1404
1405There is no code in this file, but a lot of the const data has a heavy influence
1406on code, including the order of data in arrays because that order has to
1407correspond to the order of other things elsewhere in the codebase. If you change
1408the order of something in this file, run `make test`, and get errors, you
1409changed something that depends on the order that you messed up.
1410
1411Almost all headers have `extern` references to items in this file.
1412
1413#### `dc.c`
1414
1415The code for the `dc` main function `dc_main()`.
1416
1417The header for this file is [`include/dc.h`][182].
1418
1419#### `dc_lex.c`
1420
1421The code for lexing that only `dc` needs.
1422
1423The headers for this file are [`include/lex.h`][180] and [`include/dc.h`][182].
1424
1425#### `dc_parse.c`
1426
1427The code for parsing that only `dc` needs.
1428
1429The headers for this file are [`include/parse.h`][181] and
1430[`include/bc.h`][182].
1431
1432#### `file.c`
1433
1434The code for `bc`'s implementation of buffered I/O. For more information about
1435why I implemented my own buffered I/O, see [`include/file.h`][55], [Error
1436Handling][97], and [Custom I/O][114], along with [`status.h`][176] and the notes
1437about version [`3.0.0`][32] in the [`NEWS`][32].
1438
1439The header for this file is [`include/file.h`][55].
1440
1441#### `history.c`
1442
1443The code for `bc`'s implementation of command-line editing/history, which is
1444based on a [UTF-8-aware fork][28] of [`linenoise`][29].
1445
1446For more information, see the [Command-Line History][189] section.
1447
1448The header for this file is [`include/history.h`][36].
1449
1450#### `lang.c`
1451
1452The data structures used for actual execution of `bc` and `dc` code.
1453
1454While execution is done in [`src/program.c`][53], this file defines functions
1455for initializing, copying, and freeing the data structures, which is somewhat
1456orthogonal to actual execution.
1457
1458Yes, it's misnamed; that's an accident of history where the first things I put
1459into it all seemed related to the `bc` language.
1460
1461The header for this file is [`include/lang.h`][38].
1462
1463#### `lex.c`
1464
1465The code for the common things that both programs need for lexing.
1466
1467The header for this file is [`include/lex.h`][180].
1468
1469#### `library.c`
1470
1471The code to implement the public API of the `bcl` library.
1472
1473The code in this file does a lot to ensure that clients do not have to worry
1474about internal `bc` details, especially error handling with `setjmp()` and
1475`longjmp()`. That and encapsulating the handling of numbers are the bulk of what
1476the code in this file actually does because most of the library is still
1477implemented in [`src/num.c`][39].
1478
1479The headers for this file are [`include/bcl.h`][30] and
1480[`include/library.h`][183].
1481
1482#### `main.c`
1483
1484The entry point for both programs; this is the `main()` function.
1485
1486This file has no headers associated with it.
1487
1488#### `num.c`
1489
1490The code for all of the arbitrary-precision [numbers][177] and [math][178] in
1491`bc`.
1492
1493The header for this file is [`include/num.h`][184].
1494
1495#### `opt.c`
1496
1497The code for parsing command-line options.
1498
1499The header for this file is [`include/opt.h`][35].
1500
1501#### `parse.c`
1502
1503The code for the common items that both programs need for parsing.
1504
1505The header for this file is [`include/parse.h`][181].
1506
1507#### `program.c`
1508
1509The code for the actual execution engine for `bc` and `dc` code.
1510
1511The header for this file is [`include/program.h`][57].
1512
1513#### `rand.c`
1514
1515The code for the [pseudo-random number generator (PRNG)][179] and the special
1516stack handling it needs.
1517
1518The PRNG only generates fixed-size integers. The magic of generating random
1519numbers of arbitrary size is actually given to the code that does math
1520([`src/num.c`][39]).
1521
1522The header for this file is [`include/rand.h`][37].
1523
1524#### `read.c`
1525
1526The code for reading from files and `stdin`.
1527
1528The header for this file is [`include/read.h`][185].
1529
1530#### `vector.c`
1531
1532The code for [vectors][111], [maps][186], and [slab vectors][187], along with
1533slabs.
1534
1535The header for this file is [`include/vector.h`][174].
1536
1537#### `vm.c`
1538
1539The code for setting up and running `bc` and `dc`.
1540
1541It is so named because I think of it as the "virtual machine" of `bc`, though
1542that is probably not true as [`program.h`][57] is probably the "virtual machine"
1543code. Thus, the name is more historical accident.
1544
1545The header for this file is [`include/vm.h`][27].
1546
1547### `tests/`
1548
1549This directory contains the entire [test suite][124] and its infrastructure.
1550
1551#### `all.sh`
1552
1553A convenience script for the `make run_all_tests` target (see the [Group
1554Tests][141] section for more information).
1555
1556#### `all.txt`
1557
1558The file with the names of the calculators. This is to make it easier for the
1559test scripts to know where the standard and other test directories are.
1560
1561#### `bcl.c`
1562
1563The test for the [`bcl`][156] API. For more information, see the [`bcl`
1564Test][157] section.
1565
1566#### `errors.sh`
1567
1568The script to run the error tests for each calculator. For more information, see
1569the [Error Tests][151] section.
1570
1571#### `extra_required.txt`
1572
1573The file with the list of tests which both calculators have that need the [Extra
1574Math build option][188]. This exists to make it easy for test scripts to skip
1575those tests when the [Extra Math build option][188] is disabled.
1576
1577#### `history.py`
1578
1579The file with all of the history tests. For more information, see the [History
1580Tests][155] section.
1581
1582#### `history.sh`
1583
1584The script to integrate [`history.py`][139] into the build system in a portable
1585way, and to skip it if necessary.
1586
1587This script also re-runs the test three times if it fails. This is because
1588`pexpect` can be flaky at times.
1589
1590#### `other.sh`
1591
1592The script to run the "other" (miscellaneous) tests for each calculator. For
1593more information, see the [Other Tests][154] section.
1594
1595#### `read.sh`
1596
1597The script to run the read tests for each calculator. For more information, see
1598the [`read()` Tests][153] section.
1599
1600#### `script.sed`
1601
1602The `sed` script to edit the output of GNU `bc` when generating script tests.
1603For more information, see the [Script Tests][150] section.
1604
1605#### `script.sh`
1606
1607The script for running one script test. For more information, see the [Script
1608Tests][150] section.
1609
1610#### `scripts.sh`
1611
1612The script to help the `make run_all_tests` (see the [Group Tests][141] section)
1613run all of the script tests.
1614
1615#### `stdin.sh`
1616
1617The script to run the `stdin` tests for each calculator. For more information,
1618see the [`stdin` Tests][152] section.
1619
1620#### `test.sh`
1621
1622The script to run one standard test. For more information, see the [Standard
1623Tests][149] section.
1624
1625#### `bc/`
1626
1627The standard tests directory for `bc`. For more information, see the [Standard
1628Tests][149] section.
1629
1630##### `all.txt`
1631
1632The file to tell the build system and `make run_all_tests` (see the [Group
1633Tests][141] section) what standard tests to run for `bc`, as well as in what
1634order.
1635
1636This file just lists the test names, one per line.
1637
1638##### `errors.txt`
1639
1640The initial error test file for `bc`. This file has one test per line. See the
1641[Error Tests][151] section for more information.
1642
1643##### `posix_errors.txt`
1644
1645The file of tests for POSIX compatibility for `bc`. This file has one test per
1646line. For more information, see the [Error Tests][151] section.
1647
1648##### `timeconst.sh`
1649
1650The script to run the `bc` tests that use the [Linux `timeconst.bc` script][6].
1651For more information, see the [Linux `timeconst.bc` Script][191]section.
1652
1653##### `errors/`
1654
1655The directory with error tests for `bc`, most discovered by AFL++ (see the
1656[Fuzzing][82] section). There is one test per file. For more information, see
1657the [Error Tests][151] section.
1658
1659##### `scripts/`
1660
1661The script tests directory for `bc`. For more information, see the [Script
1662Tests][150] section.
1663
1664###### `all.txt`
1665
1666A file to tell the build system and `make run_all_tests` (see the [Group
1667Tests][141] section) what script tests to run for `bc`, as well as in what
1668order.
1669
1670This file just lists the test names, one per line.
1671
1672#### `dc/`
1673
1674The standard tests directory for `dc`. For more information, see the [Standard
1675Tests][149] section.
1676
1677##### `all.txt`
1678
1679The file to tell the build system and `make run_all_tests` (see the [Group
1680Tests][141] section) what standard tests to run for `dc`, as well as in what
1681order.
1682
1683This file just lists the test names, one per line.
1684
1685##### `errors.txt`
1686
1687The initial error test file for `dc`. This file has one test per line. See the
1688[Error Tests][151] section for more information.
1689
1690##### `read_errors.txt`
1691
1692The file of tests errors with the `?` command (`read()` in `bc`). This file has
1693one test per line. See the [Error Tests][151] section for more information.
1694
1695##### `errors/`
1696
1697The directory with error tests for `dc`, most discovered by AFL++ (see the
1698[Fuzzing][82] section). There is one test per file. For more information, see
1699the [Error Tests][151] section.
1700
1701##### `scripts/`
1702
1703The script tests directory for `dc`. For more information, see the [Script
1704Tests][150] section.
1705
1706###### `all.txt`
1707
1708The file to tell the build system and `make run_all_tests` (see the [Group
1709Tests][141] section) what script tests to run for `dc`, as well as in what
1710order.
1711
1712This file just lists the test names, one per line.
1713
1714#### `fuzzing/`
1715
1716The directory containing the fuzzing infrastructure. For more information, see
1717the [Fuzzing][82] section.
1718
1719##### `bc_afl_continue.yaml`
1720
1721The [`tmuxp`][123] config (for use with [`tmux`][122]) for easily restarting a
1722fuzz run. For more information, see the [Convenience][130] subsection of the
1723[Fuzzing][82] section.
1724
1725##### `bc_afl.yaml`
1726
1727The [`tmuxp`][123] config (for use with [`tmux`][122]) for easily starting a
1728fuzz run. For more information, see the [Convenience][130] subsection of the
1729[Fuzzing][82] section.
1730
1731Be aware that this will delete all previous unsaved fuzzing tests in the output
1732directories.
1733
1734##### `bc_inputs1/`
1735
1736The fuzzing input directory for the first third of inputs for `bc`. For more
1737information, see the [Corpuses][192] subsection of the [Fuzzing][82] section.
1738
1739##### `bc_inputs2/`
1740
1741The fuzzing input directory for the second third of inputs for `bc`. For more
1742information, see the [Corpuses][192] subsection of the [Fuzzing][82] section.
1743
1744##### `bc_inputs3/`
1745
1746The fuzzing input directory for the third third of inputs for `bc`. For more
1747information, see the [Corpuses][192] subsection of the [Fuzzing][82] section.
1748
1749##### `dc_inputs/`
1750
1751The fuzzing input directory for the inputs for `dc`. For more information, see
1752the [Corpuses][192] subsection of the [Fuzzing][82] section.
1753
1754## Build System
1755
1756The build system is described in detail in the [build manual][14], so
1757maintainers should start there. This section, however, describes some parts of
1758the build system that only maintainers will care about.
1759
1760### Clean Targets
1761
1762`bc` has a default `make clean` target that cleans up the build files. However,
1763because `bc`'s build system can generate many different types of files, there
1764are other clean targets that may be useful:
1765
1766* `make clean_gen` cleans the `gen/strgen` executable generated from
1767  [`gen/strgen.c`][15]. It has no prerequisites.
1768* `make clean` cleans object files, `*.cat` files (see the [Locales][85]
1769  section), executables, and files generated from text files in [`gen/`][145],
1770  including `gen/strgen` if it was built. So this has a prerequisite on
1771  `make clean_gen` in normal use.
1772* `make clean_benchmarks` cleans [benchmarks][144], including the `ministat`
1773  executable. It has no prerequisites.
1774* `make clean_config` cleans the generated `Makefile` and the manuals that
1775  [`configure.sh`][69] copied in preparation for install. It also depends on
1776  `make clean` and `make clean_benchmarks`, so it cleans those items too. This
1777  is the target that [`configure.sh`][69] uses before it does its work.
1778* `make clean_coverage` cleans the generated coverage files for the [test
1779  suite][124]'s [code coverage][146] capabilities. It has no prerequisites. This
1780  is useful if the code coverage tools are giving errors.
1781* `make clean_tests` cleans *everything*. It has prerequisites on all previous
1782  clean targets, but it also cleans all of the [generated tests][143].
1783
1784When adding more generated files, you may need to add them to one of these
1785targets and/or add a target for them especially.
1786
1787### Preprocessor Macros
1788
1789`bc` and `dc` use *a lot* of preprocessor macros to ensure that each build type:
1790
1791* builds,
1792* works under the [test suite][124], and
1793* excludes as much code as possible from all builds.
1794
1795This section will explain the preprocessor style of `bc` and `dc`, as well as
1796provide an explanation of the macros used.
1797
1798#### Style
1799
1800The style of macro use in `bc` is pretty straightforward: I avoid depending on
1801macro definitions and instead, I set defaults if the macro is not defined and
1802then test the value if the macro with a plain `#if`.
1803
1804(Some examples of setting defaults are in [`include/status.h`][176], just above
1805the definition of the `BcStatus` enum.)
1806
1807In other words, I use `#if` instead of `#ifndef` or `#ifdef`, where possible.
1808
1809There are a couple of cases where I went with standard stuff instead. For
1810example, to test whether I am in debug mode or not, I still use the standard
1811`#ifndef NDEBUG`.
1812
1813#### Standard Macros
1814
1815`BC_ENABLED`
1816
1817:   This macro expands to `1` if `bc` is enabled, `0` if disabled.
1818
1819`DC_ENABLED`
1820
1821:   This macro expands to `1` if `dc` is enabled, `0` if disabled.
1822
1823`BUILD_TYPE`
1824
1825:   The macro expands to the build type, which is one of: `A`, `E`, `H`, `N`,
1826    `EH`, `EN`, `HN`, `EHN`. This build type is used in the help text to direct
1827    the user to the correct markdown manual in the `git.yzena.com` website.
1828
1829`EXECPREFIX`
1830
1831:   Thist macro expands to the prefix on the executable name. This is used to
1832    allow `bc` and `dc` to skip the prefix when finding out which calculator is
1833    executing.
1834
1835`BC_NUM_KARATSUBA_LEN`
1836
1837:   This macro expands to an integer, which is the length of numbers below which
1838    the Karatsuba multiplication algorithm switches to brute-force
1839    multiplication.
1840
1841`BC_ENABLE_EXTRA_MATH`
1842
1843:   This macro expands to `1` if the [Extra Math build option][188] is enabled,
1844    `0` if disabled.
1845
1846`BC_ENABLE_HISTORY`
1847
1848:   This macro expands to `1` if the [History build option][193] is enabled, `0`
1849    if disabled.
1850
1851`BC_ENABLE_NLS`
1852
1853:   This macro expands to `1` if the [NLS build option][193] (for locales) is
1854    enabled, `0` if disabled.
1855
1856`BC_ENABLE_LIBRARY`
1857
1858:   This macro expands to `1` if the [`bcl` library][156] is enabled, `0` if
1859    disabled. If this is enabled, building the calculators themselves is
1860    disabled, but both `BC_ENABLED` and `DC_ENABLED` must be non-zero.
1861
1862`BC_ENABLE_MEMCHECK`
1863
1864:   This macro expands to `1` if `bc` has been built for use with Valgrind's
1865    [Memcheck][194], `0` otherwise. This ensures that fatal errors still free
1866    all of their memory when exiting. `bc` does not do that normally because
1867    what's the point?
1868
1869`BC_ENABLE_AFL`
1870
1871:   This macro expands to `1` if `bc` has been built for fuzzing with
1872    [AFL++][125], `0` otherwise.. See the [Fuzzing][82] section for more
1873    information.
1874
1875`BC_DEFAULT_BANNER`
1876
1877:   This macro expands to the default value for displaying the `bc` banner.
1878
1879`BC_DEFAULT_SIGINT_RESET`
1880
1881:   The macro expands to the default value for whether or not `bc` should reset
1882    on `SIGINT` or quit.
1883
1884`BC_DEFAULT_TTY_MODE`
1885
1886:   The macro expands to the default value for whether or not `bc` should use
1887    TTY mode when it available.
1888
1889`BC_DEFAULT_PROMPT`
1890
1891:   This macro expands to the default value for whether or not `bc` should use a
1892    prompt when TTY mode is available.
1893
1894`DC_DEFAULT_SIGINT_RESET`
1895
1896:   The macro expands to the default value for whether or not `dc` should reset
1897    on `SIGINT` or quit.
1898
1899`DC_DEFAULT_TTY_MODE`
1900
1901:   The macro expands to the default value for whether or not `dc` should use
1902    TTY mode when it available.
1903
1904`DC_DEFAULT_PROMPT`
1905
1906:   This macro expands to the default value for whether or not `dc` should use a
1907    prompt when TTY mode is available.
1908
1909`BC_DEBUG_CODE`
1910
1911:   If this macro expands to a non-zero integer, then `bc` is built with *a lot*
1912    of extra debugging code. This is never set by the build system and must be
1913    set by the programmer manually. This should never be set in builds given to
1914    end users. For more information, see the [Debugging][134] section.
1915
1916## Test Suite
1917
1918While the source code may be the heart and soul of `bc`, the test suite is the
1919arms and legs: it gives `bc` the power to do anything it needs to do.
1920
1921The test suite is what allowed `bc` to climb to such high heights of quality.
1922This even goes for fuzzing because fuzzing depends on the test suite for its
1923input corpuses. (See the [Fuzzing][82] section.)
1924
1925Understanding how the test suite works should be, I think, the first thing that
1926maintainers learn after learning what `bc` and `dc` should do. This is because
1927the test suite, properly used, gives confidence that changes have not caused
1928bugs or regressions.
1929
1930That is why I spent the time to make the test suite as easy to use and as fast
1931as possible.
1932
1933To use the test suite (assuming `bc` and/or `dc` are already built), run the
1934following command:
1935
1936```
1937make test
1938```
1939
1940That's it. That's all.
1941
1942It will return an error code if the test suite failed. It will also print out
1943information about the failure.
1944
1945If you want the test suite to go fast, then run the following command:
1946
1947```
1948make -j<cores> test
1949```
1950
1951Where `<cores>` is the number of cores that your computer has. Of course, this
1952requires a `make` implementation that supports that option, but most do. (And I
1953will use this convention throughout the rest of this section.)
1954
1955I have even tried as much as possible, to put longer-running tests near the
1956beginning of the run so that the entire suite runs as fast as possible.
1957
1958However, if you want to be sure which test is failing, then running a bare
1959`make test` is a great way to do that.
1960
1961But enough about how you have no excuses to use the test suite as much as
1962possible; let's talk about how it works and what you *can* do with it.
1963
1964### Standard Tests
1965
1966The heavy lifting of testing the math in `bc`, as well as basic scripting, is
1967done by the "standard tests" for each calculator.
1968
1969These tests use the files in the [`tests/bc/`][161] and [`tests/dc/`][162]
1970directories (except for [`tests/bc/all.txt`][163], [`tests/bc/errors.txt`][164],
1971[`tests/bc/posix_errors.txt`][165], [`tests/bc/timeconst.sh`][166],
1972[`tests/dc/all.txt`][167], [`tests/dc/errors.txt`][168], and
1973[`tests/dc/read_errors.txt`][175]), which are called the "standard test
1974directories."
1975
1976For every test, there is the test file and the results file. The test files have
1977names of the form `<test>.txt`, where `<test>` is the name of the test, and the
1978results files have names of the form `<test>_results.txt`.
1979
1980If the test file exists but the results file does not, the results for that test
1981are generated by a GNU-compatible `bc` or `dc`. See the [Generated Tests][143]
1982section.
1983
1984The `all.txt` file in each standard tests directory is what tells the test suite
1985and [build system][142] what tests there are, and the tests are either run in
1986that order, or in the case of parallel `make`, that is the order that the
1987targets are listed as prerequisites of `make test`.
1988
1989If the test exists in the `all.txt` file but does not *actually* exist, the test
1990and its results are generated by a GNU-compatible `bc` or `dc`. See the
1991[Generated Tests][143] section.
1992
1993To add a non-generated standard test, do the following:
1994
1995* Add the test file (`<test>.txt` in the standard tests directory).
1996* Add the results file (`<test>_results.txt` in the standard tests directory).
1997  You can skip this step if just the results file needs to be generated. See the
1998  [Generated Tests][147] section for more information.
1999* Add the name of the test to the `all.txt` file in the standard tests
2000  directory, putting it in the order it should be in. If possible, I would put
2001  longer tests near the beginning because they will start running earlier with
2002  parallel `make`. I always keep `decimal` first, though, as a smoke test.
2003
2004If you need to add a generated standard test, see the [Generated Tests][147]
2005section for how to do that.
2006
2007Some standard tests need to be skipped in certain cases. That is handled by the
2008[build system][142]. See the [Integration with the Build System][147] section
2009for more details.
2010
2011In addition to all of the above, the standard test directory is not only the
2012directory for the standard tests of each calculator, it is also the parent
2013directory of all other test directories for each calculator.
2014
2015#### `bc` Standard Tests
2016
2017The list of current (17 July 2021) standard tests for `bc` is below:
2018
2019decimal
2020
2021:   Tests decimal parsing and printing.
2022
2023print
2024
2025:   Tests printing in every base from decimal. This is near the top for
2026    performance of parallel testing.
2027
2028parse
2029
2030:   Tests parsing in any base and outputting in decimal. This is near the top
2031    for performance of parallel testing.
2032
2033lib2
2034
2035:   Tests the extended math library. This is near the top for performance of
2036    parallel testing.
2037
2038print2
2039
2040:   Tests printing at the extreme values of `obase`.
2041
2042length
2043
2044:   Tests the `length()` builtin function.
2045
2046scale
2047
2048:   Tests the `scale()` builtin function.
2049
2050shift
2051
2052:   Tests the left (`<<`) and right (`>>`) shift operators.
2053
2054add
2055
2056:   Tests addition.
2057
2058subtract
2059
2060:   Tests subtraction.
2061
2062multiply
2063
2064:   Tests multiplication.
2065
2066divide
2067
2068:   Tests division.
2069
2070modulus
2071
2072:   Tests modulus.
2073
2074power
2075
2076:   Tests power (exponentiation).
2077
2078sqrt
2079
2080:   Tests the `sqrt()` (square root) builtin function.
2081
2082trunc
2083
2084:   Tests the truncation (`$`) operator.
2085
2086places
2087
2088:   Tests the places (`@`) operator.
2089
2090vars
2091
2092:   Tests some usage of variables. This one came from [AFL++][125] I think.
2093
2094boolean
2095
2096:   Tests boolean operators.
2097
2098comp
2099
2100:   Tests comparison operators.
2101
2102abs
2103
2104:   Tests the `abs()` builtin function.
2105
2106assignments
2107
2108:   Tests assignment operators, including increment/decrement operators.
2109
2110functions
2111
2112:   Tests functions, specifically function parameters being replaced before they
2113    themselves are used. See the comment in `bc_program_call()` about the last
2114    condition.
2115
2116scientific
2117
2118:   Tests scientific notation.
2119
2120engineering
2121
2122:   Tests engineering notation.
2123
2124globals
2125
2126:   Tests that assigning to globals affects callers.
2127
2128strings
2129
2130:   Tests strings.
2131
2132strings2
2133
2134:   Tests string allocation in slabs, to ensure slabs work.
2135
2136letters
2137
2138:   Tests single and double letter numbers to ensure they behave differently.
2139    Single-letter numbers always be set to the same value, regardless of
2140    `ibase`.
2141
2142exponent
2143
2144:   Tests the `e()` function in the math library.
2145
2146log
2147
2148:   Tests the `l()` function in the math library.
2149
2150pi
2151
2152:   Tests that `bc` produces the right value of pi for numbers with varying
2153    `scale` values.
2154
2155arctangent
2156
2157:   Tests the `a()` function in the math library.
2158
2159sine
2160
2161:   Tests the `s()` function in the math library.
2162
2163cosine
2164
2165:   Tests the `c()` function in the math library.
2166
2167bessel
2168
2169:   Tests the `j()` function in the math library.
2170
2171arrays
2172
2173:   Test arrays.
2174
2175misc
2176
2177:   Miscellaneous tests. I named it this because at the time, I struggled to
2178    classify them, but it's really testing multi-line numbers.
2179
2180misc1
2181
2182:   A miscellaneous test found by [AFL++][125].
2183
2184misc2
2185
2186:   A miscellaneous test found by [AFL++][125].
2187
2188misc3
2189
2190:   A miscellaneous test found by [AFL++][125].
2191
2192misc4
2193
2194:   A miscellaneous test found by [AFL++][125].
2195
2196misc5
2197
2198:   A miscellaneous test found by [AFL++][125].
2199
2200misc6
2201
2202:   A miscellaneous test found by [AFL++][125].
2203
2204misc7
2205
2206:   A miscellaneous test found by [AFL++][125].
2207
2208void
2209
2210:   Tests void functions.
2211
2212rand
2213
2214:   Tests the pseudo-random number generator and its special stack handling.
2215
2216recursive_arrays
2217
2218:   Tested the slab vector undo ability in used in `bc_parse_name()` when it
2219    existed. Now used as a stress test.
2220
2221divmod
2222
2223:   Tests divmod.
2224
2225modexp
2226
2227:   Tests modular exponentiation.
2228
2229bitfuncs
2230
2231:   Tests the bitwise functions, `band()`, `bor()`, `bxor()`, `blshift()` and
2232    `brshift()` in [`gen/lib2.bc`][26].
2233
2234#### `dc` Standard Tests
2235
2236The list of current (17 July 2021) standard tests for `dc` is below:
2237
2238decimal
2239
2240:   Tests decimal parsing and printing.
2241
2242length
2243
2244:   Tests the `length()` builtin function, including for strings and arrays.
2245
2246stack_len
2247
2248:   Tests taking the length of the results stack.
2249
2250stack_len
2251
2252:   Tests taking the length of the execution stack.
2253
2254add
2255
2256:   Tests addition.
2257
2258subtract
2259
2260:   Tests subtraction.
2261
2262multiply
2263
2264:   Tests multiplication.
2265
2266divide
2267
2268:   Tests division.
2269
2270modulus
2271
2272:   Tests modulus.
2273
2274divmod
2275
2276:   Tests divmod.
2277
2278power
2279
2280:   Tests power (exponentiation).
2281
2282sqrt
2283
2284:   Tests the `sqrt()` (square root) builtin function.
2285
2286modexp
2287
2288:   Tests modular exponentiation.
2289
2290boolean
2291
2292:   Tests boolean operators.
2293
2294negate
2295
2296:   Tests negation as a command and as part of numbers.
2297
2298trunc
2299
2300:   Tests the truncation (`$`) operator.
2301
2302places
2303
2304:   Tests the places (`@`) operator.
2305
2306shift
2307
2308:   Tests the left (`<<`) and right (`>>`) shift operators.
2309
2310abs
2311
2312:   Tests the `abs()` builtin function.
2313
2314scientific
2315
2316:   Tests scientific notation.
2317
2318engineering
2319
2320:   Tests engineering notation.
2321
2322vars
2323
2324:   Tests some usage of variables. This one came from [AFL++][125] I think.
2325
2326misc
2327
2328:   Miscellaneous tests. I named it this because at the time, I struggled to
2329    classify them.
2330
2331strings
2332
2333:   Tests strings.
2334
2335rand
2336
2337:   Tests the pseudo-random number generator and its special stack handling.
2338
2339exec_stack
2340
2341:   Tests the execution stack depth command.
2342
2343### Script Tests
2344
2345The heavy lifting of testing the scripting of `bc` is done by the "script tests"
2346for each calculator.
2347
2348These tests use the files in the [`tests/bc/scripts/`][169] and
2349[`tests/dc/scripts/`][170] directories (except for
2350[`tests/bc/scripts/all.txt`][171] and [`tests/dc/scripts/all.txt`][172]), which
2351are called the "script test directories."
2352
2353To add a script test, do the following:
2354
2355* Add the test file (`<test>.bc` or `<test>.dc` in the script tests directory).
2356* Add the results file (`<test>.txt` in the script tests directory). You can
2357  skip this step if just the results file needs to be generated. See the
2358  [Generated Tests][147] section for more information.
2359* Add the name of the test to the `all.txt` file in the script tests directory,
2360  putting it in the order it should be in. If possible, I would put longer tests
2361  near the beginning because they will start running earlier with parallel
2362  `make`.
2363
2364Some script tests need to be skipped in certain cases. That is handled by the
2365[build system][142]. See the [Integration with the Build System][147] section
2366for more details.
2367
2368Another unique thing about the script tests, at least for `bc`: they test the
2369`-g` and `--global-stacks` flags. This means that all of the script tests for
2370`bc` are written assuming the `-g` flag was given on the command-line
2371
2372There is one extra piece of script tests: [`tests/script.sed`][190]. This `sed`
2373script is used to remove an incompatibility with GNU `bc`.
2374
2375If there is only one more character to print at the end of `BC_LINE_LENGTH`, GNU
2376`bc` still prints a backslash+newline+digit combo. OpenBSD doesn't, which is
2377correct according to my reading of the `bc` spec, so my `bc` doesn't as well.
2378
2379The `sed` script edits numbers that end with just one digit on a line by itself
2380to put it on the same line as others.
2381
2382#### `bc` Script Tests
2383
2384The list of current (17 July 2021) script tests for `bc` is below:
2385
2386print.bc
2387
2388:   Tests printing even harder than the print standard test.
2389
2390multiply.bc
2391
2392:   Tests multiplication even harder than the multiply standard test.
2393
2394divide.bc
2395
2396:   Tests division even harder than the divide standard test.
2397
2398subtract.bc
2399
2400:   Tests subtraction even harder than the subtract standard test.
2401
2402add.bc
2403
2404:   Tests addition even harder than the add standard test.
2405
2406parse.bc
2407
2408:   Tests parsing even harder than the parse standard test.
2409
2410array.bc
2411
2412:   Tests arrays even harder than the arrays standard test.
2413
2414atan.bc
2415
2416:   Tests arctangent even harder than the arctangent standard test.
2417
2418bessel.bc
2419
2420:   Tests bessel even harder than the bessel standard test.
2421
2422functions.bc
2423
2424:   Tests functions even harder than the functions standard test.
2425
2426globals.bc
2427
2428:   Tests global stacks directly.
2429
2430len.bc
2431
2432:   Tests the `length()` builtin on arrays.
2433
2434rand.bc
2435
2436:   Tests the random number generator in the presence of global stacks.
2437
2438references.bc
2439
2440:   Tests functions with array reference parameters.
2441
2442screen.bc
2443
2444:   A random script provided by an early user that he used to calculate the size
2445    of computer screens
2446
2447strings2.bc
2448
2449:   Tests escaping in strings.
2450
2451#### `dc` Script Tests
2452
2453The list of current (17 July 2021) script tests for `dc` is below:
2454
2455prime.dc
2456
2457:   Tests scripting by generating the first 100,000 primes.
2458
2459asciify.dc
2460
2461:   Tests the asciify command.
2462
2463stream.dc
2464
2465:   Tests the stream command.
2466
2467array.dc
2468
2469:   Tests arrays.
2470
2471else.dc
2472
2473:   Tests else clauses on conditional execution commands.
2474
2475factorial.dc
2476
2477:   Tests scripting with factorial.
2478
2479loop.dc
2480
2481:   Tests scripting by implementing loops.
2482
2483quit.dc
2484
2485:   Tests the quit command in the presence of tail calls.
2486
2487weird.dc
2488
2489:   A miscellaneous test.
2490
2491### Error Tests
2492
2493One of the most useful parts of the `bc` test suite, in my opinion, is the heavy
2494testing of error conditions.
2495
2496Just about every error condition I can think of is tested, along with many
2497machine-generated (by [AFL++][125]) ones.
2498
2499However, because the error tests will often return error codes, they require
2500different infrastructure from the rest of the test suite, which assumes that
2501the calculator under test will return successfully. A lot of that infrastructure
2502is in the [`scripts/functions.sh`][105] script, but it basically allows the
2503calculator to exit with an error code and then tests that there *was* an error
2504code.
2505
2506Besides returning error codes, error tests also ensure that there is output from
2507`stderr`. This is to make sure that an error message is always printed.
2508
2509The error tests for each calculator are spread through two directories, due to
2510historical accident. These two directories are the standard test directory (see
2511the [Standard Tests][149] section) and the `errors/` directory directly
2512underneath the standard tests directory.
2513
2514This split is convenient, however, because the tests in each directory are
2515treated differently.
2516
2517The error tests in the standard test directory, which include `errors.txt` for
2518both calculators, `posix_errors.txt` for `bc`, and `read_errors.txt` for `dc`,
2519are read line-by-line and shoved through `stdin`, and each line is considered a
2520separate test. For this reason, there can't be any blank lines in the error
2521files in the standard tests directory because a blank line causes a successful
2522exit.
2523
2524On the other hand, the tests in the `errors/` directory below the standard tests
2525directory are considered to be one test per file, and they are used differently.
2526They are shoved into the calculator through `stdin`, but they are also executed
2527on the command-line.
2528
2529To add an error test, first figure out which kind you want.
2530
2531Is it a simple one-liner, and you don't care if it's tested through a file?
2532
2533Then put it in one of the error files in the standard test directory. I would
2534only put POSIX errors in the `posix_errors.txt` file for `bc`, and only `read()`
2535errors in the `read_errors.txt` file for `dc`; all others I would put in the
2536respective `errors.txt` file.
2537
2538On the other hand, if you care if the error is run as a file on the
2539command-line, or the error requires multiple lines to reproduce, then put the
2540test in the respective `errors/` directory.
2541
2542After that, you are done; the test suite will automatically pick up the new
2543test, and you don't have to tell the test suite the expected results.
2544
2545### `stdin` Tests
2546
2547The `stdin` tests specifically test the lexing and parsing of multi-line
2548comments and strings. This is important because when reading from `stdin`, the
2549calculators can only read one line at a time, so partial parses are possible.
2550
2551To add `stdin` tests, just add the tests to the `stdin.txt` file in the
2552respective standard tests directory, and add the expected results in the
2553`stdin_results.txt` in the respective standard tests directory.
2554
2555### `read()` Tests
2556
2557The `read()` tests are meant to test the `read()` builtin function, to ensure
2558that the parsing and execution is correct.
2559
2560Each line is one test, as that is the nature of using the `read()` function, so
2561to add a test, just add it as another line in the `read.txt` file in the
2562respective standard tests directory, and add its result to the
2563`read_results.txt` file in the respective standard tests directory.
2564
2565### Other Tests
2566
2567The "other" tests are just random tests that I could not easily classify under
2568other types of tests. They usually include things like command-line parsing and
2569environment variable testing.
2570
2571To add an other test, it requires adding the programming for it to
2572[`tests/other.sh`][195] because all of the tests are written specifically in
2573that script. It would be best to use the infrastructure in
2574[`scripts/functions.sh`][105].
2575
2576### Linux `timeconst.bc` Script
2577
2578One special script that `bc`'s test suite will use is the [Linux `timeconst.bc`
2579script][6].
2580
2581I made the test suite able to use this script because the reason the
2582[toybox][16] maintainer wanted my `bc` is because of this script, and I wanted
2583to be sure that it would run correctly on the script.
2584
2585However, it is not part of the distribution, nor is it part of the repository.
2586The reason for this is because [`timeconst.bc`][6] is under the GPL, while this
2587repo is under a BSD license.
2588
2589If you want `bc` to run tests on [`timeconst.bc`][6], download it and place it
2590at `tests/bc/scripts/timeconst.bc`. If it is there, the test suite will
2591automatically run its tests; otherwise, it will skip it.
2592
2593### History Tests
2594
2595There are automatic tests for history; however, they have dependencies: Python 3
2596and [`pexpect`][137].
2597
2598As a result, because I need the [test suite to be portable][138], like the rest
2599of `bc`, the history tests are carefully guarded with things to ensure that they
2600are skipped, rather than failing if Python and [`pexpect`][137] are not
2601installed. For this reason, there is a `sh` script, [`tests/history.sh`][140]
2602that runs the actual script, [`tests/history.py`][139].
2603
2604I have added as many tests as I could to cover as many lines and branches as
2605possible. I guess I could have done more, but doing so would have required a lot
2606of time.
2607
2608I have tried to make it as easy as possible to run the history tests. They will
2609run automatically if you use the `make test` command, and they will also use
2610parallel execution with `make -j<cores> test`.
2611
2612All of the tests are contained in [`tests/history.py`][139]. The reason for this
2613is because they are in Python, and I don't have an easy way of including Python
2614(or at the very least, I am not familiar enough with Python to do that). So they
2615are all in the same file to make it easier on me.
2616
2617Each test is one function in the script. They all take the same number and type
2618of arguments:
2619
26201.	`exe`: the executable to run.
26212.	`args`: the arguments to pass to the executable.
26223.	`env`: the environment.
2623
2624Each function creates a child process with `pexpect.spawn` and then tests with
2625that child. Then the function returns the child to the caller, who closes it
2626and checks its error code against its expected error code.
2627
2628Yes, the error code is not a success all the time. This is because of the UTF-8
2629tests; `bc` gives a fatal error on any non-ASCII data because ASCII is all `bc`
2630is required to handle, per the [standard][2].
2631
2632So in [`tests/history.py`][139], there are four main arrays:
2633
2634* `bc` test functions,
2635* `bc` expected error codes.
2636* `dc` test functions.
2637* `dc` expected error codes.
2638
2639[`tests/history.py`][139] takes an index as an argument; that index is what test
2640it should run. That index is used to index into the proper test and error code
2641array.
2642
2643If you need to add more history tests, you need to do the following:
2644
26451.	Add the function for that test to [`tests/history.py`][139].
26462.	Add the function to the proper array of tests.
26473.	Add the expected error code to the proper array of error codes.
26484.	Add a target for the test to [`Makefile.in`][70].
26495.	Add that target as a prerequisite to either `test_bc_history` or
2650	`test_dc_history`.
2651
2652You do not need to do anything to add the test to `history_all_tests` (see
2653[Group Tests][141] below) because the scripts will automatically run all of the
2654tests properly.
2655
2656### Generated Tests
2657
2658Some tests are *large*, and as such, it is impractical to check them into `git`.
2659Instead, the tests depend on the existence of a GNU-compatible `bc` in the
2660`PATH`, which is then used to generate the tests.
2661
2662If [`configure.sh`][69] was run with the `-G` argument, which disables generated
2663tests, then `make test` and friends will automatically skip generated tests.
2664This is useful to do on platforms that are not guaranteed to have a
2665GNU-compatible `bc` installed.
2666
2667However, adding a generated test is a complicated because you have to figure out
2668*where* you want to put the file to generate the test.
2669
2670For example, `bc`'s test suite will automatically use a GNU-compatible `bc` to
2671generate a `<test>_results.txt` file in the [standard tests][149] directory
2672(either `tests/bc/` or `tests/dc/`) if none exists for the `<test>` test. If no
2673`<test>.txt` file exists in the [standard tests][149] directory, then `bc`'s
2674test suite will look for a `<test>.bc` or `<test>.dc` file in the [script
2675tests][150] directory (either `tests/bc/scripts` or `tests/dc/scripts`), and if
2676that exists, it will use that script to generate the `<test>.txt` file in the
2677[standard tests][149] directory after which it will generate the
2678`<test>_results.txt` file in the [standard tests][149] directory.
2679
2680So you can choose to either:
2681
2682* Have a test in the [standard tests][149] directory without a corresponding
2683  `*_results.txt` file, or
2684* Have a script in the [script tests][150] directory to generate the
2685  corresponding file in the standard test directory before generating the
2686  corresponding `*_results.txt` file.
2687
2688Adding a script has a double benefit: the script itself can be used as a test.
2689However, script test results can also be generated.
2690
2691If `bc` is asked to run a script test, then if the script does not exist, `bc`'s
2692test suite returns an error. If it *does* exist, but no corresponding
2693`<test>.txt` file exists in the [script tests][150] directory, then a
2694GNU-compatible `bc` is used to generate the `<test>.txt` results file.
2695
2696If generated tests are disabled through [`configure.sh`][69], then these tests
2697are not generated if they do not exist. However, if they *do* exist, then they
2698are run. This can happen if a `make clean_tests` was not run between a build
2699that generated tests and a build that will not.
2700
2701### Group Tests
2702
2703While the test suite has a lot of targets in order to get parallel execution,
2704there are five targets that allow you to run each section, or all, of the test
2705suite as one unit:
2706
2707* `bc_all_tests` (`bc` tests)
2708* `timeconst_all_tests` ([Linux `timeconst.bc` script][6] tests)
2709* `dc_all_tests` (`dc` tests)
2710* `history_all_tests` (history tests)
2711* `run_all_tests` (combination of the previous four)
2712
2713In addition, there are more fine-grained targets available:
2714
2715* `test_bc` runs all `bc` tests (except history tests).
2716* `test_dc` runs all `dc` tests (except history tests).
2717* `test_bc_tests` runs all `bc` [standard tests][149].
2718* `test_dc_tests` runs all `dc` [standard tests][149].
2719* `test_bc_scripts` runs all `bc` [script tests][150].
2720* `test_dc_scripts` runs all `dc` [script tests][150].
2721* `test_bc_stdin` runs the `bc` [`stdin` tests][152].
2722* `test_dc_stdin` runs the `dc` [`stdin` tests][152].
2723* `test_bc_read` runs the `bc` [`read()` tests][153].
2724* `test_dc_read` runs the `dc` [`read()` tests][153].
2725* `test_bc_errors` runs the `bc` [error tests][151].
2726* `test_dc_errors` runs the `dc` [error tests][151].
2727* `test_bc_other` runs the `bc` [other tests][151].
2728* `test_dc_other` runs the `dc` [other tests][151].
2729* `timeconst` runs the tests for the [Linux `timeconst.bc` script][6].
2730* `test_history` runs all history tests.
2731* `test_bc_history` runs all `bc` history tests.
2732* `test_dc_history` runs all `dc` history tests.
2733
2734All of the above tests are parallelizable.
2735
2736### Individual Tests
2737
2738In addition to all of the above, individual test targets are available. These
2739are mostly useful for attempting to fix a singular test failure.
2740
2741These tests are:
2742
2743* `test_bc_<test>`, where `<test>` is the name of a `bc` [standard test][149].
2744  The name is the name of the test file without the `.txt` extension. It is the
2745  name printed by the test suite when running the test.
2746* `test_dc_<test>`, where `<test>` is the name of a `dc` [standard test][149].
2747  The name is the name of the test file without the `.txt` extension. It is the
2748  name printed by the test suite when running the test.
2749* `test_bc_script_<test>`, where `<test>` is the name of a `bc` [script
2750  test][150]. The name of the test is the name of the script without the `.bc`
2751  extension.
2752* `test_dc_script_<test>`, where `<test>` is the name of a `dc` [script
2753  test][150]. The name of the test is the name of the script without the `.dc`
2754  extension.
2755* `test_bc_history<idx>` runs the `bc` history test with index `<idx>`.
2756* `test_dc_history<idx>` runs the `dc` history test with index `<idx>`.
2757
2758### [`bcl`][156] Test
2759
2760When [`bcl`][156] is built, the [build system][142] automatically ensures that
2761`make test` runs the [`bcl`][156] test instead of the `bc` and `dc` tests.
2762
2763There is only one test, and it is built from [`tests/bcl.c`][158].
2764
2765The reason the test is in C is because [`bcl`][156] is a C library; I did not
2766want to have to write C code *and* POSIX `sh` scripts to run it.
2767
2768The reason there is only one test is because most of the code for the library is
2769tested by virtue of testing `bc` and `dc`; the test needs to only ensure that
2770the library bindings and plumbing do not interfere with the underlying code.
2771
2772However, just because there is only one test does not mean that it doesn't test
2773more than one thing. The code actually handles a series of tests, along with
2774error checking to ensure that nothing went wrong.
2775
2776To add a [`bcl`][156] test, just figure out what test you want, figure out where
2777in the [`tests/bcl.c`][158] would be best to put it, and put it there. Do as
2778much error checking as possible, and use the `err(BclError)` function. Ensure
2779that all memory is freed because that test is run through [Valgrind][159] and
2780[AddressSanitizer][160].
2781
2782### Integration with the Build System
2783
2784If it was not obvious by now, the test suite is heavily integrated into the
2785[build system][142], but the integration goes further than just making the test
2786suite easy to run from `make` and generating individual and group tests.
2787
2788The big problem the test suite has is that some `bc` code, stuff that is
2789important to test, is only in *some* builds. This includes all of the extra math
2790extensions, for example.
2791
2792So the test suite needs to have some way of turning off the tests that depend on
2793certain [build types][81] when those [build types][81] are not used.
2794
2795This is the reason the is tightly integrated with the [build system][142]: the
2796[build system][142] knows what [build type][81] was used and can tell the test
2797suite to turn off the tests that do not apply.
2798
2799It does this with arguments to the test scripts that are either a `1` or a `0`,
2800depending on whether tests of that type should be enabled or not. These
2801arguments are why I suggest, in the [Test Scripts][148] section, to always use a
2802`make` target to run the test suite or any individual test. I have added a lot
2803of targets to make this easy and as fast as possible.
2804
2805In addition to all of that, the build system is responsible for selecting the
2806`bc`/`dc` tests or the [`bcl` test][157].
2807
2808### Test Suite Portability
2809
2810The test suite is meant to be run by users and packagers as part of their
2811install process.
2812
2813This puts some constraints on the test suite, but the biggest is that the test
2814suite must be as [portable as `bc` itself][136].
2815
2816This means that the test suite must be implemented in pure POSIX `make`, `sh`,
2817and C99.
2818
2819#### Test Scripts
2820
2821To accomplish the portability, the test suite is run by a bunch of `sh` scripts
2822that have the constraints laid out in [POSIX Shell Scripts][76].
2823
2824However, that means they have some quirks, made worse by the fact that there are
2825[generated tests][143] and [tests that need to be skipped, but only
2826sometimes][147].
2827
2828This means that a lot of the scripts take an awkward number and type of
2829arguments. Some arguments are strings, but most are integers, like
2830[`scripts/release.sh`][83].
2831
2832It is for this reason that I do not suggest running the test scripts directly.
2833Instead, always use an appropriate `make` target, which already knows the
2834correct arguments for the test because of the [integration with the build
2835system][147].
2836
2837### Test Coverage
2838
2839In order to get test coverage information, you need `gcc`, `gcov`, and `gcovr`.
2840
2841If you have them, run the following commands:
2842
2843```
2844CC=gcc ./configure -gO3 -c
2845make -j<cores>
2846make coverage
2847```
2848
2849Note that `make coverage` does not have a `-j<cores>` part; it cannot be run in
2850parallel. If you try, you will get errors. And note that `CC=gcc` is used.
2851
2852After running those commands, you can open your web browser and open the
2853`index.html` file in the root directory of the repo. From there, you can explore
2854all of the coverage results.
2855
2856If you see lines or branches that you think you could hit with a manual
2857execution, do such manual execution, and then run the following command:
2858
2859```
2860make coverage_output
2861```
2862
2863and the coverage output will be updated.
2864
2865If you want to rerun `make coverage`, you must do a `make clean` and build
2866first, like this:
2867
2868```
2869make clean
2870make -j<cores>
2871make coverage
2872```
2873
2874Otherwise, you will get errors.
2875
2876If you want to run tests in parallel, you can do this:
2877
2878```
2879make -j<cores>
2880make -j<cores> test
2881make coverage_output
2882```
2883
2884and that will generate coverage output correctly.
2885
2886### [AddressSanitizer][21] and Friends
2887
2888To run the test suite under [AddressSanitizer][21] or any of its friends, use
2889the following commands:
2890
2891```
2892CFLAGS="-fsanitize=<sanitizer> ./configure -gO3 -m
2893make -j<cores>
2894make -j<cores> test
2895```
2896
2897where `<sanitizer>` is the correct name of the desired sanitizer. There is one
2898exception to the above: `UndefinedBehaviorSanitizer` should be run on a build
2899that has zero optimization, so for `UBSan`, use the following commands:
2900
2901```
2902CFLAGS="-fsanitize=undefined" ./configure -gO0 -m
2903make -j<cores>
2904make -j<cores> test
2905```
2906
2907### [Valgrind][20]
2908
2909To run the test suite under [Valgrind][20], run the following commands:
2910
2911```
2912./configure -gO3 -v
2913make -j<cores>
2914make -j<cores> test
2915```
2916
2917It really is that easy. I have directly added infrastructure to the build system
2918and the test suite to ensure that if [Valgrind][20] detects any memory errors or
2919any memory leaks at all, it will tell the test suite infrastructure to report an
2920error and exit accordingly.
2921
2922## POSIX Shell Scripts
2923
2924There is a lot of shell scripts in this repository, and every single one of them
2925is written in pure [POSIX `sh`][72].
2926
2927The reason that they are written in [POSIX `sh`][72] is for *portability*: POSIX
2928systems are only guaranteed to have a barebones implementation of `sh`
2929available.
2930
2931There are *many* snares for unwary programmers attempting to modify
2932[`configure.sh`][69], any of the scripts in this directory, [`strgen.sh`][9], or
2933any of the scripts in [`tests/`][77]. Here are some of them:
2934
29351.	No `bash`-isms.
29362.	Only POSIX standard utilities are allowed.
29373.	Only command-line options defined in the POSIX standard for POSIX utilities
2938	are allowed.
29394.	Only the standardized behavior of POSIX utilities is allowed.
29405.	Functions return data by *printing* it. Using `return` sets their exit code.
2941
2942In other words, the script must only use what is standardized in the [`sh`][72]
2943and [Shell Command Language][73] standards in POSIX. This is *hard*. It precludes
2944things like `local` and the `[[ ]]` notation.
2945
2946These are *enormous* restrictions and must be tested properly. I put out at
2947least one release with a change to `configure.sh` that wasn't portable. That was
2948an embarrassing mistake.
2949
2950The lack of `local`, by the way, is why variables in functions are named with
2951the form:
2952
2953```
2954_<function_name>_<var_name>
2955```
2956
2957This is done to prevent any clashes of variable names with already existing
2958names. And this applies to *all* shell scripts. However, there are a few times
2959when that naming convention is *not* used; all of them are because those
2960functions are required to change variables in the global scope.
2961
2962### Maintainer-Only Scripts
2963
2964If a script is meant to be used for maintainers (of `bc`, not package
2965maintainers), then rules 2, 3, and 4 don't need to be followed as much because
2966it is assumed that maintainers will be able to install whatever tools are
2967necessary to do the job.
2968
2969## Manuals
2970
2971The manuals for `bc` and `dc` are all generated, and the manpages for `bc`,
2972`dc`, and `bcl` are also generated.
2973
2974Why?
2975
2976I don't like the format of manpages, and I am not confident in my ability to
2977write them. Also, they are not easy to read on the web.
2978
2979So that explains why `bcl`'s manpage is generated from its markdown version. But
2980why are the markdown versions of the `bc` and `dc` generated?
2981
2982Because the content of the manuals needs to change based on the [build
2983type][81]. For example, if `bc` was built with no history support, it should not
2984have the **COMMAND LINE HISTORY** section in its manual. If it did, that would
2985just confuse users.
2986
2987So the markdown manuals for `bc` and `dc` are generated from templates
2988([`manuals/bc.1.md.in`][89] and [`manuals/dc.1.md.in`][90]). And from there,
2989the manpages are generated from the generated manuals.
2990
2991The generated manpage for `bcl` ([`manuals/bcl.3`][62]) is checked into version
2992control, and the generated markdown manuals and manpages for `bc`
2993([`manuals/bc`][79]) and `dc` ([`manuals/dc`][80]) are as well.
2994
2995This is because generating the manuals and manpages requires a heavy dependency
2996that only maintainers should care about: [Pandoc][92]. Because users [should not
2997have to install *any* dependencies][136], the files are generated, checked into
2998version control, and included in distribution tarballs.
2999
3000If you run [`configure.sh`][69], you have an easy way of generating the markdown
3001manuals and manpages: just run `make manpages`. This target calls
3002[`scripts/manpage.sh`][60] appropriately for `bc`, `dc`, and `bcl`.
3003
3004For more on how generating manuals and manpages works, see
3005[`scripts/manpage.sh`][60].
3006
3007## Locales
3008
3009The locale system of `bc` is enormously complex, but that's because
3010POSIX-compatible locales are terrible.
3011
3012How are they terrible?
3013
3014First, `gencat` does not work for generating cross-compilation. In other words,
3015it does not generate machine-portable files. There's nothing I can do about
3016this except for warn users.
3017
3018Second, the format of `.msg` files is...interesting. Thank goodness it is text
3019because otherwise, it would be impossible to get them right.
3020
3021Third, `.msg` files are not used. In other words, `gencat` exists. Why?
3022
3023Fourth, `$NLSPATH` is an awful way to set where and *how* to install locales.
3024
3025Yes, where and *how*.
3026
3027Obviously, from it's name, it's a path, and that's the where. The *how* is more
3028complicated.
3029
3030It's actually *not* a path, but a path template. It's a format string, and it
3031can have a few format specifiers. For more information on that, see [this
3032link][84]. But in essence, those format specifiers configure how each locale is
3033supposed to be installed.
3034
3035With all those problems, why use POSIX locales? Portability, as always. I can't
3036assume that `gettext` will be available, but I *can* pretty well assume that
3037POSIX locales will be available.
3038
3039The locale system of `bc` includes all files under [`locales/`][85],
3040[`scripts/locale_install.sh`][87], [`scripts/locale_uninstall.sh`][88],
3041[`scripts/functions.sh`][105], the `bc_err_*` constants in [`src/data.c`][131],
3042and the parts of the build system needed to activate it. There is also code in
3043[`src/vm.c`][58] (in `bc_vm_gettext()`) for loading the current locale.
3044
3045If the order of error messages and/or categories are changed, the order of
3046errors must be changed in the enum, the default error messages and categories in
3047[`src/data.c`][131], and all of the messages and categories in the `.msg` files
3048under [`locales/`][85].
3049
3050## Static Analysis
3051
3052I do *some* static analysis on `bc`.
3053
3054I used to use [Coverity][196], but I stopped using it when it started giving me
3055too many false positives and also because it had a vulnerability.
3056
3057However, I still use the [Clang Static Analyzer][197] through
3058[`scan-build`][19]. I only use it in debug mode because I have to add some
3059special code to make it not complain about things that are definitely not a
3060problem.
3061
3062The most frequent example of false positives is where a local is passed to a
3063function to be initialized. [`scan-build`][19] misses that fact, so I
3064pre-initialize such locals to prevent the warnings.
3065
3066To run `scan-build`, do the following:
3067
3068```
3069make clean
3070scan-build make
3071```
3072
3073`scan-build` will print its warnings to `stdout`.
3074
3075## Fuzzing
3076
3077The quality of this `bc` is directly related to the amount of fuzzing I did. As
3078such, I spent a lot of work making the fuzzing convenient and fast, though I do
3079admit that it took me a long time to admit that it did need to be faster.
3080
3081First, there were several things which make fuzzing fast:
3082
3083* Using [AFL++][125]'s deferred initialization.
3084* Splitting `bc`'s corpuses.
3085* Parallel fuzzing.
3086
3087Second, there are several things which make fuzzing convenient:
3088
3089* Preprepared input corpuses.
3090* [`scripts/fuzz_prep.sh`][119].
3091* `tmux` and `tmuxp` configs.
3092* [`scripts/afl.py`][94].
3093
3094### Fuzzing Performance
3095
3096Fuzzing with [AFL++][125] can be ***SLOW***. Spending the time to make it as
3097fast as possible is well worth the time.
3098
3099However, there is a caveat to the above: it is easy to make [AFL++][125] crash,
3100be unstable, or be unable to find "paths" (see [AFL++ Quickstart][129]) if the
3101performance enhancements are done poorly.
3102
3103To stop [AFL++][125] from crashing on test cases, and to be stable, these are
3104the requirements:
3105
3106* The state at startup must be *exactly* the same.
3107* The virtual memory setup at startup must be *exactly* the same.
3108
3109The first isn't too hard; it's the second that is difficult.
3110
3111`bc` allocates a lot of memory at start. ("A lot" is relative; it's far less
3112than most programs.) After going through an execution run, however, some of that
3113memory, while it could be cleared and reset, is in different places because of
3114vectors. Since vectors reallocate, their allocations are not guaranteed to be in
3115the same place.
3116
3117So to make all three work, I had to set up the deferred initialization and
3118persistent mode *before* any memory was allocated (except for `vm.jmp_bufs`,
3119which is probably what caused the stability to drop below 100%). However, using
3120deferred alone let me put the [AFL++][125] initialization further back. This
3121works because [AFL++][125] sets up a `fork()` server that `fork()`'s `bc` right
3122at that call. Thus, every run has the exact same virtual memory setup, and each
3123run can skip all of the setup code.
3124
3125I tested `bc` using [AFL++][125]'s deferred initialization, plus persistent
3126mode, plus shared memory fuzzing. In order to do it safely, with stability above
312799%, all of that was actually *slower* than using just deferred initialization
3128with the initialization *right before* `stdin` was read. And as a bonus, the
3129stability in that situation is 100%.
3130
3131As a result, my [AFL++][125] setup only uses deferred initialization. That's the
3132`__AFL_INIT()` call.
3133
3134(Note: there is one more big item that must be done in order to have 100%
3135stability: the pseudo-random number generator *must* start with *exactly* the
3136same seed for every run. This is set up with the `tmux` and `tmuxp` configs that
3137I talk about below in [Convenience][130]. This seed is set before the
3138`__AFL_INIT()` call, so setting it has no runtime cost for each run, but without
3139it, stability would be abysmal.)
3140
3141On top of that, while `dc` is plenty fast under fuzzing (because of a faster
3142parser and less test cases), `bc` can be slow. So I have split the `bc` input
3143corpus into three parts, and I set fuzzers to run on each individually. This
3144means that they will duplicate work, but they will also find more stuff.
3145
3146On top of all of that, each input corpus (the three `bc` corpuses and the one
3147`dc` corpus) is set to run with 4 fuzzers. That works out perfectly for two
3148reasons: first, my machine has 16 cores, and second, the [AFL++][125] docs
3149recommend 4 parallel fuzzers, at least, to run different "power schedules."
3150
3151### Convenience
3152
3153The preprepared input corpuses are contained in the
3154`tests/fuzzing/bc_inputs{1,2,3}/`, and `tests/fuzzing/dc_inputs` directories.
3155There are three `bc` directories and only one `dc` directory because `bc`'s
3156input corpuses are about three times as large, and `bc` is a larger program;
3157it's going to need much more fuzzing.
3158
3159(They do share code though, so fuzzing all of them still tests a lot of the same
3160math code.)
3161
3162The next feature of convenience is the [`scripts/fuzz_prep.sh`][119] script. It
3163assumes the existence of `afl-clang-lto` in the `$PATH`, but if that exists, it
3164automatically configures and builds `bc` with a fuzz-ideal build.
3165
3166A fuzz-ideal build has several things:
3167
3168* `afl-clang-lto` as the compiler. (See [AFL++ Quickstart][129].)
3169* Debug mode, to crash as easily as possible.
3170* Full optimization (including [Link-Time Optimization][126]), for performance.
3171* [AFL++][125]'s deferred initialization (see [Fuzzing Performance][127] above).
3172* And `AFL_HARDEN=1` during the build to harden the build. See the [AFL++][125]
3173  documentation for more information.
3174
3175There is one big thing that a fuzz-ideal build does *not* have: it does not use
3176[AFL++][125]'s `libdislocator.so`. This is because `libdislocator.so` crashes if
3177it fails to allocate memory. I do not want to consider those as crashes because
3178my `bc` does, in fact, handle them gracefully by exiting with a set error code.
3179So `libdislocator.so` is not an option.
3180
3181However, to add to [`scripts/fuzz_prep.sh`][119] making a fuzz-ideal build, in
3182`tests/fuzzing/`, there are two `yaml` files: [`tests/fuzzing/bc_afl.yaml`][120]
3183and [`tests/fuzzing/bc_afl_continue.yaml`][121]. These files are meant to be
3184used with [`tmux`][122] and [`tmuxp`][123]. While other programmers will have to
3185adjust the `start_directory` item, once it is adjusted, then using this command:
3186
3187```
3188tmuxp load tests/fuzzing/bc_afl.yaml
3189```
3190
3191will start fuzzing.
3192
3193In other words, to start fuzzing, the sequence is:
3194
3195```
3196./scripts/fuzz_prep.sh
3197tmuxp load tests/fuzzing/bc_afl.yaml
3198```
3199
3200Doing that will load, in `tmux`, 16 separate instances of [AFL++][125], 12 on
3201`bc` and 4 on `dc`. The outputs will be put into the
3202`tests/fuzzing/bc_outputs{1,2,3}/` and `tests/fuzzing/dc_outputs/` directories.
3203
3204(Note that loading that config will also delete all unsaved [AFL++][125] output
3205from the output directories.)
3206
3207Sometimes, [AFL++][125] will report crashes when there are none. When crashes
3208are reported, I always run the following command:
3209
3210```
3211./scripts/afl.py <dir>
3212```
3213
3214where `dir` is one of `bc1`, `bc2`, `bc3`, or `dc`, depending on which of the
321516 instances reported the crash. If it was one of the first four (`bc11` through
3216`bc14`), I use `bc1`. If it was one of the second four (`bc21` through `bc24`, I
3217use `bc2`. If it was one of the third four (`bc31` through `bc34`, I use `bc3`.
3218And if it was `dc`, I use `dc`.
3219
3220The [`scripts/afl.py`][94] script will report whether [AFL++][125] correctly
3221reported a crash or not. If so, it will copy the crashing test case to
3222`.test.txt` and tell you whether it was from running it as a file or through
3223`stdin`.
3224
3225From there, I personally always investigate the crash and fix it. Then, when the
3226crash is fixed, I either move `.test.txt` to `tests/bc/errors/<idx>.txt` as an
3227error test (if it produces an error) or I create a new `tests/bc/misc<idx>.txt`
3228test for it and a corresponding results file. (See [Test Suite][124] for more
3229information about the test suite.) In either case, `<idx>` is the next number
3230for a file in that particular place. For example, if the last file in
3231`tests/bc/errors/` is `tests/bc/errors/18.txt`, I move `.test.txt` to
3232`tests/bc/error/19.txt`.
3233
3234Then I immediately run [`scripts/afl.py`][94] again to find the next crash
3235because often, [AFL++][125] found multiple test cases that trigger the same
3236crash. If it finds another, I repeat the process until it is happy.
3237
3238Once it *is* happy, I do the same `fuzz_prep.sh`, `tmuxp load` sequence and
3239restart fuzzing. Why do I restart instead of continuing? Because with the
3240changes, the test outputs could be stale and invalid.
3241
3242However, there *is* a case where I continue: if [`scripts/afl.py`][94] finds
3243that every crash reported by [AFL++][125] is invalid. If that's the case, I can
3244just continue with the command:
3245
3246```
3247tmuxp load tests/fuzzing/bc_afl_continue.yaml
3248```
3249
3250(Note: I admit that I usually run [`scripts/afl.py`][94] while the fuzzer is
3251still running, so often, I don't find a need to continue since there was no
3252stop. However, the capability is there, if needed.)
3253
3254In addition, my fuzzing setup, including the `tmux` and `tmuxp` configs,
3255automatically set up [AFL++][125] power schedules (see [Fuzzing
3256Performance][127] above). They also set up the parallel fuzzing such that there
3257is one fuzzer in each group of 4 that does deterministic fuzzing. It's always
3258the first one in each group.
3259
3260For more information about deterministic fuzzing, see the [AFL++][125]
3261documentation.
3262
3263### Corpuses
3264
3265I occasionally add to the input corpuses. These files come from new files in the
3266[Test Suite][124]. In fact, I use soft links when the files are the same.
3267
3268However, when I add new files to an input corpus, I sometimes reduce the size of
3269the file by removing some redundancies.
3270
3271And then, when adding to the `bc` corpuses, I try to add them evenly so that
3272each corpus will take about the same amount of time to get to a finished state.
3273
3274### [AFL++][125] Quickstart
3275
3276The way [AFL++][125] works is complicated.
3277
3278First, it is the one to invoke the compiler. It leverages the compiler to add
3279code to the binary to help it know when certain branches are taken.
3280
3281Then, when fuzzing, it uses that branch information to generate information
3282about the "path" that was taken through the binary.
3283
3284I don't know what AFL++ counts as a new path, but each new path is added to an
3285output corpus, and it is later used as a springboard to find new paths.
3286
3287This is what makes AFL++ so effective: it's not just blindly thrashing a binary;
3288it adapts to the binary by leveraging information about paths.
3289
3290### Fuzzing Runs
3291
3292For doing a fuzzing run, I expect about a week or two where my computer is
3293basically unusable, except for text editing and light web browsing.
3294
3295Yes, it can take two weeks for me to do a full fuzzing run, and that does *not*
3296include the time needed to find and fix crashes; it only counts the time on the
3297*last* run, the one that does not find any crashes. This means that the entire
3298process can take a month or more.
3299
3300What I use as an indicator that the fuzzing run is good enough is when the
3301number of "Pending" paths (see [AFL++ Quickstart][129] above) for all fuzzer
3302instances, except maybe the deterministic instances, is below 50. And even then,
3303I try to let deterministic instances get that far as well.
3304
3305You can see how many pending paths are left in the "path geometry" section of
3306the [AFL++][125] dashboard.
3307
3308Also, to make [AFL++][125] quit, you need to send it a `SIGINT`, either with
3309`Ctrl+c` or some other method. It will not quit until you tell it to.
3310
3311### Radamsa
3312
3313I rarely use [Radamsa][99] instead of [AFL++][125]. In fact, it's only happened
3314once.
3315
3316The reason I use [Radamsa][99] instead of [AFL++][125] is because it is easier
3317to use with varying command-line arguments, which was needed for testing `bc`'s
3318command-line expression parsing code, and [AFL++][125] is best when testing
3319input from `stdin`.
3320
3321[`scripts/radamsa.sh`][100] does also do fuzzing on the [AFL++][125] inputs, but
3322it's not as effective at that, so I don't really use it for that either.
3323
3324[`scripts/radamsa.sh`][100] and [Radamsa][99] were only really used once; I have
3325not had to touch the command-line expression parsing code since.
3326
3327### [AddressSanitizer][21] with Fuzzing
3328
3329One advantage of using [AFL++][125] is that it saves every test case that
3330generated a new path (see [AFL++ Quickstart][129] above), and it doesn't delete
3331them when the user makes it quit.
3332
3333Keeping them around is not a good idea, for several reasons:
3334
3335* They are frequently large.
3336* There are a lot of them.
3337* They go stale; after `bc` is changed, the generated paths may not be valid
3338  anymore.
3339
3340However, before they are deleted, they can definitely be leveraged for even
3341*more* bug squashing by running *all* of the paths through a build of `bc` with
3342[AddressSanitizer][21].
3343
3344This can easily be done with these four commands:
3345
3346```
3347./scripts/fuzz_prep.sh -a
3348./scripts/afl.py --asan bc1
3349./scripts/afl.py --asan bc2
3350./scripts/afl.py --asan bc3
3351./scripts/afl.py --asan dc
3352```
3353
3354(By the way, the last four commands could be run in separate terminals to do the
3355processing in parallel.)
3356
3357These commands build an [ASan][21]-enabled build of `bc` and `dc` and then they
3358run `bc` and `dc` on all of the found crashes and path output corpuses. This is
3359to check that no path or crash has found any memory errors, including memory
3360leaks.
3361
3362Because the output corpuses can contain test cases that generate infinite loops
3363in `bc` or `dc`, [`scripts/afl.py`][94] has a timeout of 8 seconds, which is far
3364greater than the timeout that [AFL++][125] uses and should be enough to catch
3365any crash.
3366
3367If [AFL++][125] fails to find crashes *and* [ASan][21] fails to find memory
3368errors on the outputs of [AFL++][125], that is an excellent indicator of very
3369few bugs in `bc`, and a release can be made with confidence.
3370
3371## Code Concepts
3372
3373This section is about concepts that, if understood, will make it easier to
3374understand the code as it is written.
3375
3376The concepts in this section are not found in a single source file, but they are
3377littered throughout the code. That's why I am writing them all down in a single
3378place.
3379
3380### POSIX Mode
3381
3382POSIX mode is `bc`-only.
3383
3384In fact, POSIX mode is two different modes: Standard Mode and Warning Mode.
3385These modes are designed to help users write POSIX-compatible `bc` scripts.
3386
3387#### Standard Mode
3388
3389Standard Mode is activated with the `-s` or `--standard` flags.
3390
3391In this mode, `bc` will error if any constructs are used that are not strictly
3392compatible with the [POSIX `bc` specification][2].
3393
3394#### Warning Mode
3395
3396Warning Mode is activated with the `-w` or `--warn` flags.
3397
3398In this mode, `bc` will issue warnings, but continue, if any constructs are used
3399that are not strictly compatible with the [POSIX `bc` specification][2].
3400
3401### Memory Management
3402
3403The memory management in `bc` is simple: everything is owned by one thing.
3404
3405If something is in a vector, it is owned by that vector.
3406
3407If something is contained in a struct, it is owned by that struct with one
3408exception: structs can be given pointers to other things, but only if those
3409other things will outlast the struct itself.
3410
3411As an example, the `BcParse` struct has a pointer to the one `BcProgram` in
3412`bc`. This is okay because the program is initialized first and deallocated
3413last.
3414
3415In other words, it's simple: if a field in a struct is a pointer, then unless
3416that pointer is directly allocated by the struct (like the vector array or the
3417number limb array), that struct does not own the item at that pointer.
3418Otherwise, the struct *does* own the item.
3419
3420### [Async-Signal-Safe][115] Signal Handling
3421
3422`bc` is not the typical Unix utility. Most Unix utilities are I/O bound, but
3423`bc` is, by and large, CPU-bound. This has several consequences, but the biggest
3424is that there is no easy way to allow signals to interrupt it.
3425
3426This consequence is not obvious, but it comes from the fact that a lot of I/O
3427operations can be interrupted and return [`EINTR`][198]. This makes such I/O
3428calls natural places for allowing signals to interrupt execution, even when the
3429signal comes during execution, and not interrupting I/O calls. The way this is
3430done is setting a flag in the signal handler, which is checked around the time
3431of the I/O call, when it is convenient.
3432
3433Alternatively, I/O bound programs can use the [self-pipe trick][199].
3434
3435Neither of these are possible in `bc` because the execution of math code can
3436take a long time. If a signal arrives during this long execution time, setting a
3437flag like an I/O bound application and waiting until the next I/O call could
3438take seconds, minutes, hours, or even days. (Last I checked, my `bc` takes a
3439week to calculate a million digits of pi, and it's not slow as far as `bc`
3440implementations go.)
3441
3442Thus, using just the technique of setting the flag just will not work for an
3443interactive calculator.
3444
3445Well, it can, but it requires a lot of code and massive inefficiencies. I know
3446this because that was the original implementation.
3447
3448The original implementation set a flag and just exit the signal handler. Then,
3449on just about every loop header, I have a check for the signal flag. These
3450checks happened on every iteration of every loop. It was a massive waste because
3451it was polling, and [polling is evil][200].
3452
3453So for version [3.0.0][32], I expended a lot of effort to change the
3454implementation.
3455
3456In the new system, code *outside* the signal handler sets a flag (`vm.sig_lock`)
3457to tell the signal handler whether it can use `longjmp()` to stop the current
3458execution. If so, it does. If not, it sets a flag, which then is used by the
3459code outside the signal handler that set the `vm.sig_lock` flag. When that code
3460unsets `vm.sig_lock`, it checks to see if a signal happened, and if so, that
3461code executes the `longjmp()` and stops the current execution.
3462
3463Other than that, the rest of the interrupt-based implementation is best
3464described in the [Error Handling][97].
3465
3466However, there are rules for signal handlers that I must lay out.
3467
3468First, signal handlers can only call [async-signal-safe][115] functions.
3469
3470Second, any field set or read by both the signal handler and normal code must be
3471a `volatile sig_atomic_t`.
3472
3473Third, when setting such fields, they must be set to constants and no math can
3474be done on them. This restriction and the above restriction exist in order to
3475ensure that the setting of the fields is always atomic with respect to signals.
3476
3477These rules exist for *any* code using Unix signal handlers, not just `bc`.
3478
3479#### Vectors and Numbers
3480
3481Vectors and numbers needed special consideration with the interrupt-based signal
3482handling.
3483
3484When vectors and numbers are about to allocate, or *reallocate* their arrays,
3485they need to lock signals to ensure that they do not call `malloc()` and friends
3486and get interrupted by a signal because, as you will see in the [Error
3487Handling][97] section, `longjmp()` cannot be used in a signal handler if it may
3488be able to interrupt a non-[async-signal-safe][115] function like `malloc()` and
3489friends.
3490
3491### Asserts
3492
3493If you asked me what procedure is used the most in `bc`, I would reply without
3494hesitation, "`assert()`."
3495
3496I use `assert()` everywhere. In fact, it is what made [fuzzing][82] with
3497[AFL++][125] so effective. [AFL++][125] is incredibly good at finding crashes,
3498and a failing `assert()` counts as one.
3499
3500So while a lot of bad bugs might have corrupted data and *not* caused crashes,
3501because I put in so many `assert()`'s, they were *turned into* crashing bugs,
3502and [AFL++][125] found them.
3503
3504By far, the most bugs it found this way was in the `bc` parser. (See the [`bc`
3505Parsing][110] for more information.) And even though I was careful to put
3506`assert()`'s everywhere, most parser bugs manifested during execution of
3507bytecode because the virtual machine assumes the bytecode is valid.
3508
3509Sidenote: one of those bugs caused an infinite recursion when running the sine
3510(`s()`) function in the math library, so yes, parser bugs can be *very* weird.
3511
3512Anyway, they way I did `assert()`'s was like this: whenever I realized that I
3513had put assumptions into the code, I would put an `assert()` there to test it
3514**and** to *document* it.
3515
3516Yes, documentation. In fact, by far the best documentation of the code in `bc`
3517is actually the `assert()`'s. The only time I would not put an `assert()` to
3518test an assumption is if that assumption was already tested by an `assert()`
3519earlier.
3520
3521As an example, if a function calls another function and passes a pointer that
3522the caller previously `assert()`'ed was *not* `NULL`, then the callee does not
3523have to `assert()` it too, unless *also* called by another function that does
3524not `assert()` that.
3525
3526At first glance, it may seem like putting asserts for pointers being non-`NULL`
3527everywhere would actually be good, but unfortunately, not for fuzzing. Each
3528`assert()` is a branch, and [AFL++][125] rates its own effectiveness based on
3529how many branches it covers. If there are too many `assert()`'s, it may think
3530that it is not being effective and that more fuzzing is needed.
3531
3532This means that `assert()`'s show up most often in two places: function
3533preconditions and function postconditions.
3534
3535Function preconditions are `assert()`'s that test conditions relating to the
3536arguments a function was given. They appear at the top of the function, usually
3537before anything else (except maybe initializing a local variable).
3538
3539Function postconditions are `assert()`'s that test the return values or other
3540conditions when a function exits. These are at the bottom of a function or just
3541before a `return` statement.
3542
3543The other `assert()`'s cover various miscellaneous assumptions.
3544
3545If you change the code, I ***HIGHLY*** suggest that you use `assert()`'s to
3546document your assumptions. And don't remove them when [AFL++][125] gleefully
3547crashes `bc` and `dc` over and over again.
3548
3549### Vectors
3550
3551In `bc`, vectors mean resizable arrays, and they are the most fundamental piece
3552of code in the entire codebase.
3553
3554I had previously written a [vector implementation][112], which I used to guide
3555my decisions, but I wrote a new one so that `bc` would not have a dependency. I
3556also didn't make it as sophisticated; the one in `bc` is very simple.
3557
3558Vectors store some information about the type that they hold:
3559
3560* The size (as returned by `sizeof`).
3561* An enum designating the destructor.
3562
3563If the destructor is `BC_DTOR_NONE`, it is counted as the type not having a
3564destructor.
3565
3566But by storing the size, the vector can then allocate `size * cap` bytes, where
3567`cap` is the capacity. Then, when growing the vector, the `cap` is doubled again
3568and again until it is bigger than the requested size.
3569
3570But to store items, or to push items, or even to return items, the vector has to
3571figure out where they are, since to it, the array just looks like an array of
3572bytes.
3573
3574It does this by calculating a pointer to the underlying type with
3575`v + (i * size)`, where `v` is the array of bytes, `i` is the index of the
3576desired element, and `size` is the size of the underlying type.
3577
3578Doing that, vectors can avoid undefined behavior (because `char` pointers can
3579be cast to any other pointer type), while calculating the exact position of
3580every element.
3581
3582Because it can do that, it can figure out where to push new elements by
3583calculating `v + (len * size)`, where `len` is the number of items actually in
3584the vector.
3585
3586By the way, `len` is different from `cap`. While `cap` is the amount of storage
3587*available*, `len` is the number of actual elements in the vector at the present
3588point in time.
3589
3590Growing the vector happens when `len` is equal to `cap` *before* pushing new
3591items, not after.
3592
3593To add a destructor, you need to add an enum item to `BcDtorType` in
3594[`include/vector.h`][174] and add the actual destructor in the same place as the
3595enum item in the `bc_vec_dtors[]` array in [`src/data.c`][131].
3596
3597#### Pointer Invalidation
3598
3599There is one big danger with the vectors as currently implemented: pointer
3600invalidation.
3601
3602If a piece of code receives a pointer from a vector, then adds an item to the
3603vector before they finish using the pointer, that code must then update the
3604pointer from the vector again.
3605
3606This is because any pointer inside the vector is calculated based off of the
3607array in the vector, and when the vector grows, it can `realloc()` the array,
3608which may move it in memory. If that is done, any pointer returned by
3609`bc_vec_item()`, `bc_vec_top()` and `bc_vec_item_rev()` will be invalid.
3610
3611This fact was the single most common cause of crashes in the early days of this
3612`bc`; wherever I have put a comment about pointers becoming invalidated and
3613updating them with another call to `bc_vec_item()` and friends, *do **NOT**
3614remove that code!*
3615
3616#### Maps
3617
3618Maps in `bc` are...not.
3619
3620They are really a combination of two vectors. Those combinations are easily
3621recognized in the source because one vector is named `<name>s` (plural), and the
3622other is named `<name>_map`.
3623
3624There are currently three, all in `BcProgram`:
3625
3626* `fns` and `fn_map` (`bc` functions).
3627* `vars` and `var_map` (variables).
3628* `arrs` and `arr_map` (arrays).
3629
3630They work like this: the `<name>_map` vector holds `BcId`'s, which just holds a
3631string and an index. The string is the name of the item, and the index is the
3632index of that item in the `<name>s` vector.
3633
3634Obviously, I could have just done a linear search for items in the `<name>s`
3635vector, but that would be slow with a lot of functions/variables/arrays.
3636Instead, I ensure that whenever an item is inserted into the `<name>_map`
3637vector, the item is inserted in sorted order. This means that the `<name>_map`
3638is always sorted (by the names of the items).
3639
3640So when looking up an item in the "map", what is really done is this:
3641
36421.	A binary search is carried out on the names in the `<name>_map` vector.
36432.	When one is found, it returns the index in the `<name>_map` vector where the
3644	item was found.
36453.	This index is then used to retrieve the `BcId`.
36464.	The index from the `BcId` is then used to index into the `<name>s` vector,
3647	which returns the *actual* desired item.
3648
3649Why were the `<name>s` and `<name>_map` vectors not combined for ease? The
3650answer is that sometime, when attempting to insert into the "map", code might
3651find that something is already there. For example, a function with that name may
3652already exist, or the variable might already exist.
3653
3654If the insert fails, then the name already exists, and the inserting code can
3655forego creating a new item to put into the vector. However, if there is no item,
3656the inserting code must create a new item and insert it.
3657
3658If the two vectors were combined together, it would not be possible to separate
3659the steps such that creating a new item could be avoided if it already exists.
3660
3661#### Slabs and Slab Vectors
3662
3663`bc` allocates *a lot* of small strings, and small allocations are the toughest
3664for general-purpose allocators to handle efficiently.
3665
3666Because of that reason, I decided to create a system for allocating small
3667strings using something that I call a "slab vector" after [slab
3668allocators][201].
3669
3670These vectors allocate what I call "slabs," which are just an allocation of a
3671single page with a length to tell the slab how much of the slab is used.
3672
3673The vector itself holds slabs, and when the slab vector is asked to allocate a
3674string, it attempts to in the last slab. If that slab cannot do so, it allocates
3675a new slab and allocates from that.
3676
3677There is one exception: if a string is going to be bigger than 128 bytes, then
3678the string is directly allocated, and a slab is created with that pointer and a
3679length of `SIZE_MAX`, which tells the slab vector that it is a direct
3680allocation. Then, the last slab is pushed into the next spot and the new special
3681slab is put into the vacated spot. This ensures that a non-special slab is
3682always last.
3683
3684### Command-Line History
3685
3686When I first wrote `bc`, I immediately started using it in order to eat my own
3687dog food.
3688
3689It sucked, and the biggest reason why was because of the lack of command-line
3690history.
3691
3692At first, I just dealt with it, not knowing how command-line history might be
3693implemented.
3694
3695Eventually, I caved and attempted to adapt [`linenoise-mob`][28], which I had
3696known about for some time.
3697
3698It turned out to be easier than I thought; the hardest part was the tedious
3699renaming of everything to fit the `bc` naming scheme.
3700
3701Understanding command-line history in `bc` is really about understanding VT-100
3702escape codes, so I would start there.
3703
3704Now, the history implementation of `bc` has been adapted far beyond that initial
3705adaptation to make the command-line history implementation perfect for `bc`
3706alone, including integrating it into `bc`'s [Custom I/O][114] and making sure
3707that it does not disturb output that did not end with a newline.
3708
3709On top of that, at one point, I attempted to get history to work on Windows. It
3710barely worked after a lot of work and a lot of portability code, but even with
3711all of that, it does not have at least one feature: multi-line pasting from the
3712clipboard.
3713
3714### Error Handling
3715
3716The error handling on `bc` got an overhaul for version [`3.0.0`][32], and it
3717became one of the things that taught me the most about C in particular and
3718programming in general.
3719
3720Before then, error handling was manual. Almost all functions returned a
3721`BcStatus` indicating if an error had occurred. This led to a proliferation of
3722lines like:
3723
3724```
3725if (BC_ERR(s)) return s;
3726```
3727
3728In fact, a quick and dirty count of such lines in version `2.7.2` (the last
3729version before [`3.0.0`][32] turned up 252 occurrences of that sort of line.
3730
3731And that didn't even guarantee that return values were checked *everywhere*.
3732
3733But before I can continue, let me back up a bit.
3734
3735From the beginning, I decided that I would not do what GNU `bc` does on errors;
3736it tries to find a point at which it can recover. Instead, I decided that I
3737would have `bc` reset to a clean slate, which I believed, would reduce the
3738number of bugs where an unclean state caused errors with continuing execution.
3739
3740So from the beginning, errors would essentially unwind the stack until they got
3741to a safe place from which to clean the slate, reset, and ask for more input.
3742
3743Well, if that weren't enough, `bc` also has to handle [POSIX signals][113]. As
3744such, it had a signal handler that set a flag. But it could not safely interrupt
3745execution, so that's all it could do.
3746
3747In order to actually respond to the signal, I had to litter checks for the flag
3748*everywhere* in the code. And I mean *everywhere*. They had to be checked on
3749every iteration of *every* loop. They had to be checked going into and out of
3750certain functions.
3751
3752It was a mess.
3753
3754But fortunately for me, signals did the same thing that errors did: they unwound
3755the stack to the *same* place.
3756
3757Do you see where I am going with this?
3758
3759It turns out that what I needed was a [async-signal-safe][115] form of what
3760programmers call "exceptions" in other languages.
3761
3762I knew that [`setjmp()`][116] and [`longjmp()`][117] are used in C to implement
3763exceptions, so I thought I would learn how to use them. How hard could it be?
3764
3765Quite hard, it turns out, especially in the presence of signals. And that's
3766because there are a few big snares:
3767
37681.	The value of any local variables are not guaranteed to be preserved after a
3769	`longjmp()` back into a function.
37702.	While `longjmp()` is required to be [async-signal-safe][115], if it is
3771	invoked by a signal handler that interrupted a non-[async-signal-safe][115]
3772	function, then the behavior is undefined.
3773
3774Oh boy.
3775
3776For number 1, the answer to this is to hide data that must stay changed behind
3777pointers. Only the *pointers* are considered local, so as long as I didn't do
3778any modifying pointer arithmetic, pointers and their data would be safe. For
3779cases where I have local data that must change and stay changed, I needed to
3780*undo* the `setjmp()`, do the change, and the *redo* the `setjmp()`.
3781
3782For number 2, `bc` needs some way to tell the signal handler that it cannot do a
3783`longjmp()`. This is done by "locking" signals with a `volatile sig_atomic_t`.
3784(For more information, see the [Async-Signal-Safe Signal Handling][173]
3785section.) For every function that calls a function that is not
3786async-signal-safe, they first need to use `BC_SIG_LOCK` to lock signals, and
3787afterward, use `BC_SIG_UNLOCK` to unlock signals.
3788
3789`BC_SIG_UNLOCK` has another requirement: it must check for signals or errors and
3790jump if necessary.
3791
3792On top of all of that, *all* functions with cleanup needed to be able to run
3793their cleanup. This meant that `longjmp()` could not just jump to the finish; it
3794had to start what I call a "jump series," using a stack of `jmp_buf`'s
3795(`jmp_bufs` in `BcVm`). Each `longjmp()` uses the top of the `jmp_bufs` stack to
3796execute its jump. Then, if the cleanup code was executed because of a jump, the
3797cleanup code was responsible for continuing the jump series by popping the
3798previous item off the stack and using the new top of the stack for a jump.
3799
3800In this way, C++-style exceptions were implemented in pure C. Not fun, but it
3801works. However, the order of operations matters, especially in the macros that
3802help implement the error handling.
3803
3804For example, in `BC_UNSETJMP`, signals are unlocked before checking for signals.
3805If a signal comes between, that's fine; it will still cause a jump to the right
3806place. However, disabling the lock after could mean that a signal could come
3807*after* checking for signals, but before signals were unlocked, causing the
3808handling of the signal to be delayed.
3809
3810#### Custom I/O
3811
3812Why did I implement my own buffered I/O for `bc`? Because I use `setjmp()` and
3813`longjmp()` for error handling (see the [Error Handling][97] section), and the
3814buffered I/O in `libc` does not interact well with the use of those procedures;
3815all of the buffered I/O API is basically non-[async-signal-safe][115].
3816
3817Implementing custom buffered I/O had other benefits. First, it allowed me to
3818tightly integrate history with the I/O code. Second, it allowed me to make
3819changes to history in order to make it adapt to user prompts.
3820
3821### Lexing
3822
3823To simplify parsing, both calculators use lexers to turn the text into a more
3824easily-parsable form.
3825
3826While some tokens are only one character long, others require many tokens, and
3827some of those need to store all of the text corresponding to the token for use
3828by the parsers. Tokens that need to store their corresponding text include, but
3829are not limited to:
3830
3831* Strings.
3832* Numbers.
3833* Identifiers.
3834
3835For this purpose, the lexer has a [vector][111] named `str` to store the data
3836for tokens. This data is overwritten if another token is lexed that needs to
3837store data, so the parsers need to copy the data before calling the lexer again.
3838
3839Both lexers do some of the same things:
3840
3841* Lex identifiers into tokens, storing the identifier in `str`.
3842* Lex number strings into tokens, storing the string in `str`.
3843* Lex whitespace.
3844* Lex comments.
3845
3846Other than that, and some common plumbing, the lexers have separate code.
3847
3848#### `dc` Lexing
3849
3850The `dc` lexer is remarkably simple; in fact, besides [`src/main.c`][205],
3851[`src/bc.c`][40], and [`src/dc.c`][44], which just contain one function each,
3852the only file smaller that [`src/dc_lex.c`][45] is [`src/args.c`][206], which
3853just processes command-line arguments after they are parsed by
3854[`src/opt.c`][51].
3855
3856For most characters, the `dc` lexer is able to convert directly from the
3857character to its corresponding token. This happens using `dc_lex_tokens[]` in
3858[`src/data.c`][131].
3859
3860`dc`'s lexer also has to lex the register name after lexing tokens for commands
3861that need registers.
3862
3863And finally, `dc`'s lexer needs to parse `dc` strings, which is the only part of
3864the `dc` lexer that is more complex than the `bc` lexer. This is because `dc`
3865strings need to have a balanced number of brackets.
3866
3867#### `bc` Lexing
3868
3869The `bc` lexer is fairly simple. It does the following things:
3870
3871* Lexes `bc` strings.
3872* Lexes `bc` identifiers. This is necessary because this is how `bc` keywords
3873  are lexed. After ensuring that an identifier is not a keyword, the `bc` lexer
3874  allows the common identifier function to take over.
3875* Turns characters and groups of characters into `bc` operator tokens.
3876
3877### Parsing
3878
3879The difference between parsing `bc` and `dc` code is...vast. The `dc` parser is
3880simple, while the `bc` parser is the most complex piece of code in the entire
3881codebase.
3882
3883However, they both do some of the same things.
3884
3885First, the parsers do *not* use [abstract syntax trees][207]; instead, they
3886directly generate the bytecode that will be executed by the `BcProgram` code.
3887Even in the case of `bc`, this heavily simplifies the parsing because the
3888[Shunting-Yard Algorithm][109] is designed to generate [Reverse Polish
3889Notation][108], which is basically directly executable.
3890
3891Second, any extra data that the `BcProgram` needs for execution is stored into
3892functions (see the [Functions][208] section). These include constants and
3893strings.
3894
3895#### `dc` Parsing
3896
3897The parser for `dc`, like its lexer, is remarkably simple. In fact, the easiness
3898of lexing and parsing [Reverse Polish notation][108] is probably why it was used
3899for `dc` when it was first created at Bell Labs.
3900
3901For most tokens, the `dc` parser is able to convert directly from the token
3902to its corresponding instruction. This happens using `dc_parse_insts[]` in
3903[`src/data.c`][131].
3904
3905`dc`'s parser also has to parse the register name for commands that need
3906registers. This is the most complex part of the `dc` parser; each different
3907register command needs to be parsed differently because most of them require two
3908or more instructions to execute properly.
3909
3910For example, storing in a register requires a swap instruction and an assignment
3911instruction.
3912
3913Another example are conditional execution instructions; they need to produce the
3914instruction for the condition, and then they must parse a possible "else" part,
3915which might not exist.
3916
3917#### `bc` Parsing
3918
3919`bc`'s parser is, by far, the most sensitive piece of code in this software, and
3920there is a very big reason for that: `bc`'s standard is awful and defined a very
3921poor language.
3922
3923The standard says that either semicolons or newlines can end statements. Trying
3924to parse the end of a statement when it can either be a newline or a semicolon
3925is subtle. Doing it in the presence of control flow constructs that do not have
3926to use braces is even harder.
3927
3928And then comes the biggest complication of all: `bc` has to assume that it is
3929*always* at a REPL (Read-Eval-Print Loop). `bc` is, first and foremost, an
3930*interactive* utility.
3931
3932##### Flags
3933
3934All of this means that `bc` has to be able to partially parse something, store
3935enough data to recreate that state later, and return, making sure to not
3936execute anything in the meantime.
3937
3938*That* is what the flags in [`include/bc.h`][106] are: they are the state that
3939`bc` is saving for itself.
3940
3941It saves them in a stack, by the way, because it's possible to nest
3942structures, just like any other programming language. Thus, not only does it
3943have to store state, it needs to do it arbitrarily, and still be able to
3944come back to it.
3945
3946So `bc` stores its parser state with flags in a stack. Careful setting of these
3947flags, along with properly using them and maintaining the flag stack, are what
3948make `bc` parsing work, but it's complicated. In fact, as I mentioned, the `bc`
3949parser is the single most subtle, fickle, and sensitive piece of code in the
3950entire codebase. Only one thing came close once: square root, and that was only
3951sensitive because I wrote it wrong. This parser is pretty good, and it is
3952*still* sensitive. And flags are the reason why.
3953
3954For more information about what individual flags there are, see the comments in
3955[`include/bc.h`][106].
3956
3957##### Labels
3958
3959`bc`'s language is Turing-complete. That means that code needs the ability to
3960jump around, specifically to implement control flow like `if` statements and
3961loops.
3962
3963`bc` handles this while parsing with what I called "labels."
3964
3965Labels are markers in the bytecode. They are stored in functions alongside the
3966bytecode, and they are just indices into the bytecode.
3967
3968When the `bc` parser creates a label, it pushes an index onto the labels array,
3969and the index of the label in that array is the index that will be inserted into
3970the bytecode.
3971
3972Then, when a jump happens, the index pulled out of the bytecode is used to index
3973the labels array, and the label (index) at the index is then used to set the
3974instruction pointer.
3975
3976##### Cond Labels
3977
3978"Cond" labels are so-called because they are used by conditionals.
3979
3980The key to them is that they come *before* the code that uses them. In other
3981words, when jumping to a condition, code is jumping *backwards*.
3982
3983This means that when a cond label is created, the value that should go there is
3984well-known. Cond labels are easy.
3985
3986However, they are still stored on a stack so that the parser knows what cond
3987label to use.
3988
3989##### Exit Labels
3990
3991Exit labels are not so easy.
3992
3993"Exit" labels are so-called because they are used by code "exiting" out of `if`
3994statements or loops.
3995
3996The key to them is that they come *after* the code that uses them. In other
3997words, when jumping to an exit, code is jumping *forwards*.
3998
3999But this means that when an exit label is created, the value that should go
4000there is *not* known. The code that needs it must be parsed and generated first.
4001
4002That means that exit labels are created with the index of `SIZE_MAX`, which is
4003then specifically checked for with an assert in `bc_program_exec()` before using
4004those indices.
4005
4006There should ***NEVER*** be a case when an exit label is not filled in properly
4007if the parser has no bugs. This is because every `if` statement, every loop,
4008must have an exit, so the exit must be set. If not, there is a bug.
4009
4010Exit labels are also stored on a stack so that the parser knows what exit label
4011to use.
4012
4013##### Expression Parsing
4014
4015`bc` has expressions like you might expect in a typical programming language.
4016This means [infix notation][107].
4017
4018One thing about infix notation is that you can't just generate code straight
4019from it like you can with [Reverse Polish notation][108]. It requires more work
4020to shape it into a form that works for execution on a stack machine.
4021
4022That extra work is called the [Shunting-Yard algorithm][109], and the form it
4023translates infix notation into is...[Reverse Polish notation][108].
4024
4025In order to understand the rest of this section, you must understand the
4026[Shunting-Yard algorithm][109]. Go do that before you read on.
4027
4028###### Operator Stack
4029
4030In `bc`, the [Shunting-Yard algorithm][109] is implemented with bytecode as the
4031output and an explicit operator stack (the `ops` field in `BcParse`) as the
4032operator stack. It stores tokens from `BcLex`.
4033
4034However, there is one **HUGE** hangup: multiple expressions can stack. This
4035means that multiple expressions can be parsed at one time (think an array element
4036expression in the middle of a larger expression). Because of that, we need to
4037keep track of where the previous expression ended. That's what `start` parameter
4038to `bc_parse_operator()` is.
4039
4040Parsing multiple expressions on one operator stack only works because
4041expressions can only *stack*; this means that, if an expression begins before
4042another ends, it must *also* end before that other expression ends. This
4043property ensures that operators will never interfere with each other on the
4044operator stack.
4045
4046###### Recursion
4047
4048Because expressions can stack, parsing expressions actually requires recursion.
4049Well, it doesn't *require* it, but the code is much more readable that way.
4050
4051This recursion is indirect; the functions that `bc_parse_expr_err()` (the actual
4052expression parsing function) calls can, in turn, call it.
4053
4054###### Expression Flags
4055
4056There is one more big thing: not all expressions in `bc` are equal.
4057
4058Some expressions have requirements that others don't have. For example, only
4059array arguments can be arrays (which are technically not expressions, but are
4060treated as such for parsing), and some operators (in POSIX) are not allowed in
4061certain places.
4062
4063For this reason, functions that are part of the expression parsing
4064infrastructure in `bc`'s parser usually take a `flags` argument. This is meant
4065to be passed to children, and somewhere, they will be checked to ensure that the
4066resulting expression meets its requirements.
4067
4068There are also places where the flags are changed. This is because the
4069requirements change.
4070
4071Maintaining the integrity of the requirements flag set is an important part of
4072the `bc` parser. However, they do not have to be stored on a stack because their
4073stack is implicit from the recursion that expression parsing uses.
4074
4075### Functions
4076
4077Functions, in `bc`, are data structures that contain the bytecode and data
4078produced by the parsers. Functions are what the `BcProgram` program executes.
4079
4080#### Main and Read Functions
4081
4082There are two functions that always exist, which I call the "main" and "read"
4083functions.
4084
4085The "main" function is the function in which any code and data outside other
4086functions is put. Basically, it is the function where the scripting code ends
4087up.
4088
4089The "read" function is the function that is reset and parsed every time a call
4090to the `read()` builtin function happens.
4091
4092#### `dc` Strings
4093
4094In `dc`, strings can be executed, and since there are no actual "functions" in
4095`dc`, strings are handled as functions. In fact, they are effectively translated
4096into functions by parsing.
4097
4098##### Tail Calls
4099
4100Since strings in `dc` are functions, and the fact that `dc` has no native loops,
4101such loops are implemented in `dc` code using strings with conditional execution
4102commands at the end of strings.
4103
4104When such conditional execution, or even unconditional execution, commands are
4105the very last commands in a string, then `dc` can perform a [tail call][202].
4106
4107This is done by recording the fact that a tail call happened, done by
4108incrementing an integer on a stack. When a string is executed *without* a tail
4109call, a new entry is pushed onto the stack with the integer `1`.
4110
4111When a string finally quits that followed tail calls, its stack entry is popped,
4112eliminating all of those tail calls.
4113
4114Why perform tail calls? Because otherwise, `dc` would be subject to the same
4115thing that plagues [functional programming languages][203]: stack overflow. In
4116`dc`'s case, that would manifest itself as a growing [heap][204], because the
4117execution stack is stored on the heap, until a fatal allocation failure would
4118occur.
4119
4120#### Execution
4121
4122Execution is handled by an interpreter implemented using `BcProgram` and code
4123in [`src/program.c`][53].
4124
4125The interpreter is a mix between a [stack machine][210] and a [register
4126machine][211]. It is a stack machine in that operations happen on a stack I call
4127the "results stack," but it is a register machine in that items on the stack can
4128be stored to and loaded from "registers" (`dc` terminology), variables (`bc`
4129terminology), and arrays.
4130
4131##### Stacks
4132
4133There are two stacks in the interpreter:
4134
4135* The "results" stack (as mentioned above).
4136* The "execution" stack.
4137
4138The results stack (the `results` field of the `BcProgram` struct) is the stack
4139where the results of computations are stored. It is what makes the interpreter
4140part [stack machine][210]. It is filled with `BcResult`'s.
4141
4142The execution stack (the `stack` field of the `BcProgram` struct) is the stack
4143that tracks the current execution state of the interpreter. It is the presence
4144of this separate stack that allows the interpreter to implement the machine as a
4145loop, rather than recursively. It is filled with `BcInstPtr`'s, which are the
4146"instruction pointers."
4147
4148These instruction pointers have three fields, all integers:
4149
4150* `func`, the index of the function that is currently executing.
4151* `idx`, the index of the next bytecode instruction to execute in the function's
4152  bytecode array.
4153* `len`, which is the length of the results stack when the function started
4154  executing. This is not used by `dc`, but it used by `bc` because functions
4155  in `bc` should never affect the results stack of their callers.
4156
4157With these three fields, and always executing using the instruction pointer at
4158the top of the execution stack, the interpreter can always keep track of its
4159execution.
4160
4161When a function or a string starts executing, a new `BcInstPtr` is pushed onto
4162the execution stack for it. This includes if a function was called recursively.
4163And then, when the function or string returns, its `BcInstPtr` is popped off of
4164the execution stack.
4165
4166##### Bytecode
4167
4168Execution of functions are done through bytecode produced directly by the
4169parsers (see the [Parsing][209]). This bytecode is stored in the `code`
4170[vector][111] of the `BcFunc` struct.
4171
4172This is a vector for two reasons:
4173
4174* It makes it easier to add bytecode to the vector in the parsers.
4175* `bc` allows users to redefine functions.
4176
4177The reason I can use bytecode is because there are less than 256 instructions,
4178so an `unsigned char` can store all the bytecodes.
4179
4180###### Bytecode Indices
4181
4182There is one other factor to bytecode: there are instructions that need to
4183reference strings, constants, variables, or arrays. Bytecode need some way to
4184reference those things.
4185
4186Fortunately, all of those things can be referenced in the same way: with indices
4187because all of the items are in vectors.
4188
4189So `bc` has a way of encoding an index into bytecode. It does this by, after
4190pushing the instruction that references anything, pushing a byte set to the
4191length of the index in bytes, then the bytes of the index are pushed in
4192little-endian order.
4193
4194Then, when the interpreter encounters an instruction that needs one or more
4195items, it decodes the index or indices there and updates the `idx` field of the
4196current `BcInstPtr` to point to the byte after the index or indices.
4197
4198One more thing: the encoder of the indices only pushes as many bytes as
4199necessary to encode the index. It stops pushing when the index has no more bytes
4200with any 1 bits.
4201
4202##### Variables
4203
4204In `bc`, the vector of variables, `vars` in `BcProgram`, is not a vector of
4205numbers; it is a vector of vector of numbers. The first vector is the vector of
4206variables, the second is the variable stack, and the last level is the actual
4207number.
4208
4209This is because both `bc` and `dc` need variables to be stacks.
4210
4211For `dc`, registers are *defined* to be stacks.
4212
4213For `bc`, variables as stacks is how function arguments/parameters and function
4214`auto` variables are implemented.
4215
4216When a function is called, and a value needs to be used as a function argument,
4217a copy of the value is pushed onto the stack corresponding to the variable with
4218the same name as the function's parameter. For `auto` variables, a new number
4219set to zero is pushed onto each stack corresponding to the `auto` variables.
4220(Zero is used because the [`bc` spec][2] requires that `auto` variables are set
4221to zero.)
4222
4223It is in this way that the old value of the variable, which may not even be
4224related to the function parameter or `auto` variable, is preserved while the
4225variable is used as a function parameter or `auto` variable.
4226
4227When the function returns, of course, the stacks of the variables for the
4228parameters and `auto`'s will have their top item popped, restoring the old value
4229as it was before the function call.
4230
4231##### Arrays
4232
4233Like variables, arrays are also implemented as stacks. However, because they are
4234arrays, there is yet another level; the `arrs` field in `BcProgram` is a vector
4235of vectors of vectors of numbers. The first of the two levels is the vector of
4236arrays, the second the stack of for each array, the third the actual array, and
4237last the numbers in the array.
4238
4239`dc` has no need of this extra stack, but `bc` does because arrays can be
4240function parameters themselves.
4241
4242When arrays are used for function arguments, they are copied with a deep copy;
4243each item of the source vector is copied. This is because in `bc`, according to
4244the [`bc` spec][2], all function arguments are passed by value.
4245
4246However, array references are possible (see below).
4247
4248When arrays are used as `auto`'s, a new vector is pushed with one element; if
4249more elements are needed, the array is grown automatically, and new elements are
4250given the value of zero.
4251
4252In fact, if *any* array is accessed and does not have an element at that index,
4253the array is automaticall grown to that size, and all new elements are given the
4254value zero. This behavior is guaranteed by the [`bc` spec][2].
4255
4256###### Array References
4257
4258Array references had to be implemented as vectors themselves because they must
4259be pushed on the vectors stacks, which, as seen above, expect vectors
4260themselves.
4261
4262So thus, references are implemented as vectors on the vector stacks. These
4263vectors are not vectors of vectors themselves; they are vectors of bytes; in
4264fact, the fact that they are byte vectors and not vector vectors is how a
4265reference vector is detected.
4266
4267These reference vectors always have the same two things pushed: a byte encoding
4268(the same way bytecode indices are) of the referenced vector's index in the
4269`arrs` vector, and a byte encoding of the referenced vectors index in the vector
4270stack.
4271
4272If an item in a referenced vector is needed, then the reference is dereferenced,
4273and the item is returned.
4274
4275If a reference vector is passed to a function that does *not* expect a
4276reference, the vector is dereferenced and a deep copy is done, in the same way
4277as vectors are copied for normal array function parameters.
4278
4279### Callbacks
4280
4281There are many places in `bc` and `dc` where function pointers are used:
4282
4283* To implement destructors in vectors. (See the [Vectors][111] section.)
4284* To select the correct lex and parse functions for `bc` and `dc`.
4285* To select the correct function to execute unary operators.
4286* To select the correct function to execute binary operators.
4287* To calculate the correct number size for binary operators.
4288* To print a "digit" of a number.
4289* To seed the pseudo-random number generator.
4290
4291And there might be more.
4292
4293In every case, they are used for reducing the amount of code. Instead of
4294`if`/`else` chains, such as:
4295
4296```
4297if (BC_IS_BC) {
4298	bc_parse_parse(vm.parse);
4299}
4300else {
4301	dc_parse_parse(vm.parse);
4302}
4303```
4304
4305The best example of this is `bc_num_binary()`. It is called by every binary
4306operator. It figures out if it needs to allocate space for a new `BcNum`. If so,
4307it allocates the space and then calls the function pointer to the *true*
4308operation.
4309
4310Doing it like that shrunk the code *immensely*. First, instead of every single
4311binary operator duplicating the allocation code, it only exists in one place.
4312Second, `bc_num_binary()` itself does not have a massive `if`/`else` chain or a
4313`switch` statement.
4314
4315But perhaps the most important use was for destructors in vectors.
4316
4317Most of the data structures in `bc` are stored in vectors. If I hadn't made
4318destructors available for vectors, then ensuring that `bc` had no memory leaks
4319would have been nigh impossible. As it is, I check `bc` for memory leaks every
4320release when I change the code, and I have not released `bc` after version
4321`1.0.0` with any memory leaks, as far as I can remember anyway.
4322
4323### Numbers
4324
4325In order to do arbitrary-precision math, as `bc` must do, there must be some way
4326of representing arbitrary-precision numbers. `BcNum` in [`include/num.h`][184]
4327is `bc`'s.
4328
4329(Note: the word ["limb"][214] is used below; it has a specific meaning when
4330applied to arbitrary-precision numbers. It means one piece of the number. It can
4331have a single digit, which is what GNU `bc` does, or it can have multiple, which
4332is what this `bc` does.)
4333
4334This struct needs to store several things:
4335
4336* The array of limbs of the number. This is the `num` field.
4337* The location of the decimal point. This is the `rdx` (short for [radix][215])
4338  field.
4339* The number of limbs the number has. This is the `len` field.
4340* Whether the number is negative or not. This is the least significant bit of
4341  the `rdx` field. More on that later.
4342
4343In addition, `bc`'s number stores the capacity of the limb array; this is the
4344`cap` field.
4345
4346If the number needs to grow, and the capacity of the number is big enough, the
4347number is not reallocated; the number of limbs is just added to.
4348
4349There is one additional wrinkle: to make the usual operations (binary operators)
4350fast, the decimal point is *not* allowed to be in the middle of a limb; it must
4351always be between limbs, after all limbs (integer) or before all limbs (real
4352between -1 and 1).
4353
4354The reason for this is because addition, subtraction, multiplication, and
4355division expect digits to be lined up on the decimal point. By requiring that it
4356be between limbs, no extra alignment is needed, and those operations can proceed
4357without extra overhead.
4358
4359This does make some operations, most notably extending, truncating, and
4360shifting, more expensive, but the overhead is constant, and these operations are
4361usually cheap compared to the binary operators anyway.
4362
4363This also requires something else: `bc` numbers need to know *exactly* how many
4364decimal places they have after the decimal point. If the decimal point must be
4365inbetween limbs, the last decimal place could be in the middle of a limb. The
4366amount of decimal places in a number is carefully tracked and stored in the
4367`scale` field, and this number must always coincide with the `rdx` field by the
4368following formula:
4369
4370```
4371scale + (BC_BASE_DIGS - 1) / BC_BASE_DIGS == rdx >> 1
4372```
4373
4374(`BC_BASE_DIGS` is the number of decimal digits stored in one limb. It is 9 on
437564-bit systems and 4 on other systems.)
4376
4377Yes, `rdx` is shifted; that is because the negative bit is stored in the least
4378significant bit of the `rdx` field, and the actual radix (amount of limbs after
4379the decimal/radix point) is stored in the rest of the bits. This is safe because
4380`BC_BASE_DIGS` is always at least 4, which means `rdx` will always need at least
43812 bits less than `scale`.
4382
4383In addition to `rdx` always matching `scale`, another invariant is that `rdx`
4384must always be less than or equal to `len`. (Because `scale` may be greater than
4385`rdx`, `scale` does not have to be less than or equal to `len`.)
4386
4387Another invariant is that `len` must always be less than or equal to `cap`, for
4388obvious reasons.
4389
4390The last thing programmers need to know is that the limb array is stored in
4391little-endian order. This means that the last decimal places are in the limb
4392stored at index 0, and the most significant digits are stored at index `len-1`.
4393
4394This is done to make the most important operations fast. Addition and
4395subtraction are done from least significant to most significant limbs, which
4396means they can speed through memory in the way most computers are best at.
4397Multiplication does the same, sort of, and with division, it matters less.
4398Comparison does need to go backwards, but that's after exhausting all other
4399alternatives, including for example, checking the length of the integer portion
4400of each limb array.
4401
4402Finally, here are some possible special situations with numbers and what they
4403mean:
4404
4405* `len == 0`: the number equals 0.
4406* `len == 0 && scale != 0`: the number equals 0, but it has a `scale` value.
4407  This is the only case where `scale` does not have to coincide with `rdx`
4408  This can happen with division, for example, that sets a specific `scale` for
4409  the result value but may produce 0.
4410* `(rdx >> 1) < len`: the number is greater than or equal to 1, or less than or
4411  equal to -1.
4412* `(rdx >> 1) == len`: the number is greater than -1 and less than 1, not
4413  including 0, although this will be true for 0 as well. However, 0 is always
4414  assumed to be represented by `len == 0`.
4415* `(rdx >> 1) == 0`: the number is an integer. In this case, `scale` must also
4416  equal 0.
4417
4418#### Math Style
4419
4420When I wrote the math for `bc`, I adopted a certain style that, if known, will
4421make it easier to understand the code. The style follows these rules:
4422
4423* `BcNum` arguments always come before arguments of other types.
4424* Among the `BcNum` arguments, the operands always come first, and the `BcNum`
4425  where the result(s) will be stored come last.
4426* Error checking is placed first in the function.
4427* Easy cases are placed next.
4428* Preparation, such as allocating temporaries, comes next.
4429* The actual math.
4430* Cleanup and ensuring invariants.
4431
4432While these rules are not hard and fast, using them as a guide will probably
4433help.
4434
4435### Strings as Numbers
4436
4437Strings can be assigned to variables. This is a problem because the vectors for
4438variable stacks expect `BcNum` structs only.
4439
4440While I could have made a union, I decided that the complexity of adding an
4441entirely new type, with destructor and everything, was not worth it. Instead, I
4442took advantage of the fact that `free()`, when passed a `NULL` pointer, will do
4443nothing.
4444
4445Using that, I made it so `BcNum`'s could store strings instead. This is marked
4446by the `BcNum` having a `NULL` limb array (`num`) and a `cap` of 0 (which should
4447*never* happen with a real number, though the other fields could be 0).
4448
4449The `BcNum` stores the function that stores the string in the `rdx` field, and
4450it stores the index of the string in the `scale` field. This is used to actually
4451load the string if necessary.
4452
4453Note that historically, string information was stored in the `loc` field of
4454the `d` union in a `BcResult`. This was changed recently to standardize; now,
4455all string information are stored in the `n` field of the `d` union regardless.
4456This means that all string information is stored in `BcNum`'s. This removes
4457extra cases.
4458
4459Also, if a temp is made with a string, then the result type should still be
4460`BC_RESULT_STR`, not `BC_RESULT_TEMP`. This is to make it easier to do type
4461checks.
4462
4463### Pseudo-Random Number Generator
4464
4465In order to understand this section, I suggest you read the information in the
4466manpages about the pseudo-random number generator (PRNG) first; that will help
4467you understand the guarantees it has, which is important because this section
4468delves into implementation details.
4469
4470First, the PRNG I use is seeded; this is because most OS's have an excellent
4471cryptographically secure PRNG available via command-line, usually
4472`/dev/urandom`, but the only *seeded* PRNG available is usually `bash`'s
4473`$RANDOM`, which is essentially a wrapper around C's `rand()`.
4474
4475`rand()` is...bad. It is only guaranteed to return 15 bits of random data.
4476Obviously, getting good random data out of that would be hard with that alone,
4477but implementations also seem to be poor.
4478
4479On top of that, `bc` is an arbitrary-precision calculator; if I made it able to
4480generate random numbers, I could make it generate random numbers of any size,
4481and since it would be seeded, results would be reproducible, when wanted.
4482
4483So to get that, I needed a seeded PRNG with good characteristics. After scouring
4484the Internet, I decided on the [PCG PRNG][215], mostly because of [this blog
4485post][216]. Part of the reason was the behavior of the xoroshiro128+ author, who
4486hates on PCG and its author, but also because PCG seemed to do better when
4487tested by independent parties.
4488
4489After that decision, I faced a challenge: PCG requires 255 bits of seed: 128 for
4490the actual seed, and 127 for the "increment." (Melissa O'Neill, the PCG author,
4491likens the increment to selecting a codebook.)
4492
4493I could, of course, put the entire 255 bits into one massive arbitrary-precision
4494number; `bc` is good at that, after all. But that didn't sit right with me
4495because it would mean any seed selected by users would have the real portion
4496ignored, which is stupid in a program like `bc`.
4497
4498Instead, I decided to make the integer portion the increment (clamped down to
4499size), and the real portion the seed.
4500
4501In most cases, this would be a bad idea because you cannot, in general, know how
4502many decimal places you need to represent any number with `n` real digits in
4503base `b` in another base. However, there is an easy to how many decimal digits
4504after the decimal point it takes to represent reals of base 2 in base 10: the
4505power of two.
4506
4507It turns out that, for base 2 represented in base 10, the power of 2 is
4508*exactly* how many digits are necessary to represent *any* number `n/2^p`, where
4509`p` is the power of 2. This is because at every halving, the number of decimal
4510places increases by 1:
4511
4512```
45130.5
45140.25
45150.125
45160.0625
45170.03125
45180.015625
4519...
4520```
4521
4522So the algorithm to convert all 255 bits of the seed is as follows:
4523
45241.	Convert the increment to a `BcNum`.
45252.	Convert the seed to a `BcNum`.
45263.	Divide the seed by `2^128` with a `scale` of 128. (For 32-bit systems,
4527	substitute 64 bits for 128.)
45284.	Add the two numbers together.
4529
4530Likewise, the algorithm to convert from a user-supplied number to a seed is:
4531
45321.	Truncate a copy of the number.
45332.	Subtract the result from #1 from the original number. This gives the real
4534	portion of the number.
45353.	Clamp the result of #1 to 127 (or 63) bits. This is the increment.
45364.	Multiply the result of #2 by `2^128`.
45375.	Truncate the result of #4. This is the seed.
4538
4539#### Generating Arbitrary-Precision Numbers
4540
4541I wrote a function (`bc_rand_bounded()`) that will return unbiased results with
4542any bound below the max that PCG can generate.
4543
4544To generate an integer of arbitrary size using a bound, `bc` simply uses
4545`bc_rand_bounded()` to generate numbers with a bound `10^BC_BASE_DIGS` for as
4546many limbs as needed to satisfy the bigger bound.
4547
4548To generate numbers with arbitrary precision after the decimal point, `bc`
4549merely generates an arbitrary precision integer with the bound `10^p`, where `p`
4550is the desired number of decimal places, then divides in by `10^p` with a
4551`scale` of `p`.
4552
4553## Debug Code
4554
4555Besides building `bc` in debug mode with the `-g` flag to [`configure.sh`][69],
4556programmers can also add `-DBC_DEBUG_CODE=1` to the `CFLAGS`. This will enable
4557the inclusion of *a lot* of extra code to assist with debugging.
4558
4559For more information, see all of the code guarded by `#if BC_DEBUG_CODE` in the
4560[`include/`][212] directory and in the [`src/`][213] directory.
4561
4562Yes, all of the code is guarded by `#if` preprocessor statements; this is
4563because the code should *never* be in a release build, and by making programmers
4564add this manually (not even an option to [`configure.sh`][69], it is easier to
4565ensure that never happens.
4566
4567However, that said, the extra debug code is useful; that was why I kept it in.
4568
4569## Performance
4570
4571While I have put in a lot of effort to make `bc` as fast as possible, there
4572might be some things you can do to speed it up without changing the code.
4573
4574First, you can probably use [profile-guided optimization][217] to optimize even
4575better, using the test suite to profile.
4576
4577Second, I included macros that might help branch placement and prediction:
4578
4579* `BC_ERR(e)`
4580* `BC_UNLIKELY(e)`
4581* `BC_NO_ERR(e)`
4582* `BC_LIKELY(e)`
4583
4584`BC_ERR` is the same as `BC_UNLIKELY`, and `BC_NO_ERR` is the same as
4585`BC_LIKELY`; I just added them to also document branches that lead to error
4586conditions or *away* from error conditions.
4587
4588Anyway, if `BC_LIKELY` and `BC_UNLIKELY` are not defined during compilation,
4589they expand to nothing but the argument they were given.
4590
4591They can, however, be defined to `__builtin_expect((e), 1)` and
4592`__builtin_expect((e), 0)`, respectively, on GCC and Clang for better branch
4593prediction and placement. (For more information about `__builtin_expect()` see
4594the [GCC documentation][218].)
4595
4596There might be other compilers that can take advantage of that, but I don't know
4597anything about that.
4598
4599Also, as stated in the [build manual][219], link-time optimization is excellent
4600at optimizing this `bc`. Use it.
4601
4602### Benchmarks
4603
4604To help programmers improve performance, I have built and assembled
4605infrastructure to make benchmarking easy.
4606
4607First, in order to easily run benchmarks, I created
4608[`scripts/benchmark.sh`][220].
4609
4610Second, I copied and adapted [`ministat.c`][223] [from FreeBSD][221], to make it
4611easier to judge whether the results are significant or not.
4612
4613Third, I made the `make` clean target `make clean_benchmarks`, to clean
4614`scripts/ministat` and the generated benchmark files.
4615
4616Fourth, I made it so [`scripts/benchmark.sh`][220] outputs the timing and memory
4617data in a format that is easy for `scripts/ministat` to digest.
4618
4619To add a benchmark, add a script in the right directory to generate the
4620benchmark. Yes, generate.
4621
4622All of the benchmarks are generated first, from `.bc` and `.dc` files in the
4623[`benchmarks/bc/`][91] and [`benchmarks/dc/`][224]. This is so that massive
4624amounts of data can be generated and then pushed through the calculators.
4625
4626If you need to benchmark `bc` or `dc` with simple loops, have the generator
4627files simply print the loop code.
4628
4629### Caching of Numbers
4630
4631In order to provide some performance boost, `bc` tries to reuse old `BcNum`'s
4632that have the default capacity (`BC_NUM_DEF_SIZE`).
4633
4634It does this by allowing `bc_num_free()` to put the limb array onto a
4635statically-allocated stack (it's just a global array with a set size). Then,
4636when a `BcNum` with the default capacity is needed, `bc_num_init()` asks if any
4637are available. If the answer is yes, the one on top of the stack is returned.
4638Otherwise, `NULL` is returned, and `bc_num_free()` knows it needs to `malloc()`
4639a new limb array.
4640
4641When the stack is filled, any numbers that `bc` attempts to put on it are just
4642freed.
4643
4644This setup saved a few percent in my testing for version [3.0.0][32], which is
4645when I added it.
4646
4647## `bcl`
4648
4649At the request of one of my biggest users, I spent the time to make a build mode
4650where the number and math code of `bc` could be wrapped into a library, which I
4651called `bcl`.
4652
4653This mode is exclusive; `bc` and `dc` themselves are *not* built when building
4654`bcl`.
4655
4656The only things in the `bc` math code that is not included is:
4657
4658* Printing newlines (clients do not care about `bc`'s line lenth restriction).
4659* `dc`'s stream print.
4660
4661Even the [pseudo-random number generator][179] is included, with extra support
4662for generating real numbers with it. (In `bc`, such support is in
4663[`lib2.bc`][26].)
4664
4665### Signal Handling
4666
4667Like signal handling in `bc` proper (see the [Async-Signal-Safe Signal
4668Handling][173] section), `bcl` has the infrastructure for signal handling.
4669
4670This infrastructure is different, however, as `bcl` assumes that clients will
4671implement their own signal handling.
4672
4673So instead of doing signal handling on its own, `bcl` provides the capability to
4674interrupt executions and return to the clients almost immediately. Like in `bc`,
4675this is done with `setjmp()` and `longjmp()`, although the jump series is
4676stopped before returning normally to client code.
4677
4678### Contexts
4679
4680Contexts were an idea by the same user that requested `bcl`. They are meant to
4681make it so multiple clients in one program can keep their data separate from
4682each other.
4683
4684### Numbers
4685
4686Numbers in `bcl` are literally indices into an encapsulated array of numbers,
4687hidden in the context. These indices are then passed to clients to refer to
4688numbers later.
4689
4690### Operand Consumption
4691
4692Most math functions in `bcl` "consume" their operand arguments; the arguments
4693are freed, whether or not an error is returned.
4694
4695This is to make it easy to implement math code, like this:
4696
4697```
4698n = bcl_add(bcl_mul(a, b), bcl_div(c, d));
4699```
4700
4701If numbers need to be preserved, they can be with `bcl_dup()`:
4702
4703```
4704n = bcl_add(bcl_mul(bcl_dup(a), bc_dup(b)), bcl_div(bcl_dup(c), bcl_dup(d)));
4705```
4706
4707### Errors
4708
4709Errors can be encoded in the indices representing numbers, and where necessary,
4710clients are responsible for checking those errors.
4711
4712The encoding of errors is this: if an error happens, the value `0-error` is
4713returned. To decode, do the exact same thing. Thus, any index above
4714`0-num_errors` is an error.
4715
4716If an index that represents an error is passed to a math function, that function
4717propagates the error to its result and does not perform the math operation.
4718
4719All of this is to, once again, make it easy to implement the math code as above.
4720
4721However, where possible, errors are returned directly.
4722
4723[1]: https://en.wikipedia.org/wiki/Bus_factor
4724[2]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/bc.html#top
4725[3]: https://en.wikipedia.org/wiki/Dc_(Unix)
4726[4]: https://en.wikipedia.org/wiki/Reverse_Polish_notation
4727[5]: ./bc/A.1.md#standard-library
4728[6]: https://github.com/torvalds/linux/blob/master/kernel/time/timeconst.bc
4729[7]: ./bc/A.1.md#extended-library
4730[8]: #libbc-2
4731[9]: #strgensh
4732[10]: https://vimeo.com/230142234
4733[11]: https://gavinhoward.com/2019/12/values-for-yao/
4734[12]: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
4735[13]: ./build.md#cross-compiling
4736[14]: ./build.md
4737[15]: #strgenc
4738[16]: http://landley.net/toybox/about.html
4739[17]: https://www.busybox.net/
4740[18]: https://en.wikipedia.org/wiki/Karatsuba_algorithm
4741[19]: https://clang-analyzer.llvm.org/scan-build.html
4742[20]: https://www.valgrind.org/
4743[21]: https://clang.llvm.org/docs/AddressSanitizer.html
4744[22]: https://gavinhoward.com/2019/11/finishing-software/
4745[23]: #history
4746[24]: https://clang.llvm.org/docs/ClangFormat.html
4747[25]: ./algorithms.md
4748[26]: #lib2bc
4749[27]: #vmh
4750[28]: https://github.com/rain-1/linenoise-mob
4751[29]: https://github.com/antirez/linenoise
4752[30]: #bclh
4753[31]: #argsh
4754[32]: ../NEWS.md#3-0-0
4755[33]: ../NEWS.md
4756[34]: https://github.com/skeeto/optparse
4757[35]: #opth
4758[36]: #historyh
4759[37]: #randh
4760[38]: #langh
4761[39]: #numc
4762[40]: #bcc
4763[41]: #bc_lexc
4764[42]: #bc_parsec
4765[43]: #libraryc
4766[44]: #dcc
4767[45]: #dc_lexc
4768[46]: #dc_parsec
4769[47]: #filec
4770[48]: #historyc
4771[49]: #langc
4772[50]: #lexc
4773[51]: #optc
4774[52]: #parsec
4775[53]: #programc
4776[54]: #randc
4777[55]: #fileh
4778[56]: #readc
4779[57]: #programh
4780[58]: #vmc
4781[59]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/gencat.html#top
4782[60]: #manpagesh
4783[61]: #bcl3md
4784[62]: #bcl3
4785[63]: #bclvcxproj
4786[64]: #bclvcxprojfilters
4787[65]: #bclsln
4788[66]: #bcvcxproj
4789[67]: #bcvcxprojfilters
4790[68]: #bcsln
4791[69]: #configuresh
4792[70]: #makefilein
4793[71]: #functionsh
4794[72]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sh.html#top
4795[73]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18
4796[74]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html#top
4797[75]: #versionh
4798[76]: ##posix-shell-scripts
4799[77]: #tests
4800[78]: #karatsubapy
4801[79]: #bc-1
4802[80]: #dc-1
4803[81]: ./build.md#build-type
4804[82]: #fuzzing-1
4805[83]: #releasesh
4806[84]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_02
4807[85]: #locales-1
4808[86]: #manuals-1
4809[87]: #locale_installsh
4810[88]: #locale_uninstallsh
4811[89]: #bc1mdin
4812[90]: #dc1mdin
4813[91]: #bc
4814[92]: https://pandoc.org/
4815[93]: #release_settingstxt
4816[94]: #aflpy
4817[95]: #randmathpy
4818[96]: #caching-of-numbers
4819[97]: #error-handling
4820[98]: #radamsatxt
4821[99]: https://gitlab.com/akihe/radamsa
4822[100]: #radamsash
4823[101]: https://musl.libc.org/
4824[102]: ./build.md#settings
4825[103]: #test_settingstxt
4826[104]: #test_settingssh
4827[105]: #functionssh
4828[106]: #bch
4829[107]: https://en.wikipedia.org/wiki/Infix_notation
4830[108]: https://en.wikipedia.org/wiki/Reverse_Polish_notation
4831[109]: https://en.wikipedia.org/wiki/Shunting-yard_algorithm
4832[110]: #bc-parsing
4833[111]: #vectors
4834[112]: https://git.yzena.com/Yzena/Yc/src/branch/master/include/yc/vector.h
4835[113]: https://en.wikipedia.org/wiki/Signal_(IPC)
4836[114]: #custom-io
4837[115]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03
4838[116]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/setjmp.html
4839[117]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/longjmp.html
4840[118]: https://www.youtube.com/watch?v=4PaWFYm0kEw
4841[119]: #fuzz_prepsh
4842[120]: #bc_aflyaml
4843[121]: #bc_afl_continueyaml
4844[122]: https://github.com/tmux/tmux
4845[123]: https://tmuxp.git-pull.com/
4846[124]: #test-suite
4847[125]: https://aflplus.plus/
4848[126]: #link-time-optimization
4849[127]: #fuzzing-performance
4850[128]: #radamsa
4851[129]: #afl-quickstart
4852[130]: #convenience
4853[131]: #datac
4854[132]: https://git.yzena.com/gavin/vim-bc
4855[133]: https://git.yzena.com/gavin/bc_libs
4856[134]: #debugging
4857[135]: #asserts
4858[136]: #portability
4859[137]: https://pexpect.readthedocs.io/en/stable/
4860[138]: #test-suite-portability
4861[139]: #historypy
4862[140]: #historysh
4863[141]: #group-tests
4864[142]: #build-system
4865[143]: #generated-tests
4866[144]: #benchmarks-1
4867[145]: #gen
4868[146]: #test-coverage
4869[147]: #integration-with-the-build-system
4870[148]: #test-scripts
4871[149]: #standard-tests
4872[150]: #script-tests
4873[151]: #error-tests
4874[152]: #stdin-tests
4875[153]: #read-tests
4876[154]: #other-tests
4877[155]: #history-tests
4878[156]: #bcl
4879[157]: #bcl-test
4880[158]: #bclc
4881[159]: #valgrind
4882[160]: #addresssanitizer-and-friends
4883[161]: #bc-2
4884[162]: #dc-2
4885[163]: #alltxt-1
4886[164]: #errorstxt
4887[165]: #posix_errorstxt
4888[166]: #timeconstsh
4889[167]: #alltxt-3
4890[168]: #errorstxt-1
4891[169]: #scripts-1
4892[170]: #scripts-2
4893[171]: #alltxt-2
4894[172]: #alltxt-4
4895[173]: #async-signal-safe-signal-handling
4896[174]: #vectorh
4897[175]: #read_errorstxt
4898[176]: #statush
4899[177]: #numbers
4900[178]: #math-style
4901[179]: #pseudo-random-number-generator
4902[180]: #lexh
4903[181]: #parseh
4904[182]: #dch
4905[183]: #libraryh
4906[184]: #numh
4907[185]: #readh
4908[186]: #maps
4909[187]: #slabs-and-slab-vectors
4910[188]: ./build.md#extra-math
4911[189]: #command-line-history
4912[190]: #scriptsed
4913[191]: #linux-timeconstbc-script
4914[192]: #corpuses
4915[193]: ./build.md#history
4916[194]: https://www.valgrind.org/docs/manual/mc-manual.html
4917[195]: #othersh
4918[196]: https://scan.coverity.com/
4919[197]: https://clang-analyzer.llvm.org/
4920[198]: https://unix.stackexchange.com/questions/253349/eintr-is-there-a-rationale-behind-it
4921[199]: https://cr.yp.to/docs/selfpipe.html
4922[200]: https://skarnet.org/cgi-bin/archive.cgi?2:mss:1607:201701:dfblejammjllfkggpcph
4923[201]: https://slembcke.github.io/2020/10/12/CustomAllocators.html#1-slab-allocator
4924[202]: https://en.wikipedia.org/wiki/Tail_call
4925[203]: https://en.wikipedia.org/wiki/Functional_programming_language
4926[204]: https://en.wikipedia.org/wiki/C_dynamic_memory_allocation
4927[205]: #mainc
4928[206]: #argc
4929[207]: https://en.wikipedia.org/wiki/Abstract_syntax_tree
4930[208]: #functions
4931[209]: #parsing
4932[210]: https://en.wikipedia.org/wiki/Stack_machine
4933[211]: https://en.wikipedia.org/wiki/Register_machine
4934[212]: #include
4935[213]: #src
4936[214]: https://gmplib.org/manual/Nomenclature-and-Types
4937[215]: https://en.wikipedia.org/wiki/Radix_point
4938[216]: #main-and-read-functions
4939[215]: https://www.pcg-random.org/
4940[216]: https://lemire.me/blog/2017/08/22/testing-non-cryptographic-random-number-generators-my-results/
4941[217]: https://en.wikipedia.org/wiki/Profile-guided_optimization
4942[218]: https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fexpect
4943[219]: ./build.md#optimization
4944[220]: #benchmarksh
4945[221]: https://cgit.freebsd.org/src/tree/usr.bin/ministat/ministat.c
4946[222]: https://www.freebsd.org/cgi/man.cgi?query=ministat&apropos=0&sektion=0&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html
4947[223]: #ministatc
4948[224]: #dc
4949