1# Development 2 3Updated: 29 July 2021 4 5This document is meant for the day when I (Gavin D. Howard) get [hit by a 6bus][1]. In other words, it's meant to make the [bus factor][1] a non-issue. 7 8This document is supposed to contain all of the knowledge necessary to develop 9`bc` and `dc`. 10 11In addition, this document is meant to add to the [oral tradition of software 12engineering][118], as described by Bryan Cantrill. 13 14This document will reference other parts of the repository. That is so a lot of 15the documentation can be closest to the part of the repo where it is actually 16necessary. 17 18## What Is It? 19 20This repository contains an implementation of both [POSIX `bc`][2] and [Unix 21`dc`][3]. 22 23POSIX `bc` is a standard utility required for POSIX systems. `dc` is a 24historical utility that was included in early Unix and even predates both Unix 25and C. They both are arbitrary-precision command-line calculators with their own 26programming languages. `bc`'s language looks similar to C, with infix notation 27and including functions, while `dc` uses [Reverse Polish Notation][4] and allows 28the user to execute strings as though they were functions. 29 30In addition, it is also possible to build the arbitrary-precision math as a 31library, named [`bcl`][156]. 32 33**Note**: for ease, I will refer to both programs as `bc` in this document. 34However, if I say "just `bc`," I am referring to just `bc`, and if I say `dc`, I 35am referring to just `dc`. 36 37### History 38 39This project started in January 2018 when a certain individual on IRC, hearing 40that I knew how to write parsers, asked me to write a `bc` parser for his math 41library. I did so. I thought about writing my own math library, but he 42disparaged my programming skills and made me think that I couldn't do it. 43 44However, he took so long to do it that I eventually decided to give it a try and 45had a working math portion in two weeks. It taught me that I should not listen 46to such people. 47 48From that point, I decided to make it an extreme learning experience about how 49to write quality software. 50 51That individual's main goal had been to get his `bc` into [toybox][16], and I 52managed to get my own `bc` in. I also got it in [busybox][17]. 53 54Eventually, in late 2018, I also decided to try my hand at implementing 55[Karatsuba multiplication][18], an algorithm that that unnamed individual 56claimed I could never implement. It took me a bit, but I did it. 57 58This project became a passion project for me, and I continued. In mid-2019, 59Stefan Eßer suggested I improve performance by putting more than 1 digit in each 60section of the numbers. After I showed immaturity because of some burnout, I 61implemented his suggestion, and the results were incredible. 62 63Since that time, I have gradually been improving the `bc` as I have learned more 64about things like fuzzing, [`scan-build`][19], [valgrind][20], 65[AddressSanitizer][21] (and the other sanitizers), and many other things. 66 67One of my happiest moments was when my `bc` was made the default in FreeBSD. 68 69But since I believe in [finishing the software I write][22], I have done less 70work on `bc` over time, though there are still times when I put a lot of effort 71in, such as now (17 June 2021), when I am attempting to convince OpenBSD to use 72my `bc`. 73 74And that is why I am writing this document: someday, someone else is going to 75want to change my code, and this document is my attempt to make it as simple as 76possible. 77 78### Values 79 80[According to Bryan Cantrill][10], all software has values. I think he's 81correct, though I [added one value for programming languages in particular][11]. 82 83However, for `bc`, his original list will do: 84 85* Approachability 86* Availability 87* Compatibility 88* Composability 89* Debuggability 90* Expressiveness 91* Extensibility 92* Interoperability 93* Integrity 94* Maintainability 95* Measurability 96* Operability 97* Performance 98* Portability 99* Resiliency 100* Rigor 101* Robustness 102* Safety 103* Security 104* Simplicity 105* Stability 106* Thoroughness 107* Transparency 108* Velocity 109 110There are several values that don't apply. The reason they don't apply is 111because `bc` and `dc` are existing utilities; this is just another 112reimplementation. The designs of `bc` and `dc` are set in stone; there is 113nothing we can do to change them, so let's get rid of those values that would 114apply to their design: 115 116* Compatibility 117* Integrity 118* Maintainability 119* Measurability 120* Performance 121* Portability 122* Resiliency 123* Rigor 124* Robustness 125* Safety 126* Security 127* Simplicity 128* Stability 129* Thoroughness 130* Transparency 131 132Furthermore, some of the remaining ones don't matter to me, so let me get rid of 133those and order the rest according to my *actual* values for this project: 134 135* Robustness 136* Stability 137* Portability 138* Compatibility 139* Performance 140* Security 141* Simplicity 142 143First is **robustness**. This `bc` and `dc` should be robust, accepting any 144input, never crashing, and instead, returning an error. 145 146Closely related to that is **stability**. The execution of `bc` and `dc` should 147be deterministic and never change for the same inputs, including the 148pseudo-random number generator (for the same seed). 149 150Third is **portability**. These programs should run everywhere that POSIX 151exists, as well as Windows. This means that just about every person on the 152planet will have access to these programs. 153 154Next is **compatibility**. These programs should, as much as possible, be 155compatible with other existing implementations and standards. 156 157Then we come to **performance**. A calculator is only usable if it's fast, so 158these programs should run as fast as possible. 159 160After that is **security**. These programs should *never* be the reason a user's 161computer is compromised. 162 163And finally, **simplicity**. Where possible, the code should be simple, while 164deferring to the above values. 165 166Keep these values in mind for the rest of this document, and for exploring any 167other part of this repo. 168 169#### Portability 170 171But before I go on, I want to talk about portability in particular. 172 173Most of these principles just require good attention and care, but portability 174is different. Sometimes, it requires pulling in code from other places and 175adapting it. In other words, sometimes I need to duplicate and adapt code. 176 177This happened in a few cases: 178 179* Option parsing (see [`include/opt.h`][35]). 180* History (see [`include/history.h`][36]). 181* Pseudo-Random Number Generator (see [`include/rand.h`][37]). 182 183This was done because I decided to ensure that `bc`'s dependencies were 184basically zero. In particular, either users have a normal install of Windows or 185they have a POSIX system. 186 187A POSIX system limited me to C99, `sh`, and zero external dependencies. That 188last item is why I pull code into `bc`: if I pull it in, it's not an external 189dependency. 190 191That's why `bc` has duplicated code. Remove it, and you risk `bc` not being 192portable to some platforms. 193 194## Suggested Course 195 196I do have a suggested course for programmers to follow when trying to understand 197this codebase. The order is this: 198 1991. `bc` Spec. 2002. Manpages. 2013. Test suite. 2024. Understand the build. 2035. Algorithms manual. 2046. Code concepts. 2057. Repo structure. 2068. Headers. 2079. Source code. 208 209This order roughly follows this order: 210 2111. High-level requirements 2122. Low-level requirements 2133. High-level implementation 2144. Low-level implementation 215 216In other words, first understand what the code is *supposed* to do, then 217understand the code itself. 218 219## Useful External Tools 220 221I have a few tools external to `bc` that are useful: 222 223* A [Vim plugin with syntax files made specifically for my `bc` and `dc`][132]. 224* A [repo of `bc` and `dc` scripts][133]. 225* A set of `bash` aliases (see below). 226* A `.bcrc` file with items useful for my `bash` setup (see below). 227 228My `bash` aliases are these: 229 230```sh 231alias makej='make -j16' 232alias mcmake='make clean && make' 233alias mcmakej='make clean && make -j16' 234alias bcdebug='CPPFLAGS="-DBC_DEBUG_CODE=1" CFLAGS="-Weverything -Wno-padded \ 235 -Wno-switch-enum -Wno-format-nonliteral -Wno-cast-align \ 236 -Wno-unreachable-code-return -Wno-missing-noreturn \ 237 -Wno-disabled-macro-expansion -Wno-unreachable-code -Wall -Wextra \ 238 -pedantic -std=c99" ./configure.sh' 239alias bcconfig='CFLAGS="-Weverything -Wno-padded -Wno-switch-enum \ 240 -Wno-format-nonliteral -Wno-cast-align -Wno-unreachable-code-return \ 241 -Wno-missing-noreturn -Wno-disabled-macro-expansion -Wno-unreachable-code \ 242 -Wall -Wextra -pedantic -std=c99" ./configure.sh' 243alias bcnoassert='CPPFLAGS="-DNDEBUG" CFLAGS="-Weverything -Wno-padded \ 244 -Wno-switch-enum -Wno-format-nonliteral -Wno-cast-align \ 245 -Wno-unreachable-code-return -Wno-missing-noreturn \ 246 -Wno-disabled-macro-expansion -Wno-unreachable-code -Wall -Wextra \ 247 -pedantic -std=c99" ./configure.sh' 248alias bcdebugnoassert='CPPFLAGS="-DNDEBUG -DBC_DEBUG_CODE=1" \ 249 CFLAGS="-Weverything -Wno-padded -Wno-switch-enum -Wno-format-nonliteral \ 250 -Wno-cast-align -Wno-unreachable-code-return -Wno-missing-noreturn \ 251 -Wno-disabled-macro-expansion -Wno-unreachable-code -Wall -Wextra \ 252 -pedantic -std=c99" ./configure.sh' 253alias bcunset='unset BC_LINE_LENGTH && unset BC_ENV_ARGS' 254``` 255 256`makej` runs `make` with all of my cores. 257 258`mcmake` runs `make clean` before running `make`. It will take a target on the 259command-line. 260 261`mcmakej` is a combination of `makej` and `mcmake`. 262 263`bcdebug` configures `bc` for a full debug build, including `BC_DEBUG_CODE` (see 264[Debugging][134] below). 265 266`bcconfig` configures `bc` with Clang (Clang is my personal default compiler) 267using full warnings, with a few really loud and useless warnings turned off. 268 269`bcnoassert` configures `bc` to not have asserts built in. 270 271`bcdebugnoassert` is like `bcnoassert`, except it also configures `bc` for debug 272mode. 273 274`bcunset` unsets my personal `bc` environment variables, which are set to: 275 276```sh 277export BC_ENV_ARGS="-l $HOME/.bcrc" 278export BC_LINE_LENGTH="74" 279``` 280 281Unsetting these environment variables are necessary for running 282[`scripts/release.sh`][83] because otherwise, it will error when attempting to 283run `bc -s` on my `$HOME/.bcrc`. 284 285Speaking of which, the contents of that file are: 286 287```bc 288define void print_time_unit(t){ 289 if(t<10)print "0" 290 if(t<1&&t)print "0" 291 print t,":" 292} 293define void sec2time(t){ 294 auto s,m,h,d,r 295 r=scale 296 scale=0 297 t=abs(t) 298 s=t%60 299 t-=s 300 m=t/60%60 301 t-=m 302 h=t/3600%24 303 t-=h 304 d=t/86400 305 if(d)print_time_unit(d) 306 if(h)print_time_unit(h) 307 print_time_unit(m) 308 if(s<10)print "0" 309 if(s<1&&s)print "0" 310 s 311 scale=r 312} 313define minutes(secs){ 314 return secs/60; 315} 316define hours(secs){ 317 return secs/3600; 318} 319define days(secs){ 320 return secs/3600/24; 321} 322define years(secs){ 323 return secs/3600/24/365.25; 324} 325define fbrand(b,p){ 326 auto l,s,t 327 b=abs(b)$ 328 if(b<2)b=2 329 s=scale 330 t=b^abs(p)$ 331 l=ceil(l2(t),0) 332 if(l>scale)scale=l 333 t=irand(t)/t 334 scale=s 335 return t 336} 337define ifbrand(i,b,p){return irand(abs(i)$)+fbrand(b,p)} 338``` 339 340This allows me to use `bc` as part of my `bash` prompt. 341 342## Code Style 343 344The code style for `bc` is...weird, and that comes from historical accident. 345 346In [History][23], I mentioned how I got my `bc` in [toybox][16]. Well, in order 347to do that, my `bc` originally had toybox style. Eventually, I changed to using 348tabs, and assuming they were 4 spaces wide, but other than that, I basically 349kept the same style, with some exceptions that are more or less dependent on my 350taste. 351 352The code style is as follows: 353 354* Tabs are 4 spaces. 355* Tabs are used at the beginning of lines for indent. 356* Spaces are used for alignment. 357* Lines are limited to 80 characters, period. 358* Pointer asterisk (`*`) goes with the variable (on the right), not the type, 359 unless it is for a pointer type returned from a function. 360* The opening brace is put on the same line as the header for the function, 361 loop, or `if` statement. 362* Unless the header is more than one line, in which case the opening brace is 363 put on its own line. 364* If the opening brace is put on its own line, there is no blank line after it. 365* If the opening brace is *not* put on its own line, there *is* a blank line 366 after it, *unless* the block is only one or two lines long. 367* Code lines are grouped into what I call "paragraphs." Basically, lines that 368 seem like they should go together are grouped together. This one comes down 369 to judgment. 370* Bodies of `if` statements, `else` statements, and loops that are one line 371 long are put on the same line as the statement, unless the header is more than 372 one line long, and/or, the header and body cannot fit into 80 characters with 373 a space inbetween them. 374* If single-line bodies are on a separate line from their headers, and the 375 headers are only a single line, then no braces are used. 376* However, braces are *always* used if they contain another `if` statement or 377 loop. 378* Loops with empty bodies are ended with a semicolon. 379* Expressions that return a boolean value are surrounded by paretheses. 380* Macro backslashes are aligned as far to the left as possible. 381* Binary operators have spaces on both sides. 382* If a line with binary operators overflows 80 characters, a newline is inserted 383 *after* binary operators. 384* Function modifiers and return types are on the same line as the function name. 385* With one exception, `goto`'s are only used to jump to the end of a function 386 for cleanup. 387* All structs, enums, and unions are `typedef`'ed. 388* All constant data is in one file: [`src/data.c`][131], but the corresponding 389 `extern` declarations are in the appropriate header file. 390* All local variables are declared at the beginning of the scope where they 391 appear. They may be initialized at that point, if it does not invoke UB or 392 otherwise cause bugs. 393* All precondition `assert()`'s (see [Asserts][135]) come *after* local variable 394 declarations. 395* Besides short `if` statements and loops, there should *never* be more than one 396 statement per line. 397 398### ClangFormat 399 400I attempted three times to use [ClangFormat][24] to impose a standard, 401machine-useful style on `bc`. All three failed. Otherwise, the style in this 402repo would be more consistent. 403 404## Repo Structure 405 406Functions are documented with Doxygen-style doc comments. Functions that appear 407in headers are documented in the headers, while static functions are documented 408where they are defined. 409 410### `bcl.sln` 411 412A Visual Studio solution file for [`bcl`][156]. This, along with 413[`bcl.vcxproj`][63] and [`bcl.vcxproj.filters`][64] is what makes it possible to 414build [`bcl`][156] on Windows. 415 416### `bcl.vcxproj` 417 418A Visual Studio project file for [`bcl`][156]. This, along with [`bcl.sln`][65] 419and [`bcl.vcxproj.filters`][64] is what makes it possible to build [`bcl`][156] 420on Windows. 421 422### `bcl.vcxproj.filters` 423 424A Visual Studio filters file for [`bcl`][156]. This, along with [`bcl.sln`][65] 425and [`bcl.vcxproj`][63] is what makes it possible to build [`bcl`][156] on 426Windows. 427 428### `bc.sln` 429 430A Visual Studio solution file for `bc`. This, along with [`bc.vcxproj`][66] 431and [`bc.vcxproj.filters`][67] is what makes it possible to build `bc` on 432Windows. 433 434### `bc.vcxproj` 435 436A Visual Studio project file for `bc`. This, along with [`bc.sln`][68] and 437[`bc.vcxproj.filters`][67] is what makes it possible to build `bc` on Windows. 438 439### `bc.vcxproj.filters` 440 441A Visual Studio filters file for `bc`. This, along with [`bc.sln`][68] and 442[`bc.vcxproj`][66] is what makes it possible to build `bc` on Windows. 443 444### `configure` 445 446A symlink to [`configure.sh`][69]. 447 448### `configure.sh` 449 450This is the script to configure `bc` and [`bcl`][156] for building. 451 452This `bc` has a custom build system. The reason for this is because of 453[*portability*][136]. 454 455If `bc` used an outside build system, that build system would be an external 456dependency. Thus, I had to write a build system for `bc` that used nothing but 457C99 and POSIX utilities. 458 459One of those utilities is POSIX `sh`, which technically implements a 460Turing-complete programming language. It's a terrible one, but it works. 461 462A user that wants to build `bc` on a POSIX system (not Windows) first runs 463`configure.sh` with the options he wants. `configure.sh` uses those options and 464the `Makefile` template ([`Makefile.in`][70]) to generate an actual valid 465`Makefile`. Then `make` can do the rest. 466 467For more information about the build process, see the [Build System][142] 468section and the [build manual][14]. 469 470For more information about shell scripts, see [POSIX Shell Scripts][76]. 471 472`configure.sh` does the following: 473 4741. It processes command-line arguments and figure out what the user wants to 475 build. 4762. It reads in [`Makefile.in`][70]. 4773. One-by-one, it replaces placeholders (in [`Makefile.in`][70]) of the form 478 `%%<placeholder_name>%%` based on the [build type][81]. 4794. It appends a list of file targets based on the [build type][81]. 4805. It appends the correct test targets. 4816. It copies the correct manpage and markdown manual for `bc` and `dc` into a 482 location from which they can be copied for install. 4837. It does a `make clean` to reset the build state. 484 485### `.gitattributes` 486 487A `.gitattributes` file. This is needed to preserve the `crlf` line endings in 488the Visual Studio files. 489 490### `.gitignore` 491 492The `.gitignore` 493 494### `LICENSE.md` 495 496This is the `LICENSE` file, including the licenses of various software that I 497have borrowed. 498 499### `Makefile.in` 500 501This is the `Makefile` template for [`configure.sh`][69] to use for generating a 502`Makefile`. 503 504For more information, see [`configure.sh`][69], the [Build System][142] section, 505and the [build manual][14]. 506 507Because of [portability][136], the generated `Makefile.in` should be a pure 508[POSIX `make`][74]-compatible `Makefile` (minus the placeholders). Here are a 509few snares for the unwary programmer in this file: 510 5111. No extensions allowed, including and especially GNU extensions. 5122. If new headers are added, they must also be added to `Makefile.in`. 5133. Don't delete the `.POSIX:` empty target at the top; that's what tells `make` 514 implementations that pure [POSIX `make`][74] is needed. 515 516In particular, there is no way to set up variables other than the `=` operator. 517There are no conditionals, so all of the conditional stuff must be in 518[`configure.sh`][69]. This is, in fact, why [`configure.sh`][69] exists in the 519first place: [POSIX `make`][74] is barebones and only does a build with no 520configuration. 521 522### `NEWS.md` 523 524A running changelog with an entry for each version. This should be updated at 525the same time that [`include/version.h`][75] is. 526 527### `NOTICE.md` 528 529The `NOTICE` file with proper attributions. 530 531### `README.md` 532 533The `README`. Read it. 534 535### `benchmarks/` 536 537The folder containing files to generate benchmarks. 538 539Each of these files was made, at one time or another, to benchmark some 540experimental feature, so if it seems there is no rhyme or reason to these 541benchmarks, it is because there is none, besides historical accident. 542 543#### `bc/` 544 545The folder containing `bc` scripts to generate `bc` benchmarks. 546 547##### `add.bc` 548 549The file to generate the benchmark to benchmark addition in `bc`. 550 551##### `arrays_and_constants.bc` 552 553The file to generate the benchmark to benchmark `bc` using lots of array names 554and constants. 555 556##### `arrays.bc` 557 558The file to generate the benchmark to benchmark `bc` using lots of array names. 559 560##### `constants.bc` 561 562The file to generate the benchmark to benchmark `bc` using lots of constants. 563 564##### `divide.bc` 565 566The file to generate the benchmark to benchmark division in `bc`. 567 568##### `functions.bc` 569 570The file to generate the benchmark to benchmark `bc` using lots of functions. 571 572##### `irand_long.bc` 573 574The file to generate the benchmark to benchmark `bc` using lots of calls to 575`irand()` with large bounds. 576 577##### `irand_short.bc` 578 579The file to generate the benchmark to benchmark `bc` using lots of calls to 580`irand()` with small bounds. 581 582##### `lib.bc` 583 584The file to generate the benchmark to benchmark `bc` using lots of calls to 585heavy functions in `lib.bc`. 586 587##### `multiply.bc` 588 589The file to generate the benchmark to benchmark multiplication in `bc`. 590 591##### `postfix_incdec.bc` 592 593The file to generate the benchmark to benchmark `bc` using postfix increment and 594decrement operators. 595 596##### `power.bc` 597 598The file to generate the benchmark to benchmark power (exponentiation) in `bc`. 599 600##### `subtract.bc` 601 602The file to generate the benchmark to benchmark subtraction in `bc`. 603 604##### `strings.bc` 605 606The file to generate the benchmark to benchmark `bc` using lots of strings. 607 608#### `dc/` 609 610The folder containing `dc` scripts to generate `dc` benchmarks. 611 612##### `modexp.dc` 613 614The file to generate the benchmark to benchmark modular exponentiation in `dc`. 615 616### `gen/` 617 618A folder containing the files necessary to generate C strings that will be 619embedded in the executable. 620 621All of the files in this folder have license headers, but the program and script 622that can generate strings from them include code to strip the license header out 623before strings are generated. 624 625#### `bc_help.txt` 626 627A text file containing the text displayed for `bc -h` or `bc --help`. 628 629This text just contains the command-line options and a short summary of the 630differences from GNU and BSD `bc`'s. It also directs users to the manpage. 631 632The reason for this is because otherwise, the help would be far too long to be 633useful. 634 635**Warning**: The text has some `printf()` format specifiers. You need to make 636sure the format specifiers match the arguments given to `bc_file_printf()`. 637 638#### `dc_help.txt` 639 640A text file containing the text displayed for `dc -h` or `dc --help`. 641 642This text just contains the command-line options and a short summary of the 643differences from GNU and BSD `dc`'s. It also directs users to the manpage. 644 645The reason for this is because otherwise, the help would be far too long to be 646useful. 647 648**Warning**: The text has some `printf()` format specifiers. You need to make 649sure the format specifiers match the arguments given to `bc_file_printf()`. 650 651#### `lib.bc` 652 653A `bc` script containing the [standard math library][5] required by POSIX. See 654the [POSIX standard][2] for what is required. 655 656This file does not have any extraneous whitespace, except for tabs at the 657beginning of lines. That is because this data goes directly into the binary, 658and whitespace is extra bytes in the binary. Thus, not having any extra 659whitespace shrinks the resulting binary. 660 661However, tabs at the beginning of lines are kept for two reasons: 662 6631. Readability. (This file is still code.) 6642. The program and script that generate strings from this file can remove 665 tabs at the beginning of lines. 666 667For more details about the algorithms used, see the [algorithms manual][25]. 668 669However, there are a few snares for unwary programmers. 670 671First, all constants must be one digit. This is because otherwise, multi-digit 672constants could be interpreted wrongly if the user uses a different `ibase`. 673This does not happen with single-digit numbers because they are guaranteed to be 674interpreted what number they would be if the `ibase` was as high as possible. 675 676This is why `A` is used in the library instead of `10`, and things like `2*9*A` 677for `180` in [`lib2.bc`][26]. 678 679As an alternative, you can set `ibase` in the function, but if you do, make sure 680to set it with a single-digit number and beware the snare below... 681 682Second, `scale`, `ibase`, and `obase` must be safely restored before returning 683from any function in the library. This is because without the `-g` option, 684functions are allowed to change any of the globals. 685 686Third, all local variables in a function must be declared in an `auto` statement 687before doing anything else. This includes arrays. However, function parameters 688are considered predeclared. 689 690Fourth, and this is only a snare for `lib.bc`, not [`lib2.bc`][26], the code 691must not use *any* extensions. It has to work when users use the `-s` or `-w` 692flags. 693 694#### `lib2.bc` 695 696A `bc` script containing the [extended math library][7]. 697 698Like [`lib.bc`][8], and for the same reasons, this file should have no 699extraneous whitespace, except for tabs at the beginning of lines. 700 701For more details about the algorithms used, see the [algorithms manual][25]. 702 703Also, be sure to check [`lib.bc`][8] for the snares that can trip up unwary 704programmers when writing code for `lib2.bc`. 705 706#### `strgen.c` 707 708Code for the program to generate C strings from text files. This is the original 709program, although [`strgen.sh`][9] was added later. 710 711The reason I used C here is because even though I knew `sh` would be available 712(it must be available to run `configure.sh`), I didn't know how to do what I 713needed to do with POSIX utilities and `sh`. 714 715Later, [`strgen.sh`][9] was contributed by Stefan Eßer of FreeBSD, showing that 716it *could* be done with `sh` and POSIX utilities. 717 718However, `strgen.c` exists *still* exists because the versions generated by 719[`strgen.sh`][9] may technically hit an environmental limit. (See the [draft C99 720standard][12], page 21.) This is because [`strgen.sh`][9] generates string 721literals, and in C99, string literals can be limited to 4095 characters, and 722`gen/lib2.bc` is above that. 723 724Fortunately, the limit for "objects," which include `char` arrays, is much 725bigger: 65535 bytes, so that's what `strgen.c` generates. 726 727However, the existence of `strgen.c` does come with a cost: the build needs C99 728compiler that targets the host machine. For more information, see the ["Cross 729Compiling" section][13] of the [build manual][14]. 730 731Read the comments in `strgen.c` for more detail about it, the arguments it 732takes, and how it works. 733 734#### `strgen.sh` 735 736An `sh` script that will generate C strings that uses only POSIX utilities. This 737exists for those situations where a host C99 compiler is not available, and the 738environment limits mentioned above in [`strgen.c`][15] don't matter. 739 740`strgen.sh` takes the same arguments as [`strgen.c`][15], and the arguments mean 741the exact same things, so see the comments in [`strgen.c`][15] for more detail 742about that, and see the comments in `strgen.sh` for more details about it and 743how it works. 744 745For more information about shell scripts, see [POSIX Shell Scripts][76]. 746 747### `include/` 748 749A folder containing the headers. 750 751The headers are not included among the source code because I like it better that 752way. Also there were folders within `src/` at one point, and I did not want to 753see `#include "../some_header.h"` or things like that. 754 755So all headers are here, even though only one ([`bcl.h`][30]) is meant for end 756users (to be installed in `INCLUDEDIR`). 757 758#### `args.h` 759 760This file is the API for processing command-line arguments. 761 762#### `bc.h` 763 764This header is the API for `bc`-only items. This includes the `bc_main()` 765function and the `bc`-specific lexing and parsing items. 766 767The `bc` parser is perhaps the most sensitive part of the entire codebase. See 768the documentation in `bc.h` for more information. 769 770The code associated with this header is in [`src/bc.c`][40], 771[`src/bc_lex.c`][41], and [`src/bc_parse.c`][42]. 772 773#### `bcl.h` 774 775This header is the API for the [`bcl`][156] library. 776 777This header is meant for distribution to end users and contains the API that end 778users of [`bcl`][156] can use in their own software. 779 780This header, because it's the public header, is also the root header. That means 781that it has platform-specific fixes for Windows. (If the fixes were not in this 782header, the build would fail on Windows.) 783 784The code associated with this header is in [`src/library.c`][43]. 785 786#### `dc.h` 787 788This header is the API for `dc`-only items. This includes the `dc_main()` 789function and the `dc`-specific lexing and parsing items. 790 791The code associated with this header is in [`src/dc.c`][44], 792[`src/dc_lex.c`][45], and [`src/dc_parse.c`][46]. 793 794#### `file.h` 795 796This header is for `bc`'s internal buffered I/O API. 797 798For more information about `bc`'s error handling and custom buffered I/O, see 799[Error Handling][97] and [Custom I/O][114], along with [`status.h`][176] and the 800notes about version [`3.0.0`][32] in the [`NEWS`][32]. 801 802The code associated with this header is in [`src/file.c`][47]. 803 804#### `history.h` 805 806This header is for `bc`'s implementation of command-line editing/history, which 807is based on a [UTF-8-aware fork][28] of [`linenoise`][29]. 808 809For more information, see the [Command-Line History][189] section. 810 811The code associated with this header is in [`src/history.c`][48]. 812 813#### `lang.h` 814 815This header defines the data structures and bytecode used for actual execution 816of `bc` and `dc` code. 817 818Yes, it's misnamed; that's an accident of history where the first things I put 819into it all seemed related to the `bc` language. 820 821The code associated with this header is in [`src/lang.c`][49]. 822 823#### `lex.h` 824 825This header defines the common items that both programs need for lexing. 826 827The code associated with this header is in [`src/lex.c`][50], 828[`src/bc_lex.c`][41], and [`src/dc_lex.c`][45]. 829 830#### `library.h` 831 832This header defines the things needed for [`bcl`][156] that users should *not* 833have access to. In other words, [`bcl.h`][30] is the *public* header for the 834library, and this header is the *private* header for the library. 835 836The code associated with this header is in [`src/library.c`][43]. 837 838#### `num.h` 839 840This header is the API for numbers and math. 841 842The code associated with this header is in [`src/num.c`][39]. 843 844#### `opt.h` 845 846This header is the API for parsing command-line arguments. 847 848It's different from [`args.h`][31] in that [`args.h`][31] is for the main code 849to process the command-line arguments into global data *after* they have already 850been parsed by `opt.h` into proper tokens. In other words, `opt.h` actually 851parses the command-line arguments, and [`args.h`][31] turns that parsed data 852into flags (bits), strings, and expressions that will be used later. 853 854Why are they separate? Because originally, `bc` used `getopt_long()` for 855parsing, so [`args.h`][31] was the only one that existed. After it was 856discovered that `getopt_long()` has different behavior on different platforms, I 857adapted a [public-domain option parsing library][34] to do the job instead. And 858in doing so, I gave it its own header. 859 860They could probably be combined, but I don't really care enough at this point. 861 862The code associated with this header is in [`src/opt.c`][51]. 863 864#### `parse.h` 865 866This header defines the common items that both programs need for parsing. 867 868Note that the parsers don't produce abstract syntax trees (AST's) or any 869intermediate representations. They produce bytecode directly. In other words, 870they don't have special data structures except what they need to do their job. 871 872The code associated with this header is in [`src/parse.c`][50], 873[`src/bc_lex.c`][42], and [`src/dc_lex.c`][46]. 874 875#### `program.h` 876 877This header defines the items needed to manage the data structures in 878[`lang.h`][38] as well as any helper functions needed to generate bytecode or 879execute it. 880 881The code associated with this header is in [`src/program.c`][53]. 882 883#### `rand.h` 884 885This header defines the API for the [pseudo-random number generator 886(PRNG)][179]. 887 888The PRNG only generates fixed-size integers. The magic of generating random 889numbers of arbitrary size is actually given to the code that does math 890([`src/num.c`][39]). 891 892The code associated with this header is in [`src/rand.c`][54]. 893 894#### `read.h` 895 896This header defines the API for reading from files and `stdin`. 897 898Thus, [`file.h`][55] is really for buffered *output*, while this file is for 899*input*. There is no buffering needed for `bc`'s inputs. 900 901The code associated with this header is in [`src/read.c`][56]. 902 903#### `status.h` 904 905This header has several things: 906 907* A list of possible errors that internal `bc` code can use. 908* Compiler-specific fixes. 909* Platform-specific fixes. 910* Macros for `bc`'s [error handling][97]. 911 912There is no code associated with this header. 913 914#### `vector.h` 915 916This header defines the API for the vectors (resizable arrays) that are used for 917data structures. 918 919Vectors are what do the heavy lifting in almost all of `bc`'s data structures. 920Even the maps of identifiers and arrays use vectors. 921 922#### `version.h` 923 924This header defines the version of `bc`. 925 926There is no code associated with this header. 927 928#### `vm.h` 929 930This header defines the API for setting up and running `bc` and `dc`. 931 932It is so named because I think of it as the "virtual machine" of `bc`, though 933that is probably not true as [`program.h`][57] is probably the "virtual machine" 934API. Thus, the name is more historical accident. 935 936The code associated with this header is in [`src/vm.c`][58]. 937 938### `locales/` 939 940This folder contains a bunch of `.msg` files and soft links to the real `.msg` 941files. This is how locale support is implemented in `bc`. 942 943The files are in the format required by the [`gencat`][59] POSIX utility. They 944all have the same messages, in the same order, with the same numbering, under 945the same groups. This is because the locale system expects those messages in 946that order. 947 948The softlinks exist because for many locales, they would contain the exact same 949information. To prevent duplication, they are simply linked to a master copy. 950 951The naming format for all files is: 952 953``` 954<language_code>_<country_code>.<encoding>.msg 955``` 956 957This naming format must be followed for all locale files. 958 959### `manuals/` 960 961This folder contains the documentation for `bc`, `dc`, and [`bcl`][156], along 962with a few other manuals. 963 964#### `algorithms.md` 965 966This file explains the mathematical algorithms that are used. 967 968The hope is that this file will guide people in understanding how the math code 969works. 970 971#### `bc.1.md.in` 972 973This file is a template for the markdown version of the `bc` manual and 974manpages. 975 976For more information about how the manpages and markdown manuals are generated, 977and for why, see [`scripts/manpage.sh`][60] and [Manuals][86]. 978 979#### `bcl.3` 980 981This is the manpage for the [`bcl`][156] library. It is generated from 982[`bcl.3.md`][61] using [`scripts/manpage.sh`][60]. 983 984For the reason why I check generated data into the repo, see 985[`scripts/manpage.sh`][60] and [Manuals][86]. 986 987#### `bcl.3.md` 988 989This is the markdown manual for the [`bcl`][156] library. It is the source for the 990generated [`bcl.3`][62] file. 991 992#### `benchmarks.md` 993 994This is a document that compares this `bc` to GNU `bc` in various benchmarks. It 995was last updated when version [`3.0.0`][32] was released. 996 997It has very little documentation value, other than showing what compiler options 998are useful for performance. 999 1000#### `build.md` 1001 1002This is the [build manual][14]. 1003 1004This `bc` has a custom build system. The reason for this is because of 1005[*portability*][136]. 1006 1007If `bc` used an outside build system, that build system would be an external 1008dependency. Thus, I had to write a build system for `bc` that used nothing but 1009C99 and POSIX utilities, including barebones [POSIX `make`][74]. 1010 1011for more information about the build system, see the [build system][142] 1012section, the [build manual][14], [`configure.sh`][69], and [`Makefile.in`][70]. 1013 1014#### `dc.1.md.in` 1015 1016This file is a template for the markdown version of the `dc` manual and 1017manpages. 1018 1019For more information about how the manpages and markdown manuals are generated, 1020and for why, see [`scripts/manpage.sh`][60] and [Manuals][86]. 1021 1022#### `development.md` 1023 1024The file you are reading right now. 1025 1026#### `header_bcl.txt` 1027 1028Used by [`scripts/manpage.sh`][60] to give the [`bcl.3`][62] manpage a proper 1029header. 1030 1031For more information about generating manuals, see [`scripts/manpage.sh`][60] 1032and [Manuals][86]. 1033 1034#### `header_bc.txt` 1035 1036Used by [`scripts/manpage.sh`][60] to give the [generated `bc` manpages][79] a 1037proper header. 1038 1039For more information about generating manuals, see [`scripts/manpage.sh`][60] 1040and [Manuals][86]. 1041 1042#### `header_dc.txt` 1043 1044Used by [`scripts/manpage.sh`][60] to give the [generated `dc` manpages][80] a 1045proper header. 1046 1047For more information about generating manuals, see [`scripts/manpage.sh`][60] 1048and [Manuals][86]. 1049 1050#### `header.txt` 1051 1052Used by [`scripts/manpage.sh`][60] to give all generated manpages a license 1053header. 1054 1055For more information about generating manuals, see [`scripts/manpage.sh`][60] 1056and [Manuals][86]. 1057 1058#### `release.md` 1059 1060A checklist that I try to somewhat follow when making a release. 1061 1062#### `bc/` 1063 1064A folder containing the `bc` manuals. 1065 1066Each `bc` manual corresponds to a [build type][81]. See that link for more 1067details. 1068 1069For each manual, there are two copies: the markdown version generated from the 1070template, and the manpage generated from the markdown version. 1071 1072#### `dc/` 1073 1074A folder containing the `dc` manuals. 1075 1076Each `dc` manual corresponds to a [build type][81]. See that link for more 1077details. 1078 1079For each manual, there are two copies: the markdown version generated from the 1080template, and the manpage generated from the markdown version. 1081 1082### `scripts/` 1083 1084This folder contains helper scripts. Most of them are written in pure [POSIX 1085`sh`][72], but one ([`karatsuba.py`][78]) is written in Python 3. 1086 1087For more information about the shell scripts, see [POSIX Shell Scripts][76]. 1088 1089#### `afl.py` 1090 1091This script is meant to be used as part of the fuzzing workflow. 1092 1093It does one of two things: checks for valid crashes, or runs `bc` and or `dc` 1094under all of the paths found by [AFL++][125]. 1095 1096See [Fuzzing][82] for more information about fuzzing, including this script. 1097 1098#### `alloc.sh` 1099 1100This script is a quick and dirty script to test whether or not the garbage 1101collection mechanism of the [`BcNum` caching][96] works. It has been little-used 1102because it tests something that is not important to correctness. 1103 1104#### `benchmark.sh` 1105 1106A script making it easy to run benchmarks and to run the executable produced by 1107[`ministat.c`][223] on them. 1108 1109For more information, see the [Benchmarks][144] section. 1110 1111#### `bitfuncgen.c` 1112 1113A source file for an executable to generate tests for `bc`'s bitwise functions 1114in [`gen/lib2.bc`][26]. The executable is `scripts/bitfuncgen`, and it is built 1115with `make bitfuncgen`. It produces the test on `stdout` and the expected 1116results on `stderr`. This means that to generat tests, use the following 1117invokation: 1118 1119``` 1120scripts/bitfuncgen > tests/bc/bitfuncs.txt 2> tests/bc/bitfuncs_results.txt 1121``` 1122 1123It calls `abort()` if it runs into an error. 1124 1125#### `exec-install.sh` 1126 1127This script is the magic behind making sure `dc` is installed properly if it's 1128a symlink to `bc`. It checks to see if it is a link, and if so, it just creates 1129a new symlink in the install directory. Of course, it also installs `bc` itself, 1130or `dc` when it's alone. 1131 1132#### `functions.sh` 1133 1134This file is a bunch of common functions for most of the POSIX shell scripts. It 1135is not supposed to be run; instead, it is *sourced* by other POSIX shell 1136scripts, like so: 1137 1138``` 1139. "$scriptdir/functions.sh" 1140``` 1141 1142or the equivalent, depending on where the sourcing script is. 1143 1144For more information about the shell scripts, see [POSIX Shell Scripts][76]. 1145 1146#### `fuzz_prep.sh` 1147 1148Fuzzing is a regular activity when I am preparing for a release. 1149 1150This script handles all the options and such for building a fuzzable binary. 1151Instead of having to remember a bunch of options, I just put them in this script 1152and run the script when I want to fuzz. 1153 1154For more information about fuzzing, see [Fuzzing][82]. 1155 1156#### `karatsuba.py` 1157 1158This script has at least one of two major differences from most of the other 1159scripts: 1160 1161* It's written in Python 3. 1162* It's meant for software packagers. 1163 1164For example, [`scripts/afl.py`][94] and [`scripts/randmath.py`][95] are both in 1165Python 3, but they are not meant for the end user or software packagers and are 1166not included in source distributions. But this script is. 1167 1168This script breaks my rule of only POSIX utilities necessary for package 1169maintainers, but there's a very good reason for that: it's only meant to be run 1170*once* when the package is created for the first time, and maybe not even then. 1171 1172You see, this script does two things: it tests the Karatsuba implementation at 1173various settings for `KARATSUBA_LEN`, and it figures out what the optimal 1174`KARATSUBA_LEN` is for the machine that it is running on. 1175 1176Package maintainers can use this script, when creating a package for this `bc`, 1177to figure out what is optimal for their users. Then they don't have to run it 1178ever again. So this script only has to run on the packagers machine. 1179 1180I tried to write the script in `sh`, by the way, and I finally accepted the 1181tradeoff of using Python 3 when it became too hard. 1182 1183However, I also mentioned that it's for testing Karatsuba with various settings 1184of `KARATSUBA_LEN`. Package maintainers will want to run the [test suite][124], 1185right? 1186 1187Yes, but this script is not part of the [test suite][124]; it's used for testing 1188in the [`scripts/release.sh`][83] script, which is maintainer use only. 1189 1190However, there is one snare with `karatsuba.py`: I didn't want the user to have 1191to install any Python libraries to run it. Keep that in mind if you change it. 1192 1193#### `link.sh` 1194 1195This script is the magic behind making `dc` a symlink of `bc` when both 1196calculators are built. 1197 1198#### `locale_install.sh` 1199 1200This script does what its name says: it installs locales. 1201 1202It turns out that this is complicated. 1203 1204There is a magic environment variable, `$NLSPATH`, that tells you how and where 1205you are supposed to install locales. 1206 1207Yes, *how*. And where. 1208 1209But now is not the place to rant about `$NLSPATH`. For more information on 1210locales and `$NLSPATH`, see [Locales][85]. 1211 1212#### `locale_uninstall.sh` 1213 1214This script does what its name says: it uninstalls locales. 1215 1216This is far less complicated than installing locales. I basically generate a 1217wildcard path and then list all paths that fit that wildcard. Then I delete each 1218one of those paths. Easy. 1219 1220For more information on locales, see [Locales][85]. 1221 1222#### `manpage.sh` 1223 1224This script is the one that generates markdown manuals from a template and a 1225manpage from a markdown manual. 1226 1227For more information about generating manuals, see [Manuals][86]. 1228 1229#### `ministat.c` 1230 1231This is a file copied [from FreeBSD][221] that calculates the standard 1232statistical numbers, such as mean, average, and median, based on numbers 1233obtained from a file. 1234 1235For more information, see the [FreeBSD ministat(1) manpage][222]. 1236 1237This file allows `bc` to build the `scripts/ministat` executable using the 1238command `make ministat`, and this executable helps programmers evaluate the 1239results of [benchmarks][144] more accurately. 1240 1241#### `package.sh` 1242 1243This script is what helps `bc` maintainers cut a release. It does the following: 1244 12451. Creates the appropriate `git` tag. 12462. Pushes the `git` tag. 12473. Copies the repo to a temp directory. 12484. Removes files that should not be included in source distributions. 12495. Creates the tarballs. 12506. Signs the tarballs. 12517. Zips and signs the Windows executables if they exist. 12528. Calculates and outputs SHA512 and SHA256 sums for all of the files, 1253 including the signatures. 1254 1255This script is for `bc` maintainers to use when cutting a release. It is not 1256meant for outside use. This means that some non-POSIX utilities can be used, 1257such as `git` and `gpg`. 1258 1259In addition, before using this script, it expects that the folders that Windows 1260generated when building `bc`, `dc`, and [`bcl`][156], are in the parent 1261directory of the repo, exactly as Windows generated them. If they are not there, 1262then it will not zip and sign, nor calculate sums of, the Windows executables. 1263 1264Because this script creates a tag and pushes it, it should *only* be run *ONCE* 1265per release. 1266 1267#### `radamsa.sh` 1268 1269A script to test `bc`'s command-line expression parsing code, which, while 1270simple, strives to handle as much as possible. 1271 1272What this script does is it uses the test cases in [`radamsa.txt`][98] an input 1273to the [Radamsa fuzzer][99]. 1274 1275For more information, see the [Radamsa][128] section. 1276 1277#### `radamsa.txt` 1278 1279Initial test cases for the [`radamsa.sh`][100] script. 1280 1281#### `randmath.py` 1282 1283This script generates random math problems and checks that `bc`'s and `dc`'s 1284output matches the GNU `bc` and `dc`. (For this reason, it is necessary to have 1285GNU `bc` and `dc` installed before using this script.) 1286 1287One snare: be sure that this script is using the GNU `bc` and `dc`, not a 1288previously-installed version of this `bc` and `dc`. 1289 1290If you want to check for memory issues or failing asserts, you can build the 1291`bc` using `./scripts/fuzz_prep.sh -a`, and then run it under this script. Any 1292errors or crashes should be caught by the script and given to the user as part 1293of the "checklist" (see below). 1294 1295The basic idea behind this script is that it generates as many math problems as 1296it can, biasing towards situations that may be likely to have bugs, and testing 1297each math problem against GNU `bc` or `dc`. 1298 1299If GNU `bc` or `dc` fails, it just continues. If this `bc` or `dc` fails, it 1300stores that problem. If the output mismatches, it also stores the problem. 1301 1302Then, when the user sends a `SIGINT`, the script stops testing and goes into 1303report mode. One-by-one, it will go through the "checklist," the list of failed 1304problems, and present each problem to the user, as well as whether this `bc` or 1305`dc` crashed, and its output versus GNU. Then the user can decide to add them as 1306test cases, which it does automatically to the appropriate test file. 1307 1308#### `release_settings.txt` 1309 1310A text file of settings combinations that [`release.sh`][83] uses to ensure that 1311`bc` and `dc` build and work with various default settings. [`release.sh`][83] 1312simply reads it line by line and uses each line for one build. 1313 1314#### `release.sh` 1315 1316This script is for `bc` maintainers only. It runs `bc`, `dc`, and [`bcl`][156] 1317through a gauntlet that is mostly meant to be used in preparation for a release. 1318 1319It does the following: 1320 13211. Builds every [build type][81], with every setting combo in 1322 [`release_settings.txt`][93] with both calculators, `bc` alone, and `dc` 1323 alone. 13242. Builds every [build type][81], with every setting combo in 1325 [`release_settings.txt`][93] with both calculators, `bc` alone, and `dc` 1326 alone for 32-bit. 13273. Does #1 and #2 for Debug, Release, Release with Debug Info, and Min Size 1328 Release builds. 13294. Runs the [test suite][124] on every build, if desired. 13305. Runs the [test suite][124] under [ASan, UBSan, and MSan][21] for every build 1331 type/setting combo. 13326. Runs [`scripts/karatsuba.py`][78] in test mode. 13337. Runs the [test suite][124] for both calculators, `bc` alone, and `dc` alone 1334 under [valgrind][20] and errors if there are any memory bugs or memory 1335 leaks. 1336 1337#### `safe-install.sh` 1338 1339A script copied from [musl][101] to atomically install files. 1340 1341#### `test_settings.sh` 1342 1343A quick and dirty script to help automate rebuilding while manually testing the 1344various default settings. 1345 1346This script uses [`test_settings.txt`][103] to generate the various settings 1347combos. 1348 1349For more information about settings, see [Settings][102] in the [build 1350manual][14]. 1351 1352#### `test_settings.txt` 1353 1354A list of the various settings combos to be used by [`test_settings.sh`][104]. 1355 1356### `src/` 1357 1358This folder is, obviously, where the actual heart and soul of `bc`, the source 1359code, is. 1360 1361All of the source files are in one folder; this simplifies the build system 1362immensely. 1363 1364There are separate files for `bc` and `dc` specific code ([`bc.c`][40], 1365[`bc_lex.c`][41], [`bc_parse.c`][42], [`dc.c`][44], [`dc_lex.c`][45], and 1366[`dc_parse.c`][46]) where possible because it is cleaner to exclude an entire 1367source file from a build than to have `#if`/`#endif` preprocessor guards. 1368 1369That said, it was easier in many cases to use preprocessor macros where both 1370calculators used much of the same code and data structures, so there is a 1371liberal sprinkling of them through the code. 1372 1373#### `args.c` 1374 1375Code for processing command-line arguments. 1376 1377The header for this file is [`include/args.h`][31]. 1378 1379#### `bc.c` 1380 1381The code for the `bc` main function `bc_main()`. 1382 1383The header for this file is [`include/bc.h`][106]. 1384 1385#### `bc_lex.c` 1386 1387The code for lexing that only `bc` needs. 1388 1389The headers for this file are [`include/lex.h`][180] and [`include/bc.h`][106]. 1390 1391#### `bc_parse.c` 1392 1393The code for parsing that only `bc` needs. This code is the most complex and 1394subtle in the entire codebase. 1395 1396The headers for this file are [`include/parse.h`][181] and 1397[`include/bc.h`][106]. 1398 1399#### `data.c` 1400 1401Due to [historical accident][23] because of a desire to get my `bc` into 1402[toybox][16], all of the constant data that `bc` needs is all in one file. This 1403is that file. 1404 1405There is no code in this file, but a lot of the const data has a heavy influence 1406on code, including the order of data in arrays because that order has to 1407correspond to the order of other things elsewhere in the codebase. If you change 1408the order of something in this file, run `make test`, and get errors, you 1409changed something that depends on the order that you messed up. 1410 1411Almost all headers have `extern` references to items in this file. 1412 1413#### `dc.c` 1414 1415The code for the `dc` main function `dc_main()`. 1416 1417The header for this file is [`include/dc.h`][182]. 1418 1419#### `dc_lex.c` 1420 1421The code for lexing that only `dc` needs. 1422 1423The headers for this file are [`include/lex.h`][180] and [`include/dc.h`][182]. 1424 1425#### `dc_parse.c` 1426 1427The code for parsing that only `dc` needs. 1428 1429The headers for this file are [`include/parse.h`][181] and 1430[`include/bc.h`][182]. 1431 1432#### `file.c` 1433 1434The code for `bc`'s implementation of buffered I/O. For more information about 1435why I implemented my own buffered I/O, see [`include/file.h`][55], [Error 1436Handling][97], and [Custom I/O][114], along with [`status.h`][176] and the notes 1437about version [`3.0.0`][32] in the [`NEWS`][32]. 1438 1439The header for this file is [`include/file.h`][55]. 1440 1441#### `history.c` 1442 1443The code for `bc`'s implementation of command-line editing/history, which is 1444based on a [UTF-8-aware fork][28] of [`linenoise`][29]. 1445 1446For more information, see the [Command-Line History][189] section. 1447 1448The header for this file is [`include/history.h`][36]. 1449 1450#### `lang.c` 1451 1452The data structures used for actual execution of `bc` and `dc` code. 1453 1454While execution is done in [`src/program.c`][53], this file defines functions 1455for initializing, copying, and freeing the data structures, which is somewhat 1456orthogonal to actual execution. 1457 1458Yes, it's misnamed; that's an accident of history where the first things I put 1459into it all seemed related to the `bc` language. 1460 1461The header for this file is [`include/lang.h`][38]. 1462 1463#### `lex.c` 1464 1465The code for the common things that both programs need for lexing. 1466 1467The header for this file is [`include/lex.h`][180]. 1468 1469#### `library.c` 1470 1471The code to implement the public API of the `bcl` library. 1472 1473The code in this file does a lot to ensure that clients do not have to worry 1474about internal `bc` details, especially error handling with `setjmp()` and 1475`longjmp()`. That and encapsulating the handling of numbers are the bulk of what 1476the code in this file actually does because most of the library is still 1477implemented in [`src/num.c`][39]. 1478 1479The headers for this file are [`include/bcl.h`][30] and 1480[`include/library.h`][183]. 1481 1482#### `main.c` 1483 1484The entry point for both programs; this is the `main()` function. 1485 1486This file has no headers associated with it. 1487 1488#### `num.c` 1489 1490The code for all of the arbitrary-precision [numbers][177] and [math][178] in 1491`bc`. 1492 1493The header for this file is [`include/num.h`][184]. 1494 1495#### `opt.c` 1496 1497The code for parsing command-line options. 1498 1499The header for this file is [`include/opt.h`][35]. 1500 1501#### `parse.c` 1502 1503The code for the common items that both programs need for parsing. 1504 1505The header for this file is [`include/parse.h`][181]. 1506 1507#### `program.c` 1508 1509The code for the actual execution engine for `bc` and `dc` code. 1510 1511The header for this file is [`include/program.h`][57]. 1512 1513#### `rand.c` 1514 1515The code for the [pseudo-random number generator (PRNG)][179] and the special 1516stack handling it needs. 1517 1518The PRNG only generates fixed-size integers. The magic of generating random 1519numbers of arbitrary size is actually given to the code that does math 1520([`src/num.c`][39]). 1521 1522The header for this file is [`include/rand.h`][37]. 1523 1524#### `read.c` 1525 1526The code for reading from files and `stdin`. 1527 1528The header for this file is [`include/read.h`][185]. 1529 1530#### `vector.c` 1531 1532The code for [vectors][111], [maps][186], and [slab vectors][187], along with 1533slabs. 1534 1535The header for this file is [`include/vector.h`][174]. 1536 1537#### `vm.c` 1538 1539The code for setting up and running `bc` and `dc`. 1540 1541It is so named because I think of it as the "virtual machine" of `bc`, though 1542that is probably not true as [`program.h`][57] is probably the "virtual machine" 1543code. Thus, the name is more historical accident. 1544 1545The header for this file is [`include/vm.h`][27]. 1546 1547### `tests/` 1548 1549This directory contains the entire [test suite][124] and its infrastructure. 1550 1551#### `all.sh` 1552 1553A convenience script for the `make run_all_tests` target (see the [Group 1554Tests][141] section for more information). 1555 1556#### `all.txt` 1557 1558The file with the names of the calculators. This is to make it easier for the 1559test scripts to know where the standard and other test directories are. 1560 1561#### `bcl.c` 1562 1563The test for the [`bcl`][156] API. For more information, see the [`bcl` 1564Test][157] section. 1565 1566#### `errors.sh` 1567 1568The script to run the error tests for each calculator. For more information, see 1569the [Error Tests][151] section. 1570 1571#### `extra_required.txt` 1572 1573The file with the list of tests which both calculators have that need the [Extra 1574Math build option][188]. This exists to make it easy for test scripts to skip 1575those tests when the [Extra Math build option][188] is disabled. 1576 1577#### `history.py` 1578 1579The file with all of the history tests. For more information, see the [History 1580Tests][155] section. 1581 1582#### `history.sh` 1583 1584The script to integrate [`history.py`][139] into the build system in a portable 1585way, and to skip it if necessary. 1586 1587This script also re-runs the test three times if it fails. This is because 1588`pexpect` can be flaky at times. 1589 1590#### `other.sh` 1591 1592The script to run the "other" (miscellaneous) tests for each calculator. For 1593more information, see the [Other Tests][154] section. 1594 1595#### `read.sh` 1596 1597The script to run the read tests for each calculator. For more information, see 1598the [`read()` Tests][153] section. 1599 1600#### `script.sed` 1601 1602The `sed` script to edit the output of GNU `bc` when generating script tests. 1603For more information, see the [Script Tests][150] section. 1604 1605#### `script.sh` 1606 1607The script for running one script test. For more information, see the [Script 1608Tests][150] section. 1609 1610#### `scripts.sh` 1611 1612The script to help the `make run_all_tests` (see the [Group Tests][141] section) 1613run all of the script tests. 1614 1615#### `stdin.sh` 1616 1617The script to run the `stdin` tests for each calculator. For more information, 1618see the [`stdin` Tests][152] section. 1619 1620#### `test.sh` 1621 1622The script to run one standard test. For more information, see the [Standard 1623Tests][149] section. 1624 1625#### `bc/` 1626 1627The standard tests directory for `bc`. For more information, see the [Standard 1628Tests][149] section. 1629 1630##### `all.txt` 1631 1632The file to tell the build system and `make run_all_tests` (see the [Group 1633Tests][141] section) what standard tests to run for `bc`, as well as in what 1634order. 1635 1636This file just lists the test names, one per line. 1637 1638##### `errors.txt` 1639 1640The initial error test file for `bc`. This file has one test per line. See the 1641[Error Tests][151] section for more information. 1642 1643##### `posix_errors.txt` 1644 1645The file of tests for POSIX compatibility for `bc`. This file has one test per 1646line. For more information, see the [Error Tests][151] section. 1647 1648##### `timeconst.sh` 1649 1650The script to run the `bc` tests that use the [Linux `timeconst.bc` script][6]. 1651For more information, see the [Linux `timeconst.bc` Script][191]section. 1652 1653##### `errors/` 1654 1655The directory with error tests for `bc`, most discovered by AFL++ (see the 1656[Fuzzing][82] section). There is one test per file. For more information, see 1657the [Error Tests][151] section. 1658 1659##### `scripts/` 1660 1661The script tests directory for `bc`. For more information, see the [Script 1662Tests][150] section. 1663 1664###### `all.txt` 1665 1666A file to tell the build system and `make run_all_tests` (see the [Group 1667Tests][141] section) what script tests to run for `bc`, as well as in what 1668order. 1669 1670This file just lists the test names, one per line. 1671 1672#### `dc/` 1673 1674The standard tests directory for `dc`. For more information, see the [Standard 1675Tests][149] section. 1676 1677##### `all.txt` 1678 1679The file to tell the build system and `make run_all_tests` (see the [Group 1680Tests][141] section) what standard tests to run for `dc`, as well as in what 1681order. 1682 1683This file just lists the test names, one per line. 1684 1685##### `errors.txt` 1686 1687The initial error test file for `dc`. This file has one test per line. See the 1688[Error Tests][151] section for more information. 1689 1690##### `read_errors.txt` 1691 1692The file of tests errors with the `?` command (`read()` in `bc`). This file has 1693one test per line. See the [Error Tests][151] section for more information. 1694 1695##### `errors/` 1696 1697The directory with error tests for `dc`, most discovered by AFL++ (see the 1698[Fuzzing][82] section). There is one test per file. For more information, see 1699the [Error Tests][151] section. 1700 1701##### `scripts/` 1702 1703The script tests directory for `dc`. For more information, see the [Script 1704Tests][150] section. 1705 1706###### `all.txt` 1707 1708The file to tell the build system and `make run_all_tests` (see the [Group 1709Tests][141] section) what script tests to run for `dc`, as well as in what 1710order. 1711 1712This file just lists the test names, one per line. 1713 1714#### `fuzzing/` 1715 1716The directory containing the fuzzing infrastructure. For more information, see 1717the [Fuzzing][82] section. 1718 1719##### `bc_afl_continue.yaml` 1720 1721The [`tmuxp`][123] config (for use with [`tmux`][122]) for easily restarting a 1722fuzz run. For more information, see the [Convenience][130] subsection of the 1723[Fuzzing][82] section. 1724 1725##### `bc_afl.yaml` 1726 1727The [`tmuxp`][123] config (for use with [`tmux`][122]) for easily starting a 1728fuzz run. For more information, see the [Convenience][130] subsection of the 1729[Fuzzing][82] section. 1730 1731Be aware that this will delete all previous unsaved fuzzing tests in the output 1732directories. 1733 1734##### `bc_inputs1/` 1735 1736The fuzzing input directory for the first third of inputs for `bc`. For more 1737information, see the [Corpuses][192] subsection of the [Fuzzing][82] section. 1738 1739##### `bc_inputs2/` 1740 1741The fuzzing input directory for the second third of inputs for `bc`. For more 1742information, see the [Corpuses][192] subsection of the [Fuzzing][82] section. 1743 1744##### `bc_inputs3/` 1745 1746The fuzzing input directory for the third third of inputs for `bc`. For more 1747information, see the [Corpuses][192] subsection of the [Fuzzing][82] section. 1748 1749##### `dc_inputs/` 1750 1751The fuzzing input directory for the inputs for `dc`. For more information, see 1752the [Corpuses][192] subsection of the [Fuzzing][82] section. 1753 1754## Build System 1755 1756The build system is described in detail in the [build manual][14], so 1757maintainers should start there. This section, however, describes some parts of 1758the build system that only maintainers will care about. 1759 1760### Clean Targets 1761 1762`bc` has a default `make clean` target that cleans up the build files. However, 1763because `bc`'s build system can generate many different types of files, there 1764are other clean targets that may be useful: 1765 1766* `make clean_gen` cleans the `gen/strgen` executable generated from 1767 [`gen/strgen.c`][15]. It has no prerequisites. 1768* `make clean` cleans object files, `*.cat` files (see the [Locales][85] 1769 section), executables, and files generated from text files in [`gen/`][145], 1770 including `gen/strgen` if it was built. So this has a prerequisite on 1771 `make clean_gen` in normal use. 1772* `make clean_benchmarks` cleans [benchmarks][144], including the `ministat` 1773 executable. It has no prerequisites. 1774* `make clean_config` cleans the generated `Makefile` and the manuals that 1775 [`configure.sh`][69] copied in preparation for install. It also depends on 1776 `make clean` and `make clean_benchmarks`, so it cleans those items too. This 1777 is the target that [`configure.sh`][69] uses before it does its work. 1778* `make clean_coverage` cleans the generated coverage files for the [test 1779 suite][124]'s [code coverage][146] capabilities. It has no prerequisites. This 1780 is useful if the code coverage tools are giving errors. 1781* `make clean_tests` cleans *everything*. It has prerequisites on all previous 1782 clean targets, but it also cleans all of the [generated tests][143]. 1783 1784When adding more generated files, you may need to add them to one of these 1785targets and/or add a target for them especially. 1786 1787### Preprocessor Macros 1788 1789`bc` and `dc` use *a lot* of preprocessor macros to ensure that each build type: 1790 1791* builds, 1792* works under the [test suite][124], and 1793* excludes as much code as possible from all builds. 1794 1795This section will explain the preprocessor style of `bc` and `dc`, as well as 1796provide an explanation of the macros used. 1797 1798#### Style 1799 1800The style of macro use in `bc` is pretty straightforward: I avoid depending on 1801macro definitions and instead, I set defaults if the macro is not defined and 1802then test the value if the macro with a plain `#if`. 1803 1804(Some examples of setting defaults are in [`include/status.h`][176], just above 1805the definition of the `BcStatus` enum.) 1806 1807In other words, I use `#if` instead of `#ifndef` or `#ifdef`, where possible. 1808 1809There are a couple of cases where I went with standard stuff instead. For 1810example, to test whether I am in debug mode or not, I still use the standard 1811`#ifndef NDEBUG`. 1812 1813#### Standard Macros 1814 1815`BC_ENABLED` 1816 1817: This macro expands to `1` if `bc` is enabled, `0` if disabled. 1818 1819`DC_ENABLED` 1820 1821: This macro expands to `1` if `dc` is enabled, `0` if disabled. 1822 1823`BUILD_TYPE` 1824 1825: The macro expands to the build type, which is one of: `A`, `E`, `H`, `N`, 1826 `EH`, `EN`, `HN`, `EHN`. This build type is used in the help text to direct 1827 the user to the correct markdown manual in the `git.yzena.com` website. 1828 1829`EXECPREFIX` 1830 1831: Thist macro expands to the prefix on the executable name. This is used to 1832 allow `bc` and `dc` to skip the prefix when finding out which calculator is 1833 executing. 1834 1835`BC_NUM_KARATSUBA_LEN` 1836 1837: This macro expands to an integer, which is the length of numbers below which 1838 the Karatsuba multiplication algorithm switches to brute-force 1839 multiplication. 1840 1841`BC_ENABLE_EXTRA_MATH` 1842 1843: This macro expands to `1` if the [Extra Math build option][188] is enabled, 1844 `0` if disabled. 1845 1846`BC_ENABLE_HISTORY` 1847 1848: This macro expands to `1` if the [History build option][193] is enabled, `0` 1849 if disabled. 1850 1851`BC_ENABLE_NLS` 1852 1853: This macro expands to `1` if the [NLS build option][193] (for locales) is 1854 enabled, `0` if disabled. 1855 1856`BC_ENABLE_LIBRARY` 1857 1858: This macro expands to `1` if the [`bcl` library][156] is enabled, `0` if 1859 disabled. If this is enabled, building the calculators themselves is 1860 disabled, but both `BC_ENABLED` and `DC_ENABLED` must be non-zero. 1861 1862`BC_ENABLE_MEMCHECK` 1863 1864: This macro expands to `1` if `bc` has been built for use with Valgrind's 1865 [Memcheck][194], `0` otherwise. This ensures that fatal errors still free 1866 all of their memory when exiting. `bc` does not do that normally because 1867 what's the point? 1868 1869`BC_ENABLE_AFL` 1870 1871: This macro expands to `1` if `bc` has been built for fuzzing with 1872 [AFL++][125], `0` otherwise.. See the [Fuzzing][82] section for more 1873 information. 1874 1875`BC_DEFAULT_BANNER` 1876 1877: This macro expands to the default value for displaying the `bc` banner. 1878 1879`BC_DEFAULT_SIGINT_RESET` 1880 1881: The macro expands to the default value for whether or not `bc` should reset 1882 on `SIGINT` or quit. 1883 1884`BC_DEFAULT_TTY_MODE` 1885 1886: The macro expands to the default value for whether or not `bc` should use 1887 TTY mode when it available. 1888 1889`BC_DEFAULT_PROMPT` 1890 1891: This macro expands to the default value for whether or not `bc` should use a 1892 prompt when TTY mode is available. 1893 1894`DC_DEFAULT_SIGINT_RESET` 1895 1896: The macro expands to the default value for whether or not `dc` should reset 1897 on `SIGINT` or quit. 1898 1899`DC_DEFAULT_TTY_MODE` 1900 1901: The macro expands to the default value for whether or not `dc` should use 1902 TTY mode when it available. 1903 1904`DC_DEFAULT_PROMPT` 1905 1906: This macro expands to the default value for whether or not `dc` should use a 1907 prompt when TTY mode is available. 1908 1909`BC_DEBUG_CODE` 1910 1911: If this macro expands to a non-zero integer, then `bc` is built with *a lot* 1912 of extra debugging code. This is never set by the build system and must be 1913 set by the programmer manually. This should never be set in builds given to 1914 end users. For more information, see the [Debugging][134] section. 1915 1916## Test Suite 1917 1918While the source code may be the heart and soul of `bc`, the test suite is the 1919arms and legs: it gives `bc` the power to do anything it needs to do. 1920 1921The test suite is what allowed `bc` to climb to such high heights of quality. 1922This even goes for fuzzing because fuzzing depends on the test suite for its 1923input corpuses. (See the [Fuzzing][82] section.) 1924 1925Understanding how the test suite works should be, I think, the first thing that 1926maintainers learn after learning what `bc` and `dc` should do. This is because 1927the test suite, properly used, gives confidence that changes have not caused 1928bugs or regressions. 1929 1930That is why I spent the time to make the test suite as easy to use and as fast 1931as possible. 1932 1933To use the test suite (assuming `bc` and/or `dc` are already built), run the 1934following command: 1935 1936``` 1937make test 1938``` 1939 1940That's it. That's all. 1941 1942It will return an error code if the test suite failed. It will also print out 1943information about the failure. 1944 1945If you want the test suite to go fast, then run the following command: 1946 1947``` 1948make -j<cores> test 1949``` 1950 1951Where `<cores>` is the number of cores that your computer has. Of course, this 1952requires a `make` implementation that supports that option, but most do. (And I 1953will use this convention throughout the rest of this section.) 1954 1955I have even tried as much as possible, to put longer-running tests near the 1956beginning of the run so that the entire suite runs as fast as possible. 1957 1958However, if you want to be sure which test is failing, then running a bare 1959`make test` is a great way to do that. 1960 1961But enough about how you have no excuses to use the test suite as much as 1962possible; let's talk about how it works and what you *can* do with it. 1963 1964### Standard Tests 1965 1966The heavy lifting of testing the math in `bc`, as well as basic scripting, is 1967done by the "standard tests" for each calculator. 1968 1969These tests use the files in the [`tests/bc/`][161] and [`tests/dc/`][162] 1970directories (except for [`tests/bc/all.txt`][163], [`tests/bc/errors.txt`][164], 1971[`tests/bc/posix_errors.txt`][165], [`tests/bc/timeconst.sh`][166], 1972[`tests/dc/all.txt`][167], [`tests/dc/errors.txt`][168], and 1973[`tests/dc/read_errors.txt`][175]), which are called the "standard test 1974directories." 1975 1976For every test, there is the test file and the results file. The test files have 1977names of the form `<test>.txt`, where `<test>` is the name of the test, and the 1978results files have names of the form `<test>_results.txt`. 1979 1980If the test file exists but the results file does not, the results for that test 1981are generated by a GNU-compatible `bc` or `dc`. See the [Generated Tests][143] 1982section. 1983 1984The `all.txt` file in each standard tests directory is what tells the test suite 1985and [build system][142] what tests there are, and the tests are either run in 1986that order, or in the case of parallel `make`, that is the order that the 1987targets are listed as prerequisites of `make test`. 1988 1989If the test exists in the `all.txt` file but does not *actually* exist, the test 1990and its results are generated by a GNU-compatible `bc` or `dc`. See the 1991[Generated Tests][143] section. 1992 1993To add a non-generated standard test, do the following: 1994 1995* Add the test file (`<test>.txt` in the standard tests directory). 1996* Add the results file (`<test>_results.txt` in the standard tests directory). 1997 You can skip this step if just the results file needs to be generated. See the 1998 [Generated Tests][147] section for more information. 1999* Add the name of the test to the `all.txt` file in the standard tests 2000 directory, putting it in the order it should be in. If possible, I would put 2001 longer tests near the beginning because they will start running earlier with 2002 parallel `make`. I always keep `decimal` first, though, as a smoke test. 2003 2004If you need to add a generated standard test, see the [Generated Tests][147] 2005section for how to do that. 2006 2007Some standard tests need to be skipped in certain cases. That is handled by the 2008[build system][142]. See the [Integration with the Build System][147] section 2009for more details. 2010 2011In addition to all of the above, the standard test directory is not only the 2012directory for the standard tests of each calculator, it is also the parent 2013directory of all other test directories for each calculator. 2014 2015#### `bc` Standard Tests 2016 2017The list of current (17 July 2021) standard tests for `bc` is below: 2018 2019decimal 2020 2021: Tests decimal parsing and printing. 2022 2023print 2024 2025: Tests printing in every base from decimal. This is near the top for 2026 performance of parallel testing. 2027 2028parse 2029 2030: Tests parsing in any base and outputting in decimal. This is near the top 2031 for performance of parallel testing. 2032 2033lib2 2034 2035: Tests the extended math library. This is near the top for performance of 2036 parallel testing. 2037 2038print2 2039 2040: Tests printing at the extreme values of `obase`. 2041 2042length 2043 2044: Tests the `length()` builtin function. 2045 2046scale 2047 2048: Tests the `scale()` builtin function. 2049 2050shift 2051 2052: Tests the left (`<<`) and right (`>>`) shift operators. 2053 2054add 2055 2056: Tests addition. 2057 2058subtract 2059 2060: Tests subtraction. 2061 2062multiply 2063 2064: Tests multiplication. 2065 2066divide 2067 2068: Tests division. 2069 2070modulus 2071 2072: Tests modulus. 2073 2074power 2075 2076: Tests power (exponentiation). 2077 2078sqrt 2079 2080: Tests the `sqrt()` (square root) builtin function. 2081 2082trunc 2083 2084: Tests the truncation (`$`) operator. 2085 2086places 2087 2088: Tests the places (`@`) operator. 2089 2090vars 2091 2092: Tests some usage of variables. This one came from [AFL++][125] I think. 2093 2094boolean 2095 2096: Tests boolean operators. 2097 2098comp 2099 2100: Tests comparison operators. 2101 2102abs 2103 2104: Tests the `abs()` builtin function. 2105 2106assignments 2107 2108: Tests assignment operators, including increment/decrement operators. 2109 2110functions 2111 2112: Tests functions, specifically function parameters being replaced before they 2113 themselves are used. See the comment in `bc_program_call()` about the last 2114 condition. 2115 2116scientific 2117 2118: Tests scientific notation. 2119 2120engineering 2121 2122: Tests engineering notation. 2123 2124globals 2125 2126: Tests that assigning to globals affects callers. 2127 2128strings 2129 2130: Tests strings. 2131 2132strings2 2133 2134: Tests string allocation in slabs, to ensure slabs work. 2135 2136letters 2137 2138: Tests single and double letter numbers to ensure they behave differently. 2139 Single-letter numbers always be set to the same value, regardless of 2140 `ibase`. 2141 2142exponent 2143 2144: Tests the `e()` function in the math library. 2145 2146log 2147 2148: Tests the `l()` function in the math library. 2149 2150pi 2151 2152: Tests that `bc` produces the right value of pi for numbers with varying 2153 `scale` values. 2154 2155arctangent 2156 2157: Tests the `a()` function in the math library. 2158 2159sine 2160 2161: Tests the `s()` function in the math library. 2162 2163cosine 2164 2165: Tests the `c()` function in the math library. 2166 2167bessel 2168 2169: Tests the `j()` function in the math library. 2170 2171arrays 2172 2173: Test arrays. 2174 2175misc 2176 2177: Miscellaneous tests. I named it this because at the time, I struggled to 2178 classify them, but it's really testing multi-line numbers. 2179 2180misc1 2181 2182: A miscellaneous test found by [AFL++][125]. 2183 2184misc2 2185 2186: A miscellaneous test found by [AFL++][125]. 2187 2188misc3 2189 2190: A miscellaneous test found by [AFL++][125]. 2191 2192misc4 2193 2194: A miscellaneous test found by [AFL++][125]. 2195 2196misc5 2197 2198: A miscellaneous test found by [AFL++][125]. 2199 2200misc6 2201 2202: A miscellaneous test found by [AFL++][125]. 2203 2204misc7 2205 2206: A miscellaneous test found by [AFL++][125]. 2207 2208void 2209 2210: Tests void functions. 2211 2212rand 2213 2214: Tests the pseudo-random number generator and its special stack handling. 2215 2216recursive_arrays 2217 2218: Tested the slab vector undo ability in used in `bc_parse_name()` when it 2219 existed. Now used as a stress test. 2220 2221divmod 2222 2223: Tests divmod. 2224 2225modexp 2226 2227: Tests modular exponentiation. 2228 2229bitfuncs 2230 2231: Tests the bitwise functions, `band()`, `bor()`, `bxor()`, `blshift()` and 2232 `brshift()` in [`gen/lib2.bc`][26]. 2233 2234#### `dc` Standard Tests 2235 2236The list of current (17 July 2021) standard tests for `dc` is below: 2237 2238decimal 2239 2240: Tests decimal parsing and printing. 2241 2242length 2243 2244: Tests the `length()` builtin function, including for strings and arrays. 2245 2246stack_len 2247 2248: Tests taking the length of the results stack. 2249 2250stack_len 2251 2252: Tests taking the length of the execution stack. 2253 2254add 2255 2256: Tests addition. 2257 2258subtract 2259 2260: Tests subtraction. 2261 2262multiply 2263 2264: Tests multiplication. 2265 2266divide 2267 2268: Tests division. 2269 2270modulus 2271 2272: Tests modulus. 2273 2274divmod 2275 2276: Tests divmod. 2277 2278power 2279 2280: Tests power (exponentiation). 2281 2282sqrt 2283 2284: Tests the `sqrt()` (square root) builtin function. 2285 2286modexp 2287 2288: Tests modular exponentiation. 2289 2290boolean 2291 2292: Tests boolean operators. 2293 2294negate 2295 2296: Tests negation as a command and as part of numbers. 2297 2298trunc 2299 2300: Tests the truncation (`$`) operator. 2301 2302places 2303 2304: Tests the places (`@`) operator. 2305 2306shift 2307 2308: Tests the left (`<<`) and right (`>>`) shift operators. 2309 2310abs 2311 2312: Tests the `abs()` builtin function. 2313 2314scientific 2315 2316: Tests scientific notation. 2317 2318engineering 2319 2320: Tests engineering notation. 2321 2322vars 2323 2324: Tests some usage of variables. This one came from [AFL++][125] I think. 2325 2326misc 2327 2328: Miscellaneous tests. I named it this because at the time, I struggled to 2329 classify them. 2330 2331strings 2332 2333: Tests strings. 2334 2335rand 2336 2337: Tests the pseudo-random number generator and its special stack handling. 2338 2339exec_stack 2340 2341: Tests the execution stack depth command. 2342 2343### Script Tests 2344 2345The heavy lifting of testing the scripting of `bc` is done by the "script tests" 2346for each calculator. 2347 2348These tests use the files in the [`tests/bc/scripts/`][169] and 2349[`tests/dc/scripts/`][170] directories (except for 2350[`tests/bc/scripts/all.txt`][171] and [`tests/dc/scripts/all.txt`][172]), which 2351are called the "script test directories." 2352 2353To add a script test, do the following: 2354 2355* Add the test file (`<test>.bc` or `<test>.dc` in the script tests directory). 2356* Add the results file (`<test>.txt` in the script tests directory). You can 2357 skip this step if just the results file needs to be generated. See the 2358 [Generated Tests][147] section for more information. 2359* Add the name of the test to the `all.txt` file in the script tests directory, 2360 putting it in the order it should be in. If possible, I would put longer tests 2361 near the beginning because they will start running earlier with parallel 2362 `make`. 2363 2364Some script tests need to be skipped in certain cases. That is handled by the 2365[build system][142]. See the [Integration with the Build System][147] section 2366for more details. 2367 2368Another unique thing about the script tests, at least for `bc`: they test the 2369`-g` and `--global-stacks` flags. This means that all of the script tests for 2370`bc` are written assuming the `-g` flag was given on the command-line 2371 2372There is one extra piece of script tests: [`tests/script.sed`][190]. This `sed` 2373script is used to remove an incompatibility with GNU `bc`. 2374 2375If there is only one more character to print at the end of `BC_LINE_LENGTH`, GNU 2376`bc` still prints a backslash+newline+digit combo. OpenBSD doesn't, which is 2377correct according to my reading of the `bc` spec, so my `bc` doesn't as well. 2378 2379The `sed` script edits numbers that end with just one digit on a line by itself 2380to put it on the same line as others. 2381 2382#### `bc` Script Tests 2383 2384The list of current (17 July 2021) script tests for `bc` is below: 2385 2386print.bc 2387 2388: Tests printing even harder than the print standard test. 2389 2390multiply.bc 2391 2392: Tests multiplication even harder than the multiply standard test. 2393 2394divide.bc 2395 2396: Tests division even harder than the divide standard test. 2397 2398subtract.bc 2399 2400: Tests subtraction even harder than the subtract standard test. 2401 2402add.bc 2403 2404: Tests addition even harder than the add standard test. 2405 2406parse.bc 2407 2408: Tests parsing even harder than the parse standard test. 2409 2410array.bc 2411 2412: Tests arrays even harder than the arrays standard test. 2413 2414atan.bc 2415 2416: Tests arctangent even harder than the arctangent standard test. 2417 2418bessel.bc 2419 2420: Tests bessel even harder than the bessel standard test. 2421 2422functions.bc 2423 2424: Tests functions even harder than the functions standard test. 2425 2426globals.bc 2427 2428: Tests global stacks directly. 2429 2430len.bc 2431 2432: Tests the `length()` builtin on arrays. 2433 2434rand.bc 2435 2436: Tests the random number generator in the presence of global stacks. 2437 2438references.bc 2439 2440: Tests functions with array reference parameters. 2441 2442screen.bc 2443 2444: A random script provided by an early user that he used to calculate the size 2445 of computer screens 2446 2447strings2.bc 2448 2449: Tests escaping in strings. 2450 2451#### `dc` Script Tests 2452 2453The list of current (17 July 2021) script tests for `dc` is below: 2454 2455prime.dc 2456 2457: Tests scripting by generating the first 100,000 primes. 2458 2459asciify.dc 2460 2461: Tests the asciify command. 2462 2463stream.dc 2464 2465: Tests the stream command. 2466 2467array.dc 2468 2469: Tests arrays. 2470 2471else.dc 2472 2473: Tests else clauses on conditional execution commands. 2474 2475factorial.dc 2476 2477: Tests scripting with factorial. 2478 2479loop.dc 2480 2481: Tests scripting by implementing loops. 2482 2483quit.dc 2484 2485: Tests the quit command in the presence of tail calls. 2486 2487weird.dc 2488 2489: A miscellaneous test. 2490 2491### Error Tests 2492 2493One of the most useful parts of the `bc` test suite, in my opinion, is the heavy 2494testing of error conditions. 2495 2496Just about every error condition I can think of is tested, along with many 2497machine-generated (by [AFL++][125]) ones. 2498 2499However, because the error tests will often return error codes, they require 2500different infrastructure from the rest of the test suite, which assumes that 2501the calculator under test will return successfully. A lot of that infrastructure 2502is in the [`scripts/functions.sh`][105] script, but it basically allows the 2503calculator to exit with an error code and then tests that there *was* an error 2504code. 2505 2506Besides returning error codes, error tests also ensure that there is output from 2507`stderr`. This is to make sure that an error message is always printed. 2508 2509The error tests for each calculator are spread through two directories, due to 2510historical accident. These two directories are the standard test directory (see 2511the [Standard Tests][149] section) and the `errors/` directory directly 2512underneath the standard tests directory. 2513 2514This split is convenient, however, because the tests in each directory are 2515treated differently. 2516 2517The error tests in the standard test directory, which include `errors.txt` for 2518both calculators, `posix_errors.txt` for `bc`, and `read_errors.txt` for `dc`, 2519are read line-by-line and shoved through `stdin`, and each line is considered a 2520separate test. For this reason, there can't be any blank lines in the error 2521files in the standard tests directory because a blank line causes a successful 2522exit. 2523 2524On the other hand, the tests in the `errors/` directory below the standard tests 2525directory are considered to be one test per file, and they are used differently. 2526They are shoved into the calculator through `stdin`, but they are also executed 2527on the command-line. 2528 2529To add an error test, first figure out which kind you want. 2530 2531Is it a simple one-liner, and you don't care if it's tested through a file? 2532 2533Then put it in one of the error files in the standard test directory. I would 2534only put POSIX errors in the `posix_errors.txt` file for `bc`, and only `read()` 2535errors in the `read_errors.txt` file for `dc`; all others I would put in the 2536respective `errors.txt` file. 2537 2538On the other hand, if you care if the error is run as a file on the 2539command-line, or the error requires multiple lines to reproduce, then put the 2540test in the respective `errors/` directory. 2541 2542After that, you are done; the test suite will automatically pick up the new 2543test, and you don't have to tell the test suite the expected results. 2544 2545### `stdin` Tests 2546 2547The `stdin` tests specifically test the lexing and parsing of multi-line 2548comments and strings. This is important because when reading from `stdin`, the 2549calculators can only read one line at a time, so partial parses are possible. 2550 2551To add `stdin` tests, just add the tests to the `stdin.txt` file in the 2552respective standard tests directory, and add the expected results in the 2553`stdin_results.txt` in the respective standard tests directory. 2554 2555### `read()` Tests 2556 2557The `read()` tests are meant to test the `read()` builtin function, to ensure 2558that the parsing and execution is correct. 2559 2560Each line is one test, as that is the nature of using the `read()` function, so 2561to add a test, just add it as another line in the `read.txt` file in the 2562respective standard tests directory, and add its result to the 2563`read_results.txt` file in the respective standard tests directory. 2564 2565### Other Tests 2566 2567The "other" tests are just random tests that I could not easily classify under 2568other types of tests. They usually include things like command-line parsing and 2569environment variable testing. 2570 2571To add an other test, it requires adding the programming for it to 2572[`tests/other.sh`][195] because all of the tests are written specifically in 2573that script. It would be best to use the infrastructure in 2574[`scripts/functions.sh`][105]. 2575 2576### Linux `timeconst.bc` Script 2577 2578One special script that `bc`'s test suite will use is the [Linux `timeconst.bc` 2579script][6]. 2580 2581I made the test suite able to use this script because the reason the 2582[toybox][16] maintainer wanted my `bc` is because of this script, and I wanted 2583to be sure that it would run correctly on the script. 2584 2585However, it is not part of the distribution, nor is it part of the repository. 2586The reason for this is because [`timeconst.bc`][6] is under the GPL, while this 2587repo is under a BSD license. 2588 2589If you want `bc` to run tests on [`timeconst.bc`][6], download it and place it 2590at `tests/bc/scripts/timeconst.bc`. If it is there, the test suite will 2591automatically run its tests; otherwise, it will skip it. 2592 2593### History Tests 2594 2595There are automatic tests for history; however, they have dependencies: Python 3 2596and [`pexpect`][137]. 2597 2598As a result, because I need the [test suite to be portable][138], like the rest 2599of `bc`, the history tests are carefully guarded with things to ensure that they 2600are skipped, rather than failing if Python and [`pexpect`][137] are not 2601installed. For this reason, there is a `sh` script, [`tests/history.sh`][140] 2602that runs the actual script, [`tests/history.py`][139]. 2603 2604I have added as many tests as I could to cover as many lines and branches as 2605possible. I guess I could have done more, but doing so would have required a lot 2606of time. 2607 2608I have tried to make it as easy as possible to run the history tests. They will 2609run automatically if you use the `make test` command, and they will also use 2610parallel execution with `make -j<cores> test`. 2611 2612All of the tests are contained in [`tests/history.py`][139]. The reason for this 2613is because they are in Python, and I don't have an easy way of including Python 2614(or at the very least, I am not familiar enough with Python to do that). So they 2615are all in the same file to make it easier on me. 2616 2617Each test is one function in the script. They all take the same number and type 2618of arguments: 2619 26201. `exe`: the executable to run. 26212. `args`: the arguments to pass to the executable. 26223. `env`: the environment. 2623 2624Each function creates a child process with `pexpect.spawn` and then tests with 2625that child. Then the function returns the child to the caller, who closes it 2626and checks its error code against its expected error code. 2627 2628Yes, the error code is not a success all the time. This is because of the UTF-8 2629tests; `bc` gives a fatal error on any non-ASCII data because ASCII is all `bc` 2630is required to handle, per the [standard][2]. 2631 2632So in [`tests/history.py`][139], there are four main arrays: 2633 2634* `bc` test functions, 2635* `bc` expected error codes. 2636* `dc` test functions. 2637* `dc` expected error codes. 2638 2639[`tests/history.py`][139] takes an index as an argument; that index is what test 2640it should run. That index is used to index into the proper test and error code 2641array. 2642 2643If you need to add more history tests, you need to do the following: 2644 26451. Add the function for that test to [`tests/history.py`][139]. 26462. Add the function to the proper array of tests. 26473. Add the expected error code to the proper array of error codes. 26484. Add a target for the test to [`Makefile.in`][70]. 26495. Add that target as a prerequisite to either `test_bc_history` or 2650 `test_dc_history`. 2651 2652You do not need to do anything to add the test to `history_all_tests` (see 2653[Group Tests][141] below) because the scripts will automatically run all of the 2654tests properly. 2655 2656### Generated Tests 2657 2658Some tests are *large*, and as such, it is impractical to check them into `git`. 2659Instead, the tests depend on the existence of a GNU-compatible `bc` in the 2660`PATH`, which is then used to generate the tests. 2661 2662If [`configure.sh`][69] was run with the `-G` argument, which disables generated 2663tests, then `make test` and friends will automatically skip generated tests. 2664This is useful to do on platforms that are not guaranteed to have a 2665GNU-compatible `bc` installed. 2666 2667However, adding a generated test is a complicated because you have to figure out 2668*where* you want to put the file to generate the test. 2669 2670For example, `bc`'s test suite will automatically use a GNU-compatible `bc` to 2671generate a `<test>_results.txt` file in the [standard tests][149] directory 2672(either `tests/bc/` or `tests/dc/`) if none exists for the `<test>` test. If no 2673`<test>.txt` file exists in the [standard tests][149] directory, then `bc`'s 2674test suite will look for a `<test>.bc` or `<test>.dc` file in the [script 2675tests][150] directory (either `tests/bc/scripts` or `tests/dc/scripts`), and if 2676that exists, it will use that script to generate the `<test>.txt` file in the 2677[standard tests][149] directory after which it will generate the 2678`<test>_results.txt` file in the [standard tests][149] directory. 2679 2680So you can choose to either: 2681 2682* Have a test in the [standard tests][149] directory without a corresponding 2683 `*_results.txt` file, or 2684* Have a script in the [script tests][150] directory to generate the 2685 corresponding file in the standard test directory before generating the 2686 corresponding `*_results.txt` file. 2687 2688Adding a script has a double benefit: the script itself can be used as a test. 2689However, script test results can also be generated. 2690 2691If `bc` is asked to run a script test, then if the script does not exist, `bc`'s 2692test suite returns an error. If it *does* exist, but no corresponding 2693`<test>.txt` file exists in the [script tests][150] directory, then a 2694GNU-compatible `bc` is used to generate the `<test>.txt` results file. 2695 2696If generated tests are disabled through [`configure.sh`][69], then these tests 2697are not generated if they do not exist. However, if they *do* exist, then they 2698are run. This can happen if a `make clean_tests` was not run between a build 2699that generated tests and a build that will not. 2700 2701### Group Tests 2702 2703While the test suite has a lot of targets in order to get parallel execution, 2704there are five targets that allow you to run each section, or all, of the test 2705suite as one unit: 2706 2707* `bc_all_tests` (`bc` tests) 2708* `timeconst_all_tests` ([Linux `timeconst.bc` script][6] tests) 2709* `dc_all_tests` (`dc` tests) 2710* `history_all_tests` (history tests) 2711* `run_all_tests` (combination of the previous four) 2712 2713In addition, there are more fine-grained targets available: 2714 2715* `test_bc` runs all `bc` tests (except history tests). 2716* `test_dc` runs all `dc` tests (except history tests). 2717* `test_bc_tests` runs all `bc` [standard tests][149]. 2718* `test_dc_tests` runs all `dc` [standard tests][149]. 2719* `test_bc_scripts` runs all `bc` [script tests][150]. 2720* `test_dc_scripts` runs all `dc` [script tests][150]. 2721* `test_bc_stdin` runs the `bc` [`stdin` tests][152]. 2722* `test_dc_stdin` runs the `dc` [`stdin` tests][152]. 2723* `test_bc_read` runs the `bc` [`read()` tests][153]. 2724* `test_dc_read` runs the `dc` [`read()` tests][153]. 2725* `test_bc_errors` runs the `bc` [error tests][151]. 2726* `test_dc_errors` runs the `dc` [error tests][151]. 2727* `test_bc_other` runs the `bc` [other tests][151]. 2728* `test_dc_other` runs the `dc` [other tests][151]. 2729* `timeconst` runs the tests for the [Linux `timeconst.bc` script][6]. 2730* `test_history` runs all history tests. 2731* `test_bc_history` runs all `bc` history tests. 2732* `test_dc_history` runs all `dc` history tests. 2733 2734All of the above tests are parallelizable. 2735 2736### Individual Tests 2737 2738In addition to all of the above, individual test targets are available. These 2739are mostly useful for attempting to fix a singular test failure. 2740 2741These tests are: 2742 2743* `test_bc_<test>`, where `<test>` is the name of a `bc` [standard test][149]. 2744 The name is the name of the test file without the `.txt` extension. It is the 2745 name printed by the test suite when running the test. 2746* `test_dc_<test>`, where `<test>` is the name of a `dc` [standard test][149]. 2747 The name is the name of the test file without the `.txt` extension. It is the 2748 name printed by the test suite when running the test. 2749* `test_bc_script_<test>`, where `<test>` is the name of a `bc` [script 2750 test][150]. The name of the test is the name of the script without the `.bc` 2751 extension. 2752* `test_dc_script_<test>`, where `<test>` is the name of a `dc` [script 2753 test][150]. The name of the test is the name of the script without the `.dc` 2754 extension. 2755* `test_bc_history<idx>` runs the `bc` history test with index `<idx>`. 2756* `test_dc_history<idx>` runs the `dc` history test with index `<idx>`. 2757 2758### [`bcl`][156] Test 2759 2760When [`bcl`][156] is built, the [build system][142] automatically ensures that 2761`make test` runs the [`bcl`][156] test instead of the `bc` and `dc` tests. 2762 2763There is only one test, and it is built from [`tests/bcl.c`][158]. 2764 2765The reason the test is in C is because [`bcl`][156] is a C library; I did not 2766want to have to write C code *and* POSIX `sh` scripts to run it. 2767 2768The reason there is only one test is because most of the code for the library is 2769tested by virtue of testing `bc` and `dc`; the test needs to only ensure that 2770the library bindings and plumbing do not interfere with the underlying code. 2771 2772However, just because there is only one test does not mean that it doesn't test 2773more than one thing. The code actually handles a series of tests, along with 2774error checking to ensure that nothing went wrong. 2775 2776To add a [`bcl`][156] test, just figure out what test you want, figure out where 2777in the [`tests/bcl.c`][158] would be best to put it, and put it there. Do as 2778much error checking as possible, and use the `err(BclError)` function. Ensure 2779that all memory is freed because that test is run through [Valgrind][159] and 2780[AddressSanitizer][160]. 2781 2782### Integration with the Build System 2783 2784If it was not obvious by now, the test suite is heavily integrated into the 2785[build system][142], but the integration goes further than just making the test 2786suite easy to run from `make` and generating individual and group tests. 2787 2788The big problem the test suite has is that some `bc` code, stuff that is 2789important to test, is only in *some* builds. This includes all of the extra math 2790extensions, for example. 2791 2792So the test suite needs to have some way of turning off the tests that depend on 2793certain [build types][81] when those [build types][81] are not used. 2794 2795This is the reason the is tightly integrated with the [build system][142]: the 2796[build system][142] knows what [build type][81] was used and can tell the test 2797suite to turn off the tests that do not apply. 2798 2799It does this with arguments to the test scripts that are either a `1` or a `0`, 2800depending on whether tests of that type should be enabled or not. These 2801arguments are why I suggest, in the [Test Scripts][148] section, to always use a 2802`make` target to run the test suite or any individual test. I have added a lot 2803of targets to make this easy and as fast as possible. 2804 2805In addition to all of that, the build system is responsible for selecting the 2806`bc`/`dc` tests or the [`bcl` test][157]. 2807 2808### Test Suite Portability 2809 2810The test suite is meant to be run by users and packagers as part of their 2811install process. 2812 2813This puts some constraints on the test suite, but the biggest is that the test 2814suite must be as [portable as `bc` itself][136]. 2815 2816This means that the test suite must be implemented in pure POSIX `make`, `sh`, 2817and C99. 2818 2819#### Test Scripts 2820 2821To accomplish the portability, the test suite is run by a bunch of `sh` scripts 2822that have the constraints laid out in [POSIX Shell Scripts][76]. 2823 2824However, that means they have some quirks, made worse by the fact that there are 2825[generated tests][143] and [tests that need to be skipped, but only 2826sometimes][147]. 2827 2828This means that a lot of the scripts take an awkward number and type of 2829arguments. Some arguments are strings, but most are integers, like 2830[`scripts/release.sh`][83]. 2831 2832It is for this reason that I do not suggest running the test scripts directly. 2833Instead, always use an appropriate `make` target, which already knows the 2834correct arguments for the test because of the [integration with the build 2835system][147]. 2836 2837### Test Coverage 2838 2839In order to get test coverage information, you need `gcc`, `gcov`, and `gcovr`. 2840 2841If you have them, run the following commands: 2842 2843``` 2844CC=gcc ./configure -gO3 -c 2845make -j<cores> 2846make coverage 2847``` 2848 2849Note that `make coverage` does not have a `-j<cores>` part; it cannot be run in 2850parallel. If you try, you will get errors. And note that `CC=gcc` is used. 2851 2852After running those commands, you can open your web browser and open the 2853`index.html` file in the root directory of the repo. From there, you can explore 2854all of the coverage results. 2855 2856If you see lines or branches that you think you could hit with a manual 2857execution, do such manual execution, and then run the following command: 2858 2859``` 2860make coverage_output 2861``` 2862 2863and the coverage output will be updated. 2864 2865If you want to rerun `make coverage`, you must do a `make clean` and build 2866first, like this: 2867 2868``` 2869make clean 2870make -j<cores> 2871make coverage 2872``` 2873 2874Otherwise, you will get errors. 2875 2876If you want to run tests in parallel, you can do this: 2877 2878``` 2879make -j<cores> 2880make -j<cores> test 2881make coverage_output 2882``` 2883 2884and that will generate coverage output correctly. 2885 2886### [AddressSanitizer][21] and Friends 2887 2888To run the test suite under [AddressSanitizer][21] or any of its friends, use 2889the following commands: 2890 2891``` 2892CFLAGS="-fsanitize=<sanitizer> ./configure -gO3 -m 2893make -j<cores> 2894make -j<cores> test 2895``` 2896 2897where `<sanitizer>` is the correct name of the desired sanitizer. There is one 2898exception to the above: `UndefinedBehaviorSanitizer` should be run on a build 2899that has zero optimization, so for `UBSan`, use the following commands: 2900 2901``` 2902CFLAGS="-fsanitize=undefined" ./configure -gO0 -m 2903make -j<cores> 2904make -j<cores> test 2905``` 2906 2907### [Valgrind][20] 2908 2909To run the test suite under [Valgrind][20], run the following commands: 2910 2911``` 2912./configure -gO3 -v 2913make -j<cores> 2914make -j<cores> test 2915``` 2916 2917It really is that easy. I have directly added infrastructure to the build system 2918and the test suite to ensure that if [Valgrind][20] detects any memory errors or 2919any memory leaks at all, it will tell the test suite infrastructure to report an 2920error and exit accordingly. 2921 2922## POSIX Shell Scripts 2923 2924There is a lot of shell scripts in this repository, and every single one of them 2925is written in pure [POSIX `sh`][72]. 2926 2927The reason that they are written in [POSIX `sh`][72] is for *portability*: POSIX 2928systems are only guaranteed to have a barebones implementation of `sh` 2929available. 2930 2931There are *many* snares for unwary programmers attempting to modify 2932[`configure.sh`][69], any of the scripts in this directory, [`strgen.sh`][9], or 2933any of the scripts in [`tests/`][77]. Here are some of them: 2934 29351. No `bash`-isms. 29362. Only POSIX standard utilities are allowed. 29373. Only command-line options defined in the POSIX standard for POSIX utilities 2938 are allowed. 29394. Only the standardized behavior of POSIX utilities is allowed. 29405. Functions return data by *printing* it. Using `return` sets their exit code. 2941 2942In other words, the script must only use what is standardized in the [`sh`][72] 2943and [Shell Command Language][73] standards in POSIX. This is *hard*. It precludes 2944things like `local` and the `[[ ]]` notation. 2945 2946These are *enormous* restrictions and must be tested properly. I put out at 2947least one release with a change to `configure.sh` that wasn't portable. That was 2948an embarrassing mistake. 2949 2950The lack of `local`, by the way, is why variables in functions are named with 2951the form: 2952 2953``` 2954_<function_name>_<var_name> 2955``` 2956 2957This is done to prevent any clashes of variable names with already existing 2958names. And this applies to *all* shell scripts. However, there are a few times 2959when that naming convention is *not* used; all of them are because those 2960functions are required to change variables in the global scope. 2961 2962### Maintainer-Only Scripts 2963 2964If a script is meant to be used for maintainers (of `bc`, not package 2965maintainers), then rules 2, 3, and 4 don't need to be followed as much because 2966it is assumed that maintainers will be able to install whatever tools are 2967necessary to do the job. 2968 2969## Manuals 2970 2971The manuals for `bc` and `dc` are all generated, and the manpages for `bc`, 2972`dc`, and `bcl` are also generated. 2973 2974Why? 2975 2976I don't like the format of manpages, and I am not confident in my ability to 2977write them. Also, they are not easy to read on the web. 2978 2979So that explains why `bcl`'s manpage is generated from its markdown version. But 2980why are the markdown versions of the `bc` and `dc` generated? 2981 2982Because the content of the manuals needs to change based on the [build 2983type][81]. For example, if `bc` was built with no history support, it should not 2984have the **COMMAND LINE HISTORY** section in its manual. If it did, that would 2985just confuse users. 2986 2987So the markdown manuals for `bc` and `dc` are generated from templates 2988([`manuals/bc.1.md.in`][89] and [`manuals/dc.1.md.in`][90]). And from there, 2989the manpages are generated from the generated manuals. 2990 2991The generated manpage for `bcl` ([`manuals/bcl.3`][62]) is checked into version 2992control, and the generated markdown manuals and manpages for `bc` 2993([`manuals/bc`][79]) and `dc` ([`manuals/dc`][80]) are as well. 2994 2995This is because generating the manuals and manpages requires a heavy dependency 2996that only maintainers should care about: [Pandoc][92]. Because users [should not 2997have to install *any* dependencies][136], the files are generated, checked into 2998version control, and included in distribution tarballs. 2999 3000If you run [`configure.sh`][69], you have an easy way of generating the markdown 3001manuals and manpages: just run `make manpages`. This target calls 3002[`scripts/manpage.sh`][60] appropriately for `bc`, `dc`, and `bcl`. 3003 3004For more on how generating manuals and manpages works, see 3005[`scripts/manpage.sh`][60]. 3006 3007## Locales 3008 3009The locale system of `bc` is enormously complex, but that's because 3010POSIX-compatible locales are terrible. 3011 3012How are they terrible? 3013 3014First, `gencat` does not work for generating cross-compilation. In other words, 3015it does not generate machine-portable files. There's nothing I can do about 3016this except for warn users. 3017 3018Second, the format of `.msg` files is...interesting. Thank goodness it is text 3019because otherwise, it would be impossible to get them right. 3020 3021Third, `.msg` files are not used. In other words, `gencat` exists. Why? 3022 3023Fourth, `$NLSPATH` is an awful way to set where and *how* to install locales. 3024 3025Yes, where and *how*. 3026 3027Obviously, from it's name, it's a path, and that's the where. The *how* is more 3028complicated. 3029 3030It's actually *not* a path, but a path template. It's a format string, and it 3031can have a few format specifiers. For more information on that, see [this 3032link][84]. But in essence, those format specifiers configure how each locale is 3033supposed to be installed. 3034 3035With all those problems, why use POSIX locales? Portability, as always. I can't 3036assume that `gettext` will be available, but I *can* pretty well assume that 3037POSIX locales will be available. 3038 3039The locale system of `bc` includes all files under [`locales/`][85], 3040[`scripts/locale_install.sh`][87], [`scripts/locale_uninstall.sh`][88], 3041[`scripts/functions.sh`][105], the `bc_err_*` constants in [`src/data.c`][131], 3042and the parts of the build system needed to activate it. There is also code in 3043[`src/vm.c`][58] (in `bc_vm_gettext()`) for loading the current locale. 3044 3045If the order of error messages and/or categories are changed, the order of 3046errors must be changed in the enum, the default error messages and categories in 3047[`src/data.c`][131], and all of the messages and categories in the `.msg` files 3048under [`locales/`][85]. 3049 3050## Static Analysis 3051 3052I do *some* static analysis on `bc`. 3053 3054I used to use [Coverity][196], but I stopped using it when it started giving me 3055too many false positives and also because it had a vulnerability. 3056 3057However, I still use the [Clang Static Analyzer][197] through 3058[`scan-build`][19]. I only use it in debug mode because I have to add some 3059special code to make it not complain about things that are definitely not a 3060problem. 3061 3062The most frequent example of false positives is where a local is passed to a 3063function to be initialized. [`scan-build`][19] misses that fact, so I 3064pre-initialize such locals to prevent the warnings. 3065 3066To run `scan-build`, do the following: 3067 3068``` 3069make clean 3070scan-build make 3071``` 3072 3073`scan-build` will print its warnings to `stdout`. 3074 3075## Fuzzing 3076 3077The quality of this `bc` is directly related to the amount of fuzzing I did. As 3078such, I spent a lot of work making the fuzzing convenient and fast, though I do 3079admit that it took me a long time to admit that it did need to be faster. 3080 3081First, there were several things which make fuzzing fast: 3082 3083* Using [AFL++][125]'s deferred initialization. 3084* Splitting `bc`'s corpuses. 3085* Parallel fuzzing. 3086 3087Second, there are several things which make fuzzing convenient: 3088 3089* Preprepared input corpuses. 3090* [`scripts/fuzz_prep.sh`][119]. 3091* `tmux` and `tmuxp` configs. 3092* [`scripts/afl.py`][94]. 3093 3094### Fuzzing Performance 3095 3096Fuzzing with [AFL++][125] can be ***SLOW***. Spending the time to make it as 3097fast as possible is well worth the time. 3098 3099However, there is a caveat to the above: it is easy to make [AFL++][125] crash, 3100be unstable, or be unable to find "paths" (see [AFL++ Quickstart][129]) if the 3101performance enhancements are done poorly. 3102 3103To stop [AFL++][125] from crashing on test cases, and to be stable, these are 3104the requirements: 3105 3106* The state at startup must be *exactly* the same. 3107* The virtual memory setup at startup must be *exactly* the same. 3108 3109The first isn't too hard; it's the second that is difficult. 3110 3111`bc` allocates a lot of memory at start. ("A lot" is relative; it's far less 3112than most programs.) After going through an execution run, however, some of that 3113memory, while it could be cleared and reset, is in different places because of 3114vectors. Since vectors reallocate, their allocations are not guaranteed to be in 3115the same place. 3116 3117So to make all three work, I had to set up the deferred initialization and 3118persistent mode *before* any memory was allocated (except for `vm.jmp_bufs`, 3119which is probably what caused the stability to drop below 100%). However, using 3120deferred alone let me put the [AFL++][125] initialization further back. This 3121works because [AFL++][125] sets up a `fork()` server that `fork()`'s `bc` right 3122at that call. Thus, every run has the exact same virtual memory setup, and each 3123run can skip all of the setup code. 3124 3125I tested `bc` using [AFL++][125]'s deferred initialization, plus persistent 3126mode, plus shared memory fuzzing. In order to do it safely, with stability above 312799%, all of that was actually *slower* than using just deferred initialization 3128with the initialization *right before* `stdin` was read. And as a bonus, the 3129stability in that situation is 100%. 3130 3131As a result, my [AFL++][125] setup only uses deferred initialization. That's the 3132`__AFL_INIT()` call. 3133 3134(Note: there is one more big item that must be done in order to have 100% 3135stability: the pseudo-random number generator *must* start with *exactly* the 3136same seed for every run. This is set up with the `tmux` and `tmuxp` configs that 3137I talk about below in [Convenience][130]. This seed is set before the 3138`__AFL_INIT()` call, so setting it has no runtime cost for each run, but without 3139it, stability would be abysmal.) 3140 3141On top of that, while `dc` is plenty fast under fuzzing (because of a faster 3142parser and less test cases), `bc` can be slow. So I have split the `bc` input 3143corpus into three parts, and I set fuzzers to run on each individually. This 3144means that they will duplicate work, but they will also find more stuff. 3145 3146On top of all of that, each input corpus (the three `bc` corpuses and the one 3147`dc` corpus) is set to run with 4 fuzzers. That works out perfectly for two 3148reasons: first, my machine has 16 cores, and second, the [AFL++][125] docs 3149recommend 4 parallel fuzzers, at least, to run different "power schedules." 3150 3151### Convenience 3152 3153The preprepared input corpuses are contained in the 3154`tests/fuzzing/bc_inputs{1,2,3}/`, and `tests/fuzzing/dc_inputs` directories. 3155There are three `bc` directories and only one `dc` directory because `bc`'s 3156input corpuses are about three times as large, and `bc` is a larger program; 3157it's going to need much more fuzzing. 3158 3159(They do share code though, so fuzzing all of them still tests a lot of the same 3160math code.) 3161 3162The next feature of convenience is the [`scripts/fuzz_prep.sh`][119] script. It 3163assumes the existence of `afl-clang-lto` in the `$PATH`, but if that exists, it 3164automatically configures and builds `bc` with a fuzz-ideal build. 3165 3166A fuzz-ideal build has several things: 3167 3168* `afl-clang-lto` as the compiler. (See [AFL++ Quickstart][129].) 3169* Debug mode, to crash as easily as possible. 3170* Full optimization (including [Link-Time Optimization][126]), for performance. 3171* [AFL++][125]'s deferred initialization (see [Fuzzing Performance][127] above). 3172* And `AFL_HARDEN=1` during the build to harden the build. See the [AFL++][125] 3173 documentation for more information. 3174 3175There is one big thing that a fuzz-ideal build does *not* have: it does not use 3176[AFL++][125]'s `libdislocator.so`. This is because `libdislocator.so` crashes if 3177it fails to allocate memory. I do not want to consider those as crashes because 3178my `bc` does, in fact, handle them gracefully by exiting with a set error code. 3179So `libdislocator.so` is not an option. 3180 3181However, to add to [`scripts/fuzz_prep.sh`][119] making a fuzz-ideal build, in 3182`tests/fuzzing/`, there are two `yaml` files: [`tests/fuzzing/bc_afl.yaml`][120] 3183and [`tests/fuzzing/bc_afl_continue.yaml`][121]. These files are meant to be 3184used with [`tmux`][122] and [`tmuxp`][123]. While other programmers will have to 3185adjust the `start_directory` item, once it is adjusted, then using this command: 3186 3187``` 3188tmuxp load tests/fuzzing/bc_afl.yaml 3189``` 3190 3191will start fuzzing. 3192 3193In other words, to start fuzzing, the sequence is: 3194 3195``` 3196./scripts/fuzz_prep.sh 3197tmuxp load tests/fuzzing/bc_afl.yaml 3198``` 3199 3200Doing that will load, in `tmux`, 16 separate instances of [AFL++][125], 12 on 3201`bc` and 4 on `dc`. The outputs will be put into the 3202`tests/fuzzing/bc_outputs{1,2,3}/` and `tests/fuzzing/dc_outputs/` directories. 3203 3204(Note that loading that config will also delete all unsaved [AFL++][125] output 3205from the output directories.) 3206 3207Sometimes, [AFL++][125] will report crashes when there are none. When crashes 3208are reported, I always run the following command: 3209 3210``` 3211./scripts/afl.py <dir> 3212``` 3213 3214where `dir` is one of `bc1`, `bc2`, `bc3`, or `dc`, depending on which of the 321516 instances reported the crash. If it was one of the first four (`bc11` through 3216`bc14`), I use `bc1`. If it was one of the second four (`bc21` through `bc24`, I 3217use `bc2`. If it was one of the third four (`bc31` through `bc34`, I use `bc3`. 3218And if it was `dc`, I use `dc`. 3219 3220The [`scripts/afl.py`][94] script will report whether [AFL++][125] correctly 3221reported a crash or not. If so, it will copy the crashing test case to 3222`.test.txt` and tell you whether it was from running it as a file or through 3223`stdin`. 3224 3225From there, I personally always investigate the crash and fix it. Then, when the 3226crash is fixed, I either move `.test.txt` to `tests/bc/errors/<idx>.txt` as an 3227error test (if it produces an error) or I create a new `tests/bc/misc<idx>.txt` 3228test for it and a corresponding results file. (See [Test Suite][124] for more 3229information about the test suite.) In either case, `<idx>` is the next number 3230for a file in that particular place. For example, if the last file in 3231`tests/bc/errors/` is `tests/bc/errors/18.txt`, I move `.test.txt` to 3232`tests/bc/error/19.txt`. 3233 3234Then I immediately run [`scripts/afl.py`][94] again to find the next crash 3235because often, [AFL++][125] found multiple test cases that trigger the same 3236crash. If it finds another, I repeat the process until it is happy. 3237 3238Once it *is* happy, I do the same `fuzz_prep.sh`, `tmuxp load` sequence and 3239restart fuzzing. Why do I restart instead of continuing? Because with the 3240changes, the test outputs could be stale and invalid. 3241 3242However, there *is* a case where I continue: if [`scripts/afl.py`][94] finds 3243that every crash reported by [AFL++][125] is invalid. If that's the case, I can 3244just continue with the command: 3245 3246``` 3247tmuxp load tests/fuzzing/bc_afl_continue.yaml 3248``` 3249 3250(Note: I admit that I usually run [`scripts/afl.py`][94] while the fuzzer is 3251still running, so often, I don't find a need to continue since there was no 3252stop. However, the capability is there, if needed.) 3253 3254In addition, my fuzzing setup, including the `tmux` and `tmuxp` configs, 3255automatically set up [AFL++][125] power schedules (see [Fuzzing 3256Performance][127] above). They also set up the parallel fuzzing such that there 3257is one fuzzer in each group of 4 that does deterministic fuzzing. It's always 3258the first one in each group. 3259 3260For more information about deterministic fuzzing, see the [AFL++][125] 3261documentation. 3262 3263### Corpuses 3264 3265I occasionally add to the input corpuses. These files come from new files in the 3266[Test Suite][124]. In fact, I use soft links when the files are the same. 3267 3268However, when I add new files to an input corpus, I sometimes reduce the size of 3269the file by removing some redundancies. 3270 3271And then, when adding to the `bc` corpuses, I try to add them evenly so that 3272each corpus will take about the same amount of time to get to a finished state. 3273 3274### [AFL++][125] Quickstart 3275 3276The way [AFL++][125] works is complicated. 3277 3278First, it is the one to invoke the compiler. It leverages the compiler to add 3279code to the binary to help it know when certain branches are taken. 3280 3281Then, when fuzzing, it uses that branch information to generate information 3282about the "path" that was taken through the binary. 3283 3284I don't know what AFL++ counts as a new path, but each new path is added to an 3285output corpus, and it is later used as a springboard to find new paths. 3286 3287This is what makes AFL++ so effective: it's not just blindly thrashing a binary; 3288it adapts to the binary by leveraging information about paths. 3289 3290### Fuzzing Runs 3291 3292For doing a fuzzing run, I expect about a week or two where my computer is 3293basically unusable, except for text editing and light web browsing. 3294 3295Yes, it can take two weeks for me to do a full fuzzing run, and that does *not* 3296include the time needed to find and fix crashes; it only counts the time on the 3297*last* run, the one that does not find any crashes. This means that the entire 3298process can take a month or more. 3299 3300What I use as an indicator that the fuzzing run is good enough is when the 3301number of "Pending" paths (see [AFL++ Quickstart][129] above) for all fuzzer 3302instances, except maybe the deterministic instances, is below 50. And even then, 3303I try to let deterministic instances get that far as well. 3304 3305You can see how many pending paths are left in the "path geometry" section of 3306the [AFL++][125] dashboard. 3307 3308Also, to make [AFL++][125] quit, you need to send it a `SIGINT`, either with 3309`Ctrl+c` or some other method. It will not quit until you tell it to. 3310 3311### Radamsa 3312 3313I rarely use [Radamsa][99] instead of [AFL++][125]. In fact, it's only happened 3314once. 3315 3316The reason I use [Radamsa][99] instead of [AFL++][125] is because it is easier 3317to use with varying command-line arguments, which was needed for testing `bc`'s 3318command-line expression parsing code, and [AFL++][125] is best when testing 3319input from `stdin`. 3320 3321[`scripts/radamsa.sh`][100] does also do fuzzing on the [AFL++][125] inputs, but 3322it's not as effective at that, so I don't really use it for that either. 3323 3324[`scripts/radamsa.sh`][100] and [Radamsa][99] were only really used once; I have 3325not had to touch the command-line expression parsing code since. 3326 3327### [AddressSanitizer][21] with Fuzzing 3328 3329One advantage of using [AFL++][125] is that it saves every test case that 3330generated a new path (see [AFL++ Quickstart][129] above), and it doesn't delete 3331them when the user makes it quit. 3332 3333Keeping them around is not a good idea, for several reasons: 3334 3335* They are frequently large. 3336* There are a lot of them. 3337* They go stale; after `bc` is changed, the generated paths may not be valid 3338 anymore. 3339 3340However, before they are deleted, they can definitely be leveraged for even 3341*more* bug squashing by running *all* of the paths through a build of `bc` with 3342[AddressSanitizer][21]. 3343 3344This can easily be done with these four commands: 3345 3346``` 3347./scripts/fuzz_prep.sh -a 3348./scripts/afl.py --asan bc1 3349./scripts/afl.py --asan bc2 3350./scripts/afl.py --asan bc3 3351./scripts/afl.py --asan dc 3352``` 3353 3354(By the way, the last four commands could be run in separate terminals to do the 3355processing in parallel.) 3356 3357These commands build an [ASan][21]-enabled build of `bc` and `dc` and then they 3358run `bc` and `dc` on all of the found crashes and path output corpuses. This is 3359to check that no path or crash has found any memory errors, including memory 3360leaks. 3361 3362Because the output corpuses can contain test cases that generate infinite loops 3363in `bc` or `dc`, [`scripts/afl.py`][94] has a timeout of 8 seconds, which is far 3364greater than the timeout that [AFL++][125] uses and should be enough to catch 3365any crash. 3366 3367If [AFL++][125] fails to find crashes *and* [ASan][21] fails to find memory 3368errors on the outputs of [AFL++][125], that is an excellent indicator of very 3369few bugs in `bc`, and a release can be made with confidence. 3370 3371## Code Concepts 3372 3373This section is about concepts that, if understood, will make it easier to 3374understand the code as it is written. 3375 3376The concepts in this section are not found in a single source file, but they are 3377littered throughout the code. That's why I am writing them all down in a single 3378place. 3379 3380### POSIX Mode 3381 3382POSIX mode is `bc`-only. 3383 3384In fact, POSIX mode is two different modes: Standard Mode and Warning Mode. 3385These modes are designed to help users write POSIX-compatible `bc` scripts. 3386 3387#### Standard Mode 3388 3389Standard Mode is activated with the `-s` or `--standard` flags. 3390 3391In this mode, `bc` will error if any constructs are used that are not strictly 3392compatible with the [POSIX `bc` specification][2]. 3393 3394#### Warning Mode 3395 3396Warning Mode is activated with the `-w` or `--warn` flags. 3397 3398In this mode, `bc` will issue warnings, but continue, if any constructs are used 3399that are not strictly compatible with the [POSIX `bc` specification][2]. 3400 3401### Memory Management 3402 3403The memory management in `bc` is simple: everything is owned by one thing. 3404 3405If something is in a vector, it is owned by that vector. 3406 3407If something is contained in a struct, it is owned by that struct with one 3408exception: structs can be given pointers to other things, but only if those 3409other things will outlast the struct itself. 3410 3411As an example, the `BcParse` struct has a pointer to the one `BcProgram` in 3412`bc`. This is okay because the program is initialized first and deallocated 3413last. 3414 3415In other words, it's simple: if a field in a struct is a pointer, then unless 3416that pointer is directly allocated by the struct (like the vector array or the 3417number limb array), that struct does not own the item at that pointer. 3418Otherwise, the struct *does* own the item. 3419 3420### [Async-Signal-Safe][115] Signal Handling 3421 3422`bc` is not the typical Unix utility. Most Unix utilities are I/O bound, but 3423`bc` is, by and large, CPU-bound. This has several consequences, but the biggest 3424is that there is no easy way to allow signals to interrupt it. 3425 3426This consequence is not obvious, but it comes from the fact that a lot of I/O 3427operations can be interrupted and return [`EINTR`][198]. This makes such I/O 3428calls natural places for allowing signals to interrupt execution, even when the 3429signal comes during execution, and not interrupting I/O calls. The way this is 3430done is setting a flag in the signal handler, which is checked around the time 3431of the I/O call, when it is convenient. 3432 3433Alternatively, I/O bound programs can use the [self-pipe trick][199]. 3434 3435Neither of these are possible in `bc` because the execution of math code can 3436take a long time. If a signal arrives during this long execution time, setting a 3437flag like an I/O bound application and waiting until the next I/O call could 3438take seconds, minutes, hours, or even days. (Last I checked, my `bc` takes a 3439week to calculate a million digits of pi, and it's not slow as far as `bc` 3440implementations go.) 3441 3442Thus, using just the technique of setting the flag just will not work for an 3443interactive calculator. 3444 3445Well, it can, but it requires a lot of code and massive inefficiencies. I know 3446this because that was the original implementation. 3447 3448The original implementation set a flag and just exit the signal handler. Then, 3449on just about every loop header, I have a check for the signal flag. These 3450checks happened on every iteration of every loop. It was a massive waste because 3451it was polling, and [polling is evil][200]. 3452 3453So for version [3.0.0][32], I expended a lot of effort to change the 3454implementation. 3455 3456In the new system, code *outside* the signal handler sets a flag (`vm.sig_lock`) 3457to tell the signal handler whether it can use `longjmp()` to stop the current 3458execution. If so, it does. If not, it sets a flag, which then is used by the 3459code outside the signal handler that set the `vm.sig_lock` flag. When that code 3460unsets `vm.sig_lock`, it checks to see if a signal happened, and if so, that 3461code executes the `longjmp()` and stops the current execution. 3462 3463Other than that, the rest of the interrupt-based implementation is best 3464described in the [Error Handling][97]. 3465 3466However, there are rules for signal handlers that I must lay out. 3467 3468First, signal handlers can only call [async-signal-safe][115] functions. 3469 3470Second, any field set or read by both the signal handler and normal code must be 3471a `volatile sig_atomic_t`. 3472 3473Third, when setting such fields, they must be set to constants and no math can 3474be done on them. This restriction and the above restriction exist in order to 3475ensure that the setting of the fields is always atomic with respect to signals. 3476 3477These rules exist for *any* code using Unix signal handlers, not just `bc`. 3478 3479#### Vectors and Numbers 3480 3481Vectors and numbers needed special consideration with the interrupt-based signal 3482handling. 3483 3484When vectors and numbers are about to allocate, or *reallocate* their arrays, 3485they need to lock signals to ensure that they do not call `malloc()` and friends 3486and get interrupted by a signal because, as you will see in the [Error 3487Handling][97] section, `longjmp()` cannot be used in a signal handler if it may 3488be able to interrupt a non-[async-signal-safe][115] function like `malloc()` and 3489friends. 3490 3491### Asserts 3492 3493If you asked me what procedure is used the most in `bc`, I would reply without 3494hesitation, "`assert()`." 3495 3496I use `assert()` everywhere. In fact, it is what made [fuzzing][82] with 3497[AFL++][125] so effective. [AFL++][125] is incredibly good at finding crashes, 3498and a failing `assert()` counts as one. 3499 3500So while a lot of bad bugs might have corrupted data and *not* caused crashes, 3501because I put in so many `assert()`'s, they were *turned into* crashing bugs, 3502and [AFL++][125] found them. 3503 3504By far, the most bugs it found this way was in the `bc` parser. (See the [`bc` 3505Parsing][110] for more information.) And even though I was careful to put 3506`assert()`'s everywhere, most parser bugs manifested during execution of 3507bytecode because the virtual machine assumes the bytecode is valid. 3508 3509Sidenote: one of those bugs caused an infinite recursion when running the sine 3510(`s()`) function in the math library, so yes, parser bugs can be *very* weird. 3511 3512Anyway, they way I did `assert()`'s was like this: whenever I realized that I 3513had put assumptions into the code, I would put an `assert()` there to test it 3514**and** to *document* it. 3515 3516Yes, documentation. In fact, by far the best documentation of the code in `bc` 3517is actually the `assert()`'s. The only time I would not put an `assert()` to 3518test an assumption is if that assumption was already tested by an `assert()` 3519earlier. 3520 3521As an example, if a function calls another function and passes a pointer that 3522the caller previously `assert()`'ed was *not* `NULL`, then the callee does not 3523have to `assert()` it too, unless *also* called by another function that does 3524not `assert()` that. 3525 3526At first glance, it may seem like putting asserts for pointers being non-`NULL` 3527everywhere would actually be good, but unfortunately, not for fuzzing. Each 3528`assert()` is a branch, and [AFL++][125] rates its own effectiveness based on 3529how many branches it covers. If there are too many `assert()`'s, it may think 3530that it is not being effective and that more fuzzing is needed. 3531 3532This means that `assert()`'s show up most often in two places: function 3533preconditions and function postconditions. 3534 3535Function preconditions are `assert()`'s that test conditions relating to the 3536arguments a function was given. They appear at the top of the function, usually 3537before anything else (except maybe initializing a local variable). 3538 3539Function postconditions are `assert()`'s that test the return values or other 3540conditions when a function exits. These are at the bottom of a function or just 3541before a `return` statement. 3542 3543The other `assert()`'s cover various miscellaneous assumptions. 3544 3545If you change the code, I ***HIGHLY*** suggest that you use `assert()`'s to 3546document your assumptions. And don't remove them when [AFL++][125] gleefully 3547crashes `bc` and `dc` over and over again. 3548 3549### Vectors 3550 3551In `bc`, vectors mean resizable arrays, and they are the most fundamental piece 3552of code in the entire codebase. 3553 3554I had previously written a [vector implementation][112], which I used to guide 3555my decisions, but I wrote a new one so that `bc` would not have a dependency. I 3556also didn't make it as sophisticated; the one in `bc` is very simple. 3557 3558Vectors store some information about the type that they hold: 3559 3560* The size (as returned by `sizeof`). 3561* An enum designating the destructor. 3562 3563If the destructor is `BC_DTOR_NONE`, it is counted as the type not having a 3564destructor. 3565 3566But by storing the size, the vector can then allocate `size * cap` bytes, where 3567`cap` is the capacity. Then, when growing the vector, the `cap` is doubled again 3568and again until it is bigger than the requested size. 3569 3570But to store items, or to push items, or even to return items, the vector has to 3571figure out where they are, since to it, the array just looks like an array of 3572bytes. 3573 3574It does this by calculating a pointer to the underlying type with 3575`v + (i * size)`, where `v` is the array of bytes, `i` is the index of the 3576desired element, and `size` is the size of the underlying type. 3577 3578Doing that, vectors can avoid undefined behavior (because `char` pointers can 3579be cast to any other pointer type), while calculating the exact position of 3580every element. 3581 3582Because it can do that, it can figure out where to push new elements by 3583calculating `v + (len * size)`, where `len` is the number of items actually in 3584the vector. 3585 3586By the way, `len` is different from `cap`. While `cap` is the amount of storage 3587*available*, `len` is the number of actual elements in the vector at the present 3588point in time. 3589 3590Growing the vector happens when `len` is equal to `cap` *before* pushing new 3591items, not after. 3592 3593To add a destructor, you need to add an enum item to `BcDtorType` in 3594[`include/vector.h`][174] and add the actual destructor in the same place as the 3595enum item in the `bc_vec_dtors[]` array in [`src/data.c`][131]. 3596 3597#### Pointer Invalidation 3598 3599There is one big danger with the vectors as currently implemented: pointer 3600invalidation. 3601 3602If a piece of code receives a pointer from a vector, then adds an item to the 3603vector before they finish using the pointer, that code must then update the 3604pointer from the vector again. 3605 3606This is because any pointer inside the vector is calculated based off of the 3607array in the vector, and when the vector grows, it can `realloc()` the array, 3608which may move it in memory. If that is done, any pointer returned by 3609`bc_vec_item()`, `bc_vec_top()` and `bc_vec_item_rev()` will be invalid. 3610 3611This fact was the single most common cause of crashes in the early days of this 3612`bc`; wherever I have put a comment about pointers becoming invalidated and 3613updating them with another call to `bc_vec_item()` and friends, *do **NOT** 3614remove that code!* 3615 3616#### Maps 3617 3618Maps in `bc` are...not. 3619 3620They are really a combination of two vectors. Those combinations are easily 3621recognized in the source because one vector is named `<name>s` (plural), and the 3622other is named `<name>_map`. 3623 3624There are currently three, all in `BcProgram`: 3625 3626* `fns` and `fn_map` (`bc` functions). 3627* `vars` and `var_map` (variables). 3628* `arrs` and `arr_map` (arrays). 3629 3630They work like this: the `<name>_map` vector holds `BcId`'s, which just holds a 3631string and an index. The string is the name of the item, and the index is the 3632index of that item in the `<name>s` vector. 3633 3634Obviously, I could have just done a linear search for items in the `<name>s` 3635vector, but that would be slow with a lot of functions/variables/arrays. 3636Instead, I ensure that whenever an item is inserted into the `<name>_map` 3637vector, the item is inserted in sorted order. This means that the `<name>_map` 3638is always sorted (by the names of the items). 3639 3640So when looking up an item in the "map", what is really done is this: 3641 36421. A binary search is carried out on the names in the `<name>_map` vector. 36432. When one is found, it returns the index in the `<name>_map` vector where the 3644 item was found. 36453. This index is then used to retrieve the `BcId`. 36464. The index from the `BcId` is then used to index into the `<name>s` vector, 3647 which returns the *actual* desired item. 3648 3649Why were the `<name>s` and `<name>_map` vectors not combined for ease? The 3650answer is that sometime, when attempting to insert into the "map", code might 3651find that something is already there. For example, a function with that name may 3652already exist, or the variable might already exist. 3653 3654If the insert fails, then the name already exists, and the inserting code can 3655forego creating a new item to put into the vector. However, if there is no item, 3656the inserting code must create a new item and insert it. 3657 3658If the two vectors were combined together, it would not be possible to separate 3659the steps such that creating a new item could be avoided if it already exists. 3660 3661#### Slabs and Slab Vectors 3662 3663`bc` allocates *a lot* of small strings, and small allocations are the toughest 3664for general-purpose allocators to handle efficiently. 3665 3666Because of that reason, I decided to create a system for allocating small 3667strings using something that I call a "slab vector" after [slab 3668allocators][201]. 3669 3670These vectors allocate what I call "slabs," which are just an allocation of a 3671single page with a length to tell the slab how much of the slab is used. 3672 3673The vector itself holds slabs, and when the slab vector is asked to allocate a 3674string, it attempts to in the last slab. If that slab cannot do so, it allocates 3675a new slab and allocates from that. 3676 3677There is one exception: if a string is going to be bigger than 128 bytes, then 3678the string is directly allocated, and a slab is created with that pointer and a 3679length of `SIZE_MAX`, which tells the slab vector that it is a direct 3680allocation. Then, the last slab is pushed into the next spot and the new special 3681slab is put into the vacated spot. This ensures that a non-special slab is 3682always last. 3683 3684### Command-Line History 3685 3686When I first wrote `bc`, I immediately started using it in order to eat my own 3687dog food. 3688 3689It sucked, and the biggest reason why was because of the lack of command-line 3690history. 3691 3692At first, I just dealt with it, not knowing how command-line history might be 3693implemented. 3694 3695Eventually, I caved and attempted to adapt [`linenoise-mob`][28], which I had 3696known about for some time. 3697 3698It turned out to be easier than I thought; the hardest part was the tedious 3699renaming of everything to fit the `bc` naming scheme. 3700 3701Understanding command-line history in `bc` is really about understanding VT-100 3702escape codes, so I would start there. 3703 3704Now, the history implementation of `bc` has been adapted far beyond that initial 3705adaptation to make the command-line history implementation perfect for `bc` 3706alone, including integrating it into `bc`'s [Custom I/O][114] and making sure 3707that it does not disturb output that did not end with a newline. 3708 3709On top of that, at one point, I attempted to get history to work on Windows. It 3710barely worked after a lot of work and a lot of portability code, but even with 3711all of that, it does not have at least one feature: multi-line pasting from the 3712clipboard. 3713 3714### Error Handling 3715 3716The error handling on `bc` got an overhaul for version [`3.0.0`][32], and it 3717became one of the things that taught me the most about C in particular and 3718programming in general. 3719 3720Before then, error handling was manual. Almost all functions returned a 3721`BcStatus` indicating if an error had occurred. This led to a proliferation of 3722lines like: 3723 3724``` 3725if (BC_ERR(s)) return s; 3726``` 3727 3728In fact, a quick and dirty count of such lines in version `2.7.2` (the last 3729version before [`3.0.0`][32] turned up 252 occurrences of that sort of line. 3730 3731And that didn't even guarantee that return values were checked *everywhere*. 3732 3733But before I can continue, let me back up a bit. 3734 3735From the beginning, I decided that I would not do what GNU `bc` does on errors; 3736it tries to find a point at which it can recover. Instead, I decided that I 3737would have `bc` reset to a clean slate, which I believed, would reduce the 3738number of bugs where an unclean state caused errors with continuing execution. 3739 3740So from the beginning, errors would essentially unwind the stack until they got 3741to a safe place from which to clean the slate, reset, and ask for more input. 3742 3743Well, if that weren't enough, `bc` also has to handle [POSIX signals][113]. As 3744such, it had a signal handler that set a flag. But it could not safely interrupt 3745execution, so that's all it could do. 3746 3747In order to actually respond to the signal, I had to litter checks for the flag 3748*everywhere* in the code. And I mean *everywhere*. They had to be checked on 3749every iteration of *every* loop. They had to be checked going into and out of 3750certain functions. 3751 3752It was a mess. 3753 3754But fortunately for me, signals did the same thing that errors did: they unwound 3755the stack to the *same* place. 3756 3757Do you see where I am going with this? 3758 3759It turns out that what I needed was a [async-signal-safe][115] form of what 3760programmers call "exceptions" in other languages. 3761 3762I knew that [`setjmp()`][116] and [`longjmp()`][117] are used in C to implement 3763exceptions, so I thought I would learn how to use them. How hard could it be? 3764 3765Quite hard, it turns out, especially in the presence of signals. And that's 3766because there are a few big snares: 3767 37681. The value of any local variables are not guaranteed to be preserved after a 3769 `longjmp()` back into a function. 37702. While `longjmp()` is required to be [async-signal-safe][115], if it is 3771 invoked by a signal handler that interrupted a non-[async-signal-safe][115] 3772 function, then the behavior is undefined. 3773 3774Oh boy. 3775 3776For number 1, the answer to this is to hide data that must stay changed behind 3777pointers. Only the *pointers* are considered local, so as long as I didn't do 3778any modifying pointer arithmetic, pointers and their data would be safe. For 3779cases where I have local data that must change and stay changed, I needed to 3780*undo* the `setjmp()`, do the change, and the *redo* the `setjmp()`. 3781 3782For number 2, `bc` needs some way to tell the signal handler that it cannot do a 3783`longjmp()`. This is done by "locking" signals with a `volatile sig_atomic_t`. 3784(For more information, see the [Async-Signal-Safe Signal Handling][173] 3785section.) For every function that calls a function that is not 3786async-signal-safe, they first need to use `BC_SIG_LOCK` to lock signals, and 3787afterward, use `BC_SIG_UNLOCK` to unlock signals. 3788 3789`BC_SIG_UNLOCK` has another requirement: it must check for signals or errors and 3790jump if necessary. 3791 3792On top of all of that, *all* functions with cleanup needed to be able to run 3793their cleanup. This meant that `longjmp()` could not just jump to the finish; it 3794had to start what I call a "jump series," using a stack of `jmp_buf`'s 3795(`jmp_bufs` in `BcVm`). Each `longjmp()` uses the top of the `jmp_bufs` stack to 3796execute its jump. Then, if the cleanup code was executed because of a jump, the 3797cleanup code was responsible for continuing the jump series by popping the 3798previous item off the stack and using the new top of the stack for a jump. 3799 3800In this way, C++-style exceptions were implemented in pure C. Not fun, but it 3801works. However, the order of operations matters, especially in the macros that 3802help implement the error handling. 3803 3804For example, in `BC_UNSETJMP`, signals are unlocked before checking for signals. 3805If a signal comes between, that's fine; it will still cause a jump to the right 3806place. However, disabling the lock after could mean that a signal could come 3807*after* checking for signals, but before signals were unlocked, causing the 3808handling of the signal to be delayed. 3809 3810#### Custom I/O 3811 3812Why did I implement my own buffered I/O for `bc`? Because I use `setjmp()` and 3813`longjmp()` for error handling (see the [Error Handling][97] section), and the 3814buffered I/O in `libc` does not interact well with the use of those procedures; 3815all of the buffered I/O API is basically non-[async-signal-safe][115]. 3816 3817Implementing custom buffered I/O had other benefits. First, it allowed me to 3818tightly integrate history with the I/O code. Second, it allowed me to make 3819changes to history in order to make it adapt to user prompts. 3820 3821### Lexing 3822 3823To simplify parsing, both calculators use lexers to turn the text into a more 3824easily-parsable form. 3825 3826While some tokens are only one character long, others require many tokens, and 3827some of those need to store all of the text corresponding to the token for use 3828by the parsers. Tokens that need to store their corresponding text include, but 3829are not limited to: 3830 3831* Strings. 3832* Numbers. 3833* Identifiers. 3834 3835For this purpose, the lexer has a [vector][111] named `str` to store the data 3836for tokens. This data is overwritten if another token is lexed that needs to 3837store data, so the parsers need to copy the data before calling the lexer again. 3838 3839Both lexers do some of the same things: 3840 3841* Lex identifiers into tokens, storing the identifier in `str`. 3842* Lex number strings into tokens, storing the string in `str`. 3843* Lex whitespace. 3844* Lex comments. 3845 3846Other than that, and some common plumbing, the lexers have separate code. 3847 3848#### `dc` Lexing 3849 3850The `dc` lexer is remarkably simple; in fact, besides [`src/main.c`][205], 3851[`src/bc.c`][40], and [`src/dc.c`][44], which just contain one function each, 3852the only file smaller that [`src/dc_lex.c`][45] is [`src/args.c`][206], which 3853just processes command-line arguments after they are parsed by 3854[`src/opt.c`][51]. 3855 3856For most characters, the `dc` lexer is able to convert directly from the 3857character to its corresponding token. This happens using `dc_lex_tokens[]` in 3858[`src/data.c`][131]. 3859 3860`dc`'s lexer also has to lex the register name after lexing tokens for commands 3861that need registers. 3862 3863And finally, `dc`'s lexer needs to parse `dc` strings, which is the only part of 3864the `dc` lexer that is more complex than the `bc` lexer. This is because `dc` 3865strings need to have a balanced number of brackets. 3866 3867#### `bc` Lexing 3868 3869The `bc` lexer is fairly simple. It does the following things: 3870 3871* Lexes `bc` strings. 3872* Lexes `bc` identifiers. This is necessary because this is how `bc` keywords 3873 are lexed. After ensuring that an identifier is not a keyword, the `bc` lexer 3874 allows the common identifier function to take over. 3875* Turns characters and groups of characters into `bc` operator tokens. 3876 3877### Parsing 3878 3879The difference between parsing `bc` and `dc` code is...vast. The `dc` parser is 3880simple, while the `bc` parser is the most complex piece of code in the entire 3881codebase. 3882 3883However, they both do some of the same things. 3884 3885First, the parsers do *not* use [abstract syntax trees][207]; instead, they 3886directly generate the bytecode that will be executed by the `BcProgram` code. 3887Even in the case of `bc`, this heavily simplifies the parsing because the 3888[Shunting-Yard Algorithm][109] is designed to generate [Reverse Polish 3889Notation][108], which is basically directly executable. 3890 3891Second, any extra data that the `BcProgram` needs for execution is stored into 3892functions (see the [Functions][208] section). These include constants and 3893strings. 3894 3895#### `dc` Parsing 3896 3897The parser for `dc`, like its lexer, is remarkably simple. In fact, the easiness 3898of lexing and parsing [Reverse Polish notation][108] is probably why it was used 3899for `dc` when it was first created at Bell Labs. 3900 3901For most tokens, the `dc` parser is able to convert directly from the token 3902to its corresponding instruction. This happens using `dc_parse_insts[]` in 3903[`src/data.c`][131]. 3904 3905`dc`'s parser also has to parse the register name for commands that need 3906registers. This is the most complex part of the `dc` parser; each different 3907register command needs to be parsed differently because most of them require two 3908or more instructions to execute properly. 3909 3910For example, storing in a register requires a swap instruction and an assignment 3911instruction. 3912 3913Another example are conditional execution instructions; they need to produce the 3914instruction for the condition, and then they must parse a possible "else" part, 3915which might not exist. 3916 3917#### `bc` Parsing 3918 3919`bc`'s parser is, by far, the most sensitive piece of code in this software, and 3920there is a very big reason for that: `bc`'s standard is awful and defined a very 3921poor language. 3922 3923The standard says that either semicolons or newlines can end statements. Trying 3924to parse the end of a statement when it can either be a newline or a semicolon 3925is subtle. Doing it in the presence of control flow constructs that do not have 3926to use braces is even harder. 3927 3928And then comes the biggest complication of all: `bc` has to assume that it is 3929*always* at a REPL (Read-Eval-Print Loop). `bc` is, first and foremost, an 3930*interactive* utility. 3931 3932##### Flags 3933 3934All of this means that `bc` has to be able to partially parse something, store 3935enough data to recreate that state later, and return, making sure to not 3936execute anything in the meantime. 3937 3938*That* is what the flags in [`include/bc.h`][106] are: they are the state that 3939`bc` is saving for itself. 3940 3941It saves them in a stack, by the way, because it's possible to nest 3942structures, just like any other programming language. Thus, not only does it 3943have to store state, it needs to do it arbitrarily, and still be able to 3944come back to it. 3945 3946So `bc` stores its parser state with flags in a stack. Careful setting of these 3947flags, along with properly using them and maintaining the flag stack, are what 3948make `bc` parsing work, but it's complicated. In fact, as I mentioned, the `bc` 3949parser is the single most subtle, fickle, and sensitive piece of code in the 3950entire codebase. Only one thing came close once: square root, and that was only 3951sensitive because I wrote it wrong. This parser is pretty good, and it is 3952*still* sensitive. And flags are the reason why. 3953 3954For more information about what individual flags there are, see the comments in 3955[`include/bc.h`][106]. 3956 3957##### Labels 3958 3959`bc`'s language is Turing-complete. That means that code needs the ability to 3960jump around, specifically to implement control flow like `if` statements and 3961loops. 3962 3963`bc` handles this while parsing with what I called "labels." 3964 3965Labels are markers in the bytecode. They are stored in functions alongside the 3966bytecode, and they are just indices into the bytecode. 3967 3968When the `bc` parser creates a label, it pushes an index onto the labels array, 3969and the index of the label in that array is the index that will be inserted into 3970the bytecode. 3971 3972Then, when a jump happens, the index pulled out of the bytecode is used to index 3973the labels array, and the label (index) at the index is then used to set the 3974instruction pointer. 3975 3976##### Cond Labels 3977 3978"Cond" labels are so-called because they are used by conditionals. 3979 3980The key to them is that they come *before* the code that uses them. In other 3981words, when jumping to a condition, code is jumping *backwards*. 3982 3983This means that when a cond label is created, the value that should go there is 3984well-known. Cond labels are easy. 3985 3986However, they are still stored on a stack so that the parser knows what cond 3987label to use. 3988 3989##### Exit Labels 3990 3991Exit labels are not so easy. 3992 3993"Exit" labels are so-called because they are used by code "exiting" out of `if` 3994statements or loops. 3995 3996The key to them is that they come *after* the code that uses them. In other 3997words, when jumping to an exit, code is jumping *forwards*. 3998 3999But this means that when an exit label is created, the value that should go 4000there is *not* known. The code that needs it must be parsed and generated first. 4001 4002That means that exit labels are created with the index of `SIZE_MAX`, which is 4003then specifically checked for with an assert in `bc_program_exec()` before using 4004those indices. 4005 4006There should ***NEVER*** be a case when an exit label is not filled in properly 4007if the parser has no bugs. This is because every `if` statement, every loop, 4008must have an exit, so the exit must be set. If not, there is a bug. 4009 4010Exit labels are also stored on a stack so that the parser knows what exit label 4011to use. 4012 4013##### Expression Parsing 4014 4015`bc` has expressions like you might expect in a typical programming language. 4016This means [infix notation][107]. 4017 4018One thing about infix notation is that you can't just generate code straight 4019from it like you can with [Reverse Polish notation][108]. It requires more work 4020to shape it into a form that works for execution on a stack machine. 4021 4022That extra work is called the [Shunting-Yard algorithm][109], and the form it 4023translates infix notation into is...[Reverse Polish notation][108]. 4024 4025In order to understand the rest of this section, you must understand the 4026[Shunting-Yard algorithm][109]. Go do that before you read on. 4027 4028###### Operator Stack 4029 4030In `bc`, the [Shunting-Yard algorithm][109] is implemented with bytecode as the 4031output and an explicit operator stack (the `ops` field in `BcParse`) as the 4032operator stack. It stores tokens from `BcLex`. 4033 4034However, there is one **HUGE** hangup: multiple expressions can stack. This 4035means that multiple expressions can be parsed at one time (think an array element 4036expression in the middle of a larger expression). Because of that, we need to 4037keep track of where the previous expression ended. That's what `start` parameter 4038to `bc_parse_operator()` is. 4039 4040Parsing multiple expressions on one operator stack only works because 4041expressions can only *stack*; this means that, if an expression begins before 4042another ends, it must *also* end before that other expression ends. This 4043property ensures that operators will never interfere with each other on the 4044operator stack. 4045 4046###### Recursion 4047 4048Because expressions can stack, parsing expressions actually requires recursion. 4049Well, it doesn't *require* it, but the code is much more readable that way. 4050 4051This recursion is indirect; the functions that `bc_parse_expr_err()` (the actual 4052expression parsing function) calls can, in turn, call it. 4053 4054###### Expression Flags 4055 4056There is one more big thing: not all expressions in `bc` are equal. 4057 4058Some expressions have requirements that others don't have. For example, only 4059array arguments can be arrays (which are technically not expressions, but are 4060treated as such for parsing), and some operators (in POSIX) are not allowed in 4061certain places. 4062 4063For this reason, functions that are part of the expression parsing 4064infrastructure in `bc`'s parser usually take a `flags` argument. This is meant 4065to be passed to children, and somewhere, they will be checked to ensure that the 4066resulting expression meets its requirements. 4067 4068There are also places where the flags are changed. This is because the 4069requirements change. 4070 4071Maintaining the integrity of the requirements flag set is an important part of 4072the `bc` parser. However, they do not have to be stored on a stack because their 4073stack is implicit from the recursion that expression parsing uses. 4074 4075### Functions 4076 4077Functions, in `bc`, are data structures that contain the bytecode and data 4078produced by the parsers. Functions are what the `BcProgram` program executes. 4079 4080#### Main and Read Functions 4081 4082There are two functions that always exist, which I call the "main" and "read" 4083functions. 4084 4085The "main" function is the function in which any code and data outside other 4086functions is put. Basically, it is the function where the scripting code ends 4087up. 4088 4089The "read" function is the function that is reset and parsed every time a call 4090to the `read()` builtin function happens. 4091 4092#### `dc` Strings 4093 4094In `dc`, strings can be executed, and since there are no actual "functions" in 4095`dc`, strings are handled as functions. In fact, they are effectively translated 4096into functions by parsing. 4097 4098##### Tail Calls 4099 4100Since strings in `dc` are functions, and the fact that `dc` has no native loops, 4101such loops are implemented in `dc` code using strings with conditional execution 4102commands at the end of strings. 4103 4104When such conditional execution, or even unconditional execution, commands are 4105the very last commands in a string, then `dc` can perform a [tail call][202]. 4106 4107This is done by recording the fact that a tail call happened, done by 4108incrementing an integer on a stack. When a string is executed *without* a tail 4109call, a new entry is pushed onto the stack with the integer `1`. 4110 4111When a string finally quits that followed tail calls, its stack entry is popped, 4112eliminating all of those tail calls. 4113 4114Why perform tail calls? Because otherwise, `dc` would be subject to the same 4115thing that plagues [functional programming languages][203]: stack overflow. In 4116`dc`'s case, that would manifest itself as a growing [heap][204], because the 4117execution stack is stored on the heap, until a fatal allocation failure would 4118occur. 4119 4120#### Execution 4121 4122Execution is handled by an interpreter implemented using `BcProgram` and code 4123in [`src/program.c`][53]. 4124 4125The interpreter is a mix between a [stack machine][210] and a [register 4126machine][211]. It is a stack machine in that operations happen on a stack I call 4127the "results stack," but it is a register machine in that items on the stack can 4128be stored to and loaded from "registers" (`dc` terminology), variables (`bc` 4129terminology), and arrays. 4130 4131##### Stacks 4132 4133There are two stacks in the interpreter: 4134 4135* The "results" stack (as mentioned above). 4136* The "execution" stack. 4137 4138The results stack (the `results` field of the `BcProgram` struct) is the stack 4139where the results of computations are stored. It is what makes the interpreter 4140part [stack machine][210]. It is filled with `BcResult`'s. 4141 4142The execution stack (the `stack` field of the `BcProgram` struct) is the stack 4143that tracks the current execution state of the interpreter. It is the presence 4144of this separate stack that allows the interpreter to implement the machine as a 4145loop, rather than recursively. It is filled with `BcInstPtr`'s, which are the 4146"instruction pointers." 4147 4148These instruction pointers have three fields, all integers: 4149 4150* `func`, the index of the function that is currently executing. 4151* `idx`, the index of the next bytecode instruction to execute in the function's 4152 bytecode array. 4153* `len`, which is the length of the results stack when the function started 4154 executing. This is not used by `dc`, but it used by `bc` because functions 4155 in `bc` should never affect the results stack of their callers. 4156 4157With these three fields, and always executing using the instruction pointer at 4158the top of the execution stack, the interpreter can always keep track of its 4159execution. 4160 4161When a function or a string starts executing, a new `BcInstPtr` is pushed onto 4162the execution stack for it. This includes if a function was called recursively. 4163And then, when the function or string returns, its `BcInstPtr` is popped off of 4164the execution stack. 4165 4166##### Bytecode 4167 4168Execution of functions are done through bytecode produced directly by the 4169parsers (see the [Parsing][209]). This bytecode is stored in the `code` 4170[vector][111] of the `BcFunc` struct. 4171 4172This is a vector for two reasons: 4173 4174* It makes it easier to add bytecode to the vector in the parsers. 4175* `bc` allows users to redefine functions. 4176 4177The reason I can use bytecode is because there are less than 256 instructions, 4178so an `unsigned char` can store all the bytecodes. 4179 4180###### Bytecode Indices 4181 4182There is one other factor to bytecode: there are instructions that need to 4183reference strings, constants, variables, or arrays. Bytecode need some way to 4184reference those things. 4185 4186Fortunately, all of those things can be referenced in the same way: with indices 4187because all of the items are in vectors. 4188 4189So `bc` has a way of encoding an index into bytecode. It does this by, after 4190pushing the instruction that references anything, pushing a byte set to the 4191length of the index in bytes, then the bytes of the index are pushed in 4192little-endian order. 4193 4194Then, when the interpreter encounters an instruction that needs one or more 4195items, it decodes the index or indices there and updates the `idx` field of the 4196current `BcInstPtr` to point to the byte after the index or indices. 4197 4198One more thing: the encoder of the indices only pushes as many bytes as 4199necessary to encode the index. It stops pushing when the index has no more bytes 4200with any 1 bits. 4201 4202##### Variables 4203 4204In `bc`, the vector of variables, `vars` in `BcProgram`, is not a vector of 4205numbers; it is a vector of vector of numbers. The first vector is the vector of 4206variables, the second is the variable stack, and the last level is the actual 4207number. 4208 4209This is because both `bc` and `dc` need variables to be stacks. 4210 4211For `dc`, registers are *defined* to be stacks. 4212 4213For `bc`, variables as stacks is how function arguments/parameters and function 4214`auto` variables are implemented. 4215 4216When a function is called, and a value needs to be used as a function argument, 4217a copy of the value is pushed onto the stack corresponding to the variable with 4218the same name as the function's parameter. For `auto` variables, a new number 4219set to zero is pushed onto each stack corresponding to the `auto` variables. 4220(Zero is used because the [`bc` spec][2] requires that `auto` variables are set 4221to zero.) 4222 4223It is in this way that the old value of the variable, which may not even be 4224related to the function parameter or `auto` variable, is preserved while the 4225variable is used as a function parameter or `auto` variable. 4226 4227When the function returns, of course, the stacks of the variables for the 4228parameters and `auto`'s will have their top item popped, restoring the old value 4229as it was before the function call. 4230 4231##### Arrays 4232 4233Like variables, arrays are also implemented as stacks. However, because they are 4234arrays, there is yet another level; the `arrs` field in `BcProgram` is a vector 4235of vectors of vectors of numbers. The first of the two levels is the vector of 4236arrays, the second the stack of for each array, the third the actual array, and 4237last the numbers in the array. 4238 4239`dc` has no need of this extra stack, but `bc` does because arrays can be 4240function parameters themselves. 4241 4242When arrays are used for function arguments, they are copied with a deep copy; 4243each item of the source vector is copied. This is because in `bc`, according to 4244the [`bc` spec][2], all function arguments are passed by value. 4245 4246However, array references are possible (see below). 4247 4248When arrays are used as `auto`'s, a new vector is pushed with one element; if 4249more elements are needed, the array is grown automatically, and new elements are 4250given the value of zero. 4251 4252In fact, if *any* array is accessed and does not have an element at that index, 4253the array is automaticall grown to that size, and all new elements are given the 4254value zero. This behavior is guaranteed by the [`bc` spec][2]. 4255 4256###### Array References 4257 4258Array references had to be implemented as vectors themselves because they must 4259be pushed on the vectors stacks, which, as seen above, expect vectors 4260themselves. 4261 4262So thus, references are implemented as vectors on the vector stacks. These 4263vectors are not vectors of vectors themselves; they are vectors of bytes; in 4264fact, the fact that they are byte vectors and not vector vectors is how a 4265reference vector is detected. 4266 4267These reference vectors always have the same two things pushed: a byte encoding 4268(the same way bytecode indices are) of the referenced vector's index in the 4269`arrs` vector, and a byte encoding of the referenced vectors index in the vector 4270stack. 4271 4272If an item in a referenced vector is needed, then the reference is dereferenced, 4273and the item is returned. 4274 4275If a reference vector is passed to a function that does *not* expect a 4276reference, the vector is dereferenced and a deep copy is done, in the same way 4277as vectors are copied for normal array function parameters. 4278 4279### Callbacks 4280 4281There are many places in `bc` and `dc` where function pointers are used: 4282 4283* To implement destructors in vectors. (See the [Vectors][111] section.) 4284* To select the correct lex and parse functions for `bc` and `dc`. 4285* To select the correct function to execute unary operators. 4286* To select the correct function to execute binary operators. 4287* To calculate the correct number size for binary operators. 4288* To print a "digit" of a number. 4289* To seed the pseudo-random number generator. 4290 4291And there might be more. 4292 4293In every case, they are used for reducing the amount of code. Instead of 4294`if`/`else` chains, such as: 4295 4296``` 4297if (BC_IS_BC) { 4298 bc_parse_parse(vm.parse); 4299} 4300else { 4301 dc_parse_parse(vm.parse); 4302} 4303``` 4304 4305The best example of this is `bc_num_binary()`. It is called by every binary 4306operator. It figures out if it needs to allocate space for a new `BcNum`. If so, 4307it allocates the space and then calls the function pointer to the *true* 4308operation. 4309 4310Doing it like that shrunk the code *immensely*. First, instead of every single 4311binary operator duplicating the allocation code, it only exists in one place. 4312Second, `bc_num_binary()` itself does not have a massive `if`/`else` chain or a 4313`switch` statement. 4314 4315But perhaps the most important use was for destructors in vectors. 4316 4317Most of the data structures in `bc` are stored in vectors. If I hadn't made 4318destructors available for vectors, then ensuring that `bc` had no memory leaks 4319would have been nigh impossible. As it is, I check `bc` for memory leaks every 4320release when I change the code, and I have not released `bc` after version 4321`1.0.0` with any memory leaks, as far as I can remember anyway. 4322 4323### Numbers 4324 4325In order to do arbitrary-precision math, as `bc` must do, there must be some way 4326of representing arbitrary-precision numbers. `BcNum` in [`include/num.h`][184] 4327is `bc`'s. 4328 4329(Note: the word ["limb"][214] is used below; it has a specific meaning when 4330applied to arbitrary-precision numbers. It means one piece of the number. It can 4331have a single digit, which is what GNU `bc` does, or it can have multiple, which 4332is what this `bc` does.) 4333 4334This struct needs to store several things: 4335 4336* The array of limbs of the number. This is the `num` field. 4337* The location of the decimal point. This is the `rdx` (short for [radix][215]) 4338 field. 4339* The number of limbs the number has. This is the `len` field. 4340* Whether the number is negative or not. This is the least significant bit of 4341 the `rdx` field. More on that later. 4342 4343In addition, `bc`'s number stores the capacity of the limb array; this is the 4344`cap` field. 4345 4346If the number needs to grow, and the capacity of the number is big enough, the 4347number is not reallocated; the number of limbs is just added to. 4348 4349There is one additional wrinkle: to make the usual operations (binary operators) 4350fast, the decimal point is *not* allowed to be in the middle of a limb; it must 4351always be between limbs, after all limbs (integer) or before all limbs (real 4352between -1 and 1). 4353 4354The reason for this is because addition, subtraction, multiplication, and 4355division expect digits to be lined up on the decimal point. By requiring that it 4356be between limbs, no extra alignment is needed, and those operations can proceed 4357without extra overhead. 4358 4359This does make some operations, most notably extending, truncating, and 4360shifting, more expensive, but the overhead is constant, and these operations are 4361usually cheap compared to the binary operators anyway. 4362 4363This also requires something else: `bc` numbers need to know *exactly* how many 4364decimal places they have after the decimal point. If the decimal point must be 4365inbetween limbs, the last decimal place could be in the middle of a limb. The 4366amount of decimal places in a number is carefully tracked and stored in the 4367`scale` field, and this number must always coincide with the `rdx` field by the 4368following formula: 4369 4370``` 4371scale + (BC_BASE_DIGS - 1) / BC_BASE_DIGS == rdx >> 1 4372``` 4373 4374(`BC_BASE_DIGS` is the number of decimal digits stored in one limb. It is 9 on 437564-bit systems and 4 on other systems.) 4376 4377Yes, `rdx` is shifted; that is because the negative bit is stored in the least 4378significant bit of the `rdx` field, and the actual radix (amount of limbs after 4379the decimal/radix point) is stored in the rest of the bits. This is safe because 4380`BC_BASE_DIGS` is always at least 4, which means `rdx` will always need at least 43812 bits less than `scale`. 4382 4383In addition to `rdx` always matching `scale`, another invariant is that `rdx` 4384must always be less than or equal to `len`. (Because `scale` may be greater than 4385`rdx`, `scale` does not have to be less than or equal to `len`.) 4386 4387Another invariant is that `len` must always be less than or equal to `cap`, for 4388obvious reasons. 4389 4390The last thing programmers need to know is that the limb array is stored in 4391little-endian order. This means that the last decimal places are in the limb 4392stored at index 0, and the most significant digits are stored at index `len-1`. 4393 4394This is done to make the most important operations fast. Addition and 4395subtraction are done from least significant to most significant limbs, which 4396means they can speed through memory in the way most computers are best at. 4397Multiplication does the same, sort of, and with division, it matters less. 4398Comparison does need to go backwards, but that's after exhausting all other 4399alternatives, including for example, checking the length of the integer portion 4400of each limb array. 4401 4402Finally, here are some possible special situations with numbers and what they 4403mean: 4404 4405* `len == 0`: the number equals 0. 4406* `len == 0 && scale != 0`: the number equals 0, but it has a `scale` value. 4407 This is the only case where `scale` does not have to coincide with `rdx` 4408 This can happen with division, for example, that sets a specific `scale` for 4409 the result value but may produce 0. 4410* `(rdx >> 1) < len`: the number is greater than or equal to 1, or less than or 4411 equal to -1. 4412* `(rdx >> 1) == len`: the number is greater than -1 and less than 1, not 4413 including 0, although this will be true for 0 as well. However, 0 is always 4414 assumed to be represented by `len == 0`. 4415* `(rdx >> 1) == 0`: the number is an integer. In this case, `scale` must also 4416 equal 0. 4417 4418#### Math Style 4419 4420When I wrote the math for `bc`, I adopted a certain style that, if known, will 4421make it easier to understand the code. The style follows these rules: 4422 4423* `BcNum` arguments always come before arguments of other types. 4424* Among the `BcNum` arguments, the operands always come first, and the `BcNum` 4425 where the result(s) will be stored come last. 4426* Error checking is placed first in the function. 4427* Easy cases are placed next. 4428* Preparation, such as allocating temporaries, comes next. 4429* The actual math. 4430* Cleanup and ensuring invariants. 4431 4432While these rules are not hard and fast, using them as a guide will probably 4433help. 4434 4435### Strings as Numbers 4436 4437Strings can be assigned to variables. This is a problem because the vectors for 4438variable stacks expect `BcNum` structs only. 4439 4440While I could have made a union, I decided that the complexity of adding an 4441entirely new type, with destructor and everything, was not worth it. Instead, I 4442took advantage of the fact that `free()`, when passed a `NULL` pointer, will do 4443nothing. 4444 4445Using that, I made it so `BcNum`'s could store strings instead. This is marked 4446by the `BcNum` having a `NULL` limb array (`num`) and a `cap` of 0 (which should 4447*never* happen with a real number, though the other fields could be 0). 4448 4449The `BcNum` stores the function that stores the string in the `rdx` field, and 4450it stores the index of the string in the `scale` field. This is used to actually 4451load the string if necessary. 4452 4453Note that historically, string information was stored in the `loc` field of 4454the `d` union in a `BcResult`. This was changed recently to standardize; now, 4455all string information are stored in the `n` field of the `d` union regardless. 4456This means that all string information is stored in `BcNum`'s. This removes 4457extra cases. 4458 4459Also, if a temp is made with a string, then the result type should still be 4460`BC_RESULT_STR`, not `BC_RESULT_TEMP`. This is to make it easier to do type 4461checks. 4462 4463### Pseudo-Random Number Generator 4464 4465In order to understand this section, I suggest you read the information in the 4466manpages about the pseudo-random number generator (PRNG) first; that will help 4467you understand the guarantees it has, which is important because this section 4468delves into implementation details. 4469 4470First, the PRNG I use is seeded; this is because most OS's have an excellent 4471cryptographically secure PRNG available via command-line, usually 4472`/dev/urandom`, but the only *seeded* PRNG available is usually `bash`'s 4473`$RANDOM`, which is essentially a wrapper around C's `rand()`. 4474 4475`rand()` is...bad. It is only guaranteed to return 15 bits of random data. 4476Obviously, getting good random data out of that would be hard with that alone, 4477but implementations also seem to be poor. 4478 4479On top of that, `bc` is an arbitrary-precision calculator; if I made it able to 4480generate random numbers, I could make it generate random numbers of any size, 4481and since it would be seeded, results would be reproducible, when wanted. 4482 4483So to get that, I needed a seeded PRNG with good characteristics. After scouring 4484the Internet, I decided on the [PCG PRNG][215], mostly because of [this blog 4485post][216]. Part of the reason was the behavior of the xoroshiro128+ author, who 4486hates on PCG and its author, but also because PCG seemed to do better when 4487tested by independent parties. 4488 4489After that decision, I faced a challenge: PCG requires 255 bits of seed: 128 for 4490the actual seed, and 127 for the "increment." (Melissa O'Neill, the PCG author, 4491likens the increment to selecting a codebook.) 4492 4493I could, of course, put the entire 255 bits into one massive arbitrary-precision 4494number; `bc` is good at that, after all. But that didn't sit right with me 4495because it would mean any seed selected by users would have the real portion 4496ignored, which is stupid in a program like `bc`. 4497 4498Instead, I decided to make the integer portion the increment (clamped down to 4499size), and the real portion the seed. 4500 4501In most cases, this would be a bad idea because you cannot, in general, know how 4502many decimal places you need to represent any number with `n` real digits in 4503base `b` in another base. However, there is an easy to how many decimal digits 4504after the decimal point it takes to represent reals of base 2 in base 10: the 4505power of two. 4506 4507It turns out that, for base 2 represented in base 10, the power of 2 is 4508*exactly* how many digits are necessary to represent *any* number `n/2^p`, where 4509`p` is the power of 2. This is because at every halving, the number of decimal 4510places increases by 1: 4511 4512``` 45130.5 45140.25 45150.125 45160.0625 45170.03125 45180.015625 4519... 4520``` 4521 4522So the algorithm to convert all 255 bits of the seed is as follows: 4523 45241. Convert the increment to a `BcNum`. 45252. Convert the seed to a `BcNum`. 45263. Divide the seed by `2^128` with a `scale` of 128. (For 32-bit systems, 4527 substitute 64 bits for 128.) 45284. Add the two numbers together. 4529 4530Likewise, the algorithm to convert from a user-supplied number to a seed is: 4531 45321. Truncate a copy of the number. 45332. Subtract the result from #1 from the original number. This gives the real 4534 portion of the number. 45353. Clamp the result of #1 to 127 (or 63) bits. This is the increment. 45364. Multiply the result of #2 by `2^128`. 45375. Truncate the result of #4. This is the seed. 4538 4539#### Generating Arbitrary-Precision Numbers 4540 4541I wrote a function (`bc_rand_bounded()`) that will return unbiased results with 4542any bound below the max that PCG can generate. 4543 4544To generate an integer of arbitrary size using a bound, `bc` simply uses 4545`bc_rand_bounded()` to generate numbers with a bound `10^BC_BASE_DIGS` for as 4546many limbs as needed to satisfy the bigger bound. 4547 4548To generate numbers with arbitrary precision after the decimal point, `bc` 4549merely generates an arbitrary precision integer with the bound `10^p`, where `p` 4550is the desired number of decimal places, then divides in by `10^p` with a 4551`scale` of `p`. 4552 4553## Debug Code 4554 4555Besides building `bc` in debug mode with the `-g` flag to [`configure.sh`][69], 4556programmers can also add `-DBC_DEBUG_CODE=1` to the `CFLAGS`. This will enable 4557the inclusion of *a lot* of extra code to assist with debugging. 4558 4559For more information, see all of the code guarded by `#if BC_DEBUG_CODE` in the 4560[`include/`][212] directory and in the [`src/`][213] directory. 4561 4562Yes, all of the code is guarded by `#if` preprocessor statements; this is 4563because the code should *never* be in a release build, and by making programmers 4564add this manually (not even an option to [`configure.sh`][69], it is easier to 4565ensure that never happens. 4566 4567However, that said, the extra debug code is useful; that was why I kept it in. 4568 4569## Performance 4570 4571While I have put in a lot of effort to make `bc` as fast as possible, there 4572might be some things you can do to speed it up without changing the code. 4573 4574First, you can probably use [profile-guided optimization][217] to optimize even 4575better, using the test suite to profile. 4576 4577Second, I included macros that might help branch placement and prediction: 4578 4579* `BC_ERR(e)` 4580* `BC_UNLIKELY(e)` 4581* `BC_NO_ERR(e)` 4582* `BC_LIKELY(e)` 4583 4584`BC_ERR` is the same as `BC_UNLIKELY`, and `BC_NO_ERR` is the same as 4585`BC_LIKELY`; I just added them to also document branches that lead to error 4586conditions or *away* from error conditions. 4587 4588Anyway, if `BC_LIKELY` and `BC_UNLIKELY` are not defined during compilation, 4589they expand to nothing but the argument they were given. 4590 4591They can, however, be defined to `__builtin_expect((e), 1)` and 4592`__builtin_expect((e), 0)`, respectively, on GCC and Clang for better branch 4593prediction and placement. (For more information about `__builtin_expect()` see 4594the [GCC documentation][218].) 4595 4596There might be other compilers that can take advantage of that, but I don't know 4597anything about that. 4598 4599Also, as stated in the [build manual][219], link-time optimization is excellent 4600at optimizing this `bc`. Use it. 4601 4602### Benchmarks 4603 4604To help programmers improve performance, I have built and assembled 4605infrastructure to make benchmarking easy. 4606 4607First, in order to easily run benchmarks, I created 4608[`scripts/benchmark.sh`][220]. 4609 4610Second, I copied and adapted [`ministat.c`][223] [from FreeBSD][221], to make it 4611easier to judge whether the results are significant or not. 4612 4613Third, I made the `make` clean target `make clean_benchmarks`, to clean 4614`scripts/ministat` and the generated benchmark files. 4615 4616Fourth, I made it so [`scripts/benchmark.sh`][220] outputs the timing and memory 4617data in a format that is easy for `scripts/ministat` to digest. 4618 4619To add a benchmark, add a script in the right directory to generate the 4620benchmark. Yes, generate. 4621 4622All of the benchmarks are generated first, from `.bc` and `.dc` files in the 4623[`benchmarks/bc/`][91] and [`benchmarks/dc/`][224]. This is so that massive 4624amounts of data can be generated and then pushed through the calculators. 4625 4626If you need to benchmark `bc` or `dc` with simple loops, have the generator 4627files simply print the loop code. 4628 4629### Caching of Numbers 4630 4631In order to provide some performance boost, `bc` tries to reuse old `BcNum`'s 4632that have the default capacity (`BC_NUM_DEF_SIZE`). 4633 4634It does this by allowing `bc_num_free()` to put the limb array onto a 4635statically-allocated stack (it's just a global array with a set size). Then, 4636when a `BcNum` with the default capacity is needed, `bc_num_init()` asks if any 4637are available. If the answer is yes, the one on top of the stack is returned. 4638Otherwise, `NULL` is returned, and `bc_num_free()` knows it needs to `malloc()` 4639a new limb array. 4640 4641When the stack is filled, any numbers that `bc` attempts to put on it are just 4642freed. 4643 4644This setup saved a few percent in my testing for version [3.0.0][32], which is 4645when I added it. 4646 4647## `bcl` 4648 4649At the request of one of my biggest users, I spent the time to make a build mode 4650where the number and math code of `bc` could be wrapped into a library, which I 4651called `bcl`. 4652 4653This mode is exclusive; `bc` and `dc` themselves are *not* built when building 4654`bcl`. 4655 4656The only things in the `bc` math code that is not included is: 4657 4658* Printing newlines (clients do not care about `bc`'s line lenth restriction). 4659* `dc`'s stream print. 4660 4661Even the [pseudo-random number generator][179] is included, with extra support 4662for generating real numbers with it. (In `bc`, such support is in 4663[`lib2.bc`][26].) 4664 4665### Signal Handling 4666 4667Like signal handling in `bc` proper (see the [Async-Signal-Safe Signal 4668Handling][173] section), `bcl` has the infrastructure for signal handling. 4669 4670This infrastructure is different, however, as `bcl` assumes that clients will 4671implement their own signal handling. 4672 4673So instead of doing signal handling on its own, `bcl` provides the capability to 4674interrupt executions and return to the clients almost immediately. Like in `bc`, 4675this is done with `setjmp()` and `longjmp()`, although the jump series is 4676stopped before returning normally to client code. 4677 4678### Contexts 4679 4680Contexts were an idea by the same user that requested `bcl`. They are meant to 4681make it so multiple clients in one program can keep their data separate from 4682each other. 4683 4684### Numbers 4685 4686Numbers in `bcl` are literally indices into an encapsulated array of numbers, 4687hidden in the context. These indices are then passed to clients to refer to 4688numbers later. 4689 4690### Operand Consumption 4691 4692Most math functions in `bcl` "consume" their operand arguments; the arguments 4693are freed, whether or not an error is returned. 4694 4695This is to make it easy to implement math code, like this: 4696 4697``` 4698n = bcl_add(bcl_mul(a, b), bcl_div(c, d)); 4699``` 4700 4701If numbers need to be preserved, they can be with `bcl_dup()`: 4702 4703``` 4704n = bcl_add(bcl_mul(bcl_dup(a), bc_dup(b)), bcl_div(bcl_dup(c), bcl_dup(d))); 4705``` 4706 4707### Errors 4708 4709Errors can be encoded in the indices representing numbers, and where necessary, 4710clients are responsible for checking those errors. 4711 4712The encoding of errors is this: if an error happens, the value `0-error` is 4713returned. To decode, do the exact same thing. Thus, any index above 4714`0-num_errors` is an error. 4715 4716If an index that represents an error is passed to a math function, that function 4717propagates the error to its result and does not perform the math operation. 4718 4719All of this is to, once again, make it easy to implement the math code as above. 4720 4721However, where possible, errors are returned directly. 4722 4723[1]: https://en.wikipedia.org/wiki/Bus_factor 4724[2]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/bc.html#top 4725[3]: https://en.wikipedia.org/wiki/Dc_(Unix) 4726[4]: https://en.wikipedia.org/wiki/Reverse_Polish_notation 4727[5]: ./bc/A.1.md#standard-library 4728[6]: https://github.com/torvalds/linux/blob/master/kernel/time/timeconst.bc 4729[7]: ./bc/A.1.md#extended-library 4730[8]: #libbc-2 4731[9]: #strgensh 4732[10]: https://vimeo.com/230142234 4733[11]: https://gavinhoward.com/2019/12/values-for-yao/ 4734[12]: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf 4735[13]: ./build.md#cross-compiling 4736[14]: ./build.md 4737[15]: #strgenc 4738[16]: http://landley.net/toybox/about.html 4739[17]: https://www.busybox.net/ 4740[18]: https://en.wikipedia.org/wiki/Karatsuba_algorithm 4741[19]: https://clang-analyzer.llvm.org/scan-build.html 4742[20]: https://www.valgrind.org/ 4743[21]: https://clang.llvm.org/docs/AddressSanitizer.html 4744[22]: https://gavinhoward.com/2019/11/finishing-software/ 4745[23]: #history 4746[24]: https://clang.llvm.org/docs/ClangFormat.html 4747[25]: ./algorithms.md 4748[26]: #lib2bc 4749[27]: #vmh 4750[28]: https://github.com/rain-1/linenoise-mob 4751[29]: https://github.com/antirez/linenoise 4752[30]: #bclh 4753[31]: #argsh 4754[32]: ../NEWS.md#3-0-0 4755[33]: ../NEWS.md 4756[34]: https://github.com/skeeto/optparse 4757[35]: #opth 4758[36]: #historyh 4759[37]: #randh 4760[38]: #langh 4761[39]: #numc 4762[40]: #bcc 4763[41]: #bc_lexc 4764[42]: #bc_parsec 4765[43]: #libraryc 4766[44]: #dcc 4767[45]: #dc_lexc 4768[46]: #dc_parsec 4769[47]: #filec 4770[48]: #historyc 4771[49]: #langc 4772[50]: #lexc 4773[51]: #optc 4774[52]: #parsec 4775[53]: #programc 4776[54]: #randc 4777[55]: #fileh 4778[56]: #readc 4779[57]: #programh 4780[58]: #vmc 4781[59]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/gencat.html#top 4782[60]: #manpagesh 4783[61]: #bcl3md 4784[62]: #bcl3 4785[63]: #bclvcxproj 4786[64]: #bclvcxprojfilters 4787[65]: #bclsln 4788[66]: #bcvcxproj 4789[67]: #bcvcxprojfilters 4790[68]: #bcsln 4791[69]: #configuresh 4792[70]: #makefilein 4793[71]: #functionsh 4794[72]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sh.html#top 4795[73]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18 4796[74]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html#top 4797[75]: #versionh 4798[76]: ##posix-shell-scripts 4799[77]: #tests 4800[78]: #karatsubapy 4801[79]: #bc-1 4802[80]: #dc-1 4803[81]: ./build.md#build-type 4804[82]: #fuzzing-1 4805[83]: #releasesh 4806[84]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_02 4807[85]: #locales-1 4808[86]: #manuals-1 4809[87]: #locale_installsh 4810[88]: #locale_uninstallsh 4811[89]: #bc1mdin 4812[90]: #dc1mdin 4813[91]: #bc 4814[92]: https://pandoc.org/ 4815[93]: #release_settingstxt 4816[94]: #aflpy 4817[95]: #randmathpy 4818[96]: #caching-of-numbers 4819[97]: #error-handling 4820[98]: #radamsatxt 4821[99]: https://gitlab.com/akihe/radamsa 4822[100]: #radamsash 4823[101]: https://musl.libc.org/ 4824[102]: ./build.md#settings 4825[103]: #test_settingstxt 4826[104]: #test_settingssh 4827[105]: #functionssh 4828[106]: #bch 4829[107]: https://en.wikipedia.org/wiki/Infix_notation 4830[108]: https://en.wikipedia.org/wiki/Reverse_Polish_notation 4831[109]: https://en.wikipedia.org/wiki/Shunting-yard_algorithm 4832[110]: #bc-parsing 4833[111]: #vectors 4834[112]: https://git.yzena.com/Yzena/Yc/src/branch/master/include/yc/vector.h 4835[113]: https://en.wikipedia.org/wiki/Signal_(IPC) 4836[114]: #custom-io 4837[115]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03 4838[116]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/setjmp.html 4839[117]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/longjmp.html 4840[118]: https://www.youtube.com/watch?v=4PaWFYm0kEw 4841[119]: #fuzz_prepsh 4842[120]: #bc_aflyaml 4843[121]: #bc_afl_continueyaml 4844[122]: https://github.com/tmux/tmux 4845[123]: https://tmuxp.git-pull.com/ 4846[124]: #test-suite 4847[125]: https://aflplus.plus/ 4848[126]: #link-time-optimization 4849[127]: #fuzzing-performance 4850[128]: #radamsa 4851[129]: #afl-quickstart 4852[130]: #convenience 4853[131]: #datac 4854[132]: https://git.yzena.com/gavin/vim-bc 4855[133]: https://git.yzena.com/gavin/bc_libs 4856[134]: #debugging 4857[135]: #asserts 4858[136]: #portability 4859[137]: https://pexpect.readthedocs.io/en/stable/ 4860[138]: #test-suite-portability 4861[139]: #historypy 4862[140]: #historysh 4863[141]: #group-tests 4864[142]: #build-system 4865[143]: #generated-tests 4866[144]: #benchmarks-1 4867[145]: #gen 4868[146]: #test-coverage 4869[147]: #integration-with-the-build-system 4870[148]: #test-scripts 4871[149]: #standard-tests 4872[150]: #script-tests 4873[151]: #error-tests 4874[152]: #stdin-tests 4875[153]: #read-tests 4876[154]: #other-tests 4877[155]: #history-tests 4878[156]: #bcl 4879[157]: #bcl-test 4880[158]: #bclc 4881[159]: #valgrind 4882[160]: #addresssanitizer-and-friends 4883[161]: #bc-2 4884[162]: #dc-2 4885[163]: #alltxt-1 4886[164]: #errorstxt 4887[165]: #posix_errorstxt 4888[166]: #timeconstsh 4889[167]: #alltxt-3 4890[168]: #errorstxt-1 4891[169]: #scripts-1 4892[170]: #scripts-2 4893[171]: #alltxt-2 4894[172]: #alltxt-4 4895[173]: #async-signal-safe-signal-handling 4896[174]: #vectorh 4897[175]: #read_errorstxt 4898[176]: #statush 4899[177]: #numbers 4900[178]: #math-style 4901[179]: #pseudo-random-number-generator 4902[180]: #lexh 4903[181]: #parseh 4904[182]: #dch 4905[183]: #libraryh 4906[184]: #numh 4907[185]: #readh 4908[186]: #maps 4909[187]: #slabs-and-slab-vectors 4910[188]: ./build.md#extra-math 4911[189]: #command-line-history 4912[190]: #scriptsed 4913[191]: #linux-timeconstbc-script 4914[192]: #corpuses 4915[193]: ./build.md#history 4916[194]: https://www.valgrind.org/docs/manual/mc-manual.html 4917[195]: #othersh 4918[196]: https://scan.coverity.com/ 4919[197]: https://clang-analyzer.llvm.org/ 4920[198]: https://unix.stackexchange.com/questions/253349/eintr-is-there-a-rationale-behind-it 4921[199]: https://cr.yp.to/docs/selfpipe.html 4922[200]: https://skarnet.org/cgi-bin/archive.cgi?2:mss:1607:201701:dfblejammjllfkggpcph 4923[201]: https://slembcke.github.io/2020/10/12/CustomAllocators.html#1-slab-allocator 4924[202]: https://en.wikipedia.org/wiki/Tail_call 4925[203]: https://en.wikipedia.org/wiki/Functional_programming_language 4926[204]: https://en.wikipedia.org/wiki/C_dynamic_memory_allocation 4927[205]: #mainc 4928[206]: #argc 4929[207]: https://en.wikipedia.org/wiki/Abstract_syntax_tree 4930[208]: #functions 4931[209]: #parsing 4932[210]: https://en.wikipedia.org/wiki/Stack_machine 4933[211]: https://en.wikipedia.org/wiki/Register_machine 4934[212]: #include 4935[213]: #src 4936[214]: https://gmplib.org/manual/Nomenclature-and-Types 4937[215]: https://en.wikipedia.org/wiki/Radix_point 4938[216]: #main-and-read-functions 4939[215]: https://www.pcg-random.org/ 4940[216]: https://lemire.me/blog/2017/08/22/testing-non-cryptographic-random-number-generators-my-results/ 4941[217]: https://en.wikipedia.org/wiki/Profile-guided_optimization 4942[218]: https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fexpect 4943[219]: ./build.md#optimization 4944[220]: #benchmarksh 4945[221]: https://cgit.freebsd.org/src/tree/usr.bin/ministat/ministat.c 4946[222]: https://www.freebsd.org/cgi/man.cgi?query=ministat&apropos=0&sektion=0&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html 4947[223]: #ministatc 4948[224]: #dc 4949