1# String Builder optimizations 2 3## Overview 4 5This set of optimizations targets String Builder usage specifics. 6 7## Rationality 8 9String Builder is used to construct a string out of smaller pieces. In some cases it is optimal to use String Builder to collect intermediate parts, but in other cases we prefer naive string concatenation, due to an overhead introduced by String Builder object. 10 11## Dependence 12 13* BoundsAnalysis 14* AliasAnalysis 15* LoopAnalysis 16* DominatorsTree 17 18## Algorithm 19 20**Remove unnecessary String Builder** 21 22Example: 23```TS 24let input: String = ... 25let sb = new StringBuilder(input) 26let output = sb.toString() 27``` 28Since there are no `StringBuilder::append()`-calls in between `constructor` and `toString()`-call, `input` string is equal to `output` string. So, the example code is equivalent to 29```TS 30let input: String = ... 31let output = input 32``` 33 34**Replace String Builder with string concatenation** 35 36String concatenation expressed as a plus-operator over string operands turned into a String Builder usage by a frontend. 37 38Example: 39```TS 40let a, b: String 41... 42let output = a + b 43``` 44Frontend output equivalent: 45```TS 46let a, b: String 47... 48let sb = new StringBuilder() 49sb.append(a) 50sb.append(b) 51let output = sb.toString() 52``` 53The overhead of String Builder object exceeds the benefits of its usage (comparing to a naive string concatenation) for a small number of operands (two in the example above). So, we replace String Builder in such a cases back to naive string concatenation. 54 55**Optimize concatenation loops** 56 57Consider a string accumulation loop example: 58```TS 59let inputs: string[] = ... // array of strings 60let output = "" 61for (let input in inputs) 62 output += input 63``` 64Like in **Replace String Builder with string concatenation** section, frontend replaces string accumulation `output += input` with a String Builder usage, resulting in a huge performance degradation (comparing to a naive string concatenation), because *at each loop iteration* String Builder object is constructed, used to append two operands, builds resulting string, and discarded. 65 66The equivalent code looks like the following: 67```TS 68let inputs: string[] = ... // array of strings 69let output = "" 70for (let input in inputs) { 71 let sb = new StringBuilder() 72 sb.append(output) 73 sb.append(input) 74 output = sb.toString() 75} 76``` 77To optimize cases like this, we implement the following equivalent transformation: 78```TS 79let inputs: string[] = ... // array of strings 80let sb = new StringBuilder() 81for (let input in inputs) { 82 sb.append(input) 83} 84let output = sb.toString() 85``` 86 87## Pseudocode 88 89**Complete algorithm** 90 91```C# 92function SimplifyStringBuilder(graph: Graph) 93 foreach loop in graph 94 OptimizeStringConcatenation(loop) 95 foreach block in graph (in RPO) 96 OptimizeStringBuilderToString(block) 97 OptimizeStringConcatenation(block) 98``` 99 100Below we describe the algorithm in more details 101 102**Remove unnecessary String Builder** 103 104The algorithm works as follows: first we search for all the StringBuilder instances in a basic block, then we replace all their toString-call usages with instance constructor argument until we meet any other usage in RPO. 105```C# 106function OptimizeStringBuilderToString(block: BasicBlock) 107 foreach ctor being StringBuilder constructor with String argument in block 108 let instance be StringBuilder instance of ctor-call 109 let arg be ctor-call string argument 110 foreach usage of instance (in RPO) 111 if usage is toString-call 112 replace usage with arg 113 else 114 break 115 if instance is not used 116 remove ctor from block 117 remove instance from block 118``` 119**Replace String Builder with string concatenation** 120 121The algorithm works as follows: first we search for all the StringBuilder instances in a basic block, then we check if the use of instance matches concatenation pattern for 2, 3, or 4 arguments. If so, we replace the whole use of StringBuilder object with concatenation intrinsics. 122```C# 123function OptimizeStringConcatenation(block: BasicBlock) 124 foreach ctor being StringBuilder default constructor in block 125 let instance be StringBuilder instance of ctor-call 126 let match = MatchConcatenation(instance) 127 let appendCount = match.appendCount be number of append-calls of instance 128 let append = match.append be an array of append-calls of instance 129 let toStringCall = match.toStringCall be toString-call of instance 130 let concat01 = ConcatIntrinsic(append[0].input(1), append[1].input(1)) 131 remove append[0] from block 132 remove append[1] from block 133 switch appendCount 134 case 2 135 replace toStringCall with concat01 136 break 137 case 3 138 let concat012 = ConcatIntrinsic(concat01, append[2].input(1)) 139 remove append[2] from block 140 replace toStringCall with concat012 141 break 142 case 4 143 let concat23 = ConcatIntrinsic(append[2].input(1), append[3].input(1)) 144 let concat0123 = ConcatIntrinsic(concat01, concat23) 145 remove append[2] from block 146 remove append[3] from block 147 replace toStringCall with concat0123 148 remove toStringCall from block 149 remove ctor from block 150 remove instance from block 151 152function ConcatIntrinsic(arg0, arg1): IntrinsicInst 153 return concatenation intrinsic for arg0 and arg1 154 155type Match 156 toStringCall: CallInst 157 appendCount: Integer 158 append: array of CallInst 159 160function MatchConcatenation(instance: StringBuilder): Match 161 let match: Match 162 foreach usage of instance 163 if usage is toString-call 164 set match.toStringCall = usage 165 elif usage is append-call 166 add usage to match.append array 167 increment match.appendCount 168 return match 169``` 170**Optimize concatenation loops** 171 172The algorithm works as follows: first we recursively process all the inner loops of current loop, then we search for string concatenation patterns within a current loop. For each pattern found we reconnect StringBuilder usage instructions in a correct way (making them point to the only one instance we have chosen), move chosen String Builder object creation and initial string value appending to a loop pre-header, move chosen StringBuilder object toString-call to loop exit block. We cleanup unused instructions at the end. 173```C# 174function OptimizeStringConcatenation(loop: Loop) 175 foreach innerLoop being inner loop of loop 176 OptimizeStringConcatenation(innerLoop) 177 let matches = MatchConcatenationLoop(loop) 178 foreach match in matches) 179 ReconnectInstructions(match) 180 HoistInstructionsToPreHeader(match) 181 HoistInstructionsToExitBlock(match) 182 } 183 Cleanup(loop, matches) 184 185type Match 186 accValue: PhiInst 187 initialValue: Inst 188 // instructions to be hoisted to preheader 189 preheader: type 190 instance: Inst 191 appendAccValue: IntrinsicInst 192 // instructions to be left inside loop 193 loop: type 194 appendIntrinsics: array of IntrinsicInst 195 // instructions to be deleted 196 temp: array of type 197 toStringCall: Inst 198 instance: Inst 199 appendAccValue: IntrinsicInst 200 // instructions to be hoisted to exit block 201 exit: type 202 toStringCall: Inst 203 204function MatchConcatenationLoop(loop: Loop) 205 let matches: array of Match 206 foreach accValue being string accumulator in a loop 207 let match: Match 208 foreach instance being StringBuilder instance used to update accValue (in RPO) 209 if match is empty 210 // Fill preheader and exit parts of a match 211 set match.accValue = accValue 212 set match.initialValue = FindInitialValue(accValue) 213 set match.exit.toStringCall = FindToStringCall(instance) 214 set match.preheader.instance = instance 215 set match.preheader.appendAccValue = FindAppendIntrinsic(instance, accValue) 216 // Init loop part of a match 217 add other append instrinsics to match.loop.appendInstrinsics array 218 else 219 // Fill loop and temporary parts of a match 220 let temp: TemporaryInstructions 221 set temp.instance = instance 222 set temp.toStringCall = FindToStringCall(instance) 223 foreach appendIntrinsic in FindAppendIntrinsics(instance) 224 if appendIntrinsic.input(1) is accValue 225 set temp.appendAccValue = appendIntrinsic 226 else 227 add appendIntricsic to match.loop.appendInstrinsics array 228 add temp to match.temp array 229 add match to matches array 230 return matches 231 232function ReconnectInstructions(match: Match) 233 match.preheader.appendAcc.setInput(0, match.preheader.instance) 234 match.preheader.appendAcc.setInput(1, be match.initialValue) 235 match.exit.toStringCall.setInput(0, match.preheader.instance) 236 foreach user being users of match.accValue outside loop 237 user.replaceInput(match.accValue, match.exit.toStringCall) 238 239function HoistInstructionsToPreHeader(match: Match) 240 foreach inst in match.preheader 241 hoist inst to loop preheader 242 fix broken save states 243 244function HoistInstructionsToExitBlock(match: Match) 245 let exitBlock be to exit block of loop 246 hoist match.exit.toStringCall to exitBlock 247 foreach input being inputs of match.exit.toStringCall inside loop 248 hoist input to exitBlock 249 250function Cleanup(loop: Loop, matches: array of Match) 251 foreach block in loop 252 fix save states in block 253 foreach match in matches 254 foreach temp in match.temp 255 foreach inst in temp 256 remove inst 257 foreach block in loop 258 foreach phi in block 259 if phi is not used 260 remove phi from block 261``` 262 263## Examples 264 265**Remove unnecessary String Builder** 266 267ETS function example: 268```TS 269function toString0(str: String): String { 270 return new StringBuilder(str).toString(); 271} 272``` 273 274IR before transformation: 275 276(Save state and null check instructions are skipped for simplicity) 277``` 278Method: std.core.String ETSGLOBAL::toString0(std.core.String) 279 280BB 1 281prop: start 282 0.ref Parameter arg 0 -> (v5) 283succs: [bb 0] 284 285BB 0 preds: [bb 1] 286prop: 287 3.ref LoadAndInitClass 'std.core.StringBuilder' ss -> (v4) 288 4.ref NewObject 15300 v3, ss -> (v5, v10) 289 5.void CallStatic 51211 std.core.StringBuilder::<ctor> v4, v0, ss 290 10.ref CallVirtual 51332 std.core.StringBuilder::toString v4, ss -> (v11) 291 11.ref Return v10 292succs: [bb 2] 293 294BB 2 preds: [bb 0] 295prop: end 296``` 297IR after transformation: 298``` 299Method: std.core.String ETSGLOBAL::toString0(std.core.String) 300 301BB 1 302prop: start 303 0.ref Parameter arg 0 -> (v10) 304succs: [bb 0] 305 306BB 0 preds: [bb 1] 307prop: 308 10.ref Return v0 309succs: [bb 2] 310 311BB 2 preds: [bb 0] 312prop: end 313``` 314 315**Replace String Builder with string concatenation** 316 317ETS function example: 318```TS 319function concat0(a: String, b: String): String { 320 return a + b; 321} 322``` 323IR before transformation: 324``` 325Method: std.core.String ETSGLOBAL::concat0(std.core.String, std.core.String) 326 327BB 1 328prop: start 329 0.ref Parameter arg 0 -> (v10) 330 1.ref Parameter arg 1 -> (v13) 331succs: [bb 0] 332 333BB 0 preds: [bb 1] 334prop: 335 4.ref LoadAndInitClass 'std.core.StringBuilder' ss -> (v5) 336 5.ref NewObject 11355 v4, ss -> (v13, v10, v6) 337 6.void CallStatic 60100 std.core.StringBuilder::<ctor> v5, ss 338 10.ref Intrinsic.StdCoreSbAppendString v5, v0, ss 339 13.ref Intrinsic.StdCoreSbAppendString v5, v1, ss 340 16.ref CallStatic 60290 std.core.StringBuilder::toString v5, ss -> (v17) 341 17.ref Return v16 342succs: [bb 2] 343 344BB 2 preds: [bb 0] 345prop: end 346``` 347IR after transformation: 348``` 349Method: std.core.String ETSGLOBAL::concat0(std.core.String, std.core.String) 350 351BB 1 352prop: start 353 0.ref Parameter arg 0 -> (v18) 354 1.ref Parameter arg 1 -> (v18) 355succs: [bb 0] 356 357BB 0 preds: [bb 1] 358prop: 359 18.ref Intrinsic.StdCoreStringConcat2 v0, v1, ss -> (v17) 360 17.ref Return v18 361succs: [bb 2] 362 363BB 2 preds: [bb 0] 364prop: end 365``` 366 367**Optimize concatenation loops** 368 369ETS function example: 370```TS 371function concat_loop0(a: String, n: int): String { 372 let str: String = ""; 373 for (let i = 0; i < n; ++i) 374 str = str + a; 375 return str; 376} 377``` 378IR before transformation: 379``` 380Method: std.core.String ETSGLOBAL::concat_loop0(std.core.String, i32) 381 382BB 4 383prop: start 384 0.ref Parameter arg 0 -> (v9p) 385 1.i32 Parameter arg 1 -> (v10p) 386 3.i64 Constant 0x0 -> (v7p) 387 30.i64 Constant 0x1 -> (v29) 388succs: [bb 0] 389 390BB 0 preds: [bb 4] 391prop: prehead 392 4.ref LoadString 63726 v5 -> (v8p) 393succs: [bb 3] 394 395BB 3 preds: [bb 0, bb 2] 396prop: head, loop 1, depth 1 397 7p.i32 Phi v3(bb0), v29(bb2) -> (v29, v13) 398 8p.ref Phi v4(bb0), v28(bb2) -> (v31, v12) 399 9p.ref Phi v0(bb0), v9p(bb2) -> (v12, v9p, v25) 400 10p.i32 Phi v1(bb0), v10p(bb2) -> (v10p, v13) 401 13.b Compare GE i32 v7p, v10p -> (v14) 402 14. IfImm NE b v13, 0x0 403succs: [bb 1, bb 2] 404 405BB 2 preds: [bb 3] 406prop: loop 1, depth 1 407 16.ref LoadAndInitClass 'std.core.StringBuilder' ss -> (v17) 408 17.ref NewObject 22178 v16, ss -> (v28, v25, v22) 409 18.void CallStatic 60220 std.core.StringBuilder::<ctor> v17, ss 410 22.ref Intrinsic.StdCoreSbAppendString v17, v8p, ss 411 25.ref Intrinsic.StdCoreSbAppendString v17, v9p, ss 412 28.ref CallStatic 60410 std.core.StringBuilder::toString v17, ss -> (v11p, v8p) 413 29.i32 Add v7p, v30 -> (v7p) 414succs: [bb 3] 415 416BB 1 preds: [bb 3] 417prop: 418 31.ref Return v8p 419succs: [bb 5] 420 421BB 5 preds: [bb 1] 422prop: end 423``` 424IR after transformation: 425``` 426Method: std.core.String ETSGLOBAL::concat_loop0(std.core.String, i32) 427 428BB 4 429prop: start 430 0.ref Parameter arg 0 -> (v25) 431 1.i32 Parameter arg 1 -> (v40, v13) 432 3.i64 Constant 0x0 -> (v40, v7p) 433 30.i64 Constant 0x1 -> (v29) 434succs: [bb 0] 435 436BB 0 preds: [bb 4] 437prop: prehead 438 4.ref LoadString 63726 ss -> (v22) 439 16.ref LoadAndInitClass 'std.core.StringBuilder' ss -> (v17) 440 17.ref NewObject 22178 v16, ss -> (v25, v28, v22, v18) 441 18.void CallStatic 60220 std.core.StringBuilder::<ctor> v17, ss 442 22.ref Intrinsic.StdCoreSbAppendString v17, v4, ss 443 40.b Compare GE i32 v3, v1 -> (v41) 444 41. IfImm NE b v40, 0x0 445succs: [bb 1, bb 2] 446 447BB 2 preds: [bb 0, bb 2] 448prop: head, loop 1, depth 1 449 7p.i32 Phi v3(bb0), v29(bb2) -> (v29) 450 25.ref Intrinsic.StdCoreSbAppendString v17, v0, ss 451 29.i32 Add v7p, v30 -> (v13, v7p) 452 13.b Compare GE i32 v29, v1 -> (v14) 453 14. IfImm NE b v13, 0x0 454succs: [bb 1, bb 2] 455 456BB 1 preds: [bb 2, bb 0] 457prop: 458 28.ref CallStatic 60410 std.core.StringBuilder::toString v17, ss 459 31.ref Return v28 460succs: [bb 5] 461 462BB 5 preds: [bb 1] 463prop: end 464``` 465 466## Links 467 468* Implementation 469 * [simplify_string_builder.h](../optimizer/optimizations/simplify_string_builder.h) 470 * [simplify_string_builder.cpp](../optimizer/optimizations/simplify_string_builder.cpp) 471* Tests 472 * [ets_string_builder.sts](../../plugins/ets/tests/checked/ets_string_builder.sts) 473 * [ets_string_concat.sts](../../plugins/ets/tests/checked/ets_string_concat.sts) 474 * [ets_string_concat_loop.sts](../../plugins/ets/tests/checked/ets_string_concat_loop.sts) 475