• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# String Builder optimizations
2
3## Overview
4
5This set of optimizations targets String Builder usage specifics.
6
7## Rationality
8
9String Builder is used to construct a string out of smaller pieces. In some cases it is optimal to use String Builder to collect intermediate parts, but in other cases we prefer naive string concatenation, due to an overhead introduced by String Builder object.
10
11## Dependence
12
13* BoundsAnalysis
14* AliasAnalysis
15* LoopAnalysis
16* DominatorsTree
17
18## Algorithm
19
20**Remove unnecessary String Builder**
21
22Example:
23```TS
24let input: String = ...
25let sb = new StringBuilder(input)
26let output = sb.toString()
27```
28Since there are no `StringBuilder::append()`-calls in between `constructor` and `toString()`-call, `input` string is equal to `output` string. So, the example code is equivalent to
29```TS
30let input: String = ...
31let output = input
32```
33
34**Replace String Builder with string concatenation**
35
36String concatenation expressed as a plus-operator over string operands turned into a String Builder usage by a frontend.
37
38Example:
39```TS
40let a, b: String
41...
42let output = a + b
43```
44Frontend output equivalent:
45```TS
46let a, b: String
47...
48let sb = new StringBuilder()
49sb.append(a)
50sb.append(b)
51let output = sb.toString()
52```
53The overhead of String Builder object exceeds the benefits of its usage (comparing to a naive string concatenation) for a small number of operands (two in the example above). So, we replace String Builder in such a cases back to naive string concatenation.
54
55**Optimize concatenation loops**
56
57Consider a string accumulation loop example:
58```TS
59let inputs: string[] = ... // array of strings
60let output = ""
61for (let input in inputs)
62    output += input
63```
64Like in **Replace String Builder with string concatenation** section, frontend replaces string accumulation `output += input` with a String Builder usage, resulting in a huge performance degradation (comparing to a naive string concatenation), because *at each loop iteration* String Builder object is constructed, used to append two operands, builds resulting string, and discarded.
65
66The equivalent code looks like the following:
67```TS
68let inputs: string[] = ... // array of strings
69let output = ""
70for (let input in inputs) {
71    let sb = new StringBuilder()
72    sb.append(output)
73    sb.append(input)
74    output = sb.toString()
75}
76```
77To optimize cases like this, we implement the following equivalent transformation:
78```TS
79let inputs: string[] = ... // array of strings
80let sb = new StringBuilder()
81for (let input in inputs) {
82    sb.append(input)
83}
84let output = sb.toString()
85```
86
87## Pseudocode
88
89**Complete algorithm**
90
91```C#
92function SimplifyStringBuilder(graph: Graph)
93    foreach loop in graph
94        OptimizeStringConcatenation(loop)
95    foreach block in graph (in RPO)
96        OptimizeStringBuilderToString(block)
97        OptimizeStringConcatenation(block)
98```
99
100Below we describe the algorithm in more details
101
102**Remove unnecessary String Builder**
103
104The algorithm works as follows: first we search for all the StringBuilder instances in a basic block, then we replace all their toString-call usages with instance constructor argument until we meet any other usage in RPO.
105```C#
106function OptimizeStringBuilderToString(block: BasicBlock)
107    foreach ctor being StringBuilder constructor with String argument in block
108        let instance be StringBuilder instance of ctor-call
109        let arg be ctor-call string argument
110        foreach usage of instance (in RPO)
111            if usage is toString-call
112                replace usage with arg
113            else
114                break
115        if instance is not used
116            remove ctor from block
117            remove instance from block
118```
119**Replace String Builder with string concatenation**
120
121The algorithm works as follows: first we search for all the StringBuilder instances in a basic block, then we check if the use of instance matches concatenation pattern for 2, 3, or 4 arguments. If so, we replace the whole use of StringBuilder object with concatenation intrinsics.
122```C#
123function OptimizeStringConcatenation(block: BasicBlock)
124    foreach ctor being StringBuilder default constructor in block
125        let instance be StringBuilder instance of ctor-call
126        let match = MatchConcatenation(instance)
127        let appendCount = match.appendCount be number of append-calls of instance
128        let append = match.append be an array of append-calls of instance
129        let toStringCall = match.toStringCall be toString-call of instance
130        let concat01 = ConcatIntrinsic(append[0].input(1), append[1].input(1))
131        remove append[0] from block
132        remove append[1] from block
133        switch appendCount
134            case 2
135                replace toStringCall with concat01
136                break
137            case 3
138                let concat012 = ConcatIntrinsic(concat01, append[2].input(1))
139                remove append[2] from block
140                replace toStringCall with concat012
141                break
142            case 4
143                let concat23 = ConcatIntrinsic(append[2].input(1), append[3].input(1))
144                let concat0123 = ConcatIntrinsic(concat01, concat23)
145                remove append[2] from block
146                remove append[3] from block
147                replace toStringCall with concat0123
148        remove toStringCall from block
149        remove ctor from block
150        remove instance from block
151
152function ConcatIntrinsic(arg0, arg1): IntrinsicInst
153    return concatenation intrinsic for arg0 and arg1
154
155type Match
156    toStringCall: CallInst
157    appendCount: Integer
158    append: array of CallInst
159
160function MatchConcatenation(instance: StringBuilder): Match
161    let match: Match
162    foreach usage of instance
163        if usage is toString-call
164            set match.toStringCall = usage
165        elif usage is append-call
166            add usage to match.append array
167            increment match.appendCount
168    return match
169```
170**Optimize concatenation loops**
171
172The algorithm works as follows: first we recursively process all the inner loops of current loop, then we search for string concatenation patterns within a current loop. For each pattern found we reconnect StringBuilder usage instructions in a correct way (making them point to the only one instance we have chosen), move chosen String Builder object creation and initial string value appending to a loop pre-header, move chosen StringBuilder object toString-call to loop exit block. We cleanup unused instructions at the end.
173```C#
174function OptimizeStringConcatenation(loop: Loop)
175    foreach innerLoop being inner loop of loop
176        OptimizeStringConcatenation(innerLoop)
177    let matches = MatchConcatenationLoop(loop)
178    foreach match in matches)
179        ReconnectInstructions(match)
180        HoistInstructionsToPreHeader(match)
181        HoistInstructionsToExitBlock(match)
182    }
183    Cleanup(loop, matches)
184
185type Match
186    accValue: PhiInst
187    initialValue: Inst
188    // instructions to be hoisted to preheader
189    preheader: type
190        instance: Inst
191        appendAccValue: IntrinsicInst
192    // instructions to be left inside loop
193    loop: type
194        appendIntrinsics: array of IntrinsicInst
195    // instructions to be deleted
196    temp: array of type
197        toStringCall: Inst
198        instance: Inst
199        appendAccValue: IntrinsicInst
200    // instructions to be hoisted to exit block
201    exit: type
202        toStringCall: Inst
203
204function MatchConcatenationLoop(loop: Loop)
205    let matches: array of Match
206    foreach accValue being string accumulator in a loop
207        let match: Match
208        foreach instance being StringBuilder instance used to update accValue (in RPO)
209            if match is empty
210                // Fill preheader and exit parts of a match
211                set match.accValue = accValue
212                set match.initialValue = FindInitialValue(accValue)
213                set match.exit.toStringCall = FindToStringCall(instance)
214                set match.preheader.instance = instance
215                set match.preheader.appendAccValue = FindAppendIntrinsic(instance, accValue)
216                // Init loop part of a match
217                add other append instrinsics to match.loop.appendInstrinsics array
218            else
219                // Fill loop and temporary parts of a match
220                let temp: TemporaryInstructions
221                set temp.instance = instance
222                set temp.toStringCall = FindToStringCall(instance)
223                foreach appendIntrinsic in FindAppendIntrinsics(instance)
224                    if appendIntrinsic.input(1) is accValue
225                        set temp.appendAccValue = appendIntrinsic
226                    else
227                        add appendIntricsic to match.loop.appendInstrinsics array
228                add temp to match.temp array
229        add match to matches array
230    return matches
231
232function ReconnectInstructions(match: Match)
233    match.preheader.appendAcc.setInput(0, match.preheader.instance)
234    match.preheader.appendAcc.setInput(1, be match.initialValue)
235    match.exit.toStringCall.setInput(0, match.preheader.instance)
236    foreach user being users of match.accValue outside loop
237        user.replaceInput(match.accValue, match.exit.toStringCall)
238
239function HoistInstructionsToPreHeader(match: Match)
240    foreach inst in match.preheader
241        hoist inst to loop preheader
242    fix broken save states
243
244function HoistInstructionsToExitBlock(match: Match)
245    let exitBlock be to exit block of loop
246    hoist match.exit.toStringCall to exitBlock
247    foreach input being inputs of match.exit.toStringCall inside loop
248        hoist input to exitBlock
249
250function Cleanup(loop: Loop, matches: array of Match)
251    foreach block in loop
252        fix save states in block
253    foreach match in matches
254        foreach temp in match.temp
255            foreach inst in temp
256                remove inst
257    foreach block in loop
258        foreach phi in block
259            if phi is not used
260                remove phi from block
261```
262
263## Examples
264
265**Remove unnecessary String Builder**
266
267ETS function example:
268```TS
269function toString0(str: String): String {
270    return new StringBuilder(str).toString();
271}
272```
273
274IR before transformation:
275
276(Save state and null check instructions are skipped for simplicity)
277```
278Method: std.core.String ETSGLOBAL::toString0(std.core.String)
279
280BB 1
281prop: start
282    0.ref  Parameter                  arg 0 -> (v5)
283succs: [bb 0]
284
285BB 0  preds: [bb 1]
286prop:
287    3.ref  LoadAndInitClass 'std.core.StringBuilder' ss -> (v4)
288    4.ref  NewObject 15300            v3, ss -> (v5, v10)
289    5.void CallStatic 51211 std.core.StringBuilder::<ctor> v4, v0, ss
290   10.ref  CallVirtual 51332 std.core.StringBuilder::toString v4, ss -> (v11)
291   11.ref  Return                     v10
292succs: [bb 2]
293
294BB 2  preds: [bb 0]
295prop: end
296```
297IR after transformation:
298```
299Method: std.core.String ETSGLOBAL::toString0(std.core.String)
300
301BB 1
302prop: start
303    0.ref  Parameter                  arg 0 -> (v10)
304succs: [bb 0]
305
306BB 0  preds: [bb 1]
307prop:
308   10.ref  Return                     v0
309succs: [bb 2]
310
311BB 2  preds: [bb 0]
312prop: end
313```
314
315**Replace String Builder with string concatenation**
316
317ETS function example:
318```TS
319function concat0(a: String, b: String): String {
320    return a + b;
321}
322```
323IR before transformation:
324```
325Method: std.core.String ETSGLOBAL::concat0(std.core.String, std.core.String)
326
327BB 1
328prop: start
329    0.ref  Parameter                  arg 0 -> (v10)
330    1.ref  Parameter                  arg 1 -> (v13)
331succs: [bb 0]
332
333BB 0  preds: [bb 1]
334prop:
335    4.ref  LoadAndInitClass 'std.core.StringBuilder' ss -> (v5)
336    5.ref  NewObject 11355            v4, ss -> (v13, v10, v6)
337    6.void CallStatic 60100 std.core.StringBuilder::<ctor> v5, ss
338   10.ref  Intrinsic.StdCoreSbAppendString v5, v0, ss
339   13.ref  Intrinsic.StdCoreSbAppendString v5, v1, ss
340   16.ref  CallStatic 60290 std.core.StringBuilder::toString v5, ss -> (v17)
341   17.ref  Return                     v16
342succs: [bb 2]
343
344BB 2  preds: [bb 0]
345prop: end
346```
347IR after transformation:
348```
349Method: std.core.String ETSGLOBAL::concat0(std.core.String, std.core.String)
350
351BB 1
352prop: start
353    0.ref  Parameter                  arg 0 -> (v18)
354    1.ref  Parameter                  arg 1 -> (v18)
355succs: [bb 0]
356
357BB 0  preds: [bb 1]
358prop:
359   18.ref  Intrinsic.StdCoreStringConcat2 v0, v1, ss -> (v17)
360   17.ref  Return                     v18
361succs: [bb 2]
362
363BB 2  preds: [bb 0]
364prop: end
365```
366
367**Optimize concatenation loops**
368
369ETS function example:
370```TS
371function concat_loop0(a: String, n: int): String {
372    let str: String = "";
373    for (let i = 0; i < n; ++i)
374        str = str + a;
375    return str;
376}
377```
378IR before transformation:
379```
380Method: std.core.String ETSGLOBAL::concat_loop0(std.core.String, i32)
381
382BB 4
383prop: start
384    0.ref  Parameter                  arg 0 -> (v9p)
385    1.i32  Parameter                  arg 1 -> (v10p)
386    3.i64  Constant                   0x0 -> (v7p)
387   30.i64  Constant                   0x1 -> (v29)
388succs: [bb 0]
389
390BB 0  preds: [bb 4]
391prop: prehead
392    4.ref  LoadString 63726           v5 -> (v8p)
393succs: [bb 3]
394
395BB 3  preds: [bb 0, bb 2]
396prop: head, loop 1, depth 1
397   7p.i32  Phi                        v3(bb0), v29(bb2) -> (v29, v13)
398   8p.ref  Phi                        v4(bb0), v28(bb2) -> (v31, v12)
399   9p.ref  Phi                        v0(bb0), v9p(bb2) -> (v12, v9p, v25)
400  10p.i32  Phi                        v1(bb0), v10p(bb2) -> (v10p, v13)
401   13.b    Compare GE i32             v7p, v10p -> (v14)
402   14.     IfImm NE b                 v13, 0x0
403succs: [bb 1, bb 2]
404
405BB 2  preds: [bb 3]
406prop: loop 1, depth 1
407   16.ref  LoadAndInitClass 'std.core.StringBuilder' ss -> (v17)
408   17.ref  NewObject 22178            v16, ss -> (v28, v25, v22)
409   18.void CallStatic 60220 std.core.StringBuilder::<ctor> v17, ss
410   22.ref  Intrinsic.StdCoreSbAppendString v17, v8p, ss
411   25.ref  Intrinsic.StdCoreSbAppendString v17, v9p, ss
412   28.ref  CallStatic 60410 std.core.StringBuilder::toString v17, ss -> (v11p, v8p)
413   29.i32  Add                        v7p, v30 -> (v7p)
414succs: [bb 3]
415
416BB 1  preds: [bb 3]
417prop:
418   31.ref  Return                     v8p
419succs: [bb 5]
420
421BB 5  preds: [bb 1]
422prop: end
423```
424IR after transformation:
425```
426Method: std.core.String ETSGLOBAL::concat_loop0(std.core.String, i32)
427
428BB 4
429prop: start
430    0.ref  Parameter                  arg 0 -> (v25)
431    1.i32  Parameter                  arg 1 -> (v40, v13)
432    3.i64  Constant                   0x0 -> (v40, v7p)
433   30.i64  Constant                   0x1 -> (v29)
434succs: [bb 0]
435
436BB 0  preds: [bb 4]
437prop: prehead
438    4.ref  LoadString 63726           ss -> (v22)
439   16.ref  LoadAndInitClass 'std.core.StringBuilder' ss -> (v17)
440   17.ref  NewObject 22178            v16, ss -> (v25, v28, v22, v18)
441   18.void CallStatic 60220 std.core.StringBuilder::<ctor> v17, ss
442   22.ref  Intrinsic.StdCoreSbAppendString v17, v4, ss
443   40.b    Compare GE i32             v3, v1 -> (v41)
444   41.     IfImm NE b                 v40, 0x0
445succs: [bb 1, bb 2]
446
447BB 2  preds: [bb 0, bb 2]
448prop: head, loop 1, depth 1
449   7p.i32  Phi                        v3(bb0), v29(bb2) -> (v29)
450   25.ref  Intrinsic.StdCoreSbAppendString v17, v0, ss
451   29.i32  Add                        v7p, v30 -> (v13, v7p)
452   13.b    Compare GE i32             v29, v1 -> (v14)
453   14.     IfImm NE b                 v13, 0x0
454succs: [bb 1, bb 2]
455
456BB 1  preds: [bb 2, bb 0]
457prop:
458   28.ref  CallStatic 60410 std.core.StringBuilder::toString v17, ss
459   31.ref  Return                     v28
460succs: [bb 5]
461
462BB 5  preds: [bb 1]
463prop: end
464```
465
466## Links
467
468* Implementation
469    * [simplify_string_builder.h](../optimizer/optimizations/simplify_string_builder.h)
470    * [simplify_string_builder.cpp](../optimizer/optimizations/simplify_string_builder.cpp)
471* Tests
472    * [ets_string_builder.sts](../../plugins/ets/tests/checked/ets_string_builder.sts)
473    * [ets_string_concat.sts](../../plugins/ets/tests/checked/ets_string_concat.sts)
474    * [ets_string_concat_loop.sts](../../plugins/ets/tests/checked/ets_string_concat_loop.sts)
475