1; RUN: llc %s -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -o - 2 3; When a i64 sub is expanded to subc + sube. 4; libcall #1 5; \ 6; \ subc 7; \ / \ 8; \ / \ 9; \ / libcall #2 10; sube 11; 12; If the libcalls are not serialized (i.e. both have chains which are dag 13; entry), legalizer can serialize them in arbitrary orders. If it's 14; unlucky, it can force libcall #2 before libcall #1 in the above case. 15; 16; subc 17; | 18; libcall #2 19; | 20; libcall #1 21; | 22; sube 23; 24; However since subc and sube are "glued" together, this ends up being a 25; cycle when the scheduler combine subc and sube as a single scheduling 26; unit. 27; 28; The right solution is to fix LegalizeType too chains the libcalls together. 29; However, LegalizeType is not processing nodes in order. The fix now is to 30; fix subc / sube (and addc / adde) to use physical register dependency instead. 31; rdar://10019576 32 33define void @t() nounwind { 34entry: 35 %tmp = load i64, i64* undef, align 4 36 %tmp5 = udiv i64 %tmp, 30 37 %tmp13 = and i64 %tmp5, 64739244643450880 38 %tmp16 = sub i64 0, %tmp13 39 %tmp19 = and i64 %tmp16, 63 40 %tmp20 = urem i64 %tmp19, 3 41 %tmp22 = and i64 %tmp16, -272346829004752 42 store i64 %tmp22, i64* undef, align 4 43 store i64 %tmp20, i64* undef, align 4 44 ret void 45} 46