Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: IV widening does not kick in for a simple loop with array and string indexing #102068

Open
jakobbotsch opened this issue May 10, 2024 · 3 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:2 Work that is important, but not critical for the release
Milestone

Comments

@jakobbotsch
Copy link
Member

Example from @richlander:

public static void Foo()
{
    string puzzle = "003020600900305001001806400008102900700000008006708200002609500800203009005010300";
    int[] board = new int[81];

    for (int i = 0; i < puzzle.Length; i++)
    {
        board[i] = puzzle[i] - '0';
    }
}

Loop codegen:

       xor      ecx, ecx
       align    [0 bytes for IG03]
						;; size=22 bbWeight=1 PerfScore 1.75
G_M24659_IG03:  ;; offset=0x001A
       mov      edx, ecx
       mov      r8, 0x160D4E52C30      ; '003020600900305001001806400008102900700000008006708200002609500'
       movzx    r8, word  ptr [r8+2*rdx+0x0C]
       add      r8d, -48
       mov      dword ptr [rax+4*rdx+0x10], r8d
       inc      ecx
       cmp      ecx, 81
       jl       SHORT G_M24659_IG03
						;; size=34 bbWeight=3.96 PerfScore 20.79

Expected:

       xor      ecx, ecx
       align    [0 bytes for IG03]
						;; size=22 bbWeight=1 PerfScore 1.75
G_M24659_IG03:  ;; offset=0x001A
       mov      r8, 0x160D4E52C30      ; '003020600900305001001806400008102900700000008006708200002609500'
       movzx    r8, word  ptr [r8+2*rcx+0x0C]
       add      r8d, -48
       mov      dword ptr [rax+4*rcx+0x10], r8d
       inc      ecx
       cmp      ecx, 81
       jl       SHORT G_M24659_IG03
						;; size=34 bbWeight=3.96 PerfScore 20.79
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 10, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label May 10, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@jakobbotsch jakobbotsch self-assigned this May 10, 2024
@jakobbotsch jakobbotsch removed the untriaged New issue has not been triaged by the area owner label May 10, 2024
@jakobbotsch jakobbotsch added this to the 9.0.0 milestone May 10, 2024
@jakobbotsch jakobbotsch added the Priority:2 Work that is important, but not critical for the release label May 10, 2024
@jakobbotsch
Copy link
Member Author

We CSE the two zero extensions, but the heuristic then reasons that storing with a zero extension is free in the backend, so it doesn't give any profit to introducing the widened IV. It isn't able to reason that widening will allow the CSE'd local to be removed entirely.

***** BB07 [0001]
STMT00016 ( ??? ... ??? )
N004 (  0,  0) [000119] DA---------                         ▌  STORE_LCL_VAR int    V02 loc2         d:3 $VN.Void
N003 (  0,  0) [000118] -----------                         └──▌  PHI       int    $301
N001 (  0,  0) [000123] ----------- pred BB07                  ├──▌  PHI_ARG   int    V02 loc2         u:4
N002 (  0,  0) [000122] ----------- pred BB06                  └──▌  PHI_ARG   int    V02 loc2         u:2 $c3

***** BB07 [0001]
STMT00004 ( 0x012[E-] ... 0x01E )
N016 ( 12, 20) [000021] DA--GO-----                         ▌  STORE_LCL_VAR int    V04 tmp1         d:2 $VN.Void
N015 ( 12, 20) [000020] -A--GO-----                         └──▌  ADD       int    <l:$547, c:$548>
N013 ( 10, 18) [000043] nA--GO-N---                            ├──▌  IND       ushort <l:$4c3, c:$501>
N012 (  6, 15) [000041] -A---O-----                            │  └──▌  ARR_ADDR  byref ushort[] $84
N011 (  6, 15) [000040] -A-----N---                            │     └──▌  ADD       byref  $403
N009 (  5,  6) [000039] -A-----N---                            │        ├──▌  ADD       long   $3c9
N007 (  4,  5) [000037] -A-----N---                            │        │  ├──▌  LSH       long   $3c8
N005 (  3,  4) [000130] -A---------                            │        │  │  ├──▌  COMMA     long   $3c7
N003 (  2,  3) [000128] DA---------                            │        │  │  │  ├──▌  STORE_LCL_VAR long   V05 cse0         d:1 $VN.Void
N002 (  2,  3) [000035] ---------U-                            │        │  │  │  │  └──▌  CAST      long <- uint $3c7
N001 (  1,  1) [000032] -----------                            │        │  │  │  │     └──▌  LCL_VAR   int    V02 loc2         u:3 $301
N004 (  1,  1) [000129] -----------                            │        │  │  │  └──▌  LCL_VAR   long   V05 cse0         u:1 $3c7
N006 (  1,  1) [000036] -------N---                            │        │  │  └──▌  CNS_INT   long   1 $1c1
N008 (  1,  1) [000038] -----------                            │        │  └──▌  CNS_INT   long   12 $1c3
N010 (  3, 10) [000132] H----------                            │        └──▌  CNS_INT(h) ref     '003020600900305001001806400008102900700000008006708200002609500' $180
N014 (  1,  1) [000019] -----------                            └──▌  CNS_INT   int    -48 $c8

***** BB07 [0001]
STMT00005 ( ??? ... ??? )
N012 (  7,  7) [000024] nA--GO-----                         ▌  STOREIND  int    $VN.Void
N010 (  2,  3) [000055] -----O-N---                         ├──▌  COMMA     byref  $85
N001 (  0,  0) [000047] -----------                         │  ├──▌  NOP       void   $VN.Void
N009 (  2,  3) [000054] -----O-----                         │  └──▌  ARR_ADDR  byref int[] $85
N008 (  2,  3) [000053] -------N---                         │     └──▌  ADD       byref  $404
N002 (  1,  1) [000044] -----------                         │        ├──▌  LCL_VAR   ref    V01 loc1         u:2 $2c0
N007 (  3,  3) [000052] -------N---                         │        └──▌  ADD       long   $3cc
N005 (  2,  2) [000050] -------N---                         │           ├──▌  LSH       long   $3cb
N003 (  1,  1) [000131] -----------                         │           │  ├──▌  LCL_VAR   long   V05 cse0         u:1 $3c7
N004 (  1,  1) [000049] -------N---                         │           │  └──▌  CNS_INT   long   2 $1c4
N006 (  1,  1) [000051] -----------                         │           └──▌  CNS_INT   long   16 $1c5
N011 (  1,  1) [000022] -----------                         └──▌  LCL_VAR   int    V04 tmp1         u:2 (last use) <l:$547, c:$548>

STMT00016 ( ??? ... ??? )
N004 (  0,  0) [000119] DA---------                         ▌  STORE_LCL_VAR int    V02 loc2         d:3 $VN.Void
N003 (  0,  0) [000118] -----------                         └──▌  PHI       int    $301
N001 (  0,  0) [000123] ----------- pred BB07                  ├──▌  PHI_ARG   int    V02 loc2         u:4
N002 (  0,  0) [000122] ----------- pred BB06                  └──▌  PHI_ARG   int    V02 loc2         u:2 $c3
  => <L00, V02.2 (0), 1>
  V02 is a primary induction variable in L00
  Exit BB10 does not need a sink; V02 is not live-in
  Estimated cycle improvement: 0 cycles per invocation
  Estimated size improvement: 0 bytes
  Widening is not profitable

With the extended analysis done in the strength reduction PR we should have enough context to replace all uses. With that PR we have

Found 8 different induction variables.
  [0]: int <L00, 0, 1> (primary IV V02, 2 uses)
    Uses: [000025] [000032]
  [1]: int <L00, 1, 1> (derived IV, 2 uses)
    Uses: [000008] [000027]
    Score: 0
  [2]: long <L00, 12, 2> (derived IV, 1 uses)
    Uses: [000039] (AM)
    Score: -2
  [3]: long <L00, 0, 2> (derived IV, 1 uses)
    Uses: [000037] (AM)
    Score: -2
  [4]: long zext<64>(<L00, 0, 1>) (derived IV, 4 uses)
    Uses: [000131] [000035] [000129] [000130]
    Score: -2
  [5]: byref <L00, (V01.2 + 16), 4> (derived IV, 3 uses)
    Uses: [000053] (AM) [000054] (AM) [000055]
    Score: -2
  [6]: long <L00, 16, 4> (derived IV, 1 uses)
    Uses: [000052] (AM)
    Score: -2
  [7]: long <L00, 0, 4> (derived IV, 1 uses)
    Uses: [000050] (AM)
    Score: -2

and could notice that derived IV [4] will all be replaceable by the new primary IV introduced when widening [0].

@jakobbotsch
Copy link
Member Author

Another relevant example is

private static int Foo(int[] arr, int start, int count)
{
    int sum = 0;
    for (int i = 0; i < count; i++)
    {
        sum += arr[start++];
    }

    return sum;
}

where we similarly end up with a local due to the start++ that blocks things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:2 Work that is important, but not critical for the release
Projects
None yet
Development

No branches or pull requests

1 participant