-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CIR][CIRGen] Improve switch support for unrecheable code #528
base: main
Are you sure you want to change the base?
Conversation
Actually the body of a switch statement can be neither of int f(int x) {
switch (x)
return 1;
return 2;
} This is accepted by clang ( |
8cba98d
to
b5eaa96
Compare
@Lancern Added |
a7b0446
to
5915d80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, thanks for you first patch and for working on this.
Well, things are a little tricky in the case when the body of a void foo(int x) {
switch (x)
while (condition()) {
case 42:
do_something();
}
} The |
It becomes a little complex when we consider The definition assume the size of case attributes is same with regions, unfortunately the region may be nested.
I'm not sure what a reasonable
Otherwise I noticed |
Since |
1e75d7b
to
e19aa5a
Compare
Let me know when this comes out of Draft state and I'll take a look again |
072a50b
to
c9f7e9a
Compare
Appreciate for the suggestions! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, almost there. Few more inline comments.
clang/test/CIR/CodeGen/switch.cpp
Outdated
@@ -275,14 +276,124 @@ void sw12(int a) { | |||
// CHECK-NEXT: cir.break | |||
// CHECK-NEXT: } | |||
|
|||
void fallthrough(int x) { | |||
void sw13(int a) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These testcases are great. Can you add a couple more variations where extra nested switch shows up? Just wanna make sure the lexical scope here is doing the right thing.
c9f7e9a
to
14b2183
Compare
14b2183
to
b7e1c76
Compare
Comments addressed. |
9ddab69
to
ef970b7
Compare
Hi, this pr is ready for review, thanks~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, much easier for me to understand the approach now that the refactoring part is gone. More comments inline.
clang/lib/CIR/CodeGen/CIRGenStmt.cpp
Outdated
@@ -976,7 +1000,8 @@ mlir::LogicalResult CIRGenFunction::buildSwitchBody( | |||
builder.setInsertionPointToEnd(lastCaseBlock); | |||
res = buildStmt(c, /*useCurrentScope=*/!isa<CompoundStmt>(c)); | |||
} else { | |||
llvm_unreachable("statement doesn't belong to any case region, NYI"); | |||
checkCaseNoneStmt(*c); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should just call buildStmt()
instead or rewrite more logic in a way that CIRGen just falls out naturally. Like I mentioned in previous reviews, this is dispatching another visitor logic just for the sake of grabbing information that could just be handled in our regular CIRGen visiting path.
Looking at checkCaseNoneStmt
impl, specifically Stmt::CaseStmtClass
/DefaultStmtClass
: you should add a buildCaseStmt
and buildDefaultStmt
and call them from buildSimpleStmt
. That code should already be part of CIRGen emission, and not something that does a side checking. If you need this info, you can walk LexScopes up to find if we are in a switch or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we call buildStmt()
, for ReturnStmt
we need to get currLexScope->RetBlocks
, we expect the block is belong to the case region, but CaseNoneStmt
has no region.
I'll keep working to find a solution. Pushed a wip
commit in case you want to have a look about the draft.
ef970b7
to
5223c3c
Compare
Make logic cleaner and more extensible. Separate collecting `SwitchStmt` information and building op logic into different functions. Add more UT to cover nested switch, which also worked before this pr. This pr is split from #528.
Make logic cleaner and more extensible. Separate collecting `SwitchStmt` information and building op logic into different functions. Add more UT to cover nested switch, which also worked before this pr. This pr is split from #528.
Make logic cleaner and more extensible. Separate collecting `SwitchStmt` information and building op logic into different functions. Add more UT to cover nested switch, which also worked before this pr. This pr is split from #528.
@wenpen still working on this? I don't usually look at draft PRs, just trying to make sure if there's something I should be looking here. |
@bcardosolopes Yes, just be a little busy recently, I will update the pr and request review form you later days, thanks~ |
5223c3c
to
9ae5d1f
Compare
clang/lib/CIR/CodeGen/CIRGenStmt.cpp
Outdated
// TODO: Rewrite the logic to handle ReturnStmt inside SwitchStmt, then | ||
// clean up the code below. | ||
if (currLexScope->IsInsideCaseNoneStmt) | ||
return mlir::success(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found many sample code that failed due to incorrect terminator in block, e.g.
switch(a) {
case 0:
break;
int x = 1;
}
switch(a) {
case 0:
return 0;
return 1;
int x = 1;
}
for (;;) {
break;
int x = 1;
}
Looks like it's another large work, so I just skip ReturnStmt here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, can you file a new issue and list these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm opposed to return mlir::success();
because it will just silently skips something we don't know how to handle, I rather these things fails or crash, so that it's clear that they aren't implemented? What happens when you remove this return?
clang/lib/CIR/CodeGen/CIRGenStmt.cpp
Outdated
@@ -328,6 +328,14 @@ mlir::LogicalResult CIRGenFunction::buildLabelStmt(const clang::LabelStmt &S) { | |||
// IsEHa: not implemented. | |||
assert(!(getContext().getLangOpts().EHAsynch && S.isSideEntry())); | |||
|
|||
// TODO: After support case stmt crossing scopes, we should build LabelStmt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any TODO
in CIRGen should be TODO(cir)
@@ -2027,6 +2031,8 @@ class CIRGenFunction : public CIRGenTypeCache { | |||
// Scope entry block tracking | |||
mlir::Block *getEntryBlock() { return EntryBlock; } | |||
|
|||
bool IsInsideCaseNoneStmt = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this, reasons below.
// and clean LexicalScope::IsInsideCaseNoneStmt. | ||
for (auto *lexScope = currLexScope; lexScope; | ||
lexScope = lexScope->getParentScope()) { | ||
assert(!lexScope->IsInsideCaseNoneStmt && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if you remove this code? Also, why doesn't it work to just walk the scope up until you find a switch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, we won't need this assert anymore if we could keep the case none stmt somehow as you suggested.
What happens if you remove this code?
Remove this code won't cause incorrect behavior currently (as we didn't support goto in that case yet), but I think it may produce strange error message in the future.
switch (int x) {
foo:
x = 1;
break;
case 2:
goto foo;
}
We need to avoid erasing the CaseNoneStmt
containing label foo
.
why doesn't it work to just walk the scope up until you find a switch?
Refer to the below code, we need to guarantee the removed Stmt
won't contain any LabelStmt
, whether the LabelStmt
is inside another nested switch or not.
switch(x) {
switch(x) {
case 1:
foo:
break;
}
break;
case 1:
goto foo;
}
clang/lib/CIR/CodeGen/CIRGenStmt.cpp
Outdated
// TODO: Rewrite the logic to handle ReturnStmt inside SwitchStmt, then | ||
// clean up the code below. | ||
if (currLexScope->IsInsideCaseNoneStmt) | ||
return mlir::success(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, can you file a new issue and list these?
clang/lib/CIR/CodeGen/CIRGenStmt.cpp
Outdated
// TODO: Rewrite the logic to handle ReturnStmt inside SwitchStmt, then | ||
// clean up the code below. | ||
if (currLexScope->IsInsideCaseNoneStmt) | ||
return mlir::success(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm opposed to return mlir::success();
because it will just silently skips something we don't know how to handle, I rather these things fails or crash, so that it's clear that they aren't implemented? What happens when you remove this return?
@@ -704,6 +717,22 @@ CIRGenFunction::buildSwitchCase(const SwitchCase &S, mlir::Type condType, | |||
llvm_unreachable("expect case or default stmt"); | |||
} | |||
|
|||
mlir::LogicalResult CIRGenFunction::buildCaseNoneStmt(const Stmt *S) { | |||
// Create orphan region to skip over the case none stmts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you are creating an orphan region, this mean that anything emitted inside a buildCaseNoneStmt
will never execute, right? The problem if a orphan region is that it won't get attached to anything, so it really adds no value (not even for unrecheable code analysis). If so, better just to split the current basic block A into two: B and C. A should jump to C and you emit the code in B.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find a good place to hold the block of CaseNoneStmt
.
For example
void f(int x) {
switch(x) {
break;
}
}
There is no region inside SwitchOp
, so we have to put the break
block outside SwitchOp
, which cause verification failed: 'cir.break' op must be within a loop or switch
.
Did I misunderstand something? Looking forward to your suggestions~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your point, but if you go for the current approach you might as well skip this codegen entirely, because what you are emitting won't ever be attached to anything. I think it's safer to mimic the original codegen here, what is Clang currently doing for OG codegen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should create a SwitchOp with at least one default region and delete that at the end if it ends up unused?
f726860
to
7a61b3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -704,6 +717,22 @@ CIRGenFunction::buildSwitchCase(const SwitchCase &S, mlir::Type condType, | |||
llvm_unreachable("expect case or default stmt"); | |||
} | |||
|
|||
mlir::LogicalResult CIRGenFunction::buildCaseNoneStmt(const Stmt *S) { | |||
// Create orphan region to skip over the case none stmts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find a good place to hold the block of CaseNoneStmt
.
For example
void f(int x) {
switch(x) {
break;
}
}
There is no region inside SwitchOp
, so we have to put the break
block outside SwitchOp
, which cause verification failed: 'cir.break' op must be within a loop or switch
.
Did I misunderstand something? Looking forward to your suggestions~
// and clean LexicalScope::IsInsideCaseNoneStmt. | ||
for (auto *lexScope = currLexScope; lexScope; | ||
lexScope = lexScope->getParentScope()) { | ||
assert(!lexScope->IsInsideCaseNoneStmt && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, we won't need this assert anymore if we could keep the case none stmt somehow as you suggested.
What happens if you remove this code?
Remove this code won't cause incorrect behavior currently (as we didn't support goto in that case yet), but I think it may produce strange error message in the future.
switch (int x) {
foo:
x = 1;
break;
case 2:
goto foo;
}
We need to avoid erasing the CaseNoneStmt
containing label foo
.
why doesn't it work to just walk the scope up until you find a switch?
Refer to the below code, we need to guarantee the removed Stmt
won't contain any LabelStmt
, whether the LabelStmt
is inside another nested switch or not.
switch(x) {
switch(x) {
case 1:
foo:
break;
}
break;
case 1:
goto foo;
}
// TODO(cir): Rewrite the logic to handle ReturnStmt inside SwitchStmt, then | ||
// clean up the code below. | ||
if (currLexScope->IsInsideCaseNoneStmt) | ||
return mlir::success(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm opposed to return mlir::success(); because it will just silently skips something we don't know how to handle, I rather these things fails or crash, so that it's clear that they aren't implemented? What happens when you remove this return?
buildReturnStmt()
assume there is exactly one return block in a region, and there is one region in a lexical scope, the only exceptions are switch scope, which has multiple regions. The related code is
mlir::Block *getOrCreateRetBlock(CIRGenFunction &CGF, mlir::Location loc) {
unsigned int regionIdx = 0;
if (isSwitch())
regionIdx = SwitchRegions.size() - 1;
if (regionIdx >= RetBlocks.size())
return createRetBlock(CGF, loc);
return &*RetBlocks.back();
}
So if we remove the return here, the following code will cause crash. regionIdx
will be -1, and we'll call RetBlocks .back()
with empty RetBlocks
int f(int x) {
switch(x) {
return 0;
}
return 1;
}
By the way, I believe the current implementation of getOrCreateRetBlock()
about switch is incorrect and also should be solved after changing definition of SwitchOp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I believe the current implementation of
getOrCreateRetBlock()
about switch is incorrect and also should be solved after changing definition ofSwitchOp
.
Right, we should fix the logic, not take shortcuts like returning mlir::success()
. Can you elaborate on what do you mean by changing the definition of SwitchOp
?
I'm going to resume reviewing this, sorry for the delay! |
@@ -704,6 +717,22 @@ CIRGenFunction::buildSwitchCase(const SwitchCase &S, mlir::Type condType, | |||
llvm_unreachable("expect case or default stmt"); | |||
} | |||
|
|||
mlir::LogicalResult CIRGenFunction::buildCaseNoneStmt(const Stmt *S) { | |||
// Create orphan region to skip over the case none stmts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your point, but if you go for the current approach you might as well skip this codegen entirely, because what you are emitting won't ever be attached to anything. I think it's safer to mimic the original codegen here, what is Clang currently doing for OG codegen?
// TODO(cir): Rewrite the logic to handle ReturnStmt inside SwitchStmt, then | ||
// clean up the code below. | ||
if (currLexScope->IsInsideCaseNoneStmt) | ||
return mlir::success(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I believe the current implementation of
getOrCreateRetBlock()
about switch is incorrect and also should be solved after changing definition ofSwitchOp
.
Right, we should fix the logic, not take shortcuts like returning mlir::success()
. Can you elaborate on what do you mean by changing the definition of SwitchOp
?
@@ -704,6 +717,22 @@ CIRGenFunction::buildSwitchCase(const SwitchCase &S, mlir::Type condType, | |||
llvm_unreachable("expect case or default stmt"); | |||
} | |||
|
|||
mlir::LogicalResult CIRGenFunction::buildCaseNoneStmt(const Stmt *S) { | |||
// Create orphan region to skip over the case none stmts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should create a SwitchOp with at least one default region and delete that at the end if it ends up unused?
Support non-block
case
and statementw that don't belong to anycase
region, fix #520 #521