Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8326615: C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) #19280

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

dafedafe
Copy link
Contributor

@dafedafe dafedafe commented May 17, 2024

Issue

The test compiler/startup/StartupOutput.java fails intermittently due to a crash after correctly printing the error Initial size of CodeCache is too small (the test limits the code cache using k-XX:InitialCodeCacheSize=1024K -XX:ReservedCodeCacheSize=1200k`).
The appearance of the issue is very dependent on thread scheduling. The original report happens during C1 initialization but C2 initialization is affected as well.

Causes

There is one occurrence during C1 initialization and one during C2 initialization where a call to RuntimeStub::new_runtime_stub can fail fatally if there is not enough space left.
For C1: Compiler::init_c1_runtime -> Runtime1::initialize -> Runtime1::generate_blob_for -> Runtime1::generate_blob -> RuntimeStub::new_runtime_stub.
For C2: C2Compiler::initialize -> OptoRuntime::generate -> OptoRuntime::generate_stub -> Compile::Compile -> Compile::Code_Gen -> PhaseOutput::install -> PhaseOutput::install_stub -> RuntimeStub::new_runtime_stub.

Solution

#15970 introduced an optional argument to RuntimeStub::new_runtime_stub to determine if it fails fatally or not. We can take advantage of it to avoid crashing and instead pass the information about the success or failure of the allocation up the (C1 and C2 initialization) call stack up to where we can set the compilations as failed.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8326615: C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19280/head:pull/19280
$ git checkout pull/19280

Update a local copy of the PR:
$ git checkout pull/19280
$ git pull https://git.openjdk.org/jdk.git pull/19280/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19280

View PR using the GUI difftool:
$ git pr show -t 19280

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19280.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 17, 2024

👋 Welcome back dfenacci! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 17, 2024

@dafedafe This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8326615: C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash)

Reviewed-by: thartmann

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 298 new commits pushed to the master branch:

  • 2a37764: 8333743: Change .jcheck/conf branches property to match valid branches
  • 75dc2f8: 8330182: Start of release updates for JDK 24
  • 054362a: 8332550: [macos] Voice Over: java.awt.IllegalComponentStateException: component must be showing on the screen to determine its location
  • 9b436d0: 8333674: Disable CollectorPolicy.young_min_ergo_vm for PPC64
  • 487c477: 8333647: C2 SuperWord: some additional PopulateIndex tests
  • d02cb74: 8333270: HandlersOnComplexResetUpdate and HandlersOnComplexUpdate tests fail with "Unexpected reference" if timeoutFactor is less than 1/3
  • 02f2404: 8333560: -Xlint:restricted does not work with --release
  • 606df44: 8332670: C1 clone intrinsic needs memory barriers
  • 33fd6ae: 8333622: ubsan: relocInfo_x86.cpp:101:56: runtime error: pointer index expression with base (-1) overflowed
  • 8de5d20: 8332865: ubsan: os::attempt_reserve_memory_between reports overflow
  • ... and 288 more: https://git.openjdk.org/jdk/compare/f1ce9b0ecce9b506f5bf7a66fcf03c93b9ae8fed...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot changed the title JDK-8326615: C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) 8326615: C1/C2 don't handle allocation failure properly during initialization (RuntimeStub::new_runtime_stub fatal crash) May 17, 2024
@openjdk
Copy link

openjdk bot commented May 17, 2024

@dafedafe this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8326615
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label May 17, 2024
@openjdk
Copy link

openjdk bot commented May 17, 2024

@dafedafe The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label May 17, 2024
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label May 17, 2024
@dafedafe dafedafe marked this pull request as ready for review May 21, 2024 06:45
@openjdk openjdk bot added the rfr Pull request is ready for review label May 21, 2024
@mlbridge
Copy link

mlbridge bot commented May 21, 2024

Webrevs

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a regression in JDK 23, right? Could you please adjust the affects versions in JIRA accordingly?

Looks good to me otherwise.

src/hotspot/share/c1/c1_Compiler.cpp Outdated Show resolved Hide resolved
src/hotspot/share/c1/c1_Runtime1.cpp Outdated Show resolved Hide resolved
@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 21, 2024
dafedafe and others added 2 commits May 21, 2024 11:17
Co-authored-by: Tobias Hartmann <tobias.hartmann@oracle.com>
Co-authored-by: Tobias Hartmann <tobias.hartmann@oracle.com>
@dafedafe
Copy link
Contributor Author

This is not a regression in JDK 23, right? Could you please adjust the affects versions in JIRA accordingly?

It is not. Fixing the version.
Thanks a lot for reviewing @TobiHartmann.

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -280,6 +285,7 @@ void Runtime1::initialize(BufferBlob* blob) {
#endif
BarrierSetC1* bs = BarrierSet::barrier_set()->barrier_set_c1();
bs->generate_c1_runtime_stubs(blob);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to handle failures in generate_c1_runtime_stubs? With the assert removed, I think we'll get a nullptr crash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, you're right. Actually that call could potentially fail too.
I've added code to handle that case as well.
Thanks @dean-long!

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me otherwise.

src/hotspot/share/gc/x/c1/xBarrierSetC1.cpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/z/c1/zBarrierSetC1.cpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/z/c1/zBarrierSetC1.cpp Outdated Show resolved Hide resolved
dafedafe and others added 3 commits May 28, 2024 09:11
Co-authored-by: Tobias Hartmann <tobias.hartmann@oracle.com>
Co-authored-by: Tobias Hartmann <tobias.hartmann@oracle.com>
Co-authored-by: Tobias Hartmann <tobias.hartmann@oracle.com>
@dafedafe
Copy link
Contributor Author

Looks good to me otherwise.

Thanks for the review @TobiHartmann!

@dean-long
Copy link
Member

This looks OK, but isn't it a lot of changes just to get this test to pass? Aren't all of these allocation failures ultimately fatal? Is there a simpler way to handle this problem?

@dafedafe
Copy link
Contributor Author

dafedafe commented Jun 4, 2024

This looks OK, but isn't it a lot of changes just to get this test to pass? Aren't all of these allocation failures ultimately fatal? Is there a simpler way to handle this problem?

It seems a bit much indeed but I think there is potentially always the possibility of not failing but only disabling the compiler. @dean-long do you think the VM would anyway fail later on?

@dean-long
Copy link
Member

It may not fail, but if it can't create a C1 or C2 compiler, then that's bad, and we might argue that this kind of failure should be similar to a failure during JVM startup. In fact, I have been thinking that there are reasons why we might want these compiler stubs to be created earlier in startup, when we are still single-threaded. That would get rid of any races, and make a failure fatal. It would also allow us to allow us to put these stubs, if they are effectively execute-only because they don't need to be patched, into a special JIT region, avoiding the MAP_JIT overhead on macos-aarch64 (see JDK-8331978).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org ready Pull request is ready to be integrated rfr Pull request is ready for review
3 participants