Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'make test' fails for julia-1.11.0-beta1 #54280

Open
sergstesh opened this issue Apr 26, 2024 · 6 comments
Open

'make test' fails for julia-1.11.0-beta1 #54280

sergstesh opened this issue Apr 26, 2024 · 6 comments
Labels
test This change adds or pertains to unit tests

Comments

@sergstesh
Copy link

This report is a logical continuation of #54234 . While dealing with #54234 I was requested (see #54234 (comment) ) to run 'make test' on julia-1.11, which I did (see #54234 (comment) ).

So, 'make test' fails for julia-1.11.0-beta1, though differently from how it fails for julia-1.10.2 .

I decided to try two more times to run 'make test' on julia-1.11.0-beta1, and to make life harder I decided to run the two 'make test' tries in parallel. For this I created two directories: /media/sergei/4c7aa17d-44cf-423c-b211-ce583883925c/home/sergei/1 , /media/sergei/4c7aa17d-44cf-423c-b211-ce583883925c/home/sergei/2 . After that I unpacked the https://github.com/JuliaLang/julia/releases/download/v1.11.0-beta1/julia-1.11.0-beta1-full.tar.gz source tarball in each of the two directories, and then in each of the two /media/sergei/4c7aa17d-44cf-423c-b211-ce583883925c/home/sergei/1/julia-1.11.0-beta1 , /media/sergei/4c7aa17d-44cf-423c-b211-ce583883925c/home/sergei/2/julia-1.11.0-beta1 directories I invoked the 'nice -n 19 make test 1>make_test.log 2>&1' command line. The command lines were invoked almost simultaneously - the difference in time was a couple of seconds.

After completion of 'make test' I created a symlink in the first directory: ln -s make_test.log make_test.1.log and in the second directory: ln -s make_test.log make_test.2.log . Now the two log files have different names and I'm uploading them.

make_test.1.log
make_test.2.log
.

The following has to be noted:

  1. execution time of the two 'make test' is significantly different;
  2. the two 'make test' fail in different tests.

Also it has to be noted that after completion of the two 'make test' there is a dangling Julia process:

sergei@slaptop:~$ ps auwx | grep julia | grep media
sergei    119546 17.8  0.0 5585212 3964 ?        SNsl Apr26  90:14 /media/sergei/4c7aa17d-44cf-423c-b211-ce583883925c/home/sergei/1/julia-1.11.0-beta1/usr/bin/julia -C native -J/media/sergei/4c7aa17d-44cf-423c-b211-ce583883925c/home/sergei/1/julia-1.11.0-beta1/usr/lib/julia/sys.so --depwarn=error --check-bounds=yes -g1 --startup-file=no -C native -J/media/sergei/4c7aa17d-44cf-423c-b211-ce583883925c/home/sergei/1/julia-1.11.0-beta1/usr/lib/julia/sys.so -g1 --startup-file=no --check-bounds=yes --startup-file=no --depwarn=error --heap-size-hint=3148M --bind-to 127.0.0.1 --worker
sergei@slaptop:~$ 

.

@fatteneder fatteneder added the test This change adds or pertains to unit tests label Apr 27, 2024
@fatteneder
Copy link
Contributor

IIUC the logs correctly, then in make_log.2.log the loading.jl tests all passed, because there appear two loading.jl entries in that file, once with started and once with elapsed time.
However, loading.jl fails in make_log.1.log (at the relocatable upgrades #51989 test, also failed here #54234 (comment)) , so there seems to be some randomness involved, but nothing obvious as to why that particular test should fail.

@sergstesh
Copy link
Author

there seems to be some randomness involved

A typical phenomenon in multi-threaded environment. In the best case it's too short timeout values, otherwise algorithmic error.

...

And don't miss the dangling Julia process - looks like the kill signal hasn't been properly sent.

...

If one decides to reproduce the tests, make sure there are, say, 40GB virtual memory available (my swap file is 64GB, I saw more than 32GB swap used during the tests execution, but I have a web browser with a lot of windows/tbs open, and it consumes, say, 10GB.

...

Regarding test failures - pay special attention to

    858 loading                                          (3) |         failed at 2024-04-26T23:13:41.669
    859 Test Failed at /media/sergei/4c7aa17d-44cf-423c-b211-ce583883925c/home/sergei/1/julia-1.11.0-beta1/test/loading.jl:1365
    860   Expression: success(addenv(`$(Base.julia_cmd()) --project=$project_path --startup-file=no -e 'import Pkg; Pkg.develop("Foo51989"); Pkg.precompile(); exit(0)'`, "JULIA_DEPOT_PATH" =>    860  depot))

in 'make_test.1.log' and to

   1002 cmdlineargs                                      (5) |         failed at 2024-04-27T00:19:48.212
   1003 Test Failed at /media/sergei/4c7aa17d-44cf-423c-b211-ce583883925c/home/sergei/2/julia-1.11.0-beta1/test/cmdlineargs.jl:1150
   1004   Expression: success(pipeline(setenv(`$(Base.julia_cmd()) --bug-report=rr-local -e 'exit()'`, "JULIA_RR_RECORD_ARGS" => "-n --nested=ignore", "_RR_TRACE_DIR" => temp_trace_dir); ))

,
i.e. in 'make_test.1.log' the failing test is 'loading.jl' and in 'make_test.2.log' the failing test is 'cmdlineargs.jl' - if I understand the diagnostic messages correctly.

@fatteneder
Copy link
Contributor

Yes, I noticed that too that both failures happen when the tests are hanging inside a separately spawned Julia process, here and here.

@fatteneder
Copy link
Contributor

fatteneder commented Apr 27, 2024

When I just run this part here locally then it also takes ~2 min. The reason it takes so long is that that test sets up an empty depot without any caches, and so I think the spawned julia process then first has to re-compile all of Pkg.jl and its dependencies before the actual script for that instance can run.

If that's true, then I guess that's fixable.
But I am curious as to why that problem doesn't show in CI. Maybe the CI machines are just beefier?

@sergstesh
Copy link
Author

Maybe the CI machines are just beefier?

What is "CI" ?

@fatteneder
Copy link
Contributor

What is "CI" ?

continuous integration, the automated testing pipeline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test This change adds or pertains to unit tests
Projects
None yet
Development

No branches or pull requests

2 participants