Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build fails on aarch64 hosts #747

Open
msgilligan opened this issue May 8, 2024 · 18 comments
Open

Build fails on aarch64 hosts #747

msgilligan opened this issue May 8, 2024 · 18 comments

Comments

@msgilligan
Copy link
Contributor

msgilligan commented May 8, 2024

Update: The x86_64 build issue turned out to be non-reproducible, so this issue has been renamed. The actual problem I have been troubleshooting is building on aarch64 hosts and there are at least 3 sub-issues that have been discovered. This issue now serves as a parent issue for aarch64 host build issues.

On Debian 12 (x86_64), I'm following the standard build instructions for QEMU v8, e.g.

$ mkdir optee
$ cd optee
$ repo init -u https://github.com/OP-TEE/manifest.git -m qemu_v8.xml -b 4.2.0
$ repo sync
$ cd build
$ make toolchains
$ make run

The error I'm getting is:

  CC      lib/display_options.o
lib/display_options.c: In function ‘print_freq’:
lib/display_options.c:59:9: internal compiler error: Illegal instruction
   59 |         unsigned long d = 1e9;
      |         ^~~~~~~~
0x1739600 diagnostic_impl(rich_location*, diagnostic_metadata const*, int, char const*, __va_list_tag (*) [1], diagnostic_t)
	???:0
0x173a286 internal_error(char const*, ...)
	???:0
0xc26e2f crash_signal(int)
	???:0
0x193cec9 __gmpn_mul_basecase
	???:0
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://bugs.linaro.org/> for instructions.
make[2]: *** [scripts/Makefile.build:257: lib/display_options.o] Error 1
make[1]: *** [Makefile:1853: lib] Error 2
make[1]: Leaving directory '/home/sean/optee/u-boot'
make: *** [Makefile:305: u-boot] Error 2
@msgilligan msgilligan changed the title Build fails on Debian 12 (4.2.0) Build fails on Ubuntu 22.04 & Debian 12 (4.2.0) May 9, 2024
@msgilligan
Copy link
Contributor Author

msgilligan commented May 9, 2024

Update: I tried the build with Ubuntu 22.04.4 LTS (Jammy Jellyfish) and am getting the same error. (I changed the title of the issue to reflect this.)

@jforissier
Copy link
Contributor

Hi @msgilligan,

The build works for me with Ubuntu 22.04 (22.04.4 LTS) using the Dockerfile at https://optee.readthedocs.io/en/latest/building/prerequisites.html. It looks like for some reason you build is not picking up the proper cross-compiler. Could you try building with make V=1 and check the CC command used to build lib/display_options.o?

@msgilligan
Copy link
Contributor Author

Hi @msgilligan,

The build works for me with Ubuntu 22.04 (22.04.4 LTS) using the Dockerfile at https://optee.readthedocs.io/en/latest/building/prerequisites.html. It looks like for some reason you build is not picking up the proper cross-compiler. Could you try building with make V=1 and check the CC command used to build lib/display_options.o?

Yes, when I run the same Dockerfile it works. I'm not sure what the difference is between Ubuntu running in a QEMU VM (using Lima on macOS) and the one in Docker.

I will try your suggestion and report back.

@msgilligan
Copy link
Contributor Author

Here is the result:

make -f ./scripts/Makefile.build obj=lib/zlib
  /usr/bin/ccache /home/sean.linux/optee-qemu8/build/../toolchains/aarch64/bin/aarch64-linux-gnu-gcc -Wp,-MD,lib/.display_options.o.d -nostdinc -isystem /home/sean.linux/optee-qemu8/toolchains/aarch64/bin/../lib/gcc/aarch64-none-linux-gnu/11.3.1/include -Iinclude   -I./arch/arm/include -include ./include/linux/kconfig.h -D__KERNEL__ -D__UBOOT__ -Wall -Wstrict-prototypes -Wno-format-security -fno-builtin -ffreestanding -std=gnu11 -fshort-wchar -fno-strict-aliasing -fno-PIE -Os -fno-stack-protector -fno-delete-null-pointer-checks -Wno-pointer-sign -Wno-stringop-truncation -Wno-zero-length-bounds -Wno-array-bounds -Wno-stringop-overflow -Wno-maybe-uninitialized -fmacro-prefix-map=./= -gdwarf-4 -fstack-usage -Wno-format-nonliteral -Wno-address-of-packed-member -Wno-unused-but-set-variable -Werror=date-time -Wno-packed-not-aligned -D__ARM__ -fno-pic -mstrict-align -ffunction-sections -fdata-sections -fno-common -ffixed-x18 -mgeneral-regs-only -mbranch-protection=none -pipe -march=armv8-a+crc -D__LINUX_ARM_ARCH__=8    -DKBUILD_BASENAME='"display_options"'  -DKBUILD_MODNAME='"display_options"' -c -o lib/display_options.o lib/display_options.c
lib/display_options.c: In function ‘print_freq’:
lib/display_options.c:59:9: internal compiler error: Illegal instruction
   59 |         unsigned long d = 1e9;
      |         ^~~~~~~~
0x1739600 diagnostic_impl(rich_location*, diagnostic_metadata const*, int, char const*, __va_list_tag (*) [1], diagnostic_t)
	???:0
0x173a286 internal_error(char const*, ...)
	???:0
0xc26e2f crash_signal(int)
	???:0
0x193cec9 __gmpn_mul_basecase
	???:0
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://bugs.linaro.org/> for instructions.
make[2]: *** [scripts/Makefile.build:257: lib/display_options.o] Error 1
make[1]: *** [Makefile:1853: lib] Error 2
make[1]: Leaving directory '/home/sean.linux/optee-qemu8/u-boot'
make: *** [Makefile:305: u-boot] Error 2

@msgilligan
Copy link
Contributor Author

I was able to successfully build the project in another Debian 12 environment and I compared the CC commands and they are the same. I'm working to narrow down other potential differences in the environment.

Of course, the code in display_options.c should probably be changed to unsigned long d = 1000000000; or maybe use an explicit cast.

@msgilligan
Copy link
Contributor Author

msgilligan commented May 13, 2024

@jforissier My goal is to be able to develop for OP-TEE on my M1 MacBook Pro. I have Docker installed and use Lima for long-lived development VMs. I prefer to use Debian (Bookworm) over Ubuntu, but am willing to use Ubuntu if necessary. But I do not want to use amd64/x86_64 emulation (builds take many hours), I want to run native arm64 VMs. I am willing to troubleshoot, document, and submit PRs to help make this happen. I can also move this to the mailing list, if that is a better venue.

I have verified that the reference Dockerfile works under Docker on my Mac when I run it with --platform linux/amd64, but without that setting under arm64 mode it fails. If I run the arm64 container with -j1 V=1 I get the following error message:

# Ensure the toolchain, components, and targets we've specified in
# rust-toolchain.toml are ready to go. Since that file sets rustup's
# default toolchain for the entire directory, all we need to do is run
# any rustup-wrapped command to trigger installation. We've arbitrarily
# chosen "cargo --version" since it has no other effect.
Configuring OP-TEE rust examples
/bin/bash: line 1: /.cargo/env: No such file or directory
make[2]: *** [package/pkg-generic.mk:273: /optee/out-br/build/optee_rust_examples_ext-1.0/.stamp_configured] Error 1
make[1]: *** [Makefile:23: _all] Error 2
make[1]: Leaving directory '/optee/out-br'
make: *** [common.mk:341: buildroot] Error 2
The command '/bin/sh -c make -j1 V=1' returned a non-zero code: 2

Although this is different from the symptom I initially reported, it seems like the first place to start, because in the case of a Docker build the environment is more carefully controlled and I'm seeing a difference in behavior based (apparently) solely on the CPU architecture.

I've also seen this problem (or similar) in builds inside VMs and on my Intel box. Under Docker the build seems to occur in the root directory, so /.cargo/env is a valid path (on amd64, at least).

From my perspective, It seems like there are at least two issues/todos to work on:

  1. Identify and fix the reason for the arm64 Docker build failure.
  2. Provide and/or (better) document a mechanism to configure the Rust toolchain path. (I've noticed a variable named BR2_PACKAGE_OPTEE_RUST_EXAMPLES_EXT_TC_PATH that can be used, but it's not documented on the website (e.g. on https://optee.readthedocs.io/en/latest/building/optee_with_rust.html)

On a separate (and positive!) note, when I do run the Docker build using amd64 and use Docker to copy /optee/out/bin/* out to a macOS directory, I can boot the Linux guest using a macOS build of QEMU and run xtest successfully.

I should also mention that when I did preliminary investigations in October 2023, I was able to do the full build in an arm64 VM on macOS and run xtest in a QEMU guest under macOS.

@msgilligan
Copy link
Contributor Author

p.s. Using RUN make V=1 OPTEE_RUST_ENABLE=n does not seem to disable the use of Rust.

@jforissier
Copy link
Contributor

jforissier commented May 14, 2024

Hi @msgilligan,

Being able to build natively on arm64 is a good thing obviously, and as you said it used to work. It should not be that hard to fix the issues and I appreciate your willingness to help in that regard.
Rust support has been enabled by default recently but was never tested on arm4 I believe :-/

p.s. Using RUN make V=1 OPTEE_RUST_ENABLE=n does not seem to disable the use of Rust.

That should be RUST_ENABLE=n, see https://github.com/OP-TEE/build/blob/4.2.0/qemu_v8.mk#L32-L33.

I've noticed a variable named BR2_PACKAGE_OPTEE_RUST_EXAMPLES_EXT_TC_PATH

Yes, it was introduced by f0a2eef. Basically it takes its value from RUST_TOOLCHAIN_PATH in common.mk RUST_TOOLCHAIN_PATH is set in toolchain.mk. make toolchains is supposed to install what's needed.

build$ git grep RUST_TOOLCHAIN_PATH
common.mk:BR2_PACKAGE_OPTEE_RUST_EXAMPLES_EXT_TC_PATH ?= $(RUST_TOOLCHAIN_PATH)
toolchain.mk:RUST_TOOLCHAIN_PATH                ?= $(TOOLCHAIN_ROOT)/rust
toolchain.mk:   $(call dl-rust-toolchain,$(RUST_TOOLCHAIN_PATH))

Does make toolchains work on your platform?

@msgilligan
Copy link
Contributor Author

OPTEE_RUST_ENABLE

That should be RUST_ENABLE=n

This documentation page shows OPTEE_RUST_ENABLE :
https://github.com/OP-TEE/optee_docs/blob/9841a9fc14ed49b084bed4b6fd8af417a45cbf24/building/optee_with_rust.rst#L41
That's why I was trying that name.

Does make toolchains work on your platform?

Yes.

If I set RUST_ENABLE=n and run the Docker build on arm64, I get:

/usr/bin/ccache /optee/build/../toolchains/aarch64/bin/aarch64-linux-ld.bfd -e__ta_entry -pie -T out/ta/os_test/ta.lds -Map=out/ta/os_test/5b9e0e40-2636-11e1-ad9e-0002a5d5c51b.map --sort-section=alignment -z max-page-size=4096  --as-needed   --dynamic-list out/ta/os_test/dyn_list --eh-frame-hdr out/ta/os_test/init.o out/ta/os_test/os_test.o out/ta/os_test/ta_entry.o out/ta/os_test/test_float_subj.o out/ta/os_test/cxx_tests.o out/ta/os_test/cxx_tests_c.o out/ta/os_test/attestation.o out/ta/os_test/user_ta_header.o -L/optee/out-br/build/optee_test_ext-1.0/ta/os_test_lib/out/ta/os_test_lib -los_test -ldl -L/optee/optee_os/out/arm/export-ta_arm64/lib --start-group -lutils -lutee -lmbedtls -ldl /optee/toolchains/aarch64/bin/../lib/gcc/aarch64-buildroot-linux-gnu/12.3.0/../../../../aarch64-buildroot-linux-gnu/lib/../lib64/libstdc++.a /optee/toolchains/aarch64/bin/../lib/gcc/aarch64-buildroot-linux-gnu/12.3.0/libgcc_eh.a --end-group /optee/toolchains/aarch64/bin/../lib/gcc/aarch64-buildroot-linux-gnu/12.3.0/libgcc.a -lutils -o out/ta/os_test/5b9e0e40-2636-11e1-ad9e-0002a5d5c51b.elf
/optee/build/../toolchains/aarch64/bin/aarch64-linux-ld.bfd: /optee/toolchains/aarch64/bin/../lib/gcc/aarch64-buildroot-linux-gnu/12.3.0/libgcc_eh.a(unwind-dw2-fde-dip.o): in function `_Unwind_Find_FDE':
/optee/out-aarch64-sdk/build/host-gcc-final-12.3.0/build/aarch64-buildroot-linux-gnu/libgcc/../../../libgcc/unwind-dw2-fde-dip.c:512: undefined reference to `_dl_find_object'
make[4]: *** [/optee/optee_os/out/arm/export-ta_arm64/mk/link.mk:123: out/ta/os_test/5b9e0e40-2636-11e1-ad9e-0002a5d5c51b.elf] Error 1
make[3]: *** [/optee/out-br/build/optee_test_ext-1.0/ta/Makefile.gmake:61: ta-os_test] Error 2
make[2]: *** [package/pkg-generic.mk:284: /optee/out-br/build/optee_test_ext-1.0/.stamp_built] Error 2
make[1]: *** [Makefile:23: _all] Error 2
make[1]: Leaving directory '/optee/out-br'
make: *** [common.mk:341: buildroot] Error 2
The command '/bin/sh -c make V=1 RUST_ENABLE=n' returned a non-zero code: 2

@jforissier
Copy link
Contributor

OPTEE_RUST_ENABLE

That should be RUST_ENABLE=n

This documentation page shows OPTEE_RUST_ENABLE : https://github.com/OP-TEE/optee_docs/blob/9841a9fc14ed49b084bed4b6fd8af417a45cbf24/building/optee_with_rust.rst#L41 That's why I was trying that name.

Ah, yes. The doc needs updating.

Does make toolchains work on your platform?

Yes.

If I set RUST_ENABLE=n and run the Docker build on arm64, I get:

/usr/bin/ccache /optee/build/../toolchains/aarch64/bin/aarch64-linux-ld.bfd -e__ta_entry -pie -T out/ta/os_test/ta.lds -Map=out/ta/os_test/5b9e0e40-2636-11e1-ad9e-0002a5d5c51b.map --sort-section=alignment -z max-page-size=4096  --as-needed   --dynamic-list out/ta/os_test/dyn_list --eh-frame-hdr out/ta/os_test/init.o out/ta/os_test/os_test.o out/ta/os_test/ta_entry.o out/ta/os_test/test_float_subj.o out/ta/os_test/cxx_tests.o out/ta/os_test/cxx_tests_c.o out/ta/os_test/attestation.o out/ta/os_test/user_ta_header.o -L/optee/out-br/build/optee_test_ext-1.0/ta/os_test_lib/out/ta/os_test_lib -los_test -ldl -L/optee/optee_os/out/arm/export-ta_arm64/lib --start-group -lutils -lutee -lmbedtls -ldl /optee/toolchains/aarch64/bin/../lib/gcc/aarch64-buildroot-linux-gnu/12.3.0/../../../../aarch64-buildroot-linux-gnu/lib/../lib64/libstdc++.a /optee/toolchains/aarch64/bin/../lib/gcc/aarch64-buildroot-linux-gnu/12.3.0/libgcc_eh.a --end-group /optee/toolchains/aarch64/bin/../lib/gcc/aarch64-buildroot-linux-gnu/12.3.0/libgcc.a -lutils -o out/ta/os_test/5b9e0e40-2636-11e1-ad9e-0002a5d5c51b.elf
/optee/build/../toolchains/aarch64/bin/aarch64-linux-ld.bfd: /optee/toolchains/aarch64/bin/../lib/gcc/aarch64-buildroot-linux-gnu/12.3.0/libgcc_eh.a(unwind-dw2-fde-dip.o): in function `_Unwind_Find_FDE':
/optee/out-aarch64-sdk/build/host-gcc-final-12.3.0/build/aarch64-buildroot-linux-gnu/libgcc/../../../libgcc/unwind-dw2-fde-dip.c:512: undefined reference to `_dl_find_object'

C++ support in TAs does not work with GCC 12.3. "make toolchains" is supposed to install 11.3, how does your setup end up pulling something from host-gcc-final-12.3.0?
Try adding WITH_CXX_TESTS=n to the make command.

make[4]: *** [/optee/optee_os/out/arm/export-ta_arm64/mk/link.mk:123: out/ta/os_test/5b9e0e40-2636-11e1-ad9e-0002a5d5c51b.elf] Error 1
make[3]: *** [/optee/out-br/build/optee_test_ext-1.0/ta/Makefile.gmake:61: ta-os_test] Error 2
make[2]: *** [package/pkg-generic.mk:284: /optee/out-br/build/optee_test_ext-1.0/.stamp_built] Error 2
make[1]: *** [Makefile:23: _all] Error 2
make[1]: Leaving directory '/optee/out-br'
make: *** [common.mk:341: buildroot] Error 2
The command '/bin/sh -c make V=1 RUST_ENABLE=n' returned a non-zero code: 2

@jforissier
Copy link
Contributor

OPTEE_RUST_ENABLE

That should be RUST_ENABLE=n

This documentation page shows OPTEE_RUST_ENABLE : https://github.com/OP-TEE/optee_docs/blob/9841a9fc14ed49b084bed4b6fd8af417a45cbf24/building/optee_with_rust.rst#L41 That's why I was trying that name.

Ah, yes. The doc needs updating.

OP-TEE/optee_docs#240

@msgilligan
Copy link
Contributor Author

msgilligan commented May 14, 2024

C++ support in TAs does not work with GCC 12.3. "make toolchains" is supposed to install 11.3, how does your setup end up pulling something from host-gcc-final-12.3.0?

I'm not sure. The only difference in my setup (I'm running a slightly modified version of the example Dockerfile) should be that I'm on arm64.

Try adding WITH_CXX_TESTS=n to the make command.

Will do.

By the way, with the following make line the build completes successfully:

RUN make -j$(nproc) RUST_ENABLE=n arm-tf optee-os qemu

(So disabling Rust and removing the buildroot and linux targets)

@msgilligan
Copy link
Contributor Author

msgilligan commented May 14, 2024

This works as well:

RUN make -j$(nproc) RUST_ENABLE=n WITH_CXX_TESTS=n

and so does:

RUN make -j$(nproc) RUST_ENABLE=n WITH_CXX_TESTS=n check

@msgilligan
Copy link
Contributor Author

msgilligan commented May 20, 2024

I opened Issue #749 for the specific problem of the Rust toolchain not being downloaded on aarch64 hosts and I've submitted a "draft" patch that I believe addresses the issue. And PR #748 seems to solve the download part of the issue, but there is still a separate problem with actually building the Rust examples.

I'll leave this issue open because I think we should probably also make an issue for the C++ tests not working on aarch64 hosts.

@msgilligan
Copy link
Contributor Author

msgilligan commented May 21, 2024

@jforissier asks:

C++ support in TAs does not work with GCC 12.3. "make toolchains" is supposed to install 11.3, how does your setup end up pulling something from host-gcc-final-12.3.0?

Looking at toolchain.mk it says:

# There isn't any native aarch64 toolchain released from Arm and buildroot
# doesn't support distribution toolchain [1]. So we are left with no choice
# but to build buildroot toolchain from source and use it.
#
# [1] https://buildroot.org/downloads/manual/manual.html#_cross_compilation_toolchain

So that seems to answer the question. So maybe we need to adjust the build root configuration to pull/build the compatible version?

Update: I created Issue #751 for this specific problem as there are at least three sub-issues we've found on this issue.

@msgilligan
Copy link
Contributor Author

msgilligan commented May 22, 2024

I have been unable to reproduce the original error (which I saw on x86_64) reported in this issue, but there are three aarch64 sub-issues now:

I also opened apache/incubator-teaclave-trustzone-sdk#135 in the Teaclave Rust repository that references these issues, so the Rust developers are made aware -- maybe they have some ideas.

We should probably do one or more of the following:

  • Close this issue
  • Rename this issue to be a parent issue for aarch64 build issues
  • Create a new issue as a parent/catch-all for aarch64 build issue

@jforissier What do you suggest?

@jforissier
Copy link
Contributor

Let's rename this issue to be a parent for the three you mentioned above. Note that I proposed #753 as a fix for #751. This leaves us with #752 to be investigated and it's good that you have asked for help in the teaclave project.

@msgilligan msgilligan changed the title Build fails on Ubuntu 22.04 & Debian 12 (4.2.0) Build fails on aarch64 hosts May 22, 2024
@msgilligan
Copy link
Contributor Author

Let's rename this issue to be a parent for the three you mentioned above.

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants