Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Make OP-TEE physically relocatable when S-EL2 SPMC is not enabled #6712

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

seonghp
Copy link
Contributor

@seonghp seonghp commented Feb 26, 2024

This patch series try to dissociate CFG_CORE_PHYS_RELOCATABLE from CFG_CORE_SEL2_SPMC.

First, this patch allows OP-TEE to register [TZDRAM_BASE, TZDRAM_BASE + TZDRAM_SIZE) as secure memory (i.e., secure_only[0]) when it cannot get secure memory information via FF-A manifest.

Then, this patch also tries to make use of memory regions below the physical offset. With certain scenarios such as physical ASLR, the physical offset can be large and it's not desirable to ignore that region.

Lastly, to make memory monitor remain honest, this patch introduce a mechanism to invalidate certain region within the memory pool. In particular, it invalidates overlapping region between the TEE RAM and the TA RAM so that this region is excluded from memory usage statistics.

Known remaining issue is that there remains redundant mappings, i.e., TA RAM mapping from some VA to PA that belongs to overlapped regions b/w TEE RAM and TA RAM, which I plan to address soon.

Looking forward to feedback from the community.

@jenswi-linaro
Copy link
Contributor

What is the use case for this?

@seonghp
Copy link
Contributor Author

seonghp commented Feb 29, 2024

Hi @jenswi-linaro, thank you for your response. In particular, we are interested in applying physical ASLR to the OP-TEE.

Copy link
Contributor

@jenswi-linaro jenswi-linaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may require quite a bit of physical memory to be meaningful.

Anyway, I'm not so keen on "core: invalidate common RAM regions b/w TA and TEE", we should find a way to have a split up TA memory range.

@@ -1618,6 +1618,8 @@ void __weak boot_save_args(unsigned long a0, unsigned long a1,
boot_arg_nsec_entry = a4;
#endif
}
if (IS_ENABLED(CFG_CORE_PHYS_RELOCATABLE))
core_mmu_set_secure_memory(TZDRAM_BASE, TZDRAM_SIZE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use TRUSTED_DRAM_BASE and TRUSTED_DRAM_SIZE instead.

@@ -146,7 +146,7 @@ void core_mmu_get_ta_range(paddr_t *base, size_t *size)
#else
static_assert(ARRAY_SIZE(secure_only) <= 2);
if (ARRAY_SIZE(secure_only) == 1) {
vaddr_t load_offs = 0;
vaddr_t load_offs __maybe_unused = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this __maybe_unused?

@@ -153,9 +153,14 @@ void core_mmu_get_ta_range(paddr_t *base, size_t *size)

assert(secure_only[0].size >
load_offs + TEE_RAM_VA_SIZE + TEE_SDP_TEST_MEM_SIZE);
#ifdef CFG_CORE_SEL2_SPMC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (IS_ENABLED(CFG_CORE_SEL2_SPMC)) {
...
} else {
...
}

@seonghp
Copy link
Contributor Author

seonghp commented Feb 29, 2024

That may require quite a bit of physical memory to be meaningful.

Yes, that's correct. For example, with few megabytes of secure DRAM, we are given few hundreds to thousands of possible offsets.

Anyway, I'm not so keen on "core: invalidate common RAM regions b/w TA and TEE", we should find a way to have a split up TA memory range.

I see. I thought allowing memory pools to have several scattered regions might introduce more intrusive changes, and that's why I tried this way. I will take time to take another approach.

Copy link

This pull request has been marked as a stale pull request because it has been open (more than) 30 days with no activity. Remove the stale label or add a comment, otherwise this pull request will automatically be closed in 5 days. Note, that you can always re-open a closed issue at any time.

@github-actions github-actions bot added the Stale label Mar 31, 2024
@seonghp
Copy link
Contributor Author

seonghp commented Apr 1, 2024

Anyway, I'm not so keen on "core: invalidate common RAM regions b/w TA and TEE", we should find a way to have a split up TA memory range.

Hi @jenswi-linaro, I'd love to hear your thoughts on my approach before I start coding.

The reason why a memory pool can only support up to one single contiguous memory regions is, to my understanding, because tee_mm_pool_t has one lo and size.

I think, by changing tee_mm_pool_t definition to:

#define NUM_REGIONS 2

struct _tee_mm_pool_t {
	tee_mm_entry_t *entry;
	paddr_t lo[NUM_REGIONS]; /* low boundary of the pool */
	paddr_size_t size[NUM_REGIONS]; /* pool size */
	uint32_t flags; /* Config flags for the pool */
	uint8_t shift; /* size shift */
	unsigned int lock;
#ifdef CFG_WITH_STATS
	size_t max_allocated;
#endif
};

we can make the pool consists of split TA RAM regions.

Could you share your thoughts on this approach?

@github-actions github-actions bot removed the Stale label Apr 2, 2024
@jenswi-linaro
Copy link
Contributor

Adding

	paddr_t lo[NUM_REGIONS]; /* low boundary of the pool */
	paddr_size_t size[NUM_REGIONS]; /* pool size */

isn't enough since tee_mm_get_smem() needs to know the lo applicable for the entry. Also, this would allocate space for all the other pools that only need one range.

Unless I'm mistaken, multiple ranges are only needed for allocations from tee_mm_sec_ddr. Perhaps it's enough to do all those allocations via a helper function that knows to look in tee_mm_sec_ddr2 or such if tee_mm_sec_ddr is full? Another option is to keep the tee_mm_alloc2() you added and accept that max_allocated isn't accurate.

In the slightly longer term, I think we should look into managing memory used by OP-TEE core more efficiently. Currently, we take TEE_RAM_VA_SIZE amount of physical memory, but we don't use all of it. It would be nice with a more dynamic approach where the unused physical memory could be reused. This is out of the scope of this PR though.

@seonghp seonghp force-pushed the master branch 4 times, most recently from 8f1a1d4 to da31546 Compare April 3, 2024 10:00
@seonghp
Copy link
Contributor Author

seonghp commented Apr 3, 2024

Thank you for your insights. I think having errors in tee_mm_sec_ddr.max_allocated is acceptable, and more preferable than adding extra logic. I revised the patch.

Another remaining issue is that currently there exist redundant mappings to TEE RAM physical memory.

              VA                 PA

    TA_RAM +------+---------->+------+
           |      |           |      |
           |      |           |      |
           +------+-------+-->+------+ core_mmu_
           |xxxxxx|      /    |      | tee_load_pa
overlapped |xxxxxx|     /     |      |
   area    |xxxxxx|    /      |      |
           |xxxxxx|   |       |      |
           +------+---+----+->+------+
           |      |   |   /   |      |
           |      |   /  /    |      |
           +------+  /  |     +------+
                    /   |
                   /    |
   TEE_RAM +------+    /
           |  RX  |   /
           +------+  /
           |  RW  | /
           |      |/
           +------+

@jenswi-linaro
Copy link
Contributor

Yes, the redundant mapping is an issue.

@seonghp seonghp force-pushed the master branch 2 times, most recently from ba2f107 to a66dbb1 Compare April 5, 2024 09:39
@seonghp
Copy link
Contributor Author

seonghp commented Apr 5, 2024

I decided to split TA RAM regions in the memory map, so there's no redundant mappings now.

		if (IS_ENABLED(CFG_CORE_PHYS_RELOCATABLE) &&
		    !IS_ENABLED(CFG_CORE_SEL2_SPMC)) {
			ADD_PHYS_MEM(MEM_AREA_TA_RAM, ta_base,
				     core_mmu_tee_load_pa - ta_base);

			ADD_PHYS_MEM(MEM_AREA_TA_RAM,
				     core_mmu_tee_load_pa + TEE_RAM_VA_SIZE,
				     ta_size -
				     (core_mmu_tee_load_pa - ta_base) -
				     TEE_RAM_VA_SIZE);
		} else {

Also, I decided to add tee_mm_sec_ddr_extra for second TA RAM region and made a helper function that attempts to allocate from tee_mm_sec_ddr_extra when allocation from tee_mm_sec_ddr fails, and substituted it for tee_mm_alloc() calls, following your suggestion.

Copy link
Contributor

@etienne-lms etienne-lms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicking comments.

@@ -99,6 +99,8 @@ ifeq ($(CFG_CORE_LARGE_PHYS_ADDR),y)
$(call force,CFG_WITH_LPAE,y)
endif

CFG_CORE_PHYS_RELOCATABLE ?= n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move inline description from line 118/119 here, 1st words rephrased?

# Enable or not support in OP-TEE to relocate itself to allow it to run from a
# physical address that differs from the link address

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! applied

@@ -1005,17 +1019,23 @@ static size_t collect_mem_ranges(struct tee_mmap_region *memory_map,
size_t num_elems)
{
const struct core_mmu_phys_mem *mem = NULL;
vaddr_t ram_start = secure_only[0].paddr;
vaddr_t tee_ram_start;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

= 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied

* middle of the secure DRAM, rather than the secure DRAM base.
* As a result, there may be up to two remaining memory regions
* (i.e., one in the front and one in the back), which can be
* used TA RAM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... can be used for TA RAM."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! applied

@@ -1350,13 +1394,19 @@ static unsigned long init_mem_map(struct tee_mmap_region *memory_map,
*/
vaddr_t id_map_start = (vaddr_t)__identity_map_init_start;
vaddr_t id_map_end = (vaddr_t)__identity_map_init_end;
vaddr_t start_addr = secure_only[0].paddr;
vaddr_t tee_start_addr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

= 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied


for (map = get_memory_map(); !core_mmap_is_end_of_table(map); map++) {
if (map->type == MEM_AREA_TA_RAM) {
DMSG("map = { .va=%lx, .size=%lx }", map->va, map->size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DMSG("map = { .va=%#"PRIxVA", .size=%#zx }", map->va, map->size);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this was used while debugging and I think this can be removed form the commit

core/mm/tee_mm.c Outdated
mm = tee_mm_alloc(&tee_mm_sec_ddr, size);
if (!mm) {
return tee_mm_alloc(&tee_mm_sec_ddr_extra, size);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coding style: remove braces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! applied

@jforissier
Copy link
Contributor

jforissier commented Apr 8, 2024

How can this be tested? I'd like to verify this on QEMUv8 and perhaps add a line to the CI. To avoid increasing the CI time we could piggyback on make -j$(nproc) check CFG_CRYPTO_WITH_CE82=y (see ci.yml) and simply add the proper flags to that line.

This dissociates CFG_CORE_PHYS_RELOCATABLE from CFG_CORE_SEL2_SPMC,
so that OP-TEE can be physically relocated without an SPMC at EL2.

One of the possible use-cases of enabling CFG_CORE_PHYS_RELOCATABLE
without CFG_CORE_SEL2_SPMC is physical KASLR. In particular, with this
configuration enabled, earlier stage bootloader can load OP-TEE at
an arbitrary 4K-aligned physical address.

Unlike the case where OP-TEE is loaded at some offset by an SPMC at
EL2 (e.g., 16KB by Hafnium) and there are some metadata to keep
(e.g, FF-A manifest) below the physical offset, we can use the memory
region below the physical offset as TA RAM when OP-TEE is loaded at
some random location by earlier stage bootloader, but there's no
SPMC at EL2.

To that end, we introduce another memory pool `tee_mm_sec_ddr_extra`
that accompanies with existing `tee_mm_sec_ddr` to provide secure-
DDR-backed allocation, to support such spliti TA RAM scenarios.

Also note that memory statistic infrastructure for TA RAM will be
inaccurate as it doesn't take account into split TA RAM, i.e.,
memory blocks allocated from `tee_sec_mm_ddr_extra`, as of now.

Signed-off-by: Seonghyun Park <seonghp@amazon.com>
@seonghp
Copy link
Contributor Author

seonghp commented Apr 8, 2024

Thanks @etienne-lms for the review. I believe I've addressed all your comments.

Thanks @jforissier for the feedback. To validate this patch, the recommended approach would be to cherry-pick seonghp/arm-trusted-firmware@f846124039 to TF-A and build OP-TEE with CFG_CORE_PHYS_RELOCATABLE=y.

However, enabling a CI for this change presents a challenge, as it relies on BL2 to load the OP-TEE image at a random physical offset, which falls outside the scope of this repository.

Contributing code to TF-A might be a potential solution, though it's not certain at this point.

@jforissier
Copy link
Contributor

However, enabling a CI for this change presents a challenge, as it relies on BL2 to load the OP-TEE image at a random physical offset, which falls outside the scope of this repository.

I'm thinking we could keep the actual load physical address the same (no change to BL2), but change the OP-TEE link address via CFG_TEE_LOAD_ADDR=xxx perhaps. Would that work?

@@ -1691,6 +1691,9 @@ void __weak boot_save_args(unsigned long a0, unsigned long a1,
boot_arg_nsec_entry = a4;
#endif
}
if (IS_ENABLED(CFG_CORE_PHYS_RELOCATABLE))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be complete we should handle the case with CFG_CORE_FFA=y and !IS_ENABLED(CFG_CORE_SEL2_SPMC) && IS_ENABLED(CFG_CORE_PHYS_RELOCATABLE) above.

s = secure_only[0].size - load_offs - TEE_RAM_VA_SIZE -
TEE_SDP_TEST_MEM_SIZE;
if (IS_ENABLED(CFG_CORE_PHYS_RELOCATABLE) &&
!IS_ENABLED(CFG_CORE_SEL2_SPMC)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need special treatment for CFG_CORE_SEL2_SPMC? Is it an optimization? If so please remove it since we can afford that extra overhead to keep the code easier to manage.

This comment applies to the entire patch.

*
* When the TEE is loaded at the secure DRAM base or at the top,
* then either one of the below `ADD_PHYS_MEM()` can result in
* an empty memory region.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't make sense to add empty regions. Please add a check against that.

panic("Extra TA RAM pool is not empty");

if (!get_extra_ta_ram(&s, &e))
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make sure that we can't allocate anything from tee_mm_sec_ddr_extra if this happens.

if (IS_ENABLED(CFG_CORE_PHYS_RELOCATABLE) &&
!IS_ENABLED(CFG_CORE_SEL2_SPMC)) {
if (!tee_mm_is_empty(&tee_mm_sec_ddr_extra))
panic("Extra TA RAM pool is not empty");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "can't happen", an assert() should be enough.

panic("Extra TA RAM is not secure");

/* remove previous config and init Extra TA ddr memory pool */
tee_mm_final(&tee_mm_sec_ddr_extra);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tee_mm_final() isn't needed since the pool is empty.

tee_mm_entry_t *mm = NULL;

mm = tee_mm_alloc(&tee_mm_sec_ddr, size);
if (!mm)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer the more direct logic:

if (mm)
     return mm;

return tee_mm_alloc(&tee_mm_sec_ddr_extra, size);

@seonghp
Copy link
Contributor Author

seonghp commented Apr 15, 2024

Thank you, @jforissier, for your suggestion. I will investigate the approach you suggested.

Also, thanks to @jenswi-linaro for the review and feedback. I was not familiar with how OP-TEE operates alongside an SPMC at S-EL2, so I was cautious and tried to avoid disrupting the code executed when CFG_CORE_SPMC_AT_EL2 config is enabled.

I will take some time to clean up the logic, especially for the SPMC_AT_EL=2 case, and will soon submit a revised patch.

@gowthamithiagu
Copy link

As CFG_CORE_PHYS_RELOCATABLE is tightly coupled with CFG_CORE_SEL2_SPMC,
couldn't separate that out with minor changes.

There is another use case where OP-TEE needs to be loaded and executed in a memory which is determined at run time.
Pre assigned secure memory could be less in certain scenarios and OP-TEE needs more memory for its execution.

@seonghp
How is the allocation between TEE_RAM and TA_RAM done here?
Can you explain how the OP-TEE comes to know where it is going to be executed at run time?
Are you passing the OP-TEE load address from the lower level bootloaders?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants