Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

driver/mtd_spi_nor, pkg/littlefs: improve reliability with corrupted flash (new PR) #20589

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

crasbe
Copy link
Contributor

@crasbe crasbe commented Apr 17, 2024

Contribution description

This is a takeover of PR #14908, which has been abandoned. The goal is to implement the proposed changes in the original PR to make it mergeable.

The changes to make it mergeable after four years were necessary due to the following commits:

  1. 60eb75d
    This commit removed the "mtd_write_page" function in favor of the "mtd_write_page_raw" function.

  2. ea105d3
    This commit removed the "mtd_spi_nor_write" function, which was changed in the original PR as well but these changes are not necessary anymore.

Open Tasks:

  • The review from @benpicco ( mtd_spi_nor | littlefs: improve reliability with corrupted flash #14908 (review) ) has to be included in this PR still.
    My plan was to add a header file to the drivers/mtd_spi_nor folder which would probably be called something like "drivers/mtd_spi_nor/include/mtd_spi_nor_security.h" "drivers/mtd_spi_nor/include/mtd_spi_nor_defines.h".

  • The security bits used by this PR are not universal to all manufacturers of Flash chips, specifically the Flash used by the original author @vincent-d is from Macronix and for example ISSI uses a different register and different bit positions. I'll have to check if they are manufacturer specific or device specific.

  • The Kconfig related changes can probably be were deleted, as the Kconfig subsystem has been removed from the mtd_spi_nor driver.

  • There is no timeout for the added functions wait_for_write_enable_cleared and wait_for_write_enable_set, which would get the thread stuck if the chip does not answer.

  • The function wait_for_write_complete does not have a timeout either but it counts the attempts and how many times it yielded the thread. This can be used as a basis for a timeout.

  • Write tests for the Macronix and ISSI flash security features.

  • Add IS25LP128 and IS25LE01G as DuTs to the test (and possibly more, whatever the magic drawer might supply).

Testing procedure

The PR comes with new tests that utilize the Block Protect functions from NOR Flashs which triggers the Program Fail and Erase Fail flags. The Block Protect bits are not permanent and can be reset easily.
The nRF52840DK development board has a suitable flash chip on board and is covered by the test.

All tests should run successfully to verify the PR is working correctly.

Running the test on the nRF52840DK is rather straight forward:

chris@ThinkPias:~/RIOT/tests/drivers/mtd_spi_nor$ USEMODULE+=mtd_spi_nor_mx_security make flash term
Building application "tests_mtd_spi_nor" for "nrf52840dk" with CPU "nrf52".

... all the normal build commands...

2024-05-30 00:08:35,710 # Connect to serial port /dev/ttyACM0
2024-05-30 00:08:36,717 # ...
Welcome to pyterm!
Type '/exit' to exit.
2024-05-30 00:08:36,717 # OK (3 tests)

When enabling the ENABLE_DEBUG and ENABLE_TRACE flags in the drivers/mtd_spi_nor/mtd_spi_nor.c file, the debug output should show the following lines:

...
2024-05-30 00:17:48,493 # mtd_spi_nor_write: ERR: page program failed! scur = 20
...
2024-05-30 00:17:49,062 # mtd_spi_nor_write: ERR: erase failed! scur = 60
...

This indicates that the security related code path was triggered and if the tests still pass, the program and erase failures happened at the right location in the tests.

Issues/PRs references

This closes #14908.

@github-actions github-actions bot added Area: pkg Area: External package ports Area: drivers Area: Device drivers Area: Kconfig Area: Kconfig integration labels Apr 17, 2024
@crasbe
Copy link
Contributor Author

crasbe commented Apr 17, 2024

@benpicco I think I found a suitable solution to get rid of the Kconfig configuration options and have a versatile option to enable manufacturer dependent security options.
The mtd_spi_nor_params_t struct (https://github.com/RIOT-OS/RIOT/blob/master/drivers/include/mtd_spi_nor.h) has a flag field, which is evaluated during compile time and is currently used to signify the page size. However it is a 16-bit wide flag field, so there should be plenty of space to add security flags.

The flags are defined in the global header file https://github.com/RIOT-OS/RIOT/blob/master/drivers/include/mtd_spi_nor.h
This would follow the approach that the mtd_spi_nor driver is configured at a board level.
One "downside" is that the common Opcode structure in https://github.com/RIOT-OS/RIOT/blob/master/drivers/mtd_spi_nor/mtd_spi_nor_configs.c can not be used in this form anymore, since the security related Opcodes are not shared between manufacturers and even worse, the Opcodes collide. While it would be possible to just ignore the collision and use the right opcode in the code later on, it's not the right approach since it blocks future extension of the code (if anyone wants to use the ACTUAL opcode).

Therefore I would introduce four new structures: two for Macronix with security (1byte and 4byte access) and two for ISSI (1byte and 4byte access).
The comment above the structures says that the compiler only adds the selected structure to the memory, so it does not take up any additional space.

I'll probably have the preliminary commit with the proposed changes ready in a couple of hours, that should make everything clearer.

@github-actions github-actions bot removed the Area: Kconfig Area: Kconfig integration label Apr 17, 2024
@crasbe
Copy link
Contributor Author

crasbe commented Apr 17, 2024

The static-test codespell complains about these lines in drivers/mtd_spi_nor/mtd_spi_nor.c being a typo (it would like to have WELL instead of WEL):

/* Wait for WEL to be set */
...
/* Wait for WEL to be cleared */

I'm not sure how to fix that? Oh wel... 😅
Interestingly it does not occur when I run "make static-test" locally.

@kfessel
Copy link
Contributor

kfessel commented Apr 18, 2024

Interestingly it does not occur when I run "make static-test" locally.

the static-test checks if the tool that is running a test is installed - so you probably do not have codespell installed.

I'm not sure how to fix that? Oh wel...

it there is a long form (write enable L... what is the L? eg WREL?) or alternative of WEL (STATUS_WEL ?) you could use that -and workaround the spell checker or as it suggest it could be added to the ignored words list.

@crasbe
Copy link
Contributor Author

crasbe commented Apr 19, 2024

Interestingly it does not occur when I run "make static-test" locally.

the static-test checks if the tool that is running a test is installed - so you probably do not have codespell installed.
That is the weird part about it: I did install codespell and it did complain about a couple of actual typos, but at the moment the tests are all running without failure:

chris@ThinkPias:~/flashdev-riot/RIOT$ make static-test
./dist/tools/ci/static_tests.sh
...
Running "./dist/tools/codespell/check.sh" ✓
Running "./dist/tools/uncrustify/uncrustify.sh --check" ✓
Command output:

	All files are uncrustified!

Running "./dist/tools/shellcheck/check.sh" ✓

I'm not sure how to fix that? Oh wel...

it there is a long form (write enable L... what is the L? eg WREL?) or alternative of WEL (STATUS_WEL ?) you could use that -and workaround the spell checker or as it suggest it could be added to the ignored words list.

WEL stands for "Write Enable Latch". However I found out that codespell added an inline option to ignore certain words. That would avoid adding a global ignored word (and potentially missing typos of "well" in the future): https://github.com/codespell-project/codespell?tab=readme-ov-file#inline-ignore

@kfessel
Copy link
Contributor

kfessel commented Apr 19, 2024

WEL stands for "Write Enable Latch". However I found out that codespell added an inline option to ignore certain words. That would avoid adding a global ignored word (and potentially missing typos of "well" in the future): https://github.com/codespell-project/codespell?tab=readme-ov-file#inline-ignore

citing from https://github.com/codespell-project/codespell?tab=readme-ov-file#ignoring-words

note that spelling errors are case-insensitive but words to ignore are case-sensitive.

i don't think you need to go the inline route (wel would still be detected if WEL was added to the ignore file and if someone writes WEL i got somthing its probably by choice)

@github-actions github-actions bot added Area: doc Area: Documentation Area: tools Area: Supplementary tools labels Apr 19, 2024
@crasbe
Copy link
Contributor Author

crasbe commented Apr 19, 2024

Okay, this was a bit naive of me. Obviously the spellcheck only checks with the ignored word list that is currently in master. So either I have to ignore the failed spell test or open a separate PR with just the updated ignored word list 🤔

@kfessel
Copy link
Contributor

kfessel commented Apr 19, 2024

Okay, this was a bit naive of me. Obviously the spellcheck only checks with the ignored word list that is currently in master. So either I have to ignore the failed spell test or open a separate PR with just the updated ignored word list 🤔

that PR also would get spellchecked i think -> there is a egg and chicken problem

seems like (our)codespell does not follow their doc codespell-project/codespell#3272 (don't know which date our version is

seems like our codespell is case-insensitive in either case

@kfessel
Copy link
Contributor

kfessel commented Apr 19, 2024

i read it wrong by case sensitive they mean you need to type it the way they have it in their typo database (usually lowercase) (i don't know why they do it like this) codespell-project/codespell#2451 (comment)

so for WEL to not trigger you need to add wel but then wel and wEl and Wel would also not trigger

@github-actions github-actions bot removed Area: doc Area: Documentation Area: tools Area: Supplementary tools labels Apr 19, 2024
@crasbe
Copy link
Contributor Author

crasbe commented Apr 19, 2024

So this was quite a rabbit hole. No matter what I did I could not reproduce the issue... It turns out that Linux Mint had an old version of codespell. Now I installed it via pip and not form the package repository and now it complains about WEL as well, yay.

BUT... the latest Release of codespell does not support the inline ignores yet. It's only implemented in the development version.

A workaround is changing "WEL" to "WEL-Flag" and now the spellcheck passes.

@github-actions github-actions bot added Area: doc Area: Documentation Area: tests Area: tests and testing framework labels Apr 24, 2024
@crasbe
Copy link
Contributor Author

crasbe commented Apr 24, 2024

The last PR adds a new test for the mtd_spi_nor driver which tests the security features. Contrary to what I wrote in the first post, you can trigger the P_FAIL (Program Failed) and E_FAIL (Erase Failed) flags by protecting a block with the Block Protect flags in the status register.

The test is not suuuper neat because I had to copy some internal functions from the mtd_spi_nor driver to have direct access to the register. This seemed the least intrusive way to create the tests.

This is a log from the most relevant test for this PR, checking the security functions with the Block Protect bits.

2024-04-24 19:02:42,301 # ..test_mtd_block_protect: Checking the Block Protect Functions
2024-04-24 19:02:42,357 # mtd_spi_nor_write_page: 0x20000800, 0x200017d0, 0x0, 0x0, 0x10
2024-04-24 19:02:42,360 # mtd_spi_cmd: 0x20000800, 06
2024-04-24 19:02:42,364 # mtd_spi_cmd_read: 0x20000800, 05, 0x2000173f, 1
2024-04-24 19:02:42,369 # mtd_spi_nor: wait device status = 0x3e, waiting WEL-Flag
2024-04-24 19:02:42,374 # mtd_spi_cmd_addr_write: 0x20000800, 02, (000000), 0x200017d0, 16
2024-04-24 19:02:42,379 # mtd_spi_cmd_read: 0x20000800, 05, 0x2000172f, 1
2024-04-24 19:02:42,383 # mtd_spi_nor: wait device status = 0x3c, waiting !WIP
2024-04-24 19:02:42,388 # wait loop 0 times, yield 0 times, total wait 0us
2024-04-24 19:02:42,392 # mtd_spi_cmd_read: 0x20000800, 05, 0x2000173f, 1
2024-04-24 19:02:42,397 # mtd_spi_nor: wait device status = 0x3c, waiting !WEL-Flag
2024-04-24 19:02:42,401 # mtd_spi_cmd_read: 0x20000800, 2b, 0x2000176f, 1
2024-04-24 19:02:42,405 # mtd_spi_nor_write: ERR: page program failed!
2024-04-24 19:02:42,409 # mtd_spi_nor_read: 0x20000800, 0x200017e0, 0x0, 0x10
2024-04-24 19:02:42,415 # mtd_spi_cmd_addr_read: 0x20000800, 03, (000000), 0x200017e0, 16
2024-04-24 19:02:42,420 # mtd_spi_nor_erase: 0x20000800, 0x0, 0x1000
2024-04-24 19:02:42,423 # mtd_spi_cmd: 0x20000800, 06
2024-04-24 19:02:42,427 # mtd_spi_cmd_read: 0x20000800, 05, 0x2000176f, 1
2024-04-24 19:02:42,432 # mtd_spi_nor: wait device status = 0x3e, waiting WEL-Flag
2024-04-24 19:02:42,437 # mtd_spi_cmd_addr_write: 0x20000800, 20, (000000), 0, 0
2024-04-24 19:02:42,441 # mtd_spi_cmd_read: 0x20000800, 05, 0x2000175f, 1
2024-04-24 19:02:42,446 # mtd_spi_nor: wait device status = 0x3c, waiting !WIP
2024-04-24 19:02:42,450 # wait loop 0 times, yield 0 times, total wait 0us
2024-04-24 19:02:42,454 # mtd_spi_cmd_read: 0x20000800, 05, 0x2000176f, 1
2024-04-24 19:02:42,459 # mtd_spi_nor: wait device status = 0x3c, waiting !WEL-Flag
2024-04-24 19:02:42,463 # mtd_spi_cmd_read: 0x20000800, 2b, 0x20001797, 1
2024-04-24 19:02:42,467 # mtd_spi_nor_erase: ERR: erase failed!
2024-04-24 19:02:42,467 # 
2024-04-24 19:02:42,468 # OK (4 tests)
2024-04-24 19:03:01,934 # Exiting Pyterm

These two messages are generated by the new code that was created by @vincent-d:

2024-04-24 19:02:42,405 # mtd_spi_nor_write: ERR: page program failed!
2024-04-24 19:02:42,467 # mtd_spi_nor_erase: ERR: erase failed!

HOWEVER... I do not love the solution with the configuration (setting both Flag and Opcode) in drivers/mtd_spi_nor/mtd_spi_nor_configs.c. On one hand it's unnecessary duplication and on the other hand it is really easy to forget to set one of the two (ask me how I know...). And if you forget to set the correct opcodes, it results in false positive security behaviour. An undefined opcode is set to 0x00 (NOP) and that returns 0x00. This 0x00 is interpreted as "everything is fine" by the security code.

My preferred option would be adding some Pseudomodules, similar to the AT24CXXX driver. However I'm not sure if it would make sense to do it the same way by allocating every namespace that begins with "M25F...", "M25R...", "IS25LP...", "IS25LE...", "SST25V..." etc. A better option would probably be something like "MTD_SPI_NOR_MX", "MTD_SPI_NOR_IS", "MTD_SPI_NOR_SST" for defining the opcode set and "MTD_SPI_NOR_SECURITY" and (in a future expansion) "MTD_SPI_NOR_ECC" for defining the features.

Any other ideas about this are very welcome.

@crasbe
Copy link
Contributor Author

crasbe commented Apr 25, 2024

@benpicco You reviewed the original PR, could you take a quick look at this to give me some guidance on the previous question? You don't have to review this PR yet.

@crasbe
Copy link
Contributor Author

crasbe commented Apr 30, 2024

I implemented the Pseudomodule solution and IMO it is way more elegant than the original idea and scales a lot better.
One thing that became clear is that the ISSI flash behaves a lot different than the Macronix chip. The flags in the security register of the Macronix flash reset themselves after a successful read and the ISSI flags remain until cleared with a separate OpCode.
Furthermore, it appears like the WEL (Write Enable Latch) Flag on the ISSI is not reset when the program or erase operation was not successful. This is not clear from the datasheet, it only states that the WEL flag is reset on completion of a program or erase operation, not that is has to be successful. Therefore the WEL flag has to be cleared with a dedicated command.
When you don't reset the WEL flag, the wait_for_wel_reset function will poll the flag indefinitely.

So one thing to add to this function is trying to clear the flag manually perhaps after a certain number of tries. (As well as adding the timeout).

Therefore it took a lot of testing and fiddling until the ISSI chip was running as expected and the solution with the Pseudomodules proved to be superior in this case because I was able to add the additional OpCodes conditionally.

Unfortunately I did not separate the changes in separate commits because I did not intend them to become this involved...

@crasbe
Copy link
Contributor Author

crasbe commented May 6, 2024

I think this PR is already quite extensive and beyond the originally intended scope, so I would like to postpone adding the timeout tasks for now to a separate PR.

@crasbe crasbe marked this pull request as ready for review May 6, 2024 10:59
@crasbe
Copy link
Contributor Author

crasbe commented May 27, 2024

Is there anything I can do to help with the review process?
Perhaps split off the test program (minus the security tests) to a seperate PR?

I could send out an Arduino Uno Shield with a Macronix flash as well for testing with a Nucleo board in case the reviewer does not have an nRF52840DK.

Copy link
Contributor

@kfessel kfessel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just reading, no function review : aside from some // and a TODO this looks good to me

maybe you can add some test results into the pr description ( the one in the comment seem to not be current)

@kfessel
Copy link
Contributor

kfessel commented May 27, 2024

@benpicco @fabian18 maybe someone of you feels more confident.

@kfessel kfessel added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label May 29, 2024
@riot-ci
Copy link

riot-ci commented May 29, 2024

Murdock results

✔️ PASSED

8f25a37 tests/mtd_spi_nor: Added test with security features

Success Failures Total Runtime
10177 0 10178 12m:24s

Artifacts

@crasbe
Copy link
Contributor Author

crasbe commented May 29, 2024

I don't really understand why the tests fail now, the changes I did are unrelated to the failures (all except for the busy_wait_us(1000000) in tests/drivers/mtd_spi_nor/main.c).

It would appear that most of the errors come from the fact that on an 8-bit ATmega an int is 16-bit instead of 32-bit on the other 32-bit architectures: https://gcc.gnu.org/wiki/avr-gcc#Type_Layout

I'm not really sure what to do about the busy_wait_us function, the maximum time I could wait here is 65ms, which defeats the purpose. The idea here was that the terminal needs some time to connect and it's annoying to lose half of the test output.

main.c: In function ‘main’:
main.c:255:18: error: large integer implicitly truncated to unsigned type [-Werror=overflow]
     busy_wait_us(1000000);
                  ^

For the time being I'll just delete it.

@maribu
Copy link
Member

maribu commented May 29, 2024

I'm not really sure what to do about the busy_wait_us function, the maximum time I could wait here is 65ms, which defeats the purpose.

Using the CPU to delay a function with a down-counting loop is the most costly option in terms of power consumption and the least accurate option. It is good enough for a brief delay in a driver's init function to avoid having to depend on ztimer just for that - or when desperate (e.g. early in boot before a timer is available).

IMO: When the unsigned type becomes an issue, it is a red flag that a different function should be used to delay execution.

@github-actions github-actions bot added Area: core Area: RIOT kernel. Handle PRs marked with this with care! Area: sys Area: System labels May 29, 2024
@github-actions github-actions bot removed Area: core Area: RIOT kernel. Handle PRs marked with this with care! Area: sys Area: System labels May 29, 2024
@crasbe
Copy link
Contributor Author

crasbe commented May 29, 2024

@maribu I absolutely agree with you. As I said, the delay was just a little help for me to catch the actual test output. However, considering that no other tests do it, it was better to delete it.

@kfessel I think I got all of the comments in the incorrect styling and added some more information regarding the test procedure.

Unrelated: I still don't love git 👀
If I had known how this escalates, I would've/should've approached it differently, but oh well. It's a bit tidier now I guess.

@kfessel
Copy link
Contributor

kfessel commented May 30, 2024

If you really want to busy wait long you can always loop a busy_wait

for(uint32_t milliseconds; milliseconds; milliseconds--) busy_wait_us(1000);

also for many cpu counting down ( single_threaded / isr disabled ) isn't to bad in precision ( especially many avr-board that have no clock crystal but a cpu crystal and are very predictable counting) setting up a ztimer in up to millisecond is probably worse in precision on slow mcu

@kfessel
Copy link
Contributor

kfessel commented May 30, 2024

@crasbe: do you want to tackle the missing timeouts (waiting on status change) (from your TODOs)?
want to prepare the interfaces to be able to fail?
ignore them (for reason e.g.: if this happens somthing is very wrong and the system is not recoverable)?

* USEMODULE += mtd_spi_nor
*
* For ISSI and Macronix devices, some security features are provided
* to check if program or erase operations were successful.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like something where we could also have a software fallback

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, however reading back all the data that was written previously or doing a blank check after an erase operation would lead to a significant performance penalty.

That is something I would prefer to do in a second PR, separate from this.

if ((status & SPI_NOR_STATUS_WEL) == 0) {
break;
}
thread_yield();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does this typically take?
Maybe a proper sleep would be in order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for WEL usually only takes one cycle, because the majority of the waiting is done in the "wait_for_write_complete" function. The WIP and WEL flags are usually cleared together on a successful write.

Waiting for a write to complete depends a lot on the operation. For short writes it cycles only once, but a chip erase can take up to 400 seconds (6.5 minutes) for my big 1GBit Flash. But that waiting is not done in the wait_for_write_enable_cleared function.

DEBUG("mtd_spi_nor_write: ERR: page program failed! scur = %02x\n", scur_reg);
ret = -EIO;
}
#elif IS_USED(MODULE_MTD_SPI_NOR_ISSI_SECURITY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we know the vendor from the device ID, can we do this check at run-time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is absolutely possible, I decided against it to avoid having too much "dead" code. Considering how different Macronix and ISSI handle that stuff, ST, Microchip, Winbond, ... probably handle it different as well, so we would have many checks left in the final binary which are never needed in normal operation.

Considering so many other parameters of the Flash have to be known at compile time, it didn't make sense to me to check it on runtime.

HOWEVER now that I think about it, it is probably possible to add a .manufacturer variable to the mtd_spi_nor_params_t struct. That would get rid of the pseudomodules and the compiler can optimize the code at compile time, because the variable is already set.

Copy link
Contributor

@benpicco benpicco May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is absolutely possible, I decided against it to avoid having too much "dead" code.

looking at the code, it doesn't appear to be too much code, just one register read for Macronix and two for ISSI.
I don't mind being able to only enable what's needed, but it would be good being able to select multiple 'security' modules if there are board variants with different flash chips.

Considering so many other parameters of the Flash have to be known at compile time, it didn't make sense to me to check it on runtime.

Actually there aren't any critical ones - the timings that get set are just minimal values, the size is determined at run time. The idea is that you can just attach any flash chip as those often get swapped out between board revisions depending on availability.

@@ -622,19 +659,50 @@ static int mtd_spi_nor_write_page(mtd_dev_t *mtd, const void *src, uint32_t page
/* write enable */
mtd_spi_cmd(dev, dev->params->opcode->wren);

/* Wait for WEL-Flag to be set */
wait_for_write_enable_set(dev);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required for all chips / when integrity checking is not enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, on all Flash devices you have to wait for the WEL flag to be set before starting a Write/Erase operation. Otherwise the operation will be ignored.

https://www.macronix.com/Lists/Datasheet/Attachments/8868/MX25R6435F,%20Wide%20Range,%2064Mb,%20v1.6.pdf Page 34 has a block diagram. However now that I look at it, Macronix says to issue the WREN command again if the flag is not set. So the right thing to do would not be to wait for the flag to be set but trying to set it until it is set.

Maybe that would need a try-counter instead of a timeout. 🤔

*
* USEMODULE += mtd_spi_nor
*
* For ISSI and Macronix devices, some security features are provided
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/security/integrity ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "security" wording comes from the Macronix "Security Register", but I can change it to "integrity", sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When reading 'Security' I tend to think about encryption and prevention from unauthorized access, not protection against data corruption.

/* ISSI flashs do not reset the WEL-Flag when an operation was not *completed* */
mtd_spi_cmd(dev, dev->params->opcode->wrdi);
}
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better move this block to a helper function to keep the code tidy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do :)

/* ISSI flashs do not reset the WEL-Flag when an operation was not *completed* */
mtd_spi_cmd(dev, dev->params->opcode->wrdi);
}
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

return mtd_write_page_raw(mtd, buffer, page, off, size);
int ret = mtd_write_page_raw(mtd, buffer, page, off, size);
if (ret == -EIO) {
ret = LFS_ERR_CORRUPT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since EIO is already used by the driver for bus errors, better use something different for errors detected by the device. How about EFAULT, EBADMSG or EILSEQ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This decision was made by the original author, I'll have to look up which return value is most suitable instead of EIO.

tests/drivers/mtd_spi_nor/main.c Show resolved Hide resolved
@crasbe
Copy link
Contributor Author

crasbe commented May 30, 2024

@crasbe: do you want to tackle the missing timeouts (waiting on status change) (from your TODOs)? want to prepare the interfaces to be able to fail? ignore them (for reason e.g.: if this happens somthing is very wrong and the system is not recoverable)?

I'm not sure what the best way to go forward is here. IMO the timeout features have to potential to be a big can of worms as well, so I would tend to do that in a separate PR as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: doc Area: Documentation Area: drivers Area: Device drivers Area: pkg Area: External package ports Area: tests Area: tests and testing framework CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants