Skip to content

feat(spi_nand_flash): Add block device support#606

Draft
RathiSonika wants to merge 8 commits intoespressif:masterfrom
RathiSonika:feat/nand_flash_bdl_support
Draft

feat(spi_nand_flash): Add block device support#606
RathiSonika wants to merge 8 commits intoespressif:masterfrom
RathiSonika:feat/nand_flash_bdl_support

Conversation

@RathiSonika
Copy link
Collaborator

@RathiSonika RathiSonika commented Nov 10, 2025

Change description

  • Add block device support (nand_flash_bdl and nand_wl_bdl)

  • Please refer spi_nand_flash/layered_architecture.md for the details

  • Refactor the component for improved structure, maintainability and readability

  • Removed the fatfs/vfs component dependency; it will be relocated (along with examples) to another component in a different PR.

  • Add Kconfig option NAND_FLASH_ENABLE_BDL to enable/disable BDL support

  • Updated test_apps and host_tests to verify the same

  • OTP and OOB area handling are still missing. I’ll address them in a separate PR, as the scope of this one is already growing.

@RathiSonika RathiSonika force-pushed the feat/nand_flash_bdl_support branch 2 times, most recently from 5e70c2f to afd1301 Compare November 11, 2025 08:38
@RathiSonika RathiSonika force-pushed the feat/nand_flash_bdl_support branch 2 times, most recently from 69e8bbd to 422413c Compare November 12, 2025 09:38
@RathiSonika RathiSonika self-assigned this Nov 13, 2025
@RathiSonika RathiSonika force-pushed the feat/nand_flash_bdl_support branch 3 times, most recently from b05a6d5 to 2d1b2a3 Compare November 13, 2025 19:11
@RathiSonika RathiSonika force-pushed the feat/nand_flash_bdl_support branch from 2d1b2a3 to 19bd12c Compare November 14, 2025 07:45
@RathiSonika RathiSonika force-pushed the feat/nand_flash_bdl_support branch 15 times, most recently from c335750 to cc21a22 Compare November 19, 2025 12:31
@RathiSonika RathiSonika force-pushed the feat/nand_flash_bdl_support branch from b8c5e19 to c38ec01 Compare February 23, 2026 10:37
* the WEAR-LEVELING layer instead (see spi_nand_flash_wl_get_blockdev).
*
*/
esp_err_t nand_flash_get_blockdev(spi_nand_flash_config_t *config,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create new BDL instance for existing NAND flash device driver?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, we have a single BDL flash instance mapped to the WL layer. A single WL instance manages the entire flash.

Therefore, there is no need to create multiple BDL instances for the NAND flash driver. The flash BDL and WL BDL will be created once during flash initialization.

@RathiSonika RathiSonika force-pushed the feat/nand_flash_bdl_support branch 3 times, most recently from 6cd97df to 2005e10 Compare February 27, 2026 12:07
@RathiSonika RathiSonika force-pushed the feat/nand_flash_bdl_support branch 2 times, most recently from c8116cf to d46bf54 Compare March 5, 2026 10:36
@RathiSonika RathiSonika force-pushed the feat/nand_flash_bdl_support branch from d6c2cb1 to be8255b Compare March 5, 2026 11:14
.release = nand_flash_blockdev_release,
};

esp_err_t nand_flash_get_blockdev(spi_nand_flash_config_t *config, esp_blockdev_handle_t *out_bdl_handle_ptr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no device_flags initialization in this BDL getter. Please, use ESP_BLOCKDEV_FLAGS_INST_CONFIG_DEFAULT() at least

return res;
}

static esp_err_t nand_flash_blockdev_erase(esp_blockdev_handle_t handle, uint64_t start_addr, size_t erase_len)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems you are deleting only one block here (!!!). erase_len is never used in this context.

Am I missing anything?

}
uint32_t start_page = trim_arg->start_addr / page_size;
uint32_t page_count = trim_arg->erase_len / page_size;
for (uint32_t page_id = start_page; page_id <= (start_page + page_count); page_id++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong - off by one, you go out of range on the last page! Proper condition check is page_id < (start_page + page_count).

Such points should be easily detected by properly designed tests.

uint32_t ecc_err_exceeding_threshold_count = 0;
uint32_t ecc_err_not_corrected_count = 0;
esp_err_t ret = ESP_OK;
for (uint32_t page = 0; page < num_pages; page++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this can be very slow on large devices... any idea for performance improvement? Like continuous stats caching?

Comment on lines +113 to +114
esp_err_t res = ESP_OK;
return res;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
esp_err_t res = ESP_OK;
return res;
return ESP_OK;

memcpy(&flash_info->geometry, &dev->chip, sizeof(nand_flash_geometry_t));
return ESP_OK;
}
break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unreachable break

spi_nand_flash_device_t *dev_handle = (spi_nand_flash_device_t *)nand_handle->ctx;

#ifdef CONFIG_IDF_TARGET_LINUX
ret = nand_emul_deinit(dev_handle);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering: is this call consitent with nand_handle->ops->release(nand_handle) ? Why is this call needed? (haven't checked the details, it just looks suspicious)

uint32_t num_pages;
bdl->ops->ioctl(bdl, ESP_BLOCKDEV_CMD_GET_AVAILABLE_SECTORS, &num_pages);

TEST_ESP_OK((start_page + page_count) > num_pages);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong check - TEST_ESP_OK verifies a value of esp_err_t, not a boolean value (which happens as 0/1 integer in your case)

@pacucha42
Copy link
Collaborator

Found few glitches, please, have a look

@pacucha42
Copy link
Collaborator

pacucha42 commented Mar 9, 2026

Also, there are some test cases missing. AI generated overview:

On-target (test_spi_nand_flash_bdl.c)

1. BDL geometry and flags (Flash BDL)

After nand_flash_get_blockdev, assert:

  • disk_size == num_blocks * block_size
  • read_size / write_size / erase_size match device
  • (Once implementation is fixed) device_flags has expected bits (e.g. erase_before_write, and_type_write, default_val_after_erase)
  • Optionally: ops and all op pointers non-NULL

2. Multi-block erase (Flash BDL)

  • Erase 2–3 consecutive blocks in one erase(handle, start_addr, erase_len) call
  • Read back and verify erased (e.g. 0xFF)
  • Ensures the implementation honours erase_len (will fail until the implementation is fixed)

3. ESP_BLOCKDEV_CMD_MARK_DELETED (WL BDL)

  • Write a few logical pages
  • Call ioctl MARK_DELETED with esp_blockdev_cmd_arg_erase_t for that range
  • Optionally trigger or observe GC; verify pages can be reused or data is no longer required
  • Ensures TRIM path and loop bounds are correct

4. Sync (WL BDL)

  • Write one or more pages
  • Call bdl->ops->sync(bdl)
  • Read back and verify
  • Ensures sync is exercised

5. GET_NAND_FLASH_INFO ioctl (Flash BDL)

  • Call ESP_BLOCKDEV_CMD_GET_NAND_FLASH_INFO
  • Assert non-zero / non-empty chip name, manufacturer_id, device_id, and geometry fields consistent with geometry on the handle

6. Error paths for get_blockdev

  • nand_flash_get_blockdev(NULL, &out) and nand_flash_get_blockdev(&config, NULL)ESP_ERR_INVALID_ARG
  • spi_nand_flash_wl_get_blockdev(NULL, &out) and spi_nand_flash_wl_get_blockdev(flash_bdl, NULL)ESP_ERR_INVALID_ARG
  • (If applicable) invalid or partially initialised flash BDL → appropriate error

7. Invalid args in ops

  • Read with misaligned src_addr or data_read_len not multiple of read_size (where required) → ESP_ERR_INVALID_ARG or ESP_ERR_INVALID_SIZE
  • Erase with misaligned start_addr or erase_lenESP_ERR_INVALID_ARG / ESP_ERR_INVALID_SIZE
  • (Optional) Read with data_read_len > dst_buf_sizeESP_ERR_INVALID_ARG

8. Unsupported ioctl

  • Call ioctl with an unknown command
  • Expect ESP_ERR_NOT_SUPPORTED

Host (test_nand_flash_bdl.cpp)

9. WL BDL on host

  • Create Flash BDL with emul config
  • Call spi_nand_flash_wl_get_blockdev(flash_bdl, &wl_bdl)
  • Assert WL geometry (e.g. disk_size, read_size, write_size)
  • Write a page, read it back, call GET_AVAILABLE_SECTORS
  • Release WL BDL (which releases flash BDL)
  • Ensures WL stack works on Linux

10. Flash BDL geometry and ops

  • After nand_flash_get_blockdev, assert geometry fields
  • Assert all required op pointers non-NULL (read, write, erase, ioctl, release)

11. COPY_PAGE ioctl (Flash BDL)

  • Erase/program source page
  • Call ioctl ESP_BLOCKDEV_CMD_COPY_PAGE from source to dest page
  • Read dest page and verify content
  • Host-only is acceptable if on-target is time-consuming

12. GET_NAND_FLASH_INFO and GET_BAD_BLOCKS_COUNT on host

  • Call both ioctls
  • Assert sane values (e.g. chip name length, bad_block_count <= total blocks)

13. Error path: NULL / invalid

  • nand_flash_get_blockdev(NULL, &out) and nand_flash_get_blockdev(&config, NULL)ESP_ERR_INVALID_ARG
  • Optionally: misaligned read/erase → error

14. Release and no use-after-free

  • Create Flash BDL, call release
  • In same process, create again and run a minimal read/write
  • Ensures release cleans up and a second init works

Priority

Priority Tests
High Geometry/flags (1), multi-block erase (2), TRIM/MARK_DELETED (3), WL BDL on host (9), error paths for get_blockdev (6)
Medium Sync (4), GET_NAND_FLASH_INFO (5, 12), invalid args in ops (7), unsupported ioctl (8), host COPY_PAGE (11)
Lower GET_PAGE_ECC_STATUS (can be slow), release+reconnect (14), extra edge cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants