Skip to content

Fix stale values in the detail struct#2881

Merged
Rot127 merged 4 commits intocapstone-engine:nextfrom
slate5:fix/riscv-init-cs-detail
Mar 29, 2026
Merged

Fix stale values in the detail struct#2881
Rot127 merged 4 commits intocapstone-engine:nextfrom
slate5:fix/riscv-init-cs-detail

Conversation

@slate5
Copy link
Copy Markdown
Contributor

@slate5 slate5 commented Mar 23, 2026

Your checklist for this pull request

  • I've documented or updated the documentation of every API function and struct this PR changes.
  • I've added tests that prove my fix is effective or that my feature works (if possible)

Detailed description

The bug: The detail struct (cs_detail) isn't cleared between instructions, causing stale values. For example, op_count would grow with each new instruction, and was only reset in rare cases when alias instructions are used but alias details are disabled (see RISCV_LLVM_printInstruction():411).

The fix: Added RISCV_init_cs_detail() that gets called once from RISCV_LLVM_getInstruction() before decoding each instruction.

Test plan

Pseudo snippet:

cs_open(CS_ARCH_RISCV, CS_MODE_RISCV64 | CS_MODE_RISCV_C, &handle);
cs_option(handle, CS_OPT_DETAIL, CS_OPT_ON);

while (cs_disasm_iter(handle, &start, &size, &address, insn))
	printf("%d\n", insn->detail->riscv.op_count);

Without the patch, op_count would grow up to the max value of uint8_t (255).

Closing issues

...

@github-actions github-actions bot added the RISCV Arch label Mar 23, 2026
Copy link
Copy Markdown
Collaborator

@Rot127 Rot127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just these two nitpicks (and the build of course). Otherwise lgtm.
Thanks!

// memset all stalled values in the detail struct to 0 before disassembling any next instruction
void RISCV_init_cs_detail(MCInst *MI)
{
if (detail_is_set(MI)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please invert and return early.

Comment on lines +277 to +278
//for (size_t i = 0; i < ARR_SIZE(RISCV_get_detail(MI)->operands); ++i)
// RISCV_get_detail(MI)->operands[i].type = RISCV_OP_INVALID;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RISCV_OP_INVALID should always be 0 because it is = RISCV_OP_INVALID which is in the API.

Suggested change
//for (size_t i = 0; i < ARR_SIZE(RISCV_get_detail(MI)->operands); ++i)
// RISCV_get_detail(MI)->operands[i].type = RISCV_OP_INVALID;

Copy link
Copy Markdown
Collaborator

@Rot127 Rot127 Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can even remove the whole assert actually.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if the API will never change, there is no need for assert :)
Then, should the final version be without an early return? This is all it stayed:

if (detail_is_set(MI))                                        
        memset(get_detail(MI), 0,                             
               offsetof(cs_detail, riscv) + sizeof(cs_riscv));

Copy link
Copy Markdown
Collaborator

@Rot127 Rot127 Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine like it is then :)
Just remove the assert

@Rot127
Copy link
Copy Markdown
Collaborator

Rot127 commented Mar 23, 2026

cc @moste00

@Rot127
Copy link
Copy Markdown
Collaborator

Rot127 commented Mar 23, 2026

@slate5 Off-topic, but I see on your profile that you are associated with the Barcelona Supercomputing Center?
I plan to apply for funding (https://www.sovereign.tech/) and need to add references of projects using Capstone.
Do you use Capstone in a capacity that I can ask for a quote from you folks?

@slate5
Copy link
Copy Markdown
Contributor Author

slate5 commented Mar 23, 2026

Do you use Capstone in a capacity that I can ask for a quote from you folks?

I don't know to what extent it is generally used in BSC, but I was using it as a dependency of syscall_intercept while porting it to RISC-V (the same applies for PPC and ARM ports done by BSC and RIKEN). And, GekkoFS depends on all syscall_intercept ports.
I’m not sure what you mean by "quote", like a citation? I can share our paper that references Capstone if that helps.

@Rot127
Copy link
Copy Markdown
Collaborator

Rot127 commented Mar 23, 2026

I’m not sure what you mean by "quote", like a citation? I can share our paper that references Capstone if that helps.

All good. The repos you linked are enough. Thanks.
It is mostly about showing that Capstone has an essential part in the digital infrastructure.

@slate5
Copy link
Copy Markdown
Contributor Author

slate5 commented Mar 23, 2026

Best of luck in getting funds, you deserve it ;)

@Rot127
Copy link
Copy Markdown
Collaborator

Rot127 commented Mar 24, 2026

@moste00 Please take a look as well.

{
if (detail_is_set(MI))
memset(get_detail(MI), 0,
offsetof(cs_detail, riscv) + sizeof(cs_riscv));
Copy link
Copy Markdown
Contributor

@moste00 moste00 Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if offsetof is warranted here, it's overly cautious to only zero out the cs_details structs up to the RISC-V member but not the rest of the strucut, the rest of the struct will probably be garbage anyway since the RISC-V member shares a memory prefix with the other members, so clearing it will likely mangle the memory start part of the other member in some complex way.

Anyway, I personally vote against this size expression and in favor of a simple sizeof(cs_detail), but I wouldn't block the PR because of it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @moste00, by "up to the RISC-V member", do you mean including or excluding cs_riscv? memset() clears the entire cs_detail, including cs_riscv. The difference between using offsetof and only sizeof is that the total size being cleared can be much smaller (424 B vs 2216 B). It's mostly irrelevant, I just don't like making memset() sweat more than necessary XD

What do you mean by "mangle the memory start part of the other member"? The only issue I can see with offsetof would be if new elements are added to cs_detail after union, they wouldn’t be cleared.

Copy link
Copy Markdown
Contributor

@moste00 moste00 Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't know the real sizes, interesting!

My point was more like: the cs_detail is structured as follows, right ?

<<Beginning of cs_detail struct>>
<<<member 1>>>
<<<member 2>>>
....
<<<member N>>>
<<<Beginning of the union of arch-specific struct details>>>
<<<<arch 1 struct >>>>
<<<<arch 2 struct>>>>    <---- [in parallel, sharing the same memory with arch 1 struct]
...
<<<<arch N struct>>>>

As I understand it, your size expression is clearing all the struct up to the memory representing the RISC-V details struct, is that correct ? But if so, this means there are stale bytes from the other union structs (for example, if the x86 struct is some M bytes longer than the RISC-V ones, those M bytes are stale after the clearing operation, correct?).

I thought that was due to you not wanting to clear this memory, but your argument is that the real sizes of the structs involved make this worth it, right? In that case you do have a point.

That said, this is the first time I actually see offsetof used in real code, maybe the saving is just not worth assuming it exists! Capstone is compiled for so many platforms and compilers, some of them are weird and old, and memset is mostly vector code munching through the memory range anyway.

Your call though, I have no strong preference either way. Thank you for clarifying this with numbers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, your size expression is clearing all the struct up to the memory representing the RISC-V details struct, is that correct ?

Yes, those M bytes are irrelevant (if I'm not missing anything) because RV won't touch them at runtime. They will be zeroed at the start, and nothing will be written there in the RV runtime, so nothing has to be cleared.

I thought that was due to you not wanting to clear this memory, but your argument is that the real sizes of the structs involved make this worth it, right?

Haha, yes, I'm just being evil (premature optimization is the root of all evil) :)

offsetof is used in different places in Capstone already. It's part of stddef.h; in fact, I took this example of clearing like this from other architectures in Capstone. Vector instructions can make this size difference irrelevant if you have them on RV and if libc implements them. I tested on LicheePi 4A: isolated memset is 4 times faster for those two sizes, but then i used gprof on capstone, and the difference between those two functions was irrelevant (~10% if gprof can be trusted). As I said, pure evil... :)

Your call though, I have no strong preference either way. Thank you for clarifying this with numbers.

You are welcome! I clearly like things to go faster, but your approach is safer (the corner case I have mentioned: if someone ever puts a new cs_detail's member after archs union). @Rot127, can give a final verdict XD

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I personally vote against this size expression and in favor of a simple sizeof(cs_detail), but I wouldn't block the PR because of it.

Nah, offsetof is fine. It is used like this in any other arch. And I am not sure if there is somewhere some dependency on it not being zeroed except the archs detail struct.

@moste00
Copy link
Copy Markdown
Contributor

moste00 commented Mar 24, 2026

Looks good and thank you @slate5 for spotting this!

I shared non-binding feedback on the size calculation you used to clear out the members, please feel free to take it into account. I think using offsetof in the expression is complexity for no apparent reason, but if this is important for your use case feel free to leave it in.

(Sidenote: The PR description says you added tests but I can't see any test code in the changed files, this is a heads up in case you forgot to push, but it's okay if you need to merge the fix urgently as I will add the tests later.)

@slate5
Copy link
Copy Markdown
Contributor Author

slate5 commented Mar 24, 2026

I addressed offsetof/sizeof in the comment above, either version can be used. sizeof is cleaner and straightforward, and offsetof does only the work that must be done.

Oh yeah, I checked that test box by mistake. I’ll add tests, and then you can merge.

@slate5
Copy link
Copy Markdown
Contributor Author

slate5 commented Mar 26, 2026

@moste00, aside from YAML, I added a unit test because cstest runs with cs_disasm, but the bug manifests when cs_disasm_iter is used. Let me know what you think.

@moste00
Copy link
Copy Markdown
Contributor

moste00 commented Mar 26, 2026

Tests look good @slate5. Thanks again.

@Rot127 Does this merge automatically or do you have to force something?

Copy link
Copy Markdown
Collaborator

@Rot127 Rot127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks a lot for the test!
Wouldn't have been necessary, but I appreciate any pedantic testing!

@Rot127
Copy link
Copy Markdown
Collaborator

Rot127 commented Mar 26, 2026

Buffer overflow:

    Start 21: unit_riscv_op_count_iter
4/4 Test #21: unit_riscv_op_count_iter .........Subprocess aborted***Exception:   0.80 sec
Errors while running CTest
=================================================================
==6116==ERROR: AddressSanitizer: global-buffer-overflow on address 0x5566df2371a8 at pc 0x5566df12e736 bp 0x7ffcef2f0100 sp 0x7ffcef2f00f0
READ of size 1 at 0x5566df2371a8 thread T0
    #0 0x5566df12e735 in RISCV_getInstruction /home/runner/work/capstone/capstone/arch/RISCV/RISCVDisassembler.c:612
    #1 0x5566df1314a7 in RISCV_LLVM_getInstruction /home/runner/work/capstone/capstone/arch/RISCV/RISCVDisassembler.c:751
    #2 0x5566ded5623b in cs_disasm_iter /home/runner/work/capstone/capstone/cs.c:1547
    #3 0x5566ded4efd6 in test_riscv_op_count_no_stale /home/runner/work/capstone/capstone/tests/unit/riscv_op_count_iter.c:31
    #4 0x5566ded4f096 in main /home/runner/work/capstone/capstone/tests/unit/riscv_op_count_iter.c:45
    #5 0x7f506002a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
    #6 0x7f506002a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
    #7 0x5566ded4e824 in _start (/home/runner/work/capstone/capstone/build/tests/unit/riscv_op_count_iter+0x93e824) (BuildId: 6533bf10d28db8d9715ed712553181e5ef7d22b3)

0x5566df2371a8 is located 0 bytes after global variable 'code' defined in '/home/runner/work/capstone/capstone/tests/unit/riscv_op_count_iter.c:11:23' (0x5566df2371a0) of size 8
0x5566df2371a8 is located 56 bytes before global variable 'expected_op_counts' defined in '/home/runner/work/capstone/capstone/tests/unit/riscv_op_count_iter.c:16:23' (0x5566df2371e0) of size 12
SUMMARY: AddressSanitizer: global-buffer-overflow /home/runner/work/capstone/capstone/arch/RISCV/RISCVDisassembler.c:612 in RISCV_getInstruction
Shadow bytes around the buggy address:
  0x5566df236f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x5566df236f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x5566df237000: 00 00 00 00 00 00 00 00 00 00 00 00 05 f9 f9 f9
  0x5566df237080: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
  0x5566df237100: 00 00 00 00 02 f9 f9 f9 f9 f9 f9 f9 00 03 f9 f9
=>0x5566df237180: f9 f9 f9 f9 00[f9]f9 f9 f9 f9 f9 f9 00 04 f9 f9
  0x5566df237200: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
  0x5566df237280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x5566df237300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x5566df237380: 00 00 00 00 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x5566df237400: 00 01 f9 f9 f9 f9 f9 f9 00 00 03 f9 f9 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==6116==ABORTING


75% tests passed, 1 tests failed out of 4

Total Test time (real) =   0.94 sec

The following tests FAILED:
	 21 - unit_riscv_op_count_iter (Subprocess aborted)

Once its fixed we can merge.

@slate5
Copy link
Copy Markdown
Contributor Author

slate5 commented Mar 26, 2026

Once its fixed we can merge.

If it's fine, I can just add a fix in this PR for cs_disasm_iter. Something like this:

CAPSTONE_EXPORT
bool CAPSTONE_API cs_disasm_iter(csh ud, const uint8_t **code, size_t *size,
                                 uint64_t *address, cs_insn *insn)
{
+       if (*size == 0)
+               return false;
+
        struct cs_struct *handle;
        uint16_t insn_size;
        MCInst mci;
        bool r;

I don't know what your practice is because this is not related to this PR and will influence all architectures, so should I open a new one, or can the fix go here?

@slate5
Copy link
Copy Markdown
Contributor Author

slate5 commented Mar 26, 2026

Or i can place this early return guard in RISCV_LLVM_getInstruction() to keep it relevant only to riscv, but maybe it's better to check the size for every arch

@Rot127
Copy link
Copy Markdown
Collaborator

Rot127 commented Mar 26, 2026

Yes, please open a new one.
Thanks, good that you found it.

{
if (detail_is_set(MI))
memset(get_detail(MI), 0,
offsetof(cs_detail, riscv) + sizeof(cs_riscv));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I personally vote against this size expression and in favor of a simple sizeof(cs_detail), but I wouldn't block the PR because of it.

Nah, offsetof is fine. It is used like this in any other arch. And I am not sure if there is somewhere some dependency on it not being zeroed except the archs detail struct.

@Rot127
Copy link
Copy Markdown
Collaborator

Rot127 commented Mar 28, 2026

Please rebase

@slate5 slate5 force-pushed the fix/riscv-init-cs-detail branch from 7e1b572 to 3768bfd Compare March 28, 2026 19:04
@Rot127 Rot127 merged commit facdaac into capstone-engine:next Mar 29, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RISCV Arch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants