-
Notifications
You must be signed in to change notification settings - Fork 14
Show chain of references in Ractor errors #935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Show chain of references in Ractor errors #935
Conversation
d37bed4 to
89a62f7
Compare
vm.c
Outdated
| !RB_OBJ_SHAREABLE_P(block_self)) { | ||
| if (!rb_ractor_shareable_p_continue(block_self, chain)) { | ||
| if (chain) { | ||
| if (NIL_P(*chain)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we duplicating the chain_append logic here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what's the best way to share code here. Should I make the function non-static, prefix it somehow and add it to "ractor_core.h"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it to an inline function in ractor_core.h.
bootstraptest/test_ractor.rb
Outdated
| " from block self #<Foo @ivar={}>\n" \ | ||
| " from hash default value\n" \ | ||
| " from instance variable @ivar\n" \ | ||
| " from instance variable @foo", %q{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chain approach makes sense to me, but I'm finding the error message a little hard to parse - it's not immediately obvious to me from the message what objects the ivars are attached to, or where the hash values are coming from. Is it worth using rb_inspect or rb_obj_as_string here or is the performance an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a performance issue, it's more that the output gets really messy really fast.
Even in the simple case of just instance variables, it will cause a huge wall of text with a lot of repetition:
class A; attr_accessor :b; end
class B; attr_accessor :c; end
class C; attr_accessor :d; end
class D; attr_accessor :e; end
a = A.new
a.b = b = B.new
b.c = c = C.new
c.d = d = D.new
d.e = ->{}
Ractor.make_shareable a../../test.rb:12:in 'Ractor.make_shareable': Proc's self is not shareable: #<Proc:0x0000000103731150 ../../test.rb:10 (lambda)> (Ractor::IsolationError)
from block self main
from instance variable @e of #<D:0x0000000103731210 @e=#<Proc:0x0000000103731150 ../../test.rb:10 (lambda)>>
from instance variable @d of #<C:0x0000000103731300 @d=#<D:0x0000000103731210 @e=#<Proc:0x0000000103731150 ../../test.rb:10 (lambda)>>>
from instance variable @c of #<B:0x00000001037313f0 @c=#<C:0x0000000103731300 @d=#<D:0x0000000103731210 @e=#<Proc:0x0000000103731150 ../../test.rb:10 (lambda)>>>>
from instance variable @b of #<A:0x00000001037314b0 @b=#<B:0x00000001037313f0 @c=#<C:0x0000000103731300 @d=#<D:0x0000000103731210 @e=#<Proc:0x0000000103731150 ../../test.rb:10 (lambda)>>>>>
from ../../test.rb:12:in '<main>'My first approach always had the object under consideration in addition to the "reference" and it wasn't super readable.
Using this branch I've noticed I usually just need the last line anyway.
89a62f7 to
2e55ee7
Compare
[Feature #21846] There is a single path through our GC Sweeping code, and we always call rb_gc_obj_free_vm_weak_references and rb_gc_obj_free before adding the object back to the freelist. We do this even when the object has no external resources that require being free'd and has no weak references pointing to it. This commit introduces a conservative fast path through gc_sweep_plane that uses the object flags to identify certain cases where these calls can be skipped - for these objects we just add them straight back on the freelist. Any object for which gc_sweep_fast_path_p returns false will use the current full sweep code (referred to here as the slow path). Currently there are 2 checks that will _always_ require an object to go down the slow path: 1. Has it's object_id been observed and stored in the id2ref_table 2. Has it got generic ivars in the gen_fields table If neither of these are true, then we run some flag checks on the object and send the following cases down the fast path: - Objects that are not heap allocated - Embedded strings that aren't in the fstring table - Embedded Arrays - Embedded Hashes - Embedded Bignums - Embedded Strings - Floats, Rationals and Complex - Various IMEMO subtypes that do no allocation We've benchmarked this code using ruby-bench as well as the gcbench benchmarks inside Ruby (benchmarks/gc) and this patch results in a modest speed improvement on almost all of the headline benchmarks (2% in railsbench with YJIT enabled), and an observable 30% improvement in time spent sweeping during the GC benchmarks: ``` master: ruby 4.1.0dev (2026-01-19T12:03:33Z master 859920d) +YJIT +PRISM [x86_64-linux] experiment: ruby 4.1.0dev (2026-01-16T21:36:46Z mvh-sweep-fast-pat.. c3ffe37) +YJIT +PRISM [x86_64-linux] -------------- ----------- ---------- --------------- ---------- ------------------ ----------------- bench master (ms) stddev (%) experiment (ms) stddev (%) experiment 1st itr master/experiment lobsters N/A N/A N/A N/A N/A N/A activerecord 132.5 0.9 132.5 1.0 1.056 1.001 chunky-png 577.2 0.4 580.1 0.4 0.994 0.995 erubi-rails 902.9 0.2 894.3 0.2 1.040 1.010 hexapdf 1763.9 3.3 1760.6 3.7 1.027 1.002 liquid-c 56.9 0.6 56.7 1.4 1.004 1.003 liquid-compile 46.3 2.1 46.1 2.1 1.005 1.004 liquid-render 77.8 0.8 75.1 0.9 1.023 1.036 mail 114.7 0.4 113.0 1.4 1.054 1.015 psych-load 1635.4 1.4 1625.9 0.5 0.988 1.006 railsbench 1685.4 2.4 1650.1 2.0 0.989 1.021 rubocop 133.5 8.1 130.3 7.8 1.002 1.024 ruby-lsp 140.3 1.9 137.5 1.8 1.007 1.020 sequel 64.6 0.7 63.9 0.7 1.003 1.011 shipit 1196.2 4.3 1181.5 4.2 1.003 1.012 -------------- ----------- ---------- --------------- ---------- ------------------ ----------------- Legend: - experiment 1st itr: ratio of master/experiment time for the first benchmarking iteration. - master/experiment: ratio of master/experiment time. Higher is better for experiment. Above 1 represents a speedup. ``` ``` Benchmark │ Wall(B) Sweep(B) Mark(B) │ Wall(E) Sweep(E) Mark(E) │ Wall Δ Sweep Δ ───────────────┼─────────────────────────────────┼─────────────────────────────────┼────────────────── null │ 0.000s 1ms 4ms │ 0.000s 1ms 4ms │ 0% 0% hash1 │ 4.330s 875ms 46ms │ 3.960s 531ms 44ms │ +8.6% +39.3% hash2 │ 6.356s 243ms 988ms │ 6.298s 176ms 1.03s │ +0.9% +27.6% rdoc │ 37.337s 2.42s 1.09s │ 36.678s 2.11s 1.20s │ +1.8% +13.1% binary_trees │ 3.366s 426ms 252ms │ 3.082s 275ms 239ms │ +8.4% +35.4% ring │ 5.252s 14ms 2.47s │ 5.327s 12ms 2.43s │ -1.4% +14.3% redblack │ 2.966s 28ms 41ms │ 2.940s 21ms 38ms │ +0.9% +25.0% ───────────────┼─────────────────────────────────┼─────────────────────────────────┼────────────────── Legend: (B) = Baseline, (E) = Experiment, Δ = improvement (positive = faster) Wall = total wallclock, Sweep = GC sweeping time, Mark = GC marking time Times are median of 3 runs ``` These results are also borne out when YJIT is disabled: ``` master: ruby 4.1.0dev (2026-01-19T12:03:33Z master 859920d) +PRISM [x86_64-linux] experiment: ruby 4.1.0dev (2026-01-16T21:36:46Z mvh-sweep-fast-pat.. c3ffe37) +PRISM [x86_64-linux] -------------- ----------- ---------- --------------- ---------- ------------------ ----------------- bench master (ms) stddev (%) experiment (ms) stddev (%) experiment 1st itr master/experiment lobsters N/A N/A N/A N/A N/A N/A activerecord 389.6 0.3 377.5 0.3 1.032 1.032 chunky-png 1123.4 0.2 1109.2 0.2 1.013 1.013 erubi-rails 1754.3 0.1 1725.7 0.1 1.035 1.017 hexapdf 3346.5 0.9 3326.9 0.7 1.003 1.006 liquid-c 84.0 0.5 83.5 0.5 0.992 1.006 liquid-compile 74.0 1.5 73.5 1.4 1.011 1.008 liquid-render 199.9 0.4 199.6 0.4 1.000 1.002 mail 177.8 0.4 176.4 0.4 1.069 1.008 psych-load 2749.6 0.7 2777.0 0.0 0.980 0.990 railsbench 2983.0 1.0 2965.5 0.8 1.041 1.006 rubocop 228.8 1.0 227.5 1.2 1.015 1.005 ruby-lsp 221.8 0.9 216.1 0.8 1.011 1.026 sequel 89.1 0.5 89.1 1.8 1.005 1.000 shipit 2385.6 1.6 2371.8 1.0 1.002 1.006 -------------- ----------- ---------- --------------- ---------- ------------------ ----------------- Legend: - experiment 1st itr: ratio of master/experiment time for the first benchmarking iteration. - master/experiment: ratio of master/experiment time. Higher is better for experiment. Above 1 represents a speedup. ``` ``` Benchmark │ Wall(B) Sweep(B) Mark(B) │ Wall(E) Sweep(E) Mark(E) │ Wall Δ Sweep Δ ───────────────┼─────────────────────────────────┼─────────────────────────────────┼────────────────── null │ 0.000s 1ms 4ms │ 0.000s 1ms 3ms │ 0% 0% hash1 │ 4.349s 877ms 45ms │ 4.045s 532ms 44ms │ +7.0% +39.3% hash2 │ 6.575s 235ms 967ms │ 6.540s 181ms 1.04s │ +0.5% +23.0% rdoc │ 45.782s 2.23s 1.14s │ 44.925s 1.90s 1.01s │ +1.9% +15.0% binary_trees │ 6.433s 426ms 252ms │ 6.268s 278ms 240ms │ +2.6% +34.7% ring │ 6.584s 17ms 2.33s │ 6.738s 13ms 2.33s │ -2.3% +30.8% redblack │ 13.334s 31ms 42ms │ 13.296s 24ms 107ms │ +0.3% +22.6% ───────────────┼─────────────────────────────────┼─────────────────────────────────┼────────────────── Legend: (B) = Baseline, (E) = Experiment, Δ = improvement (positive = faster) Wall = total wallclock, Sweep = GC sweeping time, Mark = GC marking time Times are median of 3 runs ```
It relies too much on VM level concerns, such that it can't be built with modular GC enabled. We'll move it into the VM, and then expose it to the GC implementations so they can use it.
Most compilers will optimise this anyway
Continually locking a mutex m can lead to starvation if all other threads are on the waitq of m. See https://bugs.ruby-lang.org/issues/21840 for more details. Solution: When a thread `T1` wakes up `T2` during mutex unlock but `T1` or any other thread successfully acquires it before `T2`, then we record the `running_time` of the thread during mutex acquisition. Then during unlock, if that thread's running_time is less than the saved running time, we set it back to the saved time. Fixes [Bug #21840]
We would like to do type matching on the VRegId. Extracting the VRegID from a usize makes the code a bit easier to understand and refactor. MemBase uses a VReg, and there is also a VReg in Opnd. We should be sharing types between these two, so this is a step in the direction of sharing a type
Until we get our global register allocator, we need our HIR to be in 100% block-local SSA. Add a validator to enforce that.
The RDoc link format has changed so these are all broken links.
The RDoc link format has changed so these are all broken links. ruby/net-http@97fe6085c3
- T_BIGNUM may have fields via `#object_id`. - The T_DATA logic was inversed. If `dfree` is unset we don't need cleanup.
Since `on_sp` is emitted, it doesn't do a whole lot anymore. This leaves one incompatibility for code like `"x#$%"` Ripper confuses this for bare interpolation with a global, but `$%` is not a valid global name. Still, it emits two string tokens in such a case. It doesn't make sense for prism to work around this bug, so the affected files are added as excludes. Since the only usage of this method makes sense for testing in prism itself, the method is removed instead of deprecated. ruby/prism@31be379f98
f44c160 to
b4d2a43
Compare
…uby#15982) Don't reset `th->running_time_us` when unlocking from `mutex_free` or force unlocking during thread destruction. Follow-up to 994257a.
Closes: Shopify#862 Add dynamic dispatch for `invokesuperforward` instruction as a first step. Specialization like YJIT’s is not implemented yet and will be handled separately. ## Benchmark ### lobsters <details> <summary>before patch</summary> ``` Average of last 10, non-warmup iters: 654ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (59.5% of total 15,599,811): Hash#fetch: 3,185,110 (20.4%) Regexp#match?: 708,802 ( 4.5%) Hash#key?: 696,422 ( 4.5%) String#sub!: 489,840 ( 3.1%) Set#include?: 396,625 ( 2.5%) String#<<: 396,279 ( 2.5%) String#start_with?: 379,336 ( 2.4%) Hash#delete: 325,992 ( 2.1%) String.new: 307,248 ( 2.0%) Integer#===: 279,054 ( 1.8%) Symbol#end_with?: 255,539 ( 1.6%) Kernel#is_a?: 246,961 ( 1.6%) Process.clock_gettime: 221,588 ( 1.4%) Integer#>: 219,718 ( 1.4%) String#match?: 218,056 ( 1.4%) Integer#<=: 202,617 ( 1.3%) Time#to_i: 192,214 ( 1.2%) Time#subsec: 189,240 ( 1.2%) String#to_sym: 185,593 ( 1.2%) String#include?: 182,862 ( 1.2%) Top-20 calls to C functions from JIT code (83.7% of total 126,406,213): rb_vm_opt_send_without_block: 37,054,888 (29.3%) rb_vm_send: 10,068,319 ( 8.0%) rb_vm_env_write: 8,529,584 ( 6.7%) rb_hash_aref: 8,014,188 ( 6.3%) rb_zjit_writebarrier_check_immediate: 7,697,828 ( 6.1%) rb_vm_getinstancevariable: 5,954,987 ( 4.7%) rb_ivar_get_at_no_ractor_check: 4,759,191 ( 3.8%) rb_obj_is_kind_of: 3,722,656 ( 2.9%) rb_vm_invokesuper: 2,663,433 ( 2.1%) rb_hash_aset: 2,416,121 ( 1.9%) rb_vm_setinstancevariable: 2,355,463 ( 1.9%) rb_vm_opt_getconstant_path: 2,297,784 ( 1.8%) Hash#fetch: 1,779,524 ( 1.4%) fetch: 1,405,586 ( 1.1%) rb_vm_invokeblock: 1,385,970 ( 1.1%) rb_str_buf_append: 1,369,178 ( 1.1%) rb_ec_ary_new_from_values: 1,336,805 ( 1.1%) rb_class_allocate_instance: 1,281,590 ( 1.0%) rb_hash_new_with_size: 899,859 ( 0.7%) rb_vm_sendforward: 798,572 ( 0.6%) Top-2 not optimized method types for send (100.0% of total 4,889,764): iseq: 4,886,942 (99.9%) null: 2,822 ( 0.1%) Top-3 not optimized method types for send_without_block (100.0% of total 525,349): optimized_send: 478,875 (91.2%) null: 42,175 ( 8.0%) optimized_block_call: 4,299 ( 0.8%) Top-3 not optimized method types for super (100.0% of total 2,350,295): cfunc: 2,239,567 (95.3%) alias: 107,374 ( 4.6%) attrset: 3,354 ( 0.1%) Top-3 instructions with uncategorized fallback reason (100.0% of total 2,216,938): invokeblock: 1,385,970 (62.5%) sendforward: 798,572 (36.0%) opt_send_without_block: 32,396 ( 1.5%) Top-20 send fallback reasons (99.9% of total 51,971,182): send_without_block_polymorphic: 18,639,354 (35.9%) singleton_class_seen: 9,274,307 (17.8%) send_without_block_no_profiles: 7,217,551 (13.9%) send_not_optimized_method_type: 4,889,764 ( 9.4%) send_no_profiles: 2,882,604 ( 5.5%) super_not_optimized_method_type: 2,350,295 ( 4.5%) uncategorized: 2,216,938 ( 4.3%) one_or_more_complex_arg_pass: 1,543,405 ( 3.0%) send_without_block_megamorphic: 723,037 ( 1.4%) send_polymorphic: 544,570 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 483,174 ( 0.9%) send_without_block_not_optimized_need_permission: 390,366 ( 0.8%) too_many_args_for_lir: 312,568 ( 0.6%) super_complex_args_pass: 111,053 ( 0.2%) super_target_complex_args_pass: 104,723 ( 0.2%) super_polymorphic: 87,851 ( 0.2%) argc_param_mismatch: 50,382 ( 0.1%) send_without_block_not_optimized_method_type: 42,175 ( 0.1%) obj_to_string_not_string: 34,861 ( 0.1%) send_without_block_direct_keyword_mismatch: 32,436 ( 0.1%) Top-4 setivar fallback reasons (100.0% of total 2,355,463): not_monomorphic: 2,132,748 (90.5%) not_t_object: 125,163 ( 5.3%) too_complex: 97,531 ( 4.1%) new_shape_needs_extension: 21 ( 0.0%) Top-2 getivar fallback reasons (100.0% of total 6,080,097): not_monomorphic: 5,808,527 (95.5%) too_complex: 271,570 ( 4.5%) Top-3 definedivar fallback reasons (100.0% of total 405,302): not_monomorphic: 397,150 (98.0%) too_complex: 5,122 ( 1.3%) not_t_object: 3,030 ( 0.7%) Top-6 invokeblock handler (100.0% of total 1,385,970): monomorphic_iseq: 688,147 (49.7%) polymorphic: 523,864 (37.8%) monomorphic_other: 106,268 ( 7.7%) monomorphic_ifunc: 55,505 ( 4.0%) megamorphic: 6,762 ( 0.5%) no_profiles: 5,424 ( 0.4%) Top-8 popular complex argument-parameter features not optimized (100.0% of total 1,850,659): param_forwardable: 685,936 (37.1%) param_block: 641,355 (34.7%) param_rest: 327,046 (17.7%) param_kwrest: 120,210 ( 6.5%) caller_kw_splat: 36,147 ( 2.0%) caller_splat: 34,029 ( 1.8%) caller_blockarg: 5,826 ( 0.3%) caller_kwarg: 110 ( 0.0%) Top-1 compile error reasons (100.0% of total 191,769): exception_handler: 191,769 (100.0%) Top-6 unhandled YARV insns (100.0% of total 89,278): invokesuperforward: 81,667 (91.5%) getconstant: 3,318 ( 3.7%) setblockparam: 2,837 ( 3.2%) checkmatch: 929 ( 1.0%) expandarray: 360 ( 0.4%) once: 167 ( 0.2%) Top-3 unhandled HIR insns (100.0% of total 236,976): throw: 198,481 (83.8%) invokebuiltin: 35,774 (15.1%) array_max: 2,721 ( 1.1%) Top-20 side exit reasons (100.0% of total 15,409,202): guard_type_failure: 6,871,609 (44.6%) guard_shape_failure: 6,854,409 (44.5%) block_param_proxy_not_iseq_or_ifunc: 1,008,346 ( 6.5%) unhandled_hir_insn: 236,976 ( 1.5%) compile_error: 191,769 ( 1.2%) unhandled_yarv_insn: 89,278 ( 0.6%) fixnum_mult_overflow: 50,739 ( 0.3%) block_param_proxy_modified: 28,119 ( 0.2%) patchpoint_stable_constant_names: 19,872 ( 0.1%) unhandled_newarray_send_pack: 14,481 ( 0.1%) unhandled_block_arg: 13,787 ( 0.1%) fixnum_lshift_overflow: 10,085 ( 0.1%) patchpoint_no_ep_escape: 7,815 ( 0.1%) expandarray_failure: 4,532 ( 0.0%) guard_super_method_entry: 4,475 ( 0.0%) patchpoint_method_redefined: 1,212 ( 0.0%) patchpoint_no_singleton_class: 1,130 ( 0.0%) obj_to_string_fallback: 275 ( 0.0%) guard_less_failure: 163 ( 0.0%) interrupt: 111 ( 0.0%) send_count: 152,221,918 dynamic_send_count: 51,971,182 (34.1%) optimized_send_count: 100,250,736 (65.9%) dynamic_setivar_count: 2,355,463 ( 1.5%) dynamic_getivar_count: 6,080,097 ( 4.0%) dynamic_definedivar_count: 405,302 ( 0.3%) iseq_optimized_send_count: 40,162,692 (26.4%) inline_cfunc_optimized_send_count: 40,296,415 (26.5%) inline_iseq_optimized_send_count: 3,344,046 ( 2.2%) non_variadic_cfunc_optimized_send_count: 8,915,909 ( 5.9%) variadic_cfunc_optimized_send_count: 7,531,674 ( 4.9%) compiled_iseq_count: 5,554 failed_iseq_count: 0 compile_time: 1,779ms profile_time: 13ms gc_time: 19ms invalidation_time: 248ms vm_write_pc_count: 133,179,978 vm_write_sp_count: 133,179,978 vm_write_locals_count: 129,160,863 vm_write_stack_count: 129,160,863 vm_write_to_parent_iseq_local_count: 693,262 vm_read_from_parent_iseq_local_count: 14,736,626 guard_type_count: 157,425,618 guard_type_exit_ratio: 4.4% guard_shape_count: 64,005,824 guard_shape_exit_ratio: 10.7% code_region_bytes: 29,147,136 zjit_alloc_bytes: 44,468,338 total_mem_bytes: 73,615,474 side_exit_count: 15,409,202 total_insn_count: 934,468,730 vm_insn_count: 166,726,703 zjit_insn_count: 767,742,027 ratio_in_zjit: 82.2% ``` </details> <details> <summary>after patch</summary> ``` Average of last 10, non-warmup iters: 648ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (59.5% of total 15,571,939): Hash#fetch: 3,185,114 (20.5%) Regexp#match?: 708,795 ( 4.6%) Hash#key?: 696,422 ( 4.5%) String#sub!: 489,841 ( 3.1%) Set#include?: 396,625 ( 2.5%) String#<<: 396,279 ( 2.5%) String#start_with?: 370,465 ( 2.4%) Hash#delete: 325,992 ( 2.1%) String.new: 307,248 ( 2.0%) Integer#===: 277,929 ( 1.8%) Symbol#end_with?: 255,540 ( 1.6%) Kernel#is_a?: 246,961 ( 1.6%) Process.clock_gettime: 221,588 ( 1.4%) Integer#>: 219,718 ( 1.4%) String#match?: 218,057 ( 1.4%) Integer#<=: 202,617 ( 1.3%) Time#to_i: 192,214 ( 1.2%) Time#subsec: 189,240 ( 1.2%) String#to_sym: 185,593 ( 1.2%) String#include?: 182,863 ( 1.2%) Top-20 calls to C functions from JIT code (83.7% of total 126,248,940): rb_vm_opt_send_without_block: 36,875,422 (29.2%) rb_vm_send: 10,068,311 ( 8.0%) rb_vm_env_write: 8,529,572 ( 6.8%) rb_hash_aref: 8,014,184 ( 6.3%) rb_zjit_writebarrier_check_immediate: 7,697,776 ( 6.1%) rb_vm_getinstancevariable: 5,934,206 ( 4.7%) rb_ivar_get_at_no_ractor_check: 4,759,185 ( 3.8%) rb_obj_is_kind_of: 3,745,913 ( 3.0%) rb_vm_invokesuper: 2,663,429 ( 2.1%) rb_hash_aset: 2,416,112 ( 1.9%) rb_vm_setinstancevariable: 2,361,107 ( 1.9%) rb_vm_opt_getconstant_path: 2,294,768 ( 1.8%) Hash#fetch: 1,779,524 ( 1.4%) fetch: 1,405,590 ( 1.1%) rb_vm_invokeblock: 1,385,975 ( 1.1%) rb_str_buf_append: 1,369,179 ( 1.1%) rb_ec_ary_new_from_values: 1,336,806 ( 1.1%) rb_class_allocate_instance: 1,281,533 ( 1.0%) rb_hash_new_with_size: 899,857 ( 0.7%) rb_vm_sendforward: 798,572 ( 0.6%) Top-2 not optimized method types for send (100.0% of total 4,889,758): iseq: 4,886,936 (99.9%) null: 2,822 ( 0.1%) Top-3 not optimized method types for send_without_block (100.0% of total 525,350): optimized_send: 478,875 (91.2%) null: 42,176 ( 8.0%) optimized_block_call: 4,299 ( 0.8%) Top-3 not optimized method types for super (100.0% of total 2,350,289): cfunc: 2,239,565 (95.3%) alias: 107,374 ( 4.6%) attrset: 3,350 ( 0.1%) Top-4 instructions with uncategorized fallback reason (100.0% of total 2,298,609): invokeblock: 1,385,975 (60.3%) sendforward: 798,572 (34.7%) invokesuperforward: 81,666 ( 3.6%) opt_send_without_block: 32,396 ( 1.4%) Top-20 send fallback reasons (99.9% of total 51,873,375): send_without_block_polymorphic: 18,540,291 (35.7%) singleton_class_seen: 9,210,394 (17.8%) send_without_block_no_profiles: 7,202,051 (13.9%) send_not_optimized_method_type: 4,889,758 ( 9.4%) send_no_profiles: 2,882,602 ( 5.6%) super_not_optimized_method_type: 2,350,289 ( 4.5%) uncategorized: 2,298,609 ( 4.4%) one_or_more_complex_arg_pass: 1,543,404 ( 3.0%) send_without_block_megamorphic: 723,037 ( 1.4%) send_polymorphic: 544,570 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 483,174 ( 0.9%) send_without_block_not_optimized_need_permission: 389,384 ( 0.8%) too_many_args_for_lir: 312,568 ( 0.6%) super_complex_args_pass: 111,054 ( 0.2%) super_target_complex_args_pass: 104,723 ( 0.2%) super_polymorphic: 87,852 ( 0.2%) argc_param_mismatch: 50,382 ( 0.1%) send_without_block_not_optimized_method_type: 42,176 ( 0.1%) obj_to_string_not_string: 34,853 ( 0.1%) send_without_block_direct_keyword_mismatch: 32,436 ( 0.1%) Top-4 setivar fallback reasons (100.0% of total 2,361,107): not_monomorphic: 2,138,392 (90.6%) not_t_object: 125,163 ( 5.3%) too_complex: 97,531 ( 4.1%) new_shape_needs_extension: 21 ( 0.0%) Top-2 getivar fallback reasons (100.0% of total 6,059,319): not_monomorphic: 5,787,746 (95.5%) too_complex: 271,573 ( 4.5%) Top-3 definedivar fallback reasons (100.0% of total 405,302): not_monomorphic: 397,150 (98.0%) too_complex: 5,122 ( 1.3%) not_t_object: 3,030 ( 0.7%) Top-6 invokeblock handler (100.0% of total 1,385,975): monomorphic_iseq: 688,157 (49.7%) polymorphic: 523,861 (37.8%) monomorphic_other: 106,268 ( 7.7%) monomorphic_ifunc: 55,505 ( 4.0%) megamorphic: 6,760 ( 0.5%) no_profiles: 5,424 ( 0.4%) Top-8 popular complex argument-parameter features not optimized (100.0% of total 1,850,658): param_forwardable: 685,941 (37.1%) param_block: 641,355 (34.7%) param_rest: 327,046 (17.7%) param_kwrest: 120,209 ( 6.5%) caller_kw_splat: 36,147 ( 2.0%) caller_splat: 34,029 ( 1.8%) caller_blockarg: 5,821 ( 0.3%) caller_kwarg: 110 ( 0.0%) Top-1 compile error reasons (100.0% of total 191,769): exception_handler: 191,769 (100.0%) Top-5 unhandled YARV insns (100.0% of total 7,611): getconstant: 3,318 (43.6%) setblockparam: 2,837 (37.3%) checkmatch: 929 (12.2%) expandarray: 360 ( 4.7%) once: 167 ( 2.2%) Top-3 unhandled HIR insns (100.0% of total 236,976): throw: 198,481 (83.8%) invokebuiltin: 35,774 (15.1%) array_max: 2,721 ( 1.1%) Top-20 side exit reasons (100.0% of total 15,343,302): guard_type_failure: 6,886,972 (44.9%) guard_shape_failure: 6,854,835 (44.7%) block_param_proxy_not_iseq_or_ifunc: 1,008,346 ( 6.6%) unhandled_hir_insn: 236,976 ( 1.5%) compile_error: 191,769 ( 1.2%) fixnum_mult_overflow: 50,739 ( 0.3%) block_param_proxy_modified: 28,119 ( 0.2%) patchpoint_stable_constant_names: 19,858 ( 0.1%) unhandled_newarray_send_pack: 14,481 ( 0.1%) unhandled_block_arg: 13,787 ( 0.1%) fixnum_lshift_overflow: 10,085 ( 0.1%) patchpoint_no_ep_escape: 7,815 ( 0.1%) unhandled_yarv_insn: 7,611 ( 0.0%) expandarray_failure: 4,533 ( 0.0%) guard_super_method_entry: 4,475 ( 0.0%) patchpoint_method_redefined: 1,212 ( 0.0%) patchpoint_no_singleton_class: 1,130 ( 0.0%) obj_to_string_fallback: 275 ( 0.0%) guard_less_failure: 163 ( 0.0%) interrupt: 102 ( 0.0%) send_count: 152,019,764 dynamic_send_count: 51,873,375 (34.1%) optimized_send_count: 100,146,389 (65.9%) dynamic_setivar_count: 2,361,107 ( 1.6%) dynamic_getivar_count: 6,059,319 ( 4.0%) dynamic_definedivar_count: 405,302 ( 0.3%) iseq_optimized_send_count: 40,149,182 (26.4%) inline_cfunc_optimized_send_count: 40,168,875 (26.4%) inline_iseq_optimized_send_count: 3,408,619 ( 2.2%) non_variadic_cfunc_optimized_send_count: 8,896,927 ( 5.9%) variadic_cfunc_optimized_send_count: 7,522,786 ( 4.9%) compiled_iseq_count: 5,554 failed_iseq_count: 0 compile_time: 1,784ms profile_time: 13ms gc_time: 19ms invalidation_time: 261ms vm_write_pc_count: 133,027,580 vm_write_sp_count: 133,027,580 vm_write_locals_count: 129,024,228 vm_write_stack_count: 129,024,228 vm_write_to_parent_iseq_local_count: 693,264 vm_read_from_parent_iseq_local_count: 14,727,716 guard_type_count: 157,500,381 guard_type_exit_ratio: 4.4% guard_shape_count: 64,160,894 guard_shape_exit_ratio: 10.7% code_region_bytes: 29,196,288 zjit_alloc_bytes: 44,686,498 total_mem_bytes: 73,882,786 side_exit_count: 15,343,302 total_insn_count: 934,219,385 vm_insn_count: 167,485,651 zjit_insn_count: 766,733,734 ratio_in_zjit: 82.1% ``` </details> ### rails-bench <details> <summary>before patch</summary> ``` Average of last 10, non-warmup iters: 1146ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (52.4% of total 38,306,776): Hash#key?: 3,141,619 ( 8.2%) Regexp#match?: 2,420,225 ( 6.3%) Hash#fetch: 2,245,557 ( 5.9%) Integer#===: 1,098,163 ( 2.9%) Hash#delete: 1,014,375 ( 2.6%) Array#any?: 1,007,766 ( 2.6%) String.new: 1,004,713 ( 2.6%) String#b: 797,913 ( 2.1%) String#to_sym: 680,943 ( 1.8%) Array#all?: 650,132 ( 1.7%) Fiber.current: 649,003 ( 1.7%) Array#join: 641,038 ( 1.7%) Array#include?: 613,837 ( 1.6%) Kernel#Array: 610,311 ( 1.6%) String#<<: 606,240 ( 1.6%) Symbol#end_with?: 598,807 ( 1.6%) String#force_encoding: 593,535 ( 1.5%) Kernel#dup: 580,051 ( 1.5%) Array#[]: 562,360 ( 1.5%) Kernel#respond_to?: 550,441 ( 1.4%) Top-20 calls to C functions from JIT code (75.5% of total 262,197,810): rb_vm_opt_send_without_block: 54,534,682 (20.8%) rb_hash_aref: 22,920,285 ( 8.7%) rb_vm_env_write: 19,385,633 ( 7.4%) rb_vm_send: 17,070,477 ( 6.5%) rb_zjit_writebarrier_check_immediate: 13,780,973 ( 5.3%) rb_vm_getinstancevariable: 12,379,513 ( 4.7%) rb_ivar_get_at_no_ractor_check: 12,156,906 ( 4.6%) rb_vm_invokesuper: 8,086,665 ( 3.1%) rb_hash_aset: 5,043,536 ( 1.9%) rb_obj_is_kind_of: 4,431,123 ( 1.7%) rb_vm_invokeblock: 4,036,483 ( 1.5%) Hash#key?: 3,141,619 ( 1.2%) rb_vm_opt_getconstant_path: 3,053,319 ( 1.2%) rb_class_allocate_instance: 2,878,526 ( 1.1%) rb_hash_new_with_size: 2,823,745 ( 1.1%) rb_ec_ary_new_from_values: 2,585,553 ( 1.0%) rb_str_concat_literals: 2,450,764 ( 0.9%) Regexp#match?: 2,420,225 ( 0.9%) rb_obj_alloc: 2,419,171 ( 0.9%) rb_vm_setinstancevariable: 2,357,067 ( 0.9%) Top-2 not optimized method types for send (100.0% of total 8,550,760): iseq: 8,518,289 (99.6%) optimized: 32,471 ( 0.4%) Top-2 not optimized method types for send_without_block (100.0% of total 789,641): optimized_send: 606,885 (76.9%) null: 182,756 (23.1%) Top-2 not optimized method types for super (100.0% of total 6,689,859): cfunc: 6,640,180 (99.3%) attrset: 49,679 ( 0.7%) Top-3 instructions with uncategorized fallback reason (100.0% of total 5,962,039): invokeblock: 4,036,483 (67.7%) sendforward: 1,871,601 (31.4%) opt_send_without_block: 53,955 ( 0.9%) Top-20 send fallback reasons (100.0% of total 85,599,908): send_without_block_polymorphic: 31,804,276 (37.2%) send_without_block_no_profiles: 13,349,825 (15.6%) send_not_optimized_method_type: 8,550,760 (10.0%) super_not_optimized_method_type: 6,689,859 ( 7.8%) uncategorized: 5,962,039 ( 7.0%) send_no_profiles: 5,200,278 ( 6.1%) one_or_more_complex_arg_pass: 4,198,502 ( 4.9%) send_polymorphic: 3,318,658 ( 3.9%) send_without_block_not_optimized_need_permission: 1,274,177 ( 1.5%) too_many_args_for_lir: 1,139,487 ( 1.3%) singleton_class_seen: 1,101,973 ( 1.3%) super_complex_args_pass: 829,842 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 606,885 ( 0.7%) send_without_block_megamorphic: 565,874 ( 0.7%) super_target_complex_args_pass: 414,600 ( 0.5%) send_without_block_not_optimized_method_type: 182,756 ( 0.2%) obj_to_string_not_string: 158,141 ( 0.2%) super_call_with_block: 100,004 ( 0.1%) send_without_block_direct_keyword_mismatch: 99,588 ( 0.1%) super_polymorphic: 52,360 ( 0.1%) Top-2 setivar fallback reasons (100.0% of total 2,357,067): not_monomorphic: 2,255,283 (95.7%) not_t_object: 101,784 ( 4.3%) Top-1 getivar fallback reasons (100.0% of total 12,379,538): not_monomorphic: 12,379,538 (100.0%) Top-2 definedivar fallback reasons (100.0% of total 350,548): not_monomorphic: 350,461 (100.0%) not_t_object: 87 ( 0.0%) Top-6 invokeblock handler (100.0% of total 4,036,483): monomorphic_iseq: 2,189,057 (54.2%) polymorphic: 1,207,002 (29.9%) monomorphic_other: 334,248 ( 8.3%) monomorphic_ifunc: 221,225 ( 5.5%) megamorphic: 84,439 ( 2.1%) no_profiles: 512 ( 0.0%) Top-8 popular complex argument-parameter features not optimized (100.0% of total 5,212,154): param_forwardable: 1,824,953 (35.0%) param_block: 1,792,214 (34.4%) param_rest: 861,894 (16.5%) caller_splat: 283,669 ( 5.4%) caller_kw_splat: 248,291 ( 4.8%) param_kwrest: 200,208 ( 3.8%) caller_blockarg: 752 ( 0.0%) caller_kwarg: 173 ( 0.0%) Top-1 compile error reasons (100.0% of total 391,562): exception_handler: 391,562 (100.0%) Top-6 unhandled YARV insns (100.0% of total 1,000,531): invokesuperforward: 498,993 (49.9%) getconstant: 400,945 (40.1%) expandarray: 49,985 ( 5.0%) setblockparam: 49,972 ( 5.0%) checkmatch: 480 ( 0.0%) once: 156 ( 0.0%) Top-2 unhandled HIR insns (100.0% of total 268,151): throw: 232,560 (86.7%) invokebuiltin: 35,591 (13.3%) Top-19 side exit reasons (100.0% of total 8,709,784): guard_shape_failure: 2,497,335 (28.7%) block_param_proxy_not_iseq_or_ifunc: 1,988,408 (22.8%) guard_type_failure: 1,722,007 (19.8%) unhandled_yarv_insn: 1,000,531 (11.5%) compile_error: 391,562 ( 4.5%) unhandled_newarray_send_pack: 298,017 ( 3.4%) unhandled_hir_insn: 268,151 ( 3.1%) patchpoint_method_redefined: 200,632 ( 2.3%) unhandled_block_arg: 151,295 ( 1.7%) block_param_proxy_modified: 124,245 ( 1.4%) guard_less_failure: 50,126 ( 0.6%) fixnum_lshift_overflow: 9,985 ( 0.1%) patchpoint_stable_constant_names: 6,350 ( 0.1%) fixnum_mult_overflow: 570 ( 0.0%) obj_to_string_fallback: 405 ( 0.0%) patchpoint_no_ep_escape: 109 ( 0.0%) interrupt: 42 ( 0.0%) guard_super_method_entry: 8 ( 0.0%) guard_greater_eq_failure: 6 ( 0.0%) send_count: 329,199,237 dynamic_send_count: 85,599,908 (26.0%) optimized_send_count: 243,599,329 (74.0%) dynamic_setivar_count: 2,357,067 ( 0.7%) dynamic_getivar_count: 12,379,538 ( 3.8%) dynamic_definedivar_count: 350,548 ( 0.1%) iseq_optimized_send_count: 93,946,576 (28.5%) inline_cfunc_optimized_send_count: 97,478,983 (29.6%) inline_iseq_optimized_send_count: 9,138,886 ( 2.8%) non_variadic_cfunc_optimized_send_count: 25,367,116 ( 7.7%) variadic_cfunc_optimized_send_count: 17,667,768 ( 5.4%) compiled_iseq_count: 2,888 failed_iseq_count: 0 compile_time: 876ms profile_time: 28ms gc_time: 6ms invalidation_time: 8ms vm_write_pc_count: 287,051,837 vm_write_sp_count: 287,051,837 vm_write_locals_count: 273,948,883 vm_write_stack_count: 273,948,883 vm_write_to_parent_iseq_local_count: 1,079,877 vm_read_from_parent_iseq_local_count: 30,814,984 guard_type_count: 310,888,965 guard_type_exit_ratio: 0.6% guard_shape_count: 108,669,058 guard_shape_exit_ratio: 2.3% code_region_bytes: 14,352,384 zjit_alloc_bytes: 18,992,674 total_mem_bytes: 33,345,058 side_exit_count: 8,709,784 total_insn_count: 1,705,856,454 vm_insn_count: 122,246,885 zjit_insn_count: 1,583,609,569 ratio_in_zjit: 92.8% ``` </details> <details> <summary>after patch</summary> ``` Average of last 10, non-warmup iters: 1072ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (52.5% of total 38,239,504): Hash#key?: 3,141,619 ( 8.2%) Regexp#match?: 2,420,215 ( 6.3%) Hash#fetch: 2,245,557 ( 5.9%) Integer#===: 1,097,515 ( 2.9%) Hash#delete: 1,014,375 ( 2.7%) Array#any?: 1,007,756 ( 2.6%) String.new: 1,004,713 ( 2.6%) String#b: 797,913 ( 2.1%) String#to_sym: 680,943 ( 1.8%) Array#all?: 650,132 ( 1.7%) Fiber.current: 649,003 ( 1.7%) Array#join: 641,038 ( 1.7%) Array#include?: 613,837 ( 1.6%) Kernel#Array: 610,311 ( 1.6%) String#<<: 606,240 ( 1.6%) Symbol#end_with?: 598,807 ( 1.6%) String#force_encoding: 593,535 ( 1.6%) Kernel#dup: 580,051 ( 1.5%) Array#[]: 562,360 ( 1.5%) Kernel#respond_to?: 550,441 ( 1.4%) Top-20 calls to C functions from JIT code (75.4% of total 262,218,592): rb_vm_opt_send_without_block: 54,249,429 (20.7%) rb_hash_aref: 22,920,271 ( 8.7%) rb_vm_env_write: 19,385,609 ( 7.4%) rb_vm_send: 17,070,463 ( 6.5%) rb_zjit_writebarrier_check_immediate: 13,780,893 ( 5.3%) rb_vm_getinstancevariable: 12,322,924 ( 4.7%) rb_ivar_get_at_no_ractor_check: 12,156,898 ( 4.6%) rb_vm_invokesuper: 8,086,659 ( 3.1%) rb_hash_aset: 5,043,532 ( 1.9%) rb_obj_is_kind_of: 4,474,826 ( 1.7%) rb_vm_invokeblock: 4,036,471 ( 1.5%) Hash#key?: 3,141,619 ( 1.2%) rb_vm_opt_getconstant_path: 3,053,286 ( 1.2%) rb_class_allocate_instance: 2,878,505 ( 1.1%) rb_hash_new_with_size: 2,823,748 ( 1.1%) rb_ec_ary_new_from_values: 2,585,561 ( 1.0%) rb_str_concat_literals: 2,450,756 ( 0.9%) Regexp#match?: 2,420,215 ( 0.9%) rb_obj_alloc: 2,419,146 ( 0.9%) rb_vm_setinstancevariable: 2,357,065 ( 0.9%) Top-2 not optimized method types for send (100.0% of total 8,550,755): iseq: 8,518,284 (99.6%) optimized: 32,471 ( 0.4%) Top-2 not optimized method types for send_without_block (100.0% of total 789,641): optimized_send: 606,885 (76.9%) null: 182,756 (23.1%) Top-2 not optimized method types for super (100.0% of total 6,689,853): cfunc: 6,640,178 (99.3%) attrset: 49,675 ( 0.7%) Top-4 instructions with uncategorized fallback reason (100.0% of total 6,461,020): invokeblock: 4,036,471 (62.5%) sendforward: 1,871,601 (29.0%) invokesuperforward: 498,993 ( 7.7%) opt_send_without_block: 53,955 ( 0.8%) Top-20 send fallback reasons (100.0% of total 85,813,616): send_without_block_polymorphic: 31,519,543 (36.7%) send_without_block_no_profiles: 13,349,751 (15.6%) send_not_optimized_method_type: 8,550,755 (10.0%) super_not_optimized_method_type: 6,689,853 ( 7.8%) uncategorized: 6,461,020 ( 7.5%) send_no_profiles: 5,200,273 ( 6.1%) one_or_more_complex_arg_pass: 4,198,498 ( 4.9%) send_polymorphic: 3,318,658 ( 3.9%) send_without_block_not_optimized_need_permission: 1,273,739 ( 1.5%) too_many_args_for_lir: 1,139,487 ( 1.3%) singleton_class_seen: 1,101,973 ( 1.3%) super_complex_args_pass: 829,842 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 606,885 ( 0.7%) send_without_block_megamorphic: 565,874 ( 0.7%) super_target_complex_args_pass: 414,600 ( 0.5%) send_without_block_not_optimized_method_type: 182,756 ( 0.2%) obj_to_string_not_string: 158,133 ( 0.2%) super_call_with_block: 100,004 ( 0.1%) send_without_block_direct_keyword_mismatch: 99,588 ( 0.1%) super_polymorphic: 52,360 ( 0.1%) Top-2 setivar fallback reasons (100.0% of total 2,357,065): not_monomorphic: 2,255,281 (95.7%) not_t_object: 101,784 ( 4.3%) Top-1 getivar fallback reasons (100.0% of total 12,322,949): not_monomorphic: 12,322,949 (100.0%) Top-2 definedivar fallback reasons (100.0% of total 350,548): not_monomorphic: 350,461 (100.0%) not_t_object: 87 ( 0.0%) Top-6 invokeblock handler (100.0% of total 4,036,471): monomorphic_iseq: 2,189,045 (54.2%) polymorphic: 1,207,002 (29.9%) monomorphic_other: 334,248 ( 8.3%) monomorphic_ifunc: 221,225 ( 5.5%) megamorphic: 84,439 ( 2.1%) no_profiles: 512 ( 0.0%) Top-8 popular complex argument-parameter features not optimized (100.0% of total 5,212,150): param_forwardable: 1,824,953 (35.0%) param_block: 1,792,214 (34.4%) param_rest: 861,894 (16.5%) caller_splat: 283,669 ( 5.4%) caller_kw_splat: 248,291 ( 4.8%) param_kwrest: 200,208 ( 3.8%) caller_blockarg: 748 ( 0.0%) caller_kwarg: 173 ( 0.0%) Top-1 compile error reasons (100.0% of total 391,562): exception_handler: 391,562 (100.0%) Top-5 unhandled YARV insns (100.0% of total 501,538): getconstant: 400,945 (79.9%) expandarray: 49,985 (10.0%) setblockparam: 49,972 (10.0%) checkmatch: 480 ( 0.1%) once: 156 ( 0.0%) Top-2 unhandled HIR insns (100.0% of total 268,152): throw: 232,560 (86.7%) invokebuiltin: 35,592 (13.3%) Top-19 side exit reasons (100.0% of total 8,210,699): guard_shape_failure: 2,497,552 (30.4%) block_param_proxy_not_iseq_or_ifunc: 1,988,408 (24.2%) guard_type_failure: 1,721,809 (21.0%) unhandled_yarv_insn: 501,538 ( 6.1%) compile_error: 391,562 ( 4.8%) unhandled_newarray_send_pack: 298,017 ( 3.6%) unhandled_hir_insn: 268,152 ( 3.3%) patchpoint_method_redefined: 200,632 ( 2.4%) unhandled_block_arg: 151,295 ( 1.8%) block_param_proxy_modified: 124,245 ( 1.5%) guard_less_failure: 50,033 ( 0.6%) fixnum_lshift_overflow: 9,985 ( 0.1%) patchpoint_stable_constant_names: 6,342 ( 0.1%) fixnum_mult_overflow: 570 ( 0.0%) obj_to_string_fallback: 405 ( 0.0%) patchpoint_no_ep_escape: 109 ( 0.0%) interrupt: 31 ( 0.0%) guard_super_method_entry: 8 ( 0.0%) guard_greater_eq_failure: 6 ( 0.0%) send_count: 328,805,013 dynamic_send_count: 85,813,616 (26.1%) optimized_send_count: 242,991,397 (73.9%) dynamic_setivar_count: 2,357,065 ( 0.7%) dynamic_getivar_count: 12,322,949 ( 3.7%) dynamic_definedivar_count: 350,548 ( 0.1%) iseq_optimized_send_count: 93,990,621 (28.6%) inline_cfunc_optimized_send_count: 96,851,696 (29.5%) inline_iseq_optimized_send_count: 9,181,467 ( 2.8%) non_variadic_cfunc_optimized_send_count: 25,304,458 ( 7.7%) variadic_cfunc_optimized_send_count: 17,663,155 ( 5.4%) compiled_iseq_count: 2,886 failed_iseq_count: 0 compile_time: 875ms profile_time: 27ms gc_time: 66ms invalidation_time: 9ms vm_write_pc_count: 287,186,308 vm_write_sp_count: 287,186,308 vm_write_locals_count: 274,139,228 vm_write_stack_count: 274,139,228 vm_write_to_parent_iseq_local_count: 1,079,877 vm_read_from_parent_iseq_local_count: 30,810,378 guard_type_count: 310,644,961 guard_type_exit_ratio: 0.6% guard_shape_count: 109,072,242 guard_shape_exit_ratio: 2.3% code_region_bytes: 14,352,384 zjit_alloc_bytes: 19,186,174 total_mem_bytes: 33,538,558 side_exit_count: 8,210,699 total_insn_count: 1,705,193,555 vm_insn_count: 123,691,343 zjit_insn_count: 1,581,502,212 ratio_in_zjit: 92.7% ``` </details>
Also, include the column in here. Hopefully we can do some additional optimizations later. ruby/prism@7759acdd26
OpenBSD is advertising to the preprocessor that it supports C11 but does not include the stdalign.h header. We do not actually need the header, since we can just use the keywords. ruby/prism@b3e2708fff
Instead of having custom classes, use arrays and track which tokens we should ignore the state for in the test. ruby/prism@a333b56ada
e399338 to
c8cce99
Compare
…5877)" (ruby#15990) This reverts commit 994257a. I saw some failures in CI that are probably related to the change. Example: ``` 1) Failure: TestMonitor#test_timedwait [/Users/runner/work/ruby/ruby/src/test/monitor/test_monitor.rb:282]: ``` This starvation problem has not been an issue in real apps afaik, so for now it's best to revert it and think of a better solution.
~~Two commits:~~
### zjit: Optimize send-with-block to iseq methods
This commit enables JIT-to-JIT calls for send instructions with literal blocks (e.g., `foo { |x| x * 2 }`), rather than falling back to the `rb_vm_send` C wrapper. This optimization applies to both methods with explicit block parameters (`def foo(&block)`) and methods implicitly accepting a block (`def foo; yield if block_given?; end`).
Prior to this change, any callsite with a block would miss out on the JIT-to-JIT fast path and goes through a `rb_vm_send` C wrapper call.
Initially, as Shopify#931 suggested, we assumed this would involve changes to the JIT-to-JIT calling convention to accommodate passing a block argument. However, during implementation, I discovered that @nirvdrum had already wired up the `specval` protocol used by the interpreter in their `invokesuper` work (ruby#887). That infrastructure remained dormant but was exactly what we needed here. After plumbing everything through, it Just Worked™.
It may be possible to design a more direct JIT-to-JIT protocol for passing blocks. In the HIR for `def foo(&block)`, the BB for the JIT entrypoint already takes two arguments (self + &block, presumably), and since `yield` is a keyword, it may be possible to rewrite the implicit case to be explicit (thanks @tenderlove for the idea), and do "better" than passing via `specval`.
I'm not sure if that's a goal eventually, but in any case, if `specval` works, there is no harm in enabling this optimization today.
Implementation notes:
This initial pass largely duplicates the existing `SendWithoutBlock` to `SendWithoutBlockDirect` specialization logic. A future refactor could potentially collapse Send and SendWithoutBlock into a single instruction variant (with `blockiseq: Option<IseqPtr>`, you can always pattern match the Option if needed), since they now follow very similar paths.
However, I wanted to keep this PR focused and also get feedback on that direction first before committing to such a big refactor.
The optimization currently handles `VM_METHOD_TYPE_ISEQ` only. It does not handle "block to block" `VM_METHOD_TYPE_BMETHOD`. It's unclear if that'd be all that difficult, I just didn't try. Happy to do it as a follow-up.
Any callsites not handled by this specialization continue to fallthrough to the existing rb_vm_send harness safely.
Test coverage includes both explicit block parameters and yield-based methods.
Thanks to @tenderlove for initial ideas and walkthrough, and @nirvdrum for the foundation this builds on.
Closes Shopify#931
### ~~zjit: Allow SendWithoutBlockDirect to def f(&blk)~~
Saving this for another time
### Follow-ups
* [ ] Refactor and simplify by merging `hir::Insn::Send` and `hir::Insn::SendWithoutBlock`. (Pending feedback/approval.)
* [ ] Handle block-to-block `VM_METHOD_TYPE_BMETHOD` calls. It appears that `gen_send_iseq_direct` already attempted to handle it.
* [ ] As far as I can tell, we should be able to just enable `super { ... }` too, happy to do that as a follow-up if @nirvdrum doesn't have time for it.
This clamps to supported versions based on the current Ruby version. ruby/prism@eb63748e8b
This improves the `URI.open` method documentation by adding a code example requiring `open-uri` as a basic usage.
When reading the current documentation first, I didn't realize that `open-uri` was required to call the method.
I believe the improved version could be more helpful for new users.
```sh-session
$ ruby -r uri -e 'p URI.open("http://example.com")'
-e:1:in '<main>': private method 'open' called for module URI (NoMethodError)
```
Ref https://docs.ruby-lang.org/en/master/URI.html#method-c-open
Also, this improves formatting with code fonts for better readability.
ruby/open-uri@f4400edc27
* Enable double-quoted options with an `=` sign. * Replace `$` with `$$` in the batch file without CPP. * Support for `--with-destdir`. * Allow Makefile macro definition. (Close rubyGH-15935)
Also consider paths with space at splitting the `--with-opt-dir` argument.
Previously, Visual C++ had only one toolchain for the x86 family, and the only option was to select the target processor level. In recent versions, there are multiple toolchains with the same command name for each host/target platform combination, so it is no longer possible to select the target with a command-line option. Also, configure.bat assumes that the toolchain has been configured before it is executed, so selecting it from this batch file is meaningless. Therefore, the only possible check is whether the specified target and compiler match.
Avoids an issue where `%undefined:A=B%` expands to a literal `A=B` because the parser fails to find the variable before the colon, then parses the following percent as the next variable expansion. Added a definition check to ensure safe expansion.
c8cce99 to
47d9a53
Compare
When an object fails to be made shareable with `Ractor.make_shareable` or when an unshareable object is accessed through module constants or module instance variables, the error message now includes the chain of references that leads to the unshareable value.
47d9a53 to
ee0ecd3
Compare
Improve the messages of exceptions raised by the Ractor implementation.
When an object fails to be made shareable with
Ractor.make_shareableor when an unshareable object is accessed through module constants or module instance variables, the error message now includes the chain of references that leads to the unshareable value.