combine db querries for get command by nadavMiz · Pull Request #9514 · noobaa/noobaa-core

nadavMiz · 2026-03-25T10:07:24Z

Describe the Problem

during get command we made seperate querries to read parts, chunk, blocks. this casues a redundent querries to the database. that mightt takes a lot of time
there was a redundent call to read_object_md inside read_object_mapping even though the data to call read_object_md always came from the object_md itself
added short path for small objects. used similer method to namespace_s3 short path, for loading the data as part of read_object_md. moved check for can_use_get_inline. so we will only get the data for get commnads. as this is the only place where we need it

Explain the Changes

combine the db querries to parts, chunk, blocks into one querry
move the relevant information from object_md to read_object_mapping and don't return it as the data is not used

Issues: Fixed #xxx / Gap #xxx

Testing Instructions:

Doc added/updated
Tests added

Summary by CodeRabbit

Breaking Changes
- The object-mapping read API now requires a numeric size parameter in requests.
- Mapping responses no longer include object metadata; responses return chunk information only.
Tests
- Integration and unit tests updated to include the new size field in mapping requests and validations.

coderabbitai · 2026-03-25T10:24:54Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a required numeric size param to read_object_mapping requests and removes object_md from responses. Server accepts a minimal { _id, size }, introduces a Postgres single-query parts↔chunks↔blocks path, and client/cache/tests were updated to include size.

Changes

Cohort / File(s)	Summary
API Schema `src/api/object_api.js`	`read_object_mapping` request `params` now requires `size` (integer); reply no longer includes `object_md`, `chunks` is sole required reply property.
Clients / SDK `src/sdk/map_client.js`, `src/sdk/namespace_cache.js`, `src/sdk/namespace_nb.js`, `src/sdk/namespace_s3.js`	RPC payloads and cache calls now include `size`; `NamespaceNB.read_object_md` became `async` and may perform inline read when `can_use_get_inline` true; `NamespaceS3.read_object_md` uses caller-provided `can_use_get_inline`.
Server: map reader `src/server/object_services/map_reader.js`	Added `read_parts_mapping_postgres()` and switched Postgres branch to use combined-query helper; `read_object_mapping` now accepts `{ _id, size }` and retains rebuild-on-read flow.
Server: metadata store `src/server/object_services/md_store.js`	Added `MDStore.find_parts_chunks_blocks_by_range(...)` performing a single JOIN across parts↔chunks↔blocks, decoding/deduping/grouping frags and optional per-frag sorting; increased ESLint `max-lines`.
Server: RPC handler `src/server/object_services/object_server.js`	`read_object_mapping` now derives minimal object `{ _id, size }` from params, validates `size`, authorizes using provided `key`, updates read stats by `obj_id`, and returns only `chunks`.
Endpoint behavior `src/endpoint/s3/ops/s3_get_object.js`	Sets `md_params.can_use_get_inline = true` for GETs without `part_number` and non-folder keys.
Tests / Utilities `src/test/.../test_s3_bucket_policy.js`, `src/test/integration_tests/internal/test_map_builder.js`, `src/test/utils/object_api_functions.js`	Tests and helpers updated to pass `size` into `read_object_mapping` calls and related helpers.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant RPC as RPC Server
participant MapSrv as Map Reader
participant MDStore
participant DB as Postgres DB

Client->>RPC: object.read_object_mapping(params including size)
RPC->>MapSrv: read_object_mapping({ _id, size, start, end, location_info })
alt DB_TYPE == 'postgres'
    MapSrv->>MDStore: find_parts_chunks_blocks_by_range(obj_id, range, sorter)
    MDStore->>DB: single JOIN query (parts ↔ chunks ↔ blocks)
    DB-->>MDStore: rows
    MDStore-->>MapSrv: { parts, chunks_db }
    MapSrv->>MapSrv: prepare_chunks({ chunks_db })
else non-postgres
    MapSrv->>MDStore: find_parts_by_start_range(...) 
    MapSrv->>MDStore: read_parts_mapping(parts)
end
MapSrv-->>RPC: { chunks }
RPC-->>Client: reply (chunks only)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

SDK | Upgrade AWS SDK to v3 - NamespaceS3 #9163 — also modifies NamespaceS3.read_object_md control flow and can_use_get_inline handling.

Suggested reviewers

dannyzaken
nimrod-becker
romayalon

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'combine db querries for get command' directly describes the main objective of the PR—combining database queries for the get command—which aligns with the primary change across the codebase.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/server/object_services/object_server.js`:
- Around line 694-698: Validate req.rpc_params.size before building the object
descriptor: in the block that calls load_bucket(req) and constructs obj (the
line creating obj = { _id: obj_id, size }), ensure size is a finite,
non-negative number (e.g. typeof size === 'number' && isFinite(size) && size >=
0); if it is missing/invalid, fail fast by returning an RPC error/logging and do
not construct/pass an object with undefined/NaN size to sanitize_object_range,
or explicitly coerce to a safe default only after clearly documenting that
behavior. This touches the code around get_obj_id, load_bucket, and the obj
construction where sanitize_object_range later uses obj.size.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 86025c67-302d-49b1-9b53-d704b4a762fb

📥 Commits

Reviewing files that changed from the base of the PR and between 91c9962 and bb8c7d1.

📒 Files selected for processing (9)

src/api/object_api.js
src/sdk/map_client.js
src/sdk/namespace_cache.js
src/server/object_services/map_reader.js
src/server/object_services/md_store.js
src/server/object_services/object_server.js
src/test/integration_tests/api/s3/test_s3_bucket_policy.js
src/test/integration_tests/internal/test_map_builder.js
src/test/utils/object_api_functions.js

src/server/object_services/object_server.js

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/server/object_services/map_reader.js`:
- Around line 26-38: The code reuses joined chunk payloads from
MDStore.instance().find_parts_chunks_blocks_by_range which returns one row per
(part, block), causing duplicated entries in each chunk's frag.blocks when
multiple parts reference the same chunk; update the construction of ChunkDB
instances (the chunks creation step that maps parts to new ChunkDB) to group
rows by chunk id and deduplicate the frag.blocks for each chunk before calling
map_server.prepare_chunks; specifically, aggregate parts/blocks from parts and
chunks_db into a single ChunkDB per unique chunk id (used in the chunks array)
ensuring frag.blocks contains only unique block identifiers.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3d55dcb1-9f0b-4c8b-bb5f-b0e2fa4c8f74

📥 Commits

Reviewing files that changed from the base of the PR and between bb8c7d1 and 21db7c2.

📒 Files selected for processing (9)

src/api/object_api.js
src/sdk/map_client.js
src/sdk/namespace_cache.js
src/server/object_services/map_reader.js
src/server/object_services/md_store.js
src/server/object_services/object_server.js
src/test/integration_tests/api/s3/test_s3_bucket_policy.js
src/test/integration_tests/internal/test_map_builder.js
src/test/utils/object_api_functions.js

✅ Files skipped from review due to trivial changes (4)

src/test/integration_tests/internal/test_map_builder.js
src/sdk/map_client.js
src/sdk/namespace_cache.js
src/test/integration_tests/api/s3/test_s3_bucket_policy.js

🚧 Files skipped from review as they are similar to previous changes (4)

src/test/utils/object_api_functions.js
src/api/object_api.js
src/server/object_services/md_store.js
src/server/object_services/object_server.js

src/server/object_services/map_reader.js

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/sdk/map_client.js (1)

389-397: Fail fast when object_md.size is missing.

read_object_mapping() now depends on size, but MapClient still accepts Partial<nb.ObjectInfo> and the read-side construction in src/sdk/object_io.js:700-710 forwards params.object_md as-is. A local guard here will turn an RPC schema failure into an immediate, easier-to-debug error.

Suggested guard

     async read_object_mapping() {
+        if (typeof this.object_md?.size !== 'number') {
+            throw new Error('MapClient.read_object_mapping requires object_md.size');
+        }
         const res = await this.rpc_client.object.read_object_mapping({
             obj_id: this.object_md.obj_id,
             bucket: this.object_md.bucket,
             key: this.object_md.key,
             size: this.object_md.size,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/sdk/map_client.js` around lines 389 - 397, Add a fast-fail guard in
MapClient before calling rpc_client.object.read_object_mapping: verify that
this.object_md and this.object_md.size are present (non-null/undefined) and
throw a clear, synchronous Error if size is missing so the RPC isn't invoked
with an invalid schema; update the call site that uses read_start, read_end,
location_info and read_object_mapping to rely on this check (identify symbols:
MapClient, this.object_md, read_object_mapping, read_start, read_end,
location_info) so callers that pass Partial<nb.ObjectInfo> get an immediate,
debuggable failure instead of an RPC schema error.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/server/object_services/md_store.js`:
- Around line 1922-1934: The current SQL joins blocks directly and causes
duplicate part/chunk rows; refactor the query inside the method that builds this
SQL (using this._parts.name, this._chunks.name, this._blocks.name and the
p.data->>'chunk' / c.data->>'_id' join) so blocks are aggregated per chunk first
(e.g. a subquery on ${this._blocks.name} grouping by chunk id and using
jsonb_agg(blocks.data) / jsonb_agg(...) AS blocks) and then join that
aggregated-blocks subquery to chunks and parts; preserve all existing WHERE
filters and the ORDER BY on (p.data->>'start')::bigint ASC and ensure the joined
chunk payload uses the aggregated blocks field instead of producing one row per
block.

---

Nitpick comments:
In `@src/sdk/map_client.js`:
- Around line 389-397: Add a fast-fail guard in MapClient before calling
rpc_client.object.read_object_mapping: verify that this.object_md and
this.object_md.size are present (non-null/undefined) and throw a clear,
synchronous Error if size is missing so the RPC isn't invoked with an invalid
schema; update the call site that uses read_start, read_end, location_info and
read_object_mapping to rely on this check (identify symbols: MapClient,
this.object_md, read_object_mapping, read_start, read_end, location_info) so
callers that pass Partial<nb.ObjectInfo> get an immediate, debuggable failure
instead of an RPC schema error.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ee451efa-bbe9-4542-8b76-7dd83a32a155

📥 Commits

Reviewing files that changed from the base of the PR and between 21db7c2 and 6fdc255.

📒 Files selected for processing (9)

src/api/object_api.js
src/sdk/map_client.js
src/sdk/namespace_cache.js
src/server/object_services/map_reader.js
src/server/object_services/md_store.js
src/server/object_services/object_server.js
src/test/integration_tests/api/s3/test_s3_bucket_policy.js
src/test/integration_tests/internal/test_map_builder.js
src/test/utils/object_api_functions.js

✅ Files skipped from review due to trivial changes (1)

src/sdk/namespace_cache.js

🚧 Files skipped from review as they are similar to previous changes (3)

src/test/integration_tests/internal/test_map_builder.js
src/test/utils/object_api_functions.js
src/server/object_services/object_server.js

src/server/object_services/md_store.js

coderabbitai

♻️ Duplicate comments (1)

src/server/object_services/map_reader.js (1)

33-37: ⚠️ Potential issue | 🟠 Major

Duplicate frag.blocks issue remains unaddressed.

The combined query returns one row per (part, block) pair. When multiple selected parts reference the same chunk, frag.blocks may contain duplicates. The existing read_parts_mapping path (line 131) loads blocks once per unique chunk, avoiding this issue.

Consider deduplicating frag.blocks for each chunk before the keyBy:

Suggested fix

     if (parts.length === 0) return [];
+    for (const chunk_db of chunks_db) {
+        for (const frag of chunk_db.frags || []) {
+            frag.blocks = _.uniqBy(frag.blocks || [], block => block._id.toHexString());
+        }
+    }
     const chunks_db_by_id = _.keyBy(chunks_db, '_id');

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/server/object_services/map_reader.js` around lines 33 - 37, The fragments
returned by the combined query can contain duplicate blocks so before building
chunks_db_by_id and mapping parts into ChunkDB you should deduplicate
frag.blocks per chunk id: iterate the chunks_db array (the variable used to
build chunks_db_by_id), for each entry (refer to frag.blocks / blocks property)
remove duplicate block entries (e.g., by block id or unique key) and replace
with the de-duplicated blocks array, then build chunks_db_by_id and continue
creating ChunkDB instances (the code creating chunks via parts.map -> new
ChunkDB(...)). This mirrors the deduplication behavior in read_parts_mapping and
prevents duplicate frag.blocks in the resulting ChunkDB objects.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/server/object_services/map_reader.js`:
- Around line 33-37: The fragments returned by the combined query can contain
duplicate blocks so before building chunks_db_by_id and mapping parts into
ChunkDB you should deduplicate frag.blocks per chunk id: iterate the chunks_db
array (the variable used to build chunks_db_by_id), for each entry (refer to
frag.blocks / blocks property) remove duplicate block entries (e.g., by block id
or unique key) and replace with the de-duplicated blocks array, then build
chunks_db_by_id and continue creating ChunkDB instances (the code creating
chunks via parts.map -> new ChunkDB(...)). This mirrors the deduplication
behavior in read_parts_mapping and prevents duplicate frag.blocks in the
resulting ChunkDB objects.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f82ef749-3c3e-48a8-a738-f1020d8f34b7

📥 Commits

Reviewing files that changed from the base of the PR and between 6fdc255 and 1eec208.

📒 Files selected for processing (9)

src/api/object_api.js
src/sdk/map_client.js
src/sdk/namespace_cache.js
src/server/object_services/map_reader.js
src/server/object_services/md_store.js
src/server/object_services/object_server.js
src/test/integration_tests/api/s3/test_s3_bucket_policy.js
src/test/integration_tests/internal/test_map_builder.js
src/test/utils/object_api_functions.js

✅ Files skipped from review due to trivial changes (1)

src/sdk/namespace_cache.js

🚧 Files skipped from review as they are similar to previous changes (6)

src/test/utils/object_api_functions.js
src/sdk/map_client.js
src/server/object_services/object_server.js
src/api/object_api.js
src/test/integration_tests/internal/test_map_builder.js
src/server/object_services/md_store.js

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

src/server/object_services/md_store.js (1)
1922-1927: ⚠️ Potential issue | 🟠 Major

Aggregate blocks before joining them back to parts.

This still expands to one row per (part, block), so Postgres ships and decode_json() re-parses the same part_data/chunk_data once per block. On top of that, joining chunks via c.data->>'_id' gives up the direct _id lookup the old find_chunks_by_ids() path had. On replicated/EC chunks that can wipe out the latency win this PR is targeting. Please aggregate blocks per chunk first and join that grouped result back on the chunk table’s _id column.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/object_services/md_store.js` around lines 1922 - 1927, Aggregate
blocks by chunk before joining: replace the direct LEFT JOIN to
${this._blocks.name} with a subquery that groups blocks by their chunk id (e.g.,
SELECT b.data->>'chunk' AS chunk_id, jsonb_agg(b.data) AS blocks FROM
${this._blocks.name} b WHERE (b.data->'deleted' IS NULL OR b.data->'deleted' =
'null'::jsonb) GROUP BY chunk_id) and then LEFT JOIN that grouped result back to
${this._chunks.name} using the chunk's _id column (match the grouped.chunk_id to
the chunk `_id` value) so the main SELECT returns part_data, chunk_data and the
aggregated blocks per chunk (e.g., blocks) instead of one row per (part,block);
update the SELECT list and any downstream decode_json()/parsing to expect
aggregated blocks.

🧹 Nitpick comments (1)

src/server/object_services/map_reader.js (1)
25-39: Extract the shared chunk-assembly step.

read_parts_mapping_postgres() now duplicates the chunks_db_by_id → new ChunkDB() → prepare_chunks() flow from read_parts_mapping(). A small helper that consumes { parts, chunks_db } would keep the Mongo and Postgres paths aligned the next time this mapping logic changes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server/object_services/map_reader.js` around lines 25 - 39, The Postgres
path in read_parts_mapping_postgres duplicates the chunk-assembly flow from
read_parts_mapping; extract that shared logic into a new helper (e.g.,
assemble_chunks_from_parts or buildChunksFromParts) that takes an object with
{parts, chunks_db}, constructs ChunkDB instances (using chunk.toHexString()
lookups into chunks_db_by_id and attaching parts), calls
map_server.prepare_chunks({chunks}), and returns the prepared chunks; then
replace the duplicated block in read_parts_mapping_postgres (and update
read_parts_mapping to use the same helper) so both functions call this single
helper instead of repeating the chunks_db_by_id → new ChunkDB(...) →
prepare_chunks flow.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sdk/namespace_nb.js`:
- Around line 88-97: The inline-read path uses this.target_bucket instead of the
resolved bucket present in rpc_params, causing reads to use an undefined bucket
when NamespaceNB was called without a fixed bucket; update the call to
object_sdk.object_io.read_entire_object(...) to pass the resolved bucket from
rpc_params (rpc_params.bucket) instead of this.target_bucket so the inline GET
uses the same effective bucket returned/used by
object_sdk.rpc_client.object.read_object_md; keep all other args (client:
object_sdk.rpc_client, object_md: info) unchanged.

In `@src/server/object_services/object_server.js`:
- Around line 696-705: The code currently trusts caller-supplied obj_id/size
before permission and mapping, allowing ID/key mismatch attacks; after
load_bucket and get_obj_id but before calling req.has_s3_bucket_permission or
map_reader/read_object_mapping, look up the object row by its identifier with
bucket and key predicates (e.g., query where _id = obj_id AND bucket =
req.bucket AND key = '/' + key or equivalent), rebind obj_id and replace size
with the stored size from that row (or return NOT_FOUND if no matching row),
then perform the authorization check (req.has_s3_bucket_permission) and only
call map_reader/read_object_mapping using the verified object row and stored
size.

---

Duplicate comments:
In `@src/server/object_services/md_store.js`:
- Around line 1922-1927: Aggregate blocks by chunk before joining: replace the
direct LEFT JOIN to ${this._blocks.name} with a subquery that groups blocks by
their chunk id (e.g., SELECT b.data->>'chunk' AS chunk_id, jsonb_agg(b.data) AS
blocks FROM ${this._blocks.name} b WHERE (b.data->'deleted' IS NULL OR
b.data->'deleted' = 'null'::jsonb) GROUP BY chunk_id) and then LEFT JOIN that
grouped result back to ${this._chunks.name} using the chunk's _id column (match
the grouped.chunk_id to the chunk `_id` value) so the main SELECT returns
part_data, chunk_data and the aggregated blocks per chunk (e.g., blocks) instead
of one row per (part,block); update the SELECT list and any downstream
decode_json()/parsing to expect aggregated blocks.

---

Nitpick comments:
In `@src/server/object_services/map_reader.js`:
- Around line 25-39: The Postgres path in read_parts_mapping_postgres duplicates
the chunk-assembly flow from read_parts_mapping; extract that shared logic into
a new helper (e.g., assemble_chunks_from_parts or buildChunksFromParts) that
takes an object with {parts, chunks_db}, constructs ChunkDB instances (using
chunk.toHexString() lookups into chunks_db_by_id and attaching parts), calls
map_server.prepare_chunks({chunks}), and returns the prepared chunks; then
replace the duplicated block in read_parts_mapping_postgres (and update
read_parts_mapping to use the same helper) so both functions call this single
helper instead of repeating the chunks_db_by_id → new ChunkDB(...) →
prepare_chunks flow.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5a9f2a8b-3d5b-44a1-812f-677e7aeeda97

📥 Commits

Reviewing files that changed from the base of the PR and between 1eec208 and 151df05.

📒 Files selected for processing (12)

src/api/object_api.js
src/endpoint/s3/ops/s3_get_object.js
src/sdk/map_client.js
src/sdk/namespace_cache.js
src/sdk/namespace_nb.js
src/sdk/namespace_s3.js
src/server/object_services/map_reader.js
src/server/object_services/md_store.js
src/server/object_services/object_server.js
src/test/integration_tests/api/s3/test_s3_bucket_policy.js
src/test/integration_tests/internal/test_map_builder.js
src/test/utils/object_api_functions.js

✅ Files skipped from review due to trivial changes (1)

src/sdk/map_client.js

🚧 Files skipped from review as they are similar to previous changes (3)

src/test/utils/object_api_functions.js
src/sdk/namespace_cache.js
src/test/integration_tests/internal/test_map_builder.js

src/sdk/namespace_nb.js

coderabbitai · 2026-03-26T09:35:10Z

src/server/object_services/object_server.js

+    load_bucket(req);
+    const obj_id = get_obj_id(req, 'BAD_OBJECT_ID');
+    if (typeof size !== 'number' || size < 0) {
+        throw new RpcError('BAD_REQUEST', 'size must be a positive number');
+    }
+    const obj = { _id: obj_id, size };

    // Check if the requesting account is authorized to read the object
-    if (!await req.has_s3_bucket_permission(req.bucket, 's3:GetObject', '/' + obj.key, undefined)) {
+    if (!await req.has_s3_bucket_permission(req.bucket, 's3:GetObject', '/' + key, undefined)) {
        throw new RpcError('UNAUTHORIZED', 'requesting account is not authorized to read the object');


⚠️ Potential issue | 🔴 Critical

Don’t authorize read_object_mapping against a caller-supplied key alone.

Lines 697-705 no longer verify that obj_id actually belongs to the requested bucket/key/system before the permission check. Since obj_id and size are already exposed by metadata/list responses, a caller can pair an allowed key with another object’s obj_id and fetch that other object’s mapping. Please re-bind the ID to the real object row (or enforce bucket/key predicates in the mapping query) and use its stored size before calling map_reader.

🛡️ Proposed fix

load_bucket(req); const obj_id = get_obj_id(req, 'BAD_OBJECT_ID'); if (typeof size !== 'number' || size < 0) { - throw new RpcError('BAD_REQUEST', 'size must be a positive number'); + throw new RpcError('BAD_REQUEST', 'size must be a non-negative number'); } - const obj = { _id: obj_id, size }; + const obj = await MDStore.instance().find_object_by_id(obj_id); + check_object_mode(req, obj, 'NO_SUCH_OBJECT'); - if (!await req.has_s3_bucket_permission(req.bucket, 's3:GetObject', '/' + key, undefined)) { + if (!await req.has_s3_bucket_permission(req.bucket, 's3:GetObject', '/' + obj.key, undefined)) { throw new RpcError('UNAUTHORIZED', 'requesting account is not authorized to read the object'); }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/server/object_services/object_server.js` around lines 696 - 705, The code currently trusts caller-supplied obj_id/size before permission and mapping, allowing ID/key mismatch attacks; after load_bucket and get_obj_id but before calling req.has_s3_bucket_permission or map_reader/read_object_mapping, look up the object row by its identifier with bucket and key predicates (e.g., query where _id = obj_id AND bucket = req.bucket AND key = '/' + key or equivalent), rebind obj_id and replace size with the stored size from that row (or return NOT_FOUND if no matching row), then perform the authorization check (req.has_s3_bucket_permission) and only call map_reader/read_object_mapping using the verified object row and stored size.

Signed-off-by: nadav mizrahi <[email protected]>

nadavMiz requested a review from dannyzaken March 25, 2026 10:07

pull-request-size bot added the size/XS label Mar 25, 2026

nadavMiz force-pushed the get_db_querries branch from 662fb61 to abfaa2a Compare March 25, 2026 10:24

pull-request-size bot added size/L and removed size/XS labels Mar 25, 2026

nadavMiz force-pushed the get_db_querries branch from abfaa2a to bb8c7d1 Compare March 25, 2026 10:25

nadavMiz marked this pull request as ready for review March 25, 2026 10:25

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

src/server/object_services/object_server.js Outdated Show resolved Hide resolved

nadavMiz force-pushed the get_db_querries branch from bb8c7d1 to 21db7c2 Compare March 25, 2026 10:35

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

src/server/object_services/map_reader.js Show resolved Hide resolved

nadavMiz force-pushed the get_db_querries branch from 21db7c2 to 6fdc255 Compare March 25, 2026 11:12

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

src/server/object_services/md_store.js Show resolved Hide resolved

nadavMiz force-pushed the get_db_querries branch from 6fdc255 to 1eec208 Compare March 25, 2026 14:03

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

nadavMiz force-pushed the get_db_querries branch from 1eec208 to 151df05 Compare March 26, 2026 09:23

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

nadavMiz force-pushed the get_db_querries branch from 151df05 to ea4f625 Compare March 26, 2026 10:45

combine db querries for get command

6e28f77

Signed-off-by: nadav mizrahi <[email protected]>

nadavMiz force-pushed the get_db_querries branch from ea4f625 to 6e28f77 Compare March 30, 2026 13:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

combine db querries for get command#9514

combine db querries for get command#9514
nadavMiz wants to merge 1 commit intonoobaa:masterfrom
nadavMiz:get_db_querries

nadavMiz commented Mar 25, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 25, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nadavMiz commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the Problem

Explain the Changes

Issues: Fixed #xxx / Gap #xxx

Testing Instructions:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nadavMiz commented Mar 25, 2026 •

edited

Loading

coderabbitai bot commented Mar 25, 2026 •

edited

Loading