Skip to content

Thread safety: GRAPH_CLIENT_DICT and QUERY_OBJECT_FEEDBACK_BUFFER race conditions #312

@JasonWildMe

Description

@JasonWildMe

Summary

Several mutable data structures on the Flask current_app object are shared across all 16 Gunicorn gthread worker threads with no synchronization. This creates race conditions under concurrent load.

Affected Data Structures

1. GRAPH_CLIENT_DICT (High Priority)

Location: wbia/control/controller_inject.py:156

GLOBAL_APP.GRAPH_CLIENT_DICT = {}

Accessed from: wbia/web/apis_query.py (lines ~1684, 2085, 2120)

Multiple threads can simultaneously read and write this dict:

# Thread 1
current_app.GRAPH_CLIENT_DICT[graph_uuid] = graph_client  # Write

# Thread 2 (concurrent)
graph_client = current_app.GRAPH_CLIENT_DICT.get(graph_uuid, None)  # Read (stale?)

Risks:

  • Lost updates (two threads create clients for the same graph_uuid, second overwrites first)
  • KeyError or stale reads during concurrent mutation
  • Assertion failure at assert graph_uuid_ not in current_app.GRAPH_CLIENT_DICT

2. QUERY_OBJECT_FEEDBACK_BUFFER (High Priority)

Location: wbia/control/controller_inject.py:158

GLOBAL_APP.QUERY_OBJECT_FEEDBACK_BUFFER = []

Accessed from: wbia/web/routes.py:4124-4125

# Thread 1
while len(current_app.QUERY_OBJECT_FEEDBACK_BUFFER) > 0:
    feedback = current_app.QUERY_OBJECT_FEEDBACK_BUFFER.pop()

# Thread 2 (concurrent)
current_app.QUERY_OBJECT_FEEDBACK_BUFFER.append(feedback)

Risks:

  • TOCTOU race: len() > 0 passes, another thread empties the list, .pop() raises IndexError
  • Lost feedback items when two threads pop concurrently from a single-item list
  • Inconsistent state between QUERY_OBJECT and its feedback buffer

3. QUERY_OBJECT (Medium Priority)

Location: wbia/control/controller_inject.py:156

GLOBAL_APP.QUERY_OBJECT = None
GLOBAL_APP.QUERY_OBJECT_JOBID = None

Classic TOCTOU pattern:

if current_app.QUERY_OBJECT is None:  # Thread A checks
    # Thread B sets it to non-None here
    current_app.QUERY_OBJECT = query_object  # Thread A overwrites Thread B

Proposed Fix

Add a threading.Lock for each shared structure (or one lock for the group):

import threading

GLOBAL_APP._graph_client_lock = threading.Lock()
GLOBAL_APP._query_object_lock = threading.Lock()

Wrap all read/write access in with app._graph_client_lock: blocks.

Environment

  • Gunicorn gthread: 1 worker × 16 threads
  • All threads share the same Flask current_app and ibs object
  • These races are only triggered under concurrent load (multiple simultaneous API requests)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions