Skip to content

DEADLOCK in updates.Manager when starting with multiple channels containing unread messages #1658

@ponecrazy

Description

@ponecrazy

What version of gotd are you using?

v0.135.0

Can this issue be reproduced with the latest version?

Yes

What did you do?

I have an account with hundreds of channels that contain a large volume of unread messages. Every time I log in, the update handler gets completely blocked.

After several days of careful investigation, I discovered a circular wait deadlock in the telegram/updates package. The deadlock occurs during the initialization phase when updateManager.Run() starts.

Steps to reproduce:

  1. Have an account with a lot of channels (exceeds the internal queue buffer size)
  2. Ensure these channels have many unread messages
  3. Call updateManager.Run() to authenticate and start the update handler
  4. The handler will permanently block during getDifferenceLogger()

What did you expect to see?

The update handler should successfully initialize and process all updates from multiple channels, regardless of the number of channels or the volume of unread messages.

What did you see instead?

The update handler enters a permanent deadlock state and becomes completely unresponsive.

Root Cause Analysis

The deadlock is a classic circular wait scenario involving two buffered channels with size 10:

┌─────────────────────────────────────────────────────────────────────────────┐
│                        Circular Wait Deadlock                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   internalState.Run()                                                       │
│        │                                                                    │
│        ├─ getDifferenceLogger()  [BLOCKED]                                  │
│        │      └─ getDifference() → handleUpdates() → applyCombined()        │
│        │            └─ UpdateChannelTooLong → st.Push()                     │
│        │                  │                                                 │
│        │                  ▼                                                 │
│        │           [BLOCKED! Waiting for channelState.updates]              │
│        │                                                                     │
│        ▼                                                                      │
│   for loop (never starts)                                                    │
│        └─ case u := <-s.internalQueue  [should drain the queue]             │
│                                                                             │
│                                                                             │
│   channelState.Run() (per-channel goroutine)                                │
│        │                                                                    │
│        ├─ getDifference()  [BLOCKED]                                        │
│        │      └─ UpdatesChannelDifference → s.out <- update                 │
│        │            │                                                        │
│        │            ▼                                                        │
│        │       [BLOCKED! Waiting for internalQueue]                          │
│        │                                                                     │
│        ▼                                                                      │
│   for loop (never starts)                                                    │
│        └─ case u := <-s.updates  [should consume updates]                   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Deadlock Chain Explanation

  1. internalState.Run() calls getDifferenceLogger() before entering its event loop
  2. getDifference() may recursively call itself when processing UpdatesDifferenceSlice
  3. During applyCombined(), UpdateChannelTooLong updates trigger channelState.Push()
  4. channelState.Push() tries to send to channelState.updates (buffer=10)
  5. If ≥11 channels exist, the 11th push blocks because the buffer is full
  6. channelState.Run()'s for loop (which would drain updates) hasn't started yet
  7. Because channelState.Run() is blocked in its own getDifference() call
  8. channelState.getDifference() tries to send to internalQueue (buffer=10)
  9. If multiple channels do this simultaneously, internalQueue fills up and blocks
  10. internalState.Run()'s for loop (which would drain internalQueue) hasn't started yet
  11. We're back to step 1 → DEADLOCK

Key Code Locations

  • telegram/updates/state.go:82-83 - internalQueue: make(chan tracedUpdate, 10)
  • telegram/updates/state.go:167 - getDifferenceLogger() called before for loop
  • telegram/updates/state_apply.go:60 - st.Push() can block
  • telegram/updates/state_channel.go:62 - updates: make(chan channelUpdate, 10)
  • telegram/updates/state_channel.go:100 - getDifference() called before for loop
  • telegram/updates/state_channel.go:260 - Send to s.out (which is internalQueue) can block

Temporary Workaround

Increasing the buffer sizes of both channels resolves the issue:

// In telegram/updates/state.go
internalQueue: make(chan tracedUpdate, 1000),  // was 10

// In telegram/updates/state_channel.go
updates: make(chan channelUpdate, 100),  // was 10

After testing with increased buffer sizes, the deadlock no longer occurs.

What Go version and environment are you using?

go version go1.24.4 darwin/arm64

go env Output
$ go env

Additional Notes

The fundamental issue is the startup dependency cycle:

  • internalState.Run waits for getDifferenceLogger to complete
  • getDifferenceLogger waits for channelState.Push to complete
  • channelState.Push waits for channelState.Run's for loop
  • channelState.Run waits for getDifference to complete
  • getDifference waits for internalState.Run's for loop

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions