Skip to content

feat: add custom thread pool name options#771

Open
gabrielfranca95 wants to merge 3 commits intoclj-commons:masterfrom
gabrielfranca95:feature/custom-thread-pool-names
Open

feat: add custom thread pool name options#771
gabrielfranca95 wants to merge 3 commits intoclj-commons:masterfrom
gabrielfranca95:feature/custom-thread-pool-names

Conversation

@gabrielfranca95
Copy link
Copy Markdown

Description

This PR addresses issue #769, which has been open since the community identified the need for better observability of Aleph's thread pools.

Problem

When running multiple Aleph servers or connection pools in the same process, all threads share hardcoded names like aleph-server-pool-1, aleph-netty-client-event-pool-1, etc. This makes it impossible to distinguish which threads belong to which server/client in:

  • Thread dumps during debugging
  • APM and monitoring tools
  • Log analysis during incident response

Solution

Added a new :thread-pool-name option that allows users to assign meaningful, distinguishable names to their thread pools while maintaining full backwards compatibility.

Changes

  • Server-side: Add :thread-pool-name parameter to start-server

    • Allows custom naming of server EventLoopGroup threads
    • Defaults to "aleph-server-pool" preserving existing behavior
  • Client-side: Add :thread-pool-name parameter to connection-pool

    • Enables custom naming for client EventLoopGroup threads
    • Multiple pools with the same name efficiently share the same EventLoopGroup (cached)
    • Without the parameter, uses the existing global shared pool (unchanged behavior)
  • Documentation: Updated docstrings in aleph.http namespace

Backwards Compatibility

Fully backwards compatible - All existing code continues to work without any changes. The new parameter is optional with sensible defaults matching previous behavior.

Example Usage

;; Server with custom thread pool name
(aleph.http/start-server handler {:port 8080 
                                   :thread-pool-name "api-server"})
;; Threads will appear as: api-server-1, api-server-2, ...

;; Client with custom thread pool name  
(aleph.http/connection-pool 
  {:connection-options {:thread-pool-name "external-api-client"}})
;; Threads will appear as: external-api-client-1, external-api-client-2, ...

@DerGuteMoritz
Copy link
Copy Markdown
Collaborator

First of all: Thanks for your contribution! Looking at it now, I'm thinking that a better approach would be to allow passing in a custom thread factory. Then we can leave the caching / re-use concern up to the caller. They could still use aleph.netty/enumerating-thread-factory if all they care about is a custom name prefix (we should remove the :no-doc annotation then and polish the docstring a bit). But they could even pass in a completely different ThreadFactory implementation, allowing for even more customization options. WDYT?

@gabrielfranca95
Copy link
Copy Markdown
Author

First of all: Thanks for your contribution! Looking at it now, I'm thinking that a better approach would be to allow passing in a custom thread factory. Then we can leave the caching / re-use concern up to the caller. They could still use aleph.netty/enumerating-thread-factory if all they care about is a custom name prefix (we should remove the :no-doc annotation then and polish the docstring a bit). But they could even pass in a completely different ThreadFactory implementation, allowing for even more customization options. WDYT?

Thank you for your feedback! I understand the point about Inversion of Control and removing the caching logic; I will apply the suggestions.

- Add thread-pool-name parameter to start-server for custom server pool names
- Add thread-pool-name parameter to connection-pool for custom client pool names
- Custom client pools with same name share the same EventLoopGroup
- Defaults to 'aleph-server-pool' and existing client pool for backwards compatibility
- Update documentation in aleph.http namespace

Closes clj-commons#769
gabrielfranca95 pushed a commit to gabrielfranca95/aleph that referenced this pull request Jan 17, 2026
Based on maintainer feedback on PR clj-commons#771:
- Make enumerating-thread-factory public for users to create custom ThreadFactory
- Server: replace :thread-pool-name with :thread-factory parameter
- Client: replace :thread-pool-name with :event-loop-group parameter
- Remove client-group-cache and get-client-group (leave caching to caller)
- Update docstrings in aleph.http namespace

This follows the Inversion of Control principle, allowing users full
control over thread creation (not just naming) and EventLoopGroup
lifecycle management.

Closes clj-commons#769
@gabrielfranca95 gabrielfranca95 force-pushed the feature/custom-thread-pool-names branch from ea741e2 to 751be85 Compare January 17, 2026 19:58
Based on maintainer feedback on PR clj-commons#771:
- Make enumerating-thread-factory public for users to create custom ThreadFactory
- Server: replace :thread-pool-name with :thread-factory parameter
- Client: replace :thread-pool-name with :event-loop-group parameter
- Remove client-group-cache and get-client-group (leave caching to caller)
- Update docstrings in aleph.http namespace

This follows the Inversion of Control principle, allowing users full
control over thread creation (not just naming) and EventLoopGroup
lifecycle management.

Closes clj-commons#769
@gabrielfranca95 gabrielfranca95 force-pushed the feature/custom-thread-pool-names branch from 751be85 to ef33c04 Compare January 17, 2026 22:25
@gabrielfranca95
Copy link
Copy Markdown
Author

I've updated the PR to implement your suggestions:

Server now accepts :thread-factory instead of :thread-pool-name
Client now accepts :event-loop-group for full EventLoopGroup injection
Made enumerating-thread-factory public with an improved docstring
Removed internal caching (client-group-cache and get-client-group) - reuse is now up to the caller
Regarding the CI failures, I noticed they appear to be pre-existing HTTP/2 protocol issues unrelated to these changes. The failing tests (test-explicit-url, test-files) all show the same pattern:

"First received frame was not SETTINGS" - HTTP/2 protocol error
"connection was closed after X seconds" - consequence of the above
"SSLHandshakeException: No name matching localhost found" - SSL certificate issue
These failures seem to be related to SSL/ALPN negotiation in the Java 8 environment used by CircleCI, not to the ThreadFactory/EventLoopGroup changes in this PR. I also noticed similar HTTP/2 test failures on recent master builds.

Please let me know if there's anything else I should address!

@DerGuteMoritz
Copy link
Copy Markdown
Collaborator

Thanks! The test failure is indeed a known flake which has recently started happening (see #772). I've triggered a re-run of the job.

As for the patch: I'm a bit unhappy about the asymmetry between client and server now (one accepting thread-factory and the other accepting event-loop-group. I think it would be cleaner to have the same for both. Going with event-loop-group seems like the more useful / general option. However, if we do that, the transport option may be in conflict with the passed in event-loop-group. Should we make them mutually exclusive or can you think of a way to make them work together?

Furthermore, we should then also expose and document transport-event-loop-group!

@gabrielfranca95
Copy link
Copy Markdown
Author

Thanks for the feedback! I totally agree that addressing the asymmetry is much cleaner, so I went ahead and implemented event-loop-group for both client and server.

Regarding your question about the conflict: instead of making them mutually exclusive, I implemented a way to make them work together. I added validation that checks if the provided group is compatible with the requested transport; if not, it throws an exception. This feels like the most robust approach.

I also exposed and documented transport-event-loop-group as suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants