Comprehensive code review fixes: security hardening, async safety, and error handling#1418
Comprehensive code review fixes: security hardening, async safety, and error handling#1418danielgerlag wants to merge 16 commits intomasterfrom
Conversation
…ue provider - Use 'using' statements for SqlConnection instead of manual Close() - Validate queue name to prevent SQL injection via string replacement Co-authored-by: Copilot <[email protected]>
- Disable AllowNewToEvaluateAnyType in Dynamic LINQ parsing config - Add TypeNameHandling.None to JSON deserialization settings - Add error handling for missing constructors in reflection calls - Wrap ParseLambda calls in try-catch with descriptive error messages - Handle DynamicInvoke TargetInvocationException by unwrapping inner exception - Add safe Enum.Parse with validation and error handling - Remove bare catch block, add specific exception handling with logging - Add type validation in TypeResolver.FindType Co-authored-by: Copilot <[email protected]>
- Add parameter validation and correct HTTP status codes in WorkflowsController - Add parameter validation in EventsController - Fix ServiceProvider disposal in JsonWorkflowTest - Add filter sanitization in SearchService - Fix race condition in InMemoryConversationStore - Add logging for tool call ID truncation - Add null check in ToolRegistry constructor - Add source validation in AI DefinitionLoader Co-authored-by: Copilot <[email protected]>
- Convert async void RenewLeases to async Task in AzureLockManager - Convert async void SendHeartbeat to async Task in DynamoLockProvider - Convert async void Process to async Task in KinesisStreamConsumer - Remove blocking .Wait() in DynamoPersistenceProvider.EnsureStoreExists - Remove blocking .Wait() in DynamoLockProvider.Stop - Remove blocking .Wait() in SqlServerQueueProvider.Dispose - Fix blocking GetAwaiter().GetResult() in RabbitMQProvider.Dispose Co-authored-by: Copilot <[email protected]>
…e services - Replace lock-on-parameter with proper synchronization in MemoryPersistenceProvider - Await Task.Run in SingleNodeEventHub, use thread-safe collection - Convert async void FutureQueue to async Task with proper tracking - Remove blocking .Wait() in WorkflowHost.Start/Stop - Add timeout to QueueConsumer.Stop blocking wait - Fix TOCTOU race condition in WorkflowRegistry using TryGetValue - Implement proper Dispose in SingleNodeQueueProvider - Replace First/Single with FirstOrDefault/SingleOrDefault in MemoryPersistenceProvider - Fix compensation chain logic in CompensateHandler - Fix lock release on unacquired lock in ActivityController - Fix GreyList TTL race condition - Use ConcurrentDictionary in IndexConsumer - Add logging for null steps in WorkflowExecutor Co-authored-by: Copilot <[email protected]>
- Use TryGetValue and fix throw/throw ex in SqlLockProvider - Synchronize static indexesCreated in MongoDB and RavenDB providers - Replace FirstAsync with FirstOrDefaultAsync in EF persistence provider - Add exception logging in MongoDB ProcessCommands - Remove unused field in RabbitMQ provider - Use using for RabbitMQ channel resources Co-authored-by: Copilot <[email protected]>
Move the null guard from constructor to Register<T>() and GetTool() where _serviceProvider is actually needed. This allows tests to construct ToolRegistry with null when only using Register(tool) instance methods. Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR focuses on hardening WorkflowCore services and providers by improving async/shutdown patterns, reducing race conditions, tightening deserialization/expression safety, and adding some API/input validation.
Changes:
- Replace several sync-over-async and
async voidpatterns with safer task-based patterns; improve shutdown/disposal behavior across consumers and providers. - Add concurrency safeguards and better diagnostics (thread-safe collections, additional logging, safer registry/persistence access).
- Tighten DSL and API surface security (JSON
TypeNameHandlingoff, dynamic expression restrictions, basic request validation).
Reviewed changes
Copilot reviewed 35 out of 35 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| src/WorkflowCore/Services/WorkflowRegistry.cs | Uses TryGetValue/TryRemove patterns to simplify lookups/removals and avoid key-check races. |
| src/WorkflowCore/Services/WorkflowHost.cs | Uses GetAwaiter().GetResult() for sync wrappers around async start/stop. |
| src/WorkflowCore/Services/WorkflowExecutor.cs | Adds warning + skip when a step pointer references a missing step definition. |
| src/WorkflowCore/Services/GreyList.cs | Adds clarifying comment about benign race on eviction. |
| src/WorkflowCore/Services/ErrorHandlers/CompensateHandler.cs | Minor pointer assignment placement adjustment in compensation logic. |
| src/WorkflowCore/Services/DefaultProviders/SingleNodeQueueProvider.cs | Disposes in-memory queues by completing/disposing collections. |
| src/WorkflowCore/Services/DefaultProviders/SingleNodeEventHub.cs | Switches subscribers to concurrent collection and changes publish behavior. |
| src/WorkflowCore/Services/DefaultProviders/MemoryPersistenceProvider.cs | Adjusts lookup semantics (First/Single → OrDefault) and fixes lock target for error persistence. |
| src/WorkflowCore/Services/BackgroundTasks/WorkflowConsumer.cs | Removes manual new Task(...).Start() and avoids async void helper. |
| src/WorkflowCore/Services/BackgroundTasks/QueueConsumer.cs | Adds stop timeout warning, ensures activity disposed/null’d, and logs queued task failures. |
| src/WorkflowCore/Services/BackgroundTasks/IndexConsumer.cs | Uses ConcurrentDictionary for error counts and simplifies updates. |
| src/WorkflowCore/Services/ActivityController.cs | Tracks lock acquisition and adjusts token expiry assignment. |
| src/WorkflowCore.Testing/WorkflowTest.cs | Stores and disposes the created ServiceProvider to prevent test-time resource leaks. |
| src/WorkflowCore.Testing/JsonWorkflowTest.cs | Stores and disposes the created ServiceProvider to prevent test-time resource leaks. |
| src/WorkflowCore.DSL/Services/TypeResolver.cs | Wraps type-load failures with a clearer exception message. |
| src/WorkflowCore.DSL/Services/Deserializers.cs | Forces JSON deserialization to not use type name handling. |
| src/WorkflowCore.DSL/Services/DefinitionLoader.cs | Tightens dynamic parsing config, adds input validation, and improves error reporting around reflection/expression parsing. |
| src/providers/WorkflowCore.QueueProviders.SqlServer/Services/SqlServerQueueProviderMigrator.cs | Uses using for connections; refactors transaction scope handling. |
| src/providers/WorkflowCore.QueueProviders.SqlServer/Services/SqlServerQueueProvider.cs | Uses using for connections and adds queue-name sanitization before SQL substitution. |
| src/providers/WorkflowCore.QueueProviders.RabbitMQ/Services/RabbitMQProvider.cs | Removes unused JSON settings and ensures channels are disposed asynchronously. |
| src/providers/WorkflowCore.Providers.Azure/Services/AzureLockManager.cs | Removes async void timer callback by delegating to a task method. |
| src/providers/WorkflowCore.Providers.AWS/Services/KinesisStreamConsumer.cs | Starts processing via Task.Run and changes processing method to Task. |
| src/providers/WorkflowCore.Providers.AWS/Services/DynamoPersistenceProvider.cs | Uses GetAwaiter().GetResult() for provisioning sync wrapper. |
| src/providers/WorkflowCore.Providers.AWS/Services/DynamoLockProvider.cs | Reworks heartbeat task to be task-based; makes Stop await heartbeat completion. |
| src/providers/WorkflowCore.Persistence.RavenDB/Services/RavendbPersistenceProvider.cs | Adds lock to prevent concurrent index creation. |
| src/providers/WorkflowCore.Persistence.MongoDB/Services/MongoPersistenceProvider.cs | Adds logging, index-creation lock, and transaction abort on failures. |
| src/providers/WorkflowCore.Persistence.MongoDB/ServiceCollectionExtensions.cs | Updates DI to supply ILoggerFactory to Mongo persistence provider. |
| src/providers/WorkflowCore.Persistence.EntityFramework/Services/EntityFrameworkPersistenceProvider.cs | Changes many FirstAsync calls to FirstOrDefaultAsync. |
| src/providers/WorkflowCore.LockProviders.SqlServer/SqlLockProvider.cs | Fixes exception rethrow pattern and avoids KeyNotFound on release. |
| src/extensions/WorkflowCore.WebAPI/Controllers/WorkflowsController.cs | Adds paging bounds + ID validation; changes missing-definition response. |
| src/extensions/WorkflowCore.WebAPI/Controllers/EventsController.cs | Adds required parameter validation for posting events. |
| src/extensions/WorkflowCore.AI.AzureFoundry/Services/ToolRegistry.cs | Adds null guard for DI-provided IServiceProvider. |
| src/extensions/WorkflowCore.AI.AzureFoundry/Services/SearchService.cs | Adds basic filter string validation before applying OData filter. |
| src/extensions/WorkflowCore.AI.AzureFoundry/Services/InMemoryConversationStore.cs | Refactors thread creation using GetOrAdd. |
| src/extensions/WorkflowCore.AI.AzureFoundry/Services/ChatCompletionService.cs | Adds logging around tool-call ID generation/truncation and makes helper instance-based. |
Comments suppressed due to low confidence (3)
src/providers/WorkflowCore.Persistence.EntityFramework/Services/EntityFrameworkPersistenceProvider.cs:152
FirstOrDefaultAsynccan return null here; passing a nullPersistedWorkflowintoToPersistablewill create a new entity that is not tracked/added to the DbContext, soSaveChangesAsyncwill not persist anything. This also silently changes behavior from throwing when the workflow record is missing. Consider restoringFirstAsync(fail fast) or explicitly adding the new persistable entity to the DbSet whenexistingEntity == null.
var uid = new Guid(workflow.Id);
var existingEntity = await db.Set<PersistedWorkflow>()
.Where(x => x.InstanceId == uid)
.Include(wf => wf.ExecutionPointers)
.ThenInclude(ep => ep.ExtensionAttributes)
.Include(wf => wf.ExecutionPointers)
.AsTracking()
.FirstOrDefaultAsync(cancellationToken);
var persistable = workflow.ToPersistable(existingEntity);
await db.SaveChangesAsync(cancellationToken);
}
src/providers/WorkflowCore.Persistence.EntityFramework/Services/EntityFrameworkPersistenceProvider.cs:371
FirstOrDefaultAsyncmay return null here; dereferencingexistingEntitywill throw. Consider keepingFirstAsync(fail fast) or adding a null check with a clearer error/return value (e.g., return false).
var uid = new Guid(eventSubscriptionId);
var existingEntity = await db.Set<PersistedSubscription>()
.Where(x => x.SubscriptionId == uid)
.AsTracking()
.FirstOrDefaultAsync(cancellationToken);
existingEntity.ExternalToken = token;
existingEntity.ExternalWorkerId = workerId;
existingEntity.ExternalTokenExpiry = expiry;
await db.SaveChangesAsync(cancellationToken);
src/providers/WorkflowCore.Persistence.EntityFramework/Services/EntityFrameworkPersistenceProvider.cs:389
FirstOrDefaultAsyncmay return null here;existingEntity.ExternalTokendereference will throw. Either restoreFirstAsyncor add a null check and decide how to handle missing subscriptions.
var uid = new Guid(eventSubscriptionId);
var existingEntity = await db.Set<PersistedSubscription>()
.Where(x => x.SubscriptionId == uid)
.AsTracking()
.FirstOrDefaultAsync(cancellationToken);
if (existingEntity.ExternalToken != token)
throw new InvalidOperationException();
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
...ders/WorkflowCore.Persistence.EntityFramework/Services/EntityFrameworkPersistenceProvider.cs
Show resolved
Hide resolved
...ders/WorkflowCore.Persistence.EntityFramework/Services/EntityFrameworkPersistenceProvider.cs
Outdated
Show resolved
Hide resolved
...ders/WorkflowCore.Persistence.EntityFramework/Services/EntityFrameworkPersistenceProvider.cs
Show resolved
Hide resolved
src/WorkflowCore/Services/DefaultProviders/SingleNodeEventHub.cs
Outdated
Show resolved
Hide resolved
src/extensions/WorkflowCore.AI.AzureFoundry/Services/InMemoryConversationStore.cs
Outdated
Show resolved
Hide resolved
src/extensions/WorkflowCore.WebAPI/Controllers/WorkflowsController.cs
Outdated
Show resolved
Hide resolved
- Update Squadron from 0.17.0 to 1.0.1 across 6 provider test projects (Redis, SqlServer, PostgreSQL, MySQL, MongoDB, Elasticsearch) - Update xunit from 2.9.2 to 2.9.3 in test/Directory.Build.props - Update StackExchange.Redis from 2.0.601 to 2.8.37 (required by Squadron.Redis 1.0.1) - Drop net6.0 TFM from provider tests (Squadron 1.0.1 requires net8.0+) - Fix MongoDB test: pass ILoggerFactory to updated MongoPersistenceProvider constructor - Fix PostgreSQL tests: remove Npgsql legacy timestamp behavior that was incompatible with the timestamptz column migration, causing DateTime values to silently lose timezone offset on round-trip All provider tests now pass: Redis 13/13, PostgreSQL 22/22, MySQL 21/21, SqlServer 44/44, MongoDB 25/25, Elasticsearch 7/7, DynamoDB 19/19 Co-authored-by: Copilot <[email protected]>
🛡️ Comprehensive code review fixes: security hardening, async safety, and error handling
Summary
Full codebase audit of Workflow Core identifying and fixing 58 issues across the core engine, DSL module, persistence/queue/lock providers, and extensions. Fixes span 35 files with changes grouped into 6 categories.
🔴 Security (Critical)
AllowNewToEvaluateAnyTypein Dynamic LINQ parsing config — previously allowed arbitrary type instantiation (e.g.,System.Diagnostics.Process) in user-supplied workflow expressions (DSLDefinitionLoader)TypeNameHandling.Noneto JSON workflow deserialization to prevent deserialization gadget chain attacks (Deserializers.cs)TypeResolver.FindTypeto fail gracefully on invalid typesSqlServerQueueProviderto prevent injection via queue name string replacementSearchServicefor Azure Search queries🔴 Async & Concurrency (Critical/High)
async voidmethods (4 instances) — converted toasync Taskwith proper tracking in:WorkflowConsumer.FutureQueueAzureLockManager.RenewLeasesDynamoLockProvider.SendHeartbeatKinesisStreamConsumer.Process.Wait()calls (6 instances) — replaced withGetAwaiter().GetResult()where sync API is required, orawaitwhere async is possible:WorkflowHost.Start/StopQueueConsumer.Stop(added 30s timeout)DynamoPersistenceProvider.EnsureStoreExistsDynamoLockProvider.StopSqlServerQueueProvider.DisposeTask.RuninSingleNodeEventHub.PublishNotification— now properly awaitedWorkflowRegistry(ContainsKey → TryGetValue) andGreyListHashSetwithConcurrentBaginSingleNodeEventHublock+DictionarywithConcurrentDictionaryinIndexConsumerindexesCreatedflag in MongoDB and RavenDB providersGetOrCreaterace condition inInMemoryConversationStoreusingGetOrAdd🟠 Resource Management (High)
cn.Close()withusingblocks inSqlServerQueueProviderandSqlServerQueueProviderMigrator(5 methods) to prevent connection leaksDisposeinSingleNodeQueueProvider—BlockingCollectionwas never disposedServiceProviderdisposal inWorkflowTestandJsonWorkflowTestPersistWorkflowDisposeAsyncfor RabbitMQ channel cleanup🟠 Error Handling (High)
First()→FirstOrDefault(),Single()→SingleOrDefault()inMemoryPersistenceProvider;FirstAsync()→FirstOrDefaultAsync()inEntityFrameworkPersistenceProvider(9 call sites)Invoke(null)in DSLDefinitionLoader— preventsNullReferenceExceptionon types without parameterless constructorsDynamicInvokecalls withTargetInvocationExceptionunwrapping for meaningful error messagesEnum.Parsewith try-catch and descriptive error for invalid valuescatchwith specificArgumentExceptionhandling in expression property resolutionParseLambdaerror wrapping with step/expression context in error messagesTryGetValueinstead of direct dictionary access inSqlLockProviderthrow ex→throwinSqlLockProviderto preserve stack tracesProcessCommands(was silently swallowed with a TODO)ActivityController— was releasing locks that were never acquiredWorkflowExecutor.ProcessAfterExecutionIteration🟡 API & Validation (Medium)
WorkflowsController— pagination bounds, null checks,404 Not Foundinstead of400 Bad Requestfor missing workflow definitionsEventsController— null/empty checks for event name and keyDefinitionLoaderToolRegistryconstructor🟢 Cleanup (Low)
CompensateHandlerJsonSerializerSettingsfield fromRabbitMQProvideractivity = nullafter dispose inQueueConsumerto prevent double-disposeQueueConsumerTask.WhenAllcatch blockTesting
WorkflowTest/JsonWorkflowTestfor theServiceProviderdisposal fix)Stats