Blazing iconBlazing
Coming SoonDev Preview

This content is part of an upcoming preview program. Request early access

Concurrency Patterns

Edit on GitHub

Advanced patterns for CRDT queue operations and multi-node coordination

Learn advanced concurrency patterns for distributed task processing with Blazing Batch's CRDT-based queue system.

CRDT Queue Architecture

Blazing Batch uses CRDT (Conflict-free Replicated Data Types) for safe distributed queue operations:

  • No duplicate processing - Each task processed exactly once
  • Lock-free operations - No distributed locks needed
  • Multi-node coordination - Workers on different nodes coordinate automatically
  • Partition tolerance - Works across network partitions

Node-Based Queue Partitioning

How It Works

Each API node writes to its own queue segment:

Python

Workers scan all queue segments and dequeue operations safely.

Example: Multi-Node Enqueue

Python

Race Condition Handling

Concurrent Status Updates

Multiple workers can safely update operation status concurrently:

Python

Lock-Free Dequeuing

Workers dequeue operations without locks using atomic Redis operations:

Python

Multi-App Isolation

Concurrent App Instances

Multiple Blazing app instances can run concurrently with full isolation:

Python

Cross-App Communication

Apps are isolated by default, but can share data through Redis:

Python

Network Resilience

Handling Network Partitions

Blazing Batch is designed to handle network issues gracefully:

Python

Automatic Retry on Network Failure

Python

Worker Lifecycle Management

Coordinated Worker Startup

Python

Graceful Shutdown

Python

Advanced Patterns

Fan-Out / Fan-In

Process tasks in parallel, then aggregate results:

Python

Pipeline Processing

Chain multiple processing stages:

Python

Performance Characteristics

CRDT Queue Performance

| Operation | Time Complexity | Concurrency | |-----------|----------------|-------------| | Enqueue | O(1) | Lock-free | | Dequeue | O(n) segments | Lock-free | | Status Update | O(1) | Atomic | | Claim Operation | O(1) | Atomic |

Multi-Node Scalability

| Nodes | Throughput | Latency | Coordination Overhead | |-------|-----------|---------|----------------------| | 1 | 1,000 ops/s | 10ms | None | | 3 | 2,800 ops/s | 12ms | Minimal | | 5 | 4,500 ops/s | 15ms | Low | | 10 | 8,000 ops/s | 20ms | Moderate |

Best Practices

  1. Use NODE_ID - Set unique NODE_ID for each API instance
  2. Implement timeouts - Always use timeouts for network operations
  3. Handle retries - Implement exponential backoff for transient failures
  4. Monitor queue depth - Track queue segments per step
  5. Graceful shutdown - Wait for in-flight tasks before termination
  6. Isolate apps - Use separate app_id for different workloads

Next Steps

Source

Based on test suite examples:

  • tests/test_concurrency.py - CRDT queue operations
  • tests/test_network_resilience.py - Network fault tolerance
  • tests/test_worker_lifecycle_timing.py - Worker coordination