Files
DataMate/runtime
Jerry Yan 9988ff00f5 feat(auto-annotation): add concurrent processing support
Enable parallel processing for auto-annotation tasks with configurable worker count and file-level parallelism.

Key features:
- Multi-worker support: WORKER_COUNT env var (default 1) controls number of worker threads
- Intra-task file parallelism: FILE_WORKERS env var (default 1) controls concurrent file processing within a single task
- Operator chain pooling: Pre-create N independent chain instances to avoid thread-safety issues
- Thread-safe progress tracking: Use threading.Lock to protect shared counters
- Stop signal handling: threading.Event for graceful cancellation during concurrent processing

Implementation details:
- Refactor _process_single_task() to use ThreadPoolExecutor + as_completed()
- Chain pool (queue.Queue): Each worker thread acquires/releases a chain instance
- Protected counters: processed_images, detected_total, file_results with Lock
- Stop check: Periodic check of _is_stop_requested() during concurrent processing
- Refactor start_auto_annotation_worker(): Move recovery logic here, start WORKER_COUNT threads
- Simplify _worker_loop(): Remove recovery call, keep only polling + processing

Backward compatibility:
- Default config (WORKER_COUNT=1, FILE_WORKERS=1) behaves identically to previous version
- No breaking changes to existing deployments

Testing:
- 11 unit tests all passed:
  * Multi-worker startup
  * Chain pool acquire/release
  * Concurrent file processing
  * Stop signal handling
  * Thread-safe counter updates
  * Backward compatibility (FILE_WORKERS=1)
- py_compile syntax check passed

Performance benefits:
- WORKER_COUNT=3: Process 3 tasks simultaneously
- FILE_WORKERS=4: Process 4 files in parallel within each task
- Combined: Up to 12x throughput improvement (3 workers × 4 files)
2026-02-10 16:36:34 +08:00
..
2026-01-09 14:35:03 +08:00