feat: File and Annotation 2-way sync implementation (#63)

* feat: Refactor configuration and sync logic for improved dataset handling and logging

* feat: Enhance annotation synchronization and dataset file management

- Added new fields `tags_updated_at` to `DatasetFiles` model for tracking the last update time of tags.
- Implemented new asynchronous methods in the Label Studio client for fetching, creating, updating, and deleting task annotations.
- Introduced bidirectional synchronization for annotations between DataMate and Label Studio, allowing for flexible data management.
- Updated sync service to handle annotation conflicts based on timestamps, ensuring data integrity during synchronization.
- Enhanced dataset file response model to include tags and their update timestamps.
- Modified database initialization script to create a new column for `tags_updated_at` in the dataset files table.
- Updated requirements to ensure compatibility with the latest dependencies.
This commit is contained in:
Jason Wang
2025-11-07 15:03:07 +08:00
committed by GitHub
parent d136bad38c
commit 78f50ea520
16 changed files with 1336 additions and 290 deletions

View File

@@ -55,6 +55,7 @@ CREATE TABLE IF NOT EXISTS t_dm_dataset_files (
file_size BIGINT DEFAULT 0 COMMENT '文件大小(字节)',
check_sum VARCHAR(64) COMMENT '文件校验和',
tags JSON COMMENT '文件标签信息',
tags_updated_at TIMESTAMP NULL COMMENT '标签最后更新时间',
metadata JSON COMMENT '文件元数据',
status VARCHAR(50) DEFAULT 'ACTIVE' COMMENT '文件状态:ACTIVE/DELETED/PROCESSING',
upload_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP COMMENT '上传时间',