feat: File and Annotation 2-way sync implementation (#63)

* feat: Refactor configuration and sync logic for improved dataset handling and logging

* feat: Enhance annotation synchronization and dataset file management

- Added new fields `tags_updated_at` to `DatasetFiles` model for tracking the last update time of tags.
- Implemented new asynchronous methods in the Label Studio client for fetching, creating, updating, and deleting task annotations.
- Introduced bidirectional synchronization for annotations between DataMate and Label Studio, allowing for flexible data management.
- Updated sync service to handle annotation conflicts based on timestamps, ensuring data integrity during synchronization.
- Enhanced dataset file response model to include tags and their update timestamps.
- Modified database initialization script to create a new column for `tags_updated_at` in the dataset files table.
- Updated requirements to ensure compatibility with the latest dependencies.
This commit is contained in:
Jason Wang
2025-11-07 15:03:07 +08:00
committed by GitHub
parent d136bad38c
commit 78f50ea520
16 changed files with 1336 additions and 290 deletions

View File

@@ -1,94 +1,19 @@
# ====================================
# Label Studio Adapter Configuration
# ====================================
# =========================
# 应用程序配置
# =========================
APP_NAME="Label Studio Adapter"
APP_VERSION="1.0.0"
APP_DESCRIPTION="Adapter for integrating Data Management System with Label Studio"
DEBUG=true
# =========================
# 服务器配置
# =========================
# Dev settings
HOST=0.0.0.0
PORT=18000
# =========================
# 日志配置
# =========================
LOG_LEVEL=INFO
DEBUG=true
LOG_LEVEL=DEBUG
LOG_FILE_DIR=./logs
# =========================
# Label Studio 服务配置
# =========================
# Label Studio 服务地址(根据部署方式调整)
# Docker 环境:http://label-studio:8080
# 本地开发:http://127.0.0.1:8000
LABEL_STUDIO_BASE_URL=http://label-studio:8080
# Label Studio 用户名和密码(用于自动创建用户)
LABEL_STUDIO_USERNAME=admin@example.com
LABEL_STUDIO_PASSWORD=password
# Label Studio API 认证 Token(Legacy Token,推荐使用)
# 从 Label Studio UI 的 Account & Settings > Access Token 获取
LABEL_STUDIO_USER_TOKEN=your-label-studio-token-here
# Label Studio 本地文件存储基础路径(容器内路径,用于 Docker 部署时的权限检查)
LABEL_STUDIO_LOCAL_BASE=/label-studio/local_files
# Label Studio 本地文件服务路径前缀(任务数据中的文件路径前缀)
LABEL_STUDIO_FILE_PATH_PREFIX=/data/local-files/?d=
# Label Studio 容器中的本地存储路径(用于配置 Local Storage)
LABEL_STUDIO_LOCAL_STORAGE_DATASET_BASE_PATH=/label-studio/local_files/dataset
LABEL_STUDIO_LOCAL_STORAGE_UPLOAD_BASE_PATH=/label-studio/local_files/upload
# Label Studio 任务列表分页大小
LS_TASK_PAGE_SIZE=1000
# =========================
# Data Management 服务配置
# =========================
# DM 存储文件夹前缀(通常与 Label Studio 的 local-files 文件夹映射一致)
DM_FILE_PATH_PREFIX=/
# =========================
# Adapter 数据库配置 (MySQL)
# =========================
# 优先级1:如果配置了 MySQL,将优先使用 MySQL 数据库
MYSQL_HOST=adapter-db
# DataBase
MYSQL_HOST=localhost
MYSQL_PORT=3306
MYSQL_USER=label_studio_user
MYSQL_PASSWORD=user_password
MYSQL_DATABASE=label_studio_adapter
MYSQL_USER=root
MYSQL_PASSWORD=password
MYSQL_DATABASE=datamate
# =========================
# CORS 配置
# =========================
# 允许的来源(生产环境建议配置具体域名)
ALLOWED_ORIGINS=["*"]
# Label Studio settings
LABEL_STUDIO_BASE_URL=http://localhost:8080
# 允许的 HTTP 方法
ALLOWED_METHODS=["*"]
# 允许的请求头
ALLOWED_HEADERS=["*"]
# =========================
# Docker Compose 配置
# =========================
# Docker Compose 项目名称前缀
COMPOSE_PROJECT_NAME=ls-adapter
# =========================
# 同步配置(未来扩展)
# =========================
# 批量同步任务的批次大小
SYNC_BATCH_SIZE=100
# 同步失败时的最大重试次数
MAX_RETRIES=3
LABEL_STUDIO_USER_TOKEN="demo_dev_token"