You've already forked DataMate
* feat: Enhance annotation module with template management and validation - Added DatasetMappingCreateRequest and DatasetMappingUpdateRequest schemas to handle dataset mapping requests with camelCase and snake_case support. - Introduced Annotation Template schemas including CreateAnnotationTemplateRequest, UpdateAnnotationTemplateRequest, and AnnotationTemplateResponse for managing annotation templates. - Implemented AnnotationTemplateService for creating, updating, retrieving, and deleting annotation templates, including validation of configurations and XML generation. - Added utility class LabelStudioConfigValidator for validating Label Studio configurations and XML formats. - Updated database schema for annotation templates and labeling projects to include new fields and constraints. - Seeded initial annotation templates for various use cases including image classification, object detection, and text classification. * feat: Enhance TemplateForm with improved validation and dynamic field rendering; update LabelStudio config validation for camelCase support * feat: Update docker-compose.yml to mark datamate dataset volume and network as external * feat: Add tag configuration management and related components - Introduced new components for tag selection and browsing in the frontend. - Added API endpoint to fetch tag configuration from the backend. - Implemented tag configuration management in the backend, including loading from YAML. - Enhanced template service to support dynamic tag rendering based on configuration. - Updated validation utilities to incorporate tag configuration checks. - Refactored existing code to utilize the new tag configuration structure. * feat: Refactor LabelStudioTagConfig for improved configuration loading and validation * feat: Update Makefile to include backend-python-docker-build in the build process * feat: Migrate to poetry for better deps management * Add pyyaml dependency and update Dockerfile to use Poetry for dependency management - Added pyyaml (>=6.0.3,<7.0.0) to pyproject.toml dependencies. - Updated Dockerfile to install Poetry and manage dependencies using it. - Improved layer caching by copying only dependency files before the application code. - Removed unnecessary installation of build dependencies to keep the final image size small. * feat: Remove duplicated backend-python-docker-build target from Makefile * fix: airflow is not ready for adding yet * feat: update Python version to 3.12 and remove project installation step in Dockerfile
Label Studio Adapter (DataMate)
这是 DataMate 的 Label Studio Adapter 服务,负责将 DataMate 的项目与 Label Studio 同步并提供对外的 HTTP API(基于 FastAPI)。
简要说明
- 框架:FastAPI
- 异步数据库/ORM:SQLAlchemy (async)
- 数据库迁移:Alembic
- 运行器:uvicorn
快速开始(开发)
- 克隆仓库并进入项目目录
- 创建并激活虚拟环境:
python -m venv .venv
source .venv/bin/activate
- 安装依赖:
pip install -r requirements.txt
- 准备环境变量(示例)
创建 .env 并设置必要的变量,例如:
- DATABASE_URL(或根据项目配置使用具体变量)
- LABEL_STUDIO_BASE_URL
- LABEL_STUDIO_USER_TOKEN
(具体变量请参考 .env.example)
- 数据库迁移(开发环境):
alembic upgrade head
- 启动开发服务器(示例与常用参数):
- 本地开发(默认 host/port,自动重载):
uvicorn app.main:app --reload
- 指定主机与端口并打开调试日志:
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload --log-level debug
- 在生产环境使用多个 worker(不使用 --reload):
uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4 --log-level info --proxy-headers
- 使用环境变量启动(示例):
HOST=0.0.0.0 PORT=8000 uvicorn app.main:app --reload
注意:
--reload仅用于开发,会监视文件变化并重启进程;不要在生产中使用。--workers提供并发处理能力,但会增加内存占用;生产时通常配合进程管理或容器编排(Kubernetes)使用。- 若需要完整的生产部署建议使用 ASGI 服务器(如 gunicorn + uvicorn workers / 或直接使用 uvicorn 在容器中配合进程管理)。
访问 API 文档:
- Swagger UI: http://127.0.0.1:8000/docs
- ReDoc: http://127.0.0.1:8000/redoc (推荐使用)
使用(简要)
- 所有 API 路径以
/api前缀注册(见app/main.py中app.include_router(api_router, prefix="/api"))。 - 根路径
/返回服务信息和文档链接。
更多细节请查看 doc/usage.md(接口使用)和 doc/development.md(开发说明)。