Add support for detecting new file versions and switching to them:
Backend Changes:
- Add file_version column to AnnotationResult model
- Create Alembic migration for database schema update
- Implement check_file_version() method to compare annotation and file versions
- Implement use_new_version() method to clear annotations and update version
- Update upsert_annotation() to record file version when saving
- Add new API endpoints: GET /version and POST /use-new-version
- Add FileVersionCheckResponse and UseNewVersionResponse schemas
Frontend Changes:
- Add checkFileVersionUsingGet and useNewVersionUsingPost API calls
- Add version warning banner showing current vs latest file version
- Add 'Use New Version' button with confirmation dialog
- Clear version info state when switching files to avoid stale warnings
Bug Fixes:
- Fix previousFileVersion returning updated value (save before update)
- Handle null file_version for historical data compatibility
- Fix segmented annotation clearing (preserve structure, clear results)
- Fix files without annotations incorrectly showing new version warnings
- Preserve total_segments when clearing segmented annotations
Files Modified:
- frontend/src/pages/DataAnnotation/Annotate/LabelStudioTextEditor.tsx
- frontend/src/pages/DataAnnotation/annotation.api.ts
- runtime/datamate-python/app/db/models/annotation_management.py
- runtime/datamate-python/app/module/annotation/interface/editor.py
- runtime/datamate-python/app/module/annotation/schema/editor.py
- runtime/datamate-python/app/module/annotation/service/editor.py
New Files:
- runtime/datamate-python/alembic.ini
- runtime/datamate-python/alembic/env.py
- runtime/datamate-python/alembic/script.py.mako
- runtime/datamate-python/alembic/versions/20250205_0001_add_file_version.py
- Updated `update_file_tags` to support both simplified and full tag formats.
- Introduced `TagFormatConverter` to handle conversion from simplified external tags to internal storage format.
- Added logic to fetch and utilize the appropriate annotation template for conversion.
- Improved error handling for missing templates and unknown controls during tag updates.
- Created example script demonstrating the usage of the new tag format conversion feature.
- Added unit tests for `TagFormatConverter` to ensure correct functionality and edge case handling.
* feat: Enhance annotation module with template management and validation
- Added DatasetMappingCreateRequest and DatasetMappingUpdateRequest schemas to handle dataset mapping requests with camelCase and snake_case support.
- Introduced Annotation Template schemas including CreateAnnotationTemplateRequest, UpdateAnnotationTemplateRequest, and AnnotationTemplateResponse for managing annotation templates.
- Implemented AnnotationTemplateService for creating, updating, retrieving, and deleting annotation templates, including validation of configurations and XML generation.
- Added utility class LabelStudioConfigValidator for validating Label Studio configurations and XML formats.
- Updated database schema for annotation templates and labeling projects to include new fields and constraints.
- Seeded initial annotation templates for various use cases including image classification, object detection, and text classification.
* feat: Enhance TemplateForm with improved validation and dynamic field rendering; update LabelStudio config validation for camelCase support
* feat: Update docker-compose.yml to mark datamate dataset volume and network as external
* feat: Add tag configuration management and related components
- Introduced new components for tag selection and browsing in the frontend.
- Added API endpoint to fetch tag configuration from the backend.
- Implemented tag configuration management in the backend, including loading from YAML.
- Enhanced template service to support dynamic tag rendering based on configuration.
- Updated validation utilities to incorporate tag configuration checks.
- Refactored existing code to utilize the new tag configuration structure.
* feat: Refactor LabelStudioTagConfig for improved configuration loading and validation
* feat: Update Makefile to include backend-python-docker-build in the build process
* feat: Migrate to poetry for better deps management
* Add pyyaml dependency and update Dockerfile to use Poetry for dependency management
- Added pyyaml (>=6.0.3,<7.0.0) to pyproject.toml dependencies.
- Updated Dockerfile to install Poetry and manage dependencies using it.
- Improved layer caching by copying only dependency files before the application code.
- Removed unnecessary installation of build dependencies to keep the final image size small.
* feat: Remove duplicated backend-python-docker-build target from Makefile
* fix: airflow is not ready for adding yet
* feat: update Python version to 3.12 and remove project installation step in Dockerfile
* feat: Enhance annotation module with template management and validation
- Added DatasetMappingCreateRequest and DatasetMappingUpdateRequest schemas to handle dataset mapping requests with camelCase and snake_case support.
- Introduced Annotation Template schemas including CreateAnnotationTemplateRequest, UpdateAnnotationTemplateRequest, and AnnotationTemplateResponse for managing annotation templates.
- Implemented AnnotationTemplateService for creating, updating, retrieving, and deleting annotation templates, including validation of configurations and XML generation.
- Added utility class LabelStudioConfigValidator for validating Label Studio configurations and XML formats.
- Updated database schema for annotation templates and labeling projects to include new fields and constraints.
- Seeded initial annotation templates for various use cases including image classification, object detection, and text classification.
* feat: Enhance TemplateForm with improved validation and dynamic field rendering; update LabelStudio config validation for camelCase support
* feat: Update docker-compose.yml to mark datamate dataset volume and network as external
* feat: Refactor configuration and sync logic for improved dataset handling and logging
* feat: Enhance annotation synchronization and dataset file management
- Added new fields `tags_updated_at` to `DatasetFiles` model for tracking the last update time of tags.
- Implemented new asynchronous methods in the Label Studio client for fetching, creating, updating, and deleting task annotations.
- Introduced bidirectional synchronization for annotations between DataMate and Label Studio, allowing for flexible data management.
- Updated sync service to handle annotation conflicts based on timestamps, ensuring data integrity during synchronization.
- Enhanced dataset file response model to include tags and their update timestamps.
- Modified database initialization script to create a new column for `tags_updated_at` in the dataset files table.
- Updated requirements to ensure compatibility with the latest dependencies.
* feature: implement endpoints with multi-level response models
* refactor: move `/health` and `/config` endpoints to system module, remove example from base schemas
* refactor: remove unused get_standard_response_model()