* feat(generation_service): add image URL extraction and random QA generation logic
* fix(generation_service): increase batch size from 20 to 100 for improved chunk processing
* fix(generation_service): increase batch size from 20 to 100 for improved chunk processing
* fix(chart): update Helm chart helpers and values for improved configuration
* feat(SynthesisTaskTab): enhance task table with tooltip support and improved column widths
* feat(CreateTask, SynthFileTask): improve task creation and detail view with enhanced payload handling and UI updates
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthDataDetail): add delete action for chunks with confirmation prompt
* feat(SynthDataDetail): update edit and delete buttons to icon-only format
* feat(SynthDataDetail): add confirmation modals for chunk and synthesis data deletion
* feat(DocumentSplitter): add enhanced document splitting functionality with CJK support and metadata detection
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* feat(DataSynthesis): streamline synthesis task handling and enhance chunk processing logic
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* fix(generation_service): ensure processed chunks are incremented regardless of question generation success
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(model_chat): enhance JSON parsing by removing additional thought tags and improving fallback logic
* feat(generation_service): add document filtering to remove short documents based on chunk size
* fix(chart): update Helm chart helpers and values for improved configuration
* feat(SynthesisTaskTab): enhance task table with tooltip support and improved column widths
* feat(CreateTask, SynthFileTask): improve task creation and detail view with enhanced payload handling and UI updates
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthDataDetail): add delete action for chunks with confirmation prompt
* feat(SynthDataDetail): update edit and delete buttons to icon-only format
* feat(SynthDataDetail): add confirmation modals for chunk and synthesis data deletion
* feat(DocumentSplitter): add enhanced document splitting functionality with CJK support and metadata detection
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* feat(DataSynthesis): streamline synthesis task handling and enhance chunk processing logic
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* fix(generation_service): ensure processed chunks are incremented regardless of question generation success
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(model_chat): enhance JSON parsing by removing additional thought tags and improving fallback logic
* fix(chart): update Helm chart helpers and values for improved configuration
* feat(SynthesisTaskTab): enhance task table with tooltip support and improved column widths
* feat(CreateTask, SynthFileTask): improve task creation and detail view with enhanced payload handling and UI updates
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthDataDetail): add delete action for chunks with confirmation prompt
* feat(SynthDataDetail): update edit and delete buttons to icon-only format
* feat(SynthDataDetail): add confirmation modals for chunk and synthesis data deletion
* feat(DocumentSplitter): add enhanced document splitting functionality with CJK support and metadata detection
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* feat(DataSynthesis): streamline synthesis task handling and enhance chunk processing logic
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* fix(generation_service): ensure processed chunks are incremented regardless of question generation success
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* fix(chart): update Helm chart helpers and values for improved configuration
* feat(SynthesisTaskTab): enhance task table with tooltip support and improved column widths
* feat(CreateTask, SynthFileTask): improve task creation and detail view with enhanced payload handling and UI updates
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthDataDetail): add delete action for chunks with confirmation prompt
* feat(SynthDataDetail): update edit and delete buttons to icon-only format
* feat(SynthDataDetail): add confirmation modals for chunk and synthesis data deletion
* fix: fixed the issue where an error would be reported when only setting the proportioning quantity when creating a proportioning task
* fix: prevent adding the same file multiple times
* fix: implement a more flexible matching strategy, allowing only the tag name to be configured for matching
* feat(synthesis): add evaluation task creation functionality and UI enhancements
* feat(synthesis): implement synthesis data management features including loading, editing, and deleting
* feat(synthesis): add endpoints for deleting and updating synthesis data and chunks
* fix: Correctly extract file values from selectedFilesMap in AddDataDialog
* feature: add cot data evaluation function
* fix: added verification to evaluation results
* fix: fix the prompt for evaluating
* fix: 修复当评估结果为空导致读取失败的问题
* feat: Implement data synthesis task management with database models and API endpoints
* feat: Update Python version requirements and refine dependency constraints in configuration
* fix: Correctly extract file values from selectedFilesMap in AddDataDialog
* feat: Refactor synthesis task routes and enhance file task management in the API
* feat: Enhance SynthesisTaskTab with tooltip actions and add chunk data retrieval in API
- Updated `update_file_tags` to support both simplified and full tag formats.
- Introduced `TagFormatConverter` to handle conversion from simplified external tags to internal storage format.
- Added logic to fetch and utilize the appropriate annotation template for conversion.
- Improved error handling for missing templates and unknown controls during tag updates.
- Created example script demonstrating the usage of the new tag format conversion feature.
- Added unit tests for `TagFormatConverter` to ensure correct functionality and edge case handling.
* feat: Enhance annotation module with template management and validation
- Added DatasetMappingCreateRequest and DatasetMappingUpdateRequest schemas to handle dataset mapping requests with camelCase and snake_case support.
- Introduced Annotation Template schemas including CreateAnnotationTemplateRequest, UpdateAnnotationTemplateRequest, and AnnotationTemplateResponse for managing annotation templates.
- Implemented AnnotationTemplateService for creating, updating, retrieving, and deleting annotation templates, including validation of configurations and XML generation.
- Added utility class LabelStudioConfigValidator for validating Label Studio configurations and XML formats.
- Updated database schema for annotation templates and labeling projects to include new fields and constraints.
- Seeded initial annotation templates for various use cases including image classification, object detection, and text classification.
* feat: Enhance TemplateForm with improved validation and dynamic field rendering; update LabelStudio config validation for camelCase support
* feat: Update docker-compose.yml to mark datamate dataset volume and network as external
* feat: Add tag configuration management and related components
- Introduced new components for tag selection and browsing in the frontend.
- Added API endpoint to fetch tag configuration from the backend.
- Implemented tag configuration management in the backend, including loading from YAML.
- Enhanced template service to support dynamic tag rendering based on configuration.
- Updated validation utilities to incorporate tag configuration checks.
- Refactored existing code to utilize the new tag configuration structure.
* feat: Refactor LabelStudioTagConfig for improved configuration loading and validation
* feat: Update Makefile to include backend-python-docker-build in the build process
* feat: Migrate to poetry for better deps management
* Add pyyaml dependency and update Dockerfile to use Poetry for dependency management
- Added pyyaml (>=6.0.3,<7.0.0) to pyproject.toml dependencies.
- Updated Dockerfile to install Poetry and manage dependencies using it.
- Improved layer caching by copying only dependency files before the application code.
- Removed unnecessary installation of build dependencies to keep the final image size small.
* feat: Remove duplicated backend-python-docker-build target from Makefile
* fix: airflow is not ready for adding yet
* feat: update Python version to 3.12 and remove project installation step in Dockerfile
* feat: Enhance annotation module with template management and validation
- Added DatasetMappingCreateRequest and DatasetMappingUpdateRequest schemas to handle dataset mapping requests with camelCase and snake_case support.
- Introduced Annotation Template schemas including CreateAnnotationTemplateRequest, UpdateAnnotationTemplateRequest, and AnnotationTemplateResponse for managing annotation templates.
- Implemented AnnotationTemplateService for creating, updating, retrieving, and deleting annotation templates, including validation of configurations and XML generation.
- Added utility class LabelStudioConfigValidator for validating Label Studio configurations and XML formats.
- Updated database schema for annotation templates and labeling projects to include new fields and constraints.
- Seeded initial annotation templates for various use cases including image classification, object detection, and text classification.
* feat: Enhance TemplateForm with improved validation and dynamic field rendering; update LabelStudio config validation for camelCase support
* feat: Update docker-compose.yml to mark datamate dataset volume and network as external
* feat: Refactor configuration and sync logic for improved dataset handling and logging
* feat: Enhance annotation synchronization and dataset file management
- Added new fields `tags_updated_at` to `DatasetFiles` model for tracking the last update time of tags.
- Implemented new asynchronous methods in the Label Studio client for fetching, creating, updating, and deleting task annotations.
- Introduced bidirectional synchronization for annotations between DataMate and Label Studio, allowing for flexible data management.
- Updated sync service to handle annotation conflicts based on timestamps, ensuring data integrity during synchronization.
- Enhanced dataset file response model to include tags and their update timestamps.
- Modified database initialization script to create a new column for `tags_updated_at` in the dataset files table.
- Updated requirements to ensure compatibility with the latest dependencies.
* feature: implement endpoints with multi-level response models
* refactor: move `/health` and `/config` endpoints to system module, remove example from base schemas
* refactor: remove unused get_standard_response_model()