* feature: unstructured支持简单pdf处理
* feature: update values.yaml to enhance ray-cluster configuration with security context, environment variables, and resource limits
* feature: update deploy.yaml and process.py for mineru server configuration and PDF processing enhancements
* feature: update deploy.yaml and process.py for mineru server configuration and PDF processing enhancements
* feature: improve PDF processing logic and update dependencies in process.py and pyproject.toml
* feature: improve PDF processing logic and update dependencies in process.py and pyproject.toml
* feature: update Dockerfile for improved package source mirrors and add mineru-npu to build targets
* feat(synthesis): add evaluation task creation functionality and UI enhancements
* feat(synthesis): implement synthesis data management features including loading, editing, and deleting
* feat(synthesis): add endpoints for deleting and updating synthesis data and chunks
* fix: Correctly extract file values from selectedFilesMap in AddDataDialog
* docs: update README and Makefile for clarity and new development instructions
* feature: Add label studio installation and uninstallation commands to Makefile
* feature: Enhance Makefile with detailed help commands and improve install/uninstall targets for services
* feature: Update Makefile help commands to clarify usage of local images
* feature: Improve error handling in Makefile for build, install, and uninstall targets
* feature: Enhance uninstall process in Makefile to prompt for volume deletion and update README with usage details
---------
Co-authored-by: Jason Wang <wjl_jason@qq.com>
* feat: Enhance annotation module with template management and validation
- Added DatasetMappingCreateRequest and DatasetMappingUpdateRequest schemas to handle dataset mapping requests with camelCase and snake_case support.
- Introduced Annotation Template schemas including CreateAnnotationTemplateRequest, UpdateAnnotationTemplateRequest, and AnnotationTemplateResponse for managing annotation templates.
- Implemented AnnotationTemplateService for creating, updating, retrieving, and deleting annotation templates, including validation of configurations and XML generation.
- Added utility class LabelStudioConfigValidator for validating Label Studio configurations and XML formats.
- Updated database schema for annotation templates and labeling projects to include new fields and constraints.
- Seeded initial annotation templates for various use cases including image classification, object detection, and text classification.
* feat: Enhance TemplateForm with improved validation and dynamic field rendering; update LabelStudio config validation for camelCase support
* feat: Update docker-compose.yml to mark datamate dataset volume and network as external
* feat: Add tag configuration management and related components
- Introduced new components for tag selection and browsing in the frontend.
- Added API endpoint to fetch tag configuration from the backend.
- Implemented tag configuration management in the backend, including loading from YAML.
- Enhanced template service to support dynamic tag rendering based on configuration.
- Updated validation utilities to incorporate tag configuration checks.
- Refactored existing code to utilize the new tag configuration structure.
* feat: Refactor LabelStudioTagConfig for improved configuration loading and validation
* feat: Update Makefile to include backend-python-docker-build in the build process
* feat: Migrate to poetry for better deps management
* Add pyyaml dependency and update Dockerfile to use Poetry for dependency management
- Added pyyaml (>=6.0.3,<7.0.0) to pyproject.toml dependencies.
- Updated Dockerfile to install Poetry and manage dependencies using it.
- Improved layer caching by copying only dependency files before the application code.
- Removed unnecessary installation of build dependencies to keep the final image size small.
* feat: Remove duplicated backend-python-docker-build target from Makefile
* fix: airflow is not ready for adding yet
* feat: update Python version to 3.12 and remove project installation step in Dockerfile
* Enhance CleaningTaskService to track cleaning process progress and update ExecutorType to DATAMATE
* Refactor project to use 'datamate' naming convention for services and configurations