Commit Graph

49 Commits

Author SHA1 Message Date
hhhhsc701
87e73d3bf7 feat: label-studio支持指定sc (#200)
* feat: label-studio构建脚本

* feat: label-studio构建脚本

* feat: label-studio构建脚本

* feat: label-studio安装脚本

* feat: label-studio支持指定sc
2025-12-25 16:13:38 +08:00
hhhhsc701
7e842c7cd5 feat: label-studio构建脚本 (#198)
* feat: label-studio构建脚本

* feat: label-studio构建脚本

* feat: label-studio构建脚本

* feat: label-studio安装脚本
2025-12-25 11:44:05 +08:00
hhhhsc701
1c507ac98a feat: 支持npu自动扩缩容 (#197)
* feat: npu动态调度

* feat: 数据集分页优化

* feat: 支持npu自动扩缩容

* feat: 支持npu自动扩缩容

* feat: 支持npu自动扩缩容

* feat: clean code
2025-12-24 18:03:30 +08:00
hhhhsc701
6d61348388 feat: deer-flow通过gateway转发 (#193) 2025-12-23 11:35:45 +08:00
hefanli
c1516c87b6 Feat gateway (#191)
* fix: fix the routes definition

* fix: fix the helm installing file

* fix: modify the logging dependencies
2025-12-22 18:58:55 +08:00
hefanli
d419eec3ec fix: fix the routes definition (#189) 2025-12-22 16:04:49 +08:00
hefanli
e5b28c26b1 add gateway (#187)
* feature: add gateway
2025-12-22 15:41:17 +08:00
hhhhsc701
be875086db feat: add operator-packages-volume to docker-compose and update Docke… (#179)
* feat: add operator-packages-volume to docker-compose and update Dockerfile for site-packages path

* feat: add retry
2025-12-18 20:32:42 +08:00
Dallas98
8113840ac7 fix(docker-compose): update entrypoint and command for mineru-openai-server configuration (#176) 2025-12-17 21:23:03 +08:00
hhhhsc701
924d977d6f 支持mineru npu处理 (#174)
* feature: unstructured支持简单pdf处理

* feature: update values.yaml to enhance ray-cluster configuration with security context, environment variables, and resource limits

* feature: update deploy.yaml and process.py for mineru server configuration and PDF processing enhancements

* feature: update deploy.yaml and process.py for mineru server configuration and PDF processing enhancements

* feature: improve PDF processing logic and update dependencies in process.py and pyproject.toml

* feature: improve PDF processing logic and update dependencies in process.py and pyproject.toml

* feature: update Dockerfile for improved package source mirrors and add mineru-npu to build targets
2025-12-17 16:31:06 +08:00
hhhhsc701
62b91b6deb bugfix: update values.yaml to enhance ray-cluster configuration with security context, environment variables, and resource limits (#172)
* feature: unstructured支持简单pdf处理

* feature: update values.yaml to enhance ray-cluster configuration with security context, environment variables, and resource limits
2025-12-17 10:41:13 +08:00
hhhhsc701
fc9fb07e77 bugfix (#164) 2025-12-11 23:17:01 +08:00
Dallas98
ec87e4f204 feat(frontend): 增强Synthesis Data Detail页面UX体验 (#163)
* fix(chart): update Helm chart helpers and values for improved configuration

* feat(SynthesisTaskTab): enhance task table with tooltip support and improved column widths

* feat(CreateTask, SynthFileTask): improve task creation and detail view with enhanced payload handling and UI updates

* feat(SynthFileTask): enhance file display with progress tracking and delete action

* feat(SynthFileTask): enhance file display with progress tracking and delete action

* feat(SynthDataDetail): add delete action for chunks with confirmation prompt

* feat(SynthDataDetail): update edit and delete buttons to icon-only format

* feat(SynthDataDetail): add confirmation modals for chunk and synthesis data deletion
2025-12-11 21:02:44 +08:00
Dallas98
174359be9f feat(milvus): update Milvus configuration to use URI and remove deprecated host/port settings (#155) 2025-12-10 18:41:20 +08:00
Dallas98
44d72c446f feat(milvus): update Milvus configuration to use URI and remove deprecated host/port settings 2025-12-10 18:27:58 +08:00
Dallas98
cbb146d3d7 feat(chart): add Helm chart for deploying Label Studio with PostgreSQL (#152)
* feat(chart): add Helm chart for deploying Label Studio with PostgreSQL

* feat(milvus): update Milvus configuration to use URI and remove deprecated host/port settings
2025-12-10 17:46:12 +08:00
hhhhsc701
103c21945d 修复部分功能 (#138)
* feature: 版本统一

* feature: 定时同步时默认值展示异常,增加提示

* feature: 修复数据归集搜索

* feature: 优化标注模板查询

* feature: 屏蔽webhook功能
2025-12-10 14:31:05 +08:00
hefanli
f8b32506cf fix: k8s部署场景下,backend-python服务挂载需要存储 (#144) 2025-12-09 19:09:51 +08:00
Dallas98
bef15f328d feat(config): add proxy configuration for evaluation API endpoint (#141) 2025-12-09 15:01:44 +08:00
hefanli
758cf93e36 feature: 增加压缩包上传功能 (#137)
* feature: 增加压缩包上传功能

* fix: 删除文件时数据集关于文件的相关统计信息也刷新

* fix: 增加k8s常见下评估服务的路由
2025-12-09 14:42:27 +08:00
Dallas98
7012a9ad98 feat: enhance backend deployment, frontend file selection and synthesis task management (#129)
* feat: Implement data synthesis task management with database models and API endpoints

* feat: Update Python version requirements and refine dependency constraints in configuration

* fix: Correctly extract file values from selectedFilesMap in AddDataDialog

* feat: Refactor synthesis task routes and enhance file task management in the API

* feat: Enhance SynthesisTaskTab with tooltip actions and add chunk data retrieval in API
2025-12-04 09:57:13 +08:00
hefanli
1d19cd3a62 feature: add data-evaluation
* feature: add evaluation task management function

* feature: add evaluation task detail page

* fix: delete duplicate definition for table t_model_config

* refactor: rename package synthesis to ratio

* refactor: add eval file table and  refactor related code

* fix: calling large models in parallel during evaluation
2025-12-04 09:23:54 +08:00
hhhhsc701
b5fa8af900 bugfix: 修复deer-flow部署 (#124) 2025-12-02 19:23:30 +08:00
hhhhsc701
f730bd5b0c bugfix: 支持使用runtime单实例 (#118)
* bugfix: 支持使用runtime单实例
2025-12-01 19:05:50 +08:00
hhhhsc701
bb3345268e bugfix: 清洗/算子支持名称/描述搜索 (#116)
* bugfix: milvus适配etcd deploy部署

* bugfix: 可以在知识库界面跳转到创建模型
2025-11-29 18:15:43 +08:00
hhhhsc701
fe42b03548 bugfix: milvus部分组件支持镜像仓库 (#114) 2025-11-28 17:39:56 +08:00
hhhhsc701
91390cace0 feature: 北向接口:支持通过模板创建清洗任务 (#111)
feature: 北向接口:支持通过模板创建清洗任务
2025-11-26 17:30:54 +08:00
hhhhsc701
bc2f57f2c0 feature: 修改缩进 (#109) 2025-11-26 11:24:50 +08:00
hhhhsc701
af2a01e52d feature: milvus pvc支持本地目录 (#105)
feature: milvus pvc支持本地目录
2025-11-25 16:54:24 +08:00
hhhhsc701
fb399b74cf feature: pvc支持本地盘+配置sc (#104) 2025-11-24 17:29:32 +08:00
hhhhsc701
536ef9f556 feature: milvus service名称变更 兼容k8s (#97)
feature: milvus service名称变更  兼容k8s (#97)
2025-11-21 12:06:53 +08:00
hhhhsc701
d9e163c163 Develop deer flow (#85)
* fix: deer-flow支持从datamate获取搜索引擎
2025-11-14 17:36:55 +08:00
Dallas98
aa01f52535 合并拉取请求 #74
* feat: Implement system parameter management with Redis integration
2025-11-11 22:13:14 +08:00
Jason Wang
c5ccc56cca feat: Add labeling template (#72)
* feat: Enhance annotation module with template management and validation

- Added DatasetMappingCreateRequest and DatasetMappingUpdateRequest schemas to handle dataset mapping requests with camelCase and snake_case support.
- Introduced Annotation Template schemas including CreateAnnotationTemplateRequest, UpdateAnnotationTemplateRequest, and AnnotationTemplateResponse for managing annotation templates.
- Implemented AnnotationTemplateService for creating, updating, retrieving, and deleting annotation templates, including validation of configurations and XML generation.
- Added utility class LabelStudioConfigValidator for validating Label Studio configurations and XML formats.
- Updated database schema for annotation templates and labeling projects to include new fields and constraints.
- Seeded initial annotation templates for various use cases including image classification, object detection, and text classification.

* feat: Enhance TemplateForm with improved validation and dynamic field rendering; update LabelStudio config validation for camelCase support

* feat: Update docker-compose.yml to mark datamate dataset volume and network as external
2025-11-11 09:14:14 +08:00
hhhhsc701
9dd26d622f feature: 数据库镜像制作 (#70)
* feature: 数据库镜像制作

* feature: 增加归档包流水线
2025-11-10 19:06:53 +08:00
Jason Wang
8a0228b20e feat: Enhanced file and annotation synchronization across DataMate and LabelStudio. fix: change LabelStudio mapping to +1 of DataMate.
* feat: Refactor configuration and sync logic for improved dataset handling and logging

* feat: Enhance annotation synchronization and dataset file management

- Added new fields `tags_updated_at` to `DatasetFiles` model for tracking the last update time of tags.
- Implemented new asynchronous methods in the Label Studio client for fetching, creating, updating, and deleting task annotations.
- Introduced bidirectional synchronization for annotations between DataMate and Label Studio, allowing for flexible data management.
- Updated sync service to handle annotation conflicts based on timestamps, ensuring data integrity during synchronization.
- Enhanced dataset file response model to include tags and their update timestamps.
- Modified database initialization script to create a new column for `tags_updated_at` in the dataset files table.
- Updated requirements to ensure compatibility with the latest dependencies.

* fix: Update port mapping for label studio and adjust base URL in DataAnnotation component
2025-11-10 10:04:41 +08:00
hhhhsc701
f78475e29f Develop hsc (#58)
feature: 优化镜像构建/部署
2025-11-06 17:14:54 +08:00
hhhhsc701
05b26a2981 feature: 更新算子名称;增加创建任务、模板校验 (#57)
* feature: 更新算子名称;增加创建任务、模板校验

* feature: 镜像构建增加缓存
2025-11-05 17:38:03 +08:00
Jason Wang
b5fe787c20 feat: Labeling Frontend adaptations + Backend build and deploy + Logging improvement (#55)
* feat: Front-end data annotation page adaptation to the backend API.

* feat: Implement labeling configuration editor and enhance annotation task creation form

* feat: add python backend build and deployment; add backend configuration for Label Studio integration and improve logging setup

* refactor: remove duplicate log configuration
2025-11-05 01:55:53 +08:00
hhhhsc701
f3958f08d9 feature: 对接deer-flow (#54)
feature: 对接deer-flow
2025-11-04 20:30:40 +08:00
hhhhsc701
b9b97c1ac2 Develop op (#35)
* refactor: enhance CleaningTaskService and related components with validation and repository updates
* feature: 支持算子上传创建
2025-10-30 17:17:00 +08:00
Dallas98
8d2b41ed94 feature: Implement the basic knowledge generation function (#40) 2025-10-30 16:50:54 +08:00
Startalker
155603b1ca feature: add external pdf extract operator by using mineru (#36)
* feature: add UnstructuredFormatter

* feature: add UnstructuredFormatter in db

* feature: add unstructured[docx]==0.18.15

* feature: support doc

* feature: add mineru

* feature: add external pdf extract operator by using mineru

* feature: mineru docker install bugfix

---------

Co-authored-by: Startalker <438747480@qq.com>
2025-10-30 15:55:10 +08:00
Jason Wang
2f7341dc1f refactor: Reorganize datamate-python (#34)
refactor: Reorganize datamate-python (previously label-studio-adapter) into a DDD style structure.
2025-10-30 01:32:59 +08:00
hhhhsc
a69b9f4921 feature: 对接deer-flow 2025-10-28 10:54:29 +08:00
hhhhsc
2d2419205a refactor: rename and reorganize data models and repositories for clarity 2025-10-24 15:33:46 +08:00
hhhhsc
17e6cea1d9 refactor: reorganize Helm chart structure and update service configurations 2025-10-23 16:57:12 +08:00
hhhhsc701
31ef8bc265 [Feature] Refactor project to use 'datamate' naming convention for services and configurations (#14)
* Enhance CleaningTaskService to track cleaning process progress and update ExecutorType to DATAMATE

* Refactor project to use 'datamate' naming convention for services and configurations
2025-10-22 17:53:16 +08:00
Dallas98
1c97afed7d init datamate 2025-10-21 23:00:48 +08:00