hhhhsc701
7d4dcb756b
fix: 修复入库可能重复;筛选逻辑优化 ( #226 )
...
* 修改数据清洗筛选逻辑-筛选修改为多选
* 修改数据清洗筛选逻辑-筛选修改为多选
* antd 组件库样式定制修改
* fix: 修复入库可能重复
* fix: 算子市场筛选逻辑优化
* fix: 清洗任务创建筛选逻辑优化
* fix: 清洗任务创建筛选逻辑优化
---------
Co-authored-by: chase <byzhangxin11@126.com >
2026-01-06 17:57:25 +08:00
Kecheng Sha
3f1ad6a872
feat(auto-annotation): integrate YOLO auto-labeling and enhance data management ( #223 )
...
* feat(auto-annotation): initial setup
* chore: remove package-lock.json
* chore: 清理本地测试脚本与 Maven 设置
* chore: change package-lock.json
2026-01-05 14:22:44 +08:00
hhhhsc701
6a1eb85e8e
feat: 支持运行data-juicer算子 ( #215 )
...
* feature: 增加data-juicer算子
* feat: 支持运行data-juicer算子
* feat: 支持data-juicer任务下发
* feat: 支持data-juicer结果数据集归档
* feat: 支持data-juicer结果数据集归档
2025-12-31 09:20:41 +08:00
hhhhsc701
1c507ac98a
feat: 支持npu自动扩缩容 ( #197 )
...
* feat: npu动态调度
* feat: 数据集分页优化
* feat: 支持npu自动扩缩容
* feat: 支持npu自动扩缩容
* feat: 支持npu自动扩缩容
* feat: clean code
2025-12-24 18:03:30 +08:00
hhhhsc701
d82bff441a
fix: prevent deletion of predefined operators and improve error handling ( #192 )
...
* fix: prevent deletion of predefined operators and improve error handling
* fix: prevent deletion of predefined operators and improve error handling
2025-12-22 19:30:41 +08:00
hhhhsc701
ab4523b556
add export type settings and enhance metadata structure ( #181 )
...
* fix(session): enhance database connection settings with pool pre-ping and recycle options
* feat(metadata): add export type settings and enhance metadata structure
* fix(base_op): improve sample handling by introducing target_type key and consolidating text/data retrieval logic
* feat(metadata): add export type settings and enhance metadata structure
* feat(metadata): add export type settings and enhance metadata structure
2025-12-19 11:54:08 +08:00
hhhhsc701
62b91b6deb
bugfix: update values.yaml to enhance ray-cluster configuration with security context, environment variables, and resource limits ( #172 )
...
* feature: unstructured支持简单pdf处理
* feature: update values.yaml to enhance ray-cluster configuration with security context, environment variables, and resource limits
2025-12-17 10:41:13 +08:00
hhhhsc701
fc9fb07e77
bugfix ( #164 )
2025-12-11 23:17:01 +08:00
hhhhsc701
f69ed6b8aa
Revert "feature: 增加data-juicer算子" ( #158 )
...
Revert "feature: 增加data-juicer算子 (#157 )"
This reverts commit 786f13f9c3 .
2025-12-11 10:32:53 +08:00
hhhhsc701
786f13f9c3
feature: 增加data-juicer算子 ( #157 )
2025-12-11 10:32:19 +08:00
hhhhsc701
d59c167da4
算子将抽取与落盘固定到流程中 ( #134 )
...
* feature: 将抽取动作移到每一个算子中
* feature: 落盘算子改为默认执行
* feature: 优化前端展示
* feature: 使用pyproject管理依赖
2025-12-05 17:26:29 +08:00
hhhhsc701
265e284fb8
feature: 修改算子开发指南 ( #127 )
2025-12-03 17:45:08 +08:00
hhhhsc701
c22683d635
优化部分问题 ( #126 )
...
* feature: 支持相对路径引用
* feature: 优化本地部署命令
* feature: 优化算子编排展示
* feature: 优化清洗任务失败后重试
2025-12-03 16:41:48 +08:00
hhhhsc701
07029d07ff
优化清洗重试机制,优化清洗进度展示,修复模板无法展示参数 ( #113 )
...
* bugfix: 模板无法展示参数
* bugfix: 优化清洗进度展示
* bugfix: 优化清洗重试机制
2025-11-28 15:28:10 +08:00
hhhhsc701
6bbde0ec56
feature: 清洗任务详情页 ( #73 )
...
* feature: 清洗任务详情
* fix: 取消构建镜像,改为直接拉取
* fix: 增加清洗任务详情页
* fix: 增加清洗任务详情页
* fix: 算子列表可点击
* fix: 模板详情和更新
2025-11-12 18:00:19 +08:00
hhhhsc701
05b26a2981
feature: 更新算子名称;增加创建任务、模板校验 ( #57 )
...
* feature: 更新算子名称;增加创建任务、模板校验
* feature: 镜像构建增加缓存
2025-11-05 17:38:03 +08:00
Startalker
155603b1ca
feature: add external pdf extract operator by using mineru ( #36 )
...
* feature: add UnstructuredFormatter
* feature: add UnstructuredFormatter in db
* feature: add unstructured[docx]==0.18.15
* feature: support doc
* feature: add mineru
* feature: add external pdf extract operator by using mineru
* feature: mineru docker install bugfix
---------
Co-authored-by: Startalker <438747480@qq.com >
2025-10-30 15:55:10 +08:00
hhhhsc
2d2419205a
refactor: rename and reorganize data models and repositories for clarity
2025-10-24 15:33:46 +08:00
hhhhsc701
31ef8bc265
[Feature] Refactor project to use 'datamate' naming convention for services and configurations ( #14 )
...
* Enhance CleaningTaskService to track cleaning process progress and update ExecutorType to DATAMATE
* Refactor project to use 'datamate' naming convention for services and configurations
2025-10-22 17:53:16 +08:00
Dallas98
1c97afed7d
init datamate
2025-10-21 23:00:48 +08:00