hhhhsc701
ab4523b556
add export type settings and enhance metadata structure ( #181 )
...
* fix(session): enhance database connection settings with pool pre-ping and recycle options
* feat(metadata): add export type settings and enhance metadata structure
* fix(base_op): improve sample handling by introducing target_type key and consolidating text/data retrieval logic
* feat(metadata): add export type settings and enhance metadata structure
* feat(metadata): add export type settings and enhance metadata structure
2025-12-19 11:54:08 +08:00
hhhhsc701
62b91b6deb
bugfix: update values.yaml to enhance ray-cluster configuration with security context, environment variables, and resource limits ( #172 )
...
* feature: unstructured支持简单pdf处理
* feature: update values.yaml to enhance ray-cluster configuration with security context, environment variables, and resource limits
2025-12-17 10:41:13 +08:00
hhhhsc701
fc9fb07e77
bugfix ( #164 )
2025-12-11 23:17:01 +08:00
hhhhsc701
f69ed6b8aa
Revert "feature: 增加data-juicer算子" ( #158 )
...
Revert "feature: 增加data-juicer算子 (#157 )"
This reverts commit 786f13f9c3 .
2025-12-11 10:32:53 +08:00
hhhhsc701
786f13f9c3
feature: 增加data-juicer算子 ( #157 )
2025-12-11 10:32:19 +08:00
hhhhsc701
d59c167da4
算子将抽取与落盘固定到流程中 ( #134 )
...
* feature: 将抽取动作移到每一个算子中
* feature: 落盘算子改为默认执行
* feature: 优化前端展示
* feature: 使用pyproject管理依赖
2025-12-05 17:26:29 +08:00
hhhhsc701
265e284fb8
feature: 修改算子开发指南 ( #127 )
2025-12-03 17:45:08 +08:00
hhhhsc701
c22683d635
优化部分问题 ( #126 )
...
* feature: 支持相对路径引用
* feature: 优化本地部署命令
* feature: 优化算子编排展示
* feature: 优化清洗任务失败后重试
2025-12-03 16:41:48 +08:00
hhhhsc701
07029d07ff
优化清洗重试机制,优化清洗进度展示,修复模板无法展示参数 ( #113 )
...
* bugfix: 模板无法展示参数
* bugfix: 优化清洗进度展示
* bugfix: 优化清洗重试机制
2025-11-28 15:28:10 +08:00
hhhhsc701
6bbde0ec56
feature: 清洗任务详情页 ( #73 )
...
* feature: 清洗任务详情
* fix: 取消构建镜像,改为直接拉取
* fix: 增加清洗任务详情页
* fix: 增加清洗任务详情页
* fix: 算子列表可点击
* fix: 模板详情和更新
2025-11-12 18:00:19 +08:00
hhhhsc701
05b26a2981
feature: 更新算子名称;增加创建任务、模板校验 ( #57 )
...
* feature: 更新算子名称;增加创建任务、模板校验
* feature: 镜像构建增加缓存
2025-11-05 17:38:03 +08:00
Startalker
155603b1ca
feature: add external pdf extract operator by using mineru ( #36 )
...
* feature: add UnstructuredFormatter
* feature: add UnstructuredFormatter in db
* feature: add unstructured[docx]==0.18.15
* feature: support doc
* feature: add mineru
* feature: add external pdf extract operator by using mineru
* feature: mineru docker install bugfix
---------
Co-authored-by: Startalker <438747480@qq.com >
2025-10-30 15:55:10 +08:00
hhhhsc
2d2419205a
refactor: rename and reorganize data models and repositories for clarity
2025-10-24 15:33:46 +08:00
hhhhsc701
31ef8bc265
[Feature] Refactor project to use 'datamate' naming convention for services and configurations ( #14 )
...
* Enhance CleaningTaskService to track cleaning process progress and update ExecutorType to DATAMATE
* Refactor project to use 'datamate' naming convention for services and configurations
2025-10-22 17:53:16 +08:00
Dallas98
1c97afed7d
init datamate
2025-10-21 23:00:48 +08:00