7 Commits

Author SHA1 Message Date
b5aaf52bb6 chore(deps): 更新 paddlenlp 依赖版本
- 将 paddlenlp 从 3.0b4 版本降级到 2.8.1 版本
- 保持其他依赖包版本不变
- 确保依赖版本兼容性
2026-01-09 17:20:05 +08:00
103cb94a6d feat(runtime): 添加 PaddleNLP 依赖包
- 在 pyproject.toml 中新增 paddlenlp==3.0.0b4 依赖
- 为 OCR 功能扩展提供自然语言处理支持
2026-01-09 15:51:42 +08:00
Kecheng Sha
3f1ad6a872 feat(auto-annotation): integrate YOLO auto-labeling and enhance data management (#223)
* feat(auto-annotation): initial setup

* chore: remove package-lock.json

* chore: 清理本地测试脚本与 Maven 设置

* chore: change package-lock.json
2026-01-05 14:22:44 +08:00
hhhhsc701
6a1eb85e8e feat: 支持运行data-juicer算子 (#215)
* feature: 增加data-juicer算子

* feat: 支持运行data-juicer算子

* feat: 支持data-juicer任务下发

* feat: 支持data-juicer结果数据集归档

* feat: 支持data-juicer结果数据集归档
2025-12-31 09:20:41 +08:00
hhhhsc701
924d977d6f 支持mineru npu处理 (#174)
* feature: unstructured支持简单pdf处理

* feature: update values.yaml to enhance ray-cluster configuration with security context, environment variables, and resource limits

* feature: update deploy.yaml and process.py for mineru server configuration and PDF processing enhancements

* feature: update deploy.yaml and process.py for mineru server configuration and PDF processing enhancements

* feature: improve PDF processing logic and update dependencies in process.py and pyproject.toml

* feature: improve PDF processing logic and update dependencies in process.py and pyproject.toml

* feature: update Dockerfile for improved package source mirrors and add mineru-npu to build targets
2025-12-17 16:31:06 +08:00
hhhhsc701
19a04df276 feature: 增加水印去除/高级匿名化算子 (#151)
* feature: 增加水印去除算子

* feature: clean code

* feature: clean code

* feature: 增加高级匿名化算子
2025-12-10 18:12:47 +08:00
hhhhsc701
d59c167da4 算子将抽取与落盘固定到流程中 (#134)
* feature: 将抽取动作移到每一个算子中

* feature: 落盘算子改为默认执行

* feature: 优化前端展示

* feature: 使用pyproject管理依赖
2025-12-05 17:26:29 +08:00