hhhhsc701
924d977d6f
支持mineru npu处理 ( #174 )
...
* feature: unstructured支持简单pdf处理
* feature: update values.yaml to enhance ray-cluster configuration with security context, environment variables, and resource limits
* feature: update deploy.yaml and process.py for mineru server configuration and PDF processing enhancements
* feature: update deploy.yaml and process.py for mineru server configuration and PDF processing enhancements
* feature: improve PDF processing logic and update dependencies in process.py and pyproject.toml
* feature: improve PDF processing logic and update dependencies in process.py and pyproject.toml
* feature: update Dockerfile for improved package source mirrors and add mineru-npu to build targets
2025-12-17 16:31:06 +08:00
hhhhsc701
d59c167da4
算子将抽取与落盘固定到流程中 ( #134 )
...
* feature: 将抽取动作移到每一个算子中
* feature: 落盘算子改为默认执行
* feature: 优化前端展示
* feature: 使用pyproject管理依赖
2025-12-05 17:26:29 +08:00
hhhhsc701
f1bffdcd61
bugfix: 创建清洗任务时修改数据集状态;无法删除已在模板/运行任务的算子
...
* bugfix: 创建清洗任务时修改数据集状态;无法删除已在模板/运行任务的算子
2025-11-27 17:34:53 +08:00
hhhhsc701
f78475e29f
Develop hsc ( #58 )
...
feature: 优化镜像构建/部署
2025-11-06 17:14:54 +08:00
hhhhsc701
05b26a2981
feature: 更新算子名称;增加创建任务、模板校验 ( #57 )
...
* feature: 更新算子名称;增加创建任务、模板校验
* feature: 镜像构建增加缓存
2025-11-05 17:38:03 +08:00
Startalker
a600c1d793
feature: modify UnstructuredFormatter and ExternalPDFFormatter description ( #44 )
...
* feature: add UnstructuredFormatter
* feature: add UnstructuredFormatter in db
* feature: add unstructured[docx]==0.18.15
* feature: support doc
* feature: add mineru
* feature: add external pdf extract operator by using mineru
* feature: mineru docker install bugfix
* feature: add unstructured xlsx/xls/csv/pptx/ppt
* feature: modify UnstructuredFormatter and ExternalPDFFormatter description
---------
Co-authored-by: Startalker <438747480@qq.com >
2025-10-31 10:32:14 +08:00
Startalker
06b05a65a9
feature: add unstructured xlsx/xls/csv/pptx/ppt ( #41 )
...
* feature: add UnstructuredFormatter
* feature: add UnstructuredFormatter in db
* feature: add unstructured[docx]==0.18.15
* feature: support doc
* feature: add mineru
* feature: add external pdf extract operator by using mineru
* feature: mineru docker install bugfix
* feature: add unstructured xlsx/xls/csv/pptx/ppt
---------
Co-authored-by: Startalker <438747480@qq.com >
2025-10-30 20:21:12 +08:00
Startalker
155603b1ca
feature: add external pdf extract operator by using mineru ( #36 )
...
* feature: add UnstructuredFormatter
* feature: add UnstructuredFormatter in db
* feature: add unstructured[docx]==0.18.15
* feature: support doc
* feature: add mineru
* feature: add external pdf extract operator by using mineru
* feature: mineru docker install bugfix
---------
Co-authored-by: Startalker <438747480@qq.com >
2025-10-30 15:55:10 +08:00
Startalker
f86d4fae25
feature: add unstructured formatter operator for doc/docx ( #17 )
...
* feature: add UnstructuredFormatter
* feature: add UnstructuredFormatter in db
* feature: add unstructured[docx]==0.18.15
* feature: support doc
---------
Co-authored-by: Startalker <438747480@qq.com >
2025-10-23 16:49:03 +08:00
Dallas98
1c97afed7d
init datamate
2025-10-21 23:00:48 +08:00