feature: add unstructured xlsx/xls/csv/pptx/ppt (#41)

* feature: add UnstructuredFormatter

* feature: add UnstructuredFormatter in db

* feature: add unstructured[docx]==0.18.15

* feature: support doc

* feature: add mineru

* feature: add external pdf extract operator by using mineru

* feature: mineru docker install bugfix

* feature: add unstructured xlsx/xls/csv/pptx/ppt

---------

Co-authored-by: Startalker <438747480@qq.com>
This commit is contained in:
Startalker
2025-10-30 20:21:12 +08:00
committed by GitHub
parent b9b97c1ac2
commit 06b05a65a9
3 changed files with 3 additions and 2 deletions

View File

@@ -19,4 +19,4 @@ xmltodict==1.0.2
zhconv==1.4.3
sqlalchemy==2.0.40
pymysql==1.1.1
unstructured[docx]==0.18.15
unstructured[docx,csv,xlsx,pptx]==0.18.15