43f7d88ad2
fix(data-cleaning): 修复数据集文件查询方法调用参数缺失问题
...
- 为datasetFileService.getDatasetFiles方法调用补充了缺失的参数
- 确保分页查询功能正常工作
- 解决了因参数不匹配导致的方法
2026-01-27 19:54:11 +08:00
3a93098b57
feat(data-management): 添加数据集文件标注结果过滤功能
...
- 在 OpenAPI 规范中添加 hasAnnotation 查询参数用于过滤存在标注结果的文件
- 修改后端服务层 DatasetFileApplicationService 支持 hasAnnotation 参数
- 更新数据访问层 DatasetFileRepositoryImpl 实现基于标注结果的存在性查询
- 调整前端 DatasetFileTransfer 组件支持标注过滤功能
- 移除无用的分块选项配置并优化全选逻辑
- 修复文件查询时的参数传递和依赖追踪问题
2026-01-27 18:11:30 +08:00
6835511f5a
feat(data-management): 修改知识项导出功能为ZIP格式
...
- 将导出文件格式从JSON改为ZIP压缩包
- 使用ZipArchiveOutputStream实现ZIP文件创建
- 为每个知识项创建独立的文件条目
- 添加文件名规范化和长度限制逻辑
- 实现重复文件名的索引编号处理
- 移除Jackson ObjectMapper依赖引入
- 更新响应头内容类型为application/zip
2026-01-26 11:15:58 +08:00
a8c7c9404c
feat(knowledge): 添加知识条目导出功能和文件上传支持
...
- 在 KnowledgeItemApplicationService 中新增 exportKnowledgeItems 方法实现知识条目导出
- 添加 export 相关常量配置包括文件名格式、内容类型等
- 在 KnowledgeItemRepository 中新增 findAllBySetId 查询方法
- 在 KnowledgeItemController 中新增 export 接口端点
- 在 KnowledgeItemEditor 组件中添加文件上传功能支持 txt/md/markdown 格式
- 在 KnowledgeSetDetail 页面中添加导出按钮并集成导出 API
- 更新前端 API 文件添加 exportKnowledgeItemsUsingGet 方法
- 配置文件上传验证和自动填充标题内容逻辑
2026-01-26 11:13:21 +08:00
c5ace0c4cc
feat(annotation): 支持图像数据集的内嵌标注编辑器
...
- 添加文件预览接口,支持以 inline 方式预览数据集中的指定文件
- 实现图像任务构建功能,支持图像标注任务的数据结构
- 扩展标注编辑器服务以支持 TEXT 和 IMAGE 类型数据集
- 添加媒体对象分类支持,解析图像标注配置
- 实现图像文件预览 URL 构建逻辑
- 优化项目信息获取和任务响应构建流程
- 修复数据库查询中的项目 ID 引用错误
2026-01-25 17:25:44 +08:00
73f0ab65fa
feat(annotation): 实现标注结果同步到知识管理功能
...
- 在知识条目实体中新增来源数据集ID和文件ID字段
- 实现标注编辑器中同步标注结果到知识管理的服务逻辑
- 添加知识同步服务类处理标注到知识条目的转换和同步
- 实现通过下载接口获取文本内容的独立服务模块
- 更新知识条目查询接口支持按来源数据集和文件ID过滤
- 自动创建和关联标注项目对应的知识集
- 支持文本和Markdown文件的内容合并标注结果
- 添加同步过程中的错误处理和日志记录机制
2026-01-21 16:09:34 +08:00
e78acbea0a
feat(data-management): 添加知识库管理功能
...
- 在DataManagementErrorCode中新增知识库相关错误码定义
- 在数据库初始化脚本中创建知识集和知识条目表结构
- 新增KnowledgeItemApplicationService实现知识条目的CRUD操作
- 新增KnowledgeSetApplicationService实现知识集的CRUD操作
- 定义KnowledgeContentType、KnowledgeSourceType和KnowledgeStatusType枚举类型
- 创建KnowledgeItem和KnowledgeSet领域模型实体
- 实现KnowledgeItemMapper和KnowledgeSetMapper数据访问接口
- 提供KnowledgeItemRepositoryImpl和KnowledgeSetRepositoryImpl仓储实现
- 添加知识条目按条件分页查询功能
- 实现知识条目从数据集文件导入的功能
- 支持知识集和知识条目的标签管理和状态控制
2026-01-21 11:32:45 +08:00
79371ba078
feat(data-management): 添加数据集父子层级结构功能
...
- 在OpenAPI规范中新增parentDatasetId字段用于层级过滤
- 实现数据集父子关系的创建、更新和删除逻辑
- 添加数据集移动时的路径重命名和文件路径前缀更新
- 增加子数据集数量验证防止误删父数据集
- 更新前端界面支持选择父数据集和导航显示
- 优化Python后端自动标注任务的路径处理逻辑
- 修改数据库表结构添加外键约束确保数据一致性
2026-01-20 13:34:50 +08:00
hhhhsc701
7d4dcb756b
fix: 修复入库可能重复;筛选逻辑优化 ( #226 )
...
* 修改数据清洗筛选逻辑-筛选修改为多选
* 修改数据清洗筛选逻辑-筛选修改为多选
* antd 组件库样式定制修改
* fix: 修复入库可能重复
* fix: 算子市场筛选逻辑优化
* fix: 清洗任务创建筛选逻辑优化
* fix: 清洗任务创建筛选逻辑优化
---------
Co-authored-by: chase <byzhangxin11@126.com >
2026-01-06 17:57:25 +08:00
hefanli
a15a6134ff
fix the ratio task config ( #224 )
...
* fix: fix the dataset card icon
* fix: fix the dataset file tag distribution and ratio task
* refactor: change dateRange config from latest to start-end
2026-01-05 17:02:28 +08:00
Kecheng Sha
3f1ad6a872
feat(auto-annotation): integrate YOLO auto-labeling and enhance data management ( #223 )
...
* feat(auto-annotation): initial setup
* chore: remove package-lock.json
* chore: 清理本地测试脚本与 Maven 设置
* chore: change package-lock.json
2026-01-05 14:22:44 +08:00
hefanli
ccfb84c034
feature: add mysql collection and starrocks collection ( #222 )
...
* fix: fix the path for backend-python imaage building
* feature: add mysql collection and starrocks collection
* feature: add mysql collection and starrocks collection
* fix: change the permission of those files which collected from nfs to 754
* fix: delete collected files, config files and log files while deleting collection task
* fix: add the collection task detail api
* fix: change the log of collecting for dataset
* fix: add collection task selecting while creating and updating dataset
* fix: set the umask value to 0022 for java process
2026-01-04 19:05:08 +08:00
hhhhsc701
f183b9f2f3
feat: 算子上传适配 ( #216 )
2025-12-31 10:30:32 +08:00
hhhhsc701
6a1eb85e8e
feat: 支持运行data-juicer算子 ( #215 )
...
* feature: 增加data-juicer算子
* feat: 支持运行data-juicer算子
* feat: 支持data-juicer任务下发
* feat: 支持data-juicer结果数据集归档
* feat: 支持data-juicer结果数据集归档
2025-12-31 09:20:41 +08:00
hefanli
63f4e3e447
refactor: modify data collection to python implementation ( #214 )
...
* feature: LabelStudio jumps without login
* refactor: modify data collection to python implementation
* refactor: modify data collection to python implementation
* refactor: modify data collection to python implementation
* refactor: modify data collection to python implementation
* refactor: modify data collection to python implementation
* refactor: modify data collection to python implementation
* fix: remove terrabase dependency
* feature: add the collection task executions page and the collection template page
* fix: fix the collection task creation
* fix: fix the collection task creation
2025-12-30 18:48:43 +08:00
hhhhsc701
80d4dfd285
feat: 修复清洗任务创建 ( #207 )
2025-12-30 14:41:39 +08:00
hhhhsc701
1c507ac98a
feat: 支持npu自动扩缩容 ( #197 )
...
* feat: npu动态调度
* feat: 数据集分页优化
* feat: 支持npu自动扩缩容
* feat: 支持npu自动扩缩容
* feat: 支持npu自动扩缩容
* feat: clean code
2025-12-24 18:03:30 +08:00
hefanli
215d7f0612
Fix the ratio task bug ( #194 )
...
* fix: add feign client configurations
* fix: add nacos configurations
* fix: add python to gateway
* fix: Fix the ratio task bug
2025-12-24 11:40:26 +08:00
hhhhsc701
6d61348388
feat: deer-flow通过gateway转发 ( #193 )
2025-12-23 11:35:45 +08:00
hhhhsc701
d82bff441a
fix: prevent deletion of predefined operators and improve error handling ( #192 )
...
* fix: prevent deletion of predefined operators and improve error handling
* fix: prevent deletion of predefined operators and improve error handling
2025-12-22 19:30:41 +08:00
hefanli
c1516c87b6
Feat gateway ( #191 )
...
* fix: fix the routes definition
* fix: fix the helm installing file
* fix: modify the logging dependencies
2025-12-22 18:58:55 +08:00
hefanli
e5b28c26b1
add gateway ( #187 )
...
* feature: add gateway
2025-12-22 15:41:17 +08:00
hhhhsc701
46f4a8c219
feat: add download functionality for example operator and update Dock… ( #188 )
...
* feat: add download functionality for example operator and update Dockerfile
* feat: enhance download response by exposing content disposition header
* feat: update download function to accept filename parameter for example operator
2025-12-22 15:39:32 +08:00
hefanli
082aca1597
fix: the interface for querying data set files is compatible with ret… ( #171 )
...
fix: the interface for querying data set files is compatible with returns in file system format and list returns.
2025-12-16 11:31:52 +08:00
Dallas98
2f3ae21f8a
feat: enhance dataset file fetching with improved pagination and document loading support ( #156 )
2025-12-10 22:39:24 +08:00
Dallas98
cbb146d3d7
feat(chart): add Helm chart for deploying Label Studio with PostgreSQL ( #152 )
...
* feat(chart): add Helm chart for deploying Label Studio with PostgreSQL
* feat(milvus): update Milvus configuration to use URI and remove deprecated host/port settings
2025-12-10 17:46:12 +08:00
hefanli
f87060490c
feature: data management supports nested folders ( #150 )
...
* fix: k8s部署场景下,backend-python服务挂载需要存储
* fix: 增加数据集文件免拷贝的接口定义
* fix: 评估时评估结果赋予初始空值,防止未评估完成时接口报错
* feature: 数据管理支持嵌套文件夹(展示时按照文件系统展示;批量下载时带上相对路径)
* fix: 去除多余的文件重命名逻辑
* refactor: remove unused imports
2025-12-10 16:42:45 +08:00
hhhhsc701
103c21945d
修复部分功能 ( #138 )
...
* feature: 版本统一
* feature: 定时同步时默认值展示异常,增加提示
* feature: 修复数据归集搜索
* feature: 优化标注模板查询
* feature: 屏蔽webhook功能
2025-12-10 14:31:05 +08:00
hefanli
758cf93e36
feature: 增加压缩包上传功能 ( #137 )
...
* feature: 增加压缩包上传功能
* fix: 删除文件时数据集关于文件的相关统计信息也刷新
* fix: 增加k8s常见下评估服务的路由
2025-12-09 14:42:27 +08:00
hhhhsc701
7a9530c1e3
feature: 增加对redis未部署时异常捕获 ( #131 )
...
* feature: 增加download-deer-flow
* feature: 增加对redis未部署时异常捕获
* feature: clean code
2025-12-04 16:09:29 +08:00
hefanli
1d19cd3a62
feature: add data-evaluation
...
* feature: add evaluation task management function
* feature: add evaluation task detail page
* fix: delete duplicate definition for table t_model_config
* refactor: rename package synthesis to ratio
* refactor: add eval file table and refactor related code
* fix: calling large models in parallel during evaluation
2025-12-04 09:23:54 +08:00
hhhhsc701
c22683d635
优化部分问题 ( #126 )
...
* feature: 支持相对路径引用
* feature: 优化本地部署命令
* feature: 优化算子编排展示
* feature: 优化清洗任务失败后重试
2025-12-03 16:41:48 +08:00
Dallas98
458afa2966
feat: Add original file ID to document metadata in RagEtlService #121
2025-12-02 15:10:07 +08:00
Jason Wang
d692f5fdae
feat: new endpoint allowing only add file path to dataset record without any FS operations ( #119 )
...
* feat: Implement add files' path only to dataset
* refactor: Rename variable for clarity in metadata serialization
2025-12-01 20:31:06 +08:00
Dallas98
9fc35f066f
feat: Add original file ID to document metadata in RagEtlService
2025-12-01 17:04:52 +08:00
hhhhsc701
bb3345268e
bugfix: 清洗/算子支持名称/描述搜索 ( #116 )
...
* bugfix: milvus适配etcd deploy部署
* bugfix: 可以在知识库界面跳转到创建模型
2025-11-29 18:15:43 +08:00
hhhhsc701
07029d07ff
优化清洗重试机制,优化清洗进度展示,修复模板无法展示参数 ( #113 )
...
* bugfix: 模板无法展示参数
* bugfix: 优化清洗进度展示
* bugfix: 优化清洗重试机制
2025-11-28 15:28:10 +08:00
hhhhsc701
f1bffdcd61
bugfix: 创建清洗任务时修改数据集状态;无法删除已在模板/运行任务的算子
...
* bugfix: 创建清洗任务时修改数据集状态;无法删除已在模板/运行任务的算子
2025-11-27 17:34:53 +08:00
hhhhsc701
91390cace0
feature: 北向接口:支持通过模板创建清洗任务 ( #111 )
...
feature: 北向接口:支持通过模板创建清洗任务
2025-11-26 17:30:54 +08:00
Dallas98
bc26cfba55
feat: Refactor knowledge base retrieval to return detailed search results and enhance API integration #108
2025-11-25 21:21:21 +08:00
hefanli
c1352ab91f
feature: multiple ratio configurations can be set for the data set. ( #103 )
...
feature: multiple ratio configurations can be set for the data set.
2025-11-24 15:28:17 +08:00
Dallas98
9858388084
feat: Refactor dataset file pagination and enhance retrieval functionality with new request structure #98
...
* feat: Enhance knowledge base management with collection renaming, imp…
* feat: Update Milvus integration with new API, enhance collection mana…
* Merge branch 'refs/heads/main' into dev
* feat: Refactor dataset file pagination and enhance retrieval function…
* Merge branch 'main' into dev
2025-11-21 17:28:25 +08:00
hhhhsc701
536ef9f556
feature: milvus service名称变更 兼容k8s ( #97 )
...
feature: milvus service名称变更 兼容k8s (#97 )
2025-11-21 12:06:53 +08:00
hefanli
a07fba23f2
feature:数据集导入数据集支持选择归集任务导入 ( #92 )
...
* feature: 实现obs归集
* feature: 增加数据集中出现同名文件时的处理方式
* feature: 前端数据集导入数据时增加可以选择归集任务导入
2025-11-19 11:05:33 +08:00
Dallas98
4506fa8a91
feat: Enhance AddDataDialog with dataset file selection and improved upload process ( #91 )
2025-11-18 20:48:28 +08:00
Dallas98
04a233b803
fix: 修复知识库问题 ( #89 )
...
* feat: Refactor system parameter management with new data structure and update logic
* feat: Enhance dataset file management with improved file copying
* feat: Enhance dataset file management with improved file copying
* fix: 修复知识库相关问题
* feat: Integrate Milvus service for enhanced knowledge base management and file deletion
2025-11-17 19:11:04 +08:00
Dallas98
145c154d1f
feat: Integrate Milvus service for enhanced knowledge base management and file deletion ( #88 )
...
* feat: Refactor system parameter management with new data structure and update logic
* fix: 修复知识库相关问题
2025-11-17 17:36:09 +08:00
Dallas98
e300d13c21
feat: Enhance dataset file management with improved file copying
2025-11-14 23:30:28 +08:00
Dallas98
5638bdcf1c
feat: add file copying functionality to dataset directory and update base path configuration
2025-11-14 18:05:40 +08:00
hhhhsc701
5cef9cb273
feature: deer-flow支持从datamate获取外部接入模型 ( #83 )
...
* feature: deer-flow支持从datamate获取外部接入模型
2025-11-13 20:13:16 +08:00