* fix: fix the path for backend-python imaage building
* feature: add mysql collection and starrocks collection
* feature: add mysql collection and starrocks collection
* fix: change the permission of those files which collected from nfs to 754
* fix: delete collected files, config files and log files while deleting collection task
* fix: add the collection task detail api
* fix: change the log of collecting for dataset
* fix: add collection task selecting while creating and updating dataset
* fix: set the umask value to 0022 for java process
* feature: LabelStudio jumps without login
* refactor: modify data collection to python implementation
* refactor: modify data collection to python implementation
* refactor: modify data collection to python implementation
* refactor: modify data collection to python implementation
* refactor: modify data collection to python implementation
* refactor: modify data collection to python implementation
* fix: remove terrabase dependency
* feature: add the collection task executions page and the collection template page
* fix: fix the collection task creation
* fix: fix the collection task creation
* feat: add download functionality for example operator and update Dockerfile
* feat: enhance download response by exposing content disposition header
* feat: update download function to accept filename parameter for example operator
* feat(generation_service): add image URL extraction and random QA generation logic
* fix(generation_service): increase batch size from 20 to 100 for improved chunk processing
* fix(generation_service): increase batch size from 20 to 100 for improved chunk processing
* fix(chart): update Helm chart helpers and values for improved configuration
* feat(SynthesisTaskTab): enhance task table with tooltip support and improved column widths
* feat(CreateTask, SynthFileTask): improve task creation and detail view with enhanced payload handling and UI updates
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthDataDetail): add delete action for chunks with confirmation prompt
* feat(SynthDataDetail): update edit and delete buttons to icon-only format
* feat(SynthDataDetail): add confirmation modals for chunk and synthesis data deletion
* feat(DocumentSplitter): add enhanced document splitting functionality with CJK support and metadata detection
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* feat(DataSynthesis): streamline synthesis task handling and enhance chunk processing logic
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* fix(generation_service): ensure processed chunks are incremented regardless of question generation success
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(model_chat): enhance JSON parsing by removing additional thought tags and improving fallback logic
* feat(generation_service): add document filtering to remove short documents based on chunk size
* fix(chart): update Helm chart helpers and values for improved configuration
* feat(SynthesisTaskTab): enhance task table with tooltip support and improved column widths
* feat(CreateTask, SynthFileTask): improve task creation and detail view with enhanced payload handling and UI updates
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthDataDetail): add delete action for chunks with confirmation prompt
* feat(SynthDataDetail): update edit and delete buttons to icon-only format
* feat(SynthDataDetail): add confirmation modals for chunk and synthesis data deletion
* feat(DocumentSplitter): add enhanced document splitting functionality with CJK support and metadata detection
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* feat(DataSynthesis): streamline synthesis task handling and enhance chunk processing logic
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* fix(generation_service): ensure processed chunks are incremented regardless of question generation success
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(model_chat): enhance JSON parsing by removing additional thought tags and improving fallback logic
* fix(chart): update Helm chart helpers and values for improved configuration
* feat(SynthesisTaskTab): enhance task table with tooltip support and improved column widths
* feat(CreateTask, SynthFileTask): improve task creation and detail view with enhanced payload handling and UI updates
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthFileTask): enhance file display with progress tracking and delete action
* feat(SynthDataDetail): add delete action for chunks with confirmation prompt
* feat(SynthDataDetail): update edit and delete buttons to icon-only format
* feat(SynthDataDetail): add confirmation modals for chunk and synthesis data deletion
* feat(DocumentSplitter): add enhanced document splitting functionality with CJK support and metadata detection
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* feat(DataSynthesis): streamline synthesis task handling and enhance chunk processing logic
* feat(DataSynthesis): refactor data synthesis models and update task handling logic
* fix(generation_service): ensure processed chunks are incremented regardless of question generation success
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* feat(CreateTask): enhance task creation with new synthesis templates and improved configuration options
* fix: optimize PDF parsing by implementing concurrent processing with ThreadPoolExecutor
* Refactor to async processing for file extraction
Refactor the file processing to use asyncio for improved performance and concurrency.
* feature: unstructured支持简单pdf处理
* feature: update values.yaml to enhance ray-cluster configuration with security context, environment variables, and resource limits
* feature: update deploy.yaml and process.py for mineru server configuration and PDF processing enhancements
* feature: update deploy.yaml and process.py for mineru server configuration and PDF processing enhancements
* feature: improve PDF processing logic and update dependencies in process.py and pyproject.toml
* feature: improve PDF processing logic and update dependencies in process.py and pyproject.toml
* feature: update Dockerfile for improved package source mirrors and add mineru-npu to build targets