* feat: Update site name to DataMate and refine text for AI data processing * feat: Refactor settings page and implement model access functionality - Created a new ModelAccess component for managing model configurations. - Removed the old Settings component and replaced it with a new SettingsPage component that integrates ModelAccess, SystemConfig, and WebhookConfig. - Added SystemConfig component for managing system settings. - Implemented WebhookConfig component for managing webhook configurations. - Updated API functions for model management in settings.apis.ts. - Adjusted routing to point to the new SettingsPage component. * feat: Implement Data Collection Page with Task Management and Execution Log - Created DataCollectionPage component to manage data collection tasks. - Added TaskManagement and ExecutionLog components for task handling and logging. - Integrated task operations including start, stop, edit, and delete functionalities. - Implemented filtering and searching capabilities in task management. - Introduced SimpleCronScheduler for scheduling tasks with cron expressions. - Updated CreateTask component to utilize new scheduling and template features. - Enhanced BasicInformation component to conditionally render fields based on visibility settings. - Refactored ImportConfiguration component to remove NAS import section. * feat: Update task creation API endpoint and enhance task creation form with new fields and validation * Refactor file upload and operator management components - Removed unnecessary console logs from file download and export functions. - Added size property to TaskItem interface for better task management. - Simplified TaskUpload component by utilizing useFileSliceUpload hook for file upload logic. - Enhanced OperatorPluginCreate component to handle file uploads and parsing more efficiently. - Updated ConfigureStep component to use Ant Design Form for better data handling and validation. - Improved PreviewStep component to navigate back to the operator market. - Added support for additional file types in UploadStep component. - Implemented delete operator functionality in OperatorMarketPage with confirmation prompts. - Cleaned up unused API functions in operator.api.ts to streamline the codebase. - Fixed number formatting utility to handle zero values correctly. * Refactor Knowledge Generation to Knowledge Base - Created new API service for Knowledge Base operations including querying, creating, updating, and deleting knowledge bases and files. - Added constants for Knowledge Base status and type mappings. - Defined models for Knowledge Base and related files. - Removed obsolete Knowledge Base creation and home components, replacing them with new implementations under the Knowledge Base structure. - Updated routing to reflect the new Knowledge Base paths. - Adjusted menu items to align with the new Knowledge Base terminology. - Modified ModelAccess interface to include modelName and type properties. * feat: Implement Knowledge Base Page with CRUD operations and data management - Added KnowledgeBasePage component for displaying and managing knowledge bases. - Integrated search and filter functionalities with SearchControls component. - Implemented CreateKnowledgeBase component for creating and editing knowledge bases. - Enhanced AddDataDialog for file uploads and dataset selections. - Introduced TableTransfer component for managing data transfers between tables. - Updated API functions for knowledge base operations, including file management. - Refactored knowledge base model to include file status and metadata. - Adjusted routing to point to the new KnowledgeBasePage. * feat: enhance OperatorPluginCreate and ConfigureStep for better upload handling and UI updates * refactor: remove unused components and clean up API logging in KnowledgeBase * feat: update icons in various components and improve styling for better UI consistency * fix: adjust upload step handling and improve error display in configuration step * feat: Add RatioTransfer component for dataset selection and configuration - Implemented RatioTransfer component to manage dataset selection and ratio configuration. - Integrated dataset fetching with search and filter capabilities. - Added RatioConfig component for displaying and updating selected datasets' configurations. - Enhanced SelectDataset component with improved UI and functionality for dataset selection. - Updated RatioTasksPage to utilize new ratio task status mapping and improved error handling for task deletion. - Refactored ratio model and constants for better type safety and clarity. - Changed Vite configuration to use local backend service for development. * feat: Add .editorconfig and enhance SystemConfig with table for settings display * feat: Enhance parameter configuration for range inputs and update default values * feat: Update site name to DataMate and refine text for AI data processing * Refactor file upload and operator management components - Removed unnecessary console logs from file download and export functions. - Added size property to TaskItem interface for better task management. - Simplified TaskUpload component by utilizing useFileSliceUpload hook for file upload logic. - Enhanced OperatorPluginCreate component to handle file uploads and parsing more efficiently. - Updated ConfigureStep component to use Ant Design Form for better data handling and validation. - Improved PreviewStep component to navigate back to the operator market. - Added support for additional file types in UploadStep component. - Implemented delete operator functionality in OperatorMarketPage with confirmation prompts. - Cleaned up unused API functions in operator.api.ts to streamline the codebase. - Fixed number formatting utility to handle zero values correctly. * Refactor Knowledge Generation to Knowledge Base - Created new API service for Knowledge Base operations including querying, creating, updating, and deleting knowledge bases and files. - Added constants for Knowledge Base status and type mappings. - Defined models for Knowledge Base and related files. - Removed obsolete Knowledge Base creation and home components, replacing them with new implementations under the Knowledge Base structure. - Updated routing to reflect the new Knowledge Base paths. - Adjusted menu items to align with the new Knowledge Base terminology. - Modified ModelAccess interface to include modelName and type properties. * feat: Implement Knowledge Base Page with CRUD operations and data management - Added KnowledgeBasePage component for displaying and managing knowledge bases. - Integrated search and filter functionalities with SearchControls component. - Implemented CreateKnowledgeBase component for creating and editing knowledge bases. - Enhanced AddDataDialog for file uploads and dataset selections. - Introduced TableTransfer component for managing data transfers between tables. - Updated API functions for knowledge base operations, including file management. - Refactored knowledge base model to include file status and metadata. - Adjusted routing to point to the new KnowledgeBasePage. * feat: enhance OperatorPluginCreate and ConfigureStep for better upload handling and UI updates * feat: update icons in various components and improve styling for better UI consistency * fix: adjust upload step handling and improve error display in configuration step * feat: Update site name to DataMate and refine text for AI data processing * Refactor file upload and operator management components - Removed unnecessary console logs from file download and export functions. - Added size property to TaskItem interface for better task management. - Simplified TaskUpload component by utilizing useFileSliceUpload hook for file upload logic. - Enhanced OperatorPluginCreate component to handle file uploads and parsing more efficiently. - Updated ConfigureStep component to use Ant Design Form for better data handling and validation. - Improved PreviewStep component to navigate back to the operator market. - Added support for additional file types in UploadStep component. - Implemented delete operator functionality in OperatorMarketPage with confirmation prompts. - Cleaned up unused API functions in operator.api.ts to streamline the codebase. - Fixed number formatting utility to handle zero values correctly. * Refactor Knowledge Generation to Knowledge Base - Created new API service for Knowledge Base operations including querying, creating, updating, and deleting knowledge bases and files. - Added constants for Knowledge Base status and type mappings. - Defined models for Knowledge Base and related files. - Removed obsolete Knowledge Base creation and home components, replacing them with new implementations under the Knowledge Base structure. - Updated routing to reflect the new Knowledge Base paths. - Adjusted menu items to align with the new Knowledge Base terminology. - Modified ModelAccess interface to include modelName and type properties. * feat: Implement Knowledge Base Page with CRUD operations and data management - Added KnowledgeBasePage component for displaying and managing knowledge bases. - Integrated search and filter functionalities with SearchControls component. - Implemented CreateKnowledgeBase component for creating and editing knowledge bases. - Enhanced AddDataDialog for file uploads and dataset selections. - Introduced TableTransfer component for managing data transfers between tables. - Updated API functions for knowledge base operations, including file management. - Refactored knowledge base model to include file status and metadata. - Adjusted routing to point to the new KnowledgeBasePage. * feat: enhance OperatorPluginCreate and ConfigureStep for better upload handling and UI updates * feat: update icons in various components and improve styling for better UI consistency * fix: adjust upload step handling and improve error display in configuration step * feat: add settings drawer and integrate SettingsPage component * feat: add ratio task management features including detail view and API integration * feat: enable destruction of hidden settings drawer to free up resources * feat: add data metrics and ratio display components with charts for enhanced data visualization
DataMate All-in-One Data Work Platform
DataMate is an enterprise-level data processing platform for model fine-tuning and RAG retrieval, supporting core functions such as data collection, data management, operator marketplace, data cleaning, data synthesis, data annotation, data evaluation, and knowledge generation.
If you like this project, please give it a Star⭐️!
🌟 Core Features
- Core Modules: Data Collection, Data Management, Operator Marketplace, Data Cleaning, Data Synthesis, Data Annotation, Data Evaluation, Knowledge Generation.
- Visual Orchestration: Drag-and-drop data processing workflow design.
- Operator Ecosystem: Rich built-in operators and support for custom operators.
🚀 Quick Start
Prerequisites
- Git (for pulling source code)
- Make (for building and installing)
- Docker (for building images and deploying services)
- Docker-Compose (for service deployment - Docker method)
- Kubernetes (for service deployment - k8s method)
- Helm (for service deployment - k8s method)
This project supports deployment via two methods: docker-compose and helm. After executing the command, please enter the corresponding number for the deployment method. The command echo is as follows:
Choose a deployment method:
1. Docker/Docker-Compose
2. Kubernetes/Helm
Enter choice:
Clone the Code
git clone git@github.com:ModelEngine-Group/DataMate.git
cd DataMate
Deploy the basic services
make install
Build and deploy Mineru Enhanced PDF Processing
make build-mineru
make install-mineru
Deploy the DeerFlow service
- Modify
runtime/deer-flow/.env.exampleand add configurations for SEARCH_API_KEY and the EMBEDDING model. - Modify
runtime/deer-flow/.conf.yaml.exampleand add basic model service configurations. - Execute
make install-deer-flow
Local Development and Deployment
After modifying the local code, please execute the following commands to build the image and deploy using the local image.
make build
make install REGISTRY=""
🤝 Contribution Guidelines
Thank you for your interest in this project! We warmly welcome contributions from the community. Whether it's submitting bug reports, suggesting new features, or directly participating in code development, all forms of help make the project better.
• 📮 GitHub Issues: Submit bugs or feature suggestions.
• 🔧 GitHub Pull Requests: Contribute code improvements.
📄 License
DataMate is open source under the MIT license. You are free to use, modify, and distribute the code of this project in compliance with the license terms.