* feat: Update site name to DataMate and refine text for AI data processing * feat: Refactor settings page and implement model access functionality - Created a new ModelAccess component for managing model configurations. - Removed the old Settings component and replaced it with a new SettingsPage component that integrates ModelAccess, SystemConfig, and WebhookConfig. - Added SystemConfig component for managing system settings. - Implemented WebhookConfig component for managing webhook configurations. - Updated API functions for model management in settings.apis.ts. - Adjusted routing to point to the new SettingsPage component. * feat: Implement Data Collection Page with Task Management and Execution Log - Created DataCollectionPage component to manage data collection tasks. - Added TaskManagement and ExecutionLog components for task handling and logging. - Integrated task operations including start, stop, edit, and delete functionalities. - Implemented filtering and searching capabilities in task management. - Introduced SimpleCronScheduler for scheduling tasks with cron expressions. - Updated CreateTask component to utilize new scheduling and template features. - Enhanced BasicInformation component to conditionally render fields based on visibility settings. - Refactored ImportConfiguration component to remove NAS import section. * feat: Update task creation API endpoint and enhance task creation form with new fields and validation * Refactor file upload and operator management components - Removed unnecessary console logs from file download and export functions. - Added size property to TaskItem interface for better task management. - Simplified TaskUpload component by utilizing useFileSliceUpload hook for file upload logic. - Enhanced OperatorPluginCreate component to handle file uploads and parsing more efficiently. - Updated ConfigureStep component to use Ant Design Form for better data handling and validation. - Improved PreviewStep component to navigate back to the operator market. - Added support for additional file types in UploadStep component. - Implemented delete operator functionality in OperatorMarketPage with confirmation prompts. - Cleaned up unused API functions in operator.api.ts to streamline the codebase. - Fixed number formatting utility to handle zero values correctly. * Refactor Knowledge Generation to Knowledge Base - Created new API service for Knowledge Base operations including querying, creating, updating, and deleting knowledge bases and files. - Added constants for Knowledge Base status and type mappings. - Defined models for Knowledge Base and related files. - Removed obsolete Knowledge Base creation and home components, replacing them with new implementations under the Knowledge Base structure. - Updated routing to reflect the new Knowledge Base paths. - Adjusted menu items to align with the new Knowledge Base terminology. - Modified ModelAccess interface to include modelName and type properties. * feat: Implement Knowledge Base Page with CRUD operations and data management - Added KnowledgeBasePage component for displaying and managing knowledge bases. - Integrated search and filter functionalities with SearchControls component. - Implemented CreateKnowledgeBase component for creating and editing knowledge bases. - Enhanced AddDataDialog for file uploads and dataset selections. - Introduced TableTransfer component for managing data transfers between tables. - Updated API functions for knowledge base operations, including file management. - Refactored knowledge base model to include file status and metadata. - Adjusted routing to point to the new KnowledgeBasePage. * feat: enhance OperatorPluginCreate and ConfigureStep for better upload handling and UI updates
DataMate All-in-One Data Work Platform
DataMate is an enterprise-level data processing platform for model fine-tuning and RAG retrieval, supporting core functions such as data collection, data management, operator marketplace, data cleaning, data synthesis, data annotation, data evaluation, and knowledge generation.
If you like this project, please give it a Star⭐️!
🌟 Core Features
- Core Modules: Data Collection, Data Management, Operator Marketplace, Data Cleaning, Data Synthesis, Data Annotation, Data Evaluation, Knowledge Generation.
- Visual Orchestration: Drag-and-drop data processing workflow design.
- Operator Ecosystem: Rich built-in operators and support for custom operators.
🚀 Quick Start
Prerequisites
- Git (for pulling source code)
- Make (for building and installing)
- Docker (for building images and deploying services)
- Docker-Compose (for service deployment - Docker method)
- Kubernetes (for service deployment - k8s method)
- Helm (for service deployment - k8s method)
Clone the Code
git clone git@github.com:ModelEngine-Group/DataMate.git
cd DataMate
Build Images
make build
Docker Installation
make install INSTALLER=docker
Kubernetes Installation
make install INSTALLER=k8s
🤝 Contribution Guidelines
Thank you for your interest in this project! We warmly welcome contributions from the community. Whether it's submitting bug reports, suggesting new features, or directly participating in code development, all forms of help make the project better.
• 📮 GitHub Issues: Submit bugs or feature suggestions.
• 🔧 GitHub Pull Requests: Contribute code improvements.
📄 License
DataMate is open source under the MIT license. You are free to use, modify, and distribute the code of this project in compliance with the license terms.