* feat: Refactor configuration and sync logic for improved dataset handling and logging * feat: Enhance annotation synchronization and dataset file management - Added new fields `tags_updated_at` to `DatasetFiles` model for tracking the last update time of tags. - Implemented new asynchronous methods in the Label Studio client for fetching, creating, updating, and deleting task annotations. - Introduced bidirectional synchronization for annotations between DataMate and Label Studio, allowing for flexible data management. - Updated sync service to handle annotation conflicts based on timestamps, ensuring data integrity during synchronization. - Enhanced dataset file response model to include tags and their update timestamps. - Modified database initialization script to create a new column for `tags_updated_at` in the dataset files table. - Updated requirements to ensure compatibility with the latest dependencies.
DataMate All-in-One Data Work Platform
DataMate is an enterprise-level data processing platform for model fine-tuning and RAG retrieval, supporting core functions such as data collection, data management, operator marketplace, data cleaning, data synthesis, data annotation, data evaluation, and knowledge generation.
If you like this project, please give it a Star⭐️!
🌟 Core Features
- Core Modules: Data Collection, Data Management, Operator Marketplace, Data Cleaning, Data Synthesis, Data Annotation, Data Evaluation, Knowledge Generation.
- Visual Orchestration: Drag-and-drop data processing workflow design.
- Operator Ecosystem: Rich built-in operators and support for custom operators.
🚀 Quick Start
Prerequisites
- Git (for pulling source code)
- Make (for building and installing)
- Docker (for building images and deploying services)
- Docker-Compose (for service deployment - Docker method)
- Kubernetes (for service deployment - k8s method)
- Helm (for service deployment - k8s method)
Clone the Code
git clone git@github.com:ModelEngine-Group/DataMate.git
cd DataMate
Build Images
make build
Docker Installation
make install INSTALLER=docker
Kubernetes Installation
make install INSTALLER=k8s
🤝 Contribution Guidelines
Thank you for your interest in this project! We warmly welcome contributions from the community. Whether it's submitting bug reports, suggesting new features, or directly participating in code development, all forms of help make the project better.
• 📮 GitHub Issues: Submit bugs or feature suggestions.
• 🔧 GitHub Pull Requests: Contribute code improvements.
📄 License
DataMate is open source under the MIT license. You are free to use, modify, and distribute the code of this project in compliance with the license terms.