You've already forked DataMate
Add three new LLM-powered auto-annotation operators: - LLMTextClassification: Text classification using LLM - LLMNamedEntityRecognition: Named entity recognition with type validation - LLMRelationExtraction: Relation extraction with entity and relation type validation Key features: - Load LLM config from t_model_config table via modelId parameter - Lazy loading of LLM configuration on first execute() - Result validation with whitelist checking for entity/relation types - Fault-tolerant: returns empty results on LLM failure instead of throwing - Fully compatible with existing Worker pipeline Files added: - runtime/ops/annotation/_llm_utils.py: Shared LLM utilities - runtime/ops/annotation/llm_text_classification/: Text classification operator - runtime/ops/annotation/llm_named_entity_recognition/: NER operator - runtime/ops/annotation/llm_relation_extraction/: Relation extraction operator Files modified: - runtime/ops/annotation/__init__.py: Register 3 new operators - runtime/python-executor/datamate/auto_annotation_worker.py: Add to Worker whitelist - frontend/src/pages/DataAnnotation/OperatorCreate/hooks/useOperatorOperations.ts: Add to frontend whitelist
35 lines
1.1 KiB
YAML
35 lines
1.1 KiB
YAML
name: 'LLM关系抽取'
|
|
name_en: 'LLM Relation Extraction'
|
|
description: '基于大语言模型的关系抽取算子,识别实体并抽取实体间关系三元组。'
|
|
description_en: 'LLM-based relation extraction operator that identifies entities and extracts relation triples.'
|
|
language: 'python'
|
|
vendor: 'datamate'
|
|
raw_id: 'LLMRelationExtraction'
|
|
version: '1.0.0'
|
|
types:
|
|
- 'annotation'
|
|
modal: 'text'
|
|
inputs: 'text'
|
|
outputs: 'text'
|
|
settings:
|
|
modelId:
|
|
name: '模型ID'
|
|
description: '已配置的 LLM 模型 ID(留空使用系统默认模型)。'
|
|
type: 'input'
|
|
defaultVal: ''
|
|
entityTypes:
|
|
name: '实体类型'
|
|
description: '逗号分隔的实体类型,如:PER,ORG,LOC'
|
|
type: 'input'
|
|
defaultVal: 'PER,ORG,LOC'
|
|
relationTypes:
|
|
name: '关系类型'
|
|
description: '逗号分隔的关系类型,如:属于,位于,创立,工作于'
|
|
type: 'input'
|
|
defaultVal: '属于,位于,创立,工作于'
|
|
outputDir:
|
|
name: '输出目录'
|
|
description: '算子输出目录(由运行时自动注入)。'
|
|
type: 'input'
|
|
defaultVal: ''
|