Files
Jerry Yan 49f99527cc feat(auto-annotation): add LLM-based annotation operators
Add three new LLM-powered auto-annotation operators:
- LLMTextClassification: Text classification using LLM
- LLMNamedEntityRecognition: Named entity recognition with type validation
- LLMRelationExtraction: Relation extraction with entity and relation type validation

Key features:
- Load LLM config from t_model_config table via modelId parameter
- Lazy loading of LLM configuration on first execute()
- Result validation with whitelist checking for entity/relation types
- Fault-tolerant: returns empty results on LLM failure instead of throwing
- Fully compatible with existing Worker pipeline

Files added:
- runtime/ops/annotation/_llm_utils.py: Shared LLM utilities
- runtime/ops/annotation/llm_text_classification/: Text classification operator
- runtime/ops/annotation/llm_named_entity_recognition/: NER operator
- runtime/ops/annotation/llm_relation_extraction/: Relation extraction operator

Files modified:
- runtime/ops/annotation/__init__.py: Register 3 new operators
- runtime/python-executor/datamate/auto_annotation_worker.py: Add to Worker whitelist
- frontend/src/pages/DataAnnotation/OperatorCreate/hooks/useOperatorOperations.ts: Add to frontend whitelist
2026-02-10 15:22:23 +08:00

30 lines
869 B
YAML

name: 'LLM命名实体识别'
name_en: 'LLM Named Entity Recognition'
description: '基于大语言模型的命名实体识别算子,支持自定义实体类型。'
description_en: 'LLM-based NER operator with custom entity types.'
language: 'python'
vendor: 'datamate'
raw_id: 'LLMNamedEntityRecognition'
version: '1.0.0'
types:
- 'annotation'
modal: 'text'
inputs: 'text'
outputs: 'text'
settings:
modelId:
name: '模型ID'
description: '已配置的 LLM 模型 ID(留空使用系统默认模型)。'
type: 'input'
defaultVal: ''
entityTypes:
name: '实体类型'
description: '逗号分隔的实体类型,如:PER,ORG,LOC,DATE'
type: 'input'
defaultVal: 'PER,ORG,LOC,DATE'
outputDir:
name: '输出目录'
description: '算子输出目录(由运行时自动注入)。'
type: 'input'
defaultVal: ''