Skip to content

Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)

Notifications You must be signed in to change notification settings

quqxui/Awesome-LLM4IE-Papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 

Repository files navigation

Awesome-LLM4IE-Papers

🔥🔥🔥 The article has been accepted by Frontiers of Computer Science (FCS).


Awesome papers about generative Information extraction using LLMs

The organization of papers is discussed in our survey: Large Language Models for Generative Information Extraction: A Survey.

If you find any relevant academic papers that have not been included in our research, please submit a request for an update. We welcome contributions from everyone.

If any suggestions or mistakes, please feel free to let us know via email at derongxu@mail.ustc.edu.cn and chenweicw@mail.ustc.edu.cn. We appreciate your feedback and help in improving our work.

If you find our survey useful for your research, please cite the following paper:

@article{xu2024large,
  title={Large language models for generative information extraction: A survey},
  author={Xu, Derong and Chen, Wei and Peng, Wenjun and Zhang, Chao and Xu, Tong and Zhao, Xiangyu and Wu, Xian and Zheng, Yefeng and Wang, Yang and Chen, Enhong},
  journal={Frontiers of Computer Science},
  volume={18},
  number={6},
  pages={186357},
  year={2024},
  publisher={Springer}
}

📒 Table of Contents

💡 News

  • Update Logs
    • The details can be find in ./update_new_papers_list.
    • 2024/09/04 Add 22 papers
    • 2024/06/06 Add 41 papers
    • 2024/03/30 Add 27 papers
    • 2024/03/29 Add 20 papers

Information Extraction tasks

A taxonomy by various tasks.

Named Entity Recognition

Models targeting only ner tasks.

Entity Typing

Paper Venue Date Code
Calibrated Seq2seq Models for Efficient and Generalizable Ultra-fine Entity Typing EMNLP Findings 2023-12 GitHub
Generative Entity Typing with Curriculum Learning EMNLP 2022-12 GitHub

Entity Identification & Typing

Paper Venue Date Code
Granular Entity Mapper: Advancing Fine-grained Multimodal Named Entity Recognition and Grounding EMNLP Findings 2024
Double-Checker: Large Language Model as a Checker for Few-shot Named Entity Recognition EMNLP Findings 2024 GitHub
VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models ACL 2024 GitHub
ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models ACL Findings 2024 GitHub
Rethinking Negative Instances for Generative Named Entity Recognition ACL Findings 2024 GitHub
LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition ACL Findings 2024 GitHub
RT: a Retrieving and Chain-of-Thought framework for few-shot medical named entity recognition Others 2024-05 GitHub
P-ICL: Point In-Context Learning for Named Entity Recognition with Large Language Models Arxiv 2024-06 GitHub
Astro-NER -- Astronomy Named Entity Recognition: Is GPT a Good Domain Expert Annotator? Arxiv 2024-05
Know-Adapter: Towards Knowledge-Aware Parameter-Efficient Transfer Learning for Few-shot Named Entity Recognition COLING 2024
ToNER: Type-oriented Named Entity Recognition with Generative Language Model COLING 2024
CHisIEC: An Information Extraction Corpus for Ancient Chinese History COLING 2024 GitHub
Astronomical Knowledge Entity Extraction in Astrophysics Journal Articles via Large Language Models Others 2024-04
LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking Arxiv 2024-04 GitHub
Enhancing Software-Related Information Extraction via Single-Choice Question Answering with Large Language Models Others 2024-04
Knowledge-Enriched Prompt for Low-Resource Named Entity Recognition TALLIP 2024-04
VANER: Leveraging Large Language Model for Versatile and Adaptive Biomedical Named Entity Recognition Arxiv 2024-04 GitHub
LLMs in Biomedicine: A study on clinical Named Entity Recognition Arxiv 2024-04
Out of Sesame Street: A Study of Portuguese Legal Named Entity Recognition Through In-Context Learning ResearchGate 2024-04
Mining experimental data from Materials Science literature with Large Language Models: an evaluation study Arxiv 2024-04 GitHub
LinkNER: Linking Local Named Entity Recognition Models to Large Language Models using Uncertainty WWW 2024
Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models NAACL Short 2024 GitHub
On-the-fly Definition Augmentation of LLMs for Biomedical NER NAACL 2024 GitHub
MetaIE: Distilling a Meta Model from LLM for All Kinds of Information Extraction Tasks Arxiv 2024-03 GitHub
Distilling Named Entity Recognition Models for Endangered Species from Large Language Models Arxiv 2024-03
Augmenting NER Datasets with LLMs: Towards Automated and Refined Annotation Arxiv 2024-03
ConsistNER: Towards Instructive NER Demonstrations for LLMs with the Consistency of Ontology and Context AAAI 2024
Embedded Named Entity Recognition using Probing Classifiers Arxiv 2024-03 GitHub
In-Context Learning for Few-Shot Nested Named Entity Recognition Arxiv 2024-02
LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition Arxiv 2024-02
Structured information extraction from scientific text with large language models Nature Communications 2024-02 GitHub
NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data Arxiv 2024-02
A Simple but Effective Approach to Improve Structured Language Model Output for Information Extraction Arxiv 2024-02
PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition Arxiv 2024-02
Small Language Model Is a Good Guide for Large Language Model in Chinese Entity Relation Extraction Arxiv 2024-02
C-ICL: Contrastive In-context Learning for Information Extraction Arxiv 2024-02
UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition ICLR 2024 GitHub
Improving Large Language Models for Clinical Named Entity Recognition via Prompt Engineering Arxiv 2024-01 GitHub
2INER: Instructive and In-Context Learning on Few-Shot Named Entity Recognition EMNLP Findings 2023-12
In-context Learning for Few-shot Multimodal Named Entity Recognition EMNLP Findings 2023-12
Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples! EMNLP Findings 2023-12 GitHub
Learning to Rank Context for Named Entity Recognition Using a Synthetic Dataset EMNLP 2023-12 GitHub
LLMaAA: Making Large Language Models as Active Annotators EMNLP Findings 2023-12 GitHub
Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge EMNLP Findings 2023-12 GitHub
GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer Arxiv 2023-11 GitHub
GPT Struct Me: Probing GPT Models on Narrative Entity Extraction WI-IAT 2023-10 GitHub
GPT-NER: Named Entity Recognition via Large Language Models Arxiv 2023-10 GitHub
Prompt-NER: Zero-shot Named Entity Recognition in Astronomy Literature via Large Language Models Arxiv 2023-10
Inspire the Large Language Model by External Knowledge on BioMedical Named Entity Recognition Arxiv 2023-09
One Model for All Domains: Collaborative Domain-Prefx Tuning for Cross-Domain NER IJCAI 2023-09 GitHub
Chain-of-Thought Prompt Distillation for Multimodal Named Entity Recognition and Multimodal Relation Extraction Arxiv 2023-08
Learning In-context Learning for Named Entity Recognition  ACL 2023-07 GitHub
Debiasing Generative Named Entity Recognition by Calibrating Sequence Likelihood ACL Short 2023-07
Entity-to-Text based Data Augmentation for various Named Entity Recognition Tasks ACL Findings 2023-07
Large Language Models as Instructors: A Study on Multilingual Clinical Entity Extraction BioNLP 2023-07 GitHub
NAG-NER: a Unified Non-Autoregressive Generation Framework for Various NER Tasks ACL Industry 2023-07
Unified Named Entity Recognition as Multi-Label Sequence Generation IJCNN 2023-06
PromptNER : Prompting For Named Entity Recognition Arxiv 2023-06
Does Synthetic Data Generation of LLMs Help Clinical Text Mining? Arxiv 2023-04
Unified Text Structuralization with Instruction-tuned Language Models Arxiv 2023-03
Structured information extraction from complex scientific text with fine-tuned large language models Arxiv 2022-12 Demo
LightNER: A Lightweight Tuning Paradigm for Low-resource NER via Pluggable Prompting COLING 2022-10 GitHub
De-bias for generative extraction in unified NER task ACL 2022-05
InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER Arxiv 2022-03
Document-level Entity-based Extraction as Template Generation EMNLP 2021-11 GitHub
A Unified Generative Framework for Various NER Subtasks ACL 2021-08 GitHub
Template-Based Named Entity Recognition Using BART ACL Findings 2021-08 GitHub

Relation Extraction

Models targeting only RE tasks.

Relation Classification

Paper Venue Date Code
Enhancing Software-Related Information Extraction via Single-Choice Question Answering with Large Language Models Others 2024-04
CRE-LLM: A Domain-Specific Chinese Relation Extraction Framework with Fine-tuned Large Language Model Arxiv 2024-04 GitHub
Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction IJCAI 2024-04
Empirical Analysis of Dialogue Relation Extraction with Large Language Models IJCAI 2024-04
Meta In-Context Learning Makes Large Language Models Better Zero and Few-Shot Relation Extractors IJCAI 2024-04
Retrieval-Augmented Generation-based Relation Extraction Arxiv 2024-04 GitHub
Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Arxiv 2024-04
STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models AAAI 2024-03
Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction Arxiv 2024-02
Chain of Thought with Explicit Evidence Reasoning for Few-shot Relation Extraction EMNLP Findings 2023-12
GPT-RE: In-context Learning for Relation Extraction using Large Language Models EMNLP 2023-12 GitHub
Guideline Learning for In-context Information Extraction EMNLP 2023-12
Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples! EMNLP Findings 2023-12 GitHub
LLMaAA: Making Large Language Models as Active Annotators EMNLP Findings 2023-12 GitHub
Improving Unsupervised Relation Extraction by Augmenting Diverse Sentence Pairs EMNLP 2023-12 GitHub
Revisiting Large Language Models as Zero-shot Relation Extractors EMNLP Findings 2023-12
Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment Arxiv 2023-10
Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors ACL Findings 2023-07 GitHub
How to Unleash the Power of Large Language Models for Few-shot Relation Extraction? ACL Workshop 2023-07 GitHub
Sequence generation with label augmentation for relation extraction AAAI 2023-06 GitHub
Does Synthetic Data Generation of LLMs Help Clinical Text Mining? Arxiv 2023-04
DORE: Document Ordered Relation Extraction based on Generative Framework EMNLP Findings 2022-12
REBEL: Relation Extraction By End-to-end Language generation EMNLP Findings 2021-11 GitHub

Relation Triplet

Paper Venue Date Code
ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis ACL 2024 GitHub
AutoRE: Document-Level Relation Extraction with Large Language Models ACL Demos 2024 GitHub
Meta In-Context Learning Makes Large Language Models Better Zero and Few-Shot Relation Extractors IJCAI 2024-04
Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction WWW 2024
Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction COLING 2024 GitHub
Unlocking Instructive In-Context Learning with Tabular Prompting for Relational Triple Extraction COLING 2024
A Simple but Effective Approach to Improve Structured Language Model Output for Information Extraction Arxiv 2024-02
Structured information extraction from scientific text with large language models Nature Communications 2024-02 GitHub
Document-Level In-Context Few-Shot Relation Extraction via Pre-Trained Language Models Arxiv 2024-02 GitHub
Small Language Model Is a Good Guide for Large Language Model in Chinese Entity Relation Extraction Arxiv 2024-02
Efficient Data Learning for Open Information Extraction with Pre-trained Language Models EMNLP Findings 2023-12
Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment Arxiv 2023-10
Unified Text Structuralization with Instruction-tuned Language Models Arxiv 2023-03
Document-level Entity-based Extraction as Template Generation EMNLP 2021-11 GitHub

Relation Strict

Paper Venue Date Code
MetaIE: Distilling a Meta Model from LLM for All Kinds of Information Extraction Tasks Arxiv 2024-03 GitHub
Distilling Named Entity Recognition Models for Endangered Species from Large Language Models Arxiv 2024-03
CHisIEC: An Information Extraction Corpus for Ancient Chinese History COLING 2024-03 GitHub
An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction AAAI 2024-03 GitHub
C-ICL: Contrastive In-context Learning for Information Extraction Arxiv 2024-02
REBEL: Relation Extraction By End-to-end Language generation EMNLP Findings 2021-11 GitHub

Event Extraction

Models targeting only EE tasks.

Event Detection

Paper Venue Date Code
Improving Event Definition Following For Zero-Shot Event Detection Arxiv 2024-03
Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment Arxiv 2023-10
Unified Text Structuralization with Instruction-tuned Language Models Arxiv 2023-03
Unleash GPT-2 Power for Event Detection ACL 2021-08

Event Argument Extraction

Paper Venue Date Code
LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument Extraction ACL 2024 GitHub
Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction ACL Findings 2024 GitHub
KeyEE: Enhancing Low-Resource Generative Event Extraction with Auxiliary Keyword Sub-Prompt Others 2024-04 GitHub
MetaIE: Distilling a Meta Model from LLM for All Kinds of Information Extraction Tasks Arxiv 2024-03 GitHub
Leveraging ChatGPT in Pharmacovigilance Event Extraction: An Empirical Study EACL 2024-02 GitHub
ULTRA: Unleash LLMs' Potential for Event Argument Extraction through Hierarchical Modeling and Pair-wise Refinement Arxiv 2024-01
Context-Aware Prompt for Generation-based Event Argument Extraction with Diffusion Models CIKM 2023-10
Contextualized Soft Prompts for Extraction of Event Arguments ACL Findings 2023-07
AMPERE: AMR-Aware Prefix for Generation-Based Event Argument Extraction Model ACL 2023-07 GitHub
Code4Struct: Code Generation for Few-Shot Event Structure Prediction ACL 2023-07 GitHub
Event Extraction as Question Generation and Answering ACL short 2023-07 GitHub
Global Constraints with Prompting for Zero-Shot Event Argument Classification EACL Findings 2023-05
Prompt for extraction? PAIE: prompting argument interaction for event argument extraction ACL 2022-05 GitHub

Event Detection & Argument Extraction

Paper Venue Date Code
TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction ACL Findings 2024 GitHub
EventRL: Enhancing Event Extraction with Outcome Supervision for Large Language Models Arxiv 2024-02
Guideline Learning for In-context Information Extraction EMNLP 2023-12
DemoSG: Demonstration-enhanced Schema-guided Generation for Low-resource Event Extraction EMNLP Findings 2023-12 GitHub
Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples! EMNLP Findings 2023-12 GitHub
DICE: Data-Efficient Clinical Event Extraction with Generative Models ACL 2023-07 GitHub
A Monte Carlo Language Model Pipeline for Zero-Shot Sociopolitical Event Extraction NeurIPS Workshop 2023-10
STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models AAAI 2024-03
DEGREE: A Data-Efficient Generative Event Extraction Model NAACL 2022-07 GitHub
ClarET: Pre-training a correlation-aware context-to-event transformer for event-centric generation and classification ACL 2022-05 GitHub
Dynamic prefix-tuning for generative template-based event extraction ACL 2022-05
Text2event: Controllable sequence-to- structure generation for end-to-end event extraction ACL 2021-08 GitHub
Document-level event argument extraction by conditional generation NAACL 2021-06 GitHub

Universal Information Extraction

Unified models targeting multiple IE tasks.

NL-LLMs based

Paper Venue Date Code
Diluie: constructing diverse demonstrations of in-context learning with large language model for unified information extraction Others 2024-04 GitHub
ChatUIE: Exploring Chat-based Unified Information Extraction using Large Language Models COLING 2024
YAYI-UIE: A Chat-Enhanced Instruction Tuning Framework for Universal Information Extraction Arxiv 2024-04
Set Learning for Generative Information Extraction EMNLP 2023-12
GIELLM: Japanese General Information Extraction Large Language Model Utilizing Mutual Reinforcement Effect Arxiv 2023-11
InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction Arxiv 2023-04 GitHub
Zero-Shot Information Extraction via Chatting with ChatGPT Arxiv 2023-02 GitHub
GenIE: Generative Information Extraction NAACL 2022-07 GitHub
DEEPSTRUCT: Pretraining of Language Models for Structure Prediction ACL Findings 2022-05 GitHub
Unified Structure Generation for Universal Information Extraction ACL 2022-05 GitHub
Structured prediction as translation between augmented natural languages ICLR 2021-01 GitHub

Code-LLMs based

Paper Venue Date Code
KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction ACL 2024 GitHub
GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction ICLR 2024 GitHub
Retrieval-Augmented Code Generation for Universal Information Extraction Arxiv 2023-11
CODEIE: Large Code Generation Models are Better Few-Shot Information Extractors ACL 2023-07 GitHub
CodeKGC: Code Language Model for Generative Knowledge Graph Construction ACM TALLIP 2024-03 GitHub

Information Extraction Techniques

A taxonomy by techniques.

Supervised Fine-tuning

Paper Venue Date Code
Rethinking Negative Instances for Generative Named Entity Recognition ACL Findings 2024 GitHub
Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction ACL Findings 2024 GitHub
AutoRE: Document-Level Relation Extraction with Large Language Models ACL Demos 2024 GitHub
Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction IJCAI 2024-04
Empirical Analysis of Dialogue Relation Extraction with Large Language Models IJCAI 2024-04
An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction AAAI 2024 GitHub
Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction COLING 2024 GitHub
ToNER: Type-oriented Named Entity Recognition with Generative Language Model COLING 2024
CHisIEC: An Information Extraction Corpus for Ancient Chinese History COLING 2024 GitHub
KeyEE: Enhancing Low-Resource Generative Event Extraction with Auxiliary Keyword Sub-Prompt Others 2024-04 GitHub
VANER: Leveraging Large Language Model for Versatile and Adaptive Biomedical Named Entity Recognition Arxiv 2024-04 GitHub
LLMs in Biomedicine: A study on clinical Named Entity Recognition Arxiv 2024-04
Mining experimental data from Materials Science literature with Large Language Models: an evaluation study Arxiv 2024-04 GitHub
CRE-LLM: A Domain-Specific Chinese Relation Extraction Framework with Fine-tuned Large Language Model Arxiv 2024-04 GitHub
Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Arxiv 2024-04
Improving Event Definition Following For Zero-Shot Event Detection Arxiv 2024-03
Embedded Named Entity Recognition using Probing Classifiers Arxiv 2024-03 GitHub
EventRL: Enhancing Event Extraction with Outcome Supervision for Large Language Models Arxiv 2024-02
Structured information extraction from scientific text with large language models Nature Communications 2024-02 GitHub
PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition Arxiv 2024-02
UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition ICLR 2024 GitHub
GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction ICLR 2024 GitHub
Set Learning for Generative Information Extraction EMNLP 2023-12
Efficient Data Learning for Open Information Extraction with Pre-trained Language Models EMNLP Findings 2023-12
DemoSG: Demonstration-enhanced Schema-guided Generation for Low-resource Event Extraction EMNLP Findings 2023-12 GitHub
Calibrated Seq2seq Models for Efficient and Generalizable Ultra-fine Entity Typing EMNLP Findings 2023-12
GIELLM: Japanese General Information Extraction Large Language Model Utilizing Mutual Reinforcement Effect Arxiv 2023-11
GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer Arxiv 2023-11 GitHub
Context-Aware Prompt for Generation-based Event Argument Extraction with Diffusion Models CIKM 2023-10
Contextualized Soft Prompts for Extraction of Event Arguments ACL Findings 2023-07
AMPERE: AMR-Aware Prefix for Generation-Based Event Argument Extraction Model ACL 2023-07 GitHub
Debiasing Generative Named Entity Recognition by Calibrating Sequence Likelihood ACL short 2023-07
DICE: Data-Efficient Clinical Event Extraction with Generative Models ACL 2023-07 GitHub
Event Extraction as Question Generation and Answering ACL short 2023-07 GitHub
NAG-NER: a Unified Non-Autoregressive Generation Framework for Various NER Tasks ACL Industry 2023-07
Sequence generation with label augmentation for relation extraction AAAI 2023-06 GitHub
Unified Named Entity Recognition as Multi-Label Sequence Generation IJCNN 2023-06
InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction Arxiv 2023-04 GitHub
Structured information extraction from complex scientific text with fine-tuned large language models Arxiv 2022-12 Demo
Generative Entity Typing with Curriculum Learning EMNLP 2022-12 GitHub
DORE: Document Ordered Relation Extraction based on Generative Framework EMNLP Findings 2022-12
LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model NeurIPS 2022-10 GitHub
LightNER: A Lightweight Tuning Paradigm for Low-resource NER via Pluggable Prompting COLING 2022-10 GitHub
GenIE: Generative Information Extraction NAACL 2022-07 GitHub
DEGREE: A Data-Efficient Generative Event Extraction Model NAACL 2022-07 GitHub
ClarET: Pre-training a correlation-aware context-to-event transformer for event-centric generation and classification ACL 2022-05 GitHub
DEEPSTRUCT: Pretraining of Language Models for Structure Prediction ACL Findings 2022-05 GitHub
Dynamic prefix-tuning for generative template-based event extraction ACL 2022-05
Prompt for extraction? PAIE: prompting argument interaction for event argument extraction ACL 2022-05 GitHub
Unified Structure Generation for Universal Information Extraction ACL 2022-05 GitHub
De-bias for generative extraction in unified NER task ACL 2022-05
Document-level Entity-based Extraction as Template Generation EMNLP 2021-11 GitHub
REBEL: Relation Extraction By End-to-end Language generation EMNLP Findings 2021-11 GitHub
A Unified Generative Framework for Various NER Subtasks ACL 2021-08 GitHub
Template-Based Named Entity Recognition Using BART ACL Findings 2021-08 GitHub
Text2event: Controllable sequence-to- structure generation for end-to-end event extraction ACL 2021-08 GitHub
Document-level event argument extraction by conditional generation NAACL 2021-06 GitHub
Structured prediction as translation between augmented natural languages ICLR 2021-01 GitHub

Few-shot

Few-shot Fine-tuning

Paper Venue Date Code
Diluie: constructing diverse demonstrations of in-context learning with large language model for unified information extraction Others 2024-04 GitHub
KeyEE: Enhancing Low-Resource Generative Event Extraction with Auxiliary Keyword Sub-Prompt Others 2024-04 GitHub
Meta In-Context Learning Makes Large Language Models Better Zero and Few-Shot Relation Extractors IJCAI 2024-04
On-the-fly Definition Augmentation of LLMs for Biomedical NER NAACL 2024-03 GitHub
DemoSG: Demonstration-enhanced Schema-guided Generation for Low-resource Event Extraction EMNLP Findings 2023-12 GitHub
One Model for All Domains: Collaborative Domain-Prefx Tuning for Cross-Domain NER IJCAI 2023-09 GitHub
LightNER: A Lightweight Tuning Paradigm for Low-resource NER via Pluggable Prompting COLING 2022-10 GitHub
Unified Structure Generation for Universal Information Extraction ACL 2022-05 GitHub
InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER Arxiv 2022-03
Template-Based Named Entity Recognition Using BART ACL Findings 2021-08 GitHub
Structured prediction as translation between augmented natural languages ICLR 2021-01 GitHub

In-Context Learning

Paper Venue Date Code
TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction ACL Findings 2024 GitHub
RT: a Retrieving and Chain-of-Thought framework for few-shot medical named entity recognition Others 2024-05 GitHub
P-ICL: Point In-Context Learning for Named Entity Recognition with Large Language Models Arxiv 2024-06 GitHub
LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking Arxiv 2024-04 GitHub
Enhancing Software-Related Information Extraction via Single-Choice Question Answering with Large Language Models Others 2024-04
LLMs in Biomedicine: A study on clinical Named Entity Recognition Arxiv 2024-04
Out of Sesame Street: A Study of Portuguese Legal Named Entity Recognition Through In-Context Learning ResearchGate 2024-04
Mining experimental data from Materials Science literature with Large Language Models: an evaluation study Arxiv 2024-04 GitHub
Empirical Analysis of Dialogue Relation Extraction with Large Language Models IJCAI 2024-04
Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models NAACL Short 2024 GitHub
ConsistNER: Towards Instructive NER Demonstrations for LLMs with the Consistency of Ontology and Context AAAI 2024
On-the-fly Definition Augmentation of LLMs for Biomedical NER NAACL 2024 GitHub
CHisIEC: An Information Extraction Corpus for Ancient Chinese History COLING 2024 GitHub
Unlocking Instructive In-Context Learning with Tabular Prompting for Relational Triple Extraction COLING 2024
CodeKGC: Code Language Model for Generative Knowledge Graph Construction ACM TALLIP 2024-03 GitHub
Document-Level In-Context Few-Shot Relation Extraction via Pre-Trained Language Models Arxiv 2024-02 GitHub
In-Context Learning for Few-Shot Nested Named Entity Recognition Arxiv 2024-02
Leveraging ChatGPT in Pharmacovigilance Event Extraction: An Empirical Study EACL 2024-02 GitHub
Heuristic-Driven Link-of-Analogy Prompting: Enhancing Large Language Models for Document-Level Event Argument Extraction Arxiv 2024-02
LinkNER: Linking Local Named Entity Recognition Models to Large Language Models using Uncertainty WWW 2024
Small Language Model Is a Good Guide for Large Language Model in Chinese Entity Relation Extraction Arxiv 2024-02
C-ICL: Contrastive In-context Learning for Information Extraction Arxiv 2024-02
Improving Large Language Models for Clinical Named Entity Recognition via Prompt Engineering Arxiv 2024-01 GitHub
Chain of Thought with Explicit Evidence Reasoning for Few-shot Relation Extraction EMNLP Findings 2023-12
GPT-RE: In-context Learning for Relation Extraction using Large Language Models EMNLP 2023-12 GitHub
Guideline Learning for In-context Information Extraction EMNLP 2023-12
Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples! EMNLP Findings 2023-12 GitHub
Retrieval-Augmented Code Generation for Universal Information Extraction Arxiv 2023-11
Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment Arxiv 2023-10
GPT-NER: Named Entity Recognition via Large Language Models Arxiv 2023-10 GitHub
GPT Struct Me: Probing GPT Models on Narrative Entity Extraction WI-IAT 2023-10 GitHub
Learning In-context Learning for Named Entity Recognition  ACL 2023-07 GitHub
Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors ACL Findings 2023-07 GitHub
Code4Struct: Code Generation for Few-Shot Event Structure Prediction ACL 2023-07 GitHub
CODEIE: Large Code Generation Models are Better Few-Shot Information Extractors ACL 2023-07 GitHub
How to Unleash the Power of Large Language Models for Few-shot Relation Extraction? ACL Workshop 2023-07 GitHub
PromptNER : Prompting For Named Entity Recognition Arxiv 2023-06 GitHub
Unified Text Structuralization with Instruction-tuned Language Models Arxiv 2023-03

Zero-shot

Zero-shot Prompting

Paper Venue Date Code
ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis ACL 2024 GitHub
Astronomical Knowledge Entity Extraction in Astrophysics Journal Articles via Large Language Models Others 2024-04
Mining experimental data from Materials Science literature with Large Language Models: an evaluation study Arxiv 2024-04 GitHub
Empirical Analysis of Dialogue Relation Extraction with Large Language Models IJCAI 2024-04
Retrieval-Augmented Generation-based Relation Extraction Arxiv 2024-04 GitHub
Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Arxiv 2024-04
Meta In-Context Learning Makes Large Language Models Better Zero and Few-Shot Relation Extractors IJCAI 2024-04
Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models NAACL Short 2024 GitHub
CodeKGC: Code Language Model for Generative Knowledge Graph Construction ACM TALLIP 2024-03 GitHub
On-the-fly Definition Augmentation of LLMs for Biomedical NER NAACL 2024-03 GitHub
Leveraging ChatGPT in Pharmacovigilance Event Extraction: An Empirical Study EACL 2024-02 GitHub
A Simple but Effective Approach to Improve Structured Language Model Output for Information Extraction Arxiv 2024-02
Small Language Model Is a Good Guide for Large Language Model in Chinese Entity Relation Extraction Arxiv 2024-02
Improving Large Language Models for Clinical Named Entity Recognition via Prompt Engineering Arxiv 2024-01 GitHub
Improving Unsupervised Relation Extraction by Augmenting Diverse Sentence Pairs EMNLP 2023-12 GitHub
Prompt-NER: Zero-shot Named Entity Recognition in Astronomy Literature via Large Language Models Arxiv 2023-10
Revisiting Large Language Models as Zero-shot Relation Extractors EMNLP Findings 2023-10
Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors ACL Findings 2023-07 GitHub
Code4Struct: Code Generation for Few-Shot Event Structure Prediction ACL 2023-07 GitHub
A Monte Carlo Language Model Pipeline for Zero-Shot Sociopolitical Event Extraction NeurIPS Workshop 2023-10
Global Constraints with Prompting for Zero-Shot Event Argument Classification EACL Findings 2023-05
Zero-Shot Information Extraction via Chatting with ChatGPT Arxiv 2023-02 GitHub

Cross-Domain Learning

Paper Venue Date Code
KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction ACL 2024 GitHub
VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models ACL 2024 GitHub
Rethinking Negative Instances for Generative Named Entity Recognition ACL Findings 2024 GitHub
IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus ACL Short 2024 GitHub
Diluie: constructing diverse demonstrations of in-context learning with large language model for unified information extraction Others 2024-04 GitHub
Advancing Entity Recognition in Biomedicine via Instruction Tuning of Large Language Models Bioinformatics 2024-03 GitHub
ChatUIE: Exploring Chat-based Unified Information Extraction using Large Language Models COLING 2024
ULTRA: Unleash LLMs' Potential for Event Argument Extraction through Hierarchical Modeling and Pair-wise Refinement Arxiv 2024-01
YAYI-UIE: A Chat-Enhanced Instruction Tuning Framework for Universal Information Extraction Arxiv 2024-04
GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction ICLR 2024 GitHub
UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition ICLR 2024 GitHub
InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction Arxiv 2023-04 GitHub
DEEPSTRUCT: Pretraining of Language Models for Structure Prediction ACL Findings 2022-05 GitHub
Multilingual generative language models for zero-shot cross-lingual event argument extraction ACL 2022-05 GitHub

Cross-Type Learning

Paper Venue Date Code
Document-level event argument extraction by conditional generation NAACL 2021-06 GitHub

Data Augmentation

Data Annotation

Paper Venue Date Code
Astro-NER -- Astronomy Named Entity Recognition: Is GPT a Good Domain Expert Annotator? Arxiv 2024-05
MetaIE: Distilling a Meta Model from LLM for All Kinds of Information Extraction Tasks Arxiv 2024-03 GitHub
Augmenting NER Datasets with LLMs: Towards Automated and Refined Annotation Arxiv 2024-03
NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data Arxiv 2024-02
Leveraging ChatGPT in Pharmacovigilance Event Extraction: An Empirical Study EACL 2024-02 GitHub
LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition Arxiv 2024-02
LLMaAA: Making Large Language Models as Active Annotators EMNLP Findings 2023-12 GitHub
Improving Unsupervised Relation Extraction by Augmenting Diverse Sentence Pairs EMNLP 2023-12 GitHub
Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models EMNLP 2023-12 GitHub
How to Unleash the Power of Large Language Models for Few-shot Relation Extraction? ACL Workshop 2023-07 GitHub
Large Language Models as Instructors: A Study on Multilingual Clinical Entity Extraction bioNLP Workshop 2023-07 GitHub
Does Synthetic Data Generation of LLMs Help Clinical Text Mining? Arxiv 2023-04
Unleash GPT-2 Power for Event Detection ACL 2021-08

Knowledge Retrieval

Paper Venue Date Code
LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition ACL Findings 2024 GitHub
Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction WWW 2024
Learning to Rank Context for Named Entity Recognition Using a Synthetic Dataset EMNLP 2023-12 GitHub
Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge EMNLP Findings 2023-12 GitHub
Chain-of-Thought Prompt Distillation for Multimodal Named Entity Recognition and Multimodal Relation Extraction Arxiv 2023-08

Inverse Generation

Paper Venue Date Code
Distilling Named Entity Recognition Models for Endangered Species from Large Language Models Arxiv 2024-03
Improving Event Definition Following For Zero-Shot Event Detection Arxiv 2024-03
ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models ACL Findings 2024 GitHub
Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction Arxiv 2024-02
Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction EMNLP 2023-12 GitHub
Entity-to-Text based Data Augmentation for various Named Entity Recognition Tasks ACL Findings 2023-07
Event Extraction as Question Generation and Answering ACL Short 2023-07 GitHub
STAR: Boosting Low-Resource Event Extraction by Structure-to-Text Data Generation with Large Language Models AAAI 2024-03

Synthetic Datasets for Instruction-tuning

Paper Venue Date Code
Rethinking Negative Instances for Generative Named Entity Recognition ACL Findings 2024 GitHub
UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition ICLR 2024-01 GitHub
GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer Arxiv 2023-11 GitHub
Chain-of-Thought Prompt Distillation for Multimodal Named Entity Recognition and Multimodal Relation Extraction Arxiv 2023-08

Prompts Design

Question Answer

Paper Venue Date Code
Knowledge-Enriched Prompt for Low-Resource Named Entity Recognition TALLIP 2024-04
Enhancing Software-Related Information Extraction via Single-Choice Question Answering with Large Language Models Others 2024-04
Revisiting Large Language Models as Zero-shot Relation Extractors EMNLP Findings 2023-12
Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors ACL Findings 2023-07 GitHub
Zero-Shot Information Extraction via Chatting with ChatGPT Arxiv 2023-02 GitHub

Chain of Thought

Paper Venue Date Code
RT: a Retrieving and Chain-of-Thought framework for few-shot medical named entity recognition Others 2024-05 GitHub
Inspire the Large Language Model by External Knowledge on BioMedical Named Entity Recognition Arxiv 2023-09
Chain-of-Thought Prompt Distillation for Multimodal Named Entity Recognition and Multimodal Relation Extraction Arxiv 2023-08
Revisiting Relation Extraction in the era of Large Language Models ACL 2023-07 GitHub
Zero-shot Temporal Relation Extraction with ChatGPT BioNLP 2023-07
PromptNER : Prompting For Named Entity Recognition Arxiv 2023-06

Self-Improvement

Paper Venue Date Code
ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models ACL Findings 2024 GitHub
ULTRA: Unleash LLMs' Potential for Event Argument Extraction through Hierarchical Modeling and Pair-wise Refinement Arxiv 2024-01
Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models NAACL Short 2024 GitHub

Constrained Decoding Generation

Paper Venue Date Code
An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction AAAI 2024-03 GitHub
Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning EMNLP 2024-01 GitHub
DORE: Document Ordered Relation Extraction based on Generative Framework EMNLP Findings 2022-12
Autoregressive Structured Prediction with Language Models EMNLP Findings 2022-12 GitHub
Unified Structure Generation for Universal Information Extraction ACL 2022-05 GitHub

Specific Domain

Paper Domain Venue Date Code
Granular Entity Mapper: Advancing Fine-grained Multimodal Named Entity Recognition and Grounding Multimodal EMNLP Findings 2024
LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition Multimodal ACL Findings 2024 GitHub
RT: a Retrieving and Chain-of-Thought framework for few-shot medical named entity recognition Medical Others 2024-05 GitHub
Astro-NER -- Astronomy Named Entity Recognition: Is GPT a Good Domain Expert Annotator? Astronomy Arxiv 2024-05
Astronomical Knowledge Entity Extraction in Astrophysics Journal Articles via Large Language Models Astronomy Others 2024-04
VANER: Leveraging Large Language Model for Versatile and Adaptive Biomedical Named Entity Recognition Biomedical Arxiv 2024-04 GitHub
LLMs in Biomedicine: A study on clinical Named Entity Recognition Biomedical Arxiv 2024-04
Enhancing Software-Related Information Extraction via Single-Choice Question Answering with Large Language Models Software Others 2024-04
Out of Sesame Street: A Study of Portuguese Legal Named Entity Recognition Through In-Context Learning Legal ResearchGate 2024-04
Mining experimental data from Materials Science literature with Large Language Models: an evaluation study Scientific Arxiv 2024-04 GitHub
Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Acupuncture Point Arxiv 2024-04
Advancing Entity Recognition in Biomedicine via Instruction Tuning of Large Language Models Biomedical Bioinformatics 2024-03 GitHub
Distilling Named Entity Recognition Models for Endangered Species from Large Language Models Endangered Species Arxiv 2024-03
CHisIEC: An Information Extraction Corpus for Ancient Chinese History Historical COLING 2024-03 GitHub
On-the-fly Definition Augmentation of LLMs for Biomedical NER Biomedical NAACL 2024-03 GitHub
Improving LLM-Based Health Information Extraction with In-Context Learning Health Others 2024-03
Structured information extraction from scientific text with large language models Scientific Nat. Commun. 2024-02 GitHub
Leveraging ChatGPT in Pharmacovigilance Event Extraction: An Empirical Study Pharmacovigilance EACL 2024-02 GitHub
Structured information extraction from scientific text with large language models Scientific Nat. Commun. 2024-02 GitHub
Combining prompt‑based language models and weak supervision for labeling named entity recognition on legal documents Legal Others 2024-02
Improving Large Language Models for Clinical Named Entity Recognition via Prompt Engineering Clinical Arxiv 2024-01 GitHub
Impact of Sample Selection on In-Context Learning for Entity Extraction from Scientific Writing Scientific EMNLP Findings 2023-12 GitHub
Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge Multimodal ENMLP Findings 2023-12 GitHub
In-context Learning for Few-shot Multimodal Named Entity Recognition Multimodal ENMLP Findings 2023-12
PolyIE: A Dataset of Information Extraction from Polymer Material Scientific Literature Polymer Material Arxiv 2023-11 GitHub
Prompt-NER: Zero-shot Named Entity Recognition in Astronomy Literature via Large Language Models Astronomical Arxiv 2023-10
Inspire the Large Language Model by External Knowledge on BioMedical Named Entity Recognition Biomedical Arxiv 2023-09
Chain-of-Thought Prompt Distillation for Multimodal Named Entity Recognition and Multimodal Relation Extraction Multimodal Arxiv 2023-08
DICE: Data-Efficient Clinical Event Extraction with Generative Models Clinical ACL 2023-07 GitHub
How far is Language Model from 100% Few-shot Named Entity Recognition in Medical Domain Medical Arxiv 2023-07 GitHub
Large Language Models as Instructors: A Study on Multilingual Clinical Entity Extraction Multilingual / Clinical BioNLP 2023-07 GitHub
Does Synthetic Data Generation of LLMs Help Clinical Text Mining? Clinical Arxiv 2023-04
Yes but.. Can ChatGPT Identify Entities in Historical Documents Historical JCDL 2023-03
Zero-shot Clinical Entity Recognition using ChatGPT Clinical Arxiv 2023-03
Structured information extraction from complex scientific text with fine-tuned large language models Scientific Arxiv 2022-12 Demo
Multilingual generative language models for zero-shot cross-lingual event argument extraction Multilingual ACL 2022-05 GitHub

Evaluation and Analysis

Paper Venue Date Code
TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction ACL Findings 2024 GitHub
IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus ACL Short 2024 GitHub
CHisIEC: An Information Extraction Corpus for Ancient Chinese History COLING 2024 GitHub
GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models NAACL 2024 GitHub
Empirical Analysis of Dialogue Relation Extraction with Large Language Models IJCAI 2024
Astro-NER -- Astronomy Named Entity Recognition: Is GPT a Good Domain Expert Annotator? Arxiv 2024-05
Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Arxiv 2024-04
Mining experimental data from Materials Science literature with Large Language Models: an evaluation study Arxiv 2024-04 GitHub
Distilling Named Entity Recognition Models for Endangered Species from Large Language Models Arxiv 2024-03
LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities Arxiv 2024-02 GitHub
Few shot clinical entity recognition in three languages: Masked language models outperform LLM prompting Arxiv 2024-02
Information Extraction from Legal Wills: How Well Does GPT-4 Do? EMNLP Findings 2023-12 GitHub
Information Extraction in Low-Resource Scenarios: Survey and Perspective Arxiv 2023-12 GitHub
Empirical Study of Zero-Shot NER with ChatGPT EMNLP 2023-12 GitHub
NERetrieve: Dataset for Next Generation Named Entity Recognition and Retrieval EMNLP Findings 2023-12 GitHub
Preserving Knowledge Invariance: Rethinking Robustness Evaluation of Open Information Extraction EMNLP 2023-12 GitHub
PolyIE: A Dataset of Information Extraction from Polymer Material Scientific Literature Arxiv 2023-11 GitHub
XNLP: An Interactive Demonstration System for Universal Structured NLP Arxiv 2023-08 Demo
A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical Tasks Arxiv 2023-07
How far is Language Model from 100% Few-shot Named Entity Recognition in Medical Domain Arxiv 2023-07 GitHub
Revisiting Relation Extraction in the era of Large Language Models ACL 2023-07 GitHub
Zero-shot Temporal Relation Extraction with ChatGPT BioNLP 2023-07
InstructIE: A Chinese Instruction-based Information Extraction Dataset Arxiv 2023-05 GitHub
Is Information Extraction Solved by ChatGPT? An Analysis of Performance, Evaluation Criteria, Robustness and Errors Arxiv 2023-05 GitHub
Evaluating ChatGPT's Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness Arxiv 2023-04 GitHub
Exploring the Feasibility of ChatGPT for Event Extraction Arxiv 2023-03
Yes but.. Can ChatGPT Identify Entities in Historical Documents JCDL 2023-03
Zero-shot Clinical Entity Recognition using ChatGPT Arxiv 2023-03
Thinking about GPT-3 In-Context Learning for Biomedical IE? Think Again EMNLP Findings 2022-12 GitHub
Large Language Models are Few-Shot Clinical Information Extractors EMNLP 2022-12 Huggingface

Project and Toolkit

Paper Type Venue Date Link
ONEKE Project - - Link
TechGPT-2.0: A Large Language Model Project to Solve the Task of Knowledge Graph Construction Project Arxiv 2024-01 Link
CollabKG: A Learnable Human-Machine-Cooperative Information Extraction Toolkit for (Event) Knowledge Graph Construction Toolkit Arxiv 2023-07 Link

Recently Updated Papers

2024/09/04

Paper Venue Date Code
Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction ACL 2024-08 GitHub
Epidemic Information Extraction for Event-Based Surveillance using Large Language Models ICICT 2024-08
SpeechEE: A Novel Benchmark for Speech Event Extraction ACM MM 2024-08 GitHub
HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction Arxiv 2024-08
Knowledge AI: Fine-tuning NLP Models for Facilitating Scientific Knowledge Extraction and Understanding Arxiv 2024-08
Target Prompting for Information Extraction with Vision Language Model Arxiv 2024-08
Evaluating Named Entity Recognition Using Few-Shot Prompting with Large Language Models Arxiv 2024-08 GitHub
Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine against COVID-19 Literature: Comparative Study Arxiv 2024-08
CLLMFS: A Contrastive Learning enhanced Large Language Model Framework for Few-Shot Named Entity Recognition ECAI 2024-08
LLMs are not Zero-Shot Reasoners for Biomedical Information Extraction Arxiv 2024-08
Label Alignment and Reassignment with Generalist Large Language Model for Enhanced Cross-Domain Named Entity Recognition Arxiv 2024-07
MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models Arxiv 2024-08 GitHub
FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios ECAI 2024-07 GitHub
Adapting Multilingual LLMs to Low-Resource Languages with Knowledge Graphs via Adapters KaLLM workshop 2024-07 GitHub
Show Less, Instruct More: Enriching Prompts with Definitions and Guidelines for Zero-Shot NER Arxiv 2024-07
Large Language Models Struggle in Token-Level Clinical Named Entity Recognition AMIA 2024-08
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks Arxiv 2024-08
Retrieval Augmented Instruction Tuning for Open NER with Large Language Models Arxiv 2024-06 GitHub
Beyond Boundaries: Learning a Universal Entity Taxonomy across Datasets and Languages for Open Named Entity Recognition Arxiv 2024-06 GitHub
Fighting Against the Repetitive Training and Sample Dependency Problem in Few-shot Named Entity Recognition IEEE Access 2024-06 GitHub
llmNER: (Zero|Few)-Shot Named Entity Recognition, Exploiting the Power of Large Language Models Arxiv 2024-06 GitHub
Assessing the Performance of Chinese Open Source Large Language Models in Information Extraction Tasks Arxiv 2024-06

Datasets

* denotes the dataset is multimodal. # refers to the number of categories or sentences.

Task Dataset Domain #Class #Train #Val #Test Link
NER ACE04 News 7 6202 745 812 Link
ACE05 News 7 7299 971 1060 Link
BC5CDR Biomedical 2 4560 4581 4797 Link
Broad Twitter Corpus Social Media 3 6338 1001 2000 Link
CADEC Biomedical 1 5340 1097 1160 Link
CoNLL03 News 4 14041 3250 3453 Link
CoNLLpp News 4 14041 3250 3453 Link
CrossNER-AI Artificial Intelligence 14 100 350 431 Link
CrossNER-Literature Literary 12 100 400 416
CrossNER-Music Musical 13 100 380 465
CrossNER-Politics Political 9 199 540 650
CrossNER-Science Scientific 17 200 450 543
FabNER Scientific 12 9435 2182 2064 Link
Few-NERD General 66 131767 18824 37468 Link
FindVehicle Traffic 21 21565 20777 20777 Link
GENIA Biomedical 5 15023 1669 1854 Link
HarveyNER Social Media 4 3967 1301 1303 Link
MIT-Movie Social Media 12 9774 2442 2442 Link
MIT-Restaurant Social Media 8 7659 1520 1520 Link
MultiNERD Wikipedia 16 134144 10000 10000 Link
NCBI Biomedical 4 5432 923 940 Link
OntoNotes 5.0 General 18 59924 8528 8262 Link
ShARe13 Biomedical 1 8508 12050 9009 Link
ShARe14 Biomedical 1 17404 1360 15850 Link
SNAP* Social Media 4 4290 1432 1459 Link
Temporal Twitter Corpus (TTC) Social Meida 3 10000 500 1500 Link
Tweebank-NER Social Media 4 1639 710 1201 Link
Twitter2015* Social Media 4 4000 1000 3357 Link
Twitter2017* Social Media 4 3373 723 723 Link
TwitterNER7 Social Media 7 7111 886 576 Link
WikiDiverse* News 13 6312 755 757 Link
WNUT2017 Social Media 6 3394 1009 1287 Link
RE ACE05 News 7 10051 2420 2050 Link
ADE Biomedical 1 3417 427 428 Link
CoNLL04 News 5 922 231 288 Link
DocRED Wikipedia 96 3008 300 700 Link
MNRE* Social Media 23 12247 1624 1614 Link
NYT News 24 56196 5000 5000 Link
Re-TACRED News 40 58465 19584 13418 Link
SciERC Scientific 7 1366 187 397 Link
SemEval2010 General 19 6507 1493 2717 Link
TACRED News 42 68124 22631 15509 Link
TACREV News 42 68124 22631 15509 Link
EE ACE05 News 33/22 17172 923 832 Link
CASIE Cybersecurity 5/26 11189 1778 3208 Link
GENIA11 Biomedical 9/11 8730 1091 1092 Link
GENIA13 Biomedical 13/7 4000 500 500 Link
PHEE Biomedical 2/16 2898 961 968 Link
RAMS News 139/65 7329 924 871 Link
WikiEvents Wikipedia 50/59 5262 378 492 Link

Star History

Star History Chart