Medical_NLP

Summary of medical NLP evaluations/competitions, datasets, papers and pre-trained models.

中文版本 English_version

Since Cris Lee left the medical NLP field in 2021, this repo is now maintained by Xidong Wang, Ziyue Lin, Jing Tang.

Medical_NLP

1. Evaluation
- 1.1 Chinese Medical Benchmark Evaluation: CMB / CMExam / PromptCBLUE
- 1.2 English Medical Benchmark Evaluation:
2. Competitions
- 2.1 Ongoing Competitions
- 2.2 Completed Competitions
3. Datasets
- 3.1 Chinese
- 3.2 English
4. Open-source Models
- 4.1 Medical PLMs
- 4.2 Medical LLMs
5. Relevant Papers
6. Open-source Toolkits
7. Industrial Solutions
8. Blog Sharing
9. Friendly Links

1. Evaluation

1.1 Chinese Medical Benchmark Evaluation: CMB / CMExam / PromptCBLUE

CMB
- GitHub Link：https://github.com/FreedomIntelligence/CMB
- Source：Various clinical medical examinations; complex clinical case consultations
CMExam
- GitHub Link：https://github.com/williamliujl/CMExam
- Source：Past questions from the Medical Practitioner Qualification Examination
PromptCBLUE
- GitHub Link：https://github.com/michael-wzhu/PromptCBLUE
- Source：CBLUE
PromptCBLUE
- GitHub Link：https://github.com/CBLUEbenchmark/CBLUE
- Source：Academic evaluation competitions from past CHIP conferences and datasets from Alibaba Quark's medical search service

1.2 English Medical Benchmark Evaluation:、

MultiMedBench
- Desription: A large-scale multimodal generative model

2. Competitions

2.1 Ongoing Competitions

None at the moment. Additions are welcome~

2.2 Completed Competitions

2.2.1 English competitions

BioNLP Workshop 2023 Shared Task
- Link: https://aclweb.org/aclwiki/BioNLP_Workshop#SHARED_TASKS_2023
- Source: BioNLP Workshop
MedVidQA 2023
- Link: https://medvidqa.github.io/index.html
- Source: National Institutes of Health
MEDIQA-2021
- Link: https://sites.google.com/view/mediqa2021
- Source: NAACL-BioNLP 2021 workshop
ICLR-2021 International Competition for Medical Dialogue Generation and Automatic Diagnosis
- Link: https://mlpcp21.github.io/pages/challenge
- Source: ICLR 2021 workshop

2.2.1 Chinese competitions

NLP for Medical Imaging - Medical Imaging Diagnostic Report Generation
- Link: https://gaiic.caai.cn/ai2023/
- Source: NLP for Medical Imaging - Medical Imaging Diagnostic Report Generation
NLP for Medical Imaging - Medical Imaging Diagnostic Report Generation
- Link: http://challenge.xfyun.cn/topic/info?type=disease-claims-2022&ch=ds22-dw-sq03
- Source: iFlytek
Evaluation Task of the 8th China Health Information Processing Conference (CHIP2022)
- Link: http://cips-chip.org.cn/
- Source: CHIP2022
iFlytek - Medical Entity and Relationship Recognition Challenge
- Link: http://www.fudan-disc.com/sharedtask/imcs21/index.html
- Source: iFlytek

3. Datasets

3.1 Chinese

Huatuo-26M
- Link: https://github.com/FreedomIntelligence/Huatuo-26M
- Description: Huatuo-26M is the largest Traditional Chinese Medicine (TCM) Q&A dataset to date.
Chinese Medical Dialogue Dataset
- Link: https://github.com/FreedomIntelligence/Huatuo-26M
- Description: Medical Q&A data from six departments
CBLUE
- Link: https://github.com/CBLUEbenchmark/CBLUE
- Description: Covers medical text information extraction (entity recognition, relation extraction)
cMedQA2 (108K)
- Link: https://github.com/zhangsheng93/cMedQA2
- Description: Chinese medical Q&A dataset with over 100,000 entries.
xywy-KG(294K triples)
- Link: https://github.com/baiyang2464/chatbot-base-on-Knowledge-Graph
- Description: 44.1K entities 294.1K triples
39Health-KG (210K triples)
- Link: https://github.com/zhihao-chen/QASystemOnMedicalGraph
- Desription: Includes 15 pieces of information, with 7 types of entities, about 37,000 entities, and 210,000 entity relationships.
Medical-Dialogue-System
- Link: https://github.com/UCSD-AI4H/Medical-Dialogue-System
- The MedDialog dataset (Chinese) contains conversations (in Chinese) between doctors and patients. It has 1.1 million dialogues and 4 million utterances. The data is continuously growing and more dialogues will be added.
Chinese medical dialogue data
- 地址：https://github.com/Toyhom/Chinese-medical-dialogue-data
- The dataset contains a total of 792,099 data from six different departments including Male, Pediatrics, Obstetrics and Gynecology, Internal Medicine, Surgery, and Oncology.
Yidu-S4K
- Link: http://openkg.cn/dataset/yidu-s4k
- Description: Named Entity Recognition, Entity and Attribute Extraction
Yidu-N7K
- Link: http://openkg.cn/dataset/yidu-n7k
- Description: Clinical Language Standardization
Chinese Medical Q&A Dataset
- Link: https://github.com/zhangsheng93/cMedQA2
- Description: Medical Q&A
Chinese Doctor-Patient Dialogue Data
- 地址：https://github.com/UCSD-AI4H/Medical-Dialogue-System
- Description: Medical Q&A
CPubMed-KG (4.4M triples)
- Link: https://cpubmed.openi.org.cn/graph/wiki
- Description: Full-text journal data of high quality from the Chinese Medical Association
Chinese Medical Knowledge Graph CMeKG (1M triples)
- Link: http://cmekg.pcl.ac.cn/
- Description: CMeKG（Chinese Medical Knowledge Graph）
CHIP Annual Evaluation (Official Evaluation)
- Link: http://cips-chip.org.cn/2022/callforeval ; http://www.cips-chip.org.cn/2021/ ; http://cips-chip.org.cn/2020/
- Description: CHIP Annual Evaluation (Official Evaluation)
Ruijin Hospital Diabetes Dataset (Diabetes)
- Link: https://tianchi.aliyun.com/competition/entrance/231687/information
- Description: Diabetes literature mining and knowledge graph construction using diabetes-related textbooks and research papers
Tianchi Novel Coronavirus Pneumonia Question Matching Competition (Novel Coronavirus)
- Link: https://tianchi.aliyun.com/competition/entrance/231776/information
- Description: The competition data includes: anonymized medical problem data pairs and annotated data.

3.2 English

MedMentions
- Link: https://github.com/chanzuckerberg/MedMentions
- Desription: Biomedical entity linking dataset based on PubMed abstracts
webMedQA
- Link: https://github.com/hejunqing/webMedQA
- Description: Medical Q&A
COMETA
- Link: https://www.siphs.org/
- Description: Medical entity linking data in social media. Published at EMNLP2020
PubMedQA
- Link: https://arxiv.org/abs/1909.06146
- Description: Medical entity linking data in social media. Published at EMNLP2020
MediQA
- Link: https://sites.google.com/view/mediqa2021
- Description: Text summarization
ChatDoctor Dataset-1
- Link: https://drive.google.com/file/d/1lyfqIwlLSClhgrCutWuEe_IACNq6XNUt/view?usp=sharing
- Description: 100k real conversations between patients and doctors from HealthCareMagic.com
ChatDoctor Dataset-2
- Link: https://drive.google.com/file/d/1ZKbqgYqWc7DJHs3N9TQYQVPdDQmZaClA/view?usp=sharing
- Description: 10k real conversations between patients and doctors from icliniq.com
Visual Med-Alpaca Data
- Link: https://github.com/cambridgeltl/visual-med-alpaca/tree/main/data
- Description: These data are used for Visual Med-Alpaca traning, being produced based on BigBio, ROCO and GPT-3.5-Turbo
CheXpert Plus
- Link: https://github.com/Stanford-AIMI/chexpert-plus
- Description: The largest publicly available text dataset in the field of radiology consists of 36 million text tokens, each paired with high-quality images in DICOM format. Additionally, the dataset includes a vast array of images and patient metadata covering various clinical and social groups, as well as numerous pathology labels and RadGraph annotations.

4. Open-source Models

4.1 Medical PLMs

BioBERT：
- Website: https://github.com/naver/biobert-pretrained
- Introduction: A language representation model for biomedical domain, especially designed for biomedical text mining tasks such as biomedical named entity recognition, relation extraction, question answering, etc.
BlueBERT：
- Website: https://github.com/ncbi-nlp/BLUE_Benchmark
- Introduction: BLUE benchmark consists of five different biomedicine text-mining tasks with ten corpora. BLUE benchmark rely on preexisting datasets because they have been widely used by the BioNLP community as shared tasks. These tasks cover a diverse range of text genres (biomedical literature and clinical notes), dataset sizes, and degrees of difficulty and, more importantly, highlight common biomedicine text-mining challenges.
BioFLAIR：
- Website: https://github.com/flairNLP/flair
- Introduction: Flair is a powerful NLP library, which allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), sentiment analysis, part-of-speech tagging (PoS), special support for biomedical data, sense disambiguation and classification, with support for a rapidly growing number of languages. Flair is also a A text embedding library and a PyTorch NLP framework.
COVID-Twitter-BERT：
- Website: https://github.com/digitalepidemiologylab/covid-twitter-bert
- Introduction: COVID-Twitter-BERT (CT-BERT) is a transformer-based model pretrained on a large corpus of Twitter messages on the topic of COVID-19. The v2 model is trained on 97M tweets (1.2B training examples).
bio-lm (Biomedical and Clinical Language Models)
- Website: https://github.com/facebookresearch/bio-lm
- Introduction: This work evaluates many models used for biomedical and clinical nlp tasks, and train new models that perform much better.
BioALBERT
- Website: https://github.com/usmaann/BioALBERT
- Introduction: A biomedical language representation model trained on large domain specific (biomedical) corpora for designed for biomedical text mining tasks.

4.2 Medical LLMs

4.2.1 Chinese Medical Large Language Models

BenTsao：
- Website: https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese
- Introduction: BenTsao is based on LLaMA-7B and has undergone fine-tuning with Chinese medical instructions through instruct-tuning. Researchers built a Chinese medical instruction dataset using a medical knowledge graph and the GPT3.5 API, and used this dataset as the basis for instruct-tuning LLaMA, thereby improving its question-answering capabilities in the medical field.
BianQue：
- Website: https://github.com/scutcyr/BianQue
- Introduction: A large medical conversation model fine-tuned through joint training with instructions and multi-turn inquiry dialogues. It is based on ClueAI/ChatYuan-large-v2 and fine-tuned using a blended dataset of Chinese medical question and answer instructions as well as multi-turn inquiry dialogues.
SoulChat：
- Website: https://github.com/scutcyr/SoulChat
- Introduction: SoulChat initialized with ChatGLM-6B, underwent instruct-tuning using a large-scale dataset of Chinese long-form instructions and multi-turn empathetic conversations in the field of psychological counseling. This instruct-tuning process aimed to enhance the model's empathy ability to guide users in expressing themselves, and capacity to provide thoughtful advice.
DoctorGLM：
- Website: https://github.com/Kent0n-Li/ChatDoctor
- Introduction: A large Chinese diagnostic model based on ChatGLM-6B. This model has been fine-tuned using a Chinese medical conversation dataset, incorporating various fine-tuning techniques such as Lora and P-tuningv2, and has been deployed for practical use.
HuatuoGPT：
- Website: https://github.com/FreedomIntelligence/HuatuoGPT
- Introduction: HuaTuo GPT is a GPT-like model that has undergone fine-tuning with specific medical instructions in Chinese. This model is a Chinese Language Model (LLM) designed specifically for medical consultation. Its training data includes distilled data from ChatGPT and real data from doctors. During the training process, reinforcement learning from human feedback (RLHF) has been incorporated to improve its performance.
HuatuoGPT-II：
- Website: https://github.com/FreedomIntelligence/HuatuoGPT-II
- Introduction: HuatuoGPT2 employs an innovative domain adaptation method to significantly boost its medical knowledge and dialogue proficiency. It showcases state-of-the-art performance in several medical benchmarks, especially surpassing GPT-4 in expert evaluations and the fresh medical licensing exams.

4.2.2 English Medical Large Language Models

GatorTron：
- Website: https://github.com/uf-hobi-informatics-lab/GatorTron
- Introduction: An early LLM developed for the Healthcare domain, aims to investigate how systems utilizing unstructured EHRs can benefit from clinical LLMs with billions of parameters.
Codex-Med：
- Website: https://github.com/vlievin/medical-reasoning
- Introduction: Codex-Med aimed to investigate the effectiveness of GPT-3.5 models. Two multiple-choice medical exam question datasets, namely USMLE and MedMCQA, as well as a medical reading comprehension dataset called PubMedQA were utilized.
Galactica：
- Website: https://galactica.org/
- Aiming to solve the problem of information overload in the scientific field, Galactica was proposed to store, combine, and reason about scientific knowledge, including Healthcare. Galactica was trained on a large corpus of papers, reference material, and knowledge bases to potentially discover hidden connections between different research and bring insights to the surface.
DeID-GPT：
- Website: https://github.com/yhydhx/ChatGPT-API
- Introduction: A novel GPT4-enabled de-identification framework which enables to automatically identify and remove the identifying information.
ChatDoctor：
- Website: https://github.com/Kent0n-Li/ChatDoctor
- Introduction: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge.
MedAlpaca：
- Website: https://github.com/kbressem/medAlpaca
- Introduction: MedAlpaca employed an open-source policy that enables on-site implementation, aiming at mitigating privacy concerns. MedAlpaca is built upon the LLaMA with 7 and 13 billion parameters.
PMC-LLaMA：
- Website: https://github.com/chaoyi-wu/PMC-LLaMA
- Introduction: PMC-LLaMA is an open-source language model that by tuning LLaMA-7B on a total of 4.8 million biomedical academic papers for further injecting medical knowledge, enhancing its capability in the medical domain.
Visual Med-Alpaca：
- Website: https://github.com/cambridgeltl/visual-med-alpaca
- Introduction: Visual Med-Alpaca is an open-source, parameter-efficient biomedical foundation model that can be integrated with medical "visual experts" for multimodal biomedical tasks. Built upon the LLaMa-7B architecture, this model is trained using an instruction set curated collaboratively by GPT-3.5-Turbo and human experts.
GatorTronGPT：
- Website: https://github.com/uf-hobi-informatics-lab/GatorTronGPT
- Introduction: GatorTronGPT is a clinical generative LLM designed with a GPT-3 architecture comprising 5 or 20 billion parameters. It utilizes a vast corpus of 277 billion words, consisting of a combination of clinical and English text.
MedAGI：
- Website: https://github.com/JoshuaChou2018/MedAGI
- Introduction: A paradigm to unify domain-specific medical LLMs with the lowest cost and a possible path to achieving medical AGI, rather than it is a LLM.
LLaVA-Med：
- Website: https://github.com/microsoft/LLaVA-Med
- Introduction: LLaVA-Med was initialized with the general-domain LLaVA and then continuously trained in a curriculum learning fashion (first biomedical concept alignment then full-blown instruction-tuning).
Med-Flamingo：
- Website: https://github.com/snap-stanford/med-flamingo
- Introduction: Med-Flamingo is a vision language model specifically designed to handle interleaved multimodal data comprising both images and text. Building on the achievements of Flamingo, Med-Flamingo further enhances these capabilities for the medical domain by pre-training diverse multimodal knowledge sources across various medical disciplines.

5. Relevant Papers

5.1 The Post-ChatGPT Era: Helpful Papers

Large Language Models Encode Clinical Knowledge. Paper Link
Performance of ChatGPT on USMLE: The Potential of Large Language Models for AI-Assisted Medical Education. Paper Link
Turing Test for Medical Advice from ChatGPT. Paper Link
Toolformer: Language Models Can Self-Learn to Use Tools. Paper Link
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automatic Feedback. Paper Link
Capability of GPT-4 in Medical Challenge Questions. Paper Link

5.2 Review Articles

Pretrained Language Models in Biomedical Field: A Systematic Review. Paper Link
Deep Learning Guide for Healthcare. Paper Link Published in Nature Medicine.
A Survey of Large Language Models for Healthcare. Paper Link

5.3 Task-Specific Articles

Articles Related to Electronic Health Records

Transfer Learning from Medical Literature for Section Prediction in Electronic Health Records. Paper Link
MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records. Paper Link

Medical Relation Extraction

Leveraging Dependency Forest for Neural Medical Relation Extraction. Paper Link

Medical Knowledge Graph

Learning a Health Knowledge Graph from Electronic Medical Records. Paper Link

Auxiliary Diagnosis

Evaluation and Accurate Diagnoses of Pediatric Diseases Using Artificial Intelligence. Paper Link

Medical Entity Linking (Normalization)

Medical Entity Linking Using Triplet Network. Paper Link
A Generate-and-Rank Framework with Semantic Type Regularization for Biomedical Concept Normalization. Paper Link
Deep Neural Models for Medical Concept Normalization in User-Generated Texts. Paper Link

5.4 Conference Index

List of Medical-Related Papers from ACL 2020

A Generate-and-Rank Framework with Semantic Type Regularization for Biomedical Concept Normalization. Paper Link
Biomedical Entity Representations with Synonym Marginalization. Paper Link
Document Translation vs. Query Translation for Cross-Lingual Information Retrieval in the Medical Domain. Paper Link
MIE: A Medical Information Extractor towards Medical Dialogues. Paper Link
Rationalizing Medical Relation Prediction from Corpus-level Statistics. Paper Link

List of Medical NLP Related Papers from AAAI 2020

On the Generation of Medical Question-Answer Pairs. Paper Link
LATTE: Latent Type Modeling for Biomedical Entity Linking. Paper Link
Learning Conceptual-Contextual Embeddings for Medical Text. Paper Link
Understanding Medical Conversations with Scattered Keyword Attention and Weak Supervision from Responses. Paper Link
Simultaneously Linking Entities and Extracting Relations from Biomedical Text without Mention-level Supervision. Paper Link
Can Embeddings Adequately Represent Medical Terminology? New Large-Scale Medical Term Similarity Datasets Have the Answer! Paper Link

List of Medical NLP Related Papers from EMNLP 2020

Towards Medical Machine Reading Comprehension with Structural Knowledge and Plain Text. Paper Link
MedDialog: Large-scale Medical Dialogue Datasets. Paper Link
COMETA: A Corpus for Medical Entity Linking in the Social Media. Paper Link
Biomedical Event Extraction as Sequence Labeling. Paper Link
FedED: Federated Learning via Ensemble Distillation for Medical Relation Extraction. Paper Link Paper Analysis
Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition. Paper Link
A Knowledge-driven Generative Model for Multi-implication Chinese Medical Procedure Entity Normalization. Paper Link
BioMegatron: Larger Biomedical Domain Language Model. Paper Link
Querying Across Genres for Medical Claims in News. Paper Link

6. Open-source Toolkits

Tokenization tool: PKUSEG Project Link Project Description: A multi-domain Chinese tokenization tool launched by Peking University, which supports selection in the medical field.

7. Industrial Solutions

Lingyi Wisdom
Left Hand Doctor
Yidu Cloud Research Institute - Medical Natural Language Processing
Baidu - Medical Text Structuring
Alibaba Cloud - Medical Natural Language Processing

8. Blog Sharing

Alpaca: A Powerful Open Source Instruction Following Model
Lessons Learned from Building Natural Language Processing Systems in the Medical Field
Introduction to Medical Public Databases and Data Mining Techniques in the Big Data Era
Looking at the Development of NLP Application in the Medical Field from ACL 2021, with Resource Download

9. Friendly Links

awesome_Chinese_medical_NLP
Chinese NLP Dataset Search
medical-data(Large Amount of Medical Related Data)
Tianchi Dataset (Includes Multiple Medical NLP Datasets)

10. reference

@misc{medical_NLP_github,
  author = {Xidong Wang, Ziyue Lin and Jing Tang},
  title = {Medical NLP},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/FreedomIntelligence/Medical_NLP}}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

English_vision.md

English_vision.md

Medical_NLP

1. Evaluation

1.1 Chinese Medical Benchmark Evaluation: CMB / CMExam / PromptCBLUE

1.2 English Medical Benchmark Evaluation:、

2. Competitions

2.1 Ongoing Competitions

2.2 Completed Competitions

2.2.1 English competitions

2.2.1 Chinese competitions

3. Datasets

3.1 Chinese

3.2 English

4. Open-source Models

4.1 Medical PLMs

4.2 Medical LLMs

4.2.1 Chinese Medical Large Language Models

4.2.2 English Medical Large Language Models

5. Relevant Papers

5.1 The Post-ChatGPT Era: Helpful Papers

5.2 Review Articles

5.3 Task-Specific Articles

5.4 Conference Index

6. Open-source Toolkits

7. Industrial Solutions

8. Blog Sharing

9. Friendly Links

10. reference

Files

English_vision.md

Latest commit

History

English_vision.md

File metadata and controls

Medical_NLP

1. Evaluation

1.1 Chinese Medical Benchmark Evaluation: CMB / CMExam / PromptCBLUE

1.2 English Medical Benchmark Evaluation:、

2. Competitions

2.1 Ongoing Competitions

2.2 Completed Competitions

2.2.1 English competitions

2.2.1 Chinese competitions

3. Datasets

3.1 Chinese

3.2 English

4. Open-source Models

4.1 Medical PLMs

4.2 Medical LLMs

4.2.1 Chinese Medical Large Language Models

4.2.2 English Medical Large Language Models

5. Relevant Papers

5.1 The Post-ChatGPT Era: Helpful Papers

5.2 Review Articles

5.3 Task-Specific Articles

5.4 Conference Index

6. Open-source Toolkits

7. Industrial Solutions

8. Blog Sharing

9. Friendly Links

10. reference