Skip to content

Latest commit

 

History

History
1400 lines (840 loc) · 91 KB

File metadata and controls

1400 lines (840 loc) · 91 KB

ErAConD: Error Annotated Conversational Dialog Dataset for Grammatical Error Correction

NewsEdits: A News Article Revision Dataset and a Novel Document-Level Reasoning Challenge

SUBS: Subtree Substitution for Compositional Semantic Parsing

Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks

Political Ideology and Polarization: A Multi-dimensional Approach

Cooperative Self-training of Machine Reading Comprehension

GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers

A Robustly Optimized BMRC for Aspect Sentiment Triplet Extraction

Seed-Guided Topic Discovery with Out-of-Vocabulary Seeds

Towards Process-Oriented, Modular, and Versatile Question Generation that Meets Educational Needs

On Synthetic Data for Back Translation

Towards Robust and Semantically Organised Latent Representations for Unsupervised Text Style Transfer

Automatic Correction of Human Translations

On the Robustness of Reading Comprehension Models to Entity Renaming

Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization

Cross-document Misinformation Detection based on Event Graph Reasoning

Machine-in-the-Loop Rewriting for Creative Image Captioning

A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Prediction

User-Centric Gender Rewriting

Reframing Human-AI Collaboration for Generating Free-Text Explanations

EmRel: Joint Representation of Entities and Embedded Relations for Multi-triple Extraction

Analyzing Modality Robustness in Multimodal Sentiment Analysis

Easy Adaptation to Mitigate Gender Bias in Multilingual Text Classification

Long-term Control for Dialogue Generation: Methods and Evaluation

Learning Dialogue Representations from Consecutive Utterances

PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining

GRAM: Fast Fine-tuning of Pre-trained Language Models for Content-based Collaborative Filtering

Generating Repetitions with Appropriate Repeated Words

CompactIE: Compact Facts in Open Information Extraction

CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination

OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering

Provably Confidential Language Modelling

KAT: A Knowledge Augmented Transformer for Vision-and-Language

When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it

On Curriculum Learning for Commonsense Reasoning

FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization

ScAN: Suicide Attempt and Ideation Events Dataset

Socially Aware Bias Measurements for Hindi Language Representations

AmbiPun: Generating Humorous Puns with Ambiguous Context

EmpHi: Generating Empathetic Responses with Human-like Intents

Inducing and Using Alignments for Transition-based AMR Parsing

Masked Part-Of-Speech Model: Does Modeling Long Context Help Unsupervised POS-tagging?

DREAM: Improving Situational QA by First Elaborating the Situation

Probing via Prompting

Cross-Domain Detection of GPT-2-Generated Technical Text

MultiSpanQA: A Dataset for Multi-Span Question Answering

Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models

Commonsense and Named Entity Aware Knowledge Grounded Dialogue Generation

Efficient Hierarchical Domain Adaptation for Pretrained Language Models

Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate

Quality-Aware Decoding for Neural Machine Translation

Pretrained Models for Multilingual Federated Learning

AcTune: Uncertainty-Based Active Self-Training for Active Fine-Tuning of Pretrained Language Models

Go Back in Time: Generating Flashbacks in Stories with Event Temporal Prompts

Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays

Relation-Specific Attentions over Entity Mentions for Enhanced Document-Level Relation Extraction

Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal Misinformation

BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation

Disentangled Learning of Stance and Aspect Topics for Vaccine Attitude Detection in Social Media

Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models

MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation

Implicit n-grams Induced by Recurrence

Improving Multi-Document Summarization through Referenced Flexible Extraction with Credit-Awareness

MuCPAD: A Multi-Domain Chinese Predicate-Argument Dataset

ValCAT: Variable-Length Contextualized Adversarial Transformations Using Encoder-Decoder Language Model

CIAug: Equipping Interpolative Augmentation with Curriculum Learning

Proposition-Level Clustering for Multi-Document Summarization

BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer

Combining Humor and Sarcasm for Improving Political Parody Detection

TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages

Extending Multi-Text Sentence Fusion Resources via Pyramid Annotations

The Devil is in the Details: On the Pitfalls of Vocabulary Selection in Neural Machine Translation

MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting

DEGREE: A Data-Efficient Generation-Based Event Extraction Model

Hero-Gang Neural Model For Named Entity Recognition

All You May Need for VQA are Image Captions

Frustratingly Easy System Combination for Grammatical Error Correction

Simple Local Attentions Remain Competitive for Long-Context Tasks

Multi-Relational Graph Transformer for Automatic Short Answer Grading

CS1QA: A Dataset for Assisting Code-based Question Answering in an Introductory Programming Course

Unsupervised Cross-Lingual Transfer of Structured Predictors without Source Data

Don’t Take It Literally: An Edit-Invariant Sequence Loss for Text Generation

Reference-free Summarization Evaluation via Semantic Correlation and Compression Ratio

Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models

Sentence-Level Resampling for Named Entity Recognition

Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem

Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks

Do Prompt-Based Models Really Understand the Meaning of Their Prompts?

GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval

Sparse Distillation: Speeding Up Text Classification by Using Bigger Student Models

SAIS: Supervising and Augmenting Intermediate Steps for Document-Level Relation Extraction

LITE: Intent-based Task Representation Learning Using Weak Supervision

SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization via Negative Sampling

Combating the Curse of Multilinguality in Cross-Lingual WSD by Aligning Sparse Contextualized Word Representations

What do tokens know about their characters and how do they know it?

AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

Paragraph-based Transformer Pre-training for Multi-Sentence Inference

Interactive Query-Assisted Summarization via Deep Reinforcement Learning

QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization

How Gender Debiasing Affects Internal Model Representations, and Why It Matters

A Structured Span Selector

Learning To Retrieve Prompts for In-Context Learning

Necessity and Sufficiency for Explaining Text Classifiers: A Case Study in Hate Speech Detection

Learning to Retrieve Passages without Supervision

Re2G: Retrieve, Rerank, Generate

Don’t sweat the small stuff, classify the rest: Sample Shielding to protect text classifiers against adversarial attacks

Gender Bias in Masked Language Models for Multiple Languages

Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization

MetaICL: Learning to Learn In Context

Robust Conversational Agents against Imperceptible Toxicity Triggers

Selective Differential Privacy for Language Modeling

Do Trajectories Encode Verb Meaning?

Learning to Borrow– Relation Representation for Without-Mention Entity-Pairs for Knowledge Graph Completion

Modal Dependency Parsing via Language Model Priming

Document-Level Relation Extraction with Sentences Importance Estimation and Focusing

Triggerless Backdoor Attack for NLP Tasks with Clean Labels

PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided MCTS Decoding

Interpretable Proof Generation via Iterative Backward Reasoning

Incorporating Centering Theory into Neural Coreference Resolution

Progressive Class Semantic Matching for Semi-supervised Text Classification

Features or Spurious Artifacts? Data-centric Baselines for Fair and Robust Hate Speech Detection

A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation

Should We Rely on Entity Mentions for Relation Extraction? Debiasing Relation Extraction with Counterfactual Analysis

Analyzing Encoded Concepts in Transformer Language Models

MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction

NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias

Enhance Incomplete Utterance Restoration by Joint Learning Token Extraction and Text Generation

Efficient Constituency Tree based Encoding for Natural Language to Bash Translation

ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition

A Dataset for N-ary Relation Extraction of Drug Combinations

FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations

Global Entity Disambiguation with BERT

Clues Before Answers: Generation-Enhanced Multiple-Choice QA

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline

LUNA: Learning Slot-Turn Alignment for Dialogue State Tracking

Crossroads, Buildings and Neighborhoods: A Dataset for Fine-grained Location Recognition

CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking

Multimodal Dialogue State Tracking

On the Use of Bert for Automated Essay Scoring: Joint Learning of Multi-Scale Essay Representation

Recognition of They/Them as Singular Personal Pronouns in Coreference Resolution

Transparent Human Evaluation for Image Captioning

DocAMR: Multi-Sentence AMR Representation and Evaluation

ElitePLM: An Empirical Study on General Language Ability Evaluation of Pretrained Language Models

Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand

Exposing the Limits of Video-Text Models through Contrast Sets

Zero-shot Sonnet Generation with Discourse-level Planning and Aesthetics Features

When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer

How Conservative are Language Models? Adapting to the Introduction of Gender-Neutral Pronouns

Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts

Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining for Task-Oriented Dialog

ChapterBreak: A Challenge Dataset for Long-Range Language Models

ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction

Quantifying Language Variation Acoustically with Few Resources

Adaptable Adapters

One Reference Is Not Enough: Diverse Distillation with Reference Selection for Non-Autoregressive Translation

Can Rationalization Improve Robustness?

On the Effectiveness of Sentence Encoding for Intent Detection Meta-Learning

A Computational Acquisition Model for Multimodal Word Categorization

Residue-Based Natural Language Adversarial Attack Detection

EASE: Entity-Aware Contrastive Learning of Sentence Embedding

Is Neural Topic Modelling Better than Clustering? An Empirical Study on Clustering with Contextual Embeddings for Topics

TRUE: Re-evaluating Factual Consistency Evaluation

Knowledge Inheritance for Pre-trained Language Models

Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation

On Transferability of Prompt Tuning for Natural Language Processing

DocEE: A Large-Scale and Fine-grained Benchmark for Document-level Event Extraction

Towards Debiasing Translation Artifacts

WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models

Generative Biomedical Entity Linking via Knowledge Base-Guided Pre-training and Synonyms-Aware Fine-tuning

Robust Self-Augmentation for Named Entity Recognition with Meta Reweighting

Optimising Equal Opportunity Fairness in Model Training

Leaner and Faster: Two-Stage Model Compression for Lightweight Text-Image Retrieval

Early Rumor Detection Using Neural Hawkes Process with a New Benchmark Dataset

KCD: Knowledge Walks and Textual Cues Enhanced Political Perspective Detection in News Media

Collective Relevance Labeling for Passage Retrieval

COGMEN: COntextualized GNN based Multimodal Emotion recognitioN

Revisit Overconfidence for OOD Detection: Reassigned Contrastive Learning with Adaptive Class-dependent Threshold

Practice Makes a Solver Perfect: Data Augmentation for Math Word Problem Solvers

DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings

A Data Cartography based MixUp for Pre-trained Language Models

TVShowGuess: Character Comprehension in Stories as Speaker Guessing

Causal Distillation for Language Models

FNet: Mixing Tokens with Fourier Transforms

Answer Consolidation: Formulation and Benchmarking

FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation

Improving Compositional Generalization with Latent Structure and Data Augmentation

Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling

Imagination-Augmented Natural Language Understanding

Compositional Task-Oriented Parsing as Abstractive Question Answering

Learning Cross-Lingual IR from an English Retriever

Testing the Ability of Language Models to Interpret Figurative Language

Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity

Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer

Disentangling Categorization in Multi-agent Emergent Communication

GenIE: Generative Information Extraction

Entity Linking via Explicit Mention-Mention Coreference Modeling

Massive-scale Decoding for Text Generation using Lattices

Disentangling Indirect Answers to Yes-No Questions in Real Conversations

Quantifying Adaptability in Pre-trained Language Models with 500 Tasks

A Study of the Attention Abnormality in Trojaned BERTs

EPiDA: An Easy Plug-in Data Augmentation Framework for High Performance Text Classification

Partial-input baselines show that NLI models can ignore context, but they don’t.

Learning as Conversation: Dialogue Systems Reinforced for Information Acquisition

Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs

What Factors Should Paper-Reviewer Assignments Rely On? Community Perspectives on Issues and Ideals in Conference Peer-Review

Domain-Oriented Prefix-Tuning: Towards Efficient and Generalizable Fine-tuning for Zero-Shot Dialogue Summarization

Exact Paired-Permutation Testing for Structured Test Statistics

A Balanced Data Approach for Evaluating Cross-Lingual Transfer: Mapping the Linguistic Blood Bank

SSEGCN: Syntactic and Semantic Enhanced Graph Convolutional Network for Aspect-based Sentiment Analysis

DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks

SkillSpan: Hard and Soft Skill Extraction from English Job Postings

A Double-Graph Based Framework for Frame Semantic Parsing

An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling

A Two-Stream AMR-enhanced Model for Document-level Event Argument Extraction

Robust (Controlled) Table-to-Text Generation with Structure-Aware Equivariance Learning

JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering

Models In a Spelling Bee: Language Models Implicitly Learn the Character Composition of Tokens

A Corpus for Understanding and Generating Moral Stories

Modeling Multi-Granularity Hierarchical Features for Relation Extraction

Cross-modal Contrastive Learning for Speech Translation

KALA: Knowledge-Augmented Language Model Adaptation

Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences

Semantically Informed Slang Interpretation

TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding

Syn2Vec: Synset Colexification Graphs for Lexical Semantic Similarity

On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?

Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance

Visual Commonsense in Pretrained Unimodal and Multimodal Models

QuALITY: Question Answering with Long Input Texts, Yes!

ExSum: From Local Explanations to Model Understanding

Maximum Bayes Smatch Ensemble Distillation for AMR Parsing

CORWA: A Citation-Oriented Related Work Annotation Dataset

Extreme Zero-Shot Learning for Extreme Text Classification

ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence

Automatic Multi-Label Prompting: Simple and Interpretable Few-Shot Classification

Embedding Hallucination for Few-shot Language Fine-tuning

Cryptocurrency Bubble Detection: A New Stock Market Dataset, Financial Task & Hyperbolic Models

Nearest Neighbor Knowledge Distillation for Neural Machine Translation

DEMix Layers: Disentangling Domains for Modular Language Modeling

Contrastive Learning for Prompt-based Few-shot Language Learners

Identifying Implicitly Abusive Remarks about Identity Groups using a Linguistically Informed Approach

Label Definitions Improve Semantic Role Labeling

Consistency Training with Virtual Adversarial Discrete Perturbation

CoMPM: Context Modeling with Speaker’s Pre-trained Memory Tracking for Emotion Recognition in Conversation

DialSummEval: Revisiting Summarization Evaluation for Dialogues

Hyperbolic Relevance Matching for Neural Keyphrase Extraction

Template-free Prompt Tuning for Few-shot NER

Few-Shot Document-Level Relation Extraction

LaMemo: Language Modeling with Look-Ahead Memory

Exploiting Inductive Bias in Transformers for Unsupervised Disentanglement of Syntax and Semantics with VAEs

Neighbors Are Not Strangers: Improving Non-Autoregressive Translation under Low-Frequency Lexical Constraints

A Holistic Framework for Analyzing the COVID-19 Vaccine Debate

Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training

You Don’t Know My Favorite Color: Preventing Dialogue Representations from Revealing Speakers’ Private Personas

Hate Speech and Counter Speech Detection: Conversational Context Does Matter

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

HiURE: Hierarchical Exemplar Contrastive Learning for Unsupervised Relation Extraction

Diagnosing Vision-and-Language Navigation: What Really Matters

MOVER: Mask, Over-generate and Rank for Hyperbole Generation

Embarrassingly Simple Performance Prediction for Abductive Natural Language Inference

Impact of Training Instance Selection on Domain-Specific Entity Extraction using BERT

MM-GATBT: Enriching Multimodal Representation Using Graph Attention Network

ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation

Compositional Generalization in Grounded Language Learning via Induced Model Sparsity

Strong Heuristics for Named Entity Linking

textless-lib: a Library for Textless Spoken Language Processing

TurkishDelightNLP: A Neural Turkish NLP Toolkit

ZS4IE: A toolkit for Zero-Shot Information Extraction with simple Verbalizations

RESIN-11: Schema-guided Event Prediction for 11 Newsworthy Scenarios

SETSum: Summarization and Visualization of Student Evaluations of Teaching

PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit

DadmaTools: Natural Language Processing Toolkit for Persian Language

FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

Self-supervised Representation Learning for Speech Processing

Constraining word alignments with posterior regularization for label transfer

Medical Coding with Biomedical Transformer Ensembles and Zero/Few-shot Learning

ReFinED: An Efficient Zero-shot-capable Approach to End-to-End Entity Linking

Knowledge Extraction From Texts Based on Wikidata

AIT-QA: Question Answering Dataset over Complex Tables in the Airline Industry

Fast and Light-Weight Answer Text Retrieval in Dialogue Systems