Skip to content

YiQi0318/LLMs_daily_arxiv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

Updated on 2024.07.25

Table of Contents
  1. LLM - Explainable
  2. LLM - Interpretable
  3. LLM - Reasoning
  4. LLM - Uncertainty
  5. LLM - Perplexity

LLM - Explainable

Publish Date Title Authors PDF Code
2024-07-24 ViPer: Visual Personalization of Generative Models via Individual Preference Learning Sogand Salehi et.al. 2407.17365v1 null
2024-07-24 Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mechanism Anhao Zhao et.al. 2407.17011v1 null
2024-07-24 MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Micro-Expression Dynamics in Video Dialogues Liyun Zhang et.al. 2407.16552v2 null
2024-07-22 AI for Handball: predicting and explaining the 2024 Olympic Games tournament with Deep Learning and Large Language Models Florian Felice et.al. 2407.15987v1 null
2024-07-22 Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability Zhuoyan Xu et.al. 2407.15720v1 link
2024-07-22 Dissecting Multiplication in Transformers: Insights into LLMs Luyu Qiu et.al. 2407.15360v1 null
2024-07-21 Explaining Decisions of Agents in Mixed-Motive Games Maayan Orner et.al. 2407.15255v1 null
2024-07-21 XAI meets LLMs: A Survey of the Relation between Explainable AI and Large Language Models Erik Cambria et.al. 2407.15248v1 null
2024-07-20 Understanding the Relationship between Prompts and Response Uncertainty in Large Language Models Ze Yu Zhang et.al. 2407.14845v1 null
2024-07-21 Trading Devil Final: Backdoor attack via Stock market and Bayesian Optimization Orson Mengara et.al. 2407.14573v1 null
2024-07-19 Evaluating the Reliability of Self-Explanations in Large Language Models Korbinian Randl et.al. 2407.14487v1 link
2024-07-19 Undermining Mental Proof: How AI Can Make Cooperation Harder by Making Thinking Easier Zachary Wojtowicz et.al. 2407.14452v1 null
2024-07-18 The Software Complexity of Nations Sándor Juhász et.al. 2407.13880v1 null
2024-07-24 The Honorific Effect: Exploring the Impact of Japanese Linguistic Formalities on AI-Generated Physics Explanations Keisuke Sato et.al. 2407.13787v2 null
2024-07-18 COMCAT: Leveraging Human Judgment to Improve Automatic Documentation and Summarization Skyler Grandel et.al. 2407.13648v1 null
2024-07-18 SOMONITOR: Explainable Marketing Data Processing and Analysis with Large Language Models Qi Yang et.al. 2407.13117v1 null
2024-07-17 Explainable Biomedical Hypothesis Generation via Retrieval Augmented Generation enabled Large Language Models Alexander R. Pelletier et.al. 2407.12888v1 null
2024-07-16 InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification Yujia Hu et.al. 2407.12882v1 link
2024-07-03 Truth is Universal: Robust Detection of Lies in LLMs Lennart Bürger et.al. 2407.12831v1 null
2024-07-16 InvAgent: A Large Language Model based Multi-Agent System for Inventory Management in Supply Chains Yinzhu Quan et.al. 2407.11384v1 link
2024-06-03 The Life Cycle of Large Language Models: A Review of Biases in Education Jinsook Lee et.al. 2407.11203v1 null
2024-06-25 RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems Robert Friel et.al. 2407.11005v1 null
2024-06-24 Visualization Literacy of Multimodal Large Language Models: A Comparative Study Zhimin Li et.al. 2407.10996v1 null
2024-06-23 Do Large Language Models Understand Verbal Indicators of Romantic Attraction? Sandra C. Matz et.al. 2407.10989v1 null
2024-07-15 GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework Hannah Sansford et.al. 2407.10793v1 null
2024-07-16 Transforming Agency. On the mode of existence of Large Language Models Xabier E. Barandiaran et.al. 2407.10735v2 null
2024-07-15 Learning Dynamics of LLM Finetuning Yi Ren et.al. 2407.10490v1 link
2024-07-19 Rapid Biomedical Research Classification: The Pandemic PACT Advanced Categorisation Engine Omid Rohanian et.al. 2407.10086v2 null
2024-07-13 Building pre-train LLM Dataset for the INDIC Languages: a case study on Hindi Shantipriya Parida et.al. 2407.09855v1 null
2024-07-17 Counterfactual Explainable Incremental Prompt Attack Analysis on Large Language Models Dong Shu et.al. 2407.09292v2 null
2024-07-12 DAHRS: Divergence-Aware Hallucination-Remediated SRL Projection Sangpil Youm et.al. 2407.09283v1 null
2024-07-11 Fault Diagnosis in Power Grids with Large Language Model Liu Jing et.al. 2407.08836v1 null
2024-07-11 Towards Explainable Evolution Strategies with Large Language Models Jill Baumann et.al. 2407.08331v1 null
2024-07-10 Training on the Test Task Confounds Evaluation and Emergence Ricardo Dominguez-Olmedo et.al. 2407.07890v1 link
2024-07-10 A Proposed S.C.O.R.E. Evaluation Framework for Large Language Models : Safety, Consensus, Objectivity, Reproducibility and Explainability Ting Fang Tan et.al. 2407.07666v1 null
2024-07-08 SimPal: Towards a Meta-Conversational Framework to Understand Teacher's Instructional Goals for K-12 Physics Effat Farhana et.al. 2407.06241v1 null
2024-07-07 Experiments with truth using Machine Learning: Spectral analysis and explainable classification of synthetic, false, and genuine information Vishnu S. Pendyala et.al. 2407.05464v1 null
2024-07-07 Exploring the Educational Landscape of AI: Large Language Models' Approaches to Explaining Conservation of Momentum in Physics Keisuke Sato et.al. 2407.05308v1 null
2024-07-04 From Data to Commonsense Reasoning: The Use of Large Language Models for Explainable AI Stefanie Krause et.al. 2407.03778v1 null
2024-07-04 Improving Self Consistency in LLMs through Probabilistic Tokenization Ashutosh Sathe et.al. 2407.03678v1 null
2024-07-04 The Mysterious Case of Neuron 1512: Injectable Realignment Architectures Reveal Internal Characteristics of Meta's Llama 2 Model Brenden Smith et.al. 2407.03621v1 link
2024-07-03 LANE: Logic Alignment of Non-tuning Large Language Models and Online Recommendation Systems for Explainable Reason Generation Hongke Zhao et.al. 2407.02833v1 null
2024-07-01 Engineering Conversational Search Systems: A Review of Applications, Architectures, and Functional Components Phillip Schneider et.al. 2407.00997v1 null
2024-07-08 LLM Uncertainty Quantification through Directional Entailment Graph and Claim Level Response Augmentation Longchao Da et.al. 2407.00994v2 null
2024-07-03 HRDE: Retrieval-Augmented Large Language Models for Chinese Health Rumor Detection and Explainability Yanfang Chen et.al. 2407.00668v2 link
2024-06-29 MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation Jinsheng Huang et.al. 2407.00468v1 link
2024-06-28 Evaluating Human Alignment and Model Faithfulness of LLM Rationale Mohsen Fayyaz et.al. 2407.00219v1 null
2024-06-28 Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach Sai Krishna Revanth Vuruma et.al. 2407.00167v1 null
2024-06-28 Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring Jiazheng Li et.al. 2406.19949v1 null
2024-06-27 xTower: A Multilingual LLM for Explaining and Correcting Translation Errors Marcos Treviso et.al. 2406.19482v1 null
2024-06-26 "Is ChatGPT a Better Explainer than My Professor?": Evaluating the Explanation Capabilities of LLMs in Conversation Compared to a Human Baseline Grace Li et.al. 2406.18512v1 null
2024-06-26 Mental Modeling of Reinforcement Learning Agents by Language Models Wenhao Lu et.al. 2406.18505v1 null
2024-06-26 Is In-Context Learning a Type of Gradient-Based Learning? Evidence from the Inverse Frequency Effect in Structural Priming Zhenghao Zhou et.al. 2406.18501v1 null
2024-06-25 From Distributional to Overton Pluralism: Investigating Large Language Model Alignment Thom Lake et.al. 2406.17692v1 link
2024-06-25 Banishing LLM Hallucinations Requires Rethinking Generalization Johnny Li et.al. 2406.17642v1 null
2024-06-23 Unveiling LLM Mechanisms Through Neural ODEs and Control Theory Yukun Zhang et.al. 2406.16985v1 null
2024-06-24 Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track Ronak Pradeep et.al. 2406.16828v1 link
2024-06-24 Large Language Models Are Cross-Lingual Knowledge-Free Reasoners Peng Hu et.al. 2406.16655v1 link
2024-06-24 UNO Arena for Evaluating Sequential Decision-Making Capability of Large Language Models Zhanyue Qin et.al. 2406.16382v1 null
2024-06-23 Preference Tuning For Toxicity Mitigation Generalizes Across Languages Xiaochen Li et.al. 2406.16235v1 link
2024-06-23 Effectiveness of ChatGPT in explaining complex medical reports to patients Mengxuan Sun et.al. 2406.15963v1 null
2024-06-30 LLM-Powered Explanations: Unraveling Recommendations Through Subgraph Reasoning Guangsi Shi et.al. 2406.15859v2 null
2024-06-21 Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network Badr AlKhamissi et.al. 2406.15109v1 link
2024-06-21 Harnessing Knowledge Retrieval with Large Language Models for Clinical Report Error Correction Jinge Wu et.al. 2406.15045v1 null
2024-06-20 Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task? Zhiqiang Pi et.al. 2406.14737v1 null
2024-06-20 Self-supervised Interpretable Concept-based Models for Text Classification Francesco De Santis et.al. 2406.14335v1 null
2024-06-20 Definition generation for lexical semantic change detection Mariia Fedorova et.al. 2406.14167v1 link
2024-06-22 Enhancing Travel Choice Modeling with Large Language Models: A Prompt-Learning Approach Xuehao Zhai et.al. 2406.13558v2 null
2024-06-16 Current state of LLM Risks and AI Guardrails Suriya Ganesh Ayyamperumal et.al. 2406.12934v1 null
2024-06-19 Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models Hengyi Wang et.al. 2406.12649v2 null
2024-06-18 An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs Daking Rai et.al. 2406.12288v1 link
2024-06-18 Unveiling Implicit Table Knowledge with Question-Then-Pinpoint Reasoner for Insightful Table Summarization Kwangwook Seo et.al. 2406.12269v1 null
2024-06-18 A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning Lijie Hu et.al. 2406.12255v1 null
2024-06-29 Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM Huaxin Zhang et.al. 2406.12235v2 link
2024-06-28 WellDunn: On the Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions Seyedali Mohammadi et.al. 2406.12058v3 null
2024-05-31 Generative AI Voting: Fair Collective Choice is Resilient to LLM Biases and Inconsistencies Srijoni Majumdar et.al. 2406.11871v1 null
2024-06-17 CELL your Model: Contrastive Explanation Methods for Large Language Models Ronny Luss et.al. 2406.11785v1 null
2024-06-17 GECOBench: A Gender-Controlled Text Dataset and Benchmark for Quantifying Biases in Explanations Rick Wilming et.al. 2406.11547v1 link
2024-06-17 A Systematic Analysis of Large Language Models as Soft Reasoners: The Case of Syllogistic Inferences Leonardo Bertolazzi et.al. 2406.11341v1 null
2024-06-17 TIFG: Text-Informed Feature Generation with Large Language Models Xinhao Zhang et.al. 2406.11177v1 null
2024-06-16 LLMFactor: Extracting Profitable Factors through Prompts for Explainable Stock Movement Prediction Meiyun Wang et.al. 2406.10811v1 null
2024-06-15 A Comprehensive Survey of Foundation Models in Medicine Wasif Khan et.al. 2406.10729v1 null
2024-06-15 Multilingual Large Language Models and Curse of Multilinguality Daniil Gurgurov et.al. 2406.10602v1 null
2024-06-14 Towards Effectively Detecting and Explaining Vulnerabilities Using Large Language Models Qiheng Mao et.al. 2406.09701v1 null
2024-06-13 Automated Molecular Concept Generation and Labeling with Large Language Models Shichang Zhang et.al. 2406.09612v1 null
2024-06-12 LLM-assisted Concept Discovery: Automatically Identifying and Explaining Neuron Functions Nhat Hoang-Xuan et.al. 2406.08572v1 null
2024-06-13 CoXQL: A Dataset for Parsing Explanation Requests in Conversational XAI Systems Qianli Wang et.al. 2406.08101v2 link
2024-06-12 A Concept-Based Explainability Framework for Large Multimodal Models Jayneel Parekh et.al. 2406.08074v1 null
2024-06-13 LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing Hongxiang Zhang et.al. 2406.07714v2 null
2024-06-15 What's in an embedding? Would a rose by any embedding smell as sweet? Venkat Venkatasubramanian et.al. 2406.06870v3 null
2024-06-10 Evaluating Zero-Shot Long-Context LLM Compression Chenyu Wang et.al. 2406.06773v1 null
2024-06-09 Exploring the Efficacy of Large Language Models (GPT-4) in Binary Reverse Engineering Saman Pordanesh et.al. 2406.06637v1 null
2024-06-06 Reinterpreting 'the Company a Word Keeps': Towards Explainable and Ontologically Grounded Language Models Walid S. Saba et.al. 2406.06610v1 null
2024-06-06 Are Large Language Models the New Interface for Data Pipelines? Sylvio Barbon Junior et.al. 2406.06596v1 null
2024-06-13 From Redundancy to Relevance: Enhancing Explainability in Multimodal Large Language Models Xiaofeng Zhang et.al. 2406.06579v2 null
2024-06-10 Insights from Social Shaping Theory: The Appropriation of Large Language Models in an Undergraduate Programming Course Aadarsh Padiyath et.al. 2406.06451v1 null
2024-07-05 Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue Simone Alghisi et.al. 2406.06399v2 null
2024-07-03 MedExQA: Medical Question Answering Benchmark with Multiple Explanations Yunsoo Kim et.al. 2406.06331v2 link
2024-06-10 Safety Alignment Should Be Made More Than Just a Few Tokens Deep Xiangyu Qi et.al. 2406.05946v1 link
2024-06-13 How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States Zhenhong Zhou et.al. 2406.05644v2 link
2024-06-08 Aligning Human Knowledge with Visual Concepts Towards Explainable Medical Image Classification Yunhe Gao et.al. 2406.05596v1 null
2024-06-07 Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models Michał Romaszewski et.al. 2406.04926v1 null
2024-06-07 Think out Loud: Emotion Deducing Explanation in Dialogues Jiangnan Li et.al. 2406.04758v1 null
2024-06-07 Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions Jingtan Wang et.al. 2406.04606v1 link
2024-06-08 What Do Language Models Learn in Context? The Structured Task Hypothesis Jiaoda Li et.al. 2406.04216v2 link
2024-06-06 Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective Xinhao Yao et.al. 2406.03768v1 link
2024-06-04 Dynamic and Adaptive Feature Generation with LLM Xinhao Zhang et.al. 2406.03505v1 null
2024-06-05 AD-H: Autonomous Driving with Hierarchical Agents Zaibin Zhang et.al. 2406.03474v1 null
2024-06-06 Large Language Models as Evaluators for Recommendation Explanations Xiaoyu Zhang et.al. 2406.03248v2 link
2024-06-05 Missci: Reconstructing Fallacies in Misrepresented Science Max Glockner et.al. 2406.03181v1 link
2024-06-04 XRec: Large Language Models for Explainable Recommendation Qiyao Ma et.al. 2406.02377v1 link
2024-06-04 I've got the "Answer"! Interpretation of LLMs Hidden States in Question Answering Valeriya Goloviznina et.al. 2406.02060v1 null
2024-06-20 What Are Large Language Models Mapping to in the Brain? A Case Against Over-Reliance on Brain Scores Ebrahim Feghhi et.al. 2406.01538v2 link
2024-06-04 Superhuman performance in urology board questions by an explainable large language model enabled for context integration of the European Association of Urology guidelines: the UroBot study Martin J. Hetz et.al. 2406.01428v2 null
2024-06-03 TCMBench: A Comprehensive Benchmark for Evaluating Large Language Models in Traditional Chinese Medicine Wenjing Yue et.al. 2406.01126v1 null
2024-06-03 Unveil the Duality of Retrieval-Augmented Generation: Theoretical Analysis and Practical Solution Shicheng Xu et.al. 2406.00944v1 null
2024-06-01 Evaluating Uncertainty-based Failure Detection for Closed-Loop LLM Planners Zhi Zheng et.al. 2406.00430v1 null
2024-05-31 How In-Context Learning Emerges from Training on Unstructured Data: On the Role of Co-Occurrence, Positional Information, and Noise Structures Kevin Christian Wibisono et.al. 2406.00131v1 link
2024-05-27 How Ready Are Generative Pre-trained Large Language Models for Explaining Bengali Grammatical Errors? Subhankar Maity et.al. 2406.00039v1 null
2024-05-24 Large Language Model Pruning Hanjuan Huang et.al. 2406.00030v1 null
2024-06-05 SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales Tianyang Xu et.al. 2405.20974v2 link
2024-06-03 Large Language Models are Zero-Shot Next Location Predictors Ciro Beneduce et.al. 2405.20962v2 link
2024-05-31 FineRadScore: A Radiology Report Line-by-Line Evaluation Technique Generating Corrections with Severity Scores Alyssa Huang et.al. 2405.20613v1 link
2024-05-30 XPrompt:Explaining Large Language Model's Generation via Joint Prompt Attribution Yurui Chang et.al. 2405.20404v1 null
2024-05-29 Quo Vadis ChatGPT? From Large Language Models to Large Knowledge Models Venkat Venkatasubramanian et.al. 2405.19561v1 null
2024-05-29 Towards Faithful Chain-of-Thought: Large Language Models are Bridging Reasoners Jiachun Li et.al. 2405.18915v1 null
2024-06-11 Faithful Logical Reasoning via Symbolic Chain-of-Thought Jundong Xu et.al. 2405.18357v2 link
2024-05-28 Active Use of Latent Constituency Representation in both Humans and Large Language Models Wei Liu et.al. 2405.18241v1 link
2024-05-28 Exploring Activation Patterns of Parameters in Language Models Yudong Wang et.al. 2405.17799v1 null
2024-05-28 Facilitating Holistic Evaluations with LLMs: Insights from Scenario-Based Experiments Toru Ishida et.al. 2405.17728v1 null
2024-05-27 PAE: LLM-based Product Attribute Extraction for E-Commerce Fashion Trends Apurva Sinha et.al. 2405.17533v1 null
2024-07-02 TEII: Think, Explain, Interact and Iterate with Large Language Models to Solve Cross-lingual Emotion Detection Long Cheng et.al. 2405.17129v2 link
2024-05-27 The Uncanny Valley: Exploring Adversarial Robustness from a Flatness Perspective Nils Philipp Walter et.al. 2405.16918v1 null
2024-05-25 Large Language Models Enable Automated Formative Feedback in Human-Robot Interaction Tasks Emily Jensen et.al. 2405.16344v1 null
2024-06-20 Finetuning Large Language Model for Personalized Ranking Zhuoxi Bai et.al. 2405.16127v2 link
2024-05-24 Transformers represent belief state geometry in their residual stream Adam S. Shai et.al. 2405.15943v1 null
2024-05-24 Inverse-RLignment: Inverse Reinforcement Learning from Demonstrations for LLM Alignment Hao Sun et.al. 2405.15624v1 null
2024-07-03 ChatGPT Code Detection: Techniques for Uncovering the Source of Code Marc Oedingen et.al. 2405.15512v2 link
2024-05-24 From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks Jacob Russin et.al. 2405.15164v1 null
2024-05-28 Explaining Multi-modal Large Language Models by Analyzing their Vision Perception Loris Giulivi et.al. 2405.14612v2 link
2024-05-23 Large Language Models for Explainable Decisions in Dynamic Digital Twins Nan Zhang et.al. 2405.14411v1 link
2024-05-26 Explainable Few-shot Knowledge Tracing Haoxuan Li et.al. 2405.14391v2 link
2024-05-23 Knowledge Localization: Mission Not Accomplished? Enter Query Localization! Yuheng Chen et.al. 2405.14117v1 null
2024-05-22 Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation Cyril Chhun et.al. 2405.13769v1 link
2024-05-22 Mining Action Rules for Defect Reduction Planning Khouloud Oueslati et.al. 2405.13740v1 null
2024-05-22 Navigating User Experience of ChatGPT-based Conversational Recommender Systems: The Effects of Prompt Guidance and Recommendation Domain Yizhe Zhang et.al. 2405.13560v1 null
2024-05-22 HighwayLLM: Decision-Making and Navigation in Highway Driving with RL-Informed Language Model Mustafa Yildirim et.al. 2405.13547v1 null
2024-05-21 Investigating Symbolic Capabilities of Large Language Models Neisarg Dave et.al. 2405.13209v1 null
2024-05-11 RAGE Against the Machine: Retrieval-Augmented LLM Explanations Joel Rorseth et.al. 2405.13000v1 null
2024-05-20 Directed Metric Structures arising in Large Language Models Stéphane Gaubert et.al. 2405.12264v1 null
2024-05-19 Exploring the Capabilities of Prompted Large Language Models in Educational and Assessment Applications Subhankar Maity et.al. 2405.11579v1 null
2024-05-17 SynDy: Synthetic Dynamic Dataset Generation Framework for Misinformation Tasks Michael Shliselberg et.al. 2405.10700v1 null
2024-05-15 LoRA Learns Less and Forgets Less Dan Biderman et.al. 2405.09673v1 null
2024-05-15 Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models Majid Zarharan et.al. 2405.09454v1 link
2024-05-14 Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure Odysseas S. Chlapanis et.al. 2405.08502v1 link
2024-05-14 Challenges and Opportunities in Text Generation Explainability Kenza Amara et.al. 2405.08468v1 null
2024-05-14 Understanding the performance gap between online and offline alignment algorithms Yunhao Tang et.al. 2405.08448v1 null
2024-05-12 ExplainableDetector: Exploring Transformer-based Language Modeling Approach for SMS Spam Detection with Explainability Analysis Mohammad Amaz Uddin et.al. 2405.08026v1 null
2024-05-13 Can Language Models Explain Their Own Classification Behavior? Dane Sherburn et.al. 2405.07436v1 link
2024-05-10 LLM-Generated Black-box Explanations Can Be Adversarially Helpful Rohan Ajwani et.al. 2405.06800v1 null
2024-05-15 Parameter-Efficient Instruction Tuning of Large Language Models For Extreme Financial Numeral Labelling Subhendu Khatuya et.al. 2405.06671v2 link
2024-06-03 XAI4LLM. Let Machine Learning Models and LLMs Collaborate for Enhanced In-Context Learning in Healthcare Fatemeh Nazary et.al. 2405.06270v3 null
2024-05-09 Can Perplexity Reflect Large Language Model's Ability in Long Text Understanding? Yutong Hu et.al. 2405.06105v1 null
2024-05-09 LLMs for XAI: Future Directions for Explaining Explanations Alexandra Zytek et.al. 2405.06064v1 null
2024-05-09 Investigating Interaction Modes and User Agency in Human-LLM Collaboration for Domain-Specific Data Analysis Jiajing Guo et.al. 2405.05548v1 null
2024-05-08 The Effect of Model Size on LLM Post-hoc Explainability via LIME Henning Heyen et.al. 2405.05348v1 link
2024-05-09 LLMs with Personalities in Multi-issue Negotiation Games Sean Noh et.al. 2405.05248v2 null
2024-05-08 Zero-shot LLM-guided Counterfactual Generation for Text Amrita Bhattacharjee et.al. 2405.04793v1 null
2024-05-09 Large Language Models for Cyber Security: A Systematic Literature Review HanXiang Xu et.al. 2405.04760v2 null
2024-05-07 Large Language Models Cannot Explain Themselves Advait Sarkar et.al. 2405.04382v1 null
2024-05-07 Deception in Reinforced Autonomous Agents: The Unconventional Rabbit Hat Trick in Legislation Atharvan Dogra et.al. 2405.04325v1 null
2024-05-07 Granite Code Models: A Family of Open Foundation Models for Code Intelligence Mayank Mishra et.al. 2405.04324v1 link
2024-05-07 Semantic API Alignment: Linking High-level User Goals to APIs Robert Feldt et.al. 2405.04236v1 null
2024-05-07 NL2Plan: Robust LLM-Driven Planning from Minimal Text Descriptions Elliot Gestrin et.al. 2405.04215v1 null
2024-05-07 A Causal Explainable Guardrails for Large Language Models Zhixuan Chu et.al. 2405.04160v1 null
2024-05-06 FOKE: A Personalized and Explainable Education Framework Integrating Foundation Models, Knowledge Graphs, and Prompt Engineering Silan Hu et.al. 2405.03734v1 null
2024-05-06 Explainable Fake News Detection With Large Language Model via Defense Among Competing Wisdom Bo Wang et.al. 2405.03371v1 link
2024-05-03 What does the Knowledge Neuron Thesis Have to do with Knowledge? Jingcheng Niu et.al. 2405.02421v1 link
2024-05-07 A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model Jiexia Ye et.al. 2405.02358v2 link
2024-05-03 Argumentative Large Language Models for Explainable and Contestable Decision-Making Gabriel Freedman et.al. 2405.02079v1 null
2024-05-03 Which Identities Are Mobilized: Towards an automated detection of social group appeals in political texts Felicia Riethmüller et.al. 2405.01904v1 null
2024-05-02 CoS: Enhancing Personalization and Mitigating Bias with Context Steering Jerry Zhi-Yang He et.al. 2405.01768v1 null
2024-05-08 Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving Xin Quan et.al. 2405.01379v2 null
2024-04-26 LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study Van Bach Nguyen et.al. 2405.00722v1 null
2024-05-01 RAG-based Explainable Prediction of Road Users Behaviors for Automated Driving using Knowledge Graphs and Large Language Models Mohamed Manzour Hussien et.al. 2405.00449v1 null
2024-05-01 Social Life Simulation for Non-Cognitive Skills Learning Zihan Yan et.al. 2405.00273v1 null
2024-04-30 A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications Steph Buongiorno et.al. 2404.19729v1 null
2024-04-30 On Training a Neural Network to Explain Binaries Alexander Interrante-Grant et.al. 2404.19631v1 null
2024-04-29 Large Language Models as Conversational Movie Recommenders: A User Study Ruixuan Sun et.al. 2404.19093v1 null
2024-04-30 Evaluating Concept-based Explanations of Language Models: A Study on Faithfulness and Readability Meng Li et.al. 2404.18533v2 link
2024-04-30 Comparing LLM prompting with Cross-lingual transfer performance on Indigenous and Low-resource Brazilian Languages David Ifeoluwa Adelani et.al. 2404.18286v2 null
2024-04-27 Advancing Healthcare Automation: Multi-Agent Systems for Medical Necessity Justification Himanshu Pandey et.al. 2404.17977v1 null
2024-04-11 Rumour Evaluation with Very Large Language Models Dahlia Shehata et.al. 2404.16859v1 link
2024-04-25 TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning Liang Zhang et.al. 2404.16635v1 link
2024-04-04 Elicitron: An LLM Agent-Based Simulation Framework for Design Requirements Elicitation Mohammadmehdi Ataei et.al. 2404.16045v1 null
2024-04-24 Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach Linyu Liu et.al. 2404.15993v1 null
2024-04-25 Detecting Conceptual Abstraction in LLMs Michaela Regneri et.al. 2404.15848v2 null
2024-04-22 Pixels and Predictions: Potential of GPT-4V in Meteorological Imagery Analysis and Forecast Communication John R. Lawson et.al. 2404.15166v1 null
2024-06-04 Graph Machine Learning in the Era of Large Language Models (LLMs) Wenqi Fan et.al. 2404.14928v2 null
2024-05-10 Explaining Arguments' Strength: Unveiling the Role of Attacks and Supports (Technical Report) Xiang Yin et.al. 2404.14304v2 link
2024-04-22 Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach Yao Wan et.al. 2404.14296v1 link
2024-04-22 EventLens: Leveraging Event-Aware Pretraining and Cross-modal Linking Enhances Visual Commonsense Reasoning Mingjie Ma et.al. 2404.13847v1 null
2024-04-29 Large Language Models for Networking: Workflow, Advances and Challenges Chang Liu et.al. 2404.12901v2 null
2024-04-18 MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale Xiaotang Gai et.al. 2404.12372v1 null
2024-04-18 Concept Induction using LLMs: a user experiment for assessment Adrita Barua et.al. 2404.11875v1 null
2024-05-01 Course Recommender Systems Need to Consider the Job Market Jibril Frej et.al. 2404.10876v2 link
2024-06-03 Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model Hengyuan Zhang et.al. 2404.10306v4 link
2024-04-11 Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs Jierui Li et.al. 2404.08148v1 null
2024-05-29 Sketch-Plan-Generalize: Continual Few-Shot Learning of Inductively Generalizable Spatial Concepts Namasivayam Kalithasan et.al. 2404.07774v2 null
2024-04-11 Unraveling the Dilemma of AI Errors: Exploring the Effectiveness of Human and Machine Explanations for Large Language Models Marvin Pafla et.al. 2404.07725v1 null
2024-04-07 Explaining EDA synthesis errors with LLMs Siyu Qiu et.al. 2404.07235v1 null
2024-04-11 From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications Yongqiang Ma et.al. 2404.07108v2 null
2024-05-15 A Mathematical Theory for Learning Semantic Languages by Abstract Learners Kuo-Yu Liao et.al. 2404.07009v3 null
2024-04-10 WordDecipher: Enhancing Digital Workspace Communication with Explainable AI for Non-native English Speakers Yuexi Chen et.al. 2404.07005v1 null
2024-04-09 CausalBench: A Comprehensive Benchmark for Causal Learning Capability of Large Language Models Yu Zhou et.al. 2404.06349v1 null
2024-04-07 X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model Jan Held et.al. 2404.06332v1 null
2024-04-07 StockGPT: A GenAI Model for Stock Prediction and Trading Dat Mai et.al. 2404.05101v1 null
2024-04-07 Data Bias According to Bipol: Men are Naturally Right and It is the Role of Women to Follow Their Lead Irene Pagliai et.al. 2404.04838v1 link
2024-04-06 Binary Classifier Optimization for Large Language Model Alignment Seungjae Jung et.al. 2404.04656v1 null
2024-04-04 Language Model Evolution: An Iterated Learning Perspective Yi Ren et.al. 2404.04286v1 link
2024-04-04 Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph Marco Bronzini et.al. 2404.03623v1 null
2024-04-04 Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models Yantao Liu et.al. 2404.03577v1 link
2024-04-04 Edisum: Summarizing and Explaining Wikipedia Edits at Scale Marija Šakota et.al. 2404.03428v1 link
2024-04-04 Probing Large Language Models for Scalar Adjective Lexical Semantics and Scalar Diversity Pragmatics Fangru Lin et.al. 2404.03301v1 link
2024-04-04 DELTA: Decomposed Efficient Long-Term Robot Task Planning using Large Language Models Yuchen Liu et.al. 2404.03275v1 null
2024-04-03 LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models Gabriela Ben Melech Stan et.al. 2404.03118v1 null
2024-04-10 An Incomplete Loop: Deductive, Inductive, and Abductive Learning in Large Language Models Emmy Liu et.al. 2404.03028v2 null
2024-04-13 Explainable Traffic Flow Prediction with Large Language Models Xusen Guo et.al. 2404.02937v3 null
2024-04-03 Towards detecting unanticipated bias in Large Language Models Anna Kruspe et.al. 2404.02650v1 null
2024-04-03 Task Agnostic Architecture for Algorithm Induction via Implicit Composition Sahil J. Sindhi et.al. 2404.02450v1 null
2024-04-01 Enhancing Reasoning Capacity of SLM using Cognitive Enhancement Jonathan Pan et.al. 2404.01135v1 null
2024-04-01 Query Performance Prediction using Relevance Judgments Generated by Large Language Models Chuan Meng et.al. 2404.01012v1 link
2024-04-12 Harnessing the Power of Large Language Model for Uncertainty Aware Graph Processing Zhenyu Qian et.al. 2404.00589v2 link
2024-03-28 "I'm categorizing LLM as a productivity tool": Examining ethics of LLM use in HCI research practices Shivani Kapania et.al. 2403.19876v1 null
2024-03-27 Measuring Political Bias in Large Language Models: What Is Said and How It Is Said Yejin Bang et.al. 2403.18932v1 null
2024-03-26 Targeted Visualization of the Backbone of Encoder LLMs Isaac Roberts et.al. 2403.18872v1 link
2024-03-27 A Path Towards Legal Autonomy: An interoperable and explainable approach to extracting, transforming, loading and computing legal information using large language models, expert systems and Bayesian networks Axel Constant et.al. 2403.18537v1 null
2024-03-27 LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models Mingxing Peng et.al. 2403.18344v1 null
2024-03-27 Exploring the Privacy Protection Capabilities of Chinese Large Language Models Yuqi Yang et.al. 2403.18205v1 null
2024-03-26 Addressing Social Misattributions of Large Language Models: An HCXAI-based Approach Andrea Ferrario et.al. 2403.17873v1 null
2024-03-26 Constructions Are So Difficult That Even Large Language Models Get Them Right for the Wrong Reasons Shijia Zhou et.al. 2403.17760v1 link
2024-03-25 A Comprehensive Study of the Capabilities of Large Language Models for Vulnerability Detection Benjamin Steenhoek et.al. 2403.17218v1 null
2024-03-25 Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making Shuai Ma et.al. 2403.16812v1 null
2024-03-26 RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict Yirong Zeng et.al. 2403.16662v2 link
2024-03-25 ChatDBG: An AI-Powered Debugging Assistant Kyla Levin et.al. 2403.16354v1 link
2024-03-26 Towards a RAG-based Summarization Agent for the Electron-Ion Collider Karthik Suresh et.al. 2403.15729v2 null
2024-03-22 Large language models for crowd decision making based on prompt design strategies using ChatGPT: models, analysis and challenges Cristina Zuheros et.al. 2403.15587v1 null
2024-04-02 Assessing the Utility of Large Language Models for Phenotype-Driven Gene Prioritization in Rare Genetic Disorder Diagnosis Junyoung Kim et.al. 2403.14801v2 null
2024-03-21 A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science Clayton Cohn et.al. 2403.14565v1 null
2024-04-08 MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation Longzheng Wang et.al. 2403.14171v3 link
2024-03-21 From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation Haofei Zhao et.al. 2403.14118v1 null
2024-03-21 PE-GPT: A Physics-Informed Interactive Large Language Model for Power Converter Modulation Design Fanfan Lin et.al. 2403.14059v1 null
2024-03-12 Duwak: Dual Watermarks in Large Language Models Chaoyi Zhu et.al. 2403.13000v1 null
2024-03-19 INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations Lirui Luo et.al. 2403.12451v1 null
2024-05-08 Towards Interpretable Hate Speech Detection using Large Language Model-extracted Rationales Ayushi Nirmal et.al. 2403.12403v2 link
2024-05-09 From Explainable to Interpretable Deep Learning for Natural Language Processing in Healthcare: How Far from Reality? Guangming Huang et.al. 2403.11894v3 null
2024-03-18 DEE: Dual-stage Explainable Evaluation Method for Text Generation Shenyu Zhang et.al. 2403.11509v1 null
2024-04-30 Correcting misinformation on social media with a large language model Xinyi Zhou et.al. 2403.11169v3 link
2024-03-17 Enhancing Event Causality Identification with Rationale and Structure-Aware Causal Question Answering Baiyan Zhang et.al. 2403.11129v1 null
2024-03-26 SelfIE: Self-Interpretation of Large Language Model Embeddings Haozhe Chen et.al. 2403.10949v2 link
2024-03-16 Depression Detection on Social Media with Large Language Models Xiaochong Lan et.al. 2403.10750v1 null
2024-03-15 Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization Ratnadira Widyasari et.al. 2403.10507v1 null
2024-03-22 Can a GPT4-Powered AI Agent Be a Good Enough Performance Attribution Analyst? Bruno de Melo et.al. 2403.10482v2 null
2024-03-15 A Question on the Explainability of Large Language Models and the Word-Level Univariate First-Order Plausibility Assumption Jeremie Bogaert et.al. 2403.10275v1 null
2024-03-15 Language to Map: Topological map generation from natural language path instructions Hideki Deguchi et.al. 2403.10008v1 null
2024-03-14 Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey Xiaoyu Liu et.al. 2403.09606v1 null
2024-04-23 Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models Laura Fernández-Becerra et.al. 2403.09567v2 null
2024-03-14 XCoOp: Explainable Prompt Learning for Computer-Aided Diagnosis via Concept-guided Context Optimization Yequan Bie et.al. 2403.09410v1 null
2024-03-14 Meaningful Learning: Advancing Abstract Reasoning in Large Language Models via Generic Fact Guidance Kai Xiong et.al. 2403.09085v1 null
2024-03-13 Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era Xuansheng Wu et.al. 2403.08946v1 link
2024-03-13 TINA: Think, Interaction, and Action Framework for Zero-Shot Vision Language Navigation Dingbang Li et.al. 2403.08833v1 null
2024-03-13 Can Large Language Models Identify Authorship? Baixiang Huang et.al. 2403.08213v1 link
2024-03-12 generAItor: Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation Thilo Spinner et.al. 2403.07627v1 null
2024-03-12 Robustness, Security, Privacy, Explainability, Efficiency, and Usability of Large Language Models for Code Zhou Yang et.al. 2403.07506v1 null
2024-03-11 Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena Leonie Weissweiler et.al. 2403.06965v1 null
2024-03-11 RecAI: Leveraging Large Language Models for Next-Generation Recommender Systems Jianxun Lian et.al. 2403.06465v1 link
2024-03-10 ArgMed-Agents: Explainable Clinical Decision Reasoning with Large Language Models via Argumentation Schemes Shengxin Hong et.al. 2403.06294v1 null
2024-03-10 Low-dose CT Denoising with Language-engaged Dual-space Alignment Zhihao Chen et.al. 2403.06128v1 link
2024-03-10 Explaining Code with a Purpose: An Integrated Approach for Developing Code Comprehension and Prompting Skills Paul Denny et.al. 2403.06050v1 null
2024-03-08 Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings Wei Zhou et.al. 2403.05338v1 null
2024-03-08 Aligning Large Language Models for Controllable Recommendations Wensheng Lu et.al. 2403.05063v1 null
2024-03-07 Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference Wei-Lin Chiang et.al. 2403.04132v1 null
2024-04-26 Multimodal Large Language Models to Support Real-World Fact-Checking Jiahui Geng et.al. 2403.03627v2 null
2024-03-06 RouteExplainer: An Explanation Framework for Vehicle Routing Problem Daisuke Kikuta et.al. 2403.03585v1 link
2024-03-06 Explaining Genetic Programming Trees using Large Language Models Paula Maddigan et.al. 2403.03397v1 null
2024-03-05 SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection Peng Qi et.al. 2403.03170v1 null
2024-03-05 Word Importance Explains How Prompts Affect Language Model Outputs Stefan Hackmann et.al. 2403.03028v1 null
2024-03-05 FinReport: Explainable Stock Earnings Forecasting via News Factor Analyzing Model Xiangyu Li et.al. 2403.02647v1 link
2024-03-04 Evaluating the Explainability of Neural Rankers Saran Pandian et.al. 2403.01981v1 null
2024-03-03 SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos Yulei Niu et.al. 2403.01599v1 null
2024-03-03 Logic Rules as Explanations for Legal Case Retrieval Zhongxiang Sun et.al. 2403.01457v1 link
2024-03-02 Improving the Validity of Automatically Generated Feedback via Reinforcement Learning Alexander Scarlatos et.al. 2403.01304v1 link
2024-03-02 STAR: Constraint LoRA with Dynamic Active Learning for Data-Efficient Fine-Tuning of Large Language Models Linhai Zhang et.al. 2403.01165v1 link
2024-02-25 Cognitive Bias in High-Stakes Decision-Making with LLMs Jessica Echterhoff et.al. 2403.00811v1 null
2024-03-16 ChatDiet: Empowering Personalized Nutrition-Oriented Food Recommender Chatbots through an LLM-Augmented Framework Zhongqi Yang et.al. 2403.00781v2 null
2024-02-29 FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition Xiaoqiang Wang et.al. 2403.00126v1 null
2024-02-29 Dual Operating Modes of In-Context Learning Ziqian Lin et.al. 2402.18819v1 link
2024-04-15 Cause and Effect: Can Large Language Models Truly Understand Causality? Swagata Ashwani et.al. 2402.18139v2 null
2024-03-13 Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions Hanjie Chen et.al. 2402.18060v3 link
2024-03-04 A Language Model based Framework for New Concept Placement in Ontologies Hang Dong et.al. 2402.17897v2 link
2024-04-12 Re-Ex: Revising after Explanation Reduces the Factual Errors in LLM Responses Juyeon Kim et.al. 2402.17097v2 link
2024-02-26 Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling Hang Jiang et.al. 2402.17019v1 link
2024-02-28 Defending LLMs against Jailbreaking Attacks via Backtranslation Yihan Wang et.al. 2402.16459v2 link
2024-02-26 ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors Zhexin Zhang et.al. 2402.16444v1 link
2024-02-26 Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models Tianyi Tang et.al. 2402.16438v1 null
2024-03-11 Finer: Investigating and Enhancing Fine-Grained Visual Concept Recognition in Large Vision Language Models Jeonghwan Kim et.al. 2402.16315v2 null
2024-02-24 HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition Yuxuan Liu et.al. 2402.15754v1 null
2024-02-24 Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning Yong Liu et.al. 2402.15751v1 null
2024-03-04 LLMs Can Defend Themselves Against Jailbreaking in a Practical Manner: A Vision Paper Daoyuan Wu et.al. 2402.15727v2 null
2024-02-26 Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition Yufei Huang et.al. 2402.15175v2 null
2024-02-22 Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark Xiuying Chen et.al. 2402.14359v1 null
2024-02-22 Do Machines and Humans Focus on Similar Code? Exploring Explainability of Large Language Models in Code Summarization Jiliang Li et.al. 2402.14182v1 null
2024-02-21 An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach Mohammad Amaz Uddin et.al. 2402.13871v1 null
2024-02-21 Factual Consistency Evaluation of Summarisation in the Era of Large Language Models Zheheng Luo et.al. 2402.13758v1 null
2024-03-08 SaGE: Evaluating Moral Consistency in Large Language Models Vamshi Krishna Bonagiri et.al. 2402.13709v2 link
2024-02-19 Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question? Nishant Balepur et.al. 2402.12483v1 link
2024-02-19 Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from Large Language Models Puxuan Yu et.al. 2402.12276v1 link
2024-02-18 Opening the black box of language acquisition Jérôme Michaud et.al. 2402.11681v1 link
2024-02-23 Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Bias Detection Valeria Pastorino et.al. 2402.11621v2 null
2024-02-18 Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network Lin Chen et.al. 2402.11518v1 null
2024-02-18 Rethinking the Roles of Large Language Models in Chinese Grammatical Error Correction Yinghui Li et.al. 2402.11420v1 null
2024-02-17 Dissecting Human and LLM Preferences Junlong Li et.al. 2402.11296v1 link
2024-02-17 GenDec: A robust generative Question-decomposition method for Multi-hop reasoning Jian Wu et.al. 2402.11166v1 null
2024-02-16 Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models Zihao Lin et.al. 2402.11122v1 null
2024-02-21 Exploring Value Biases: How LLMs Deviate Towards the Ideal Sarath Sivaprasad et.al. 2402.11005v2 null
2024-03-15 Zero-shot Explainable Mental Health Analysis on Social Media by Incorporating Mental Scales Wenyu Li et.al. 2402.10948v2 null
2024-02-19 Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities Mingyu Jin et.al. 2402.10835v2 null
2024-02-16 RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model Jianhao Yuan et.al. 2402.10828v1 null
2024-02-16 Quantifying the Persona Effect in LLM Simulations Tiancheng Hu et.al. 2402.10811v1 null
2024-02-16 Properties and Challenges of LLM-Generated Explanations Jenny Kunz et.al. 2402.10532v1 null
2024-02-15 Large Language Models for Forecasting and Anomaly Detection: A Systematic Literature Review Jing Su et.al. 2402.10350v1 null
2024-02-15 Case Study: Testing Model Capabilities in Some Reasoning Tasks Min Zhang et.al. 2402.09967v1 null
2024-02-15 Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States Hanyu Duan et.al. 2402.09733v1 null
2024-02-21 CodeMind: A Framework to Challenge Large Language Models for Code Reasoning Changshu Liu et.al. 2402.09664v3 link
2024-02-14 Large Language Model-Based Interpretable Machine Learning Control in Building Energy Systems Liang Zhang et.al. 2402.09584v1 null
2024-02-14 SyntaxShap: Syntax-aware Explainability Method for Text Generation Kenza Amara et.al. 2402.09259v1 null
2024-02-12 Why and When LLM-Based Assistants Can Go Wrong: Investigating the Effectiveness of Prompt-Based Interactions for Software Help-Seeking Anjali Khurana et.al. 2402.08030v1 null
2024-02-02 Exploring patient trust in clinical advice from AI-driven LLMs like ChatGPT for self-diagnosis Delong Du et.al. 2402.07920v1 null
2024-01-29 Experimental Interface for Multimodal and Large Language Model Based Explanations of Educational Recommender Systems Hasan Abu-Rasheed et.al. 2402.07910v1 null
2024-02-12 TELLER: A Trustworthy Framework for Explainable, Generalizable and Controllable Fake News Detection Hui Liu et.al. 2402.07776v1 link
2024-02-12 Can LLMs Produce Faithful Explanations For Fact-checking? Towards Faithful Explainable Fact-Checking via Multi-Agent Debate Kyungha Kim et.al. 2402.07401v1 null
2024-02-11 TransGPT: Multi-modal Generative Pre-trained Transformer for Transportation Peng Wang et.al. 2402.07233v1 null
2024-02-11 X-LoRA: Mixture of Low-Rank Adapter Experts, a Flexible Framework for Large Language Models with Applications in Protein Mechanics and Design Eric L. Buehler et.al. 2402.07148v1 link
2024-02-08 Integrating LLMs for Explainable Fault Diagnosis in Complex Systems Akshay J. Dave et.al. 2402.06695v1 null
2024-02-09 The Quantified Boolean Bayesian Network: Theory and Experiments with a Logical Graphical Model Gregory Coppola et.al. 2402.06557v1 link
2024-02-06 Personalized Language Modeling from Personalized Human Feedback Xinyu Li et.al. 2402.05133v1 null
2024-02-05 Illuminate: A novel approach for depression detection with explainable analysis and proactive therapy using prompt engineering Aryan Agrawal et.al. 2402.05127v1 null
2024-02-07 Large Language Models As Faithful Explainers Yu-Neng Chuang et.al. 2402.04678v1 null
2024-03-14 Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models Chirag Agarwal et.al. 2402.04614v3 null
2024-02-06 Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models David Sobrín-Hidalgo et.al. 2402.04206v1 null
2024-02-29 Learning to Generate Explainable Stock Predictions using Self-Reflective Large Language Models Kelvin J. L. Koa et.al. 2402.03659v3 link
2024-01-31 Uncertainty-Aware Explainable Recommendation with Large Language Models Yicui Peng et.al. 2402.03366v1 null
2024-02-05 The Matrix: A Bayesian learning model for LLMs Siddhartha Dalal et.al. 2402.03175v1 null
2024-02-05 Less is KEN: a Universal and Simple Non-Parametric Pruning Algorithm for Large Language Models Michele Mastromattei et.al. 2402.03142v1 link
2024-02-05 How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning Zeping Yu et.al. 2402.02872v1 null
2024-02-04 Selecting Large Language Model to Fine-tune via Rectified Scaling Law Haowei Lin et.al. 2402.02314v1 null
2024-02-03 Frequency Explains the Inverse Correlation of Large Language Models' Size, Training Data Amount, and Surprisal's Fit to Reading Times Byung-Doh Oh et.al. 2402.02255v1 link
2024-02-06 Large Language Model Agent for Hyper-Parameter Optimization Siyi Liu et.al. 2402.01881v2 null
2024-02-02 The RL/LLM Taxonomy Tree: Reviewing Synergies Between Reinforcement Learning and Large Language Models Moschoula Pternea et.al. 2402.01874v1 null
2024-02-02 Ecologically rational meta-learned inference explains human category learning Akshay K. Jagadish et.al. 2402.01821v1 null
2024-02-01 When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards Norah Alzahrani et.al. 2402.01781v1 null
2024-01-30 Rethinking Interpretability in the Era of Large Language Models Chandan Singh et.al. 2402.01761v1 link
2024-02-24 Contextualization Distillation from Large Language Model for Knowledge Graph Completion Dawei Li et.al. 2402.01729v3 null
2024-03-01 Measuring Moral Inconsistencies in Large Language Models Vamshi Krishna Bonagiri et.al. 2402.01719v3 null
2024-02-16 Emojis Decoded: Leveraging ChatGPT for Enhanced Understanding in Social Media Communications Yuhang Zhou et.al. 2402.01681v2 null
2024-02-05 SymbolicAI: A framework for logic-based approaches combining generative models and solvers Marius-Constantin Dinu et.al. 2402.00854v2 link
2024-02-01 Enhancing Ethical Explanations of Large Language Models through Iterative Symbolic Refinement Xin Quan et.al. 2402.00745v1 link
2024-02-01 IndiVec: An Exploration of Leveraging Large Language Models for Media Bias Detection with Fine-Grained Bias Indicators Luyang Lin et.al. 2402.00345v1 null
2024-02-01 Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective Qun Ma et.al. 2402.00262v1 null
2024-01-31 Multimodal Neurodegenerative Disease Subtyping Explained by ChatGPT Diego Machado Reyes et.al. 2402.00137v1 null
2024-03-10 Arrows of Time for Large Language Models Vassilis Papadopoulos et.al. 2401.17505v2 null
2024-01-30 Detecting mental disorder on social media: a ChatGPT-augmented explainable approach Loris Belcastro et.al. 2401.17477v1 link
2024-02-10 Reproducibility, energy efficiency and performance of pseudorandom number generators in machine learning: a comparative study of python, numpy, tensorflow, and pytorch implementations Benjamin Antunes et.al. 2401.17345v2 null
2024-01-30 Incoherent Probability Judgments in Large Language Models Jian-Qiao Zhu et.al. 2401.16646v1 null
2024-02-27 How Good is ChatGPT at Face Biometrics? A First Look into Recognition, Soft Biometrics, and Explainability Ivan DeAndres-Tame et.al. 2401.13641v2 link
2024-01-24 Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models Hongzhan Lin et.al. 2401.13298v1 link
2024-01-23 XAI for All: Can Large Language Models Simplify Explainable AI? Philip Mavrepis et.al. 2401.13110v1 null
2024-02-22 From Understanding to Utilization: A Survey on Explainability for Large Language Models Haoyan Luo et.al. 2401.12874v2 null
2024-01-23 How well can large language models explain business processes? Dirk Fahland et.al. 2401.12846v1 null
2024-02-23 Generating Zero-shot Abstractive Explanations for Rumour Verification Iman Munire Bilal et.al. 2401.12713v3 link
2024-01-23 LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools Qianli Wang et.al. 2401.12576v1 link
2024-01-21 Over-Reasoning and Redundant Calculation of Large Language Models Cheng-Han Chiang et.al. 2401.11467v1 link
2024-01-20 Analyzing Task-Encoding Tokens in Large Language Models Yu Bai et.al. 2401.11323v1 null
2024-01-17 Vlogger: Make Your Dream A Vlog Shaobin Zhuang et.al. 2401.09414v1 link
2024-01-24 Supporting Student Decisions on Learning Recommendations: An LLM-Based Chatbot with Knowledge Graph Contextualization for Conversational Explainability and Mentoring Hasan Abu-Rasheed et.al. 2401.08517v3 null
2024-01-16 LLM-Guided Multi-View Hypergraph Learning for Human-Centric Explainable Recommendation Zhixuan Chu et.al. 2401.08217v1 null
2024-02-15 Are self-explanations from Large Language Models faithful? Andreas Madsen et.al. 2401.07927v3 link
2024-01-15 Quantum Transfer Learning for Acceptability Judgements Giuseppe Buonaiuto et.al. 2401.07777v1 null
2024-01-14 Harnessing Large Language Models Over Transformer Models for Detecting Bengali Depressive Social Media Text: A Comprehensive Study Ahmadul Karim Chowdhury et.al. 2401.07310v1 null
2024-01-12 TestSpark: IntelliJ IDEA's Ultimate Test Generation Companion Arkadii Sapozhnikov et.al. 2401.06580v1 link
2024-01-12 Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models Asma Ghandeharioun et.al. 2401.06102v2 null
2024-01-11 Video Anomaly Detection and Explanation via Large Language Models Hui Lv et.al. 2401.05702v1 null
2024-01-11 REBUS: A Robust Evaluation Benchmark of Understanding Symbols Andrew Gritsevskiy et.al. 2401.05604v1 link
2024-01-08 LLM4PLC: Harnessing Large Language Models for Verifiable Programming of PLCs in Industrial Control Systems Mohamad Fakih et.al. 2401.05443v1 link
2024-01-10 Prompting Large Language Models for Recommender Systems: A Comprehensive Framework and Empirical Analysis Lanling Xu et.al. 2401.04997v1 null
2024-01-08 ExTraCT -- Explainable Trajectory Corrections from language inputs using Textual description of features J-Anne Yow et.al. 2401.03701v1 null
2024-01-06 Autonomous Crowdsensing: Operating and Organizing Crowdsensing for Sensing Automation Wansen Wu et.al. 2401.03229v1 null
2024-01-02 Evaluating Large Language Models on the GMAT: Implications for the Future of Business Education Vahid Ashrafimoghari et.al. 2401.02985v1 null
2024-01-05 Large Language Models in Plant Biology Hilbert Yuen In Lam et.al. 2401.02789v1 null
2024-01-02 VALD-MD: Visual Attribution via Latent Diffusion for Medical Diagnostics Ammar A. Siddiqui et.al. 2401.01414v1 null
2023-12-30 The Problem of Alignment Tsvetelina Hristova et.al. 2401.00210v1 null
2023-12-29 Building Efficient Universal Classifiers with Natural Language Inference Moritz Laurer et.al. 2312.17543v1 link
2023-12-23 An Explainable AI Approach to Large Language Model Assisted Causal Model Auditing and Development Yanming Zhang et.al. 2312.16211v1 null
2024-01-03 Unlocking the Potential of Large Language Models for Explainable Recommendations Yucong Luo et.al. 2312.15661v3 link
2023-12-11 Transportation Transformed: A Comprehensive Review of Dynamic Rerouting in Multimodal Networks Suyash Pratap et.al. 2312.14953v1 null
2023-12-22 VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation Max Ku et.al. 2312.14867v1 null
2023-12-21 Deep de Finetti: Recovering Topic Distributions from Large Language Models Liyi Zhang et.al. 2312.14226v1 null
2023-12-16 Learning Interpretable Queries for Explainable Image Classification with Information Pursuit Stefan Kolek et.al. 2312.11548v1 null
2023-12-19 The Good, The Bad, and Why: Unveiling Emotions in Generative AI Cheng Li et.al. 2312.11111v2 null
2023-12-17 Can persistent homology whiten Transformer-based black-box models? A case study on BERT compression Luis Balderas et.al. 2312.10702v1 null
2024-01-17 LLM-SQL-Solver: Can LLMs Determine SQL Equivalence? Fuheng Zhao et.al. 2312.10321v2 null
2023-12-15 GPT-doctor: Customizing Large Language Models for Medical Consultation Wen Wang et.al. 2312.10225v1 null
2023-12-04 A collection of principles for guiding and evaluating large language models Konstantin Hebenstreit et.al. 2312.10059v1 null
2023-12-15 Prompting Datasets: Data Discovery with Conversational Agents Johanna Walker et.al. 2312.09947v1 null
2023-12-15 SMILE: Multimodal Dataset for Understanding Laughter in Video with Language Models Lee Hyun et.al. 2312.09818v1 link
2023-12-14 Successor Heads: Recurring, Interpretable Attention Heads In The Wild Rhys Gould et.al. 2312.09230v1 null
2023-12-27 Fine-Grained Image-Text Alignment in Medical Imaging Enables Cyclic Image-Report Generation Wenting Chen et.al. 2312.08078v4 null
2023-12-13 Helping Language Models Learn More: Multi-dimensional Task Prompt for Few-shot Tuning Jinta Weng et.al. 2312.08027v1 null
2023-12-12 Tell, don't show: Declarative facts influence how LLMs generalize Alexander Meinke et.al. 2312.07779v1 null
2023-12-05 Building Trustworthy NeuroSymbolic AI Systems: Consistency, Reliability, Explainability, and Safety Manas Gaur et.al. 2312.06798v1 null
2023-12-10 Evidence-based Interpretable Open-domain Fact-checking with Large Language Models Xin Tan et.al. 2312.05834v1 null
2023-11-30 Applying Large Language Models and Chain-of-Thought for Automatic Scoring Gyeong-Geon Lee et.al. 2312.03748v1 null
2023-12-06 XAIQA: Explainer-Based Data Augmentation for Extractive Question Answering Joel Stremmel et.al. 2312.03567v1 null
2023-12-03 TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long Documents James Enouen et.al. 2312.01279v1 null
2023-11-30 Large Language Models for Travel Behavior Prediction Baichuan Mo et.al. 2312.00819v1 null
2023-11-30 CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation Pei Ke et.al. 2311.18702v1 link
2023-11-30 Evaluating the Rationale Understanding of Critical Reasoning in Logical Reading Comprehension Akira Kawabata et.al. 2311.18353v1 null
2023-11-29 Understanding Your Agent: Leveraging Large Language Models for Behavior Explanation Xijia Zhang et.al. 2311.18062v1 null
2023-11-29 Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning Xiaoqian Wu et.al. 2311.17365v1 null
2023-11-29 Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering Zeqing Wang et.al. 2311.17331v1 null
2024-02-12 Large language models can enhance persuasion through linguistic feature alignment Minkyu Shin et.al. 2311.16466v2 null
2023-11-16 Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities Avishree Khare et.al. 2311.16169v1 null
2023-11-27 Decoding Logic Errors: A Comparative Study on Bug Detection by Students and Large Language Models Stephen MacNeil et.al. 2311.16017v1 null
2023-11-27 Justifiable Artificial Intelligence: Engineering Large Language Models for Legal Applications Sabine Wehnert et.al. 2311.15716v1 null
2023-11-27 Deficiency of Large Language Models in Finance: An Empirical Examination of Hallucination Haoqiang Kang et.al. 2311.15548v1 null
2023-11-25 Code Generation Based Grading: Evaluating an Auto-grading Mechanism for "Explain-in-Plain-English" Questions David H. Smith IV et.al. 2311.14903v1 null
2023-11-10 ChatGPT Exhibits Gender and Racial Biases in Acute Coronary Syndrome Management Angela Zhang et.al. 2311.14703v1 null
2023-11-23 Towards Auditing Large Language Models: Improving Text-based Stereotype Detection Wu Zekun et.al. 2311.14126v1 null
2023-11-23 Towards Explainable Strategy Templates using NLP Transformers Pallavi Bagga et.al. 2311.14061v1 null
2023-11-22 Large Language Models in Education: Vision and Opportunities Wensheng Gan et.al. 2311.13160v1 null
2023-11-21 A Survey on Large Language Models for Personalized and Explainable Recommendations Junyi Chen et.al. 2311.12338v1 null
2023-11-20 Unifying Corroborative and Contributive Attributions in Large Language Models Theodora Worledge et.al. 2311.12233v1 null
2023-11-20 LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions Songhao Han et.al. 2311.11904v1 null
2023-11-20 Large Language Models and Explainable Law: a Hybrid Methodology Marco Billi et.al. 2311.11811v1 null
2023-11-20 Exploring Prompting Large Language Models as Explainable Metrics Ghazaleh Mahmoudi et.al. 2311.11552v1 link
2023-11-19 Using Causal Threads to Explain Changes in a Dynamic System Robert B. Allen et.al. 2311.11334v1 null
2023-12-17 Rethinking Large Language Models in Mental Health Applications Shaoxiong Ji et.al. 2311.11267v2 null
2023-11-16 ChatGPT-3.5, ChatGPT-4, Google Bard, and Microsoft Bing to Improve Health Literacy and Communication in Pediatric Populations and Beyond Kanhai S. Amin et.al. 2311.10075v1 null
2023-11-16 Is "A Helpful Assistant" the Best Role for Large Language Models? A Systematic Evaluation of Social Roles in System Prompts Mingqian Zheng et.al. 2311.10054v1 null
2023-11-15 Explaining Explanation: An Empirical Study on Explanation in Code Reviews Ratnadira Widyasari et.al. 2311.09020v1 null
2023-11-15 Data Similarity is Not Enough to Explain Language Model Performance Gregory Yauney et.al. 2311.09006v1 link
2023-11-15 XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-Making Zichen Chen et.al. 2311.08614v1 null
2023-11-14 UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations Wenting Zhao et.al. 2311.08469v1 null
2023-11-16 Are Large Language Models Temporally Grounded? Yifu Qiu et.al. 2311.08398v2 link
2023-11-13 In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax Aaron Mueller et.al. 2311.07811v1 link
2023-11-13 On Measuring Faithfulness of Natural Language Explanations Letitia Parcalabescu et.al. 2311.07466v1 link
2023-11-12 SELF-EXPLAIN: Teaching Large Language Models to Reason Complex Questions by Themselves Jiachen Zhao et.al. 2311.06985v1 null
2023-11-10 Distilling Large Language Models using Skill-Occupation Graph Context for HR-Related Tasks Pouya Pezeshkpour et.al. 2311.06383v1 link
2023-11-08 DEMASQ: Unmasking the ChatGPT Wordsmith Kavita Kumari et.al. 2311.05019v1 null
2023-11-01 From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems Samyar Janatian et.al. 2311.04911v1 link
2023-11-07 Extracting human interpretable structure-property relationships in chemistry using XAI and large language models Geemi P. Wellawatte et.al. 2311.04047v1 link
2023-11-07 Which is better? Exploring Prompting Strategy For LLM-based Metrics Joonghoon Kim et.al. 2311.03754v1 link
2023-11-07 Leveraging Structured Information for Explainable Multi-hop Question Answering and Reasoning Ruosen Li et.al. 2311.03734v1 link
2023-11-04 Can ChatGPT support software verification? Christian Janßen et.al. 2311.02433v1 null
2023-11-12 Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models Sean Xie et.al. 2311.01732v2 link
2023-09-26 Creating Trustworthy LLMs: Dealing with Hallucinations in Healthcare AI Muhammad Aurangzeb Ahmad et.al. 2311.01463v1 null
2023-11-01 Emotion Detection for Misinformation: A Review Zhiwei Liu et.al. 2311.00671v1 null
2023-11-22 HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning Yongjin Yang et.al. 2311.00321v2 link
2023-11-01 ChatGPT-Powered Hierarchical Comparisons for Image Classification Zhiyuan Ren et.al. 2311.00206v1 null
2023-11-14 Learning From Mistakes Makes LLM Better Reasoner Shengnan An et.al. 2310.20689v2 link
2023-10-31 Theory of Mind in Large Language Models: Examining Performance of 11 State-of-the-Art models vs. Children Aged 7-10 on Advanced Tests Max J. van Duijn et.al. 2310.20320v1 null
2023-10-30 The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics Christoph Leiter et.al. 2310.19792v1 link
2023-10-30 Explaining Tree Model Decisions in Natural Language for Network Intrusion Detection Noah Ziems et.al. 2310.19658v1 null
2023-10-28 The Synergy of Speculative Decoding and Batching in Serving Large Language Models Qidong Su et.al. 2310.18813v1 null
2023-11-01 Will releasing the weights of future large language models grant widespread access to pandemic agents? Anjali Gopal et.al. 2310.18233v2 null
2023-10-26 Beyond MLE: Convex Learning for Text Generation Chenze Shao et.al. 2310.17217v1 null
2023-10-26 DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models Ge Zheng et.al. 2310.16436v2 null
2023-10-25 Graph Agent: Explicit Reasoning Agent for Graphs Qinyong Wang et.al. 2310.16421v1 null
2023-12-29 Evaluating General-Purpose AI with Psychometrics Xiting Wang et.al. 2310.16379v2 null
2023-10-24 UI Layout Generation with LLMs Guided by UI Grammar Yuwen Lu et.al. 2310.15455v1 null
2023-10-22 Evaluating Subjective Cognitive Appraisals of Emotions from Large Language Models Hongli Zhan et.al. 2310.14389v1 link
2023-10-22 Towards Harmful Erotic Content Detection through Coreference-Driven Contextual Analysis Inez Okulska et.al. 2310.14325v1 null
2023-10-21 Large Language Models and Multimodal Retrieval for Visual Word Sense Disambiguation Anastasia Kritharoula et.al. 2310.14025v1 link
2023-10-20 Ecologically Valid Explanations for Label Variation in NLI Nan-Jiang Jiang et.al. 2310.13850v1 link
2023-10-30 Why Can Large Language Models Generate Correct Chain-of-Thoughts? Rasul Tutunov et.al. 2310.13571v2 null
2023-10-20 The Perils & Promises of Fact-checking with Large Language Models Dorian Quelle et.al. 2310.13549v1 null
2023-10-20 Explaining Interactions Between Text Spans Sagnik Ray Choudhury et.al. 2310.13506v1 link
2023-10-19 Frozen Transformers in Language Models Are Effective Visual Encoder Layers Ziqi Pang et.al. 2310.12973v1 link
2023-10-28 Probing LLMs for hate speech detection: strengths and vulnerabilities Sarthak Roy et.al. 2310.12860v2 null
2023-10-19 Large Language Models Help Humans Verify Truthfulness -- Except When They Are Convincingly Wrong Chenglei Si et.al. 2310.12558v1 null
2023-10-17 Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations Shiyuan Huang et.al. 2310.11207v1 null
2023-11-11 Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms Seungju Han et.al. 2310.10418v2 link
2023-10-15 EX-FEVER: A Dataset for Multi-hop Explainable Fact Verification Huanhuan Ma et.al. 2310.09754v1 link
2023-10-13 A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models Takuma Udagawa et.al. 2310.08797v1 null
2023-10-12 Circuit Component Reuse Across Tasks in Transformer Language Models Jack Merullo et.al. 2310.08744v1 null
2023-10-12 Who Wrote it and Why? Prompting Large-Language Models for Authorship Verification Chia-Yu Hung et.al. 2310.08123v1 null
2023-10-12 Large Language Models for Scientific Synthesis, Inference and Explanation Yizhen Zheng et.al. 2310.07984v1 link
2023-10-11 Large Language Models Are Zero-Shot Time Series Forecasters Nate Gruver et.al. 2310.07820v1 link
2023-10-10 Benchmarking and Explaining Large Language Model-based Code Generation: A Causality-Centric Approach Zhenlan Ji et.al. 2310.06680v1 null
2023-10-10 SCAR: Power Side-Channel Analysis at RTL-Level Amisha Srivastava et.al. 2310.06257v1 null
2023-10-11 The Importance of Prompt Tuning for Automated Neuron Explanations Justin Lee et.al. 2310.06200v2 null
2023-10-09 A Meta-Learning Perspective on Transformers for Causal Language Modeling Xinbo Wu et.al. 2310.05884v1 null
2023-10-10 Are Large Language Models Post Hoc Explainers? Nicholas Kroeger et.al. 2310.05797v2 link
2023-10-09 A Closer Look into Automatic Evaluation Using Large Language Models Cheng-Han Chiang et.al. 2310.05657v1 link
2023-10-09 Explaining the Complex Task Reasoning of Large Language Models with Template-Content Structure Haotong Yang et.al. 2310.05452v1 null
2023-10-20 Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models Haoran Wang et.al. 2310.05253v2 link
2023-10-08 Scaling Laws of RoPE-based Extrapolation Xiaoran Liu et.al. 2310.05209v1 null
2023-10-08 Harnessing the Power of ChatGPT in Fake News: An In-Depth Exploration in Generation, Detection and Explanation Yue Huang et.al. 2310.05046v1 null
2023-10-08 Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading Howard Chen et.al. 2310.05029v1 null
2023-10-08 Domain Knowledge Graph Construction Via A Simple Checker Yueling Zeng et.al. 2310.04949v1 null
2023-11-11 FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets Neng Wang et.al. 2310.04793v2 link
2023-10-03 Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of Large Language Models with Misconceptions Naiming Liu et.al. 2310.02439v1 null
2023-10-13 Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving Long Chen et.al. 2310.01957v2 link
2023-11-28 DeepDecipher: Accessing and Investigating Neuron Activation in Large Language Models Albert Garde et.al. 2310.01870v2 link
2023-12-07 UPAR: A Kantian-Inspired Prompting Framework for Enhancing Large Language Model Capabilities Hejia Geng et.al. 2310.01441v2 null
2023-10-02 Automated Evaluation of Classroom Instructional Support with LLMs and BoWs: Connecting Global Predictions to Specific Feedback Jacob Whitehill et.al. 2310.01132v1 null
2023-10-08 Back to the Future: Towards Explainable Temporal Reasoning with Large Language Models Chenhan Yuan et.al. 2310.01074v2 link
2023-10-01 Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning Mustafa Shukor et.al. 2310.00647v1 link
2023-11-22 Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals Yair Gat et.al. 2310.00603v2 null
2023-09-29 Tell Me a Story! Narrative-Driven XAI with Large Language Models David Martens et.al. 2309.17057v1 link
2023-09-28 T-COL: Generating Counterfactual Explanations for General User Preferences on Variable Machine Learning Systems Ming Wang et.al. 2309.16146v1 link
2023-09-28 TPE: Towards Better Compositional Reasoning over Conceptual Tools with Multi-persona Collaboration Hongru Wang et.al. 2309.16090v1 null
2023-09-27 HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs) Tarek Ali et.al. 2309.16021v1 null
2023-09-27 MindGPT: Interpreting What You See with Non-invasive Brain Recordings Jiaxuan Chen et.al. 2309.15729v1 link
2023-09-23 LLMs as Counterfactual Explanation Modules: Can ChatGPT Explain Black-box Text Classifiers? Amrita Bhattacharjee et.al. 2309.13340v1 null
2023-09-21 JobRecoGPT -- Explainable job recommendations using LLMs Preetam Ghosh et.al. 2309.11805v1 null
2023-09-20 Controlled Generation with Prompt Insertion for Natural Language Explanations in Grammatical Error Correction Masahiro Kaneko et.al. 2309.11439v1 link

(back to top)

LLM - Interpretable

Publish Date Title Authors PDF Code
2024-07-24 How Good (Or Bad) Are LLMs at Detecting Misleading Visualizations? Leo Yu-Ho Lo et.al. 2407.17291v1 null
2024-07-24 SAFETY-J: Evaluating Safety with Critique Yixiu Liu et.al. 2407.17075v1 null
2024-07-24 Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mechanism Anhao Zhao et.al. 2407.17011v1 null
2024-07-23 PhenoFlow: A Human-LLM Driven Visual Analytics System for Exploring Large and Complex Stroke Datasets Jaeyoung Kim et.al. 2407.16329v1 null
2024-07-22 Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs Abhay Sheshadri et.al. 2407.15549v1 null
2024-07-22 Decoding BACnet Packets: A Large Language Model Approach for Packet Interpretation Rashi Sharma et.al. 2407.15428v1 null
2024-07-22 Dissecting Multiplication in Transformers: Insights into LLMs Luyu Qiu et.al. 2407.15360v1 null
2024-07-23 LLMExplainer: Large Language Model based Bayesian Inference for Graph Explanation Generation Jiaxing Zhang et.al. 2407.15351v2 null
2024-07-21 XAI meets LLMs: A Survey of the Relation between Explainable AI and Large Language Models Erik Cambria et.al. 2407.15248v1 null
2024-07-19 Human-Interpretable Adversarial Prompt Attack on Large Language Models with Situational Context Nilanjana Das et.al. 2407.14644v1 null
2024-07-19 On Pre-training of Multimodal Language Models Customized for Chart Understanding Wan-Cyuan Fan et.al. 2407.14506v1 null
2024-07-19 Check-Eval: A Checklist-based Approach for Evaluating Text Quality Jayr Pereira et.al. 2407.14467v1 null
2024-07-02 Predictive Simultaneous Interpretation: Harnessing Large Language Models for Democratizing Real-Time Multilingual Communication Kurando Iida et.al. 2407.14269v1 null
2024-07-19 KoMA: Knowledge-driven Multi-agent Framework for Autonomous Driving with Large Language Models Kemou Jiang et.al. 2407.14239v1 null
2024-07-19 LeKUBE: A Legal Knowledge Update BEnchmark Changyue Wang et.al. 2407.14192v1 null
2024-07-19 ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness? Siddhant Waghjale et.al. 2407.14044v1 link
2024-07-18 PRAGyan -- Connecting the Dots in Tweets Rahul Ravi et.al. 2407.13909v1 null
2024-07-18 X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs Sirnam Swetha et.al. 2407.13851v1 null
2024-07-24 The Honorific Effect: Exploring the Impact of Japanese Linguistic Formalities on AI-Generated Physics Explanations Keisuke Sato et.al. 2407.13787v2 null
2024-07-03 RDBE: Reasoning Distillation-Based Evaluation Enhances Automatic Essay Scoring Ali Ghiasvand Mohammadkhani et.al. 2407.13781v1 null
2024-07-20 EarthMarker: Visual Prompt Learning for Region-level and Point-level Remote Sensing Imagery Comprehension Wei Zhang et.al. 2407.13596v2 link
2024-07-18 CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis Junying Chen et.al. 2407.13301v1 null
2024-07-18 SOMONITOR: Explainable Marketing Data Processing and Analysis with Large Language Models Qi Yang et.al. 2407.13117v1 null
2024-07-18 TrialEnroll: Predicting Clinical Trial Enrollment Success with Deep & Cross Network and Large Language Models Ling Yue et.al. 2407.13115v1 null
2024-07-10 Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey) Krishnaram Kenthapadi et.al. 2407.12858v1 null
2024-07-01 AutoFlow: Automated Workflow Generation for Large Language Model Agents Zelong Li et.al. 2407.12821v1 link
2024-07-17 AudienceView: AI-Assisted Interpretation of Audience Feedback in Journalism William Brannon et.al. 2407.12613v1 link
2024-07-17 NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models Gengze Zhou et.al. 2407.12366v1 link
2024-07-16 GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text Kyle Hamilton et.al. 2407.11827v1 null
2024-07-15 Mechanistic interpretability of large language models with applications to the financial services industry Ashkan Golgoon et.al. 2407.11215v1 null
2024-06-27 Does ChatGPT Have a Mind? Simon Goldstein et.al. 2407.11015v1 null
2024-06-24 Visualization Literacy of Multimodal Large Language Models: A Comparative Study Zhimin Li et.al. 2407.10996v1 null
2024-07-15 Think-on-Graph 2.0: Deep and Interpretable Large Language Model Reasoning with Knowledge Graph-guided Retrieval Shengjie Ma et.al. 2407.10805v1 null
2024-07-15 Multilingual Contrastive Decoding via Language-Agnostic Layers Skipping Wenhao Zhu et.al. 2407.10795v1 link
2024-07-15 Interpretability analysis on a pathology foundation model reveals biologically relevant embeddings across modalities Nhat Le et.al. 2407.10785v1 null
2024-07-15 Learning Dynamics of LLM Finetuning Yi Ren et.al. 2407.10490v1 link
2024-07-17 LAB-Bench: Measuring Capabilities of Language Models for Biology Research Jon M. Laurent et.al. 2407.10362v3 null
2024-07-22 TokenSHAP: Interpreting Large Language Models with Monte Carlo Shapley Value Estimation Roni Goldshmidt et.al. 2407.10114v2 null
2024-07-14 Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation Ge Gao et.al. 2407.10091v1 null
2024-07-13 Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks Shengbin Yue et.al. 2407.09893v1 link
2024-07-13 Speech-Guided Sequential Planning for Autonomous Navigation using Large Language Model Meta AI 3 (Llama3) Alkesh K. Srivastava et.al. 2407.09890v1 null
2024-06-26 Prompting Whole Slide Image Based Genetic Biomarker Prediction Ling Zhang et.al. 2407.09540v1 null
2024-07-12 SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers Shraman Pramanick et.al. 2407.09413v1 link
2024-07-11 Fault Diagnosis in Power Grids with Large Language Model Liu Jing et.al. 2407.08836v1 null
2024-07-11 Tamil Language Computing: the Present and the Future Kengatharaiyer Sarveswaran et.al. 2407.08618v1 null
2024-07-11 Incorporating Large Language Models into Production Systems for Enhanced Task Automation and Flexibility Yuchen Xia et.al. 2407.08550v1 null
2024-07-11 Tactics, Techniques, and Procedures (TTPs) in Interpreted Malware: A Zero-Shot Generation with Large Language Models Ying Zhang et.al. 2407.08532v1 null
2024-07-11 On the attribution of confidence to large language models Geoff Keeling et.al. 2407.08388v1 null
2024-07-11 Towards Explainable Evolution Strategies with Large Language Models Jill Baumann et.al. 2407.08331v1 null
2024-07-11 GeNet: A Multimodal LLM-Based Co-Pilot for Network Topology and Configuration Beni Ifland et.al. 2407.08249v1 null
2024-07-10 On LLM Wizards: Identifying Large Language Models' Behaviors for Wizard of Oz Experiments Jingchao Fang et.al. 2407.08067v1 null
2024-07-10 Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models Yuji Zhang et.al. 2407.08039v1 null
2024-07-10 Transformer Alignment in Large Language Models Murdock Aubry et.al. 2407.07810v1 null
2024-07-10 Interpretable Differential Diagnosis with Dual-Inference Large Language Models Shuang Zhou et.al. 2407.07330v1 null
2024-07-09 Large Language Models for Wearable Sensor-Based Human Activity Recognition, Health Monitoring, and Behavioral Modeling: A Survey of Early Trends, Datasets, and Challenges Emilio Ferrara et.al. 2407.07196v1 null
2024-07-09 Divine LLaMAs: Bias, Stereotypes, Stigmatization, and Emotion Representation of Religion in Large Language Models Flor Miriam Plaza-del-Arco et.al. 2407.06908v1 null
2024-07-10 Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts Shuangkang Fang et.al. 2407.06842v2 null
2024-07-09 Combining Knowledge Graphs and Large Language Models Amanda Kau et.al. 2407.06564v1 null
2024-07-09 Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons Yongqi Leng et.al. 2407.06488v1 null
2024-07-08 Artificial Intuition: Efficient Classification of Scientific Abstracts Harsh Sakhrani et.al. 2407.06093v1 null
2024-07-08 GenFollower: Enhancing Car-Following Prediction with Large Language Models Xianda Chen et.al. 2407.05611v1 null
2024-07-07 Experiments with truth using Machine Learning: Spectral analysis and explainable classification of synthetic, false, and genuine information Vishnu S. Pendyala et.al. 2407.05464v1 null
2024-07-06 Enhance the Robustness of Text-Centric Multimodal Alignments Ting-Yu Yen et.al. 2407.05036v1 null
2024-07-05 MobileFlow: A Multimodal LLM For Mobile GUI Agent Songqin Nong et.al. 2407.04346v1 null
2024-07-05 Crafting Large Language Models for Enhanced Interpretability Chung-En Sun et.al. 2407.04307v1 null
2024-07-17 DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning Chengpeng Li et.al. 2407.04078v3 link
2024-07-04 A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations Md Tahmid Rahman Laskar et.al. 2407.04069v1 null
2024-07-04 Semantic Graphs for Syntactic Simplification: A Revisit from the Age of LLM Peiran Yao et.al. 2407.04067v1 link
2024-07-15 LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking Amy Xin et.al. 2407.04020v2 link
2024-07-04 Generative Technology for Human Emotion Recognition: A Scope Review Fei Ma et.al. 2407.03640v1 null
2024-07-04 The Mysterious Case of Neuron 1512: Injectable Realignment Architectures Reveal Internal Characteristics of Meta's Llama 2 Model Brenden Smith et.al. 2407.03621v1 link
2024-07-03 Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering Zhaohe Liao et.al. 2407.03008v1 null
2024-07-03 FSM: A Finite State Machine Based Zero-Shot Prompting Paradigm for Multi-Hop Question Answering Xiaochen Wang et.al. 2407.02964v1 null
2024-07-03 Model-Enhanced LLM-Driven VUI Testing of VPA Apps Suwan Li et.al. 2407.02791v1 null
2024-06-27 Meta Large Language Model Compiler: Foundation Models of Compiler Optimization Chris Cummins et.al. 2407.02524v1 null
2024-06-23 INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness Hung Le et.al. 2407.02518v1 null
2024-07-02 GRASP: A Grid-Based Benchmark for Evaluating Commonsense Spatial Reasoning Zhisheng Tang et.al. 2407.01892v1 link
2024-06-29 Potential Renovation of Information Search Process with the Power of Large Language Model for Healthcare Forhan Bin Emdad et.al. 2407.01627v1 null
2024-07-01 Agentless: Demystifying LLM-based Software Engineering Agents Chunqiu Steven Xia et.al. 2407.01489v1 link
2024-07-01 Evaluating Knowledge-based Cross-lingual Inconsistency in Large Language Models Xiaolin Xing et.al. 2407.01358v1 link
2024-07-01 Calibrated Large Language Models for Binary Question Answering Patrizio Giovannotti et.al. 2407.01122v1 null
2024-07-01 Human-like object concept representations emerge naturally in multimodal large language models Changde Du et.al. 2407.01067v1 null
2024-07-01 Background-aware Multi-source Fusion Financial Trend Forecasting Mechanism Fengting Mo et.al. 2407.00904v1 null
2024-06-29 Financial Knowledge Large Language Model Cehao Yang et.al. 2407.00365v1 null
2024-06-29 LLM-Generated Natural Language Meets Scaling Laws: New Explorations and Data Augmentation Methods Zhenhua Wang et.al. 2407.00322v1 null
2024-06-27 Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks Ibrahim Abdelaziz et.al. 2407.00121v1 null
2024-06-17 A Personalised Learning Tool for Physics Undergraduate Students Built On a Large Language Model for Symbolic Regression Yufan Zhu et.al. 2407.00065v1 null
2024-06-28 Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification Anisha Gunjal et.al. 2406.20079v1 link
2024-06-28 Learning Interpretable Legal Case Retrieval via Knowledge-Guided Case Reformulation Chenlong Deng et.al. 2406.19760v1 link
2024-06-27 PathAlign: A vision-language model for whole slide images in histopathology Faruk Ahmed et.al. 2406.19578v1 null
2024-06-27 DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions Nigel Fernandez et.al. 2406.19356v1 null
2024-06-27 Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding Yue Fan et.al. 2406.19263v1 link
2024-06-27 Towards Learning Abductive Reasoning using VSA Distributed Representations Giacomo Camposampiero et.al. 2406.19121v1 link
2024-06-27 LayoutCopilot: An LLM-powered Multi-agent Collaborative Framework for Interactive Analog Layout Design Bingyang Liu et.al. 2406.18873v1 null
2024-06-27 DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment Ke-Han Lu et.al. 2406.18871v1 null
2024-06-27 ELCoRec: Enhance Language Understanding with Co-Propagation of Numerical and Categorical Features for Recommendation Jizheng Chen et.al. 2406.18825v1 null
2024-06-26 Categorical Syllogisms Revisited: A Review of the Logical Reasoning Abilities of LLMs for Analyzing Categorical Syllogism Shi Zong et.al. 2406.18762v1 null
2024-07-15 Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models Georgios Tziafas et.al. 2406.18746v2 null
2024-06-26 Themis: Towards Flexible and Interpretable NLG Evaluation Xinyu Hu et.al. 2406.18365v1 link
2024-06-26 AI Alignment through Reinforcement Learning from Human Feedback? Contradictions and Limitations Adam Dahlgren Lindström et.al. 2406.18346v1 null
2024-06-26 A Context-Driven Approach for Co-Auditing Smart Contracts with The Support of GPT-4 code interpreter Mohamed Salah Bouafif et.al. 2406.18075v1 null
2024-06-26 Diagnosis Assistant for Liver Cancer Utilizing a Large Language Model with Three Types of Knowledge Xuzhou Wu et.al. 2406.18039v1 null
2024-06-26 Automated Clinical Data Extraction with Knowledge Conditioned LLMs Diya Li et.al. 2406.18027v1 null
2024-06-25 Encourage or Inhibit Monosemanticity? Revisit Monosemanticity from a Feature Decorrelation Perspective Hanqi Yan et.al. 2406.17969v1 null
2024-06-25 Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback Zhongtao Miao et.al. 2406.17873v1 link
2024-06-25 Human-Object Interaction from Human-Level Instructions Zhen Wu et.al. 2406.17840v1 null
2024-06-22 MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries? Xirui Li et.al. 2406.17806v1 null
2024-06-25 Banishing LLM Hallucinations Requires Rethinking Generalization Johnny Li et.al. 2406.17642v1 null
2024-06-25 Large Language Models are Interpretable Learners Ruochen Wang et.al. 2406.17224v1 link
2024-07-01 Large Language Models Assume People are More Rational than We Really are Ryan Liu et.al. 2406.17055v2 link
2024-06-23 Unveiling LLM Mechanisms Through Neural ODEs and Control Theory Yukun Zhang et.al. 2406.16985v1 null
2024-06-24 USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations Mounika Marreddy et.al. 2406.16833v1 null
2024-06-25 RES-Q: Evaluating Code-Editing Large Language Model Systems at the Repository Scale Beck LaBash et.al. 2406.16801v2 link
2024-06-24 OCALM: Object-Centric Assessment with Language Models Timo Kaufmann et.al. 2406.16748v1 null
2024-06-29 EmoLLM: Multimodal Emotional Understanding Meets Large Language Models Qu Yang et.al. 2406.16442v2 link
2024-06-25 Graph-Augmented LLMs for Personalized Health Insights: A Case Study in Sleep Analysis Ajan Subramanian et.al. 2406.16252v2 null
2024-06-23 Preference Tuning For Toxicity Mitigation Generalizes Across Languages Xiaochen Li et.al. 2406.16235v1 link
2024-06-23 Towards Natural Language-Driven Assembly Using Foundation Models Omkar Joglekar et.al. 2406.16093v1 null
2024-06-23 Unlocking the Future: Exploring Look-Ahead Planning Mechanistic Interpretability in Large Language Models Tianyi Men et.al. 2406.16033v1 null
2024-06-25 AudioBench: A Universal Benchmark for Audio Large Language Models Bin Wang et.al. 2406.16020v2 link
2024-06-23 Memorizing Documents with Guidance in Large Language Models Bumjin Park et.al. 2406.15996v1 null
2024-06-30 LLM-Powered Explanations: Unraveling Recommendations Through Subgraph Reasoning Guangsi Shi et.al. 2406.15859v2 null
2024-06-22 DABL: Detecting Semantic Anomalies in Business Processes Using Large Language Models Wei Guan et.al. 2406.15781v1 link
2024-06-22 MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception Guanqun Wang et.al. 2406.15768v1 null
2024-06-21 Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph Roman Vashurin et.al. 2406.15627v1 null
2024-06-19 Dr.E Bridges Graphs with Large Language Models through Words Zipeng Liu et.al. 2406.15504v1 null
2024-06-21 A LLM-Based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation Irune Zubiaga et.al. 2406.15227v1 null
2024-06-21 Unsupervised Extraction of Dialogue Policies from Conversations Makesh Narsimhan Sreedhar et.al. 2406.15214v1 null
2024-06-21 Asynchronous Large Language Model Enhanced Planner for Autonomous Driving Yuan Chen et.al. 2406.14556v2 null
2024-06-20 LLaSA: Large Multimodal Agent for Human Activity Analysis Through Wearable Sensors Sheikh Asif Imran et.al. 2406.14498v1 link
2024-06-20 Self-supervised Interpretable Concept-based Models for Text Classification Francesco De Santis et.al. 2406.14335v1 null
2024-07-01 QuST-LLM: Integrating Large Language Models for Comprehensive Spatial Transcriptomics Analysis Chao Hui Huang et.al. 2406.14307v2 link
2024-06-20 Definition generation for lexical semantic change detection Mariia Fedorova et.al. 2406.14167v1 link
2024-06-20 Finding Safety Neurons in Large Language Models Jianhui Chen et.al. 2406.14144v1 null
2024-06-19 Distributional reasoning in LLMs: Parallel reasoning processes in multi-hop reasoning Yuval Shalev et.al. 2406.13858v1 null
2024-06-19 Fine-Tuning Gemma-7B for Enhanced Sentiment Analysis of Financial News Headlines Kangtong Mo et.al. 2406.13626v1 null
2024-06-27 VDebugger: Harnessing Execution Feedback for Debugging Visual Programs Xueqing Wu et.al. 2406.13444v2 link
2024-06-19 Finding Blind Spots in Evaluator LLMs with Interpretable Checklists Sumanth Doddapaneni et.al. 2406.13439v1 link
2024-06-19 Data Contamination Can Cross Language Barriers Feng Yao et.al. 2406.13236v1 link
2024-06-19 Locating and Extracting Relational Concepts in Large Language Models Zijian Wang et.al. 2406.13184v1 link
2024-06-19 LLMatDesign: Autonomous Materials Discovery with Large Language Models Shuyi Jia et.al. 2406.13163v1 null
2024-06-18 Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts Haoxiang Wang et.al. 2406.12845v1 link
2024-06-18 ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools Team GLM et.al. 2406.12793v1 link
2024-06-18 UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions Xunzhi Wang et.al. 2406.12784v1 link
2024-06-18 Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning Bingchen Zhao et.al. 2406.12742v1 link
2024-06-18 On the Robustness of Language Models for Tabular Question Answering Kushal Raj Bhandari et.al. 2406.12719v1 null
2024-06-18 Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction Haoqiu Yan et.al. 2406.12707v1 link
2024-06-18 MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL Arian Askari et.al. 2406.12692v1 null
2024-06-18 Estimating Knowledge in Large Language Models Without Generating a Single Token Daniela Gottesman et.al. 2406.12673v1 null
2024-06-18 Transforming Surgical Interventions with Embodied Intelligence for Ultrasound Robotics Huan Xu et.al. 2406.12651v1 null
2024-06-19 Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models Hengyi Wang et.al. 2406.12649v2 null
2024-06-19 Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on Large Language Models Eldar Kurtic et.al. 2406.12572v2 link
2024-06-18 LLM4MSR: An LLM-Enhanced Paradigm for Multi-Scenario Recommendation Yuhao Wang et.al. 2406.12529v1 null
2024-06-18 Interpreting Bias in Large Language Models: A Feature-Based Approach Nirmalendu Prakash et.al. 2406.12347v1 null
2024-06-18 A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning Lijie Hu et.al. 2406.12255v1 null
2024-06-29 Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM Huaxin Zhang et.al. 2406.12235v2 link
2024-06-24 Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector Gangwei Jiang et.al. 2406.12227v2 null
2024-06-17 Satyrn: A Platform for Analytics Augmented Generation Marko Sterbentz et.al. 2406.12069v1 null
2024-06-17 ARTIST: Improving the Generation of Text-rich Images by Disentanglement Jianyi Zhang et.al. 2406.12044v1 null
2024-06-17 Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts Junmo Kang et.al. 2406.12034v1 null
2024-06-17 How Do Large Language Models Acquire Factual Knowledge During Pretraining? Hoyeon Chang et.al. 2406.11813v1 null
2024-06-17 WaDec: Decompile WebAssembly Using Large Language Model Xinyu She et.al. 2406.11346v1 null
2024-06-17 Can Machines Resonate with Humans? Evaluating the Emotional and Empathic Comprehension of LMs Muhammad Arslan Manzoor et.al. 2406.11250v1 null
2024-06-17 Enabling robots to follow abstract instructions and complete complex dynamic tasks Ruaridh Mon-Williams et.al. 2406.11231v1 null
2024-06-17 Compound Schema Registry Silvery D. Fu et.al. 2406.11227v1 null
2024-06-17 MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model Jiahao Huo et.al. 2406.11193v1 null
2024-06-18 DELRec: Distilling Sequential Pattern to Enhance LLM-based Recommendation Guohao Sun et.al. 2406.11156v2 null
2024-07-01 The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models Bolei Ma et.al. 2406.11096v2 null
2024-06-16 Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens Weiyao Luo et.al. 2406.10985v1 null
2024-06-18 City-LEO: Toward Transparent City Management Using LLM with End-to-End Optimization Zihao Jiao et.al. 2406.10958v2 null
2024-06-28 Large Language Model Enhanced Clustering for News Event Detection Adane Nega Tarekegn et.al. 2406.10552v3 null
2024-06-17 Requirements are All You Need: From Requirements to Code with LLMs Bingyang Wei et.al. 2406.10101v2 link
2024-06-14 Exploring the Correlation between Human and Machine Evaluation of Simultaneous Speech Translation Xiaoman Wang et.al. 2406.10091v1 null
2024-06-14 Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam Nabor C. Mendonça et.al. 2406.09671v1 link
2024-06-12 LLM-assisted Concept Discovery: Automatically Identifying and Explaining Neuron Functions Nhat Hoang-Xuan et.al. 2406.08572v1 null
2024-06-12 Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning Jaehyun Nam et.al. 2406.08527v1 null
2024-06-12 Leveraging Large Language Models for Web Scraping Aman Ahluwalia et.al. 2406.08246v1 null
2024-06-12 AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection Pia Pachinger et.al. 2406.08080v1 null
2024-06-12 A Concept-Based Explainability Framework for Large Multimodal Models Jayneel Parekh et.al. 2406.08074v1 null
2024-06-12 Toward a Method to Generate Capability Ontologies from Natural Language Descriptions Luis Miguel Vieira da Silva et.al. 2406.07962v1 null
2024-06-11 Estimating the Hallucination Rate of Generative AI Andrew Jesson et.al. 2406.07457v1 null
2024-06-11 Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities Delfina Sol Martinez Pandiani et.al. 2406.07353v1 link
2024-06-11 Instruct Large Language Models to Drive like Humans Ruijun Zhang et.al. 2406.07296v1 link
2024-06-10 Harnessing AI for efficient analysis of complex policy documents: a case study of Executive Order 14110 Mark A. Kramer et.al. 2406.06657v1 null
2024-06-09 Exploring the Efficacy of Large Language Models (GPT-4) in Binary Reverse Engineering Saman Pordanesh et.al. 2406.06637v1 null
2024-06-09 LLM Questionnaire Completion for Automatic Psychiatric Assessment Gony Rosenman et.al. 2406.06636v1 null
2024-06-07 LinkQ: An LLM-Assisted Visual Interface for Knowledge Graph Question-Answering Harry Li et.al. 2406.06621v1 link
2024-06-06 Prototypical Reward Network for Data-Efficient RLHF Jinghan Zhang et.al. 2406.06606v1 null
2024-06-13 From Redundancy to Relevance: Enhancing Explainability in Multimodal Large Language Models Xiaofeng Zhang et.al. 2406.06579v2 null
2024-06-18 OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step Owen Dugan et.al. 2406.06576v2 null
2024-06-02 Inverse Constitutional AI: Compressing Preferences into Principles Arduin Findeis et.al. 2406.06560v1 link
2024-06-11 Transforming Wearable Data into Health Insights using Large Language Model Agents Mike A. Merrill et.al. 2406.06464v2 null
2024-06-10 Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization Yi Gu et.al. 2406.06382v1 link
2024-06-10 MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows Xingjian Zhang et.al. 2406.06357v1 link
2024-06-11 iMotion-LLM: Motion Prediction Instruction Tuning Abdulwahab Felemban et.al. 2406.06211v2 null
2024-06-10 Prompting Large Language Models with Audio for General-Purpose Speech Summarization Wonjune Kang et.al. 2406.05968v1 link
2024-06-16 RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation Kiseung Kim et.al. 2406.05794v2 null
2024-06-08 VP-LLM: Text-Driven 3D Volume Completion with Large Language Models through Patchification Jianmeng Liu et.al. 2406.05543v1 null
2024-06-08 MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention Prince Jha et.al. 2406.05344v1 link
2024-06-07 LINX: A Language Driven Generative System for Goal-Oriented Automated Data Exploration Tavor Lipman et.al. 2406.05107v1 null
2024-06-07 LLM-based speaker diarization correction: A generalizable approach Georgios Efstathiadis et.al. 2406.04927v1 link
2024-06-07 Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models Michał Romaszewski et.al. 2406.04926v1 null
2024-06-07 WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild Bill Yuchen Lin et.al. 2406.04770v1 link
2024-06-07 LogiCode: an LLM-Driven Framework for Logical Anomaly Detection Yiheng Zhang et.al. 2406.04687v1 link
2024-06-07 Large Language Model-guided Document Selection Xiang Kong et.al. 2406.04638v1 null
2024-06-07 OCDB: Revisiting Causal Discovery with a Comprehensive Benchmark and Evaluation Framework Wei Zhou et.al. 2406.04598v1 null
2024-06-06 MAIRA-2: Grounded Radiology Report Generation Shruthi Bannur et.al. 2406.04449v1 null
2024-06-01 Large Language Model Confidence Estimation via Black-Box Access Tejaswini Pedapati et.al. 2406.04370v1 null
2024-06-06 Verbalized Machine Learning: Revisiting Machine Learning with Language Models Tim Z. Xiao et.al. 2406.04344v1 null
2024-06-06 Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People Dun-Ming Huang et.al. 2406.04278v1 link
2024-06-06 Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts Shubham Kumar Nigam et.al. 2406.04136v1 link
2024-06-06 Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning Xiaohu Du et.al. 2406.03718v1 link
2024-06-13 Ranking Manipulation for Conversational Search Engines Samuel Pfrommer et.al. 2406.03589v2 link
2024-06-04 Dynamic and Adaptive Feature Generation with LLM Xinhao Zhang et.al. 2406.03505v1 null
2024-06-05 Cycles of Thought: Measuring LLM Confidence through Stable Explanations Evan Becker et.al. 2406.03441v1 null
2024-06-05 Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models Qiang Sun et.al. 2406.02962v1 link
2024-06-06 Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers Brian K Chen et.al. 2406.02847v2 null
2024-06-04 Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks Tianyu He et.al. 2406.02550v1 link
2024-06-04 Iteration Head: A Mechanistic Study of Chain-of-Thought Vivien Cabannes et.al. 2406.02128v1 null
2024-06-04 I've got the "Answer"! Interpretation of LLMs Hidden States in Question Answering Valeriya Goloviznina et.al. 2406.02060v1 null
2024-06-04 Enhancing Trust in LLMs: Algorithms for Comparing and Interpreting LLMs Nik Bear Brown et.al. 2406.01943v1 null
2024-06-05 Dishonesty in Helpful and Harmless Alignment Youcheng Huang et.al. 2406.01931v2 null
2024-06-21 Large Language Model-Enabled Multi-Agent Manufacturing Systems Jonghan Lim et.al. 2406.01893v2 null
2024-06-04 PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning Yupeng Zheng et.al. 2406.01587v2 null
2024-06-03 LoFiT: Localized Fine-tuning on LLM Representations Fangcong Yin et.al. 2406.01563v1 link
2024-06-20 What Are Large Language Models Mapping to in the Brain? A Case Against Over-Reliance on Brain Scores Ebrahim Feghhi et.al. 2406.01538v2 link
2024-06-03 The Geometry of Categorical and Hierarchical Concepts in Large Language Models Kiho Park et.al. 2406.01506v1 link
2024-06-11 AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation Junhao Cheng et.al. 2406.01388v2 link
2024-06-03 Large Language Model Assisted Optimal Bidding of BESS in FCAS Market: An AI-agent based Approach Borui Zhang et.al. 2406.00974v1 null
2024-06-04 Efficient Behavior Tree Planning with Commonsense Pruning and Heuristic Xinglin Chen et.al. 2406.00965v2 null
2024-06-10 Are you still on track!? Catching LLM Task Drift with Activations Sahar Abdelnabi et.al. 2406.00799v2 null
2024-06-02 An Early Investigation into the Utility of Multimodal Large Language Models in Medical Imaging Sulaiman Khan et.al. 2406.00667v1 null
2024-06-02 Presence or Absence: Are Unknown Word Usages in Dictionaries? Xianghe Ma et.al. 2406.00656v1 link
2024-06-11 InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation Jacob Si et.al. 2406.00426v3 link
2024-06-01 Controlling Large Language Model Agents with Entropic Activation Steering Nate Rahn et.al. 2406.00244v1 null
2024-05-31 DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models Linli Yao et.al. 2405.20985v1 null
2024-05-31 Improving Reward Models with Synthetic Critiques Zihuiwen Ye et.al. 2405.20850v1 null
2024-05-31 Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning Cheng Tan et.al. 2405.20834v1 null
2024-05-31 UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation Hanzhang Zhou et.al. 2405.20612v1 null
2024-05-30 XPrompt:Explaining Large Language Model's Generation via Joint Prompt Attribution Yurui Chang et.al. 2405.20404v1 null
2024-05-30 Defensive Prompt Patch: A Robust and Interpretable Defense of LLMs against Jailbreak Attacks Chen Xiong et.al. 2405.20099v1 null
2024-05-30 Deciphering Human Mobility: Inferring Semantics of Trajectories with Large Language Models Yuxiao Luo et.al. 2405.19850v1 null
2024-05-30 Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model Chaochen Gao et.al. 2405.19846v1 null
2024-05-30 Knowledge Graph Tuning: Real-time Large Language Model Personalization based on Human Feedback Jingwei Sun et.al. 2405.19686v1 null
2024-05-29 Normative Modules: A Generative Agent Architecture for Learning Norms that Supports Multi-Agent Cooperation Atrisha Sarkar et.al. 2405.19328v1 null
2024-05-29 Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models Tianrun Chen et.al. 2405.19326v1 null
2024-05-29 Learning from Litigation: Graphs and LLMs for Retrieval and Reasoning in eDiscovery Sounak Lahiri et.al. 2405.19164v1 null
2024-06-02 Cephalo: Multi-Modal Vision-Language Models for Bio-Inspired Materials Analysis and Design Markus J. Buehler et.al. 2405.19076v2 link
2024-06-03 Genshin: General Shield for Natural Language Processing with Large Language Models Xiao Peng et.al. 2405.18741v2 null
2024-06-02 LLM-based Hierarchical Concept Decomposition for Interpretable Fine-Grained Image Classification Renyi Qu et.al. 2405.18672v2 null
2024-05-28 Large Language Models as Partners in Student Essay Evaluation Toru Ishida et.al. 2405.18632v1 null
2024-05-28 OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning Pengxiang Li et.al. 2405.18380v1 link
2024-05-28 FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models Yang Zhang et.al. 2405.18218v1 null
2024-05-28 Exploring Context Window of Large Language Models via Decomposed Positional Vectors Zican Dong et.al. 2405.18009v1 null
2024-05-28 SkinCAP: A Multi-modal Dermatology Dataset Annotated with Rich Medical Captions Juexiao Zhou et.al. 2405.18004v1 null
2024-05-28 Knowledge Circuits in Pretrained Transformers Yunzhi Yao et.al. 2405.17969v1 link
2024-05-28 Arithmetic Reasoning with LLM: Prolog Generation & Permutation Xiaocheng Yang et.al. 2405.17893v1 null
2024-05-27 Mechanistic Interpretability of Binary and Ternary Transformers Jason Li et.al. 2405.17703v1 link
2024-05-27 Deployment of NLP and LLM Techniques to Control Mobile Robots at the Edge: A Case Study Using GPT-4-Turbo and LLaMA 2 Pascal Sikorski et.al. 2405.17670v1 null
2024-05-27 Enhanced Robot Arm at the Edge with NLP and Vision Systems Pascal Sikorski et.al. 2405.17665v1 null
2024-05-27 BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments Yusuf Roohani et.al. 2405.17631v1 link
2024-05-25 Revisit, Extend, and Enhance Hessian-Free Influence Functions Ziao Yang et.al. 2405.17490v1 null
2024-05-28 LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding Haoyu Zhao et.al. 2405.17104v2 null
2024-05-27 Exploring the LLM Journey from Cognition to Expression with Linear Representations Yuzi Yan et.al. 2405.16964v1 null
2024-05-27 TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing Xinyu Zhang et.al. 2405.16803v1 null
2024-05-26 Crafting Interpretable Embeddings by Asking LLMs Questions Vinamra Benara et.al. 2405.16714v1 link
2024-05-26 Attaining Human`s Desirable Outcomes in Human-AI Interaction via Structural Causal Games Anjie Liu et.al. 2405.16588v1 null
2024-05-26 Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search Max Liu et.al. 2405.16450v1 null
2024-05-26 Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level Runlin Lei et.al. 2405.16405v1 null
2024-05-25 Large Language Models Enable Automated Formative Feedback in Human-Robot Interaction Tasks Emily Jensen et.al. 2405.16344v1 null
2024-06-03 Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge Brendan Park et.al. 2405.16277v3 link
2024-05-25 Incremental Comprehension of Garden-Path Sentences by Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention Andrew Li et.al. 2405.16042v1 null
2024-05-24 Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models Yue Zhang et.al. 2405.15684v1 null
2024-05-24 Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges Jonas Becker et.al. 2405.15604v1 link
2024-05-24 ChatGPT Code Detection: Techniques for Uncovering the Source of Code Marc Oedingen et.al. 2405.15512v1 link
2024-05-24 Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search Nicola Dainese et.al. 2405.15383v1 null
2024-05-24 Large Language Models can Deliver Accurate and Interpretable Time Series Anomaly Detection Jun Liu et.al. 2405.15370v1 null
2024-05-24 V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM Abdur Rahman et.al. 2405.15341v1 null
2024-05-24 Decompose and Aggregate: A Step-by-Step Interpretable Evaluation Framework Minzhi Li et.al. 2405.15329v1 null
2024-05-24 Before Generation, Align it! A Novel and Effective Strategy for Mitigating Hallucinations in Text-to-SQL Generation Ge Qu et.al. 2405.15307v1 link
2024-05-23 AutoCoder: Enhancing Code Large Language Model with \textsc{AIEV-Instruct} Bin Lei et.al. 2405.14906v1 link
2024-05-28 Explaining Multi-modal Large Language Models by Analyzing their Vision Perception Loris Giulivi et.al. 2405.14612v2 link
2024-05-23 Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning Jiapu Wang et.al. 2405.14170v1 null
2024-05-28 DeTox: Toxic Subspace Projection for Model Editing Rheeya Uppaal et.al. 2405.13967v3 link
2024-05-22 Large Language Models are Good Spontaneous Multilingual Learners: Is the Multilingual Annotated Data Necessary? Shimao Zhang et.al. 2405.13816v1 link
2024-05-22 Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation Gauthier Guinet et.al. 2405.13622v1 null
2024-05-24 ECLIPSE: Semantic Entropy-LCS for Cross-Lingual Industrial Log Parsing Wei Zhang et.al. 2405.13548v2 null
2024-05-22 HighwayLLM: Decision-Making and Navigation in Highway Driving with RL-Informed Language Model Mustafa Yildirim et.al. 2405.13547v1 null
2024-05-21 A Survey of Robotic Language Grounding: Tradeoffs Between Symbols and Embeddings Vanya Cohen et.al. 2405.13245v1 null
2024-05-21 GPT-4 Jailbreaks Itself with Near-Perfect Success Using Self-Explanation Govind Ramesh et.al. 2405.13077v1 null
2024-05-19 Human-Centered LLM-Agent User Interface: A Position Paper Daniel Chin et.al. 2405.13050v1 null
2024-05-15 IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues Diji Yang et.al. 2405.13021v1 null
2024-05-21 Quantifying Emergence in Large Language Models Hang Chen et.al. 2405.12617v1 link
2024-05-21 Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models Charles O'Neill et.al. 2405.12522v1 null
2024-05-20 Directed Metric Structures arising in Large Language Models Stéphane Gaubert et.al. 2405.12264v1 null
2024-05-20 "Set It Up!": Functional Object Arrangement with Compositional Generative Models Yiqing Xu et.al. 2405.11928v1 null
2024-05-20 Unveiling and Manipulating Prompt Influence in Large Language Models Zijian Feng et.al. 2405.11891v1 link
2024-05-21 Decoding by Contrasting Knowledge: Enhancing LLMs' Confidence on Edited Facts Baolong Bi et.al. 2405.11613v2 link
2024-05-17 Exploring Subjectivity for more Human-Centric Assessment of Social Biases in Large Language Models Paula Akemi Aoyagui et.al. 2405.11048v1 null
2024-05-20 The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks Lucius Bushnaq et.al. 2405.10928v2 link
2024-05-17 COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain Dimitrios P. Panagoulias et.al. 2405.10893v1 null
2024-05-17 MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains Zhaohuan Zhan et.al. 2405.10620v1 null
2024-05-20 Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks Anwoy Chatterjee et.al. 2405.10548v2 null
2024-05-14 Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in LLMs Akhila Yerukola et.al. 2405.08760v1 null
2024-05-14 Challenges and Opportunities in Text Generation Explainability Kenza Amara et.al. 2405.08468v1 null
2024-05-14 Compositional Text-to-Image Generation with Dense Blob Representations Weili Nie et.al. 2405.08246v1 null
2024-05-13 Interpreting Latent Student Knowledge Representations in Programming Assignments Nigel Fernandez et.al. 2405.08213v1 null
2024-05-11 Translating Expert Intuition into Quantifiable Features: Encode Investigator Domain Knowledge via LLM for Enhanced Predictive Analytics Phoebe Jing et.al. 2405.08017v1 null
2024-05-13 A Generalist Learner for Multifaceted Medical Image Interpretation Hong-Yu Zhou et.al. 2405.07988v1 null
2024-05-13 MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning Shuo Yin et.al. 2405.07551v1 null
2024-05-13 Integrating Intent Understanding and Optimal Behavior Planning for Behavior Tree Generation from Human Instructions Xinglin Chen et.al. 2405.07474v1 null
2024-05-12 Human-interpretable clustering of short-text using large language models Justin K. Miller et.al. 2405.07278v1 null
2024-05-11 Automating Thematic Analysis: How LLMs Analyse Controversial Topics Awais Hameed Khan et.al. 2405.06919v1 null
2024-05-21 AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents Shuyuan Xu et.al. 2405.06907v2 link
2024-05-10 MEIC: Re-thinking RTL Debug Automation using LLMs Ke Xu et.al. 2405.06840v1 null
2024-05-10 Large Language Model in Financial Regulatory Interpretation Zhiyu Cao et.al. 2405.06808v1 null
2024-05-15 On the Shape of Brainscores for Large Language Models (LLMs) Jingkai Li et.al. 2405.06725v3 link
2024-05-09 Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses Gaurav Kumar Gupta et.al. 2405.06712v1 null
2024-05-08 Interpretable Cross-Examination Technique (ICE-T): Using highly informative features to boost LLM performance Goran Muric et.al. 2405.06703v1 null
2024-05-13 Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling Lyumanshan Ye et.al. 2405.06495v2 null
2024-05-10 Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL Ning Cheng et.al. 2405.06410v1 null
2024-05-09 LLMs for XAI: Future Directions for Explaining Explanations Alexandra Zytek et.al. 2405.06064v1 null
2024-05-09 Probing Multimodal LLMs as World Models for Driving Shiva Sreeram et.al. 2405.05956v1 link
2024-05-09 One vs. Many: Comprehending Accurate Information from Multiple Erroneous and Inconsistent AI Generations Yoonjoo Lee et.al. 2405.05581v1 null
2024-05-11 Poser: Unmasking Alignment Faking LLMs by Manipulating Their Internals Joshua Clymer et.al. 2405.05466v2 null
2024-05-08 Empathy Through Multimodality in Conversational Interfaces Mahyar Abbasian et.al. 2405.04777v1 null
2024-05-09 Large Language Models for Cyber Security: A Systematic Literature Review HanXiang Xu et.al. 2405.04760v2 null
2024-05-13 A Transformer with Stack Attention Jiaoda Li et.al. 2405.04515v2 link
2024-05-06 In Situ AI Prototyping: Infusing Multimodal Prompts into Mobile Settings with MobileMaker Savvas Petridis et.al. 2405.03806v1 null
2024-05-06 Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames Keith Burghardt et.al. 2405.03688v1 link
2024-05-23 AlphaMath Almost Zero: process Supervision without process Guoxin Chen et.al. 2405.03553v2 link
2024-05-06 MedDoc-Bot: A Chat Tool for Comparative Analysis of Large Language Models in the Context of the Pediatric Hypertension Guideline Mohamed Yaseen Jabarulla et.al. 2405.03359v1 link
2024-05-06 WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning Yuanhan Zhang et.al. 2405.03272v1 null
2024-05-06 A Philosophical Introduction to Language Models - Part II: The Way Forward Raphaël Millière et.al. 2405.03207v1 null
2024-05-23 Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions Ruizhe Li et.al. 2405.03205v2 link
2024-05-06 Exploring the Potential of the Large Language Models (LLMs) in Identifying Misleading News Headlines Md Main Uddin Rony et.al. 2405.03153v1 null
2024-05-05 Traffic Performance GPT (TP-GPT): Real-Time Data Informed Intelligent ChatBot for Transportation Surveillance and Management Bingzhang Wang et.al. 2405.03076v1 null
2024-05-22 A scoping review of using Large Language Models (LLMs) to investigate Electronic Health Records (EHRs) Lingyao Li et.al. 2405.03066v2 null
2024-05-07 Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models Tianze Xu et.al. 2405.02801v2 link
2024-05-04 TREC iKAT 2023: A Test Collection for Evaluating Conversational and Interactive Knowledge Assistants Mohammad Aliannejadi et.al. 2405.02637v1 link
2024-05-03 What does the Knowledge Neuron Thesis Have to do with Knowledge? Jingcheng Niu et.al. 2405.02421v1 link
2024-05-03 LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model Yulin Luo et.al. 2405.02363v1 null
2024-04-18 NL2FOL: Translating Natural Language to First-Order Logic for Logical Fallacy Detection Abhinav Lalwani et.al. 2405.02318v1 null
2024-05-03 Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows Jasmine Y. Shih et.al. 2405.02260v1 null
2024-05-03 Argumentative Large Language Models for Explainable and Contestable Decision-Making Gabriel Freedman et.al. 2405.02079v1 null
2024-05-02 A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law Zhiyu Zoey Chen et.al. 2405.01769v1 null
2024-05-02 ALCM: Autonomous LLM-Augmented Causal Discovery Framework Elahe Khatibi et.al. 2405.01744v1 null
2024-05-01 GOLD: Geometry Problem Solver with Natural Language Description Jiaxin Zhang et.al. 2405.00494v1 link
2024-05-01 The Pyramid of Captions Delong Chen et.al. 2405.00485v1 null
2024-05-01 CultiVerse: Towards Cross-Cultural Understanding for Paintings with Large Language Model Wei Zhang et.al. 2405.00435v1 null
2024-04-30 PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification Leon Garza et.al. 2404.19744v1 null
2024-05-22 Neuro-Vision to Language: Enhancing Visual Reconstruction and Language Interaction through Brain Recordings Guobin Shen et.al. 2404.19438v3 null
2024-04-30 Transcrib3D: 3D Referring Expression Resolution through Large Language Models Jiading Fang et.al. 2404.19221v1 null
2024-04-29 SuperCLUE-Fin: Graded Fine-Grained Analysis of Chinese LLMs on Diverse Financial Tasks and Applications Liang Xu et.al. 2404.19063v1 null
2024-04-29 AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering Wenxiang Zhao et.al. 2404.18816v1 null
2024-04-29 PECC: Problem Extraction and Coding Challenges Patrick Haller et.al. 2404.18766v1 link
2024-04-29 HFT: Half Fine-Tuning for Large Language Models Tingfeng Hui et.al. 2404.18466v1 null
2024-04-28 Logic Agent: Enhancing Validity with Logic Rule Invocation Hanmeng Liu et.al. 2404.18130v1 null
2024-04-27 MediFact at MEDIQA-CORR 2024: Why AI Needs a Human Touch Nadia Saeed et.al. 2404.17999v1 link
2024-04-27 Verco: Learning Coordinated Verbal Communication for Multi-agent Reinforcement Learning Dapeng Li et.al. 2404.17780v1 null
2024-04-29 On the Use of Large Language Models to Generate Capability Ontologies Luis Miguel Vieira da Silva et.al. 2404.17524v2 null
2024-04-26 Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study Yang Wu et.al. 2404.17136v1 link
2024-04-25 AutoGenesisAgent: Self-Generating Multi-Agent Systems for Complex Tasks Jeremy Harper et.al. 2404.17017v1 null
2024-04-25 Evolve Cost-aware Acquisition Functions Using Large Language Models Yiming Yao et.al. 2404.16906v1 null
2024-04-11 Rumour Evaluation with Very Large Language Models Dahlia Shehata et.al. 2404.16859v1 link
2024-04-25 RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis Xiaoman Zhang et.al. 2404.16754v1 null
2024-04-25 Evolutionary Large Language Models for Hardware Security: A Comparative Survey Mohammad Akyash et.al. 2404.16651v1 null
2024-04-25 Interpreting Answers to Yes-No Questions in Dialogues from Multiple Domains Zijie Wang et.al. 2404.16262v1 link
2024-04-24 Return of EM: Entity-driven Answer Set Expansion for QA Evaluation Dongryeol Lee et.al. 2404.15650v1 null
2024-04-27 PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models Shashi Kant Gupta et.al. 2404.15549v2 null
2024-04-01 Automated Assessment of Encouragement and Warmth in Classrooms Leveraging Multimodal Emotional Features and ChatGPT Ruikun Hou et.al. 2404.15310v1 null
2024-04-23 Aligning LLM Agents by Learning Latent Preference from User Edits Ge Gao et.al. 2404.15269v1 link
2024-04-22 Pixels and Predictions: Potential of GPT-4V in Meteorological Imagery Analysis and Forecast Communication John R. Lawson et.al. 2404.15166v1 null
2024-04-23 Language in Vivo vs. in Silico: Size Matters but Larger Language Models Still Do Not Comprehend Language on a Par with Humans Vittoria Dentella et.al. 2404.14883v1 null
2024-04-23 Think-Program-reCtify: 3D Situated Reasoning with Large Language Models Qingrong He et.al. 2404.14705v1 null
2024-04-26 Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training Mengzhao Jia et.al. 2404.14604v3 null
2024-04-22 Integrating Disambiguation and User Preferences into Large Language Models for Robot Motion Planning Mohammed Abugurain et.al. 2404.14547v1 null
2024-04-22 CoFInAl: Enhancing Action Quality Assessment with Coarse-to-Fine Instruction Alignment Kanglei Zhou et.al. 2404.13999v1 link
2024-05-23 Towards General Conceptual Model Editing via Adversarial Representation Engineering Yihao Zhang et.al. 2404.13752v2 link
2024-04-21 FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization Zhaopeng Gu et.al. 2404.13671v1 null
2024-04-21 Trojan Detection in Large Language Models: Insights from The Trojan Detection Challenge Narek Maloyan et.al. 2404.13660v1 null
2024-04-21 ChatRetriever: Adapting Large Language Models for Generalized and Robust Conversational Dense Retrieval Kelong Mao et.al. 2404.13556v1 link
2024-04-20 "I Wish There Were an AI": Challenges and AI Potential in Cancer Patient-Provider Communication Ziqi Yang et.al. 2404.13409v1 null
2024-04-20 Large Language Models as Test Case Generators: Performance Evaluation and Enhancement Kefan Li et.al. 2404.13340v1 null
2024-04-19 CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models Manish Bhatt et.al. 2404.13161v1 link
2024-04-19 Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation Guanhua Chen et.al. 2404.12879v1 null
2024-04-19 Large Language Model Supply Chain: A Research Agenda Shenao Wang et.al. 2404.12736v1 null
2024-04-19 Just Like Me: The Role of Opinions and Personal Experiences in The Perception of Explanations in Subjective Decision-Making Sharon Ferguson et.al. 2404.12558v1 null
2024-04-18 BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models Yu Feng et.al. 2404.12494v1 null
2024-04-18 MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale Xiaotang Gai et.al. 2404.12372v1 null
2024-04-23 Large Language Models for Synthetic Participatory Planning of Synergistic Transportation Systems Jiangbo Yu et.al. 2404.12317v3 null
2024-04-18 Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair Yusuke Sakai et.al. 2404.12299v1 null
2024-04-18 Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM Michelle S. Lam et.al. 2404.12259v1 link
2024-04-18 EVIT: Event-Oriented Instruction Tuning for Event Reasoning Zhengwei Tao et.al. 2404.11978v1 null
2024-04-18 Aligning Language Models to Explicitly Handle Ambiguity Hyuhng Joon Kim et.al. 2404.11972v1 null
2024-04-18 Concept Induction using LLMs: a user experiment for assessment Adrita Barua et.al. 2404.11875v1 null
2024-04-17 MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory Ali Modarressi et.al. 2404.11672v1 null
2024-04-16 Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases Yanze Li et.al. 2404.10595v1 null
2024-04-16 Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning Xiao Wang et.al. 2404.10552v1 null
2024-04-15 Evolving Interpretable Visual Classifiers with Large Language Models Mia Chiquier et.al. 2404.09941v1 null
2024-04-15 Reimagining Self-Adaptation in the Age of Large Language Models Raghav Donakanti et.al. 2404.09866v1 null
2024-04-16 How Far Have We Gone in Stripped Binary Code Understanding Using Large Language Models Xiuwei Shang et.al. 2404.09836v2 null
2024-04-15 Resilience of Large Language Models for Noisy Instructions Bin Wang et.al. 2404.09754v1 null
2024-04-15 Enhancing Robot Explanation Capabilities through Vision-Language Models: a Preliminary Study by Interpreting Visual Inputs for Improved Human-Robot Interaction David Sobrín-Hidalgo et.al. 2404.09705v1 null
2024-04-15 Bridging Vision and Language Spaces with Assignment Prediction Jungin Park et.al. 2404.09632v1 link
2024-04-15 MMCode: Evaluating Multi-Modal Code Large Language Models with Visually Rich Programming Problems Kaixin Li et.al. 2404.09486v1 link
2024-04-14 Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions Taojun Hu et.al. 2404.09135v1 null
2024-04-17 Incremental Residual Concept Bottleneck Models Chenming Shang et.al. 2404.08978v2 null
2024-04-13 Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension Mengnan Qi et.al. 2404.08885v1 null
2024-04-12 LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning Junchi Wang et.al. 2404.08767v1 link
2024-04-12 Can LLMs substitute SQL? Comparing Resource Utilization of Querying LLMs versus Traditional Relational Databases Xiang Zhang et.al. 2404.08727v1 null
2024-04-05 Effects of Different Prompts on the Quality of GPT-4 Responses to Dementia Care Questions Zhuochun Li et.al. 2404.08674v1 null
2024-03-25 Linear Cross-document Event Coreference Resolution with X-AMR Shafiuddin Rehan Ahmed et.al. 2404.08656v1 link
2024-04-12 Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward Xuan Xie et.al. 2404.08517v1 null
2024-04-12 Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task Hassan Ali et.al. 2404.08424v1 null
2024-03-22 Content Knowledge Identification with Multi-Agent Large Language Models (LLMs) Kaiqi Yang et.al. 2404.07960v1 null
2024-04-11 DesignQA: A Multimodal Benchmark for Evaluating Large Language Models' Understanding of Engineering Documentation Anna C. Doris et.al. 2404.07917v1 link
2024-04-12 Reflectance Estimation for Proximity Sensing by Vision-Language Models: Utilizing Distributional Semantics for Low-Level Cognition in Robotics Masashi Osada et.al. 2404.07717v2 link
2024-04-11 Can Large Language Models Assess Serendipity in Recommender Systems? Yu Tokutake et.al. 2404.07499v1 null
2024-04-10 Vision-Language Model-based Physical Reasoning for Robot Liquid Perception Wenqiang Lai et.al. 2404.06904v1 null
2024-04-09 Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language? Omid Ghahroodi et.al. 2404.06644v1 null
2024-04-09 Building A Knowledge Graph to Enrich ChatGPT Responses in Manufacturing Service Discovery Yunqing Li et.al. 2404.06571v1 null
2024-04-09 Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in Python Valdecy Pereira et.al. 2404.06370v1 link
2024-04-21 AgentsCoDriver: Large Language Model Empowered Collaborative Driving with Lifelong Learning Senkang Hu et.al. 2404.06345v2 null
2024-04-07 X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model Jan Held et.al. 2404.06332v1 null
2024-04-08 LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding Chuwei Luo et.al. 2404.05225v1 link
2024-04-08 LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models Shibo Hao et.al. 2404.05221v1 null
2024-04-07 Facial Affective Behavior Analysis with Instruction Tuning Yifan Li et.al. 2404.05052v1 null
2024-04-07 FRACTAL: Fine-Grained Scoring from Aggregate Text Labels Yukti Makhija et.al. 2404.04817v1 null
2024-04-06 Multicalibration for Confidence Scoring in LLMs Gianluca Detommaso et.al. 2404.04689v1 null
2024-04-06 Autonomous Artificial Intelligence Agents for Clinical Decision Making in Oncology Dyke Ferber et.al. 2404.04667v1 null
2024-04-06 Do We Really Need a Complex Agent System? Distill Embodied Agent into a Single Model Zhonghan Zhao et.al. 2404.04619v1 null
2024-04-05 Scope Ambiguities in Large Language Models Gaurav Kamath et.al. 2404.04332v1 link
2024-04-05 Assessing the quality of information extraction Filip Seitl et.al. 2404.04068v1 null
2024-04-04 Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph Marco Bronzini et.al. 2404.03623v1 null
2024-04-04 Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity Jake Varley et.al. 2404.03570v1 null
2024-04-03 LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models Gabriela Ben Melech Stan et.al. 2404.03118v1 null
2024-04-03 Towards a Fully Interpretable and More Scalable RSA Model for Metaphor Understanding Gaia Carenini et.al. 2404.02983v1 null
2024-04-13 Explainable Traffic Flow Prediction with Large Language Models Xusen Guo et.al. 2404.02937v3 null
2024-04-13 Toward Informal Language Processing: Knowledge of Slang in Large Language Models Zhewei Sun et.al. 2404.02323v2 null
2024-04-02 ZeroCAP: Zero-Shot Multi-Robot Context Aware Pattern Formation via Large Language Models Vishnunandan L. N. Venkatesh et.al. 2404.02318v1 null
2024-04-02 Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation Veronica Valeros et.al. 2404.01940v1 null
2024-04-02 InsightLens: Discovering and Exploring Insights from Conversational Contexts in Large-Language-Model-Powered Data Analysis Luoxuan Weng et.al. 2404.01644v1 null
2024-03-29 Wait, It's All Token Noise? Always Has Been: Interpreting LLM Behavior Using Shapley Value Behnam Mohammadi et.al. 2404.01332v1 null
2024-04-01 Chat Modeling: Natural Language-based Procedural Modeling of Biological Structures without Training Donggang Jia et.al. 2404.01063v1 null
2024-04-11 Source-Aware Training Enables Knowledge Attribution in Language Models Muhammad Khalifa et.al. 2404.01019v2 link
2024-04-01 Query Performance Prediction using Relevance Judgments Generated by Large Language Models Chuan Meng et.al. 2404.01012v1 link
2024-04-01 Exploring the Nexus of Large Language Models and Legal Systems: A Short Survey Weicong Qin et.al. 2404.00990v1 null
2024-04-12 Harnessing the Power of Large Language Model for Uncertainty Aware Graph Processing Zhenyu Qian et.al. 2404.00589v2 link
2024-03-30 PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression Muhammad Asif Ali et.al. 2404.00489v1 null
2024-03-30 Do Vision-Language Models Understand Compound Nouns? Sonal Kumar et.al. 2404.00419v1 null
2024-03-30 EventGround: Narrative Reasoning by Grounding to Eventuality-centric Knowledge Graphs Cheng Jiayang et.al. 2404.00209v1 link
2024-03-29 User Modeling Challenges in Interactive AI Assistant Systems Megan Su et.al. 2403.20134v1 null
2024-03-28 Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving Akshay Gopalkrishnan et.al. 2403.19838v1 link
2024-03-28 AlloyBERT: Alloy Property Prediction with Large Language Models Akshat Chaudhari et.al. 2403.19783v1 null
2024-03-28 Enhancing Anomaly Detection in Financial Markets with an LLM-based Multi-Agent Framework Taejin Park et.al. 2403.19735v1 null
2024-04-01 Change-Agent: Towards Interactive Comprehensive Remote Sensing Change Interpretation and Analysis Chenyang Liu et.al. 2403.19646v2 link
2024-03-28 Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation Yutong He et.al. 2403.19103v1 null
2024-03-27 A Survey on Large Language Models from Concept to Implementation Chen Wang et.al. 2403.18969v1 null
2024-03-27 CheckEval: Robust Evaluation Framework using Large Language Model via Checklist Yukyung Lee et.al. 2403.18771v1 null
2024-04-03 Quantifying and Mitigating Unimodal Biases in Multimodal Large Language Models: A Causal Perspective Meiqi Chen et.al. 2403.18346v3 null
2024-03-27 LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models Mingxing Peng et.al. 2403.18344v1 null
2024-03-27 Can LLMs Converse Formally? Automatically Assessing LLMs in Translating and Interpreting Formal Specifications Rushang Karia et.al. 2403.18327v1 null
2024-03-26 Evaluating the Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security Applications Fouad Trad et.al. 2403.17787v1 null
2024-03-25 Generation of Asset Administration Shell with Large Language Model Agents: Interoperability in Digital Twins with Semantic Node Yuchen Xia et.al. 2403.17209v1 null
2024-03-25 The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition Georgios Chochlakis et.al. 2403.17125v1 null
2024-03-25 Grounding Language Plans in Demonstrations Through Counterfactual Perturbations Yanwei Wang et.al. 2403.17124v1 null
2024-03-25 Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models Hao Shao et.al. 2403.16999v1 link
2024-03-25 PropTest: Automatic Property Testing for Improved Visual Programming Jaywon Koo et.al. 2403.16921v1 null
2024-04-22 Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography Jiayue Zhang et.al. 2403.16687v3 null
2024-03-28 Can Language Models Pretend Solvers? Logic Code Simulation with LLMs Minyu Chen et.al. 2403.16097v2 null
2024-04-15 Computational Sentence-level Metrics Predicting Human Sentence Comprehension Kun Sun et.al. 2403.15822v2 null
2024-03-23 EDDA: A Encoder-Decoder Data Augmentation Framework for Zero-Shot Stance Detection Daijun Ding et.al. 2403.15715v1 link
2024-04-03 Evaluating GPT-4 with Vision on Detection of Radiological Findings on Chest Radiographs Yiliang Zhou et.al. 2403.15528v2 null
2024-03-21 Open Source Conversational LLMs do not know most Spanish words Javier Conde et.al. 2403.15491v1 null
2024-03-15 ChatPattern: Layout Pattern Customization via Natural Language Zixiao Wang et.al. 2403.15434v1 null
2024-03-22 Can large language models explore in-context? Akshay Krishnamurthy et.al. 2403.15371v1 null
2024-04-03 AllHands: Ask Me Anything on Large-scale Verbatim Feedback via Large Language Models Chaoyun Zhang et.al. 2403.15157v2 null
2024-03-22 Comprehensive Lipidomic Automation Workflow using Large Language Models Connor Beveridge et.al. 2403.15076v1 null
2024-03-21 MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? Renrui Zhang et.al. 2403.14624v1 null
2024-03-21 Dermacen Analytica: A Novel Methodology Integrating Multi-Modal Large Language Models with Machine Learning in tele-dermatology Dimitrios P. Panagoulias et.al. 2403.14243v1 null
2024-04-08 MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation Longzheng Wang et.al. 2403.14171v3 link
2024-03-20 CoMo: Controllable Motion Generation through Language Guided Pose Code Editing Yiming Huang et.al. 2403.13900v1 null
2024-03-20 Encoding the Subsurface in 3D with Seismic Ben Lasscock et.al. 2403.13593v1 null
2024-03-20 IndiTag: An Online Media Bias Analysis and Annotation System Using Fine-Grained Bias Indicators Luyang Lin et.al. 2403.13446v1 link
2024-03-19 A Canary in the AI Coal Mine: American Jews May Be Disproportionately Harmed by Intellectual Property Dispossession in Large Language Model Training Heila Precel et.al. 2403.13073v1 null
2024-04-02 AutoTRIZ: Artificial Ideation with TRIZ and Large Language Models Shuo Jiang et.al. 2403.13002v2 null
2024-03-19 Semantic Layering in Room Segmentation via LLMs Taehyeon Kim et.al. 2403.12920v1 null
2024-03-19 Pragmatic Competence Evaluation of Large Language Models for Korean Dojun Park et.al. 2403.12675v1 null
2024-04-02 Enhancing Formal Theorem Proving: A Comprehensive Dataset for Training AI Models on Coq Code Andreas Florath et.al. 2403.12627v2 null
2024-03-19 AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework Xiang Li et.al. 2403.12582v1 link
2024-03-19 INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations Lirui Luo et.al. 2403.12451v1 null
2024-03-19 Towards Interpretable Hate Speech Detection using Large Language Model-extracted Rationales Ayushi Nirmal et.al. 2403.12403v1 null
2024-03-19 Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models Ying-Chun Lin et.al. 2403.12388v1 null
2024-04-02 Investigating Markers and Drivers of Gender Bias in Machine Translations Peter J Barclay et.al. 2403.11896v2 null
2024-03-18 Metaphor Understanding Challenge Dataset for LLMs Xiaoyu Tong et.al. 2403.11810v1 null
2024-03-22 Scene-LLM: Extending Language Model for 3D Visual Understanding and Reasoning Rao Fu et.al. 2403.11401v2 null
2024-04-10 StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows Yiran Wu et.al. 2403.11322v3 link
2024-03-17 ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models Siyuan Huang et.al. 2403.11289v1 link
2024-03-26 SelfIE: Self-Interpretation of Large Language Model Embeddings Haozhe Chen et.al. 2403.10949v2 link
2024-03-16 A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment Tianhe Wu et.al. 2403.10854v1 link
2024-03-16 LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices Jingping Nie et.al. 2403.10779v1 null
2024-03-16 NARRATE: Versatile Language Architecture for Optimal Control in Robotics Seif Ismail et.al. 2403.10762v1 null
2024-03-15 Uncovering Latent Themes of Messaging on Social Media by Integrating LLMs: A Case Study on Climate Campaigns Tunazzina Islam et.al. 2403.10707v1 null
2024-03-22 Large Language Model-informed ECG Dual Attention Network for Heart Failure Risk Prediction Chen Chen et.al. 2403.10581v2 null
2024-03-15 TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale Pengcheng Jiang et.al. 2403.10351v1 null
2024-03-14 Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors Guanghua Li et.al. 2403.09747v1 null
2024-03-14 XCoOp: Explainable Prompt Learning for Computer-Aided Diagnosis via Concept-guided Context Optimization Yequan Bie et.al. 2403.09410v1 null
2024-03-14 UniCode: Learning a Unified Codebook for Multimodal Large Language Models Sipeng Zheng et.al. 2403.09072v1 null
2024-02-21 Diet-ODIN: A Novel Framework for Opioid Misuse Detection with Interpretable Dietary Patterns Zheyuan Zhang et.al. 2403.08820v1 link
2024-03-13 A Picture Is Worth a Thousand Words: Exploring Diagram and Video-Based OOP Exercises to Counter LLM Over-Reliance Bruno Pereira Cipriano et.al. 2403.08396v1 null
2024-03-13 Embedded Translations for Low-resource Automated Glossing Changbing Yang et.al. 2403.08189v1 null
2024-03-12 NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning Bingqian Lin et.al. 2403.07376v1 link
2024-03-11 From English to ASIC: Hardware Implementation with Large Language Model Emil Goh et.al. 2403.07039v1 link
2024-03-11 Large Model driven Radiology Report Generation with Clinical Quality Reinforcement Learning Zijian Zhou et.al. 2403.06728v1 null
2024-03-11 FashionReGen: LLM-Empowered Fashion Report Generation Yujuan Ding et.al. 2403.06660v1 null
2024-03-10 Are You Being Tracked? Discover the Power of Zero-Shot Trajectory Tracing with LLMs! Huanqi Yang et.al. 2403.06201v1 null
2024-03-10 Reframe Anything: LLM Agent for Open World Video Reframing Jiawang Cao et.al. 2403.06070v1 null
2024-03-09 LEVA: Using Large Language Models to Enhance Visual Analytics Yuheng Zhao et.al. 2403.05816v1 null
2024-03-08 Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach Zhen Tan et.al. 2403.05636v1 null
2024-03-08 ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment Xiwei Hu et.al. 2403.05135v1 null
2024-03-11 Embracing Large Language and Multimodal Models for Prosthetic Technologies Sharmita Dey et.al. 2403.04974v2 null
2024-03-07 Automatic and Universal Prompt Injection Attacks against Large Language Models Xiaogeng Liu et.al. 2403.04957v1 link
2024-03-07 iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries Adam Coscia et.al. 2403.04760v1 link
2024-03-07 KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts Adam Coscia et.al. 2403.04758v1 link
2024-03-07 Wiki-TabNER:Advancing Table Interpretation Through Named Entity Recognition Aneta Koleva et.al. 2403.04577v1 link
2024-03-08 Do Large Language Model Understand Multi-Intent Spoken Language ? Shangjian Yin et.al. 2403.04481v2 link
2024-03-18 Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models Changjiang Gao et.al. 2403.04325v2 null
2024-03-13 Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning Deepanway Ghosal et.al. 2403.03864v3 link
2024-03-06 Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery Wei Zhang et.al. 2403.03790v1 null
2024-03-06 GPTopic: Dynamic and Interactive Topic Representations Arik Reuter et.al. 2403.03628v1 null
2024-03-06 Explaining Genetic Programming Trees using Large Language Models Paula Maddigan et.al. 2403.03397v1 null
2024-03-05 Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled by GPT-4 for Enhanced Interpretability and Public Engagement Rafaela Martelo et.al. 2403.03188v1 link
2024-03-05 HINTs: Sensemaking on large collections of documents with Hypergraph visualization and INTelligent agents Sam Yu-Te Lee et.al. 2403.02752v1 null
2024-03-05 HARGPT: Are LLMs Zero-Shot Human Activity Recognizers? Sijie Ji et.al. 2403.02727v1 null
2024-03-05 Updating the Minimum Information about CLinical Artificial Intelligence (MI-CLAIM) checklist for generative modeling research Brenda Y. Miao et.al. 2403.02558v1 link
2024-03-26 FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction Alessandro Scirè et.al. 2403.02270v2 null
2024-03-04 Towards Intent-Based Network Management: Large Language Models for Intent Extraction in 5G Core Networks Dimitrios Michael Manias et.al. 2403.02238v1 null
2024-03-04 Evaluating the Explainability of Neural Rankers Saran Pandian et.al. 2403.01981v1 null
2024-03-03 Logic Rules as Explanations for Legal Case Retrieval Zhongxiang Sun et.al. 2403.01457v1 link
2024-03-02 Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers Melanie Subbiah et.al. 2403.01061v1 link
2024-03-01 Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries Zelalem Gero et.al. 2403.01002v1 link
2024-02-26 InteraRec: Interactive Recommendations Using Multimodal Large Language Models Saketh Reddy Karra et.al. 2403.00822v1 null
2024-02-25 Bootstrapping Cognitive Agents with a Large Language Model Feiyu Zhu et.al. 2403.00810v1 null
2024-02-18 Ploutos: Towards interpretable stock movement prediction with financial large language model Hanshuang Tong et.al. 2403.00782v1 null
2024-02-18 ChatDiet: Empowering Personalized Nutrition-Oriented Food Recommender Chatbots through an LLM-Augmented Framework Zhongqi Yang et.al. 2403.00781v1 null
2024-03-27 LLMs in Political Science: Heralding a New Era of Visual Analysis Yu Wang et.al. 2403.00154v2 null
2024-02-29 FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition Xiaoqiang Wang et.al. 2403.00126v1 null
2024-02-29 Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines Lijia Ma et.al. 2402.19421v1 null
2024-03-12 Data Interpreter: An LLM Agent For Data Science Sirui Hong et.al. 2402.18679v3 link
2024-02-28 Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning Jiachun Li et.al. 2402.18344v1 null
2024-02-29 MIKO: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery Feihong Lu et.al. 2402.18169v2 null
2024-02-28 Cause and Effect: Can Large Language Models Truly Understand Causality? Swagata Ashwani et.al. 2402.18139v1 null
2024-02-28 ChatSpamDetector: Leveraging Large Language Models for Effective Phishing Email Detection Takashi Koide et.al. 2402.18093v1 null
2024-02-27 Automated Statistical Model Discovery with Language Models Michael Y. Li et.al. 2402.17879v1 null
2024-03-07 ByteComposer: a Human-like Melody Composition Method based on Language Model Agent Xia Liang et.al. 2402.17785v2 null
2024-02-27 Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data Xiao Liu et.al. 2402.17644v1 link
2024-02-27 Nissist: An Incident Mitigation Copilot based on Troubleshooting Guides Kaikai An et.al. 2402.17531v1 null
2024-02-27 Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models Xiaolong Wang et.al. 2402.17226v1 null
2024-03-20 OSCaR: Object State Captioning and State Change Representation Nguyen Nguyen et.al. 2402.17128v3 link
2024-02-24 Enforcing Temporal Constraints on Generative Agent Behavior with Reactive Synthesis Raven Rothkopf et.al. 2402.16905v1 null
2024-02-26 Mysterious Projections: Multimodal LLMs Gain Domain-Specific Visual Capabilities Without Richer Cross-Modal Projections Gaurav Verma et.al. 2402.16832v1 null
2024-02-28 StructLM: Towards Building Generalist Models for Structured Knowledge Grounding Alex Zhuang et.al. 2402.16671v2 null
2024-03-04 Improving LLM-based Machine Translation with Systematic Self-Correction Zhaopeng Feng et.al. 2402.16379v2 link
2024-02-25 AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation Yasheng Sun et.al. 2402.16124v1 null
2024-02-25 Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression Xinze Li et.al. 2402.16058v1 link
2024-02-25 LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding Yuxuan Wang et.al. 2402.16050v1 link
2024-02-23 Language-Based User Profiles for Recommendation Joyce Zhou et.al. 2402.15623v1 null
2024-02-19 Detecting misinformation through Framing Theory: the Frame Element-based Model Guan Wang et.al. 2402.15525v1 null
2024-02-23 Explorations of Self-Repair in Language Models Cody Rushing et.al. 2402.15390v1 link
2024-02-23 Substrate Prediction for RiPP Biosynthetic Enzymes via Masked Language Modeling and Transfer Learning Joseph D. Clark et.al. 2402.15181v1 null
2024-02-23 Large Multimodal Agents: A Survey Junlin Xie et.al. 2402.15116v1 null
2024-03-08 LLMBind: A Unified Modality-Task Integration Framework Bin Zhu et.al. 2402.14891v3 null
2024-02-21 Driving Generative Agents With Their Personality Lawrence J. Klinkert et.al. 2402.14879v1 null
2024-02-20 A Dual-Prompting for Interpretable Mental Health Language Models Hyolim Jeon et.al. 2402.14854v1 null
2024-02-19 RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question Answering and Clinical Reasoning Congyun Jin et.al. 2402.14840v1 null
2024-02-23 A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health Nikhil Behari et.al. 2402.14807v2 null
2024-02-22 Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation Jiawei Wang et.al. 2402.14744v1 null
2024-02-22 COMPASS: Computational Mapping of Patient-Therapist Alliance Strategies with Language Modeling Baihan Lin et.al. 2402.14701v1 null
2024-02-28 OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement Tianyu Zheng et.al. 2402.14658v2 null
2024-02-22 Towards Unified Task Embeddings Across Multiple Models: Bridging the Gap for Prompt-Based Large Language Models and Beyond Xinyu Wang et.al. 2402.14522v1 null
2024-02-22 Data Science with LLMs and Interpretable Models Sebastian Bordt et.al. 2402.14474v1 link
2024-02-21 MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms Yiqiao Jin et.al. 2402.14154v1 null
2024-02-21 DeiSAM: Segment Anything with Deictic Prompting Hikaru Shindo et.al. 2402.14123v1 link
2024-02-21 An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach Mohammad Amaz Uddin et.al. 2402.13871v1 null
2024-02-21 LLM4SBR: A Lightweight and Effective Framework for Integrating Large Language Models in Session-based Recommendation Shutong Qiao et.al. 2402.13840v1 null
2024-03-15 CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models Fuwen Luo et.al. 2402.13607v2 null
2024-02-21 Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment Yunxin Li et.al. 2402.13561v1 null
2024-02-21 Round Trip Translation Defence against Large Language Model Jailbreaking Attacks Canaan Yung et.al. 2402.13517v1 link
2024-02-20 SymBa: Symbolic Backward Chaining for Multi-step Natural Language Reasoning Jinu Lee et.al. 2402.12806v1 null
2024-02-20 Are Large Language Models Rational Investors? Yuhang Zhou et.al. 2402.12713v1 null
2024-02-18 scInterpreter: Training Large Language Models to Interpret scRNA-seq Data for Cell Type Annotation Cong Li et.al. 2402.12405v1 null
2024-02-19 Reformatted Alignment Run-Ze Fan et.al. 2402.12219v1 link
2024-02-19 ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning Renqiu Xia et.al. 2402.12185v1 link
2024-02-19 Distilling Large Language Models for Text-Attributed Graph Learning Bo Pan et.al. 2402.12022v1 null
2024-02-25 How Interpretable are Reasoning Explanations from Prompting Large Language Models? Wei Jie Yeo et.al. 2402.11863v2 link
2024-02-22 ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs Fengqing Jiang et.al. 2402.11753v2 null
2024-02-18 A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models Jaylen Jones et.al. 2402.11676v1 link
2024-02-18 Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals Francesco Ortu et.al. 2402.11655v1 link
2024-02-17 TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks Benjamin Feuer et.al. 2402.11137v1 link
2024-02-09 Zero-shot Explainable Mental Health Analysis on Social Media by incorporating Mental Scales Wenyu Li et.al. 2402.10948v1 null
2024-02-16 How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs? Ehsan Doostmohammadi et.al. 2402.10770v1 null
2024-02-16 Inference to the Best Explanation in Large Language Models Dhairya Dalal et.al. 2402.10767v1 null
2024-02-16 Opening the Black Box of Large Language Models: Two Views on Holistic Interpretability Haiyan Zhao et.al. 2402.10688v1 null
2024-02-16 LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models Minsuk Kahng et.al. 2402.10524v1 null
2024-02-15 OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset Shubham Toshniwal et.al. 2402.10176v1 link
2024-02-15 Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States Hanyu Duan et.al. 2402.09733v1 null
2024-02-15 Answer is All You Need: Instruction-following Text Embedding via Answering the Question Letian Peng et.al. 2402.09642v1 link
2024-02-14 Large Language Model-Based Interpretable Machine Learning Control in Building Energy Systems Liang Zhang et.al. 2402.09584v1 null
2024-02-14 AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach Maryam Amirizaniani et.al. 2402.09334v1 null
2024-02-14 Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code Vahid Majdinasab et.al. 2402.09299v1 null
2024-02-14 SyntaxShap: Syntax-aware Explainability Method for Text Generation Kenza Amara et.al. 2402.09259v1 null
2024-02-14 Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models Goutham Rajendran et.al. 2402.09236v1 null
2024-02-13 Large Language Models for the Automated Analysis of Optimization Algorithms Camilo Chacón Sartori et.al. 2402.08472v1 link
2024-02-13 Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks Jusung Lee et.al. 2402.08360v1 null
2024-02-17 LLaGA: Large Language and Graph Assistant Runjin Chen et.al. 2402.08170v2 link
2024-02-25 Policy Improvement using Language Feedback Models Victor Zhong et.al. 2402.07876v3 null
2024-02-12 Game Agent Driven by Free-Form Text Command: Using LLM-based Code Generation and Behavior Branch Ray Ito et.al. 2402.07442v1 null
2024-02-14 Natural Language Reinforcement Learning Xidong Feng et.al. 2402.07157v2 null
2024-02-09 InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning Huaiyuan Ying et.al. 2402.06332v1 link
2024-02-09 ContPhy: Continuum Physical Concept Learning and Reasoning from Videos Zhicheng Zheng et.al. 2402.06119v1 null
2024-02-02 Character-based Outfit Generation with Vision-augmented Style Extraction via LLMs Najmeh Forouzandehmehr et.al. 2402.05941v1 null
2024-02-08 Driving Everywhere with Large Language Model Policy Adaptation Boyi Li et.al. 2402.05932v1 null
2024-02-05 Zero-Shot Clinical Trial Patient Matching with LLMs Michael Wornow et.al. 2402.05125v1 null
2024-02-07 Opening the AI black box: program synthesis via mechanistic interpretability Eric J. Michaud et.al. 2402.05110v1 link
2024-02-07 Improving Cross-Domain Low-Resource Text Generation through LLM Post-Editing: A Programmer-Interpreter Approach Zhuang Li et.al. 2402.04609v1 null
2024-02-06 Chatbot Meets Pipeline: Augment Large Language Model with Definite Finite Automaton Yiyou Sun et.al. 2402.04411v1 null
2024-02-06 Assured LLM-Based Software Engineering Nadia Alshahwan et.al. 2402.04380v1 null
2024-02-06 Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models David Sobrín-Hidalgo et.al. 2402.04206v1 null
2024-02-06 SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models Yichen Shi et.al. 2402.04178v1 link
2024-02-06 Scientific Language Modeling: A Quantitative Review of Large Language Models in Molecular Science Pengfei Liu et.al. 2402.04119v1 link
2024-02-07 Position Paper: Against Spurious Sparks $-$ Dovelating Inflated AI Claims Patrick Altmeyer et.al. 2402.03962v2 null
2024-02-06 Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience Xilin Jiang et.al. 2402.03710v1 null
2024-02-27 Distinguishing the Knowable from the Unknowable with Language Models Gustaf Ahdritz et.al. 2402.03563v2 link
2024-01-25 When Geoscience Meets Generative AI and Large Language Models: Foundations, Trends, and Future Challenges Abdenour Hadid et.al. 2402.03349v1 null
2024-03-04 English Prompts are Better for NLI-based Zero-Shot Emotion Classification than Target-Language Prompts Patrick Barreiß et.al. 2402.03223v2 null
2024-02-22 PuzzleBench: Can LLMs Solve Challenging First-Order Combinatorial Reasoning Problems? Chinmay Mittal et.al. 2402.02611v2 null
2024-02-04 Integration of cognitive tasks into artificial general intelligence test for large models Youzhi Qu et.al. 2402.02547v1 null
2024-02-03 A Data Generation Perspective to the Mechanism of In-Context Learning Haitao Mao et.al. 2402.02212v1 null
2024-02-03 Vi(E)va LLM! A Conceptual Stack for Evaluating and Interpreting Generative AI-based Visualizations Luca Podo et.al. 2402.02167v1 link
2024-02-13 PresAIse, A Prescriptive AI Solution for Enterprises Wei Sun et.al. 2402.02006v2 null
2024-02-02 The Role of Foundation Models in Neuro-Symbolic Learning and Reasoning Daniel Cunnington et.al. 2402.01889v1 null
2024-02-06 Large Language Model Agent for Hyper-Parameter Optimization Siyi Liu et.al. 2402.01881v2 null
2024-02-02 The Political Preferences of LLMs David Rozado et.al. 2402.01789v1 null
2024-01-30 Rethinking Interpretability in the Era of Large Language Models Chandan Singh et.al. 2402.01761v1 link
2024-01-29 Compensatory Biases Under Cognitive Load: Reducing Selection Bias in Large Language Models J. E. Eicher et.al. 2402.01740v1 null
2024-01-25 ChatGPT vs Gemini vs LLaMA on Multilingual Sentiment Analysis Alessio Buscemi et.al. 2402.01715v1 null
2024-01-23 Quality of Answers of Generative Large Language Models vs Peer Patients for Interpreting Lab Test Results for Lay Patients: Evaluation Study Zhe He et.al. 2402.01693v1 null
2024-02-16 Emojis Decoded: Leveraging ChatGPT for Enhanced Understanding in Social Media Communications Yuhang Zhou et.al. 2402.01681v2 null
2024-02-02 BAT: Learning to Reason about Spatial Sounds with Large Language Models Zhisheng Zheng et.al. 2402.01591v1 null
2024-02-02 From Words to Molecules: A Survey of Large Language Models in Chemistry Chang Liao et.al. 2402.01439v1 null
2024-02-02 Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis Zeeshan Rasheed et.al. 2402.01386v1 null
2024-02-02 Reasoning Capacity in Multi-Agent Systems: Limitations, Challenges and Human-Centered Solutions Pouya Pezeshkpour et.al. 2402.01108v1 null
2024-02-01 Executable Code Actions Elicit Better LLM Agents Xingyao Wang et.al. 2402.01030v1 link
2024-02-01 Enhancing Ethical Explanations of Large Language Models through Iterative Symbolic Refinement Xin Quan et.al. 2402.00745v1 link
2024-02-01 Transforming and Combining Rewards for Aligning Large Language Models Zihao Wang et.al. 2402.00742v1 null
2024-02-01 AssertLLM: Generating and Evaluating Hardware Verification Assertions from Design Specifications via Multi-LLMs Wenji Fang et.al. 2402.00386v1 null
2024-02-01 IndiVec: An Exploration of Leveraging Large Language Models for Media Bias Detection with Fine-Grained Bias Indicators Luyang Lin et.al. 2402.00345v1 null
2024-02-01 Efficient Non-Parametric Uncertainty Quantification for Black-Box Large Language Models and Decision Planning Yao-Hung Hubert Tsai et.al. 2402.00251v1 null
2024-01-31 Multimodal Neurodegenerative Disease Subtyping Explained by ChatGPT Diego Machado Reyes et.al. 2402.00137v1 null
2024-01-31 ChIRAAG: ChatGPT Informed Rapid and Automated Assertion Generation Bhabesh Mali et.al. 2402.00093v1 null
2024-02-07 Detecting Multimedia Generated by Large AI Models: A Survey Li Lin et.al. 2402.00045v3 link
2024-01-21 Training microrobots to swim by a large language model Zhuoqun Xu et.al. 2402.00044v1 null
2024-02-05 Comparative Analysis of LLaMA and ChatGPT Embeddings for Molecule Embedding Shaghayegh Sadeghi et.al. 2402.00024v2 link
2024-02-03 EEG-GPT: Exploring Capabilities of Large Language Models for EEG Classification and Interpretation Jonathan W. Kim et.al. 2401.18006v2 null
2024-01-31 Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study Qirui Jiao et.al. 2401.17981v1 null
2024-01-31 Probing Language Models' Gesture Understanding for Enhanced Human-AI Interaction Philipp Wicke et.al. 2401.17858v1 null
2024-01-30 Detecting mental disorder on social media: a ChatGPT-augmented explainable approach Loris Belcastro et.al. 2401.17477v1 link
2024-02-05 EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain Wei Zhang et.al. 2401.16822v2 null
2024-01-30 A Cross-Language Investigation into Jailbreak Attacks in Large Language Models Jie Li et.al. 2401.16765v1 null
2024-02-03 Engineering A Large Language Model From Scratch Abiodun Finbarrs Oketunji et.al. 2401.16736v3 null
2024-01-29 Probabilistic Abduction for Visual Abstract Reasoning via Learning Rules in Vector-symbolic Architectures Michael Hersche et.al. 2401.16024v1 link
2024-01-29 APIGen: Generative API Method Recommendation Yujia Chen et.al. 2401.15843v1 link
2024-02-12 Scalable Qualitative Coding with LLMs: Chain-of-Thought Reasoning Matches Human Performance in Some Hermeneutic Tasks Zackary Okun Dunivin et.al. 2401.15170v2 null
2024-01-26 Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias Yu He Ke et.al. 2401.14589v1 null
2024-01-25 LongHealth: A Question Answering Benchmark with Long Clinical Documents Lisa Adams et.al. 2401.14490v1 link
2024-01-25 GPTVoiceTasker: LLM-Powered Virtual Assistant for Smartphone Minh Duc Vu et.al. 2401.14268v1 null
2024-01-25 CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks Andrei Tomut et.al. 2401.14109v1 null
2024-01-25 A Survey of Deep Learning and Foundation Models for Time Series Forecasting John A. Miller et.al. 2401.13912v1 null
2024-01-24 AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents Chang Ma et.al. 2401.13178v1 link
2024-01-23 From Understanding to Utilization: A Survey on Explainability for Large Language Models Haoyan Luo et.al. 2401.12874v1 null
2024-01-23 How well can large language models explain business processes? Dirk Fahland et.al. 2401.12846v1 null
2024-01-27 C2Ideas: Supporting Creative Interior Color Design Ideation with Large Language Model Yihan Hou et.al. 2401.12586v2 null
2024-01-30 SLANG: New Concept Comprehension of Large Language Models Lingrui Mei et.al. 2401.12585v2 null
2024-01-23 LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools Qianli Wang et.al. 2401.12576v1 link
2024-01-23 Automated Fact-Checking of Climate Change Claims with Large Language Models Markus Leippold et.al. 2401.12566v1 null
2024-01-22 CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation Zhihong Chen et.al. 2401.12208v1 null
2024-01-21 Integration of Large Language Models in Control of EHD Pumps for Precise Color Synthesis Yanhong Peng et.al. 2401.11500v1 null
2024-01-18 LangProp: A code optimization framework using Language Models applied to driving Shu Ishida et.al. 2401.10314v1 link
2024-01-18 Advancing Large Multi-modal Models with Explicit Chain-of-Reasoning and Visual Question Generation Kohei Uehara et.al. 2401.10005v1 null
2024-01-18 Temporal Insight Enhancement: Mitigating Temporal Hallucination in Multimodal Large Language Models Li Sun et.al. 2401.09861v1 null
2024-01-17 Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models Haonan Guo et.al. 2401.09083v1 link
2024-01-17 What makes for a 'good' social actor? Using respect as a lens to evaluate interactions with language agents Lize Alberts et.al. 2401.09082v1 null
2024-01-16 AiGen-FoodReview: A Multimodal Dataset of Machine-Generated Restaurant Reviews and Images on Social Media Alessandro Gambetti et.al. 2401.08825v1 null
2024-01-15 Assistant, Parrot, or Colonizing Loudspeaker? ChatGPT Metaphors for Developing Critical AI Literacies Anuj Gupta et.al. 2401.08711v1 null
2024-01-16 Anchor function: a type of benchmark functions for studying language models Zhongwang Zhang et.al. 2401.08309v1 null
2024-01-16 AesBench: An Expert Benchmark for Multimodal Large Language Models on Image Aesthetics Perception Yipo Huang et.al. 2401.08276v1 link
2024-01-16 LLM-Guided Multi-View Hypergraph Learning for Human-Centric Explainable Recommendation Zhixuan Chu et.al. 2401.08217v1 null
2024-02-16 MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline Minpeng Liao et.al. 2401.08190v2 link
2024-02-15 Are self-explanations from Large Language Models faithful? Andreas Madsen et.al. 2401.07927v3 link
2024-01-17 See the Unseen: Better Context-Consistent Knowledge-Editing by Noises Youcheng Huang et.al. 2401.07544v2 null
2024-01-12 Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data Yubin Kim et.al. 2401.06866v1 null
2024-01-12 Enhancing the Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought Zaijing Li et.al. 2401.06836v1 null
2024-01-12 From Automation to Augmentation: Large Language Models Elevating Essay Scoring Landscape Changrong Xiao et.al. 2401.06431v1 link
2024-01-23 How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs Yi Zeng et.al. 2401.06373v2 link
2024-01-12 Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models Asma Ghandeharioun et.al. 2401.06102v2 null
2024-01-11 Large Language Models vs. Search Engines: Evaluating User Preferences Across Varied Information Retrieval Scenarios Kevin Matthe Caramancion et.al. 2401.05761v1 null
2024-01-11 Towards Conversational Diagnostic AI Tao Tu et.al. 2401.05654v1 null
2024-01-17 Theory of Mind abilities of Large Language Models in Human-Robot Interaction : An Illusion? Mudit Verma et.al. 2401.05302v2 null
2024-01-10 Aligning Translation-Specific Understanding to General Understanding in Large Language Models Yichong Huang et.al. 2401.05072v1 null
2024-01-10 ANGO: A Next-Level Evaluation Benchmark For Generation-Oriented Language Models In Chinese Domain Bingchao Wang et.al. 2401.04898v1 null
2024-01-08 Evaluating Brain-Inspired Modular Training in Automated Circuit Discovery for Mechanistic Interpretability Jatin Nainani et.al. 2401.03646v1 null
2024-01-05 UMIE: Unified Multimodal Information Extraction with Instruction Tuning Lin Sun et.al. 2401.03082v1 link
2024-02-01 Object-Centric Instruction Augmentation for Robotic Manipulation Junjie Wen et.al. 2401.02814v2 null
2024-02-06 VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model Pengying Wu et.al. 2401.02695v2 null
2024-01-05 Correctness Comparison of ChatGPT-4, Bard, Claude-2, and Copilot for Spatial Tasks Hartwig H. Hochmair et.al. 2401.02404v2 null
2024-01-04 DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models Wendi Cui et.al. 2401.02132v1 link
2024-01-03 Large Language Models Relearn Removed Concepts Michelle Lo et.al. 2401.01814v1 link
2024-01-12 WordArt Designer API: User-Driven Artistic Typography Synthesis with Large Language Models on ModelScope Jun-Yan He et.al. 2401.01699v2 null
2024-01-02 VALD-MD: Visual Attribution via Latent Diffusion for Medical Diagnostics Ammar A. Siddiqui et.al. 2401.01414v1 null
2024-01-02 A Novel Evaluation Framework for Assessing Resilience Against Prompt Injection Attacks in Large Language Models Daniel Wankit Yip et.al. 2401.00991v1 null
2023-12-31 AllSpark: a multimodal spatiotemporal general model Run Shao et.al. 2401.00546v1 null
2023-12-31 keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM Chaojie Wang et.al. 2401.00426v1 null
2024-01-12 Advancing TTP Analysis: Harnessing the Power of Encoder-Only and Decoder-Only Language Models with Retrieval Augmented Generation Reza Fayyazi et.al. 2401.00280v2 null
2023-12-30 Is Knowledge All Large Language Models Needed for Causal Reasoning? Hengrui Cai et.al. 2401.00139v1 link
2023-12-27 Conversational Question Answering with Reformulations over Knowledge Graph Lihui Liu et.al. 2312.17269v1 null
2023-12-29 Large Language Model for Causal Decision Making Haitao Jiang et.al. 2312.17122v2 null
2023-12-27 Rethinking Tabular Data Understanding with Large Language Models Tianyang Liu et.al. 2312.16702v1 link
2023-12-26 Observable Propagation: A Data-Efficient Approach to Uncover Feature Vectors in Transformers Jacob Dunefsky et.al. 2312.16291v1 link
2023-12-26 Understanding Before Recommendation: Semantic Aspect-Aware Review Exploitation via Large Language Models Fan Liu et.al. 2312.16275v1 null
2023-12-26 Large Language Models as Traffic Signal Control Agents: Capacity and Opportunity Siqi Lai et.al. 2312.16044v1 link
2024-01-29 ChartBench: A Benchmark for Complex Visual Reasoning in Charts Zhengzhuo Xu et.al. 2312.15915v2 null
2023-12-26 Think and Retrieval: A Hypothesis Knowledge Graph Enhanced Medical Large Language Models Xinke Jiang et.al. 2312.15883v1 null
2023-12-22 Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention Zhen Tan et.al. 2312.15033v1 null
2023-12-22 Theory of Hallucinations based on Equivariance Hisaichi Shibata et.al. 2312.14504v1 null
2023-12-22 Don't Believe Everything You Read: Enhancing Summarization Interpretability through Automatic Identification of Hallucinations in Large Language Models Priyesh Vakharia et.al. 2312.14346v1 null
2023-12-19 Large Language Models in Medical Term Classification and Unexpected Misalignment Between Response and Reasoning Xiaodan Zhang et.al. 2312.14184v1 null
2023-12-21 Diversifying Knowledge Enhancement of Biomedical Language Models using Adapter Modules and Knowledge Graphs Juraj Vladika et.al. 2312.13881v1 null
2023-12-21 A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties Junfei Xiao et.al. 2312.13764v1 link
2023-12-20 ECAMP: Entity-centered Context-aware Medical Vision Language Pre-training Rongsheng Wang et.al. 2312.13316v1 link
2023-12-21 AMD:Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion Beibei Jing et.al. 2312.12763v2 null
2023-12-21 A Case Study on Test Case Construction with Large Language Models: Unveiling Practical Insights and Challenges Roberto Francisco de Lima Junior et.al. 2312.12598v2 null
2024-01-30 Locating Factual Knowledge in Large Language Models: Exploring the Residual Stream and Analyzing Subvalues in Vocabulary Space Zeping Yu et.al. 2312.12141v2 null
2023-12-19 Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach Weiyu Ma et.al. 2312.11865v1 link
2023-12-16 Learning Interpretable Queries for Explainable Image Classification with Information Pursuit Stefan Kolek et.al. 2312.11548v1 null
2023-12-22 A mathematical perspective on Transformers Borjan Geshkovski et.al. 2312.10794v2 link
2023-12-17 kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest Neighbor In-Context Learning Wenting Zhao et.al. 2312.10771v1 null
2023-12-17 Knowledge Trees: Gradient Boosting Decision Trees on Knowledge Neurons as Probing Classifier Sergey A. Saltykov et.al. 2312.10746v1 null
2023-12-17 Can persistent homology whiten Transformer-based black-box models? A case study on BERT compression Luis Balderas et.al. 2312.10702v1 null
2023-12-16 Continuous Prompt Generation from Linear Combination of Discrete Prompt Embeddings Pascal Passigan et.al. 2312.10323v1 null
2023-12-23 Shedding Light on Software Engineering-specific Metaphors and Idioms Mia Mohammad Imran et.al. 2312.10297v2 link
2023-12-15 A Review of Repository Level Prompting for LLMs Douglas Schonholtz et.al. 2312.10101v1 null
2023-12-04 Generative AI in Writing Research Papers: A New Type of Algorithmic Bias and Uncertainty in Scholarly Work Rishab Jain et.al. 2312.10057v1 null
2023-12-15 Neurosymbolic Value-Inspired AI (Why, What, and How) Amit Sheth et.al. 2312.09928v1 null
2023-12-15 GPT-4 Surpassing Human Performance in Linguistic Pragmatics Ljubisa Bojic et.al. 2312.09545v1 null
2023-12-14 Large Language Models for Autonomous Driving: Real-World Experiments Can Cui et.al. 2312.09397v1 null
2023-12-14 Successor Heads: Recurring, Interpretable Attention Heads In The Wild Rhys Gould et.al. 2312.09230v1 null
2023-12-14 Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models Zhiyuan You et.al. 2312.08962v1 null
2023-12-14 Learning Safety Constraints From Demonstration Using One-Class Decision Trees Mattijs Baert et.al. 2312.08837v1 null
2023-12-13 Helping Language Models Learn More: Multi-dimensional Task Prompt for Few-shot Tuning Jinta Weng et.al. 2312.08027v1 null
2023-12-07 Large Language Models for Intent-Driven Session Recommendations Zhu Sun et.al. 2312.07552v1 link
2023-12-12 Efficiently Programming Large Language Models using SGLang Lianmin Zheng et.al. 2312.07104v1 link
2023-12-12 Towards Enhanced Human Activity Recognition through Natural Language Generation and Pose Estimation Nikhil Kashyap et.al. 2312.06965v1 null
2023-12-27 Steering Llama 2 via Contrastive Activation Addition Nina Rimsky et.al. 2312.06681v2 link
2023-12-11 AnyHome: Open-Vocabulary Generation of Structured and Textured 3D Homes Zehao Wen et.al. 2312.06644v1 null
2023-12-11 DiffVL: Scaling Up Soft Body Manipulation using Vision-Language Driven Differentiable Physics Zhiao Huang et.al. 2312.06408v1 null
2023-12-11 GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models Jiaxu Zhao et.al. 2312.06315v1 null
2023-12-11 ProtoCode: Leveraging Large Language Models for Automated Generation of Machine-Readable Protocols from Scientific Publications Shuo Jiang et.al. 2312.06241v1 null
2023-12-10 Evidence-based Interpretable Open-domain Fact-checking with Large Language Models Xin Tan et.al. 2312.05834v1 null
2023-12-19 Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning Subhabrata Dutta et.al. 2312.05571v2 link
2023-12-09 Image and Data Mining in Reticular Chemistry Using GPT-4V Zhiling Zheng et.al. 2312.05468v1 null
2023-12-09 Identifying and Mitigating Model Failures through Few-shot CLIP-aided Diffusion Generation Atoosa Chegini et.al. 2312.05464v1 null
2023-12-08 GlitchBench: Can large multimodal models detect video game glitches? Mohammad Reza Taesiri et.al. 2312.05291v1 null
2023-12-08 Retrieval-based Video Language Model for Efficient Long Video Question Answering Jiaqi Xu et.al. 2312.04931v1 null
2023-12-08 Ophtha-LLaMA2: A Large Language Model for Ophthalmology Huan Zhao et.al. 2312.04906v1 null
2024-01-10 KwaiAgents: Generalized Information-seeking Agent System with Large Language Models Haojie Pan et.al. 2312.04889v3 link
2023-12-07 AVA: Towards Autonomous Visualization Agents through Visual Perception-Driven Decision-Making Shusen Liu et.al. 2312.04494v1 null
2023-12-07 LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs Yunsheng Ma et.al. 2312.04372v1 null
2023-12-27 Towards Knowledge-driven Autonomous Driving Xin Li et.al. 2312.04316v3 link
2023-12-07 Efficiently Predicting Protein Stability Changes Upon Single-point Mutation with Large Language Models Yijie Zhang et.al. 2312.04019v1 null
2023-12-05 How should the advent of large language models affect the practice of science? Marcel Binz et.al. 2312.03759v1 null
2023-12-04 Near-real-time Earthquake-induced Fatality Estimation using Crowdsourced Data and Large-Language Models Chenguang Wang et.al. 2312.03755v1 null
2023-12-08 Methods to Estimate Large Language Model Confidence Maia Kotelanski et.al. 2312.03733v2 null
2023-12-06 GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models Haicheng Liao et.al. 2312.03543v1 link
2023-12-05 FlexModel: A Framework for Interpretability of Distributed Large Language Models Matthew Choi et.al. 2312.03140v1 link
2023-12-07 Evaluating Agents using Social Choice Theory Marc Lanctot et.al. 2312.03121v2 link
2023-12-05 Breast Ultrasound Report Generation using LangChain Jaeyoung Huh et.al. 2312.03013v1 null
2023-12-05 Harmonizing Global Voices: Culturally-Aware Models for Enhanced Content Moderation Alex J. Chan et.al. 2312.02401v1 null
2023-12-04 LLMs Accelerate Annotation for Medical Information Extraction Akshay Goel et.al. 2312.02296v1 null
2023-12-04 Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition Chengyou Jia et.al. 2312.02226v1 null
2023-11-28 Training Chain-of-Thought via Latent-Variable Inference Du Phan et.al. 2312.02179v1 null
2023-12-04 Learning Machine Morality through Experience and Interaction Elizaveta Tennant et.al. 2312.01818v1 null
2023-12-26 Jellyfish: A Large Language Model for Data Preprocessing Haochen Zhang et.al. 2312.01678v3 null
2023-12-11 Characterizing Large Language Model Geometry Solves Toxicity Detection and Generation Randall Balestriero et.al. 2312.01648v2 link
2023-12-04 The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning Bill Yuchen Lin et.al. 2312.01552v1 null
2023-12-03 SAGE: Bridging Semantic and Actionable Parts for GEneralizable Articulated-Object Manipulation under Language Instructions Haoran Geng et.al. 2312.01307v1 null
2023-12-03 TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long Documents James Enouen et.al. 2312.01279v1 null
2023-12-02 From Voices to Validity: Leveraging Large Language Models (LLMs) for Textual Analysis of Policy Stakeholder Interviews Alex Liu et.al. 2312.01202v1 null
2023-12-01 Leveraging Large Language Models to Improve REST API Testing Myeongsoo Kim et.al. 2312.00894v1 null
2023-12-18 Empowering Autonomous Driving with Large Language Models: A Safety Perspective Yixuan Wang et.al. 2312.00812v3 null
2023-11-30 Towards Accurate Differential Diagnosis with Large Language Models Daniel McDuff et.al. 2312.00164v1 null
2023-11-30 PoseGPT: Chatting about 3D Human Pose Yao Feng et.al. 2311.18836v1 null
2023-11-30 CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation Zineng Tang et.al. 2311.18775v1 null
2023-12-05 AlignBench: Benchmarking Chinese Alignment of Large Language Models Xiao Liu et.al. 2311.18743v3 link
2023-11-30 Categorical Traffic Transformer: Interpretable and Diverse Behavior Prediction with Tokenized Latent Yuxiao Chen et.al. 2311.18307v1 null
2023-11-29 Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings Andrea W Wen-Yi et.al. 2311.18034v1 link
2023-11-28 Unlocking Spatial Comprehension in Text-to-Image Diffusion Models Mohammad Mahdi Derakhshani et.al. 2311.17937v1 null
2023-11-29 VIM: Probing Multimodal Large Language Models for Visual Embedded Instruction Following Yujie Lu et.al. 2311.17647v1 null
2023-11-29 Exploring Large Language Models for Human Mobility Prediction under Public Events Yuebing Liang et.al. 2311.17351v1 null
2023-11-29 Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering Zeqing Wang et.al. 2311.17331v1 null
2023-11-28 Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis Xiaohui Chen et.al. 2311.17126v1 null
2023-11-30 Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following Yutong Feng et.al. 2311.17002v2 null
2023-12-27 StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models Kazuki Yamauchi et.al. 2311.16509v2 null
2023-12-10 LLMGA: Multimodal Large Language Model based Generation Assistant Bin Xia et.al. 2311.16500v2 link
2023-11-27 ChartLlama: A Multimodal LLM for Chart Understanding and Generation Yucheng Han et.al. 2311.16483v1 null
2023-11-27 Have we built machines that think like people? Luca M. Schulze Buschoff et.al. 2311.16093v1 link
2023-11-27 Decoding Logic Errors: A Comparative Study on Bug Detection by Students and Large Language Models Stephen MacNeil et.al. 2311.16017v1 null
2023-11-27 Sparsify-then-Classify: From Internal Neurons of Large Language Models To Efficient Text Classifiers Yilun Liu et.al. 2311.15983v1 link
2023-11-27 Dawning of a New Era in Gravitational Wave Data Analysis: Unveiling Cosmic Mysteries via Artificial Intelligence -- A Systematic Review Tianyu Zhao et.al. 2311.15585v1 null
2023-12-03 See and Think: Embodied Agent in Virtual Environment Zhonghan Zhao et.al. 2311.15209v2 null
2023-11-25 Localizing Lying in Llama: Understanding Instructed Dishonesty on True-False Questions Through Prompting, Probing, and Patching James Campbell et.al. 2311.15131v1 null
2023-11-19 Zero-Shot Question Answering over Financial Documents using Large Language Models Karmvir Singh Phogat et.al. 2311.14722v1 null
2023-11-24 Benchmarking Large Language Models for Log Analysis, Security, and Interpretation Egil Karlsen et.al. 2311.14519v1 null
2023-11-30 A density estimation perspective on learning from pairwise human preferences Vincent Dumoulin et.al. 2311.14115v2 link
2023-11-23 Towards Explainable Strategy Templates using NLP Transformers Pallavi Bagga et.al. 2311.14061v1 null
2023-11-23 Challenges of Large Language Models for Mental Health Counseling Neo Christopher Chung et.al. 2311.13857v1 null
2023-12-03 FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design Yangyang Yu et.al. 2311.13743v2 link
2023-11-22 Vamos: Versatile Action Models for Video Understanding Shijie Wang et.al. 2311.13627v1 null
2023-11-22 ADriver-I: A General World Model for Autonomous Driving Fan Jia et.al. 2311.13549v1 null
2023-12-15 Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs Yonghui Wang et.al. 2311.13194v2 link
2023-11-25 From Classification to Clinical Insights: Towards Analyzing and Reasoning About Mobile and Behavioral Health Data With Large Language Models Zachary Englhardt et.al. 2311.13063v2 null
2023-11-21 ALPHA: AnomaLous Physiological Health Assessment Using Large Language Models Jiankai Tang et.al. 2311.12524v1 link
2023-11-21 Adapting LLMs for Efficient, Personalized Information Retrieval: Methods and Implications Samira Ghodratnama et.al. 2311.12287v1 null
2023-11-20 Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents Zhuosheng Zhang et.al. 2311.11797v1 link
2023-11-20 Incorporating LLM Priors into Tabular Learners Max Zhu et.al. 2311.11628v1 null
2023-11-20 GPT in Data Science: A Practical Exploration of Model Selection Nathalia Nascimento et.al. 2311.11516v1 null
2023-11-20 Meta Prompting for AGI Systems Yifan Zhang et.al. 2311.11482v1 link
2023-12-17 Rethinking Large Language Models in Mental Health Applications Shaoxiong Ji et.al. 2311.11267v2 null
2023-11-18 Bit Cipher -- A Simple yet Powerful Word Representation System that Integrates Efficiently with Language Models Haoran Zhao et.al. 2311.11012v1 null
2023-11-18 RecExplainer: Aligning Large Language Models for Recommendation Model Interpretability Yuxuan Lei et.al. 2311.10947v1 null
2023-11-17 Flexible Model Interpretability through Natural Language Model Editing Karel D'Oosterlinck et.al. 2311.10905v1 null
2023-11-27 A Language Agent for Autonomous Driving Jiageng Mao et.al. 2311.10813v3 link
2023-11-15 MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning Fuxiao Liu et.al. 2311.10774v1 link
2023-11-16 MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning Xiangru Tang et.al. 2311.10537v1 link
2023-11-16 Interpreting User Requests in the Context of Natural Language Standing Instructions Nikita Moghe et.al. 2311.09796v1 null
2023-11-16 On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering Linyong Nan et.al. 2311.09721v1 null
2023-11-16 Evaluating In-Context Learning of Libraries for Code Generation Arkil Patel et.al. 2311.09635v1 null
2023-11-16 Efficient End-to-End Visual Document Understanding with Rationale Distillation Wang Zhu et.al. 2311.09612v1 null
2023-11-16 Pachinko: Patching Interpretable QA Models through Natural Language Feedback Chaitanya Malaviya et.al. 2311.09558v1 link
2023-11-09 Chain of Images for Intuitively Reasoning Fanxu Meng et.al. 2311.09241v1 link
2023-11-15 TableLlama: Towards Open Large Generalist Models for Tables Tianshu Zhang et.al. 2311.09206v1 null
2023-11-15 MELA: Multilingual Evaluation of Linguistic Acceptability Ziyin Zhang et.al. 2311.09033v1 null
2023-11-15 Identifying Linear Relational Concepts in Large Language Models David Chanin et.al. 2311.08968v1 null
2023-11-15 I Was Blind but Now I See: Implementing Vision-Enabled Dialogue in Social Robots Giulio Antonio Abbo et.al. 2311.08957v1 null
2023-11-15 HELLaMA: LLaMA-based Table to Text Generation by Highlighting the Important Evidence Junyi Bian et.al. 2311.08896v1 null
2023-11-15 Token Prediction as Implicit Classification to Identify LLM-Generated Text Yutian Chen et.al. 2311.08723v1 link
2023-11-15 Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling Bairu Hou et.al. 2311.08718v1 link
2023-11-15 XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-Making Zichen Chen et.al. 2311.08614v1 null
2023-11-15 Navigating the Ocean of Biases: Political Bias Attribution in Language Models via Causal Structures David F. Jenny et.al. 2311.08605v1 link
2023-11-14 Towards Evaluating AI Systems for Moral Status Using Self-Reports Ethan Perez et.al. 2311.08576v1 null
2023-11-14 Taxonomy, Semantic Data Schema, and Schema Alignment for Open Data in Urban Building Energy Modeling Liang Zhang et.al. 2311.08535v1 null
2023-11-14 Plum: Prompt Learning using Metaheuristic Rui Pan et.al. 2311.08364v1 link
2023-11-14 Human-Centric Autonomous Systems With LLMs for User Command Reasoning Yi Yang et.al. 2311.08206v1 link
2023-11-11 Conceptual Model Interpreter for Large Language Models Felix Härer et.al. 2311.07605v1 link
2023-11-13 It's Not Easy Being Wrong: Evaluating Process of Elimination Reasoning in Large Language Models Nishant Balepur et.al. 2311.07532v1 link
2023-11-13 Finding and Editing Multi-Modal Neurons in Pre-Trained Transformer Haowen Pan et.al. 2311.07470v1 null
2023-11-13 On Measuring Faithfulness of Natural Language Explanations Letitia Parcalabescu et.al. 2311.07466v1 link
2023-11-13 Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models Junpeng Li et.al. 2311.07314v1 null
2023-11-12 Assessing the Interpretability of Programmatic Policies with Large Language Models Zahra Bashir et.al. 2311.06979v1 null
2023-11-12 Simulating Public Administration Crisis: A Novel Generative Agent-Based Simulation System to Lower Technology Barriers in Social Science Research Bushi Xiao et.al. 2311.06957v1 null
2023-11-10 ChatGPT in the context of precision agriculture data analytics Ilyas Potamitis et.al. 2311.06390v1 link
2023-11-09 Deep Natural Language Feature Learning for Interpretable Prediction Felipe Urrutia et.al. 2311.05754v1 link
2023-11-09 Do personality tests generalize to Large Language Models? Florian E. Dorner et.al. 2311.05297v1 null
2023-11-02 Chain of Empathy: Enhancing Empathetic Response of Large Language Models Based on Psychotherapy Models Yoon Kyung Lee et.al. 2311.04915v1 null
2023-11-08 SEMQA: Semi-Extractive Multi-Source Question Answering Tal Schuster et.al. 2311.04886v1 link
2023-11-07 Evaluating the Effectiveness of Retrieval-Augmented Large Language Models in Scientific Document Reasoning Sai Munikoti et.al. 2311.04348v1 null
2023-11-07 Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves Yihe Deng et.al. 2311.04205v1 link
2023-11-07 Perturbed examples reveal invariances shared by language models Ruchit Rawal et.al. 2311.04166v1 null
2023-11-07 Extracting human interpretable structure-property relationships in chemistry using XAI and large language models Geemi P. Wellawatte et.al. 2311.04047v1 link
2023-11-07 Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundation Models Yichao Cao et.al. 2311.03799v1 link
2023-11-07 Leveraging Structured Information for Explainable Multi-hop Question Answering and Reasoning Ruosen Li et.al. 2311.03734v1 link
2023-11-07 The Linear Representation Hypothesis and the Geometry of Large Language Models Kiho Park et.al. 2311.03658v1 link
2023-11-06 Beyond Words: A Mathematical Framework for Interpreting Large Language Models Javier González et.al. 2311.03033v1 null
2023-11-06 QualEval: Qualitative Evaluation for Model Improvement Vishvak Murahari et.al. 2311.02807v1 link
2023-11-03 Don't Make Your LLM an Evaluation Benchmark Cheater Kun Zhou et.al. 2311.01964v1 null
2023-11-06 Large Language Models to the Rescue: Reducing the Complexity in Scientific Workflow Development Using ChatGPT Mario Sänger et.al. 2311.01825v2 null
2023-11-12 Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models Sean Xie et.al. 2311.01732v2 link
2023-11-02 TopicGPT: A Prompt-based Topic Modeling Framework Chau Minh Pham et.al. 2311.01449v1 link
2023-11-02 REAL: Resilience and Adaptation using Large Language Models on Autonomous Aerial Robots Andrea Tagliabue et.al. 2311.01403v1 null
2023-11-02 Revisiting the Knowledge Injection Frameworks Peng Fu et.al. 2311.01150v1 null
2023-11-02 Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game Sam Toyer et.al. 2311.01011v1 null
2023-11-02 Vision-Language Interpreter for Robot Task Planning Keisuke Shirai et.al. 2311.00967v1 link
2023-11-02 M2T2: Multi-Task Masked Transformer for Object-centric Pick and Place Wentao Yuan et.al. 2311.00926v1 null
2023-11-01 Emotion Detection for Misinformation: A Review Zhiwei Liu et.al. 2311.00671v1 null
2023-11-01 De-Diffusion Makes Text a Strong Cross-Modal Interface Chen Wei et.al. 2311.00618v1 null
2023-11-01 The Mystery and Fascination of LLMs: A Comprehensive Survey on the Interpretation and Analysis of Emergent Abilities Yuxiang Zhou et.al. 2311.00237v1 null
2023-11-01 Is GPT Powerful Enough to Analyze the Emotions of Memes? Jingjing Wang et.al. 2311.00223v1 null
2023-10-31 Large Language Model Can Interpret Latent Space of Sequential Recommender Zhengyi Yang et.al. 2310.20487v1 link
2023-10-31 The SourceData-NLP dataset: integrating curation into scientific publishing for training large language models Jorge Abreu-Vicente et.al. 2310.20440v1 link
2023-10-30 Generative retrieval-augmented ontologic graph and multi-agent strategies for interpretive large language model-based materials design Markus J. Buehler et.al. 2310.19998v1 null
2023-10-30 GPCR-BERT: Interpreting Sequential Design of G Protein Coupled Receptors Using Protein Language Models Seongwon Kim et.al. 2310.19915v1 null

(back to top)

LLM - Reasoning

Publish Date Title Authors PDF Code
2024-07-24 Grammar-based Game Description Generation using Large Language Models Tsunehiko Tanaka et.al. 2407.17404v1 null
2024-07-24 Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching Yuyang Ding et.al. 2407.17349v1 null
2024-07-24 LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover Zijian Wu et.al. 2407.17227v1 null
2024-07-24 Fusing LLMs and KGs for Formal Causal Reasoning behind Financial Risk Contagion Guanyuan Yu et.al. 2407.17190v1 null
2024-07-24 Reinforced Prompt Personalization for Recommendation with Large Language Models Wenyu Mao et.al. 2407.17115v1 link
2024-07-24 A Voter-Based Stochastic Rejection-Method Framework for Asymptotically Safe Language Model Outputs Jake R. Watts et.al. 2407.16994v1 null
2024-07-24 ScholarChemQA: Unveiling the Power of Language Models in Chemical Research Question Answering Xiuying Chen et.al. 2407.16931v1 null
2024-07-23 CompBench: A Comparative Reasoning Benchmark for Multimodal LLMs Jihyung Kil et.al. 2407.16837v1 link
2024-07-23 PreAlign: Boosting Cross-Lingual Transfer by Early Establishment of Multilingual Alignment Jiahuan Li et.al. 2407.16222v1 null
2024-07-23 Figure it Out: Analyzing-based Jailbreak Attack on Large Language Models Shi Lin et.al. 2407.16205v1 null
2024-07-23 UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language Models Liu Qi et.al. 2407.16160v1 null
2024-07-22 Enhancing Temporal Understanding in LLMs for Semi-structured Tables Irwin Deng et.al. 2407.16030v1 null
2024-07-22 Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability Zhuoyan Xu et.al. 2407.15720v1 link
2024-07-22 CrashEventLLM: Predicting System Crashes with Large Language Models Priyanka Mudgal et.al. 2407.15716v1 null
2024-07-22 HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning Zhecan Wang et.al. 2407.15680v1 null
2024-07-22 Dissecting Multiplication in Transformers: Insights into LLMs Luyu Qiu et.al. 2407.15360v1 null
2024-07-21 Evidence-Based Temporal Fact Verification Anab Maulana Barik et.al. 2407.15291v1 null
2024-07-21 MIBench: Evaluating Multimodal Large Language Models over Multiple Images Haowei Liu et.al. 2407.15272v1 null
2024-07-21 Multi-Agent Causal Discovery Using Large Language Models Hao Duong Le et.al. 2407.15073v1 null
2024-07-22 Knowledge Mechanisms in Large Language Models: A Survey and Perspective Mengru Wang et.al. 2407.15017v1 null
2024-07-20 Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data Antonis Antoniades et.al. 2407.14985v1 null
2024-07-20 TraveLLM: Could you plan my new public transit route in face of a network disruption? Bowen Fang et.al. 2407.14926v1 null
2024-07-20 Understanding the Relationship between Prompts and Response Uncertainty in Large Language Models Ze Yu Zhang et.al. 2407.14845v1 null
2024-07-20 Can VLMs be used on videos for action recognition? LLMs are Visual Reasoning Coordinators Harsh Lunia et.al. 2407.14834v1 null
2024-07-20 On the Design and Analysis of LLM-Based Algorithms Yanxi Chen et.al. 2407.14788v1 link
2024-07-19 Adversarial Databases Improve Success in Retrieval-based Large Language Models Sean Wu et.al. 2407.14609v1 null
2024-07-18 Thought-Like-Pro: Enhancing Reasoning of Large Language Models through Self-Driven Prolog-based Chain-of-Though Xiaoyu Tan et.al. 2407.14562v1 null
2024-07-19 Internal Consistency and Self-Feedback in Large Language Models: A Survey Xun Liang et.al. 2407.14507v1 link
2024-07-19 On Pre-training of Multimodal Language Models Customized for Chart Understanding Wan-Cyuan Fan et.al. 2407.14506v1 null
2024-07-18 ViLLa: Video Reasoning Segmentation with Large Language Model Rongkun Zheng et.al. 2407.14500v1 link
2024-07-19 Evaluating the Reliability of Self-Explanations in Large Language Models Korbinian Randl et.al. 2407.14487v1 link
2024-07-19 OpenSU3D: Open World 3D Scene Understanding using Foundation Models Rafay Mohiuddin et.al. 2407.14279v1 null
2024-07-19 LeKUBE: A Legal Knowledge Update BEnchmark Changyue Wang et.al. 2407.14192v1 null
2024-07-19 Visual Text Generation in the Wild Yuanzhi Zhu et.al. 2407.14138v1 link
2024-07-19 Enhancing Data-Limited Graph Neural Networks by Actively Distilling Knowledge from Large Language Models Quan Li et.al. 2407.13989v1 null
2024-07-18 Werewolf Arena: A Case Study in LLM Evaluation via Social Deduction Suma Bailis et.al. 2407.13943v1 null
2024-07-18 PRAGyan -- Connecting the Dots in Tweets Rahul Ravi et.al. 2407.13909v1 null
2024-07-18 X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs Sirnam Swetha et.al. 2407.13851v1 null
2024-07-18 Which objects help me to act effectively? Reasoning about physically-grounded affordances Anne Kemmeren et.al. 2407.13811v1 null
2024-07-18 SegPoint: Segment Any Point Cloud via Large Language Model Shuting He et.al. 2407.13761v1 null
2024-07-18 Prover-Verifier Games improve legibility of LLM outputs Jan Hendrik Kirchner et.al. 2407.13692v1 null
2024-07-18 Weak-to-Strong Reasoning Yuqing Yang et.al. 2407.13647v1 link
2024-07-18 KNOWNET: Guided Health Information Seeking from LLMs via Knowledge Graph Integration Youfu Yan et.al. 2407.13598v1 null
2024-07-18 Robots Can Multitask Too: Integrating a Memory Architecture and LLMs for Enhanced Cross-Task Robot Action Generation Hassan Ali et.al. 2407.13505v1 null
2024-07-18 Combining Constraint Programming Reasoning with Large Language Model Predictions Florian Régin et.al. 2407.13490v1 null
2024-07-18 BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models Moon Ye-Bin et.al. 2407.13442v1 null
2024-07-18 Reconstruct the Pruned Model without Any Retraining Pingjie Wang et.al. 2407.13331v1 null
2024-07-18 CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis Junying Chen et.al. 2407.13301v1 null
2024-07-18 Are Large Language Models Capable of Generating Human-Level Narratives? Yufei Tian et.al. 2407.13248v1 null
2024-07-18 Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data Wufei Ma et.al. 2407.13094v1 null
2024-07-17 Leveraging Environment Interaction for Automated PDDL Generation and Planning with Large Language Models Sadegh Mahdavi et.al. 2407.12979v1 null
2024-07-16 BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval Hongjin Su et.al. 2407.12883v1 null
2024-07-16 Large Visual-Language Models Are Also Good Classifiers: A Study of In-Context Multimodal Fake News Detection Ye Jiang et.al. 2407.12879v1 null
2024-07-16 Review-Feedback-Reason (ReFeR): A Novel Framework for NLG Evaluation and Reasoning Yaswanth Narsupalli et.al. 2407.12877v1 null
2024-07-12 Token-Supervised Value Models for Enhancing Mathematical Reasoning Capabilities of Large Language Models Jung Hyun Lee et.al. 2407.12863v1 null
2024-07-10 Analyzing Large language models chatbots: An experimental approach using a probability test Melise Peruchini et.al. 2407.12862v1 null
2024-07-17 Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models? Ben Yao et.al. 2407.12725v1 null
2024-07-17 Towards Collaborative Intelligence: Propagating Intentions and Reasoning for Multi-Agent Coordination with Large Language Models Xihe Qiu et.al. 2407.12532v1 null
2024-07-17 Struct-X: Enhancing Large Language Models Reasoning with Structured Data Xiaoyu Tan et.al. 2407.12522v1 null
2024-07-17 Case2Code: Learning Inductive Reasoning with Synthetic Data Yunfan Shao et.al. 2407.12504v1 link
2024-07-17 Evaluating Linguistic Capabilities of Multimodal LLMs in the Lens of Few-Shot Learning Mustafa Dogan et.al. 2407.12498v1 null
2024-07-17 F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions Jie Yang et.al. 2407.12435v1 null
2024-07-17 TurkishMMLU: Measuring Massive Multitask Language Understanding in Turkish Arda Yüksel et.al. 2407.12402v1 null
2024-07-17 Mamba-PTQ: Outlier Channels in Recurrent Large Language Models Alessandro Pierro et.al. 2407.12397v1 null
2024-07-17 NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models Gengze Zhou et.al. 2407.12366v1 link
2024-07-17 LLM-based query paraphrasing for video search Jiaxin Wu et.al. 2407.12341v1 null
2024-07-16 Private prediction for large-scale synthetic text generation Kareem Amin et.al. 2407.12108v1 null
2024-07-16 Better RAG using Relevant Information Gain Marc Pickett et.al. 2407.12101v1 link
2024-07-16 NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? Mo Li et.al. 2407.11963v1 link
2024-07-17 Harnessing Large Language Models for Multimodal Product Bundling Xiaohao Liu et.al. 2407.11712v2 null
2024-07-16 A Comprehensive Evaluation of Large Language Models on Temporal Event Forecasting He Chang et.al. 2407.11638v1 null
2024-07-16 Reasoning with Large Language Models, a Survey Aske Plaat et.al. 2407.11511v1 null
2024-07-16 SPINACH: SPARQL-Based Information Navigation for Challenging Real-World Questions Shicheng Liu et.al. 2407.11417v1 null
2024-07-19 Reliable Reasoning Beyond Natural Language Nasim Borazjanizadeh et.al. 2407.11373v2 null
2024-07-16 VISA: Reasoning Video Object Segmentation via Large Language Models Cilin Yan et.al. 2407.11325v1 link
2024-07-15 Making New Connections: LLMs as Puzzle Generators for The New York Times' Connections Word Game Tim Merino et.al. 2407.11240v1 null
2024-07-17 Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay Gonçalo Hora de Carvalho et.al. 2407.11068v2 link
2024-07-15 Can Textual Semantics Mitigate Sounding Object Segmentation Preference? Yaoting Wang et.al. 2407.10947v1 link
2024-07-15 Think-on-Graph 2.0: Deep and Interpretable Large Language Model Reasoning with Knowledge Graph-guided Retrieval Shengjie Ma et.al. 2407.10805v1 null
2024-07-15 Multilingual Contrastive Decoding via Language-Agnostic Layers Skipping Wenhao Zhu et.al. 2407.10795v1 link
2024-07-15 Graphusion: Leveraging Large Language Models for Scientific Knowledge Graph Fusion and Construction in NLP Education Rui Yang et.al. 2407.10794v1 link
2024-07-16 Sibyl: Simple yet Effective Agent Framework for Complex Real-world Reasoning Yulong Wang et.al. 2407.10718v2 link
2024-07-18 Qwen2 Technical Report An Yang et.al. 2407.10671v3 link
2024-07-17 LAB-Bench: Measuring Capabilities of Language Models for Biology Research Jon M. Laurent et.al. 2407.10362v3 null
2024-07-20 Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models Yuchen Yang et.al. 2407.10299v2 link
2024-07-14 GenSco: Can Question Decomposition based Passage Alignment improve Question Answering? Barah Fazili et.al. 2407.10245v1 null
2024-07-20 BiasAlert: A Plug-and-play Tool for Social Bias Detection in LLMs Zhiting Fan et.al. 2407.10241v2 null
2024-07-22 Key-Point-Driven Mathematical Reasoning Distillation of Large Language Model Xunyu Zhu et.al. 2407.10167v2 null
2024-07-14 ChatLogic: Integrating Logic Programming with Large Language Models for Multi-Step Reasoning Zhongsheng Wang et.al. 2407.10162v1 link
2024-07-19 Rapid Biomedical Research Classification: The Pandemic PACT Advanced Categorisation Engine Omid Rohanian et.al. 2407.10086v2 null
2024-07-14 All Roads Lead to Rome: Unveiling the Trajectory of Recommender Systems Across the LLM Era Bo Chen et.al. 2407.10081v1 null
2024-07-13 Benchmarking LLMs for Optimization Modeling and Enhancing Reasoning via Reverse Socratic Synthesis Zhicheng Yang et.al. 2407.09887v1 link
2024-07-13 IoT-LM: Large Multisensory Language Models for the Internet of Things Shentong Mo et.al. 2407.09801v1 link
2024-07-17 Security Matrix for Multimodal Agents on Mobile Devices: A Systematic and Proof of Concept Study Yulong Yang et.al. 2407.09295v2 null
2024-07-17 Counterfactual Explainable Incremental Prompt Attack Analysis on Large Language Models Dong Shu et.al. 2407.09292v2 null
2024-07-12 Predicting and Understanding Human Action Decisions: Insights from Large Language Models and Cognitive Instance-Based Learning Thuy Ngoc Nguyen et.al. 2407.09281v1 null
2024-07-12 Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors Nico Daheim et.al. 2407.09136v1 link
2024-07-12 STD-LLM: Understanding Both Spatial and Temporal Properties of Spatial-Temporal Data with LLMs Yiheng Huang et.al. 2407.09096v1 null
2024-07-12 SpreadsheetLLM: Encoding Spreadsheets for Large Language Models Yuzhang Tian et.al. 2407.09025v1 null
2024-07-12 Leveraging large language models for nano synthesis mechanism explanation: solid foundations or mere conjectures? Yingming Pu et.al. 2407.08922v1 link
2024-07-11 Evaluating Nuanced Bias in Large Language Model Free Response Answers Jennifer Healey et.al. 2407.08842v1 null
2024-07-11 MAVIS: Mathematical Visual Instruction Tuning Renrui Zhang et.al. 2407.08739v1 link
2024-07-11 Real-Time Anomaly Detection and Reactive Planning with Large Language Models Rohan Sinha et.al. 2407.08735v1 null
2024-07-11 Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist Zihao Zhou et.al. 2407.08733v1 null
2024-07-11 GTA: A Benchmark for General Tool Agents Jize Wang et.al. 2407.08713v1 link
2024-07-11 Cloud Atlas: Efficient Fault Localization for Cloud Systems using Language Models and Causal Insight Zhiqiang Xie et.al. 2407.08694v1 null
2024-07-15 Emergent Visual-Semantic Hierarchies in Image-Text Representations Morris Alper et.al. 2407.08521v2 null
2024-07-16 Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents Haoyi Xiong et.al. 2407.08516v2 null
2024-07-11 Investigating LLMs as Voting Assistants via Contextual Augmentation: A Case Study on the European Parliament Elections 2024 Ilias Chalkidis et.al. 2407.08495v1 null
2024-07-11 Lynx: An Open Source Hallucination Evaluation Model Selvan Sunitha Ravi et.al. 2407.08488v1 null
2024-07-17 Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On Liang Zeng et.al. 2407.08348v2 null
2024-07-12 RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL Zhenhe Wu et.al. 2407.08273v2 null
2024-07-16 Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding Minghui Wu et.al. 2407.08150v2 null
2024-07-10 RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization Xijie Huang et.al. 2407.08044v1 link
2024-07-10 A Critical Review of Causal Reasoning Benchmarks for Large Language Models Linying Yang et.al. 2407.08029v1 null
2024-07-04 CaseGPT: a case reasoning framework based on language models and retrieval-augmented generation Rui Yang et.al. 2407.07913v1 null
2024-07-12 A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends Daizong Liu et.al. 2407.07403v2 link
2024-07-10 LokiLM: Technical Report Justin Kiefel et.al. 2407.07370v1 null
2024-07-10 Interpretable Differential Diagnosis with Dual-Inference Large Language Models Shuang Zhou et.al. 2407.07330v1 null
2024-07-10 Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model Wenqi Zhang et.al. 2407.07053v2 link
2024-07-09 Learning From Crowdsourced Noisy Labels: A Signal Processing Perspective Shahana Ibrahim et.al. 2407.06902v1 null
2024-07-08 A Single Transformer for Scalable Vision-Language Modeling Yangyi Chen et.al. 2407.06438v1 link
2024-07-08 Multimodal Chain-of-Thought Reasoning via ChatGPT to Protect Children from Age-Inappropriate Apps Chuanbo Hu et.al. 2407.06309v1 null
2024-07-08 CodeUpdateArena: Benchmarking Knowledge Editing on API Updates Zeyu Leo Liu et.al. 2407.06249v1 null
2024-07-08 SimPal: Towards a Meta-Conversational Framework to Understand Teacher's Instructional Goals for K-12 Physics Effat Farhana et.al. 2407.06241v1 null
2024-07-08 Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision Orr Zohar et.al. 2407.06189v1 link
2024-07-08 iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement Aoyu Pang et.al. 2407.06025v1 link
2024-07-09 Distilling System 2 into System 1 Ping Yu et.al. 2407.06023v2 null
2024-07-08 Towards Optimizing and Evaluating a Retrieval Augmented QA Chatbot using LLMs with Human in the Loop Anum Afzal et.al. 2407.05925v1 null
2024-07-08 When is the consistent prediction likely to be a correct prediction? Alex Nguyen et.al. 2407.05778v1 null
2024-07-08 Large Language Models Understand Layouts Weiming Li et.al. 2407.05750v1 null
2024-07-08 Empirical Study of Symmetrical Reasoning in Conversational Chatbots Daniela N. Rim et.al. 2407.05734v1 null
2024-07-08 Retrieved In-Context Principles from Previous Mistakes Hao Sun et.al. 2407.05682v1 null
2024-07-07 Training Task Experts through Retrieval Based Distillation Jiaxin Ge et.al. 2407.05463v1 null
2024-07-07 LTLBench: Towards Benchmarks for Evaluating Temporal Logic Reasoning in Large Language Models Weizhi Tang et.al. 2407.05434v1 link
2024-07-10 SBoRA: Low-Rank Adaptation with Regional Weight Updates Lai-Man Po et.al. 2407.05413v2 link
2024-07-07 ElecBench: a Power Dispatch Evaluation Benchmark for Large Language Models Xiyuan Zhou et.al. 2407.05365v1 link
2024-07-07 VideoCoT: A Video Chain-of-Thought Dataset with Active Annotation Tool Yan Wang et.al. 2407.05355v1 null
2024-07-07 WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks Léo Boisvert et.al. 2407.05291v1 link
2024-07-07 Beyond Binary Gender Labels: Revealing Gender Biases in LLMs through Gender-Neutral Name Predictions Zhiwen You et.al. 2407.05271v1 link
2024-07-06 Lucy: Think and Reason to Solve Text-to-SQL Nina Narodytska et.al. 2407.05153v1 null
2024-07-06 Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns? Kuei-Chun Kao et.al. 2407.05134v1 null
2024-07-06 Progress or Regress? Self-Improvement Reversal in Post-training Ting Wu et.al. 2407.05013v1 null
2024-07-06 LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts Yijia Xiao et.al. 2407.04973v1 link
2024-07-06 MemoCRS: Memory-enhanced Sequential Conversational Recommender Systems with Large Language Models Yunjia Xi et.al. 2407.04960v1 link
2024-07-06 Safe Generative Chats in a WhatsApp Intelligent Tutoring System Zachary Levonian et.al. 2407.04915v1 null
2024-07-06 Algorithmic Language Models with Neurally Compiled Libraries Lucas Saldyt et.al. 2407.04899v1 null
2024-07-12 On scalable oversight with weak LLMs judging strong LLMs Zachary Kenton et.al. 2407.04622v2 null
2024-07-05 Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model Duy M. H. Nguyen et.al. 2407.04489v1 null
2024-07-05 cosmosage: A Natural-Language Assistant for Cosmologists Tijmen de Haan et.al. 2407.04420v1 link
2024-07-05 AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents Petr Anokhin et.al. 2407.04363v1 link
2024-07-05 Towards Context-aware Support for Color Vision Deficiency: An Approach Integrating LLM and AR Shogo Morita et.al. 2407.04362v1 null
2024-07-05 WOMD-Reasoning: A Large-Scale Language Dataset for Interaction and Driving Intentions Reasoning Yiheng Li et.al. 2407.04281v1 null
2024-07-09 DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning Chengpeng Li et.al. 2407.04078v2 link
2024-07-04 Semantic Graphs for Syntactic Simplification: A Revisit from the Age of LLM Peiran Yao et.al. 2407.04067v1 link
2024-07-04 A Survey on Natural Language Counterfactual Generation Yongjie Wang et.al. 2407.03993v1 null
2024-07-04 MobileExperts: A Dynamic Tool-Enabled Agent Team in Mobile Devices Jiayi Zhang et.al. 2407.03913v1 null
2024-07-04 From Data to Commonsense Reasoning: The Use of Large Language Models for Explainable AI Stefanie Krause et.al. 2407.03778v1 null
2024-07-04 STOC-TOT: Stochastic Tree-of-Thought with Constrained Decoding for Complex Reasoning in Multi-Hop Question Answering Zhenyu Bi et.al. 2407.03687v1 null
2024-07-04 Improving Self Consistency in LLMs through Probabilistic Tokenization Ashutosh Sathe et.al. 2407.03678v1 null
2024-07-14 Evaluating Language Model Context Windows: A "Working Memory" Test and Inference-time Correction Amanda Dsouza et.al. 2407.03651v2 link
2024-07-04 Visualizing Dialogues: Enhancing Image Selection through Dialogue Understanding with Large Language Models Chang-Sheng Kao et.al. 2407.03615v1 link
2024-07-03 UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization Md Nayem Uddin et.al. 2407.03525v1 null
2024-07-03 On Large Language Models in National Security Applications William N. Caballero et.al. 2407.03453v1 null
2024-07-03 How Does Quantization Affect Multilingual LLMs? Kelly Marchisio et.al. 2407.03211v1 null
2024-07-03 TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts Ruida Wang et.al. 2407.03203v1 link
2024-07-03 Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models Haritz Puerto et.al. 2407.03181v1 link
2024-07-03 Investigating Decoder-only Large Language Models for Speech-to-text Translation Chao-Wei Huang et.al. 2407.03169v1 null
2024-07-03 Social Bias Evaluation for Large Language Models Requires Prompt Variations Rem Hida et.al. 2407.03129v1 link
2024-07-03 ALTER: Augmentation for Large-Table-Based Reasoning Han Zhang et.al. 2407.03061v1 link
2024-07-03 Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering Zhaohe Liao et.al. 2407.03008v1 null
2024-07-03 SemioLLM: Assessing Large Language Models for Semiological Analysis in Epilepsy Research Meghal Dani et.al. 2407.03004v1 null
2024-07-03 Large Language Models as Evaluators for Scientific Synthesis Julia Evans et.al. 2407.02977v1 null
2024-07-03 FSM: A Finite State Machine Based Zero-Shot Prompting Paradigm for Multi-Hop Question Answering Xiaochen Wang et.al. 2407.02964v1 null
2024-07-03 GraCoRe: Benchmarking Graph Comprehension and Complex Reasoning in Large Language Models Zike Yuan et.al. 2407.02936v1 link
2024-07-03 LANE: Logic Alignment of Non-tuning Large Language Models and Online Recommendation Systems for Explainable Reason Generation Hongke Zhao et.al. 2407.02833v1 null
2024-07-02 Reasoning in Large Language Models: A Geometric Perspective Romain Cosentino et.al. 2407.02678v1 null
2024-07-02 An AI-Based System Utilizing IoT-Enabled Ambient Sensors and LLMs for Complex Activity Tracking Yuan Sun et.al. 2407.02606v1 null
2024-07-02 Open Scene Graphs for Open World Object-Goal Navigation Joel Loo et.al. 2407.02473v1 null
2024-07-02 TokenPacker: Efficient Visual Projector for Multimodal LLM Wentong Li et.al. 2407.02392v1 link
2024-07-02 Generative Large Language Models in Automated Fact-Checking: A Survey Ivan Vykopal et.al. 2407.02351v1 null
2024-07-02 RVISA: Reasoning and Verification for Implicit Sentiment Analysis Wenna Lai et.al. 2407.02340v1 null
2024-07-02 Evaluating the Ability of LLMs to Solve Semantics-Aware Process Mining Tasks Adrian Rebmann et.al. 2407.02310v1 link
2024-07-02 Multilingual Trolley Problems for Language Models Zhijing Jin et.al. 2407.02273v1 link
2024-07-04 Embodied AI in Mobile Robots: Coverage Path Planning with Large Language Models Xiangrui Kong et.al. 2407.02220v2 null
2024-07-02 Automatic Adaptation Rule Optimization via Large Language Models Yusei Ishimizu et.al. 2407.02203v1 null
2024-07-02 Is Your Large Language Model Knowledgeable or a Choices-Only Cheater? Nishant Balepur et.al. 2407.01992v1 null
2024-07-04 Enabling Discriminative Reasoning in LLMs for Legal Judgment Prediction Chenlong Deng et.al. 2407.01964v3 link
2024-07-02 Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness Khyathi Raghavi Chandu et.al. 2407.01942v1 null
2024-07-02 GRASP: A Grid-Based Benchmark for Evaluating Commonsense Spatial Reasoning Zhisheng Tang et.al. 2407.01892v1 link
2024-07-01 DiscoveryBench: Towards Data-Driven Discovery with Large Language Models Bodhisattwa Prasad Majumder et.al. 2407.01725v1 link
2024-07-01 Deciphering the Factors Influencing the Efficacy of Chain-of-Thought: Probability, Memorization, and Noisy Reasoning Akshara Prabhakar et.al. 2407.01687v1 link
2024-07-01 KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches Jiayi Yuan et.al. 2407.01527v1 null
2024-07-02 Empowering 3D Visual Grounding with Reasoning Capabilities Chenming Zhu et.al. 2407.01525v2 null
2024-07-01 TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models' Theory-of-Mind Guiyang Hou et.al. 2407.01455v1 null
2024-07-01 MIRAI: Evaluating LLM Agents for Event Forecasting Chenchen Ye et.al. 2407.01231v1 null
2024-07-01 EconNLI: Evaluating Large Language Models on Economics Reasoning Yue Guo et.al. 2407.01212v1 link
2024-07-01 IBSEN: Director-Actor Agent Collaboration for Controllable and Interactive Drama Script Generation Senyu Han et.al. 2407.01093v1 link
2024-07-03 FRoG: Evaluating Fuzzy Reasoning of Generalized Quantifiers in Large Language Models Yiyuan Li et.al. 2407.01046v2 link
2024-07-01 DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models Jiabao Pan et.al. 2407.01009v1 null
2024-07-01 Data on the Move: Traffic-Oriented Data Trading Platform Powered by AI Agent with Common Sense Yi Yu et.al. 2407.00995v1 null
2024-07-01 Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents Shihan Deng et.al. 2407.00993v1 null
2024-07-01 Tokenize the World into Object-level Knowledge to Address Long-tail Events in Autonomous Driving Ran Tian et.al. 2407.00959v1 null
2024-07-01 MalAlgoQA: A Pedagogical Approach for Evaluating Counterfactual Reasoning Abilities Naiming Liu et.al. 2407.00938v1 null
2024-07-01 MathCAMPS: Fine-grained Synthesis of Mathematical Problems From Human Curricula Shubhra Mishra et.al. 2407.00900v1 link
2024-07-01 Large Language Models Are Involuntary Truth-Tellers: Exploiting Fallacy Failure for Jailbreak Attacks Yue Zhou et.al. 2407.00869v1 null
2024-07-02 Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning Zimu Lu et.al. 2407.00782v2 link
2024-06-30 Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs Yifei Zhang et.al. 2407.00653v1 null
2024-06-29 LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement Jiahao Ying et.al. 2407.00497v1 null
2024-06-29 MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation Jinsheng Huang et.al. 2407.00468v1 link
2024-06-29 Too Late to Train, Too Early To Use? A Study on Necessity and Viability of Low-Resource Bengali LLMs Tamzeed Mahfuz et.al. 2407.00416v1 null
2024-06-29 Advancing Process Verification for Large Language Models via Tree-Based Preference Learning Mingqian He et.al. 2407.00390v1 null
2024-06-28 Evaluating Human Alignment and Model Faithfulness of LLM Rationale Mohsen Fayyaz et.al. 2407.00219v1 null
2024-06-27 From Efficient Multimodal Models to World Models: A Survey Xinji Mai et.al. 2407.00118v1 null
2024-06-26 Visual Reasoning and Multi-Agent Approach in Multimodal Large Language Models (MLLMs): Solving TSP and mTSP Combinatorial Challenges Mohammed Elhenawy et.al. 2407.00092v1 null
2024-06-28 LLaRA: Supercharging Robot Learning Data for Vision-Language Policy Xiang Li et.al. 2406.20095v1 link
2024-06-28 Scaling Synthetic Data Creation with 1,000,000,000 Personas Xin Chan et.al. 2406.20094v1 link
2024-06-28 Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language Yicheng Chen et.al. 2406.20085v1 null
2024-07-02 BMW Agents -- A Framework For Task Automation Through Multi-Agent Collaboration Noel Crawford et.al. 2406.20041v3 null
2024-06-28 ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models Yuxiang Zhang et.al. 2406.20015v1 link
2024-06-28 Into the Unknown: Generating Geospatial Descriptions for New Environments Tzuf Paz-Argaman et.al. 2406.19967v1 null
2024-06-28 BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering Zheng Chu et.al. 2406.19820v1 null
2024-06-28 Belief Revision: The Adaptability of Large Language Models Reasoning Bryan Wilie et.al. 2406.19764v1 null
2024-07-02 ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning Christopher E. Mower et.al. 2406.19741v2 link
2024-06-28 MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics? Jinming Li et.al. 2406.19693v1 null
2024-06-27 Rethinking harmless refusals when fine-tuning foundation models Florin Pop et.al. 2406.19552v1 null
2024-06-27 Leveraging Machine-Generated Rationales to Facilitate Social Meaning Detection in Conversations Ritam Dutt et.al. 2406.19545v1 link
2024-06-27 Context Matters: An Empirical Study of the Impact of Contextual Information in Temporal Question Answering Systems Dan Schumacher et.al. 2406.19538v1 null
2024-07-04 Using Large Language Models to Assist Video Content Analysis: An Exploratory Study of Short Videos on Depression Jiaying Liu et.al. 2406.19528v2 null
2024-06-27 Investigating How Large Language Models Leverage Internal Knowledge to Perform Complex Reasoning Miyoung Ko et.al. 2406.19502v1 link
2024-07-02 ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos Jr-Jen Chen et.al. 2406.19392v2 link
2024-06-27 From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data Zheyang Xiong et.al. 2406.19292v1 null
2024-06-27 Aligning Teacher with Student Preferences for Tailored Training Data Generation Yantao Liu et.al. 2406.19227v1 null
2024-06-27 Towards Learning Abductive Reasoning using VSA Distributed Representations Giacomo Camposampiero et.al. 2406.19121v1 link
2024-06-27 STBench: Assessing the Ability of Large Language Models in Spatio-Temporal Analysis Wenbin Li et.al. 2406.19065v1 link
2024-06-28 UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models Siyuan Wu et.al. 2406.18966v2 link
2024-06-27 Disentangling Knowledge-based and Visual Reasoning by Question Decomposition in KB-VQA Elham J. Barezi et.al. 2406.18839v1 null
2024-06-26 Categorical Syllogisms Revisited: A Review of the Logical Reasoning Abilities of LLMs for Analyzing Categorical Syllogism Shi Zong et.al. 2406.18762v1 null
2024-06-26 Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models Georgios Tziafas et.al. 2406.18746v1 null
2024-07-01 Towards Open-World Grasping with Large Vision-Language Models Georgios Tziafas et.al. 2406.18722v2 null
2024-06-26 Learning to Correct for QA Reasoning with Black-box LLMs Jaehyung Kim et.al. 2406.18695v1 link
2024-06-26 Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation Guanting Dong et.al. 2406.18676v1 link
2024-06-26 Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs Xin Lai et.al. 2406.18629v1 link
2024-06-26 An LLM-based Knowledge Synthesis and Scientific Reasoning Framework for Biomedical Discovery Oskar Wysocki et.al. 2406.18626v1 null
2024-06-26 CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs Zirui Wang et.al. 2406.18521v1 link
2024-06-26 Mental Modeling of Reinforcement Learning Agents by Language Models Wenhao Lu et.al. 2406.18505v1 null
2024-06-26 MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data Meng Fang et.al. 2406.18321v1 null
2024-06-26 AI-native Memory: A Pathway from LLMs Towards AGI Jingbo Shang et.al. 2406.18312v1 null
2024-06-26 SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding Zhenglin Wang et.al. 2406.18200v1 null
2024-06-26 Knowledge Graph Enhanced Retrieval-Augmented Generation for Failure Mode and Effects Analysis Lukas Bahr et.al. 2406.18114v1 link
2024-06-26 Multi-step Knowledge Retrieval and Inference over Unstructured Data Aditya Kalyanpur et.al. 2406.17987v1 null
2024-06-25 NormTab: Improving Symbolic Reasoning in LLMs Through Tabular Data Normalization Md Mahadi Hasan Nahid et.al. 2406.17961v1 null
2024-06-25 Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback Zhongtao Miao et.al. 2406.17873v1 link
2024-06-22 MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries? Xirui Li et.al. 2406.17806v1 null
2024-06-25 LLM-ARC: Enhancing LLMs with an Automated Reasoning Critic Aditya Kalyanpur et.al. 2406.17663v1 null
2024-06-25 Banishing LLM Hallucinations Requires Rethinking Generalization Johnny Li et.al. 2406.17642v1 null
2024-06-25 "Seeing the Big through the Small": Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations? Beiduo Chen et.al. 2406.17600v1 null
2024-06-26 LongIns: A Challenging Long-context Instruction-based Exam for LLMs Shawn Gavin et.al. 2406.17588v2 null
2024-06-25 Beyond Text-to-SQL for IoT Defense: A Comprehensive Framework for Querying and Classifying IoT Threats Ryan Pavlich et.al. 2406.17574v1 null
2024-06-25 The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Guilherme Penedo et.al. 2406.17557v1 null
2024-06-25 Tell Me Where You Are: Multimodal LLMs Meet Place Recognition Zonglin Lyu et.al. 2406.17520v1 null
2024-06-25 Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA Minzheng Wang et.al. 2406.17419v1 link
2024-06-25 Leveraging LLMs for Dialogue Quality Measurement Jinghan Jia et.al. 2406.17304v1 null
2024-06-26 Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models Wenhao Shi et.al. 2406.17294v2 link
2024-06-25 DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph Zhehao Zhang et.al. 2406.17271v1 link
2024-06-24 CogExplore: Contextual Exploration with Language-Encoded Environment Representations Harel Biggie et.al. 2406.17180v1 null
2024-06-24 Multi-LogiEval: Towards Evaluating Multi-Step Logical Reasoning Ability of Large Language Models Nisarg Patel et.al. 2406.17169v1 link
2024-06-24 USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations Mounika Marreddy et.al. 2406.16833v1 null
2024-06-25 Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs Ashwinee Panda et.al. 2406.16797v2 link
2024-06-24 Scaling Laws for Linear Complexity Language Models Xuyang Shen et.al. 2406.16690v1 link
2024-06-24 Large Language Models Are Cross-Lingual Knowledge-Free Reasoners Peng Hu et.al. 2406.16655v1 link
2024-06-25 OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer Lu Zhang et.al. 2406.16620v2 null
2024-06-24 Evaluating the Ability of Large Language Models to Reason about Cardinal Directions Anthony G Cohn et.al. 2406.16528v1 null
2024-06-24 eagerlearners at SemEval2024 Task 5: The Legal Argument Reasoning Task in Civil Procedure Hoorieh Sabzevari et.al. 2406.16490v1 link
2024-06-24 Evaluating and Analyzing Relationship Hallucinations in LVLMs Mingrui Wu et.al. 2406.16449v1 link
2024-06-29 EmoLLM: Multimodal Emotional Understanding Meets Large Language Models Qu Yang et.al. 2406.16442v2 link
2024-06-24 UniCoder: Scaling Code Large Language Model via Universal Code Tao Sun et.al. 2406.16441v1 null
2024-06-24 Anomaly Detection of Tabular Data Using LLMs Aodong Li et.al. 2406.16308v1 null
2024-06-23 GraphEval2000: Benchmarking and Improving Large Language Models on Graph Datasets Qiming Wu et.al. 2406.16176v1 null
2024-06-23 Chain-of-Probe: Examing the Necessity and Accuracy of CoT Step-by-Step Zezhong Wang et.al. 2406.16144v1 null
2024-06-23 PORT: Preference Optimization on Reasoning Traces Salem Lahlou et.al. 2406.16061v1 null
2024-06-23 Can LLM Graph Reasoning Generalize beyond Pattern Memorization? Yizhuo Zhang et.al. 2406.15992v1 null
2024-06-26 BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Terry Yue Zhuo et.al. 2406.15877v2 link
2024-06-30 LLM-Powered Explanations: Unraveling Recommendations Through Subgraph Reasoning Guangsi Shi et.al. 2406.15859v2 null
2024-06-22 MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception Guanqun Wang et.al. 2406.15768v1 null
2024-06-22 video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models Guangzhi Sun et.al. 2406.15704v1 link
2024-06-21 Robust Reinforcement Learning from Corrupted Human Feedback Alexander Bukharin et.al. 2406.15568v1 null
2024-06-18 On the Principles behind Opinion Dynamics in Multi-Agent Systems of Large Language Models Pedro Cisneros-Velarde et.al. 2406.15492v1 null
2024-06-21 Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network Badr AlKhamissi et.al. 2406.15109v1 link
2024-06-21 MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to 200K Tokens Yongqi Fan et.al. 2406.15019v1 link
2024-06-21 Do Large Language Models Exhibit Cognitive Dissonance? Studying the Difference Between Revealed Beliefs and Stated Answers Manuel Mondal et.al. 2406.14986v1 null
2024-06-21 ICLEval: Evaluating In-Context Learning Ability of Large Language Models Wentong Chen et.al. 2406.14955v1 link
2024-06-21 Autonomous Agents for Collaborative Task under Information Asymmetry Wei Liu et.al. 2406.14928v1 link
2024-06-21 Sports Intelligence: Assessing the Sports Understanding Capabilities of Language Models through Question Answering from Text to Video Zhengbang Yang et.al. 2406.14877v1 null
2024-06-21 DistiLRR: Transferring Code Repair for Low-Resource Programming Languages Kyle Wong et.al. 2406.14867v1 link
2024-06-21 Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models Jiayu Wang et.al. 2406.14852v1 null
2024-06-20 ACR: A Benchmark for Automatic Cohort Retrieval Dung Ngoc Thai et.al. 2406.14780v1 null
2024-06-20 A Learn-Then-Reason Model Towards Generalization in Knowledge Base Question Answering Lingxi Zhang et.al. 2406.14763v1 null
2024-06-20 Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task? Zhiqiang Pi et.al. 2406.14737v1 null
2024-06-20 Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell Taiming Lu et.al. 2406.14673v1 link
2024-06-20 HYPERmotion: Learning Hybrid Behavior Planning for Autonomous Loco-manipulation Jin Wang et.al. 2406.14655v1 null
2024-06-20 Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities Sachit Menon et.al. 2406.14562v1 null
2024-06-21 Asynchronous Large Language Model Enhanced Planner for Autonomous Driving Yuan Chen et.al. 2406.14556v2 null
2024-06-20 Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data Johannes Treutlein et.al. 2406.14546v1 link
2024-06-20 Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs Yuxuan Qiao et.al. 2406.14544v1 link
2024-06-25 SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages Gayane Ghazaryan et.al. 2406.14425v2 null
2024-06-20 The neural correlates of logical-mathematical symbol systems processing resemble that of spatial cognition more than natural language processing Yuannan Li et.al. 2406.14358v1 null
2024-06-20 medIKAL: Integrating Knowledge Graphs as Assistants of LLMs for Enhanced Clinical Diagnosis on EMRs Mingyi Jia et.al. 2406.14326v1 null
2024-06-27 Q: Improving Multi-step Reasoning for LLMs with Deliberative Planning* Chaojie Wang et.al. 2406.14283v3 null
2024-06-20 SeCoKD: Aligning Large Language Models for In-Context Learning with Fewer Shots Weixing Wang et.al. 2406.14208v1 null
2024-06-20 Timo: Towards Better Temporal Reasoning for Language Models Zhaochen Su et.al. 2406.14192v1 link
2024-06-20 Definition generation for lexical semantic change detection Mariia Fedorova et.al. 2406.14167v1 link
2024-07-01 Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration Haokun Liu et.al. 2406.14097v2 null
2024-06-20 MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models Zhongshen Zeng et.al. 2406.13975v1 null
2024-06-20 Causal Inference with Latent Variables: Recent Advances and Future Prospectives Yaochen Zhu et.al. 2406.13966v1 null
2024-06-20 CityGPT: Empowering Urban Spatial Cognition of Large Language Models Jie Feng et.al. 2406.13948v1 null
2024-06-20 AspirinSum: an Aspect-based utility-preserved de-identification Summarization framework Ya-Lun Li et.al. 2406.13947v1 null
2024-06-19 Using Multimodal Large Language Models for Automated Detection of Traffic Safety Critical Events Mohammad Abu Tami et.al. 2406.13894v1 null
2024-06-19 Adaptable Logical Control for Large Language Models Honghua Zhang et.al. 2406.13892v1 link
2024-06-19 Distributional reasoning in LLMs: Parallel reasoning processes in multi-hop reasoning Yuval Shalev et.al. 2406.13858v1 null
2024-06-27 Can Low-Rank Knowledge Distillation in LLMs be Useful for Microelectronic Reasoning? Nirjhor Rouf et.al. 2406.13808v3 null
2024-06-19 WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia Yufang Hou et.al. 2406.13805v1 null
2024-06-19 Semantic Structure-Mapping in LLM and Human Analogical Reasoning Sam Musker et.al. 2406.13803v1 link
2024-06-19 Can LLMs Reason in the Wild with Programs? Yuan Yang et.al. 2406.13764v1 link
2024-06-19 Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models Zhawnen Chen et.al. 2406.13763v1 null
2024-06-19 Improving Visual Commonsense in Language Models via Multiple Image Generation Guy Yariv et.al. 2406.13621v1 link
2024-06-27 VDebugger: Harnessing Execution Feedback for Debugging Visual Programs Xueqing Wu et.al. 2406.13444v2 link
2024-06-19 Finding Blind Spots in Evaluator LLMs with Interpretable Checklists Sumanth Doddapaneni et.al. 2406.13439v1 link
2024-06-19 MoreHopQA: More Than Multi-hop Reasoning Julian Schnitzler et.al. 2406.13397v1 link
2024-06-19 ALiiCE: Evaluating Positional Fine-grained Citation Generation Yilong Xu et.al. 2406.13375v1 null
2024-06-19 Investigating Low-Cost LLM Annotation for~Spoken Dialogue Understanding Datasets Lucas Druart et.al. 2406.13269v1 null
2024-06-19 Bridging Law and Data: Augmenting Reasoning via a Semi-Structured Dataset with IRAC methodology Xiaoxi Kang et.al. 2406.13217v1 null
2024-06-19 Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata Mykhailo Poliakov et.al. 2406.13213v1 link
2024-06-19 DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents Jiho Kim et.al. 2406.13144v1 link
2024-06-19 Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation Yuhang Zhou et.al. 2406.13114v1 null
2024-06-18 Assessing AI vs Human-Authored Spear Phishing SMS Attacks: An Empirical Study Using the TRAPD Method Jerson Francia et.al. 2406.13049v1 null
2024-06-18 MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction Yuyan Liu et.al. 2406.12950v1 link
2024-06-18 DrVideo: Document Retrieval Based Long Video Understanding Ziyu Ma et.al. 2406.12846v1 null
2024-06-18 LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation Seyedarmin Azizi et.al. 2406.12832v1 link
2024-06-18 UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions Xunzhi Wang et.al. 2406.12784v1 link
2024-06-18 Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries Eden Biran et.al. 2406.12775v1 link
2024-06-18 OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Zhen Huang et.al. 2406.12753v1 link
2024-06-18 Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning Bingchen Zhao et.al. 2406.12742v1 link
2024-06-18 MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL Arian Askari et.al. 2406.12692v1 null
2024-06-18 DetectBench: Can Large Language Model Detect and Piece Together Implicit Evidence? Zhouhong Gu et.al. 2406.12641v1 link
2024-06-18 Ask-before-Plan: Proactive Language Agents for Real-World Planning Xuan Zhang et.al. 2406.12639v1 link
2024-06-18 Large Language Models based Multi-Agent Framework for Objective Oriented Control Design in Power Electronics Chenggang Cui et.al. 2406.12628v1 null
2024-06-18 Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges Aman Singh Thakur et.al. 2406.12624v1 null
2024-06-18 Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for Ensembling Yao-Ching Yu et.al. 2406.12585v1 link
2024-06-19 Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on Large Language Models Eldar Kurtic et.al. 2406.12572v2 link
2024-06-18 Liar, Liar, Logical Mire: A Benchmark for Suppositional Reasoning in Large Language Models Philipp Mondorf et.al. 2406.12546v1 null
2024-06-18 LLM4MSR: An LLM-Enhanced Paradigm for Multi-Scenario Recommendation Yuhao Wang et.al. 2406.12529v1 null
2024-06-18 LightPAL: Lightweight Passage Retrieval for Open Domain Multi-Document Summarization Masafumi Enomoto et.al. 2406.12494v1 null
2024-06-18 RS-GPT4V: A Unified Multimodal Instruction-Following Dataset for Remote Sensing Image Understanding Linrui Xu et.al. 2406.12479v1 link
2024-06-18 IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models Qiyao Wang et.al. 2406.12386v1 link
2024-06-18 Problem-Solving in Language Model Networks Ciaran Regan et.al. 2406.12374v1 link
2024-06-18 Retrieval Meets Reasoning: Dynamic In-Context Editing for Long-Text Understanding Weizhi Fei et.al. 2406.12331v1 null
2024-06-18 PRePair: Pointwise Reasoning Enhance Pairwise Evaluating for Robust Instruction-Following Assessments Hawon Jeong et.al. 2406.12319v1 null
2024-06-18 An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs Daking Rai et.al. 2406.12288v1 null
2024-06-18 Unveiling Implicit Table Knowledge with Question-Then-Pinpoint Reasoner for Insightful Table Summarization Kwangwook Seo et.al. 2406.12269v1 null
2024-06-18 A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning Lijie Hu et.al. 2406.12255v1 null
2024-06-24 Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector Gangwei Jiang et.al. 2406.12227v2 null
2024-06-18 Leveraging Large Language Model for Heterogeneous Ad Hoc Teamwork Collaboration Xinzhu Liu et.al. 2406.12224v1 null
2024-06-18 Navigating the Labyrinth: Evaluating and Enhancing LLMs' Ability to Reason About Search Problems Nasim Borazjanizadeh et.al. 2406.12172v1 null
2024-06-19 Is poisoning a real threat to LLM alignment? Maybe more so than you think Pankayaraj Pathmanathan et.al. 2406.12091v2 link
2024-06-17 InternalInspector $I^2$: Robust Confidence Estimation in LLMs through Internal States Mohammad Beigi et.al. 2406.12053v1 null
2024-06-17 MedCalc-Bench: Evaluating Large Language Models for Medical Calculations Nikhil Khandekar et.al. 2406.12036v1 link
2024-06-17 Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts Junmo Kang et.al. 2406.12034v1 null
2024-06-17 GAugLLM: Improving Graph Contrastive Learning for Text-Attributed Graphs with Large Language Models Yi Fang et.al. 2406.11945v1 link
2024-06-16 A Notion of Complexity for Theory of Mind via Discrete World Models X. Angelo Huang et.al. 2406.11911v1 link
2024-06-15 A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges Yuqi Nie et.al. 2406.11903v1 null
2024-06-17 Improving Multi-Agent Debate with Sparse Communication Topology Yunxuan Li et.al. 2406.11776v1 null
2024-06-17 Meta Reasoning for Large Language Models Peizhong Gao et.al. 2406.11698v1 null
2024-06-17 TourRank: Utilizing Large Language Models for Documents Ranking with a Tournament-Inspired Strategy Yiqun Chen et.al. 2406.11678v1 link
2024-06-17 A Two-dimensional Zero-shot Dialogue State Tracking Evaluation Method using GPT-4 Ming Gu et.al. 2406.11651v1 link
2024-06-17 Towards an End-to-End Framework for Invasive Brain Signal Decoding with Large Language Models Sheng Feng et.al. 2406.11568v1 link
2024-06-17 MEMLA: Enhancing Multilingual Knowledge Editing with Neuron-Masked Low-Rank Adaptation Jiakuan Xie et.al. 2406.11566v1 null
2024-06-17 AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation Chuyan Xiong et.al. 2406.11548v1 null
2024-06-17 Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMs Yi Fang et.al. 2406.11514v1 null
2024-06-17 Can AI with High Reasoning Ability Replicate Human-like Decision Making in Economic Experiments? Ayato Kitadai et.al. 2406.11426v1 null
2024-06-17 P-TA: Using Proximal Policy Optimization to Enhance Tabular Data Augmentation via Large Language Models Shuo Yang et.al. 2406.11391v1 null
2024-06-17 A Systematic Analysis of Large Language Models as Soft Reasoners: The Case of Syllogistic Inferences Leonardo Bertolazzi et.al. 2406.11341v1 null
2024-06-17 ClawMachine: Fetching Visual Tokens as An Entity for Referring and Grounding Tianren Ma et.al. 2406.11327v1 null
2024-06-17 Enhancing Biomedical Knowledge Retrieval-Augmented Generation with Self-Rewarding Tree Search and Proximal Policy Optimization Minda Hu et.al. 2406.11258v1 null
2024-06-18 AvaTaR: Optimizing LLM Agents for Tool-Assisted Knowledge Retrieval Shirley Wu et.al. 2406.11200v2 link
2024-06-17 Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning Zebang Cheng et.al. 2406.11161v1 link
2024-06-21 Contextual Knowledge Graph Chengjin Xu et.al. 2406.11160v2 null
2024-06-19 Vul-RAG: Enhancing LLM-based Vulnerability Detection via Knowledge-level RAG Xueying Du et.al. 2406.11147v2 null
2024-06-17 RePrompt: Planning by Automatic Prompt Engineering for Large Language Models Agents Weizhe Chen et.al. 2406.11132v1 null
2024-06-17 Exploring Safety-Utility Trade-Offs in Personalized Language Models Anvesh Rao Vijjini et.al. 2406.11107v1 null
2024-06-16 A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners Bowen Jiang et.al. 2406.11050v1 null
2024-06-16 RUPBench: Benchmarking Reasoning Under Perturbations for Robustness Evaluation in Large Language Models Yuqing Wang et.al. 2406.11020v1 null
2024-06-18 Connecting the Dots: Evaluating Abstract Reasoning Capabilities of LLMs Using the New York Times Connections Word Game Prisha Samadarshi et.al. 2406.11012v2 link
2024-06-16 Not All Bias is Bad: Balancing Rational Deviations and Cognitive Biases in Large Language Model Reasoning Liman Wang et.al. 2406.10999v1 null
2024-06-18 City-LEO: Toward Transparent City Management Using LLM with End-to-End Optimization Zihao Jiao et.al. 2406.10958v2 null
2024-06-16 E-Bench: Towards Evaluating the Ease-of-Use of Large Language Models Zhenyu Zhang et.al. 2406.10950v1 null
2024-06-16 Effective Generative AI: The Human-Algorithm Centaur Soroush Saghafian et.al. 2406.10942v1 null
2024-06-16 Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies Hung-Ting Su et.al. 2406.10923v1 null
2024-06-16 RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models Zhuoran Jin et.al. 2406.10890v1 link
2024-06-16 Demonstration Notebook: Finding the Most Suited In-Context Learning Example from Interactions Yiming Tang et.al. 2406.10878v1 null
2024-06-16 Step-level Value Preference Optimization for Mathematical Reasoning Guoxin Chen et.al. 2406.10858v1 null
2024-06-16 Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning Joykirat Singh et.al. 2406.10834v1 null
2024-06-16 Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses Zhiwen Fan et.al. 2406.10789v1 null
2024-06-15 FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models Zhikai Zhang et.al. 2406.10740v1 null
2024-06-15 Seeing Clearly, Answering Incorrectly: A Multimodal Robustness Benchmark for Evaluating MLLMs on Leading Questions Yexin Liu et.al. 2406.10638v1 link
2024-06-15 On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models Sree Harsha Tanneru et.al. 2406.10625v1 null
2024-06-15 Reactor Mk.1 performances: MMLU, HumanEval and BBH test results TJ Dunham et.al. 2406.10515v1 null
2024-06-14 What is the Visual Cognition Gap between Humans and Multimodal LLMs? Xu Cao et.al. 2406.10424v1 link
2024-06-14 Self-Reflection Outcome is Sensitive to Prompt Construction Fengyuan Liu et.al. 2406.10400v1 link
2024-06-18 Efficient Prompting for LLM-based Generative Internet of Things Bin Xiao et.al. 2406.10382v2 null
2024-06-14 Unlock the Correlation between Supervised Fine-Tuning and Reinforcement Learning in Training Code Large Language Models Jie Chen et.al. 2406.10305v1 null
2024-06-12 Mimicking User Data: On Mitigating Fine-Tuning Risks in Closed Large Language Models Francisco Eiras et.al. 2406.10288v1 null
2024-06-11 FoodSky: A Food-oriented Large Language Model that Passes the Chef and Dietetic Examination Pengfei Zhou et.al. 2406.10261v1 null
2024-06-10 The Impact of Quantization on Retrieval-Augmented Generation: An Analysis of Small LLMs Mert Yazan et.al. 2406.10251v1 null
2024-06-14 BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack Yuri Kuratov et.al. 2406.10149v1 null
2024-06-14 Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning Jiaqi Li et.al. 2406.10099v1 null
2024-06-18 First Multi-Dimensional Evaluation of Flowchart Comprehension for Multimodal Large Language Models Enming Zhang et.al. 2406.10057v2 link
2024-06-14 Precision Empowers, Excess Distracts: Visual Question Answering With Dynamically Infused Knowledge In Language Models Manas Jhalani et.al. 2406.09994v1 null
2024-06-14 A Better LLM Evaluator for Text Generation: The Impact of Prompt Output Sequencing and Optimization KuanChao Chu et.al. 2406.09972v1 null
2024-06-14 Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam Nabor C. Mendonça et.al. 2406.09671v1 link
2024-06-13 ImageNet3D: Towards General-Purpose Object-Level 3D Understanding Wufei Ma et.al. 2406.09613v1 link
2024-06-12 Pandora: Towards General World Model with Natural Language Actions and Video States Jiannan Xiang et.al. 2406.09455v1 null
2024-06-13 VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding Muhammad Maaz et.al. 2406.09418v1 link
2024-06-13 Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms Miaosen Zhang et.al. 2406.09397v1 null
2024-06-13 GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning Zhen Xiang et.al. 2406.09187v1 null
2024-06-13 ReMI: A Dataset for Reasoning with Multiple Images Mehran Kazemi et.al. 2406.09175v1 null
2024-06-13 Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning Bahare Fatemi et.al. 2406.09170v1 null
2024-06-13 Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs Xuan Zhang et.al. 2406.09136v1 link
2024-06-13 MMRel: A Relation Understanding Dataset and Benchmark in the MLLM Era Jiahao Nie et.al. 2406.09121v1 link
2024-06-13 Chain-of-Though (CoT) prompting strategies for medical error detection and correction Zhaolong Wu et.al. 2406.09103v1 null
2024-06-13 SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models Kehua Feng et.al. 2406.09098v1 link
2024-06-13 Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning? Zhaochen Su et.al. 2406.09072v1 link
2024-06-13 MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning Hanqing Wang et.al. 2406.09044v1 null
2024-06-14 Language Models are Crossword Solvers Soumadeep Saha et.al. 2406.09043v2 null
2024-06-13 ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models Jing Liu et.al. 2406.09041v1 null
2024-06-13 Cognitively Inspired Energy-Based World Models Alexi Gladstone et.al. 2406.08862v1 null
2024-06-13 LLM-Driven Robots Risk Enacting Discrimination, Violence, and Unlawful Actions Rumaisa Azeem et.al. 2406.08824v1 null
2024-06-13 Mixture-of-Skills: Learning to Optimize Data Usage for Fine-Tuning Large Language Models Minghao Wu et.al. 2406.08811v1 null
2024-06-13 A Survey on Compositional Learning of AI Models: Theoretical and Experimetnal Practices Sania Sinha et.al. 2406.08787v1 null
2024-06-12 Mistral-C2F: Coarse to Fine Actor for Analytical and Reasoning Enhancement in RLHF and Effective-Merged LLMs Chen Zheng et.al. 2406.08657v1 null
2024-06-12 LLM-Craft: Robotic Crafting of Elasto-Plastic Objects with Large Language Models Alison Bartsch et.al. 2406.08648v1 null
2024-06-12 CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery Xiaoshuai Song et.al. 2406.08587v1 link
2024-06-12 Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning Jaehyun Nam et.al. 2406.08527v1 null
2024-06-12 Research Trends for the Interplay between Large Language Models and Knowledge Graphs Hanieh Khorashadizadeh et.al. 2406.08223v1 null
2024-06-12 ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs Irene Huang et.al. 2406.08164v1 link
2024-06-16 Making Task-Oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests Amogh Mannekote et.al. 2406.07794v2 null
2024-06-11 Out-Of-Context Prompting Boosts Fairness and Robustness in Large Language Model Predictions Leonardo Cotta et.al. 2406.07685v1 null
2024-06-11 QuickLLaMA: Query-aware Inference Acceleration for Large Language Models Jingyao Li et.al. 2406.07528v1 link
2024-06-11 TextGrad: Automatic "Differentiation" via Text Mert Yuksekgonul et.al. 2406.07496v1 link
2024-06-17 VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Zesen Cheng et.al. 2406.07476v2 link
2024-06-11 On the Robustness of Document-Level Relation Extraction Models to Entity Name Variations Shiao Meng et.al. 2406.07444v1 link
2024-06-13 Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B Di Zhang et.al. 2406.07394v2 link
2024-06-11 Limited Out-of-Context Knowledge Reasoning in Large Language Models Peng Hu et.al. 2406.07393v1 null
2024-06-11 Large Language Models for Constrained-Based Causal Discovery Kai-Hendrik Cohrs et.al. 2406.07378v1 link
2024-06-11 Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities Delfina Sol Martinez Pandiani et.al. 2406.07353v1 null
2024-06-11 Instruct Large Language Models to Drive like Humans Ruijun Zhang et.al. 2406.07296v1 link
2024-06-11 Needle In A Multimodal Haystack Weiyun Wang et.al. 2406.07230v1 link
2024-06-11 Scaling Large-Language-Model-based Multi-Agent Collaboration Chen Qian et.al. 2406.07155v1 link
2024-06-11 Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees Sijia Chen et.al. 2406.07115v1 null
2024-06-17 Beyond Bare Queries: Open-Vocabulary Object Retrieval with 3D Scene Graph Sergey Linok et.al. 2406.07113v2 null
2024-06-11 DARA: Decomposition-Alignment-Reasoning Autonomous Language Agent for Question Answering over Knowledge Graphs Haishuo Fang et.al. 2406.07080v1 link
2024-06-11 CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks with Front-End UI Only Junhee Cho et.al. 2406.06947v1 link
2024-06-15 What's in an embedding? Would a rose by any embedding smell as sweet? Venkat Venkatasubramanian et.al. 2406.06870v3 null
2024-06-11 Eyeballing Combinatorial Problems: A Case Study of Using Multimodal Large Language Models to Solve Traveling Salesman Problems Mohammed Elhenawy et.al. 2406.06865v1 null
2024-06-11 Ollabench: Evaluating LLMs' Reasoning for Human-centric Interdependent Cybersecurity Tam n. Nguyen et.al. 2406.06863v1 link
2024-06-07 GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents Anthony Costarelli et.al. 2406.06613v1 link
2024-06-06 Reinterpreting 'the Company a Word Keeps': Towards Explainable and Ontologically Grounded Language Models Walid S. Saba et.al. 2406.06610v1 null
2024-06-05 Improve Mathematical Reasoning in Language Models by Automated Process Supervision Liangchen Luo et.al. 2406.06592v1 null
2024-06-05 Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models Flavio Petruzzellis et.al. 2406.06588v1 null
2024-06-05 Bi-Chainer: Automated Large Language Models Reasoning with Bidirectional Chaining Shuqi Liu et.al. 2406.06586v1 null
2024-06-04 Break the Chain: Large Language Models Can be Shortcut Reasoners Mengru Ding et.al. 2406.06580v1 null
2024-06-04 From Redundancy to Relevance: Enhancing Explainability in Multimodal Large Language Models Xiaofeng Zhang et.al. 2406.06579v1 null
2024-06-10 Towards a Personal Health Large Language Model Justin Cosentino et.al. 2406.06474v1 null
2024-06-11 Transforming Wearable Data into Health Insights using Large Language Model Agents Mike A. Merrill et.al. 2406.06464v2 null
2024-06-15 Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies Junlin Wang et.al. 2406.06461v3 null
2024-06-15 Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching Xiaoying Zhang et.al. 2406.06326v3 null
2024-06-11 LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages Andrew M. Bean et.al. 2406.06196v2 link
2024-06-10 Enhancing Long-Term Memory using Hierarchical Aggregate Tree for Retrieval Augmented Generation Aadharsh Aadhithya A et.al. 2406.06124v1 null
2024-06-10 Prompting Large Language Models with Audio for General-Purpose Speech Summarization Wonjune Kang et.al. 2406.05968v1 link
2024-06-10 CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark David Romero et.al. 2406.05967v1 null
2024-06-10 Chain-of-Scrutiny: Detecting Backdoor Attacks for Large Language Models Xi Li et.al. 2406.05948v1 null
2024-06-09 Hello Again! LLM-powered Personalized Agent for Long-term Dialogue Hao Li et.al. 2406.05925v1 link
2024-06-09 Why Don't Prompt-Based Fairness Metrics Correlate? Abdelrahman Zayed et.al. 2406.05918v1 null
2024-06-09 LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning Utsav Singh et.al. 2406.05881v1 null
2024-06-09 A Survey on LLM-Based Agentic Workflows and LLM-Profiled Components Xinzhe Li et.al. 2406.05804v1 null
2024-06-09 Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking Fangxu Yu et.al. 2406.05673v1 link
2024-06-09 Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses Maryam Amirizaniani et.al. 2406.05659v1 null
2024-06-08 Verbalized Probabilistic Graphical Modeling with Large Language Models Hengguan Huang et.al. 2406.05516v1 null
2024-06-08 Towards a Benchmark for Causal Business Process Reasoning with LLMs Fabiana Fournier et.al. 2406.05506v1 null
2024-06-08 Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation Neeraj Varshney et.al. 2406.05494v1 null
2024-06-08 Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios Yuhang Zhou et.al. 2406.05322v1 null
2024-06-07 LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMs Arash Gholami Davoodi et.al. 2406.05194v1 link
2024-06-07 Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions Shi-Yu Tian et.al. 2406.05055v1 null
2024-06-07 Quantifying Geospatial in the Common Crawl Corpus Ilya Ilyankou et.al. 2406.04952v1 null
2024-06-07 Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models Michał Romaszewski et.al. 2406.04926v1 null
2024-06-07 ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering Raphael Gruber et.al. 2406.04866v1 link
2024-06-07 Experiences from Integrating Large Language Model Chatbots into the Classroom Arto Hellas et.al. 2406.04817v1 null
2024-06-07 Zero, Finite, and Infinite Belief History of Theory of Mind Reasoning in Large Language Models Weizhi Tang et.al. 2406.04800v1 null
2024-06-07 Think out Loud: Emotion Deducing Explanation in Dialogues Jiangnan Li et.al. 2406.04758v1 null
2024-06-07 LogiCode: an LLM-Driven Framework for Logical Anomaly Detection Yiheng Zhang et.al. 2406.04687v1 link
2024-06-07 LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model Dongkai Wang et.al. 2406.04659v1 link
2024-06-07 LinkGPT: Teaching Large Language Models To Predict Missing Links Zhongmou He et.al. 2406.04640v1 null
2024-06-07 What do MLLMs hear? Examining reasoning with text and sound components in Multimodal Large Language Models Enis Berk Çoban et.al. 2406.04615v1 null
2024-06-07 StackSight: Unveiling WebAssembly through Large Language Models and Neurosymbolic Chain-of-Thought Decompilation Weike Fang et.al. 2406.04568v1 null
2024-06-07 SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models Md Imbesat Hassan Rizvi et.al. 2406.04566v1 link
2024-06-06 FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models Max Zhu et.al. 2406.04501v1 null
2024-06-06 Time Sensitive Knowledge Editing through Efficient Finetuning Xiou Ge et.al. 2406.04496v1 null
2024-06-06 On The Importance of Reasoning for Context Retrieval in Repository-Level Code Editing Alexander Kovrigin et.al. 2406.04464v1 link
2024-06-06 MAIRA-2: Grounded Radiology Report Generation Shruthi Bannur et.al. 2406.04449v1 null
2024-06-06 MoralBench: Moral Evaluation of LLMs Jianchao Ji et.al. 2406.04428v1 link
2024-06-06 RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation Jiaming Liu et.al. 2406.04339v1 null
2024-06-06 Text-to-Drive: Diverse Driving Behavior Synthesis via Large Language Models Phat Nguyen et.al. 2406.04300v1 null
2024-06-06 Generative AI-in-the-loop: Integrating LLMs and GPTs into the Next Generation Networks Han Zhang et.al. 2406.04276v1 null
2024-06-06 Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models Ling Yang et.al. 2406.04271v1 link
2024-06-06 DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning Shangqing Tu et.al. 2406.04197v1 link
2024-06-06 ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints Divij Handa et.al. 2406.04046v1 null
2024-06-06 Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt Zonghao Ying et.al. 2406.04031v1 link
2024-06-14 POEM: Interactive Prompt Optimization for Enhancing Multimodal Reasoning of Large Language Models Jianben He et.al. 2406.03843v2 null
2024-06-06 Tool-Planner: Dynamic Solution Tree Planning for Large Language Model with Tool Clustering Yanming Liu et.al. 2406.03807v1 link
2024-06-06 Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective Xinhao Yao et.al. 2406.03768v1 link
2024-06-06 VisLTR: Visualization-in-the-Loop Table Reasoning Jianing Hao et.al. 2406.03753v1 null
2024-06-06 A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions Lei Liu et.al. 2406.03712v1 null
2024-06-06 Evaluating the World Model Implicit in a Generative Model Keyon Vafa et.al. 2406.03689v1 link
2024-06-05 TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools Avi Caciularu et.al. 2406.03618v1 null
2024-06-05 AD-H: Autonomous Driving with Hierarchical Agents Zaibin Zhang et.al. 2406.03474v1 null
2024-06-05 Pre-trained Large Language Models Use Fourier Features to Compute Addition Tianyi Zhou et.al. 2406.03445v1 null
2024-06-05 IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models David Ifeoluwa Adelani et.al. 2406.03368v1 null
2024-06-05 CLMASP: Coupling Large Language Models with Answer Set Programming for Robotic Task Planning Xinrui Lin et.al. 2406.03367v1 null
2024-06-06 Large Language Models as Evaluators for Recommendation Explanations Xiaoyu Zhang et.al. 2406.03248v2 link
2024-06-05 Missci: Reconstructing Fallacies in Misrepresented Science Max Glockner et.al. 2406.03181v1 link
2024-06-05 Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation Tingjia Shen et.al. 2406.03085v1 null
2024-06-05 How Truncating Weights Improves Reasoning in Language Models Lei Chen et.al. 2406.03068v1 null
2024-06-05 Verified Code Transpilation with LLMs Sahil Bhatia et.al. 2406.03003v1 null
2024-06-05 NUMCoT: Numerals and Units of Measurement in Chain-of-Thought Reasoning using Large Language Models Ancheng Xu et.al. 2406.02864v1 link
2024-06-05 LLM as a Scorer: The Impact of Output Order on Dialogue Evaluation Yi-Pei Chen et.al. 2406.02863v1 null
2024-06-05 Item-Language Model for Conversational Recommendation Li Yang et.al. 2406.02844v1 null
2024-06-04 Chain of Agents: Large Language Models Collaborating on Long-Context Tasks Yusen Zhang et.al. 2406.02818v1 null
2024-06-04 $\texttt{ACCORD}$: Closing the Commonsense Measurability Gap François Roewer-Després et.al. 2406.02804v1 link
2024-06-04 Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities Wenyue Hua et.al. 2406.02787v1 null
2024-06-04 Adaptive Preference Scaling for Reinforcement Learning with Human Feedback Ilgee Hong et.al. 2406.02764v1 null
2024-06-09 RATT: A Thought Structure for Coherent and Correct LLM Reasoning Jinghan Zhang et.al. 2406.02746v2 null
2024-06-04 Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller Min Cai et.al. 2406.02721v1 link
2024-06-04 Multiple Choice Questions and Large Languages Models: A Case Study with Fictional Medical Data Maxime Griot et.al. 2406.02394v1 link
2024-06-04 Language Models Do Hard Arithmetic Tasks Easily and Hardly Do Easy Arithmetic Tasks Andrew Gambardella et.al. 2406.02356v1 null
2024-06-04 mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models Huiyuan Lai et.al. 2406.02301v1 link
2024-06-04 Iteration Head: A Mechanistic Study of Chain-of-Thought Vivien Cabannes et.al. 2406.02128v1 null
2024-06-04 MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset Weiqi Wang et.al. 2406.02106v1 link
2024-06-04 Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data Haolong Li et.al. 2406.02100v1 null
2024-06-05 Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models Marianna Nezhurina et.al. 2406.02061v2 link
2024-06-05 Multimodal Reasoning with Multimodal Knowledge Graph Junlin Lee et.al. 2406.02030v2 null
2024-06-04 Why Would You Suggest That? Human Trust in Language Model Responses Manasi Sharma et.al. 2406.02018v1 null
2024-06-04 Process-Driven Autoformalization in Lean 4 Jianqiao Lu et.al. 2406.01940v1 link
2024-06-04 PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning Yupeng Zheng et.al. 2406.01587v2 null
2024-06-03 LoFiT: Localized Fine-tuning on LLM Representations Fangcong Yin et.al. 2406.01563v1 link
2024-06-03 FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to Improve Fact Verification with Knowledge Graphs Sushant Gautam et.al. 2406.01311v1 null
2024-06-03 EffiQA: Efficient Question-Answering with Strategic Multi-Model Collaboration on Knowledge Graphs Zixuan Dong et.al. 2406.01238v1 null
2024-06-03 Explore then Determine: A GNN-LLM Synergy Framework for Reasoning over Knowledge Graph Guangyi Liu et.al. 2406.01145v1 null
2024-06-03 SemCoder: Training Code Language Models with Comprehensive Semantics Yangruibo Ding et.al. 2406.01006v1 null
2024-06-04 Efficient Behavior Tree Planning with Commonsense Pruning and Heuristic Xinglin Chen et.al. 2406.00965v2 null
2024-06-04 MEDIQ: Question-Asking LLMs for Adaptive and Reliable Clinical Reasoning Shuyue Stella Li et.al. 2406.00922v2 link
2024-06-02 Pretrained Hybrids with MAD Skills Nicholas Roberts et.al. 2406.00894v1 null
2024-06-02 OLIVE: Object Level In-Context Visual Embeddings Timothy Ossowski et.al. 2406.00872v1 link
2024-06-02 Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection Chentao Cao et.al. 2406.00806v1 null
2024-06-02 Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction Xiaoyuan Li et.al. 2406.00755v1 link
2024-06-01 Task Planning for Object Rearrangement in Multi-room Environments Karan Mirakhor et.al. 2406.00451v1 null
2024-06-01 Evaluating Uncertainty-based Failure Detection for Closed-Loop LLM Planners Zhi Zheng et.al. 2406.00430v1 null
2024-06-01 A Closer Look at Logical Reasoning with LLMs: The Choice of Tool Matters Long Hei Matthew Lam et.al. 2406.00284v1 link
2024-06-01 Are Large Vision Language Models up to the Challenge of Chart Comprehension and Reasoning? An Extensive Investigation into the Capabilities and Limitations of LVLMs Mohammed Saidul Islam et.al. 2406.00257v1 null
2024-06-05 Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey Bowen Jiang et.al. 2406.00252v2 link
2024-05-31 Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training Maximillian Chen et.al. 2406.00222v1 null
2024-05-31 Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation Bernd Bohnet et.al. 2406.00179v1 null
2024-05-31 QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation Zhuo Chen et.al. 2406.00132v1 null
2024-05-31 Towards LLM-Powered Verilog RTL Assistant: Self-Verification and Self-Correction Hanxian Huang et.al. 2406.00115v1 null
2024-05-31 Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training Feiteng Fang et.al. 2405.20978v1 null
2024-06-05 SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales Tianyang Xu et.al. 2405.20974v2 link
2024-06-03 Large Language Models are Zero-Shot Next Location Predictors Ciro Beneduce et.al. 2405.20962v2 link
2024-05-31 Preemptive Answer "Attacks" on Chain-of-Thought Reasoning Rongwu Xu et.al. 2405.20902v1 null
2024-05-31 Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning Cheng Tan et.al. 2405.20834v1 null
2024-05-27 Exploring Backdoor Attacks against Large Language Model-based Decision Making Ruochen Jiao et.al. 2405.20774v1 null
2024-05-31 Robust Planning with LLM-Modulo Framework: Case Study in Travel Planning Atharva Gundawar et.al. 2405.20625v1 null
2024-05-30 Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large Language Models Reasoning Xinlu Zhang et.al. 2405.20535v1 null
2024-05-30 SECURE: Benchmarking Generative Large Language Models for Cybersecurity Advisory Dipkamal Bhusal et.al. 2405.20441v1 null
2024-05-30 MotionLLM: Understanding Human Behaviors from Human Motions and Videos Ling-Hao Chen et.al. 2405.20340v1 null
2024-05-30 TAIA: Large Language Models are Out-of-Distribution Data Learners Shuyang Jiang et.al. 2405.20192v1 link
2024-05-30 Nadine: An LLM-driven Intelligent Social Robot with Affective Capabilities and Human-like Memory Hangyeol Kang et.al. 2405.20189v1 null
2024-05-30 Reasoning about concepts with LLMs: Inconsistencies abound Rosario Uceda-Sosa et.al. 2405.20163v1 null
2024-05-30 GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning Costas Mavromatis et.al. 2405.20139v1 link
2024-05-30 Improve Student's Reasoning Generalizability through Cascading Decomposed CoTs Distillation Chengwei Dai et.al. 2405.19842v1 link
2024-05-30 VQA Training Sets are Self-play Environments for Generating Few-shot Pools Tautvydas Misiunas et.al. 2405.19773v1 null
2024-05-30 Beyond Imitation: Learning Key Reasoning Steps from Dual Chain-of-Thoughts in Reasoning Distillation Chengwei Dai et.al. 2405.19737v1 link
2024-05-30 Enhancing Large Vision Language Models with Self-Training on Image Comprehension Yihe Deng et.al. 2405.19716v1 null
2024-05-30 AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization Jiawei Chen et.al. 2405.19668v1 null
2024-06-01 Easy Problems That LLMs Get Wrong Sean Williams et.al. 2405.19616v2 link
2024-05-30 The Accuracy of Domain Specific and Descriptive Analysis Generated by Large Language Models Denish Omondi Otieno et.al. 2405.19578v1 null
2024-05-29 Quo Vadis ChatGPT? From Large Language Models to Large Knowledge Models Venkat Venkatasubramanian et.al. 2405.19561v1 null
2024-05-29 MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions Zhenwen Liang et.al. 2405.19444v1 link
2024-05-29 X-VILA: Cross-Modality Alignment for Large Language Model Hanrong Ye et.al. 2405.19335v1 null
2024-06-02 MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Ge Zhang et.al. 2405.19327v3 link
2024-05-29 Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models Tianrun Chen et.al. 2405.19326v1 null
2024-05-29 Towards Next-Generation Urban Decision Support Systems through AI-Powered Generation of Scientific Ontology using Large Language Models -- A Case in Optimizing Intermodal Freight Transportation Jose Tupayachi et.al. 2405.19255v1 null
2024-05-29 VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos Ziyang Wang et.al. 2405.19209v1 link
2024-05-29 Learning from Litigation: Graphs and LLMs for Retrieval and Reasoning in eDiscovery Sounak Lahiri et.al. 2405.19164v1 null
2024-05-29 PathReasoner: Modeling Reasoning Path with Equivalent Extension for Logical Question Answering Fangzhi Xu et.al. 2405.19109v1 null
2024-06-02 Cephalo: Multi-Modal Vision-Language Models for Bio-Inspired Materials Analysis and Design Markus J. Buehler et.al. 2405.19076v2 link
2024-05-29 Towards Faithful Chain-of-Thought: Large Language Models are Bridging Reasoners Jiachun Li et.al. 2405.18915v1 null
2024-05-31 LLMs achieve adult human performance on higher-order theory of mind tasks Winnie Street et.al. 2405.18870v2 null
2024-06-02 Gemini & Physical World: Large Language Models Can Estimate the Intensity of Earthquake Shaking from Multi-Modal Social Media Posts S. Mostafa Mousavi et.al. 2405.18732v2 null
2024-05-29 Efficient Model-agnostic Alignment via Bayesian Persuasion Fengshuo Bai et.al. 2405.18718v1 null
2024-05-29 Calibrating Reasoning in Language Models with Internal Consistency Zhihui Xie et.al. 2405.18711v1 null
2024-05-30 Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Tiansheng Huang et.al. 2405.18641v2 link
2024-05-28 Don't Forget to Connect! Improving RAG with Graph-based Reranking Jialin Dong et.al. 2405.18414v1 null
2024-05-28 OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning Pengxiang Li et.al. 2405.18380v1 link
2024-05-28 LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models Anthony Sarah et.al. 2405.18377v1 null
2024-05-28 Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning Phakphum Artkaew et.al. 2405.18375v1 link
2024-05-28 PromptWizard: Task-Aware Agent-driven Prompt Optimization Framework Eshaan Agarwal et.al. 2405.18369v1 null
2024-05-28 Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving? Yifan Bai et.al. 2405.18361v1 null
2024-05-28 MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning Somnath Kumar et.al. 2405.18358v1 null
2024-05-28 Faithful Logical Reasoning via Symbolic Chain-of-Thought Jundong Xu et.al. 2405.18357v1 link
2024-05-28 Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning Renzhi Wang et.al. 2405.18292v1 null
2024-05-28 A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models Chengxing Xie et.al. 2405.18208v1 null
2024-05-28 LLM experiments with simulation: Large Language Model Multi-Agent System for Process Simulation Parametrization in Digital Twins Yuchen Xia et.al. 2405.18092v1 link
2024-05-28 Towards Dialogues for Joint Human-AI Reasoning and Value Alignment Elfia Bezou-Vrakatseli et.al. 2405.18073v1 null
2024-05-28 TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models Jaewoo Ahn et.al. 2405.18027v1 null
2024-05-28 Knowledge Circuits in Pretrained Transformers Yunzhi Yao et.al. 2405.17969v1 link
2024-05-28 Self-Guiding Exploration for Combinatorial Problems Zangir Iklassov et.al. 2405.17950v1 link
2024-05-28 Arithmetic Reasoning with LLM: Prolog Generation & Permutation Xiaocheng Yang et.al. 2405.17893v1 null
2024-05-28 Conv-CoA: Improving Open-domain Question Answering in Large Language Models via Conversational Chain-of-Action Zhenyu Pan et.al. 2405.17822v1 null
2024-05-28 XL3M: A Training-free Framework for LLM Length Extension Based on Segment-wise Inference Shengnan Wang et.al. 2405.17755v1 null
2024-05-28 CLAIM Your Data: Enhancing Imputation Accuracy with Contextual Large Language Models Ahatsham Hayat et.al. 2405.17712v1 null
2024-05-27 Video Enriched Retrieval Augmented Generation Using Aligned Video Captions Kevin Dela Rosa et.al. 2405.17706v1 link
2024-05-27 BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments Yusuf Roohani et.al. 2405.17631v1 link
2024-05-30 Code Repair with LLMs gives an Exploration-Exploitation Tradeoff Hao Tang et.al. 2405.17503v2 null
2024-05-27 Matryoshka Multimodal Models Mu Cai et.al. 2405.17430v1 null
2024-05-27 Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model Kuan-Chih Huang et.al. 2405.17427v1 link
2024-05-27 Self-Corrected Multimodal Large Language Model for End-to-End Robot Manipulation Jiaming Liu et.al. 2405.17418v1 null
2024-05-27 MindMerger: Efficient Boosting LLM Reasoning in non-English Languages Zixian Huang et.al. 2405.17386v1 link
2024-05-27 Assessing LLMs Suitability for Knowledge Graph Completion Vasile Ionut Remus Iga et.al. 2405.17249v1 link
2024-05-27 LLM-Assisted Static Analysis for Detecting Security Vulnerabilities Ziyang Li et.al. 2405.17238v1 null
2024-05-29 Position: Foundation Agents as the Paradigm Shift for Decision Making Xiaoqian Liu et.al. 2405.17009v3 link
2024-05-28 Entity Alignment with Noisy Annotations from Large Language Models Shengyuan Chen et.al. 2405.16806v2 link
2024-05-27 TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing Xinyu Zhang et.al. 2405.16803v1 null
2024-05-29 AutoCV: Empowering Reasoning with Automated Process Labeling via Confidence Variation Jianqiao Lu et.al. 2405.16802v3 link
2024-05-28 Large Scale Knowledge Washing Yu Wang et.al. 2405.16720v2 null
2024-05-26 RLSF: Reinforcement Learning via Symbolic Feedback Piyush Jha et.al. 2405.16661v1 null
2024-05-30 Meta-Task Planning for Language Agents Cong Zhang et.al. 2405.16510v3 null
2024-05-26 M$^3$CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought Qiguang Chen et.al. 2405.16473v1 link
2024-05-26 Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search Max Liu et.al. 2405.16450v1 null
2024-05-26 Augmented Risk Prediction for the Onset of Alzheimer's Disease from Electronic Health Records with Large Language Models Jiankun Wang et.al. 2405.16413v1 null
2024-05-28 SpinQuant: LLM quantization with learned rotations Zechun Liu et.al. 2405.16406v2 null
2024-05-28 STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making Chuanhao Li et.al. 2405.16376v2 link
2024-06-03 Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge Brendan Park et.al. 2405.16277v3 link
2024-05-25 MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time Jikun Kang et.al. 2405.16265v1 null
2024-05-25 Finetuning Large Language Model for Personalized Ranking Zhuoxi Bai et.al. 2405.16127v1 null
2024-05-25 Keypoint-based Progressive Chain-of-Thought Distillation for LLMs Kaituo Feng et.al. 2405.16064v1 null
2024-05-25 Streaming Long Video Understanding with Large Language Models Rui Qian et.al. 2405.16009v1 null
2024-05-30 SLIDE: A Framework Integrating Small and Large Language Models for Open-Domain Dialogues Evaluation Kun Zhao et.al. 2405.15924v3 link
2024-05-24 HYSYNTH: Context-Free LLM Approximation for Guiding Program Synthesis Shraddha Barke et.al. 2405.15880v1 null
2024-05-24 Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications Yang Li et.al. 2405.15877v1 null
2024-05-24 Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models Yue Zhang et.al. 2405.15684v1 null
2024-05-24 M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models Hongyu Wang et.al. 2405.15638v1 link
2024-05-24 Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges Jonas Becker et.al. 2405.15604v1 link
2024-05-24 Before Generation, Align it! A Novel and Effective Strategy for Mitigating Hallucinations in Text-to-SQL Generation Ge Qu et.al. 2405.15307v1 link
2024-05-24 Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation Zhiwei Wang et.al. 2405.15302v1 null
2024-05-24 Coaching Copilot: Blended Form of an LLM-Powered Chatbot and a Human Coach to Effectively Support Self-Reflection for Leadership Growth Riku Arakawa et.al. 2405.15250v1 null
2024-05-24 A Solution-based LLM API-using Methodology for Academic Information Seeking Yuanchun Wang et.al. 2405.15165v1 link
2024-05-24 From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks Jacob Russin et.al. 2405.15164v1 null
2024-05-24 OptLLM: Optimal Assignment of Queries to Large Language Models Yueyue Liu et.al. 2405.15130v1 link
2024-05-24 Let Me Do It For You: Towards LLM Empowered Recommendation via Tool Learning Yuyue Zhao et.al. 2405.15114v1 null
2024-05-23 Dissociation of Faithful and Unfaithful Reasoning in LLMs Evelyn Yee et.al. 2405.15092v1 link
2024-05-23 OAC: Output-adaptive Calibration for Accurate Post-training Quantization Ali Edalati et.al. 2405.15025v1 null
2024-05-23 Agentic Skill Discovery Xufeng Zhao et.al. 2405.15019v1 null
2024-05-23 A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns Asaf Yehudai et.al. 2405.14863v1 null
2024-05-23 Bitune: Bidirectional Instruction-Tuning Dawid J. Kopiczko et.al. 2405.14862v1 null
2024-05-23 Efficient Medical Question Answering with Knowledge-Augmented Question Generation Julien Khlaut et.al. 2405.14654v1 null
2024-05-24 Generating Exceptional Behavior Tests with Reasoning Augmented Large Language Models Jiyang Zhang et.al. 2405.14619v2 null
2024-05-26 Explainable Few-shot Knowledge Tracing Haoxuan Li et.al. 2405.14391v2 link
2024-05-23 Can Large Language Models Create New Knowledge for Spatial Reasoning Tasks? Thomas Greatrix et.al. 2405.14379v1 null
2024-05-23 JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models Kun Zhou et.al. 2405.14365v1 null
2024-05-23 DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Huajian Xin et.al. 2405.14333v1 null
2024-05-26 Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration Yang Zhang et.al. 2405.14314v2 null
2024-05-23 Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning Jiapu Wang et.al. 2405.14170v1 null
2024-05-23 Towards Transferable Attacks Against Vision-LLMs in Autonomous Driving with Typography Nhat Chung et.al. 2405.14169v1 null
2024-05-23 Large Language Models Can Self-Correct with Minimal Effort Zhenyu Wu et.al. 2405.14092v1 null
2024-05-23 $T^2$ of Thoughts: Temperature Tree Elicits Reasoning in Large Language Models Chengkun Cai et.al. 2405.14075v1 null
2024-05-22 On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models Mudit Verma et.al. 2405.13966v1 null
2024-05-22 PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering in Pituitary Surgery Runlong He et.al. 2405.13949v1 link
2024-05-22 FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering Yuan Sui et.al. 2405.13873v1 null
2024-05-29 Image-of-Thought Prompting for Visual Reasoning Refinement in Multimodal Large Language Models Qiji Zhou et.al. 2405.13872v2 null
2024-05-22 Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation Cyril Chhun et.al. 2405.13769v1 link
2024-05-22 HighwayLLM: Decision-Making and Navigation in Highway Driving with RL-Informed Language Model Mustafa Yildirim et.al. 2405.13547v1 null
2024-05-22 LIRE: listwise reward enhancement for preference alignment Mingye Zhu et.al. 2405.13516v1 null
2024-05-22 Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning Yuanhao Yue et.al. 2405.13448v1 null
2024-05-22 Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction Tingchen Fu et.al. 2405.13432v1 null
2024-05-21 Investigating Symbolic Capabilities of Large Language Models Neisarg Dave et.al. 2405.13209v1 null
2024-05-21 Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding Rong Gao et.al. 2405.13206v1 null
2024-05-20 Can Github issues be solved with Tree Of Thoughts? Ricardo La Rosa et.al. 2405.13057v1 link
2024-05-17 Surgical Feature-Space Decomposition of LLMs: Why, When and How? Arnav Chavan et.al. 2405.13039v1 null
2024-05-16 Can formal argumentative reasoning enhance LLMs performances? Federico Castagna et.al. 2405.13036v1 null
2024-05-15 IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues Diji Yang et.al. 2405.13021v1 null
2024-05-14 QCRD: Quality-guided Contrastive Rationale Distillation for Large Language Models Wei Wang et.al. 2405.13014v1 null
2024-05-12 MathDivide: Improved mathematical reasoning by large language models Saksham Sahai Srivastava et.al. 2405.13004v1 null
2024-05-21 Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models Zhangyue Yin et.al. 2405.12939v1 link
2024-05-21 Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs Bilgehan Sel et.al. 2405.12933v1 null
2024-05-21 DrHouse: An LLM-empowered Diagnostic Reasoning System through Harnessing Outcomes from Sensor Data and Expert Knowledge Bufang Yang et.al. 2405.12541v1 null
2024-05-21 LLM+Reasoning+Planning for supporting incomplete user queries in presence of APIs Sudhir Agarwal et.al. 2405.12433v1 null
2024-05-20 Eliciting Problem Specifications via Large Language Models Robert E. Wray et.al. 2405.12147v1 null
2024-05-20 MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Ting Jiang et.al. 2405.12130v1 link
2024-05-20 DOP: Diagnostic-Oriented Prompting for Large Language Models in Mathematical Correction Hao Chen et.al. 2405.12100v1 null
2024-05-20 KG-RAG: Bridging the Gap Between Knowledge and Creativity Diego Sanmartin et.al. 2405.12035v1 null
2024-05-20 Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs Siyu Lou et.al. 2405.11880v1 null
2024-05-20 Evaluating and Modeling Social Intelligence: A Comparative Study of Human and AI Capabilities Junqi Wang et.al. 2405.11841v1 link
2024-05-19 Inquire, Interact, and Integrate: A Proactive Agent Collaborative Framework for Zero-Shot Multimodal Medical Reasoning Zishan Gu et.al. 2405.11640v1 null
2024-05-19 MHPP: Exploring the Capabilities and Limitations of Language Models Beyond Basic Code Generation Jianbo Dai et.al. 2405.11430v1 link
2024-05-17 Are Large Language Models Moral Hypocrites? A Study Based on Moral Foundations José Luiz Nunes et.al. 2405.11100v1 null
2024-05-17 From Generalist to Specialist: Improving Large Language Models for Medical Physics Using ARCoT Jace Grandinetti et.al. 2405.11040v1 null
2024-05-17 Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities Hao Zhou et.al. 2405.10825v1 null
2024-05-17 Efficient Multimodal Large Language Models: A Survey Yizhang Jin et.al. 2405.10739v1 link
2024-05-17 MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains Zhaohuan Zhan et.al. 2405.10620v1 null
2024-05-17 RDRec: Rationale Distillation for LLM-based Recommendation Xinfeng Wang et.al. 2405.10587v1 link
2024-05-17 Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset Jie Zhu et.al. 2405.10542v1 link
2024-05-16 Retrieving and Refining: A Hybrid Framework with Large Language Models for Rare Disease Identification Jinge Wu et.al. 2405.10440v1 null
2024-05-16 When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models Xianzheng Ma et.al. 2405.10255v1 null
2024-05-16 A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks Xuanfan Ni et.al. 2405.10251v1 null
2024-05-16 LFED: A Literary Fiction Evaluation Dataset for Large Language Models Linhao Yu et.al. 2405.10166v1 link
2024-05-16 SEEK: Semantic Reasoning for Object Goal Navigation in Real World Inspection Tasks Muhammad Fadhil Ginting et.al. 2405.09822v1 null
2024-05-16 LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery Pingchuan Ma et.al. 2405.09783v1 null
2024-05-15 Matching domain experts by training from scratch on domain knowledge Xiaoliang Luo et.al. 2405.09395v1 null
2024-05-15 Exploring the Potential of Large Language Models for Automation in Technical Customer Service Jochen Wulf et.al. 2405.09161v1 null
2024-05-14 A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine Hanguang Xiao et.al. 2405.08603v1 null
2024-05-14 Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure Odysseas S. Chlapanis et.al. 2405.08502v1 link
2024-05-14 PromptMind Team at MEDIQA-CORR 2024: Improving Clinical Text Correction with Error Categorization and LLM Ensembles Satya Kesav Gundabathula et.al. 2405.08373v1 null
2024-05-13 LLM Theory of Mind and Alignment: Opportunities and Risks Winnie Street et.al. 2405.08154v1 null
2024-05-13 EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning Yinzhu Quan et.al. 2405.07938v1 null
2024-05-13 Generating Human Motion in 3D Scenes from Text Descriptions Zhi Cen et.al. 2405.07784v1 null
2024-05-13 Backdoor Removal for Generative Large Language Models Haoran Li et.al. 2405.07667v1 null
2024-05-13 MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning Shuo Yin et.al. 2405.07551v1 null
2024-05-13 Oedipus: LLM-enchanced Reasoning CAPTCHA Solver Gelei Deng et.al. 2405.07496v1 null
2024-05-14 MedConceptsQA: Open Source Medical Concepts QA Benchmark Ofir Ben Shoham et.al. 2405.07348v2 link
2024-05-12 Learnable Tokenizer for LLM-based Generative Recommendation Wenjie Wang et.al. 2405.07314v1 null
2024-05-12 MM-InstructEval: Zero-Shot Evaluation of (Multimodal) Large Language Models on Multimodal Reasoning Tasks Xiaocui Yang et.al. 2405.07229v1 link
2024-05-11 Automating Thematic Analysis: How LLMs Analyse Controversial Topics Awais Hameed Khan et.al. 2405.06919v1 null
2024-05-09 Hypothesis Testing Prompting Improves Deductive Reasoning in Large Language Models Yitian Li et.al. 2405.06707v1 null
2024-05-09 LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought Zhuoxuan Jiang et.al. 2405.06705v1 null
2024-05-07 SUTRA: Scalable Multilingual Language Model Architecture Abhijit Bendale et.al. 2405.06694v1 null
2024-05-07 Fleet of Agents: Coordinated Problem Solving with Large Language Models using Genetic Particle Filtering Akhil Arora et.al. 2405.06691v1 null
2024-05-05 Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning Jun Zhao et.al. 2405.06680v1 null
2024-05-10 Program Synthesis using Inductive Logic Programming for the Abstraction and Reasoning Corpus Filipe Marinho Rocha et.al. 2405.06399v1 null
2024-05-09 LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models Ruihao Gong et.al. 2405.06001v1 link
2024-05-09 OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning Dan Qiao et.al. 2405.05957v1 link
2024-05-09 Probing Multimodal LLMs as World Models for Driving Shiva Sreeram et.al. 2405.05956v1 link
2024-05-09 Co-driver: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes Ziang Guo et.al. 2405.05885v1 null
2024-05-09 Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning Artem Lykov et.al. 2405.05824v1 link
2024-05-09 Redefining Information Retrieval of Structured Database via Large Language Models Mingzhu Wang et.al. 2405.05508v1 null
2024-05-08 SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants Masoud Moghani et.al. 2405.05226v1 null
2024-05-08 MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning Inderjeet Nair et.al. 2405.05189v1 null
2024-05-08 QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs Weijia Zhang et.al. 2405.05109v1 null
2024-05-08 Federated Adaptation for Foundation Model-based Recommendations Chunxu Zhang et.al. 2405.04840v1 link
2024-05-08 ACORN: Aspect-wise Commonsense Reasoning Explanation Evaluation Ana Brassard et.al. 2405.04818v1 link
2024-05-08 Chain of Thoughtlessness: An Analysis of CoT in Planning Kaya Stechly et.al. 2405.04776v1 null
2024-05-08 BiasKG: Adversarial Knowledge Graphs to Induce Bias in Large Language Models Chu Fei Luo et.al. 2405.04756v1 link
2024-05-07 Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking Emre Can Acikgoz et.al. 2405.04685v1 null
2024-05-07 Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics Hanlin Zhu et.al. 2405.04669v1 null
2024-05-07 ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning Jing Lin et.al. 2405.04533v1 null
2024-05-08 Unveiling Disparities in Web Task Handling Between Human and Web Agent Kihoon Son et.al. 2405.04497v2 null
2024-05-07 Large Language Models Cannot Explain Themselves Advait Sarkar et.al. 2405.04382v1 null
2024-05-07 NL2Plan: Robust LLM-Driven Planning from Minimal Text Descriptions Elliot Gestrin et.al. 2405.04215v1 null
2024-05-07 D-NLP at SemEval-2024 Task 2: Evaluating Clinical Inference Capabilities of Large Language Models Duygu Altinok et.al. 2405.04170v1 link
2024-05-07 Optimizing Language Model's Reasoning Abilities with Weak Supervision Yongqi Tong et.al. 2405.04086v1 null
2024-05-14 Generating Probabilistic Scenario Programs from Natural Language Karim Elmaaroufi et.al. 2405.03709v2 null
2024-05-08 How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs Muhammad Uzair Khattak et.al. 2405.03690v2 null
2024-05-06 Language-Image Models with 3D Understanding Jang Hyun Cho et.al. 2405.03685v1 null
2024-05-06 Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment Abhinav Agarwalla et.al. 2405.03594v1 null
2024-05-23 AlphaMath Almost Zero: process Supervision without process Guoxin Chen et.al. 2405.03553v2 link
2024-05-15 MAmmoTH2: Scaling Instructions from the Web Xiang Yue et.al. 2405.03548v3 null
2024-05-06 Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-Context Learning Yubo Mai et.al. 2405.03509v1 null
2024-05-06 Explainable Fake News Detection With Large Language Model via Defense Among Competing Wisdom Bo Wang et.al. 2405.03371v1 link
2024-05-06 MedDoc-Bot: A Chat Tool for Comparative Analysis of Large Language Models in the Context of the Pediatric Hypertension Guideline Mohamed Yaseen Jabarulla et.al. 2405.03359v1 link
2024-05-06 WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning Yuanhan Zhang et.al. 2405.03272v1 null
2024-05-06 CRAFT: Extracting and Tuning Cultural Instructions from the Wild Bin Wang et.al. 2405.03138v1 link
2024-05-05 High Order Reasoning for Time Critical Recommendation in Evidence-based Medicine Manjiang Yu et.al. 2405.03010v1 null
2024-05-05 MedAdapter: Efficient Test-Time Adaptation of Large Language Models towards Medical Reasoning Wenqi Shi et.al. 2405.03000v1 null
2024-05-05 Trojans in Large Language Models of Code: A Critical Review through a Trigger-Based Taxonomy Aftab Hussain et.al. 2405.02828v1 null
2024-05-04 CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions Hanchong Zhang et.al. 2405.02712v1 link
2024-05-04 A Literature Review and Framework for Human Evaluation of Generative Large Language Models in Healthcare Thomas Yu Chow Tam et.al. 2405.02559v1 null
2024-05-20 GigSense: An LLM-Infused Tool forWorkers' Collective Intelligence Kashif Imteyaz et.al. 2405.02528v2 null
2024-05-09 REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs Deepa Tilwani et.al. 2405.02228v2 null
2024-05-03 Argumentative Large Language Models for Explainable and Contestable Decision-Making Gabriel Freedman et.al. 2405.02079v1 null
2024-05-03 Exploring Combinatorial Problem Solving with Large Language Models: A Case Study on the Travelling Salesman Problem Using GPT-3.5 Turbo Mahmoud Masoud et.al. 2405.01997v1 null
2024-05-03 Incorporating External Knowledge and Goal Guidance for LLM-based Conversational Recommender Systems Chuang Li et.al. 2405.01868v1 null
2024-05-02 ALCM: Autonomous LLM-Augmented Causal Discovery Framework Elahe Khatibi et.al. 2405.01744v1 null
2024-05-08 Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning Tianle Xia et.al. 2405.01649v3 null
2024-04-30 Large Language Model Agent for Fake News Detection Xinyi Li et.al. 2405.01593v1 null
2024-04-28 Tabular Embedding Model (TEM): Finetuning Embedding Models For Tabular RAG Applications Sujit Khanna et.al. 2405.01585v1 null
2024-05-02 OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning Shihao Wang et.al. 2405.01533v1 link
2024-05-02 Analyzing the Role of Semantic Representations in the Era of Large Language Models Zhijing Jin et.al. 2405.01502v1 link
2024-05-08 Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving Xin Quan et.al. 2405.01379v2 null
2024-05-02 GAIA: A General AI Assistant for Intelligent Accelerator Operations Frank Mayet et.al. 2405.01359v1 null
2024-05-02 The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights Wenhao Zhu et.al. 2405.01345v1 link
2024-05-02 Bayesian Optimization with LLM-Based Acquisition Functions for Natural Language Preference Elicitation David Eric Austin et.al. 2405.00981v1 null
2024-05-02 CACTUS: Chemistry Agent Connecting Tool-Usage to Science Andrew D. McNaughton et.al. 2405.00972v1 link
2024-04-25 Can't say cant? Measuring and Reasoning of Dark Jargons in Large Language Models Xu Ji et.al. 2405.00718v1 null
2024-04-25 Large Language Models in Healthcare: A Comprehensive Benchmark Andrew Liu et.al. 2405.00716v1 null
2024-05-01 HalluVault: A Novel Logic Programming-aided Metamorphic Testing Framework for Detecting Fact-Conflicting Hallucinations in Large Language Models Ningke Li et.al. 2405.00648v1 null
2024-05-01 Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning Yuxi Xie et.al. 2405.00451v1 null
2024-05-01 RAG-based Explainable Prediction of Road Users Behaviors for Automated Driving using Knowledge Graphs and Large Language Models Mohamed Manzour Hussien et.al. 2405.00449v1 null
2024-05-01 Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models Leonardo Ranaldi et.al. 2405.00402v1 null
2024-05-01 AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation Experts Zefang Liu et.al. 2405.00361v1 link
2024-05-03 Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language Model Yu Cui et.al. 2405.00338v2 null
2024-05-03 A Careful Examination of Large Language Model Performance on Grade School Arithmetic Hugh Zhang et.al. 2405.00332v3 null
2024-05-01 DFKI-NLP at SemEval-2024 Task 2: Towards Robust LLMs Using Data Perturbations and MinMax Training Bhuvanesh Verma et.al. 2405.00321v1 null
2024-04-30 General Purpose Verification for Chain of Thought Prompting Robert Vacareanu et.al. 2405.00204v1 null
2024-04-30 Better & Faster Large Language Models via Multi-token Prediction Fabian Gloeckle et.al. 2404.19737v1 null
2024-04-30 Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners Chun Feng et.al. 2404.19696v1 null
2024-04-30 Do Large Language Models Understand Conversational Implicature -- A case study with a chinese sitcom Shisen Yue et.al. 2404.19509v1 link
2024-05-01 Neuro-Vision to Language: Image Reconstruction and Language enabled Interaction via Brain Recordings Guobin Shen et.al. 2404.19438v2 null
2024-04-30 Can Large Language Models put 2 and 2 together? Probing for Entailed Arithmetical Relationships D. Panas et.al. 2404.19432v1 null
2024-04-30 Evaluating Telugu Proficiency in Large Language Models_ A Comparative Analysis of ChatGPT and Gemini Katikela Sreeharsha Kishore et.al. 2404.19369v1 null
2024-04-30 Multi-hop Question Answering over Knowledge Graphs using Large Language Models Abir Chakraborty et.al. 2404.19234v1 null
2024-04-30 Transcrib3D: 3D Referring Expression Resolution through Large Language Models Jiading Fang et.al. 2404.19221v1 null
2024-04-29 SuperCLUE-Fin: Graded Fine-Grained Analysis of Chinese LLMs on Diverse Financial Tasks and Applications Liang Xu et.al. 2404.19063v1 null
2024-04-29 Plan of Thoughts: Heuristic-Guided Problem Solving with Large Language Models Houjun Liu et.al. 2404.19055v1 null
2024-04-29 Towards Generalizable Agents in Text-Based Educational Environments: A Study of Integrating RL with LLMs Bahar Radmehr et.al. 2404.18978v1 null
2024-04-29 Benchmarking Benchmark Leakage in Large Language Models Ruijie Xu et.al. 2404.18824v1 link
2024-04-29 PECC: Problem Extraction and Coding Challenges Patrick Haller et.al. 2404.18766v1 link
2024-04-29 Injecting Salesperson's Dialogue Strategies in Large Language Models with Chain-of-Thought Reasoning Wen-Yu Chang et.al. 2404.18564v1 null
2024-04-29 Ethical Reasoning and Moral Value Alignment of LLMs Depend on the Language we Prompt them in Utkarsh Agarwal et.al. 2404.18460v1 null
2024-04-29 FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models Wei Li et.al. 2404.18359v1 null
2024-04-30 Comparing LLM prompting with Cross-lingual transfer performance on Indigenous and Low-resource Brazilian Languages David Ifeoluwa Adelani et.al. 2404.18286v2 null
2024-04-28 Logic Agent: Enhancing Validity with Logic Rule Invocation Hanmeng Liu et.al. 2404.18130v1 null
2024-04-28 Generative AI for Low-Carbon Artificial Intelligence of Things Jinbo Wen et.al. 2404.18077v1 null
2024-04-27 CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments Kaixuan Huang et.al. 2404.18021v1 null
2024-04-27 Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction Guozheng Li et.al. 2404.17809v1 null
2024-04-26 CoMM: Collaborative Multi-Agent, Multi-Reasoning-Path Prompting for Complex Problem Solving Pei Chen et.al. 2404.17729v1 link
2024-04-26 PLAYER: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery Games* Qinglin Zhu et.al. 2404.17662v1 link
2024-05-09 Large Language Model Agent as a Mechanical Designer Yayati Jadhav et.al. 2404.17525v2 null
2024-04-29 On the Use of Large Language Models to Generate Capability Ontologies Luis Miguel Vieira da Silva et.al. 2404.17524v2 null
2024-04-26 Enhancing Legal Compliance and Regulation Analysis with Large Language Models Shabnam Hassani et.al. 2404.17522v1 null
2024-04-26 A Comprehensive Evaluation on Event Reasoning of Large Language Models Zhengwei Tao et.al. 2404.17513v1 link
2024-04-26 Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System Robin Schmucker et.al. 2404.17460v1 null
2024-04-26 Small Language Models Need Strong Verifiers to Self-Correct Reasoning Yunxiang Zhang et.al. 2404.17140v1 null
2024-04-26 Make Your LLM Fully Utilize the Context Shengnan An et.al. 2404.16811v2 link
2024-04-25 Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning Tianhui Zhang et.al. 2404.16807v1 null
2024-04-25 RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis Xiaoman Zhang et.al. 2404.16754v1 null
2024-04-25 Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents Giorgio Piatti et.al. 2404.16698v1 null
2024-04-25 EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning Hongxia Xie et.al. 2404.16670v1 link
2024-04-25 Evolutionary Large Language Models for Hardware Security: A Comparative Survey Mohammad Akyash et.al. 2404.16651v1 null
2024-04-25 Evaluating Consistency and Reasoning Capabilities of Large Language Models Yash Saxena et.al. 2404.16478v1 null
2024-04-25 List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs An Yan et.al. 2404.16375v1 link
2024-04-24 The Feasibility of Implementing Large-Scale Transformers on Multi-FPGA Platforms Yu Gao et.al. 2404.16158v1 null
2024-04-24 Cantor: Inspiring Multimodal Chain-of-Thought of MLLM Timin Gao et.al. 2404.16033v1 null
2024-04-24 GeckOpt: LLM System Efficiency via Intent-Based Tool Selection Michael Fore et.al. 2404.15804v1 null
2024-04-24 Leveraging Large Language Models for Multimodal Search Oriol Barbany et.al. 2404.15790v1 null
2024-04-24 Beyond Chain-of-Thought: A Survey of Chain-of-X Paradigms for LLMs Yu Xia et.al. 2404.15676v1 null
2024-04-24 Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? Hossein Salami et.al. 2404.15578v1 null
2024-04-23 Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models Mihir Parmar et.al. 2404.15522v1 link
2024-04-25 ToM-LM: Delegating Theory of Mind Reasoning to External Symbolic Executors in Large Language Models Weizhi Tang et.al. 2404.15515v2 null
2024-04-23 Re-Thinking Inverse Graphics With Large Language Models Peter Kulits et.al. 2404.15228v1 null
2024-04-23 Regressive Side Effects of Training Language Models to Mimic Student Misconceptions Shashank Sonkar et.al. 2404.15156v1 null
2024-04-23 Rethinking LLM Memorization through the Lens of Adversarial Compression Avi Schwarzschild et.al. 2404.15146v1 null
2024-04-28 Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Reasoners Qihuang Zhong et.al. 2404.14963v2 null
2024-04-23 Graph Machine Learning in the Era of Large Language Models (LLMs) Wenqi Fan et.al. 2404.14928v1 null
2024-04-23 Pattern-Aware Chain-of-Thought Prompting in Large Language Models Yufeng Zhang et.al. 2404.14812v1 null
2024-04-23 A Survey of Large Language Models on Generative Graph Analytics: Query, Learning, and Applications Wenbo Shang et.al. 2404.14809v1 null
2024-04-23 Med42 -- Evaluating Fine-Tuning Strategies for Medical LLMs: Full-Parameter vs. Parameter-Efficient Approaches Clément Christophe et.al. 2404.14779v1 null
2024-04-23 CT-Agent: Clinical Trial Multi-Agent with Large Language Model-based Reasoning Ling Yue et.al. 2404.14777v1 null
2024-04-23 Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks Amir Saeidi et.al. 2404.14723v1 null
2024-04-23 Think-Program-reCtify: 3D Situated Reasoning with Large Language Models Qingrong He et.al. 2404.14705v1 null
2024-04-23 NExT: Teaching Large Language Models to Reason about Code Execution Ansong Ni et.al. 2404.14662v1 null
2024-04-26 Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training Mengzhao Jia et.al. 2404.14604v3 null
2024-04-22 Tree of Reviews: A Tree-based Dynamic Iterative Retrieval Framework for Multi-hop Question Answering Li Jiapeng et.al. 2404.14464v1 null
2024-04-14 Enhancing Fault Detection for Large Language Models via Mutation-Based Confidence Smoothing Qiang Hu et.al. 2404.14419v1 null
2024-04-22 An Artificial Neuron for Enhanced Problem Solving in Large Language Models Sumedh Rasal et.al. 2404.14222v1 null
2024-04-22 Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction Zheye Deng et.al. 2404.14215v1 link
2024-04-24 Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion Yingxuan Li et.al. 2404.13993v2 null
2024-04-22 Information Re-Organization Improves Reasoning in Large Language Models Xiaoxia Cheng et.al. 2404.13985v1 null
2024-04-22 MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit Boning Zhang et.al. 2404.13925v1 link
2024-04-22 Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models Yukyung Lee et.al. 2404.13919v1 null
2024-04-22 EventLens: Leveraging Event-Aware Pretraining and Cross-modal Linking Enhances Visual Commonsense Reasoning Mingjie Ma et.al. 2404.13847v1 null
2024-04-24 MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning Yifan Jiang et.al. 2404.13591v2 link
2024-04-20 Large Language Models as Test Case Generators: Performance Evaluation and Enhancement Kefan Li et.al. 2404.13340v1 null
2024-05-03 LLMChain: Blockchain-based Reputation System for Sharing and Evaluating Large Language Models Mouhamed Amine Bouchiha et.al. 2404.13236v2 link
2024-04-19 Beyond Self-Consistency: Ensemble Reasoning Boosts Consistency and Accuracy of LLMs in Cancer Staging Chia-Hsuan Chang et.al. 2404.13149v1 null
2024-04-17 TREACLE: Thrifty Reasoning via Context-Aware LLM and Prompt Selection Xuechen Zhang et.al. 2404.13082v1 null
2024-04-14 Evidence from counterfactual tasks supports emergent analogical reasoning in large language models Taylor Webb et.al. 2404.13070v1 link
2024-04-19 Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs Biyang Guo et.al. 2404.13033v1 link
2024-04-24 Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models Yian Li et.al. 2404.12966v2 null
2024-04-29 Large Language Models for Networking: Workflow, Advances and Challenges Chang Liu et.al. 2404.12901v2 null
2024-04-19 Towards Logically Consistent Language Models via Probabilistic Reasoning Diego Calanzone et.al. 2404.12843v1 null
2024-04-19 TextSquare: Scaling up Text-Centric Visual Instruction Tuning Jingqun Tang et.al. 2404.12803v1 null
2024-04-19 Relevant or Random: Can LLMs Truly Perform Analogical Reasoning? Chengwei Qin et.al. 2404.12728v1 null
2024-04-19 Enabling Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration Yichong Huang et.al. 2404.12715v1 null
2024-04-22 Multi-Objective Fine-Tuning for Enhanced Program Repair with LLMs Boyang Yang et.al. 2404.12636v2 null
2024-04-18 BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models Yu Feng et.al. 2404.12494v1 null
2024-04-18 NORMAD: A Benchmark for Measuring the Cultural Adaptability of Large Language Models Abhinav Rao et.al. 2404.12464v1 null
2024-04-25 BLINK: Multimodal Large Language Models Can See but Not Perceive Xingyu Fu et.al. 2404.12390v2 null
2024-04-18 MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale Xiaotang Gai et.al. 2404.12372v1 null
2024-04-18 Large Language Models in Targeted Sentiment Analysis Nicolay Rusnachenko et.al. 2404.12342v1 link
2024-04-18 Normative Requirements Operationalization with Large Language Models Nick Feng et.al. 2404.12335v1 null
2024-04-18 Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Ye Tian et.al. 2404.12253v1 null
2024-04-19 AccidentBlip2: Accident Detection With Multi-View MotionBlip2 Yihua Shao et.al. 2404.12149v2 link
2024-04-18 RAGAR, Your Falsehood RADAR: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models M. Abdul Khaliq et.al. 2404.12065v1 null
2024-04-18 EVIT: Event-Oriented Instruction Tuning for Event Reasoning Zhengwei Tao et.al. 2404.11978v1 null
2024-04-18 Large Language Models Can Plan Your Travels Rigorously with Formal Verification Tools Yilun Hao et.al. 2404.11891v1 null
2024-04-18 CAUS: A Dataset for Question Generation based on Human Cognition Leveraging Large Language Models Minjung Shin et.al. 2404.11835v1 null
2024-04-19 Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study Zooey Nguyen et.al. 2404.11792v2 null
2024-04-21 Missed Connections: Lateral Thinking Puzzles for Large Language Models Graham Todd et.al. 2404.11730v2 null
2024-04-17 How often are errors in natural language reasoning due to paraphrastic variability? Neha Srikanth et.al. 2404.11717v1 null
2024-04-17 Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models Yue Zhou et.al. 2404.11500v1 link
2024-04-17 Exploring the Transferability of Visual Prompting for Multimodal Large Language Models Yichi Zhang et.al. 2404.11207v1 link
2024-04-17 Fact :Teaching MLLMs with Faithful, Concise and Transferable Rationales Minghe Gao et.al. 2404.11129v1 null
2024-04-17 TransLinkGuard: Safeguarding Transformer Models Against Model Stealing in Edge Deployment Qinfeng Li et.al. 2404.11121v1 null
2024-04-18 ViLLM-Eval: A Comprehensive Evaluation Suite for Vietnamese Large Language Models Trong-Hieu Nguyen et.al. 2404.11086v2 null
2024-04-17 On the Empirical Complexity of Reasoning and Planning in LLMs Liwei Kang et.al. 2404.11041v1 null
2024-04-17 Empowering Large Language Models on Robotic Manipulation with Affordance Prompting Guangran Cheng et.al. 2404.11027v1 null
2024-04-17 Many-Shot In-Context Learning Rishabh Agarwal et.al. 2404.11018v1 null
2024-04-16 Self-playing Adversarial Language Game Enhances LLM Reasoning Pengyu Cheng et.al. 2404.10642v1 link
2024-04-16 Private Attribute Inference from Images with Vision-Language Models Batuhan Tömekçe et.al. 2404.10618v1 null
2024-04-16 Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases Yanze Li et.al. 2404.10595v1 null
2024-04-16 CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity Moshe Berchansky et.al. 2404.10513v1 null
2024-04-16 MEEL: Multi-Modal Event Evolution Learning Zhengwei Tao et.al. 2404.10429v1 link
2024-04-16 Reasoning on Efficient Knowledge Paths:Knowledge Graph Guides Large Language Model for Domain Question Answering Yuqi Wang et.al. 2404.10384v1 null
2024-04-16 Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards Hyeonbin Hwang et.al. 2404.10346v1 link
2024-04-28 RLRF:Reinforcement Learning from Reflection through Debates as Feedback for Bias Mitigation in LLMs Ruoxi Cheng et.al. 2404.10160v2 null
2024-04-15 TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition Md Mahadi Hasan Nahid et.al. 2404.10150v1 link
2024-04-15 ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesis Aashish Anantha Ramakrishnan et.al. 2404.10141v1 link
2024-04-15 A Survey on Deep Learning for Theorem Proving Zhaoyu Li et.al. 2404.09939v1 link
2024-04-15 Compression Represents Intelligence Linearly Yuzhen Huang et.al. 2404.09937v1 link
2024-04-15 AI-Driven Statutory Reasoning via Software Engineering Methods Rohan Padhye et.al. 2404.09868v1 null
2024-04-15 Reimagining Self-Adaptation in the Age of Large Language Models Raghav Donakanti et.al. 2404.09866v1 null
2024-04-15 Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model Hyunsoo Cho et.al. 2404.09717v1 null
2024-04-15 Generative AI for Game Theory-based Mobile Networking Long He et.al. 2404.09699v1 null
2024-04-15 Bridging Vision and Language Spaces with Assignment Prediction Jungin Park et.al. 2404.09632v1 link
2024-04-15 Bridging the Gap between Different Vocabularies for LLM Ensemble Yangyifan Xu et.al. 2404.09492v1 link
2024-04-15 Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning Sungwon Han et.al. 2404.09491v1 link
2024-04-15 MMCode: Evaluating Multi-Modal Code Large Language Models with Visually Rich Programming Problems Kaixin Li et.al. 2404.09486v1 link
2024-04-14 A Survey on Integration of Large Language Models with Intelligent Robots Yeseung Kim et.al. 2404.09228v1 null
2024-04-16 Post-Semantic-Thinking: A Robust Strategy to Distill Reasoning Capacity from Large Language Models Xiaoshu Chen et.al. 2404.09170v2 null
2024-04-14 When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models Yanhong Li et.al. 2404.09129v1 null
2024-04-13 CuriousLLM: Elevating Multi-Document QA with Reasoning-Infused Knowledge Graph Prompting Zukang Yang et.al. 2404.09077v1 link
2024-04-12 "Don't forget to put the milk back!" Dataset for Enabling Embodied Agents to Detect Anomalous Situations James F. Mullen Jr et.al. 2404.08827v1 null
2024-04-12 LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning Junchi Wang et.al. 2404.08767v1 link
2024-04-11 MM-PhyQA: Multimodal Physics Question-Answering With Multi-Image CoT Prompting Avinash Anand et.al. 2404.08704v1 null
2024-04-10 Apollonion: Profile-centric Dialog Agent Shangyu Chen et.al. 2404.08692v1 null
2024-04-06 ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming Simone Tedeschi et.al. 2404.08676v1 link
2024-04-12 Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts Övgü Özdemir et.al. 2404.08589v1 link
2024-04-12 LaSagnA: Language-based Segmentation Assistant for Complex Queries Cong Wei et.al. 2404.08506v1 link
2024-04-12 Strategic Interactions between Large Language Models-based Agents in Beauty Contests Siting Lu et.al. 2404.08492v1 null
2024-04-12 Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian Stefano De Paoli et.al. 2404.08488v1 null
2024-04-11 Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs Jierui Li et.al. 2404.08148v1 null
2024-04-11 Data-Augmentation-Based Dialectal Adaptation for LLMs Fahim Faisal et.al. 2404.08092v1 link
2024-04-10 Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition Kehua Feng et.al. 2404.08008v1 link
2024-04-17 LaVy: Vietnamese Multimodal Large Language Model Chi Tran et.al. 2404.07922v4 link
2024-04-11 ODA: Observation-Driven Agent for integrating LLMs and Knowledge Graphs Lei Sun et.al. 2404.07677v1 null
2024-04-11 WESE: Weak Exploration to Strong Exploitation for LLM Agents Xu Huang et.al. 2404.07456v1 null
2024-04-11 Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs Kanchana Ranasinghe et.al. 2404.07449v1 null
2024-04-10 Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs Bowen Jin et.al. 2404.07103v1 link
2024-04-10 VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning Alexandros Xenos et.al. 2404.07078v1 link
2024-04-10 Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study Hongru Du et.al. 2404.06962v1 link
2024-04-10 Vision-Language Model-based Physical Reasoning for Robot Liquid Perception Wenqiang Lai et.al. 2404.06904v1 null
2024-04-09 GenCHiP: Generating Robot Policy Code for High-Precision and Contact-Rich Manipulation Tasks Kaylee Burns et.al. 2404.06645v1 null
2024-04-09 Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language? Omid Ghahroodi et.al. 2404.06644v1 null
2024-04-09 AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents Luca Gioacchini et.al. 2404.06411v1 link
2024-04-09 Model Generation from Requirements with LLMs: an Exploratory Study Alessio Ferrari et.al. 2404.06371v1 null
2024-04-21 AgentsCoDriver: Large Language Model Empowered Collaborative Driving with Lifelong Learning Senkang Hu et.al. 2404.06345v2 null
2024-04-09 DRE: Generating Recommendation Explanations by Aligning Large Language Models at Data-level Shen Gao et.al. 2404.06311v1 null
2024-04-09 Multimodal Road Network Generation Based on Large Language Model Jiajing Chen et.al. 2404.06227v1 null
2024-04-08 Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning Ruiqi Zhang et.al. 2404.05868v1 null
2024-04-08 Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Keen You et.al. 2404.05719v1 null
2024-04-08 Evaluating Mathematical Reasoning Beyond Accuracy Shijie Xia et.al. 2404.05692v1 link
2024-04-18 CoReS: Orchestrating the Dance of Reasoning and Segmentation Xiaoyi Bao et.al. 2404.05673v2 null
2024-04-08 MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering Iñigo Alonso et.al. 2404.05590v1 null
2024-04-08 Evaluating Interventional Reasoning Capabilities of Large Language Models Tejas Kasetty et.al. 2404.05545v1 null
2024-04-08 HAMMR: HierArchical MultiModal React agents for generic VQA Lluis Castrejon et.al. 2404.05465v1 null
2024-04-11 RoT: Enhancing Large Language Models with Reflection on Search Trees Wenyang Hui et.al. 2404.05449v2 link
2024-04-08 Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models Yutao Ouyang et.al. 2404.05291v1 null
2024-04-08 LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models Shibo Hao et.al. 2404.05221v1 null
2024-04-08 LLM-BT: Performing Robotic Adaptive Tasks based on Large Language Models and Behavior Trees Haotian Zhou et.al. 2404.05134v1 null
2024-04-07 Facial Affective Behavior Analysis with Instruction Tuning Yifan Li et.al. 2404.05052v1 null
2024-04-07 MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models Zihao Wei et.al. 2404.04990v1 link
2024-04-07 SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials Mael Jullien et.al. 2404.04963v1 null
2024-04-07 RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models Qi Lv et.al. 2404.04929v1 null
2024-04-07 LLM-Based Multi-Agent Systems for Software Engineering: Vision and the Road Ahead Junda He et.al. 2404.04834v1 null
2024-04-07 FRACTAL: Fine-Grained Scoring from Aggregate Text Labels Yukti Makhija et.al. 2404.04817v1 null
2024-04-07 GenEARL: A Training-Free Generative Framework for Multimodal Event Argument Role Labeling Hritik Bansal et.al. 2404.04763v1 null
2024-04-06 Challenges Faced by Large Language Models in Solving Multi-Agent Flocking Peihan Li et.al. 2404.04752v1 null
2024-04-06 Navigating the Landscape of Hint Generation Research: From the Past to the Future Anubhav Jangra et.al. 2404.04728v1 null
2024-04-06 Autonomous Artificial Intelligence Agents for Clinical Decision Making in Oncology Dyke Ferber et.al. 2404.04667v1 null
2024-04-06 Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement Zaid Khan et.al. 2404.04627v1 null
2024-04-06 IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials Shreyasi Mandal et.al. 2404.04510v1 link
2024-04-05 Exploring Autonomous Agents through the Lens of Large Language Models: A Review Saikat Barua et.al. 2404.04442v1 null
2024-04-05 Assisting humans in complex comparisons: automated information comparison at scale Truman Yuen et.al. 2404.04351v1 null
2024-04-05 Koala: Key frame-conditioned long video-LLM Reuben Tan et.al. 2404.04346v1 null
2024-04-04 CBR-RAG: Case-Based Reasoning for Retrieval Augmented Generation in LLMs for Legal Question Answering Nirmalie Wiratunga et.al. 2404.04302v1 link
2024-04-04 Reason from Fallacy: Enhancing Large Language Models' Logical Reasoning through Logical Fallacy Understanding Yanda Li et.al. 2404.04293v1 null
2024-04-05 Physical Property Understanding from Language-Embedded Feature Fields Albert J. Zhai et.al. 2404.04242v1 null
2024-04-05 Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents Harsh Kohli et.al. 2404.04237v1 null
2024-04-05 Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer Hele-Andra Kuulmets et.al. 2404.04042v1 null
2024-04-05 Can only LLMs do Reasoning?: Potential of Small Language Models in Task Planning Gawon Choi et.al. 2404.03891v1 link
2024-04-08 SAAS: Solving Ability Amplification Strategy for Enhanced Mathematical Reasoning in Large Language Models Hyeonwoo Kim et.al. 2404.03887v2 null
2024-04-04 Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra Darioush Kevian et.al. 2404.03647v1 null
2024-04-04 Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph Marco Bronzini et.al. 2404.03623v1 null
2024-04-04 Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models Wenshan Wu et.al. 2404.03622v1 null
2024-04-04 Sailor: Open Language Models for South-East Asia Longxu Dou et.al. 2404.03608v1 link
2024-04-04 Evaluating LLMs at Detecting Errors in LLM Responses Ryo Kamoi et.al. 2404.03602v1 link
2024-04-04 Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models Yantao Liu et.al. 2404.03577v1 link
2024-04-04 Edisum: Summarizing and Explaining Wikipedia Edits at Scale Marija Šakota et.al. 2404.03428v1 link
2024-04-04 Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought Jooyoung Lee et.al. 2404.03414v1 null
2024-04-04 nicolay-r at SemEval-2024 Task 3: Using Flan-T5 for Reasoning Emotion Cause in Conversations with Chain-of-Thought on Emotion States Nicolay Rusnachenko et.al. 2404.03361v1 link
2024-04-04 Probing Large Language Models for Scalar Adjective Lexical Semantics and Scalar Diversity Pragmatics Fangru Lin et.al. 2404.03301v1 link
2024-04-04 The Probabilities Also Matter: A More Faithful Metric for Faithfulness of Free-Text Explanations in Large Language Models Noah Y. Siegel et.al. 2404.03189v1 null
2024-04-04 Robust Pronoun Use Fidelity with English LLMs: Are they Reasoning, Repeating, or Just Biased? Vagrant Gautam et.al. 2404.03134v1 link
2024-04-10 An Incomplete Loop: Deductive, Inductive, and Abductive Learning in Large Language Models Emmy Liu et.al. 2404.03028v2 null
2024-04-03 Towards a Fully Interpretable and More Scalable RSA Model for Metaphor Understanding Gaia Carenini et.al. 2404.02983v1 null
2024-04-03 Explainable Traffic Flow Prediction with Large Language Models Xusen Guo et.al. 2404.02937v1 null
2024-04-03 KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking Jiawei Zhang et.al. 2404.02935v1 link
2024-04-03 GreedLlama: Performance of Financial Value-Aligned Large Language Models in Moral Reasoning Jeffy Yu et.al. 2404.02934v1 null
2024-04-03 I-Design: Personalized LLM Interior Designer Ata Çelen et.al. 2404.02838v1 null
2024-04-03 Empowering Biomedical Discovery with AI Agents Shanghua Gao et.al. 2404.02831v1 null
2024-04-05 A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches Zhigen Zhao et.al. 2404.02817v2 null
2024-04-03 Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models Hyungjoo Chae et.al. 2404.02575v1 null
2024-04-03 VIAssist: Adapting Multi-modal Large Language Models for Users with Visual Impairments Bufang Yang et.al. 2404.02508v1 null
2024-04-03 Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT Amirhossein Abaskohi et.al. 2404.02403v1 link
2024-04-02 $\texttt{LM}^\texttt{2}$: A Simple Society of Language Models Solves Complex Reasoning Gurusha Juneja et.al. 2404.02255v1 null
2024-04-02 Advancing LLM Reasoning Generalists with Preference Trees Lifan Yuan et.al. 2404.02078v1 link
2024-04-04 Long-context LLMs Struggle with Long In-context Learning Tianle Li et.al. 2404.02060v2 link
2024-04-02 Large Language Models for Orchestrating Bimanual Robots Kun Chu et.al. 2404.02018v1 null
2024-04-13 HyperCLOVA X Technical Report Kang Min Yoo et.al. 2404.01954v2 null
2024-04-02 Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey Philipp Mondorf et.al. 2404.01869v1 null
2024-04-02 Where to Move Next: Zero-shot Generalization of LLMs for Next POI Recommendation Shanshan Feng et.al. 2404.01855v1 link
2024-04-03 Towards Generalizable and Faithful Logic Reasoning over Natural Language via Resolution Refutation Zhouhao Sun et.al. 2404.01677v2 null
2024-04-02 METAL: Towards Multilingual Meta-Evaluation Rishav Hada et.al. 2404.01667v1 null
2024-04-02 InsightLens: Discovering and Exploring Insights from Conversational Contexts in Large-Language-Model-Powered Data Analysis Luoxuan Weng et.al. 2404.01644v1 null
2024-04-01 Syntactic Robustness for LLM-based Code Generation Laboni Sarker et.al. 2404.01535v1 null
2024-04-01 Are large language models superhuman chemists? Adrian Mirza et.al. 2404.01475v1 null
2024-04-01 Will the Real Linda Please Stand up...to Large Language Models? Examining the Representativeness Heuristic in LLMs Pengda Wang et.al. 2404.01461v1 null
2024-03-31 CHOPS: CHat with custOmer Profile Systems for Customer Service with LLMs Jingzhe Shi et.al. 2404.01343v1 null
2024-04-01 FABLES: Evaluating faithfulness and content selection in book-length summarization Yekyung Kim et.al. 2404.01261v1 link
2024-04-01 A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules Xiang Li et.al. 2404.01245v1 null
2024-04-01 LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models Yadong Zhang et.al. 2404.01230v1 null
2024-04-01 Enhancing Reasoning Capacity of SLM using Cognitive Enhancement Jonathan Pan et.al. 2404.01135v1 null
2024-04-01 Enabling Memory Safety of C Programs using LLMs Nausheen Mohammed et.al. 2404.01096v1 null
2024-04-01 Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning Rongjie Li et.al. 2404.00909v1 null
2024-04-02 An Abundance of Katherines: The Game Theory of Baby Naming Katy Blumer et.al. 2404.00732v2 null
2024-03-30 Multi-hop Question Answering under Temporal Knowledge Editing Keyuan Cheng et.al. 2404.00492v1 null
2024-04-04 Planning and Editing What You Retrieve for Enhanced Tool Learning Tenghao Huang et.al. 2404.00450v2 link
2024-03-30 Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks Hyunjae Kim et.al. 2404.00376v1 null
2024-03-30 Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange Ankit Satpute et.al. 2404.00344v1 link
2024-03-30 Your Co-Workers Matter: Evaluating Collaborative Capabilities of Language Models in Blocks World Guande Wu et.al. 2404.00246v1 link
2024-03-30 Aligning Large Language Models with Recommendation Knowledge Yuwei Cao et.al. 2404.00245v1 null
2024-03-30 DeFT: Flash Tree-attention with IO-Awareness for Efficient Tree-search-based LLM Inference Jinwei Yao et.al. 2404.00242v1 null
2024-03-30 Multi-Conditional Ranking with Large Language Models Pouya Pezeshkpour et.al. 2404.00211v1 link
2024-03-30 EventGround: Narrative Reasoning by Grounding to Eventuality-centric Knowledge Graphs Cheng Jiayang et.al. 2404.00209v1 link
2024-03-30 Conceptual and Unbiased Reasoning in Language Models Ben Zhou et.al. 2404.00205v1 null
2024-03-29 Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections Ahmad Diab et.al. 2404.00141v1 null
2024-03-29 Measuring Taiwanese Mandarin Language Understanding Po-Heng Chen et.al. 2403.20180v1 null
2024-03-29 ITCMA: A Generative Agent Based on a Computational Consciousness Structure Hanzhong Zhang et.al. 2403.20097v1 null
2024-03-29 On Large Language Models' Hallucination with Regard to Known Facts Che Jiang et.al. 2403.20009v1 null
2024-03-29 Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning Qinhao Zhou et.al. 2403.19962v1 null
2024-03-28 LLMSense: Harnessing LLMs for High-level Reasoning Over Spatiotemporal Sensor Traces Xiaomin Ouyang et.al. 2403.19857v1 null
2024-03-28 Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving Akshay Gopalkrishnan et.al. 2403.19838v1 link
2024-03-28 Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models Yucheng Shi et.al. 2403.19631v1 null
2024-03-28 BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation Yuhong He et.al. 2403.19414v1 null
2024-03-28 RAIL: Robot Affordance Imagination with Large Language Models Ceng Zhang et.al. 2403.19369v1 null
2024-03-28 IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation Jiacui Huang et.al. 2403.19336v1 null
2024-03-28 Plug-and-Play Grounding of Reasoning in Multimodal Large Language Models Jiaxing Chen et.al. 2403.19322v1 null
2024-04-01 TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios Xiaokang Zhang et.al. 2403.19318v2 link
2024-03-28 Mitigating Misleading Chain-of-Thought Reasoning with Selective Filtering Yexin Wu et.al. 2403.19167v1 null
2024-03-28 MFORT-QA: Multi-hop Few-shot Open Rich Table Question Answering Che Guan et.al. 2403.19116v1 null
2024-03-28 Learning From Correctness Without Prompting Makes LLM Efficient Reasoner Yuxuan Yao et.al. 2403.19094v1 null
2024-03-27 LITA: Language Instructed Temporal-Localization Assistant De-An Huang et.al. 2403.19046v1 link
2024-03-27 Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Yanwei Li et.al. 2403.18814v1 link
2024-04-03 Long-form factuality in large language models Jerry Wei et.al. 2403.18802v3 link
2024-03-27 A Path Towards Legal Autonomy: An interoperable and explainable approach to extracting, transforming, loading and computing legal information using large language models, expert systems and Bayesian networks Axel Constant et.al. 2403.18537v1 null
2024-03-27 TriviaHG: A Dataset for Automatic Hint Generation from Factoid Questions Jamshid Mozafari et.al. 2403.18426v1 link
2024-03-27 The Topos of Transformer Networks Mattia Jacopo Villani et.al. 2403.18415v1 null
2024-03-27 An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM Wonkyun Kim et.al. 2403.18406v1 link
2024-03-27 Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval Shengjie Ma et.al. 2403.18405v1 null
2024-03-27 BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models Haitao Li et.al. 2403.18365v1 null
2024-04-03 Quantifying and Mitigating Unimodal Biases in Multimodal Large Language Models: A Causal Perspective Meiqi Chen et.al. 2403.18346v3 null
2024-03-27 LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models Mingxing Peng et.al. 2403.18344v1 null
2024-03-27 Dual Instruction Tuning with Large Language Models for Mathematical Reasoning Yongwei Zhou et.al. 2403.18295v1 null
2024-03-27 Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models Yiwu Zhong et.al. 2403.18252v1 link
2024-03-27 Large Language Models Need Consultants for Reasoning: Becoming an Expert in a Complex Human System Through Behavior Simulation Chuwen Wang et.al. 2403.18230v1 link
2024-03-28 Oh! We Freeze: Improving Quantized Knowledge Distillation via Signal Propagation Analysis for Large Language Models Kartikeya Bhardwaj et.al. 2403.18159v2 null
2024-03-26 Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization Jin Peng Zhou et.al. 2403.18120v1 link
2024-03-26 ShapeGrasp: Zero-Shot Task-Oriented Grasping with Large Language Models through Geometric Decomposition Samuel Li et.al. 2403.18062v1 null
2024-03-26 MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution Wei Tao et.al. 2403.17927v1 null
2024-03-26 Assessment of Multimodal Large Language Models in Alignment with Human Values Zhelun Shi et.al. 2403.17830v1 null
2024-03-26 Constructions Are So Difficult That Even Large Language Models Get Them Right for the Wrong Reasons Shijia Zhou et.al. 2403.17760v1 link
2024-03-26 Large Language Models Enhanced Collaborative Filtering Zhongxiang Sun et.al. 2403.17688v1 null
2024-03-26 DGoT: Dynamic Graph of Thoughts for Scientific Abstract Generation Xinyu Ning et.al. 2403.17491v1 link
2024-03-26 ChatGPT Rates Natural Language Explanation Quality Like Humans: But on Which Scales? Fan Huang et.al. 2403.17368v1 link
2024-03-26 Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models Zhenyu Pan et.al. 2403.17359v1 null
2024-03-25 TwoStep: Multi-agent Task Planning using Classical Planners and Large Language Models Ishika Singh et.al. 2403.17246v1 null
2024-03-25 A Comprehensive Study of the Capabilities of Large Language Models for Vulnerability Detection Benjamin Steenhoek et.al. 2403.17218v1 null
2024-03-25 Grounding Language Plans in Demonstrations Through Counterfactual Perturbations Yanwei Wang et.al. 2403.17124v1 null
2024-03-25 Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models Hao Shao et.al. 2403.16999v1 link
2024-03-25 PropTest: Automatic Property Testing for Improved Visual Programming Jaywon Koo et.al. 2403.16921v1 null
2024-03-25 Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art Neeloy Chakraborty et.al. 2403.16527v1 null
2024-03-25 Harnessing the power of LLMs for normative reasoning in MASs Bastin Tony Roy Savarimuthu et.al. 2403.16524v1 null
2024-03-25 Norm Violation Detection in Multi-Agent Systems using Large Language Models: A Pilot Study Shawn He et.al. 2403.16517v1 null
2024-03-25 Evaluating Large Language Models with Runtime Behavior of Program Execution Junkai Chen et.al. 2403.16437v1 null
2024-03-27 Re2LLM: Reflective Reinforcement Large Language Model for Session-based Recommendation Ziyan Wang et.al. 2403.16427v3 null
2024-03-28 Synthesize Step-by-Step: Tools, Templates and LLMs as Data Generators for Reasoning-Based Chart VQA Zhuowan Li et.al. 2403.16385v2 null
2024-03-28 Can Language Models Pretend Solvers? Logic Code Simulation with LLMs Minyu Chen et.al. 2403.16097v2 null
2024-03-24 Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with Justifications Wei Ma et.al. 2403.16073v1 null
2024-03-23 Few-shot Dialogue Strategy Learning for Motivational Interviewing via Inductive Reasoning Zhouhang Xie et.al. 2403.15737v1 null
2024-03-23 LLMs Instruct LLMs:An Extraction and Editing Method Xin Zhang et.al. 2403.15736v1 null
2024-03-21 Open Source Conversational LLMs do not know most Spanish words Javier Conde et.al. 2403.15491v1 null
2024-03-19 LLMs-based Few-Shot Disease Predictions using EHR: A Novel Approach Combining Predictive Agent Reasoning and Critical Agent Instruction Hejie Cui et.al. 2403.15464v1 null
2024-04-01 LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models Yuzhang Shang et.al. 2403.15388v3 null
2024-03-22 Can large language models explore in-context? Akshay Krishnamurthy et.al. 2403.15371v1 null
2024-03-22 CoLLEGe: Concept Embedding Generation for Large Language Models Ryan Teehan et.al. 2403.15362v1 null
2024-03-22 Sphere Neural-Networks for Rational Reasoning Tiansi Dong et.al. 2403.15297v1 null
2024-03-22 MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection Taeheon Kim et.al. 2403.15209v1 null
2024-03-22 CACA Agent: Capability Collaboration based AI Agent Peng Xu et.al. 2403.15137v1 null
2024-04-03 MasonTigers at SemEval-2024 Task 9: Solving Puzzles with an Ensemble of Chain-of-Thoughts Md Nishat Raihan et.al. 2403.14982v2 null
2024-03-22 Attention-Driven Reasoning: Unlocking the Potential of Large Language Models Bingli Liao et.al. 2403.14932v1 null
2024-03-25 VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding Ahmad Mahmood et.al. 2403.14743v2 null
2024-03-21 MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? Renrui Zhang et.al. 2403.14624v1 null
2024-03-21 A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science Clayton Cohn et.al. 2403.14565v1 null
2024-03-21 ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting Xiaoxue Cheng et.al. 2403.14312v1 link
2024-03-21 ERD: A Framework for Improving LLM Reasoning for Cognitive Distortion Classification Sehee Lim et.al. 2403.14255v1 null
2024-03-23 K-Act2Emo: Korean Commonsense Knowledge Graph for Indirect Emotional Expression Kyuhee Kim et.al. 2403.14253v2 link
2024-03-21 Empowering Segmentation Ability to Multi-modal Large Language Models Yuqi Yang et.al. 2403.14141v1 null
2024-03-21 Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations Jiaxing Sun et.al. 2403.14112v1 link
2024-03-21 Empowering Personalized Learning through a Conversation-based Tutoring System with Student Modeling Minju Park et.al. 2403.14071v1 null
2024-03-14 Circuit Transformer: End-to-end Circuit Design by Predicting the Next Gate Xihan Li et.al. 2403.13838v1 null
2024-03-23 Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts Guangzeng Han et.al. 2403.13786v2 null
2024-03-22 Llama meets EU: Investigating the European Political Spectrum through the Lens of LLMs Ilias Chalkidis et.al. 2403.13592v2 link
2024-03-20 PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns Yew Ken Chia et.al. 2403.13315v1 link
2024-03-20 LeanReasoner: Boosting Complex Logical Reasoning with Lean Dongwei Jiang et.al. 2403.13312v1 link
2024-03-20 Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs Zhihong Sun et.al. 2403.13271v1 null
2024-03-19 VL-ICL Bench: The Devil in the Details of Benchmarking Multimodal In-Context Learning Yongshuo Zong et.al. 2403.13164v1 link
2024-03-13 AutoTRIZ: Artificial Ideation with TRIZ and Large Language Models Shuo Jiang et.al. 2403.13002v1 null
2024-03-11 Prompt Selection and Augmentation for Few Examples Code Generation in Large Language Model and its Application in Robotics Control On Tai Wu et.al. 2403.12999v1 null
2024-03-19 Dated Data: Tracing Knowledge Cutoffs in Large Language Models Jeffrey Cheng et.al. 2403.12958v1 null
2024-03-19 Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models Joana Ribeiro de Faria et.al. 2403.12936v1 null
2024-03-19 mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding Anwen Hu et.al. 2403.12895v1 link
2024-03-19 HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning Fucai Ke et.al. 2403.12884v1 null
2024-03-19 Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models Zehui Chen et.al. 2403.12881v1 link
2024-03-19 Compositional 3D Scene Synthesis with Scene Graph Guided Layout-Shape Generation Yao Wei et.al. 2403.12848v1 null
2024-03-19 RelationVLM: Making Large Vision-Language Models Understand Visual Relations Zhipeng Huang et.al. 2403.12801v1 null
2024-03-18 NovelQA: A Benchmark for Long-Range Novel Question Answering Cunxiang Wang et.al. 2403.12766v1 link
2024-03-19 Instructing Large Language Models to Identify and Ignore Irrelevant Conditions Zhenyu Wu et.al. 2403.12744v1 link
2024-03-19 Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs Victor Carbune et.al. 2403.12596v1 null
2024-03-19 AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework Xiang Li et.al. 2403.12582v1 link
2024-03-19 To Help or Not to Help: LLM-based Attentive Support for Human-Robot Group Interactions Daniel Tanneberg et.al. 2403.12533v1 null
2024-03-19 Embodied LLM Agents Learn to Cooperate in Organized Teams Xudong Guo et.al. 2403.12482v1 null
2024-03-19 Dr3: Ask Large Language Models Not to Give Off-Topic Answers in Open Domain Multi-Hop Question Answering Yuan Gao et.al. 2403.12393v1 null
2024-03-22 RankPrompt: Step-by-Step Comparisons Make Language Models Better Reasoners Chi Hu et.al. 2403.12373v3 null
2024-03-18 OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety Chuang Liu et.al. 2403.12316v1 null
2024-03-18 TnT-LLM: Text Mining at Scale with Large Language Models Mengting Wan et.al. 2403.12173v1 null
2024-03-18 EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents Abhay Zala et.al. 2403.12014v1 null
2024-03-18 QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback based Self-Correction Xiang Huang et.al. 2403.11886v1 null
2024-03-18 Agent3D-Zero: An Agent for Zero-shot 3D Understanding Sha Zhang et.al. 2403.11835v1 null
2024-03-18 Metaphor Understanding Challenge Dataset for LLMs Xiaoyu Tong et.al. 2403.11810v1 null
2024-03-25 Counting-Stars: A Simple, Efficient, and Reasonable Strategy for Evaluating Long-Context Large Language Models Mingyang Song et.al. 2403.11802v2 link
2024-03-18 Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus Seungpil Lee et.al. 2403.11793v1 null
2024-03-20 LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning Shu Wang et.al. 2403.11552v2 link
2024-03-22 Scene-LLM: Extending Language Model for 3D Visual Understanding and Reasoning Rao Fu et.al. 2403.11401v2 null
2024-03-17 ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models Siyuan Huang et.al. 2403.11289v1 link
2024-03-17 Enhancing Event Causality Identification with Rationale and Structure-Aware Causal Question Answering Baiyan Zhang et.al. 2403.11129v1 null
2024-03-17 GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment Lance Ying et.al. 2403.11075v1 null
2024-03-26 SelfIE: Self-Interpretation of Large Language Model Embeddings Haozhe Chen et.al. 2403.10949v2 link
2024-03-16 BEnQA: A Question Answering and Reasoning Benchmark for Bengali and English Sheikh Shafayat et.al. 2403.10900v1 link
2024-03-16 A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment Tianhe Wu et.al. 2403.10854v1 link
2024-03-16 NARRATE: Versatile Language Architecture for Optimal Control in Robotics Seif Ismail et.al. 2403.10762v1 null
2024-03-15 VideoAgent: Long-form Video Understanding with Large Language Model as Agent Xiaohan Wang et.al. 2403.10517v1 null
2024-03-15 Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization Ratnadira Widyasari et.al. 2403.10507v1 null
2024-03-15 HawkEye: Training Video-Text LLMs for Grounding Text in Videos Yueqian Wang et.al. 2403.10228v1 link
2024-03-15 AUTONODE: A Neuro-Graphic Self-Learnable Engine for Cognitive GUI Automation Arkajit Datta et.al. 2403.10171v1 null
2024-03-15 RAFT: Adapting Language Model to Domain Specific RAG Tianjun Zhang et.al. 2403.10131v1 link
2024-03-15 Enhancing Human-Centered Dynamic Scene Understanding via Multiple LLMs Collaborated Reasoning Hang Zhang et.al. 2403.10107v1 null
2024-03-15 Knowledge Condensation and Reasoning for Knowledge-based VQA Dongze Hao et.al. 2403.10037v1 null
2024-03-15 ViTCN: Vision Transformer Contrastive Network For Reasoning Bo Song et.al. 2403.09962v1 null
2024-03-14 Meta-Cognitive Analysis: Evaluating Declarative and Procedural Knowledge in Datasets and Large Language Models Zhuoqun Li et.al. 2403.09750v1 link
2024-03-14 Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors Guanghua Li et.al. 2403.09747v1 null
2024-03-13 Do Large Language Models Solve ARC Visual Analogies Like People Do? Gustaw Opiełka et.al. 2403.09734v1 null
2024-03-14 3D-VLA: A 3D Vision-Language-Action Generative World Model Haoyu Zhen et.al. 2403.09631v1 null
2024-03-22 MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Brandon McKinzie et.al. 2403.09611v3 null
2024-03-14 Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey Xiaoyu Liu et.al. 2403.09606v1 null
2024-03-14 Logical Discrete Graphical Models Must Supplement Large Language Models for Information Synthesis Gregory Coppola et.al. 2403.09599v1 null
2024-03-15 ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models Runyu Ma et.al. 2403.09583v2 null
2024-03-22 Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation Yunhao Gou et.al. 2403.09572v2 null
2024-03-21 Less is More: Data Value Estimation for Visual Instruction Tuning Zikang Liu et.al. 2403.09559v2 null
2024-03-14 Exploring the Comprehension of ChatGPT in Traditional Chinese Medicine Knowledge Li Yizhen et.al. 2403.09164v1 null
2024-03-14 Caveat Lector: Large Language Models in Legal Practice Eliza Mik et.al. 2403.09163v1 null
2024-03-14 USimAgent: Large Language Models for Simulating Search Users Erhan Zhang et.al. 2403.09142v1 null
2024-03-14 Meaningful Learning: Advancing Abstract Reasoning in Large Language Models via Generic Fact Guidance Kai Xiong et.al. 2403.09085v1 null
2024-03-14 Query Rewriting via Large Language Models Jie Liu et.al. 2403.09060v1 null
2024-03-13 Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era Xuansheng Wu et.al. 2403.08946v1 link
2024-03-13 AcademiaOS: Automating Grounded Theory Development in Qualitative Research with Large Language Models Thomas Übellacker et.al. 2403.08844v1 link
2024-03-13 TINA: Think, Interaction, and Action Framework for Zero-Shot Vision Language Navigation Dingbang Li et.al. 2403.08833v1 null
2024-03-13 Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework Jingling Li et.al. 2403.08743v1 null
2024-03-13 The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models Carlo Nicolini et.al. 2403.08739v1 null
2024-03-14 Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation Daniel Honerkamp et.al. 2403.08605v2 link
2024-03-13 Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments Sitao Cheng et.al. 2403.08593v1 null
2024-03-13 CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model Cheng Chen et.al. 2403.08350v1 link
2024-03-13 LLM-Assisted Light: Leveraging Large Language Model Capabilities for Human-Mimetic Traffic Signal Control in Complex Urban Environments Maonan Wang et.al. 2403.08337v1 link
2024-03-13 Can Large Language Models Identify Authorship? Baixiang Huang et.al. 2403.08213v1 link
2024-03-13 Large Language Models are Contrastive Reasoners Liang Yao et.al. 2403.08211v1 link
2024-03-12 DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies William Xie et.al. 2403.07832v1 null
2024-03-12 Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM Sainbayar Sukhbaatar et.al. 2403.07816v1 null
2024-03-12 Fine-tuning Large Language Models with Sequential Instructions Hanxu Hu et.al. 2403.07794v1 link
2024-03-15 Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations Carlos Jose Xavier Cruz et.al. 2403.07769v3 link
2024-03-12 FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models Yan Liu et.al. 2403.07747v1 null
2024-03-12 Multi-modal Auto-regressive Modeling via Visual Words Tianshuo Peng et.al. 2403.07720v1 link
2024-03-12 DrPlanner: Diagnosis and Repair of Motion Planners Using Large Language Models Yuanfei Lin et.al. 2403.07470v1 link
2024-03-12 Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs Tianqing Fang et.al. 2403.07398v1 null
2024-03-12 NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning Bingqian Lin et.al. 2403.07376v1 link
2024-03-11 Narrating Causal Graphs with Large Language Models Atharva Phatak et.al. 2403.07118v1 null
2024-03-13 Naming, Describing, and Quantifying Visual Objects in Humans and LLMs Alberto Testoni et.al. 2403.06935v2 link
2024-03-11 ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis Yanming Liu et.al. 2403.06932v1 link
2024-03-11 RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback Yanming Liu et.al. 2403.06840v1 link
2024-03-11 KELLMRec: Knowledge-Enhanced Large Language Models for Recommendation Weiqing Luo et.al. 2403.06642v1 null
2024-03-11 **Guiding Clinical Reasoning with Large Language Models via K

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages