GitHub

Updated on 2024.07.25

Table of Contents

LLM - Explainable
LLM - Interpretable
LLM - Reasoning
LLM - Uncertainty
LLM - Perplexity

LLM - Explainable

Publish Date	Title	Authors	PDF	Code
2024-07-24	ViPer: Visual Personalization of Generative Models via Individual Preference Learning	Sogand Salehi et.al.	2407.17365v1	null
2024-07-24	Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mechanism	Anhao Zhao et.al.	2407.17011v1	null
2024-07-24	MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Micro-Expression Dynamics in Video Dialogues	Liyun Zhang et.al.	2407.16552v2	null
2024-07-22	AI for Handball: predicting and explaining the 2024 Olympic Games tournament with Deep Learning and Large Language Models	Florian Felice et.al.	2407.15987v1	null
2024-07-22	Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability	Zhuoyan Xu et.al.	2407.15720v1	link
2024-07-22	Dissecting Multiplication in Transformers: Insights into LLMs	Luyu Qiu et.al.	2407.15360v1	null
2024-07-21	Explaining Decisions of Agents in Mixed-Motive Games	Maayan Orner et.al.	2407.15255v1	null
2024-07-21	XAI meets LLMs: A Survey of the Relation between Explainable AI and Large Language Models	Erik Cambria et.al.	2407.15248v1	null
2024-07-20	Understanding the Relationship between Prompts and Response Uncertainty in Large Language Models	Ze Yu Zhang et.al.	2407.14845v1	null
2024-07-21	Trading Devil Final: Backdoor attack via Stock market and Bayesian Optimization	Orson Mengara et.al.	2407.14573v1	null
2024-07-19	Evaluating the Reliability of Self-Explanations in Large Language Models	Korbinian Randl et.al.	2407.14487v1	link
2024-07-19	Undermining Mental Proof: How AI Can Make Cooperation Harder by Making Thinking Easier	Zachary Wojtowicz et.al.	2407.14452v1	null
2024-07-18	The Software Complexity of Nations	Sándor Juhász et.al.	2407.13880v1	null
2024-07-24	The Honorific Effect: Exploring the Impact of Japanese Linguistic Formalities on AI-Generated Physics Explanations	Keisuke Sato et.al.	2407.13787v2	null
2024-07-18	COMCAT: Leveraging Human Judgment to Improve Automatic Documentation and Summarization	Skyler Grandel et.al.	2407.13648v1	null
2024-07-18	SOMONITOR: Explainable Marketing Data Processing and Analysis with Large Language Models	Qi Yang et.al.	2407.13117v1	null
2024-07-17	Explainable Biomedical Hypothesis Generation via Retrieval Augmented Generation enabled Large Language Models	Alexander R. Pelletier et.al.	2407.12888v1	null
2024-07-16	InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification	Yujia Hu et.al.	2407.12882v1	link
2024-07-03	Truth is Universal: Robust Detection of Lies in LLMs	Lennart Bürger et.al.	2407.12831v1	null
2024-07-16	InvAgent: A Large Language Model based Multi-Agent System for Inventory Management in Supply Chains	Yinzhu Quan et.al.	2407.11384v1	link
2024-06-03	The Life Cycle of Large Language Models: A Review of Biases in Education	Jinsook Lee et.al.	2407.11203v1	null
2024-06-25	RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems	Robert Friel et.al.	2407.11005v1	null
2024-06-24	Visualization Literacy of Multimodal Large Language Models: A Comparative Study	Zhimin Li et.al.	2407.10996v1	null
2024-06-23	Do Large Language Models Understand Verbal Indicators of Romantic Attraction?	Sandra C. Matz et.al.	2407.10989v1	null
2024-07-15	GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework	Hannah Sansford et.al.	2407.10793v1	null
2024-07-16	Transforming Agency. On the mode of existence of Large Language Models	Xabier E. Barandiaran et.al.	2407.10735v2	null
2024-07-15	Learning Dynamics of LLM Finetuning	Yi Ren et.al.	2407.10490v1	link
2024-07-19	Rapid Biomedical Research Classification: The Pandemic PACT Advanced Categorisation Engine	Omid Rohanian et.al.	2407.10086v2	null
2024-07-13	Building pre-train LLM Dataset for the INDIC Languages: a case study on Hindi	Shantipriya Parida et.al.	2407.09855v1	null
2024-07-17	Counterfactual Explainable Incremental Prompt Attack Analysis on Large Language Models	Dong Shu et.al.	2407.09292v2	null
2024-07-12	DAHRS: Divergence-Aware Hallucination-Remediated SRL Projection	Sangpil Youm et.al.	2407.09283v1	null
2024-07-11	Fault Diagnosis in Power Grids with Large Language Model	Liu Jing et.al.	2407.08836v1	null
2024-07-11	Towards Explainable Evolution Strategies with Large Language Models	Jill Baumann et.al.	2407.08331v1	null
2024-07-10	Training on the Test Task Confounds Evaluation and Emergence	Ricardo Dominguez-Olmedo et.al.	2407.07890v1	link
2024-07-10	A Proposed S.C.O.R.E. Evaluation Framework for Large Language Models : Safety, Consensus, Objectivity, Reproducibility and Explainability	Ting Fang Tan et.al.	2407.07666v1	null
2024-07-08	SimPal: Towards a Meta-Conversational Framework to Understand Teacher's Instructional Goals for K-12 Physics	Effat Farhana et.al.	2407.06241v1	null
2024-07-07	Experiments with truth using Machine Learning: Spectral analysis and explainable classification of synthetic, false, and genuine information	Vishnu S. Pendyala et.al.	2407.05464v1	null
2024-07-07	Exploring the Educational Landscape of AI: Large Language Models' Approaches to Explaining Conservation of Momentum in Physics	Keisuke Sato et.al.	2407.05308v1	null
2024-07-04	From Data to Commonsense Reasoning: The Use of Large Language Models for Explainable AI	Stefanie Krause et.al.	2407.03778v1	null
2024-07-04	Improving Self Consistency in LLMs through Probabilistic Tokenization	Ashutosh Sathe et.al.	2407.03678v1	null
2024-07-04	The Mysterious Case of Neuron 1512: Injectable Realignment Architectures Reveal Internal Characteristics of Meta's Llama 2 Model	Brenden Smith et.al.	2407.03621v1	link
2024-07-03	LANE: Logic Alignment of Non-tuning Large Language Models and Online Recommendation Systems for Explainable Reason Generation	Hongke Zhao et.al.	2407.02833v1	null
2024-07-01	Engineering Conversational Search Systems: A Review of Applications, Architectures, and Functional Components	Phillip Schneider et.al.	2407.00997v1	null
2024-07-08	LLM Uncertainty Quantification through Directional Entailment Graph and Claim Level Response Augmentation	Longchao Da et.al.	2407.00994v2	null
2024-07-03	HRDE: Retrieval-Augmented Large Language Models for Chinese Health Rumor Detection and Explainability	Yanfang Chen et.al.	2407.00668v2	link
2024-06-29	MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation	Jinsheng Huang et.al.	2407.00468v1	link
2024-06-28	Evaluating Human Alignment and Model Faithfulness of LLM Rationale	Mohsen Fayyaz et.al.	2407.00219v1	null
2024-06-28	Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach	Sai Krishna Revanth Vuruma et.al.	2407.00167v1	null
2024-06-28	Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring	Jiazheng Li et.al.	2406.19949v1	null
2024-06-27	xTower: A Multilingual LLM for Explaining and Correcting Translation Errors	Marcos Treviso et.al.	2406.19482v1	null
2024-06-26	"Is ChatGPT a Better Explainer than My Professor?": Evaluating the Explanation Capabilities of LLMs in Conversation Compared to a Human Baseline	Grace Li et.al.	2406.18512v1	null
2024-06-26	Mental Modeling of Reinforcement Learning Agents by Language Models	Wenhao Lu et.al.	2406.18505v1	null
2024-06-26	Is In-Context Learning a Type of Gradient-Based Learning? Evidence from the Inverse Frequency Effect in Structural Priming	Zhenghao Zhou et.al.	2406.18501v1	null
2024-06-25	From Distributional to Overton Pluralism: Investigating Large Language Model Alignment	Thom Lake et.al.	2406.17692v1	link
2024-06-25	Banishing LLM Hallucinations Requires Rethinking Generalization	Johnny Li et.al.	2406.17642v1	null
2024-06-23	Unveiling LLM Mechanisms Through Neural ODEs and Control Theory	Yukun Zhang et.al.	2406.16985v1	null
2024-06-24	Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track	Ronak Pradeep et.al.	2406.16828v1	link
2024-06-24	Large Language Models Are Cross-Lingual Knowledge-Free Reasoners	Peng Hu et.al.	2406.16655v1	link
2024-06-24	UNO Arena for Evaluating Sequential Decision-Making Capability of Large Language Models	Zhanyue Qin et.al.	2406.16382v1	null
2024-06-23	Preference Tuning For Toxicity Mitigation Generalizes Across Languages	Xiaochen Li et.al.	2406.16235v1	link
2024-06-23	Effectiveness of ChatGPT in explaining complex medical reports to patients	Mengxuan Sun et.al.	2406.15963v1	null
2024-06-30	LLM-Powered Explanations: Unraveling Recommendations Through Subgraph Reasoning	Guangsi Shi et.al.	2406.15859v2	null
2024-06-21	Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network	Badr AlKhamissi et.al.	2406.15109v1	link
2024-06-21	Harnessing Knowledge Retrieval with Large Language Models for Clinical Report Error Correction	Jinge Wu et.al.	2406.15045v1	null
2024-06-20	Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?	Zhiqiang Pi et.al.	2406.14737v1	null
2024-06-20	Self-supervised Interpretable Concept-based Models for Text Classification	Francesco De Santis et.al.	2406.14335v1	null
2024-06-20	Definition generation for lexical semantic change detection	Mariia Fedorova et.al.	2406.14167v1	link
2024-06-22	Enhancing Travel Choice Modeling with Large Language Models: A Prompt-Learning Approach	Xuehao Zhai et.al.	2406.13558v2	null
2024-06-16	Current state of LLM Risks and AI Guardrails	Suriya Ganesh Ayyamperumal et.al.	2406.12934v1	null
2024-06-19	Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models	Hengyi Wang et.al.	2406.12649v2	null
2024-06-18	An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs	Daking Rai et.al.	2406.12288v1	link
2024-06-18	Unveiling Implicit Table Knowledge with Question-Then-Pinpoint Reasoner for Insightful Table Summarization	Kwangwook Seo et.al.	2406.12269v1	null
2024-06-18	A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning	Lijie Hu et.al.	2406.12255v1	null
2024-06-29	Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM	Huaxin Zhang et.al.	2406.12235v2	link
2024-06-28	WellDunn: On the Robustness and Explainability of Language Models and Large Language Models in Identifying Wellness Dimensions	Seyedali Mohammadi et.al.	2406.12058v3	null
2024-05-31	Generative AI Voting: Fair Collective Choice is Resilient to LLM Biases and Inconsistencies	Srijoni Majumdar et.al.	2406.11871v1	null
2024-06-17	CELL your Model: Contrastive Explanation Methods for Large Language Models	Ronny Luss et.al.	2406.11785v1	null
2024-06-17	GECOBench: A Gender-Controlled Text Dataset and Benchmark for Quantifying Biases in Explanations	Rick Wilming et.al.	2406.11547v1	link
2024-06-17	A Systematic Analysis of Large Language Models as Soft Reasoners: The Case of Syllogistic Inferences	Leonardo Bertolazzi et.al.	2406.11341v1	null
2024-06-17	TIFG: Text-Informed Feature Generation with Large Language Models	Xinhao Zhang et.al.	2406.11177v1	null
2024-06-16	LLMFactor: Extracting Profitable Factors through Prompts for Explainable Stock Movement Prediction	Meiyun Wang et.al.	2406.10811v1	null
2024-06-15	A Comprehensive Survey of Foundation Models in Medicine	Wasif Khan et.al.	2406.10729v1	null
2024-06-15	Multilingual Large Language Models and Curse of Multilinguality	Daniil Gurgurov et.al.	2406.10602v1	null
2024-06-14	Towards Effectively Detecting and Explaining Vulnerabilities Using Large Language Models	Qiheng Mao et.al.	2406.09701v1	null
2024-06-13	Automated Molecular Concept Generation and Labeling with Large Language Models	Shichang Zhang et.al.	2406.09612v1	null
2024-06-12	LLM-assisted Concept Discovery: Automatically Identifying and Explaining Neuron Functions	Nhat Hoang-Xuan et.al.	2406.08572v1	null
2024-06-13	CoXQL: A Dataset for Parsing Explanation Requests in Conversational XAI Systems	Qianli Wang et.al.	2406.08101v2	link
2024-06-12	A Concept-Based Explainability Framework for Large Multimodal Models	Jayneel Parekh et.al.	2406.08074v1	null
2024-06-13	LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing	Hongxiang Zhang et.al.	2406.07714v2	null
2024-06-15	What's in an embedding? Would a rose by any embedding smell as sweet?	Venkat Venkatasubramanian et.al.	2406.06870v3	null
2024-06-10	Evaluating Zero-Shot Long-Context LLM Compression	Chenyu Wang et.al.	2406.06773v1	null
2024-06-09	Exploring the Efficacy of Large Language Models (GPT-4) in Binary Reverse Engineering	Saman Pordanesh et.al.	2406.06637v1	null
2024-06-06	Reinterpreting 'the Company a Word Keeps': Towards Explainable and Ontologically Grounded Language Models	Walid S. Saba et.al.	2406.06610v1	null
2024-06-06	Are Large Language Models the New Interface for Data Pipelines?	Sylvio Barbon Junior et.al.	2406.06596v1	null
2024-06-13	From Redundancy to Relevance: Enhancing Explainability in Multimodal Large Language Models	Xiaofeng Zhang et.al.	2406.06579v2	null
2024-06-10	Insights from Social Shaping Theory: The Appropriation of Large Language Models in an Undergraduate Programming Course	Aadarsh Padiyath et.al.	2406.06451v1	null
2024-07-05	Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue	Simone Alghisi et.al.	2406.06399v2	null
2024-07-03	MedExQA: Medical Question Answering Benchmark with Multiple Explanations	Yunsoo Kim et.al.	2406.06331v2	link
2024-06-10	Safety Alignment Should Be Made More Than Just a Few Tokens Deep	Xiangyu Qi et.al.	2406.05946v1	link
2024-06-13	How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States	Zhenhong Zhou et.al.	2406.05644v2	link
2024-06-08	Aligning Human Knowledge with Visual Concepts Towards Explainable Medical Image Classification	Yunhe Gao et.al.	2406.05596v1	null
2024-06-07	Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models	Michał Romaszewski et.al.	2406.04926v1	null
2024-06-07	Think out Loud: Emotion Deducing Explanation in Dialogues	Jiangnan Li et.al.	2406.04758v1	null
2024-06-07	Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions	Jingtan Wang et.al.	2406.04606v1	link
2024-06-08	What Do Language Models Learn in Context? The Structured Task Hypothesis	Jiaoda Li et.al.	2406.04216v2	link
2024-06-06	Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective	Xinhao Yao et.al.	2406.03768v1	link
2024-06-04	Dynamic and Adaptive Feature Generation with LLM	Xinhao Zhang et.al.	2406.03505v1	null
2024-06-05	AD-H: Autonomous Driving with Hierarchical Agents	Zaibin Zhang et.al.	2406.03474v1	null
2024-06-06	Large Language Models as Evaluators for Recommendation Explanations	Xiaoyu Zhang et.al.	2406.03248v2	link
2024-06-05	Missci: Reconstructing Fallacies in Misrepresented Science	Max Glockner et.al.	2406.03181v1	link
2024-06-04	XRec: Large Language Models for Explainable Recommendation	Qiyao Ma et.al.	2406.02377v1	link
2024-06-04	I've got the "Answer"! Interpretation of LLMs Hidden States in Question Answering	Valeriya Goloviznina et.al.	2406.02060v1	null
2024-06-20	What Are Large Language Models Mapping to in the Brain? A Case Against Over-Reliance on Brain Scores	Ebrahim Feghhi et.al.	2406.01538v2	link
2024-06-04	Superhuman performance in urology board questions by an explainable large language model enabled for context integration of the European Association of Urology guidelines: the UroBot study	Martin J. Hetz et.al.	2406.01428v2	null
2024-06-03	TCMBench: A Comprehensive Benchmark for Evaluating Large Language Models in Traditional Chinese Medicine	Wenjing Yue et.al.	2406.01126v1	null
2024-06-03	Unveil the Duality of Retrieval-Augmented Generation: Theoretical Analysis and Practical Solution	Shicheng Xu et.al.	2406.00944v1	null
2024-06-01	Evaluating Uncertainty-based Failure Detection for Closed-Loop LLM Planners	Zhi Zheng et.al.	2406.00430v1	null
2024-05-31	How In-Context Learning Emerges from Training on Unstructured Data: On the Role of Co-Occurrence, Positional Information, and Noise Structures	Kevin Christian Wibisono et.al.	2406.00131v1	link
2024-05-27	How Ready Are Generative Pre-trained Large Language Models for Explaining Bengali Grammatical Errors?	Subhankar Maity et.al.	2406.00039v1	null
2024-05-24	Large Language Model Pruning	Hanjuan Huang et.al.	2406.00030v1	null
2024-06-05	SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales	Tianyang Xu et.al.	2405.20974v2	link
2024-06-03	Large Language Models are Zero-Shot Next Location Predictors	Ciro Beneduce et.al.	2405.20962v2	link
2024-05-31	FineRadScore: A Radiology Report Line-by-Line Evaluation Technique Generating Corrections with Severity Scores	Alyssa Huang et.al.	2405.20613v1	link
2024-05-30	XPrompt:Explaining Large Language Model's Generation via Joint Prompt Attribution	Yurui Chang et.al.	2405.20404v1	null
2024-05-29	Quo Vadis ChatGPT? From Large Language Models to Large Knowledge Models	Venkat Venkatasubramanian et.al.	2405.19561v1	null
2024-05-29	Towards Faithful Chain-of-Thought: Large Language Models are Bridging Reasoners	Jiachun Li et.al.	2405.18915v1	null
2024-06-11	Faithful Logical Reasoning via Symbolic Chain-of-Thought	Jundong Xu et.al.	2405.18357v2	link
2024-05-28	Active Use of Latent Constituency Representation in both Humans and Large Language Models	Wei Liu et.al.	2405.18241v1	link
2024-05-28	Exploring Activation Patterns of Parameters in Language Models	Yudong Wang et.al.	2405.17799v1	null
2024-05-28	Facilitating Holistic Evaluations with LLMs: Insights from Scenario-Based Experiments	Toru Ishida et.al.	2405.17728v1	null
2024-05-27	PAE: LLM-based Product Attribute Extraction for E-Commerce Fashion Trends	Apurva Sinha et.al.	2405.17533v1	null
2024-07-02	TEII: Think, Explain, Interact and Iterate with Large Language Models to Solve Cross-lingual Emotion Detection	Long Cheng et.al.	2405.17129v2	link
2024-05-27	The Uncanny Valley: Exploring Adversarial Robustness from a Flatness Perspective	Nils Philipp Walter et.al.	2405.16918v1	null
2024-05-25	Large Language Models Enable Automated Formative Feedback in Human-Robot Interaction Tasks	Emily Jensen et.al.	2405.16344v1	null
2024-06-20	Finetuning Large Language Model for Personalized Ranking	Zhuoxi Bai et.al.	2405.16127v2	link
2024-05-24	Transformers represent belief state geometry in their residual stream	Adam S. Shai et.al.	2405.15943v1	null
2024-05-24	Inverse-RLignment: Inverse Reinforcement Learning from Demonstrations for LLM Alignment	Hao Sun et.al.	2405.15624v1	null
2024-07-03	ChatGPT Code Detection: Techniques for Uncovering the Source of Code	Marc Oedingen et.al.	2405.15512v2	link
2024-05-24	From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks	Jacob Russin et.al.	2405.15164v1	null
2024-05-28	Explaining Multi-modal Large Language Models by Analyzing their Vision Perception	Loris Giulivi et.al.	2405.14612v2	link
2024-05-23	Large Language Models for Explainable Decisions in Dynamic Digital Twins	Nan Zhang et.al.	2405.14411v1	link
2024-05-26	Explainable Few-shot Knowledge Tracing	Haoxuan Li et.al.	2405.14391v2	link
2024-05-23	Knowledge Localization: Mission Not Accomplished? Enter Query Localization!	Yuheng Chen et.al.	2405.14117v1	null
2024-05-22	Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation	Cyril Chhun et.al.	2405.13769v1	link
2024-05-22	Mining Action Rules for Defect Reduction Planning	Khouloud Oueslati et.al.	2405.13740v1	null
2024-05-22	Navigating User Experience of ChatGPT-based Conversational Recommender Systems: The Effects of Prompt Guidance and Recommendation Domain	Yizhe Zhang et.al.	2405.13560v1	null
2024-05-22	HighwayLLM: Decision-Making and Navigation in Highway Driving with RL-Informed Language Model	Mustafa Yildirim et.al.	2405.13547v1	null
2024-05-21	Investigating Symbolic Capabilities of Large Language Models	Neisarg Dave et.al.	2405.13209v1	null
2024-05-11	RAGE Against the Machine: Retrieval-Augmented LLM Explanations	Joel Rorseth et.al.	2405.13000v1	null
2024-05-20	Directed Metric Structures arising in Large Language Models	Stéphane Gaubert et.al.	2405.12264v1	null
2024-05-19	Exploring the Capabilities of Prompted Large Language Models in Educational and Assessment Applications	Subhankar Maity et.al.	2405.11579v1	null
2024-05-17	SynDy: Synthetic Dynamic Dataset Generation Framework for Misinformation Tasks	Michael Shliselberg et.al.	2405.10700v1	null
2024-05-15	LoRA Learns Less and Forgets Less	Dan Biderman et.al.	2405.09673v1	null
2024-05-15	Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models	Majid Zarharan et.al.	2405.09454v1	link
2024-05-14	Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure	Odysseas S. Chlapanis et.al.	2405.08502v1	link
2024-05-14	Challenges and Opportunities in Text Generation Explainability	Kenza Amara et.al.	2405.08468v1	null
2024-05-14	Understanding the performance gap between online and offline alignment algorithms	Yunhao Tang et.al.	2405.08448v1	null
2024-05-12	ExplainableDetector: Exploring Transformer-based Language Modeling Approach for SMS Spam Detection with Explainability Analysis	Mohammad Amaz Uddin et.al.	2405.08026v1	null
2024-05-13	Can Language Models Explain Their Own Classification Behavior?	Dane Sherburn et.al.	2405.07436v1	link
2024-05-10	LLM-Generated Black-box Explanations Can Be Adversarially Helpful	Rohan Ajwani et.al.	2405.06800v1	null
2024-05-15	Parameter-Efficient Instruction Tuning of Large Language Models For Extreme Financial Numeral Labelling	Subhendu Khatuya et.al.	2405.06671v2	link
2024-06-03	XAI4LLM. Let Machine Learning Models and LLMs Collaborate for Enhanced In-Context Learning in Healthcare	Fatemeh Nazary et.al.	2405.06270v3	null
2024-05-09	Can Perplexity Reflect Large Language Model's Ability in Long Text Understanding?	Yutong Hu et.al.	2405.06105v1	null
2024-05-09	LLMs for XAI: Future Directions for Explaining Explanations	Alexandra Zytek et.al.	2405.06064v1	null
2024-05-09	Investigating Interaction Modes and User Agency in Human-LLM Collaboration for Domain-Specific Data Analysis	Jiajing Guo et.al.	2405.05548v1	null
2024-05-08	The Effect of Model Size on LLM Post-hoc Explainability via LIME	Henning Heyen et.al.	2405.05348v1	link
2024-05-09	LLMs with Personalities in Multi-issue Negotiation Games	Sean Noh et.al.	2405.05248v2	null
2024-05-08	Zero-shot LLM-guided Counterfactual Generation for Text	Amrita Bhattacharjee et.al.	2405.04793v1	null
2024-05-09	Large Language Models for Cyber Security: A Systematic Literature Review	HanXiang Xu et.al.	2405.04760v2	null
2024-05-07	Large Language Models Cannot Explain Themselves	Advait Sarkar et.al.	2405.04382v1	null
2024-05-07	Deception in Reinforced Autonomous Agents: The Unconventional Rabbit Hat Trick in Legislation	Atharvan Dogra et.al.	2405.04325v1	null
2024-05-07	Granite Code Models: A Family of Open Foundation Models for Code Intelligence	Mayank Mishra et.al.	2405.04324v1	link
2024-05-07	Semantic API Alignment: Linking High-level User Goals to APIs	Robert Feldt et.al.	2405.04236v1	null
2024-05-07	NL2Plan: Robust LLM-Driven Planning from Minimal Text Descriptions	Elliot Gestrin et.al.	2405.04215v1	null
2024-05-07	A Causal Explainable Guardrails for Large Language Models	Zhixuan Chu et.al.	2405.04160v1	null
2024-05-06	FOKE: A Personalized and Explainable Education Framework Integrating Foundation Models, Knowledge Graphs, and Prompt Engineering	Silan Hu et.al.	2405.03734v1	null
2024-05-06	Explainable Fake News Detection With Large Language Model via Defense Among Competing Wisdom	Bo Wang et.al.	2405.03371v1	link
2024-05-03	What does the Knowledge Neuron Thesis Have to do with Knowledge?	Jingcheng Niu et.al.	2405.02421v1	link
2024-05-07	A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model	Jiexia Ye et.al.	2405.02358v2	link
2024-05-03	Argumentative Large Language Models for Explainable and Contestable Decision-Making	Gabriel Freedman et.al.	2405.02079v1	null
2024-05-03	Which Identities Are Mobilized: Towards an automated detection of social group appeals in political texts	Felicia Riethmüller et.al.	2405.01904v1	null
2024-05-02	CoS: Enhancing Personalization and Mitigating Bias with Context Steering	Jerry Zhi-Yang He et.al.	2405.01768v1	null
2024-05-08	Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving	Xin Quan et.al.	2405.01379v2	null
2024-04-26	LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study	Van Bach Nguyen et.al.	2405.00722v1	null
2024-05-01	RAG-based Explainable Prediction of Road Users Behaviors for Automated Driving using Knowledge Graphs and Large Language Models	Mohamed Manzour Hussien et.al.	2405.00449v1	null
2024-05-01	Social Life Simulation for Non-Cognitive Skills Learning	Zihan Yan et.al.	2405.00273v1	null
2024-04-30	A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications	Steph Buongiorno et.al.	2404.19729v1	null
2024-04-30	On Training a Neural Network to Explain Binaries	Alexander Interrante-Grant et.al.	2404.19631v1	null
2024-04-29	Large Language Models as Conversational Movie Recommenders: A User Study	Ruixuan Sun et.al.	2404.19093v1	null
2024-04-30	Evaluating Concept-based Explanations of Language Models: A Study on Faithfulness and Readability	Meng Li et.al.	2404.18533v2	link
2024-04-30	Comparing LLM prompting with Cross-lingual transfer performance on Indigenous and Low-resource Brazilian Languages	David Ifeoluwa Adelani et.al.	2404.18286v2	null
2024-04-27	Advancing Healthcare Automation: Multi-Agent Systems for Medical Necessity Justification	Himanshu Pandey et.al.	2404.17977v1	null
2024-04-11	Rumour Evaluation with Very Large Language Models	Dahlia Shehata et.al.	2404.16859v1	link
2024-04-25	TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning	Liang Zhang et.al.	2404.16635v1	link
2024-04-04	Elicitron: An LLM Agent-Based Simulation Framework for Design Requirements Elicitation	Mohammadmehdi Ataei et.al.	2404.16045v1	null
2024-04-24	Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach	Linyu Liu et.al.	2404.15993v1	null
2024-04-25	Detecting Conceptual Abstraction in LLMs	Michaela Regneri et.al.	2404.15848v2	null
2024-04-22	Pixels and Predictions: Potential of GPT-4V in Meteorological Imagery Analysis and Forecast Communication	John R. Lawson et.al.	2404.15166v1	null
2024-06-04	Graph Machine Learning in the Era of Large Language Models (LLMs)	Wenqi Fan et.al.	2404.14928v2	null
2024-05-10	Explaining Arguments' Strength: Unveiling the Role of Attacks and Supports (Technical Report)	Xiang Yin et.al.	2404.14304v2	link
2024-04-22	Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach	Yao Wan et.al.	2404.14296v1	link
2024-04-22	EventLens: Leveraging Event-Aware Pretraining and Cross-modal Linking Enhances Visual Commonsense Reasoning	Mingjie Ma et.al.	2404.13847v1	null
2024-04-29	Large Language Models for Networking: Workflow, Advances and Challenges	Chang Liu et.al.	2404.12901v2	null
2024-04-18	MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale	Xiaotang Gai et.al.	2404.12372v1	null
2024-04-18	Concept Induction using LLMs: a user experiment for assessment	Adrita Barua et.al.	2404.11875v1	null
2024-05-01	Course Recommender Systems Need to Consider the Job Market	Jibril Frej et.al.	2404.10876v2	link
2024-06-03	Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model	Hengyuan Zhang et.al.	2404.10306v4	link
2024-04-11	Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs	Jierui Li et.al.	2404.08148v1	null
2024-05-29	Sketch-Plan-Generalize: Continual Few-Shot Learning of Inductively Generalizable Spatial Concepts	Namasivayam Kalithasan et.al.	2404.07774v2	null
2024-04-11	Unraveling the Dilemma of AI Errors: Exploring the Effectiveness of Human and Machine Explanations for Large Language Models	Marvin Pafla et.al.	2404.07725v1	null
2024-04-07	Explaining EDA synthesis errors with LLMs	Siyu Qiu et.al.	2404.07235v1	null
2024-04-11	From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications	Yongqiang Ma et.al.	2404.07108v2	null
2024-05-15	A Mathematical Theory for Learning Semantic Languages by Abstract Learners	Kuo-Yu Liao et.al.	2404.07009v3	null
2024-04-10	WordDecipher: Enhancing Digital Workspace Communication with Explainable AI for Non-native English Speakers	Yuexi Chen et.al.	2404.07005v1	null
2024-04-09	CausalBench: A Comprehensive Benchmark for Causal Learning Capability of Large Language Models	Yu Zhou et.al.	2404.06349v1	null
2024-04-07	X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model	Jan Held et.al.	2404.06332v1	null
2024-04-07	StockGPT: A GenAI Model for Stock Prediction and Trading	Dat Mai et.al.	2404.05101v1	null
2024-04-07	Data Bias According to Bipol: Men are Naturally Right and It is the Role of Women to Follow Their Lead	Irene Pagliai et.al.	2404.04838v1	link
2024-04-06	Binary Classifier Optimization for Large Language Model Alignment	Seungjae Jung et.al.	2404.04656v1	null
2024-04-04	Language Model Evolution: An Iterated Learning Perspective	Yi Ren et.al.	2404.04286v1	link
2024-04-04	Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph	Marco Bronzini et.al.	2404.03623v1	null
2024-04-04	Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models	Yantao Liu et.al.	2404.03577v1	link
2024-04-04	Edisum: Summarizing and Explaining Wikipedia Edits at Scale	Marija Šakota et.al.	2404.03428v1	link
2024-04-04	Probing Large Language Models for Scalar Adjective Lexical Semantics and Scalar Diversity Pragmatics	Fangru Lin et.al.	2404.03301v1	link
2024-04-04	DELTA: Decomposed Efficient Long-Term Robot Task Planning using Large Language Models	Yuchen Liu et.al.	2404.03275v1	null
2024-04-03	LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models	Gabriela Ben Melech Stan et.al.	2404.03118v1	null
2024-04-10	An Incomplete Loop: Deductive, Inductive, and Abductive Learning in Large Language Models	Emmy Liu et.al.	2404.03028v2	null
2024-04-13	Explainable Traffic Flow Prediction with Large Language Models	Xusen Guo et.al.	2404.02937v3	null
2024-04-03	Towards detecting unanticipated bias in Large Language Models	Anna Kruspe et.al.	2404.02650v1	null
2024-04-03	Task Agnostic Architecture for Algorithm Induction via Implicit Composition	Sahil J. Sindhi et.al.	2404.02450v1	null
2024-04-01	Enhancing Reasoning Capacity of SLM using Cognitive Enhancement	Jonathan Pan et.al.	2404.01135v1	null
2024-04-01	Query Performance Prediction using Relevance Judgments Generated by Large Language Models	Chuan Meng et.al.	2404.01012v1	link
2024-04-12	Harnessing the Power of Large Language Model for Uncertainty Aware Graph Processing	Zhenyu Qian et.al.	2404.00589v2	link
2024-03-28	"I'm categorizing LLM as a productivity tool": Examining ethics of LLM use in HCI research practices	Shivani Kapania et.al.	2403.19876v1	null
2024-03-27	Measuring Political Bias in Large Language Models: What Is Said and How It Is Said	Yejin Bang et.al.	2403.18932v1	null
2024-03-26	Targeted Visualization of the Backbone of Encoder LLMs	Isaac Roberts et.al.	2403.18872v1	link
2024-03-27	A Path Towards Legal Autonomy: An interoperable and explainable approach to extracting, transforming, loading and computing legal information using large language models, expert systems and Bayesian networks	Axel Constant et.al.	2403.18537v1	null
2024-03-27	LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models	Mingxing Peng et.al.	2403.18344v1	null
2024-03-27	Exploring the Privacy Protection Capabilities of Chinese Large Language Models	Yuqi Yang et.al.	2403.18205v1	null
2024-03-26	Addressing Social Misattributions of Large Language Models: An HCXAI-based Approach	Andrea Ferrario et.al.	2403.17873v1	null
2024-03-26	Constructions Are So Difficult That Even Large Language Models Get Them Right for the Wrong Reasons	Shijia Zhou et.al.	2403.17760v1	link
2024-03-25	A Comprehensive Study of the Capabilities of Large Language Models for Vulnerability Detection	Benjamin Steenhoek et.al.	2403.17218v1	null
2024-03-25	Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making	Shuai Ma et.al.	2403.16812v1	null
2024-03-26	RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict	Yirong Zeng et.al.	2403.16662v2	link
2024-03-25	ChatDBG: An AI-Powered Debugging Assistant	Kyla Levin et.al.	2403.16354v1	link
2024-03-26	Towards a RAG-based Summarization Agent for the Electron-Ion Collider	Karthik Suresh et.al.	2403.15729v2	null
2024-03-22	Large language models for crowd decision making based on prompt design strategies using ChatGPT: models, analysis and challenges	Cristina Zuheros et.al.	2403.15587v1	null
2024-04-02	Assessing the Utility of Large Language Models for Phenotype-Driven Gene Prioritization in Rare Genetic Disorder Diagnosis	Junyoung Kim et.al.	2403.14801v2	null
2024-03-21	A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science	Clayton Cohn et.al.	2403.14565v1	null
2024-04-08	MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation	Longzheng Wang et.al.	2403.14171v3	link
2024-03-21	From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation	Haofei Zhao et.al.	2403.14118v1	null
2024-03-21	PE-GPT: A Physics-Informed Interactive Large Language Model for Power Converter Modulation Design	Fanfan Lin et.al.	2403.14059v1	null
2024-03-12	Duwak: Dual Watermarks in Large Language Models	Chaoyi Zhu et.al.	2403.13000v1	null
2024-03-19	INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations	Lirui Luo et.al.	2403.12451v1	null
2024-05-08	Towards Interpretable Hate Speech Detection using Large Language Model-extracted Rationales	Ayushi Nirmal et.al.	2403.12403v2	link
2024-05-09	From Explainable to Interpretable Deep Learning for Natural Language Processing in Healthcare: How Far from Reality?	Guangming Huang et.al.	2403.11894v3	null
2024-03-18	DEE: Dual-stage Explainable Evaluation Method for Text Generation	Shenyu Zhang et.al.	2403.11509v1	null
2024-04-30	Correcting misinformation on social media with a large language model	Xinyi Zhou et.al.	2403.11169v3	link
2024-03-17	Enhancing Event Causality Identification with Rationale and Structure-Aware Causal Question Answering	Baiyan Zhang et.al.	2403.11129v1	null
2024-03-26	SelfIE: Self-Interpretation of Large Language Model Embeddings	Haozhe Chen et.al.	2403.10949v2	link
2024-03-16	Depression Detection on Social Media with Large Language Models	Xiaochong Lan et.al.	2403.10750v1	null
2024-03-15	Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization	Ratnadira Widyasari et.al.	2403.10507v1	null
2024-03-22	Can a GPT4-Powered AI Agent Be a Good Enough Performance Attribution Analyst?	Bruno de Melo et.al.	2403.10482v2	null
2024-03-15	A Question on the Explainability of Large Language Models and the Word-Level Univariate First-Order Plausibility Assumption	Jeremie Bogaert et.al.	2403.10275v1	null
2024-03-15	Language to Map: Topological map generation from natural language path instructions	Hideki Deguchi et.al.	2403.10008v1	null
2024-03-14	Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey	Xiaoyu Liu et.al.	2403.09606v1	null
2024-04-23	Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models	Laura Fernández-Becerra et.al.	2403.09567v2	null
2024-03-14	XCoOp: Explainable Prompt Learning for Computer-Aided Diagnosis via Concept-guided Context Optimization	Yequan Bie et.al.	2403.09410v1	null
2024-03-14	Meaningful Learning: Advancing Abstract Reasoning in Large Language Models via Generic Fact Guidance	Kai Xiong et.al.	2403.09085v1	null
2024-03-13	Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era	Xuansheng Wu et.al.	2403.08946v1	link
2024-03-13	TINA: Think, Interaction, and Action Framework for Zero-Shot Vision Language Navigation	Dingbang Li et.al.	2403.08833v1	null
2024-03-13	Can Large Language Models Identify Authorship?	Baixiang Huang et.al.	2403.08213v1	link
2024-03-12	generAItor: Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation	Thilo Spinner et.al.	2403.07627v1	null
2024-03-12	Robustness, Security, Privacy, Explainability, Efficiency, and Usability of Large Language Models for Code	Zhou Yang et.al.	2403.07506v1	null
2024-03-11	Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena	Leonie Weissweiler et.al.	2403.06965v1	null
2024-03-11	RecAI: Leveraging Large Language Models for Next-Generation Recommender Systems	Jianxun Lian et.al.	2403.06465v1	link
2024-03-10	ArgMed-Agents: Explainable Clinical Decision Reasoning with Large Language Models via Argumentation Schemes	Shengxin Hong et.al.	2403.06294v1	null
2024-03-10	Low-dose CT Denoising with Language-engaged Dual-space Alignment	Zhihao Chen et.al.	2403.06128v1	link
2024-03-10	Explaining Code with a Purpose: An Integrated Approach for Developing Code Comprehension and Prompting Skills	Paul Denny et.al.	2403.06050v1	null
2024-03-08	Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings	Wei Zhou et.al.	2403.05338v1	null
2024-03-08	Aligning Large Language Models for Controllable Recommendations	Wensheng Lu et.al.	2403.05063v1	null
2024-03-07	Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference	Wei-Lin Chiang et.al.	2403.04132v1	null
2024-04-26	Multimodal Large Language Models to Support Real-World Fact-Checking	Jiahui Geng et.al.	2403.03627v2	null
2024-03-06	RouteExplainer: An Explanation Framework for Vehicle Routing Problem	Daisuke Kikuta et.al.	2403.03585v1	link
2024-03-06	Explaining Genetic Programming Trees using Large Language Models	Paula Maddigan et.al.	2403.03397v1	null
2024-03-05	SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection	Peng Qi et.al.	2403.03170v1	null
2024-03-05	Word Importance Explains How Prompts Affect Language Model Outputs	Stefan Hackmann et.al.	2403.03028v1	null
2024-03-05	FinReport: Explainable Stock Earnings Forecasting via News Factor Analyzing Model	Xiangyu Li et.al.	2403.02647v1	link
2024-03-04	Evaluating the Explainability of Neural Rankers	Saran Pandian et.al.	2403.01981v1	null
2024-03-03	SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos	Yulei Niu et.al.	2403.01599v1	null
2024-03-03	Logic Rules as Explanations for Legal Case Retrieval	Zhongxiang Sun et.al.	2403.01457v1	link
2024-03-02	Improving the Validity of Automatically Generated Feedback via Reinforcement Learning	Alexander Scarlatos et.al.	2403.01304v1	link
2024-03-02	STAR: Constraint LoRA with Dynamic Active Learning for Data-Efficient Fine-Tuning of Large Language Models	Linhai Zhang et.al.	2403.01165v1	link
2024-02-25	Cognitive Bias in High-Stakes Decision-Making with LLMs	Jessica Echterhoff et.al.	2403.00811v1	null
2024-03-16	ChatDiet: Empowering Personalized Nutrition-Oriented Food Recommender Chatbots through an LLM-Augmented Framework	Zhongqi Yang et.al.	2403.00781v2	null
2024-02-29	FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition	Xiaoqiang Wang et.al.	2403.00126v1	null
2024-02-29	Dual Operating Modes of In-Context Learning	Ziqian Lin et.al.	2402.18819v1	link
2024-04-15	Cause and Effect: Can Large Language Models Truly Understand Causality?	Swagata Ashwani et.al.	2402.18139v2	null
2024-03-13	Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions	Hanjie Chen et.al.	2402.18060v3	link
2024-03-04	A Language Model based Framework for New Concept Placement in Ontologies	Hang Dong et.al.	2402.17897v2	link
2024-04-12	Re-Ex: Revising after Explanation Reduces the Factual Errors in LLM Responses	Juyeon Kim et.al.	2402.17097v2	link
2024-02-26	Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling	Hang Jiang et.al.	2402.17019v1	link
2024-02-28	Defending LLMs against Jailbreaking Attacks via Backtranslation	Yihan Wang et.al.	2402.16459v2	link
2024-02-26	ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors	Zhexin Zhang et.al.	2402.16444v1	link
2024-02-26	Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models	Tianyi Tang et.al.	2402.16438v1	null
2024-03-11	Finer: Investigating and Enhancing Fine-Grained Visual Concept Recognition in Large Vision Language Models	Jeonghwan Kim et.al.	2402.16315v2	null
2024-02-24	HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition	Yuxuan Liu et.al.	2402.15754v1	null
2024-02-24	Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning	Yong Liu et.al.	2402.15751v1	null
2024-03-04	LLMs Can Defend Themselves Against Jailbreaking in a Practical Manner: A Vision Paper	Daoyuan Wu et.al.	2402.15727v2	null
2024-02-26	Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition	Yufei Huang et.al.	2402.15175v2	null
2024-02-22	Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark	Xiuying Chen et.al.	2402.14359v1	null
2024-02-22	Do Machines and Humans Focus on Similar Code? Exploring Explainability of Large Language Models in Code Summarization	Jiliang Li et.al.	2402.14182v1	null
2024-02-21	An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach	Mohammad Amaz Uddin et.al.	2402.13871v1	null
2024-02-21	Factual Consistency Evaluation of Summarisation in the Era of Large Language Models	Zheheng Luo et.al.	2402.13758v1	null
2024-03-08	SaGE: Evaluating Moral Consistency in Large Language Models	Vamshi Krishna Bonagiri et.al.	2402.13709v2	link
2024-02-19	Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?	Nishant Balepur et.al.	2402.12483v1	link
2024-02-19	Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from Large Language Models	Puxuan Yu et.al.	2402.12276v1	link
2024-02-18	Opening the black box of language acquisition	Jérôme Michaud et.al.	2402.11681v1	link
2024-02-23	Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Bias Detection	Valeria Pastorino et.al.	2402.11621v2	null
2024-02-18	Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network	Lin Chen et.al.	2402.11518v1	null
2024-02-18	Rethinking the Roles of Large Language Models in Chinese Grammatical Error Correction	Yinghui Li et.al.	2402.11420v1	null
2024-02-17	Dissecting Human and LLM Preferences	Junlong Li et.al.	2402.11296v1	link
2024-02-17	GenDec: A robust generative Question-decomposition method for Multi-hop reasoning	Jian Wu et.al.	2402.11166v1	null
2024-02-16	Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models	Zihao Lin et.al.	2402.11122v1	null
2024-02-21	Exploring Value Biases: How LLMs Deviate Towards the Ideal	Sarath Sivaprasad et.al.	2402.11005v2	null
2024-03-15	Zero-shot Explainable Mental Health Analysis on Social Media by Incorporating Mental Scales	Wenyu Li et.al.	2402.10948v2	null
2024-02-19	Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities	Mingyu Jin et.al.	2402.10835v2	null
2024-02-16	RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model	Jianhao Yuan et.al.	2402.10828v1	null
2024-02-16	Quantifying the Persona Effect in LLM Simulations	Tiancheng Hu et.al.	2402.10811v1	null
2024-02-16	Properties and Challenges of LLM-Generated Explanations	Jenny Kunz et.al.	2402.10532v1	null
2024-02-15	Large Language Models for Forecasting and Anomaly Detection: A Systematic Literature Review	Jing Su et.al.	2402.10350v1	null
2024-02-15	Case Study: Testing Model Capabilities in Some Reasoning Tasks	Min Zhang et.al.	2402.09967v1	null
2024-02-15	Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States	Hanyu Duan et.al.	2402.09733v1	null
2024-02-21	CodeMind: A Framework to Challenge Large Language Models for Code Reasoning	Changshu Liu et.al.	2402.09664v3	link
2024-02-14	Large Language Model-Based Interpretable Machine Learning Control in Building Energy Systems	Liang Zhang et.al.	2402.09584v1	null
2024-02-14	SyntaxShap: Syntax-aware Explainability Method for Text Generation	Kenza Amara et.al.	2402.09259v1	null
2024-02-12	Why and When LLM-Based Assistants Can Go Wrong: Investigating the Effectiveness of Prompt-Based Interactions for Software Help-Seeking	Anjali Khurana et.al.	2402.08030v1	null
2024-02-02	Exploring patient trust in clinical advice from AI-driven LLMs like ChatGPT for self-diagnosis	Delong Du et.al.	2402.07920v1	null
2024-01-29	Experimental Interface for Multimodal and Large Language Model Based Explanations of Educational Recommender Systems	Hasan Abu-Rasheed et.al.	2402.07910v1	null
2024-02-12	TELLER: A Trustworthy Framework for Explainable, Generalizable and Controllable Fake News Detection	Hui Liu et.al.	2402.07776v1	link
2024-02-12	Can LLMs Produce Faithful Explanations For Fact-checking? Towards Faithful Explainable Fact-Checking via Multi-Agent Debate	Kyungha Kim et.al.	2402.07401v1	null
2024-02-11	TransGPT: Multi-modal Generative Pre-trained Transformer for Transportation	Peng Wang et.al.	2402.07233v1	null
2024-02-11	X-LoRA: Mixture of Low-Rank Adapter Experts, a Flexible Framework for Large Language Models with Applications in Protein Mechanics and Design	Eric L. Buehler et.al.	2402.07148v1	link
2024-02-08	Integrating LLMs for Explainable Fault Diagnosis in Complex Systems	Akshay J. Dave et.al.	2402.06695v1	null
2024-02-09	The Quantified Boolean Bayesian Network: Theory and Experiments with a Logical Graphical Model	Gregory Coppola et.al.	2402.06557v1	link
2024-02-06	Personalized Language Modeling from Personalized Human Feedback	Xinyu Li et.al.	2402.05133v1	null
2024-02-05	Illuminate: A novel approach for depression detection with explainable analysis and proactive therapy using prompt engineering	Aryan Agrawal et.al.	2402.05127v1	null
2024-02-07	Large Language Models As Faithful Explainers	Yu-Neng Chuang et.al.	2402.04678v1	null
2024-03-14	Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models	Chirag Agarwal et.al.	2402.04614v3	null
2024-02-06	Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models	David Sobrín-Hidalgo et.al.	2402.04206v1	null
2024-02-29	Learning to Generate Explainable Stock Predictions using Self-Reflective Large Language Models	Kelvin J. L. Koa et.al.	2402.03659v3	link
2024-01-31	Uncertainty-Aware Explainable Recommendation with Large Language Models	Yicui Peng et.al.	2402.03366v1	null
2024-02-05	The Matrix: A Bayesian learning model for LLMs	Siddhartha Dalal et.al.	2402.03175v1	null
2024-02-05	Less is KEN: a Universal and Simple Non-Parametric Pruning Algorithm for Large Language Models	Michele Mastromattei et.al.	2402.03142v1	link
2024-02-05	How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning	Zeping Yu et.al.	2402.02872v1	null
2024-02-04	Selecting Large Language Model to Fine-tune via Rectified Scaling Law	Haowei Lin et.al.	2402.02314v1	null
2024-02-03	Frequency Explains the Inverse Correlation of Large Language Models' Size, Training Data Amount, and Surprisal's Fit to Reading Times	Byung-Doh Oh et.al.	2402.02255v1	link
2024-02-06	Large Language Model Agent for Hyper-Parameter Optimization	Siyi Liu et.al.	2402.01881v2	null
2024-02-02	The RL/LLM Taxonomy Tree: Reviewing Synergies Between Reinforcement Learning and Large Language Models	Moschoula Pternea et.al.	2402.01874v1	null
2024-02-02	Ecologically rational meta-learned inference explains human category learning	Akshay K. Jagadish et.al.	2402.01821v1	null
2024-02-01	When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards	Norah Alzahrani et.al.	2402.01781v1	null
2024-01-30	Rethinking Interpretability in the Era of Large Language Models	Chandan Singh et.al.	2402.01761v1	link
2024-02-24	Contextualization Distillation from Large Language Model for Knowledge Graph Completion	Dawei Li et.al.	2402.01729v3	null
2024-03-01	Measuring Moral Inconsistencies in Large Language Models	Vamshi Krishna Bonagiri et.al.	2402.01719v3	null
2024-02-16	Emojis Decoded: Leveraging ChatGPT for Enhanced Understanding in Social Media Communications	Yuhang Zhou et.al.	2402.01681v2	null
2024-02-05	SymbolicAI: A framework for logic-based approaches combining generative models and solvers	Marius-Constantin Dinu et.al.	2402.00854v2	link
2024-02-01	Enhancing Ethical Explanations of Large Language Models through Iterative Symbolic Refinement	Xin Quan et.al.	2402.00745v1	link
2024-02-01	IndiVec: An Exploration of Leveraging Large Language Models for Media Bias Detection with Fine-Grained Bias Indicators	Luyang Lin et.al.	2402.00345v1	null
2024-02-01	Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective	Qun Ma et.al.	2402.00262v1	null
2024-01-31	Multimodal Neurodegenerative Disease Subtyping Explained by ChatGPT	Diego Machado Reyes et.al.	2402.00137v1	null
2024-03-10	Arrows of Time for Large Language Models	Vassilis Papadopoulos et.al.	2401.17505v2	null
2024-01-30	Detecting mental disorder on social media: a ChatGPT-augmented explainable approach	Loris Belcastro et.al.	2401.17477v1	link
2024-02-10	Reproducibility, energy efficiency and performance of pseudorandom number generators in machine learning: a comparative study of python, numpy, tensorflow, and pytorch implementations	Benjamin Antunes et.al.	2401.17345v2	null
2024-01-30	Incoherent Probability Judgments in Large Language Models	Jian-Qiao Zhu et.al.	2401.16646v1	null
2024-02-27	How Good is ChatGPT at Face Biometrics? A First Look into Recognition, Soft Biometrics, and Explainability	Ivan DeAndres-Tame et.al.	2401.13641v2	link
2024-01-24	Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models	Hongzhan Lin et.al.	2401.13298v1	link
2024-01-23	XAI for All: Can Large Language Models Simplify Explainable AI?	Philip Mavrepis et.al.	2401.13110v1	null
2024-02-22	From Understanding to Utilization: A Survey on Explainability for Large Language Models	Haoyan Luo et.al.	2401.12874v2	null
2024-01-23	How well can large language models explain business processes?	Dirk Fahland et.al.	2401.12846v1	null
2024-02-23	Generating Zero-shot Abstractive Explanations for Rumour Verification	Iman Munire Bilal et.al.	2401.12713v3	link
2024-01-23	LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools	Qianli Wang et.al.	2401.12576v1	link
2024-01-21	Over-Reasoning and Redundant Calculation of Large Language Models	Cheng-Han Chiang et.al.	2401.11467v1	link
2024-01-20	Analyzing Task-Encoding Tokens in Large Language Models	Yu Bai et.al.	2401.11323v1	null
2024-01-17	Vlogger: Make Your Dream A Vlog	Shaobin Zhuang et.al.	2401.09414v1	link
2024-01-24	Supporting Student Decisions on Learning Recommendations: An LLM-Based Chatbot with Knowledge Graph Contextualization for Conversational Explainability and Mentoring	Hasan Abu-Rasheed et.al.	2401.08517v3	null
2024-01-16	LLM-Guided Multi-View Hypergraph Learning for Human-Centric Explainable Recommendation	Zhixuan Chu et.al.	2401.08217v1	null
2024-02-15	Are self-explanations from Large Language Models faithful?	Andreas Madsen et.al.	2401.07927v3	link
2024-01-15	Quantum Transfer Learning for Acceptability Judgements	Giuseppe Buonaiuto et.al.	2401.07777v1	null
2024-01-14	Harnessing Large Language Models Over Transformer Models for Detecting Bengali Depressive Social Media Text: A Comprehensive Study	Ahmadul Karim Chowdhury et.al.	2401.07310v1	null
2024-01-12	TestSpark: IntelliJ IDEA's Ultimate Test Generation Companion	Arkadii Sapozhnikov et.al.	2401.06580v1	link
2024-01-12	Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models	Asma Ghandeharioun et.al.	2401.06102v2	null
2024-01-11	Video Anomaly Detection and Explanation via Large Language Models	Hui Lv et.al.	2401.05702v1	null
2024-01-11	REBUS: A Robust Evaluation Benchmark of Understanding Symbols	Andrew Gritsevskiy et.al.	2401.05604v1	link
2024-01-08	LLM4PLC: Harnessing Large Language Models for Verifiable Programming of PLCs in Industrial Control Systems	Mohamad Fakih et.al.	2401.05443v1	link
2024-01-10	Prompting Large Language Models for Recommender Systems: A Comprehensive Framework and Empirical Analysis	Lanling Xu et.al.	2401.04997v1	null
2024-01-08	ExTraCT -- Explainable Trajectory Corrections from language inputs using Textual description of features	J-Anne Yow et.al.	2401.03701v1	null
2024-01-06	Autonomous Crowdsensing: Operating and Organizing Crowdsensing for Sensing Automation	Wansen Wu et.al.	2401.03229v1	null
2024-01-02	Evaluating Large Language Models on the GMAT: Implications for the Future of Business Education	Vahid Ashrafimoghari et.al.	2401.02985v1	null
2024-01-05	Large Language Models in Plant Biology	Hilbert Yuen In Lam et.al.	2401.02789v1	null
2024-01-02	VALD-MD: Visual Attribution via Latent Diffusion for Medical Diagnostics	Ammar A. Siddiqui et.al.	2401.01414v1	null
2023-12-30	The Problem of Alignment	Tsvetelina Hristova et.al.	2401.00210v1	null
2023-12-29	Building Efficient Universal Classifiers with Natural Language Inference	Moritz Laurer et.al.	2312.17543v1	link
2023-12-23	An Explainable AI Approach to Large Language Model Assisted Causal Model Auditing and Development	Yanming Zhang et.al.	2312.16211v1	null
2024-01-03	Unlocking the Potential of Large Language Models for Explainable Recommendations	Yucong Luo et.al.	2312.15661v3	link
2023-12-11	Transportation Transformed: A Comprehensive Review of Dynamic Rerouting in Multimodal Networks	Suyash Pratap et.al.	2312.14953v1	null
2023-12-22	VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation	Max Ku et.al.	2312.14867v1	null
2023-12-21	Deep de Finetti: Recovering Topic Distributions from Large Language Models	Liyi Zhang et.al.	2312.14226v1	null
2023-12-16	Learning Interpretable Queries for Explainable Image Classification with Information Pursuit	Stefan Kolek et.al.	2312.11548v1	null
2023-12-19	The Good, The Bad, and Why: Unveiling Emotions in Generative AI	Cheng Li et.al.	2312.11111v2	null
2023-12-17	Can persistent homology whiten Transformer-based black-box models? A case study on BERT compression	Luis Balderas et.al.	2312.10702v1	null
2024-01-17	LLM-SQL-Solver: Can LLMs Determine SQL Equivalence?	Fuheng Zhao et.al.	2312.10321v2	null
2023-12-15	GPT-doctor: Customizing Large Language Models for Medical Consultation	Wen Wang et.al.	2312.10225v1	null
2023-12-04	A collection of principles for guiding and evaluating large language models	Konstantin Hebenstreit et.al.	2312.10059v1	null
2023-12-15	Prompting Datasets: Data Discovery with Conversational Agents	Johanna Walker et.al.	2312.09947v1	null
2023-12-15	SMILE: Multimodal Dataset for Understanding Laughter in Video with Language Models	Lee Hyun et.al.	2312.09818v1	link
2023-12-14	Successor Heads: Recurring, Interpretable Attention Heads In The Wild	Rhys Gould et.al.	2312.09230v1	null
2023-12-27	Fine-Grained Image-Text Alignment in Medical Imaging Enables Cyclic Image-Report Generation	Wenting Chen et.al.	2312.08078v4	null
2023-12-13	Helping Language Models Learn More: Multi-dimensional Task Prompt for Few-shot Tuning	Jinta Weng et.al.	2312.08027v1	null
2023-12-12	Tell, don't show: Declarative facts influence how LLMs generalize	Alexander Meinke et.al.	2312.07779v1	null
2023-12-05	Building Trustworthy NeuroSymbolic AI Systems: Consistency, Reliability, Explainability, and Safety	Manas Gaur et.al.	2312.06798v1	null
2023-12-10	Evidence-based Interpretable Open-domain Fact-checking with Large Language Models	Xin Tan et.al.	2312.05834v1	null
2023-11-30	Applying Large Language Models and Chain-of-Thought for Automatic Scoring	Gyeong-Geon Lee et.al.	2312.03748v1	null
2023-12-06	XAIQA: Explainer-Based Data Augmentation for Extractive Question Answering	Joel Stremmel et.al.	2312.03567v1	null
2023-12-03	TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long Documents	James Enouen et.al.	2312.01279v1	null
2023-11-30	Large Language Models for Travel Behavior Prediction	Baichuan Mo et.al.	2312.00819v1	null
2023-11-30	CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation	Pei Ke et.al.	2311.18702v1	link
2023-11-30	Evaluating the Rationale Understanding of Critical Reasoning in Logical Reading Comprehension	Akira Kawabata et.al.	2311.18353v1	null
2023-11-29	Understanding Your Agent: Leveraging Large Language Models for Behavior Explanation	Xijia Zhang et.al.	2311.18062v1	null
2023-11-29	Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning	Xiaoqian Wu et.al.	2311.17365v1	null
2023-11-29	Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering	Zeqing Wang et.al.	2311.17331v1	null
2024-02-12	Large language models can enhance persuasion through linguistic feature alignment	Minkyu Shin et.al.	2311.16466v2	null
2023-11-16	Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities	Avishree Khare et.al.	2311.16169v1	null
2023-11-27	Decoding Logic Errors: A Comparative Study on Bug Detection by Students and Large Language Models	Stephen MacNeil et.al.	2311.16017v1	null
2023-11-27	Justifiable Artificial Intelligence: Engineering Large Language Models for Legal Applications	Sabine Wehnert et.al.	2311.15716v1	null
2023-11-27	Deficiency of Large Language Models in Finance: An Empirical Examination of Hallucination	Haoqiang Kang et.al.	2311.15548v1	null
2023-11-25	Code Generation Based Grading: Evaluating an Auto-grading Mechanism for "Explain-in-Plain-English" Questions	David H. Smith IV et.al.	2311.14903v1	null
2023-11-10	ChatGPT Exhibits Gender and Racial Biases in Acute Coronary Syndrome Management	Angela Zhang et.al.	2311.14703v1	null
2023-11-23	Towards Auditing Large Language Models: Improving Text-based Stereotype Detection	Wu Zekun et.al.	2311.14126v1	null
2023-11-23	Towards Explainable Strategy Templates using NLP Transformers	Pallavi Bagga et.al.	2311.14061v1	null
2023-11-22	Large Language Models in Education: Vision and Opportunities	Wensheng Gan et.al.	2311.13160v1	null
2023-11-21	A Survey on Large Language Models for Personalized and Explainable Recommendations	Junyi Chen et.al.	2311.12338v1	null
2023-11-20	Unifying Corroborative and Contributive Attributions in Large Language Models	Theodora Worledge et.al.	2311.12233v1	null
2023-11-20	LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions	Songhao Han et.al.	2311.11904v1	null
2023-11-20	Large Language Models and Explainable Law: a Hybrid Methodology	Marco Billi et.al.	2311.11811v1	null
2023-11-20	Exploring Prompting Large Language Models as Explainable Metrics	Ghazaleh Mahmoudi et.al.	2311.11552v1	link
2023-11-19	Using Causal Threads to Explain Changes in a Dynamic System	Robert B. Allen et.al.	2311.11334v1	null
2023-12-17	Rethinking Large Language Models in Mental Health Applications	Shaoxiong Ji et.al.	2311.11267v2	null
2023-11-16	ChatGPT-3.5, ChatGPT-4, Google Bard, and Microsoft Bing to Improve Health Literacy and Communication in Pediatric Populations and Beyond	Kanhai S. Amin et.al.	2311.10075v1	null
2023-11-16	Is "A Helpful Assistant" the Best Role for Large Language Models? A Systematic Evaluation of Social Roles in System Prompts	Mingqian Zheng et.al.	2311.10054v1	null
2023-11-15	Explaining Explanation: An Empirical Study on Explanation in Code Reviews	Ratnadira Widyasari et.al.	2311.09020v1	null
2023-11-15	Data Similarity is Not Enough to Explain Language Model Performance	Gregory Yauney et.al.	2311.09006v1	link
2023-11-15	XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-Making	Zichen Chen et.al.	2311.08614v1	null
2023-11-14	UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations	Wenting Zhao et.al.	2311.08469v1	null
2023-11-16	Are Large Language Models Temporally Grounded?	Yifu Qiu et.al.	2311.08398v2	link
2023-11-13	In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax	Aaron Mueller et.al.	2311.07811v1	link
2023-11-13	On Measuring Faithfulness of Natural Language Explanations	Letitia Parcalabescu et.al.	2311.07466v1	link
2023-11-12	SELF-EXPLAIN: Teaching Large Language Models to Reason Complex Questions by Themselves	Jiachen Zhao et.al.	2311.06985v1	null
2023-11-10	Distilling Large Language Models using Skill-Occupation Graph Context for HR-Related Tasks	Pouya Pezeshkpour et.al.	2311.06383v1	link
2023-11-08	DEMASQ: Unmasking the ChatGPT Wordsmith	Kavita Kumari et.al.	2311.05019v1	null
2023-11-01	From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems	Samyar Janatian et.al.	2311.04911v1	link
2023-11-07	Extracting human interpretable structure-property relationships in chemistry using XAI and large language models	Geemi P. Wellawatte et.al.	2311.04047v1	link
2023-11-07	Which is better? Exploring Prompting Strategy For LLM-based Metrics	Joonghoon Kim et.al.	2311.03754v1	link
2023-11-07	Leveraging Structured Information for Explainable Multi-hop Question Answering and Reasoning	Ruosen Li et.al.	2311.03734v1	link
2023-11-04	Can ChatGPT support software verification?	Christian Janßen et.al.	2311.02433v1	null
2023-11-12	Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models	Sean Xie et.al.	2311.01732v2	link
2023-09-26	Creating Trustworthy LLMs: Dealing with Hallucinations in Healthcare AI	Muhammad Aurangzeb Ahmad et.al.	2311.01463v1	null
2023-11-01	Emotion Detection for Misinformation: A Review	Zhiwei Liu et.al.	2311.00671v1	null
2023-11-22	HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning	Yongjin Yang et.al.	2311.00321v2	link
2023-11-01	ChatGPT-Powered Hierarchical Comparisons for Image Classification	Zhiyuan Ren et.al.	2311.00206v1	null
2023-11-14	Learning From Mistakes Makes LLM Better Reasoner	Shengnan An et.al.	2310.20689v2	link
2023-10-31	Theory of Mind in Large Language Models: Examining Performance of 11 State-of-the-Art models vs. Children Aged 7-10 on Advanced Tests	Max J. van Duijn et.al.	2310.20320v1	null
2023-10-30	The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics	Christoph Leiter et.al.	2310.19792v1	link
2023-10-30	Explaining Tree Model Decisions in Natural Language for Network Intrusion Detection	Noah Ziems et.al.	2310.19658v1	null
2023-10-28	The Synergy of Speculative Decoding and Batching in Serving Large Language Models	Qidong Su et.al.	2310.18813v1	null
2023-11-01	Will releasing the weights of future large language models grant widespread access to pandemic agents?	Anjali Gopal et.al.	2310.18233v2	null
2023-10-26	Beyond MLE: Convex Learning for Text Generation	Chenze Shao et.al.	2310.17217v1	null
2023-10-26	DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models	Ge Zheng et.al.	2310.16436v2	null
2023-10-25	Graph Agent: Explicit Reasoning Agent for Graphs	Qinyong Wang et.al.	2310.16421v1	null
2023-12-29	Evaluating General-Purpose AI with Psychometrics	Xiting Wang et.al.	2310.16379v2	null
2023-10-24	UI Layout Generation with LLMs Guided by UI Grammar	Yuwen Lu et.al.	2310.15455v1	null
2023-10-22	Evaluating Subjective Cognitive Appraisals of Emotions from Large Language Models	Hongli Zhan et.al.	2310.14389v1	link
2023-10-22	Towards Harmful Erotic Content Detection through Coreference-Driven Contextual Analysis	Inez Okulska et.al.	2310.14325v1	null
2023-10-21	Large Language Models and Multimodal Retrieval for Visual Word Sense Disambiguation	Anastasia Kritharoula et.al.	2310.14025v1	link
2023-10-20	Ecologically Valid Explanations for Label Variation in NLI	Nan-Jiang Jiang et.al.	2310.13850v1	link
2023-10-30	Why Can Large Language Models Generate Correct Chain-of-Thoughts?	Rasul Tutunov et.al.	2310.13571v2	null
2023-10-20	The Perils & Promises of Fact-checking with Large Language Models	Dorian Quelle et.al.	2310.13549v1	null
2023-10-20	Explaining Interactions Between Text Spans	Sagnik Ray Choudhury et.al.	2310.13506v1	link
2023-10-19	Frozen Transformers in Language Models Are Effective Visual Encoder Layers	Ziqi Pang et.al.	2310.12973v1	link
2023-10-28	Probing LLMs for hate speech detection: strengths and vulnerabilities	Sarthak Roy et.al.	2310.12860v2	null
2023-10-19	Large Language Models Help Humans Verify Truthfulness -- Except When They Are Convincingly Wrong	Chenglei Si et.al.	2310.12558v1	null
2023-10-17	Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations	Shiyuan Huang et.al.	2310.11207v1	null
2023-11-11	Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms	Seungju Han et.al.	2310.10418v2	link
2023-10-15	EX-FEVER: A Dataset for Multi-hop Explainable Fact Verification	Huanhuan Ma et.al.	2310.09754v1	link
2023-10-13	A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models	Takuma Udagawa et.al.	2310.08797v1	null
2023-10-12	Circuit Component Reuse Across Tasks in Transformer Language Models	Jack Merullo et.al.	2310.08744v1	null
2023-10-12	Who Wrote it and Why? Prompting Large-Language Models for Authorship Verification	Chia-Yu Hung et.al.	2310.08123v1	null
2023-10-12	Large Language Models for Scientific Synthesis, Inference and Explanation	Yizhen Zheng et.al.	2310.07984v1	link
2023-10-11	Large Language Models Are Zero-Shot Time Series Forecasters	Nate Gruver et.al.	2310.07820v1	link
2023-10-10	Benchmarking and Explaining Large Language Model-based Code Generation: A Causality-Centric Approach	Zhenlan Ji et.al.	2310.06680v1	null
2023-10-10	SCAR: Power Side-Channel Analysis at RTL-Level	Amisha Srivastava et.al.	2310.06257v1	null
2023-10-11	The Importance of Prompt Tuning for Automated Neuron Explanations	Justin Lee et.al.	2310.06200v2	null
2023-10-09	A Meta-Learning Perspective on Transformers for Causal Language Modeling	Xinbo Wu et.al.	2310.05884v1	null
2023-10-10	Are Large Language Models Post Hoc Explainers?	Nicholas Kroeger et.al.	2310.05797v2	link
2023-10-09	A Closer Look into Automatic Evaluation Using Large Language Models	Cheng-Han Chiang et.al.	2310.05657v1	link
2023-10-09	Explaining the Complex Task Reasoning of Large Language Models with Template-Content Structure	Haotong Yang et.al.	2310.05452v1	null
2023-10-20	Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models	Haoran Wang et.al.	2310.05253v2	link
2023-10-08	Scaling Laws of RoPE-based Extrapolation	Xiaoran Liu et.al.	2310.05209v1	null
2023-10-08	Harnessing the Power of ChatGPT in Fake News: An In-Depth Exploration in Generation, Detection and Explanation	Yue Huang et.al.	2310.05046v1	null
2023-10-08	Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading	Howard Chen et.al.	2310.05029v1	null
2023-10-08	Domain Knowledge Graph Construction Via A Simple Checker	Yueling Zeng et.al.	2310.04949v1	null
2023-11-11	FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets	Neng Wang et.al.	2310.04793v2	link
2023-10-03	Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of Large Language Models with Misconceptions	Naiming Liu et.al.	2310.02439v1	null
2023-10-13	Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving	Long Chen et.al.	2310.01957v2	link
2023-11-28	DeepDecipher: Accessing and Investigating Neuron Activation in Large Language Models	Albert Garde et.al.	2310.01870v2	link
2023-12-07	UPAR: A Kantian-Inspired Prompting Framework for Enhancing Large Language Model Capabilities	Hejia Geng et.al.	2310.01441v2	null
2023-10-02	Automated Evaluation of Classroom Instructional Support with LLMs and BoWs: Connecting Global Predictions to Specific Feedback	Jacob Whitehill et.al.	2310.01132v1	null
2023-10-08	Back to the Future: Towards Explainable Temporal Reasoning with Large Language Models	Chenhan Yuan et.al.	2310.01074v2	link
2023-10-01	Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning	Mustafa Shukor et.al.	2310.00647v1	link
2023-11-22	Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals	Yair Gat et.al.	2310.00603v2	null
2023-09-29	Tell Me a Story! Narrative-Driven XAI with Large Language Models	David Martens et.al.	2309.17057v1	link
2023-09-28	T-COL: Generating Counterfactual Explanations for General User Preferences on Variable Machine Learning Systems	Ming Wang et.al.	2309.16146v1	link
2023-09-28	TPE: Towards Better Compositional Reasoning over Conceptual Tools with Multi-persona Collaboration	Hongru Wang et.al.	2309.16090v1	null
2023-09-27	HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs)	Tarek Ali et.al.	2309.16021v1	null
2023-09-27	MindGPT: Interpreting What You See with Non-invasive Brain Recordings	Jiaxuan Chen et.al.	2309.15729v1	link
2023-09-23	LLMs as Counterfactual Explanation Modules: Can ChatGPT Explain Black-box Text Classifiers?	Amrita Bhattacharjee et.al.	2309.13340v1	null
2023-09-21	JobRecoGPT -- Explainable job recommendations using LLMs	Preetam Ghosh et.al.	2309.11805v1	null
2023-09-20	Controlled Generation with Prompt Insertion for Natural Language Explanations in Grammatical Error Correction	Masahiro Kaneko et.al.	2309.11439v1	link

(back to top)

LLM - Interpretable

Publish Date	Title	Authors	PDF	Code
2024-07-24	How Good (Or Bad) Are LLMs at Detecting Misleading Visualizations?	Leo Yu-Ho Lo et.al.	2407.17291v1	null
2024-07-24	SAFETY-J: Evaluating Safety with Critique	Yixiu Liu et.al.	2407.17075v1	null
2024-07-24	Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mechanism	Anhao Zhao et.al.	2407.17011v1	null
2024-07-23	PhenoFlow: A Human-LLM Driven Visual Analytics System for Exploring Large and Complex Stroke Datasets	Jaeyoung Kim et.al.	2407.16329v1	null
2024-07-22	Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs	Abhay Sheshadri et.al.	2407.15549v1	null
2024-07-22	Decoding BACnet Packets: A Large Language Model Approach for Packet Interpretation	Rashi Sharma et.al.	2407.15428v1	null
2024-07-22	Dissecting Multiplication in Transformers: Insights into LLMs	Luyu Qiu et.al.	2407.15360v1	null
2024-07-23	LLMExplainer: Large Language Model based Bayesian Inference for Graph Explanation Generation	Jiaxing Zhang et.al.	2407.15351v2	null
2024-07-21	XAI meets LLMs: A Survey of the Relation between Explainable AI and Large Language Models	Erik Cambria et.al.	2407.15248v1	null
2024-07-19	Human-Interpretable Adversarial Prompt Attack on Large Language Models with Situational Context	Nilanjana Das et.al.	2407.14644v1	null
2024-07-19	On Pre-training of Multimodal Language Models Customized for Chart Understanding	Wan-Cyuan Fan et.al.	2407.14506v1	null
2024-07-19	Check-Eval: A Checklist-based Approach for Evaluating Text Quality	Jayr Pereira et.al.	2407.14467v1	null
2024-07-02	Predictive Simultaneous Interpretation: Harnessing Large Language Models for Democratizing Real-Time Multilingual Communication	Kurando Iida et.al.	2407.14269v1	null
2024-07-19	KoMA: Knowledge-driven Multi-agent Framework for Autonomous Driving with Large Language Models	Kemou Jiang et.al.	2407.14239v1	null
2024-07-19	LeKUBE: A Legal Knowledge Update BEnchmark	Changyue Wang et.al.	2407.14192v1	null
2024-07-19	ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?	Siddhant Waghjale et.al.	2407.14044v1	link
2024-07-18	PRAGyan -- Connecting the Dots in Tweets	Rahul Ravi et.al.	2407.13909v1	null
2024-07-18	X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs	Sirnam Swetha et.al.	2407.13851v1	null
2024-07-24	The Honorific Effect: Exploring the Impact of Japanese Linguistic Formalities on AI-Generated Physics Explanations	Keisuke Sato et.al.	2407.13787v2	null
2024-07-03	RDBE: Reasoning Distillation-Based Evaluation Enhances Automatic Essay Scoring	Ali Ghiasvand Mohammadkhani et.al.	2407.13781v1	null
2024-07-20	EarthMarker: Visual Prompt Learning for Region-level and Point-level Remote Sensing Imagery Comprehension	Wei Zhang et.al.	2407.13596v2	link
2024-07-18	CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis	Junying Chen et.al.	2407.13301v1	null
2024-07-18	SOMONITOR: Explainable Marketing Data Processing and Analysis with Large Language Models	Qi Yang et.al.	2407.13117v1	null
2024-07-18	TrialEnroll: Predicting Clinical Trial Enrollment Success with Deep & Cross Network and Large Language Models	Ling Yue et.al.	2407.13115v1	null
2024-07-10	Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)	Krishnaram Kenthapadi et.al.	2407.12858v1	null
2024-07-01	AutoFlow: Automated Workflow Generation for Large Language Model Agents	Zelong Li et.al.	2407.12821v1	link
2024-07-17	AudienceView: AI-Assisted Interpretation of Audience Feedback in Journalism	William Brannon et.al.	2407.12613v1	link
2024-07-17	NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models	Gengze Zhou et.al.	2407.12366v1	link
2024-07-16	GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text	Kyle Hamilton et.al.	2407.11827v1	null
2024-07-15	Mechanistic interpretability of large language models with applications to the financial services industry	Ashkan Golgoon et.al.	2407.11215v1	null
2024-06-27	Does ChatGPT Have a Mind?	Simon Goldstein et.al.	2407.11015v1	null
2024-06-24	Visualization Literacy of Multimodal Large Language Models: A Comparative Study	Zhimin Li et.al.	2407.10996v1	null
2024-07-15	Think-on-Graph 2.0: Deep and Interpretable Large Language Model Reasoning with Knowledge Graph-guided Retrieval	Shengjie Ma et.al.	2407.10805v1	null
2024-07-15	Multilingual Contrastive Decoding via Language-Agnostic Layers Skipping	Wenhao Zhu et.al.	2407.10795v1	link
2024-07-15	Interpretability analysis on a pathology foundation model reveals biologically relevant embeddings across modalities	Nhat Le et.al.	2407.10785v1	null
2024-07-15	Learning Dynamics of LLM Finetuning	Yi Ren et.al.	2407.10490v1	link
2024-07-17	LAB-Bench: Measuring Capabilities of Language Models for Biology Research	Jon M. Laurent et.al.	2407.10362v3	null
2024-07-22	TokenSHAP: Interpreting Large Language Models with Monte Carlo Shapley Value Estimation	Roni Goldshmidt et.al.	2407.10114v2	null
2024-07-14	Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation	Ge Gao et.al.	2407.10091v1	null
2024-07-13	Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks	Shengbin Yue et.al.	2407.09893v1	link
2024-07-13	Speech-Guided Sequential Planning for Autonomous Navigation using Large Language Model Meta AI 3 (Llama3)	Alkesh K. Srivastava et.al.	2407.09890v1	null
2024-06-26	Prompting Whole Slide Image Based Genetic Biomarker Prediction	Ling Zhang et.al.	2407.09540v1	null
2024-07-12	SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers	Shraman Pramanick et.al.	2407.09413v1	link
2024-07-11	Fault Diagnosis in Power Grids with Large Language Model	Liu Jing et.al.	2407.08836v1	null
2024-07-11	Tamil Language Computing: the Present and the Future	Kengatharaiyer Sarveswaran et.al.	2407.08618v1	null
2024-07-11	Incorporating Large Language Models into Production Systems for Enhanced Task Automation and Flexibility	Yuchen Xia et.al.	2407.08550v1	null
2024-07-11	Tactics, Techniques, and Procedures (TTPs) in Interpreted Malware: A Zero-Shot Generation with Large Language Models	Ying Zhang et.al.	2407.08532v1	null
2024-07-11	On the attribution of confidence to large language models	Geoff Keeling et.al.	2407.08388v1	null
2024-07-11	Towards Explainable Evolution Strategies with Large Language Models	Jill Baumann et.al.	2407.08331v1	null
2024-07-11	GeNet: A Multimodal LLM-Based Co-Pilot for Network Topology and Configuration	Beni Ifland et.al.	2407.08249v1	null
2024-07-10	On LLM Wizards: Identifying Large Language Models' Behaviors for Wizard of Oz Experiments	Jingchao Fang et.al.	2407.08067v1	null
2024-07-10	Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models	Yuji Zhang et.al.	2407.08039v1	null
2024-07-10	Transformer Alignment in Large Language Models	Murdock Aubry et.al.	2407.07810v1	null
2024-07-10	Interpretable Differential Diagnosis with Dual-Inference Large Language Models	Shuang Zhou et.al.	2407.07330v1	null
2024-07-09	Large Language Models for Wearable Sensor-Based Human Activity Recognition, Health Monitoring, and Behavioral Modeling: A Survey of Early Trends, Datasets, and Challenges	Emilio Ferrara et.al.	2407.07196v1	null
2024-07-09	Divine LLaMAs: Bias, Stereotypes, Stigmatization, and Emotion Representation of Religion in Large Language Models	Flor Miriam Plaza-del-Arco et.al.	2407.06908v1	null
2024-07-10	Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts	Shuangkang Fang et.al.	2407.06842v2	null
2024-07-09	Combining Knowledge Graphs and Large Language Models	Amanda Kau et.al.	2407.06564v1	null
2024-07-09	Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons	Yongqi Leng et.al.	2407.06488v1	null
2024-07-08	Artificial Intuition: Efficient Classification of Scientific Abstracts	Harsh Sakhrani et.al.	2407.06093v1	null
2024-07-08	GenFollower: Enhancing Car-Following Prediction with Large Language Models	Xianda Chen et.al.	2407.05611v1	null
2024-07-07	Experiments with truth using Machine Learning: Spectral analysis and explainable classification of synthetic, false, and genuine information	Vishnu S. Pendyala et.al.	2407.05464v1	null
2024-07-06	Enhance the Robustness of Text-Centric Multimodal Alignments	Ting-Yu Yen et.al.	2407.05036v1	null
2024-07-05	MobileFlow: A Multimodal LLM For Mobile GUI Agent	Songqin Nong et.al.	2407.04346v1	null
2024-07-05	Crafting Large Language Models for Enhanced Interpretability	Chung-En Sun et.al.	2407.04307v1	null
2024-07-17	DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning	Chengpeng Li et.al.	2407.04078v3	link
2024-07-04	A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations	Md Tahmid Rahman Laskar et.al.	2407.04069v1	null
2024-07-04	Semantic Graphs for Syntactic Simplification: A Revisit from the Age of LLM	Peiran Yao et.al.	2407.04067v1	link
2024-07-15	LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking	Amy Xin et.al.	2407.04020v2	link
2024-07-04	Generative Technology for Human Emotion Recognition: A Scope Review	Fei Ma et.al.	2407.03640v1	null
2024-07-04	The Mysterious Case of Neuron 1512: Injectable Realignment Architectures Reveal Internal Characteristics of Meta's Llama 2 Model	Brenden Smith et.al.	2407.03621v1	link
2024-07-03	Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering	Zhaohe Liao et.al.	2407.03008v1	null
2024-07-03	FSM: A Finite State Machine Based Zero-Shot Prompting Paradigm for Multi-Hop Question Answering	Xiaochen Wang et.al.	2407.02964v1	null
2024-07-03	Model-Enhanced LLM-Driven VUI Testing of VPA Apps	Suwan Li et.al.	2407.02791v1	null
2024-06-27	Meta Large Language Model Compiler: Foundation Models of Compiler Optimization	Chris Cummins et.al.	2407.02524v1	null
2024-06-23	INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness	Hung Le et.al.	2407.02518v1	null
2024-07-02	GRASP: A Grid-Based Benchmark for Evaluating Commonsense Spatial Reasoning	Zhisheng Tang et.al.	2407.01892v1	link
2024-06-29	Potential Renovation of Information Search Process with the Power of Large Language Model for Healthcare	Forhan Bin Emdad et.al.	2407.01627v1	null
2024-07-01	Agentless: Demystifying LLM-based Software Engineering Agents	Chunqiu Steven Xia et.al.	2407.01489v1	link
2024-07-01	Evaluating Knowledge-based Cross-lingual Inconsistency in Large Language Models	Xiaolin Xing et.al.	2407.01358v1	link
2024-07-01	Calibrated Large Language Models for Binary Question Answering	Patrizio Giovannotti et.al.	2407.01122v1	null
2024-07-01	Human-like object concept representations emerge naturally in multimodal large language models	Changde Du et.al.	2407.01067v1	null
2024-07-01	Background-aware Multi-source Fusion Financial Trend Forecasting Mechanism	Fengting Mo et.al.	2407.00904v1	null
2024-06-29	Financial Knowledge Large Language Model	Cehao Yang et.al.	2407.00365v1	null
2024-06-29	LLM-Generated Natural Language Meets Scaling Laws: New Explorations and Data Augmentation Methods	Zhenhua Wang et.al.	2407.00322v1	null
2024-06-27	Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks	Ibrahim Abdelaziz et.al.	2407.00121v1	null
2024-06-17	A Personalised Learning Tool for Physics Undergraduate Students Built On a Large Language Model for Symbolic Regression	Yufan Zhu et.al.	2407.00065v1	null
2024-06-28	Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification	Anisha Gunjal et.al.	2406.20079v1	link
2024-06-28	Learning Interpretable Legal Case Retrieval via Knowledge-Guided Case Reformulation	Chenlong Deng et.al.	2406.19760v1	link
2024-06-27	PathAlign: A vision-language model for whole slide images in histopathology	Faruk Ahmed et.al.	2406.19578v1	null
2024-06-27	DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions	Nigel Fernandez et.al.	2406.19356v1	null
2024-06-27	Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding	Yue Fan et.al.	2406.19263v1	link
2024-06-27	Towards Learning Abductive Reasoning using VSA Distributed Representations	Giacomo Camposampiero et.al.	2406.19121v1	link
2024-06-27	LayoutCopilot: An LLM-powered Multi-agent Collaborative Framework for Interactive Analog Layout Design	Bingyang Liu et.al.	2406.18873v1	null
2024-06-27	DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment	Ke-Han Lu et.al.	2406.18871v1	null
2024-06-27	ELCoRec: Enhance Language Understanding with Co-Propagation of Numerical and Categorical Features for Recommendation	Jizheng Chen et.al.	2406.18825v1	null
2024-06-26	Categorical Syllogisms Revisited: A Review of the Logical Reasoning Abilities of LLMs for Analyzing Categorical Syllogism	Shi Zong et.al.	2406.18762v1	null
2024-07-15	Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models	Georgios Tziafas et.al.	2406.18746v2	null
2024-06-26	Themis: Towards Flexible and Interpretable NLG Evaluation	Xinyu Hu et.al.	2406.18365v1	link
2024-06-26	AI Alignment through Reinforcement Learning from Human Feedback? Contradictions and Limitations	Adam Dahlgren Lindström et.al.	2406.18346v1	null
2024-06-26	A Context-Driven Approach for Co-Auditing Smart Contracts with The Support of GPT-4 code interpreter	Mohamed Salah Bouafif et.al.	2406.18075v1	null
2024-06-26	Diagnosis Assistant for Liver Cancer Utilizing a Large Language Model with Three Types of Knowledge	Xuzhou Wu et.al.	2406.18039v1	null
2024-06-26	Automated Clinical Data Extraction with Knowledge Conditioned LLMs	Diya Li et.al.	2406.18027v1	null
2024-06-25	Encourage or Inhibit Monosemanticity? Revisit Monosemanticity from a Feature Decorrelation Perspective	Hanqi Yan et.al.	2406.17969v1	null
2024-06-25	Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback	Zhongtao Miao et.al.	2406.17873v1	link
2024-06-25	Human-Object Interaction from Human-Level Instructions	Zhen Wu et.al.	2406.17840v1	null
2024-06-22	MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?	Xirui Li et.al.	2406.17806v1	null
2024-06-25	Banishing LLM Hallucinations Requires Rethinking Generalization	Johnny Li et.al.	2406.17642v1	null
2024-06-25	Large Language Models are Interpretable Learners	Ruochen Wang et.al.	2406.17224v1	link
2024-07-01	Large Language Models Assume People are More Rational than We Really are	Ryan Liu et.al.	2406.17055v2	link
2024-06-23	Unveiling LLM Mechanisms Through Neural ODEs and Control Theory	Yukun Zhang et.al.	2406.16985v1	null
2024-06-24	USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations	Mounika Marreddy et.al.	2406.16833v1	null
2024-06-25	RES-Q: Evaluating Code-Editing Large Language Model Systems at the Repository Scale	Beck LaBash et.al.	2406.16801v2	link
2024-06-24	OCALM: Object-Centric Assessment with Language Models	Timo Kaufmann et.al.	2406.16748v1	null
2024-06-29	EmoLLM: Multimodal Emotional Understanding Meets Large Language Models	Qu Yang et.al.	2406.16442v2	link
2024-06-25	Graph-Augmented LLMs for Personalized Health Insights: A Case Study in Sleep Analysis	Ajan Subramanian et.al.	2406.16252v2	null
2024-06-23	Preference Tuning For Toxicity Mitigation Generalizes Across Languages	Xiaochen Li et.al.	2406.16235v1	link
2024-06-23	Towards Natural Language-Driven Assembly Using Foundation Models	Omkar Joglekar et.al.	2406.16093v1	null
2024-06-23	Unlocking the Future: Exploring Look-Ahead Planning Mechanistic Interpretability in Large Language Models	Tianyi Men et.al.	2406.16033v1	null
2024-06-25	AudioBench: A Universal Benchmark for Audio Large Language Models	Bin Wang et.al.	2406.16020v2	link
2024-06-23	Memorizing Documents with Guidance in Large Language Models	Bumjin Park et.al.	2406.15996v1	null
2024-06-30	LLM-Powered Explanations: Unraveling Recommendations Through Subgraph Reasoning	Guangsi Shi et.al.	2406.15859v2	null
2024-06-22	DABL: Detecting Semantic Anomalies in Business Processes Using Large Language Models	Wei Guan et.al.	2406.15781v1	link
2024-06-22	MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception	Guanqun Wang et.al.	2406.15768v1	null
2024-06-21	Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph	Roman Vashurin et.al.	2406.15627v1	null
2024-06-19	Dr.E Bridges Graphs with Large Language Models through Words	Zipeng Liu et.al.	2406.15504v1	null
2024-06-21	A LLM-Based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation	Irune Zubiaga et.al.	2406.15227v1	null
2024-06-21	Unsupervised Extraction of Dialogue Policies from Conversations	Makesh Narsimhan Sreedhar et.al.	2406.15214v1	null
2024-06-21	Asynchronous Large Language Model Enhanced Planner for Autonomous Driving	Yuan Chen et.al.	2406.14556v2	null
2024-06-20	LLaSA: Large Multimodal Agent for Human Activity Analysis Through Wearable Sensors	Sheikh Asif Imran et.al.	2406.14498v1	link
2024-06-20	Self-supervised Interpretable Concept-based Models for Text Classification	Francesco De Santis et.al.	2406.14335v1	null
2024-07-01	QuST-LLM: Integrating Large Language Models for Comprehensive Spatial Transcriptomics Analysis	Chao Hui Huang et.al.	2406.14307v2	link
2024-06-20	Definition generation for lexical semantic change detection	Mariia Fedorova et.al.	2406.14167v1	link
2024-06-20	Finding Safety Neurons in Large Language Models	Jianhui Chen et.al.	2406.14144v1	null
2024-06-19	Distributional reasoning in LLMs: Parallel reasoning processes in multi-hop reasoning	Yuval Shalev et.al.	2406.13858v1	null
2024-06-19	Fine-Tuning Gemma-7B for Enhanced Sentiment Analysis of Financial News Headlines	Kangtong Mo et.al.	2406.13626v1	null
2024-06-27	VDebugger: Harnessing Execution Feedback for Debugging Visual Programs	Xueqing Wu et.al.	2406.13444v2	link
2024-06-19	Finding Blind Spots in Evaluator LLMs with Interpretable Checklists	Sumanth Doddapaneni et.al.	2406.13439v1	link
2024-06-19	Data Contamination Can Cross Language Barriers	Feng Yao et.al.	2406.13236v1	link
2024-06-19	Locating and Extracting Relational Concepts in Large Language Models	Zijian Wang et.al.	2406.13184v1	link
2024-06-19	LLMatDesign: Autonomous Materials Discovery with Large Language Models	Shuyi Jia et.al.	2406.13163v1	null
2024-06-18	Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts	Haoxiang Wang et.al.	2406.12845v1	link
2024-06-18	ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools	Team GLM et.al.	2406.12793v1	link
2024-06-18	UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions	Xunzhi Wang et.al.	2406.12784v1	link
2024-06-18	Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning	Bingchen Zhao et.al.	2406.12742v1	link
2024-06-18	On the Robustness of Language Models for Tabular Question Answering	Kushal Raj Bhandari et.al.	2406.12719v1	null
2024-06-18	Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction	Haoqiu Yan et.al.	2406.12707v1	link
2024-06-18	MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL	Arian Askari et.al.	2406.12692v1	null
2024-06-18	Estimating Knowledge in Large Language Models Without Generating a Single Token	Daniela Gottesman et.al.	2406.12673v1	null
2024-06-18	Transforming Surgical Interventions with Embodied Intelligence for Ultrasound Robotics	Huan Xu et.al.	2406.12651v1	null
2024-06-19	Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models	Hengyi Wang et.al.	2406.12649v2	null
2024-06-19	Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on Large Language Models	Eldar Kurtic et.al.	2406.12572v2	link
2024-06-18	LLM4MSR: An LLM-Enhanced Paradigm for Multi-Scenario Recommendation	Yuhao Wang et.al.	2406.12529v1	null
2024-06-18	Interpreting Bias in Large Language Models: A Feature-Based Approach	Nirmalendu Prakash et.al.	2406.12347v1	null
2024-06-18	A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning	Lijie Hu et.al.	2406.12255v1	null
2024-06-29	Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM	Huaxin Zhang et.al.	2406.12235v2	link
2024-06-24	Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector	Gangwei Jiang et.al.	2406.12227v2	null
2024-06-17	Satyrn: A Platform for Analytics Augmented Generation	Marko Sterbentz et.al.	2406.12069v1	null
2024-06-17	ARTIST: Improving the Generation of Text-rich Images by Disentanglement	Jianyi Zhang et.al.	2406.12044v1	null
2024-06-17	Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts	Junmo Kang et.al.	2406.12034v1	null
2024-06-17	How Do Large Language Models Acquire Factual Knowledge During Pretraining?	Hoyeon Chang et.al.	2406.11813v1	null
2024-06-17	WaDec: Decompile WebAssembly Using Large Language Model	Xinyu She et.al.	2406.11346v1	null
2024-06-17	Can Machines Resonate with Humans? Evaluating the Emotional and Empathic Comprehension of LMs	Muhammad Arslan Manzoor et.al.	2406.11250v1	null
2024-06-17	Enabling robots to follow abstract instructions and complete complex dynamic tasks	Ruaridh Mon-Williams et.al.	2406.11231v1	null
2024-06-17	Compound Schema Registry	Silvery D. Fu et.al.	2406.11227v1	null
2024-06-17	MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model	Jiahao Huo et.al.	2406.11193v1	null
2024-06-18	DELRec: Distilling Sequential Pattern to Enhance LLM-based Recommendation	Guohao Sun et.al.	2406.11156v2	null
2024-07-01	The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models	Bolei Ma et.al.	2406.11096v2	null
2024-06-16	Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens	Weiyao Luo et.al.	2406.10985v1	null
2024-06-18	City-LEO: Toward Transparent City Management Using LLM with End-to-End Optimization	Zihao Jiao et.al.	2406.10958v2	null
2024-06-28	Large Language Model Enhanced Clustering for News Event Detection	Adane Nega Tarekegn et.al.	2406.10552v3	null
2024-06-17	Requirements are All You Need: From Requirements to Code with LLMs	Bingyang Wei et.al.	2406.10101v2	link
2024-06-14	Exploring the Correlation between Human and Machine Evaluation of Simultaneous Speech Translation	Xiaoman Wang et.al.	2406.10091v1	null
2024-06-14	Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam	Nabor C. Mendonça et.al.	2406.09671v1	link
2024-06-12	LLM-assisted Concept Discovery: Automatically Identifying and Explaining Neuron Functions	Nhat Hoang-Xuan et.al.	2406.08572v1	null
2024-06-12	Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning	Jaehyun Nam et.al.	2406.08527v1	null
2024-06-12	Leveraging Large Language Models for Web Scraping	Aman Ahluwalia et.al.	2406.08246v1	null
2024-06-12	AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection	Pia Pachinger et.al.	2406.08080v1	null
2024-06-12	A Concept-Based Explainability Framework for Large Multimodal Models	Jayneel Parekh et.al.	2406.08074v1	null
2024-06-12	Toward a Method to Generate Capability Ontologies from Natural Language Descriptions	Luis Miguel Vieira da Silva et.al.	2406.07962v1	null
2024-06-11	Estimating the Hallucination Rate of Generative AI	Andrew Jesson et.al.	2406.07457v1	null
2024-06-11	Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities	Delfina Sol Martinez Pandiani et.al.	2406.07353v1	link
2024-06-11	Instruct Large Language Models to Drive like Humans	Ruijun Zhang et.al.	2406.07296v1	link
2024-06-10	Harnessing AI for efficient analysis of complex policy documents: a case study of Executive Order 14110	Mark A. Kramer et.al.	2406.06657v1	null
2024-06-09	Exploring the Efficacy of Large Language Models (GPT-4) in Binary Reverse Engineering	Saman Pordanesh et.al.	2406.06637v1	null
2024-06-09	LLM Questionnaire Completion for Automatic Psychiatric Assessment	Gony Rosenman et.al.	2406.06636v1	null
2024-06-07	LinkQ: An LLM-Assisted Visual Interface for Knowledge Graph Question-Answering	Harry Li et.al.	2406.06621v1	link
2024-06-06	Prototypical Reward Network for Data-Efficient RLHF	Jinghan Zhang et.al.	2406.06606v1	null
2024-06-13	From Redundancy to Relevance: Enhancing Explainability in Multimodal Large Language Models	Xiaofeng Zhang et.al.	2406.06579v2	null
2024-06-18	OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step	Owen Dugan et.al.	2406.06576v2	null
2024-06-02	Inverse Constitutional AI: Compressing Preferences into Principles	Arduin Findeis et.al.	2406.06560v1	link
2024-06-11	Transforming Wearable Data into Health Insights using Large Language Model Agents	Mike A. Merrill et.al.	2406.06464v2	null
2024-06-10	Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization	Yi Gu et.al.	2406.06382v1	link
2024-06-10	MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows	Xingjian Zhang et.al.	2406.06357v1	link
2024-06-11	iMotion-LLM: Motion Prediction Instruction Tuning	Abdulwahab Felemban et.al.	2406.06211v2	null
2024-06-10	Prompting Large Language Models with Audio for General-Purpose Speech Summarization	Wonjune Kang et.al.	2406.05968v1	link
2024-06-16	RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation	Kiseung Kim et.al.	2406.05794v2	null
2024-06-08	VP-LLM: Text-Driven 3D Volume Completion with Large Language Models through Patchification	Jianmeng Liu et.al.	2406.05543v1	null
2024-06-08	MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention	Prince Jha et.al.	2406.05344v1	link
2024-06-07	LINX: A Language Driven Generative System for Goal-Oriented Automated Data Exploration	Tavor Lipman et.al.	2406.05107v1	null
2024-06-07	LLM-based speaker diarization correction: A generalizable approach	Georgios Efstathiadis et.al.	2406.04927v1	link
2024-06-07	Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models	Michał Romaszewski et.al.	2406.04926v1	null
2024-06-07	WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild	Bill Yuchen Lin et.al.	2406.04770v1	link
2024-06-07	LogiCode: an LLM-Driven Framework for Logical Anomaly Detection	Yiheng Zhang et.al.	2406.04687v1	link
2024-06-07	Large Language Model-guided Document Selection	Xiang Kong et.al.	2406.04638v1	null
2024-06-07	OCDB: Revisiting Causal Discovery with a Comprehensive Benchmark and Evaluation Framework	Wei Zhou et.al.	2406.04598v1	null
2024-06-06	MAIRA-2: Grounded Radiology Report Generation	Shruthi Bannur et.al.	2406.04449v1	null
2024-06-01	Large Language Model Confidence Estimation via Black-Box Access	Tejaswini Pedapati et.al.	2406.04370v1	null
2024-06-06	Verbalized Machine Learning: Revisiting Machine Learning with Language Models	Tim Z. Xiao et.al.	2406.04344v1	null
2024-06-06	Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People	Dun-Ming Huang et.al.	2406.04278v1	link
2024-06-06	Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts	Shubham Kumar Nigam et.al.	2406.04136v1	link
2024-06-06	Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning	Xiaohu Du et.al.	2406.03718v1	link
2024-06-13	Ranking Manipulation for Conversational Search Engines	Samuel Pfrommer et.al.	2406.03589v2	link
2024-06-04	Dynamic and Adaptive Feature Generation with LLM	Xinhao Zhang et.al.	2406.03505v1	null
2024-06-05	Cycles of Thought: Measuring LLM Confidence through Stable Explanations	Evan Becker et.al.	2406.03441v1	null
2024-06-05	Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models	Qiang Sun et.al.	2406.02962v1	link
2024-06-06	Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers	Brian K Chen et.al.	2406.02847v2	null
2024-06-04	Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks	Tianyu He et.al.	2406.02550v1	link
2024-06-04	Iteration Head: A Mechanistic Study of Chain-of-Thought	Vivien Cabannes et.al.	2406.02128v1	null
2024-06-04	I've got the "Answer"! Interpretation of LLMs Hidden States in Question Answering	Valeriya Goloviznina et.al.	2406.02060v1	null
2024-06-04	Enhancing Trust in LLMs: Algorithms for Comparing and Interpreting LLMs	Nik Bear Brown et.al.	2406.01943v1	null
2024-06-05	Dishonesty in Helpful and Harmless Alignment	Youcheng Huang et.al.	2406.01931v2	null
2024-06-21	Large Language Model-Enabled Multi-Agent Manufacturing Systems	Jonghan Lim et.al.	2406.01893v2	null
2024-06-04	PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning	Yupeng Zheng et.al.	2406.01587v2	null
2024-06-03	LoFiT: Localized Fine-tuning on LLM Representations	Fangcong Yin et.al.	2406.01563v1	link
2024-06-20	What Are Large Language Models Mapping to in the Brain? A Case Against Over-Reliance on Brain Scores	Ebrahim Feghhi et.al.	2406.01538v2	link
2024-06-03	The Geometry of Categorical and Hierarchical Concepts in Large Language Models	Kiho Park et.al.	2406.01506v1	link
2024-06-11	AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation	Junhao Cheng et.al.	2406.01388v2	link
2024-06-03	Large Language Model Assisted Optimal Bidding of BESS in FCAS Market: An AI-agent based Approach	Borui Zhang et.al.	2406.00974v1	null
2024-06-04	Efficient Behavior Tree Planning with Commonsense Pruning and Heuristic	Xinglin Chen et.al.	2406.00965v2	null
2024-06-10	Are you still on track!? Catching LLM Task Drift with Activations	Sahar Abdelnabi et.al.	2406.00799v2	null
2024-06-02	An Early Investigation into the Utility of Multimodal Large Language Models in Medical Imaging	Sulaiman Khan et.al.	2406.00667v1	null
2024-06-02	Presence or Absence: Are Unknown Word Usages in Dictionaries?	Xianghe Ma et.al.	2406.00656v1	link
2024-06-11	InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation	Jacob Si et.al.	2406.00426v3	link
2024-06-01	Controlling Large Language Model Agents with Entropic Activation Steering	Nate Rahn et.al.	2406.00244v1	null
2024-05-31	DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models	Linli Yao et.al.	2405.20985v1	null
2024-05-31	Improving Reward Models with Synthetic Critiques	Zihuiwen Ye et.al.	2405.20850v1	null
2024-05-31	Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning	Cheng Tan et.al.	2405.20834v1	null
2024-05-31	UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation	Hanzhang Zhou et.al.	2405.20612v1	null
2024-05-30	XPrompt:Explaining Large Language Model's Generation via Joint Prompt Attribution	Yurui Chang et.al.	2405.20404v1	null
2024-05-30	Defensive Prompt Patch: A Robust and Interpretable Defense of LLMs against Jailbreak Attacks	Chen Xiong et.al.	2405.20099v1	null
2024-05-30	Deciphering Human Mobility: Inferring Semantics of Trajectories with Large Language Models	Yuxiao Luo et.al.	2405.19850v1	null
2024-05-30	Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model	Chaochen Gao et.al.	2405.19846v1	null
2024-05-30	Knowledge Graph Tuning: Real-time Large Language Model Personalization based on Human Feedback	Jingwei Sun et.al.	2405.19686v1	null
2024-05-29	Normative Modules: A Generative Agent Architecture for Learning Norms that Supports Multi-Agent Cooperation	Atrisha Sarkar et.al.	2405.19328v1	null
2024-05-29	Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models	Tianrun Chen et.al.	2405.19326v1	null
2024-05-29	Learning from Litigation: Graphs and LLMs for Retrieval and Reasoning in eDiscovery	Sounak Lahiri et.al.	2405.19164v1	null
2024-06-02	Cephalo: Multi-Modal Vision-Language Models for Bio-Inspired Materials Analysis and Design	Markus J. Buehler et.al.	2405.19076v2	link
2024-06-03	Genshin: General Shield for Natural Language Processing with Large Language Models	Xiao Peng et.al.	2405.18741v2	null
2024-06-02	LLM-based Hierarchical Concept Decomposition for Interpretable Fine-Grained Image Classification	Renyi Qu et.al.	2405.18672v2	null
2024-05-28	Large Language Models as Partners in Student Essay Evaluation	Toru Ishida et.al.	2405.18632v1	null
2024-05-28	OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning	Pengxiang Li et.al.	2405.18380v1	link
2024-05-28	FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models	Yang Zhang et.al.	2405.18218v1	null
2024-05-28	Exploring Context Window of Large Language Models via Decomposed Positional Vectors	Zican Dong et.al.	2405.18009v1	null
2024-05-28	SkinCAP: A Multi-modal Dermatology Dataset Annotated with Rich Medical Captions	Juexiao Zhou et.al.	2405.18004v1	null
2024-05-28	Knowledge Circuits in Pretrained Transformers	Yunzhi Yao et.al.	2405.17969v1	link
2024-05-28	Arithmetic Reasoning with LLM: Prolog Generation & Permutation	Xiaocheng Yang et.al.	2405.17893v1	null
2024-05-27	Mechanistic Interpretability of Binary and Ternary Transformers	Jason Li et.al.	2405.17703v1	link
2024-05-27	Deployment of NLP and LLM Techniques to Control Mobile Robots at the Edge: A Case Study Using GPT-4-Turbo and LLaMA 2	Pascal Sikorski et.al.	2405.17670v1	null
2024-05-27	Enhanced Robot Arm at the Edge with NLP and Vision Systems	Pascal Sikorski et.al.	2405.17665v1	null
2024-05-27	BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments	Yusuf Roohani et.al.	2405.17631v1	link
2024-05-25	Revisit, Extend, and Enhance Hessian-Free Influence Functions	Ziao Yang et.al.	2405.17490v1	null
2024-05-28	LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding	Haoyu Zhao et.al.	2405.17104v2	null
2024-05-27	Exploring the LLM Journey from Cognition to Expression with Linear Representations	Yuzi Yan et.al.	2405.16964v1	null
2024-05-27	TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing	Xinyu Zhang et.al.	2405.16803v1	null
2024-05-26	Crafting Interpretable Embeddings by Asking LLMs Questions	Vinamra Benara et.al.	2405.16714v1	link
2024-05-26	Attaining Human`s Desirable Outcomes in Human-AI Interaction via Structural Causal Games	Anjie Liu et.al.	2405.16588v1	null
2024-05-26	Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search	Max Liu et.al.	2405.16450v1	null
2024-05-26	Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level	Runlin Lei et.al.	2405.16405v1	null
2024-05-25	Large Language Models Enable Automated Formative Feedback in Human-Robot Interaction Tasks	Emily Jensen et.al.	2405.16344v1	null
2024-06-03	Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge	Brendan Park et.al.	2405.16277v3	link
2024-05-25	Incremental Comprehension of Garden-Path Sentences by Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention	Andrew Li et.al.	2405.16042v1	null
2024-05-24	Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models	Yue Zhang et.al.	2405.15684v1	null
2024-05-24	Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges	Jonas Becker et.al.	2405.15604v1	link
2024-05-24	ChatGPT Code Detection: Techniques for Uncovering the Source of Code	Marc Oedingen et.al.	2405.15512v1	link
2024-05-24	Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search	Nicola Dainese et.al.	2405.15383v1	null
2024-05-24	Large Language Models can Deliver Accurate and Interpretable Time Series Anomaly Detection	Jun Liu et.al.	2405.15370v1	null
2024-05-24	V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM	Abdur Rahman et.al.	2405.15341v1	null
2024-05-24	Decompose and Aggregate: A Step-by-Step Interpretable Evaluation Framework	Minzhi Li et.al.	2405.15329v1	null
2024-05-24	Before Generation, Align it! A Novel and Effective Strategy for Mitigating Hallucinations in Text-to-SQL Generation	Ge Qu et.al.	2405.15307v1	link
2024-05-23	AutoCoder: Enhancing Code Large Language Model with \textsc{AIEV-Instruct}	Bin Lei et.al.	2405.14906v1	link
2024-05-28	Explaining Multi-modal Large Language Models by Analyzing their Vision Perception	Loris Giulivi et.al.	2405.14612v2	link
2024-05-23	Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning	Jiapu Wang et.al.	2405.14170v1	null
2024-05-28	DeTox: Toxic Subspace Projection for Model Editing	Rheeya Uppaal et.al.	2405.13967v3	link
2024-05-22	Large Language Models are Good Spontaneous Multilingual Learners: Is the Multilingual Annotated Data Necessary?	Shimao Zhang et.al.	2405.13816v1	link
2024-05-22	Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation	Gauthier Guinet et.al.	2405.13622v1	null
2024-05-24	ECLIPSE: Semantic Entropy-LCS for Cross-Lingual Industrial Log Parsing	Wei Zhang et.al.	2405.13548v2	null
2024-05-22	HighwayLLM: Decision-Making and Navigation in Highway Driving with RL-Informed Language Model	Mustafa Yildirim et.al.	2405.13547v1	null
2024-05-21	A Survey of Robotic Language Grounding: Tradeoffs Between Symbols and Embeddings	Vanya Cohen et.al.	2405.13245v1	null
2024-05-21	GPT-4 Jailbreaks Itself with Near-Perfect Success Using Self-Explanation	Govind Ramesh et.al.	2405.13077v1	null
2024-05-19	Human-Centered LLM-Agent User Interface: A Position Paper	Daniel Chin et.al.	2405.13050v1	null
2024-05-15	IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues	Diji Yang et.al.	2405.13021v1	null
2024-05-21	Quantifying Emergence in Large Language Models	Hang Chen et.al.	2405.12617v1	link
2024-05-21	Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models	Charles O'Neill et.al.	2405.12522v1	null
2024-05-20	Directed Metric Structures arising in Large Language Models	Stéphane Gaubert et.al.	2405.12264v1	null
2024-05-20	"Set It Up!": Functional Object Arrangement with Compositional Generative Models	Yiqing Xu et.al.	2405.11928v1	null
2024-05-20	Unveiling and Manipulating Prompt Influence in Large Language Models	Zijian Feng et.al.	2405.11891v1	link
2024-05-21	Decoding by Contrasting Knowledge: Enhancing LLMs' Confidence on Edited Facts	Baolong Bi et.al.	2405.11613v2	link
2024-05-17	Exploring Subjectivity for more Human-Centric Assessment of Social Biases in Large Language Models	Paula Akemi Aoyagui et.al.	2405.11048v1	null
2024-05-20	The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks	Lucius Bushnaq et.al.	2405.10928v2	link
2024-05-17	COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain	Dimitrios P. Panagoulias et.al.	2405.10893v1	null
2024-05-17	MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains	Zhaohuan Zhan et.al.	2405.10620v1	null
2024-05-20	Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks	Anwoy Chatterjee et.al.	2405.10548v2	null
2024-05-14	Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in LLMs	Akhila Yerukola et.al.	2405.08760v1	null
2024-05-14	Challenges and Opportunities in Text Generation Explainability	Kenza Amara et.al.	2405.08468v1	null
2024-05-14	Compositional Text-to-Image Generation with Dense Blob Representations	Weili Nie et.al.	2405.08246v1	null
2024-05-13	Interpreting Latent Student Knowledge Representations in Programming Assignments	Nigel Fernandez et.al.	2405.08213v1	null
2024-05-11	Translating Expert Intuition into Quantifiable Features: Encode Investigator Domain Knowledge via LLM for Enhanced Predictive Analytics	Phoebe Jing et.al.	2405.08017v1	null
2024-05-13	A Generalist Learner for Multifaceted Medical Image Interpretation	Hong-Yu Zhou et.al.	2405.07988v1	null
2024-05-13	MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning	Shuo Yin et.al.	2405.07551v1	null
2024-05-13	Integrating Intent Understanding and Optimal Behavior Planning for Behavior Tree Generation from Human Instructions	Xinglin Chen et.al.	2405.07474v1	null
2024-05-12	Human-interpretable clustering of short-text using large language models	Justin K. Miller et.al.	2405.07278v1	null
2024-05-11	Automating Thematic Analysis: How LLMs Analyse Controversial Topics	Awais Hameed Khan et.al.	2405.06919v1	null
2024-05-21	AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents	Shuyuan Xu et.al.	2405.06907v2	link
2024-05-10	MEIC: Re-thinking RTL Debug Automation using LLMs	Ke Xu et.al.	2405.06840v1	null
2024-05-10	Large Language Model in Financial Regulatory Interpretation	Zhiyu Cao et.al.	2405.06808v1	null
2024-05-15	On the Shape of Brainscores for Large Language Models (LLMs)	Jingkai Li et.al.	2405.06725v3	link
2024-05-09	Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses	Gaurav Kumar Gupta et.al.	2405.06712v1	null
2024-05-08	Interpretable Cross-Examination Technique (ICE-T): Using highly informative features to boost LLM performance	Goran Muric et.al.	2405.06703v1	null
2024-05-13	Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling	Lyumanshan Ye et.al.	2405.06495v2	null
2024-05-10	Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL	Ning Cheng et.al.	2405.06410v1	null
2024-05-09	LLMs for XAI: Future Directions for Explaining Explanations	Alexandra Zytek et.al.	2405.06064v1	null
2024-05-09	Probing Multimodal LLMs as World Models for Driving	Shiva Sreeram et.al.	2405.05956v1	link
2024-05-09	One vs. Many: Comprehending Accurate Information from Multiple Erroneous and Inconsistent AI Generations	Yoonjoo Lee et.al.	2405.05581v1	null
2024-05-11	Poser: Unmasking Alignment Faking LLMs by Manipulating Their Internals	Joshua Clymer et.al.	2405.05466v2	null
2024-05-08	Empathy Through Multimodality in Conversational Interfaces	Mahyar Abbasian et.al.	2405.04777v1	null
2024-05-09	Large Language Models for Cyber Security: A Systematic Literature Review	HanXiang Xu et.al.	2405.04760v2	null
2024-05-13	A Transformer with Stack Attention	Jiaoda Li et.al.	2405.04515v2	link
2024-05-06	In Situ AI Prototyping: Infusing Multimodal Prompts into Mobile Settings with MobileMaker	Savvas Petridis et.al.	2405.03806v1	null
2024-05-06	Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames	Keith Burghardt et.al.	2405.03688v1	link
2024-05-23	AlphaMath Almost Zero: process Supervision without process	Guoxin Chen et.al.	2405.03553v2	link
2024-05-06	MedDoc-Bot: A Chat Tool for Comparative Analysis of Large Language Models in the Context of the Pediatric Hypertension Guideline	Mohamed Yaseen Jabarulla et.al.	2405.03359v1	link
2024-05-06	WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning	Yuanhan Zhang et.al.	2405.03272v1	null
2024-05-06	A Philosophical Introduction to Language Models - Part II: The Way Forward	Raphaël Millière et.al.	2405.03207v1	null
2024-05-23	Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions	Ruizhe Li et.al.	2405.03205v2	link
2024-05-06	Exploring the Potential of the Large Language Models (LLMs) in Identifying Misleading News Headlines	Md Main Uddin Rony et.al.	2405.03153v1	null
2024-05-05	Traffic Performance GPT (TP-GPT): Real-Time Data Informed Intelligent ChatBot for Transportation Surveillance and Management	Bingzhang Wang et.al.	2405.03076v1	null
2024-05-22	A scoping review of using Large Language Models (LLMs) to investigate Electronic Health Records (EHRs)	Lingyao Li et.al.	2405.03066v2	null
2024-05-07	Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models	Tianze Xu et.al.	2405.02801v2	link
2024-05-04	TREC iKAT 2023: A Test Collection for Evaluating Conversational and Interactive Knowledge Assistants	Mohammad Aliannejadi et.al.	2405.02637v1	link
2024-05-03	What does the Knowledge Neuron Thesis Have to do with Knowledge?	Jingcheng Niu et.al.	2405.02421v1	link
2024-05-03	LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model	Yulin Luo et.al.	2405.02363v1	null
2024-04-18	NL2FOL: Translating Natural Language to First-Order Logic for Logical Fallacy Detection	Abhinav Lalwani et.al.	2405.02318v1	null
2024-05-03	Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows	Jasmine Y. Shih et.al.	2405.02260v1	null
2024-05-03	Argumentative Large Language Models for Explainable and Contestable Decision-Making	Gabriel Freedman et.al.	2405.02079v1	null
2024-05-02	A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law	Zhiyu Zoey Chen et.al.	2405.01769v1	null
2024-05-02	ALCM: Autonomous LLM-Augmented Causal Discovery Framework	Elahe Khatibi et.al.	2405.01744v1	null
2024-05-01	GOLD: Geometry Problem Solver with Natural Language Description	Jiaxin Zhang et.al.	2405.00494v1	link
2024-05-01	The Pyramid of Captions	Delong Chen et.al.	2405.00485v1	null
2024-05-01	CultiVerse: Towards Cross-Cultural Understanding for Paintings with Large Language Model	Wei Zhang et.al.	2405.00435v1	null
2024-04-30	PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification	Leon Garza et.al.	2404.19744v1	null
2024-05-22	Neuro-Vision to Language: Enhancing Visual Reconstruction and Language Interaction through Brain Recordings	Guobin Shen et.al.	2404.19438v3	null
2024-04-30	Transcrib3D: 3D Referring Expression Resolution through Large Language Models	Jiading Fang et.al.	2404.19221v1	null
2024-04-29	SuperCLUE-Fin: Graded Fine-Grained Analysis of Chinese LLMs on Diverse Financial Tasks and Applications	Liang Xu et.al.	2404.19063v1	null
2024-04-29	AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering	Wenxiang Zhao et.al.	2404.18816v1	null
2024-04-29	PECC: Problem Extraction and Coding Challenges	Patrick Haller et.al.	2404.18766v1	link
2024-04-29	HFT: Half Fine-Tuning for Large Language Models	Tingfeng Hui et.al.	2404.18466v1	null
2024-04-28	Logic Agent: Enhancing Validity with Logic Rule Invocation	Hanmeng Liu et.al.	2404.18130v1	null
2024-04-27	MediFact at MEDIQA-CORR 2024: Why AI Needs a Human Touch	Nadia Saeed et.al.	2404.17999v1	link
2024-04-27	Verco: Learning Coordinated Verbal Communication for Multi-agent Reinforcement Learning	Dapeng Li et.al.	2404.17780v1	null
2024-04-29	On the Use of Large Language Models to Generate Capability Ontologies	Luis Miguel Vieira da Silva et.al.	2404.17524v2	null
2024-04-26	Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study	Yang Wu et.al.	2404.17136v1	link
2024-04-25	AutoGenesisAgent: Self-Generating Multi-Agent Systems for Complex Tasks	Jeremy Harper et.al.	2404.17017v1	null
2024-04-25	Evolve Cost-aware Acquisition Functions Using Large Language Models	Yiming Yao et.al.	2404.16906v1	null
2024-04-11	Rumour Evaluation with Very Large Language Models	Dahlia Shehata et.al.	2404.16859v1	link
2024-04-25	RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis	Xiaoman Zhang et.al.	2404.16754v1	null
2024-04-25	Evolutionary Large Language Models for Hardware Security: A Comparative Survey	Mohammad Akyash et.al.	2404.16651v1	null
2024-04-25	Interpreting Answers to Yes-No Questions in Dialogues from Multiple Domains	Zijie Wang et.al.	2404.16262v1	link
2024-04-24	Return of EM: Entity-driven Answer Set Expansion for QA Evaluation	Dongryeol Lee et.al.	2404.15650v1	null
2024-04-27	PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models	Shashi Kant Gupta et.al.	2404.15549v2	null
2024-04-01	Automated Assessment of Encouragement and Warmth in Classrooms Leveraging Multimodal Emotional Features and ChatGPT	Ruikun Hou et.al.	2404.15310v1	null
2024-04-23	Aligning LLM Agents by Learning Latent Preference from User Edits	Ge Gao et.al.	2404.15269v1	link
2024-04-22	Pixels and Predictions: Potential of GPT-4V in Meteorological Imagery Analysis and Forecast Communication	John R. Lawson et.al.	2404.15166v1	null
2024-04-23	Language in Vivo vs. in Silico: Size Matters but Larger Language Models Still Do Not Comprehend Language on a Par with Humans	Vittoria Dentella et.al.	2404.14883v1	null
2024-04-23	Think-Program-reCtify: 3D Situated Reasoning with Large Language Models	Qingrong He et.al.	2404.14705v1	null
2024-04-26	Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training	Mengzhao Jia et.al.	2404.14604v3	null
2024-04-22	Integrating Disambiguation and User Preferences into Large Language Models for Robot Motion Planning	Mohammed Abugurain et.al.	2404.14547v1	null
2024-04-22	CoFInAl: Enhancing Action Quality Assessment with Coarse-to-Fine Instruction Alignment	Kanglei Zhou et.al.	2404.13999v1	link
2024-05-23	Towards General Conceptual Model Editing via Adversarial Representation Engineering	Yihao Zhang et.al.	2404.13752v2	link
2024-04-21	FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization	Zhaopeng Gu et.al.	2404.13671v1	null
2024-04-21	Trojan Detection in Large Language Models: Insights from The Trojan Detection Challenge	Narek Maloyan et.al.	2404.13660v1	null
2024-04-21	ChatRetriever: Adapting Large Language Models for Generalized and Robust Conversational Dense Retrieval	Kelong Mao et.al.	2404.13556v1	link
2024-04-20	"I Wish There Were an AI": Challenges and AI Potential in Cancer Patient-Provider Communication	Ziqi Yang et.al.	2404.13409v1	null
2024-04-20	Large Language Models as Test Case Generators: Performance Evaluation and Enhancement	Kefan Li et.al.	2404.13340v1	null
2024-04-19	CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models	Manish Bhatt et.al.	2404.13161v1	link
2024-04-19	Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation	Guanhua Chen et.al.	2404.12879v1	null
2024-04-19	Large Language Model Supply Chain: A Research Agenda	Shenao Wang et.al.	2404.12736v1	null
2024-04-19	Just Like Me: The Role of Opinions and Personal Experiences in The Perception of Explanations in Subjective Decision-Making	Sharon Ferguson et.al.	2404.12558v1	null
2024-04-18	BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models	Yu Feng et.al.	2404.12494v1	null
2024-04-18	MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale	Xiaotang Gai et.al.	2404.12372v1	null
2024-04-23	Large Language Models for Synthetic Participatory Planning of Synergistic Transportation Systems	Jiangbo Yu et.al.	2404.12317v3	null
2024-04-18	Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair	Yusuke Sakai et.al.	2404.12299v1	null
2024-04-18	Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM	Michelle S. Lam et.al.	2404.12259v1	link
2024-04-18	EVIT: Event-Oriented Instruction Tuning for Event Reasoning	Zhengwei Tao et.al.	2404.11978v1	null
2024-04-18	Aligning Language Models to Explicitly Handle Ambiguity	Hyuhng Joon Kim et.al.	2404.11972v1	null
2024-04-18	Concept Induction using LLMs: a user experiment for assessment	Adrita Barua et.al.	2404.11875v1	null
2024-04-17	MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory	Ali Modarressi et.al.	2404.11672v1	null
2024-04-16	Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases	Yanze Li et.al.	2404.10595v1	null
2024-04-16	Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning	Xiao Wang et.al.	2404.10552v1	null
2024-04-15	Evolving Interpretable Visual Classifiers with Large Language Models	Mia Chiquier et.al.	2404.09941v1	null
2024-04-15	Reimagining Self-Adaptation in the Age of Large Language Models	Raghav Donakanti et.al.	2404.09866v1	null
2024-04-16	How Far Have We Gone in Stripped Binary Code Understanding Using Large Language Models	Xiuwei Shang et.al.	2404.09836v2	null
2024-04-15	Resilience of Large Language Models for Noisy Instructions	Bin Wang et.al.	2404.09754v1	null
2024-04-15	Enhancing Robot Explanation Capabilities through Vision-Language Models: a Preliminary Study by Interpreting Visual Inputs for Improved Human-Robot Interaction	David Sobrín-Hidalgo et.al.	2404.09705v1	null
2024-04-15	Bridging Vision and Language Spaces with Assignment Prediction	Jungin Park et.al.	2404.09632v1	link
2024-04-15	MMCode: Evaluating Multi-Modal Code Large Language Models with Visually Rich Programming Problems	Kaixin Li et.al.	2404.09486v1	link
2024-04-14	Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions	Taojun Hu et.al.	2404.09135v1	null
2024-04-17	Incremental Residual Concept Bottleneck Models	Chenming Shang et.al.	2404.08978v2	null
2024-04-13	Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension	Mengnan Qi et.al.	2404.08885v1	null
2024-04-12	LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning	Junchi Wang et.al.	2404.08767v1	link
2024-04-12	Can LLMs substitute SQL? Comparing Resource Utilization of Querying LLMs versus Traditional Relational Databases	Xiang Zhang et.al.	2404.08727v1	null
2024-04-05	Effects of Different Prompts on the Quality of GPT-4 Responses to Dementia Care Questions	Zhuochun Li et.al.	2404.08674v1	null
2024-03-25	Linear Cross-document Event Coreference Resolution with X-AMR	Shafiuddin Rehan Ahmed et.al.	2404.08656v1	link
2024-04-12	Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward	Xuan Xie et.al.	2404.08517v1	null
2024-04-12	Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task	Hassan Ali et.al.	2404.08424v1	null
2024-03-22	Content Knowledge Identification with Multi-Agent Large Language Models (LLMs)	Kaiqi Yang et.al.	2404.07960v1	null
2024-04-11	DesignQA: A Multimodal Benchmark for Evaluating Large Language Models' Understanding of Engineering Documentation	Anna C. Doris et.al.	2404.07917v1	link
2024-04-12	Reflectance Estimation for Proximity Sensing by Vision-Language Models: Utilizing Distributional Semantics for Low-Level Cognition in Robotics	Masashi Osada et.al.	2404.07717v2	link
2024-04-11	Can Large Language Models Assess Serendipity in Recommender Systems?	Yu Tokutake et.al.	2404.07499v1	null
2024-04-10	Vision-Language Model-based Physical Reasoning for Robot Liquid Perception	Wenqiang Lai et.al.	2404.06904v1	null
2024-04-09	Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language?	Omid Ghahroodi et.al.	2404.06644v1	null
2024-04-09	Building A Knowledge Graph to Enrich ChatGPT Responses in Manufacturing Service Discovery	Yunqing Li et.al.	2404.06571v1	null
2024-04-09	Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in Python	Valdecy Pereira et.al.	2404.06370v1	link
2024-04-21	AgentsCoDriver: Large Language Model Empowered Collaborative Driving with Lifelong Learning	Senkang Hu et.al.	2404.06345v2	null
2024-04-07	X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model	Jan Held et.al.	2404.06332v1	null
2024-04-08	LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding	Chuwei Luo et.al.	2404.05225v1	link
2024-04-08	LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models	Shibo Hao et.al.	2404.05221v1	null
2024-04-07	Facial Affective Behavior Analysis with Instruction Tuning	Yifan Li et.al.	2404.05052v1	null
2024-04-07	FRACTAL: Fine-Grained Scoring from Aggregate Text Labels	Yukti Makhija et.al.	2404.04817v1	null
2024-04-06	Multicalibration for Confidence Scoring in LLMs	Gianluca Detommaso et.al.	2404.04689v1	null
2024-04-06	Autonomous Artificial Intelligence Agents for Clinical Decision Making in Oncology	Dyke Ferber et.al.	2404.04667v1	null
2024-04-06	Do We Really Need a Complex Agent System? Distill Embodied Agent into a Single Model	Zhonghan Zhao et.al.	2404.04619v1	null
2024-04-05	Scope Ambiguities in Large Language Models	Gaurav Kamath et.al.	2404.04332v1	link
2024-04-05	Assessing the quality of information extraction	Filip Seitl et.al.	2404.04068v1	null
2024-04-04	Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph	Marco Bronzini et.al.	2404.03623v1	null
2024-04-04	Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity	Jake Varley et.al.	2404.03570v1	null
2024-04-03	LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models	Gabriela Ben Melech Stan et.al.	2404.03118v1	null
2024-04-03	Towards a Fully Interpretable and More Scalable RSA Model for Metaphor Understanding	Gaia Carenini et.al.	2404.02983v1	null
2024-04-13	Explainable Traffic Flow Prediction with Large Language Models	Xusen Guo et.al.	2404.02937v3	null
2024-04-13	Toward Informal Language Processing: Knowledge of Slang in Large Language Models	Zhewei Sun et.al.	2404.02323v2	null
2024-04-02	ZeroCAP: Zero-Shot Multi-Robot Context Aware Pattern Formation via Large Language Models	Vishnunandan L. N. Venkatesh et.al.	2404.02318v1	null
2024-04-02	Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation	Veronica Valeros et.al.	2404.01940v1	null
2024-04-02	InsightLens: Discovering and Exploring Insights from Conversational Contexts in Large-Language-Model-Powered Data Analysis	Luoxuan Weng et.al.	2404.01644v1	null
2024-03-29	Wait, It's All Token Noise? Always Has Been: Interpreting LLM Behavior Using Shapley Value	Behnam Mohammadi et.al.	2404.01332v1	null
2024-04-01	Chat Modeling: Natural Language-based Procedural Modeling of Biological Structures without Training	Donggang Jia et.al.	2404.01063v1	null
2024-04-11	Source-Aware Training Enables Knowledge Attribution in Language Models	Muhammad Khalifa et.al.	2404.01019v2	link
2024-04-01	Query Performance Prediction using Relevance Judgments Generated by Large Language Models	Chuan Meng et.al.	2404.01012v1	link
2024-04-01	Exploring the Nexus of Large Language Models and Legal Systems: A Short Survey	Weicong Qin et.al.	2404.00990v1	null
2024-04-12	Harnessing the Power of Large Language Model for Uncertainty Aware Graph Processing	Zhenyu Qian et.al.	2404.00589v2	link
2024-03-30	PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression	Muhammad Asif Ali et.al.	2404.00489v1	null
2024-03-30	Do Vision-Language Models Understand Compound Nouns?	Sonal Kumar et.al.	2404.00419v1	null
2024-03-30	EventGround: Narrative Reasoning by Grounding to Eventuality-centric Knowledge Graphs	Cheng Jiayang et.al.	2404.00209v1	link
2024-03-29	User Modeling Challenges in Interactive AI Assistant Systems	Megan Su et.al.	2403.20134v1	null
2024-03-28	Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving	Akshay Gopalkrishnan et.al.	2403.19838v1	link
2024-03-28	AlloyBERT: Alloy Property Prediction with Large Language Models	Akshat Chaudhari et.al.	2403.19783v1	null
2024-03-28	Enhancing Anomaly Detection in Financial Markets with an LLM-based Multi-Agent Framework	Taejin Park et.al.	2403.19735v1	null
2024-04-01	Change-Agent: Towards Interactive Comprehensive Remote Sensing Change Interpretation and Analysis	Chenyang Liu et.al.	2403.19646v2	link
2024-03-28	Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation	Yutong He et.al.	2403.19103v1	null
2024-03-27	A Survey on Large Language Models from Concept to Implementation	Chen Wang et.al.	2403.18969v1	null
2024-03-27	CheckEval: Robust Evaluation Framework using Large Language Model via Checklist	Yukyung Lee et.al.	2403.18771v1	null
2024-04-03	Quantifying and Mitigating Unimodal Biases in Multimodal Large Language Models: A Causal Perspective	Meiqi Chen et.al.	2403.18346v3	null
2024-03-27	LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models	Mingxing Peng et.al.	2403.18344v1	null
2024-03-27	Can LLMs Converse Formally? Automatically Assessing LLMs in Translating and Interpreting Formal Specifications	Rushang Karia et.al.	2403.18327v1	null
2024-03-26	Evaluating the Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security Applications	Fouad Trad et.al.	2403.17787v1	null
2024-03-25	Generation of Asset Administration Shell with Large Language Model Agents: Interoperability in Digital Twins with Semantic Node	Yuchen Xia et.al.	2403.17209v1	null
2024-03-25	The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition	Georgios Chochlakis et.al.	2403.17125v1	null
2024-03-25	Grounding Language Plans in Demonstrations Through Counterfactual Perturbations	Yanwei Wang et.al.	2403.17124v1	null
2024-03-25	Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models	Hao Shao et.al.	2403.16999v1	link
2024-03-25	PropTest: Automatic Property Testing for Improved Visual Programming	Jaywon Koo et.al.	2403.16921v1	null
2024-04-22	Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography	Jiayue Zhang et.al.	2403.16687v3	null
2024-03-28	Can Language Models Pretend Solvers? Logic Code Simulation with LLMs	Minyu Chen et.al.	2403.16097v2	null
2024-04-15	Computational Sentence-level Metrics Predicting Human Sentence Comprehension	Kun Sun et.al.	2403.15822v2	null
2024-03-23	EDDA: A Encoder-Decoder Data Augmentation Framework for Zero-Shot Stance Detection	Daijun Ding et.al.	2403.15715v1	link
2024-04-03	Evaluating GPT-4 with Vision on Detection of Radiological Findings on Chest Radiographs	Yiliang Zhou et.al.	2403.15528v2	null
2024-03-21	Open Source Conversational LLMs do not know most Spanish words	Javier Conde et.al.	2403.15491v1	null
2024-03-15	ChatPattern: Layout Pattern Customization via Natural Language	Zixiao Wang et.al.	2403.15434v1	null
2024-03-22	Can large language models explore in-context?	Akshay Krishnamurthy et.al.	2403.15371v1	null
2024-04-03	AllHands: Ask Me Anything on Large-scale Verbatim Feedback via Large Language Models	Chaoyun Zhang et.al.	2403.15157v2	null
2024-03-22	Comprehensive Lipidomic Automation Workflow using Large Language Models	Connor Beveridge et.al.	2403.15076v1	null
2024-03-21	MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?	Renrui Zhang et.al.	2403.14624v1	null
2024-03-21	Dermacen Analytica: A Novel Methodology Integrating Multi-Modal Large Language Models with Machine Learning in tele-dermatology	Dimitrios P. Panagoulias et.al.	2403.14243v1	null
2024-04-08	MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation	Longzheng Wang et.al.	2403.14171v3	link
2024-03-20	CoMo: Controllable Motion Generation through Language Guided Pose Code Editing	Yiming Huang et.al.	2403.13900v1	null
2024-03-20	Encoding the Subsurface in 3D with Seismic	Ben Lasscock et.al.	2403.13593v1	null
2024-03-20	IndiTag: An Online Media Bias Analysis and Annotation System Using Fine-Grained Bias Indicators	Luyang Lin et.al.	2403.13446v1	link
2024-03-19	A Canary in the AI Coal Mine: American Jews May Be Disproportionately Harmed by Intellectual Property Dispossession in Large Language Model Training	Heila Precel et.al.	2403.13073v1	null
2024-04-02	AutoTRIZ: Artificial Ideation with TRIZ and Large Language Models	Shuo Jiang et.al.	2403.13002v2	null
2024-03-19	Semantic Layering in Room Segmentation via LLMs	Taehyeon Kim et.al.	2403.12920v1	null
2024-03-19	Pragmatic Competence Evaluation of Large Language Models for Korean	Dojun Park et.al.	2403.12675v1	null
2024-04-02	Enhancing Formal Theorem Proving: A Comprehensive Dataset for Training AI Models on Coq Code	Andreas Florath et.al.	2403.12627v2	null
2024-03-19	AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework	Xiang Li et.al.	2403.12582v1	link
2024-03-19	INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations	Lirui Luo et.al.	2403.12451v1	null
2024-03-19	Towards Interpretable Hate Speech Detection using Large Language Model-extracted Rationales	Ayushi Nirmal et.al.	2403.12403v1	null
2024-03-19	Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models	Ying-Chun Lin et.al.	2403.12388v1	null
2024-04-02	Investigating Markers and Drivers of Gender Bias in Machine Translations	Peter J Barclay et.al.	2403.11896v2	null
2024-03-18	Metaphor Understanding Challenge Dataset for LLMs	Xiaoyu Tong et.al.	2403.11810v1	null
2024-03-22	Scene-LLM: Extending Language Model for 3D Visual Understanding and Reasoning	Rao Fu et.al.	2403.11401v2	null
2024-04-10	StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows	Yiran Wu et.al.	2403.11322v3	link
2024-03-17	ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models	Siyuan Huang et.al.	2403.11289v1	link
2024-03-26	SelfIE: Self-Interpretation of Large Language Model Embeddings	Haozhe Chen et.al.	2403.10949v2	link
2024-03-16	A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment	Tianhe Wu et.al.	2403.10854v1	link
2024-03-16	LLM-based Conversational AI Therapist for Daily Functioning Screening and Psychotherapeutic Intervention via Everyday Smart Devices	Jingping Nie et.al.	2403.10779v1	null
2024-03-16	NARRATE: Versatile Language Architecture for Optimal Control in Robotics	Seif Ismail et.al.	2403.10762v1	null
2024-03-15	Uncovering Latent Themes of Messaging on Social Media by Integrating LLMs: A Case Study on Climate Campaigns	Tunazzina Islam et.al.	2403.10707v1	null
2024-03-22	Large Language Model-informed ECG Dual Attention Network for Heart Failure Risk Prediction	Chen Chen et.al.	2403.10581v2	null
2024-03-15	TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale	Pengcheng Jiang et.al.	2403.10351v1	null
2024-03-14	Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors	Guanghua Li et.al.	2403.09747v1	null
2024-03-14	XCoOp: Explainable Prompt Learning for Computer-Aided Diagnosis via Concept-guided Context Optimization	Yequan Bie et.al.	2403.09410v1	null
2024-03-14	UniCode: Learning a Unified Codebook for Multimodal Large Language Models	Sipeng Zheng et.al.	2403.09072v1	null
2024-02-21	Diet-ODIN: A Novel Framework for Opioid Misuse Detection with Interpretable Dietary Patterns	Zheyuan Zhang et.al.	2403.08820v1	link
2024-03-13	A Picture Is Worth a Thousand Words: Exploring Diagram and Video-Based OOP Exercises to Counter LLM Over-Reliance	Bruno Pereira Cipriano et.al.	2403.08396v1	null
2024-03-13	Embedded Translations for Low-resource Automated Glossing	Changbing Yang et.al.	2403.08189v1	null
2024-03-12	NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning	Bingqian Lin et.al.	2403.07376v1	link
2024-03-11	From English to ASIC: Hardware Implementation with Large Language Model	Emil Goh et.al.	2403.07039v1	link
2024-03-11	Large Model driven Radiology Report Generation with Clinical Quality Reinforcement Learning	Zijian Zhou et.al.	2403.06728v1	null
2024-03-11	FashionReGen: LLM-Empowered Fashion Report Generation	Yujuan Ding et.al.	2403.06660v1	null
2024-03-10	Are You Being Tracked? Discover the Power of Zero-Shot Trajectory Tracing with LLMs!	Huanqi Yang et.al.	2403.06201v1	null
2024-03-10	Reframe Anything: LLM Agent for Open World Video Reframing	Jiawang Cao et.al.	2403.06070v1	null
2024-03-09	LEVA: Using Large Language Models to Enhance Visual Analytics	Yuheng Zhao et.al.	2403.05816v1	null
2024-03-08	Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach	Zhen Tan et.al.	2403.05636v1	null
2024-03-08	ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment	Xiwei Hu et.al.	2403.05135v1	null
2024-03-11	Embracing Large Language and Multimodal Models for Prosthetic Technologies	Sharmita Dey et.al.	2403.04974v2	null
2024-03-07	Automatic and Universal Prompt Injection Attacks against Large Language Models	Xiaogeng Liu et.al.	2403.04957v1	link
2024-03-07	iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries	Adam Coscia et.al.	2403.04760v1	link
2024-03-07	KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts	Adam Coscia et.al.	2403.04758v1	link
2024-03-07	Wiki-TabNER:Advancing Table Interpretation Through Named Entity Recognition	Aneta Koleva et.al.	2403.04577v1	link
2024-03-08	Do Large Language Model Understand Multi-Intent Spoken Language ?	Shangjian Yin et.al.	2403.04481v2	link
2024-03-18	Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models	Changjiang Gao et.al.	2403.04325v2	null
2024-03-13	Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning	Deepanway Ghosal et.al.	2403.03864v3	link
2024-03-06	Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery	Wei Zhang et.al.	2403.03790v1	null
2024-03-06	GPTopic: Dynamic and Interactive Topic Representations	Arik Reuter et.al.	2403.03628v1	null
2024-03-06	Explaining Genetic Programming Trees using Large Language Models	Paula Maddigan et.al.	2403.03397v1	null
2024-03-05	Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled by GPT-4 for Enhanced Interpretability and Public Engagement	Rafaela Martelo et.al.	2403.03188v1	link
2024-03-05	HINTs: Sensemaking on large collections of documents with Hypergraph visualization and INTelligent agents	Sam Yu-Te Lee et.al.	2403.02752v1	null
2024-03-05	HARGPT: Are LLMs Zero-Shot Human Activity Recognizers?	Sijie Ji et.al.	2403.02727v1	null
2024-03-05	Updating the Minimum Information about CLinical Artificial Intelligence (MI-CLAIM) checklist for generative modeling research	Brenda Y. Miao et.al.	2403.02558v1	link
2024-03-26	FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction	Alessandro Scirè et.al.	2403.02270v2	null
2024-03-04	Towards Intent-Based Network Management: Large Language Models for Intent Extraction in 5G Core Networks	Dimitrios Michael Manias et.al.	2403.02238v1	null
2024-03-04	Evaluating the Explainability of Neural Rankers	Saran Pandian et.al.	2403.01981v1	null
2024-03-03	Logic Rules as Explanations for Legal Case Retrieval	Zhongxiang Sun et.al.	2403.01457v1	link
2024-03-02	Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers	Melanie Subbiah et.al.	2403.01061v1	link
2024-03-01	Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries	Zelalem Gero et.al.	2403.01002v1	link
2024-02-26	InteraRec: Interactive Recommendations Using Multimodal Large Language Models	Saketh Reddy Karra et.al.	2403.00822v1	null
2024-02-25	Bootstrapping Cognitive Agents with a Large Language Model	Feiyu Zhu et.al.	2403.00810v1	null
2024-02-18	Ploutos: Towards interpretable stock movement prediction with financial large language model	Hanshuang Tong et.al.	2403.00782v1	null
2024-02-18	ChatDiet: Empowering Personalized Nutrition-Oriented Food Recommender Chatbots through an LLM-Augmented Framework	Zhongqi Yang et.al.	2403.00781v1	null
2024-03-27	LLMs in Political Science: Heralding a New Era of Visual Analysis	Yu Wang et.al.	2403.00154v2	null
2024-02-29	FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition	Xiaoqiang Wang et.al.	2403.00126v1	null
2024-02-29	Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines	Lijia Ma et.al.	2402.19421v1	null
2024-03-12	Data Interpreter: An LLM Agent For Data Science	Sirui Hong et.al.	2402.18679v3	link
2024-02-28	Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning	Jiachun Li et.al.	2402.18344v1	null
2024-02-29	MIKO: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery	Feihong Lu et.al.	2402.18169v2	null
2024-02-28	Cause and Effect: Can Large Language Models Truly Understand Causality?	Swagata Ashwani et.al.	2402.18139v1	null
2024-02-28	ChatSpamDetector: Leveraging Large Language Models for Effective Phishing Email Detection	Takashi Koide et.al.	2402.18093v1	null
2024-02-27	Automated Statistical Model Discovery with Language Models	Michael Y. Li et.al.	2402.17879v1	null
2024-03-07	ByteComposer: a Human-like Melody Composition Method based on Language Model Agent	Xia Liang et.al.	2402.17785v2	null
2024-02-27	Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data	Xiao Liu et.al.	2402.17644v1	link
2024-02-27	Nissist: An Incident Mitigation Copilot based on Troubleshooting Guides	Kaikai An et.al.	2402.17531v1	null
2024-02-27	Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models	Xiaolong Wang et.al.	2402.17226v1	null
2024-03-20	OSCaR: Object State Captioning and State Change Representation	Nguyen Nguyen et.al.	2402.17128v3	link
2024-02-24	Enforcing Temporal Constraints on Generative Agent Behavior with Reactive Synthesis	Raven Rothkopf et.al.	2402.16905v1	null
2024-02-26	Mysterious Projections: Multimodal LLMs Gain Domain-Specific Visual Capabilities Without Richer Cross-Modal Projections	Gaurav Verma et.al.	2402.16832v1	null
2024-02-28	StructLM: Towards Building Generalist Models for Structured Knowledge Grounding	Alex Zhuang et.al.	2402.16671v2	null
2024-03-04	Improving LLM-based Machine Translation with Systematic Self-Correction	Zhaopeng Feng et.al.	2402.16379v2	link
2024-02-25	AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation	Yasheng Sun et.al.	2402.16124v1	null
2024-02-25	Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression	Xinze Li et.al.	2402.16058v1	link
2024-02-25	LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding	Yuxuan Wang et.al.	2402.16050v1	link
2024-02-23	Language-Based User Profiles for Recommendation	Joyce Zhou et.al.	2402.15623v1	null
2024-02-19	Detecting misinformation through Framing Theory: the Frame Element-based Model	Guan Wang et.al.	2402.15525v1	null
2024-02-23	Explorations of Self-Repair in Language Models	Cody Rushing et.al.	2402.15390v1	link
2024-02-23	Substrate Prediction for RiPP Biosynthetic Enzymes via Masked Language Modeling and Transfer Learning	Joseph D. Clark et.al.	2402.15181v1	null
2024-02-23	Large Multimodal Agents: A Survey	Junlin Xie et.al.	2402.15116v1	null
2024-03-08	LLMBind: A Unified Modality-Task Integration Framework	Bin Zhu et.al.	2402.14891v3	null
2024-02-21	Driving Generative Agents With Their Personality	Lawrence J. Klinkert et.al.	2402.14879v1	null
2024-02-20	A Dual-Prompting for Interpretable Mental Health Language Models	Hyolim Jeon et.al.	2402.14854v1	null
2024-02-19	RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question Answering and Clinical Reasoning	Congyun Jin et.al.	2402.14840v1	null
2024-02-23	A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health	Nikhil Behari et.al.	2402.14807v2	null
2024-02-22	Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation	Jiawei Wang et.al.	2402.14744v1	null
2024-02-22	COMPASS: Computational Mapping of Patient-Therapist Alliance Strategies with Language Modeling	Baihan Lin et.al.	2402.14701v1	null
2024-02-28	OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement	Tianyu Zheng et.al.	2402.14658v2	null
2024-02-22	Towards Unified Task Embeddings Across Multiple Models: Bridging the Gap for Prompt-Based Large Language Models and Beyond	Xinyu Wang et.al.	2402.14522v1	null
2024-02-22	Data Science with LLMs and Interpretable Models	Sebastian Bordt et.al.	2402.14474v1	link
2024-02-21	MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms	Yiqiao Jin et.al.	2402.14154v1	null
2024-02-21	DeiSAM: Segment Anything with Deictic Prompting	Hikaru Shindo et.al.	2402.14123v1	link
2024-02-21	An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach	Mohammad Amaz Uddin et.al.	2402.13871v1	null
2024-02-21	LLM4SBR: A Lightweight and Effective Framework for Integrating Large Language Models in Session-based Recommendation	Shutong Qiao et.al.	2402.13840v1	null
2024-03-15	CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models	Fuwen Luo et.al.	2402.13607v2	null
2024-02-21	Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment	Yunxin Li et.al.	2402.13561v1	null
2024-02-21	Round Trip Translation Defence against Large Language Model Jailbreaking Attacks	Canaan Yung et.al.	2402.13517v1	link
2024-02-20	SymBa: Symbolic Backward Chaining for Multi-step Natural Language Reasoning	Jinu Lee et.al.	2402.12806v1	null
2024-02-20	Are Large Language Models Rational Investors?	Yuhang Zhou et.al.	2402.12713v1	null
2024-02-18	scInterpreter: Training Large Language Models to Interpret scRNA-seq Data for Cell Type Annotation	Cong Li et.al.	2402.12405v1	null
2024-02-19	Reformatted Alignment	Run-Ze Fan et.al.	2402.12219v1	link
2024-02-19	ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning	Renqiu Xia et.al.	2402.12185v1	link
2024-02-19	Distilling Large Language Models for Text-Attributed Graph Learning	Bo Pan et.al.	2402.12022v1	null
2024-02-25	How Interpretable are Reasoning Explanations from Prompting Large Language Models?	Wei Jie Yeo et.al.	2402.11863v2	link
2024-02-22	ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs	Fengqing Jiang et.al.	2402.11753v2	null
2024-02-18	A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models	Jaylen Jones et.al.	2402.11676v1	link
2024-02-18	Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals	Francesco Ortu et.al.	2402.11655v1	link
2024-02-17	TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks	Benjamin Feuer et.al.	2402.11137v1	link
2024-02-09	Zero-shot Explainable Mental Health Analysis on Social Media by incorporating Mental Scales	Wenyu Li et.al.	2402.10948v1	null
2024-02-16	How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs?	Ehsan Doostmohammadi et.al.	2402.10770v1	null
2024-02-16	Inference to the Best Explanation in Large Language Models	Dhairya Dalal et.al.	2402.10767v1	null
2024-02-16	Opening the Black Box of Large Language Models: Two Views on Holistic Interpretability	Haiyan Zhao et.al.	2402.10688v1	null
2024-02-16	LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models	Minsuk Kahng et.al.	2402.10524v1	null
2024-02-15	OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset	Shubham Toshniwal et.al.	2402.10176v1	link
2024-02-15	Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States	Hanyu Duan et.al.	2402.09733v1	null
2024-02-15	Answer is All You Need: Instruction-following Text Embedding via Answering the Question	Letian Peng et.al.	2402.09642v1	link
2024-02-14	Large Language Model-Based Interpretable Machine Learning Control in Building Energy Systems	Liang Zhang et.al.	2402.09584v1	null
2024-02-14	AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach	Maryam Amirizaniani et.al.	2402.09334v1	null
2024-02-14	Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code	Vahid Majdinasab et.al.	2402.09299v1	null
2024-02-14	SyntaxShap: Syntax-aware Explainability Method for Text Generation	Kenza Amara et.al.	2402.09259v1	null
2024-02-14	Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models	Goutham Rajendran et.al.	2402.09236v1	null
2024-02-13	Large Language Models for the Automated Analysis of Optimization Algorithms	Camilo Chacón Sartori et.al.	2402.08472v1	link
2024-02-13	Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks	Jusung Lee et.al.	2402.08360v1	null
2024-02-17	LLaGA: Large Language and Graph Assistant	Runjin Chen et.al.	2402.08170v2	link
2024-02-25	Policy Improvement using Language Feedback Models	Victor Zhong et.al.	2402.07876v3	null
2024-02-12	Game Agent Driven by Free-Form Text Command: Using LLM-based Code Generation and Behavior Branch	Ray Ito et.al.	2402.07442v1	null
2024-02-14	Natural Language Reinforcement Learning	Xidong Feng et.al.	2402.07157v2	null
2024-02-09	InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning	Huaiyuan Ying et.al.	2402.06332v1	link
2024-02-09	ContPhy: Continuum Physical Concept Learning and Reasoning from Videos	Zhicheng Zheng et.al.	2402.06119v1	null
2024-02-02	Character-based Outfit Generation with Vision-augmented Style Extraction via LLMs	Najmeh Forouzandehmehr et.al.	2402.05941v1	null
2024-02-08	Driving Everywhere with Large Language Model Policy Adaptation	Boyi Li et.al.	2402.05932v1	null
2024-02-05	Zero-Shot Clinical Trial Patient Matching with LLMs	Michael Wornow et.al.	2402.05125v1	null
2024-02-07	Opening the AI black box: program synthesis via mechanistic interpretability	Eric J. Michaud et.al.	2402.05110v1	link
2024-02-07	Improving Cross-Domain Low-Resource Text Generation through LLM Post-Editing: A Programmer-Interpreter Approach	Zhuang Li et.al.	2402.04609v1	null
2024-02-06	Chatbot Meets Pipeline: Augment Large Language Model with Definite Finite Automaton	Yiyou Sun et.al.	2402.04411v1	null
2024-02-06	Assured LLM-Based Software Engineering	Nadia Alshahwan et.al.	2402.04380v1	null
2024-02-06	Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models	David Sobrín-Hidalgo et.al.	2402.04206v1	null
2024-02-06	SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models	Yichen Shi et.al.	2402.04178v1	link
2024-02-06	Scientific Language Modeling: A Quantitative Review of Large Language Models in Molecular Science	Pengfei Liu et.al.	2402.04119v1	link
2024-02-07	Position Paper: Against Spurious Sparks $-$ Dovelating Inflated AI Claims	Patrick Altmeyer et.al.	2402.03962v2	null
2024-02-06	Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience	Xilin Jiang et.al.	2402.03710v1	null
2024-02-27	Distinguishing the Knowable from the Unknowable with Language Models	Gustaf Ahdritz et.al.	2402.03563v2	link
2024-01-25	When Geoscience Meets Generative AI and Large Language Models: Foundations, Trends, and Future Challenges	Abdenour Hadid et.al.	2402.03349v1	null
2024-03-04	English Prompts are Better for NLI-based Zero-Shot Emotion Classification than Target-Language Prompts	Patrick Barreiß et.al.	2402.03223v2	null
2024-02-22	PuzzleBench: Can LLMs Solve Challenging First-Order Combinatorial Reasoning Problems?	Chinmay Mittal et.al.	2402.02611v2	null
2024-02-04	Integration of cognitive tasks into artificial general intelligence test for large models	Youzhi Qu et.al.	2402.02547v1	null
2024-02-03	A Data Generation Perspective to the Mechanism of In-Context Learning	Haitao Mao et.al.	2402.02212v1	null
2024-02-03	Vi(E)va LLM! A Conceptual Stack for Evaluating and Interpreting Generative AI-based Visualizations	Luca Podo et.al.	2402.02167v1	link
2024-02-13	PresAIse, A Prescriptive AI Solution for Enterprises	Wei Sun et.al.	2402.02006v2	null
2024-02-02	The Role of Foundation Models in Neuro-Symbolic Learning and Reasoning	Daniel Cunnington et.al.	2402.01889v1	null
2024-02-06	Large Language Model Agent for Hyper-Parameter Optimization	Siyi Liu et.al.	2402.01881v2	null
2024-02-02	The Political Preferences of LLMs	David Rozado et.al.	2402.01789v1	null
2024-01-30	Rethinking Interpretability in the Era of Large Language Models	Chandan Singh et.al.	2402.01761v1	link
2024-01-29	Compensatory Biases Under Cognitive Load: Reducing Selection Bias in Large Language Models	J. E. Eicher et.al.	2402.01740v1	null
2024-01-25	ChatGPT vs Gemini vs LLaMA on Multilingual Sentiment Analysis	Alessio Buscemi et.al.	2402.01715v1	null
2024-01-23	Quality of Answers of Generative Large Language Models vs Peer Patients for Interpreting Lab Test Results for Lay Patients: Evaluation Study	Zhe He et.al.	2402.01693v1	null
2024-02-16	Emojis Decoded: Leveraging ChatGPT for Enhanced Understanding in Social Media Communications	Yuhang Zhou et.al.	2402.01681v2	null
2024-02-02	BAT: Learning to Reason about Spatial Sounds with Large Language Models	Zhisheng Zheng et.al.	2402.01591v1	null
2024-02-02	From Words to Molecules: A Survey of Large Language Models in Chemistry	Chang Liao et.al.	2402.01439v1	null
2024-02-02	Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis	Zeeshan Rasheed et.al.	2402.01386v1	null
2024-02-02	Reasoning Capacity in Multi-Agent Systems: Limitations, Challenges and Human-Centered Solutions	Pouya Pezeshkpour et.al.	2402.01108v1	null
2024-02-01	Executable Code Actions Elicit Better LLM Agents	Xingyao Wang et.al.	2402.01030v1	link
2024-02-01	Enhancing Ethical Explanations of Large Language Models through Iterative Symbolic Refinement	Xin Quan et.al.	2402.00745v1	link
2024-02-01	Transforming and Combining Rewards for Aligning Large Language Models	Zihao Wang et.al.	2402.00742v1	null
2024-02-01	AssertLLM: Generating and Evaluating Hardware Verification Assertions from Design Specifications via Multi-LLMs	Wenji Fang et.al.	2402.00386v1	null
2024-02-01	IndiVec: An Exploration of Leveraging Large Language Models for Media Bias Detection with Fine-Grained Bias Indicators	Luyang Lin et.al.	2402.00345v1	null
2024-02-01	Efficient Non-Parametric Uncertainty Quantification for Black-Box Large Language Models and Decision Planning	Yao-Hung Hubert Tsai et.al.	2402.00251v1	null
2024-01-31	Multimodal Neurodegenerative Disease Subtyping Explained by ChatGPT	Diego Machado Reyes et.al.	2402.00137v1	null
2024-01-31	ChIRAAG: ChatGPT Informed Rapid and Automated Assertion Generation	Bhabesh Mali et.al.	2402.00093v1	null
2024-02-07	Detecting Multimedia Generated by Large AI Models: A Survey	Li Lin et.al.	2402.00045v3	link
2024-01-21	Training microrobots to swim by a large language model	Zhuoqun Xu et.al.	2402.00044v1	null
2024-02-05	Comparative Analysis of LLaMA and ChatGPT Embeddings for Molecule Embedding	Shaghayegh Sadeghi et.al.	2402.00024v2	link
2024-02-03	EEG-GPT: Exploring Capabilities of Large Language Models for EEG Classification and Interpretation	Jonathan W. Kim et.al.	2401.18006v2	null
2024-01-31	Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study	Qirui Jiao et.al.	2401.17981v1	null
2024-01-31	Probing Language Models' Gesture Understanding for Enhanced Human-AI Interaction	Philipp Wicke et.al.	2401.17858v1	null
2024-01-30	Detecting mental disorder on social media: a ChatGPT-augmented explainable approach	Loris Belcastro et.al.	2401.17477v1	link
2024-02-05	EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain	Wei Zhang et.al.	2401.16822v2	null
2024-01-30	A Cross-Language Investigation into Jailbreak Attacks in Large Language Models	Jie Li et.al.	2401.16765v1	null
2024-02-03	Engineering A Large Language Model From Scratch	Abiodun Finbarrs Oketunji et.al.	2401.16736v3	null
2024-01-29	Probabilistic Abduction for Visual Abstract Reasoning via Learning Rules in Vector-symbolic Architectures	Michael Hersche et.al.	2401.16024v1	link
2024-01-29	APIGen: Generative API Method Recommendation	Yujia Chen et.al.	2401.15843v1	link
2024-02-12	Scalable Qualitative Coding with LLMs: Chain-of-Thought Reasoning Matches Human Performance in Some Hermeneutic Tasks	Zackary Okun Dunivin et.al.	2401.15170v2	null
2024-01-26	Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias	Yu He Ke et.al.	2401.14589v1	null
2024-01-25	LongHealth: A Question Answering Benchmark with Long Clinical Documents	Lisa Adams et.al.	2401.14490v1	link
2024-01-25	GPTVoiceTasker: LLM-Powered Virtual Assistant for Smartphone	Minh Duc Vu et.al.	2401.14268v1	null
2024-01-25	CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks	Andrei Tomut et.al.	2401.14109v1	null
2024-01-25	A Survey of Deep Learning and Foundation Models for Time Series Forecasting	John A. Miller et.al.	2401.13912v1	null
2024-01-24	AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents	Chang Ma et.al.	2401.13178v1	link
2024-01-23	From Understanding to Utilization: A Survey on Explainability for Large Language Models	Haoyan Luo et.al.	2401.12874v1	null
2024-01-23	How well can large language models explain business processes?	Dirk Fahland et.al.	2401.12846v1	null
2024-01-27	C2Ideas: Supporting Creative Interior Color Design Ideation with Large Language Model	Yihan Hou et.al.	2401.12586v2	null
2024-01-30	SLANG: New Concept Comprehension of Large Language Models	Lingrui Mei et.al.	2401.12585v2	null
2024-01-23	LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools	Qianli Wang et.al.	2401.12576v1	link
2024-01-23	Automated Fact-Checking of Climate Change Claims with Large Language Models	Markus Leippold et.al.	2401.12566v1	null
2024-01-22	CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation	Zhihong Chen et.al.	2401.12208v1	null
2024-01-21	Integration of Large Language Models in Control of EHD Pumps for Precise Color Synthesis	Yanhong Peng et.al.	2401.11500v1	null
2024-01-18	LangProp: A code optimization framework using Language Models applied to driving	Shu Ishida et.al.	2401.10314v1	link
2024-01-18	Advancing Large Multi-modal Models with Explicit Chain-of-Reasoning and Visual Question Generation	Kohei Uehara et.al.	2401.10005v1	null
2024-01-18	Temporal Insight Enhancement: Mitigating Temporal Hallucination in Multimodal Large Language Models	Li Sun et.al.	2401.09861v1	null
2024-01-17	Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models	Haonan Guo et.al.	2401.09083v1	link
2024-01-17	What makes for a 'good' social actor? Using respect as a lens to evaluate interactions with language agents	Lize Alberts et.al.	2401.09082v1	null
2024-01-16	AiGen-FoodReview: A Multimodal Dataset of Machine-Generated Restaurant Reviews and Images on Social Media	Alessandro Gambetti et.al.	2401.08825v1	null
2024-01-15	Assistant, Parrot, or Colonizing Loudspeaker? ChatGPT Metaphors for Developing Critical AI Literacies	Anuj Gupta et.al.	2401.08711v1	null
2024-01-16	Anchor function: a type of benchmark functions for studying language models	Zhongwang Zhang et.al.	2401.08309v1	null
2024-01-16	AesBench: An Expert Benchmark for Multimodal Large Language Models on Image Aesthetics Perception	Yipo Huang et.al.	2401.08276v1	link
2024-01-16	LLM-Guided Multi-View Hypergraph Learning for Human-Centric Explainable Recommendation	Zhixuan Chu et.al.	2401.08217v1	null
2024-02-16	MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline	Minpeng Liao et.al.	2401.08190v2	link
2024-02-15	Are self-explanations from Large Language Models faithful?	Andreas Madsen et.al.	2401.07927v3	link
2024-01-17	See the Unseen: Better Context-Consistent Knowledge-Editing by Noises	Youcheng Huang et.al.	2401.07544v2	null
2024-01-12	Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data	Yubin Kim et.al.	2401.06866v1	null
2024-01-12	Enhancing the Emotional Generation Capability of Large Language Models via Emotional Chain-of-Thought	Zaijing Li et.al.	2401.06836v1	null
2024-01-12	From Automation to Augmentation: Large Language Models Elevating Essay Scoring Landscape	Changrong Xiao et.al.	2401.06431v1	link
2024-01-23	How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs	Yi Zeng et.al.	2401.06373v2	link
2024-01-12	Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models	Asma Ghandeharioun et.al.	2401.06102v2	null
2024-01-11	Large Language Models vs. Search Engines: Evaluating User Preferences Across Varied Information Retrieval Scenarios	Kevin Matthe Caramancion et.al.	2401.05761v1	null
2024-01-11	Towards Conversational Diagnostic AI	Tao Tu et.al.	2401.05654v1	null
2024-01-17	Theory of Mind abilities of Large Language Models in Human-Robot Interaction : An Illusion?	Mudit Verma et.al.	2401.05302v2	null
2024-01-10	Aligning Translation-Specific Understanding to General Understanding in Large Language Models	Yichong Huang et.al.	2401.05072v1	null
2024-01-10	ANGO: A Next-Level Evaluation Benchmark For Generation-Oriented Language Models In Chinese Domain	Bingchao Wang et.al.	2401.04898v1	null
2024-01-08	Evaluating Brain-Inspired Modular Training in Automated Circuit Discovery for Mechanistic Interpretability	Jatin Nainani et.al.	2401.03646v1	null
2024-01-05	UMIE: Unified Multimodal Information Extraction with Instruction Tuning	Lin Sun et.al.	2401.03082v1	link
2024-02-01	Object-Centric Instruction Augmentation for Robotic Manipulation	Junjie Wen et.al.	2401.02814v2	null
2024-02-06	VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model	Pengying Wu et.al.	2401.02695v2	null
2024-01-05	Correctness Comparison of ChatGPT-4, Bard, Claude-2, and Copilot for Spatial Tasks	Hartwig H. Hochmair et.al.	2401.02404v2	null
2024-01-04	DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models	Wendi Cui et.al.	2401.02132v1	link
2024-01-03	Large Language Models Relearn Removed Concepts	Michelle Lo et.al.	2401.01814v1	link
2024-01-12	WordArt Designer API: User-Driven Artistic Typography Synthesis with Large Language Models on ModelScope	Jun-Yan He et.al.	2401.01699v2	null
2024-01-02	VALD-MD: Visual Attribution via Latent Diffusion for Medical Diagnostics	Ammar A. Siddiqui et.al.	2401.01414v1	null
2024-01-02	A Novel Evaluation Framework for Assessing Resilience Against Prompt Injection Attacks in Large Language Models	Daniel Wankit Yip et.al.	2401.00991v1	null
2023-12-31	AllSpark: a multimodal spatiotemporal general model	Run Shao et.al.	2401.00546v1	null
2023-12-31	keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM	Chaojie Wang et.al.	2401.00426v1	null
2024-01-12	Advancing TTP Analysis: Harnessing the Power of Encoder-Only and Decoder-Only Language Models with Retrieval Augmented Generation	Reza Fayyazi et.al.	2401.00280v2	null
2023-12-30	Is Knowledge All Large Language Models Needed for Causal Reasoning?	Hengrui Cai et.al.	2401.00139v1	link
2023-12-27	Conversational Question Answering with Reformulations over Knowledge Graph	Lihui Liu et.al.	2312.17269v1	null
2023-12-29	Large Language Model for Causal Decision Making	Haitao Jiang et.al.	2312.17122v2	null
2023-12-27	Rethinking Tabular Data Understanding with Large Language Models	Tianyang Liu et.al.	2312.16702v1	link
2023-12-26	Observable Propagation: A Data-Efficient Approach to Uncover Feature Vectors in Transformers	Jacob Dunefsky et.al.	2312.16291v1	link
2023-12-26	Understanding Before Recommendation: Semantic Aspect-Aware Review Exploitation via Large Language Models	Fan Liu et.al.	2312.16275v1	null
2023-12-26	Large Language Models as Traffic Signal Control Agents: Capacity and Opportunity	Siqi Lai et.al.	2312.16044v1	link
2024-01-29	ChartBench: A Benchmark for Complex Visual Reasoning in Charts	Zhengzhuo Xu et.al.	2312.15915v2	null
2023-12-26	Think and Retrieval: A Hypothesis Knowledge Graph Enhanced Medical Large Language Models	Xinke Jiang et.al.	2312.15883v1	null
2023-12-22	Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention	Zhen Tan et.al.	2312.15033v1	null
2023-12-22	Theory of Hallucinations based on Equivariance	Hisaichi Shibata et.al.	2312.14504v1	null
2023-12-22	Don't Believe Everything You Read: Enhancing Summarization Interpretability through Automatic Identification of Hallucinations in Large Language Models	Priyesh Vakharia et.al.	2312.14346v1	null
2023-12-19	Large Language Models in Medical Term Classification and Unexpected Misalignment Between Response and Reasoning	Xiaodan Zhang et.al.	2312.14184v1	null
2023-12-21	Diversifying Knowledge Enhancement of Biomedical Language Models using Adapter Modules and Knowledge Graphs	Juraj Vladika et.al.	2312.13881v1	null
2023-12-21	A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties	Junfei Xiao et.al.	2312.13764v1	link
2023-12-20	ECAMP: Entity-centered Context-aware Medical Vision Language Pre-training	Rongsheng Wang et.al.	2312.13316v1	link
2023-12-21	AMD:Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion	Beibei Jing et.al.	2312.12763v2	null
2023-12-21	A Case Study on Test Case Construction with Large Language Models: Unveiling Practical Insights and Challenges	Roberto Francisco de Lima Junior et.al.	2312.12598v2	null
2024-01-30	Locating Factual Knowledge in Large Language Models: Exploring the Residual Stream and Analyzing Subvalues in Vocabulary Space	Zeping Yu et.al.	2312.12141v2	null
2023-12-19	Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach	Weiyu Ma et.al.	2312.11865v1	link
2023-12-16	Learning Interpretable Queries for Explainable Image Classification with Information Pursuit	Stefan Kolek et.al.	2312.11548v1	null
2023-12-22	A mathematical perspective on Transformers	Borjan Geshkovski et.al.	2312.10794v2	link
2023-12-17	kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest Neighbor In-Context Learning	Wenting Zhao et.al.	2312.10771v1	null
2023-12-17	Knowledge Trees: Gradient Boosting Decision Trees on Knowledge Neurons as Probing Classifier	Sergey A. Saltykov et.al.	2312.10746v1	null
2023-12-17	Can persistent homology whiten Transformer-based black-box models? A case study on BERT compression	Luis Balderas et.al.	2312.10702v1	null
2023-12-16	Continuous Prompt Generation from Linear Combination of Discrete Prompt Embeddings	Pascal Passigan et.al.	2312.10323v1	null
2023-12-23	Shedding Light on Software Engineering-specific Metaphors and Idioms	Mia Mohammad Imran et.al.	2312.10297v2	link
2023-12-15	A Review of Repository Level Prompting for LLMs	Douglas Schonholtz et.al.	2312.10101v1	null
2023-12-04	Generative AI in Writing Research Papers: A New Type of Algorithmic Bias and Uncertainty in Scholarly Work	Rishab Jain et.al.	2312.10057v1	null
2023-12-15	Neurosymbolic Value-Inspired AI (Why, What, and How)	Amit Sheth et.al.	2312.09928v1	null
2023-12-15	GPT-4 Surpassing Human Performance in Linguistic Pragmatics	Ljubisa Bojic et.al.	2312.09545v1	null
2023-12-14	Large Language Models for Autonomous Driving: Real-World Experiments	Can Cui et.al.	2312.09397v1	null
2023-12-14	Successor Heads: Recurring, Interpretable Attention Heads In The Wild	Rhys Gould et.al.	2312.09230v1	null
2023-12-14	Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models	Zhiyuan You et.al.	2312.08962v1	null
2023-12-14	Learning Safety Constraints From Demonstration Using One-Class Decision Trees	Mattijs Baert et.al.	2312.08837v1	null
2023-12-13	Helping Language Models Learn More: Multi-dimensional Task Prompt for Few-shot Tuning	Jinta Weng et.al.	2312.08027v1	null
2023-12-07	Large Language Models for Intent-Driven Session Recommendations	Zhu Sun et.al.	2312.07552v1	link
2023-12-12	Efficiently Programming Large Language Models using SGLang	Lianmin Zheng et.al.	2312.07104v1	link
2023-12-12	Towards Enhanced Human Activity Recognition through Natural Language Generation and Pose Estimation	Nikhil Kashyap et.al.	2312.06965v1	null
2023-12-27	Steering Llama 2 via Contrastive Activation Addition	Nina Rimsky et.al.	2312.06681v2	link
2023-12-11	AnyHome: Open-Vocabulary Generation of Structured and Textured 3D Homes	Zehao Wen et.al.	2312.06644v1	null
2023-12-11	DiffVL: Scaling Up Soft Body Manipulation using Vision-Language Driven Differentiable Physics	Zhiao Huang et.al.	2312.06408v1	null
2023-12-11	GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models	Jiaxu Zhao et.al.	2312.06315v1	null
2023-12-11	ProtoCode: Leveraging Large Language Models for Automated Generation of Machine-Readable Protocols from Scientific Publications	Shuo Jiang et.al.	2312.06241v1	null
2023-12-10	Evidence-based Interpretable Open-domain Fact-checking with Large Language Models	Xin Tan et.al.	2312.05834v1	null
2023-12-19	Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning	Subhabrata Dutta et.al.	2312.05571v2	link
2023-12-09	Image and Data Mining in Reticular Chemistry Using GPT-4V	Zhiling Zheng et.al.	2312.05468v1	null
2023-12-09	Identifying and Mitigating Model Failures through Few-shot CLIP-aided Diffusion Generation	Atoosa Chegini et.al.	2312.05464v1	null
2023-12-08	GlitchBench: Can large multimodal models detect video game glitches?	Mohammad Reza Taesiri et.al.	2312.05291v1	null
2023-12-08	Retrieval-based Video Language Model for Efficient Long Video Question Answering	Jiaqi Xu et.al.	2312.04931v1	null
2023-12-08	Ophtha-LLaMA2: A Large Language Model for Ophthalmology	Huan Zhao et.al.	2312.04906v1	null
2024-01-10	KwaiAgents: Generalized Information-seeking Agent System with Large Language Models	Haojie Pan et.al.	2312.04889v3	link
2023-12-07	AVA: Towards Autonomous Visualization Agents through Visual Perception-Driven Decision-Making	Shusen Liu et.al.	2312.04494v1	null
2023-12-07	LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs	Yunsheng Ma et.al.	2312.04372v1	null
2023-12-27	Towards Knowledge-driven Autonomous Driving	Xin Li et.al.	2312.04316v3	link
2023-12-07	Efficiently Predicting Protein Stability Changes Upon Single-point Mutation with Large Language Models	Yijie Zhang et.al.	2312.04019v1	null
2023-12-05	How should the advent of large language models affect the practice of science?	Marcel Binz et.al.	2312.03759v1	null
2023-12-04	Near-real-time Earthquake-induced Fatality Estimation using Crowdsourced Data and Large-Language Models	Chenguang Wang et.al.	2312.03755v1	null
2023-12-08	Methods to Estimate Large Language Model Confidence	Maia Kotelanski et.al.	2312.03733v2	null
2023-12-06	GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models	Haicheng Liao et.al.	2312.03543v1	link
2023-12-05	FlexModel: A Framework for Interpretability of Distributed Large Language Models	Matthew Choi et.al.	2312.03140v1	link
2023-12-07	Evaluating Agents using Social Choice Theory	Marc Lanctot et.al.	2312.03121v2	link
2023-12-05	Breast Ultrasound Report Generation using LangChain	Jaeyoung Huh et.al.	2312.03013v1	null
2023-12-05	Harmonizing Global Voices: Culturally-Aware Models for Enhanced Content Moderation	Alex J. Chan et.al.	2312.02401v1	null
2023-12-04	LLMs Accelerate Annotation for Medical Information Extraction	Akshay Goel et.al.	2312.02296v1	null
2023-12-04	Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition	Chengyou Jia et.al.	2312.02226v1	null
2023-11-28	Training Chain-of-Thought via Latent-Variable Inference	Du Phan et.al.	2312.02179v1	null
2023-12-04	Learning Machine Morality through Experience and Interaction	Elizaveta Tennant et.al.	2312.01818v1	null
2023-12-26	Jellyfish: A Large Language Model for Data Preprocessing	Haochen Zhang et.al.	2312.01678v3	null
2023-12-11	Characterizing Large Language Model Geometry Solves Toxicity Detection and Generation	Randall Balestriero et.al.	2312.01648v2	link
2023-12-04	The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning	Bill Yuchen Lin et.al.	2312.01552v1	null
2023-12-03	SAGE: Bridging Semantic and Actionable Parts for GEneralizable Articulated-Object Manipulation under Language Instructions	Haoran Geng et.al.	2312.01307v1	null
2023-12-03	TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long Documents	James Enouen et.al.	2312.01279v1	null
2023-12-02	From Voices to Validity: Leveraging Large Language Models (LLMs) for Textual Analysis of Policy Stakeholder Interviews	Alex Liu et.al.	2312.01202v1	null
2023-12-01	Leveraging Large Language Models to Improve REST API Testing	Myeongsoo Kim et.al.	2312.00894v1	null
2023-12-18	Empowering Autonomous Driving with Large Language Models: A Safety Perspective	Yixuan Wang et.al.	2312.00812v3	null
2023-11-30	Towards Accurate Differential Diagnosis with Large Language Models	Daniel McDuff et.al.	2312.00164v1	null
2023-11-30	PoseGPT: Chatting about 3D Human Pose	Yao Feng et.al.	2311.18836v1	null
2023-11-30	CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation	Zineng Tang et.al.	2311.18775v1	null
2023-12-05	AlignBench: Benchmarking Chinese Alignment of Large Language Models	Xiao Liu et.al.	2311.18743v3	link
2023-11-30	Categorical Traffic Transformer: Interpretable and Diverse Behavior Prediction with Tokenized Latent	Yuxiao Chen et.al.	2311.18307v1	null
2023-11-29	Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings	Andrea W Wen-Yi et.al.	2311.18034v1	link
2023-11-28	Unlocking Spatial Comprehension in Text-to-Image Diffusion Models	Mohammad Mahdi Derakhshani et.al.	2311.17937v1	null
2023-11-29	VIM: Probing Multimodal Large Language Models for Visual Embedded Instruction Following	Yujie Lu et.al.	2311.17647v1	null
2023-11-29	Exploring Large Language Models for Human Mobility Prediction under Public Events	Yuebing Liang et.al.	2311.17351v1	null
2023-11-29	Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering	Zeqing Wang et.al.	2311.17331v1	null
2023-11-28	Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis	Xiaohui Chen et.al.	2311.17126v1	null
2023-11-30	Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following	Yutong Feng et.al.	2311.17002v2	null
2023-12-27	StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models	Kazuki Yamauchi et.al.	2311.16509v2	null
2023-12-10	LLMGA: Multimodal Large Language Model based Generation Assistant	Bin Xia et.al.	2311.16500v2	link
2023-11-27	ChartLlama: A Multimodal LLM for Chart Understanding and Generation	Yucheng Han et.al.	2311.16483v1	null
2023-11-27	Have we built machines that think like people?	Luca M. Schulze Buschoff et.al.	2311.16093v1	link
2023-11-27	Decoding Logic Errors: A Comparative Study on Bug Detection by Students and Large Language Models	Stephen MacNeil et.al.	2311.16017v1	null
2023-11-27	Sparsify-then-Classify: From Internal Neurons of Large Language Models To Efficient Text Classifiers	Yilun Liu et.al.	2311.15983v1	link
2023-11-27	Dawning of a New Era in Gravitational Wave Data Analysis: Unveiling Cosmic Mysteries via Artificial Intelligence -- A Systematic Review	Tianyu Zhao et.al.	2311.15585v1	null
2023-12-03	See and Think: Embodied Agent in Virtual Environment	Zhonghan Zhao et.al.	2311.15209v2	null
2023-11-25	Localizing Lying in Llama: Understanding Instructed Dishonesty on True-False Questions Through Prompting, Probing, and Patching	James Campbell et.al.	2311.15131v1	null
2023-11-19	Zero-Shot Question Answering over Financial Documents using Large Language Models	Karmvir Singh Phogat et.al.	2311.14722v1	null
2023-11-24	Benchmarking Large Language Models for Log Analysis, Security, and Interpretation	Egil Karlsen et.al.	2311.14519v1	null
2023-11-30	A density estimation perspective on learning from pairwise human preferences	Vincent Dumoulin et.al.	2311.14115v2	link
2023-11-23	Towards Explainable Strategy Templates using NLP Transformers	Pallavi Bagga et.al.	2311.14061v1	null
2023-11-23	Challenges of Large Language Models for Mental Health Counseling	Neo Christopher Chung et.al.	2311.13857v1	null
2023-12-03	FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design	Yangyang Yu et.al.	2311.13743v2	link
2023-11-22	Vamos: Versatile Action Models for Video Understanding	Shijie Wang et.al.	2311.13627v1	null
2023-11-22	ADriver-I: A General World Model for Autonomous Driving	Fan Jia et.al.	2311.13549v1	null
2023-12-15	Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs	Yonghui Wang et.al.	2311.13194v2	link
2023-11-25	From Classification to Clinical Insights: Towards Analyzing and Reasoning About Mobile and Behavioral Health Data With Large Language Models	Zachary Englhardt et.al.	2311.13063v2	null
2023-11-21	ALPHA: AnomaLous Physiological Health Assessment Using Large Language Models	Jiankai Tang et.al.	2311.12524v1	link
2023-11-21	Adapting LLMs for Efficient, Personalized Information Retrieval: Methods and Implications	Samira Ghodratnama et.al.	2311.12287v1	null
2023-11-20	Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents	Zhuosheng Zhang et.al.	2311.11797v1	link
2023-11-20	Incorporating LLM Priors into Tabular Learners	Max Zhu et.al.	2311.11628v1	null
2023-11-20	GPT in Data Science: A Practical Exploration of Model Selection	Nathalia Nascimento et.al.	2311.11516v1	null
2023-11-20	Meta Prompting for AGI Systems	Yifan Zhang et.al.	2311.11482v1	link
2023-12-17	Rethinking Large Language Models in Mental Health Applications	Shaoxiong Ji et.al.	2311.11267v2	null
2023-11-18	Bit Cipher -- A Simple yet Powerful Word Representation System that Integrates Efficiently with Language Models	Haoran Zhao et.al.	2311.11012v1	null
2023-11-18	RecExplainer: Aligning Large Language Models for Recommendation Model Interpretability	Yuxuan Lei et.al.	2311.10947v1	null
2023-11-17	Flexible Model Interpretability through Natural Language Model Editing	Karel D'Oosterlinck et.al.	2311.10905v1	null
2023-11-27	A Language Agent for Autonomous Driving	Jiageng Mao et.al.	2311.10813v3	link
2023-11-15	MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning	Fuxiao Liu et.al.	2311.10774v1	link
2023-11-16	MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning	Xiangru Tang et.al.	2311.10537v1	link
2023-11-16	Interpreting User Requests in the Context of Natural Language Standing Instructions	Nikita Moghe et.al.	2311.09796v1	null
2023-11-16	On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering	Linyong Nan et.al.	2311.09721v1	null
2023-11-16	Evaluating In-Context Learning of Libraries for Code Generation	Arkil Patel et.al.	2311.09635v1	null
2023-11-16	Efficient End-to-End Visual Document Understanding with Rationale Distillation	Wang Zhu et.al.	2311.09612v1	null
2023-11-16	Pachinko: Patching Interpretable QA Models through Natural Language Feedback	Chaitanya Malaviya et.al.	2311.09558v1	link
2023-11-09	Chain of Images for Intuitively Reasoning	Fanxu Meng et.al.	2311.09241v1	link
2023-11-15	TableLlama: Towards Open Large Generalist Models for Tables	Tianshu Zhang et.al.	2311.09206v1	null
2023-11-15	MELA: Multilingual Evaluation of Linguistic Acceptability	Ziyin Zhang et.al.	2311.09033v1	null
2023-11-15	Identifying Linear Relational Concepts in Large Language Models	David Chanin et.al.	2311.08968v1	null
2023-11-15	I Was Blind but Now I See: Implementing Vision-Enabled Dialogue in Social Robots	Giulio Antonio Abbo et.al.	2311.08957v1	null
2023-11-15	HELLaMA: LLaMA-based Table to Text Generation by Highlighting the Important Evidence	Junyi Bian et.al.	2311.08896v1	null
2023-11-15	Token Prediction as Implicit Classification to Identify LLM-Generated Text	Yutian Chen et.al.	2311.08723v1	link
2023-11-15	Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling	Bairu Hou et.al.	2311.08718v1	link
2023-11-15	XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-Making	Zichen Chen et.al.	2311.08614v1	null
2023-11-15	Navigating the Ocean of Biases: Political Bias Attribution in Language Models via Causal Structures	David F. Jenny et.al.	2311.08605v1	link
2023-11-14	Towards Evaluating AI Systems for Moral Status Using Self-Reports	Ethan Perez et.al.	2311.08576v1	null
2023-11-14	Taxonomy, Semantic Data Schema, and Schema Alignment for Open Data in Urban Building Energy Modeling	Liang Zhang et.al.	2311.08535v1	null
2023-11-14	Plum: Prompt Learning using Metaheuristic	Rui Pan et.al.	2311.08364v1	link
2023-11-14	Human-Centric Autonomous Systems With LLMs for User Command Reasoning	Yi Yang et.al.	2311.08206v1	link
2023-11-11	Conceptual Model Interpreter for Large Language Models	Felix Härer et.al.	2311.07605v1	link
2023-11-13	It's Not Easy Being Wrong: Evaluating Process of Elimination Reasoning in Large Language Models	Nishant Balepur et.al.	2311.07532v1	link
2023-11-13	Finding and Editing Multi-Modal Neurons in Pre-Trained Transformer	Haowen Pan et.al.	2311.07470v1	null
2023-11-13	On Measuring Faithfulness of Natural Language Explanations	Letitia Parcalabescu et.al.	2311.07466v1	link
2023-11-13	Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models	Junpeng Li et.al.	2311.07314v1	null
2023-11-12	Assessing the Interpretability of Programmatic Policies with Large Language Models	Zahra Bashir et.al.	2311.06979v1	null
2023-11-12	Simulating Public Administration Crisis: A Novel Generative Agent-Based Simulation System to Lower Technology Barriers in Social Science Research	Bushi Xiao et.al.	2311.06957v1	null
2023-11-10	ChatGPT in the context of precision agriculture data analytics	Ilyas Potamitis et.al.	2311.06390v1	link
2023-11-09	Deep Natural Language Feature Learning for Interpretable Prediction	Felipe Urrutia et.al.	2311.05754v1	link
2023-11-09	Do personality tests generalize to Large Language Models?	Florian E. Dorner et.al.	2311.05297v1	null
2023-11-02	Chain of Empathy: Enhancing Empathetic Response of Large Language Models Based on Psychotherapy Models	Yoon Kyung Lee et.al.	2311.04915v1	null
2023-11-08	SEMQA: Semi-Extractive Multi-Source Question Answering	Tal Schuster et.al.	2311.04886v1	link
2023-11-07	Evaluating the Effectiveness of Retrieval-Augmented Large Language Models in Scientific Document Reasoning	Sai Munikoti et.al.	2311.04348v1	null
2023-11-07	Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves	Yihe Deng et.al.	2311.04205v1	link
2023-11-07	Perturbed examples reveal invariances shared by language models	Ruchit Rawal et.al.	2311.04166v1	null
2023-11-07	Extracting human interpretable structure-property relationships in chemistry using XAI and large language models	Geemi P. Wellawatte et.al.	2311.04047v1	link
2023-11-07	Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundation Models	Yichao Cao et.al.	2311.03799v1	link
2023-11-07	Leveraging Structured Information for Explainable Multi-hop Question Answering and Reasoning	Ruosen Li et.al.	2311.03734v1	link
2023-11-07	The Linear Representation Hypothesis and the Geometry of Large Language Models	Kiho Park et.al.	2311.03658v1	link
2023-11-06	Beyond Words: A Mathematical Framework for Interpreting Large Language Models	Javier González et.al.	2311.03033v1	null
2023-11-06	QualEval: Qualitative Evaluation for Model Improvement	Vishvak Murahari et.al.	2311.02807v1	link
2023-11-03	Don't Make Your LLM an Evaluation Benchmark Cheater	Kun Zhou et.al.	2311.01964v1	null
2023-11-06	Large Language Models to the Rescue: Reducing the Complexity in Scientific Workflow Development Using ChatGPT	Mario Sänger et.al.	2311.01825v2	null
2023-11-12	Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models	Sean Xie et.al.	2311.01732v2	link
2023-11-02	TopicGPT: A Prompt-based Topic Modeling Framework	Chau Minh Pham et.al.	2311.01449v1	link
2023-11-02	REAL: Resilience and Adaptation using Large Language Models on Autonomous Aerial Robots	Andrea Tagliabue et.al.	2311.01403v1	null
2023-11-02	Revisiting the Knowledge Injection Frameworks	Peng Fu et.al.	2311.01150v1	null
2023-11-02	Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game	Sam Toyer et.al.	2311.01011v1	null
2023-11-02	Vision-Language Interpreter for Robot Task Planning	Keisuke Shirai et.al.	2311.00967v1	link
2023-11-02	M2T2: Multi-Task Masked Transformer for Object-centric Pick and Place	Wentao Yuan et.al.	2311.00926v1	null
2023-11-01	Emotion Detection for Misinformation: A Review	Zhiwei Liu et.al.	2311.00671v1	null
2023-11-01	De-Diffusion Makes Text a Strong Cross-Modal Interface	Chen Wei et.al.	2311.00618v1	null
2023-11-01	The Mystery and Fascination of LLMs: A Comprehensive Survey on the Interpretation and Analysis of Emergent Abilities	Yuxiang Zhou et.al.	2311.00237v1	null
2023-11-01	Is GPT Powerful Enough to Analyze the Emotions of Memes?	Jingjing Wang et.al.	2311.00223v1	null
2023-10-31	Large Language Model Can Interpret Latent Space of Sequential Recommender	Zhengyi Yang et.al.	2310.20487v1	link
2023-10-31	The SourceData-NLP dataset: integrating curation into scientific publishing for training large language models	Jorge Abreu-Vicente et.al.	2310.20440v1	link
2023-10-30	Generative retrieval-augmented ontologic graph and multi-agent strategies for interpretive large language model-based materials design	Markus J. Buehler et.al.	2310.19998v1	null
2023-10-30	GPCR-BERT: Interpreting Sequential Design of G Protein Coupled Receptors Using Protein Language Models	Seongwon Kim et.al.	2310.19915v1	null

(back to top)

LLM - Reasoning

Publish Date	Title	Authors	PDF	Code
2024-07-24	Grammar-based Game Description Generation using Large Language Models	Tsunehiko Tanaka et.al.	2407.17404v1	null
2024-07-24	Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching	Yuyang Ding et.al.	2407.17349v1	null
2024-07-24	LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover	Zijian Wu et.al.	2407.17227v1	null
2024-07-24	Fusing LLMs and KGs for Formal Causal Reasoning behind Financial Risk Contagion	Guanyuan Yu et.al.	2407.17190v1	null
2024-07-24	Reinforced Prompt Personalization for Recommendation with Large Language Models	Wenyu Mao et.al.	2407.17115v1	link
2024-07-24	A Voter-Based Stochastic Rejection-Method Framework for Asymptotically Safe Language Model Outputs	Jake R. Watts et.al.	2407.16994v1	null
2024-07-24	ScholarChemQA: Unveiling the Power of Language Models in Chemical Research Question Answering	Xiuying Chen et.al.	2407.16931v1	null
2024-07-23	CompBench: A Comparative Reasoning Benchmark for Multimodal LLMs	Jihyung Kil et.al.	2407.16837v1	link
2024-07-23	PreAlign: Boosting Cross-Lingual Transfer by Early Establishment of Multilingual Alignment	Jiahuan Li et.al.	2407.16222v1	null
2024-07-23	Figure it Out: Analyzing-based Jailbreak Attack on Large Language Models	Shi Lin et.al.	2407.16205v1	null
2024-07-23	UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language Models	Liu Qi et.al.	2407.16160v1	null
2024-07-22	Enhancing Temporal Understanding in LLMs for Semi-structured Tables	Irwin Deng et.al.	2407.16030v1	null
2024-07-22	Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability	Zhuoyan Xu et.al.	2407.15720v1	link
2024-07-22	CrashEventLLM: Predicting System Crashes with Large Language Models	Priyanka Mudgal et.al.	2407.15716v1	null
2024-07-22	HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning	Zhecan Wang et.al.	2407.15680v1	null
2024-07-22	Dissecting Multiplication in Transformers: Insights into LLMs	Luyu Qiu et.al.	2407.15360v1	null
2024-07-21	Evidence-Based Temporal Fact Verification	Anab Maulana Barik et.al.	2407.15291v1	null
2024-07-21	MIBench: Evaluating Multimodal Large Language Models over Multiple Images	Haowei Liu et.al.	2407.15272v1	null
2024-07-21	Multi-Agent Causal Discovery Using Large Language Models	Hao Duong Le et.al.	2407.15073v1	null
2024-07-22	Knowledge Mechanisms in Large Language Models: A Survey and Perspective	Mengru Wang et.al.	2407.15017v1	null
2024-07-20	Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data	Antonis Antoniades et.al.	2407.14985v1	null
2024-07-20	TraveLLM: Could you plan my new public transit route in face of a network disruption?	Bowen Fang et.al.	2407.14926v1	null
2024-07-20	Understanding the Relationship between Prompts and Response Uncertainty in Large Language Models	Ze Yu Zhang et.al.	2407.14845v1	null
2024-07-20	Can VLMs be used on videos for action recognition? LLMs are Visual Reasoning Coordinators	Harsh Lunia et.al.	2407.14834v1	null
2024-07-20	On the Design and Analysis of LLM-Based Algorithms	Yanxi Chen et.al.	2407.14788v1	link
2024-07-19	Adversarial Databases Improve Success in Retrieval-based Large Language Models	Sean Wu et.al.	2407.14609v1	null
2024-07-18	Thought-Like-Pro: Enhancing Reasoning of Large Language Models through Self-Driven Prolog-based Chain-of-Though	Xiaoyu Tan et.al.	2407.14562v1	null
2024-07-19	Internal Consistency and Self-Feedback in Large Language Models: A Survey	Xun Liang et.al.	2407.14507v1	link
2024-07-19	On Pre-training of Multimodal Language Models Customized for Chart Understanding	Wan-Cyuan Fan et.al.	2407.14506v1	null
2024-07-18	ViLLa: Video Reasoning Segmentation with Large Language Model	Rongkun Zheng et.al.	2407.14500v1	link
2024-07-19	Evaluating the Reliability of Self-Explanations in Large Language Models	Korbinian Randl et.al.	2407.14487v1	link
2024-07-19	OpenSU3D: Open World 3D Scene Understanding using Foundation Models	Rafay Mohiuddin et.al.	2407.14279v1	null
2024-07-19	LeKUBE: A Legal Knowledge Update BEnchmark	Changyue Wang et.al.	2407.14192v1	null
2024-07-19	Visual Text Generation in the Wild	Yuanzhi Zhu et.al.	2407.14138v1	link
2024-07-19	Enhancing Data-Limited Graph Neural Networks by Actively Distilling Knowledge from Large Language Models	Quan Li et.al.	2407.13989v1	null
2024-07-18	Werewolf Arena: A Case Study in LLM Evaluation via Social Deduction	Suma Bailis et.al.	2407.13943v1	null
2024-07-18	PRAGyan -- Connecting the Dots in Tweets	Rahul Ravi et.al.	2407.13909v1	null
2024-07-18	X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs	Sirnam Swetha et.al.	2407.13851v1	null
2024-07-18	Which objects help me to act effectively? Reasoning about physically-grounded affordances	Anne Kemmeren et.al.	2407.13811v1	null
2024-07-18	SegPoint: Segment Any Point Cloud via Large Language Model	Shuting He et.al.	2407.13761v1	null
2024-07-18	Prover-Verifier Games improve legibility of LLM outputs	Jan Hendrik Kirchner et.al.	2407.13692v1	null
2024-07-18	Weak-to-Strong Reasoning	Yuqing Yang et.al.	2407.13647v1	link
2024-07-18	KNOWNET: Guided Health Information Seeking from LLMs via Knowledge Graph Integration	Youfu Yan et.al.	2407.13598v1	null
2024-07-18	Robots Can Multitask Too: Integrating a Memory Architecture and LLMs for Enhanced Cross-Task Robot Action Generation	Hassan Ali et.al.	2407.13505v1	null
2024-07-18	Combining Constraint Programming Reasoning with Large Language Model Predictions	Florian Régin et.al.	2407.13490v1	null
2024-07-18	BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models	Moon Ye-Bin et.al.	2407.13442v1	null
2024-07-18	Reconstruct the Pruned Model without Any Retraining	Pingjie Wang et.al.	2407.13331v1	null
2024-07-18	CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis	Junying Chen et.al.	2407.13301v1	null
2024-07-18	Are Large Language Models Capable of Generating Human-Level Narratives?	Yufei Tian et.al.	2407.13248v1	null
2024-07-18	Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data	Wufei Ma et.al.	2407.13094v1	null
2024-07-17	Leveraging Environment Interaction for Automated PDDL Generation and Planning with Large Language Models	Sadegh Mahdavi et.al.	2407.12979v1	null
2024-07-16	BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval	Hongjin Su et.al.	2407.12883v1	null
2024-07-16	Large Visual-Language Models Are Also Good Classifiers: A Study of In-Context Multimodal Fake News Detection	Ye Jiang et.al.	2407.12879v1	null
2024-07-16	Review-Feedback-Reason (ReFeR): A Novel Framework for NLG Evaluation and Reasoning	Yaswanth Narsupalli et.al.	2407.12877v1	null
2024-07-12	Token-Supervised Value Models for Enhancing Mathematical Reasoning Capabilities of Large Language Models	Jung Hyun Lee et.al.	2407.12863v1	null
2024-07-10	Analyzing Large language models chatbots: An experimental approach using a probability test	Melise Peruchini et.al.	2407.12862v1	null
2024-07-17	Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models?	Ben Yao et.al.	2407.12725v1	null
2024-07-17	Towards Collaborative Intelligence: Propagating Intentions and Reasoning for Multi-Agent Coordination with Large Language Models	Xihe Qiu et.al.	2407.12532v1	null
2024-07-17	Struct-X: Enhancing Large Language Models Reasoning with Structured Data	Xiaoyu Tan et.al.	2407.12522v1	null
2024-07-17	Case2Code: Learning Inductive Reasoning with Synthetic Data	Yunfan Shao et.al.	2407.12504v1	link
2024-07-17	Evaluating Linguistic Capabilities of Multimodal LLMs in the Lens of Few-Shot Learning	Mustafa Dogan et.al.	2407.12498v1	null
2024-07-17	F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions	Jie Yang et.al.	2407.12435v1	null
2024-07-17	TurkishMMLU: Measuring Massive Multitask Language Understanding in Turkish	Arda Yüksel et.al.	2407.12402v1	null
2024-07-17	Mamba-PTQ: Outlier Channels in Recurrent Large Language Models	Alessandro Pierro et.al.	2407.12397v1	null
2024-07-17	NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models	Gengze Zhou et.al.	2407.12366v1	link
2024-07-17	LLM-based query paraphrasing for video search	Jiaxin Wu et.al.	2407.12341v1	null
2024-07-16	Private prediction for large-scale synthetic text generation	Kareem Amin et.al.	2407.12108v1	null
2024-07-16	Better RAG using Relevant Information Gain	Marc Pickett et.al.	2407.12101v1	link
2024-07-16	NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?	Mo Li et.al.	2407.11963v1	link
2024-07-17	Harnessing Large Language Models for Multimodal Product Bundling	Xiaohao Liu et.al.	2407.11712v2	null
2024-07-16	A Comprehensive Evaluation of Large Language Models on Temporal Event Forecasting	He Chang et.al.	2407.11638v1	null
2024-07-16	Reasoning with Large Language Models, a Survey	Aske Plaat et.al.	2407.11511v1	null
2024-07-16	SPINACH: SPARQL-Based Information Navigation for Challenging Real-World Questions	Shicheng Liu et.al.	2407.11417v1	null
2024-07-19	Reliable Reasoning Beyond Natural Language	Nasim Borazjanizadeh et.al.	2407.11373v2	null
2024-07-16	VISA: Reasoning Video Object Segmentation via Large Language Models	Cilin Yan et.al.	2407.11325v1	link
2024-07-15	Making New Connections: LLMs as Puzzle Generators for The New York Times' Connections Word Game	Tim Merino et.al.	2407.11240v1	null
2024-07-17	Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay	Gonçalo Hora de Carvalho et.al.	2407.11068v2	link
2024-07-15	Can Textual Semantics Mitigate Sounding Object Segmentation Preference?	Yaoting Wang et.al.	2407.10947v1	link
2024-07-15	Think-on-Graph 2.0: Deep and Interpretable Large Language Model Reasoning with Knowledge Graph-guided Retrieval	Shengjie Ma et.al.	2407.10805v1	null
2024-07-15	Multilingual Contrastive Decoding via Language-Agnostic Layers Skipping	Wenhao Zhu et.al.	2407.10795v1	link
2024-07-15	Graphusion: Leveraging Large Language Models for Scientific Knowledge Graph Fusion and Construction in NLP Education	Rui Yang et.al.	2407.10794v1	link
2024-07-16	Sibyl: Simple yet Effective Agent Framework for Complex Real-world Reasoning	Yulong Wang et.al.	2407.10718v2	link
2024-07-18	Qwen2 Technical Report	An Yang et.al.	2407.10671v3	link
2024-07-17	LAB-Bench: Measuring Capabilities of Language Models for Biology Research	Jon M. Laurent et.al.	2407.10362v3	null
2024-07-20	Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models	Yuchen Yang et.al.	2407.10299v2	link
2024-07-14	GenSco: Can Question Decomposition based Passage Alignment improve Question Answering?	Barah Fazili et.al.	2407.10245v1	null
2024-07-20	BiasAlert: A Plug-and-play Tool for Social Bias Detection in LLMs	Zhiting Fan et.al.	2407.10241v2	null
2024-07-22	Key-Point-Driven Mathematical Reasoning Distillation of Large Language Model	Xunyu Zhu et.al.	2407.10167v2	null
2024-07-14	ChatLogic: Integrating Logic Programming with Large Language Models for Multi-Step Reasoning	Zhongsheng Wang et.al.	2407.10162v1	link
2024-07-19	Rapid Biomedical Research Classification: The Pandemic PACT Advanced Categorisation Engine	Omid Rohanian et.al.	2407.10086v2	null
2024-07-14	All Roads Lead to Rome: Unveiling the Trajectory of Recommender Systems Across the LLM Era	Bo Chen et.al.	2407.10081v1	null
2024-07-13	Benchmarking LLMs for Optimization Modeling and Enhancing Reasoning via Reverse Socratic Synthesis	Zhicheng Yang et.al.	2407.09887v1	link
2024-07-13	IoT-LM: Large Multisensory Language Models for the Internet of Things	Shentong Mo et.al.	2407.09801v1	link
2024-07-17	Security Matrix for Multimodal Agents on Mobile Devices: A Systematic and Proof of Concept Study	Yulong Yang et.al.	2407.09295v2	null
2024-07-17	Counterfactual Explainable Incremental Prompt Attack Analysis on Large Language Models	Dong Shu et.al.	2407.09292v2	null
2024-07-12	Predicting and Understanding Human Action Decisions: Insights from Large Language Models and Cognitive Instance-Based Learning	Thuy Ngoc Nguyen et.al.	2407.09281v1	null
2024-07-12	Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors	Nico Daheim et.al.	2407.09136v1	link
2024-07-12	STD-LLM: Understanding Both Spatial and Temporal Properties of Spatial-Temporal Data with LLMs	Yiheng Huang et.al.	2407.09096v1	null
2024-07-12	SpreadsheetLLM: Encoding Spreadsheets for Large Language Models	Yuzhang Tian et.al.	2407.09025v1	null
2024-07-12	Leveraging large language models for nano synthesis mechanism explanation: solid foundations or mere conjectures?	Yingming Pu et.al.	2407.08922v1	link
2024-07-11	Evaluating Nuanced Bias in Large Language Model Free Response Answers	Jennifer Healey et.al.	2407.08842v1	null
2024-07-11	MAVIS: Mathematical Visual Instruction Tuning	Renrui Zhang et.al.	2407.08739v1	link
2024-07-11	Real-Time Anomaly Detection and Reactive Planning with Large Language Models	Rohan Sinha et.al.	2407.08735v1	null
2024-07-11	Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist	Zihao Zhou et.al.	2407.08733v1	null
2024-07-11	GTA: A Benchmark for General Tool Agents	Jize Wang et.al.	2407.08713v1	link
2024-07-11	Cloud Atlas: Efficient Fault Localization for Cloud Systems using Language Models and Causal Insight	Zhiqiang Xie et.al.	2407.08694v1	null
2024-07-15	Emergent Visual-Semantic Hierarchies in Image-Text Representations	Morris Alper et.al.	2407.08521v2	null
2024-07-16	Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents	Haoyi Xiong et.al.	2407.08516v2	null
2024-07-11	Investigating LLMs as Voting Assistants via Contextual Augmentation: A Case Study on the European Parliament Elections 2024	Ilias Chalkidis et.al.	2407.08495v1	null
2024-07-11	Lynx: An Open Source Hallucination Evaluation Model	Selvan Sunitha Ravi et.al.	2407.08488v1	null
2024-07-17	Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On	Liang Zeng et.al.	2407.08348v2	null
2024-07-12	RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL	Zhenhe Wu et.al.	2407.08273v2	null
2024-07-16	Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding	Minghui Wu et.al.	2407.08150v2	null
2024-07-10	RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization	Xijie Huang et.al.	2407.08044v1	link
2024-07-10	A Critical Review of Causal Reasoning Benchmarks for Large Language Models	Linying Yang et.al.	2407.08029v1	null
2024-07-04	CaseGPT: a case reasoning framework based on language models and retrieval-augmented generation	Rui Yang et.al.	2407.07913v1	null
2024-07-12	A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends	Daizong Liu et.al.	2407.07403v2	link
2024-07-10	LokiLM: Technical Report	Justin Kiefel et.al.	2407.07370v1	null
2024-07-10	Interpretable Differential Diagnosis with Dual-Inference Large Language Models	Shuang Zhou et.al.	2407.07330v1	null
2024-07-10	Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model	Wenqi Zhang et.al.	2407.07053v2	link
2024-07-09	Learning From Crowdsourced Noisy Labels: A Signal Processing Perspective	Shahana Ibrahim et.al.	2407.06902v1	null
2024-07-08	A Single Transformer for Scalable Vision-Language Modeling	Yangyi Chen et.al.	2407.06438v1	link
2024-07-08	Multimodal Chain-of-Thought Reasoning via ChatGPT to Protect Children from Age-Inappropriate Apps	Chuanbo Hu et.al.	2407.06309v1	null
2024-07-08	CodeUpdateArena: Benchmarking Knowledge Editing on API Updates	Zeyu Leo Liu et.al.	2407.06249v1	null
2024-07-08	SimPal: Towards a Meta-Conversational Framework to Understand Teacher's Instructional Goals for K-12 Physics	Effat Farhana et.al.	2407.06241v1	null
2024-07-08	Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision	Orr Zohar et.al.	2407.06189v1	link
2024-07-08	iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement	Aoyu Pang et.al.	2407.06025v1	link
2024-07-09	Distilling System 2 into System 1	Ping Yu et.al.	2407.06023v2	null
2024-07-08	Towards Optimizing and Evaluating a Retrieval Augmented QA Chatbot using LLMs with Human in the Loop	Anum Afzal et.al.	2407.05925v1	null
2024-07-08	When is the consistent prediction likely to be a correct prediction?	Alex Nguyen et.al.	2407.05778v1	null
2024-07-08	Large Language Models Understand Layouts	Weiming Li et.al.	2407.05750v1	null
2024-07-08	Empirical Study of Symmetrical Reasoning in Conversational Chatbots	Daniela N. Rim et.al.	2407.05734v1	null
2024-07-08	Retrieved In-Context Principles from Previous Mistakes	Hao Sun et.al.	2407.05682v1	null
2024-07-07	Training Task Experts through Retrieval Based Distillation	Jiaxin Ge et.al.	2407.05463v1	null
2024-07-07	LTLBench: Towards Benchmarks for Evaluating Temporal Logic Reasoning in Large Language Models	Weizhi Tang et.al.	2407.05434v1	link
2024-07-10	SBoRA: Low-Rank Adaptation with Regional Weight Updates	Lai-Man Po et.al.	2407.05413v2	link
2024-07-07	ElecBench: a Power Dispatch Evaluation Benchmark for Large Language Models	Xiyuan Zhou et.al.	2407.05365v1	link
2024-07-07	VideoCoT: A Video Chain-of-Thought Dataset with Active Annotation Tool	Yan Wang et.al.	2407.05355v1	null
2024-07-07	WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks	Léo Boisvert et.al.	2407.05291v1	link
2024-07-07	Beyond Binary Gender Labels: Revealing Gender Biases in LLMs through Gender-Neutral Name Predictions	Zhiwen You et.al.	2407.05271v1	link
2024-07-06	Lucy: Think and Reason to Solve Text-to-SQL	Nina Narodytska et.al.	2407.05153v1	null
2024-07-06	Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns?	Kuei-Chun Kao et.al.	2407.05134v1	null
2024-07-06	Progress or Regress? Self-Improvement Reversal in Post-training	Ting Wu et.al.	2407.05013v1	null
2024-07-06	LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts	Yijia Xiao et.al.	2407.04973v1	link
2024-07-06	MemoCRS: Memory-enhanced Sequential Conversational Recommender Systems with Large Language Models	Yunjia Xi et.al.	2407.04960v1	link
2024-07-06	Safe Generative Chats in a WhatsApp Intelligent Tutoring System	Zachary Levonian et.al.	2407.04915v1	null
2024-07-06	Algorithmic Language Models with Neurally Compiled Libraries	Lucas Saldyt et.al.	2407.04899v1	null
2024-07-12	On scalable oversight with weak LLMs judging strong LLMs	Zachary Kenton et.al.	2407.04622v2	null
2024-07-05	Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model	Duy M. H. Nguyen et.al.	2407.04489v1	null
2024-07-05	cosmosage: A Natural-Language Assistant for Cosmologists	Tijmen de Haan et.al.	2407.04420v1	link
2024-07-05	AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents	Petr Anokhin et.al.	2407.04363v1	link
2024-07-05	Towards Context-aware Support for Color Vision Deficiency: An Approach Integrating LLM and AR	Shogo Morita et.al.	2407.04362v1	null
2024-07-05	WOMD-Reasoning: A Large-Scale Language Dataset for Interaction and Driving Intentions Reasoning	Yiheng Li et.al.	2407.04281v1	null
2024-07-09	DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning	Chengpeng Li et.al.	2407.04078v2	link
2024-07-04	Semantic Graphs for Syntactic Simplification: A Revisit from the Age of LLM	Peiran Yao et.al.	2407.04067v1	link
2024-07-04	A Survey on Natural Language Counterfactual Generation	Yongjie Wang et.al.	2407.03993v1	null
2024-07-04	MobileExperts: A Dynamic Tool-Enabled Agent Team in Mobile Devices	Jiayi Zhang et.al.	2407.03913v1	null
2024-07-04	From Data to Commonsense Reasoning: The Use of Large Language Models for Explainable AI	Stefanie Krause et.al.	2407.03778v1	null
2024-07-04	STOC-TOT: Stochastic Tree-of-Thought with Constrained Decoding for Complex Reasoning in Multi-Hop Question Answering	Zhenyu Bi et.al.	2407.03687v1	null
2024-07-04	Improving Self Consistency in LLMs through Probabilistic Tokenization	Ashutosh Sathe et.al.	2407.03678v1	null
2024-07-14	Evaluating Language Model Context Windows: A "Working Memory" Test and Inference-time Correction	Amanda Dsouza et.al.	2407.03651v2	link
2024-07-04	Visualizing Dialogues: Enhancing Image Selection through Dialogue Understanding with Large Language Models	Chang-Sheng Kao et.al.	2407.03615v1	link
2024-07-03	UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization	Md Nayem Uddin et.al.	2407.03525v1	null
2024-07-03	On Large Language Models in National Security Applications	William N. Caballero et.al.	2407.03453v1	null
2024-07-03	How Does Quantization Affect Multilingual LLMs?	Kelly Marchisio et.al.	2407.03211v1	null
2024-07-03	TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts	Ruida Wang et.al.	2407.03203v1	link
2024-07-03	Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models	Haritz Puerto et.al.	2407.03181v1	link
2024-07-03	Investigating Decoder-only Large Language Models for Speech-to-text Translation	Chao-Wei Huang et.al.	2407.03169v1	null
2024-07-03	Social Bias Evaluation for Large Language Models Requires Prompt Variations	Rem Hida et.al.	2407.03129v1	link
2024-07-03	ALTER: Augmentation for Large-Table-Based Reasoning	Han Zhang et.al.	2407.03061v1	link
2024-07-03	Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering	Zhaohe Liao et.al.	2407.03008v1	null
2024-07-03	SemioLLM: Assessing Large Language Models for Semiological Analysis in Epilepsy Research	Meghal Dani et.al.	2407.03004v1	null
2024-07-03	Large Language Models as Evaluators for Scientific Synthesis	Julia Evans et.al.	2407.02977v1	null
2024-07-03	FSM: A Finite State Machine Based Zero-Shot Prompting Paradigm for Multi-Hop Question Answering	Xiaochen Wang et.al.	2407.02964v1	null
2024-07-03	GraCoRe: Benchmarking Graph Comprehension and Complex Reasoning in Large Language Models	Zike Yuan et.al.	2407.02936v1	link
2024-07-03	LANE: Logic Alignment of Non-tuning Large Language Models and Online Recommendation Systems for Explainable Reason Generation	Hongke Zhao et.al.	2407.02833v1	null
2024-07-02	Reasoning in Large Language Models: A Geometric Perspective	Romain Cosentino et.al.	2407.02678v1	null
2024-07-02	An AI-Based System Utilizing IoT-Enabled Ambient Sensors and LLMs for Complex Activity Tracking	Yuan Sun et.al.	2407.02606v1	null
2024-07-02	Open Scene Graphs for Open World Object-Goal Navigation	Joel Loo et.al.	2407.02473v1	null
2024-07-02	TokenPacker: Efficient Visual Projector for Multimodal LLM	Wentong Li et.al.	2407.02392v1	link
2024-07-02	Generative Large Language Models in Automated Fact-Checking: A Survey	Ivan Vykopal et.al.	2407.02351v1	null
2024-07-02	RVISA: Reasoning and Verification for Implicit Sentiment Analysis	Wenna Lai et.al.	2407.02340v1	null
2024-07-02	Evaluating the Ability of LLMs to Solve Semantics-Aware Process Mining Tasks	Adrian Rebmann et.al.	2407.02310v1	link
2024-07-02	Multilingual Trolley Problems for Language Models	Zhijing Jin et.al.	2407.02273v1	link
2024-07-04	Embodied AI in Mobile Robots: Coverage Path Planning with Large Language Models	Xiangrui Kong et.al.	2407.02220v2	null
2024-07-02	Automatic Adaptation Rule Optimization via Large Language Models	Yusei Ishimizu et.al.	2407.02203v1	null
2024-07-02	Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?	Nishant Balepur et.al.	2407.01992v1	null
2024-07-04	Enabling Discriminative Reasoning in LLMs for Legal Judgment Prediction	Chenlong Deng et.al.	2407.01964v3	link
2024-07-02	Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness	Khyathi Raghavi Chandu et.al.	2407.01942v1	null
2024-07-02	GRASP: A Grid-Based Benchmark for Evaluating Commonsense Spatial Reasoning	Zhisheng Tang et.al.	2407.01892v1	link
2024-07-01	DiscoveryBench: Towards Data-Driven Discovery with Large Language Models	Bodhisattwa Prasad Majumder et.al.	2407.01725v1	link
2024-07-01	Deciphering the Factors Influencing the Efficacy of Chain-of-Thought: Probability, Memorization, and Noisy Reasoning	Akshara Prabhakar et.al.	2407.01687v1	link
2024-07-01	KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches	Jiayi Yuan et.al.	2407.01527v1	null
2024-07-02	Empowering 3D Visual Grounding with Reasoning Capabilities	Chenming Zhu et.al.	2407.01525v2	null
2024-07-01	TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models' Theory-of-Mind	Guiyang Hou et.al.	2407.01455v1	null
2024-07-01	MIRAI: Evaluating LLM Agents for Event Forecasting	Chenchen Ye et.al.	2407.01231v1	null
2024-07-01	EconNLI: Evaluating Large Language Models on Economics Reasoning	Yue Guo et.al.	2407.01212v1	link
2024-07-01	IBSEN: Director-Actor Agent Collaboration for Controllable and Interactive Drama Script Generation	Senyu Han et.al.	2407.01093v1	link
2024-07-03	FRoG: Evaluating Fuzzy Reasoning of Generalized Quantifiers in Large Language Models	Yiyuan Li et.al.	2407.01046v2	link
2024-07-01	DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models	Jiabao Pan et.al.	2407.01009v1	null
2024-07-01	Data on the Move: Traffic-Oriented Data Trading Platform Powered by AI Agent with Common Sense	Yi Yu et.al.	2407.00995v1	null
2024-07-01	Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents	Shihan Deng et.al.	2407.00993v1	null
2024-07-01	Tokenize the World into Object-level Knowledge to Address Long-tail Events in Autonomous Driving	Ran Tian et.al.	2407.00959v1	null
2024-07-01	MalAlgoQA: A Pedagogical Approach for Evaluating Counterfactual Reasoning Abilities	Naiming Liu et.al.	2407.00938v1	null
2024-07-01	MathCAMPS: Fine-grained Synthesis of Mathematical Problems From Human Curricula	Shubhra Mishra et.al.	2407.00900v1	link
2024-07-01	Large Language Models Are Involuntary Truth-Tellers: Exploiting Fallacy Failure for Jailbreak Attacks	Yue Zhou et.al.	2407.00869v1	null
2024-07-02	Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning	Zimu Lu et.al.	2407.00782v2	link
2024-06-30	Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs	Yifei Zhang et.al.	2407.00653v1	null
2024-06-29	LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement	Jiahao Ying et.al.	2407.00497v1	null
2024-06-29	MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation	Jinsheng Huang et.al.	2407.00468v1	link
2024-06-29	Too Late to Train, Too Early To Use? A Study on Necessity and Viability of Low-Resource Bengali LLMs	Tamzeed Mahfuz et.al.	2407.00416v1	null
2024-06-29	Advancing Process Verification for Large Language Models via Tree-Based Preference Learning	Mingqian He et.al.	2407.00390v1	null
2024-06-28	Evaluating Human Alignment and Model Faithfulness of LLM Rationale	Mohsen Fayyaz et.al.	2407.00219v1	null
2024-06-27	From Efficient Multimodal Models to World Models: A Survey	Xinji Mai et.al.	2407.00118v1	null
2024-06-26	Visual Reasoning and Multi-Agent Approach in Multimodal Large Language Models (MLLMs): Solving TSP and mTSP Combinatorial Challenges	Mohammed Elhenawy et.al.	2407.00092v1	null
2024-06-28	LLaRA: Supercharging Robot Learning Data for Vision-Language Policy	Xiang Li et.al.	2406.20095v1	link
2024-06-28	Scaling Synthetic Data Creation with 1,000,000,000 Personas	Xin Chan et.al.	2406.20094v1	link
2024-06-28	Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language	Yicheng Chen et.al.	2406.20085v1	null
2024-07-02	BMW Agents -- A Framework For Task Automation Through Multi-Agent Collaboration	Noel Crawford et.al.	2406.20041v3	null
2024-06-28	ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models	Yuxiang Zhang et.al.	2406.20015v1	link
2024-06-28	Into the Unknown: Generating Geospatial Descriptions for New Environments	Tzuf Paz-Argaman et.al.	2406.19967v1	null
2024-06-28	BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering	Zheng Chu et.al.	2406.19820v1	null
2024-06-28	Belief Revision: The Adaptability of Large Language Models Reasoning	Bryan Wilie et.al.	2406.19764v1	null
2024-07-02	ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning	Christopher E. Mower et.al.	2406.19741v2	link
2024-06-28	MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?	Jinming Li et.al.	2406.19693v1	null
2024-06-27	Rethinking harmless refusals when fine-tuning foundation models	Florin Pop et.al.	2406.19552v1	null
2024-06-27	Leveraging Machine-Generated Rationales to Facilitate Social Meaning Detection in Conversations	Ritam Dutt et.al.	2406.19545v1	link
2024-06-27	Context Matters: An Empirical Study of the Impact of Contextual Information in Temporal Question Answering Systems	Dan Schumacher et.al.	2406.19538v1	null
2024-07-04	Using Large Language Models to Assist Video Content Analysis: An Exploratory Study of Short Videos on Depression	Jiaying Liu et.al.	2406.19528v2	null
2024-06-27	Investigating How Large Language Models Leverage Internal Knowledge to Perform Complex Reasoning	Miyoung Ko et.al.	2406.19502v1	link
2024-07-02	ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos	Jr-Jen Chen et.al.	2406.19392v2	link
2024-06-27	From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data	Zheyang Xiong et.al.	2406.19292v1	null
2024-06-27	Aligning Teacher with Student Preferences for Tailored Training Data Generation	Yantao Liu et.al.	2406.19227v1	null
2024-06-27	Towards Learning Abductive Reasoning using VSA Distributed Representations	Giacomo Camposampiero et.al.	2406.19121v1	link
2024-06-27	STBench: Assessing the Ability of Large Language Models in Spatio-Temporal Analysis	Wenbin Li et.al.	2406.19065v1	link
2024-06-28	UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models	Siyuan Wu et.al.	2406.18966v2	link
2024-06-27	Disentangling Knowledge-based and Visual Reasoning by Question Decomposition in KB-VQA	Elham J. Barezi et.al.	2406.18839v1	null
2024-06-26	Categorical Syllogisms Revisited: A Review of the Logical Reasoning Abilities of LLMs for Analyzing Categorical Syllogism	Shi Zong et.al.	2406.18762v1	null
2024-06-26	Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models	Georgios Tziafas et.al.	2406.18746v1	null
2024-07-01	Towards Open-World Grasping with Large Vision-Language Models	Georgios Tziafas et.al.	2406.18722v2	null
2024-06-26	Learning to Correct for QA Reasoning with Black-box LLMs	Jaehyung Kim et.al.	2406.18695v1	link
2024-06-26	Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation	Guanting Dong et.al.	2406.18676v1	link
2024-06-26	Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs	Xin Lai et.al.	2406.18629v1	link
2024-06-26	An LLM-based Knowledge Synthesis and Scientific Reasoning Framework for Biomedical Discovery	Oskar Wysocki et.al.	2406.18626v1	null
2024-06-26	CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs	Zirui Wang et.al.	2406.18521v1	link
2024-06-26	Mental Modeling of Reinforcement Learning Agents by Language Models	Wenhao Lu et.al.	2406.18505v1	null
2024-06-26	MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data	Meng Fang et.al.	2406.18321v1	null
2024-06-26	AI-native Memory: A Pathway from LLMs Towards AGI	Jingbo Shang et.al.	2406.18312v1	null
2024-06-26	SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding	Zhenglin Wang et.al.	2406.18200v1	null
2024-06-26	Knowledge Graph Enhanced Retrieval-Augmented Generation for Failure Mode and Effects Analysis	Lukas Bahr et.al.	2406.18114v1	link
2024-06-26	Multi-step Knowledge Retrieval and Inference over Unstructured Data	Aditya Kalyanpur et.al.	2406.17987v1	null
2024-06-25	NormTab: Improving Symbolic Reasoning in LLMs Through Tabular Data Normalization	Md Mahadi Hasan Nahid et.al.	2406.17961v1	null
2024-06-25	Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback	Zhongtao Miao et.al.	2406.17873v1	link
2024-06-22	MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?	Xirui Li et.al.	2406.17806v1	null
2024-06-25	LLM-ARC: Enhancing LLMs with an Automated Reasoning Critic	Aditya Kalyanpur et.al.	2406.17663v1	null
2024-06-25	Banishing LLM Hallucinations Requires Rethinking Generalization	Johnny Li et.al.	2406.17642v1	null
2024-06-25	"Seeing the Big through the Small": Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations?	Beiduo Chen et.al.	2406.17600v1	null
2024-06-26	LongIns: A Challenging Long-context Instruction-based Exam for LLMs	Shawn Gavin et.al.	2406.17588v2	null
2024-06-25	Beyond Text-to-SQL for IoT Defense: A Comprehensive Framework for Querying and Classifying IoT Threats	Ryan Pavlich et.al.	2406.17574v1	null
2024-06-25	The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale	Guilherme Penedo et.al.	2406.17557v1	null
2024-06-25	Tell Me Where You Are: Multimodal LLMs Meet Place Recognition	Zonglin Lyu et.al.	2406.17520v1	null
2024-06-25	Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA	Minzheng Wang et.al.	2406.17419v1	link
2024-06-25	Leveraging LLMs for Dialogue Quality Measurement	Jinghan Jia et.al.	2406.17304v1	null
2024-06-26	Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models	Wenhao Shi et.al.	2406.17294v2	link
2024-06-25	DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph	Zhehao Zhang et.al.	2406.17271v1	link
2024-06-24	CogExplore: Contextual Exploration with Language-Encoded Environment Representations	Harel Biggie et.al.	2406.17180v1	null
2024-06-24	Multi-LogiEval: Towards Evaluating Multi-Step Logical Reasoning Ability of Large Language Models	Nisarg Patel et.al.	2406.17169v1	link
2024-06-24	USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations	Mounika Marreddy et.al.	2406.16833v1	null
2024-06-25	Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs	Ashwinee Panda et.al.	2406.16797v2	link
2024-06-24	Scaling Laws for Linear Complexity Language Models	Xuyang Shen et.al.	2406.16690v1	link
2024-06-24	Large Language Models Are Cross-Lingual Knowledge-Free Reasoners	Peng Hu et.al.	2406.16655v1	link
2024-06-25	OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer	Lu Zhang et.al.	2406.16620v2	null
2024-06-24	Evaluating the Ability of Large Language Models to Reason about Cardinal Directions	Anthony G Cohn et.al.	2406.16528v1	null
2024-06-24	eagerlearners at SemEval2024 Task 5: The Legal Argument Reasoning Task in Civil Procedure	Hoorieh Sabzevari et.al.	2406.16490v1	link
2024-06-24	Evaluating and Analyzing Relationship Hallucinations in LVLMs	Mingrui Wu et.al.	2406.16449v1	link
2024-06-29	EmoLLM: Multimodal Emotional Understanding Meets Large Language Models	Qu Yang et.al.	2406.16442v2	link
2024-06-24	UniCoder: Scaling Code Large Language Model via Universal Code	Tao Sun et.al.	2406.16441v1	null
2024-06-24	Anomaly Detection of Tabular Data Using LLMs	Aodong Li et.al.	2406.16308v1	null
2024-06-23	GraphEval2000: Benchmarking and Improving Large Language Models on Graph Datasets	Qiming Wu et.al.	2406.16176v1	null
2024-06-23	Chain-of-Probe: Examing the Necessity and Accuracy of CoT Step-by-Step	Zezhong Wang et.al.	2406.16144v1	null
2024-06-23	PORT: Preference Optimization on Reasoning Traces	Salem Lahlou et.al.	2406.16061v1	null
2024-06-23	Can LLM Graph Reasoning Generalize beyond Pattern Memorization?	Yizhuo Zhang et.al.	2406.15992v1	null
2024-06-26	BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions	Terry Yue Zhuo et.al.	2406.15877v2	link
2024-06-30	LLM-Powered Explanations: Unraveling Recommendations Through Subgraph Reasoning	Guangsi Shi et.al.	2406.15859v2	null
2024-06-22	MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception	Guanqun Wang et.al.	2406.15768v1	null
2024-06-22	video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models	Guangzhi Sun et.al.	2406.15704v1	link
2024-06-21	Robust Reinforcement Learning from Corrupted Human Feedback	Alexander Bukharin et.al.	2406.15568v1	null
2024-06-18	On the Principles behind Opinion Dynamics in Multi-Agent Systems of Large Language Models	Pedro Cisneros-Velarde et.al.	2406.15492v1	null
2024-06-21	Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network	Badr AlKhamissi et.al.	2406.15109v1	link
2024-06-21	MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to 200K Tokens	Yongqi Fan et.al.	2406.15019v1	link
2024-06-21	Do Large Language Models Exhibit Cognitive Dissonance? Studying the Difference Between Revealed Beliefs and Stated Answers	Manuel Mondal et.al.	2406.14986v1	null
2024-06-21	ICLEval: Evaluating In-Context Learning Ability of Large Language Models	Wentong Chen et.al.	2406.14955v1	link
2024-06-21	Autonomous Agents for Collaborative Task under Information Asymmetry	Wei Liu et.al.	2406.14928v1	link
2024-06-21	Sports Intelligence: Assessing the Sports Understanding Capabilities of Language Models through Question Answering from Text to Video	Zhengbang Yang et.al.	2406.14877v1	null
2024-06-21	DistiLRR: Transferring Code Repair for Low-Resource Programming Languages	Kyle Wong et.al.	2406.14867v1	link
2024-06-21	Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models	Jiayu Wang et.al.	2406.14852v1	null
2024-06-20	ACR: A Benchmark for Automatic Cohort Retrieval	Dung Ngoc Thai et.al.	2406.14780v1	null
2024-06-20	A Learn-Then-Reason Model Towards Generalization in Knowledge Base Question Answering	Lingxi Zhang et.al.	2406.14763v1	null
2024-06-20	Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?	Zhiqiang Pi et.al.	2406.14737v1	null
2024-06-20	Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell	Taiming Lu et.al.	2406.14673v1	link
2024-06-20	HYPERmotion: Learning Hybrid Behavior Planning for Autonomous Loco-manipulation	Jin Wang et.al.	2406.14655v1	null
2024-06-20	Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities	Sachit Menon et.al.	2406.14562v1	null
2024-06-21	Asynchronous Large Language Model Enhanced Planner for Autonomous Driving	Yuan Chen et.al.	2406.14556v2	null
2024-06-20	Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data	Johannes Treutlein et.al.	2406.14546v1	link
2024-06-20	Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs	Yuxuan Qiao et.al.	2406.14544v1	link
2024-06-25	SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages	Gayane Ghazaryan et.al.	2406.14425v2	null
2024-06-20	The neural correlates of logical-mathematical symbol systems processing resemble that of spatial cognition more than natural language processing	Yuannan Li et.al.	2406.14358v1	null
2024-06-20	medIKAL: Integrating Knowledge Graphs as Assistants of LLMs for Enhanced Clinical Diagnosis on EMRs	Mingyi Jia et.al.	2406.14326v1	null
2024-06-27	Q: Improving Multi-step Reasoning for LLMs with Deliberative Planning*	Chaojie Wang et.al.	2406.14283v3	null
2024-06-20	SeCoKD: Aligning Large Language Models for In-Context Learning with Fewer Shots	Weixing Wang et.al.	2406.14208v1	null
2024-06-20	Timo: Towards Better Temporal Reasoning for Language Models	Zhaochen Su et.al.	2406.14192v1	link
2024-06-20	Definition generation for lexical semantic change detection	Mariia Fedorova et.al.	2406.14167v1	link
2024-07-01	Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration	Haokun Liu et.al.	2406.14097v2	null
2024-06-20	MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models	Zhongshen Zeng et.al.	2406.13975v1	null
2024-06-20	Causal Inference with Latent Variables: Recent Advances and Future Prospectives	Yaochen Zhu et.al.	2406.13966v1	null
2024-06-20	CityGPT: Empowering Urban Spatial Cognition of Large Language Models	Jie Feng et.al.	2406.13948v1	null
2024-06-20	AspirinSum: an Aspect-based utility-preserved de-identification Summarization framework	Ya-Lun Li et.al.	2406.13947v1	null
2024-06-19	Using Multimodal Large Language Models for Automated Detection of Traffic Safety Critical Events	Mohammad Abu Tami et.al.	2406.13894v1	null
2024-06-19	Adaptable Logical Control for Large Language Models	Honghua Zhang et.al.	2406.13892v1	link
2024-06-19	Distributional reasoning in LLMs: Parallel reasoning processes in multi-hop reasoning	Yuval Shalev et.al.	2406.13858v1	null
2024-06-27	Can Low-Rank Knowledge Distillation in LLMs be Useful for Microelectronic Reasoning?	Nirjhor Rouf et.al.	2406.13808v3	null
2024-06-19	WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia	Yufang Hou et.al.	2406.13805v1	null
2024-06-19	Semantic Structure-Mapping in LLM and Human Analogical Reasoning	Sam Musker et.al.	2406.13803v1	link
2024-06-19	Can LLMs Reason in the Wild with Programs?	Yuan Yang et.al.	2406.13764v1	link
2024-06-19	Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models	Zhawnen Chen et.al.	2406.13763v1	null
2024-06-19	Improving Visual Commonsense in Language Models via Multiple Image Generation	Guy Yariv et.al.	2406.13621v1	link
2024-06-27	VDebugger: Harnessing Execution Feedback for Debugging Visual Programs	Xueqing Wu et.al.	2406.13444v2	link
2024-06-19	Finding Blind Spots in Evaluator LLMs with Interpretable Checklists	Sumanth Doddapaneni et.al.	2406.13439v1	link
2024-06-19	MoreHopQA: More Than Multi-hop Reasoning	Julian Schnitzler et.al.	2406.13397v1	link
2024-06-19	ALiiCE: Evaluating Positional Fine-grained Citation Generation	Yilong Xu et.al.	2406.13375v1	null
2024-06-19	Investigating Low-Cost LLM Annotation for~Spoken Dialogue Understanding Datasets	Lucas Druart et.al.	2406.13269v1	null
2024-06-19	Bridging Law and Data: Augmenting Reasoning via a Semi-Structured Dataset with IRAC methodology	Xiaoxi Kang et.al.	2406.13217v1	null
2024-06-19	Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata	Mykhailo Poliakov et.al.	2406.13213v1	link
2024-06-19	DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue Understanding of Conversational Agents	Jiho Kim et.al.	2406.13144v1	link
2024-06-19	Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation	Yuhang Zhou et.al.	2406.13114v1	null
2024-06-18	Assessing AI vs Human-Authored Spear Phishing SMS Attacks: An Empirical Study Using the TRAPD Method	Jerson Francia et.al.	2406.13049v1	null
2024-06-18	MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction	Yuyan Liu et.al.	2406.12950v1	link
2024-06-18	DrVideo: Document Retrieval Based Long Video Understanding	Ziyu Ma et.al.	2406.12846v1	null
2024-06-18	LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation	Seyedarmin Azizi et.al.	2406.12832v1	link
2024-06-18	UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions	Xunzhi Wang et.al.	2406.12784v1	link
2024-06-18	Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries	Eden Biran et.al.	2406.12775v1	link
2024-06-18	OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI	Zhen Huang et.al.	2406.12753v1	link
2024-06-18	Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning	Bingchen Zhao et.al.	2406.12742v1	link
2024-06-18	MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL	Arian Askari et.al.	2406.12692v1	null
2024-06-18	DetectBench: Can Large Language Model Detect and Piece Together Implicit Evidence?	Zhouhong Gu et.al.	2406.12641v1	link
2024-06-18	Ask-before-Plan: Proactive Language Agents for Real-World Planning	Xuan Zhang et.al.	2406.12639v1	link
2024-06-18	Large Language Models based Multi-Agent Framework for Objective Oriented Control Design in Power Electronics	Chenggang Cui et.al.	2406.12628v1	null
2024-06-18	Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges	Aman Singh Thakur et.al.	2406.12624v1	null
2024-06-18	Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for Ensembling	Yao-Ching Yu et.al.	2406.12585v1	link
2024-06-19	Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on Large Language Models	Eldar Kurtic et.al.	2406.12572v2	link
2024-06-18	Liar, Liar, Logical Mire: A Benchmark for Suppositional Reasoning in Large Language Models	Philipp Mondorf et.al.	2406.12546v1	null
2024-06-18	LLM4MSR: An LLM-Enhanced Paradigm for Multi-Scenario Recommendation	Yuhao Wang et.al.	2406.12529v1	null
2024-06-18	LightPAL: Lightweight Passage Retrieval for Open Domain Multi-Document Summarization	Masafumi Enomoto et.al.	2406.12494v1	null
2024-06-18	RS-GPT4V: A Unified Multimodal Instruction-Following Dataset for Remote Sensing Image Understanding	Linrui Xu et.al.	2406.12479v1	link
2024-06-18	IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models	Qiyao Wang et.al.	2406.12386v1	link
2024-06-18	Problem-Solving in Language Model Networks	Ciaran Regan et.al.	2406.12374v1	link
2024-06-18	Retrieval Meets Reasoning: Dynamic In-Context Editing for Long-Text Understanding	Weizhi Fei et.al.	2406.12331v1	null
2024-06-18	PRePair: Pointwise Reasoning Enhance Pairwise Evaluating for Robust Instruction-Following Assessments	Hawon Jeong et.al.	2406.12319v1	null
2024-06-18	An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs	Daking Rai et.al.	2406.12288v1	null
2024-06-18	Unveiling Implicit Table Knowledge with Question-Then-Pinpoint Reasoner for Insightful Table Summarization	Kwangwook Seo et.al.	2406.12269v1	null
2024-06-18	A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning	Lijie Hu et.al.	2406.12255v1	null
2024-06-24	Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector	Gangwei Jiang et.al.	2406.12227v2	null
2024-06-18	Leveraging Large Language Model for Heterogeneous Ad Hoc Teamwork Collaboration	Xinzhu Liu et.al.	2406.12224v1	null
2024-06-18	Navigating the Labyrinth: Evaluating and Enhancing LLMs' Ability to Reason About Search Problems	Nasim Borazjanizadeh et.al.	2406.12172v1	null
2024-06-19	Is poisoning a real threat to LLM alignment? Maybe more so than you think	Pankayaraj Pathmanathan et.al.	2406.12091v2	link
2024-06-17	InternalInspector $I^2$: Robust Confidence Estimation in LLMs through Internal States	Mohammad Beigi et.al.	2406.12053v1	null
2024-06-17	MedCalc-Bench: Evaluating Large Language Models for Medical Calculations	Nikhil Khandekar et.al.	2406.12036v1	link
2024-06-17	Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts	Junmo Kang et.al.	2406.12034v1	null
2024-06-17	GAugLLM: Improving Graph Contrastive Learning for Text-Attributed Graphs with Large Language Models	Yi Fang et.al.	2406.11945v1	link
2024-06-16	A Notion of Complexity for Theory of Mind via Discrete World Models	X. Angelo Huang et.al.	2406.11911v1	link
2024-06-15	A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges	Yuqi Nie et.al.	2406.11903v1	null
2024-06-17	Improving Multi-Agent Debate with Sparse Communication Topology	Yunxuan Li et.al.	2406.11776v1	null
2024-06-17	Meta Reasoning for Large Language Models	Peizhong Gao et.al.	2406.11698v1	null
2024-06-17	TourRank: Utilizing Large Language Models for Documents Ranking with a Tournament-Inspired Strategy	Yiqun Chen et.al.	2406.11678v1	link
2024-06-17	A Two-dimensional Zero-shot Dialogue State Tracking Evaluation Method using GPT-4	Ming Gu et.al.	2406.11651v1	link
2024-06-17	Towards an End-to-End Framework for Invasive Brain Signal Decoding with Large Language Models	Sheng Feng et.al.	2406.11568v1	link
2024-06-17	MEMLA: Enhancing Multilingual Knowledge Editing with Neuron-Masked Low-Rank Adaptation	Jiakuan Xie et.al.	2406.11566v1	null
2024-06-17	AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation	Chuyan Xiong et.al.	2406.11548v1	null
2024-06-17	Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMs	Yi Fang et.al.	2406.11514v1	null
2024-06-17	Can AI with High Reasoning Ability Replicate Human-like Decision Making in Economic Experiments?	Ayato Kitadai et.al.	2406.11426v1	null
2024-06-17	P-TA: Using Proximal Policy Optimization to Enhance Tabular Data Augmentation via Large Language Models	Shuo Yang et.al.	2406.11391v1	null
2024-06-17	A Systematic Analysis of Large Language Models as Soft Reasoners: The Case of Syllogistic Inferences	Leonardo Bertolazzi et.al.	2406.11341v1	null
2024-06-17	ClawMachine: Fetching Visual Tokens as An Entity for Referring and Grounding	Tianren Ma et.al.	2406.11327v1	null
2024-06-17	Enhancing Biomedical Knowledge Retrieval-Augmented Generation with Self-Rewarding Tree Search and Proximal Policy Optimization	Minda Hu et.al.	2406.11258v1	null
2024-06-18	AvaTaR: Optimizing LLM Agents for Tool-Assisted Knowledge Retrieval	Shirley Wu et.al.	2406.11200v2	link
2024-06-17	Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning	Zebang Cheng et.al.	2406.11161v1	link
2024-06-21	Contextual Knowledge Graph	Chengjin Xu et.al.	2406.11160v2	null
2024-06-19	Vul-RAG: Enhancing LLM-based Vulnerability Detection via Knowledge-level RAG	Xueying Du et.al.	2406.11147v2	null
2024-06-17	RePrompt: Planning by Automatic Prompt Engineering for Large Language Models Agents	Weizhe Chen et.al.	2406.11132v1	null
2024-06-17	Exploring Safety-Utility Trade-Offs in Personalized Language Models	Anvesh Rao Vijjini et.al.	2406.11107v1	null
2024-06-16	A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners	Bowen Jiang et.al.	2406.11050v1	null
2024-06-16	RUPBench: Benchmarking Reasoning Under Perturbations for Robustness Evaluation in Large Language Models	Yuqing Wang et.al.	2406.11020v1	null
2024-06-18	Connecting the Dots: Evaluating Abstract Reasoning Capabilities of LLMs Using the New York Times Connections Word Game	Prisha Samadarshi et.al.	2406.11012v2	link
2024-06-16	Not All Bias is Bad: Balancing Rational Deviations and Cognitive Biases in Large Language Model Reasoning	Liman Wang et.al.	2406.10999v1	null
2024-06-18	City-LEO: Toward Transparent City Management Using LLM with End-to-End Optimization	Zihao Jiao et.al.	2406.10958v2	null
2024-06-16	E-Bench: Towards Evaluating the Ease-of-Use of Large Language Models	Zhenyu Zhang et.al.	2406.10950v1	null
2024-06-16	Effective Generative AI: The Human-Algorithm Centaur	Soroush Saghafian et.al.	2406.10942v1	null
2024-06-16	Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies	Hung-Ting Su et.al.	2406.10923v1	null
2024-06-16	RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models	Zhuoran Jin et.al.	2406.10890v1	link
2024-06-16	Demonstration Notebook: Finding the Most Suited In-Context Learning Example from Interactions	Yiming Tang et.al.	2406.10878v1	null
2024-06-16	Step-level Value Preference Optimization for Mathematical Reasoning	Guoxin Chen et.al.	2406.10858v1	null
2024-06-16	Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning	Joykirat Singh et.al.	2406.10834v1	null
2024-06-16	Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses	Zhiwen Fan et.al.	2406.10789v1	null
2024-06-15	FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models	Zhikai Zhang et.al.	2406.10740v1	null
2024-06-15	Seeing Clearly, Answering Incorrectly: A Multimodal Robustness Benchmark for Evaluating MLLMs on Leading Questions	Yexin Liu et.al.	2406.10638v1	link
2024-06-15	On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models	Sree Harsha Tanneru et.al.	2406.10625v1	null
2024-06-15	Reactor Mk.1 performances: MMLU, HumanEval and BBH test results	TJ Dunham et.al.	2406.10515v1	null
2024-06-14	What is the Visual Cognition Gap between Humans and Multimodal LLMs?	Xu Cao et.al.	2406.10424v1	link
2024-06-14	Self-Reflection Outcome is Sensitive to Prompt Construction	Fengyuan Liu et.al.	2406.10400v1	link
2024-06-18	Efficient Prompting for LLM-based Generative Internet of Things	Bin Xiao et.al.	2406.10382v2	null
2024-06-14	Unlock the Correlation between Supervised Fine-Tuning and Reinforcement Learning in Training Code Large Language Models	Jie Chen et.al.	2406.10305v1	null
2024-06-12	Mimicking User Data: On Mitigating Fine-Tuning Risks in Closed Large Language Models	Francisco Eiras et.al.	2406.10288v1	null
2024-06-11	FoodSky: A Food-oriented Large Language Model that Passes the Chef and Dietetic Examination	Pengfei Zhou et.al.	2406.10261v1	null
2024-06-10	The Impact of Quantization on Retrieval-Augmented Generation: An Analysis of Small LLMs	Mert Yazan et.al.	2406.10251v1	null
2024-06-14	BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack	Yuri Kuratov et.al.	2406.10149v1	null
2024-06-14	Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning	Jiaqi Li et.al.	2406.10099v1	null
2024-06-18	First Multi-Dimensional Evaluation of Flowchart Comprehension for Multimodal Large Language Models	Enming Zhang et.al.	2406.10057v2	link
2024-06-14	Precision Empowers, Excess Distracts: Visual Question Answering With Dynamically Infused Knowledge In Language Models	Manas Jhalani et.al.	2406.09994v1	null
2024-06-14	A Better LLM Evaluator for Text Generation: The Impact of Prompt Output Sequencing and Optimization	KuanChao Chu et.al.	2406.09972v1	null
2024-06-14	Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam	Nabor C. Mendonça et.al.	2406.09671v1	link
2024-06-13	ImageNet3D: Towards General-Purpose Object-Level 3D Understanding	Wufei Ma et.al.	2406.09613v1	link
2024-06-12	Pandora: Towards General World Model with Natural Language Actions and Video States	Jiannan Xiang et.al.	2406.09455v1	null
2024-06-13	VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding	Muhammad Maaz et.al.	2406.09418v1	link
2024-06-13	Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms	Miaosen Zhang et.al.	2406.09397v1	null
2024-06-13	GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning	Zhen Xiang et.al.	2406.09187v1	null
2024-06-13	ReMI: A Dataset for Reasoning with Multiple Images	Mehran Kazemi et.al.	2406.09175v1	null
2024-06-13	Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning	Bahare Fatemi et.al.	2406.09170v1	null
2024-06-13	Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs	Xuan Zhang et.al.	2406.09136v1	link
2024-06-13	MMRel: A Relation Understanding Dataset and Benchmark in the MLLM Era	Jiahao Nie et.al.	2406.09121v1	link
2024-06-13	Chain-of-Though (CoT) prompting strategies for medical error detection and correction	Zhaolong Wu et.al.	2406.09103v1	null
2024-06-13	SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models	Kehua Feng et.al.	2406.09098v1	link
2024-06-13	Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?	Zhaochen Su et.al.	2406.09072v1	link
2024-06-13	MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning	Hanqing Wang et.al.	2406.09044v1	null
2024-06-14	Language Models are Crossword Solvers	Soumadeep Saha et.al.	2406.09043v2	null
2024-06-13	ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models	Jing Liu et.al.	2406.09041v1	null
2024-06-13	Cognitively Inspired Energy-Based World Models	Alexi Gladstone et.al.	2406.08862v1	null
2024-06-13	LLM-Driven Robots Risk Enacting Discrimination, Violence, and Unlawful Actions	Rumaisa Azeem et.al.	2406.08824v1	null
2024-06-13	Mixture-of-Skills: Learning to Optimize Data Usage for Fine-Tuning Large Language Models	Minghao Wu et.al.	2406.08811v1	null
2024-06-13	A Survey on Compositional Learning of AI Models: Theoretical and Experimetnal Practices	Sania Sinha et.al.	2406.08787v1	null
2024-06-12	Mistral-C2F: Coarse to Fine Actor for Analytical and Reasoning Enhancement in RLHF and Effective-Merged LLMs	Chen Zheng et.al.	2406.08657v1	null
2024-06-12	LLM-Craft: Robotic Crafting of Elasto-Plastic Objects with Large Language Models	Alison Bartsch et.al.	2406.08648v1	null
2024-06-12	CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery	Xiaoshuai Song et.al.	2406.08587v1	link
2024-06-12	Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning	Jaehyun Nam et.al.	2406.08527v1	null
2024-06-12	Research Trends for the Interplay between Large Language Models and Knowledge Graphs	Hanieh Khorashadizadeh et.al.	2406.08223v1	null
2024-06-12	ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs	Irene Huang et.al.	2406.08164v1	link
2024-06-16	Making Task-Oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests	Amogh Mannekote et.al.	2406.07794v2	null
2024-06-11	Out-Of-Context Prompting Boosts Fairness and Robustness in Large Language Model Predictions	Leonardo Cotta et.al.	2406.07685v1	null
2024-06-11	QuickLLaMA: Query-aware Inference Acceleration for Large Language Models	Jingyao Li et.al.	2406.07528v1	link
2024-06-11	TextGrad: Automatic "Differentiation" via Text	Mert Yuksekgonul et.al.	2406.07496v1	link
2024-06-17	VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs	Zesen Cheng et.al.	2406.07476v2	link
2024-06-11	On the Robustness of Document-Level Relation Extraction Models to Entity Name Variations	Shiao Meng et.al.	2406.07444v1	link
2024-06-13	Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B	Di Zhang et.al.	2406.07394v2	link
2024-06-11	Limited Out-of-Context Knowledge Reasoning in Large Language Models	Peng Hu et.al.	2406.07393v1	null
2024-06-11	Large Language Models for Constrained-Based Causal Discovery	Kai-Hendrik Cohrs et.al.	2406.07378v1	link
2024-06-11	Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities	Delfina Sol Martinez Pandiani et.al.	2406.07353v1	null
2024-06-11	Instruct Large Language Models to Drive like Humans	Ruijun Zhang et.al.	2406.07296v1	link
2024-06-11	Needle In A Multimodal Haystack	Weiyun Wang et.al.	2406.07230v1	link
2024-06-11	Scaling Large-Language-Model-based Multi-Agent Collaboration	Chen Qian et.al.	2406.07155v1	link
2024-06-11	Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees	Sijia Chen et.al.	2406.07115v1	null
2024-06-17	Beyond Bare Queries: Open-Vocabulary Object Retrieval with 3D Scene Graph	Sergey Linok et.al.	2406.07113v2	null
2024-06-11	DARA: Decomposition-Alignment-Reasoning Autonomous Language Agent for Question Answering over Knowledge Graphs	Haishuo Fang et.al.	2406.07080v1	link
2024-06-11	CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks with Front-End UI Only	Junhee Cho et.al.	2406.06947v1	link
2024-06-15	What's in an embedding? Would a rose by any embedding smell as sweet?	Venkat Venkatasubramanian et.al.	2406.06870v3	null
2024-06-11	Eyeballing Combinatorial Problems: A Case Study of Using Multimodal Large Language Models to Solve Traveling Salesman Problems	Mohammed Elhenawy et.al.	2406.06865v1	null
2024-06-11	Ollabench: Evaluating LLMs' Reasoning for Human-centric Interdependent Cybersecurity	Tam n. Nguyen et.al.	2406.06863v1	link
2024-06-07	GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents	Anthony Costarelli et.al.	2406.06613v1	link
2024-06-06	Reinterpreting 'the Company a Word Keeps': Towards Explainable and Ontologically Grounded Language Models	Walid S. Saba et.al.	2406.06610v1	null
2024-06-05	Improve Mathematical Reasoning in Language Models by Automated Process Supervision	Liangchen Luo et.al.	2406.06592v1	null
2024-06-05	Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models	Flavio Petruzzellis et.al.	2406.06588v1	null
2024-06-05	Bi-Chainer: Automated Large Language Models Reasoning with Bidirectional Chaining	Shuqi Liu et.al.	2406.06586v1	null
2024-06-04	Break the Chain: Large Language Models Can be Shortcut Reasoners	Mengru Ding et.al.	2406.06580v1	null
2024-06-04	From Redundancy to Relevance: Enhancing Explainability in Multimodal Large Language Models	Xiaofeng Zhang et.al.	2406.06579v1	null
2024-06-10	Towards a Personal Health Large Language Model	Justin Cosentino et.al.	2406.06474v1	null
2024-06-11	Transforming Wearable Data into Health Insights using Large Language Model Agents	Mike A. Merrill et.al.	2406.06464v2	null
2024-06-15	Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies	Junlin Wang et.al.	2406.06461v3	null
2024-06-15	Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching	Xiaoying Zhang et.al.	2406.06326v3	null
2024-06-11	LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages	Andrew M. Bean et.al.	2406.06196v2	link
2024-06-10	Enhancing Long-Term Memory using Hierarchical Aggregate Tree for Retrieval Augmented Generation	Aadharsh Aadhithya A et.al.	2406.06124v1	null
2024-06-10	Prompting Large Language Models with Audio for General-Purpose Speech Summarization	Wonjune Kang et.al.	2406.05968v1	link
2024-06-10	CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark	David Romero et.al.	2406.05967v1	null
2024-06-10	Chain-of-Scrutiny: Detecting Backdoor Attacks for Large Language Models	Xi Li et.al.	2406.05948v1	null
2024-06-09	Hello Again! LLM-powered Personalized Agent for Long-term Dialogue	Hao Li et.al.	2406.05925v1	link
2024-06-09	Why Don't Prompt-Based Fairness Metrics Correlate?	Abdelrahman Zayed et.al.	2406.05918v1	null
2024-06-09	LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning	Utsav Singh et.al.	2406.05881v1	null
2024-06-09	A Survey on LLM-Based Agentic Workflows and LLM-Profiled Components	Xinzhe Li et.al.	2406.05804v1	null
2024-06-09	Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking	Fangxu Yu et.al.	2406.05673v1	link
2024-06-09	Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses	Maryam Amirizaniani et.al.	2406.05659v1	null
2024-06-08	Verbalized Probabilistic Graphical Modeling with Large Language Models	Hengguan Huang et.al.	2406.05516v1	null
2024-06-08	Towards a Benchmark for Causal Business Process Reasoning with LLMs	Fabiana Fournier et.al.	2406.05506v1	null
2024-06-08	Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation	Neeraj Varshney et.al.	2406.05494v1	null
2024-06-08	Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios	Yuhang Zhou et.al.	2406.05322v1	null
2024-06-07	LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMs	Arash Gholami Davoodi et.al.	2406.05194v1	link
2024-06-07	Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions	Shi-Yu Tian et.al.	2406.05055v1	null
2024-06-07	Quantifying Geospatial in the Common Crawl Corpus	Ilya Ilyankou et.al.	2406.04952v1	null
2024-06-07	Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models	Michał Romaszewski et.al.	2406.04926v1	null
2024-06-07	ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering	Raphael Gruber et.al.	2406.04866v1	link
2024-06-07	Experiences from Integrating Large Language Model Chatbots into the Classroom	Arto Hellas et.al.	2406.04817v1	null
2024-06-07	Zero, Finite, and Infinite Belief History of Theory of Mind Reasoning in Large Language Models	Weizhi Tang et.al.	2406.04800v1	null
2024-06-07	Think out Loud: Emotion Deducing Explanation in Dialogues	Jiangnan Li et.al.	2406.04758v1	null
2024-06-07	LogiCode: an LLM-Driven Framework for Logical Anomaly Detection	Yiheng Zhang et.al.	2406.04687v1	link
2024-06-07	LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model	Dongkai Wang et.al.	2406.04659v1	link
2024-06-07	LinkGPT: Teaching Large Language Models To Predict Missing Links	Zhongmou He et.al.	2406.04640v1	null
2024-06-07	What do MLLMs hear? Examining reasoning with text and sound components in Multimodal Large Language Models	Enis Berk Çoban et.al.	2406.04615v1	null
2024-06-07	StackSight: Unveiling WebAssembly through Large Language Models and Neurosymbolic Chain-of-Thought Decompilation	Weike Fang et.al.	2406.04568v1	null
2024-06-07	SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models	Md Imbesat Hassan Rizvi et.al.	2406.04566v1	link
2024-06-06	FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models	Max Zhu et.al.	2406.04501v1	null
2024-06-06	Time Sensitive Knowledge Editing through Efficient Finetuning	Xiou Ge et.al.	2406.04496v1	null
2024-06-06	On The Importance of Reasoning for Context Retrieval in Repository-Level Code Editing	Alexander Kovrigin et.al.	2406.04464v1	link
2024-06-06	MAIRA-2: Grounded Radiology Report Generation	Shruthi Bannur et.al.	2406.04449v1	null
2024-06-06	MoralBench: Moral Evaluation of LLMs	Jianchao Ji et.al.	2406.04428v1	link
2024-06-06	RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation	Jiaming Liu et.al.	2406.04339v1	null
2024-06-06	Text-to-Drive: Diverse Driving Behavior Synthesis via Large Language Models	Phat Nguyen et.al.	2406.04300v1	null
2024-06-06	Generative AI-in-the-loop: Integrating LLMs and GPTs into the Next Generation Networks	Han Zhang et.al.	2406.04276v1	null
2024-06-06	Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models	Ling Yang et.al.	2406.04271v1	link
2024-06-06	DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning	Shangqing Tu et.al.	2406.04197v1	link
2024-06-06	ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints	Divij Handa et.al.	2406.04046v1	null
2024-06-06	Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt	Zonghao Ying et.al.	2406.04031v1	link
2024-06-14	POEM: Interactive Prompt Optimization for Enhancing Multimodal Reasoning of Large Language Models	Jianben He et.al.	2406.03843v2	null
2024-06-06	Tool-Planner: Dynamic Solution Tree Planning for Large Language Model with Tool Clustering	Yanming Liu et.al.	2406.03807v1	link
2024-06-06	Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective	Xinhao Yao et.al.	2406.03768v1	link
2024-06-06	VisLTR: Visualization-in-the-Loop Table Reasoning	Jianing Hao et.al.	2406.03753v1	null
2024-06-06	A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions	Lei Liu et.al.	2406.03712v1	null
2024-06-06	Evaluating the World Model Implicit in a Generative Model	Keyon Vafa et.al.	2406.03689v1	link
2024-06-05	TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools	Avi Caciularu et.al.	2406.03618v1	null
2024-06-05	AD-H: Autonomous Driving with Hierarchical Agents	Zaibin Zhang et.al.	2406.03474v1	null
2024-06-05	Pre-trained Large Language Models Use Fourier Features to Compute Addition	Tianyi Zhou et.al.	2406.03445v1	null
2024-06-05	IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models	David Ifeoluwa Adelani et.al.	2406.03368v1	null
2024-06-05	CLMASP: Coupling Large Language Models with Answer Set Programming for Robotic Task Planning	Xinrui Lin et.al.	2406.03367v1	null
2024-06-06	Large Language Models as Evaluators for Recommendation Explanations	Xiaoyu Zhang et.al.	2406.03248v2	link
2024-06-05	Missci: Reconstructing Fallacies in Misrepresented Science	Max Glockner et.al.	2406.03181v1	link
2024-06-05	Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation	Tingjia Shen et.al.	2406.03085v1	null
2024-06-05	How Truncating Weights Improves Reasoning in Language Models	Lei Chen et.al.	2406.03068v1	null
2024-06-05	Verified Code Transpilation with LLMs	Sahil Bhatia et.al.	2406.03003v1	null
2024-06-05	NUMCoT: Numerals and Units of Measurement in Chain-of-Thought Reasoning using Large Language Models	Ancheng Xu et.al.	2406.02864v1	link
2024-06-05	LLM as a Scorer: The Impact of Output Order on Dialogue Evaluation	Yi-Pei Chen et.al.	2406.02863v1	null
2024-06-05	Item-Language Model for Conversational Recommendation	Li Yang et.al.	2406.02844v1	null
2024-06-04	Chain of Agents: Large Language Models Collaborating on Long-Context Tasks	Yusen Zhang et.al.	2406.02818v1	null
2024-06-04	$\texttt{ACCORD}$: Closing the Commonsense Measurability Gap	François Roewer-Després et.al.	2406.02804v1	link
2024-06-04	Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities	Wenyue Hua et.al.	2406.02787v1	null
2024-06-04	Adaptive Preference Scaling for Reinforcement Learning with Human Feedback	Ilgee Hong et.al.	2406.02764v1	null
2024-06-09	RATT: A Thought Structure for Coherent and Correct LLM Reasoning	Jinghan Zhang et.al.	2406.02746v2	null
2024-06-04	Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller	Min Cai et.al.	2406.02721v1	link
2024-06-04	Multiple Choice Questions and Large Languages Models: A Case Study with Fictional Medical Data	Maxime Griot et.al.	2406.02394v1	link
2024-06-04	Language Models Do Hard Arithmetic Tasks Easily and Hardly Do Easy Arithmetic Tasks	Andrew Gambardella et.al.	2406.02356v1	null
2024-06-04	mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models	Huiyuan Lai et.al.	2406.02301v1	link
2024-06-04	Iteration Head: A Mechanistic Study of Chain-of-Thought	Vivien Cabannes et.al.	2406.02128v1	null
2024-06-04	MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset	Weiqi Wang et.al.	2406.02106v1	link
2024-06-04	Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data	Haolong Li et.al.	2406.02100v1	null
2024-06-05	Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models	Marianna Nezhurina et.al.	2406.02061v2	link
2024-06-05	Multimodal Reasoning with Multimodal Knowledge Graph	Junlin Lee et.al.	2406.02030v2	null
2024-06-04	Why Would You Suggest That? Human Trust in Language Model Responses	Manasi Sharma et.al.	2406.02018v1	null
2024-06-04	Process-Driven Autoformalization in Lean 4	Jianqiao Lu et.al.	2406.01940v1	link
2024-06-04	PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning	Yupeng Zheng et.al.	2406.01587v2	null
2024-06-03	LoFiT: Localized Fine-tuning on LLM Representations	Fangcong Yin et.al.	2406.01563v1	link
2024-06-03	FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to Improve Fact Verification with Knowledge Graphs	Sushant Gautam et.al.	2406.01311v1	null
2024-06-03	EffiQA: Efficient Question-Answering with Strategic Multi-Model Collaboration on Knowledge Graphs	Zixuan Dong et.al.	2406.01238v1	null
2024-06-03	Explore then Determine: A GNN-LLM Synergy Framework for Reasoning over Knowledge Graph	Guangyi Liu et.al.	2406.01145v1	null
2024-06-03	SemCoder: Training Code Language Models with Comprehensive Semantics	Yangruibo Ding et.al.	2406.01006v1	null
2024-06-04	Efficient Behavior Tree Planning with Commonsense Pruning and Heuristic	Xinglin Chen et.al.	2406.00965v2	null
2024-06-04	MEDIQ: Question-Asking LLMs for Adaptive and Reliable Clinical Reasoning	Shuyue Stella Li et.al.	2406.00922v2	link
2024-06-02	Pretrained Hybrids with MAD Skills	Nicholas Roberts et.al.	2406.00894v1	null
2024-06-02	OLIVE: Object Level In-Context Visual Embeddings	Timothy Ossowski et.al.	2406.00872v1	link
2024-06-02	Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection	Chentao Cao et.al.	2406.00806v1	null
2024-06-02	Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction	Xiaoyuan Li et.al.	2406.00755v1	link
2024-06-01	Task Planning for Object Rearrangement in Multi-room Environments	Karan Mirakhor et.al.	2406.00451v1	null
2024-06-01	Evaluating Uncertainty-based Failure Detection for Closed-Loop LLM Planners	Zhi Zheng et.al.	2406.00430v1	null
2024-06-01	A Closer Look at Logical Reasoning with LLMs: The Choice of Tool Matters	Long Hei Matthew Lam et.al.	2406.00284v1	link
2024-06-01	Are Large Vision Language Models up to the Challenge of Chart Comprehension and Reasoning? An Extensive Investigation into the Capabilities and Limitations of LVLMs	Mohammed Saidul Islam et.al.	2406.00257v1	null
2024-06-05	Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey	Bowen Jiang et.al.	2406.00252v2	link
2024-05-31	Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training	Maximillian Chen et.al.	2406.00222v1	null
2024-05-31	Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation	Bernd Bohnet et.al.	2406.00179v1	null
2024-05-31	QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation	Zhuo Chen et.al.	2406.00132v1	null
2024-05-31	Towards LLM-Powered Verilog RTL Assistant: Self-Verification and Self-Correction	Hanxian Huang et.al.	2406.00115v1	null
2024-05-31	Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training	Feiteng Fang et.al.	2405.20978v1	null
2024-06-05	SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales	Tianyang Xu et.al.	2405.20974v2	link
2024-06-03	Large Language Models are Zero-Shot Next Location Predictors	Ciro Beneduce et.al.	2405.20962v2	link
2024-05-31	Preemptive Answer "Attacks" on Chain-of-Thought Reasoning	Rongwu Xu et.al.	2405.20902v1	null
2024-05-31	Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning	Cheng Tan et.al.	2405.20834v1	null
2024-05-27	Exploring Backdoor Attacks against Large Language Model-based Decision Making	Ruochen Jiao et.al.	2405.20774v1	null
2024-05-31	Robust Planning with LLM-Modulo Framework: Case Study in Travel Planning	Atharva Gundawar et.al.	2405.20625v1	null
2024-05-30	Unveiling the Impact of Coding Data Instruction Fine-Tuning on Large Language Models Reasoning	Xinlu Zhang et.al.	2405.20535v1	null
2024-05-30	SECURE: Benchmarking Generative Large Language Models for Cybersecurity Advisory	Dipkamal Bhusal et.al.	2405.20441v1	null
2024-05-30	MotionLLM: Understanding Human Behaviors from Human Motions and Videos	Ling-Hao Chen et.al.	2405.20340v1	null
2024-05-30	TAIA: Large Language Models are Out-of-Distribution Data Learners	Shuyang Jiang et.al.	2405.20192v1	link
2024-05-30	Nadine: An LLM-driven Intelligent Social Robot with Affective Capabilities and Human-like Memory	Hangyeol Kang et.al.	2405.20189v1	null
2024-05-30	Reasoning about concepts with LLMs: Inconsistencies abound	Rosario Uceda-Sosa et.al.	2405.20163v1	null
2024-05-30	GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning	Costas Mavromatis et.al.	2405.20139v1	link
2024-05-30	Improve Student's Reasoning Generalizability through Cascading Decomposed CoTs Distillation	Chengwei Dai et.al.	2405.19842v1	link
2024-05-30	VQA Training Sets are Self-play Environments for Generating Few-shot Pools	Tautvydas Misiunas et.al.	2405.19773v1	null
2024-05-30	Beyond Imitation: Learning Key Reasoning Steps from Dual Chain-of-Thoughts in Reasoning Distillation	Chengwei Dai et.al.	2405.19737v1	link
2024-05-30	Enhancing Large Vision Language Models with Self-Training on Image Comprehension	Yihe Deng et.al.	2405.19716v1	null
2024-05-30	AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization	Jiawei Chen et.al.	2405.19668v1	null
2024-06-01	Easy Problems That LLMs Get Wrong	Sean Williams et.al.	2405.19616v2	link
2024-05-30	The Accuracy of Domain Specific and Descriptive Analysis Generated by Large Language Models	Denish Omondi Otieno et.al.	2405.19578v1	null
2024-05-29	Quo Vadis ChatGPT? From Large Language Models to Large Knowledge Models	Venkat Venkatasubramanian et.al.	2405.19561v1	null
2024-05-29	MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions	Zhenwen Liang et.al.	2405.19444v1	link
2024-05-29	X-VILA: Cross-Modality Alignment for Large Language Model	Hanrong Ye et.al.	2405.19335v1	null
2024-06-02	MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series	Ge Zhang et.al.	2405.19327v3	link
2024-05-29	Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models	Tianrun Chen et.al.	2405.19326v1	null
2024-05-29	Towards Next-Generation Urban Decision Support Systems through AI-Powered Generation of Scientific Ontology using Large Language Models -- A Case in Optimizing Intermodal Freight Transportation	Jose Tupayachi et.al.	2405.19255v1	null
2024-05-29	VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos	Ziyang Wang et.al.	2405.19209v1	link
2024-05-29	Learning from Litigation: Graphs and LLMs for Retrieval and Reasoning in eDiscovery	Sounak Lahiri et.al.	2405.19164v1	null
2024-05-29	PathReasoner: Modeling Reasoning Path with Equivalent Extension for Logical Question Answering	Fangzhi Xu et.al.	2405.19109v1	null
2024-06-02	Cephalo: Multi-Modal Vision-Language Models for Bio-Inspired Materials Analysis and Design	Markus J. Buehler et.al.	2405.19076v2	link
2024-05-29	Towards Faithful Chain-of-Thought: Large Language Models are Bridging Reasoners	Jiachun Li et.al.	2405.18915v1	null
2024-05-31	LLMs achieve adult human performance on higher-order theory of mind tasks	Winnie Street et.al.	2405.18870v2	null
2024-06-02	Gemini & Physical World: Large Language Models Can Estimate the Intensity of Earthquake Shaking from Multi-Modal Social Media Posts	S. Mostafa Mousavi et.al.	2405.18732v2	null
2024-05-29	Efficient Model-agnostic Alignment via Bayesian Persuasion	Fengshuo Bai et.al.	2405.18718v1	null
2024-05-29	Calibrating Reasoning in Language Models with Internal Consistency	Zhihui Xie et.al.	2405.18711v1	null
2024-05-30	Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning	Tiansheng Huang et.al.	2405.18641v2	link
2024-05-28	Don't Forget to Connect! Improving RAG with Graph-based Reranking	Jialin Dong et.al.	2405.18414v1	null
2024-05-28	OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning	Pengxiang Li et.al.	2405.18380v1	link
2024-05-28	LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models	Anthony Sarah et.al.	2405.18377v1	null
2024-05-28	Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning	Phakphum Artkaew et.al.	2405.18375v1	link
2024-05-28	PromptWizard: Task-Aware Agent-driven Prompt Optimization Framework	Eshaan Agarwal et.al.	2405.18369v1	null
2024-05-28	Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?	Yifan Bai et.al.	2405.18361v1	null
2024-05-28	MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning	Somnath Kumar et.al.	2405.18358v1	null
2024-05-28	Faithful Logical Reasoning via Symbolic Chain-of-Thought	Jundong Xu et.al.	2405.18357v1	link
2024-05-28	Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning	Renzhi Wang et.al.	2405.18292v1	null
2024-05-28	A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models	Chengxing Xie et.al.	2405.18208v1	null
2024-05-28	LLM experiments with simulation: Large Language Model Multi-Agent System for Process Simulation Parametrization in Digital Twins	Yuchen Xia et.al.	2405.18092v1	link
2024-05-28	Towards Dialogues for Joint Human-AI Reasoning and Value Alignment	Elfia Bezou-Vrakatseli et.al.	2405.18073v1	null
2024-05-28	TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models	Jaewoo Ahn et.al.	2405.18027v1	null
2024-05-28	Knowledge Circuits in Pretrained Transformers	Yunzhi Yao et.al.	2405.17969v1	link
2024-05-28	Self-Guiding Exploration for Combinatorial Problems	Zangir Iklassov et.al.	2405.17950v1	link
2024-05-28	Arithmetic Reasoning with LLM: Prolog Generation & Permutation	Xiaocheng Yang et.al.	2405.17893v1	null
2024-05-28	Conv-CoA: Improving Open-domain Question Answering in Large Language Models via Conversational Chain-of-Action	Zhenyu Pan et.al.	2405.17822v1	null
2024-05-28	XL3M: A Training-free Framework for LLM Length Extension Based on Segment-wise Inference	Shengnan Wang et.al.	2405.17755v1	null
2024-05-28	CLAIM Your Data: Enhancing Imputation Accuracy with Contextual Large Language Models	Ahatsham Hayat et.al.	2405.17712v1	null
2024-05-27	Video Enriched Retrieval Augmented Generation Using Aligned Video Captions	Kevin Dela Rosa et.al.	2405.17706v1	link
2024-05-27	BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments	Yusuf Roohani et.al.	2405.17631v1	link
2024-05-30	Code Repair with LLMs gives an Exploration-Exploitation Tradeoff	Hao Tang et.al.	2405.17503v2	null
2024-05-27	Matryoshka Multimodal Models	Mu Cai et.al.	2405.17430v1	null
2024-05-27	Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model	Kuan-Chih Huang et.al.	2405.17427v1	link
2024-05-27	Self-Corrected Multimodal Large Language Model for End-to-End Robot Manipulation	Jiaming Liu et.al.	2405.17418v1	null
2024-05-27	MindMerger: Efficient Boosting LLM Reasoning in non-English Languages	Zixian Huang et.al.	2405.17386v1	link
2024-05-27	Assessing LLMs Suitability for Knowledge Graph Completion	Vasile Ionut Remus Iga et.al.	2405.17249v1	link
2024-05-27	LLM-Assisted Static Analysis for Detecting Security Vulnerabilities	Ziyang Li et.al.	2405.17238v1	null
2024-05-29	Position: Foundation Agents as the Paradigm Shift for Decision Making	Xiaoqian Liu et.al.	2405.17009v3	link
2024-05-28	Entity Alignment with Noisy Annotations from Large Language Models	Shengyuan Chen et.al.	2405.16806v2	link
2024-05-27	TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing	Xinyu Zhang et.al.	2405.16803v1	null
2024-05-29	AutoCV: Empowering Reasoning with Automated Process Labeling via Confidence Variation	Jianqiao Lu et.al.	2405.16802v3	link
2024-05-28	Large Scale Knowledge Washing	Yu Wang et.al.	2405.16720v2	null
2024-05-26	RLSF: Reinforcement Learning via Symbolic Feedback	Piyush Jha et.al.	2405.16661v1	null
2024-05-30	Meta-Task Planning for Language Agents	Cong Zhang et.al.	2405.16510v3	null
2024-05-26	M$^3$CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought	Qiguang Chen et.al.	2405.16473v1	link
2024-05-26	Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search	Max Liu et.al.	2405.16450v1	null
2024-05-26	Augmented Risk Prediction for the Onset of Alzheimer's Disease from Electronic Health Records with Large Language Models	Jiankun Wang et.al.	2405.16413v1	null
2024-05-28	SpinQuant: LLM quantization with learned rotations	Zechun Liu et.al.	2405.16406v2	null
2024-05-28	STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making	Chuanhao Li et.al.	2405.16376v2	link
2024-06-03	Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge	Brendan Park et.al.	2405.16277v3	link
2024-05-25	MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time	Jikun Kang et.al.	2405.16265v1	null
2024-05-25	Finetuning Large Language Model for Personalized Ranking	Zhuoxi Bai et.al.	2405.16127v1	null
2024-05-25	Keypoint-based Progressive Chain-of-Thought Distillation for LLMs	Kaituo Feng et.al.	2405.16064v1	null
2024-05-25	Streaming Long Video Understanding with Large Language Models	Rui Qian et.al.	2405.16009v1	null
2024-05-30	SLIDE: A Framework Integrating Small and Large Language Models for Open-Domain Dialogues Evaluation	Kun Zhao et.al.	2405.15924v3	link
2024-05-24	HYSYNTH: Context-Free LLM Approximation for Guiding Program Synthesis	Shraddha Barke et.al.	2405.15880v1	null
2024-05-24	Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications	Yang Li et.al.	2405.15877v1	null
2024-05-24	Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models	Yue Zhang et.al.	2405.15684v1	null
2024-05-24	M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models	Hongyu Wang et.al.	2405.15638v1	link
2024-05-24	Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges	Jonas Becker et.al.	2405.15604v1	link
2024-05-24	Before Generation, Align it! A Novel and Effective Strategy for Mitigating Hallucinations in Text-to-SQL Generation	Ge Qu et.al.	2405.15307v1	link
2024-05-24	Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation	Zhiwei Wang et.al.	2405.15302v1	null
2024-05-24	Coaching Copilot: Blended Form of an LLM-Powered Chatbot and a Human Coach to Effectively Support Self-Reflection for Leadership Growth	Riku Arakawa et.al.	2405.15250v1	null
2024-05-24	A Solution-based LLM API-using Methodology for Academic Information Seeking	Yuanchun Wang et.al.	2405.15165v1	link
2024-05-24	From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks	Jacob Russin et.al.	2405.15164v1	null
2024-05-24	OptLLM: Optimal Assignment of Queries to Large Language Models	Yueyue Liu et.al.	2405.15130v1	link
2024-05-24	Let Me Do It For You: Towards LLM Empowered Recommendation via Tool Learning	Yuyue Zhao et.al.	2405.15114v1	null
2024-05-23	Dissociation of Faithful and Unfaithful Reasoning in LLMs	Evelyn Yee et.al.	2405.15092v1	link
2024-05-23	OAC: Output-adaptive Calibration for Accurate Post-training Quantization	Ali Edalati et.al.	2405.15025v1	null
2024-05-23	Agentic Skill Discovery	Xufeng Zhao et.al.	2405.15019v1	null
2024-05-23	A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns	Asaf Yehudai et.al.	2405.14863v1	null
2024-05-23	Bitune: Bidirectional Instruction-Tuning	Dawid J. Kopiczko et.al.	2405.14862v1	null
2024-05-23	Efficient Medical Question Answering with Knowledge-Augmented Question Generation	Julien Khlaut et.al.	2405.14654v1	null
2024-05-24	Generating Exceptional Behavior Tests with Reasoning Augmented Large Language Models	Jiyang Zhang et.al.	2405.14619v2	null
2024-05-26	Explainable Few-shot Knowledge Tracing	Haoxuan Li et.al.	2405.14391v2	link
2024-05-23	Can Large Language Models Create New Knowledge for Spatial Reasoning Tasks?	Thomas Greatrix et.al.	2405.14379v1	null
2024-05-23	JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models	Kun Zhou et.al.	2405.14365v1	null
2024-05-23	DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data	Huajian Xin et.al.	2405.14333v1	null
2024-05-26	Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration	Yang Zhang et.al.	2405.14314v2	null
2024-05-23	Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning	Jiapu Wang et.al.	2405.14170v1	null
2024-05-23	Towards Transferable Attacks Against Vision-LLMs in Autonomous Driving with Typography	Nhat Chung et.al.	2405.14169v1	null
2024-05-23	Large Language Models Can Self-Correct with Minimal Effort	Zhenyu Wu et.al.	2405.14092v1	null
2024-05-23	$T^2$ of Thoughts: Temperature Tree Elicits Reasoning in Large Language Models	Chengkun Cai et.al.	2405.14075v1	null
2024-05-22	On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models	Mudit Verma et.al.	2405.13966v1	null
2024-05-22	PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering in Pituitary Surgery	Runlong He et.al.	2405.13949v1	link
2024-05-22	FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering	Yuan Sui et.al.	2405.13873v1	null
2024-05-29	Image-of-Thought Prompting for Visual Reasoning Refinement in Multimodal Large Language Models	Qiji Zhou et.al.	2405.13872v2	null
2024-05-22	Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation	Cyril Chhun et.al.	2405.13769v1	link
2024-05-22	HighwayLLM: Decision-Making and Navigation in Highway Driving with RL-Informed Language Model	Mustafa Yildirim et.al.	2405.13547v1	null
2024-05-22	LIRE: listwise reward enhancement for preference alignment	Mingye Zhu et.al.	2405.13516v1	null
2024-05-22	Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning	Yuanhao Yue et.al.	2405.13448v1	null
2024-05-22	Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction	Tingchen Fu et.al.	2405.13432v1	null
2024-05-21	Investigating Symbolic Capabilities of Large Language Models	Neisarg Dave et.al.	2405.13209v1	null
2024-05-21	Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding	Rong Gao et.al.	2405.13206v1	null
2024-05-20	Can Github issues be solved with Tree Of Thoughts?	Ricardo La Rosa et.al.	2405.13057v1	link
2024-05-17	Surgical Feature-Space Decomposition of LLMs: Why, When and How?	Arnav Chavan et.al.	2405.13039v1	null
2024-05-16	Can formal argumentative reasoning enhance LLMs performances?	Federico Castagna et.al.	2405.13036v1	null
2024-05-15	IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues	Diji Yang et.al.	2405.13021v1	null
2024-05-14	QCRD: Quality-guided Contrastive Rationale Distillation for Large Language Models	Wei Wang et.al.	2405.13014v1	null
2024-05-12	MathDivide: Improved mathematical reasoning by large language models	Saksham Sahai Srivastava et.al.	2405.13004v1	null
2024-05-21	Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models	Zhangyue Yin et.al.	2405.12939v1	link
2024-05-21	Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs	Bilgehan Sel et.al.	2405.12933v1	null
2024-05-21	DrHouse: An LLM-empowered Diagnostic Reasoning System through Harnessing Outcomes from Sensor Data and Expert Knowledge	Bufang Yang et.al.	2405.12541v1	null
2024-05-21	LLM+Reasoning+Planning for supporting incomplete user queries in presence of APIs	Sudhir Agarwal et.al.	2405.12433v1	null
2024-05-20	Eliciting Problem Specifications via Large Language Models	Robert E. Wray et.al.	2405.12147v1	null
2024-05-20	MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning	Ting Jiang et.al.	2405.12130v1	link
2024-05-20	DOP: Diagnostic-Oriented Prompting for Large Language Models in Mathematical Correction	Hao Chen et.al.	2405.12100v1	null
2024-05-20	KG-RAG: Bridging the Gap Between Knowledge and Creativity	Diego Sanmartin et.al.	2405.12035v1	null
2024-05-20	Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs	Siyu Lou et.al.	2405.11880v1	null
2024-05-20	Evaluating and Modeling Social Intelligence: A Comparative Study of Human and AI Capabilities	Junqi Wang et.al.	2405.11841v1	link
2024-05-19	Inquire, Interact, and Integrate: A Proactive Agent Collaborative Framework for Zero-Shot Multimodal Medical Reasoning	Zishan Gu et.al.	2405.11640v1	null
2024-05-19	MHPP: Exploring the Capabilities and Limitations of Language Models Beyond Basic Code Generation	Jianbo Dai et.al.	2405.11430v1	link
2024-05-17	Are Large Language Models Moral Hypocrites? A Study Based on Moral Foundations	José Luiz Nunes et.al.	2405.11100v1	null
2024-05-17	From Generalist to Specialist: Improving Large Language Models for Medical Physics Using ARCoT	Jace Grandinetti et.al.	2405.11040v1	null
2024-05-17	Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities	Hao Zhou et.al.	2405.10825v1	null
2024-05-17	Efficient Multimodal Large Language Models: A Survey	Yizhang Jin et.al.	2405.10739v1	link
2024-05-17	MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains	Zhaohuan Zhan et.al.	2405.10620v1	null
2024-05-17	RDRec: Rationale Distillation for LLM-based Recommendation	Xinfeng Wang et.al.	2405.10587v1	link
2024-05-17	Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset	Jie Zhu et.al.	2405.10542v1	link
2024-05-16	Retrieving and Refining: A Hybrid Framework with Large Language Models for Rare Disease Identification	Jinge Wu et.al.	2405.10440v1	null
2024-05-16	When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models	Xianzheng Ma et.al.	2405.10255v1	null
2024-05-16	A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks	Xuanfan Ni et.al.	2405.10251v1	null
2024-05-16	LFED: A Literary Fiction Evaluation Dataset for Large Language Models	Linhao Yu et.al.	2405.10166v1	link
2024-05-16	SEEK: Semantic Reasoning for Object Goal Navigation in Real World Inspection Tasks	Muhammad Fadhil Ginting et.al.	2405.09822v1	null
2024-05-16	LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery	Pingchuan Ma et.al.	2405.09783v1	null
2024-05-15	Matching domain experts by training from scratch on domain knowledge	Xiaoliang Luo et.al.	2405.09395v1	null
2024-05-15	Exploring the Potential of Large Language Models for Automation in Technical Customer Service	Jochen Wulf et.al.	2405.09161v1	null
2024-05-14	A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine	Hanguang Xiao et.al.	2405.08603v1	null
2024-05-14	Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure	Odysseas S. Chlapanis et.al.	2405.08502v1	link
2024-05-14	PromptMind Team at MEDIQA-CORR 2024: Improving Clinical Text Correction with Error Categorization and LLM Ensembles	Satya Kesav Gundabathula et.al.	2405.08373v1	null
2024-05-13	LLM Theory of Mind and Alignment: Opportunities and Risks	Winnie Street et.al.	2405.08154v1	null
2024-05-13	EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning	Yinzhu Quan et.al.	2405.07938v1	null
2024-05-13	Generating Human Motion in 3D Scenes from Text Descriptions	Zhi Cen et.al.	2405.07784v1	null
2024-05-13	Backdoor Removal for Generative Large Language Models	Haoran Li et.al.	2405.07667v1	null
2024-05-13	MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning	Shuo Yin et.al.	2405.07551v1	null
2024-05-13	Oedipus: LLM-enchanced Reasoning CAPTCHA Solver	Gelei Deng et.al.	2405.07496v1	null
2024-05-14	MedConceptsQA: Open Source Medical Concepts QA Benchmark	Ofir Ben Shoham et.al.	2405.07348v2	link
2024-05-12	Learnable Tokenizer for LLM-based Generative Recommendation	Wenjie Wang et.al.	2405.07314v1	null
2024-05-12	MM-InstructEval: Zero-Shot Evaluation of (Multimodal) Large Language Models on Multimodal Reasoning Tasks	Xiaocui Yang et.al.	2405.07229v1	link
2024-05-11	Automating Thematic Analysis: How LLMs Analyse Controversial Topics	Awais Hameed Khan et.al.	2405.06919v1	null
2024-05-09	Hypothesis Testing Prompting Improves Deductive Reasoning in Large Language Models	Yitian Li et.al.	2405.06707v1	null
2024-05-09	LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought	Zhuoxuan Jiang et.al.	2405.06705v1	null
2024-05-07	SUTRA: Scalable Multilingual Language Model Architecture	Abhijit Bendale et.al.	2405.06694v1	null
2024-05-07	Fleet of Agents: Coordinated Problem Solving with Large Language Models using Genetic Particle Filtering	Akhil Arora et.al.	2405.06691v1	null
2024-05-05	Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning	Jun Zhao et.al.	2405.06680v1	null
2024-05-10	Program Synthesis using Inductive Logic Programming for the Abstraction and Reasoning Corpus	Filipe Marinho Rocha et.al.	2405.06399v1	null
2024-05-09	LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models	Ruihao Gong et.al.	2405.06001v1	link
2024-05-09	OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning	Dan Qiao et.al.	2405.05957v1	link
2024-05-09	Probing Multimodal LLMs as World Models for Driving	Shiva Sreeram et.al.	2405.05956v1	link
2024-05-09	Co-driver: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes	Ziang Guo et.al.	2405.05885v1	null
2024-05-09	Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning	Artem Lykov et.al.	2405.05824v1	link
2024-05-09	Redefining Information Retrieval of Structured Database via Large Language Models	Mingzhu Wang et.al.	2405.05508v1	null
2024-05-08	SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants	Masoud Moghani et.al.	2405.05226v1	null
2024-05-08	MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning	Inderjeet Nair et.al.	2405.05189v1	null
2024-05-08	QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs	Weijia Zhang et.al.	2405.05109v1	null
2024-05-08	Federated Adaptation for Foundation Model-based Recommendations	Chunxu Zhang et.al.	2405.04840v1	link
2024-05-08	ACORN: Aspect-wise Commonsense Reasoning Explanation Evaluation	Ana Brassard et.al.	2405.04818v1	link
2024-05-08	Chain of Thoughtlessness: An Analysis of CoT in Planning	Kaya Stechly et.al.	2405.04776v1	null
2024-05-08	BiasKG: Adversarial Knowledge Graphs to Induce Bias in Large Language Models	Chu Fei Luo et.al.	2405.04756v1	link
2024-05-07	Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking	Emre Can Acikgoz et.al.	2405.04685v1	null
2024-05-07	Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics	Hanlin Zhu et.al.	2405.04669v1	null
2024-05-07	ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning	Jing Lin et.al.	2405.04533v1	null
2024-05-08	Unveiling Disparities in Web Task Handling Between Human and Web Agent	Kihoon Son et.al.	2405.04497v2	null
2024-05-07	Large Language Models Cannot Explain Themselves	Advait Sarkar et.al.	2405.04382v1	null
2024-05-07	NL2Plan: Robust LLM-Driven Planning from Minimal Text Descriptions	Elliot Gestrin et.al.	2405.04215v1	null
2024-05-07	D-NLP at SemEval-2024 Task 2: Evaluating Clinical Inference Capabilities of Large Language Models	Duygu Altinok et.al.	2405.04170v1	link
2024-05-07	Optimizing Language Model's Reasoning Abilities with Weak Supervision	Yongqi Tong et.al.	2405.04086v1	null
2024-05-14	Generating Probabilistic Scenario Programs from Natural Language	Karim Elmaaroufi et.al.	2405.03709v2	null
2024-05-08	How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs	Muhammad Uzair Khattak et.al.	2405.03690v2	null
2024-05-06	Language-Image Models with 3D Understanding	Jang Hyun Cho et.al.	2405.03685v1	null
2024-05-06	Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment	Abhinav Agarwalla et.al.	2405.03594v1	null
2024-05-23	AlphaMath Almost Zero: process Supervision without process	Guoxin Chen et.al.	2405.03553v2	link
2024-05-15	MAmmoTH2: Scaling Instructions from the Web	Xiang Yue et.al.	2405.03548v3	null
2024-05-06	Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-Context Learning	Yubo Mai et.al.	2405.03509v1	null
2024-05-06	Explainable Fake News Detection With Large Language Model via Defense Among Competing Wisdom	Bo Wang et.al.	2405.03371v1	link
2024-05-06	MedDoc-Bot: A Chat Tool for Comparative Analysis of Large Language Models in the Context of the Pediatric Hypertension Guideline	Mohamed Yaseen Jabarulla et.al.	2405.03359v1	link
2024-05-06	WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning	Yuanhan Zhang et.al.	2405.03272v1	null
2024-05-06	CRAFT: Extracting and Tuning Cultural Instructions from the Wild	Bin Wang et.al.	2405.03138v1	link
2024-05-05	High Order Reasoning for Time Critical Recommendation in Evidence-based Medicine	Manjiang Yu et.al.	2405.03010v1	null
2024-05-05	MedAdapter: Efficient Test-Time Adaptation of Large Language Models towards Medical Reasoning	Wenqi Shi et.al.	2405.03000v1	null
2024-05-05	Trojans in Large Language Models of Code: A Critical Review through a Trigger-Based Taxonomy	Aftab Hussain et.al.	2405.02828v1	null
2024-05-04	CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions	Hanchong Zhang et.al.	2405.02712v1	link
2024-05-04	A Literature Review and Framework for Human Evaluation of Generative Large Language Models in Healthcare	Thomas Yu Chow Tam et.al.	2405.02559v1	null
2024-05-20	GigSense: An LLM-Infused Tool forWorkers' Collective Intelligence	Kashif Imteyaz et.al.	2405.02528v2	null
2024-05-09	REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs	Deepa Tilwani et.al.	2405.02228v2	null
2024-05-03	Argumentative Large Language Models for Explainable and Contestable Decision-Making	Gabriel Freedman et.al.	2405.02079v1	null
2024-05-03	Exploring Combinatorial Problem Solving with Large Language Models: A Case Study on the Travelling Salesman Problem Using GPT-3.5 Turbo	Mahmoud Masoud et.al.	2405.01997v1	null
2024-05-03	Incorporating External Knowledge and Goal Guidance for LLM-based Conversational Recommender Systems	Chuang Li et.al.	2405.01868v1	null
2024-05-02	ALCM: Autonomous LLM-Augmented Causal Discovery Framework	Elahe Khatibi et.al.	2405.01744v1	null
2024-05-08	Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning	Tianle Xia et.al.	2405.01649v3	null
2024-04-30	Large Language Model Agent for Fake News Detection	Xinyi Li et.al.	2405.01593v1	null
2024-04-28	Tabular Embedding Model (TEM): Finetuning Embedding Models For Tabular RAG Applications	Sujit Khanna et.al.	2405.01585v1	null
2024-05-02	OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning	Shihao Wang et.al.	2405.01533v1	link
2024-05-02	Analyzing the Role of Semantic Representations in the Era of Large Language Models	Zhijing Jin et.al.	2405.01502v1	link
2024-05-08	Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving	Xin Quan et.al.	2405.01379v2	null
2024-05-02	GAIA: A General AI Assistant for Intelligent Accelerator Operations	Frank Mayet et.al.	2405.01359v1	null
2024-05-02	The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights	Wenhao Zhu et.al.	2405.01345v1	link
2024-05-02	Bayesian Optimization with LLM-Based Acquisition Functions for Natural Language Preference Elicitation	David Eric Austin et.al.	2405.00981v1	null
2024-05-02	CACTUS: Chemistry Agent Connecting Tool-Usage to Science	Andrew D. McNaughton et.al.	2405.00972v1	link
2024-04-25	Can't say cant? Measuring and Reasoning of Dark Jargons in Large Language Models	Xu Ji et.al.	2405.00718v1	null
2024-04-25	Large Language Models in Healthcare: A Comprehensive Benchmark	Andrew Liu et.al.	2405.00716v1	null
2024-05-01	HalluVault: A Novel Logic Programming-aided Metamorphic Testing Framework for Detecting Fact-Conflicting Hallucinations in Large Language Models	Ningke Li et.al.	2405.00648v1	null
2024-05-01	Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning	Yuxi Xie et.al.	2405.00451v1	null
2024-05-01	RAG-based Explainable Prediction of Road Users Behaviors for Automated Driving using Knowledge Graphs and Large Language Models	Mohamed Manzour Hussien et.al.	2405.00449v1	null
2024-05-01	Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models	Leonardo Ranaldi et.al.	2405.00402v1	null
2024-05-01	AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation Experts	Zefang Liu et.al.	2405.00361v1	link
2024-05-03	Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language Model	Yu Cui et.al.	2405.00338v2	null
2024-05-03	A Careful Examination of Large Language Model Performance on Grade School Arithmetic	Hugh Zhang et.al.	2405.00332v3	null
2024-05-01	DFKI-NLP at SemEval-2024 Task 2: Towards Robust LLMs Using Data Perturbations and MinMax Training	Bhuvanesh Verma et.al.	2405.00321v1	null
2024-04-30	General Purpose Verification for Chain of Thought Prompting	Robert Vacareanu et.al.	2405.00204v1	null
2024-04-30	Better & Faster Large Language Models via Multi-token Prediction	Fabian Gloeckle et.al.	2404.19737v1	null
2024-04-30	Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners	Chun Feng et.al.	2404.19696v1	null
2024-04-30	Do Large Language Models Understand Conversational Implicature -- A case study with a chinese sitcom	Shisen Yue et.al.	2404.19509v1	link
2024-05-01	Neuro-Vision to Language: Image Reconstruction and Language enabled Interaction via Brain Recordings	Guobin Shen et.al.	2404.19438v2	null
2024-04-30	Can Large Language Models put 2 and 2 together? Probing for Entailed Arithmetical Relationships	D. Panas et.al.	2404.19432v1	null
2024-04-30	Evaluating Telugu Proficiency in Large Language Models_ A Comparative Analysis of ChatGPT and Gemini	Katikela Sreeharsha Kishore et.al.	2404.19369v1	null
2024-04-30	Multi-hop Question Answering over Knowledge Graphs using Large Language Models	Abir Chakraborty et.al.	2404.19234v1	null
2024-04-30	Transcrib3D: 3D Referring Expression Resolution through Large Language Models	Jiading Fang et.al.	2404.19221v1	null
2024-04-29	SuperCLUE-Fin: Graded Fine-Grained Analysis of Chinese LLMs on Diverse Financial Tasks and Applications	Liang Xu et.al.	2404.19063v1	null
2024-04-29	Plan of Thoughts: Heuristic-Guided Problem Solving with Large Language Models	Houjun Liu et.al.	2404.19055v1	null
2024-04-29	Towards Generalizable Agents in Text-Based Educational Environments: A Study of Integrating RL with LLMs	Bahar Radmehr et.al.	2404.18978v1	null
2024-04-29	Benchmarking Benchmark Leakage in Large Language Models	Ruijie Xu et.al.	2404.18824v1	link
2024-04-29	PECC: Problem Extraction and Coding Challenges	Patrick Haller et.al.	2404.18766v1	link
2024-04-29	Injecting Salesperson's Dialogue Strategies in Large Language Models with Chain-of-Thought Reasoning	Wen-Yu Chang et.al.	2404.18564v1	null
2024-04-29	Ethical Reasoning and Moral Value Alignment of LLMs Depend on the Language we Prompt them in	Utkarsh Agarwal et.al.	2404.18460v1	null
2024-04-29	FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models	Wei Li et.al.	2404.18359v1	null
2024-04-30	Comparing LLM prompting with Cross-lingual transfer performance on Indigenous and Low-resource Brazilian Languages	David Ifeoluwa Adelani et.al.	2404.18286v2	null
2024-04-28	Logic Agent: Enhancing Validity with Logic Rule Invocation	Hanmeng Liu et.al.	2404.18130v1	null
2024-04-28	Generative AI for Low-Carbon Artificial Intelligence of Things	Jinbo Wen et.al.	2404.18077v1	null
2024-04-27	CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments	Kaixuan Huang et.al.	2404.18021v1	null
2024-04-27	Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction	Guozheng Li et.al.	2404.17809v1	null
2024-04-26	CoMM: Collaborative Multi-Agent, Multi-Reasoning-Path Prompting for Complex Problem Solving	Pei Chen et.al.	2404.17729v1	link
2024-04-26	PLAYER: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery Games*	Qinglin Zhu et.al.	2404.17662v1	link
2024-05-09	Large Language Model Agent as a Mechanical Designer	Yayati Jadhav et.al.	2404.17525v2	null
2024-04-29	On the Use of Large Language Models to Generate Capability Ontologies	Luis Miguel Vieira da Silva et.al.	2404.17524v2	null
2024-04-26	Enhancing Legal Compliance and Regulation Analysis with Large Language Models	Shabnam Hassani et.al.	2404.17522v1	null
2024-04-26	A Comprehensive Evaluation on Event Reasoning of Large Language Models	Zhengwei Tao et.al.	2404.17513v1	link
2024-04-26	Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System	Robin Schmucker et.al.	2404.17460v1	null
2024-04-26	Small Language Models Need Strong Verifiers to Self-Correct Reasoning	Yunxiang Zhang et.al.	2404.17140v1	null
2024-04-26	Make Your LLM Fully Utilize the Context	Shengnan An et.al.	2404.16811v2	link
2024-04-25	Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning	Tianhui Zhang et.al.	2404.16807v1	null
2024-04-25	RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis	Xiaoman Zhang et.al.	2404.16754v1	null
2024-04-25	Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents	Giorgio Piatti et.al.	2404.16698v1	null
2024-04-25	EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning	Hongxia Xie et.al.	2404.16670v1	link
2024-04-25	Evolutionary Large Language Models for Hardware Security: A Comparative Survey	Mohammad Akyash et.al.	2404.16651v1	null
2024-04-25	Evaluating Consistency and Reasoning Capabilities of Large Language Models	Yash Saxena et.al.	2404.16478v1	null
2024-04-25	List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs	An Yan et.al.	2404.16375v1	link
2024-04-24	The Feasibility of Implementing Large-Scale Transformers on Multi-FPGA Platforms	Yu Gao et.al.	2404.16158v1	null
2024-04-24	Cantor: Inspiring Multimodal Chain-of-Thought of MLLM	Timin Gao et.al.	2404.16033v1	null
2024-04-24	GeckOpt: LLM System Efficiency via Intent-Based Tool Selection	Michael Fore et.al.	2404.15804v1	null
2024-04-24	Leveraging Large Language Models for Multimodal Search	Oriol Barbany et.al.	2404.15790v1	null
2024-04-24	Beyond Chain-of-Thought: A Survey of Chain-of-X Paradigms for LLMs	Yu Xia et.al.	2404.15676v1	null
2024-04-24	Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations?	Hossein Salami et.al.	2404.15578v1	null
2024-04-23	Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models	Mihir Parmar et.al.	2404.15522v1	link
2024-04-25	ToM-LM: Delegating Theory of Mind Reasoning to External Symbolic Executors in Large Language Models	Weizhi Tang et.al.	2404.15515v2	null
2024-04-23	Re-Thinking Inverse Graphics With Large Language Models	Peter Kulits et.al.	2404.15228v1	null
2024-04-23	Regressive Side Effects of Training Language Models to Mimic Student Misconceptions	Shashank Sonkar et.al.	2404.15156v1	null
2024-04-23	Rethinking LLM Memorization through the Lens of Adversarial Compression	Avi Schwarzschild et.al.	2404.15146v1	null
2024-04-28	Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Reasoners	Qihuang Zhong et.al.	2404.14963v2	null
2024-04-23	Graph Machine Learning in the Era of Large Language Models (LLMs)	Wenqi Fan et.al.	2404.14928v1	null
2024-04-23	Pattern-Aware Chain-of-Thought Prompting in Large Language Models	Yufeng Zhang et.al.	2404.14812v1	null
2024-04-23	A Survey of Large Language Models on Generative Graph Analytics: Query, Learning, and Applications	Wenbo Shang et.al.	2404.14809v1	null
2024-04-23	Med42 -- Evaluating Fine-Tuning Strategies for Medical LLMs: Full-Parameter vs. Parameter-Efficient Approaches	Clément Christophe et.al.	2404.14779v1	null
2024-04-23	CT-Agent: Clinical Trial Multi-Agent with Large Language Model-based Reasoning	Ling Yue et.al.	2404.14777v1	null
2024-04-23	Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks	Amir Saeidi et.al.	2404.14723v1	null
2024-04-23	Think-Program-reCtify: 3D Situated Reasoning with Large Language Models	Qingrong He et.al.	2404.14705v1	null
2024-04-23	NExT: Teaching Large Language Models to Reason about Code Execution	Ansong Ni et.al.	2404.14662v1	null
2024-04-26	Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training	Mengzhao Jia et.al.	2404.14604v3	null
2024-04-22	Tree of Reviews: A Tree-based Dynamic Iterative Retrieval Framework for Multi-hop Question Answering	Li Jiapeng et.al.	2404.14464v1	null
2024-04-14	Enhancing Fault Detection for Large Language Models via Mutation-Based Confidence Smoothing	Qiang Hu et.al.	2404.14419v1	null
2024-04-22	An Artificial Neuron for Enhanced Problem Solving in Large Language Models	Sumedh Rasal et.al.	2404.14222v1	null
2024-04-22	Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction	Zheye Deng et.al.	2404.14215v1	link
2024-04-24	Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion	Yingxuan Li et.al.	2404.13993v2	null
2024-04-22	Information Re-Organization Improves Reasoning in Large Language Models	Xiaoxia Cheng et.al.	2404.13985v1	null
2024-04-22	MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit	Boning Zhang et.al.	2404.13925v1	link
2024-04-22	Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models	Yukyung Lee et.al.	2404.13919v1	null
2024-04-22	EventLens: Leveraging Event-Aware Pretraining and Cross-modal Linking Enhances Visual Commonsense Reasoning	Mingjie Ma et.al.	2404.13847v1	null
2024-04-24	MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning	Yifan Jiang et.al.	2404.13591v2	link
2024-04-20	Large Language Models as Test Case Generators: Performance Evaluation and Enhancement	Kefan Li et.al.	2404.13340v1	null
2024-05-03	LLMChain: Blockchain-based Reputation System for Sharing and Evaluating Large Language Models	Mouhamed Amine Bouchiha et.al.	2404.13236v2	link
2024-04-19	Beyond Self-Consistency: Ensemble Reasoning Boosts Consistency and Accuracy of LLMs in Cancer Staging	Chia-Hsuan Chang et.al.	2404.13149v1	null
2024-04-17	TREACLE: Thrifty Reasoning via Context-Aware LLM and Prompt Selection	Xuechen Zhang et.al.	2404.13082v1	null
2024-04-14	Evidence from counterfactual tasks supports emergent analogical reasoning in large language models	Taylor Webb et.al.	2404.13070v1	link
2024-04-19	Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs	Biyang Guo et.al.	2404.13033v1	link
2024-04-24	Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models	Yian Li et.al.	2404.12966v2	null
2024-04-29	Large Language Models for Networking: Workflow, Advances and Challenges	Chang Liu et.al.	2404.12901v2	null
2024-04-19	Towards Logically Consistent Language Models via Probabilistic Reasoning	Diego Calanzone et.al.	2404.12843v1	null
2024-04-19	TextSquare: Scaling up Text-Centric Visual Instruction Tuning	Jingqun Tang et.al.	2404.12803v1	null
2024-04-19	Relevant or Random: Can LLMs Truly Perform Analogical Reasoning?	Chengwei Qin et.al.	2404.12728v1	null
2024-04-19	Enabling Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration	Yichong Huang et.al.	2404.12715v1	null
2024-04-22	Multi-Objective Fine-Tuning for Enhanced Program Repair with LLMs	Boyang Yang et.al.	2404.12636v2	null
2024-04-18	BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models	Yu Feng et.al.	2404.12494v1	null
2024-04-18	NORMAD: A Benchmark for Measuring the Cultural Adaptability of Large Language Models	Abhinav Rao et.al.	2404.12464v1	null
2024-04-25	BLINK: Multimodal Large Language Models Can See but Not Perceive	Xingyu Fu et.al.	2404.12390v2	null
2024-04-18	MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale	Xiaotang Gai et.al.	2404.12372v1	null
2024-04-18	Large Language Models in Targeted Sentiment Analysis	Nicolay Rusnachenko et.al.	2404.12342v1	link
2024-04-18	Normative Requirements Operationalization with Large Language Models	Nick Feng et.al.	2404.12335v1	null
2024-04-18	Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing	Ye Tian et.al.	2404.12253v1	null
2024-04-19	AccidentBlip2: Accident Detection With Multi-View MotionBlip2	Yihua Shao et.al.	2404.12149v2	link
2024-04-18	RAGAR, Your Falsehood RADAR: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models	M. Abdul Khaliq et.al.	2404.12065v1	null
2024-04-18	EVIT: Event-Oriented Instruction Tuning for Event Reasoning	Zhengwei Tao et.al.	2404.11978v1	null
2024-04-18	Large Language Models Can Plan Your Travels Rigorously with Formal Verification Tools	Yilun Hao et.al.	2404.11891v1	null
2024-04-18	CAUS: A Dataset for Question Generation based on Human Cognition Leveraging Large Language Models	Minjung Shin et.al.	2404.11835v1	null
2024-04-19	Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study	Zooey Nguyen et.al.	2404.11792v2	null
2024-04-21	Missed Connections: Lateral Thinking Puzzles for Large Language Models	Graham Todd et.al.	2404.11730v2	null
2024-04-17	How often are errors in natural language reasoning due to paraphrastic variability?	Neha Srikanth et.al.	2404.11717v1	null
2024-04-17	Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models	Yue Zhou et.al.	2404.11500v1	link
2024-04-17	Exploring the Transferability of Visual Prompting for Multimodal Large Language Models	Yichi Zhang et.al.	2404.11207v1	link
2024-04-17	Fact :Teaching MLLMs with Faithful, Concise and Transferable Rationales	Minghe Gao et.al.	2404.11129v1	null
2024-04-17	TransLinkGuard: Safeguarding Transformer Models Against Model Stealing in Edge Deployment	Qinfeng Li et.al.	2404.11121v1	null
2024-04-18	ViLLM-Eval: A Comprehensive Evaluation Suite for Vietnamese Large Language Models	Trong-Hieu Nguyen et.al.	2404.11086v2	null
2024-04-17	On the Empirical Complexity of Reasoning and Planning in LLMs	Liwei Kang et.al.	2404.11041v1	null
2024-04-17	Empowering Large Language Models on Robotic Manipulation with Affordance Prompting	Guangran Cheng et.al.	2404.11027v1	null
2024-04-17	Many-Shot In-Context Learning	Rishabh Agarwal et.al.	2404.11018v1	null
2024-04-16	Self-playing Adversarial Language Game Enhances LLM Reasoning	Pengyu Cheng et.al.	2404.10642v1	link
2024-04-16	Private Attribute Inference from Images with Vision-Language Models	Batuhan Tömekçe et.al.	2404.10618v1	null
2024-04-16	Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases	Yanze Li et.al.	2404.10595v1	null
2024-04-16	CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity	Moshe Berchansky et.al.	2404.10513v1	null
2024-04-16	MEEL: Multi-Modal Event Evolution Learning	Zhengwei Tao et.al.	2404.10429v1	link
2024-04-16	Reasoning on Efficient Knowledge Paths:Knowledge Graph Guides Large Language Model for Domain Question Answering	Yuqi Wang et.al.	2404.10384v1	null
2024-04-16	Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards	Hyeonbin Hwang et.al.	2404.10346v1	link
2024-04-28	RLRF:Reinforcement Learning from Reflection through Debates as Feedback for Bias Mitigation in LLMs	Ruoxi Cheng et.al.	2404.10160v2	null
2024-04-15	TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition	Md Mahadi Hasan Nahid et.al.	2404.10150v1	link
2024-04-15	ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesis	Aashish Anantha Ramakrishnan et.al.	2404.10141v1	link
2024-04-15	A Survey on Deep Learning for Theorem Proving	Zhaoyu Li et.al.	2404.09939v1	link
2024-04-15	Compression Represents Intelligence Linearly	Yuzhen Huang et.al.	2404.09937v1	link
2024-04-15	AI-Driven Statutory Reasoning via Software Engineering Methods	Rohan Padhye et.al.	2404.09868v1	null
2024-04-15	Reimagining Self-Adaptation in the Age of Large Language Models	Raghav Donakanti et.al.	2404.09866v1	null
2024-04-15	Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model	Hyunsoo Cho et.al.	2404.09717v1	null
2024-04-15	Generative AI for Game Theory-based Mobile Networking	Long He et.al.	2404.09699v1	null
2024-04-15	Bridging Vision and Language Spaces with Assignment Prediction	Jungin Park et.al.	2404.09632v1	link
2024-04-15	Bridging the Gap between Different Vocabularies for LLM Ensemble	Yangyifan Xu et.al.	2404.09492v1	link
2024-04-15	Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning	Sungwon Han et.al.	2404.09491v1	link
2024-04-15	MMCode: Evaluating Multi-Modal Code Large Language Models with Visually Rich Programming Problems	Kaixin Li et.al.	2404.09486v1	link
2024-04-14	A Survey on Integration of Large Language Models with Intelligent Robots	Yeseung Kim et.al.	2404.09228v1	null
2024-04-16	Post-Semantic-Thinking: A Robust Strategy to Distill Reasoning Capacity from Large Language Models	Xiaoshu Chen et.al.	2404.09170v2	null
2024-04-14	When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models	Yanhong Li et.al.	2404.09129v1	null
2024-04-13	CuriousLLM: Elevating Multi-Document QA with Reasoning-Infused Knowledge Graph Prompting	Zukang Yang et.al.	2404.09077v1	link
2024-04-12	"Don't forget to put the milk back!" Dataset for Enabling Embodied Agents to Detect Anomalous Situations	James F. Mullen Jr et.al.	2404.08827v1	null
2024-04-12	LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning	Junchi Wang et.al.	2404.08767v1	link
2024-04-11	MM-PhyQA: Multimodal Physics Question-Answering With Multi-Image CoT Prompting	Avinash Anand et.al.	2404.08704v1	null
2024-04-10	Apollonion: Profile-centric Dialog Agent	Shangyu Chen et.al.	2404.08692v1	null
2024-04-06	ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming	Simone Tedeschi et.al.	2404.08676v1	link
2024-04-12	Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts	Övgü Özdemir et.al.	2404.08589v1	link
2024-04-12	LaSagnA: Language-based Segmentation Assistant for Complex Queries	Cong Wei et.al.	2404.08506v1	link
2024-04-12	Strategic Interactions between Large Language Models-based Agents in Beauty Contests	Siting Lu et.al.	2404.08492v1	null
2024-04-12	Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian	Stefano De Paoli et.al.	2404.08488v1	null
2024-04-11	Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs	Jierui Li et.al.	2404.08148v1	null
2024-04-11	Data-Augmentation-Based Dialectal Adaptation for LLMs	Fahim Faisal et.al.	2404.08092v1	link
2024-04-10	Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition	Kehua Feng et.al.	2404.08008v1	link
2024-04-17	LaVy: Vietnamese Multimodal Large Language Model	Chi Tran et.al.	2404.07922v4	link
2024-04-11	ODA: Observation-Driven Agent for integrating LLMs and Knowledge Graphs	Lei Sun et.al.	2404.07677v1	null
2024-04-11	WESE: Weak Exploration to Strong Exploitation for LLM Agents	Xu Huang et.al.	2404.07456v1	null
2024-04-11	Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs	Kanchana Ranasinghe et.al.	2404.07449v1	null
2024-04-10	Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs	Bowen Jin et.al.	2404.07103v1	link
2024-04-10	VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning	Alexandros Xenos et.al.	2404.07078v1	link
2024-04-10	Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study	Hongru Du et.al.	2404.06962v1	link
2024-04-10	Vision-Language Model-based Physical Reasoning for Robot Liquid Perception	Wenqiang Lai et.al.	2404.06904v1	null
2024-04-09	GenCHiP: Generating Robot Policy Code for High-Precision and Contact-Rich Manipulation Tasks	Kaylee Burns et.al.	2404.06645v1	null
2024-04-09	Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language?	Omid Ghahroodi et.al.	2404.06644v1	null
2024-04-09	AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents	Luca Gioacchini et.al.	2404.06411v1	link
2024-04-09	Model Generation from Requirements with LLMs: an Exploratory Study	Alessio Ferrari et.al.	2404.06371v1	null
2024-04-21	AgentsCoDriver: Large Language Model Empowered Collaborative Driving with Lifelong Learning	Senkang Hu et.al.	2404.06345v2	null
2024-04-09	DRE: Generating Recommendation Explanations by Aligning Large Language Models at Data-level	Shen Gao et.al.	2404.06311v1	null
2024-04-09	Multimodal Road Network Generation Based on Large Language Model	Jiajing Chen et.al.	2404.06227v1	null
2024-04-08	Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning	Ruiqi Zhang et.al.	2404.05868v1	null
2024-04-08	Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs	Keen You et.al.	2404.05719v1	null
2024-04-08	Evaluating Mathematical Reasoning Beyond Accuracy	Shijie Xia et.al.	2404.05692v1	link
2024-04-18	CoReS: Orchestrating the Dance of Reasoning and Segmentation	Xiaoyi Bao et.al.	2404.05673v2	null
2024-04-08	MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering	Iñigo Alonso et.al.	2404.05590v1	null
2024-04-08	Evaluating Interventional Reasoning Capabilities of Large Language Models	Tejas Kasetty et.al.	2404.05545v1	null
2024-04-08	HAMMR: HierArchical MultiModal React agents for generic VQA	Lluis Castrejon et.al.	2404.05465v1	null
2024-04-11	RoT: Enhancing Large Language Models with Reflection on Search Trees	Wenyang Hui et.al.	2404.05449v2	link
2024-04-08	Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models	Yutao Ouyang et.al.	2404.05291v1	null
2024-04-08	LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models	Shibo Hao et.al.	2404.05221v1	null
2024-04-08	LLM-BT: Performing Robotic Adaptive Tasks based on Large Language Models and Behavior Trees	Haotian Zhou et.al.	2404.05134v1	null
2024-04-07	Facial Affective Behavior Analysis with Instruction Tuning	Yifan Li et.al.	2404.05052v1	null
2024-04-07	MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models	Zihao Wei et.al.	2404.04990v1	link
2024-04-07	SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials	Mael Jullien et.al.	2404.04963v1	null
2024-04-07	RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models	Qi Lv et.al.	2404.04929v1	null
2024-04-07	LLM-Based Multi-Agent Systems for Software Engineering: Vision and the Road Ahead	Junda He et.al.	2404.04834v1	null
2024-04-07	FRACTAL: Fine-Grained Scoring from Aggregate Text Labels	Yukti Makhija et.al.	2404.04817v1	null
2024-04-07	GenEARL: A Training-Free Generative Framework for Multimodal Event Argument Role Labeling	Hritik Bansal et.al.	2404.04763v1	null
2024-04-06	Challenges Faced by Large Language Models in Solving Multi-Agent Flocking	Peihan Li et.al.	2404.04752v1	null
2024-04-06	Navigating the Landscape of Hint Generation Research: From the Past to the Future	Anubhav Jangra et.al.	2404.04728v1	null
2024-04-06	Autonomous Artificial Intelligence Agents for Clinical Decision Making in Oncology	Dyke Ferber et.al.	2404.04667v1	null
2024-04-06	Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement	Zaid Khan et.al.	2404.04627v1	null
2024-04-06	IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials	Shreyasi Mandal et.al.	2404.04510v1	link
2024-04-05	Exploring Autonomous Agents through the Lens of Large Language Models: A Review	Saikat Barua et.al.	2404.04442v1	null
2024-04-05	Assisting humans in complex comparisons: automated information comparison at scale	Truman Yuen et.al.	2404.04351v1	null
2024-04-05	Koala: Key frame-conditioned long video-LLM	Reuben Tan et.al.	2404.04346v1	null
2024-04-04	CBR-RAG: Case-Based Reasoning for Retrieval Augmented Generation in LLMs for Legal Question Answering	Nirmalie Wiratunga et.al.	2404.04302v1	link
2024-04-04	Reason from Fallacy: Enhancing Large Language Models' Logical Reasoning through Logical Fallacy Understanding	Yanda Li et.al.	2404.04293v1	null
2024-04-05	Physical Property Understanding from Language-Embedded Feature Fields	Albert J. Zhai et.al.	2404.04242v1	null
2024-04-05	Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents	Harsh Kohli et.al.	2404.04237v1	null
2024-04-05	Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer	Hele-Andra Kuulmets et.al.	2404.04042v1	null
2024-04-05	Can only LLMs do Reasoning?: Potential of Small Language Models in Task Planning	Gawon Choi et.al.	2404.03891v1	link
2024-04-08	SAAS: Solving Ability Amplification Strategy for Enhanced Mathematical Reasoning in Large Language Models	Hyeonwoo Kim et.al.	2404.03887v2	null
2024-04-04	Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra	Darioush Kevian et.al.	2404.03647v1	null
2024-04-04	Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph	Marco Bronzini et.al.	2404.03623v1	null
2024-04-04	Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models	Wenshan Wu et.al.	2404.03622v1	null
2024-04-04	Sailor: Open Language Models for South-East Asia	Longxu Dou et.al.	2404.03608v1	link
2024-04-04	Evaluating LLMs at Detecting Errors in LLM Responses	Ryo Kamoi et.al.	2404.03602v1	link
2024-04-04	Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models	Yantao Liu et.al.	2404.03577v1	link
2024-04-04	Edisum: Summarizing and Explaining Wikipedia Edits at Scale	Marija Šakota et.al.	2404.03428v1	link
2024-04-04	Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought	Jooyoung Lee et.al.	2404.03414v1	null
2024-04-04	nicolay-r at SemEval-2024 Task 3: Using Flan-T5 for Reasoning Emotion Cause in Conversations with Chain-of-Thought on Emotion States	Nicolay Rusnachenko et.al.	2404.03361v1	link
2024-04-04	Probing Large Language Models for Scalar Adjective Lexical Semantics and Scalar Diversity Pragmatics	Fangru Lin et.al.	2404.03301v1	link
2024-04-04	The Probabilities Also Matter: A More Faithful Metric for Faithfulness of Free-Text Explanations in Large Language Models	Noah Y. Siegel et.al.	2404.03189v1	null
2024-04-04	Robust Pronoun Use Fidelity with English LLMs: Are they Reasoning, Repeating, or Just Biased?	Vagrant Gautam et.al.	2404.03134v1	link
2024-04-10	An Incomplete Loop: Deductive, Inductive, and Abductive Learning in Large Language Models	Emmy Liu et.al.	2404.03028v2	null
2024-04-03	Towards a Fully Interpretable and More Scalable RSA Model for Metaphor Understanding	Gaia Carenini et.al.	2404.02983v1	null
2024-04-03	Explainable Traffic Flow Prediction with Large Language Models	Xusen Guo et.al.	2404.02937v1	null
2024-04-03	KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking	Jiawei Zhang et.al.	2404.02935v1	link
2024-04-03	GreedLlama: Performance of Financial Value-Aligned Large Language Models in Moral Reasoning	Jeffy Yu et.al.	2404.02934v1	null
2024-04-03	I-Design: Personalized LLM Interior Designer	Ata Çelen et.al.	2404.02838v1	null
2024-04-03	Empowering Biomedical Discovery with AI Agents	Shanghua Gao et.al.	2404.02831v1	null
2024-04-05	A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches	Zhigen Zhao et.al.	2404.02817v2	null
2024-04-03	Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models	Hyungjoo Chae et.al.	2404.02575v1	null
2024-04-03	VIAssist: Adapting Multi-modal Large Language Models for Users with Visual Impairments	Bufang Yang et.al.	2404.02508v1	null
2024-04-03	Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT	Amirhossein Abaskohi et.al.	2404.02403v1	link
2024-04-02	$\texttt{LM}^\texttt{2}$: A Simple Society of Language Models Solves Complex Reasoning	Gurusha Juneja et.al.	2404.02255v1	null
2024-04-02	Advancing LLM Reasoning Generalists with Preference Trees	Lifan Yuan et.al.	2404.02078v1	link
2024-04-04	Long-context LLMs Struggle with Long In-context Learning	Tianle Li et.al.	2404.02060v2	link
2024-04-02	Large Language Models for Orchestrating Bimanual Robots	Kun Chu et.al.	2404.02018v1	null
2024-04-13	HyperCLOVA X Technical Report	Kang Min Yoo et.al.	2404.01954v2	null
2024-04-02	Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey	Philipp Mondorf et.al.	2404.01869v1	null
2024-04-02	Where to Move Next: Zero-shot Generalization of LLMs for Next POI Recommendation	Shanshan Feng et.al.	2404.01855v1	link
2024-04-03	Towards Generalizable and Faithful Logic Reasoning over Natural Language via Resolution Refutation	Zhouhao Sun et.al.	2404.01677v2	null
2024-04-02	METAL: Towards Multilingual Meta-Evaluation	Rishav Hada et.al.	2404.01667v1	null
2024-04-02	InsightLens: Discovering and Exploring Insights from Conversational Contexts in Large-Language-Model-Powered Data Analysis	Luoxuan Weng et.al.	2404.01644v1	null
2024-04-01	Syntactic Robustness for LLM-based Code Generation	Laboni Sarker et.al.	2404.01535v1	null
2024-04-01	Are large language models superhuman chemists?	Adrian Mirza et.al.	2404.01475v1	null
2024-04-01	Will the Real Linda Please Stand up...to Large Language Models? Examining the Representativeness Heuristic in LLMs	Pengda Wang et.al.	2404.01461v1	null
2024-03-31	CHOPS: CHat with custOmer Profile Systems for Customer Service with LLMs	Jingzhe Shi et.al.	2404.01343v1	null
2024-04-01	FABLES: Evaluating faithfulness and content selection in book-length summarization	Yekyung Kim et.al.	2404.01261v1	link
2024-04-01	A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules	Xiang Li et.al.	2404.01245v1	null
2024-04-01	LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models	Yadong Zhang et.al.	2404.01230v1	null
2024-04-01	Enhancing Reasoning Capacity of SLM using Cognitive Enhancement	Jonathan Pan et.al.	2404.01135v1	null
2024-04-01	Enabling Memory Safety of C Programs using LLMs	Nausheen Mohammed et.al.	2404.01096v1	null
2024-04-01	Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning	Rongjie Li et.al.	2404.00909v1	null
2024-04-02	An Abundance of Katherines: The Game Theory of Baby Naming	Katy Blumer et.al.	2404.00732v2	null
2024-03-30	Multi-hop Question Answering under Temporal Knowledge Editing	Keyuan Cheng et.al.	2404.00492v1	null
2024-04-04	Planning and Editing What You Retrieve for Enhanced Tool Learning	Tenghao Huang et.al.	2404.00450v2	link
2024-03-30	Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks	Hyunjae Kim et.al.	2404.00376v1	null
2024-03-30	Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange	Ankit Satpute et.al.	2404.00344v1	link
2024-03-30	Your Co-Workers Matter: Evaluating Collaborative Capabilities of Language Models in Blocks World	Guande Wu et.al.	2404.00246v1	link
2024-03-30	Aligning Large Language Models with Recommendation Knowledge	Yuwei Cao et.al.	2404.00245v1	null
2024-03-30	DeFT: Flash Tree-attention with IO-Awareness for Efficient Tree-search-based LLM Inference	Jinwei Yao et.al.	2404.00242v1	null
2024-03-30	Multi-Conditional Ranking with Large Language Models	Pouya Pezeshkpour et.al.	2404.00211v1	link
2024-03-30	EventGround: Narrative Reasoning by Grounding to Eventuality-centric Knowledge Graphs	Cheng Jiayang et.al.	2404.00209v1	link
2024-03-30	Conceptual and Unbiased Reasoning in Language Models	Ben Zhou et.al.	2404.00205v1	null
2024-03-29	Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections	Ahmad Diab et.al.	2404.00141v1	null
2024-03-29	Measuring Taiwanese Mandarin Language Understanding	Po-Heng Chen et.al.	2403.20180v1	null
2024-03-29	ITCMA: A Generative Agent Based on a Computational Consciousness Structure	Hanzhong Zhang et.al.	2403.20097v1	null
2024-03-29	On Large Language Models' Hallucination with Regard to Known Facts	Che Jiang et.al.	2403.20009v1	null
2024-03-29	Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning	Qinhao Zhou et.al.	2403.19962v1	null
2024-03-28	LLMSense: Harnessing LLMs for High-level Reasoning Over Spatiotemporal Sensor Traces	Xiaomin Ouyang et.al.	2403.19857v1	null
2024-03-28	Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving	Akshay Gopalkrishnan et.al.	2403.19838v1	link
2024-03-28	Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models	Yucheng Shi et.al.	2403.19631v1	null
2024-03-28	BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation	Yuhong He et.al.	2403.19414v1	null
2024-03-28	RAIL: Robot Affordance Imagination with Large Language Models	Ceng Zhang et.al.	2403.19369v1	null
2024-03-28	IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation	Jiacui Huang et.al.	2403.19336v1	null
2024-03-28	Plug-and-Play Grounding of Reasoning in Multimodal Large Language Models	Jiaxing Chen et.al.	2403.19322v1	null
2024-04-01	TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios	Xiaokang Zhang et.al.	2403.19318v2	link
2024-03-28	Mitigating Misleading Chain-of-Thought Reasoning with Selective Filtering	Yexin Wu et.al.	2403.19167v1	null
2024-03-28	MFORT-QA: Multi-hop Few-shot Open Rich Table Question Answering	Che Guan et.al.	2403.19116v1	null
2024-03-28	Learning From Correctness Without Prompting Makes LLM Efficient Reasoner	Yuxuan Yao et.al.	2403.19094v1	null
2024-03-27	LITA: Language Instructed Temporal-Localization Assistant	De-An Huang et.al.	2403.19046v1	link
2024-03-27	Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models	Yanwei Li et.al.	2403.18814v1	link
2024-04-03	Long-form factuality in large language models	Jerry Wei et.al.	2403.18802v3	link
2024-03-27	A Path Towards Legal Autonomy: An interoperable and explainable approach to extracting, transforming, loading and computing legal information using large language models, expert systems and Bayesian networks	Axel Constant et.al.	2403.18537v1	null
2024-03-27	TriviaHG: A Dataset for Automatic Hint Generation from Factoid Questions	Jamshid Mozafari et.al.	2403.18426v1	link
2024-03-27	The Topos of Transformer Networks	Mattia Jacopo Villani et.al.	2403.18415v1	null
2024-03-27	An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM	Wonkyun Kim et.al.	2403.18406v1	link
2024-03-27	Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval	Shengjie Ma et.al.	2403.18405v1	null
2024-03-27	BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models	Haitao Li et.al.	2403.18365v1	null
2024-04-03	Quantifying and Mitigating Unimodal Biases in Multimodal Large Language Models: A Causal Perspective	Meiqi Chen et.al.	2403.18346v3	null
2024-03-27	LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models	Mingxing Peng et.al.	2403.18344v1	null
2024-03-27	Dual Instruction Tuning with Large Language Models for Mathematical Reasoning	Yongwei Zhou et.al.	2403.18295v1	null
2024-03-27	Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models	Yiwu Zhong et.al.	2403.18252v1	link
2024-03-27	Large Language Models Need Consultants for Reasoning: Becoming an Expert in a Complex Human System Through Behavior Simulation	Chuwen Wang et.al.	2403.18230v1	link
2024-03-28	Oh! We Freeze: Improving Quantized Knowledge Distillation via Signal Propagation Analysis for Large Language Models	Kartikeya Bhardwaj et.al.	2403.18159v2	null
2024-03-26	Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization	Jin Peng Zhou et.al.	2403.18120v1	link
2024-03-26	ShapeGrasp: Zero-Shot Task-Oriented Grasping with Large Language Models through Geometric Decomposition	Samuel Li et.al.	2403.18062v1	null
2024-03-26	MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution	Wei Tao et.al.	2403.17927v1	null
2024-03-26	Assessment of Multimodal Large Language Models in Alignment with Human Values	Zhelun Shi et.al.	2403.17830v1	null
2024-03-26	Constructions Are So Difficult That Even Large Language Models Get Them Right for the Wrong Reasons	Shijia Zhou et.al.	2403.17760v1	link
2024-03-26	Large Language Models Enhanced Collaborative Filtering	Zhongxiang Sun et.al.	2403.17688v1	null
2024-03-26	DGoT: Dynamic Graph of Thoughts for Scientific Abstract Generation	Xinyu Ning et.al.	2403.17491v1	link
2024-03-26	ChatGPT Rates Natural Language Explanation Quality Like Humans: But on Which Scales?	Fan Huang et.al.	2403.17368v1	link
2024-03-26	Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models	Zhenyu Pan et.al.	2403.17359v1	null
2024-03-25	TwoStep: Multi-agent Task Planning using Classical Planners and Large Language Models	Ishika Singh et.al.	2403.17246v1	null
2024-03-25	A Comprehensive Study of the Capabilities of Large Language Models for Vulnerability Detection	Benjamin Steenhoek et.al.	2403.17218v1	null
2024-03-25	Grounding Language Plans in Demonstrations Through Counterfactual Perturbations	Yanwei Wang et.al.	2403.17124v1	null
2024-03-25	Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models	Hao Shao et.al.	2403.16999v1	link
2024-03-25	PropTest: Automatic Property Testing for Improved Visual Programming	Jaywon Koo et.al.	2403.16921v1	null
2024-03-25	Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art	Neeloy Chakraborty et.al.	2403.16527v1	null
2024-03-25	Harnessing the power of LLMs for normative reasoning in MASs	Bastin Tony Roy Savarimuthu et.al.	2403.16524v1	null
2024-03-25	Norm Violation Detection in Multi-Agent Systems using Large Language Models: A Pilot Study	Shawn He et.al.	2403.16517v1	null
2024-03-25	Evaluating Large Language Models with Runtime Behavior of Program Execution	Junkai Chen et.al.	2403.16437v1	null
2024-03-27	Re2LLM: Reflective Reinforcement Large Language Model for Session-based Recommendation	Ziyan Wang et.al.	2403.16427v3	null
2024-03-28	Synthesize Step-by-Step: Tools, Templates and LLMs as Data Generators for Reasoning-Based Chart VQA	Zhuowan Li et.al.	2403.16385v2	null
2024-03-28	Can Language Models Pretend Solvers? Logic Code Simulation with LLMs	Minyu Chen et.al.	2403.16097v2	null
2024-03-24	Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with Justifications	Wei Ma et.al.	2403.16073v1	null
2024-03-23	Few-shot Dialogue Strategy Learning for Motivational Interviewing via Inductive Reasoning	Zhouhang Xie et.al.	2403.15737v1	null
2024-03-23	LLMs Instruct LLMs:An Extraction and Editing Method	Xin Zhang et.al.	2403.15736v1	null
2024-03-21	Open Source Conversational LLMs do not know most Spanish words	Javier Conde et.al.	2403.15491v1	null
2024-03-19	LLMs-based Few-Shot Disease Predictions using EHR: A Novel Approach Combining Predictive Agent Reasoning and Critical Agent Instruction	Hejie Cui et.al.	2403.15464v1	null
2024-04-01	LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models	Yuzhang Shang et.al.	2403.15388v3	null
2024-03-22	Can large language models explore in-context?	Akshay Krishnamurthy et.al.	2403.15371v1	null
2024-03-22	CoLLEGe: Concept Embedding Generation for Large Language Models	Ryan Teehan et.al.	2403.15362v1	null
2024-03-22	Sphere Neural-Networks for Rational Reasoning	Tiansi Dong et.al.	2403.15297v1	null
2024-03-22	MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection	Taeheon Kim et.al.	2403.15209v1	null
2024-03-22	CACA Agent: Capability Collaboration based AI Agent	Peng Xu et.al.	2403.15137v1	null
2024-04-03	MasonTigers at SemEval-2024 Task 9: Solving Puzzles with an Ensemble of Chain-of-Thoughts	Md Nishat Raihan et.al.	2403.14982v2	null
2024-03-22	Attention-Driven Reasoning: Unlocking the Potential of Large Language Models	Bingli Liao et.al.	2403.14932v1	null
2024-03-25	VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding	Ahmad Mahmood et.al.	2403.14743v2	null
2024-03-21	MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?	Renrui Zhang et.al.	2403.14624v1	null
2024-03-21	A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science	Clayton Cohn et.al.	2403.14565v1	null
2024-03-21	ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting	Xiaoxue Cheng et.al.	2403.14312v1	link
2024-03-21	ERD: A Framework for Improving LLM Reasoning for Cognitive Distortion Classification	Sehee Lim et.al.	2403.14255v1	null
2024-03-23	K-Act2Emo: Korean Commonsense Knowledge Graph for Indirect Emotional Expression	Kyuhee Kim et.al.	2403.14253v2	link
2024-03-21	Empowering Segmentation Ability to Multi-modal Large Language Models	Yuqi Yang et.al.	2403.14141v1	null
2024-03-21	Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations	Jiaxing Sun et.al.	2403.14112v1	link
2024-03-21	Empowering Personalized Learning through a Conversation-based Tutoring System with Student Modeling	Minju Park et.al.	2403.14071v1	null
2024-03-14	Circuit Transformer: End-to-end Circuit Design by Predicting the Next Gate	Xihan Li et.al.	2403.13838v1	null
2024-03-23	Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts	Guangzeng Han et.al.	2403.13786v2	null
2024-03-22	Llama meets EU: Investigating the European Political Spectrum through the Lens of LLMs	Ilias Chalkidis et.al.	2403.13592v2	link
2024-03-20	PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns	Yew Ken Chia et.al.	2403.13315v1	link
2024-03-20	LeanReasoner: Boosting Complex Logical Reasoning with Lean	Dongwei Jiang et.al.	2403.13312v1	link
2024-03-20	Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs	Zhihong Sun et.al.	2403.13271v1	null
2024-03-19	VL-ICL Bench: The Devil in the Details of Benchmarking Multimodal In-Context Learning	Yongshuo Zong et.al.	2403.13164v1	link
2024-03-13	AutoTRIZ: Artificial Ideation with TRIZ and Large Language Models	Shuo Jiang et.al.	2403.13002v1	null
2024-03-11	Prompt Selection and Augmentation for Few Examples Code Generation in Large Language Model and its Application in Robotics Control	On Tai Wu et.al.	2403.12999v1	null
2024-03-19	Dated Data: Tracing Knowledge Cutoffs in Large Language Models	Jeffrey Cheng et.al.	2403.12958v1	null
2024-03-19	Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models	Joana Ribeiro de Faria et.al.	2403.12936v1	null
2024-03-19	mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding	Anwen Hu et.al.	2403.12895v1	link
2024-03-19	HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning	Fucai Ke et.al.	2403.12884v1	null
2024-03-19	Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models	Zehui Chen et.al.	2403.12881v1	link
2024-03-19	Compositional 3D Scene Synthesis with Scene Graph Guided Layout-Shape Generation	Yao Wei et.al.	2403.12848v1	null
2024-03-19	RelationVLM: Making Large Vision-Language Models Understand Visual Relations	Zhipeng Huang et.al.	2403.12801v1	null
2024-03-18	NovelQA: A Benchmark for Long-Range Novel Question Answering	Cunxiang Wang et.al.	2403.12766v1	link
2024-03-19	Instructing Large Language Models to Identify and Ignore Irrelevant Conditions	Zhenyu Wu et.al.	2403.12744v1	link
2024-03-19	Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs	Victor Carbune et.al.	2403.12596v1	null
2024-03-19	AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework	Xiang Li et.al.	2403.12582v1	link
2024-03-19	To Help or Not to Help: LLM-based Attentive Support for Human-Robot Group Interactions	Daniel Tanneberg et.al.	2403.12533v1	null
2024-03-19	Embodied LLM Agents Learn to Cooperate in Organized Teams	Xudong Guo et.al.	2403.12482v1	null
2024-03-19	Dr3: Ask Large Language Models Not to Give Off-Topic Answers in Open Domain Multi-Hop Question Answering	Yuan Gao et.al.	2403.12393v1	null
2024-03-22	RankPrompt: Step-by-Step Comparisons Make Language Models Better Reasoners	Chi Hu et.al.	2403.12373v3	null
2024-03-18	OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety	Chuang Liu et.al.	2403.12316v1	null
2024-03-18	TnT-LLM: Text Mining at Scale with Large Language Models	Mengting Wan et.al.	2403.12173v1	null
2024-03-18	EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents	Abhay Zala et.al.	2403.12014v1	null
2024-03-18	QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback based Self-Correction	Xiang Huang et.al.	2403.11886v1	null
2024-03-18	Agent3D-Zero: An Agent for Zero-shot 3D Understanding	Sha Zhang et.al.	2403.11835v1	null
2024-03-18	Metaphor Understanding Challenge Dataset for LLMs	Xiaoyu Tong et.al.	2403.11810v1	null
2024-03-25	Counting-Stars: A Simple, Efficient, and Reasonable Strategy for Evaluating Long-Context Large Language Models	Mingyang Song et.al.	2403.11802v2	link
2024-03-18	Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus	Seungpil Lee et.al.	2403.11793v1	null
2024-03-20	LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning	Shu Wang et.al.	2403.11552v2	link
2024-03-22	Scene-LLM: Extending Language Model for 3D Visual Understanding and Reasoning	Rao Fu et.al.	2403.11401v2	null
2024-03-17	ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models	Siyuan Huang et.al.	2403.11289v1	link
2024-03-17	Enhancing Event Causality Identification with Rationale and Structure-Aware Causal Question Answering	Baiyan Zhang et.al.	2403.11129v1	null
2024-03-17	GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment	Lance Ying et.al.	2403.11075v1	null
2024-03-26	SelfIE: Self-Interpretation of Large Language Model Embeddings	Haozhe Chen et.al.	2403.10949v2	link
2024-03-16	BEnQA: A Question Answering and Reasoning Benchmark for Bengali and English	Sheikh Shafayat et.al.	2403.10900v1	link
2024-03-16	A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment	Tianhe Wu et.al.	2403.10854v1	link
2024-03-16	NARRATE: Versatile Language Architecture for Optimal Control in Robotics	Seif Ismail et.al.	2403.10762v1	null
2024-03-15	VideoAgent: Long-form Video Understanding with Large Language Model as Agent	Xiaohan Wang et.al.	2403.10517v1	null
2024-03-15	Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization	Ratnadira Widyasari et.al.	2403.10507v1	null
2024-03-15	HawkEye: Training Video-Text LLMs for Grounding Text in Videos	Yueqian Wang et.al.	2403.10228v1	link
2024-03-15	AUTONODE: A Neuro-Graphic Self-Learnable Engine for Cognitive GUI Automation	Arkajit Datta et.al.	2403.10171v1	null
2024-03-15	RAFT: Adapting Language Model to Domain Specific RAG	Tianjun Zhang et.al.	2403.10131v1	link
2024-03-15	Enhancing Human-Centered Dynamic Scene Understanding via Multiple LLMs Collaborated Reasoning	Hang Zhang et.al.	2403.10107v1	null
2024-03-15	Knowledge Condensation and Reasoning for Knowledge-based VQA	Dongze Hao et.al.	2403.10037v1	null
2024-03-15	ViTCN: Vision Transformer Contrastive Network For Reasoning	Bo Song et.al.	2403.09962v1	null
2024-03-14	Meta-Cognitive Analysis: Evaluating Declarative and Procedural Knowledge in Datasets and Large Language Models	Zhuoqun Li et.al.	2403.09750v1	link
2024-03-14	Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors	Guanghua Li et.al.	2403.09747v1	null
2024-03-13	Do Large Language Models Solve ARC Visual Analogies Like People Do?	Gustaw Opiełka et.al.	2403.09734v1	null
2024-03-14	3D-VLA: A 3D Vision-Language-Action Generative World Model	Haoyu Zhen et.al.	2403.09631v1	null
2024-03-22	MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training	Brandon McKinzie et.al.	2403.09611v3	null
2024-03-14	Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey	Xiaoyu Liu et.al.	2403.09606v1	null
2024-03-14	Logical Discrete Graphical Models Must Supplement Large Language Models for Information Synthesis	Gregory Coppola et.al.	2403.09599v1	null
2024-03-15	ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models	Runyu Ma et.al.	2403.09583v2	null
2024-03-22	Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation	Yunhao Gou et.al.	2403.09572v2	null
2024-03-21	Less is More: Data Value Estimation for Visual Instruction Tuning	Zikang Liu et.al.	2403.09559v2	null
2024-03-14	Exploring the Comprehension of ChatGPT in Traditional Chinese Medicine Knowledge	Li Yizhen et.al.	2403.09164v1	null
2024-03-14	Caveat Lector: Large Language Models in Legal Practice	Eliza Mik et.al.	2403.09163v1	null
2024-03-14	USimAgent: Large Language Models for Simulating Search Users	Erhan Zhang et.al.	2403.09142v1	null
2024-03-14	Meaningful Learning: Advancing Abstract Reasoning in Large Language Models via Generic Fact Guidance	Kai Xiong et.al.	2403.09085v1	null
2024-03-14	Query Rewriting via Large Language Models	Jie Liu et.al.	2403.09060v1	null
2024-03-13	Usable XAI: 10 Strategies Towards Exploiting Explainability in the LLM Era	Xuansheng Wu et.al.	2403.08946v1	link
2024-03-13	AcademiaOS: Automating Grounded Theory Development in Qualitative Research with Large Language Models	Thomas Übellacker et.al.	2403.08844v1	link
2024-03-13	TINA: Think, Interaction, and Action Framework for Zero-Shot Vision Language Navigation	Dingbang Li et.al.	2403.08833v1	null
2024-03-13	Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework	Jingling Li et.al.	2403.08743v1	null
2024-03-13	The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models	Carlo Nicolini et.al.	2403.08739v1	null
2024-03-14	Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation	Daniel Honerkamp et.al.	2403.08605v2	link
2024-03-13	Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments	Sitao Cheng et.al.	2403.08593v1	null
2024-03-13	CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model	Cheng Chen et.al.	2403.08350v1	link
2024-03-13	LLM-Assisted Light: Leveraging Large Language Model Capabilities for Human-Mimetic Traffic Signal Control in Complex Urban Environments	Maonan Wang et.al.	2403.08337v1	link
2024-03-13	Can Large Language Models Identify Authorship?	Baixiang Huang et.al.	2403.08213v1	link
2024-03-13	Large Language Models are Contrastive Reasoners	Liang Yao et.al.	2403.08211v1	link
2024-03-12	DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies	William Xie et.al.	2403.07832v1	null
2024-03-12	Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM	Sainbayar Sukhbaatar et.al.	2403.07816v1	null
2024-03-12	Fine-tuning Large Language Models with Sequential Instructions	Hanxu Hu et.al.	2403.07794v1	link
2024-03-15	Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations	Carlos Jose Xavier Cruz et.al.	2403.07769v3	link
2024-03-12	FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models	Yan Liu et.al.	2403.07747v1	null
2024-03-12	Multi-modal Auto-regressive Modeling via Visual Words	Tianshuo Peng et.al.	2403.07720v1	link
2024-03-12	DrPlanner: Diagnosis and Repair of Motion Planners Using Large Language Models	Yuanfei Lin et.al.	2403.07470v1	link
2024-03-12	Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs	Tianqing Fang et.al.	2403.07398v1	null
2024-03-12	NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning	Bingqian Lin et.al.	2403.07376v1	link
2024-03-11	Narrating Causal Graphs with Large Language Models	Atharva Phatak et.al.	2403.07118v1	null
2024-03-13	Naming, Describing, and Quantifying Visual Objects in Humans and LLMs	Alberto Testoni et.al.	2403.06935v2	link
2024-03-11	ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis	Yanming Liu et.al.	2403.06932v1	link
2024-03-11	RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback	Yanming Liu et.al.	2403.06840v1	link
2024-03-11	KELLMRec: Knowledge-Enhanced Large Language Models for Recommendation	Weiqing Luo et.al.	2403.06642v1	null
2024-03-11	**Guiding Clinical Reasoning with Large Language Models via K

Name		Name	Last commit message	Last commit date
Latest commit History 230 Commits
LLMs-arxiv-daily.json		LLMs-arxiv-daily.json
README.md		README.md
daily_arxiv.py		daily_arxiv.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Updated on 2024.07.25

LLM - Explainable

LLM - Interpretable

LLM - Reasoning

About

Releases

Packages

Languages

YiQi0318/LLMs_daily_arxiv

Folders and files

Latest commit

History

Repository files navigation

Updated on 2024.07.25

LLM - Explainable

LLM - Interpretable

LLM - Reasoning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages