Latest Large Language Models Research Papers

Research on large language models including GPT, Claude, Llama, and other transformer-based architectures for natural language understanding and generation.

49 Papers
Showing 20 of 20 papers

ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

Tao Liu, Taiqiang Wu, Runming Yang +3 more

Supervised fine-tuning (SFT) is a fundamental post-training strategy to align Large Language Models (LLMs) with human intent. However, traditional SFT often ignores the one-to-many nature of language by forcing alignment with a single reference answer, leading to the model overfitting to non-core ex...

supervised fine-tuningLarge Language Modelsone-to-many naturetoken probabilitysemantic importance+2 more
Jan 14, 202613

YaPO: Learnable Sparse Activation Steering Vectors for Domain Adaptation

Abdelaziz Bounhar, Rania Hossam Elmohamady Elbadry, Hadi Abdine +3 more

Steering Large Language Models (LLMs) through activation interventions has emerged as a lightweight alternative to fine-tuning for alignment and personalization. Recent work on Bi-directional Preference Optimization (BiPO) shows that dense steering vectors can be learned directly from preference dat...

Large Language Modelsactivation interventionsfine-tuningBi-directional Preference OptimizationDirect Preference Optimization+13 more
Jan 13, 20266

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Yao Tang, Li Dong, Yaru Hao +3 more

Large language models often solve complex reasoning tasks more effectively with Chain-of-Thought (CoT), but at the cost of long, low-bandwidth token sequences. Humans, by contrast, often reason softly by maintaining a distribution over plausible next steps. Motivated by this, we propose Multiplex Th...

Chain-of-Thoughtstochastic soft reasoningmultiplex tokencontinuous multiplex tokenon-policy reinforcement learning+3 more
Jan 13, 202633

Enhancing Sentiment Classification and Irony Detection in Large Language Models through Advanced Prompt Engineering Techniques

Marvin Schmitt, Anne Schwerk, Sebastian Lempert

This study investigates the use of prompt engineering to enhance large language models (LLMs), specifically GPT-4o-mini and gemini-1.5-flash, in sentiment analysis tasks. It evaluates advanced prompting techniques like few-shot learning, chain-of-thought prompting, and self-consistency against a bas...

prompt engineeringlarge language modelsfew-shot learningchain-of-thought promptingself-consistency+3 more
Jan 13, 20264

Entropy Sentinel: Continuous LLM Accuracy Monitoring from Decoding Entropy Traces in STEM

Pedro Memoli Buffa, Luciano Del Corro

Deploying LLMs raises two coupled challenges: (1) monitoring - estimating where a model underperforms as traffic and domains drift - and (2) improvement - prioritizing data acquisition to close the largest performance gaps. We test whether an inference-time signal can estimate slice-level accuracy u...

output-entropy profilenext-token probabilitiesfinal-layertop-k logprobsinstance correctness+4 more
Jan 13, 202615

RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation

Sunzhu Li, Jiale Zhao, Miteto Wei +6 more

Reinforcement Learning with Verifiable Rewards (RLVR) has driven substantial progress in reasoning-intensive domains like mathematics. However, optimizing open-ended generation remains challenging due to the lack of ground truth. While rubric-based evaluation offers a structured proxy for verificati...

Reinforcement Learning with Verifiable Rewardsrubric-based evaluationprinciple-guided synthesismulti-model aggregationdifficulty evolution+3 more
Jan 13, 202650
PreviousPage 2 of 3Next