Curiosity-Driven Reinforcement Learning from Human Feedback CD-RLHF: An AI Framework that Mitigates the Diversity...

Large Language Models (LLMs) have become increasingly reliant on Reinforcement Learning from...

Memorization vs. Generalization: How Supervised Fine-Tuning SFT and Reinforcement Learning RL Shape Foundation Model...

Modern AI systems rely heavily on post-training techniques like supervised fine-tuning (SFT)...

Meta AI Proposes EvalPlanner: A Preference Optimization Algorithm for Thinking-LLM-as-a-Judge

The rapid advancement of Large Language Models (LLMs) has significantly improved their...

Agentic AI: The Foundations Based on Perception Layer, Knowledge Representation and Memory Systems

Agentic AI stands at the intersection of autonomy, intelligence, and adaptability, offering...

From Deep Knowledge Tracing to DKT2: A Leap Forward in Educational AI

Knowledge Tracing (KT) plays a crucial role in Intelligent Tutoring Systems (ITS) by modeling students’ knowledge states and predicting their future performance. Traditional KT...

Baidu Research Introduces EICopilot: An Intelligent Agent-based Chatbot to Retrieve and Interpret Enterprise Information...

Knowledge graphs have been used tremendously in the field of enterprise lately,...

Open Thoughts: An Open Source Initiative Advancing AI Reasoning with High-Quality Datasets and Models...

The critical issue of restricted access to high-quality reasoning datasets has limited...

Decoupling Tokenization: How Over-Tokenized Transformers Redefine Vocabulary Scaling in Language Models

Tokenization plays a fundamental role in the performance and scalability of Large...

Quantization Space Utilization Rate (QSUR): A Novel Post-Training Quantization Method Designed to Enhance the...

Post-training quantization (PTQ) focuses on reducing the size and improving the speed...

YuE: An Open-Source Music Generation AI Model Family Capable of Creating Full-Length Songs with...

Significant progress has been made in short-form instrumental compositions in AI and...

Recommended