| 2 |
The State Of LLMs 2025: Progress, Problems, and Predictions |
JUL 19, 2025 |
28 |
~6857 |
✅ Complete |
| 2 |
The State Of LLMs 2025: Progress, Problems, and Predictions |
JUL 19, 2025 |
28 |
~6857 |
✅ Complete |
| 1 |
Categories of Inference-Time Scaling for Improved LLM Reasoning |
JUL 19, 2025 |
17 |
~7335 |
✅ Complete |
| 1 |
Categories of Inference-Time Scaling for Improved LLM Reasoning |
JUL 19, 2025 |
17 |
~7335 |
✅ Complete |
| 3 |
LLM Research Papers: The 2025 List (July to December) |
JUL 19, 2025 |
16 |
~3314 |
✅ Complete |
| 3 |
LLM Research Papers: The 2025 List (July to December) |
JUL 19, 2025 |
16 |
~3314 |
✅ Complete |
| 4 |
From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates |
JUL 19, 2025 |
22 |
~5889 |
✅ Complete |
| 4 |
From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates |
JUL 19, 2025 |
22 |
~5889 |
✅ Complete |
| 5 |
Beyond Standard LLMs |
JUL 19, 2025 |
28 |
~7404 |
✅ Complete |
| 5 |
Beyond Standard LLMs |
JUL 19, 2025 |
28 |
~7404 |
✅ Complete |
| 6 |
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch) |
JUL 19, 2025 |
18 |
~6473 |
✅ Complete |
| 6 |
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch) |
JUL 19, 2025 |
18 |
~6473 |
✅ Complete |
| 7 |
Understanding and Implementing Qwen3 From Scratch |
JUL 19, 2025 |
21 |
~7964 |
✅ Complete |
| 7 |
Understanding and Implementing Qwen3 From Scratch |
JUL 19, 2025 |
21 |
~7964 |
✅ Complete |
| 8 |
From GPT-2 to gpt-oss: Analyzing the Architectural Advances |
JUL 19, 2025 |
26 |
~5647 |
✅ Complete |
| 8 |
From GPT-2 to gpt-oss: Analyzing the Architectural Advances |
JUL 19, 2025 |
26 |
~5647 |
✅ Complete |
| 9 |
The Big LLM Architecture Comparison |
FEB 5, 2025 |
60 |
~12056 |
✅ Complete |
| 9 |
The Big LLM Architecture Comparison |
FEB 5, 2025 |
60 |
~12056 |
✅ Complete |
| 10 |
LLM Research Papers: The 2025 List (January to June) |
JUL 19, 2025 |
11 |
~4119 |
✅ Complete |
| 10 |
LLM Research Papers: The 2025 List (January to June) |
JUL 19, 2025 |
11 |
~4119 |
✅ Complete |
| 11 |
Understanding and Coding the KV Cache in LLMs from Scratch |
JUL 19, 2025 |
15 |
~3110 |
✅ Complete |
| 11 |
Understanding and Coding the KV Cache in LLMs from Scratch |
JUL 19, 2025 |
15 |
~3110 |
✅ Complete |
| 12 |
Coding LLMs from the Ground Up: A Complete Course |
JUL 19, 2025 |
1 |
~1017 |
✅ Complete |
| 12 |
Coding LLMs from the Ground Up: A Complete Course |
JUL 19, 2025 |
1 |
~1017 |
✅ Complete |
| 13 |
The State of Reinforcement Learning for LLM Reasoning |
JUL 19, 2025 |
36 |
~7957 |
✅ Complete |
| 13 |
The State of Reinforcement Learning for LLM Reasoning |
JUL 19, 2025 |
36 |
~7957 |
✅ Complete |
| 14 |
First Look at Reasoning From Scratch: Chapter 1 |
JUL 19, 2025 |
7 |
~3935 |
✅ Complete |
| 14 |
First Look at Reasoning From Scratch: Chapter 1 |
JUL 19, 2025 |
7 |
~3935 |
✅ Complete |
| 15 |
The State of LLM Reasoning Model Inference |
JUL 19, 2025 |
26 |
~4627 |
✅ Complete |
| 15 |
The State of LLM Reasoning Model Inference |
JUL 19, 2025 |
26 |
~4627 |
✅ Complete |
| 16 |
Understanding Reasoning LLMs |
JUL 19, 2025 |
18 |
~4363 |
✅ Complete |
| 16 |
Understanding Reasoning LLMs |
JUL 19, 2025 |
18 |
~4363 |
✅ Complete |
| 17 |
Noteworthy AI Research Papers of 2024 (Part Two) |
JUL 19, 2025 |
22 |
~5751 |
✅ Complete |
| 17 |
Noteworthy AI Research Papers of 2024 (Part Two) |
JUL 19, 2025 |
22 |
~5751 |
✅ Complete |
| 18 |
Noteworthy AI Research Papers of 2024 (Part One) |
JUL 19, 2025 |
11 |
~3402 |
✅ Complete |
| 18 |
Noteworthy AI Research Papers of 2024 (Part One) |
JUL 19, 2025 |
11 |
~3402 |
✅ Complete |
| 19 |
LLM Research Papers: The 2024 List |
JUL 19, 2025 |
2 |
~6515 |
✅ Complete |
| 19 |
LLM Research Papers: The 2024 List |
JUL 19, 2025 |
2 |
~6515 |
✅ Complete |
| 20 |
Understanding Multimodal LLMs |
JUL 19, 2025 |
31 |
~5021 |
✅ Complete |
| 20 |
Understanding Multimodal LLMs |
JUL 19, 2025 |
31 |
~5021 |
✅ Complete |
| 21 |
Building A GPT-Style LLM Classifier From Scratch |
JUL 19, 2025 |
21 |
~4475 |
✅ Complete |
| 21 |
Building A GPT-Style LLM Classifier From Scratch |
JUL 19, 2025 |
21 |
~4475 |
✅ Complete |
| 22 |
Building LLMs from the Ground Up: A 3-hour Coding Workshop |
JUL 19, 2025 |
1 |
~355 |
✅ Complete |
| 22 |
Building LLMs from the Ground Up: A 3-hour Coding Workshop |
JUL 19, 2025 |
1 |
~355 |
✅ Complete |
| 23 |
New LLM Pre-training and Post-training Paradigms |
JUL 19, 2025 |
21 |
~4595 |
✅ Complete |
| 23 |
New LLM Pre-training and Post-training Paradigms |
JUL 19, 2025 |
21 |
~4595 |
✅ Complete |
| 24 |
Instruction Pretraining LLMs |
JUL 19, 2025 |
18 |
~6797 |
✅ Complete |
| 24 |
Instruction Pretraining LLMs |
JUL 19, 2025 |
18 |
~6797 |
✅ Complete |
| 25 |
Developing an LLM: Building, Training, Finetuning |
JUL 19, 2025 |
0 |
~204 |
✅ Complete |
| 25 |
Developing an LLM: Building, Training, Finetuning |
JUL 19, 2025 |
0 |
~204 |
✅ Complete |
| 26 |
LLM Research Insights: Instruction Masking and New LoRA Finetuning Experiments |
JUL 19, 2025 |
19 |
~4398 |
✅ Complete |
| 26 |
LLM Research Insights: Instruction Masking and New LoRA Finetuning Experiments |
JUL 19, 2025 |
19 |
~4398 |
✅ Complete |
| 27 |
How Good Are the Latest Open LLMs? And Is DPO Better Than PPO? |
JUL 19, 2025 |
19 |
~5651 |
✅ Complete |
| 27 |
How Good Are the Latest Open LLMs? And Is DPO Better Than PPO? |
JUL 19, 2025 |
19 |
~5651 |
✅ Complete |
| 28 |
Using and Finetuning Pretrained Transformers |
JUL 19, 2025 |
9 |
~3541 |
✅ Complete |
| 28 |
Using and Finetuning Pretrained Transformers |
JUL 19, 2025 |
9 |
~3541 |
✅ Complete |
| 29 |
Tips for LLM Pretraining and Evaluating Reward Models |
JUL 19, 2025 |
15 |
~5333 |
✅ Complete |
| 29 |
Tips for LLM Pretraining and Evaluating Reward Models |
JUL 19, 2025 |
15 |
~5333 |
✅ Complete |
| 30 |
A LoRA Successor, Small Finetuned LLMs Vs Generalist LLMs, and Transparent LLM Research |
JUL 19, 2025 |
18 |
~5229 |
✅ Complete |
| 30 |
A LoRA Successor, Small Finetuned LLMs Vs Generalist LLMs, and Transparent LLM Research |
JUL 19, 2025 |
18 |
~5229 |
✅ Complete |
| 31 |
Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch |
JUL 19, 2025 |
13 |
~3241 |
✅ Complete |
| 31 |
Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch |
JUL 19, 2025 |
13 |
~3241 |
✅ Complete |
| 32 |
Model Merging, Mixtures of Experts, and Towards Smaller LLMs |
JUL 19, 2025 |
21 |
~5792 |
✅ Complete |
| 32 |
Model Merging, Mixtures of Experts, and Towards Smaller LLMs |
JUL 19, 2025 |
21 |
~5792 |
✅ Complete |
| 33 |
Understanding and Coding Self-Attention, Multi-Head Attention, Causal-Attention, and Cross-Attention in LLMs |
JUL 19, 2025 |
21 |
~4997 |
✅ Complete |
| 33 |
Understanding and Coding Self-Attention, Multi-Head Attention, Causal-Attention, and Cross-Attention in LLMs |
JUL 19, 2025 |
21 |
~4997 |
✅ Complete |
| 34 |
Ten Noteworthy AI Research Papers of 2023 |
JUL 19, 2025 |
27 |
~4553 |
✅ Complete |
| 34 |
Ten Noteworthy AI Research Papers of 2023 |
JUL 19, 2025 |
27 |
~4553 |
✅ Complete |
| 35 |
Tackling Hallucinations, Boosting Reasoning Abilities, and New Insights into the Transformer Architecture |
JUL 19, 2025 |
19 |
~5369 |
✅ Complete |
| 35 |
Tackling Hallucinations, Boosting Reasoning Abilities, and New Insights into the Transformer Architecture |
JUL 19, 2025 |
19 |
~5369 |
✅ Complete |
| 36 |
Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation) |
JUL 19, 2025 |
17 |
~3605 |
✅ Complete |
| 36 |
Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation) |
JUL 19, 2025 |
17 |
~3605 |
✅ Complete |
| 37 |
A Potential Successor to RLHF for Efficient LLM Alignment and the Resurgence of CNNs |
JUL 19, 2025 |
11 |
~3559 |
✅ Complete |
| 37 |
A Potential Successor to RLHF for Efficient LLM Alignment and the Resurgence of CNNs |
JUL 19, 2025 |
11 |
~3559 |
✅ Complete |
| 38 |
AI and Open Source in 2023 |
JUL 19, 2025 |
18 |
~3238 |
✅ Complete |
| 38 |
AI and Open Source in 2023 |
JUL 19, 2025 |
18 |
~3238 |
✅ Complete |
| 39 |
LLM Business and Busyness: Recent Company Investments and AI Adoption, New Small Openly Available LLMs, and LoRA Research |
JUL 19, 2025 |
14 |
~3326 |
✅ Complete |
| 39 |
LLM Business and Busyness: Recent Company Investments and AI Adoption, New Small Openly Available LLMs, and LoRA Research |
JUL 19, 2025 |
14 |
~3326 |
✅ Complete |
| 40 |
From Self-Alignment to LongLoRA |
JUL 19, 2025 |
23 |
~1976 |
✅ Complete |
| 40 |
From Self-Alignment to LongLoRA |
JUL 19, 2025 |
23 |
~1976 |
✅ Complete |
| 41 |
LLM Training: RLHF and Its Alternatives |
JUL 19, 2025 |
18 |
~3362 |
✅ Complete |
| 41 |
LLM Training: RLHF and Its Alternatives |
JUL 19, 2025 |
18 |
~3362 |
✅ Complete |
| 42 |
The Missing Bits: Llama 2 Weights Have Changed |
JUL 19, 2025 |
10 |
~1283 |
✅ Complete |
| 42 |
The Missing Bits: Llama 2 Weights Have Changed |
JUL 19, 2025 |
10 |
~1283 |
✅ Complete |
| 43 |
New Foundation Models: CodeLlama and other highlights in Open-Source AI |
JUL 19, 2025 |
22 |
~4244 |
✅ Complete |
| 43 |
New Foundation Models: CodeLlama and other highlights in Open-Source AI |
JUL 19, 2025 |
22 |
~4244 |
✅ Complete |
| 44 |
Llama 2, Flash-Attention 2, and More |
JUL 19, 2025 |
15 |
~1250 |
✅ Complete |
| 44 |
Llama 2, Flash-Attention 2, and More |
JUL 19, 2025 |
15 |
~1250 |
✅ Complete |
| 45 |
Large Language Models and Nearest Neighbors |
JUL 19, 2025 |
11 |
~3039 |
✅ Complete |
| 45 |
Large Language Models and Nearest Neighbors |
JUL 19, 2025 |
11 |
~3039 |
✅ Complete |
| 46 |
Long Contexts and Scaling Transformers to 1,000,000,000 Tokens |
JUL 19, 2025 |
23 |
~1822 |
✅ Complete |
| 46 |
Long Contexts and Scaling Transformers to 1,000,000,000 Tokens |
JUL 19, 2025 |
23 |
~1822 |
✅ Complete |
| 47 |
State of Computer Vision 2023: From Vision Transformers to Neural Radiance Fields |
JUL 19, 2025 |
18 |
~3132 |
✅ Complete |
| 47 |
State of Computer Vision 2023: From Vision Transformers to Neural Radiance Fields |
JUL 19, 2025 |
18 |
~3132 |
✅ Complete |
| 48 |
Accelerating PyTorch Model Training |
JUL 19, 2025 |
15 |
~2002 |
✅ Complete |
| 48 |
Accelerating PyTorch Model Training |
JUL 19, 2025 |
15 |
~2002 |
✅ Complete |
| 49 |
Understanding Encoder And Decoder LLMs |
JUL 19, 2025 |
5 |
~1566 |
✅ Complete |
| 49 |
Understanding Encoder And Decoder LLMs |
JUL 19, 2025 |
5 |
~1566 |
✅ Complete |
| 50 |
Direct-Preference Optimization for Human Feedback and More |
JUL 19, 2025 |
25 |
~1999 |
✅ Complete |
| 50 |
Direct-Preference Optimization for Human Feedback and More |
JUL 19, 2025 |
25 |
~1999 |
✅ Complete |
| 51 |
LLM Tuning & Dataset Perspectives |
JUL 19, 2025 |
19 |
~3462 |
✅ Complete |
| 51 |
LLM Tuning & Dataset Perspectives |
JUL 19, 2025 |
19 |
~3462 |
✅ Complete |
| 52 |
About LayerNorm Variants in the Original Transformer Paper, and Some Other Interesting Historical Tidbits About LLMs |
JUL 19, 2025 |
6 |
~1156 |
✅ Complete |
| 52 |
About LayerNorm Variants in the Original Transformer Paper, and Some Other Interesting Historical Tidbits About LLMs |
JUL 19, 2025 |
6 |
~1156 |
✅ Complete |
| 53 |
Finetuning LLMs Efficiently with Adapters |
JUL 19, 2025 |
12 |
~1830 |
✅ Complete |
| 53 |
Finetuning LLMs Efficiently with Adapters |
JUL 19, 2025 |
12 |
~1830 |
✅ Complete |
| 54 |
Transformers for Long Inputs and Less Training Data |
JUL 19, 2025 |
13 |
~1772 |
✅ Complete |
| 54 |
Transformers for Long Inputs and Less Training Data |
JUL 19, 2025 |
13 |
~1772 |
✅ Complete |
| 55 |
Insights from Large-Scale LLM Training Runs |
JUL 19, 2025 |
13 |
~2923 |
✅ Complete |
| 55 |
Insights from Large-Scale LLM Training Runs |
JUL 19, 2025 |
13 |
~2923 |
✅ Complete |
| 56 |
Understanding Parameter-Efficient LLM Finetuning: Prompt Tuning And Prefix Tuning |
JUL 19, 2025 |
8 |
~1016 |
✅ Complete |
| 56 |
Understanding Parameter-Efficient LLM Finetuning: Prompt Tuning And Prefix Tuning |
JUL 19, 2025 |
8 |
~1016 |
✅ Complete |
| 57 |
Finetuning Large Language Models |
JUL 19, 2025 |
9 |
~2246 |
✅ Complete |
| 57 |
Finetuning Large Language Models |
JUL 19, 2025 |
9 |
~2246 |
✅ Complete |
| 58 |
Understanding Large Language Models |
JUL 19, 2025 |
21 |
~3501 |
✅ Complete |
| 58 |
Understanding Large Language Models |
JUL 19, 2025 |
21 |
~3501 |
✅ Complete |
| 59 |
Large Language Models 3.0 |
JUL 19, 2025 |
16 |
~4087 |
✅ Complete |
| 59 |
Large Language Models 3.0 |
JUL 19, 2025 |
16 |
~4087 |
✅ Complete |
| 60 |
TrAIn Differently: Do We Need Reinforcement Learning with Human Feedback (RLHF)? |
JUL 19, 2025 |
17 |
~4586 |
✅ Complete |
| 60 |
TrAIn Differently: Do We Need Reinforcement Learning with Human Feedback (RLHF)? |
JUL 19, 2025 |
17 |
~4586 |
✅ Complete |
| 61 |
RevAIval of Ideas: From Next-Generation Convolutional Neural Networks to LLMs |
JUL 19, 2025 |
26 |
~4682 |
✅ Complete |
| 61 |
RevAIval of Ideas: From Next-Generation Convolutional Neural Networks to LLMs |
JUL 19, 2025 |
26 |
~4682 |
✅ Complete |
| 62 |
Looking Back at 2022: A Big Year For AI |
JUL 19, 2025 |
15 |
~3059 |
✅ Complete |
| 62 |
Looking Back at 2022: A Big Year For AI |
JUL 19, 2025 |
15 |
~3059 |
✅ Complete |
| 63 |
Launching Large Language Models and Open Source Software |
JUL 19, 2025 |
18 |
~2936 |
✅ Complete |
| 63 |
Launching Large Language Models and Open Source Software |
JUL 19, 2025 |
18 |
~2936 |
✅ Complete |
| 64 |
Transformers, Fast and Slow: New Developments in Language Processing |
JUL 19, 2025 |
13 |
~2953 |
✅ Complete |
| 64 |
Transformers, Fast and Slow: New Developments in Language Processing |
JUL 19, 2025 |
13 |
~2953 |
✅ Complete |
| 65 |
A Diffusion of Innovations: Recent Developments in Generative Learning |
JUL 19, 2025 |
12 |
~2110 |
✅ Complete |
| 65 |
A Diffusion of Innovations: Recent Developments in Generative Learning |
JUL 19, 2025 |
12 |
~2110 |
✅ Complete |