AI & ML interests
None defined yet.
Recent Activity
View all activity
mryabย
submitted a
paper to Daily Papers 2 months ago
lewtunย
submitted 2
papers to Daily Papers 3 months ago
Post
4705
๐ What happened in AI in 2025? ๐
We prepared the 2025 version of the HF AI Timeline Grid, highlighting open vs API-based model releases, and allowing you to browse and filter by access, modality, and release type!
Play with it here:
2025-ai-timeline/2025-ai-timeline
Here's my personal quarterly TL;DR:
1๏ธโฃ Q1 โ Learning to Reason
Deepseek not only releases a top-notch reasoning model, but shows how to train them and compete with closed frontier models. OpenAI debuts Deep Research.
Significant milestones: DeepSeek R1 & R1-Zero, Qwen 2.5 VL, OpenAI Deep Research, Gemini 2.5 Pro (experimental)
2๏ธโฃ Q2 โ Multimodality and Coding
More LLMs embrace multimodality by default, and there's a surge in coding agents. Strong vision, audio, and generative models emerge.
Significant milestones: Llama 4, Qwen 3, Imagen 4, OpenAI Codex, Google Jules, Claude 4
3๏ธโฃ Q3 โ "Gold" rush, OpenAI opens up, the community goes bananas
Flagship models get gold in Math olympiads and hard benchmarks. OpenAI releases strong open source models and Google releases the much anticipated nano-banana for image generation and editing. Agentic workflows become commonplace.
Significant milestones: Gemini and OpenAI IMO Gold, gpt-oss, Gemini 2.5 Flash Image, Grok 4, Claude Sonnet 4.5
4๏ธโฃ Q4 โ Mistral returns, leaderboard hill-climbing
Mistral is back with updated model families. All labs release impressive models to wrap up the year!
Significant milestones: Claude Opus 4.5, DeepSeek Math V2, FLUX 2, GPT 5.1, Kimi K2 Thinking, Nano Banana Pro, GLM 4.7, Gemini 3, Mistral 3, MiniMax M2.1 ๐คฏ
Credits
๐ NHLOCAL for the source data https://github.com/NHLOCAL/AiTimeline
๐ซก @reach-vb for the original idea, design and recipe
๐ @ariG23498 and yours truly for compiling and verifying the 2025 edition
๐ฅณ Here's to 2026, wishing it becomes the best year ever for open releases and on-device-first use-cases! ๐ฅ
We prepared the 2025 version of the HF AI Timeline Grid, highlighting open vs API-based model releases, and allowing you to browse and filter by access, modality, and release type!
Play with it here:
2025-ai-timeline/2025-ai-timeline
Here's my personal quarterly TL;DR:
1๏ธโฃ Q1 โ Learning to Reason
Deepseek not only releases a top-notch reasoning model, but shows how to train them and compete with closed frontier models. OpenAI debuts Deep Research.
Significant milestones: DeepSeek R1 & R1-Zero, Qwen 2.5 VL, OpenAI Deep Research, Gemini 2.5 Pro (experimental)
2๏ธโฃ Q2 โ Multimodality and Coding
More LLMs embrace multimodality by default, and there's a surge in coding agents. Strong vision, audio, and generative models emerge.
Significant milestones: Llama 4, Qwen 3, Imagen 4, OpenAI Codex, Google Jules, Claude 4
3๏ธโฃ Q3 โ "Gold" rush, OpenAI opens up, the community goes bananas
Flagship models get gold in Math olympiads and hard benchmarks. OpenAI releases strong open source models and Google releases the much anticipated nano-banana for image generation and editing. Agentic workflows become commonplace.
Significant milestones: Gemini and OpenAI IMO Gold, gpt-oss, Gemini 2.5 Flash Image, Grok 4, Claude Sonnet 4.5
4๏ธโฃ Q4 โ Mistral returns, leaderboard hill-climbing
Mistral is back with updated model families. All labs release impressive models to wrap up the year!
Significant milestones: Claude Opus 4.5, DeepSeek Math V2, FLUX 2, GPT 5.1, Kimi K2 Thinking, Nano Banana Pro, GLM 4.7, Gemini 3, Mistral 3, MiniMax M2.1 ๐คฏ
Credits
๐ NHLOCAL for the source data https://github.com/NHLOCAL/AiTimeline
๐ซก @reach-vb for the original idea, design and recipe
๐ @ariG23498 and yours truly for compiling and verifying the 2025 edition
๐ฅณ Here's to 2026, wishing it becomes the best year ever for open releases and on-device-first use-cases! ๐ฅ
alozowskiย
authored a
paper 5 months ago
ariG23498ย
authored a
paper 6 months ago
Post
11917
deepseek-ai/DeepSeek-OCR is out! ๐ฅ my take โคต๏ธ
> pretty insane it can parse and re-render charts in HTML
> it uses CLIP and SAM features concatenated, so better grounding
> very efficient per vision tokens/performance ratio
> covers 100 languages
> pretty insane it can parse and re-render charts in HTML
> it uses CLIP and SAM features concatenated, so better grounding
> very efficient per vision tokens/performance ratio
> covers 100 languages
jfchiย
authored 6
papers 7 months ago
FFB: A Fair Fairness Benchmark for In-Processing Group Fairness Methods
Paper โข 2306.09468 โข Published โข 1
The Llama 3 Herd of Models
Paper โข 2407.21783 โข Published โข 118
Persistent Pre-Training Poisoning of LLMs
Paper โข 2410.13722 โข Published
Towards Understanding the Fragility of Multilingual LLMs against Fine-Tuning Attacks
Paper โข 2410.18210 โข Published
Large Reasoning Models Learn Better Alignment from Flawed Thinking
Paper โข 2510.00938 โข Published โข 60
Shape it Up! Restoring LLM Safety during Finetuning
Paper โข 2505.17196 โข Published โข 1
Post
7000
large AI labs open-sourced a ton of models last week ๐ฅ
here's few picks, find even more here merve/sep-16-releases-68d13ea4c547f02f95842f05 ๐ค
> IBM released a new Docling model with 258M params based on Granite (A2.0) ๐ ibm-granite/granite-docling-258M
> Xiaomi released 7B audio LM with base and instruct variants (MIT) XiaomiMiMo/mimo-audio-68cc7202692c27dae881cce0
> DecartAI released Lucy Edit, open Nano Banana ๐ (NC) decart-ai/Lucy-Edit-Dev
> OpenGVLab released a family of agentic computer use models (3B/7B/32B) with the dataset ๐ป OpenGVLab/scalecua-68c912cf56f7ff4c8e034003
> Meituan Longcat released thinking version of LongCat-Flash ๐ญ meituan-longcat/LongCat-Flash-Thinking
here's few picks, find even more here merve/sep-16-releases-68d13ea4c547f02f95842f05 ๐ค
> IBM released a new Docling model with 258M params based on Granite (A2.0) ๐ ibm-granite/granite-docling-258M
> Xiaomi released 7B audio LM with base and instruct variants (MIT) XiaomiMiMo/mimo-audio-68cc7202692c27dae881cce0
> DecartAI released Lucy Edit, open Nano Banana ๐ (NC) decart-ai/Lucy-Edit-Dev
> OpenGVLab released a family of agentic computer use models (3B/7B/32B) with the dataset ๐ป OpenGVLab/scalecua-68c912cf56f7ff4c8e034003
> Meituan Longcat released thinking version of LongCat-Flash ๐ญ meituan-longcat/LongCat-Flash-Thinking
Post
3531
IBM just released small swiss army knife for the document models: granite-docling-258M on Hugging Face ๐ฅ
> not only a document converter but also can do document question answering, understand multiple languages ๐คฏ
> best part: released with Apache 2.0 license ๐ use it with your commercial projects!
> it supports transformers, vLLM and MLX from the get-go! ๐ค
> built on SigLIP2 & granite-165M
model: ibm-granite/granite-docling-258M
demo: ibm-granite/granite-docling-258m-demo ๐
> not only a document converter but also can do document question answering, understand multiple languages ๐คฏ
> best part: released with Apache 2.0 license ๐ use it with your commercial projects!
> it supports transformers, vLLM and MLX from the get-go! ๐ค
> built on SigLIP2 & granite-165M
model: ibm-granite/granite-docling-258M
demo: ibm-granite/granite-docling-258m-demo ๐
Post
1284
a ton of image/video generation models and LLMs from big labs ๐ฅ
> Meta released facebook/mobilellm-r1-68c4597b104fac45f28f448e, smol LLMs for on-device use ๐ฌ
> Tencent released tencent/SRPO, high res image generation model and tencent/POINTS-Reader, cutting edge OCR ๐
> ByteDance released bytedance-research/HuMo, video generation from any input โฏ๏ธ
find more models, datasets, demos here merve/sep-11-releases-68c7dbfa26bea8cd921fa0ac
> Meta released facebook/mobilellm-r1-68c4597b104fac45f28f448e, smol LLMs for on-device use ๐ฌ
> Tencent released tencent/SRPO, high res image generation model and tencent/POINTS-Reader, cutting edge OCR ๐
> ByteDance released bytedance-research/HuMo, video generation from any input โฏ๏ธ
find more models, datasets, demos here merve/sep-11-releases-68c7dbfa26bea8cd921fa0ac
Post
1077
fan-favorite vision LM Florence-2 is now officially supported in transformers ๐ค
find all the models in
florence-community org ๐ซก
find all the models in
Post
1874
past week was great for open LLMs ๐ฅ merve/sep-1-releases-68bede0e729c12597eefd050
> Google released google/embeddinggemma-300m, new embedding model with 300M params
> new update to Kimi-K2 just landed moonshotai/Kimi-K2-Instruct-0905 ๐
> OpenBMB released a new version to MiniCPM with 8B params openbmb/MiniCPM4.1-8B
also soooo many Qwen-Image & Kontext LoRAs dropped!
> Google released google/embeddinggemma-300m, new embedding model with 300M params
> new update to Kimi-K2 just landed moonshotai/Kimi-K2-Instruct-0905 ๐
> OpenBMB released a new version to MiniCPM with 8B params openbmb/MiniCPM4.1-8B
also soooo many Qwen-Image & Kontext LoRAs dropped!