hypothetical (Kirill)

reacted to their post with 👀🤯 1 day ago

Post

3710

The smallest and the highest quality in the world Gemma4 E2B and E4B models! 7x compression! From 9.3GB -> 1.4GB!

TheStageAI/gemma-4-E2B-it
TheStageAI/gemma-4-E4B-it

2 replies

·

replied to their post 1 day ago

Yes, fully understand. Team is working on a new set of releases.

For now a lot of compressed checkpoints coming without proper evaluation.
Team's pipeline always includes to reproduce "paper" results for released models (what takes time including that sometimes its a bit hard to recover full technique of evaluation). Then evaluate the best existing checkpoints and then run TheStage AI algorithms with evals to approve quality improvement.

Each step takes time comparing to just release set of quantised models. Our goal is to build meaningful and controllable compression and release models which not just small but really can do the work and provide clear limitations.

You can check some high-level ideas of automated/controllable compression here: https://huggingface.co/spaces/TheStageAI/ANNA-LLM

liked a model 3 days ago

TheStageAI/gemma-4-E4B-it

Image-Text-to-Text • Updated 4 days ago • 506 • 8

reacted to their post with 🚀 3 days ago

Post

3710

The smallest and the highest quality in the world Gemma4 E2B and E4B models! 7x compression! From 9.3GB -> 1.4GB!

TheStageAI/gemma-4-E2B-it
TheStageAI/gemma-4-E4B-it

2 replies

·

posted an update 3 days ago

Post

3710

The smallest and the highest quality in the world Gemma4 E2B and E4B models! 7x compression! From 9.3GB -> 1.4GB!

TheStageAI/gemma-4-E2B-it
TheStageAI/gemma-4-E4B-it

2 replies

·

liked a model 3 days ago

TheStageAI/gemma-4-E2B-it

Image-Text-to-Text • Updated 3 days ago • 619 • 4

liked a model 18 days ago

TheStageAI/Elastic-Z-Image-Turbo

Updated Apr 15 • 151 • 2

posted an update about 2 months ago

Post

184

Very cool updates! With our stack you can do the same for your networks!
Wan 2.2 generation during 34 seconds on a single H100!

Stack is general and can be applied to your neural networks for acceleration.

https://app.thestage.ai/blog/Generate-Wan-2.2-Videos-5.3x-Faster-with-Qlip?id=10

TheStageAI/Wan2.2-T2V-A14B

reacted to their post with 🤗 2 months ago

Post

1136

An intuitive, simple, open-source experiment dashboard built with Streamlit. It offers pre-defined layouts for different evaluations and convenient overlay comparisons of outputs, which are especially valuable during model compression when comparing results with the original model.

Github: https://github.com/TheStageAI/Spikes-Pipes

posted an update 2 months ago

Post

1136

An intuitive, simple, open-source experiment dashboard built with Streamlit. It offers pre-defined layouts for different evaluations and convenient overlay comparisons of outputs, which are especially valuable during model compression when comparing results with the original model.

Github: https://github.com/TheStageAI/Spikes-Pipes