Transformers v4 or v5 for my new project?

alemerci · December 27, 2025, 11:15am

I am going to start a brand new research project and don’t know whether I should directly go with transformers v5 or the latest v4 instead. I will need to use a lot of libraries like the state space model optimizers and wonder if v4 wouldn’t be a safer choice. I am using Python 3.14.2.

John6666 · December 27, 2025, 12:08pm

For your current use case, I think it’s better to stick with v4. The Python version is the bigger challenge…

When using PyTorch or Hugging Face libraries, I think you’ll run into quite a few compatibility issues if you use anything other than Python 3.10 to 3.12…

For your case, start on Transformers v4 (latest stable) and keep Transformers v5 (RC) in a separate “try-it” environment until v5 is final and your CUDA-extension stack is proven on Python 3.14.2.

That is the lowest-risk path for a brand-new research project that depends on SSM kernels and other compiled extensions.

Background that matters: what “v4 vs v5” really means

Transformers v4

The current stable line. Installing transformers normally gives you v4 today.
Latest v4 release on PyPI is 4.57.3 (Nov 25, 2025). (PyPI)

Transformers v5

Not a finalized stable release yet. It is shipping as a release candidate (RC).
HF explicitly says it is opt-in and pushed to PyPI as a pre-release. Installing without --pre keeps you on v4. (GitHub)
Latest pre-releases shown on PyPI include 5.0.0rc0 (Dec 1, 2025) and 5.0.0rc1 (Dec 11, 2025). (PyPI)

Interpretation: v5 is “the future,” but today it is still in the stabilization window. That is fine for testing. It is risky as the foundation of a fresh research codebase where you want momentum.

The bigger issue in your case: Python 3.14.2 + GPU stack

When you say “state space model optimizers,” that usually means packages that ship custom CUDA kernels (or build them locally). Those stacks tend to break on new Python minors first.

A very direct, recent signal: a PyTorch issue reports no CUDA-enabled wheels for Python 3.14 (cp314) on the official CUDA wheel indexes, leading to “CPU-only” installs or “no matching distribution” errors for CUDA builds. (GitHub)

Why this matters for SSM tooling:

mamba-ssm lists requirements that include Linux, NVIDIA GPU, CUDA 11.6+, PyTorch 1.12+, and it explicitly warns about install difficulties and suggests --no-build-isolation. (PyPI)
FlashAttention similarly depends on a strict CUDA toolchain (CUDA 12+) and even recommends using an NVIDIA PyTorch container to simplify installs. (GitHub)
Real-world failures often look like “CUDA version mismatch vs the CUDA version PyTorch was compiled with,” which is a common compiled-extension trap. (Stack Overflow)

Bottom line: if Python 3.14.2 pushes you into unusual PyTorch builds (nightly, source builds, CPU-only), your SSM optimizer stack becomes the main source of delay. In that world, adding “Transformers major-version transition” on top is usually not the best use of risk budget.

Why v4 is the safer baseline for you

1) v5 is still RC

Hugging Face’s own release notes are explicit: v5 is a release candidate, not final, opt-in, pre-release. (GitHub)
In practice, RCs are where downstream libraries find edge cases.

2) Your dependency graph is likely to touch internals

Many research stacks import utility functions, patch model classes, or rely on behavior that was “stable by convention” in v4. Major releases remove deprecations and can move internals.

HF describes v5 as a big refactor that removes long-standing deprecations and simplifies APIs and internals. (GitHub)
That is good long-term. It is also exactly how compatibility breaks happen short-term.

3) v5 includes real surface changes you will eventually deal with anyway

Example: authentication parameter changes.

The official v5 migration guide says use_auth_token is deprecated in favor of token “everywhere” and suggests search-and-replace. (GitHub)
If you start on v5 you avoid later migration, but you accept RC churn now.

Example: tokenization changes.

HF is actively redesigning tokenization for v5 toward being simpler and more modular. (Hugging Face)
If you do anything custom with tokenizers, you want to be deliberate about when you absorb that change.

What I would do for a brand-new research project

Core recommendation

Primary environment (the one you do experiments in):
- Transformers v4.57.3 pinned. (PyPI)
- Pin PyTorch and every CUDA-extension package tightly.
Secondary environment (compat/testing):
- Install Transformers v5 RC (pre-release) and run a small smoke test suite weekly or when you bump dependencies. HF explicitly documents the opt-in install via --pre. (GitHub)

This gives you two benefits at once:

Stability for day-to-day research.
Early warning when it is time to migrate.

Practical decision rule for your situation

Choose v4 now if any of these are true (likely for you)

You expect to use mamba-ssm / causal-conv1d / flash-attn / Triton-style extensions.
You want fewer moving parts while you debug Python 3.14.2 packaging issues.
You are building an experiment harness you want to reuse for months.

This aligns with v4 being the stable default and v5 being explicitly RC today. (GitHub)

Choose v5 RC now only if all of these are true

You are comfortable being on a pre-release.
You have time budget to chase breakages and report them upstream.
Your other stack pieces are already stable (PyTorch CUDA wheels, kernels install cleanly).

If you are already fighting cp314 CUDA wheel gaps, you are not in this situation. (GitHub)

The Python 3.14.2 question you should answer first

Because you said you are using Python 3.14.2, you should treat this as a gating check:

If you need GPU training or fast kernels

Verify you can install a CUDA-enabled PyTorch build on your machine and platform.
The cp314 “no CUDA wheels available” report is current and credible. (GitHub)

If you cannot get CUDA PyTorch cleanly, you will lose far more time here than you will ever save by picking v5 vs v4.

If you are CPU-only (some research is)

Then the Python 3.14.2 risk is lower.
In that case the main remaining question is: “Do you want to build on a stable library (v4) or an RC (v5)?” The stable answer is still v4.

Common pitfalls to expect and how they map to your choice

Pitfall A: auth breaks when pulling private models

v5 migration: use_auth_token → token. (GitHub)
If your lab has private repos, this shows up quickly.

Pitfall B: tokenizer behavior changes break data pipelines

v5 tokenization work is active and ongoing. (Hugging Face)
If your research depends on exact token boundaries (common in probing and compression), plan extra validation when you switch to v5.

Pitfall C: CUDA extension install failures derail the project

mamba-ssm requirements and install notes make clear it is a Linux + CUDA + PyTorch-dependent build. (PyPI)
flash-attn similarly expects a modern CUDA toolchain. (GitHub)
Python 3.14.2 increases the chance of “wheel not found” and “build from source” cascades. (GitHub)

This pitfall dominates your timeline more than the Transformers major version.

Summary

Use Transformers v4.57.3 as your baseline. v5 is still RC and opt-in. (GitHub)
Your biggest risk is Python 3.14.2 + CUDA wheels + compiled SSM kernels, not Transformers itself. (GitHub)
Run a second environment with v5 RC to track migration and catch issues early. (GitHub)

Topic		Replies	Views
Transformers 4.6.0 on SageMaker? Amazon SageMaker	14	4645	September 9, 2022
Using a new model in an older version of Transformers library Beginners	0	252	December 15, 2023
Transformers v3.0.0 is out! 🤗Transformers	0	1973	July 7, 2020
Change Transformers version on Huggingface Inference Endpoint? 🤗Transformers	0	216	February 23, 2024
Looks like the new transformer 4.49.0 has some issues 🤗Transformers	3	389	March 6, 2025