Diffusers documentation

AutoencoderKLLTX2Audio

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.36.0).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

AutoencoderKLLTX2Audio

The 3D variational autoencoder (VAE) model with KL loss used in LTX-2 was introduced by Lightricks. This is for encoding and decoding audio latent representations.

The model can be loaded with the following code snippet.

from diffusers import AutoencoderKLLTX2Audio

vae = AutoencoderKLLTX2Audio.from_pretrained("Lightricks/LTX-2", subfolder="vae", torch_dtype=torch.float32).to("cuda")

AutoencoderKLLTX2Audio

class diffusers.AutoencoderKLLTX2Audio

< >

( base_channels: int = 128 output_channels: int = 2 ch_mult: typing.Tuple[int, ...] = (1, 2, 4) num_res_blocks: int = 2 attn_resolutions: typing.Optional[typing.Tuple[int, ...]] = None in_channels: int = 2 resolution: int = 256 latent_channels: int = 8 norm_type: str = 'pixel' causality_axis: typing.Optional[str] = 'height' dropout: float = 0.0 mid_block_add_attention: bool = False sample_rate: int = 16000 mel_hop_length: int = 160 is_causal: bool = True mel_bins: typing.Optional[int] = 64 double_z: bool = True )

LTX2 audio VAE for encoding and decoding audio latent representations.

wrapper

< >

( *args **kwargs )

wrapper

< >

( *args **kwargs )

Update on GitHub