[Bugfix] Fix granite speech shape validation (#21762)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
Cyrus Leung 2025-07-29 02:19:21 +08:00 committed by GitHub
parent ec261b0291
commit e17a4d3bf9
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -64,14 +64,15 @@ class GraniteSpeechAudioInputs(TensorSchema):
Dimensions:
- b: Batch size
- nf: Number of audio features (variable length)
- fi: Number of input features from the Mel spectrogram.
- fo: Number of output features, i.e. the embedding size.
- 160: Fixed feature dimension for Mel spectrogram features
"""
input_features: Annotated[torch.Tensor, TensorShape("b", "nf", 160)]
input_features: Annotated[torch.Tensor, TensorShape("b", "fi", 160)]
"""Audio input features."""
input_features_mask: Annotated[torch.Tensor, TensorShape("b", "nf")]
input_features_mask: Annotated[torch.Tensor, TensorShape("b", "fo")]
"""Mask for variable length audio features."""
audio_embed_sizes: Annotated[list[int], TensorShape("b")]