Encoder Model
This directory contains the pre-trained encoder model for voice conversion.
Model Details
- File:
encoder.pt - Size: ~17.1 MB
- Input: Audio waveform
- Output: Speaker embeddings
Usage
# Load the encoder model
encoder = torch.load('encoder.pt')
encoder.eval()
# Process audio
with torch.no_grad():
embedding = encoder(audio_tensor)
Dependencies
- PyTorch
- NumPy
- Librosa (for audio processing)
Model Configuration
See config.json for model architecture and training parameters.