File size: 3,207 Bytes
8088350 3e8322b 8088350 3e8322b 8088350 3e8322b 8088350 3e8322b 8088350 3e8322b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
---
language: en
license: apache-2.0
tags:
- t5
- music
- spotify
- text2json
- audio-features
- fine-tuned
base_model: t5-base
datasets:
- custom
library_name: transformers
pipeline_tag: text2text-generation
---
# T5-Base Fine-tuned for Spotify Features Prediction
T5-Base fine-tuned to convert natural language prompts into Spotify audio feature JSON
## Model Details
- **Base Model**: t5-base
- **Model Type**: Text-to-JSON generation
- **Language**: English
- **Task**: Convert natural language music preferences into Spotify audio feature JSON objects
- **Fine-tuning Dataset**: Custom dataset of prompts to Spotify audio features
## Training Configuration
- **Epochs**: 7
- **Learning Rate**: 3e-4
- **Batch Size**: 8 (per device)
- **Gradient Accumulation Steps**: 4
- **Scheduler**: Cosine with warmup
- **Optimizer**: AdamW
- **Max Length**: 256 tokens
- **Precision**: bfloat16
## Usage
```python
from transformers import T5ForConditionalGeneration, T5Tokenizer
import json
# Load model and tokenizer
model = T5ForConditionalGeneration.from_pretrained("afsagag/t5-spotify-features")
tokenizer = T5Tokenizer.from_pretrained("afsagag/t5-spotify-features")
# Example usage
prompt = "I want energetic dance music with high energy and danceability"
input_text = f"prompt: {prompt}"
# Tokenize and generate
input_ids = tokenizer(input_text, return_tensors="pt", max_length=256, truncation=True).input_ids
outputs = model.generate(
input_ids,
max_length=256,
num_beams=4,
early_stopping=True,
do_sample=False
)
# Decode result
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)
# Parse JSON output
try:
spotify_features = json.loads(result)
print("Generated Spotify Features:", spotify_features)
except json.JSONDecodeError:
print("Generated text is not valid JSON")
```
## Expected Output Format
The model generates JSON objects with Spotify audio features:
```json
{
"danceability": 0.85,
"energy": 0.90,
"valence": 0.75,
"acousticness": 0.15,
"instrumentalness": 0.05,
"speechiness": 0.08,
}
```
## Metrics
- **Per-set Mean Absolute Error**: Measures average prediction accuracy across feature sets
- **Per-set Root Mean Squared Error**: Measures prediction variance
- **Per-feature Correlation**: Pearson correlation for individual audio features
## Model Files
- `config.json`: Model configuration
- `pytorch_model.bin`: Model weights
- `tokenizer.json`: Tokenizer vocabulary
- `tokenizer_config.json`: Tokenizer configuration
- `special_tokens_map.json`: Special token mappings
## Limitations
- Model may occasionally generate invalid JSON that requires post-processing
- Trained on specific prompt formats starting with "prompt: "
- Performance depends on similarity to training data distribution
- May not generalize well to very abstract or unusual music descriptions
## Training Data
The model was trained on a custom dataset pairing natural language music descriptions with corresponding Spotify audio feature values.
## Ethical Considerations
This model generates music preference predictions and should not be used as the sole basis for music recommendation systems without human oversight.
|