Literary Tone Classifier v0.1
A DeBERTa-based classifier for detecting emotional tone in literary fiction passages. Designed for commercial fiction filtering and training data curation.
Model Description
- Base Model: microsoft/deberta-v3-small (86M parameters)
- Context Window: 2048 tokens (~1500 words, 3-5 pages)
- Training Data: 4,052 literary passages from various fiction genres
- Task: Multi-class text classification (5 tones)
Quick Start
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "Mitchins/literary-tone-classifier-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
text = """Your literary passage here..."""
inputs = tokenizer(text, max_length=2048, truncation=True,
padding='max_length', return_tensors='pt')
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=1)[0]
tones = ['dark', 'humorous', 'lyrical', 'melancholic', 'warm']
for tone, score in zip(tones, probs):
print(f"{tone}: {score:.2%}")
Classes
The model predicts 5 emotional tone categories:
| Class | What It Detects | Validation F1 | Typical Use |
|---|---|---|---|
| dark | Grim, violent, oppressive, noir | 91.1% | Filter grimdark fantasy, horror, noir |
| humorous | Satirical, witty, comedic | 91.1% | Filter comedy, satire, lighthearted fiction |
| warm | Comforting, affectionate, uplifting | 94.2% | Filter romance, cozy mysteries |
| melancholic | Sad, loss-focused, bittersweet | 88.7% | Filter emotional drama, literary fiction |
| lyrical | Poetic prose, metaphorical | 90.5% | Identify literary/poetic writing style |
Performance
Validation Set (406 passages)
- Overall Accuracy: 91.38%
- F1-Macro: 91.12%
- Precision-Macro: 90.72%
Real-World Testing (Published Books)
- Dark detection: 80% accuracy (tested on Abercrombie, Jemisin, Sanderson, King, Lawrence)
- Humorous detection: 100% accuracy (tested on Pratchett, Weir)
- Overall: 87.5% accuracy on 1500-word passages from published fiction
Synthetic Test Set
- Accuracy: 93.75% (15/16 correct)
- Better generalization than validation (-2.4pp gap)
Practical Usage: Commercial Fiction Filtering
The model works best when interpreted as a commercial fiction filter:
# Recommended interpretation
if probs[tones.index('humorous')] >= 0.60:
category = "Comedy/Satire" # 98% precision - very reliable
elif probs[tones.index('dark')] >= 0.60:
category = "Grim/Horror" # 88% precision - reliable
elif probs[tones.index('warm')] >= 0.70 or probs[tones.index('melancholic')] >= 0.70:
category = "Heartfelt/Emotional" # Combine for relationship-focused fiction
else:
category = "Neutral" # Likely action/thriller/plot-driven
What Works Well
✅ Dark/Grim (grimdark fantasy, noir, horror)
- High precision (88.5%)
- Works on Abercrombie, Jemisin, King
- Good for filtering dark content
✅ Humorous (comedy, satire)
- Very high precision (98.2%)
- Perfect accuracy on Pratchett, Weir
- Rarely produces false positives
✅ Warm (romance, cozy)
- Best validation performance (94.2% F1)
- High recall (97.6%)
- Not yet tested on published romance
✅ Melancholic (drama, loss)
- Good recall (95.9%)
- Catches emotional passages well
- Not yet tested on published literary fiction
Limitations
⚠️ Lyrical class has narrow application
- Detects poetic/metaphorical prose, not epic scale
- Rarely triggers on genre fiction (fantasy/sci-fi)
- Only triggered on 1/7 published books tested
- More useful as "literary style" metadata than tone category
⚠️ Cannot detect pacing
- No "action-packed" vs "contemplative" detection
- "Tense/Propulsive" is not included (considered book-level, not passage-level)
⚠️ Limited testing on some categories
- Warm and melancholic not yet validated on published romance/literary fiction
- Most testing focused on dark and humorous
⚠️ Edge cases
- Struggled with Stephen King's conversational horror style
- Can confuse lyrical with melancholic (both reflective)
- May classify Pratchett's dark humor scenes inconsistently
Training Details
- Optimizer: AdamW
- Learning Rate: 2e-5
- Batch Size: 2 per device, gradient accumulation 16 (effective: 32)
- Epochs: 5
- Loss Function: Focal Loss (γ=2.0) with class weights
- Early Stopping: Patience 3 on validation F1-macro
- Hardware: Single GPU, ~1.5 hours training time
Data
- Total samples: 4,052 literary passages
- Train/Val split: 90/10 stratified
- Source: Consolidated from 12-tone taxonomy
- Class distribution:
- humorous: 1,264 (31.2%)
- dark: 980 (24.2%)
- warm: 828 (20.4%)
- lyrical: 490 (12.1%)
- melancholic: 490 (12.1%)
Recommended Thresholds
Based on validation precision:
| Tone | Conservative | Balanced | Aggressive |
|---|---|---|---|
| dark | ≥0.75 | ≥0.65 | ≥0.55 |
| humorous | ≥0.70 | ≥0.60 | ≥0.50 |
| warm/melancholic | ≥0.80 | ≥0.70 | ≥0.60 |
Higher thresholds = fewer false positives, lower recall.
Example Use Cases
1. Filter Grimdark Passages
dark_passages = []
for passage in corpus:
inputs = tokenizer(passage, max_length=2048, truncation=True,
padding='max_length', return_tensors='pt')
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=1)[0]
if probs[0] >= 0.65: # dark threshold
dark_passages.append((passage, probs[0].item()))
# Expected precision: ~85%
2. Filter Comedy
# Very high precision due to conservative model
if probs[1] >= 0.60: # humorous threshold
comedy_passages.append(passage)
# Expected precision: ~95%+
3. Identify Literary Prose Style
# Use lyrical score as prose density indicator
if probs[2] >= 0.20: # lyrical threshold
literary_style = "poetic/metaphorical"
else:
literary_style = "commercial/invisible"
When NOT to Use This Model
❌ Don't use for:
- Detecting pacing (fast vs slow)
- Epic/wondrous atmosphere (model's "lyrical" doesn't capture this)
- Fine-grained romance subgenres
- Non-English text
- Very short passages (<200 words)
- Poetry or non-prose text
❌ Known failure modes:
- Conversational horror (e.g., Stephen King) may not register as dark
- Dry satirical passages may not register as humorous
- Epic fantasy worldbuilding won't trigger lyrical
- Action scenes default to "neutral" (no action category)
Comparison to Baseline
Previous 12-tone model:
- Validation: 64.76%
- Synthetic test: 45.83%
- Published books: 66.67%
This 5-tone model v0.1:
- Validation: 91.38% (+26.6pp)
- Synthetic test: 93.75% (+47.9pp)
- Published books: 87.5% (+20.8pp)
Model Version
- Version: 0.1
- Date: 2025-12-15
- Status: Production-ready for dark and humorous filtering; warm/melancholic need more testing
Future Improvements (v0.2)
Planned improvements:
- Replace "lyrical" with "wondrous/atmospheric" for epic scale detection
- Validate warm/melancholic on published romance and literary fiction
- Potentially add pacing classifier as separate model
- Test on more diverse edge cases
Citation
@misc{literary-tone-v0.1,
title={Literary Tone Classifier v0.1},
author={Your Name},
year={2025},
howpublished={HuggingFace Model Hub}
}
License
Model weights: [Your chosen license] Base model (DeBERTa): MIT License
Contact
For issues or questions about this model, please open an issue on the model repository.
Note: This is v0.1 - a working prototype optimized for dark and humorous detection. Some categories (warm, melancholic, lyrical) need additional validation on diverse published fiction before production use in all scenarios.
- Downloads last month
- 11