my_awesome_qa_model
This model is a fine-tuned version of google/muril-base-cased on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.2420
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 30
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| No log | 1.0 | 87 | 3.3597 |
| 3.9557 | 2.0 | 174 | 2.1384 |
| 2.3734 | 3.0 | 261 | 1.3267 |
| 1.2469 | 4.0 | 348 | 1.0861 |
| 0.709 | 5.0 | 435 | 1.0629 |
| 0.4988 | 6.0 | 522 | 1.6941 |
| 0.3718 | 7.0 | 609 | 1.3660 |
| 0.3718 | 8.0 | 696 | 2.0104 |
| 0.2292 | 9.0 | 783 | 2.1057 |
| 0.1848 | 10.0 | 870 | 2.1225 |
| 0.1241 | 11.0 | 957 | 1.8473 |
| 0.1352 | 12.0 | 1044 | 1.5934 |
| 0.0767 | 13.0 | 1131 | 1.7822 |
| 0.0589 | 14.0 | 1218 | 1.9077 |
| 0.0502 | 15.0 | 1305 | 1.9062 |
| 0.0502 | 16.0 | 1392 | 1.9073 |
| 0.0559 | 17.0 | 1479 | 1.9963 |
| 0.0441 | 18.0 | 1566 | 1.7880 |
| 0.0296 | 19.0 | 1653 | 2.3304 |
| 0.0204 | 20.0 | 1740 | 2.3634 |
| 0.0165 | 21.0 | 1827 | 2.1404 |
| 0.0152 | 22.0 | 1914 | 1.8899 |
| 0.01 | 23.0 | 2001 | 2.0763 |
| 0.01 | 24.0 | 2088 | 2.2466 |
| 0.0079 | 25.0 | 2175 | 2.2306 |
| 0.0072 | 26.0 | 2262 | 2.2067 |
| 0.0128 | 27.0 | 2349 | 2.2512 |
| 0.0113 | 28.0 | 2436 | 2.2725 |
| 0.0062 | 29.0 | 2523 | 2.2457 |
| 0.0064 | 30.0 | 2610 | 2.2420 |
Framework versions
- Transformers 4.52.4
- Pytorch 2.6.0+cu124
- Datasets 2.14.4
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for shlok123/my_awesome_qa_model
Base model
google/muril-base-cased