physisolver-gpt2
This model is a fine-tuned version of gpt2 on an high school physics dataset.
Model description
Physisolver-GPT2 is a fine-tuned version of the GPT-2 architecture, optimized for answering physics-related multiple-choice questions. This model has been trained on a custom dataset consisting of physics questions, multiple-choice options, and correct answers. The goal of the model is to generate accurate answers to physics questions by understanding the context and the provided choices.
Training Data:
The model was trained on a dataset of physics questions in a multiple-choice question format. Each question is paired with multiple possible answers, and the correct answer is also included. The dataset is formatted in JSON, where each entry contains a question, a list of choices, and the correct answer.
Training Objective:
The model has been fine-tuned to predict the correct answer given the context of the question and available choices. During training, the model learned to associate questions with their corresponding answers through supervised learning, using the transformer architecture’s language modeling objective.
Capabilities:
The model can generate answers to a wide variety of physics-related questions.
It understands the question prompt and the choices provided.
It generates the most likely answer based on context and trained knowledge.
Intended Use: This model is suitable for interactive applications such as:
Physics tutoring systems.
AI-powered physics question-answering platforms.
Integration with educational tools for physics learning.
Performance:
The model performs well on a variety of physics topics included in the dataset. However, since it is trained on a custom dataset, it may not generalize well to entirely different or unseen physics topics outside the scope of the training data.
Limitations:
The model’s performance is limited to the scope and quality of the dataset it was trained on.
It may struggle with complex or highly specialized physics concepts not covered in the training data.
The model's ability to reason is limited to the patterns it learned from the dataset and may not always provide optimal answers for ambiguous questions.
Deployment:
Physisolver-GPT2 can be accessed and integrated into applications through the Hugging Face Model Hub, where it is available for both research and production use.
Future Improvements:
Expanding the dataset with more diverse physics topics could improve the model’s generalization.
Fine-tuning with a broader set of questions from different physics domains, such as thermodynamics, electromagnetism, etc.
Adding capabilities for reasoning through multi-step problems or understanding context beyond single-choice questions.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 8
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 2
- mixed_precision_training: Native AMP
Framework versions
- Transformers 4.51.1
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 7
Model tree for mrohith29/physisolver-gpt2
Base model
openai-community/gpt2