lucweber commited on
Commit
9a31823
·
verified ·
1 Parent(s): d4c0208

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -0
README.md CHANGED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Difficulty Scorer v2
2
+
3
+ A Qwen3-8B based difficulty scorer trained on our own difficulty data, as it is used in our EMNLP 2025 submission titled
4
+
5
+ **Stratified Selective Sampling for Instruction Tuning with Dedicated Scoring Strategy** [REF]
6
+
7
+ ## Model Architecture
8
+
9
+ - Base model: [`Qwen/Qwen3-8B`](https://huggingface.co/Qwen/Qwen3-8B)
10
+ - Custom head: Regression head on top of pooling layer.
11
+
12
+ For more details, see `model.py`
13
+
14
+ ## Use Cases
15
+
16
+ The model can be used to classify the difficulty of model instructions. More challenging instructions are associated with better learning outcomes during training.
17
+
18
+ ---
19
+
20
+ ## How to Use
21
+
22
+ ### Inference
23
+
24
+ ```python
25
+ pass
26
+
27
+ ```
28
+
29
+ ---
30
+
31
+ ## Model Files
32
+
33
+ * `pytorch_model-0000x-of-00002.bin` – finetuned model weights
34
+ * `regression_head.bin` - custom regression head
35
+ * `config.json` – configuration including base model and head details
36
+ * `tokenizer.json`, `vocab.txt`, etc. – tokenizer files
37
+ * `model.py` – custom regression model implementation
38
+
39
+
40
+ ---
41
+
42
+ ## Evaluation
43
+
44
+ We mostly checked the validity of the scorer through it's downstream benefits in training (see paper).
45
+ We additionally did a sanity check with coding data from [deepmind/code_contests](https://huggingface.co/datasets/deepmind/code_contests), which contains difficulty scores:
46
+
47
+ ![Correlation code contest](./scatter_code_contests_vs_difficulty.png)
48
+
49
+
50
+ Correlation of our difficulty scores with code_contest data is `r = 0.41`
51
+
52
+ ---
53
+
54
+ ## Responsible
55
+
56
+ Mostly Lucas W.
57
+
58
+
59
+
60
+
61
+
62
+
63
+