HarleyCoops commited on
Commit
a299ff0
·
1 Parent(s): 6500f68

Clarify Grammar Gym roadmap

Browse files
Files changed (1) hide show
  1. README.md +16 -0
README.md CHANGED
@@ -46,3 +46,19 @@ This interface provides insights into Christian H. Cooper's groundbreaking work
46
  ## Updates
47
 
48
  **March 8th, 2025**: Updated the Gemini model name to the latest version and refreshed the API key for improved performance and reliability.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
  ## Updates
47
 
48
  **March 8th, 2025**: Updated the Gemini model name to the latest version and refreshed the API key for improved performance and reliability.
49
+
50
+ ## Grammar Gym & RL Training Roadmap
51
+
52
+ Work on the Stoney Grammar Gym is in the research-and-design stage. The latest design notes summarize how the existing dictionary work will eventually flow into a reinforcement-learning loop based on verifier-style reward functions:
53
+
54
+ - **Pipeline concept** – Extract rules from the grammar PDF, curate them, and generate task datasets for GRPO-style training with custom verifiers.
55
+ - **Reward coverage** – Plan for multi-dimensional rewards (letter accuracy, word accuracy, semantic similarity, edit distance) to reflect cultural nuance rather than single-score grading.
56
+ - **Integration target** – Re-use the same bilingual dataset plumbing so Grammar Gym training artifacts can live alongside the fine-tuning JSONL files published to the community.
57
+
58
+ Although no runnable Grammar Gym scripts ship with this Space yet, the specification is ready for implementation. The next development sprint should focus on:
59
+
60
+ 1. Building the extraction tooling (`pdf_ingest.py`, `rule_extractor.py`, `rule_organizer.py`, `task_generator.py`) exactly as defined in the design document.
61
+ 2. Wiring the generated tasks into a verifiers-compatible environment and standing up GRPO training experiments.
62
+ 3. Publishing artifacts (rules, tasks, training telemetry) back into the public dataset so community reviewers can audit each stage.
63
+
64
+ Once those deliverables are in place, we can expand the README with concrete execution instructions and add automation hooks so the Space surfaces the latest RL progress inside the UI.