Spaces:

HarleyCooper
/

AskAboutCIL

Sleeping

App Files Files Community

HarleyCoops commited on Oct 9, 2025

Commit

a299ff0

1 Parent(s): 6500f68

Clarify Grammar Gym roadmap

Browse files

Files changed (1) hide show

README.md +16 -0

README.md CHANGED Viewed

@@ -46,3 +46,19 @@ This interface provides insights into Christian H. Cooper's groundbreaking work
 ## Updates
 **March 8th, 2025**: Updated the Gemini model name to the latest version and refreshed the API key for improved performance and reliability.

 ## Updates
 **March 8th, 2025**: Updated the Gemini model name to the latest version and refreshed the API key for improved performance and reliability.
+## Grammar Gym & RL Training Roadmap
+Work on the Stoney Grammar Gym is in the research-and-design stage. The latest design notes summarize how the existing dictionary work will eventually flow into a reinforcement-learning loop based on verifier-style reward functions:
+- **Pipeline concept** – Extract rules from the grammar PDF, curate them, and generate task datasets for GRPO-style training with custom verifiers.
+- **Reward coverage** – Plan for multi-dimensional rewards (letter accuracy, word accuracy, semantic similarity, edit distance) to reflect cultural nuance rather than single-score grading.
+- **Integration target** – Re-use the same bilingual dataset plumbing so Grammar Gym training artifacts can live alongside the fine-tuning JSONL files published to the community.
+Although no runnable Grammar Gym scripts ship with this Space yet, the specification is ready for implementation. The next development sprint should focus on:
+1. Building the extraction tooling (`pdf_ingest.py`, `rule_extractor.py`, `rule_organizer.py`, `task_generator.py`) exactly as defined in the design document.
+2. Wiring the generated tasks into a verifiers-compatible environment and standing up GRPO training experiments.
+3. Publishing artifacts (rules, tasks, training telemetry) back into the public dataset so community reviewers can audit each stage.
+Once those deliverables are in place, we can expand the README with concrete execution instructions and add automation hooks so the Space surfaces the latest RL progress inside the UI.