Glazkov's picture
Upload ModernBERT entity infilling model - 2025-11-25 12:22:56
83b24f3 verified
metadata
license: mit
base_model: answerdotai/ModernBERT-base
tags:
  - modernbert
  - entity-infilling
  - text-summarization
  - masked-modeling
  - pytorch
library_name: transformers
datasets:
  - cnn_dailymail
model-index:
  - name: Glazkov/sum-entity-infilling-onehead
    results:
      - task:
          type: entity-infilling
          name: Entity Infilling
        dataset:
          name: cnn_dailymail
          type: cnn_dailymail
        metrics:
          - name: Entity Recall
            type: entity_recall
            value: TBD

Glazkov/sum-entity-infilling-onehead

This model is a fine-tuned version of answerdotai/ModernBERT-base trained on the cnn_dailymail dataset for entity infilling tasks.

Model Description

The model is designed to reconstruct masked entities in text using summary context. It was trained using a sequence-to-sequence approach where the model learns to predict original entities that have been replaced with <mask> tokens in the source text.

Intended Uses & Limitations

Intended Uses:

  • Entity reconstruction in summarization
  • Text completion and infilling
  • Research in masked language modeling
  • Educational purposes

Limitations:

  • Trained primarily on news article data
  • May not perform well on highly technical or domain-specific content
  • Performance varies with entity length and context

Training Details

Training Procedure

Evaluation Results

The model was evaluated using entity recall metrics on a validation set from the CNN/DailyMail dataset.

Metrics:

  • Entity Recall: Percentage of correctly reconstructed entities
  • Token Accuracy: Token-level prediction accuracy
  • Exact Match: Full sequence reconstruction accuracy

Usage

from transformers import AutoTokenizer, AutoModelForMaskedLM
from src.train.inference import EntityInfillingInference

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Glazkov/sum-entity-infilling-onehead")
model = AutoModelForMaskedLM.from_pretrained("Glazkov/sum-entity-infilling-onehead")

# Initialize inference
inference = EntityInfillingInference(
    model_path="Glazkov/sum-entity-infilling-onehead",
    device="cuda"  # or "cpu"
)

# Example inference
summary = "Membership gives the ICC jurisdiction over alleged crimes..."
masked_text = "(<mask> officially became the 123rd member of the International Criminal Court..."

predictions = inference.predict_masked_entities(
    summary=summary,
    masked_text=masked_text
)

Training Configuration

This model was trained using the following configuration:

  • Base Model: answerdotai/ModernBERT-base
  • Dataset: cnn_dailymail
  • Task: Entity Infilling
  • Framework: PyTorch with Accelerate
  • Training Date: 2025-11-25

For more details about the training process, see the training configuration file.

Model Architecture

The model uses ModernBERT architecture with:

  • 12 transformer layers
  • Hidden size: 768
  • Vocabulary: Custom with <mask> token support
  • Maximum sequence length: 512 tokens

Acknowledgments

License

This model is licensed under the MIT License.