Instructions to use microsoft/Florence-2-large-ft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use microsoft/Florence-2-large-ft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="microsoft/Florence-2-large-ft", trust_remote_code=True)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("microsoft/Florence-2-large-ft", trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained("microsoft/Florence-2-large-ft", trust_remote_code=True)

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use microsoft/Florence-2-large-ft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "microsoft/Florence-2-large-ft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large-ft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/microsoft/Florence-2-large-ft

SGLang

How to use microsoft/Florence-2-large-ft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "microsoft/Florence-2-large-ft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large-ft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "microsoft/Florence-2-large-ft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large-ft",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use microsoft/Florence-2-large-ft with Docker Model Runner:
```
docker model run hf.co/microsoft/Florence-2-large-ft
```

Failed to run on MacBook: requiring flash_attn

by Handgun1773 - opened Jun 18, 2024

Discussion

Handgun1773

Jun 18, 2024

•

edited Jun 18, 2024

Seems like the same problem as: https://huggingface.co/microsoft/phi-1_5/discussions/72

I'm trying the provided workaround with modeling_florence2.py (still downloading the model it didn't crash so far):

import os
from unittest.mock import patch

import requests
from PIL import Image
from transformers import AutoModelForCausalLM, AutoProcessor
from transformers.dynamic_module_utils import get_imports


def fixed_get_imports(filename: str | os.PathLike) -> list[str]:
    """Work around for https://huggingface.co/microsoft/phi-1_5/discussions/72."""
    if not str(filename).endswith("/modeling_florence2.py"):
        return get_imports(filename)
    imports = get_imports(filename)
    imports.remove("flash_attn")
    return imports


with patch("transformers.dynamic_module_utils.get_imports", fixed_get_imports):

    model = AutoModelForCausalLM.from_pretrained("microsoft/Florence-2-large-ft", trust_remote_code=True)
    processor = AutoProcessor.from_pretrained("microsoft/Florence-2-large-ft", trust_remote_code=True)

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true"
image = Image.open(requests.get(url, stream=True).raw)

def run_example(prompt):

    inputs = processor(text=prompt, images=image, return_tensors="pt")
    generated_ids = model.generate(
    input_ids=inputs["input_ids"],
    pixel_values=inputs["pixel_values"],
    max_new_tokens=1024,
    num_beams=3,
    )
    generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]

    parsed_answer = processor.post_process_generation(generated_text, task=prompt, image_size=(image.width, image.height))

    print(parsed_answer)

prompt = "<MORE_DETAILED_CAPTION>"
run_example(prompt)

Edit: With this workaround, it works on my MacBook!

spacepxl

Jun 19, 2024

I was trying to run on Windows and wondering why the patch wasn't working, until I realized the forward slash wouldn't match a windows file path.

For a more general solution that should work on any platform, replace this line:

if not str(filename).endswith("/modeling_florence2.py"):

with

if os.path.basename(filename) != "modeling_florence2.py":

Norod78

Jun 20, 2024

Thank you for this patch, it works well on Mac.
I've duplicated the Zero-GPU space for the large model, changed to the base-ft model, applied your patch and it works both locally on Mac as well as on a CPU based space
https://huggingface.co/spaces/Norod78/Florence-2-base-ft
Cheers!

mokolo1

Jun 24, 2024

Hi, I tried using your patch but I keep running into this error
TypeError: Object of type Florence2LanguageConfig is not JSON serializable

on line
model = AutoModelForCausalLM.from_pretrained("microsoft/Florence-2-large-ft", trust_remote_code=True).

Is there something I'm missing?

Norod78

Jun 25, 2024

•

edited Jun 25, 2024

I'm also on mac and I have never seen this error before. Try making sure your transformers is recent, your numpy is on 1.x and etc
Here is my pip dump from the environment this worked on, Python 3.10.13 (Miniconda) this env is for general purpose and has lots of stuff in it, but perhaps you could find a version difference in one of the major frameworks
https://pastebin.com/5rgmUtwL

mokolo1

Jun 25, 2024

Thanks @Norod78 , changing numpy to a version below 2 fixed it.

Norod78

Jun 25, 2024

Thanks @Norod78 , changing numpy to a version below 2 fixed it.

Yay! Glad I could help

Norod78

Jun 26, 2024

•

edited Jun 26, 2024

Note that this trick also works when choosing "MPS" (Apple Silicon) as your torch backend.
https://huggingface.co/spaces/Norod78/Florence-2-base-ft/blob/main/app.py

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment