Repositories / olmoocr_runner.git
README.md

Clone (read-only): git clone http://git.guha-anderson.com/git/olmoocr_runner.git
1903 bytes · 448706a3b569
# olmOCR Runner

## Introduction

This project is a small local runner for converting PDFs to Markdown with
olmOCR. It uses the upstream `olmocr` Python package for document processing and
an already-installed `llama-server` binary from llama.cpp for model inference.

The runner is meant to work with either ROCm or NVIDIA GPUs as long as the
`llama-server` on `PATH` was built for the target GPU backend. The Python
environment is the same for both targets; CUDA and ROCm selection happens in
llama.cpp, not through PyTorch or vLLM dependencies.

## Installation

Install `uv` and make sure `llama-server` is available on `PATH` before running
the installer. This repository does not download or install `llama-server`.

Run:

```bash
./install.sh
```

The installer will:

1. Verify that `llama-server` is available.
2. Run `uv sync`.
3. Download the GGUF olmOCR model and multimodal projection file into:

```text
~/models/olmOCR-2-7B-1025-Q4_K_M-GGUF/
```

By default, the runner expects these files:

```text
~/models/olmOCR-2-7B-1025-Q4_K_M-GGUF/olmocr-2-7b-1025-fp8-q4_k_m.gguf
~/models/olmOCR-2-7B-1025-Q4_K_M-GGUF/mmproj-f16.gguf
```

You can override paths with environment variables:

```bash
export LLAMA_SERVER=/path/to/llama-server
export OLMOCR_GGUF_MODEL=/path/to/olmocr.gguf
export OLMOCR_MMPROJ=/path/to/mmproj.gguf
```

## Usage

Convert a PDF to Markdown with:

```bash
./ocr.sh path/to/input.pdf
```

The output is written next to the PDF with a `.md` extension. For example:

```bash
./ocr.sh docs/example.pdf
```

creates:

```text
docs/example.md
```

To run the integration test that generates a PDF with Pandoc and verifies that
olmOCR preserves a table and a formula:

```bash
uv run pytest -q tests/test_ocr_integration.py
```

That test requires `pandoc`, `xelatex`, `llama-server`, the model files, and a
working GPU backend. It skips cleanly if any of those are missing.