Skip to content

Quick Start

Get neuropt running in under 2 minutes.


01

Install neuropt

pip install "neuropt[llm]"

Set your API key:

export ANTHROPIC_API_KEY="sk-ant-..."

02

Write a training function

Your function trains a model and returns results. neuropt calls it with different configs each time.

# train.py
import torch.nn as nn

def train_fn(config):
    # You use the config values to build your model
    model = nn.Sequential(
        nn.Linear(784, config["hidden_dim"]),
        nn.ReLU() if config["activation"] == "relu" else nn.GELU(),
        nn.Dropout(config["dropout"]),
        nn.Linear(config["hidden_dim"], 10),
    )
    optimizer = torch.optim.Adam(model.parameters(), lr=config["lr"])

    train_losses, val_losses = [], []
    for epoch in range(10):
        # ... train one epoch ...
        train_losses.append(train_loss)
        val_losses.append(val_loss)

    return {
        "score": val_losses[-1],        # required
        "train_losses": train_losses,    # helps the LLM spot overfitting
        "val_losses": val_losses,        # helps the LLM spot underfitting
        "accuracy": val_acc,             # any extra metrics you want
    }

Return per-epoch losses

The more you return, the smarter the LLM gets. Per-epoch curves let it reason about why a config worked — not just that it did.


03
# train.py (continued)

search_space = {
    "lr": (1e-4, 1e-1),              # auto → log-scale
    "hidden_dim": (64, 512),          # auto → integer
    "dropout": (0.0, 0.5),            # auto → uniform
    "activation": ["relu", "gelu"],   # auto → categorical
}

Tuples become ranges, lists become choices. neuropt infers the right sampling strategy from the param name and value types.


04

Run it

neuropt run train.py
neuropt run train.py --backend claude -n 50
from neuropt import ArchSearch

search = ArchSearch(
    train_fn=train_fn,
    search_space=search_space,
    backend="claude",
)
search.run(max_evals=50)

print(search.best_config)
print(search.best_score)

Or skip the search space

Give it a model — neuropt figures out what to tune.

import torchvision
from neuropt import ArchSearch

model = torchvision.models.resnet18(num_classes=10)

def train_fn(config):
    m = config["model"].to("cuda")   # modified deep copy
    # ... train ...
    return {"score": val_loss, "train_losses": [...], "val_losses": [...]}

search = ArchSearch.from_model(model, train_fn)
search.run(max_evals=30)

Works with PyTorch models, XGBoost, LightGBM, Random Forest, and any sklearn estimator. See Model Introspection for details.


Options

Flag What it does
--backend claude Use Claude (default if ANTHROPIC_API_KEY is set)
--backend none Random search — no API key needed
-n 50 Stop after 50 experiments
--log results.jsonl Custom log file (crash-safe, resumable)
--device cuda Force GPU device