Skip to content

Configuration Reference

All config lives in YAML files. configs/base.yaml is the default; experiment configs override it via base: configs/base.yaml.

Top-Level

Key Type Default Description
mode str "submit" "submit" for Kaggle inference, "train" for local training

paths

Key Type Default Description
data_dir str /kaggle/input/competitions/birdclef-2026 Competition data root
model_dir str /kaggle/input/models/.../perch_v2_cpu/1 Perch v2 model directory
cache_dir str cache/ Writable embedding cache directory
cache_input_dir str /kaggle/input/perch-meta Read-only pre-computed cache dataset

pipeline

Key Type Default Description
batch_files int 32 Audio files per Perch embedding batch
proxy_reduce str "max" Reduction over proxy scores ("max" or "mean")
require_full_cache_in_submit bool true Fail fast if cache missing in submit mode
verbose bool false Extra logging
run_oof_baseline bool false Compute OOF baseline scores
run_probe_check bool false Sanity-check probe predictions
dryrun_n_files int 20 Files to process in dryrun/debug mode

proto_ssm (architecture -- LOCKED)

These are fixed and should not be tuned.

Key Type Default Description
d_model int 128 Hidden dimension
d_state int 16 SSM state size
n_ssm_layers int 2 Number of SSM layers
dropout float 0.15 Dropout rate
n_prototypes int 1 Prototypes per class
n_sites int 20 Site-embedding cardinality
meta_dim int 16 Metadata embedding dimension
use_cross_attn bool true Enable cross-attention layer
cross_attn_heads int 4 Number of cross-attention heads

proto_ssm_train (neuropt-tunable)

Parameters marked with a star are included in the neuropt search space.

Key Type Default neuropt Description
n_epochs int 35 -- Training epochs
lr float 5.5e-4 log_uniform [5e-4, 3e-3] Learning rate
weight_decay float 1e-3 log_uniform [8e-4, 5e-3] AdamW weight decay
val_ratio float 0.15 -- Validation holdout fraction
patience int 15 -- Early-stopping patience (epochs)
pos_weight_cap float 41.0 uniform [20, 45] Cap on positive class weight
distill_weight float 0.23 uniform [0.05, 0.25] Distillation loss weight
proto_margin float 0.1 -- Prototype margin loss
label_smoothing float 0.019 uniform [0.01, 0.05] Label smoothing factor
oof_n_splits int 3 -- OOF cross-validation folds
mixup_alpha float 0.25 uniform [0.1, 0.45] Mixup interpolation alpha
focal_gamma float 1.15 uniform [1.0, 3.0] Focal loss gamma
swa_start_frac float 0.68 uniform [0.6, 0.85] Fraction of training before SWA
swa_lr float 5e-4 -- SWA learning rate

fusion

Key Type Default Description
lambda_event float 0.4 Weight for event-level scores
lambda_texture float 1.0 Weight for texture-level scores
lambda_proxy_texture float 0.8 Weight for proxy-texture blend
smooth_texture float 0.35 Temporal smoothing for texture stream
smooth_event float 0.15 Temporal smoothing for event stream

residual_ssm

Key Type Default Description
d_model int 64 Hidden dimension
d_state int 8 SSM state size
n_ssm_layers int 1 Number of SSM layers
dropout float 0.1 Dropout rate
correction_weight float 0.3 Blend weight for residual correction
n_epochs int 20 Training epochs
lr float 1e-3 Learning rate
patience int 8 Early-stopping patience

temperature

Key Type Default Description
aves float 1.10 Temperature for bird (Aves) logits
texture float 0.95 Temperature for texture taxa (Amphibia, Insecta)

Inference (top-level)

Key Type Default Description
file_level_top_k int 2 Keep top-k species per file
tta_shifts list[int] [0] Test-time augmentation shifts (seconds)

probe

Key Type Default Description
backend str "mlp" "mlp" or "logistic"
pca_dim int 64 PCA dimension for probe input
min_pos int 8 Minimum positive examples to train a probe
C float 0.5 Logistic regularisation strength
alpha float 0.4 Probe blend weight

probe.mlp

Key Type Default Description
hidden_layer_sizes list[int] [128] MLP hidden layers
activation str "relu" Activation function
max_iter int 100 Max training iterations
early_stopping bool true Enable early stopping
validation_fraction float 0.15 Validation split for early stopping
n_iter_no_change int 10 Early stopping patience
learning_rate_init float 0.001 Initial learning rate
l2_alpha float 0.01 L2 regularisation

timer

Key Type Default Description
budget_seconds float 5400.0 Total wall-clock budget (90 min)
residual_ssm_min_remaining float 240.0 Skip residual SSM if less time remains

neuropt

Key Type Default Description
search_space dict (see below) Parameter search ranges
batch_size int 3 Parallel evaluations per neuropt round
max_evals int 60 Maximum total evaluations
backend str "claude" neuropt backend

neuropt.search_space

Param Type Range
lr log_uniform [5e-4, 3e-3]
weight_decay log_uniform [8e-4, 5e-3]
distill_weight uniform [0.05, 0.25]
label_smoothing uniform [0.01, 0.05]
mixup_alpha uniform [0.1, 0.45]
focal_gamma uniform [1.0, 3.0]
swa_start_frac uniform [0.6, 0.85]
pos_weight_cap uniform [20, 45]