nubench_datasets¶
Curated datasets from the NuBench benchmark suite (arXiv:2511.13111).
- class graphnet.datasets.nubench_datasets.NuBenchSpec(erda_hash, detector_cls, experiment, comments, features=<factory>, event_truth=<factory>, db_relpath/merged.db', selection_relpaths=<factory>, pulsemap_per_split=<factory>)[source]¶
Bases:
objectStatic configuration for a single NuBench dataset.
- Parameters:
erda_hash (str)
detector_cls (Type[NuBenchDetector])
experiment (str)
comments (str)
features (List[str])
event_truth (List[str])
db_relpath (str)
selection_relpaths (Dict[str, str])
pulsemap_per_split (Dict[str, str])
- erda_hash: str¶
- detector_cls: Type[NuBenchDetector]¶
- experiment: str¶
- comments: str¶
- features: List[str]¶
- event_truth: List[str]¶
- db_relpath: str = 'merged/merged.db'¶
- selection_relpaths: Dict[str, str]¶
- pulsemap_per_split: Dict[str, str]¶
- class graphnet.datasets.nubench_datasets.NuBenchDataset(name, download_dir, data_representation, **kwargs)[source]¶
Bases:
ERDAHostedDatasetSingle entry point for every NuBench benchmark dataset.
Pick a dataset by its registry name (see
available_datasets()) and pass aDataRepresentationwhose detector matches the dataset. The tarball is downloaded from ERDA on first use and extracted into{download_dir}/{name}/.The NuBench convention is that train/val events live in the
merged_photonspulsemap while test events live inpulses_no_noise. This class builds each split against the correct pulsemap automatically.Example:
from graphnet.models.graphs import KNNGraph from graphnet.models.detector.nubench import Hexagon from graphnet.datasets import NuBenchDataset ds = NuBenchDataset( name="hexagon_ice_le", download_dir="/path/to/nubench_data", data_representation=KNNGraph(detector=Hexagon()), )
Construct a NuBench dataset by registry name.
- Parameters:
name (
str) – Registry key of the NuBench dataset (seeavailable_datasets()).download_dir (
str) – Directory to download and extract the dataset into.data_representation (
DataRepresentation) – Data representation whose detector must match the one expected by the selected dataset.**kwargs (
Any) – Forwarded toERDAHostedDataset.
- classmethod available_datasets()[source]¶
Return the list of registered NuBench dataset names.
- Return type:
List[str]
- property dataset_dir: str¶
Return the root directory of the extracted dataset.