dataset_config¶
Config classes for the graphnet.data.dataset module.
- class graphnet.utilities.config.dataset_config.DatasetConfig(*, path, pulsemaps, features, truth, node_truth, index_column, truth_table, node_truth_table, string_selection, selection, loss_weight_table, loss_weight_column, loss_weight_default_value, seed, graph_definition, labels)[source]¶
Bases:
BaseConfig
Configuration for all `Dataset`s.
Construct DataConfig.
Can be used for dataset configuration as code, thereby making dataset construction more transparent and reproducible.
Examples
In one session, do:
>>> dataset = Dataset(...) >>> dataset.config.dump() path: (...) pulsemaps: - (...) (...) >>> dataset.config.dump("dataset.yml")
In another session, you can then do: >>> dataset = Dataset.from_config(“dataset.yml”)
# Uniquely for DatasetConfig, you can also define and load # multiple datasets >>> dataset.config.selection = {
“train”: “event_no % 2 == 0”, “test”: “event_no % 2 == 1”,
} >>> dataset.config.dump(“dataset.yml”) >>> datasets: Dict[str, Dataset] = Dataset.from_config(
“dataset.yml”
) >>> datasets {
“train”: Dataset(…), “test”: Dataset(…),
}
# You can also combine multiple selections into a single, named # dataset >>> dataset.config.selection = {
- “train”: [
“event_no % 2 == 0 & abs(pid) == 12”, “event_no % 2 == 0 & abs(pid) == 14”, “event_no % 2 == 0 & abs(pid) == 16”,
], (…)
} >>> dataset.config.dump(“dataset.yml”) >>> datasets: Dict[str, EnsembleDataset] = Dataset.from_config(
“dataset.yml”
) >>> datasets {
“train”: EnsembleDataset(…), (…)
}
# Finally, you can still reference existing selection files in CSV # or JSON formats: >>> dataset.config.selection = {
“train”: “50000 random events ~ train_selection.csv”, “test”: “test_selection.csv”,
}
- Parameters:
path (str | List[str])
pulsemaps (str | List[str])
features (List[str])
truth (List[str])
node_truth (List[str] | None)
index_column (str)
truth_table (str)
node_truth_table (str | None)
string_selection (List[int] | None)
selection (str | List[str] | List[int | List[int]] | Dict[str, str | List[str]] | None)
loss_weight_table (str | None)
loss_weight_column (str | None)
loss_weight_default_value (float | None)
seed (int | None)
graph_definition (Any)
labels (Dict[str, Any] | None)
-
path:
Union
[str
,List
[str
]]¶
-
pulsemaps:
Union
[str
,List
[str
]]¶
-
features:
List
[str
]¶
-
truth:
List
[str
]¶
-
node_truth:
Optional
[List
[str
]]¶
-
index_column:
str
¶
-
truth_table:
str
¶
-
node_truth_table:
Optional
[str
]¶
-
string_selection:
Optional
[List
[int
]]¶
-
selection:
Union
[str
,List
[str
],List
[Union
[int
,List
[int
]]],Dict
[str
,Union
[str
,List
[str
]]],None
]¶
-
loss_weight_table:
Optional
[str
]¶
-
loss_weight_column:
Optional
[str
]¶
-
loss_weight_default_value:
Optional
[float
]¶
-
seed:
Optional
[int
]¶
-
graph_definition:
Any
¶
-
labels:
Optional
[Dict
[str
,Any
]]¶
- as_dict()[source]¶
Represent ModelConfig as a dict.
This builds on BaseModel.dict() but wraps the output in a single-key dictionary to make it unambiguous to identify model arguments that are themselves models.
- Return type:
Dict
[str
,Dict
[str
,Any
]]
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- graphnet.utilities.config.dataset_config.save_dataset_config(init_fn)[source]¶
Save the arguments to __init__ functions as member DatasetConfig.
- Return type:
Callable
- Parameters:
init_fn (Callable)
- class graphnet.utilities.config.dataset_config.DatasetConfigSaverMeta[source]¶
Bases:
type
Metaclass for DatasetConfig that saves the config after __init__.
- class graphnet.utilities.config.dataset_config.DatasetConfigSaverABCMeta(name, bases, namespace, **kwargs)[source]¶
Bases:
DatasetConfigSaverMeta
,ABCMeta
Common interface between DatasetConfigSaver and ABC Metaclasses.