string_selection_resolver¶
Utilities for resolving string-based selections to event indices.
- class graphnet.data.utilities.string_selection_resolver.StringSelectionResolver(dataset, index_column, seed, use_cache)[source]¶
Bases:
Logger
Resolve string-based selection to event indices.
String-based selection, using in DatasetConfig, is a very flexible way of defining event selections. Below we show an example of a very involved event selection, which should cover most standard event selections currently in use with graphnet.
```yml # dataset/config.yml selection:
- test:
50000 random events ~ event_no % 5 == 0 & abs(pid) == 12
50000 random events ~ event_no % 5 == 0 & abs(pid) == 14
50000 random events ~ event_no % 5 == 0 & abs(pid) == 16
50000 random events ~ event_no % 5 == 0 & abs(pid) == 13
50000 random events ~ event_no % 5 == 0 & abs(pid) == 1
- validation:
10000 random events ~ event_no % 5 == 1 & abs(pid) == 12
10000 random events ~ event_no % 5 == 1 & abs(pid) == 14
10000 random events ~ event_no % 5 == 1 & abs(pid) == 16
10000 random events ~ event_no % 5 == 1 & abs(pid) == 13
10000 random events ~ event_no % 5 == 1 & abs(pid) == 1
- train:
10000 random events ~ event_no % 5 > 1 & abs(pid) == 12
10000 random events ~ event_no % 5 > 1 & abs(pid) == 14
10000 random events ~ event_no % 5 > 1 & abs(pid) == 16
10000 random events ~ event_no % 5 > 1 & abs(pid) == 13
10000 random events ~ event_no % 5 > 1 & abs(pid) == 1
Construct StringSelectionResolver.
- Parameters:
dataset (Dataset)
index_column (str)
seed (int | None)
use_cache (bool)
- resolve(selection)[source]¶
Resolve selection as string to list of indicies.
Selections are expected to have pandas.DataFrame.query-compatible syntax, e.g.,
` "event_no % 5 > 0" `
Selections may also specify a fixed number of events to randomly sample, e.g.,` "10000 random events ~ event_no % 5 > 0" "20% random events ~ event_no % 5 > 0" `
- Return type:
List
[int
]- Parameters:
selection (str)