graphnet.data.utilities.string_selection_resolver module

Utilities for resolving string-based selections to event indices.

class graphnet.data.utilities.string_selection_resolver.StringSelectionResolver(dataset, index_column, seed, use_cache)[source]

Bases: Logger

Resolve string-based selection to event indices.

String-based selection, using in DatasetConfig, is a very flexible way of defining event selections. Below we show an example of a very involved event selection, which should cover most standard event selections currently in use with graphnet.

```yml # dataset/config.yml selection:

test:
  • 50000 random events ~ event_no % 5 == 0 & abs(pid) == 12

  • 50000 random events ~ event_no % 5 == 0 & abs(pid) == 14

  • 50000 random events ~ event_no % 5 == 0 & abs(pid) == 16

  • 50000 random events ~ event_no % 5 == 0 & abs(pid) == 13

  • 50000 random events ~ event_no % 5 == 0 & abs(pid) == 1

validation:
  • 10000 random events ~ event_no % 5 == 1 & abs(pid) == 12

  • 10000 random events ~ event_no % 5 == 1 & abs(pid) == 14

  • 10000 random events ~ event_no % 5 == 1 & abs(pid) == 16

  • 10000 random events ~ event_no % 5 == 1 & abs(pid) == 13

  • 10000 random events ~ event_no % 5 == 1 & abs(pid) == 1

train:
  • 10000 random events ~ event_no % 5 > 1 & abs(pid) == 12

  • 10000 random events ~ event_no % 5 > 1 & abs(pid) == 14

  • 10000 random events ~ event_no % 5 > 1 & abs(pid) == 16

  • 10000 random events ~ event_no % 5 > 1 & abs(pid) == 13

  • 10000 random events ~ event_no % 5 > 1 & abs(pid) == 1

```

Construct StringSelectionResolver.

Parameters:
  • dataset (Dataset)

  • index_column (str)

  • seed (int | None)

  • use_cache (bool)

resolve(selection)[source]

Resolve selection as string to list of indicies.

Selections are expected to have pandas.DataFrame.query-compatible syntax, e.g., ` "event_no % 5 > 0" ` Selections may also specify a fixed number of events to randomly sample, e.g., ` "10000 random events ~ event_no % 5 > 0" "20% random events ~ event_no % 5 > 0" `

Return type:

List[int]

Parameters:

selection (str)