nodes

Class(es) for building/connecting graphs.

class graphnet.models.data_representation.graphs.nodes.nodes.NodeDefinition(*args, **kwargs)[source]

Bases: Model

Base class for graph building.

Construct Detector.

Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

object

forward(x)[source]

Construct nodes from raw node features.

Parameters:
  • x (tensor) – standardized node features with shape ´[num_pulses, d]´,

  • features. (where ´d´ is the number of node)

  • node_feature_names – list of names for each column in ´x´.

Returns:

a graph without edges

Return type:

graph

property nb_outputs: int

Return number of output features.

This the default, but may be overridden by specific inheriting classes.

set_number_of_inputs(input_feature_names)[source]

Return number of inputs expected by node definition.

Parameters:

input_feature_names (List[str]) – name of each input feature column.

Return type:

None

set_output_feature_names(input_feature_names)[source]

Set output features names as a member variable.

Parameters:
  • input_feature_names (List[str]) – List of column names of the input to the

  • definition. (node)

Return type:

None

class graphnet.models.data_representation.graphs.nodes.nodes.NodesAsPulses(*args, **kwargs)[source]

Bases: NodeDefinition

Represent each measured pulse of Cherenkov Radiation as a node.

Construct Detector.

Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

object

class graphnet.models.data_representation.graphs.nodes.nodes.PercentileClusters(*args, **kwargs)[source]

Bases: NodeDefinition

Represent nodes as clusters with percentile summary node features.

If cluster_on is set to the xyz coordinates of DOMs e.g. cluster_on = [‘dom_x’, ‘dom_y’, ‘dom_z’], each node will be a unique DOM and the pulse information (charge, time) is summarized using percentiles.

Construct PercentileClusters.

Parameters:
  • cluster_on (List[str]) – Names of features to create clusters from.

  • percentiles (List[int]) – List of percentiles. E.g. [10, 50, 90].

  • add_counts (bool, default: True) – If True, number of duplicates is added to output array.

  • input_feature_names (Optional[List[str]], default: None) – (Optional) column names for input features.

  • args (Any)

  • kwargs (Any)

Return type:

object

class graphnet.models.data_representation.graphs.nodes.nodes.NodeAsDOMTimeSeries(*args, **kwargs)[source]

Bases: NodeDefinition

Represent each node as a DOM with time and charge time series data.

Construct NodeAsDOMTimeSeries.

Parameters:
  • keys (List[str], default: ['dom_x', 'dom_y', 'dom_z', 'dom_time', 'charge']) – Names of features in the data (in order).

  • id_columns (List[str], default: ['dom_x', 'dom_y', 'dom_z']) – List of columns that uniquely identify a DOM.

  • time_column (str, default: 'dom_time') – Name of time column.

  • charge_column (str, default: 'charge') – Name of charge column.

  • max_activations (Optional[int], default: None) – Maximum number of activations to include in the time series.

  • args (Any)

  • kwargs (Any)

Return type:

object

class graphnet.models.data_representation.graphs.nodes.nodes.IceMixNodes(*args, **kwargs)[source]

Bases: NodeDefinition

Calculate ice properties and perform random sampling.

Ice properties are calculated based on the z-coordinate of the pulse. For each event, a random sampling is performed to keep the number of pulses below a maximum number of pulses if n_pulses is over the limit.

Construct IceMixNodes.

Parameters:
  • input_feature_names (Optional[List[str]], default: None) – Column names for input features. Minimum

  • names. (required features are z coordinate and hlc column)

  • max_pulses (int, default: 768) – Maximum number of pulses to keep in the event.

  • z_name (str, default: 'dom_z') – Name of the z-coordinate column.

  • hlc_name (Optional[str], default: 'hlc') – Name of the Hard Local Coincidence Check column.

  • add_ice_properties (bool, default: True) – If True, scattering and absoption length of

  • coordinate. (ice in IceCube are added to the feature set based on z)

  • ice_args (Dict[str, Optional[float]], default: {'z_offset': None, 'z_scaling': None}) – Offset and scaling of the z coordinate in the Detector,

  • data. (to be able to make similar conversion in the ice)

  • sample_pulses (bool, default: True) – Enable sampling random pulses. If True and the

  • max_length (event is longer than the)

  • If (they will be sampled.)

  • False

  • selected. (then only the first max_length pulses will be)

  • args (Any)

  • kwargs (Any)

Return type:

object

class graphnet.models.data_representation.graphs.nodes.nodes.ClusterSummaryFeatures(*args, **kwargs)[source]

Bases: NodeDefinition

Represent pulse maps as clusters with summary features.

If cluster_on is set to the xyz coordinates of optical modules e.g. cluster_on = [‘dom_x’, ‘dom_y’, ‘dom_z’], each node will be a unique optical module and the pulse information (e.g. charge, time) is summarized. NOTE: Developed to be used with features

[dom_x, dom_y, dom_z, charge, time]

Possible features per cluster: - total charge

feature name: total_charge

  • charge accumulated after <X> time units

    feature name: charge_after_<X>ns

  • time of first hit in the optical module

    feature name: time_of_first_hit

  • time spread per optical module

    feature name: time_spread

  • time std per optical module

    feature name: time_std

  • time took to collect <X> percent of total charge per cluster

    feature name: time_after_charge_pct<X>

  • number of pulses per clusters

    feature name: counts

For more details on some of the features see Theo Glauchs thesis (chapter 5.3): https://mediatum.ub.tum.de/node?id=1584755

Construct ClusterSummaryFeatures.

Parameters:
  • cluster_on (List[str]) – Names of features to create clusters from.

  • input_feature_names (List[str]) – Column names for input features.

  • charge_label (str, default: 'charge') – Name of the charge column.

  • time_label (str, default: 'dom_time') – Name of the time column.

  • total_charge (bool, default: True) – If True, calculates total charge as feature.

  • charge_after_t (List[int], default: [10, 50, 100]) – List of times at which the accumulated charge is calculated as a feature.

  • time_of_first_hit (bool, default: True) – If True, time of first hit is added as a feature.

  • time_spread (bool, default: True) – If True, time spread is added as a feature.

  • time_std (bool, default: True) – If True, time std is added as a feature.

  • time_after_charge_pct (List[int], default: [1, 3, 5, 11, 15, 20, 50, 80]) – List of percentiles to calculate time after charge.

  • charge_standardization (Union[float, str], default: 'log') – Either a float or ‘log’. If a float, the features are multiplied by this factor. If ‘log’, the features are transformed to log10 scale.

  • time_standardization (float, default: 0.001) – Standardization factor for features with a time

  • order_in_time (bool, default: True) –

    If True, clusters are ordered in time.

    If your data is already ordered in time, you can set this to False to avoid a potential overhead.

    NOTE: Should only be set to False if you are sure that

    the input data is already ordered in time. Will lead to incorrect results otherwise.

  • add_counts (bool, default: False) – If True, number of log10(event counts per clusters) is added as a feature.

  • args (Any)

  • kwargs (Any)

Return type:

object

NOTE: Make sure that either the input data is not already standardized or that the charge_standardization and time_standardization parameters are set to 1 to avoid a double standardization.

set_indices(feature_names)[source]

Set the indices for the input features.

Return type:

None

Parameters:

feature_names (List[str])