nodes¶

Class(es) for building/connecting graphs.

class graphnet.models.data_representation.graphs.nodes.nodes.NodeDefinition(*args, **kwargs)[source]¶

Bases: Model

Base class for graph building.

Construct Detector.

Parameters:

args (Any)
kwargs (Any)

Return type:

object

forward(x)[source]¶

Construct nodes from raw node features.

Parameters:

x (tensor) – standardized node features with shape ´[num_pulses, d]´,
features. (where ´d´ is the number of node)
node_feature_names – list of names for each column in ´x´.

Returns:

a graph without edges

Return type:

graph

property nb_outputs: int¶

Return number of output features.

This the default, but may be overridden by specific inheriting classes.

set_number_of_inputs(input_feature_names)[source]¶

Return number of inputs expected by node definition.

Parameters:: input_feature_names (List[str]) – name of each input feature column.
Return type:: None

set_output_feature_names(input_feature_names)[source]¶

Set output features names as a member variable.

Parameters:

input_feature_names (List[str]) – List of column names of the input to the
definition. (node)

Return type:

None

class graphnet.models.data_representation.graphs.nodes.nodes.NodesAsPulses(*args, **kwargs)[source]¶

Bases: NodeDefinition

Represent each measured pulse of Cherenkov Radiation as a node.

Construct Detector.

Parameters:

args (Any)
kwargs (Any)

Return type:

object

class graphnet.models.data_representation.graphs.nodes.nodes.PercentileClusters(*args, **kwargs)[source]¶

Bases: NodeDefinition

Represent nodes as clusters with percentile summary node features.

If cluster_on is set to the xyz coordinates of DOMs e.g. cluster_on = [‘dom_x’, ‘dom_y’, ‘dom_z’], each node will be a unique DOM and the pulse information (charge, time) is summarized using percentiles.

Construct PercentileClusters.

Parameters:

cluster_on (List[str]) – Names of features to create clusters from.
percentiles (List[int]) – List of percentiles. E.g. [10, 50, 90].
add_counts (bool, default: True) – If True, number of duplicates is added to output array.
input_feature_names (Optional[List[str]], default: None) – (Optional) column names for input features.
args (Any)
kwargs (Any)

Return type:

object

class graphnet.models.data_representation.graphs.nodes.nodes.NodeAsDOMTimeSeries(*args, **kwargs)[source]¶

Bases: NodeDefinition

Represent each node as a DOM with time and charge time series data.

Construct NodeAsDOMTimeSeries.

Parameters:

keys (List[str], default: ['dom_x', 'dom_y', 'dom_z', 'dom_time', 'charge']) – Names of features in the data (in order).
id_columns (List[str], default: ['dom_x', 'dom_y', 'dom_z']) – List of columns that uniquely identify a DOM.
time_column (str, default: 'dom_time') – Name of time column.
charge_column (str, default: 'charge') – Name of charge column.
max_activations (Optional[int], default: None) – Maximum number of activations to include in the time series.
args (Any)
kwargs (Any)

Return type:

object

class graphnet.models.data_representation.graphs.nodes.nodes.IceMixNodes(*args, **kwargs)[source]¶

Bases: NodeDefinition

Calculate ice properties and perform random sampling.

Ice properties are calculated based on the z-coordinate of the pulse. For each event, a random sampling is performed to keep the number of pulses below a maximum number of pulses if n_pulses is over the limit.

Construct IceMixNodes.

Parameters:

input_feature_names (Optional[List[str]], default: None) – Column names for input features. Minimum
names. (required features are z coordinate and hlc column)
max_pulses (int, default: 768) – Maximum number of pulses to keep in the event.
z_name (str, default: 'dom_z') – Name of the z-coordinate column.
hlc_name (Optional[str], default: 'hlc') – Name of the Hard Local Coincidence Check column.
add_ice_properties (bool, default: True) – If True, scattering and absoption length of
coordinate. (ice in IceCube are added to the feature set based on z)
ice_args (Dict[str, Optional[float]], default: {'z_offset': None, 'z_scaling': None}) – Offset and scaling of the z coordinate in the Detector,
data. (to be able to make similar conversion in the ice)
sample_pulses (bool, default: True) – Enable sampling random pulses. If True and the
max_length (event is longer than the)
If (they will be sampled.)
False
selected. (then only the first max_length pulses will be)
args (Any)
kwargs (Any)

Return type:

object

class graphnet.models.data_representation.graphs.nodes.nodes.ClusterSummaryFeatures(*args, **kwargs)[source]¶

Bases: NodeDefinition

Represent pulse maps as clusters with summary features.

If cluster_on is set to the xyz coordinates of optical modules e.g. cluster_on = [‘dom_x’, ‘dom_y’, ‘dom_z’], each node will be a unique optical module and the pulse information (e.g. charge, time) is summarized. NOTE: Developed to be used with features

[dom_x, dom_y, dom_z, charge, time]

Possible features per cluster: - total charge

feature name: total_charge

charge accumulated after <X> time units
feature name: charge_after_<X>ns
time of first hit in the optical module
feature name: time_of_first_hit
time spread per optical module
feature name: time_spread
time std per optical module
feature name: time_std
time took to collect <X> percent of total charge per cluster
feature name: time_after_charge_pct<X>
number of pulses per clusters
feature name: counts

For more details on some of the features see Theo Glauchs thesis (chapter 5.3): https://mediatum.ub.tum.de/node?id=1584755

Construct ClusterSummaryFeatures.

Parameters:

cluster_on (List[str]) – Names of features to create clusters from.
input_feature_names (List[str]) – Column names for input features.
charge_label (str, default: 'charge') – Name of the charge column.
time_label (str, default: 'dom_time') – Name of the time column.
total_charge (bool, default: True) – If True, calculates total charge as feature.
charge_after_t (List[int], default: [10, 50, 100]) – List of times at which the accumulated charge is calculated as a feature.
time_of_first_hit (bool, default: True) – If True, time of first hit is added as a feature.
time_spread (bool, default: True) – If True, time spread is added as a feature.
time_std (bool, default: True) – If True, time std is added as a feature.
time_after_charge_pct (List[int], default: [1, 3, 5, 11, 15, 20, 50, 80]) – List of percentiles to calculate time after charge.
charge_standardization (Union[float, str], default: 'log') – Either a float or ‘log’. If a float, the features are multiplied by this factor. If ‘log’, the features are transformed to log10 scale.
time_standardization (float, default: 0.001) – Standardization factor for features with a time
order_in_time (bool, default: True) –

If True, clusters are ordered in time.
If your data is already ordered in time, you can set this to False to avoid a potential overhead.

NOTE: Should only be set to False if you are sure that
the input data is already ordered in time. Will lead to incorrect results otherwise.
add_counts (bool, default: False) – If True, number of log10(event counts per clusters) is added as a feature.
args (Any)
kwargs (Any)

Return type:

object

NOTE: Make sure that either the input data is not already standardized or that the charge_standardization and time_standardization parameters are set to 1 to avoid a double standardization.

set_indices(feature_names)[source]¶

Set the indices for the input features.

Return type:: None
Parameters:: feature_names (List[str])