nodes¶
Class(es) for building/connecting graphs.
- class graphnet.models.data_representation.graphs.nodes.nodes.NodeDefinition(*args, **kwargs)[source]¶
Bases:
Model
Base class for graph building.
Construct Detector.
- Parameters:
args (Any)
kwargs (Any)
- Return type:
object
- forward(x)[source]¶
Construct nodes from raw node features.
- Parameters:
x (
tensor
) – standardized node features with shape ´[num_pulses, d]´,features. (where ´d´ is the number of node)
node_feature_names – list of names for each column in ´x´.
- Returns:
a graph without edges
- Return type:
graph
- property nb_outputs: int¶
Return number of output features.
This the default, but may be overridden by specific inheriting classes.
- class graphnet.models.data_representation.graphs.nodes.nodes.NodesAsPulses(*args, **kwargs)[source]¶
Bases:
NodeDefinition
Represent each measured pulse of Cherenkov Radiation as a node.
Construct Detector.
- Parameters:
args (Any)
kwargs (Any)
- Return type:
object
- class graphnet.models.data_representation.graphs.nodes.nodes.PercentileClusters(*args, **kwargs)[source]¶
Bases:
NodeDefinition
Represent nodes as clusters with percentile summary node features.
If cluster_on is set to the xyz coordinates of DOMs e.g. cluster_on = [‘dom_x’, ‘dom_y’, ‘dom_z’], each node will be a unique DOM and the pulse information (charge, time) is summarized using percentiles.
Construct PercentileClusters.
- Parameters:
cluster_on (
List
[str
]) – Names of features to create clusters from.percentiles (
List
[int
]) – List of percentiles. E.g. [10, 50, 90].add_counts (
bool
, default:True
) – If True, number of duplicates is added to output array.input_feature_names (
Optional
[List
[str
]], default:None
) – (Optional) column names for input features.args (Any)
kwargs (Any)
- Return type:
object
- class graphnet.models.data_representation.graphs.nodes.nodes.NodeAsDOMTimeSeries(*args, **kwargs)[source]¶
Bases:
NodeDefinition
Represent each node as a DOM with time and charge time series data.
Construct NodeAsDOMTimeSeries.
- Parameters:
keys (
List
[str
], default:['dom_x', 'dom_y', 'dom_z', 'dom_time', 'charge']
) – Names of features in the data (in order).id_columns (
List
[str
], default:['dom_x', 'dom_y', 'dom_z']
) – List of columns that uniquely identify a DOM.time_column (
str
, default:'dom_time'
) – Name of time column.charge_column (
str
, default:'charge'
) – Name of charge column.max_activations (
Optional
[int
], default:None
) – Maximum number of activations to include in the time series.args (Any)
kwargs (Any)
- Return type:
object
- class graphnet.models.data_representation.graphs.nodes.nodes.IceMixNodes(*args, **kwargs)[source]¶
Bases:
NodeDefinition
Calculate ice properties and perform random sampling.
Ice properties are calculated based on the z-coordinate of the pulse. For each event, a random sampling is performed to keep the number of pulses below a maximum number of pulses if n_pulses is over the limit.
Construct IceMixNodes.
- Parameters:
input_feature_names (
Optional
[List
[str
]], default:None
) – Column names for input features. Minimumnames. (required features are z coordinate and hlc column)
max_pulses (
int
, default:768
) – Maximum number of pulses to keep in the event.z_name (
str
, default:'dom_z'
) – Name of the z-coordinate column.hlc_name (
Optional
[str
], default:'hlc'
) – Name of the Hard Local Coincidence Check column.add_ice_properties (
bool
, default:True
) – If True, scattering and absoption length ofcoordinate. (ice in IceCube are added to the feature set based on z)
ice_args (
Dict
[str
,Optional
[float
]], default:{'z_offset': None, 'z_scaling': None}
) – Offset and scaling of the z coordinate in the Detector,data. (to be able to make similar conversion in the ice)
sample_pulses (
bool
, default:True
) – Enable sampling random pulses. If True and themax_length (event is longer than the)
If (they will be sampled.)
False
selected. (then only the first max_length pulses will be)
args (Any)
kwargs (Any)
- Return type:
object
- class graphnet.models.data_representation.graphs.nodes.nodes.ClusterSummaryFeatures(*args, **kwargs)[source]¶
Bases:
NodeDefinition
Represent pulse maps as clusters with summary features.
If cluster_on is set to the xyz coordinates of optical modules e.g. cluster_on = [‘dom_x’, ‘dom_y’, ‘dom_z’], each node will be a unique optical module and the pulse information (e.g. charge, time) is summarized. NOTE: Developed to be used with features
[dom_x, dom_y, dom_z, charge, time]
Possible features per cluster: - total charge
feature name: total_charge
- charge accumulated after <X> time units
feature name: charge_after_<X>ns
- time of first hit in the optical module
feature name: time_of_first_hit
- time spread per optical module
feature name: time_spread
- time std per optical module
feature name: time_std
- time took to collect <X> percent of total charge per cluster
feature name: time_after_charge_pct<X>
- number of pulses per clusters
feature name: counts
For more details on some of the features see Theo Glauchs thesis (chapter 5.3): https://mediatum.ub.tum.de/node?id=1584755
Construct ClusterSummaryFeatures.
- Parameters:
cluster_on (
List
[str
]) – Names of features to create clusters from.input_feature_names (
List
[str
]) – Column names for input features.charge_label (
str
, default:'charge'
) – Name of the charge column.time_label (
str
, default:'dom_time'
) – Name of the time column.total_charge (
bool
, default:True
) – If True, calculates total charge as feature.charge_after_t (
List
[int
], default:[10, 50, 100]
) – List of times at which the accumulated charge is calculated as a feature.time_of_first_hit (
bool
, default:True
) – If True, time of first hit is added as a feature.time_spread (
bool
, default:True
) – If True, time spread is added as a feature.time_std (
bool
, default:True
) – If True, time std is added as a feature.time_after_charge_pct (
List
[int
], default:[1, 3, 5, 11, 15, 20, 50, 80]
) – List of percentiles to calculate time after charge.charge_standardization (
Union
[float
,str
], default:'log'
) – Either a float or ‘log’. If a float, the features are multiplied by this factor. If ‘log’, the features are transformed to log10 scale.time_standardization (
float
, default:0.001
) – Standardization factor for features with a timeorder_in_time (
bool
, default:True
) –- If True, clusters are ordered in time.
If your data is already ordered in time, you can set this to False to avoid a potential overhead.
- NOTE: Should only be set to False if you are sure that
the input data is already ordered in time. Will lead to incorrect results otherwise.
add_counts (
bool
, default:False
) – If True, number of log10(event counts per clusters) is added as a feature.args (Any)
kwargs (Any)
- Return type:
object
NOTE: Make sure that either the input data is not already standardized or that the charge_standardization and time_standardization parameters are set to 1 to avoid a double standardization.