utils

Utility functions for construction of graphs.

graphnet.models.graphs.utils.lex_sort(x, cluster_columns)[source]

Sort numpy arrays according to columns on ´cluster_columns´.

Note that x is sorted along the dimensions in cluster_columns backwards. I.e. cluster_columns = [0,1,2] means x is sorted along [2,1,0].

Parameters:
  • x (array) – array to be sorted.

  • cluster_columns (List[int]) – Columns of x to be sorted along.

Return type:

ndarray

Returns:

A sorted version of x.

graphnet.models.graphs.utils.gather_cluster_sequence(x, feature_idx, cluster_columns)[source]

Turn x into rows of clusters with sequences along columns.

Sequences along columns are added which correspond to gathered sequences of the feature in x specified by column index feature_idx associated with each column. Sequences are padded with NaN to be of same length. Dimension of clustered array is [n_clusters, l + len(cluster_columns)],where l is the largest sequence length.

Example: Suppose x represents a neutrino event and we have chosen to cluster on the PMT positions and that feature_idx correspond to pulse time.

The resulting array will have dimensions [n_pmts, m + 3] where m is the maximum number of same-pmt pulses found in x, and `+3`for the three spatial directions defining each cluster.

Parameters:
  • x (ndarray) – Array for clustering

  • feature_idx (int) – Index of the feature in x to

  • cluster. (be gathered for each)

  • cluster_columns (List[int]) – Index in x from which to build clusters.

Returns:

Array with dimensions [n_clusters, l + len(cluster_columns)] column_offset: Indices of the columns in array that defines clusters.

Return type:

array

graphnet.models.graphs.utils.identify_indices(feature_names, cluster_on)[source]

Identify indices for clustering and summarization.

Return type:

Tuple[List[int], List[int], List[str]]

Parameters:
  • feature_names (List[str])

  • cluster_on (List[str])

graphnet.models.graphs.utils.cluster_summarize_with_percentiles(x, summarization_indices, cluster_indices, percentiles, add_counts)[source]

Turn x into clusters with percentile summary.

From variables specified by column indices cluster_indices, x is turned into clusters. Information in columns of x specified by indices summarization_indices with each cluster is summarized using percentiles. It is assumed x represents a single event.

Example use-case: Suppose x contains raw pulses from a neutrino event where some DOMs have multiple measurements of Cherenkov radiation. If cluster_indices is set to the columns corresponding to the xyz-position of the DOMs, and the features specified in summarization_indices correspond to time, charge, then each row in the returned array will correspond to a DOM, and the time and charge for each DOM will be summarized by percentiles. Returned output array has dimensions [n_clusters, len(percentiles)*len(summarization_indices) + len(cluster_indices)]

Parameters:
  • x (ndarray) – Array to be clustered

  • summarization_indices (List[int]) – List of column indices that defines features that will be summarized with percentiles.

  • cluster_indices (List[int]) – List of column indices on which the clusters are constructed.

  • percentiles (List[int]) – percentiles used to summarize x. E.g. [10,50,90].

  • add_counts (bool)

Return type:

ndarray

Returns:

Percentile-summarized array

graphnet.models.graphs.utils.ice_transparency(z_offset, z_scaling)[source]

Return interpolation functions for optical properties of IceCube.

NOTE: The resulting interpolation functions assumes that the Z-coordinate of pulse are scaled as z = z/500. Any deviation from this scaling method results in inaccurate results.

Parameters:
  • z_offset (Optional[float], default: None) – Offset to be added to the depth of the DOM.

  • z_scaling (Optional[float], default: None) – Scaling factor to be applied to the depth of the DOM.

Returns:

Function that takes a normalized depth and returns the corresponding normalized scattering length. f_absorption: Function that takes a normalized depth and returns the corresponding normalized absorption length.

Return type:

f_scattering