graphnet.data.writers.sqlite_writer module

Module containing `GraphNeTFileSaveMethod`(s).

These modules are used to save the interim data format from DataConverter to a deep-learning friendly file format.

class graphnet.data.writers.sqlite_writer.SQLiteWriter(merged_database_name, max_table_size, index_column)[source]

Bases: GraphNeTWriter

A method for saving GraphNeT’s interim dataformat to SQLite.

Initialize SQLiteWriter.

Parameters:
  • merged_database_name (str, default: 'merged.db') – name of the database, not path, that files will be merged into. Defaults to “merged.db”.

  • max_table_size (Optional[int], default: None) – The maximum number of rows in any given table. If given, the merging proceedure splits the databases into partitions each with a maximum table size of max_table_size. Note that the size is approximate. This feature is useful if you have many events, as tables exceeding 400 million rows tend to be noticably slower to query. Defaults to None (All events are put into a single database).

  • index_column (str, default: 'event_no') – Name of column that contains event id.

merge_files(files, output_dir, primary_key_rescue)[source]

SQLite-specific method for merging output files/databases.

Parameters:
  • files (List[str]) – paths to SQLite databases that needs to be merged.

  • output_dir (str) – path to store the merged database(s) in.

  • database_name – name, not path, of database. E.g. “my_database”.

  • max_table_size – The maximum number of rows in any given table. If given, the merging proceedure splits the databases into partitions each with a maximum table size of max_table_size. Note that the size is approximate. This feature is useful if you have many events, as tables exceeding 400 million rows tend to be noticably slower to query. Defaults to None (All events are put into a single database.)

  • primary_key_rescue (str, default: 'event_no') – The name of the columns on which the primary key is constructed. This will only be used if it is not possible to infer the primary key name.

Return type:

None