oracle.pretrained package

Submodules

oracle.pretrained.ELAsTiCC module

Pretrained model(s) for the ELAsTiCC dataset.

class oracle.pretrained.ELAsTiCC.ORACLE1_ELAsTiCC(model_dir='/Users/vedshah/Documents/Research/NU-Miller/Projects/Hierarchical-VT/models/ELAsTiCC/vocal-bird-160')

Bases: GRU_MD

ORACLE1_ELAsTiCC is a model class that inherits from GRU_MD designed to load a pre-trained ELAsTiCC model and perform predictions on time series data augmented with static features. The model uses a hierarchical taxonomy to output predictions at multiple levels of granularity.

taxonomy

An instance of the taxonomy used to structure the class labels.

Type:: ORACLE_Taxonomy

ts_feature_dim

Dimensionality of the time series input features.

Type:: int

static_feature_dim

Dimensionality of the static input features.

Type:: int

model_dir

Directory path where the model weights are stored.

Type:: str

embed(table)

Embed a table into its latent space representation.

Parameters:: table – The input data (e.g., a table or structured data) to be embedded. The exact format is expected to be compatible with the make_batch method.
Returns:: A NumPy array containing the latent space embeddings corresponding to the input table.
Return type:: numpy.ndarray

make_batch(table)

Create a batch from the input table.

Parameters:: table (astropy.table.Table) – Input data containing one or more rows.
Returns:: A dictionary containing the batch data.
Return type:: dict

predict(table)

Predict the label at each hierarchical level for the table.

Parameters:: table (astropy.table.Table) – Input data containing one or more rows.
Returns:: Mapping from hierarchical level (as returned by self.score) to the predicted class label. For each level, self.score(table) is expected to return a pandas.DataFrame of shape (n_samples, n_classes) with class labels as columns; the predicted label is the column with the highest score for the first sample.
Return type:: dict
Raises:: Any exceptions raised by self.score or by numpy operations (e.g., if the score DataFrame is empty) will be propagated. –

predict_full_scores(table)

Predict class probability scores for a single time-series table.

Prepares a single observation table for the model by calling augment_table and converting time-dependent and time-independent features into torch tensors. The input table must contain the columns ‘FLUX’, ‘FLUXERR’, ‘BAND’, ‘MJD’, and ‘PHOTFLAG’, and must provide metadata entries for the keys listed in time_independent_feature_list.

Parameters:

table (astropy.table.Table) – Astropy Table containing time-series data and metadata.

Returns:

A DataFrame containing class probability scores for each class in the taxonomy.

Return type:

pd.DataFrame

Raises:

KeyError – If one or more keys in time_independent_feature_list are missing from table.meta.
ValueError – If time-series columns have inconsistent lengths or if the table is empty in a way that the downstream model cannot handle.

Note

Numeric inputs are converted to numpy.float32 and then to torch tensors.
The produced batch has the following keys and tensor shapes:
‘ts’ : torch.FloatTensor, shape (1, T, D)
‘static’ : torch.FloatTensor, shape (1, S)
‘length’ : torch.FloatTensor, shape (1,)
augment_table(table) is called and its return value is used; the original table may be replaced by the augmented one.

score(table)

Compute hierarchical scores for the input table. Predicts scores for all taxonomy nodes using self.predict_full_scores(), then groups those scores by taxonomy depth and returns a mapping from depth levels to DataFrames containing the corresponding node scores. Taxonomy levels 0 and 3 are removed because they are irrelevant in the current taxonomy.

Parameters:: table (astropy.table.Table) – Input observations/features to be scored.
Returns:: A mapping from taxonomy depth level to a DataFrame of predicted scores for nodes at that level. Each DataFrame is a subset of the full prediction DataFrame containing only the columns for the nodes at that depth.
Return type:: dict[int, pandas.DataFrame]
Raises:: KeyError – If expected node columns (from self.taxonomy.get_nodes_by_depth()) are not present in the DataFrame returned by predict_full_scores().

Note

This is very similar to the model used for the original ORACLE paper.

class oracle.pretrained.ELAsTiCC.ORACLE1_ELAsTiCC_lite(model_dir='/Users/vedshah/Documents/Research/NU-Miller/Projects/Hierarchical-VT/models/ELAsTiCC-lite/revived-star-159')

Bases: GRU

Predict class probability scores for a single time-series table.

Prepares a single observation table for the model by calling augment_table and converting time-dependent features into torch tensors.The input table must contain the columns ‘FLUX’, ‘FLUXERR’, ‘BAND’, ‘MJD’, and ‘PHOTFLAG’. This model does not require additional metadata entries.

Parameters:: table (astropy.table.Table) – Astropy Table containing time-series data.
Returns:: A DataFrame containing class probability scores for each class in the taxonomy.
Return type:: pd.DataFrame
Raises:: ValueError – If time-series columns have inconsistent lengths or if the table is empty in a way that the downstream model cannot handle.

Note

This model does not use static features, and can classify using the light curve alone.
Numeric inputs are converted to numpy.float32 and then to torch tensors.
The produced batch has the following keys and tensor shapes:
‘ts’ : torch.FloatTensor, shape (1, T, D)
‘length’ : torch.FloatTensor, shape (1,)

embed(table)

Embed a table into its latent space representation.

Parameters:: table – The input data (e.g., a table or structured data) to be embedded. The exact format is expected to be compatible with the make_batch method.
Returns:: A NumPy array containing the latent space embeddings corresponding to the input table.
Return type:: numpy.ndarray

make_batch(table)

Create a batch from the input table.

Parameters:: table (astropy.table.Table) – Input data containing one or more rows.
Returns:: A dictionary containing the batch data.
Return type:: dict

predict(table)

Predict the label at each hierarchical level for the table.

Parameters:

table (astropy.table.Table) – Input data containing one or more rows.

Returns:

Mapping from hierarchical level (as returned by self.score) to the predicted class: label. For each level, self.score(table) is expected to return a pandas.DataFrame of shape (n_samples, n_classes) with class labels as columns; the predicted label is the column with the highest score for the first sample.

Return type:

dict

Raises:

Any exceptions raised by self.score or by numpy operations (e.g., if the score DataFrame is empty) will be propagated. –

predict_full_scores(table)

Predict class probability scores for a single time-series table.

Prepares a single observation table for the model by calling augment_table and converting time-dependent and time-independent features into torch tensors. The input table must contain the columns ‘FLUX’, ‘FLUXERR’, ‘BAND’, ‘MJD’, and ‘PHOTFLAG’.

Parameters:: table (astropy.table.Table) – Astropy Table containing time-series data.
Returns:: A DataFrame containing class probability scores for each class in the taxonomy.
Return type:: pd.DataFrame
Raises:: ValueError – If time-series columns have inconsistent lengths or if the table is empty in a way that the downstream model cannot handle.

Note

Numeric inputs are converted to numpy.float32 and then to torch tensors.
The produced batch has the following keys and tensor shapes:
‘ts’ : torch.FloatTensor, shape (1, T, D)
‘length’ : torch.FloatTensor, shape (1,)

score(table)

Compute hierarchical scores for the input table. Predicts scores for all taxonomy nodes using self.predict_full_scores(), then groups those scores by taxonomy depth and returns a mapping from depth levels to DataFrames containing the corresponding node scores. Taxonomy levels 0 and 3 are removed because they are irrelevant in the current taxonomy.

Parameters:

table (astropy.table.Table) – Input observations/features to be scored.

Returns:

A mapping from taxonomy depth level to a: DataFrame of predicted scores for nodes at that level. Each DataFrame is a subset of the full prediction DataFrame containing only the columns for the nodes at that depth.

Return type:

dict[int, pandas.DataFrame]

Raises:

KeyError – If expected node columns (from self.taxonomy.get_nodes_by_depth()) are not present in the DataFrame returned by predict_full_scores().

Note

This is very similar to the model used for the original ORACLE paper.

oracle.pretrained.ELAsTiCC.augment_table(table)

Augments an astronomical observation table by cleaning and transforming its data. This function performs several modifications on the input table:

Creates a copy of the table to avoid modifying the original.

Removes rows where the ‘PHOTFLAG’ column indicates saturation (using a bitmask with 1024).

Reassigns the ‘PHOTFLAG’ column values:

Sets to 1 for detections (when the bitmask with 4096 is non-zero).

Sets to 0 for non-detections.

Converts band labels in the ‘BAND’ column to mean wavelengths using the ‘LSST_passband_to_wavelengths’ mapping.

Normalizes time data by subtracting the Modified Julian Date (MJD) of the first detection from all ‘MJD’ entries.

Reorders the columns based on the predefined list ‘time_dependent_feature_list’.

Iterates over the metadata (‘meta’) of the table and replaces any values that match entries in ‘missing_data_flags’ with ‘flag_value’.

Parameters:

table – Astropy Table An astropy table containing the columns ‘PHOTFLAG’, ‘BAND’, ‘MJD’, and a ‘meta’ attribute. The table is expected to adhere to the structure required by the function.

Returns:

Table-like object: A new, augmented table with the applied cleaning, conversion, and reordering of columns and metadata.

Module contents

Module for working with pretrained models in the ORACLE framework.