oracle.pretrained.ELAsTiCC module

Pretrained model(s) for the ELAsTiCC dataset.

class oracle.pretrained.ELAsTiCC.ORACLE1_ELAsTiCC(model_dir='/Users/vedshah/Documents/Research/NU-Miller/Projects/Hierarchical-VT/models/ELAsTiCC/vocal-bird-160')

Bases: GRU_MD

ORACLE1_ELAsTiCC is a model class that inherits from GRU_MD designed to load a pre-trained ELAsTiCC model and perform predictions on time series data augmented with static features. The model uses a hierarchical taxonomy to output predictions at multiple levels of granularity.

taxonomy

An instance of the taxonomy used to structure the class labels.

Type:

ORACLE_Taxonomy

ts_feature_dim

Dimensionality of the time series input features.

Type:

int

static_feature_dim

Dimensionality of the static input features.

Type:

int

model_dir

Directory path where the model weights are stored.

Type:

str

embed(table)

Embed a table into its latent space representation.

Parameters:

table – The input data (e.g., a table or structured data) to be embedded. The exact format is expected to be compatible with the make_batch method.

Returns:

A NumPy array containing the latent space embeddings corresponding to the input table.

Return type:

numpy.ndarray

make_batch(table)

Create a batch from the input table.

Parameters:

table (astropy.table.Table) – Input data containing one or more rows.

Returns:

A dictionary containing the batch data.

Return type:

dict

predict(table)

Predict the label at each hierarchical level for the table.

Parameters:

table (astropy.table.Table) – Input data containing one or more rows.

Returns:

Mapping from hierarchical level (as returned by self.score) to the predicted class label. For each level, self.score(table) is expected to return a pandas.DataFrame of shape (n_samples, n_classes) with class labels as columns; the predicted label is the column with the highest score for the first sample.

Return type:

dict

Raises:

Any exceptions raised by self.score or by numpy operations (e.g., if the score DataFrame is empty) will be propagated.

predict_full_scores(table)

Predict class probability scores for a single time-series table.

Prepares a single observation table for the model by calling augment_table and converting time-dependent and time-independent features into torch tensors. The input table must contain the columns ‘FLUX’, ‘FLUXERR’, ‘BAND’, ‘MJD’, and ‘PHOTFLAG’, and must provide metadata entries for the keys listed in time_independent_feature_list.

Parameters:

table (astropy.table.Table) – Astropy Table containing time-series data and metadata.

Returns:

A DataFrame containing class probability scores for each class in the taxonomy.

Return type:

pd.DataFrame

Raises:
  • KeyError – If one or more keys in time_independent_feature_list are missing from table.meta.

  • ValueError – If time-series columns have inconsistent lengths or if the table is empty in a way that the downstream model cannot handle.

Note

  • Numeric inputs are converted to numpy.float32 and then to torch tensors.

  • The produced batch has the following keys and tensor shapes:

  • ‘ts’ : torch.FloatTensor, shape (1, T, D)

  • ‘static’ : torch.FloatTensor, shape (1, S)

  • ‘length’ : torch.FloatTensor, shape (1,)

  • augment_table(table) is called and its return value is used; the original table may be replaced by the augmented one.

score(table)

Compute hierarchical scores for the input table. Predicts scores for all taxonomy nodes using self.predict_full_scores(), then groups those scores by taxonomy depth and returns a mapping from depth levels to DataFrames containing the corresponding node scores. Taxonomy levels 0 and 3 are removed because they are irrelevant in the current taxonomy.

Parameters:

table (astropy.table.Table) – Input observations/features to be scored.

Returns:

A mapping from taxonomy depth level to a DataFrame of predicted scores for nodes at that level. Each DataFrame is a subset of the full prediction DataFrame containing only the columns for the nodes at that depth.

Return type:

dict[int, pandas.DataFrame]

Raises:

KeyError – If expected node columns (from self.taxonomy.get_nodes_by_depth()) are not present in the DataFrame returned by predict_full_scores().

Note

  • This is very similar to the model used for the original ORACLE paper.

class oracle.pretrained.ELAsTiCC.ORACLE1_ELAsTiCC_lite(model_dir='/Users/vedshah/Documents/Research/NU-Miller/Projects/Hierarchical-VT/models/ELAsTiCC-lite/revived-star-159')

Bases: GRU

Predict class probability scores for a single time-series table.

Prepares a single observation table for the model by calling augment_table and converting time-dependent features into torch tensors.The input table must contain the columns ‘FLUX’, ‘FLUXERR’, ‘BAND’, ‘MJD’, and ‘PHOTFLAG’. This model does not require additional metadata entries.

Parameters:

table (astropy.table.Table) – Astropy Table containing time-series data.

Returns:

A DataFrame containing class probability scores for each class in the taxonomy.

Return type:

pd.DataFrame

Raises:

ValueError – If time-series columns have inconsistent lengths or if the table is empty in a way that the downstream model cannot handle.

Note

  • This model does not use static features, and can classify using the light curve alone.

  • Numeric inputs are converted to numpy.float32 and then to torch tensors.

  • The produced batch has the following keys and tensor shapes:

  • ‘ts’ : torch.FloatTensor, shape (1, T, D)

  • ‘length’ : torch.FloatTensor, shape (1,)

embed(table)

Embed a table into its latent space representation.

Parameters:

table – The input data (e.g., a table or structured data) to be embedded. The exact format is expected to be compatible with the make_batch method.

Returns:

A NumPy array containing the latent space embeddings corresponding to the input table.

Return type:

numpy.ndarray

make_batch(table)

Create a batch from the input table.

Parameters:

table (astropy.table.Table) – Input data containing one or more rows.

Returns:

A dictionary containing the batch data.

Return type:

dict

predict(table)

Predict the label at each hierarchical level for the table.

Parameters:

table (astropy.table.Table) – Input data containing one or more rows.

Returns:

Mapping from hierarchical level (as returned by self.score) to the predicted class

label. For each level, self.score(table) is expected to return a pandas.DataFrame of shape (n_samples, n_classes) with class labels as columns; the predicted label is the column with the highest score for the first sample.

Return type:

dict

Raises:

Any exceptions raised by self.score or by numpy operations (e.g., if the score DataFrame is empty) will be propagated.

predict_full_scores(table)

Predict class probability scores for a single time-series table.

Prepares a single observation table for the model by calling augment_table and converting time-dependent and time-independent features into torch tensors. The input table must contain the columns ‘FLUX’, ‘FLUXERR’, ‘BAND’, ‘MJD’, and ‘PHOTFLAG’.

Parameters:

table (astropy.table.Table) – Astropy Table containing time-series data.

Returns:

A DataFrame containing class probability scores for each class in the taxonomy.

Return type:

pd.DataFrame

Raises:

ValueError – If time-series columns have inconsistent lengths or if the table is empty in a way that the downstream model cannot handle.

Note

  • Numeric inputs are converted to numpy.float32 and then to torch tensors.

  • The produced batch has the following keys and tensor shapes:

  • ‘ts’ : torch.FloatTensor, shape (1, T, D)

  • ‘length’ : torch.FloatTensor, shape (1,)

score(table)

Compute hierarchical scores for the input table. Predicts scores for all taxonomy nodes using self.predict_full_scores(), then groups those scores by taxonomy depth and returns a mapping from depth levels to DataFrames containing the corresponding node scores. Taxonomy levels 0 and 3 are removed because they are irrelevant in the current taxonomy.

Parameters:

table (astropy.table.Table) – Input observations/features to be scored.

Returns:

A mapping from taxonomy depth level to a

DataFrame of predicted scores for nodes at that level. Each DataFrame is a subset of the full prediction DataFrame containing only the columns for the nodes at that depth.

Return type:

dict[int, pandas.DataFrame]

Raises:

KeyError – If expected node columns (from self.taxonomy.get_nodes_by_depth()) are not present in the DataFrame returned by predict_full_scores().

Note

  • This is very similar to the model used for the original ORACLE paper.

oracle.pretrained.ELAsTiCC.augment_table(table)

Augments an astronomical observation table by cleaning and transforming its data. This function performs several modifications on the input table:

  • Creates a copy of the table to avoid modifying the original.

  • Removes rows where the ‘PHOTFLAG’ column indicates saturation (using a bitmask with 1024).

  • Reassigns the ‘PHOTFLAG’ column values:
    • Sets to 1 for detections (when the bitmask with 4096 is non-zero).

    • Sets to 0 for non-detections.

  • Converts band labels in the ‘BAND’ column to mean wavelengths using the ‘LSST_passband_to_wavelengths’ mapping.

  • Normalizes time data by subtracting the Modified Julian Date (MJD) of the first detection from all ‘MJD’ entries.

  • Reorders the columns based on the predefined list ‘time_dependent_feature_list’.

  • Iterates over the metadata (‘meta’) of the table and replaces any values that match entries in ‘missing_data_flags’ with ‘flag_value’.

Parameters:

table – Astropy Table An astropy table containing the columns ‘PHOTFLAG’, ‘BAND’, ‘MJD’, and a ‘meta’ attribute. The table is expected to adhere to the structure required by the function.

Returns:

Table-like object

A new, augmented table with the applied cleaning, conversion, and reordering of columns and metadata.