oracle.pretrained package
Submodules
oracle.pretrained.ELAsTiCC module
Pretrained model(s) for the ELAsTiCC dataset.
- class oracle.pretrained.ELAsTiCC.ORACLE1_ELAsTiCC(model_dir='/Users/vedshah/Documents/Research/NU-Miller/Projects/Hierarchical-VT/models/ELAsTiCC/vocal-bird-160')
Bases:
GRU_MDORACLE1_ELAsTiCC is a model class that inherits from GRU_MD designed to load a pre-trained ELAsTiCC model and perform predictions on time series data augmented with static features. The model uses a hierarchical taxonomy to output predictions at multiple levels of granularity.
- taxonomy
An instance of the taxonomy used to structure the class labels.
- Type:
- ts_feature_dim
Dimensionality of the time series input features.
- Type:
int
- static_feature_dim
Dimensionality of the static input features.
- Type:
int
- model_dir
Directory path where the model weights are stored.
- Type:
str
- embed(table)
Embed a table into its latent space representation.
- Parameters:
table – The input data (e.g., a table or structured data) to be embedded. The exact format is expected to be compatible with the make_batch method.
- Returns:
A NumPy array containing the latent space embeddings corresponding to the input table.
- Return type:
numpy.ndarray
- make_batch(table)
Create a batch from the input table.
- Parameters:
table (astropy.table.Table) – Input data containing one or more rows.
- Returns:
A dictionary containing the batch data.
- Return type:
dict
- predict(table)
Predict the label at each hierarchical level for the table.
- Parameters:
table (astropy.table.Table) – Input data containing one or more rows.
- Returns:
Mapping from hierarchical level (as returned by self.score) to the predicted class label. For each level, self.score(table) is expected to return a pandas.DataFrame of shape (n_samples, n_classes) with class labels as columns; the predicted label is the column with the highest score for the first sample.
- Return type:
dict
- Raises:
Any exceptions raised by self.score or by numpy operations (e.g., if the score DataFrame is empty) will be propagated. –
- predict_full_scores(table)
Predict class probability scores for a single time-series table.
Prepares a single observation table for the model by calling augment_table and converting time-dependent and time-independent features into torch tensors. The input table must contain the columns ‘FLUX’, ‘FLUXERR’, ‘BAND’, ‘MJD’, and ‘PHOTFLAG’, and must provide metadata entries for the keys listed in time_independent_feature_list.
- Parameters:
table (astropy.table.Table) – Astropy Table containing time-series data and metadata.
- Returns:
A DataFrame containing class probability scores for each class in the taxonomy.
- Return type:
pd.DataFrame
- Raises:
KeyError – If one or more keys in time_independent_feature_list are missing from table.meta.
ValueError – If time-series columns have inconsistent lengths or if the table is empty in a way that the downstream model cannot handle.
Note
Numeric inputs are converted to numpy.float32 and then to torch tensors.
The produced batch has the following keys and tensor shapes:
‘ts’ : torch.FloatTensor, shape (1, T, D)
‘static’ : torch.FloatTensor, shape (1, S)
‘length’ : torch.FloatTensor, shape (1,)
augment_table(table) is called and its return value is used; the original table may be replaced by the augmented one.
- score(table)
Compute hierarchical scores for the input table. Predicts scores for all taxonomy nodes using self.predict_full_scores(), then groups those scores by taxonomy depth and returns a mapping from depth levels to DataFrames containing the corresponding node scores. Taxonomy levels 0 and 3 are removed because they are irrelevant in the current taxonomy.
- Parameters:
table (astropy.table.Table) – Input observations/features to be scored.
- Returns:
A mapping from taxonomy depth level to a DataFrame of predicted scores for nodes at that level. Each DataFrame is a subset of the full prediction DataFrame containing only the columns for the nodes at that depth.
- Return type:
dict[int, pandas.DataFrame]
- Raises:
KeyError – If expected node columns (from self.taxonomy.get_nodes_by_depth()) are not present in the DataFrame returned by predict_full_scores().
Note
This is very similar to the model used for the original ORACLE paper.
- class oracle.pretrained.ELAsTiCC.ORACLE1_ELAsTiCC_lite(model_dir='/Users/vedshah/Documents/Research/NU-Miller/Projects/Hierarchical-VT/models/ELAsTiCC-lite/revived-star-159')
Bases:
GRUPredict class probability scores for a single time-series table.
Prepares a single observation table for the model by calling augment_table and converting time-dependent features into torch tensors.The input table must contain the columns ‘FLUX’, ‘FLUXERR’, ‘BAND’, ‘MJD’, and ‘PHOTFLAG’. This model does not require additional metadata entries.
- Parameters:
table (astropy.table.Table) – Astropy Table containing time-series data.
- Returns:
A DataFrame containing class probability scores for each class in the taxonomy.
- Return type:
pd.DataFrame
- Raises:
ValueError – If time-series columns have inconsistent lengths or if the table is empty in a way that the downstream model cannot handle.
Note
This model does not use static features, and can classify using the light curve alone.
Numeric inputs are converted to numpy.float32 and then to torch tensors.
The produced batch has the following keys and tensor shapes:
‘ts’ : torch.FloatTensor, shape (1, T, D)
‘length’ : torch.FloatTensor, shape (1,)
- embed(table)
Embed a table into its latent space representation.
- Parameters:
table – The input data (e.g., a table or structured data) to be embedded. The exact format is expected to be compatible with the make_batch method.
- Returns:
A NumPy array containing the latent space embeddings corresponding to the input table.
- Return type:
numpy.ndarray
- make_batch(table)
Create a batch from the input table.
- Parameters:
table (astropy.table.Table) – Input data containing one or more rows.
- Returns:
A dictionary containing the batch data.
- Return type:
dict
- predict(table)
Predict the label at each hierarchical level for the table.
- Parameters:
table (astropy.table.Table) – Input data containing one or more rows.
- Returns:
- Mapping from hierarchical level (as returned by self.score) to the predicted class
label. For each level, self.score(table) is expected to return a pandas.DataFrame of shape (n_samples, n_classes) with class labels as columns; the predicted label is the column with the highest score for the first sample.
- Return type:
dict
- Raises:
Any exceptions raised by self.score or by numpy operations (e.g., if the score DataFrame is empty) will be propagated. –
- predict_full_scores(table)
Predict class probability scores for a single time-series table.
Prepares a single observation table for the model by calling augment_table and converting time-dependent and time-independent features into torch tensors. The input table must contain the columns ‘FLUX’, ‘FLUXERR’, ‘BAND’, ‘MJD’, and ‘PHOTFLAG’.
- Parameters:
table (astropy.table.Table) – Astropy Table containing time-series data.
- Returns:
A DataFrame containing class probability scores for each class in the taxonomy.
- Return type:
pd.DataFrame
- Raises:
ValueError – If time-series columns have inconsistent lengths or if the table is empty in a way that the downstream model cannot handle.
Note
Numeric inputs are converted to numpy.float32 and then to torch tensors.
The produced batch has the following keys and tensor shapes:
‘ts’ : torch.FloatTensor, shape (1, T, D)
‘length’ : torch.FloatTensor, shape (1,)
- score(table)
Compute hierarchical scores for the input table. Predicts scores for all taxonomy nodes using self.predict_full_scores(), then groups those scores by taxonomy depth and returns a mapping from depth levels to DataFrames containing the corresponding node scores. Taxonomy levels 0 and 3 are removed because they are irrelevant in the current taxonomy.
- Parameters:
table (astropy.table.Table) – Input observations/features to be scored.
- Returns:
- A mapping from taxonomy depth level to a
DataFrame of predicted scores for nodes at that level. Each DataFrame is a subset of the full prediction DataFrame containing only the columns for the nodes at that depth.
- Return type:
dict[int, pandas.DataFrame]
- Raises:
KeyError – If expected node columns (from self.taxonomy.get_nodes_by_depth()) are not present in the DataFrame returned by predict_full_scores().
Note
This is very similar to the model used for the original ORACLE paper.
- oracle.pretrained.ELAsTiCC.augment_table(table)
Augments an astronomical observation table by cleaning and transforming its data. This function performs several modifications on the input table:
Creates a copy of the table to avoid modifying the original.
Removes rows where the ‘PHOTFLAG’ column indicates saturation (using a bitmask with 1024).
- Reassigns the ‘PHOTFLAG’ column values:
Sets to 1 for detections (when the bitmask with 4096 is non-zero).
Sets to 0 for non-detections.
Converts band labels in the ‘BAND’ column to mean wavelengths using the ‘LSST_passband_to_wavelengths’ mapping.
Normalizes time data by subtracting the Modified Julian Date (MJD) of the first detection from all ‘MJD’ entries.
Reorders the columns based on the predefined list ‘time_dependent_feature_list’.
Iterates over the metadata (‘meta’) of the table and replaces any values that match entries in ‘missing_data_flags’ with ‘flag_value’.
- Parameters:
table – Astropy Table An astropy table containing the columns ‘PHOTFLAG’, ‘BAND’, ‘MJD’, and a ‘meta’ attribute. The table is expected to adhere to the structure required by the function.
- Returns:
- Table-like object
A new, augmented table with the applied cleaning, conversion, and reordering of columns and metadata.
Module contents
Module for working with pretrained models in the ORACLE framework.