API Reference¶

Surface¶

class lantern.model.surface.Phenotype(D, K, mean, kernel, variational_strategy)¶

A phenotype surface, learned with an approximate GP.

Parameters

D (int) – The phenotype dimension
K (int) – The latent effect dimension
mean (gpytorch.means.Mean) – The mean function of the GP
kernel (gpytorch.kernels.Kernel) – The GP kernel function
variational_strategy (gpytorch.variational.VariationalStrategy) – The strategy for variational inference

Return type

None

Method generated by attrs for class Phenotype.

property Kbasis¶: The number of dimensions provided by the basis

classmethod build(D, K, Ni=800, inducScale=10, distribution=<class 'gpytorch.variational.cholesky_variational_distribution.CholeskyVariationalDistribution'>, mean=None, kernel=None, learn_inducing_locations=True, *args, **kwargs)¶

Build a phenotype surface object.

Parameters

D (int) – Number of dimensions of the (output) phenotype
K (int) – Number of latent dimesions
Ni (int, optional) – Number of inducing points
inducScale (float, optional) – Range to initialize inducing points over (uniform from [-inducScale, inducScale])
distribution (gpytorch.VariationalDistribution) – The distribution of the variational approximation
mean (gpytorch.means.Mean, optional) – Mean function of the GP
kernel (gpytorch.kernels.Kernel, optional) – The kernel of the GP
learn_inducing_locations (bool, optional) – Whether to learn location of inducing points

forward(z)¶: The forward prediction of the phenotype for a position in latent phenotype space.

classmethod fromDataset(ds, *args, **kwargs)¶: Build a phenotype surface matching a dataset

Basis¶

class lantern.model.basis.Basis¶

A dimension reducing basis for mutational data.

Parameters

p (int) – Input dimension, e.g. the number of mutations
K (int) – output dimension, e.g. the number of latent directions

Return type

None

Method generated by attrs for class Basis.

property order¶: The rank order of latent dimensions

class lantern.model.basis.VariationalBasis(W_mu, W_log_sigma, log_alpha, log_beta, alpha_prior)¶

A variational basis for reducing mutational data.

Method generated by attrs for class VariationalBasis.

Parameters

W_mu (torch.nn.parameter.Parameter) –
W_log_sigma (torch.nn.parameter.Parameter) –
log_alpha (torch.nn.parameter.Parameter) –
log_beta (torch.nn.parameter.Parameter) –
alpha_prior (torch.distributions.gamma.Gamma) –

Return type

None

property order¶: The rank order of latent dimensions

Loss¶

class lantern.loss.ELBO_GP(mll)¶

The variational ELBO objective for GPs

Method generated by attrs for class ELBO_GP.

Return type: None

Dataset¶

class lantern.dataset.tokenizer.Tokenizer(lookup, tokens, sites, mutations, delim=':')¶

A class for tokenizing strings representing genetic variants.

Parameters

lookup (Dict[str, int]) – A lookup from token to index
tokens (List[str]) – A lookup from index to token
sites (List[int]) – A site number for each token, if valid
mutations (List[Union[None, str]]) – A mutation value for each token, if valid
delim (str) – The delimiter for this tokenizer

Return type

None

Method generated by attrs for class Tokenizer.

detokenize(t)¶: Convert a binarized token tensor into a mutation string

classmethod fromVariants(substitutions, delim=':', regex='(?P<wt>[a-zA-Z*])(?P<site>\\d+)(?P<mut>[a-zA-Z*])')¶: Construct a tokenizer from a list of variants.

property p¶: Total number of tokens

tokenize(*s)¶: Convert a mutation string (or strings) into a binarized tensor

class lantern.dataset.dataset._Base(substitutions='substitutions', phenotypes=['phenotype'], errors=None, tokenizer=None)¶

Base genotype-phenotype dataset class, shuttling a pandas dataframe to a TensorDataset.

Parameters

substitutions (str) – The column containing raw mutation data for each variant.
phenotypes (list[str]) – The columns of observed phenotypes for each variant
errors (list[str], optional) – The error columns associated with each phenotype, assumed to be variance (\(\sigma^2_y\))
tokenizer (lantern.dataset.tokenizer.Tokenizer) – The tokenizer converting raw mutations into one-hot encoded tensors

Method generated by attrs for class _Base.

property D¶: The number of dimensions of the phenotype

_errors_correct_length(attribute, value)¶: Check for correct length between errors and phenotypes

meanEffects()¶: The mean effects of each mutation against each phenotype, returned as a (p x D) tensor

property p¶: The number of mutations in the dataset

to(device)¶: Send to device

class lantern.dataset.Dataset(df, substitutions='substitutions', phenotypes=['phenotype'], errors=None, tokenizer=None)¶

The runtime option for datasets, taking a dataframe as the first argument.

Method generated by attrs for class Dataset.

classmethod from_sequences(df, wildtype, sequence_column='sequence', substitutions='substitutions', *args, **kwargs)¶

Build a Dataframe dataset using full sequences, converting to a compressed substitution string.

Parameters

wildtype (str) –
sequence_column (str) –

class lantern.dataset.CsvDataset(pth, substitutions='substitutions', phenotypes=['phenotype'], errors=None, tokenizer=None)¶: Method generated by attrs for class CsvDataset.

API Reference¶

Surface¶

Basis¶

Loss¶

Dataset¶

LANTERN

Navigation

Related Topics