LANTERN: an interpretable genotype-phenotype landscape model

What is LANTERN?

LANTERN is a tool for learning interpretable models of genotype-phenotype landscape (GPL) data.


LANTERN currently must be installed from source. It is recommended to install in a virtual environment (e.g. venv or conda):

python -m pip install git+


LANTERN provides a straightforward interface for training models:

import pandas as pd
from torch.optim import Adam

from lantern.dataset import Dataset
from lantern.model import Model
from lantern.model.basis import VariationalBasis
from lantern.model.surface import Phenotype

# create a dataframe containing GPL data
df = pd.DataFrame(
    {"substitutions": ["", "+a", "+b", "+a:+b"], "phenotype": [0.0, 1.0, 1.0, 0.8]},

# convert the data to a LANTERN dataset
ds = Dataset(df)

# build a LANTERN model based on the dataset, using an upper-bound
# of 8 latent dimensions
model = Model(
    VariationalBasis.fromDataset(ds, 8),
    Phenotype.fromDataset(ds, 8)

loss = model.loss(N=len(ds))
X, y = ds[:len(ds)]

optimizer = Adam(loss.parameters(), lr=0.01)
for i in range(100):
    yhat = model(X)
    lss = loss(yhat, y)
    total = sum(lss.values())

For a more thorough introduction, see the Tutorial.


LANTERN can be cited as: <insert biorxiv link>

The workflow used for generating the results of the manuscript is available at

Indices and tables