Metalearning

One of the hardest challenges in scientific machine learning is lack of access of sufficient data. Sometimes experiments are slow and expensive and there’s no easy way to gain access to more data. What do you do then?

This module contains a collection of techniques for doing low data learning. “Metalearning” traditionally refers to techniques for “learning to learn” but here we take it to mean any technique which proves effective for learning with low amounts of data.

MetaLearner

This is the abstract superclass for metalearning algorithms.

class MetaLearner[source]

Model and data to which the MAML algorithm can be applied.

To use MAML, create a subclass of this defining the learning problem to solve. It consists of a model that can be trained to perform many different tasks, and data for training it on a large (possibly infinite) set of different tasks.

compute_model(inputs, variables, training)[source]

Compute the model for a set of inputs and variables.

Parameters
  • inputs (list of tensors) – the inputs to the model

  • variables (list of tensors) – the values to use for the model’s variables. This might be the actual variables (as returned by the MetaLearner’s variables property), or alternatively it might be the values of those variables after one or more steps of gradient descent for the current task.

  • training (bool) – indicates whether the model is being invoked for training or prediction

Returns

  • (loss, outputs) where loss is the value of the model’s loss function, and

  • outputs is a list of the model’s outputs

property variables[source]

Get the list of Tensorflow variables to train.

select_task()[source]

Select a new task to train on.

If there is a fixed set of training tasks, this will typically cycle through them. If there are infinitely many training tasks, this can simply select a new one each time it is called.

get_batch()[source]

Get a batch of data for training.

This should return the data as a list of arrays, one for each of the model’s inputs. This will usually be called twice for each task, and should return a different batch on each call.

MAML

class MAML(learner, learning_rate=0.001, optimization_steps=1, meta_batch_size=10, optimizer=<deepchem.models.optimizers.Adam object>, model_dir=None)[source]

Implements the Model-Agnostic Meta-Learning algorithm for low data learning.

The algorithm is described in Finn et al., “Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks” (https://arxiv.org/abs/1703.03400). It is used for training models that can perform a variety of tasks, depending on what data they are trained on. It assumes you have training data for many tasks, but only a small amount for each one. It performs “meta-learning” by looping over tasks and trying to minimize the loss on each one after one or a few steps of gradient descent. That is, it does not try to create a model that can directly solve the tasks, but rather tries to create a model that is very easy to train.

To use this class, create a subclass of MetaLearner that encapsulates the model and data for your learning problem. Pass it to a MAML object and call fit(). You can then use train_on_current_task() to fine tune the model for a particular task.

__init__(learner, learning_rate=0.001, optimization_steps=1, meta_batch_size=10, optimizer=<deepchem.models.optimizers.Adam object>, model_dir=None)[source]

Create an object for performing meta-optimization.

Parameters
  • learner (MetaLearner) – defines the meta-learning problem

  • learning_rate (float or Tensor) – the learning rate to use for optimizing each task (not to be confused with the one used for meta-learning). This can optionally be made a variable (represented as a Tensor), in which case the learning rate will itself be learnable.

  • optimization_steps (int) – the number of steps of gradient descent to perform for each task

  • meta_batch_size (int) – the number of tasks to use for each step of meta-learning

  • optimizer (Optimizer) – the optimizer to use for meta-learning (not to be confused with the gradient descent optimization performed for each task)

  • model_dir (str) – the directory in which the model will be saved. If None, a temporary directory will be created.

fit(steps, max_checkpoints_to_keep=5, checkpoint_interval=600, restore=False)[source]

Perform meta-learning to train the model.

Parameters
  • steps (int) – the number of steps of meta-learning to perform

  • max_checkpoints_to_keep (int) – the maximum number of checkpoint files to keep. When this number is reached, older files are deleted.

  • checkpoint_interval (float) – the time interval at which to save checkpoints, measured in seconds

  • restore (bool) – if True, restore the model from the most recent checkpoint before training it further

restore()[source]

Reload the model parameters from the most recent checkpoint file.

train_on_current_task(optimization_steps=1, restore=True)[source]

Perform a few steps of gradient descent to fine tune the model on the current task.

Parameters
  • optimization_steps (int) – the number of steps of gradient descent to perform

  • restore (bool) – if True, restore the model from the most recent checkpoint before optimizing

predict_on_batch(inputs)[source]

Compute the model’s outputs for a batch of inputs.

Parameters

inputs (list of arrays) – the inputs to the model

Returns

  • (loss, outputs) where loss is the value of the model’s loss function, and

  • outputs is a list of the model’s outputs