Layers¶

Deep learning models are often said to be made up of “layers”. Intuitively, a “layer” is a function which transforms some tensor into another tensor. DeepChem maintains an extensive collection of layers which perform various useful scientific transformations. For now, most layers are Keras only but over time we expect this support to expand to other types of models and layers.

Keras Layers¶

class InteratomicL2Distances(*args, **kwargs)[source]¶

Compute (squared) L2 Distances between atoms given neighbors.

This class computes pairwise distances between its inputs.

Examples

>>> import numpy as np
>>> import deepchem as dc
>>> atoms = 5
>>> neighbors = 2
>>> coords = np.random.rand(atoms, 3)
>>> neighbor_list = np.random.randint(0, atoms, size=(atoms, neighbors))
>>> layer = InteratomicL2Distances(atoms, neighbors, 3)
>>> result = np.array(layer([coords, neighbor_list]))
>>> result.shape
(5, 2)

__init__(N_atoms: int, M_nbrs: int, ndim: int, **kwargs)[source]¶

Constructor for this layer.

Parameters

N_atoms (int) – Number of atoms in the system total.
M_nbrs (int) – Number of neighbors to consider when computing distances.
n_dim (int) – Number of descriptors for each atom.

get_config() → Dict[source]¶: Returns config dictionary for this layer.

call(inputs)[source]¶

Invokes this layer.

Parameters: inputs (list) – Should be of form inputs=[coords, nbr_list] where coords is a tensor of shape (None, N, 3) and nbr_list is a list.
Returns
Return type: Tensor of shape (N_atoms, M_nbrs) with interatomic distances.

class GraphConv(*args, **kwargs)[source]¶

Graph Convolutional Layers

This layer implements the graph convolution introduced in [1]_. The graph convolution combines per-node feature vectures in a nonlinear fashion with the feature vectors for neighboring nodes. This “blends” information in local neighborhoods of a graph.

References

1: Duvenaud, David K., et al. “Convolutional networks on graphs for learning molecular fingerprints.” Advances in neural information processing systems. 2015. https://arxiv.org/abs/1509.09292

__init__(out_channel: int, min_deg: int = 0, max_deg: int = 10, activation_fn: Optional[Callable] = None, **kwargs)[source]¶

Initialize a graph convolutional layer.

Parameters

out_channel (int) – The number of output channels per graph node.
min_deg (int, optional (default 0)) – The minimum allowed degree for each graph node.
max_deg (int, optional (default 10)) – The maximum allowed degree for each graph node. Note that this is set to 10 to handle complex molecules (some organometallic compounds have strange structures). If you’re using this for non-molecular applications, you may need to set this much higher depending on your dataset.
activation_fn (function) – A nonlinear activation function to apply. If you’re not sure, tf.nn.relu is probably a good default for your application.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

sum_neigh(atoms, deg_adj_lists)[source]¶: Store the summed atoms by degree

class GraphPool(*args, **kwargs)[source]¶

A GraphPool gathers data from local neighborhoods of a graph.

This layer does a max-pooling over the feature vectors of atoms in a neighborhood. You can think of this layer as analogous to a max-pooling layer for 2D convolutions but which operates on graphs instead. This technique is described in [1]_.

References

1: Duvenaud, David K., et al. “Convolutional networks on graphs for learning molecular fingerprints.” Advances in neural information processing systems. 2015. https://arxiv.org/abs/1509.09292

__init__(min_degree=0, max_degree=10, **kwargs)[source]¶

Initialize this layer

Parameters

min_deg (int, optional (default 0)) – The minimum allowed degree for each graph node.
max_deg (int, optional (default 10)) – The maximum allowed degree for each graph node. Note that this is set to 10 to handle complex molecules (some organometallic compounds have strange structures). If you’re using this for non-molecular applications, you may need to set this much higher depending on your dataset.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class GraphGather(*args, **kwargs)[source]¶

A GraphGather layer pools node-level feature vectors to create a graph feature vector.

Many graph convolutional networks manipulate feature vectors per graph-node. For a molecule for example, each node might represent an atom, and the network would manipulate atomic feature vectors that summarize the local chemistry of the atom. However, at the end of the application, we will likely want to work with a molecule level feature representation. The GraphGather layer creates a graph level feature vector by combining all the node-level feature vectors.

One subtlety about this layer is that it depends on the batch_size. This is done for internal implementation reasons. The GraphConv, and GraphPool layers pool all nodes from all graphs in a batch that’s being processed. The GraphGather reassembles these jumbled node feature vectors into per-graph feature vectors.

References

1: Duvenaud, David K., et al. “Convolutional networks on graphs for learning molecular fingerprints.” Advances in neural information processing systems. 2015. https://arxiv.org/abs/1509.09292

__init__(batch_size, activation_fn=None, **kwargs)[source]¶

Initialize this layer.

Parameters

batch_size (int) – The batch size for this layer. Note that the layer’s behavior changes depending on the batch size.
activation_fn (function) – A nonlinear activation function to apply. If you’re not sure, tf.nn.relu is probably a good default for your application.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

call(inputs)[source]¶

Invoking this layer.

Parameters: inputs (list) – This list should consist of inputs = [atom_features, deg_slice, membership, deg_adj_list placeholders…]. These are all tensors that are created/process by GraphConv and GraphPool

class MolGANConvolutionLayer(*args, **kwargs)[source]¶

Graph convolution layer used in MolGAN model. MolGAN is a WGAN type model for generation of small molecules. Not used directly, higher level layers like MolGANMultiConvolutionLayer use it. This layer performs basic convolution on one-hot encoded matrices containing atom and bond information. This layer also accepts three inputs for the case when convolution is performed more than once and results of previous convolution need to used. It was done in such a way to avoid creating another layer that accepts three inputs rather than two. The last input layer is so-called hidden_layer and it hold results of the convolution while first two are unchanged input tensors.

Example

See: MolGANMultiConvolutionLayer for using in layers.

>>> from tensorflow.keras import Model
>>> from tensorflow.keras.layers import Input
>>> vertices = 9
>>> nodes = 5
>>> edges = 5
>>> units = 128

>>> layer1 = MolGANConvolutionLayer(units=units,edges=edges, name='layer1')
>>> layer2 = MolGANConvolutionLayer(units=units,edges=edges, name='layer2')
>>> adjacency_tensor= Input(shape=(vertices, vertices, edges))
>>> node_tensor = Input(shape=(vertices,nodes))
>>> hidden1 = layer1([adjacency_tensor,node_tensor])
>>> output = layer2(hidden1)
>>> model = Model(inputs=[adjacency_tensor,node_tensor], outputs=[output])

References

1: Nicola De Cao et al. “MolGAN: An implicit generative model for small molecular graphs”, https://arxiv.org/abs/1805.11973

__init__(units: int, activation: Callable = <function tanh>, dropout_rate: float = 0.0, edges: int = 5, name: str = '', **kwargs)[source]¶

Initialize this layer.

Parameters

units (int) – Dimesion of dense layers used for convolution
activation (function, optional (default=Tanh)) – activation function used across model, default is Tanh
dropout_rate (float, optional (default=0.0)) – Dropout rate used by dropout layer
edges (int, optional (default=5)) – How many dense layers to use in convolution. Typically equal to number of bond types used in the model.
name (string, optional (default="")) – Name of the layer

call(inputs, training=False)[source]¶

Invoke this layer

Parameters

inputs (list) – List of two input matrices, adjacency tensor and node features tensors in one-hot encoding format.
training (bool) – Should this layer be run in training mode. Typically decided by main model, influences things like dropout.

Returns

First and second are original input tensors Third is the result of convolution

Return type

tuple(tf.Tensor,tf.Tensor,tf.Tensor)

get_config() → Dict[source]¶: Returns config dictionary for this layer.

class MolGANAggregationLayer(*args, **kwargs)[source]¶

Graph Aggregation layer used in MolGAN model. MolGAN is a WGAN type model for generation of small molecules. Performs aggregation on tensor resulting from convolution layers. Given its simple nature it might be removed in future and moved to MolGANEncoderLayer.

Example

>>> from tensorflow.keras import Model
>>> from tensorflow.keras.layers import Input
>>> vertices = 9
>>> nodes = 5
>>> edges = 5
>>> units = 128

>>> layer_1 = MolGANConvolutionLayer(units=units,edges=edges, name='layer1')
>>> layer_2 = MolGANConvolutionLayer(units=units,edges=edges, name='layer2')
>>> layer_3 = MolGANAggregationLayer(units=128, name='layer3')
>>> adjacency_tensor= Input(shape=(vertices, vertices, edges))
>>> node_tensor = Input(shape=(vertices,nodes))
>>> hidden_1 = layer_1([adjacency_tensor,node_tensor])
>>> hidden_2 = layer_2(hidden_1)
>>> output = layer_3(hidden_2[2])
>>> model = Model(inputs=[adjacency_tensor,node_tensor], outputs=[output])

References

1: Nicola De Cao et al. “MolGAN: An implicit generative model for small molecular graphs”, https://arxiv.org/abs/1805.11973

__init__(units: int = 128, activation: Callable = <function tanh>, dropout_rate: float = 0.0, name: str = '', **kwargs)[source]¶

Initialize the layer

Parameters

units (int, optional (default=128)) – Dimesion of dense layers used for aggregation
activation (function, optional (default=Tanh)) – activation function used across model, default is Tanh
dropout_rate (float, optional (default=0.0)) – Used by dropout layer
name (string, optional (default="")) – Name of the layer

call(inputs, training=False)[source]¶

Invoke this layer

Parameters

inputs (List) – Single tensor resulting from graph convolution layer
training (bool) – Should this layer be run in training mode. Typically decided by main model, influences things like dropout.

Returns

aggregation tensor – Result of aggregation function on input convolution tensor.

Return type

tf.Tensor

get_config() → Dict[source]¶: Returns config dictionary for this layer.

class MolGANMultiConvolutionLayer(*args, **kwargs)[source]¶

Multiple pass convolution layer used in MolGAN model. MolGAN is a WGAN type model for generation of small molecules. It takes outputs of previous convolution layer and uses them as inputs for the next one. It simplifies the overall framework, but might be moved to MolGANEncoderLayer in the future in order to reduce number of layers.

Example

>>> from tensorflow.keras import Model
>>> from tensorflow.keras.layers import Input
>>> vertices = 9
>>> nodes = 5
>>> edges = 5
>>> units = 128

>>> layer_1 = MolGANMultiConvolutionLayer(units=(128,64), name='layer1')
>>> layer_2 = MolGANAggregationLayer(units=128, name='layer2')
>>> adjacency_tensor= Input(shape=(vertices, vertices, edges))
>>> node_tensor = Input(shape=(vertices,nodes))
>>> hidden = layer_1([adjacency_tensor,node_tensor])
>>> output = layer_2(hidden)
>>> model = Model(inputs=[adjacency_tensor,node_tensor], outputs=[output])

References

1: Nicola De Cao et al. “MolGAN: An implicit generative model for small molecular graphs”, https://arxiv.org/abs/1805.11973

__init__(units: Tuple = (128, 64), activation: Callable = <function tanh>, dropout_rate: float = 0.0, edges: int = 5, name: str = '', **kwargs)[source]¶

Initialize the layer

Parameters

units (Tuple, optional (default=(128,64)), min_length=2) – List of dimensions used by consecutive convolution layers. The more values the more convolution layers invoked.
activation (function, optional (default=tanh)) – activation function used across model, default is Tanh
dropout_rate (float, optional (default=0.0)) – Used by dropout layer
edges (int, optional (default=0)) – Controls how many dense layers use for single convolution unit. Typically matches number of bond types used in the molecule.
name (string, optional (default="")) – Name of the layer

call(inputs, training=False)[source]¶

Invoke this layer

Parameters

inputs (list) – List of two input matrices, adjacency tensor and node features tensors in one-hot encoding format.
training (bool) – Should this layer be run in training mode. Typically decided by main model, influences things like dropout.

Returns

convolution tensor – Result of input tensors going through convolution a number of times.

Return type

tf.Tensor

get_config() → Dict[source]¶: Returns config dictionary for this layer.

class MolGANEncoderLayer(*args, **kwargs)[source]¶

Main learning layer used by MolGAN model. MolGAN is a WGAN type model for generation of small molecules. It role is to further simplify model. This layer can be manually built by stacking graph convolution layers followed by graph aggregation.

Example

>>> from tensorflow.keras import Model
>>> from tensorflow.keras.layers import Input, Dropout,Dense
>>> vertices = 9
>>> edges = 5
>>> nodes = 5
>>> dropout_rate = .0
>>> adjacency_tensor= Input(shape=(vertices, vertices, edges))
>>> node_tensor = Input(shape=(vertices, nodes))

>>> graph = MolGANEncoderLayer(units = [(128,64),128], dropout_rate= dropout_rate, edges=edges)([adjacency_tensor,node_tensor])
>>> dense = Dense(units=128, activation='tanh')(graph)
>>> dense = Dropout(dropout_rate)(dense)
>>> dense = Dense(units=64, activation='tanh')(dense)
>>> dense = Dropout(dropout_rate)(dense)
>>> output = Dense(units=1)(dense)

>>> model = Model(inputs=[adjacency_tensor,node_tensor], outputs=[output])

References

1: Nicola De Cao et al. “MolGAN: An implicit generative model for small molecular graphs”, https://arxiv.org/abs/1805.11973

__init__(units: List = [(128, 64), 128], activation: Callable = <function tanh>, dropout_rate: float = 0.0, edges: int = 5, name: str = '', **kwargs)[source]¶

Initialize the layer.

Parameters

units (List, optional (default=[(128, 64), 128])) – List of units for MolGANMultiConvolutionLayer and GraphAggregationLayer i.e. [(128,64),128] means two convolution layers dims = [128,64] followed by aggregation layer dims=128
activation (function, optional (default=Tanh)) – activation function used across model, default is Tanh
dropout_rate (float, optional (default=0.0)) – Used by dropout layer
edges (int, optional (default=0)) – Controls how many dense layers use for single convolution unit. Typically matches number of bond types used in the molecule.
name (string, optional (default="")) – Name of the layer

call(inputs, training=False)[source]¶

Invoke this layer

Parameters

inputs (list) – List of two input matrices, adjacency tensor and node features tensors in one-hot encoding format.
training (bool) – Should this layer be run in training mode. Typically decided by main model, influences things like dropout.

Returns

encoder tensor – Tensor that been through number of convolutions followed by aggregation.

Return type

tf.Tensor

get_config() → Dict[source]¶: Returns config dictionary for this layer.

class LSTMStep(*args, **kwargs)[source]¶

Layer that performs a single step LSTM update.

This layer performs a single step LSTM update. Note that it is not a full LSTM recurrent network. The LSTMStep layer is useful as a primitive for designing layers such as the AttnLSTMEmbedding or the IterRefLSTMEmbedding below.

__init__(output_dim, input_dim, init_fn='glorot_uniform', inner_init_fn='orthogonal', activation_fn='tanh', inner_activation_fn='hard_sigmoid', **kwargs)[source]¶

Parameters

output_dim (int) – Dimensionality of output vectors.
input_dim (int) – Dimensionality of input vectors.
init_fn (str) – TensorFlow nitialization to use for W.
inner_init_fn (str) – TensorFlow initialization to use for U.
activation_fn (str) – TensorFlow activation to use for output.
inner_activation_fn (str) – TensorFlow activation to use for inner steps.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶: Constructs learnable weights for this layer.

call(inputs)[source]¶

Execute this layer on input tensors.

Parameters: inputs (list) – List of three tensors (x, h_tm1, c_tm1). h_tm1 means “h, t-1”.
Returns: Returns h, [h, c]
Return type: list

class AttnLSTMEmbedding(*args, **kwargs)[source]¶

Implements AttnLSTM as in matching networks paper.

The AttnLSTM embedding adjusts two sets of vectors, the “test” and “support” sets. The “support” consists of a set of evidence vectors. Think of these as the small training set for low-data machine learning. The “test” consists of the queries we wish to answer with the small amounts of available data. The AttnLSTMEmbdding allows us to modify the embedding of the “test” set depending on the contents of the “support”. The AttnLSTMEmbedding is thus a type of learnable metric that allows a network to modify its internal notion of distance.

See references [1]_ 2 for more details.

References

1: Vinyals, Oriol, et al. “Matching networks for one shot learning.” Advances in neural information processing systems. 2016.
2: Vinyals, Oriol, Samy Bengio, and Manjunath Kudlur. “Order matters: Sequence to sequence for sets.” arXiv preprint arXiv:1511.06391 (2015).

__init__(n_test, n_support, n_feat, max_depth, **kwargs)[source]¶

Parameters

n_support (int) – Size of support set.
n_test (int) – Size of test set.
n_feat (int) – Number of features per atom
max_depth (int) – Number of “processing steps” used by sequence-to-sequence for sets model.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

Execute this layer on input tensors.

Parameters: inputs (list) – List of two tensors (X, Xp). X should be of shape (n_test, n_feat) and Xp should be of shape (n_support, n_feat) where n_test is the size of the test set, n_support that of the support set, and n_feat is the number of per-atom features.
Returns: Returns two tensors of same shape as input. Namely the output shape will be [(n_test, n_feat), (n_support, n_feat)]
Return type: list

class IterRefLSTMEmbedding(*args, **kwargs)[source]¶

Implements the Iterative Refinement LSTM.

Much like AttnLSTMEmbedding, the IterRefLSTMEmbedding is another type of learnable metric which adjusts “test” and “support.” Recall that “support” is the small amount of data available in a low data machine learning problem, and that “test” is the query. The AttnLSTMEmbedding only modifies the “test” based on the contents of the support. However, the IterRefLSTM modifies both the “support” and “test” based on each other. This allows the learnable metric to be more malleable than that from AttnLSTMEmbeding.

__init__(n_test, n_support, n_feat, max_depth, **kwargs)[source]¶

Unlike the AttnLSTM model which only modifies the test vectors additively, this model allows for an additive update to be performed to both test and support using information from each other.

Parameters

n_support (int) – Size of support set.
n_test (int) – Size of test set.
n_feat (int) – Number of input atom features
max_depth (int) – Number of LSTM Embedding layers.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

Execute this layer on input tensors.

Parameters

inputs (list) – List of two tensors (X, Xp). X should be of shape (n_test, n_feat) and Xp should be of shape (n_support, n_feat) where n_test is the size of the test set, n_support that of the support set, and n_feat is the number of per-atom features.

Returns

Returns two tensors of same shape as input. Namely the output
shape will be [(n_test, n_feat), (n_support, n_feat)]

class SwitchedDropout(*args, **kwargs)[source]¶

Apply dropout based on an input.

This is required for uncertainty prediction. The standard Keras Dropout layer only performs dropout during training, but we sometimes need to do it during prediction. The second input to this layer should be a scalar equal to 0 or 1, indicating whether to perform dropout.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class WeightedLinearCombo(*args, **kwargs)[source]¶

Computes a weighted linear combination of input layers, with the weights defined by trainable variables.

__init__(std=0.3, **kwargs)[source]¶

Initialize this layer.

Parameters: std (float, optional (default 0.3)) – The standard deviation to use when randomly initializing weights.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class CombineMeanStd(*args, **kwargs)[source]¶

Generate Gaussian nose.

__init__(training_only=False, noise_epsilon=1.0, **kwargs)[source]¶

Create a CombineMeanStd layer.

This layer should have two inputs with the same shape, and its output also has the same shape. Each element of the output is a Gaussian distributed random number whose mean is the corresponding element of the first input, and whose standard deviation is the corresponding element of the second input.

Parameters

training_only (bool) – if True, noise is only generated during training. During prediction, the output is simply equal to the first input (that is, the mean of the distribution used during training).
noise_epsilon (float) – The noise is scaled by this factor

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

call(inputs, training=True)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class Stack(*args, **kwargs)[source]¶

Stack the inputs along a new axis.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class VinaFreeEnergy(*args, **kwargs)[source]¶

Computes free-energy as defined by Autodock Vina.

TODO(rbharath): Make this layer support batching.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

nonlinearity(c, w)[source]¶: Computes non-linearity used in Vina.

repulsion(d)[source]¶: Computes Autodock Vina’s repulsion interaction term.

hydrophobic(d)[source]¶: Computes Autodock Vina’s hydrophobic interaction term.

hydrogen_bond(d)[source]¶: Computes Autodock Vina’s hydrogen bond interaction term.

gaussian_first(d)[source]¶: Computes Autodock Vina’s first Gaussian interaction term.

gaussian_second(d)[source]¶: Computes Autodock Vina’s second Gaussian interaction term.

call(inputs)[source]¶

Parameters

X (tf.Tensor of shape (N, d)) – Coordinates/features.
Z (tf.Tensor of shape (N)) – Atomic numbers of neighbor atoms.

Returns

layer – The free energy of each complex in batch

Return type

tf.Tensor of shape (B)

class NeighborList(*args, **kwargs)[source]¶

Computes a neighbor-list in Tensorflow.

Neighbor-lists (also called Verlet Lists) are a tool for grouping atoms which are close to each other spatially. This layer computes a Neighbor List from a provided tensor of atomic coordinates. You can think of this as a general “k-means” layer, but optimized for the case k==3.

TODO(rbharath): Make this layer support batching.

__init__(N_atoms, M_nbrs, ndim, nbr_cutoff, start, stop, **kwargs)[source]¶

Parameters

N_atoms (int) – Maximum number of atoms this layer will neighbor-list.
M_nbrs (int) – Maximum number of spatial neighbors possible for atom.
ndim (int) – Dimensionality of space atoms live in. (Typically 3D, but sometimes will want to use higher dimensional descriptors for atoms).
nbr_cutoff (float) – Length in Angstroms (?) at which atom boxes are gridded.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

compute_nbr_list(coords)[source]¶

Get closest neighbors for atoms.

Needs to handle padding for atoms with no neighbors.

Parameters: coords (tf.Tensor) – Shape (N_atoms, ndim)
Returns: nbr_list – Shape (N_atoms, M_nbrs) of atom indices
Return type: tf.Tensor

get_atoms_in_nbrs(coords, cells)[source]¶

Get the atoms in neighboring cells for each cells.

Returns
Return type: atoms_in_nbrs = (N_atoms, n_nbr_cells, M_nbrs)

get_closest_atoms(coords, cells)[source]¶

For each cell, find M_nbrs closest atoms.

Let N_atoms be the number of atoms.

Parameters

coords (tf.Tensor) – (N_atoms, ndim) shape.
cells (tf.Tensor) – (n_cells, ndim) shape.

Returns

closest_inds – Of shape (n_cells, M_nbrs)

Return type

tf.Tensor

get_cells_for_atoms(coords, cells)[source]¶

Compute the cells each atom belongs to.

Parameters

coords (tf.Tensor) – Shape (N_atoms, ndim)
cells (tf.Tensor) – (n_cells, ndim) shape.

Returns

cells_for_atoms – Shape (N_atoms, 1)

Return type

tf.Tensor

get_neighbor_cells(cells)[source]¶

Compute neighbors of cells in grid.

# TODO(rbharath): Do we need to handle periodic boundary conditions properly here? # TODO(rbharath): This doesn’t handle boundaries well. We hard-code # looking for n_nbr_cells neighbors, which isn’t right for boundary cells in # the cube.

Parameters: cells (tf.Tensor) – (n_cells, ndim) shape.
Returns: nbr_cells – (n_cells, n_nbr_cells)
Return type: tf.Tensor

get_cells()[source]¶

Returns the locations of all grid points in box.

Suppose start is -10 Angstrom, stop is 10 Angstrom, nbr_cutoff is 1. Then would return a list of length 20^3 whose entries would be [(-10, -10, -10), (-10, -10, -9), …, (9, 9, 9)]

Returns: cells – (n_cells, ndim) shape.
Return type: tf.Tensor

class AtomicConvolution(*args, **kwargs)[source]¶

Implements the atomic convolutional transform introduced in

Gomes, Joseph, et al. “Atomic convolutional networks for predicting protein-ligand binding affinity.” arXiv preprint arXiv:1703.10603 (2017).

At a high level, this transform performs a graph convolution on the nearest neighbors graph in 3D space.

__init__(atom_types=None, radial_params=[], boxsize=None, **kwargs)[source]¶

Atomic convolution layer

N = max_num_atoms, M = max_num_neighbors, B = batch_size, d = num_features l = num_radial_filters * num_atom_types

Parameters

atom_types (list or None) – Of length a, where a is number of atom types for filtering.
radial_params (list) – Of length l, where l is number of radial filters learned.
boxsize (float or None) – Simulation box length [Angstrom].

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

Parameters

X (tf.Tensor of shape (B, N, d)) – Coordinates/features.
Nbrs (tf.Tensor of shape (B, N, M)) – Neighbor list.
Nbrs_Z (tf.Tensor of shape (B, N, M)) – Atomic numbers of neighbor atoms.

Returns

layer – A new tensor representing the output of the atomic conv layer

Return type

tf.Tensor of shape (B, N, l)

radial_symmetry_function(R, rc, rs, e)[source]¶

Calculates radial symmetry function.

B = batch_size, N = max_num_atoms, M = max_num_neighbors, d = num_filters

Parameters

R (tf.Tensor of shape (B, N, M)) – Distance matrix.
rc (float) – Interaction cutoff [Angstrom].
rs (float) – Gaussian distance matrix mean.
e (float) – Gaussian distance matrix width.

Returns

retval – Radial symmetry function (before summation)

Return type

tf.Tensor of shape (B, N, M)

radial_cutoff(R, rc)[source]¶

Calculates radial cutoff matrix.

B = batch_size, N = max_num_atoms, M = max_num_neighbors

Parameters

[B (R) – Distance matrix.
N (tf.Tensor) – Distance matrix.
M] (tf.Tensor) – Distance matrix.
rc (tf.Variable) – Interaction cutoff [Angstrom].

Returns

FC [B, N, M] – Radial cutoff matrix.

Return type

tf.Tensor

gaussian_distance_matrix(R, rs, e)[source]¶

Calculates gaussian distance matrix.

B = batch_size, N = max_num_atoms, M = max_num_neighbors

Parameters

[B (R) – Distance matrix.
N (tf.Tensor) – Distance matrix.
M] (tf.Tensor) – Distance matrix.
rs (tf.Variable) – Gaussian distance matrix mean.
e (tf.Variable) – Gaussian distance matrix width (e = .5/std**2).

Returns

retval [B, N, M] – Gaussian distance matrix.

Return type

tf.Tensor

distance_tensor(X, Nbrs, boxsize, B, N, M, d)[source]¶

Calculates distance tensor for batch of molecules.

B = batch_size, N = max_num_atoms, M = max_num_neighbors, d = num_features

Parameters

X (tf.Tensor of shape (B, N, d)) – Coordinates/features tensor.
Nbrs (tf.Tensor of shape (B, N, M)) – Neighbor list tensor.
boxsize (float or None) – Simulation box length [Angstrom].

Returns

D – Coordinates/features distance tensor.

Return type

tf.Tensor of shape (B, N, M, d)

distance_matrix(D)[source]¶

Calcuates the distance matrix from the distance tensor

B = batch_size, N = max_num_atoms, M = max_num_neighbors, d = num_features

Parameters: D (tf.Tensor of shape (B, N, M, d)) – Distance tensor.
Returns: R – Distance matrix.
Return type: tf.Tensor of shape (B, N, M)

class AlphaShareLayer(*args, **kwargs)[source]¶

Part of a sluice network. Adds alpha parameters to control sharing between the main and auxillary tasks

Factory method AlphaShare should be used for construction

Parameters

in_layers (list of Layers or tensors) – tensors in list must be the same size and list must include two or more tensors

Returns

out_tensor (a tensor with shape [len(in_layers), x, y] where x, y were the original layer dimensions)
Distance matrix.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class SluiceLoss(*args, **kwargs)[source]¶

Calculates the loss in a Sluice Network Every input into an AlphaShare should be used in SluiceLoss

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class BetaShare(*args, **kwargs)[source]¶

Part of a sluice network. Adds beta params to control which layer outputs are used for prediction

Parameters: in_layers (list of Layers or tensors) – tensors in list must be the same size and list must include two or more tensors
Returns: output_layers – Distance matrix.
Return type: list of Layers or tensors with same size as in_layers

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶: Size of input layers must all be the same

class ANIFeat(*args, **kwargs)[source]¶

Performs transform from 3D coordinates to ANI symmetry functions

__init__(max_atoms=23, radial_cutoff=4.6, angular_cutoff=3.1, radial_length=32, angular_length=8, atom_cases=[1, 6, 7, 8, 16], atomic_number_differentiated=True, coordinates_in_bohr=True, **kwargs)[source]¶: Only X can be transformed

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

call(inputs)[source]¶: In layers should be of shape dtype tf.float32, (None, self.max_atoms, 4)

distance_matrix(coordinates, flags)[source]¶: Generate distance matrix

distance_cutoff(d, cutoff, flags)[source]¶: Generate distance matrix with trainable cutoff

radial_symmetry(d_cutoff, d, atom_numbers)[source]¶: Radial Symmetry Function

angular_symmetry(d_cutoff, d, atom_numbers, coordinates)[source]¶: Angular Symmetry Function

class GraphEmbedPoolLayer(*args, **kwargs)[source]¶

GraphCNNPool Layer from Robust Spatial Filtering with Graph Convolutional Neural Networks https://arxiv.org/abs/1703.00792

This is a learnable pool operation It constructs a new adjacency matrix for a graph of specified number of nodes.

This differs from our other pool operations which set vertices to a function value without altering the adjacency matrix.

..math:: V_{emb} = SpatialGraphCNN({V_{in}}) ..math:: V_{out} = sigma(V_{emb})^{T} * V_{in} ..math:: A_{out} = V_{emb}^{T} * A_{in} * V_{emb}

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

Parameters

num_filters (int) – Number of filters to have in the output
in_layers (list of Layers or tensors) –
[V, A, mask] V are the vertex features must be of shape (batch, vertex, channel)

A are the adjacency matrixes for each graph
Shape (batch, from_vertex, adj_matrix, to_vertex)

mask is optional, to be used when not every graph has the same number of vertices

Returns

Returns a tf.tensor with a graph convolution applied
The shape will be (batch, vertex, self.num_filters).

class GraphCNN(*args, **kwargs)[source]¶

GraphCNN Layer from Robust Spatial Filtering with Graph Convolutional Neural Networks https://arxiv.org/abs/1703.00792

Spatial-domain convolutions can be defined as H = h_0I + h_1A + h_2A^2 + … + hkAk, H ∈ R**(N×N)

We approximate it by H ≈ h_0I + h_1A

We can define a convolution as applying multiple these linear filters over edges of different types (think up, down, left, right, diagonal in images) Where each edge type has its own adjacency matrix H ≈ h_0I + h_1A_1 + h_2A_2 + … h_(L−1)A_(L−1)

V_out = sum_{c=1}^{C} H^{c} V^{c} + b

__init__(num_filters, **kwargs)[source]¶

Parameters

num_filters (int) – Number of filters to have in the output
in_layers (list of Layers or tensors) –
[V, A, mask] V are the vertex features must be of shape (batch, vertex, channel)

A are the adjacency matrixes for each graph
Shape (batch, from_vertex, adj_matrix, to_vertex)

mask is optional, to be used when not every graph has the same number of vertices
Returns (tf.tensor) –
a tf.tensor with a graph convolution applied (Returns) –
shape will be (batch (The) –
vertex –
self.num_filters) –

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class Highway(*args, **kwargs)[source]¶

Create a highway layer. y = H(x) * T(x) + x * (1 - T(x))

H(x) = activation_fn(matmul(W_H, x) + b_H) is the non-linear transformed output T(x) = sigmoid(matmul(W_T, x) + b_T) is the transform gate

Implementation based on paper

Srivastava, Rupesh Kumar, Klaus Greff, and Jürgen Schmidhuber. “Highway networks.” arXiv preprint arXiv:1505.00387 (2015).

This layer expects its input to be a two dimensional tensor of shape (batch size, # input features). Outputs will be in the same shape.

__init__(activation_fn='relu', biases_initializer='zeros', weights_initializer=None, **kwargs)[source]¶

Parameters

activation_fn (object) – the Tensorflow activation function to apply to the output
biases_initializer (callable object) – the initializer for bias values. This may be None, in which case the layer will not include biases.
weights_initializer (callable object) – the initializer for weight values

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class WeaveLayer(*args, **kwargs)[source]¶

This class implements the core Weave convolution from the Google graph convolution paper [1]_

This model contains atom features and bond features separately.Here, bond features are also called pair features. There are 2 types of transformation, atom->atom, atom->pair, pair->atom, pair->pair that this model implements.

Examples

This layer expects 4 inputs in a list of the form [atom_features, pair_features, pair_split, atom_to_pair]. We’ll walk through the structure of these inputs. Let’s start with some basic definitions.

>>> import deepchem as dc
>>> import numpy as np

Suppose you have a batch of molecules

>>> smiles = ["CCC", "C"]

Note that there are 4 atoms in total in this system. This layer expects its input molecules to be batched together.

>>> total_n_atoms = 4

Let’s suppose that we have a featurizer that computes n_atom_feat features per atom.

>>> n_atom_feat = 75

Then conceptually, atom_feat is the array of shape (total_n_atoms, n_atom_feat) of atomic features. For simplicity, let’s just go with a random such matrix.

>>> atom_feat = np.random.rand(total_n_atoms, n_atom_feat)

Let’s suppose we have n_pair_feat pairwise features

>>> n_pair_feat = 14

For each molecule, we compute a matrix of shape (n_atoms*n_atoms, n_pair_feat) of pairwise features for each pair of atoms in the molecule. Let’s construct this conceptually for our example.

>>> pair_feat = [np.random.rand(3*3, n_pair_feat), np.random.rand(1*1, n_pair_feat)]
>>> pair_feat = np.concatenate(pair_feat, axis=0)
>>> pair_feat.shape
(10, 14)

pair_split is an index into pair_feat which tells us which atom each row belongs to. In our case, we hve

>>> pair_split = np.array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3])

That is, the first 9 entries belong to “CCC” and the last entry to “C”. The final entry atom_to_pair goes in a little more in-depth than pair_split and tells us the precise pair each pair feature belongs to. In our case

>>> atom_to_pair = np.array([[0, 0],
...                          [0, 1],
...                          [0, 2],
...                          [1, 0],
...                          [1, 1],
...                          [1, 2],
...                          [2, 0],
...                          [2, 1],
...                          [2, 2],
...                          [3, 3]])

Let’s now define the actual layer

>>> layer = WeaveLayer()

And invoke it

>>> [A, P] = layer([atom_feat, pair_feat, pair_split, atom_to_pair])

The weave layer produces new atom/pair features. Let’s check their shapes

>>> A = np.array(A)
>>> A.shape
(4, 50)
>>> P = np.array(P)
>>> P.shape
(10, 50)

The 4 is total_num_atoms and the 10 is the total number of pairs. Where does 50 come from? It’s from the default arguments n_atom_input_feat and n_pair_input_feat.

References

1: Kearnes, Steven, et al. “Molecular graph convolutions: moving beyond fingerprints.” Journal of computer-aided molecular design 30.8 (2016): 595-608.

__init__(n_atom_input_feat: int = 75, n_pair_input_feat: int = 14, n_atom_output_feat: int = 50, n_pair_output_feat: int = 50, n_hidden_AA: int = 50, n_hidden_PA: int = 50, n_hidden_AP: int = 50, n_hidden_PP: int = 50, update_pair: bool = True, init: str = 'glorot_uniform', activation: str = 'relu', batch_normalize: bool = True, batch_normalize_kwargs: Dict = {'renorm': True}, **kwargs)[source]¶

Parameters

n_atom_input_feat (int, optional (default 75)) – Number of features for each atom in input.
n_pair_input_feat (int, optional (default 14)) – Number of features for each pair of atoms in input.
n_atom_output_feat (int, optional (default 50)) – Number of features for each atom in output.
n_pair_output_feat (int, optional (default 50)) – Number of features for each pair of atoms in output.
n_hidden_AA (int, optional (default 50)) – Number of units(convolution depths) in corresponding hidden layer
n_hidden_PA (int, optional (default 50)) – Number of units(convolution depths) in corresponding hidden layer
n_hidden_AP (int, optional (default 50)) – Number of units(convolution depths) in corresponding hidden layer
n_hidden_PP (int, optional (default 50)) – Number of units(convolution depths) in corresponding hidden layer
update_pair (bool, optional (default True)) – Whether to calculate for pair features, could be turned off for last layer
init (str, optional (default 'glorot_uniform')) – Weight initialization for filters.
activation (str, optional (default 'relu')) – Activation function applied
batch_normalize (bool, optional (default True)) – If this is turned on, apply batch normalization before applying activation functions on convolutional layers.
batch_normalize_kwargs (Dict, optional (default {renorm=True})) – Batch normalization is a complex layer which has many potential argumentswhich change behavior. This layer accepts user-defined parameters which are passed to all BatchNormalization layers in WeaveModel, WeaveLayer, and WeaveGather.

get_config() → Dict[source]¶: Returns config dictionary for this layer.

build(input_shape)[source]¶

Construct internal trainable weights.

Parameters: input_shape (tuple) – Ignored since we don’t need the input shape to create internal weights.

call(inputs: List) → List[source]¶

Creates weave tensors.

Parameters: inputs (List) – Should contain 4 tensors [atom_features, pair_features, pair_split, atom_to_pair]

class WeaveGather(*args, **kwargs)[source]¶

Implements the weave-gathering section of weave convolutions.

Implements the gathering layer from [1]_. The weave gathering layer gathers per-atom features to create a molecule-level fingerprint in a weave convolutional network. This layer can also performs Gaussian histogram expansion as detailed in [1]_. Note that the gathering function here is simply addition as in [1]_>

Examples

This layer expects 2 inputs in a list of the form [atom_features, pair_features]. We’ll walk through the structure of these inputs. Let’s start with some basic definitions.

>>> import deepchem as dc
>>> import numpy as np

Suppose you have a batch of molecules

>>> smiles = ["CCC", "C"]

Note that there are 4 atoms in total in this system. This layer expects its input molecules to be batched together.

>>> total_n_atoms = 4

Let’s suppose that we have n_atom_feat features per atom.

>>> n_atom_feat = 75

Then conceptually, atom_feat is the array of shape (total_n_atoms, n_atom_feat) of atomic features. For simplicity, let’s just go with a random such matrix.

>>> atom_feat = np.random.rand(total_n_atoms, n_atom_feat)

We then need to provide a mapping of indices to the atoms they belong to. In ours case this would be

>>> atom_split = np.array([0, 0, 0, 1])

Let’s now define the actual layer

>>> gather = WeaveGather(batch_size=2, n_input=n_atom_feat)
>>> output_molecules = gather([atom_feat, atom_split])
>>> len(output_molecules)
2

References

1: Kearnes, Steven, et al. “Molecular graph convolutions: moving beyond fingerprints.” Journal of computer-aided molecular design 30.8 (2016): 595-608.

Note

This class requires tensorflow_probability to be installed.

__init__(batch_size: int, n_input: int = 128, gaussian_expand: bool = True, compress_post_gaussian_expansion: bool = False, init: str = 'glorot_uniform', activation: str = 'tanh', **kwargs)[source]¶

Parameters

batch_size (int) – number of molecules in a batch
n_input (int, optional (default 128)) – number of features for each input molecule
gaussian_expand (boolean, optional (default True)) – Whether to expand each dimension of atomic features by gaussian histogram
compress_post_gaussian_expansion (bool, optional (default False)) – If True, compress the results of the Gaussian expansion back to the original dimensions of the input by using a linear layer with specified activation function. Note that this compression was not in the original paper, but was present in the original DeepChem implementation so is left present for backwards compatibility.
init (str, optional (default 'glorot_uniform')) – Weight initialization for filters if compress_post_gaussian_expansion is True.
activation (str, optional (default 'tanh')) – Activation function applied for filters if compress_post_gaussian_expansion is True. Should be recognizable by tf.keras.activations.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs: List) → List[source]¶

Creates weave tensors.

Parameters: inputs (List) – Should contain 2 tensors [atom_features, atom_split]
Returns: output_molecules – Each entry in this list is of shape (self.n_inputs,)
Return type: List

gaussian_histogram(x)[source]¶

Expands input into a set of gaussian histogram bins.

Parameters: x (tf.Tensor) – Of shape (N, n_feat)

Examples

This method uses 11 bins spanning portions of a Gaussian with zero mean and unit standard deviation.

>>> gaussian_memberships = [(-1.645, 0.283), (-1.080, 0.170),
...                         (-0.739, 0.134), (-0.468, 0.118),
...                         (-0.228, 0.114), (0., 0.114),
...                         (0.228, 0.114), (0.468, 0.118),
...                         (0.739, 0.134), (1.080, 0.170),
...                         (1.645, 0.283)]

We construct a Gaussian at gaussian_memberships[i][0] with standard deviation gaussian_memberships[i][1]. Each feature in x is assigned the probability of falling in each Gaussian, and probabilities are normalized across the 11 different Gaussians.

Returns: outputs – Of shape (N, 11*n_feat)
Return type: tf.Tensor

class DTNNEmbedding(*args, **kwargs)[source]¶

__init__(n_embedding=30, periodic_table_length=30, init='glorot_uniform', **kwargs)[source]¶

Parameters

n_embedding (int, optional) – Number of features for each atom
periodic_table_length (int, optional) – Length of embedding, 83=Bi
init (str, optional) – Weight initialization for filters.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶: parent layers: atom_number

class DTNNStep(*args, **kwargs)[source]¶

__init__(n_embedding=30, n_distance=100, n_hidden=60, init='glorot_uniform', activation='tanh', **kwargs)[source]¶

Parameters

n_embedding (int, optional) – Number of features for each atom
n_distance (int, optional) – granularity of distance matrix
n_hidden (int, optional) – Number of nodes in hidden layer
init (str, optional) – Weight initialization for filters.
activation (str, optional) – Activation function applied

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶: parent layers: atom_features, distance, distance_membership_i, distance_membership_j

class DTNNGather(*args, **kwargs)[source]¶

__init__(n_embedding=30, n_outputs=100, layer_sizes=[100], output_activation=True, init='glorot_uniform', activation='tanh', **kwargs)[source]¶

Parameters

n_embedding (int, optional) – Number of features for each atom
n_outputs (int, optional) – Number of features for each molecule(output)
layer_sizes (list of int, optional(default=[1000])) – Structure of hidden layer(s)
init (str, optional) – Weight initialization for filters.
activation (str, optional) – Activation function applied

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶: parent layers: atom_features, atom_membership

class DAGLayer(*args, **kwargs)[source]¶

DAG computation layer.

This layer generates a directed acyclic graph for each atom in a molecule. This layer is based on the algorithm from the following paper:

Lusci, Alessandro, Gianluca Pollastri, and Pierre Baldi. “Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules.” Journal of chemical information and modeling 53.7 (2013): 1563-1575.

This layer performs a sort of inward sweep. Recall that for each atom, a DAG is generated that “points inward” to that atom from the undirected molecule graph. Picture this as “picking up” the atom as the vertex and using the natural tree structure that forms from gravity. The layer “sweeps inwards” from the leaf nodes of the DAG upwards to the atom. This is batched so the transformation is done for each atom.

__init__(n_graph_feat=30, n_atom_feat=75, max_atoms=50, layer_sizes=[100], init='glorot_uniform', activation='relu', dropout=None, batch_size=64, **kwargs)[source]¶

Parameters

n_graph_feat (int, optional) – Number of features for each node(and the whole grah).
n_atom_feat (int, optional) – Number of features listed per atom.
max_atoms (int, optional) – Maximum number of atoms in molecules.
layer_sizes (list of int, optional(default=[100])) – List of hidden layer size(s): length of this list represents the number of hidden layers, and each element is the width of corresponding hidden layer.
init (str, optional) – Weight initialization for filters.
activation (str, optional) – Activation function applied.
dropout (float, optional) – Dropout probability in hidden layer(s).
batch_size (int, optional) – number of molecules in a batch.

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶: “Construct internal trainable weights.

call(inputs, training=True)[source]¶: parent layers: atom_features, parents, calculation_orders, calculation_masks, n_atoms

class DAGGather(*args, **kwargs)[source]¶

__init__(n_graph_feat=30, n_outputs=30, max_atoms=50, layer_sizes=[100], init='glorot_uniform', activation='relu', dropout=None, **kwargs)[source]¶

DAG vector gathering layer

Parameters

n_graph_feat (int, optional) – Number of features for each atom.
n_outputs (int, optional) – Number of features for each molecule.
max_atoms (int, optional) – Maximum number of atoms in molecules.
layer_sizes (list of int, optional) – List of hidden layer size(s): length of this list represents the number of hidden layers, and each element is the width of corresponding hidden layer.
init (str, optional) – Weight initialization for filters.
activation (str, optional) – Activation function applied.
dropout (float, optional) – Dropout probability in the hidden layer(s).

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs, training=True)[source]¶: parent layers: atom_features, membership

class MessagePassing(*args, **kwargs)[source]¶

General class for MPNN default structures built according to https://arxiv.org/abs/1511.06391

__init__(T, message_fn='enn', update_fn='gru', n_hidden=100, **kwargs)[source]¶

Parameters

T (int) – Number of message passing steps
message_fn (str, optional) – message function in the model
update_fn (str, optional) – update function in the model
n_hidden (int, optional) – number of hidden units in the passing phase

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶: Perform T steps of message passing

class EdgeNetwork(*args, **kwargs)[source]¶

Submodule for Message Passing

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class GatedRecurrentUnit(*args, **kwargs)[source]¶

Submodule for Message Passing

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

This is where the layer’s logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Parameters

inputs –
Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.
- NumPy array or Python scalar values in inputs get cast as tensors.
- Keras mask metadata is only collected from inputs.
- Layers are built (build(input_shape) method) using shape info from inputs only.
- input_spec compatibility is only checked against inputs.
- Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
- The SavedModel input specification is generated using inputs only.
- Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs –
Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.
- mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class SetGather(*args, **kwargs)[source]¶

set2set gather layer for graph-based model

Models using this layer must set pad_batches=True.

__init__(M, batch_size, n_hidden=100, init='orthogonal', **kwargs)[source]¶

Parameters

M (int) – Number of LSTM steps
batch_size (int) – Number of samples in a batch(all batches must have same size)
n_hidden (int, optional) – number of hidden units in the passing phase

get_config()[source]¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

build(input_shape)[source]¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call.

This is typically used to create the weights of Layer subclasses.

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]¶

Perform M steps of set2set gather,

Detailed descriptions in: https://arxiv.org/abs/1511.06391

Torch Layers¶

class ScaleNorm(scale: float, eps: float = 1e-05)[source]¶

Apply Scale Normalization to input.

The ScaleNorm layer first computes the square root of the scale, then computes the matrix/vector norm of the input tensor. The norm value is calculated as sqrt(scale) / matrix norm. Finally, the result is returned as input_tensor * norm value.

This layer can be used instead of LayerNorm when a scaled version of the norm is required. Instead of performing the scaling operation (scale / norm) in a lambda-like layer, we are defining it within this layer to make prototyping more efficient.

References

1: Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from deepchem.models.torch_models.layers import ScaleNorm
>>> scale = 0.35
>>> layer = ScaleNorm(scale)
>>> input_tensor = torch.tensor([[1.269, 39.36], [0.00918, -9.12]])
>>> output_tensor = layer(input_tensor)

__init__(scale: float, eps: float = 1e-05)[source]¶

Initialize a ScaleNorm layer.

Parameters

scale (float) – Scale magnitude.
eps (float) – Epsilon value. Default = 1e-5.

forward(x: torch.Tensor) → torch.Tensor[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class MATEncoderLayer(dist_kernel: str = 'softmax', lambda_attention: float = 0.33, lambda_distance: float = 0.33, h: int = 16, sa_hsize: int = 1024, sa_dropout_p: float = 0.0, output_bias: bool = True, d_input: int = 1024, d_hidden: int = 1024, d_output: int = 1024, activation: str = 'leakyrelu', n_layers: int = 1, ff_dropout_p: float = 0.0, encoder_hsize: int = 1024, encoder_dropout_p: float = 0.0)[source]¶

Encoder layer for use in the Molecular Attention Transformer [1]_.

The MATEncoder layer primarily consists of a self-attention layer (MultiHeadedMATAttention) and a feed-forward layer (PositionwiseFeedForward). This layer can be stacked multiple times to form an encoder.

References

1: Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from rdkit import Chem
>>> import torch
>>> import deepchem
>>> from deepchem.models.torch_models.layers import MATEmbedding, MATEncoderLayer
>>> input_smile = "CC"
>>> feat = deepchem.feat.MATFeaturizer()
>>> out = feat.featurize(input_smile)
>>> node = torch.tensor(out[0].node_features).float().unsqueeze(0)
>>> adj = torch.tensor(out[0].adjacency_matrix).float().unsqueeze(0)
>>> dist = torch.tensor(out[0].distance_matrix).float().unsqueeze(0)
>>> mask = torch.sum(torch.abs(node), dim=-1) != 0
>>> layer = MATEncoderLayer()
>>> op = MATEmbedding()(node)
>>> output = layer(op, mask, adj, dist)

__init__(dist_kernel: str = 'softmax', lambda_attention: float = 0.33, lambda_distance: float = 0.33, h: int = 16, sa_hsize: int = 1024, sa_dropout_p: float = 0.0, output_bias: bool = True, d_input: int = 1024, d_hidden: int = 1024, d_output: int = 1024, activation: str = 'leakyrelu', n_layers: int = 1, ff_dropout_p: float = 0.0, encoder_hsize: int = 1024, encoder_dropout_p: float = 0.0)[source]¶

Initialize a MATEncoder layer.

Parameters

dist_kernel (str) – Kernel activation to be used. Can be either ‘softmax’ for softmax or ‘exp’ for exponential, for the self-attention layer.
lambda_attention (float) – Constant to be multiplied with the attention matrix in the self-attention layer.
lambda_distance (float) – Constant to be multiplied with the distance matrix in the self-attention layer.
h (int) – Number of attention heads for the self-attention layer.
sa_hsize (int) – Size of dense layer in the self-attention layer.
sa_dropout_p (float) – Dropout probability for the self-attention layer.
output_bias (bool) – If True, dense layers will use bias vectors in the self-attention layer.
d_input (int) – Size of input layer in the feed-forward layer.
d_hidden (int) – Size of hidden layer in the feed-forward layer.
d_output (int) – Size of output layer in the feed-forward layer.
activation (str) – Activation function to be used in the feed-forward layer. Can choose between ‘relu’ for ReLU, ‘leakyrelu’ for LeakyReLU, ‘prelu’ for PReLU, ‘tanh’ for TanH, ‘selu’ for SELU, ‘elu’ for ELU and ‘linear’ for linear activation.
n_layers (int) – Number of layers in the feed-forward layer.
dropout_p (float) – Dropout probability in the feeed-forward layer.
encoder_hsize (int) – Size of Dense layer for the encoder itself.
encoder_dropout_p (float) – Dropout probability for connections in the encoder layer.

forward(x: torch.Tensor, mask: torch.Tensor, adj_matrix: torch.Tensor, distance_matrix: torch.Tensor, sa_dropout_p: float = 0.0) → torch.Tensor[source]¶

Output computation for the MATEncoder layer.

In the MATEncoderLayer intialization, self.sublayer is defined as an nn.ModuleList of 2 layers. We will be passing our computation through these layers sequentially. nn.ModuleList is subscriptable and thus we can access it as self.sublayer[0], for example.

Parameters

x (torch.Tensor) – Input tensor.
mask (torch.Tensor) – Masks out padding values so that they are not taken into account when computing the attention score.
adj_matrix (torch.Tensor) – Adjacency matrix of a molecule.
distance_matrix (torch.Tensor) – Distance matrix of a molecule.
sa_dropout_p (float) – Dropout probability for the self-attention layer (MultiHeadedMATAttention).

class MultiHeadedMATAttention(dist_kernel: str = 'softmax', lambda_attention: float = 0.33, lambda_distance: float = 0.33, h: int = 16, hsize: int = 1024, dropout_p: float = 0.0, output_bias: bool = True)[source]¶

First constructs an attention layer tailored to the Molecular Attention Transformer [1]_ and then converts it into Multi-Headed Attention.

In Multi-Headed attention the attention mechanism multiple times parallely through the multiple attention heads. Thus, different subsequences of a given sequences can be processed differently. The query, key and value parameters are split multiple ways and each split is passed separately through a different attention head. .. rubric:: References

1: Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from deepchem.models.torch_models.layers import MultiHeadedMATAttention, MATEmbedding
>>> import deepchem as dc
>>> import torch
>>> input_smile = "CC"
>>> feat = dc.feat.MATFeaturizer()
>>> input_smile = "CC"
>>> out = feat.featurize(input_smile)
>>> node = torch.tensor(out[0].node_features).float().unsqueeze(0)
>>> adj = torch.tensor(out[0].adjacency_matrix).float().unsqueeze(0)
>>> dist = torch.tensor(out[0].distance_matrix).float().unsqueeze(0)
>>> mask = torch.sum(torch.abs(node), dim=-1) != 0
>>> layer = MultiHeadedMATAttention(
...    dist_kernel='softmax',
...    lambda_attention=0.33,
...    lambda_distance=0.33,
...    h=16,
...    hsize=1024,
...    dropout_p=0.0)
>>> op = MATEmbedding()(node)
>>> output = layer(op, op, op, mask, adj, dist)

__init__(dist_kernel: str = 'softmax', lambda_attention: float = 0.33, lambda_distance: float = 0.33, h: int = 16, hsize: int = 1024, dropout_p: float = 0.0, output_bias: bool = True)[source]¶: Initialize a multi-headed attention layer. :param dist_kernel: Kernel activation to be used. Can be either ‘softmax’ for softmax or ‘exp’ for exponential. :type dist_kernel: str :param lambda_attention: Constant to be multiplied with the attention matrix. :type lambda_attention: float :param lambda_distance: Constant to be multiplied with the distance matrix. :type lambda_distance: float :param h: Number of attention heads. :type h: int :param hsize: Size of dense layer. :type hsize: int :param dropout_p: Dropout probability. :type dropout_p: float :param output_bias: If True, dense layers will use bias vectors. :type output_bias: bool

forward(query: torch.Tensor, key: torch.Tensor, value: torch.Tensor, mask: torch.Tensor, adj_matrix: torch.Tensor, distance_matrix: torch.Tensor, dropout_p: float = 0.0, eps: float = 1e-06, inf: float = 1000000000000.0) → torch.Tensor[source]¶: Output computation for the MultiHeadedAttention layer. :param query: Standard query parameter for attention. :type query: torch.Tensor :param key: Standard key parameter for attention. :type key: torch.Tensor :param value: Standard value parameter for attention. :type value: torch.Tensor :param mask: Masks out padding values so that they are not taken into account when computing the attention score. :type mask: torch.Tensor :param adj_matrix: Adjacency matrix of the input molecule, returned from dc.feat.MATFeaturizer() :type adj_matrix: torch.Tensor :param dist_matrix: Distance matrix of the input molecule, returned from dc.feat.MATFeaturizer() :type dist_matrix: torch.Tensor :param dropout_p: Dropout probability. :type dropout_p: float :param eps: Epsilon value :type eps: float :param inf: Value of infinity to be used. :type inf: float

class SublayerConnection(size: int, dropout_p: float = 0.0)[source]¶

SublayerConnection layer which establishes a residual connection, as used in the Molecular Attention Transformer [1]_.

The SublayerConnection layer is a residual layer which is then passed through Layer Normalization. The residual connection is established by computing the dropout-adjusted layer output of a normalized tensor and adding this to the original input tensor.

References

1: Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from deepchem.models.torch_models.layers import SublayerConnection
>>> scale = 0.35
>>> layer = SublayerConnection(2, 0.)
>>> input_ar = torch.tensor([[1., 2.], [5., 6.]])
>>> output = layer(input_ar, input_ar)

__init__(size: int, dropout_p: float = 0.0)[source]¶

Initialize a SublayerConnection Layer.

Parameters

size (int) – Size of layer.
dropout_p (float) – Dropout probability.

forward(x: torch.Tensor, output: torch.Tensor) → torch.Tensor[source]¶

Output computation for the SublayerConnection layer.

Takes an input tensor x, then adds the dropout-adjusted sublayer output for normalized x to it. This is done to add a residual connection followed by LayerNorm.

Parameters

x (torch.Tensor) – Input tensor.
output (torch.Tensor) – Layer whose normalized output will be added to x.

class PositionwiseFeedForward(d_input: int = 1024, d_hidden: int = 1024, d_output: int = 1024, activation: str = 'leakyrelu', n_layers: int = 1, dropout_p: float = 0.0)[source]¶

PositionwiseFeedForward is a layer used to define the position-wise feed-forward (FFN) algorithm for the Molecular Attention Transformer [1]_

Each layer in the MAT encoder contains a fully connected feed-forward network which applies two linear transformations and the given activation function. This is done in addition to the SublayerConnection module.

References

1: Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from deepchem.models.torch_models.layers import PositionwiseFeedForward
>>> feed_fwd_layer = PositionwiseFeedForward(d_input = 2, d_hidden = 2, d_output = 2, activation = 'relu', n_layers = 1, dropout_p = 0.1)
>>> input_tensor = torch.tensor([[1., 2.], [5., 6.]])
>>> output_tensor = feed_fwd_layer(input_tensor)

__init__(d_input: int = 1024, d_hidden: int = 1024, d_output: int = 1024, activation: str = 'leakyrelu', n_layers: int = 1, dropout_p: float = 0.0)[source]¶

Initialize a PositionwiseFeedForward layer.

Parameters

d_input (int) – Size of input layer.
d_hidden (int (same as d_input if d_output = 0)) – Size of hidden layer.
d_output (int (same as d_input if d_output = 0)) – Size of output layer.
activation (str) – Activation function to be used. Can choose between ‘relu’ for ReLU, ‘leakyrelu’ for LeakyReLU, ‘prelu’ for PReLU, ‘tanh’ for TanH, ‘selu’ for SELU, ‘elu’ for ELU and ‘linear’ for linear activation.
n_layers (int) – Number of layers.
dropout_p (float) – Dropout probability.

forward(x: torch.Tensor) → torch.Tensor[source]¶

Output Computation for the PositionwiseFeedForward layer.

Parameters: x (torch.Tensor) – Input tensor.

class MATEmbedding(d_input: int = 36, d_output: int = 1024, dropout_p: float = 0.0)[source]¶

Embedding layer to create embedding for inputs.

In an embedding layer, input is taken and converted to a vector representation for each input. In the MATEmbedding layer, an input tensor is processed through a dropout-adjusted linear layer and the resultant vector is returned.

References

1: Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from deepchem.models.torch_models.layers import MATEmbedding
>>> layer = MATEmbedding(d_input = 3, d_output = 3, dropout_p = 0.2)
>>> input_tensor = torch.tensor([1., 2., 3.])
>>> output = layer(input_tensor)

__init__(d_input: int = 36, d_output: int = 1024, dropout_p: float = 0.0)[source]¶

Initialize a MATEmbedding layer.

Parameters

d_input (int) – Size of input layer.
d_output (int) – Size of output layer.
dropout_p (float) – Dropout probability for layer.

forward(x: torch.Tensor) → torch.Tensor[source]¶

Computation for the MATEmbedding layer.

Parameters: x (torch.Tensor) – Input tensor to be converted into a vector.

class MATGenerator(hsize: int = 1024, aggregation_type: str = 'mean', d_output: int = 1, n_layers: int = 1, dropout_p: float = 0.0, attn_hidden: int = 128, attn_out: int = 4)[source]¶

MATGenerator defines the linear and softmax generator step for the Molecular Attention Transformer [1]_.

In the MATGenerator, a Generator is defined which performs the Linear + Softmax generation step. Depending on the type of aggregation selected, the attention output layer performs different operations.

References

1: Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from deepchem.models.torch_models.layers import MATGenerator
>>> layer = MATGenerator(hsize = 3, aggregation_type = 'mean', d_output = 1, n_layers = 1, dropout_p = 0.3, attn_hidden = 128, attn_out = 4)
>>> input_tensor = torch.tensor([1., 2., 3.])
>>> mask = torch.tensor([1., 1., 1.])
>>> output = layer(input_tensor, mask)

__init__(hsize: int = 1024, aggregation_type: str = 'mean', d_output: int = 1, n_layers: int = 1, dropout_p: float = 0.0, attn_hidden: int = 128, attn_out: int = 4)[source]¶

Initialize a MATGenerator.

Parameters

hsize (int) – Size of input layer.
aggregation_type (str) – Type of aggregation to be used. Can be ‘grover’, ‘mean’ or ‘contextual’.
d_output (int) – Size of output layer.
n_layers (int) – Number of layers in MATGenerator.
dropout_p (float) – Dropout probability for layer.
attn_hidden (int) – Size of hidden attention layer.
attn_out (int) – Size of output attention layer.

forward(x: torch.Tensor, mask: torch.Tensor) → torch.Tensor[source]¶

Computation for the MATGenerator layer.

Parameters

x (torch.Tensor) – Input tensor.
mask (torch.Tensor) – Mask for padding so that padded values do not get included in attention score calculation.

cosine_dist(x, y)[source]¶

Computes the inner product (cosine similarity) between two tensors.

This assumes that the two input tensors contain rows of vectors where each column represents a different feature. The output tensor will have elements that represent the inner product between pairs of normalized vectors in the rows of x and y. The two tensors need to have the same number of columns, because one cannot take the dot product between vectors of different lengths. For example, in sentence similarity and sentence classification tasks, the number of columns is the embedding size. In these tasks, the rows of the input tensors would be different test vectors or sentences. The input tensors themselves could be different batches. Using vectors or tensors of all 0s should be avoided.

The vectors in the input tensors are first l2-normalized such that each vector

has length or magnitude of 1. The inner product (dot product) is then taken

between corresponding pairs of row vectors in the input tensors and returned.

Examples

The cosine similarity between two equivalent vectors will be 1. The cosine similarity between two equivalent tensors (tensors where all the elements are the same) will be a tensor of 1s. In this scenario, if the input tensors x and y are each of shape (n,p), where each element in x and y is the same, then the output tensor would be a tensor of shape (n,n) with 1 in every entry.

>>> import tensorflow as tf
>>> import deepchem.models.layers as layers
>>> x = tf.ones((6, 4), dtype=tf.dtypes.float32, name=None)
>>> y_same = tf.ones((6, 4), dtype=tf.dtypes.float32, name=None)
>>> cos_sim_same = layers.cosine_dist(x,y_same)

x and y_same are the same tensor (equivalent at every element, in this case 1). As such, the pairwise inner product of the rows in x and y will always be 1. The output tensor will be of shape (6,6).

>>> diff = cos_sim_same - tf.ones((6, 6), dtype=tf.dtypes.float32, name=None)
>>> tf.reduce_sum(diff) == 0 # True
<tf.Tensor: shape=(), dtype=bool, numpy=True>
>>> cos_sim_same.shape
TensorShape([6, 6])

The cosine similarity between two orthogonal vectors will be 0 (by definition). If every row in x is orthogonal to every row in y, then the output will be a tensor of 0s. In the following example, each row in the tensor x1 is orthogonal to each row in x2 because they are halves of an identity matrix.

>>> identity_tensor = tf.eye(512, dtype=tf.dtypes.float32)
>>> x1 = identity_tensor[0:256,:]
>>> x2 = identity_tensor[256:512,:]
>>> cos_sim_orth = layers.cosine_dist(x1,x2)

Each row in x1 is orthogonal to each row in x2. As such, the pairwise inner product of the rows in x1`and `x2 will always be 0. Furthermore, because the shape of the input tensors are both of shape (256,512), the output tensor will be of shape (256,256).

>>> tf.reduce_sum(cos_sim_orth) == 0 # True
<tf.Tensor: shape=(), dtype=bool, numpy=True>
>>> cos_sim_orth.shape
TensorShape([256, 256])

Parameters

x (tf.Tensor) – Input Tensor of shape (n, p). The shape of this input tensor should be n rows by p columns. Note that n need not equal m (the number of rows in y).
y (tf.Tensor) – Input Tensor of shape (m, p) The shape of this input tensor should be m rows by p columns. Note that m need not equal n (the number of rows in x).

Returns

Returns a tensor of shape (n, m), that is, n rows by m columns. Each i,j-th entry of this output tensor is the inner product between the l2-normalized i-th row of the input tensor x and the the l2-normalized j-th row of the output tensor y.

Return type

tf.Tensor

Layers¶

Keras Layers¶

Torch Layers¶

Jax Layers¶