# Layers¶

Deep learning models are often said to be made up of “layers”. Intuitively, a “layer” is a function which transforms some tensor into another tensor. DeepChem maintains an extensive collection of layers which perform various useful scientific transformations. For now, most layers are Keras only but over time we expect this support to expand to other types of models and layers.

## Keras Layers¶

class InteratomicL2Distances(*args, **kwargs)[source]

Compute (squared) L2 Distances between atoms given neighbors.

This class computes pairwise distances between its inputs.

Examples

>>> import numpy as np
>>> import deepchem as dc
>>> atoms = 5
>>> neighbors = 2
>>> coords = np.random.rand(atoms, 3)
>>> neighbor_list = np.random.randint(0, atoms, size=(atoms, neighbors))
>>> layer = InteratomicL2Distances(atoms, neighbors, 3)
>>> result = np.array(layer([coords, neighbor_list]))
>>> result.shape
(5, 2)

__init__(N_atoms: int, M_nbrs: int, ndim: int, **kwargs)[source]

Constructor for this layer.

Parameters
• N_atoms (int) – Number of atoms in the system total.

• M_nbrs (int) – Number of neighbors to consider when computing distances.

• n_dim (int) – Number of descriptors for each atom.

get_config() Dict[source]

Returns config dictionary for this layer.

call(inputs: List)[source]

Invokes this layer.

Parameters

inputs (list) – Should be of form inputs=[coords, nbr_list] where coords is a tensor of shape (None, N, 3) and nbr_list is a list.

Return type

Tensor of shape (N_atoms, M_nbrs) with interatomic distances.

class GraphConv(*args, **kwargs)[source]

Graph Convolutional Layers

This layer implements the graph convolution introduced in [1]_. The graph convolution combines per-node feature vectures in a nonlinear fashion with the feature vectors for neighboring nodes. This “blends” information in local neighborhoods of a graph.

References

1

Duvenaud, David K., et al. “Convolutional networks on graphs for learning molecular fingerprints.” Advances in neural information processing systems. 2015. https://arxiv.org/abs/1509.09292

__init__(out_channel: int, min_deg: int = 0, max_deg: int = 10, activation_fn: Optional[Callable] = None, **kwargs)[source]

Initialize a graph convolutional layer.

Parameters
• out_channel (int) – The number of output channels per graph node.

• min_deg (int, optional (default 0)) – The minimum allowed degree for each graph node.

• max_deg (int, optional (default 10)) – The maximum allowed degree for each graph node. Note that this is set to 10 to handle complex molecules (some organometallic compounds have strange structures). If you’re using this for non-molecular applications, you may need to set this much higher depending on your dataset.

• activation_fn (function) – A nonlinear activation function to apply. If you’re not sure, tf.nn.relu is probably a good default for your application.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

Store the summed atoms by degree

class GraphPool(*args, **kwargs)[source]

A GraphPool gathers data from local neighborhoods of a graph.

This layer does a max-pooling over the feature vectors of atoms in a neighborhood. You can think of this layer as analogous to a max-pooling layer for 2D convolutions but which operates on graphs instead. This technique is described in [1]_.

References

1

Duvenaud, David K., et al. “Convolutional networks on graphs for learning molecular fingerprints.” Advances in neural information processing systems. 2015. https://arxiv.org/abs/1509.09292

__init__(min_degree=0, max_degree=10, **kwargs)[source]

Initialize this layer

Parameters
• min_deg (int, optional (default 0)) – The minimum allowed degree for each graph node.

• max_deg (int, optional (default 10)) – The maximum allowed degree for each graph node. Note that this is set to 10 to handle complex molecules (some organometallic compounds have strange structures). If you’re using this for non-molecular applications, you may need to set this much higher depending on your dataset.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class GraphGather(*args, **kwargs)[source]

A GraphGather layer pools node-level feature vectors to create a graph feature vector.

Many graph convolutional networks manipulate feature vectors per graph-node. For a molecule for example, each node might represent an atom, and the network would manipulate atomic feature vectors that summarize the local chemistry of the atom. However, at the end of the application, we will likely want to work with a molecule level feature representation. The GraphGather layer creates a graph level feature vector by combining all the node-level feature vectors.

One subtlety about this layer is that it depends on the batch_size. This is done for internal implementation reasons. The GraphConv, and GraphPool layers pool all nodes from all graphs in a batch that’s being processed. The GraphGather reassembles these jumbled node feature vectors into per-graph feature vectors.

References

1

Duvenaud, David K., et al. “Convolutional networks on graphs for learning molecular fingerprints.” Advances in neural information processing systems. 2015. https://arxiv.org/abs/1509.09292

__init__(batch_size, activation_fn=None, **kwargs)[source]

Initialize this layer.

Parameters
• batch_size (int) – The batch size for this layer. Note that the layer’s behavior changes depending on the batch size.

• activation_fn (function) – A nonlinear activation function to apply. If you’re not sure, tf.nn.relu is probably a good default for your application.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

call(inputs)[source]

Invoking this layer.

Parameters

inputs (list) – This list should consist of inputs = [atom_features, deg_slice, membership, deg_adj_list placeholders…]. These are all tensors that are created/process by GraphConv and GraphPool

class MolGANConvolutionLayer(*args, **kwargs)[source]

Graph convolution layer used in MolGAN model. MolGAN is a WGAN type model for generation of small molecules. Not used directly, higher level layers like MolGANMultiConvolutionLayer use it. This layer performs basic convolution on one-hot encoded matrices containing atom and bond information. This layer also accepts three inputs for the case when convolution is performed more than once and results of previous convolution need to used. It was done in such a way to avoid creating another layer that accepts three inputs rather than two. The last input layer is so-called hidden_layer and it hold results of the convolution while first two are unchanged input tensors.

Example

See: MolGANMultiConvolutionLayer for using in layers.

>>> from tensorflow.keras import Model
>>> from tensorflow.keras.layers import Input
>>> vertices = 9
>>> nodes = 5
>>> edges = 5
>>> units = 128

>>> layer1 = MolGANConvolutionLayer(units=units,edges=edges, name='layer1')
>>> layer2 = MolGANConvolutionLayer(units=units,edges=edges, name='layer2')
>>> node_tensor = Input(shape=(vertices,nodes))
>>> output = layer2(hidden1)


References

1

Nicola De Cao et al. “MolGAN: An implicit generative model for small molecular graphs”, https://arxiv.org/abs/1805.11973

__init__(units: int, activation: typing.Callable = <function tanh>, dropout_rate: float = 0.0, edges: int = 5, name: str = '', **kwargs)[source]

Initialize this layer.

Parameters
• units (int) – Dimesion of dense layers used for convolution

• activation (function, optional (default=Tanh)) – activation function used across model, default is Tanh

• dropout_rate (float, optional (default=0.0)) – Dropout rate used by dropout layer

• edges (int, optional (default=5)) – How many dense layers to use in convolution. Typically equal to number of bond types used in the model.

• name (string, optional (default="")) – Name of the layer

call(inputs, training=False)[source]

Invoke this layer

Parameters
• inputs (list) – List of two input matrices, adjacency tensor and node features tensors in one-hot encoding format.

• training (bool) – Should this layer be run in training mode. Typically decided by main model, influences things like dropout.

Returns

First and second are original input tensors Third is the result of convolution

Return type

tuple(tf.Tensor,tf.Tensor,tf.Tensor)

get_config() Dict[source]

Returns config dictionary for this layer.

class MolGANAggregationLayer(*args, **kwargs)[source]

Graph Aggregation layer used in MolGAN model. MolGAN is a WGAN type model for generation of small molecules. Performs aggregation on tensor resulting from convolution layers. Given its simple nature it might be removed in future and moved to MolGANEncoderLayer.

Example

>>> from tensorflow.keras import Model
>>> from tensorflow.keras.layers import Input
>>> vertices = 9
>>> nodes = 5
>>> edges = 5
>>> units = 128

>>> layer_1 = MolGANConvolutionLayer(units=units,edges=edges, name='layer1')
>>> layer_2 = MolGANConvolutionLayer(units=units,edges=edges, name='layer2')
>>> layer_3 = MolGANAggregationLayer(units=128, name='layer3')
>>> node_tensor = Input(shape=(vertices,nodes))
>>> hidden_2 = layer_2(hidden_1)
>>> output = layer_3(hidden_2[2])


References

1

Nicola De Cao et al. “MolGAN: An implicit generative model for small molecular graphs”, https://arxiv.org/abs/1805.11973

__init__(units: int = 128, activation: typing.Callable = <function tanh>, dropout_rate: float = 0.0, name: str = '', **kwargs)[source]

Initialize the layer

Parameters
• units (int, optional (default=128)) – Dimesion of dense layers used for aggregation

• activation (function, optional (default=Tanh)) – activation function used across model, default is Tanh

• dropout_rate (float, optional (default=0.0)) – Used by dropout layer

• name (string, optional (default="")) – Name of the layer

call(inputs, training=False)[source]

Invoke this layer

Parameters
• inputs (List) – Single tensor resulting from graph convolution layer

• training (bool) – Should this layer be run in training mode. Typically decided by main model, influences things like dropout.

Returns

aggregation tensor – Result of aggregation function on input convolution tensor.

Return type

tf.Tensor

get_config() Dict[source]

Returns config dictionary for this layer.

class MolGANMultiConvolutionLayer(*args, **kwargs)[source]

Multiple pass convolution layer used in MolGAN model. MolGAN is a WGAN type model for generation of small molecules. It takes outputs of previous convolution layer and uses them as inputs for the next one. It simplifies the overall framework, but might be moved to MolGANEncoderLayer in the future in order to reduce number of layers.

Example

>>> from tensorflow.keras import Model
>>> from tensorflow.keras.layers import Input
>>> vertices = 9
>>> nodes = 5
>>> edges = 5
>>> units = 128

>>> layer_1 = MolGANMultiConvolutionLayer(units=(128,64), name='layer1')
>>> layer_2 = MolGANAggregationLayer(units=128, name='layer2')
>>> node_tensor = Input(shape=(vertices,nodes))
>>> output = layer_2(hidden)


References

1

Nicola De Cao et al. “MolGAN: An implicit generative model for small molecular graphs”, https://arxiv.org/abs/1805.11973

__init__(units: typing.Tuple = (128, 64), activation: typing.Callable = <function tanh>, dropout_rate: float = 0.0, edges: int = 5, name: str = '', **kwargs)[source]

Initialize the layer

Parameters
• units (Tuple, optional (default=(128,64)), min_length=2) – List of dimensions used by consecutive convolution layers. The more values the more convolution layers invoked.

• activation (function, optional (default=tanh)) – activation function used across model, default is Tanh

• dropout_rate (float, optional (default=0.0)) – Used by dropout layer

• edges (int, optional (default=0)) – Controls how many dense layers use for single convolution unit. Typically matches number of bond types used in the molecule.

• name (string, optional (default="")) – Name of the layer

call(inputs, training=False)[source]

Invoke this layer

Parameters
• inputs (list) – List of two input matrices, adjacency tensor and node features tensors in one-hot encoding format.

• training (bool) – Should this layer be run in training mode. Typically decided by main model, influences things like dropout.

Returns

convolution tensor – Result of input tensors going through convolution a number of times.

Return type

tf.Tensor

get_config() Dict[source]

Returns config dictionary for this layer.

class MolGANEncoderLayer(*args, **kwargs)[source]

Main learning layer used by MolGAN model. MolGAN is a WGAN type model for generation of small molecules. It role is to further simplify model. This layer can be manually built by stacking graph convolution layers followed by graph aggregation.

Example

>>> from tensorflow.keras import Model
>>> from tensorflow.keras.layers import Input, Dropout,Dense
>>> vertices = 9
>>> edges = 5
>>> nodes = 5
>>> dropout_rate = .0
>>> node_tensor = Input(shape=(vertices, nodes))

>>> graph = MolGANEncoderLayer(units = [(128,64),128], dropout_rate= dropout_rate, edges=edges)([adjacency_tensor,node_tensor])
>>> dense = Dense(units=128, activation='tanh')(graph)
>>> dense = Dropout(dropout_rate)(dense)
>>> dense = Dense(units=64, activation='tanh')(dense)
>>> dense = Dropout(dropout_rate)(dense)
>>> output = Dense(units=1)(dense)

>>> model = Model(inputs=[adjacency_tensor,node_tensor], outputs=[output])


References

1

Nicola De Cao et al. “MolGAN: An implicit generative model for small molecular graphs”, https://arxiv.org/abs/1805.11973

__init__(units: typing.List = [(128, 64), 128], activation: typing.Callable = <function tanh>, dropout_rate: float = 0.0, edges: int = 5, name: str = '', **kwargs)[source]

Initialize the layer.

Parameters
• units (List, optional (default=[(128, 64), 128])) – List of units for MolGANMultiConvolutionLayer and GraphAggregationLayer i.e. [(128,64),128] means two convolution layers dims = [128,64] followed by aggregation layer dims=128

• activation (function, optional (default=Tanh)) – activation function used across model, default is Tanh

• dropout_rate (float, optional (default=0.0)) – Used by dropout layer

• edges (int, optional (default=0)) – Controls how many dense layers use for single convolution unit. Typically matches number of bond types used in the molecule.

• name (string, optional (default="")) – Name of the layer

call(inputs, training=False)[source]

Invoke this layer

Parameters
• inputs (list) – List of two input matrices, adjacency tensor and node features tensors in one-hot encoding format.

• training (bool) – Should this layer be run in training mode. Typically decided by main model, influences things like dropout.

Returns

encoder tensor – Tensor that been through number of convolutions followed by aggregation.

Return type

tf.Tensor

get_config() Dict[source]

Returns config dictionary for this layer.

class LSTMStep(*args, **kwargs)[source]

Layer that performs a single step LSTM update.

This layer performs a single step LSTM update. Note that it is not a full LSTM recurrent network. The LSTMStep layer is useful as a primitive for designing layers such as the AttnLSTMEmbedding or the IterRefLSTMEmbedding below.

__init__(output_dim, input_dim, init_fn='glorot_uniform', inner_init_fn='orthogonal', activation_fn='tanh', inner_activation_fn='hard_sigmoid', **kwargs)[source]
Parameters
• output_dim (int) – Dimensionality of output vectors.

• input_dim (int) – Dimensionality of input vectors.

• init_fn (str) – TensorFlow nitialization to use for W.

• inner_init_fn (str) – TensorFlow initialization to use for U.

• activation_fn (str) – TensorFlow activation to use for output.

• inner_activation_fn (str) – TensorFlow activation to use for inner steps.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Constructs learnable weights for this layer.

call(inputs)[source]

Execute this layer on input tensors.

Parameters

inputs (list) – List of three tensors (x, h_tm1, c_tm1). h_tm1 means “h, t-1”.

Returns

Returns h, [h, c]

Return type

list

class AttnLSTMEmbedding(*args, **kwargs)[source]

Implements AttnLSTM as in matching networks paper.

The AttnLSTM embedding adjusts two sets of vectors, the “test” and “support” sets. The “support” consists of a set of evidence vectors. Think of these as the small training set for low-data machine learning. The “test” consists of the queries we wish to answer with the small amounts of available data. The AttnLSTMEmbdding allows us to modify the embedding of the “test” set depending on the contents of the “support”. The AttnLSTMEmbedding is thus a type of learnable metric that allows a network to modify its internal notion of distance.

See references [1]_ [2]_ for more details.

References

1

Vinyals, Oriol, et al. “Matching networks for one shot learning.” Advances in neural information processing systems. 2016.

2

Vinyals, Oriol, Samy Bengio, and Manjunath Kudlur. “Order matters: Sequence to sequence for sets.” arXiv preprint arXiv:1511.06391 (2015).

__init__(n_test, n_support, n_feat, max_depth, **kwargs)[source]
Parameters
• n_support (int) – Size of support set.

• n_test (int) – Size of test set.

• n_feat (int) – Number of features per atom

• max_depth (int) – Number of “processing steps” used by sequence-to-sequence for sets model.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

Execute this layer on input tensors.

Parameters

inputs (list) – List of two tensors (X, Xp). X should be of shape (n_test, n_feat) and Xp should be of shape (n_support, n_feat) where n_test is the size of the test set, n_support that of the support set, and n_feat is the number of per-atom features.

Returns

Returns two tensors of same shape as input. Namely the output shape will be [(n_test, n_feat), (n_support, n_feat)]

Return type

list

class IterRefLSTMEmbedding(*args, **kwargs)[source]

Implements the Iterative Refinement LSTM.

Much like AttnLSTMEmbedding, the IterRefLSTMEmbedding is another type of learnable metric which adjusts “test” and “support.” Recall that “support” is the small amount of data available in a low data machine learning problem, and that “test” is the query. The AttnLSTMEmbedding only modifies the “test” based on the contents of the support. However, the IterRefLSTM modifies both the “support” and “test” based on each other. This allows the learnable metric to be more malleable than that from AttnLSTMEmbeding.

__init__(n_test, n_support, n_feat, max_depth, **kwargs)[source]

Unlike the AttnLSTM model which only modifies the test vectors additively, this model allows for an additive update to be performed to both test and support using information from each other.

Parameters
• n_support (int) – Size of support set.

• n_test (int) – Size of test set.

• n_feat (int) – Number of input atom features

• max_depth (int) – Number of LSTM Embedding layers.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

Execute this layer on input tensors.

Parameters

inputs (list) – List of two tensors (X, Xp). X should be of shape (n_test, n_feat) and Xp should be of shape (n_support, n_feat) where n_test is the size of the test set, n_support that of the support set, and n_feat is the number of per-atom features.

Returns

• Returns two tensors of same shape as input. Namely the output

• shape will be [(n_test, n_feat), (n_support, n_feat)]

class SwitchedDropout(*args, **kwargs)[source]

Apply dropout based on an input.

This is required for uncertainty prediction. The standard Keras Dropout layer only performs dropout during training, but we sometimes need to do it during prediction. The second input to this layer should be a scalar equal to 0 or 1, indicating whether to perform dropout.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class WeightedLinearCombo(*args, **kwargs)[source]

Computes a weighted linear combination of input layers, with the weights defined by trainable variables.

__init__(std=0.3, **kwargs)[source]

Initialize this layer.

Parameters

std (float, optional (default 0.3)) – The standard deviation to use when randomly initializing weights.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class CombineMeanStd(*args, **kwargs)[source]

Generate Gaussian nose.

__init__(training_only=False, noise_epsilon=1.0, **kwargs)[source]

Create a CombineMeanStd layer.

This layer should have two inputs with the same shape, and its output also has the same shape. Each element of the output is a Gaussian distributed random number whose mean is the corresponding element of the first input, and whose standard deviation is the corresponding element of the second input.

Parameters
• training_only (bool) – if True, noise is only generated during training. During prediction, the output is simply equal to the first input (that is, the mean of the distribution used during training).

• noise_epsilon (float) – The noise is scaled by this factor

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

call(inputs, training=True)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class Stack(*args, **kwargs)[source]

Stack the inputs along a new axis.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class VinaFreeEnergy(*args, **kwargs)[source]

Computes free-energy as defined by Autodock Vina.

TODO(rbharath): Make this layer support batching.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

nonlinearity(c, w)[source]

Computes non-linearity used in Vina.

repulsion(d)[source]

Computes Autodock Vina’s repulsion interaction term.

hydrophobic(d)[source]

Computes Autodock Vina’s hydrophobic interaction term.

hydrogen_bond(d)[source]

Computes Autodock Vina’s hydrogen bond interaction term.

gaussian_first(d)[source]

Computes Autodock Vina’s first Gaussian interaction term.

gaussian_second(d)[source]

Computes Autodock Vina’s second Gaussian interaction term.

call(inputs)[source]
Parameters
• X (tf.Tensor of shape (N, d)) – Coordinates/features.

• Z (tf.Tensor of shape (N)) – Atomic numbers of neighbor atoms.

Returns

layer – The free energy of each complex in batch

Return type

tf.Tensor of shape (B)

class NeighborList(*args, **kwargs)[source]

Computes a neighbor-list in Tensorflow.

Neighbor-lists (also called Verlet Lists) are a tool for grouping atoms which are close to each other spatially. This layer computes a Neighbor List from a provided tensor of atomic coordinates. You can think of this as a general “k-means” layer, but optimized for the case k==3.

TODO(rbharath): Make this layer support batching.

__init__(N_atoms, M_nbrs, ndim, nbr_cutoff, start, stop, **kwargs)[source]
Parameters
• N_atoms (int) – Maximum number of atoms this layer will neighbor-list.

• M_nbrs (int) – Maximum number of spatial neighbors possible for atom.

• ndim (int) – Dimensionality of space atoms live in. (Typically 3D, but sometimes will want to use higher dimensional descriptors for atoms).

• nbr_cutoff (float) – Length in Angstroms (?) at which atom boxes are gridded.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

compute_nbr_list(coords)[source]

Get closest neighbors for atoms.

Needs to handle padding for atoms with no neighbors.

Parameters

coords (tf.Tensor) – Shape (N_atoms, ndim)

Returns

nbr_list – Shape (N_atoms, M_nbrs) of atom indices

Return type

tf.Tensor

get_atoms_in_nbrs(coords, cells)[source]

Get the atoms in neighboring cells for each cells.

Return type

atoms_in_nbrs = (N_atoms, n_nbr_cells, M_nbrs)

get_closest_atoms(coords, cells)[source]

For each cell, find M_nbrs closest atoms.

Let N_atoms be the number of atoms.

Parameters
• coords (tf.Tensor) – (N_atoms, ndim) shape.

• cells (tf.Tensor) – (n_cells, ndim) shape.

Returns

closest_inds – Of shape (n_cells, M_nbrs)

Return type

tf.Tensor

get_cells_for_atoms(coords, cells)[source]

Compute the cells each atom belongs to.

Parameters
• coords (tf.Tensor) – Shape (N_atoms, ndim)

• cells (tf.Tensor) – (n_cells, ndim) shape.

Returns

cells_for_atoms – Shape (N_atoms, 1)

Return type

tf.Tensor

get_neighbor_cells(cells)[source]

Compute neighbors of cells in grid.

# TODO(rbharath): Do we need to handle periodic boundary conditions properly here? # TODO(rbharath): This doesn’t handle boundaries well. We hard-code # looking for n_nbr_cells neighbors, which isn’t right for boundary cells in # the cube.

Parameters

cells (tf.Tensor) – (n_cells, ndim) shape.

Returns

nbr_cells – (n_cells, n_nbr_cells)

Return type

tf.Tensor

get_cells()[source]

Returns the locations of all grid points in box.

Suppose start is -10 Angstrom, stop is 10 Angstrom, nbr_cutoff is 1. Then would return a list of length 20^3 whose entries would be [(-10, -10, -10), (-10, -10, -9), …, (9, 9, 9)]

Returns

cells – (n_cells, ndim) shape.

Return type

tf.Tensor

class AtomicConvolution(*args, **kwargs)[source]

Implements the atomic convolutional transform introduced in

Gomes, Joseph, et al. “Atomic convolutional networks for predicting protein-ligand binding affinity.” arXiv preprint arXiv:1703.10603 (2017).

At a high level, this transform performs a graph convolution on the nearest neighbors graph in 3D space.

Atomic convolution layer

N = max_num_atoms, M = max_num_neighbors, B = batch_size, d = num_features l = num_radial_filters * num_atom_types

Parameters
• atom_types (list or None) – Of length a, where a is number of atom types for filtering.

• radial_params (list) – Of length l, where l is number of radial filters learned.

• boxsize (float or None) – Simulation box length [Angstrom].

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]
Parameters
• X (tf.Tensor of shape (B, N, d)) – Coordinates/features.

• Nbrs (tf.Tensor of shape (B, N, M)) – Neighbor list.

• Nbrs_Z (tf.Tensor of shape (B, N, M)) – Atomic numbers of neighbor atoms.

Returns

layer – A new tensor representing the output of the atomic conv layer

Return type

tf.Tensor of shape (B, N, l)

B = batch_size, N = max_num_atoms, M = max_num_neighbors, d = num_filters

Parameters
• R (tf.Tensor of shape (B, N, M)) – Distance matrix.

• rc (float) – Interaction cutoff [Angstrom].

• rs (float) – Gaussian distance matrix mean.

• e (float) – Gaussian distance matrix width.

Returns

retval – Radial symmetry function (before summation)

Return type

tf.Tensor of shape (B, N, M)

B = batch_size, N = max_num_atoms, M = max_num_neighbors

Parameters
• [B (R) – Distance matrix.

• N (tf.Tensor) – Distance matrix.

• M] (tf.Tensor) – Distance matrix.

• rc (tf.Variable) – Interaction cutoff [Angstrom].

Returns

FC [B, N, M] – Radial cutoff matrix.

Return type

tf.Tensor

gaussian_distance_matrix(R, rs, e)[source]

Calculates gaussian distance matrix.

B = batch_size, N = max_num_atoms, M = max_num_neighbors

Parameters
• [B (R) – Distance matrix.

• N (tf.Tensor) – Distance matrix.

• M] (tf.Tensor) – Distance matrix.

• rs (tf.Variable) – Gaussian distance matrix mean.

• e (tf.Variable) – Gaussian distance matrix width (e = .5/std**2).

Returns

retval [B, N, M] – Gaussian distance matrix.

Return type

tf.Tensor

distance_tensor(X, Nbrs, boxsize, B, N, M, d)[source]

Calculates distance tensor for batch of molecules.

B = batch_size, N = max_num_atoms, M = max_num_neighbors, d = num_features

Parameters
• X (tf.Tensor of shape (B, N, d)) – Coordinates/features tensor.

• Nbrs (tf.Tensor of shape (B, N, M)) – Neighbor list tensor.

• boxsize (float or None) – Simulation box length [Angstrom].

Returns

D – Coordinates/features distance tensor.

Return type

tf.Tensor of shape (B, N, M, d)

distance_matrix(D)[source]

Calcuates the distance matrix from the distance tensor

B = batch_size, N = max_num_atoms, M = max_num_neighbors, d = num_features

Parameters

D (tf.Tensor of shape (B, N, M, d)) – Distance tensor.

Returns

R – Distance matrix.

Return type

tf.Tensor of shape (B, N, M)

class AlphaShareLayer(*args, **kwargs)[source]

Part of a sluice network. Adds alpha parameters to control sharing between the main and auxillary tasks

Factory method AlphaShare should be used for construction

Parameters

in_layers (list of Layers or tensors) – tensors in list must be the same size and list must include two or more tensors

Returns

• out_tensor (a tensor with shape [len(in_layers), x, y] where x, y were the original layer dimensions)

• Distance matrix.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class SluiceLoss(*args, **kwargs)[source]

Calculates the loss in a Sluice Network Every input into an AlphaShare should be used in SluiceLoss

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class BetaShare(*args, **kwargs)[source]

Part of a sluice network. Adds beta params to control which layer outputs are used for prediction

Parameters

in_layers (list of Layers or tensors) – tensors in list must be the same size and list must include two or more tensors

Returns

output_layers – Distance matrix.

Return type

list of Layers or tensors with same size as in_layers

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

Size of input layers must all be the same

class ANIFeat(*args, **kwargs)[source]

Performs transform from 3D coordinates to ANI symmetry functions

__init__(max_atoms=23, radial_cutoff=4.6, angular_cutoff=3.1, radial_length=32, angular_length=8, atom_cases=[1, 6, 7, 8, 16], atomic_number_differentiated=True, coordinates_in_bohr=True, **kwargs)[source]

Only X can be transformed

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

call(inputs)[source]

In layers should be of shape dtype tf.float32, (None, self.max_atoms, 4)

distance_matrix(coordinates, flags)[source]

Generate distance matrix

distance_cutoff(d, cutoff, flags)[source]

Generate distance matrix with trainable cutoff

angular_symmetry(d_cutoff, d, atom_numbers, coordinates)[source]

Angular Symmetry Function

class GraphEmbedPoolLayer(*args, **kwargs)[source]

GraphCNNPool Layer from Robust Spatial Filtering with Graph Convolutional Neural Networks https://arxiv.org/abs/1703.00792

This is a learnable pool operation It constructs a new adjacency matrix for a graph of specified number of nodes.

This differs from our other pool operations which set vertices to a function value without altering the adjacency matrix.

..math:: V_{emb} = SpatialGraphCNN({V_{in}}) ..math:: V_{out} = sigma(V_{emb})^{T} * V_{in} ..math:: A_{out} = V_{emb}^{T} * A_{in} * V_{emb}

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]
Parameters
• num_filters (int) – Number of filters to have in the output

• in_layers (list of Layers or tensors) –

[V, A, mask] V are the vertex features must be of shape (batch, vertex, channel)

A are the adjacency matrixes for each graph

mask is optional, to be used when not every graph has the same number of vertices

Returns

• Returns a tf.tensor with a graph convolution applied

• The shape will be (batch, vertex, self.num_filters).

class GraphCNN(*args, **kwargs)[source]

GraphCNN Layer from Robust Spatial Filtering with Graph Convolutional Neural Networks https://arxiv.org/abs/1703.00792

Spatial-domain convolutions can be defined as H = h_0I + h_1A + h_2A^2 + … + hkAk, H ∈ R**(N×N)

We approximate it by H ≈ h_0I + h_1A

We can define a convolution as applying multiple these linear filters over edges of different types (think up, down, left, right, diagonal in images) Where each edge type has its own adjacency matrix H ≈ h_0I + h_1A_1 + h_2A_2 + … h_(L−1)A_(L−1)

V_out = sum_{c=1}^{C} H^{c} V^{c} + b

__init__(num_filters, **kwargs)[source]
Parameters
• num_filters (int) – Number of filters to have in the output

• in_layers (list of Layers or tensors) –

[V, A, mask] V are the vertex features must be of shape (batch, vertex, channel)

A are the adjacency matrixes for each graph

mask is optional, to be used when not every graph has the same number of vertices

• Returns (tf.tensor) –

• applied (Returns a tf.tensor with a graph convolution) –

• (batch (The shape will be) –

• vertex

• self.num_filters)

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class Highway(*args, **kwargs)[source]

Create a highway layer. y = H(x) * T(x) + x * (1 - T(x))

H(x) = activation_fn(matmul(W_H, x) + b_H) is the non-linear transformed output T(x) = sigmoid(matmul(W_T, x) + b_T) is the transform gate

Implementation based on paper

Srivastava, Rupesh Kumar, Klaus Greff, and Jürgen Schmidhuber. “Highway networks.” arXiv preprint arXiv:1505.00387 (2015).

This layer expects its input to be a two dimensional tensor of shape (batch size, # input features). Outputs will be in the same shape.

__init__(activation_fn='relu', biases_initializer='zeros', weights_initializer=None, **kwargs)[source]
Parameters
• activation_fn (object) – the Tensorflow activation function to apply to the output

• biases_initializer (callable object) – the initializer for bias values. This may be None, in which case the layer will not include biases.

• weights_initializer (callable object) – the initializer for weight values

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class WeaveLayer(*args, **kwargs)[source]

This class implements the core Weave convolution from the Google graph convolution paper [1]_

This model contains atom features and bond features separately.Here, bond features are also called pair features. There are 2 types of transformation, atom->atom, atom->pair, pair->atom, pair->pair that this model implements.

Examples

This layer expects 4 inputs in a list of the form [atom_features, pair_features, pair_split, atom_to_pair]. We’ll walk through the structure of these inputs. Let’s start with some basic definitions.

>>> import deepchem as dc
>>> import numpy as np


Suppose you have a batch of molecules

>>> smiles = ["CCC", "C"]


Note that there are 4 atoms in total in this system. This layer expects its input molecules to be batched together.

>>> total_n_atoms = 4


Let’s suppose that we have a featurizer that computes n_atom_feat features per atom.

>>> n_atom_feat = 75


Then conceptually, atom_feat is the array of shape (total_n_atoms, n_atom_feat) of atomic features. For simplicity, let’s just go with a random such matrix.

>>> atom_feat = np.random.rand(total_n_atoms, n_atom_feat)


Let’s suppose we have n_pair_feat pairwise features

>>> n_pair_feat = 14


For each molecule, we compute a matrix of shape (n_atoms*n_atoms, n_pair_feat) of pairwise features for each pair of atoms in the molecule. Let’s construct this conceptually for our example.

>>> pair_feat = [np.random.rand(3*3, n_pair_feat), np.random.rand(1*1, n_pair_feat)]
>>> pair_feat = np.concatenate(pair_feat, axis=0)
>>> pair_feat.shape
(10, 14)


pair_split is an index into pair_feat which tells us which atom each row belongs to. In our case, we hve

>>> pair_split = np.array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3])


That is, the first 9 entries belong to “CCC” and the last entry to “C”. The final entry atom_to_pair goes in a little more in-depth than pair_split and tells us the precise pair each pair feature belongs to. In our case

>>> atom_to_pair = np.array([[0, 0],
...                          [0, 1],
...                          [0, 2],
...                          [1, 0],
...                          [1, 1],
...                          [1, 2],
...                          [2, 0],
...                          [2, 1],
...                          [2, 2],
...                          [3, 3]])


Let’s now define the actual layer

>>> layer = WeaveLayer()


And invoke it

>>> [A, P] = layer([atom_feat, pair_feat, pair_split, atom_to_pair])


The weave layer produces new atom/pair features. Let’s check their shapes

>>> A = np.array(A)
>>> A.shape
(4, 50)
>>> P = np.array(P)
>>> P.shape
(10, 50)


The 4 is total_num_atoms and the 10 is the total number of pairs. Where does 50 come from? It’s from the default arguments n_atom_input_feat and n_pair_input_feat.

References

1

Kearnes, Steven, et al. “Molecular graph convolutions: moving beyond fingerprints.” Journal of computer-aided molecular design 30.8 (2016): 595-608.

__init__(n_atom_input_feat: int = 75, n_pair_input_feat: int = 14, n_atom_output_feat: int = 50, n_pair_output_feat: int = 50, n_hidden_AA: int = 50, n_hidden_PA: int = 50, n_hidden_AP: int = 50, n_hidden_PP: int = 50, update_pair: bool = True, init: str = 'glorot_uniform', activation: str = 'relu', batch_normalize: bool = True, batch_normalize_kwargs: Dict = {'renorm': True}, **kwargs)[source]
Parameters
• n_atom_input_feat (int, optional (default 75)) – Number of features for each atom in input.

• n_pair_input_feat (int, optional (default 14)) – Number of features for each pair of atoms in input.

• n_atom_output_feat (int, optional (default 50)) – Number of features for each atom in output.

• n_pair_output_feat (int, optional (default 50)) – Number of features for each pair of atoms in output.

• n_hidden_AA (int, optional (default 50)) – Number of units(convolution depths) in corresponding hidden layer

• n_hidden_PA (int, optional (default 50)) – Number of units(convolution depths) in corresponding hidden layer

• n_hidden_AP (int, optional (default 50)) – Number of units(convolution depths) in corresponding hidden layer

• n_hidden_PP (int, optional (default 50)) – Number of units(convolution depths) in corresponding hidden layer

• update_pair (bool, optional (default True)) – Whether to calculate for pair features, could be turned off for last layer

• init (str, optional (default 'glorot_uniform')) – Weight initialization for filters.

• activation (str, optional (default 'relu')) – Activation function applied

• batch_normalize (bool, optional (default True)) – If this is turned on, apply batch normalization before applying activation functions on convolutional layers.

• batch_normalize_kwargs (Dict, optional (default {renorm=True})) – Batch normalization is a complex layer which has many potential argumentswhich change behavior. This layer accepts user-defined parameters which are passed to all BatchNormalization layers in WeaveModel, WeaveLayer, and WeaveGather.

get_config() Dict[source]

Returns config dictionary for this layer.

build(input_shape)[source]

Construct internal trainable weights.

Parameters

input_shape (tuple) – Ignored since we don’t need the input shape to create internal weights.

call(inputs: List) List[source]

Creates weave tensors.

Parameters

inputs (List) – Should contain 4 tensors [atom_features, pair_features, pair_split, atom_to_pair]

class WeaveGather(*args, **kwargs)[source]

Implements the weave-gathering section of weave convolutions.

Implements the gathering layer from [1]_. The weave gathering layer gathers per-atom features to create a molecule-level fingerprint in a weave convolutional network. This layer can also performs Gaussian histogram expansion as detailed in [1]_. Note that the gathering function here is simply addition as in [1]_>

Examples

This layer expects 2 inputs in a list of the form [atom_features, pair_features]. We’ll walk through the structure of these inputs. Let’s start with some basic definitions.

>>> import deepchem as dc
>>> import numpy as np


Suppose you have a batch of molecules

>>> smiles = ["CCC", "C"]


Note that there are 4 atoms in total in this system. This layer expects its input molecules to be batched together.

>>> total_n_atoms = 4


Let’s suppose that we have n_atom_feat features per atom.

>>> n_atom_feat = 75


Then conceptually, atom_feat is the array of shape (total_n_atoms, n_atom_feat) of atomic features. For simplicity, let’s just go with a random such matrix.

>>> atom_feat = np.random.rand(total_n_atoms, n_atom_feat)


We then need to provide a mapping of indices to the atoms they belong to. In ours case this would be

>>> atom_split = np.array([0, 0, 0, 1])


Let’s now define the actual layer

>>> gather = WeaveGather(batch_size=2, n_input=n_atom_feat)
>>> output_molecules = gather([atom_feat, atom_split])
>>> len(output_molecules)
2


References

1

Kearnes, Steven, et al. “Molecular graph convolutions: moving beyond fingerprints.” Journal of computer-aided molecular design 30.8 (2016): 595-608.

Note

This class requires tensorflow_probability to be installed.

__init__(batch_size: int, n_input: int = 128, gaussian_expand: bool = True, compress_post_gaussian_expansion: bool = False, init: str = 'glorot_uniform', activation: str = 'tanh', **kwargs)[source]
Parameters
• batch_size (int) – number of molecules in a batch

• n_input (int, optional (default 128)) – number of features for each input molecule

• gaussian_expand (boolean, optional (default True)) – Whether to expand each dimension of atomic features by gaussian histogram

• compress_post_gaussian_expansion (bool, optional (default False)) – If True, compress the results of the Gaussian expansion back to the original dimensions of the input by using a linear layer with specified activation function. Note that this compression was not in the original paper, but was present in the original DeepChem implementation so is left present for backwards compatibility.

• init (str, optional (default 'glorot_uniform')) – Weight initialization for filters if compress_post_gaussian_expansion is True.

• activation (str, optional (default 'tanh')) – Activation function applied for filters if compress_post_gaussian_expansion is True. Should be recognizable by tf.keras.activations.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs: List) List[source]

Creates weave tensors.

Parameters

inputs (List) – Should contain 2 tensors [atom_features, atom_split]

Returns

output_molecules – Each entry in this list is of shape (self.n_inputs,)

Return type

List

gaussian_histogram(x)[source]

Expands input into a set of gaussian histogram bins.

Parameters

x (tf.Tensor) – Of shape (N, n_feat)

Examples

This method uses 11 bins spanning portions of a Gaussian with zero mean and unit standard deviation.

>>> gaussian_memberships = [(-1.645, 0.283), (-1.080, 0.170),
...                         (-0.739, 0.134), (-0.468, 0.118),
...                         (-0.228, 0.114), (0., 0.114),
...                         (0.228, 0.114), (0.468, 0.118),
...                         (0.739, 0.134), (1.080, 0.170),
...                         (1.645, 0.283)]


We construct a Gaussian at gaussian_memberships[i][0] with standard deviation gaussian_memberships[i][1]. Each feature in x is assigned the probability of falling in each Gaussian, and probabilities are normalized across the 11 different Gaussians.

Returns

outputs – Of shape (N, 11*n_feat)

Return type

tf.Tensor

class DTNNEmbedding(*args, **kwargs)[source]
__init__(n_embedding=30, periodic_table_length=30, init='glorot_uniform', **kwargs)[source]
Parameters
• n_embedding (int, optional) – Number of features for each atom

• periodic_table_length (int, optional) – Length of embedding, 83=Bi

• init (str, optional) – Weight initialization for filters.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

parent layers: atom_number

class DTNNStep(*args, **kwargs)[source]
__init__(n_embedding=30, n_distance=100, n_hidden=60, init='glorot_uniform', activation='tanh', **kwargs)[source]
Parameters
• n_embedding (int, optional) – Number of features for each atom

• n_distance (int, optional) – granularity of distance matrix

• n_hidden (int, optional) – Number of nodes in hidden layer

• init (str, optional) – Weight initialization for filters.

• activation (str, optional) – Activation function applied

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

parent layers: atom_features, distance, distance_membership_i, distance_membership_j

class DTNNGather(*args, **kwargs)[source]
__init__(n_embedding=30, n_outputs=100, layer_sizes=[100], output_activation=True, init='glorot_uniform', activation='tanh', **kwargs)[source]
Parameters
• n_embedding (int, optional) – Number of features for each atom

• n_outputs (int, optional) – Number of features for each molecule(output)

• layer_sizes (list of int, optional(default=[1000])) – Structure of hidden layer(s)

• init (str, optional) – Weight initialization for filters.

• activation (str, optional) – Activation function applied

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

parent layers: atom_features, atom_membership

class DAGLayer(*args, **kwargs)[source]

DAG computation layer.

This layer generates a directed acyclic graph for each atom in a molecule. This layer is based on the algorithm from the following paper:

Lusci, Alessandro, Gianluca Pollastri, and Pierre Baldi. “Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules.” Journal of chemical information and modeling 53.7 (2013): 1563-1575.

This layer performs a sort of inward sweep. Recall that for each atom, a DAG is generated that “points inward” to that atom from the undirected molecule graph. Picture this as “picking up” the atom as the vertex and using the natural tree structure that forms from gravity. The layer “sweeps inwards” from the leaf nodes of the DAG upwards to the atom. This is batched so the transformation is done for each atom.

__init__(n_graph_feat=30, n_atom_feat=75, max_atoms=50, layer_sizes=[100], init='glorot_uniform', activation='relu', dropout=None, batch_size=64, **kwargs)[source]
Parameters
• n_graph_feat (int, optional) – Number of features for each node(and the whole grah).

• n_atom_feat (int, optional) – Number of features listed per atom.

• max_atoms (int, optional) – Maximum number of atoms in molecules.

• layer_sizes (list of int, optional(default=[100])) – List of hidden layer size(s): length of this list represents the number of hidden layers, and each element is the width of corresponding hidden layer.

• init (str, optional) – Weight initialization for filters.

• activation (str, optional) – Activation function applied.

• dropout (float, optional) – Dropout probability in hidden layer(s).

• batch_size (int, optional) – number of molecules in a batch.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

“Construct internal trainable weights.

call(inputs, training=True)[source]

parent layers: atom_features, parents, calculation_orders, calculation_masks, n_atoms

class DAGGather(*args, **kwargs)[source]
__init__(n_graph_feat=30, n_outputs=30, max_atoms=50, layer_sizes=[100], init='glorot_uniform', activation='relu', dropout=None, **kwargs)[source]

DAG vector gathering layer

Parameters
• n_graph_feat (int, optional) – Number of features for each atom.

• n_outputs (int, optional) – Number of features for each molecule.

• max_atoms (int, optional) – Maximum number of atoms in molecules.

• layer_sizes (list of int, optional) – List of hidden layer size(s): length of this list represents the number of hidden layers, and each element is the width of corresponding hidden layer.

• init (str, optional) – Weight initialization for filters.

• activation (str, optional) – Activation function applied.

• dropout (float, optional) – Dropout probability in the hidden layer(s).

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs, training=True)[source]

parent layers: atom_features, membership

class MessagePassing(*args, **kwargs)[source]

General class for MPNN default structures built according to https://arxiv.org/abs/1511.06391

__init__(T, message_fn='enn', update_fn='gru', n_hidden=100, **kwargs)[source]
Parameters
• T (int) – Number of message passing steps

• message_fn (str, optional) – message function in the model

• update_fn (str, optional) – update function in the model

• n_hidden (int, optional) – number of hidden units in the passing phase

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

Perform T steps of message passing

class EdgeNetwork(*args, **kwargs)[source]

Submodule for Message Passing

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class GatedRecurrentUnit(*args, **kwargs)[source]

Submodule for Message Passing

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

This is where the layer’s logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
• inputs

Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

arguments, and inputs cannot be provided via the default value of a keyword argument.

• NumPy array or Python scalar values in inputs get cast as tensors.

• Layers are built (build(input_shape) method) using shape info from inputs only.

• input_spec compatibility is only checked against inputs.

• Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

• The SavedModel input specification is generated using inputs only.

• Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

• *args – Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

• **kwargs

Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

whether the call is meant for training or inference.

• mask: Boolean input mask. If the layer’s call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

class SetGather(*args, **kwargs)[source]

set2set gather layer for graph-based model

Models using this layer must set pad_batches=True.

__init__(M, batch_size, n_hidden=100, init='orthogonal', **kwargs)[source]
Parameters
• M (int) – Number of LSTM steps

• batch_size (int) – Number of samples in a batch(all batches must have same size)

• n_hidden (int, optional) – number of hidden units in the passing phase

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs)[source]

Perform M steps of set2set gather,

Detailed descriptions in: https://arxiv.org/abs/1511.06391

## Torch Layers¶

class CNNModule(n_tasks: int, n_features: int, dims: int, layer_filters: List[int] = [100], kernel_size: Union[int, Sequence[int]] = 5, strides: Union[int, Sequence[int]] = 1, weight_init_stddevs: Union[float, Sequence[float]] = 0.02, bias_init_consts: Union[float, Sequence[float]] = 1.0, dropouts: Union[float, Sequence[float]] = 0.5, activation_fns: Union[Callable, str, Sequence[Union[Callable, str]]] = 'relu', pool_type: str = 'max', mode: str = 'classification', n_classes: int = 2, uncertainty: bool = False, residual: bool = False, padding: Union[int, str] = 'valid')[source]

A 1, 2, or 3 dimensional convolutional network for either regression or classification. The network consists of the following sequence of layers: - A configurable number of convolutional layers - A global pooling layer (either max pool or average pool) - A final fully connected layer to compute the output It optionally can compose the model from pre-activation residual blocks, as described in https://arxiv.org/abs/1603.05027, rather than a simple stack of convolution layers. This often leads to easier training, especially when using a large number of layers. Note that residual blocks can only be used when successive layers have the same output shape. Wherever the output shape changes, a simple convolution layer will be used even if residual=True. .. rubric:: Examples

>>> model = CNNModule(n_tasks=5, n_features=8, dims=2, layer_filters=[3,8,8,16], kernel_size=3, n_classes = 7, mode='classification', uncertainty=False, padding='same')
>>> x = torch.ones(2, 224, 224, 8)
>>> x = model(x)
>>> for tensor in x:
...    print(tensor.shape)
torch.Size([2, 5, 7])
torch.Size([2, 5, 7])

__init__(n_tasks: int, n_features: int, dims: int, layer_filters: List[int] = [100], kernel_size: Union[int, Sequence[int]] = 5, strides: Union[int, Sequence[int]] = 1, weight_init_stddevs: Union[float, Sequence[float]] = 0.02, bias_init_consts: Union[float, Sequence[float]] = 1.0, dropouts: Union[float, Sequence[float]] = 0.5, activation_fns: Union[Callable, str, Sequence[Union[Callable, str]]] = 'relu', pool_type: str = 'max', mode: str = 'classification', n_classes: int = 2, uncertainty: bool = False, residual: bool = False, padding: Union[int, str] = 'valid') None[source]

Create a CNN. :param n_tasks: number of tasks :type n_tasks: int :param n_features: number of features :type n_features: int :param dims: the number of dimensions to apply convolutions over (1, 2, or 3) :type dims: int :param layer_filters: the number of output filters for each convolutional layer in the network.

The length of this list determines the number of layers.

Parameters
• kernel_size (int, tuple, or list) – a list giving the shape of the convolutional kernel for each layer. Each element may be either an int (use the same kernel width for every dimension) or a tuple (the kernel width along each dimension). Alternatively this may be a single int or tuple instead of a list, in which case the same kernel shape is used for every layer.

• strides (int, tuple, or list) – a list giving the stride between applications of the kernel for each layer. Each element may be either an int (use the same stride for every dimension) or a tuple (the stride along each dimension). Alternatively this may be a single int or tuple instead of a list, in which case the same stride is used for every layer.

• weight_init_stddevs (list or float) – the standard deviation of the distribution to use for weight initialization of each layer. The length of this list should equal len(layer_filters)+1, where the final element corresponds to the dense layer. Alternatively this may be a single value instead of a list, in which case the same value is used for every layer.

• bias_init_consts (list or float) – the value to initialize the biases in each layer to. The length of this list should equal len(layer_filters)+1, where the final element corresponds to the dense layer. Alternatively this may be a single value instead of a list, in which case the same value is used for every layer.

• dropouts (list or float) – the dropout probability to use for each layer. The length of this list should equal len(layer_filters). Alternatively this may be a single value instead of a list, in which case the same value is used for every layer

• activation_fns (str or list) – the torch activation function to apply to each layer. The length of this list should equal len(layer_filters). Alternatively this may be a single value instead of a list, in which case the same value is used for every layer, ‘relu’ by default

• pool_type (str) – the type of pooling layer to use, either ‘max’ or ‘average’

• mode (str) – Either ‘classification’ or ‘regression’

• n_classes (int) – the number of classes to predict (only used in classification mode)

• uncertainty (bool) – if True, include extra outputs and loss terms to enable the uncertainty in outputs to be predicted

• residual (bool) – if True, the model will be composed of pre-activation residual blocks instead of a simple stack of convolutional layers.

• padding (str, int or tuple) – the padding to use for convolutional layers, either ‘valid’ or ‘same’

forward(inputs: Union[torch.Tensor, Sequence[torch.Tensor]]) List[Any][source]
Parameters

x (torch.Tensor) – Input Tensor

Returns

Output as per use case : regression/classification

Return type

torch.Tensor

class ScaleNorm(scale: float, eps: float = 1e-05)[source]

Apply Scale Normalization to input.

The ScaleNorm layer first computes the square root of the scale, then computes the matrix/vector norm of the input tensor. The norm value is calculated as sqrt(scale) / matrix norm. Finally, the result is returned as input_tensor * norm value.

This layer can be used instead of LayerNorm when a scaled version of the norm is required. Instead of performing the scaling operation (scale / norm) in a lambda-like layer, we are defining it within this layer to make prototyping more efficient.

References

1

Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from deepchem.models.torch_models.layers import ScaleNorm
>>> scale = 0.35
>>> layer = ScaleNorm(scale)
>>> input_tensor = torch.tensor([[1.269, 39.36], [0.00918, -9.12]])
>>> output_tensor = layer(input_tensor)

__init__(scale: float, eps: float = 1e-05)[source]

Initialize a ScaleNorm layer.

Parameters
• scale (float) – Scale magnitude.

• eps (float) – Epsilon value. Default = 1e-5.

forward(x: torch.Tensor) torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class MATEncoderLayer(dist_kernel: str = 'softmax', lambda_attention: float = 0.33, lambda_distance: float = 0.33, h: int = 16, sa_hsize: int = 1024, sa_dropout_p: float = 0.0, output_bias: bool = True, d_input: int = 1024, d_hidden: int = 1024, d_output: int = 1024, activation: str = 'leakyrelu', n_layers: int = 1, ff_dropout_p: float = 0.0, encoder_hsize: int = 1024, encoder_dropout_p: float = 0.0)[source]

Encoder layer for use in the Molecular Attention Transformer [1]_.

The MATEncoder layer primarily consists of a self-attention layer (MultiHeadedMATAttention) and a feed-forward layer (PositionwiseFeedForward). This layer can be stacked multiple times to form an encoder.

References

1

Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from rdkit import Chem
>>> import torch
>>> import deepchem
>>> from deepchem.models.torch_models.layers import MATEmbedding, MATEncoderLayer
>>> input_smile = "CC"
>>> feat = deepchem.feat.MATFeaturizer()
>>> out = feat.featurize(input_smile)
>>> node = torch.tensor(out[0].node_features).float().unsqueeze(0)
>>> dist = torch.tensor(out[0].distance_matrix).float().unsqueeze(0)
>>> mask = torch.sum(torch.abs(node), dim=-1) != 0
>>> layer = MATEncoderLayer()
>>> op = MATEmbedding()(node)

__init__(dist_kernel: str = 'softmax', lambda_attention: float = 0.33, lambda_distance: float = 0.33, h: int = 16, sa_hsize: int = 1024, sa_dropout_p: float = 0.0, output_bias: bool = True, d_input: int = 1024, d_hidden: int = 1024, d_output: int = 1024, activation: str = 'leakyrelu', n_layers: int = 1, ff_dropout_p: float = 0.0, encoder_hsize: int = 1024, encoder_dropout_p: float = 0.0)[source]

Initialize a MATEncoder layer.

Parameters
• dist_kernel (str) – Kernel activation to be used. Can be either ‘softmax’ for softmax or ‘exp’ for exponential, for the self-attention layer.

• lambda_attention (float) – Constant to be multiplied with the attention matrix in the self-attention layer.

• lambda_distance (float) – Constant to be multiplied with the distance matrix in the self-attention layer.

• h (int) – Number of attention heads for the self-attention layer.

• sa_hsize (int) – Size of dense layer in the self-attention layer.

• sa_dropout_p (float) – Dropout probability for the self-attention layer.

• output_bias (bool) – If True, dense layers will use bias vectors in the self-attention layer.

• d_input (int) – Size of input layer in the feed-forward layer.

• d_hidden (int) – Size of hidden layer in the feed-forward layer.

• d_output (int) – Size of output layer in the feed-forward layer.

• activation (str) – Activation function to be used in the feed-forward layer. Can choose between ‘relu’ for ReLU, ‘leakyrelu’ for LeakyReLU, ‘prelu’ for PReLU, ‘tanh’ for TanH, ‘selu’ for SELU, ‘elu’ for ELU and ‘linear’ for linear activation.

• n_layers (int) – Number of layers in the feed-forward layer.

• dropout_p (float) – Dropout probability in the feeed-forward layer.

• encoder_hsize (int) – Size of Dense layer for the encoder itself.

• encoder_dropout_p (float) – Dropout probability for connections in the encoder layer.

forward(x: torch.Tensor, mask: torch.Tensor, adj_matrix: torch.Tensor, distance_matrix: torch.Tensor, sa_dropout_p: float = 0.0) torch.Tensor[source]

Output computation for the MATEncoder layer.

In the MATEncoderLayer intialization, self.sublayer is defined as an nn.ModuleList of 2 layers. We will be passing our computation through these layers sequentially. nn.ModuleList is subscriptable and thus we can access it as self.sublayer[0], for example.

Parameters
• x (torch.Tensor) – Input tensor.

• mask (torch.Tensor) – Masks out padding values so that they are not taken into account when computing the attention score.

• distance_matrix (torch.Tensor) – Distance matrix of a molecule.

• sa_dropout_p (float) – Dropout probability for the self-attention layer (MultiHeadedMATAttention).

class MultiHeadedMATAttention(dist_kernel: str = 'softmax', lambda_attention: float = 0.33, lambda_distance: float = 0.33, h: int = 16, hsize: int = 1024, dropout_p: float = 0.0, output_bias: bool = True)[source]

First constructs an attention layer tailored to the Molecular Attention Transformer [1]_ and then converts it into Multi-Headed Attention.

In Multi-Headed attention the attention mechanism multiple times parallely through the multiple attention heads. Thus, different subsequences of a given sequences can be processed differently. The query, key and value parameters are split multiple ways and each split is passed separately through a different attention head. .. rubric:: References

1

Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from deepchem.models.torch_models.layers import MultiHeadedMATAttention, MATEmbedding
>>> import deepchem as dc
>>> import torch
>>> input_smile = "CC"
>>> feat = dc.feat.MATFeaturizer()
>>> input_smile = "CC"
>>> out = feat.featurize(input_smile)
>>> node = torch.tensor(out[0].node_features).float().unsqueeze(0)
>>> dist = torch.tensor(out[0].distance_matrix).float().unsqueeze(0)
>>> mask = torch.sum(torch.abs(node), dim=-1) != 0
...    dist_kernel='softmax',
...    lambda_attention=0.33,
...    lambda_distance=0.33,
...    h=16,
...    hsize=1024,
...    dropout_p=0.0)
>>> op = MATEmbedding()(node)

__init__(dist_kernel: str = 'softmax', lambda_attention: float = 0.33, lambda_distance: float = 0.33, h: int = 16, hsize: int = 1024, dropout_p: float = 0.0, output_bias: bool = True)[source]

Initialize a multi-headed attention layer. :param dist_kernel: Kernel activation to be used. Can be either ‘softmax’ for softmax or ‘exp’ for exponential. :type dist_kernel: str :param lambda_attention: Constant to be multiplied with the attention matrix. :type lambda_attention: float :param lambda_distance: Constant to be multiplied with the distance matrix. :type lambda_distance: float :param h: Number of attention heads. :type h: int :param hsize: Size of dense layer. :type hsize: int :param dropout_p: Dropout probability. :type dropout_p: float :param output_bias: If True, dense layers will use bias vectors. :type output_bias: bool

forward(query: torch.Tensor, key: torch.Tensor, value: torch.Tensor, mask: torch.Tensor, adj_matrix: torch.Tensor, distance_matrix: torch.Tensor, dropout_p: float = 0.0, eps: float = 1e-06, inf: float = 1000000000000.0) torch.Tensor[source]

Output computation for the MultiHeadedAttention layer. :param query: Standard query parameter for attention. :type query: torch.Tensor :param key: Standard key parameter for attention. :type key: torch.Tensor :param value: Standard value parameter for attention. :type value: torch.Tensor :param mask: Masks out padding values so that they are not taken into account when computing the attention score. :type mask: torch.Tensor :param adj_matrix: Adjacency matrix of the input molecule, returned from dc.feat.MATFeaturizer() :type adj_matrix: torch.Tensor :param dist_matrix: Distance matrix of the input molecule, returned from dc.feat.MATFeaturizer() :type dist_matrix: torch.Tensor :param dropout_p: Dropout probability. :type dropout_p: float :param eps: Epsilon value :type eps: float :param inf: Value of infinity to be used. :type inf: float

class SublayerConnection(size: int, dropout_p: float = 0.0)[source]

SublayerConnection layer which establishes a residual connection, as used in the Molecular Attention Transformer [1]_.

The SublayerConnection layer is a residual layer which is then passed through Layer Normalization. The residual connection is established by computing the dropout-adjusted layer output of a normalized tensor and adding this to the original input tensor.

References

1

Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from deepchem.models.torch_models.layers import SublayerConnection
>>> scale = 0.35
>>> layer = SublayerConnection(2, 0.)
>>> input_ar = torch.tensor([[1., 2.], [5., 6.]])
>>> output = layer(input_ar, input_ar)

__init__(size: int, dropout_p: float = 0.0)[source]

Initialize a SublayerConnection Layer.

Parameters
• size (int) – Size of layer.

• dropout_p (float) – Dropout probability.

forward(x: torch.Tensor, output: torch.Tensor) torch.Tensor[source]

Output computation for the SublayerConnection layer.

Takes an input tensor x, then adds the dropout-adjusted sublayer output for normalized x to it. This is done to add a residual connection followed by LayerNorm.

Parameters
• x (torch.Tensor) – Input tensor.

• output (torch.Tensor) – Layer whose normalized output will be added to x.

class PositionwiseFeedForward(d_input: int = 1024, d_hidden: int = 1024, d_output: int = 1024, activation: str = 'leakyrelu', n_layers: int = 1, dropout_p: float = 0.0, dropout_at_input_no_act: bool = False)[source]

PositionwiseFeedForward is a layer used to define the position-wise feed-forward (FFN) algorithm for the Molecular Attention Transformer [1]_

Each layer in the MAT encoder contains a fully connected feed-forward network which applies two linear transformations and the given activation function. This is done in addition to the SublayerConnection module.

Note: This modified version of PositionwiseFeedForward class contains dropout_at_input_no_act condition to facilitate its use in defining

the feed-forward (FFN) algorithm for the Directed Message Passing Neural Network (D-MPNN) [2]_

References

1

Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

2

Analyzing Learned Molecular Representations for Property Prediction https://arxiv.org/pdf/1904.01561.pdf

Examples

>>> from deepchem.models.torch_models.layers import PositionwiseFeedForward
>>> feed_fwd_layer = PositionwiseFeedForward(d_input = 2, d_hidden = 2, d_output = 2, activation = 'relu', n_layers = 1, dropout_p = 0.1)
>>> input_tensor = torch.tensor([[1., 2.], [5., 6.]])
>>> output_tensor = feed_fwd_layer(input_tensor)

__init__(d_input: int = 1024, d_hidden: int = 1024, d_output: int = 1024, activation: str = 'leakyrelu', n_layers: int = 1, dropout_p: float = 0.0, dropout_at_input_no_act: bool = False)[source]

Initialize a PositionwiseFeedForward layer.

Parameters
• d_input (int) – Size of input layer.

• d_hidden (int (same as d_input if d_output = 0)) – Size of hidden layer.

• d_output (int (same as d_input if d_output = 0)) – Size of output layer.

• activation (str) – Activation function to be used. Can choose between ‘relu’ for ReLU, ‘leakyrelu’ for LeakyReLU, ‘prelu’ for PReLU, ‘tanh’ for TanH, ‘selu’ for SELU, ‘elu’ for ELU and ‘linear’ for linear activation.

• n_layers (int) – Number of layers.

• dropout_p (float) – Dropout probability.

• dropout_at_input_no_act (bool) – If true, dropout is applied on the input tensor. For single layer, it is not passed to an activation function.

forward(x: torch.Tensor) torch.Tensor[source]

Output Computation for the PositionwiseFeedForward layer.

Parameters

x (torch.Tensor) – Input tensor.

class MATEmbedding(d_input: int = 36, d_output: int = 1024, dropout_p: float = 0.0)[source]

Embedding layer to create embedding for inputs.

In an embedding layer, input is taken and converted to a vector representation for each input. In the MATEmbedding layer, an input tensor is processed through a dropout-adjusted linear layer and the resultant vector is returned.

References

1

Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from deepchem.models.torch_models.layers import MATEmbedding
>>> layer = MATEmbedding(d_input = 3, d_output = 3, dropout_p = 0.2)
>>> input_tensor = torch.tensor([1., 2., 3.])
>>> output = layer(input_tensor)

__init__(d_input: int = 36, d_output: int = 1024, dropout_p: float = 0.0)[source]

Initialize a MATEmbedding layer.

Parameters
• d_input (int) – Size of input layer.

• d_output (int) – Size of output layer.

• dropout_p (float) – Dropout probability for layer.

forward(x: torch.Tensor) torch.Tensor[source]

Computation for the MATEmbedding layer.

Parameters

x (torch.Tensor) – Input tensor to be converted into a vector.

class MATGenerator(hsize: int = 1024, aggregation_type: str = 'mean', d_output: int = 1, n_layers: int = 1, dropout_p: float = 0.0, attn_hidden: int = 128, attn_out: int = 4)[source]

MATGenerator defines the linear and softmax generator step for the Molecular Attention Transformer [1]_.

In the MATGenerator, a Generator is defined which performs the Linear + Softmax generation step. Depending on the type of aggregation selected, the attention output layer performs different operations.

References

1

Lukasz Maziarka et al. “Molecule Attention Transformer” Graph Representation Learning workshop and Machine Learning and the Physical Sciences workshop at NeurIPS 2019. 2020. https://arxiv.org/abs/2002.08264

Examples

>>> from deepchem.models.torch_models.layers import MATGenerator
>>> layer = MATGenerator(hsize = 3, aggregation_type = 'mean', d_output = 1, n_layers = 1, dropout_p = 0.3, attn_hidden = 128, attn_out = 4)
>>> input_tensor = torch.tensor([1., 2., 3.])
>>> mask = torch.tensor([1., 1., 1.])

__init__(hsize: int = 1024, aggregation_type: str = 'mean', d_output: int = 1, n_layers: int = 1, dropout_p: float = 0.0, attn_hidden: int = 128, attn_out: int = 4)[source]

Initialize a MATGenerator.

Parameters
• hsize (int) – Size of input layer.

• aggregation_type (str) – Type of aggregation to be used. Can be ‘grover’, ‘mean’ or ‘contextual’.

• d_output (int) – Size of output layer.

• n_layers (int) – Number of layers in MATGenerator.

• dropout_p (float) – Dropout probability for layer.

• attn_hidden (int) – Size of hidden attention layer.

• attn_out (int) – Size of output attention layer.

Computation for the MATGenerator layer.

Parameters
• x (torch.Tensor) – Input tensor.

cosine_dist(x, y)[source]

Computes the inner product (cosine similarity) between two tensors.

This assumes that the two input tensors contain rows of vectors where each column represents a different feature. The output tensor will have elements that represent the inner product between pairs of normalized vectors in the rows of x and y. The two tensors need to have the same number of columns, because one cannot take the dot product between vectors of different lengths. For example, in sentence similarity and sentence classification tasks, the number of columns is the embedding size. In these tasks, the rows of the input tensors would be different test vectors or sentences. The input tensors themselves could be different batches. Using vectors or tensors of all 0s should be avoided.

The vectors in the input tensors are first l2-normalized such that each vector
has length or magnitude of 1. The inner product (dot product) is then taken
between corresponding pairs of row vectors in the input tensors and returned.

Examples

The cosine similarity between two equivalent vectors will be 1. The cosine similarity between two equivalent tensors (tensors where all the elements are the same) will be a tensor of 1s. In this scenario, if the input tensors x and y are each of shape (n,p), where each element in x and y is the same, then the output tensor would be a tensor of shape (n,n) with 1 in every entry.

>>> import numpy as np
>>> import tensorflow as tf
>>> import deepchem.models.layers as layers
>>> x = tf.ones((6, 4), dtype=tf.dtypes.float32, name=None)
>>> y_same = tf.ones((6, 4), dtype=tf.dtypes.float32, name=None)
>>> cos_sim_same = layers.cosine_dist(x,y_same)


x and y_same are the same tensor (equivalent at every element, in this case 1). As such, the pairwise inner product of the rows in x and y will always be 1. The output tensor will be of shape (6,6).

>>> diff = cos_sim_same - tf.ones((6, 6), dtype=tf.dtypes.float32, name=None)
>>> np.allclose(0.0, tf.reduce_sum(diff).numpy(), atol=1e-05)
True
>>> cos_sim_same.shape
TensorShape([6, 6])


The cosine similarity between two orthogonal vectors will be 0 (by definition). If every row in x is orthogonal to every row in y, then the output will be a tensor of 0s. In the following example, each row in the tensor x1 is orthogonal to each row in x2 because they are halves of an identity matrix.

>>> identity_tensor = tf.eye(512, dtype=tf.dtypes.float32)
>>> x1 = identity_tensor[0:256,:]
>>> x2 = identity_tensor[256:512,:]
>>> cos_sim_orth = layers.cosine_dist(x1,x2)


Each row in x1 is orthogonal to each row in x2. As such, the pairwise inner product of the rows in x1and x2 will always be 0. Furthermore, because the shape of the input tensors are both of shape (256,512), the output tensor will be of shape (256,256).

>>> np.allclose(0.0, tf.reduce_sum(cos_sim_orth).numpy(), atol=1e-05)
True
>>> cos_sim_orth.shape
TensorShape([256, 256])

Parameters
• x (tf.Tensor) – Input Tensor of shape (n, p). The shape of this input tensor should be n rows by p columns. Note that n need not equal m (the number of rows in y).

• y (tf.Tensor) – Input Tensor of shape (m, p) The shape of this input tensor should be m rows by p columns. Note that m need not equal n (the number of rows in x).

Returns

Returns a tensor of shape (n, m), that is, n rows by m columns. Each i,j-th entry of this output tensor is the inner product between the l2-normalized i-th row of the input tensor x and the the l2-normalized j-th row of the output tensor y.

Return type

tf.Tensor

class GraphNetwork(n_node_features: int = 32, n_edge_features: int = 32, n_global_features: int = 32, is_undirected: bool = True, residual_connection: bool = True)[source]

Graph Networks

A Graph Network [1]_ takes a graph as input and returns an updated graph as output. The output graph has same structure as input graph but it has updated node features, edge features and global state features.

Parameters
• n_node_features (int) – Number of features in a node

• n_edge_features (int) – Number of features in a edge

• n_global_features (int) – Number of global features

• is_undirected (bool, optional (default True)) – Directed or undirected graph

• residual_connection (bool, optional (default True)) – If True, the layer uses a residual connection during training

Example

>>> import torch
>>> from deepchem.models.torch_models.layers import GraphNetwork as GN
>>> n_nodes, n_node_features = 5, 10
>>> n_edges, n_edge_features = 5, 2
>>> n_global_features = 4
>>> node_features = torch.randn(n_nodes, n_node_features)
>>> edge_features = torch.randn(n_edges, n_edge_features)
>>> edge_index = torch.tensor([[0, 1, 2, 3, 4], [1, 2, 3, 4, 0]]).long()
>>> global_features = torch.randn(1, n_global_features)
>>> gn = GN(n_node_features=n_node_features, n_edge_features=n_edge_features, n_global_features=n_global_features)
>>> node_features, edge_features, global_features = gn(node_features, edge_index, edge_features, global_features)


References

1

Battaglia et al, Relational inductive biases, deep learning, and graph networks. https://arxiv.org/abs/1806.01261 (2018)

__init__(n_node_features: int = 32, n_edge_features: int = 32, n_global_features: int = 32, is_undirected: bool = True, residual_connection: bool = True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(node_features: torch.Tensor, edge_index: torch.Tensor, edge_features: torch.Tensor, global_features: torch.Tensor, batch: Optional[torch.Tensor] = None) Tuple[torch.Tensor, torch.Tensor, torch.Tensor][source]

Output computation for a GraphNetwork

Parameters
• node_features (torch.Tensor) – Input node features of shape $$(|\mathcal{V}|, F_n)$$

• edge_index (torch.Tensor) – Edge indexes of shape $$(2, |\mathcal{E}|)$$

• edge_features (torch.Tensor) – Edge features of the graph, shape: $$(|\mathcal{E}|, F_e)$$

• global_features (torch.Tensor) – Global features of the graph, shape: $$(F_g, 1)$$ where, $$|\mathcal{V}|$$ and $$|\mathcal{E}|$$ denotes the number of nodes and edges in the graph, $$F_n$$, $$F_e$$, $$F_g$$ denotes the number of node features, edge features and global state features respectively.

• batch (torch.LongTensor (optional, default: None)) – A vector that maps each node to its respective graph identifier. The attribute is used only when more than one graph are batched together during a single forward pass.

class Affine(dim: int)[source]

Class which performs the Affine transformation.

This transformation is based on the affinity of the base distribution with the target distribution. A geometric transformation is applied where the parameters performs changes on the scale and shift of a function (inputs).

Normalizing Flow transformations must be bijective in order to compute the logarithm of jacobian’s determinant. For this reason, transformations must perform a forward and inverse pass.

Example

>>> import deepchem as dc
>>> from deepchem.models.torch_models.layers import Affine
>>> import torch
>>> from torch.distributions import MultivariateNormal
>>> # initialize the transformation layer's parameters
>>> dim = 2
>>> samples = 96
>>> transforms = Affine(dim)
>>> # forward pass based on a given distribution
>>> distribution = MultivariateNormal(torch.zeros(dim), torch.eye(dim))
>>> input = distribution.sample(torch.Size((samples, dim)))
>>> len(transforms.forward(input))
2
>>> # inverse pass based on a distribution
>>> len(transforms.inverse(input))
2

__init__(dim: int) None[source]

Create a Affine transform layer.

Parameters

dim (int) – Value of the Nth dimension of the dataset.

forward(x: Sequence) Tuple[torch.Tensor, torch.Tensor][source]

Performs a transformation between two different distributions. This particular transformation represents the following function: y = x * exp(a) + b, where a is scale parameter and b performs a shift. This class also returns the logarithm of the jacobians determinant which is useful when invert a transformation and compute the probability of the transformation.

Parameters

x (Sequence) – Tensor sample with the initial distribution data which will pass into the normalizing flow algorithm.

Returns

• y (torch.Tensor) – Transformed tensor according to Affine layer with the shape of ‘x’.

• log_det_jacobian (torch.Tensor) – Tensor which represents the info about the deviation of the initial and target distribution.

inverse(y: Sequence) Tuple[torch.Tensor, torch.Tensor][source]

Performs a transformation between two different distributions. This transformation represents the bacward pass of the function mention before. Its mathematical representation is x = (y - b) / exp(a) , where “a” is scale parameter and “b” performs a shift. This class also returns the logarithm of the jacobians determinant which is useful when invert a transformation and compute the probability of the transformation.

Parameters

y (Sequence) – Tensor sample with transformed distribution data which will be used in the normalizing algorithm inverse pass.

Returns

• x (torch.Tensor) – Transformed tensor according to Affine layer with the shape of ‘y’.

• inverse_log_det_jacobian (torch.Tensor) – Tensor which represents the information of the deviation of the initial and target distribution.

Real NVP Transformation Layer

This class class is a constructor transformation layer used on a NormalizingFLow model. The Real Non-Preserving-Volumen (Real NVP) is a type of normalizing flow layer which gives advantages over this mainly because an ease to compute the inverse pass [1]_, this is to learn a target distribution.

Example

>>> import torch
>>> import torch.nn as nn
>>> from torch.distributions import MultivariateNormal
>>> from deepchem.models.torch_models.layers import RealNVPLayer
>>> dim = 2
>>> samples = 96
>>> data = MultivariateNormal(torch.zeros(dim), torch.eye(dim))
>>> tensor = data.sample(torch.Size((samples, dim)))

>>> layers = 4
>>> hidden_size = 16
>>> masks = F.one_hot(torch.tensor([i % 2 for i in range(layers)])).float()

>>> for layer in layers:
...   _, inverse_log_det_jacobian = layer.inverse(tensor)
...   inverse_log_det_jacobian = inverse_log_det_jacobian.detach().numpy()
>>> len(inverse_log_det_jacobian)
96


References

1

Stimper, V., Schölkopf, B., & Hernández-Lobato, J. M. (2021). Resampling Base

Parameters
• mask (torch.Tensor) – Tensor with zeros and ones and its size depende on the number of layers and dimenssions the user request.

• hidden_size (int) – The size of the outputs and inputs used on the internal nodes of the transformation layer.

forward(x: Sequence) Tuple[torch.Tensor, torch.Tensor][source]

Forward pass.

This particular transformation is represented by the following function: y = x + (1 - x) * exp( s(x)) + t(x), where t and s needs an activation function. This class also returns the logarithm of the jacobians determinant which is useful when invert a transformation and compute the probability of the transformation.

Parameters

x (Sequence) – Tensor sample with the initial distribution data which will pass into the normalizing algorithm

Returns

• y (torch.Tensor) – Transformed tensor according to Real NVP layer with the shape of ‘x’.

• log_det_jacobian (torch.Tensor) – Tensor which represents the info about the deviation of the initial and target distribution.

inverse(y: Sequence) Tuple[torch.Tensor, torch.Tensor][source]

Inverse pass

This class performs the inverse of the previous method (formward). Also, this metehod returns the logarithm of the jacobians determinant which is useful to compute the learneable features of target distribution.

Parameters

y (Sequence) – Tensor sample with transformed distribution data which will be used in the normalizing algorithm inverse pass.

Returns

• x (torch.Tensor) – Transformed tensor according to Real NVP layer with the shape of ‘y’.

• inverse_log_det_jacobian (torch.Tensor) – Tensor which represents the information of the deviation of the initial and target distribution.

class DMPNNEncoderLayer(use_default_fdim: bool = True, atom_fdim: int = 133, bond_fdim: int = 14, d_hidden: int = 300, depth: int = 3, bias: bool = False, activation: str = 'relu', dropout_p: float = 0.0, aggregation: str = 'mean', aggregation_norm: Union[int, float] = 100)[source]

Encoder layer for use in the Directed Message Passing Neural Network (D-MPNN) [1]_.

The role of the DMPNNEncoderLayer class is to generate molecule encodings in following steps:

• Message passing phase

• Get new atom hidden states and readout phase

• Concatenate the global features

Let the diagram given below represent a molecule containing 5 atoms (nodes) and 4 bonds (edges):-

1 — 5
2 — 4
3

Let the bonds from atoms 1->2 (B[12]) and 2->1 (B[21]) be considered as 2 different bonds. Hence, by considering the same for all atoms, the total number of bonds = 8.

Let:

• atom features : a1, a2, a3, a4, a5

• hidden states of atoms : h1, h2, h3, h4, h5

• bond features bonds : b12, b21, b23, b32, b24, b42, b15, b51

• initial hidden states of bonds : (0)h12, (0)h21, (0)h23, (0)h32, (0)h24, (0)h42, (0)h15, (0)h51

The hidden state of every bond is a function of the concatenated feature vector which contains concatenation of the features of initial atom of the bond and bond features.

Example: (0)h21 = func1(concat(a2, b21))

Note

Here func1 is self.W_i

The Message passing phase

The goal of the message-passing phase is to generate hidden states of all the atoms in the molecule.

The hidden state of an atom is a function of concatenation of atom features and messages (at T depth).

A message is a sum of hidden states of bonds coming to the atom (at T depth).

Note

Depth refers to the number of iterations in the message passing phase (here, T iterations). After each iteration, the hidden states of the bonds are updated.

Example

h1 = func3(concat(a1, m1))

Note

Here func3 is self.W_o.

m1 refers to the message coming to the atom.

m1 = (T-1)h21 + (T-1)h51 (hidden state of bond 2->1 + hidden state of bond 5->1) (at T depth)

for, depth T = 2:

• the hidden states of the bonds @ 1st iteration will be => (0)h21, (0)h51

• the hidden states of the bonds @ 2nd iteration will be => (1)h21, (1)h51

The hidden states of the bonds in 1st iteration are already know. For hidden states of the bonds in 2nd iteration, we follow the criterion that:

• hidden state of the bond is a function of initial hidden state of bond and messages coming to that bond in that iteration

Example

(1)h21 = func2( (0)h21 , (1)m21 )

Note

Here func2 is self.W_h.

(1)m21 refers to the messages coming to that bond 2->1 in that 2nd iteration.

Messages coming to a bond in an iteration is a sum of hidden states of bonds (from previous iteration) coming to this bond.

Example

(1)m21 = (0)h32 + (0)h42

2 <— 3
^
4

Computing the messages

                       B0      B1      B2      B3      B4      B5      B6      B7      B8
f_ini_atoms_bonds = [(0)h12, (0)h21, (0)h23, (0)h32, (0)h24, (0)h42, (0)h15, (0)h51, h(-1)]


Note

h(-1) is an empty array of the same size as other hidden states of bond states.

              B0      B1      B2      B3      B4      B5      B6      B7       B8
mapping = [ [-1,B7] [B3,B5] [B0,B5] [-1,-1] [B0,B3] [-1,-1] [B1,-1] [-1,-1]  [-1,-1] ]


Later, the encoder will map the concatenated features from the f_ini_atoms_bonds to mapping in each iteration upto Tth iteration.

Next the encoder will sum-up the concat features within same bond index.

                (1)m12           (1)m21           (1)m23              (1)m32          (1)m24           (1)m42           (1)m15          (1)m51            m(-1)
message = [ [h(-1) + (0)h51] [(0)h32 + (0)h42] [(0)h12 + (0)h42] [h(-1) + h(-1)] [(0)h12 + (0)h32] [h(-1) + h(-1)] [(0)h21 + h(-1)] [h(-1) + h(-1)]  [h(-1) + h(-1)] ]


Hence, this is how encoder can get messages for message-passing steps.

Get new atom hidden states and readout phase

Hence now for h1:

h1 = func3(
concat(
a1,
[
func2( (0)h21 , (0)h32 + (0)h42 ) +
func2( (0)h51 , 0               )
]
)
)


Similarly, h2, h3, h4 and h5 are calculated.

Next, all atom hidden states are concatenated to make a feature vector of the molecule:

mol_encodings = [[h1, h2, h3, h4, h5]]

Concatenate the global features

Let, global_features = [[gf1, gf2, gf3]]

This array contains molecule level features. In case of this example, it contains 3 global features.

Hence after concatenation,

mol_encodings = [[h1, h2, h3, h4, h5, gf1, gf2, gf3]]

(Final output of the encoder)

References

1

Analyzing Learned Molecular Representations for Property Prediction https://arxiv.org/pdf/1904.01561.pdf

Examples

>>> from rdkit import Chem
>>> import torch
>>> import deepchem as dc
>>> input_smile = "CC"
>>> feat = dc.feat.DMPNNFeaturizer(features_generators=['morgan'])
>>> graph = feat.featurize(input_smile)
>>> from deepchem.models.torch_models.dmpnn import _MapperDMPNN
>>> mapper = _MapperDMPNN(graph[0])
>>> atom_features, f_ini_atoms_bonds, atom_to_incoming_bonds, mapping, global_features = mapper.values
>>> atom_features = torch.from_numpy(atom_features).float()
>>> f_ini_atoms_bonds = torch.from_numpy(f_ini_atoms_bonds).float()
>>> atom_to_incoming_bonds = torch.from_numpy(atom_to_incoming_bonds)
>>> mapping = torch.from_numpy(mapping)
>>> global_features = torch.from_numpy(global_features).float()
>>> layer = DMPNNEncoderLayer(d_hidden=2)
>>> output = layer(atom_features, f_ini_atoms_bonds, atom_to_incoming_bonds, mapping, global_features)

__init__(use_default_fdim: bool = True, atom_fdim: int = 133, bond_fdim: int = 14, d_hidden: int = 300, depth: int = 3, bias: bool = False, activation: str = 'relu', dropout_p: float = 0.0, aggregation: str = 'mean', aggregation_norm: Union[int, float] = 100)[source]

Initialize a DMPNNEncoderLayer layer.

Parameters
• use_default_fdim (bool) – If True, self.atom_fdim and self.bond_fdim are initialized using values from the GraphConvConstants class. If False, self.atom_fdim and self.bond_fdim are initialized from the values provided.

• atom_fdim (int) – Dimension of atom feature vector.

• bond_fdim (int) – Dimension of bond feature vector.

• d_hidden (int) – Size of hidden layer in the encoder layer.

• depth (int) – No of message passing steps.

• bias (bool) – If True, dense layers will use bias vectors.

• activation (str) – Activation function to be used in the encoder layer. Can choose between ‘relu’ for ReLU, ‘leakyrelu’ for LeakyReLU, ‘prelu’ for PReLU, ‘tanh’ for TanH, ‘selu’ for SELU, and ‘elu’ for ELU.

• dropout_p (float) – Dropout probability for the encoder layer.

• aggregation (str) – Aggregation type to be used in the encoder layer. Can choose between ‘mean’, ‘sum’, and ‘norm’.

• aggregation_norm (Union[int, float]) – Value required if aggregation type is ‘norm’.

forward(atom_features: torch.Tensor, f_ini_atoms_bonds: torch.Tensor, atom_to_incoming_bonds: torch.Tensor, mapping: torch.Tensor, global_features: torch.Tensor) torch.Tensor[source]

Output computation for the DMPNNEncoderLayer.

Steps:

• Get original bond hidden states from concatenation of initial atom and bond features. (input)

• Get initial messages hidden states. (message)

• Execute message passing step for self.depth - 1 iterations.

• Get atom hidden states using atom features and message hidden states.

• Get molecule encodings.

• Concatenate global molecular features and molecule encodings.

Parameters
• atom_features (torch.Tensor) – Tensor containing atoms features.

• f_ini_atoms_bonds (torch.Tensor) – Tensor containing concatenated feature vector which contains concatenation of initial atom and bond features.

• atom_to_incoming_bonds (torch.Tensor) – Tensor containing mapping from atom index to list of indicies of incoming bonds.

• mapping (torch.Tensor) – Tensor containing the mapping that maps bond index to ‘array of indices of the bonds’ incoming at the initial atom of the bond (excluding the reverse bonds).

• global_features (torch.Tensor) – Tensor containing molecule features.

Returns

output – Tensor containing the encodings of the molecules.

Return type

torch.Tensor

## Jax Layers¶

class Linear(*args, **kwargs)[source]

Protein folding specific Linear Module.

This differs from the standard Haiku Linear in a few ways:
• It supports inputs of arbitrary rank

• Initializers are specified by strings

This code is adapted from DeepMind’s AlphaFold code release (https://github.com/deepmind/alphafold).

Examples

>>> import deepchem as dc
>>> import haiku as hk
>>> import jax
>>> import deepchem.models.jax_models.layers
>>> def forward_model(x):
...   layer = dc.models.jax_models.layers.Linear(2)
...   return layer(x)
>>> f = hk.transform(forward_model)
>>> rng = jax.random.PRNGKey(42)
>>> x = jnp.ones([8, 28 * 28])
>>> params = f.init(rng, x)
>>> output = f.apply(params, rng, x)

__init__(num_output: int, initializer: str = 'linear', use_bias: bool = True, bias_init: float = 0.0, name: str = 'linear')[source]

Constructs Linear Module.

Parameters
• num_output (int) – number of output channels.

• initializer (str (default 'linear')) – What initializer to use, should be one of {‘linear’, ‘relu’, ‘zeros’}

• use_bias (bool (default True)) – Whether to include trainable bias

• bias_init (float (default 0)) – Value used to initialize bias.

• name (str (default 'linear')) – name of module, used for name scopes.