# Symbol API¶

Topics:

We also highly encourage you to read Symbolic Configuration and Execution in Pictures.

## How to Compose Symbols¶

The symbolic API provides a way to configure computation graphs. You can configure the graphs either at the level of neural network layer operations or as fine-grained operations.

The following example configures a two-layer neural network.

    >>> import mxnet as mx
>>> net = mx.symbol.Variable('data')
>>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
>>> net = mx.symbol.Activation(data=net, name='relu1', act_type="relu")
>>> net = mx.symbol.FullyConnected(data=net, name='fc2', num_hidden=64)
>>> net = mx.symbol.SoftmaxOutput(data=net, name='out')
>>> type(net)
<class 'mxnet.symbol.Symbol'>


The basic arithmetic operators (plus, minus, div, multiplication) are overloaded for element-wise operations of symbols.

The following example creates a computation graph that adds two inputs together.

    >>> import mxnet as mx
>>> a = mx.symbol.Variable('a')
>>> b = mx.symbol.Variable('b')
>>> c = a + b


## Symbol Attributes¶

You can add an attribute to a symbol by providing an attribute dictionary when you create a symbol.

    data = mx.sym.Variable('data', attr={'mood': 'angry'})
op   = mx.sym.Convolution(data=data, name='conv', kernel=(1, 1),
num_filter=1, attr={'mood': 'so so'})


For proper communication with the C++ backend, both the key and values of the attribute dictionary should be strings. To retrieve the attributes, use attr(key) or list_attr():

    assert data.attr('mood') == 'angry'
assert op.list_attr() == {'mood': 'so so'}


For a composite symbol, you can retrieve all of the attributes associated with that symbol and its descendants with list_attr(recursive=True). In the returned dictionary, all of the attribute names have the prefix 'symbol_name' + '_' to prevent naming conflicts.

    assert op.list_attr(recursive=True) == {'data_mood': 'angry', 'conv_mood': 'so so',
'conv_weight_mood': 'so so', 'conv_bias_mood': 'so so'}


Notice that the mood attribute set for the Convolution operator is copied to conv_weight and conv_bias. They’re symbols that are automatically created by the Convolution operator, and the attributes are automatically copied for them. This is especially useful for annotating context groups in model parallelism. However, if you explicitly specify the weight or bias symbols, the attributes for the host operator are not copied to them:

    weight = mx.sym.Variable('crazy_weight', attr={'size': '5'})
data = mx.sym.Variable('data', attr={'mood': 'angry'})
op = mx.sym.Convolution(data=data, weight=weight, name='conv', kernel=(1, 1),
num_filter=1, attr= {'mood': 'so so'})
op.list_attr(recursive=True)
# =>
# {'conv_mood': 'so so',
#  'conv_bias_mood': 'so so',
#  'crazy_weight_size': '5',
#  'data_mood': 'angry'}


As you can see, the mood attribute is copied to the symbol conv_bias, which was automatically created, but not to the manually created weight symbol crazy_weight.

Another way to attach attributes is to use AttrScope. AttrScope automatically adds the specified attributes to all of the symbols created within that scope. For example:

    data = mx.symbol.Variable('data')
with mx.AttrScope(group='4', data='great'):
fc1 = mx.symbol.Activation(data, act_type='relu')
with mx.AttrScope(init_bias='0.0'):
fc2 = mx.symbol.FullyConnected(fc1, num_hidden=10, name='fc2')
assert fc1.attr('data') == 'great'
assert fc2.attr('data') == 'great'
assert fc2.attr('init_bias') == '0.0'


Naming convention: We recommend that you choose valid variable names for attribute names. Names with double underscores (e.g., __shape__) are reserved for internal use. The underscore '_' separates a symbol name and its attributes. It’s also the separator between a symbol and a variable that is automatically created by that symbol. For example, the weight variable that is created automatically by a Convolution operator named conv1 is called conv1_weight.

Components that use attributes: More and more components are using symbol attributes to collect useful annotations for the computational graph. Here is a (probably incomplete) list:

• Variable uses attributes to store (optional) shape information for a variable.
• Optimizers read __lr_mult__ and __wd_mult__ attributes for each symbol in a computational graph. This is useful to control per-layer learning rate and decay.
• The model parallelism LSTM example uses the __ctx_group__ attribute to divide the operators into groups that correspond to GPU devices.

## Serialization¶

There are two ways to save and load the symbols. You can use Pickle to serialize the Symbol objects. Or, you can use the mxnet.symbol.Symbol.save and mxnet.symbol.load functions. The advantage of using the save and load functions is that this method is language agnostic and cloud friendly. The symbol is saved in JSON format. You can also get a JSON string directly using mxnet.symbol.Symbol.tojson.

The following example shows how to save a symbol to an S3 bucket, load it back, and compare two symbols using a JSON string.

    >>> import mxnet as mx
>>> a = mx.symbol.Variable('a')
>>> b = mx.symbol.Variable('b')
>>> c = a + b
>>> c.save('s3://my-bucket/symbol-c.json')
>>> c.tojson() == c2.tojson()
True


## Executing Symbols¶

After you have assembled a set of symbols into a computation graph, the MXNet engine can evaluate them. If you are training a neural network, this is typically handled by the high-level Model class and the fit() function.

For neural networks used in “feed-forward”, “prediction”, or “inference” mode (all terms for the same thing: running a trained network), the input arguments are the input data, and the weights of the neural network that were learned during training.

To manually execute a set of symbols, you need to create an Executor object, which is typically constructed by calling the simple_bind() method on a symbol.For an example of this, see the sample notebook on how to use simple_bind().

## Multiple Outputs¶

To group the symbols together, use the mxnet.symbol.Group function.

    >>> import mxnet as mx
>>> net = mx.symbol.Variable('data')
>>> fc1 = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
>>> net = mx.symbol.Activation(data=fc1, name='relu1', act_type="relu")
>>> net = mx.symbol.FullyConnected(data=net, name='fc2', num_hidden=64)
>>> out = mx.symbol.SoftmaxOutput(data=net, name='softmax')
>>> group = mx.symbol.Group([fc1, out])
>>> group.list_outputs()
['fc1_output', 'softmax_output']


After you get the group, you can bind on group instead. The resulting executor will have two outputs, one for fc1_output and one for softmax_output.

## Symbol Creation API Reference¶

Symbolic configuration API of mxnet.

class mxnet.symbol.Symbol(handle)

Symbol is symbolic graph of the mxnet.

name

Get name string from the symbol, this function only works for non-grouped symbol.

Returns: value – The name of this symbol, returns None for grouped symbol. str
attr(key)

Get attribute string from the symbol, this function only works for non-grouped symbol.

Parameters: key (str) – The key to get attribute from. value – The attribute value of the key, returns None if attribute do not exist. str
list_attr(recursive=False)

Get all attributes from the symbol.

Returns: ret – a dicitonary mapping attribute keys to values dict of str to str
attr_dict()

Recursively get all attributes from the symbol and its childrens

Returns: ret – Returns a dict whose keys are names of the symbol and its children. Values of the returned dict are dictionaries that map attribute keys to values dict of str to dict
get_internals()

Get a new grouped symbol whose output contains internal outputs of this symbol.

Returns: sgroup – The internal of the symbol. Symbol
get_children()

Get a new grouped symbol whose output contains inputs to output nodes of the original symbol

Returns: sgroup – The children of the head node. If the symbol has no inputs None will be returned. Symbol or None
list_arguments()

List all the arguments in the symbol.

Returns: args – List of all the arguments. list of string
list_outputs()

List all outputs in the symbol.

Returns: returns – List of all the outputs. list of string
list_auxiliary_states()

List all auxiliary states in the symbol.

Returns: aux_states – List the names of the auxiliary states. list of string

Notes

Auxiliary states are special states of symbols that do not corresponds to an argument, and do not have gradient. But still be useful for the specific operations. A common example of auxiliary state is the moving_mean and moving_variance in BatchNorm. Most operators do not have Auxiliary states.

infer_type(*args, **kwargs)

Infer the type of outputs and arguments of given known types of arguments.

User can either pass in the known types in positional way or keyword argument way. Tuple of Nones is returned if there is not enough information passed in. An error will be raised if there is inconsistency found in the known types passed in.

Parameters: *args – Provide type of arguments in a positional way. Unknown type can be marked as None **kwargs – Provide keyword arguments of known types. arg_types (list of numpy.dtype or None) – List of types of arguments. The order is in the same order as list_arguments() out_types (list of numpy.dtype or None) – List of types of outputs. The order is in the same order as list_outputs() aux_types (list of numpy.dtype or None) – List of types of outputs. The order is in the same order as list_auxiliary()
infer_shape(*args, **kwargs)

Infer the shape of outputs and arguments of given known shapes of arguments.

User can either pass in the known shapes in positional way or keyword argument way. Tuple of Nones is returned if there is not enough information passed in. An error will be raised if there is inconsistency found in the known shapes passed in.

Parameters: *args – Provide shape of arguments in a positional way. Unknown shape can be marked as None **kwargs – Provide keyword arguments of known shapes. arg_shapes (list of tuple or None) – List of shapes of arguments. The order is in the same order as list_arguments() out_shapes (list of tuple or None) – List of shapes of outputs. The order is in the same order as list_outputs() aux_shapes (list of tuple or None) – List of shapes of outputs. The order is in the same order as list_auxiliary()
infer_shape_partial(*args, **kwargs)

Partially infer the shape. The same as infer_shape, except that the partial results can be returned.

debug_str()

Get a debug string.

Returns: debug_str – Debug string of the symbol. string
save(fname)

Save symbol into file.

You can also use pickle to do the job if you only work on python. The advantage of load/save is the file is language agnostic. This means the file saved using save can be loaded by other language binding of mxnet. You also get the benefit being able to directly load/save from cloud storage(S3, HDFS)

Parameters: fname (str) – The name of the file - s3://my-bucket/path/my-s3-symbol - hdfs://my-bucket/path/my-hdfs-symbol - /path-to/my-local-symbol

symbol.load()
Used to load symbol from file.
tojson()

Save symbol into a JSON string.

symbol.load_json()
Used to load symbol from JSON string.
simple_bind(ctx, grad_req='write', type_dict=None, group2ctx=None, **kwargs)

Bind current symbol to get an executor, allocate all the ndarrays needed. Allows specifying data types.

This function will ask user to pass in ndarray of position they like to bind to, and it will automatically allocate the ndarray for arguments and auxiliary states that user did not specify explicitly.

Parameters: ctx (Context) – The device context the generated executor to run on. grad_req (string) – {‘write’, ‘add’, ‘null’}, or list of str or dict of str to str, optional Specifies how we should update the gradient to the args_grad. - ‘write’ means everytime gradient is write to specified args_grad NDArray. - ‘add’ means everytime gradient is add to the specified NDArray. - ‘null’ means no action is taken, the gradient may not be calculated. type_dict (dict of str->numpy.dtype) – Input type dictionary, name->dtype group2ctx (dict of string to mx.Context) – The dict mapping the ctx_group attribute to the context assignment. kwargs (dict of str->shape) – Input shape dictionary, name->shape executor – The generated Executor mxnet.Executor
bind(ctx, args, args_grad=None, grad_req='write', aux_states=None, group2ctx=None, shared_exec=None)

Bind current symbol to get an executor.

Parameters: ctx (Context) – The device context the generated executor to run on. args (list of NDArray or dict of str to NDArray) – Input arguments to the symbol. If type is list of NDArray, the position is in the same order of list_arguments. If type is dict of str to NDArray, then it maps the name of arguments to the corresponding NDArray. In either case, all the arguments must be provided. args_grad (list of NDArray or dict of str to NDArray, optional) – When specified, args_grad provide NDArrays to hold the result of gradient value in backward. If type is list of NDArray, the position is in the same order of list_arguments. If type is dict of str to NDArray, then it maps the name of arguments to the corresponding NDArray. When the type is dict of str to NDArray, users only need to provide the dict for needed argument gradient. Only the specified argument gradient will be calculated. grad_req ({'write', 'add', 'null'}, or list of str or dict of str to str, optional) – Specifies how we should update the gradient to the args_grad. ‘write’ means everytime gradient is write to specified args_grad NDArray. ‘add’ means everytime gradient is add to the specified NDArray. ‘null’ means no action is taken, the gradient may not be calculated. aux_states (list of NDArray, or dict of str to NDArray, optional) – Input auxiliary states to the symbol, only need to specify when list_auxiliary_states is not empty. If type is list of NDArray, the position is in the same order of list_auxiliary_states If type is dict of str to NDArray, then it maps the name of auxiliary_states to the corresponding NDArray, In either case, all the auxiliary_states need to be provided. group2ctx (dict of string to mx.Context) – The dict mapping the ctx_group attribute to the context assignment. shared_exec (mx.executor.Executor) – Executor to share memory with. This is intended for runtime reshaping, variable length sequences, etc. The returned executor shares state with shared_exec, and should not be used in parallel with it. executor – The generated Executor Executor

Notes

Auxiliary states are special states of symbols that do not corresponds to an argument, and do not have gradient. But still be useful for the specific operations. A common example of auxiliary state is the moving_mean and moving_variance in BatchNorm. Most operators do not have auxiliary states and this parameter can be safely ignored.

User can give up gradient by using a dict in args_grad and only specify gradient they interested in.

grad(wrt)

Get the autodiff of current symbol.

This function can only be used if current symbol is a loss function.

Parameters: wrt (Array of String) – keyword arguments of the symbol that the gradients are taken. grad – A gradient Symbol with returns to be the corresponding gradients. Symbol
mxnet.symbol.Variable(name, attr=None, shape=None, lr_mult=None, wd_mult=None, dtype=None, init=None)

Create a symbolic variable with specified name.

Parameters: name (str) – Name of the variable. attr (dict of string -> string) – Additional attributes to set on the variable. shape (tuple) – Optionally, one can specify the shape of a variable. This will be used during shape inference. If user specified a different shape for this variable using keyword argument when calling shape inference, this shape information will be ignored. lr_mult (float) – Specify learning rate muliplier for this variable. wd_mult (float) – Specify weight decay muliplier for this variable. dtype (str or numpy.dtype) – Similar to shape, we can specify dtype for this variable. init (initializer (mxnet.init.*)) – Specify initializer for this variable to override the default initializer variable – The created variable symbol. Symbol
mxnet.symbol.Group(symbols)

Create a symbol that groups symbols together.

Parameters: symbols (list) – List of symbols to be grouped. sym – The created group symbol. Symbol
mxnet.symbol.load(fname)

Load symbol from a JSON file.

You can also use pickle to do the job if you only work on python. The advantage of load/save is the file is language agnostic. This means the file saved using save can be loaded by other language binding of mxnet. You also get the benefit being able to directly load/save from cloud storage(S3, HDFS)

Parameters: fname (str) – The name of the file, examples: s3://my-bucket/path/my-s3-symbol hdfs://my-bucket/path/my-hdfs-symbol /path-to/my-local-symbol sym – The loaded symbol. Symbol

Symbol.save()
Used to save symbol into file.
mxnet.symbol.load_json(json_str)

Parameters: json_str (str) – A json string. sym – The loaded symbol. Symbol

Symbol.tojson()
Used to save symbol into json string.
mxnet.symbol.pow(base, exp)

Raise base to an exp.

Parameters: base (Symbol or Number) – exp (Symbol or Number) – result Symbol or Number
mxnet.symbol.maximum(left, right)

maximum left and right

Parameters: left (Symbol or Number) – right (Symbol or Number) – result Symbol or Number
mxnet.symbol.minimum(left, right)

minimum left and right

Parameters: left (Symbol or Number) – right (Symbol or Number) – result Symbol or Number
mxnet.symbol.hypot(left, right)

minimum left and right

Parameters: left (Symbol or Number) – right (Symbol or Number) – result Symbol or Number
mxnet.symbol.zeros(shape, dtype=None, **kwargs)
Create a Tensor filled with zeros, similar to numpy.zeros
Parameters: shape (int or sequence of ints) – Shape of the new array. dtype (str or numpy.dtype, optional) – The value type of the inner value, default to np.float32 out – The created Symbol Symbol
mxnet.symbol.ones(shape, dtype=None, **kwargs)
Create a Tensor filled with ones, similar to numpy.ones
Parameters: shape (int or sequence of ints) – Shape of the new array. dtype (str or numpy.dtype, optional) – The value type of the inner value, default to np.float32 out – The created Symbol Symbol
mxnet.symbol.arange(start, stop=None, step=1.0, repeat=1, name=None, dtype=None)
Simlar function in the MXNet ndarray as numpy.arange
Parameters: start (number) – Start of interval. The interval includes this value. The default start value is 0. stop (number, optional) – End of interval. The interval does not include this value. step (number, optional) – Spacing between values repeat (int, optional) – “The repeating time of all elements. E.g repeat=3, the element a will be repeated three times –> a, a, a. dtype (str or numpy.dtype, optional) – The value type of the inner value, default to np.float32 out – The created Symbol Symbol
mxnet.symbol.Activation(*args, **kwargs)

Elementwise activation function.

The following activation types are supported (operations are applied elementwisely to each scalar of the input tensor):

• relu: Rectified Linear Unit, y = max(x, 0)
• sigmoid: y = 1 / (1 + exp(-x))
• tanh: Hyperbolic tangent, y = (exp(x) - exp(-x)) / (exp(x) + exp(-x))
• softrelu: Soft ReLU, or SoftPlus, y = log(1 + exp(x))

See LeakyReLU for other activations with parameters.

Parameters: data (Symbol) – Input data to activation function. act_type ({'relu', 'sigmoid', 'softrelu', 'tanh'}, required) – Activation function to be applied. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol

Examples

A one-hidden-layer MLP with ReLU activation:

>>> data = Variable('data')
>>> mlp = FullyConnected(data=data, num_hidden=128, name='proj')
>>> mlp = Activation(data=mlp, act_type='relu', name='activation')
>>> mlp = FullyConnected(data=mlp, num_hidden=10, name='mlp')
>>> mlp
<Symbol mlp>


ReLU activation

>>> test_suites = [
... ('relu', lambda x: numpy.maximum(x, 0)),
... ('sigmoid', lambda x: 1 / (1 + numpy.exp(-x))),
... ('tanh', lambda x: numpy.tanh(x)),
... ('softrelu', lambda x: numpy.log(1 + numpy.exp(x)))
... ]
>>> x = test_utils.random_arrays((2, 3, 4))
>>> for act_type, numpy_impl in test_suites:
... op = Activation(act_type=act_type, name='act')
... y = test_utils.simple_forward(op, act_data=x)
... y_np = numpy_impl(x)
... print('%s: %s' % (act_type, test_utils.almost_equal(y, y_np)))
relu: True
sigmoid: True
tanh: True
softrelu: True

mxnet.symbol.BatchNorm(*args, **kwargs)

Apply batch normalization to input.

Parameters: data (Symbol) – Input data to batch normalization gamma (Symbol) – gamma matrix beta (Symbol) – beta matrix eps (float, optional, default=0.001) – Epsilon to prevent div 0 momentum (float, optional, default=0.9) – Momentum for moving average fix_gamma (boolean, optional, default=True) – Fix gamma while training use_global_stats (boolean, optional, default=False) – Whether use global moving statistics instead of local batch-norm. This will force change batch-norm into a scale shift operator. output_mean_var (boolean, optional, default=False) – Output All,normal mean and var name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.BilinearSampler(*args, **kwargs)
Apply bilinear sampling to input feature map, which is the key of “[NIPS2015] Spatial Transformer Networks”
output[batch, channel, y_dst, x_dst] = G(data[batch, channel, y_src, x_src) x_dst, y_dst enumerate all spatial locations in output x_src = grid[batch, 0, y_dst, x_dst] y_src = grid[batch, 1, y_dst, x_dst] G() denotes the bilinear interpolation kernel

The out-boundary points will be padded as zeros. (The boundary is defined to be [-1, 1]) The shape of output will be (data.shape[0], data.shape[1], grid.shape[2], grid.shape[3]) The operator assumes that grid has been nomalized. If you want to design a CustomOp to manipulate grid, please refer to GridGeneratorOp.

Parameters: data (Symbol) – Input data to the BilinearsamplerOp. grid (Symbol) – Input grid to the BilinearsamplerOp.grid has two channels: x_src, y_src name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.BlockGrad(*args, **kwargs)

Get output from a symbol and pass 0 gradient back

From:src/operator/tensor/elemwise_unary_op.cc:31

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Cast(*args, **kwargs)

Convert data type to dtype

From:src/operator/tensor/elemwise_unary_op.cc:58

Parameters: data (NDArray) – Source input dtype ({'float16', 'float32', 'float64', 'int32', 'uint8'}, required) – Output data type. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Concat(*args, **kwargs)

Perform a feature concat on channel dim (defaut is 1) over all This function support variable length of positional input.

Parameters: data (Symbol[]) – List of tensors to concatenate num_args (int, required) – Number of inputs to be concated. dim (int, optional, default='1') – the dimension to be concated. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol

Examples

Concat two (or more) inputs along a specific dimension:

>>> a = Variable('a')
>>> b = Variable('b')
>>> c = Concat(a, b, dim=1, name='my-concat')
>>> c
<Symbol my-concat>
>>> SymbolDoc.get_output_shape(c, a=(128, 10, 3, 3), b=(128, 15, 3, 3))
{'my-concat_output': (128L, 25L, 3L, 3L)}


Note the shape should be the same except on the dimension that is being concatenated.

mxnet.symbol.Convolution(*args, **kwargs)

Apply convolution to input then add a bias.

Parameters: data (Symbol) – Input data to the ConvolutionOp. weight (Symbol) – Weight matrix. bias (Symbol) – Bias parameter. kernel (Shape(tuple), required) – convolution kernel size: (h, w) or (d, h, w) stride (Shape(tuple), optional, default=()) – convolution stride: (h, w) or (d, h, w) dilate (Shape(tuple), optional, default=()) – convolution dilate: (h, w) or (d, h, w) pad (Shape(tuple), optional, default=()) – pad for convolution: (h, w) or (d, h, w) num_filter (int (non-negative), required) – convolution filter(channel) number num_group (int (non-negative), optional, default=1) – Number of group partitions. Equivalent to slicing input into num_group partitions, apply convolution on each, then concatenate the results workspace (long (non-negative), optional, default=1024) – Maximum tmp workspace allowed for convolution (MB). no_bias (boolean, optional, default=False) – Whether to disable bias parameter. cudnn_tune ({None, 'fastest', 'limited_workspace', 'off'},optional, default='None') – Whether to pick convolution algo by running performance test. Leads to higher startup time but may give faster speed. Options are: ‘off’: no tuning ‘limited_workspace’: run test and pick the fastest algorithm that doesn’t exceed workspace limit. ‘fastest’: pick the fastest algorithm and ignore workspace limit. If set to None (default), behavior is determined by environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT: 0 for off, 1 for limited workspace (default), 2 for fastest. cudnn_off (boolean, optional, default=False) – Turn off cudnn for this layer. layout ({None, 'NCDHW', 'NCHW', 'NDHWC', 'NHWC'},optional, default='None') – Set layout for input, output and weight. Empty for default layout: NCHW for 2d and NCDHW for 3d. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Correlation(*args, **kwargs)

Apply correlation to inputs

Parameters: data1 (Symbol) – Input data1 to the correlation. data2 (Symbol) – Input data2 to the correlation. kernel_size (int (non-negative), optional, default=1) – kernel size for Correlation must be an odd number max_displacement (int (non-negative), optional, default=1) – Max displacement of Correlation stride1 (int (non-negative), optional, default=1) – stride1 quantize data1 globally stride2 (int (non-negative), optional, default=1) – stride2 quantize data2 within the neighborhood centered around data1 pad_size (int (non-negative), optional, default=0) – pad for Correlation is_multiply (boolean, optional, default=True) – operation type is either multiplication or subduction name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Crop(*args, **kwargs)

Crop the 2nd and 3rd dim of input data, with the corresponding size of h_w or with width and height of the second input symbol, i.e., with one input, we need h_w to specify the crop height and width, otherwise the second input symbol’s size will be used This function support variable length of positional input.

Parameters: data (Symbol or Symbol[]) – Tensor or List of Tensors, the second input will be used as crop_like shape reference num_args (int, required) – Number of inputs for crop, if equals one, then we will use the h_wfor crop height and width, else if equals two, then we will use the heightand width of the second input symbol, we name crop_like here offset (Shape(tuple), optional, default=(0,0)) – crop offset coordinate: (y, x) h_w (Shape(tuple), optional, default=(0,0)) – crop height and weight: (h, w) center_crop (boolean, optional, default=False) – If set to true, then it will use be the center_crop,or it will crop using the shape of crop_like name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Custom(*args, **kwargs)

Custom operator implemented in frontend.

Parameters: op_type (string) – Type of custom operator. Must be registered first. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Deconvolution(*args, **kwargs)

Apply deconvolution to input then add a bias.

Parameters: data (Symbol) – Input data to the DeconvolutionOp. weight (Symbol) – Weight matrix. bias (Symbol) – Bias parameter. kernel (Shape(tuple), required) – deconvolution kernel size: (y, x) stride (Shape(tuple), optional, default=(1,1)) – deconvolution stride: (y, x) pad (Shape(tuple), optional, default=(0,0)) – pad for deconvolution: (y, x), a good number is : (kernel-1)/2, if target_shape set, pad will be ignored and will be computed automatically adj (Shape(tuple), optional, default=(0,0)) – adjustment for output shape: (y, x), if target_shape set, adj will be ignored and will be computed automatically target_shape (Shape(tuple), optional, default=(0,0)) – output shape with targe shape : (y, x) num_filter (int (non-negative), required) – deconvolution filter(channel) number num_group (int (non-negative), optional, default=1) – number of groups partition workspace (long (non-negative), optional, default=512) – Tmp workspace for deconvolution (MB) no_bias (boolean, optional, default=True) – Whether to disable bias parameter. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Dropout(*args, **kwargs)

Apply dropout to input. During training, each element of the input is randomly set to zero with probability p. And then the whole tensor is rescaled by 1/(1-p) to keep the expectation the same as before applying dropout. During the test time, this behaves as an identity map.

Parameters: data (Symbol) – Input data to dropout. p (float, optional, default=0.5) – Fraction of the input that gets dropped out at training time name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol

Examples

Apply dropout to corrupt input as zero with probability 0.2:

>>> data = Variable('data')
>>> data_dp = Dropout(data=data, p=0.2)

>>> shape = (100, 100)  # take larger shapes to be more statistical stable
>>> x = numpy.ones(shape)
>>> op = Dropout(p=0.5, name='dp')
>>> # dropout is identity during testing
>>> y = test_utils.simple_forward(op, dp_data=x, is_train=False)
>>> test_utils.almost_equal(x, y, threshold=0)
True
>>> y = test_utils.simple_forward(op, dp_data=x, is_train=True)
>>> # expectation is (approximately) unchanged
>>> numpy.abs(x.mean() - y.mean()) < 0.1
True
>>> set(numpy.unique(y)) == set([0, 2])
True

mxnet.symbol.ElementWiseSum(*args, **kwargs)

Perform element sum of inputs

From:src/operator/tensor/elemwise_sum.cc:56 This function support variable length of positional input.

Parameters: args (NDArray[]) – List of input tensors name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Embedding(*args, **kwargs)

Map integer index to vector representations (embeddings). Those embeddings are learnable parameters. For a input of shape (d1, ..., dK), the output shape is (d1, ..., dK, output_dim). All the input values should be integers in the range [0, input_dim).

From:src/operator/tensor/indexing_op.cc:19

Parameters: data (Symbol) – Input data to the EmbeddingOp. weight (Symbol) – Embedding weight matrix. input_dim (int, required) – vocabulary size of the input indices. output_dim (int, required) – dimension of the embedding vectors. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol

Examples

Assume we want to map the 26 English alphabet letters to 16-dimensional vectorial representations.

>>> vocabulary_size = 26
>>> embed_dim = 16
>>> seq_len, batch_size = (10, 64)
>>> input = Variable('letters')
>>> op = Embedding(data=input, input_dim=vocabulary_size, output_dim=embed_dim,
...name='embed')
>>> SymbolDoc.get_output_shape(op, letters=(seq_len, batch_size))
{'embed_output': (10L, 64L, 16L)}

>>> vocab_size, embed_dim = (26, 16)
>>> batch_size = 12
>>> word_vecs = test_utils.random_arrays((vocab_size, embed_dim))
>>> op = Embedding(name='embed', input_dim=vocab_size, output_dim=embed_dim)
>>> x = numpy.random.choice(vocab_size, batch_size)
>>> y = test_utils.simple_forward(op, embed_data=x, embed_weight=word_vecs)
>>> y_np = word_vecs[x]
>>> test_utils.almost_equal(y, y_np)
True

mxnet.symbol.Flatten(*args, **kwargs)

Flatten input into 2D by collapsing all the higher dimensions. A (d1, d2, ..., dK) tensor is flatten to (d1, d2* ... *dK) matrix.

Parameters: data (NDArray) – Input data to reshape. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol

Examples

Flatten is usually applied before FullyConnected, to reshape the 4D tensor produced by convolutional layers to 2D matrix:

>>> data = Variable('data')  # say this is 4D from some conv/pool
>>> flatten = Flatten(data=data, name='flat')  # now this is 2D
>>> SymbolDoc.get_output_shape(flatten, data=(2, 3, 4, 5))
{'flat_output': (2L, 60L)}

>>> test_dims = [(2, 3, 4, 5), (2, 3), (2,)]
>>> op = Flatten(name='flat')
>>> for dims in test_dims:
... x = test_utils.random_arrays(dims)
... y = test_utils.simple_forward(op, flat_data=x)
... y_np = x.reshape((dims[0], numpy.prod(dims[1:])))
... print('%s: %s' % (dims, test_utils.almost_equal(y, y_np)))
(2, 3, 4, 5): True
(2, 3): True
(2,): True

mxnet.symbol.FullyConnected(*args, **kwargs)

Apply matrix multiplication to input then add a bias. It maps the input of shape (batch_size, input_dim) to the shape of (batch_size, num_hidden). Learnable parameters include the weights of the linear transform and an optional bias vector.

Parameters: data (Symbol) – Input data to the FullyConnectedOp. weight (Symbol) – Weight matrix. bias (Symbol) – Bias parameter. num_hidden (int, required) – Number of hidden nodes of the output. no_bias (boolean, optional, default=False) – Whether to disable bias parameter. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol

Examples

Construct a fully connected operator with target dimension 512.

>>> data = Variable('data')  # or some constructed NN
>>> op = FullyConnected(data=data,
... num_hidden=512,
... name='FC1')
>>> op
<Symbol FC1>
>>> SymbolDoc.get_output_shape(op, data=(128, 100))
{'FC1_output': (128L, 512L)}


A simple 3-layer MLP with ReLU activation:

>>> net = Variable('data')
>>> for i, dim in enumerate([128, 64]):
... net = FullyConnected(data=net, num_hidden=dim, name='FC%d' % i)
... net = Activation(data=net, act_type='relu', name='ReLU%d' % i)
>>> # 10-class predictor (e.g. MNIST)
>>> net = FullyConnected(data=net, num_hidden=10, name='pred')
>>> net
<Symbol pred>

>>> dim_in, dim_out = (3, 4)
>>> x, w, b = test_utils.random_arrays((10, dim_in), (dim_out, dim_in), (dim_out,))
>>> op = FullyConnected(num_hidden=dim_out, name='FC')
>>> out = test_utils.simple_forward(op, FC_data=x, FC_weight=w, FC_bias=b)
>>> # numpy implementation of FullyConnected
>>> out_np = numpy.dot(x, w.T) + b
>>> test_utils.almost_equal(out, out_np)
True

mxnet.symbol.GridGenerator(*args, **kwargs)

generate sampling grid for bilinear sampling.

Parameters: data (Symbol) – Input data to the GridGeneratorOp. transform_type ({'affine', 'warp'}, required) – transformation type if transformation type is affine, data is affine matrix : (batch, 6) if transformation type is warp, data is optical flow : (batch, 2, h, w) target_shape (Shape(tuple), optional, default=(0,0)) – if transformation type is affine, the operator need a target_shape : (H, W) if transofrmation type is warp, the operator will ignore target_shape name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.IdentityAttachKLSparseReg(*args, **kwargs)

Apply a sparse regularization to the output a sigmoid activation function.

Parameters: data (Symbol) – Input data. sparseness_target (float, optional, default=0.1) – The sparseness target penalty (float, optional, default=0.001) – The tradeoff parameter for the sparseness penalty momentum (float, optional, default=0.9) – The momentum for running average name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.InstanceNorm(*args, **kwargs)

An operator taking in a n-dimensional input tensor (n > 2), and normalizing the input by subtracting the mean and variance calculated over the spatial dimensions. This is an implemention of the operator described in “Instance Normalization: The Missing Ingredient for Fast Stylization”, D. Ulyanov, A. Vedaldi, V. Lempitsky, 2016 (arXiv:1607.08022v2). This layer is similar to batch normalization, with two differences: first, the normalization is carried out per example (‘instance’), not over a batch. Second, the same normalization is applied both at test and train time. This operation is also known as ‘contrast normalization’.

Parameters: data (Symbol) – A n-dimensional tensor (n > 2) of the form [batch, channel, spatial_dim1, spatial_dim2, ...]. gamma (Symbol) – A vector of length ‘channel’, which multiplies the normalized input. beta (Symbol) – A vector of length ‘channel’, which is added to the product of the normalized input and the weight. eps (float, optional, default=0.001) – Epsilon to prevent division by 0. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.L2Normalization(*args, **kwargs)

Set the l2 norm of each instance to a constant.

Parameters: data (Symbol) – Input data to the L2NormalizationOp. eps (float, optional, default=1e-10) – Epsilon to prevent div 0 mode ({'channel', 'instance', 'spatial'},optional, default='instance') – Normalization Mode. If set to instance, this operator will compute a norm for each instance in the batch; this is the default mode. If set to channel, this operator will compute a cross channel norm at each position of each instance. If set to spatial, this operator will compute a norm for each channel. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.LRN(*args, **kwargs)

Apply convolution to input then add a bias.

Parameters: data (Symbol) – Input data to the ConvolutionOp. alpha (float, optional, default=0.0001) – value of the alpha variance scaling parameter in the normalization formula beta (float, optional, default=0.75) – value of the beta power parameter in the normalization formula knorm (float, optional, default=2) – value of the k parameter in normalization formula nsize (int (non-negative), required) – normalization window width in elements. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.LeakyReLU(*args, **kwargs)

Apply activation function to input.

Parameters: data (Symbol) – Input data to activation function. act_type ({'elu', 'leaky', 'prelu', 'rrelu'},optional, default='leaky') – Activation function to be applied. slope (float, optional, default=0.25) – Init slope for the activation. (For leaky and elu only) lower_bound (float, optional, default=0.125) – Lower bound of random slope. (For rrelu only) upper_bound (float, optional, default=0.334) – Upper bound of random slope. (For rrelu only) name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.LinearRegressionOutput(*args, **kwargs)

Use linear regression for final output, this is used on final output of a net.

Parameters: data (Symbol) – Input data to function. label (Symbol) – Input label to function. grad_scale (float, optional, default=1) – Scale the gradient by a float factor name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.LogisticRegressionOutput(*args, **kwargs)

Use Logistic regression for final output, this is used on final output of a net. Logistic regression is suitable for binary classification or probability prediction tasks.

Parameters: data (Symbol) – Input data to function. label (Symbol) – Input label to function. grad_scale (float, optional, default=1) – Scale the gradient by a float factor name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.MAERegressionOutput(*args, **kwargs)

Use mean absolute error regression for final output, this is used on final output of a net.

Parameters: data (Symbol) – Input data to function. label (Symbol) – Input label to function. grad_scale (float, optional, default=1) – Scale the gradient by a float factor name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.MakeLoss(*args, **kwargs)

Get output from a symbol and pass 1 gradient back. This is used as a terminal loss if unary and binary operator are used to composite a loss with no declaration of backward dependency

Parameters: data (Symbol) – Input data. grad_scale (float, optional, default=1) – gradient scale as a supplement to unary and binary operators valid_thresh (float, optional, default=0) – regard element valid when x > valid_thresh, this is used only in valid normalization mode. normalization ({'batch', 'null', 'valid'},optional, default='null') – If set to null, op will not normalize on output gradient.If set to batch, op will normalize gradient by divide batch size.If set to valid, op will normalize gradient by divide # sample marked as valid name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Pad(*args, **kwargs)

Pads an n-dimensional input tensor. Allows for precise control of the padding type and how much padding to apply on both sides of a given dimension.

Parameters: data (Symbol) – An n-dimensional input tensor. mode ({'constant', 'edge'}, required) – Padding type to use. “constant” pads all values with a constant value, the value of which can be specified with the constant_value option. “edge” uses the boundary values of the array as padding. pad_width (Shape(tuple), required) – A tuple of padding widths of length 2*r, where r is the rank of the input tensor, specifying number of values padded to the edges of each axis. (before_1, after_1, ... , before_N, after_N) unique pad widths for each axis. Equivalent to pad_width in numpy.pad, but flattened. constant_value (double, optional, default=0) – This option is only used when mode is “constant”. This value will be used as the padding value. Defaults to 0 if not specified. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Pooling(*args, **kwargs)

Perform spatial pooling on inputs.

Parameters: data (Symbol) – Input data to the pooling operator. global_pool (boolean, optional, default=False) – Ignore kernel size, do global pooling based on current input feature map. This is useful for input with different shape kernel (Shape(tuple), required) – pooling kernel size: (y, x) or (d, y, x) pool_type ({'avg', 'max', 'sum'}, required) – Pooling type to be applied. pooling_convention ({'full', 'valid'},optional, default='valid') – Pooling convention to be applied.kValid is default setting of Mxnet and rounds down the output pooling size.kFull is compatible with Caffe and rounds up the output pooling size. stride (Shape(tuple), optional, default=(1,1)) – stride: for pooling (y, x) or (d, y, x) pad (Shape(tuple), optional, default=(0,0)) – pad for pooling: (y, x) or (d, y, x) name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.RNN(*args, **kwargs)

Apply a recurrent layer to input.

Parameters: data (Symbol) – Input data to RNN parameters (Symbol) – Vector of all RNN trainable parameters concatenated state (Symbol) – initial hidden state of the RNN state_cell (Symbol) – initial cell state for LSTM networks (only for LSTM) state_size (int (non-negative), required) – size of the state for each layer num_layers (int (non-negative), required) – number of stacked layers bidirectional (boolean, optional, default=False) – whether to use bidirectional recurrent layers mode ({'gru', 'lstm', 'rnn_relu', 'rnn_tanh'}, required) – the type of RNN to compute p (float, optional, default=0) – Dropout probability, fraction of the input that gets dropped out at training time state_outputs (boolean, optional, default=False) – Whether to have the states as symbol outputs. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.ROIPooling(*args, **kwargs)

Performs region-of-interest pooling on inputs. Resize bounding box coordinates by spatial_scale and crop input feature maps accordingly. The cropped feature maps are pooled by max pooling to a fixed size output indicated by pooled_size. batch_size will change to the number of region bounding boxes after ROIPooling

Parameters: data (Symbol) – Input data to the pooling operator, a 4D Feature maps rois (Symbol) – Bounding box coordinates, a 2D array of [[batch_index, x1, y1, x2, y2]]. (x1, y1) and (x2, y2) are top left and down right corners of designated region of interest. batch_index indicates the index of corresponding image in the input data pooled_size (Shape(tuple), required) – fix pooled size: (h, w) spatial_scale (float, required) – Ratio of input feature map height (or w) to raw image height (or w). Equals the reciprocal of total stride in convolutional layers name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Reshape(*args, **kwargs)

Reshape input according to a target shape spec. The target shape is a tuple and can be a simple list of dimensions such as (12,3) or it can incorporate special codes that correspond to contextual operations that refer to the input dimensions. The special codes are all expressed as integers less than 1. These codes effectively refer to a machine that pops input dims off the beginning of the input dims list and pushes resulting output dims onto the end of the output dims list, which starts empty. The codes are:

0 Copy Pop one input dim and push it onto the output dims
 -1 Infer Push a dim that is inferred later from all other output dims -2 CopyAll Pop all remaining input dims and push them onto output dims -3 Merge2 Pop two input dims, multiply them, and push result -4 Split2 Pop one input dim, and read two next target shape specs, push them both onto output dims (either can be -1 and will be inferred from the other

The exact mathematical behavior of these codes is given in the description of the ‘shape’ parameter. All non-codes (positive integers) just pop a dim off the input dims (if any), throw it away, and then push the specified integer onto the output dims.

Examples: Type Input Target Output Copy (2,3,4) (4,0,2) (4,3,2) Copy (2,3,4) (2,0,0) (2,3,4) Infer (2,3,4) (6,1,-1) (6,1,4) Infer (2,3,4) (3,-1,8) (3,1,8) CopyAll (9,8,7) (-2) (9,8,7) CopyAll (9,8,7) (9,-2) (9,8,7) CopyAll (9,8,7) (-2,1,1) (9,8,7,1,1) Merge2 (3,4) (-3) (12) Merge2 (3,4,5) (-3,0) (12,5) Merge2 (3,4,5) (0,-3) (3,20) Merge2 (3,4,5,6) (-3,0,0) (12,5,6) Merge2 (3,4,5,6) (-3,-2) (12,5,6) Split2 (12) (-4,6,2) (6,2) Split2 (12) (-4,2,6) (2,6) Split2 (12) (-4,-1,6) (2,6) Split2 (12,9) (-4,2,6,0) (2,6,9) Split2 (12,9,9,9) (-4,2,6,-2) (2,6,9,9,9) Split2 (12,12) (-4,2,-1,-4,-1,2) (2,6,6,2)

From:src/operator/tensor/matrix_op.cc:64

Parameters: data (NDArray) – Input data to reshape. target_shape (Shape(tuple), optional, default=(0,0)) – (Deprecated! Use shape instead.) Target new shape. One and only one dim can be 0, in which case it will be inferred from the rest of dims keep_highest (boolean, optional, default=False) – (Deprecated! Use shape instead.) Whether keep the highest dim unchanged.If set to true, then the first dim in target_shape is ignored,and always fixed as input shape (Shape(tuple), optional, default=()) – Target shape, a tuple, t=(t_1,t_2,..,t_m). the input dims be s=(s_1,s_2,.,s_n) (Let) – output dims u=(u_1,u_2,.,u_p) are computed from s and t. (The) – target shape tuple elements t_i are read in order, and used to generate successive output dims u_p (The) – t_i (meaning: behavior:) – explicit u_p = t_i (+ve) – copy u_p = s_i (0) – infer u_p = (Prod s_i) / (Prod u_j | j != p) (-1) – copy all u_p = s_i, u_p+1 = s_i+1, .. (-2) – merge two u_p = s_i * s_i+1 (-3) – split two u_p = a, u_p+1 = b | a * b = s_i (-4,a,b) – split directive (-4) in the target shape tuple is followed by two dimensions, one of which can be -1, which means it will be inferred from the other one and the original dimension. (The) – can only be one globally inferred dimension (-1), aside from any -1 occuring in a split directive. (The) – reverse (boolean, optional, default=False) – Whether to match the shapes from the backward. If reverse is true, 0 values in the shape argument will be searched from the backward. E.g the original shape is (10, 5, 4) and the shape argument is (-1, 0). If reverse is true, the new shape should be (50, 4). Otherwise it will be (40, 5). name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.SVMOutput(*args, **kwargs)

Support Vector Machine based transformation on input, backprop L2-SVM

Parameters: data (Symbol) – Input data to svm. label (Symbol) – Label data. margin (float, optional, default=1) – Scale the DType(param_.margin) for activation size regularization_coefficient (float, optional, default=1) – Scale the coefficient responsible for balacing coefficient size and error tradeoff use_linear (boolean, optional, default=False) – If set true, uses L1-SVM objective function. Default uses L2-SVM objective name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.SequenceLast(*args, **kwargs)

Takes the last element of a sequence. Takes an n-dimensional tensor of the form [max sequence length, batchsize, other dims] and returns a (n-1)-dimensional tensor of the form [batchsize, other dims]. This operator takes an optional input tensor sequence_length of positive ints of dimension [batchsize] when the sequence_length option is set to true. This allows the operator to handle variable-length sequences. If sequence_length is false, then each example in the batch is assumed to have the max sequence length.

Parameters: data (Symbol) – n-dimensional input tensor of the form [max sequence length, batchsize, other dims] sequence_length (Symbol) – vector of sequence lengths of size batchsize use_sequence_length (boolean, optional, default=False) – If set to true, this layer takes in extra input sequence_length to specify variable length sequence name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.SequenceMask(*args, **kwargs)

Sets all elements outside the sequence to a constant value. Takes an n-dimensional tensor of the form [max sequence length, batchsize, other dims] and returns a tensor of the same shape. This operator takes an optional input tensor sequence_length of positive ints of dimension [batchsize] when the sequence_length option is set to true. This allows the operator to handle variable-length sequences. If sequence_length is false, then each example in the batch is assumed to have the max sequence length, and this operator becomes the identity operator.

Parameters: data (Symbol) – n-dimensional input tensor of the form [max sequence length, batchsize, other dims] sequence_length (Symbol) – vector of sequence lengths of size batchsize use_sequence_length (boolean, optional, default=False) – If set to true, this layer takes in extra input sequence_length to specify variable length sequence value (float, optional, default=0) – The value to be used as a mask. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.SequenceReverse(*args, **kwargs)

Reverses the elements of each sequence. Takes an n-dimensional tensor of the form [max sequence length, batchsize, other dims] and returns a tensor of the same shape. This operator takes an optional input tensor sequence_length of positive ints of dimension [batchsize] when the sequence_length option is set to true. This allows the operator to handle variable-length sequences. If sequence_length is false, then each example in the batch is assumed to have the max sequence length.

Parameters: data (Symbol) – n-dimensional input tensor of the form [max sequence length, batchsize, other dims] sequence_length (Symbol) – vector of sequence lengths of size batchsize use_sequence_length (boolean, optional, default=False) – If set to true, this layer takes in extra input sequence_length to specify variable length sequence name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.SliceChannel(*args, **kwargs)

Slice input equally along specified axis

Parameters: num_outputs (int, required) – Number of outputs to be sliced. axis (int, optional, default='1') – Dimension along which to slice. squeeze_axis (boolean, optional, default=False) – If true, the dimension will be squeezed. Also, input.shape[axis] must be the same as num_outputs when squeeze_axis is turned on. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.Softmax(*args, **kwargs)

DEPRECATED: Perform a softmax transformation on input. Please use SoftmaxOutput

Parameters: data (Symbol) – Input data to softmax. grad_scale (float, optional, default=1) – Scale the gradient by a float factor ignore_label (float, optional, default=-1) – the label value will be ignored during backward (only works if use_ignore is set to be true). multi_output (boolean, optional, default=False) – If set to true, for a (n,k,x_1,..,x_n) dimensional input tensor, softmax will generate n*x_1*...*x_n output, each has k classes use_ignore (boolean, optional, default=False) – If set to true, the ignore_label value will not contribute to the backward gradient preserve_shape (boolean, optional, default=False) – If true, for a (n_1, n_2, ..., n_d, k) dimensional input tensor, softmax will generate (n1, n2, ..., n_d, k) output, normalizing the k classes as the last dimension. normalization ({'batch', 'null', 'valid'},optional, default='null') – If set to null, op will do nothing on output gradient.If set to batch, op will normalize gradient by divide batch sizeIf set to valid, op will normalize gradient by divide sample not ignored out_grad (boolean, optional, default=False) – Apply weighting from output gradient name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.SoftmaxActivation(*args, **kwargs)

Apply softmax activation to input. This is intended for internal layers. For output (loss layer) please use SoftmaxOutput. If mode=instance, this operator will compute a softmax for each instance in the batch; this is the default mode. If mode=channel, this operator will compute a num_channel-class softmax at each position of each instance; this can be used for fully convolutional network, image segmentation, etc.

Parameters: data (Symbol) – Input data to activation function. mode ({'channel', 'instance'},optional, default='instance') – Softmax Mode. If set to instance, this operator will compute a softmax for each instance in the batch; this is the default mode. If set to channel, this operator will compute a num_channel-class softmax at each position of each instance; this can be used for fully convolutional network, image segmentation, etc. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.SoftmaxOutput(*args, **kwargs)

Perform a softmax transformation on input, backprop with logloss.

Parameters: data (Symbol) – Input data to softmax. label (Symbol) – Label data, can also be probability value with same shape as data grad_scale (float, optional, default=1) – Scale the gradient by a float factor ignore_label (float, optional, default=-1) – the label value will be ignored during backward (only works if use_ignore is set to be true). multi_output (boolean, optional, default=False) – If set to true, for a (n,k,x_1,..,x_n) dimensional input tensor, softmax will generate n*x_1*...*x_n output, each has k classes use_ignore (boolean, optional, default=False) – If set to true, the ignore_label value will not contribute to the backward gradient preserve_shape (boolean, optional, default=False) – If true, for a (n_1, n_2, ..., n_d, k) dimensional input tensor, softmax will generate (n1, n2, ..., n_d, k) output, normalizing the k classes as the last dimension. normalization ({'batch', 'null', 'valid'},optional, default='null') – If set to null, op will do nothing on output gradient.If set to batch, op will normalize gradient by divide batch sizeIf set to valid, op will normalize gradient by divide sample not ignored out_grad (boolean, optional, default=False) – Apply weighting from output gradient name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.SpatialTransformer(*args, **kwargs)

Apply spatial transformer to input feature map.

Parameters: data (Symbol) – Input data to the SpatialTransformerOp. loc (Symbol) – localisation net, the output dim should be 6 when transform_type is affine. You shold initialize the weight and bias with identity tranform. target_shape (Shape(tuple), optional, default=(0,0)) – output shape(h, w) of spatial transformer: (y, x) transform_type ({'affine'}, required) – transformation type sampler_type ({'bilinear'}, required) – sampling type name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.SwapAxis(*args, **kwargs)

Apply swapaxis to input.

Parameters: data (Symbol) – Input data to the SwapAxisOp. dim1 (int (non-negative), optional, default=0) – the first axis to be swapped. dim2 (int (non-negative), optional, default=0) – the second axis to be swapped. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.UpSampling(*args, **kwargs)

Perform nearest neighboor/bilinear up sampling to inputs This function support variable length of positional input.

Parameters: data (Symbol[]) – Array of tensors to upsample scale (int (non-negative), required) – Up sampling scale num_filter (int (non-negative), optional, default=0) – Input filter. Only used by bilinear sample_type. sample_type ({'bilinear', 'nearest'}, required) – upsampling method multi_input_mode ({'concat', 'sum'},optional, default='concat') – How to handle multiple input. concat means concatenate upsampled images along the channel dimension. sum means add all images together, only available for nearest neighbor upsampling. num_args (int, required) – Number of inputs to be upsampled. For nearest neighbor upsampling, this can be 1-N; the size of output will be(scale*h_0,scale*w_0) and all other inputs will be upsampled to thesame size. For bilinear upsampling this must be 2; 1 input and 1 weight. workspace (long (non-negative), optional, default=512) – Tmp workspace for deconvolution (MB) name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.abs(*args, **kwargs)

Take absolute value of the src

From:src/operator/tensor/elemwise_unary_op.cc:83

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.adam_update(*args, **kwargs)

mxnet.symbol.arccos(*args, **kwargs)

Take arccos of the src

From:src/operator/tensor/elemwise_unary_op.cc:236

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.arccosh(*args, **kwargs)

Take arccosh of the src

From:src/operator/tensor/elemwise_unary_op.cc:308

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.arcsin(*args, **kwargs)

Take arcsin of the src

From:src/operator/tensor/elemwise_unary_op.cc:227

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.arcsinh(*args, **kwargs)

Take arcsinh of the src

From:src/operator/tensor/elemwise_unary_op.cc:299

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.arctan(*args, **kwargs)

Take arctan of the src

From:src/operator/tensor/elemwise_unary_op.cc:245

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.arctanh(*args, **kwargs)

Take arctanh of the src

From:src/operator/tensor/elemwise_unary_op.cc:317

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.argmax(*args, **kwargs)

Compute argmax

Parameters: data (NDArray) – Source input axis (int, optional, default='-1') – Empty or unsigned. The axis to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.argmax_channel(*args, **kwargs)
Parameters: src (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.argmin(*args, **kwargs)

Compute argmin

Parameters: data (NDArray) – Source input axis (int, optional, default='-1') – Empty or unsigned. The axis to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.argsort(*args, **kwargs)

Returns the indices that would sort an array.

From:src/operator/tensor/ordering_op.cc:89

Parameters: src (NDArray) – Source input axis (int or None, optional, default='-1') – Axis along which to sort the input tensor. If not given, the flattened array is used. Default is -1. is_ascend (boolean, optional, default=True) – Whether sort in ascending or descending order. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.batch_dot(*args, **kwargs)

Calculate batched dot product of two matrices. (batch, M, K) X (batch, K, N) –> (batch, M, N).

From:src/operator/tensor/matrix_op.cc:269

Parameters: lhs (NDArray) – Left input rhs (NDArray) – Right input transpose_a (boolean, optional, default=False) – True if the first matrix is transposed. transpose_b (boolean, optional, default=False) – True if the second matrix is tranposed. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.batch_take(*args, **kwargs)

Take scalar value from a batch of data vectos according to an index vector, i.e. out[i] = a[i, indices[i]]. out of bound indices are clipped to boundary.

From:src/operator/tensor/indexing_op.cc:100

Parameters: a (NDArray) – Input data array indices (NDArray) – index array name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_add(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_axis(*args, **kwargs)

Parameters: data (NDArray) – Source input axis (Shape(tuple), optional, default=()) – The axes to perform the broadcasting. size (Shape(tuple), optional, default=()) – Target sizes of the broadcasting axes. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_div(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_equal(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_greater(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_greater_equal(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_hypot(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_lesser(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_lesser_equal(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_maximum(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_minimum(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_minus(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_mul(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_not_equal(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_plus(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_power(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_sub(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.broadcast_to(*args, **kwargs)

Broadcast src to shape. If shape[i] is 0, input size will be preserved for axis i.

Parameters: data (NDArray) – Source input shape (Shape(tuple), optional, default=()) – The shape of the desired array. We can set the dim to zero if it’s same as the original. E.g A = broadcast_to(B, shape=(10, 0, 0)) has the same meaning as A = broadcast_axis(B, axis=0, size=10). name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.cast(*args, **kwargs)

Convert data type to dtype

From:src/operator/tensor/elemwise_unary_op.cc:58

Parameters: data (NDArray) – Source input dtype ({'float16', 'float32', 'float64', 'int32', 'uint8'}, required) – Output data type. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.ceil(*args, **kwargs)

Take ceil of the src

From:src/operator/tensor/elemwise_unary_op.cc:107

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.choose_element_0index(*args, **kwargs)

Choose one element from each line(row for python, column for R/Julia) in lhs according to index indicated by rhs. This function assume rhs uses 0-based index.

Parameters: lhs (NDArray) – Left operand to the function. rhs (NDArray) – Right operand to the function. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.clip(*args, **kwargs)

Clip ndarray elements to range (a_min, a_max)

From:src/operator/tensor/matrix_op.cc:301

Parameters: data (NDArray) – Source input a_min (float, required) – Minimum value a_max (float, required) – Maximum value name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.cos(*args, **kwargs)

Take cos of the src

From:src/operator/tensor/elemwise_unary_op.cc:209

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.cosh(*args, **kwargs)

Take cosh of the src

From:src/operator/tensor/elemwise_unary_op.cc:281

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.crop(*args, **kwargs)

Crop the input tensor and return a new one.

• the input and output (if explicitly given) are of the same data type, and on the same device.

From:src/operator/tensor/matrix_op.cc:145

Parameters: data (NDArray) – Source input begin (, required) – starting coordinates end (, required) – ending coordinates name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.degrees(*args, **kwargs)

Take degrees of the src

From:src/operator/tensor/elemwise_unary_op.cc:254

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.dot(*args, **kwargs)

Calculate dot product of two matrices or two vectors. If matrices have more than two dimensions, will do dot over the last (or first if transpose_a is true) axis of lhs and the first (or last if transpose_b is true) axis of rhs. Shape of result array will be the rest of lhs and rhs’s axes concatenated.

From:src/operator/tensor/matrix_op.cc:243

Parameters: lhs (NDArray) – Left input rhs (NDArray) – Right input transpose_a (boolean, optional, default=False) – True if the first matrix is transposed. transpose_b (boolean, optional, default=False) – True if the second matrix is tranposed. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.elemwise_add(*args, **kwargs)
Parameters: lhs (NDArray) – first input rhs (NDArray) – second input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.exp(*args, **kwargs)

Take exp of the src

From:src/operator/tensor/elemwise_unary_op.cc:155

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.expand_dims(*args, **kwargs)

Expand the shape of array by inserting a new axis.

From:src/operator/tensor/matrix_op.cc:124

Parameters: data (NDArray) – Source input axis (int (non-negative), required) – Position (amongst axes) where new axis is to be inserted. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.expm1(*args, **kwargs)

Take exp(x) - 1 in a numerically stable way

From:src/operator/tensor/elemwise_unary_op.cc:200

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.fill_element_0index(*args, **kwargs)

Fill one element of each line(row for python, column for R/Julia) in lhs according to index indicated by rhs and values indicated by mhs. This function assume rhs uses 0-based index.

Parameters: lhs (NDArray) – Left operand to the function. mhs (NDArray) – Middle operand to the function. rhs (NDArray) – Right operand to the function. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.fix(*args, **kwargs)

Take round of the src to integer nearest 0

From:src/operator/tensor/elemwise_unary_op.cc:122

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.flip(*args, **kwargs)

Flip the input tensor along axis and return a new one.

From:src/operator/tensor/matrix_op.cc:225

Parameters: data (NDArray) – Source input axis (int, required) – The dimension to flip name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.floor(*args, **kwargs)

Take floor of the src

From:src/operator/tensor/elemwise_unary_op.cc:112

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.gamma(*args, **kwargs)

Take the gamma function (extension of the factorial function) of the src

From:src/operator/tensor/elemwise_unary_op.cc:326

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.gammaln(*args, **kwargs)

Take gammaln (log of the absolute value of gamma(x)) of the src

From:src/operator/tensor/elemwise_unary_op.cc:335

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.identity(*args, **kwargs)

Identity mapping, copy src to output

From:src/operator/tensor/elemwise_unary_op.cc:15

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.log(*args, **kwargs)

Take log of the src

From:src/operator/tensor/elemwise_unary_op.cc:161

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.log10(*args, **kwargs)

Take base-10 log of the src

From:src/operator/tensor/elemwise_unary_op.cc:167

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.log1p(*args, **kwargs)

Take log(1 + x) in a numerically stable way

From:src/operator/tensor/elemwise_unary_op.cc:191

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.log2(*args, **kwargs)

Take base-2 log of the src

From:src/operator/tensor/elemwise_unary_op.cc:173

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.max(*args, **kwargs)

Compute max along axis. If axis is empty, global reduction is performed

Parameters: data (NDArray) – Source input axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.max_axis(*args, **kwargs)

Compute max along axis. If axis is empty, global reduction is performed

Parameters: data (NDArray) – Source input axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.mean(*args, **kwargs)

Compute mean src along axis. If axis is empty, global reduction is performed

Parameters: data (NDArray) – Source input axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.min(*args, **kwargs)

Compute min along axis. If axis is empty, global reduction is performed

Parameters: data (NDArray) – Source input axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.min_axis(*args, **kwargs)

Compute min along axis. If axis is empty, global reduction is performed

Parameters: data (NDArray) – Source input axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.nanprod(*args, **kwargs)

Compute product of src along axis, ignoring NaN values. If axis is empty, global reduction is performed

Parameters: data (NDArray) – Source input axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.nansum(*args, **kwargs)

Sum src along axis, ignoring NaN values. If axis is empty, global reduction is performed

Parameters: data (NDArray) – Source input axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.negative(*args, **kwargs)

Negate src

From:src/operator/tensor/elemwise_unary_op.cc:77

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.norm(*args, **kwargs)
Parameters: src (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.normal(*args, **kwargs)

Sample a normal distribution

Parameters: loc (float, optional, default=0) – Mean of the distribution. scale (float, optional, default=1) – Standard deviation of the distribution. shape (Shape(tuple), optional, default=()) – The shape of the output ctx (string, optional, default='') – Context of output, in format [cpu|gpu|cpu_pinned](n).Only used for imperative calls. dtype ({'None', 'float16', 'float32', 'float64'},optional, default='None') – DType of the output. If output given, set to type of output.If output not given and type not defined (dtype=None), set to float32. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.one_hot(*args, **kwargs)

Given an ndarray indices filled with locations indicating where to set on_value and depth, return an output ndarray of shape (shape(indices), depth). The off_value is marked everywhere else that are not indicated in indices. If a location in the indices is negative or greater than or equal to depth, assigning on_value to that location will be ignored.

From:src/operator/tensor/indexing_op.cc:120

Parameters: indices (NDArray) – array of locations where to set on_value depth (int, required) – The dimension size at dim = axis. on_value (double, optional, default=1) – The value assigned to the locations represented by indices. off_value (double, optional, default=0) – The value assigned to the locations not represented by indices. dtype ({'float16', 'float32', 'float64', 'int32', 'uint8'},optional, default='float32') – DType of the output name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.prod(*args, **kwargs)

Compute product of src along axis. If axis is empty, global reduction is performed

Parameters: data (NDArray) – Source input axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.radians(*args, **kwargs)

From:src/operator/tensor/elemwise_unary_op.cc:263

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.repeat(*args, **kwargs)

Repeat elements of an array

From:src/operator/tensor/matrix_op.cc:320

Parameters: data (NDArray) – Input data array repeats (int, required) – The number of repetitions for each element. axis (int or None, optional, default='None') – The axis along which to repeat values. The negative numbers are interpreted counting from the backward. By default, use the flattened input array, and return a flat output array. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.rint(*args, **kwargs)

Take round of the src to nearest integer

From:src/operator/tensor/elemwise_unary_op.cc:117

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.rmsprop_update(*args, **kwargs)

Updater function for RMSProp optimizer. The RMSProp code follows the version in http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.

mxnet.symbol.round(*args, **kwargs)

Take round of the src

From:src/operator/tensor/elemwise_unary_op.cc:101

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.rsqrt(*args, **kwargs)

Take reciprocal square root of the src

From:src/operator/tensor/elemwise_unary_op.cc:145

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.sgd_mom_update(*args, **kwargs)

Updater function for sgd optimizer

mxnet.symbol.sgd_update(*args, **kwargs)

Updater function for sgd optimizer

mxnet.symbol.sign(*args, **kwargs)

Take sign of the src

From:src/operator/tensor/elemwise_unary_op.cc:92

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.sin(*args, **kwargs)

Take sin of the src

From:src/operator/tensor/elemwise_unary_op.cc:182

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.sinh(*args, **kwargs)

Take sinh of the src

From:src/operator/tensor/elemwise_unary_op.cc:272

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.slice(*args, **kwargs)

Crop the input tensor and return a new one.

• the input and output (if explicitly given) are of the same data type, and on the same device.

From:src/operator/tensor/matrix_op.cc:145

Parameters: data (NDArray) – Source input begin (, required) – starting coordinates end (, required) – ending coordinates name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.slice_axis(*args, **kwargs)

Slice the input along certain axis and return a sliced array. The slice will be taken from [begin, end). end can be None and axis can be negative.

From:src/operator/tensor/matrix_op.cc:206

Parameters: data (NDArray) – Source input axis (int, required) – The axis to be sliced. Negative axis means to count from the last to the first axis. begin (int, required) – The beginning index to be sliced. Negative values are interpreted as counting from the backward. end (int or None, required) – The end index to be sliced. The end can be None, in which case all the rest elements are used. Also, negative values are interpreted as counting from the backward. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.smooth_l1(*args, **kwargs)

Calculate Smooth L1 Loss(lhs, scalar)

From:src/operator/tensor/elemwise_binary_scalar_op_extended.cc:63

Parameters: data (NDArray) – source input scalar (float) – scalar input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.softmax_cross_entropy(*args, **kwargs)

Calculate cross_entropy(data, one_hot(label))

From:src/operator/loss_binary_op.cc:12

Parameters: data (NDArray) – Input data label (NDArray) – Input label name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.sort(*args, **kwargs)

Return a sorted copy of an array.

From:src/operator/tensor/ordering_op.cc:59

Parameters: src (NDArray) – Source input axis (int or None, optional, default='-1') – Axis along which to choose sort the input tensor. If not given, the flattened array is used. Default is -1. is_ascend (boolean, optional, default=True) – Whether sort in ascending or descending order. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.sqrt(*args, **kwargs)

Take square root of the src

From:src/operator/tensor/elemwise_unary_op.cc:136

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.square(*args, **kwargs)

Take square of the src

From:src/operator/tensor/elemwise_unary_op.cc:127

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.sum(*args, **kwargs)

Sum src along axis. If axis is empty, global reduction is performed

Parameters: data (NDArray) – Source input axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.sum_axis(*args, **kwargs)

Sum src along axis. If axis is empty, global reduction is performed

Parameters: data (NDArray) – Source input axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed. keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.take(*args, **kwargs)

Take row vectors from an NDArray according to the indices For an input of index with shape (d1, ..., dK), the output shape is (d1, ..., dK, row_vector_length).All the input values should be integers in the range [0, column_vector_length).

From:src/operator/tensor/indexing_op.cc:60

Parameters: a (Symbol) – The source array. indices (Symbol) – The indices of the values to extract. axis (int, optional, default='0') – the axis of data tensor to be taken. mode ({'clip', 'raise', 'wrap'},optional, default='raise') – specify how out-of-bound indices bahave. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.tan(*args, **kwargs)

Take tan of the src

From:src/operator/tensor/elemwise_unary_op.cc:218

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.tanh(*args, **kwargs)

Take tanh of the src

From:src/operator/tensor/elemwise_unary_op.cc:290

Parameters: data (NDArray) – Source input name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.tile(*args, **kwargs)

Construct an array by repeating A the number of times given by reps.

From:src/operator/tensor/matrix_op.cc:343

Parameters: data (NDArray) – Input data array reps (Shape(tuple), required) – The number of times for repeating the tensor a. If reps has length d, the result will have dimension of max(d, a.ndim); If a.ndim < d, a is promoted to be d-dimensional by prepending new axes. If a.ndim > d, reps is promoted to a.ndim by pre-pending 1’s to it. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.topk(*args, **kwargs)

Return the top k element of an input tensor along a given axis.

From:src/operator/tensor/ordering_op.cc:18

Parameters: src (NDArray) – Source input axis (int or None, optional, default='-1') – Axis along which to choose the top k indices. If not given, the flattened array is used. Default is -1. k (int, optional, default='1') – Number of top elements to select, should be always smaller than or equal to the element number in the given axis. A global sort is performed if set k < 1. ret_typ ({'both', 'indices', 'mask', 'value'},optional, default='indices') – The return type. “value” means returning the top k values, “indices” means returning the indices of the top k values, “mask” means to return a mask array containing 0 and 1. 1 means the top k values. “both” means to return both value and indices. is_ascend (boolean, optional, default=False) – Whether to choose k largest or k smallest. Top K largest elements will be chosen if set to false. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.transpose(*args, **kwargs)

Transpose the input tensor and return a new one

From:src/operator/tensor/matrix_op.cc:96

Parameters: data (NDArray) – Source input axes (Shape(tuple), optional, default=()) – Target axis order. By default the axes will be inverted. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.uniform(*args, **kwargs)

Sample a uniform distribution

Parameters: low (float, optional, default=0) – The lower bound of distribution high (float, optional, default=1) – The upper bound of distribution shape (Shape(tuple), optional, default=()) – The shape of the output ctx (string, optional, default='') – Context of output, in format [cpu|gpu|cpu_pinned](n).Only used for imperative calls. dtype ({'None', 'float16', 'float32', 'float64'},optional, default='None') – DType of the output. If output given, set to type of output.If output not given and type not defined (dtype=None), set to float32. name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol
mxnet.symbol.where(*args, **kwargs)

Given three ndarrays, condition, x, and y, return an ndarray with the elements from x or y, depending on the elements from condition are true or false. x and y must have the same shape. If condition has the same shape as x, each element in the output array is from x if the corresponding element in the condition is true, and from y if false. If condtion does not have the same shape as x, it must be a 1D array whose size is the same as x’s first dimension size. Each row of the output array is from x’s row if the corresponding element from condition is true, and from y’s row if false.

From:src/operator/tensor/control_flow_op.cc:21

Parameters: condition (NDArray) – condition array x (NDArray) – y (NDArray) – name (string, optional.) – Name of the resulting symbol. symbol – The result symbol. Symbol

## Execution API Reference¶

Symbolic Executor component of MXNet.

class mxnet.executor.Executor(handle, symbol, ctx, grad_req, group2ctx)

Executor is the actual executing object of MXNet.

forward(is_train=False, **kwargs)

Calculate the outputs specified by the bound symbol.

Parameters: is_train (bool, optional) – whether this forward is for evaluation purpose. **kwargs – Additional specification of input arguments.

Examples

>>> # doing forward by specifying data
>>> texec.forward(is_train=True, data=mydata)
>>> # doing forward by not specifying things, but copy to the executor before hand
>>> mydata.copyto(texec.arg_dict['data'])
>>> texec.forward(is_train=True)
>>> # doing forward by specifying data and get outputs
>>> outputs = texec.forward(is_train=True, data=mydata)
>>> print(outputs[0].asnumpy())

backward(out_grads=None)

Do backward pass to get the gradient of arguments.

Parameters: out_grads (NDArray or list of NDArray or dict of str to NDArray, optional) – Gradient on the outputs to be propagated back. This parameter is only needed when bind is called on outputs that are not a loss function.
set_monitor_callback(callback)

Install callback.

Parameters: callback (function) – Takes a string and an NDArrayHandle.
arg_dict

Get dictionary representation of argument arrrays.

Returns: arg_dict – The dictionary that maps name of arguments to NDArrays. dict of str to NDArray ValueError : if there are duplicated names in the arguments.
grad_dict

Get dictionary representation of gradient arrays.

Returns: grad_dict – The dictionary that maps name of arguments to gradient arrays. dict of str to NDArray
aux_dict

Get dictionary representation of auxiliary states arrays.

Returns: aux_dict – The dictionary that maps name of auxiliary states to NDArrays. dict of str to NDArray ValueError : if there are duplicated names in the auxiliary states.
output_dict

Get dictionary representation of output arrays.

Returns: output_dict – The dictionary that maps name of output names to NDArrays. dict of str to NDArray ValueError : if there are duplicated names in the outputs.
copy_params_from(arg_params, aux_params=None, allow_extra_params=False)

Copy parameters from arg_params, aux_params into executor’s internal array.

Parameters: arg_params (dict of str to NDArray) – Parameters, dict of name to NDArray of arguments aux_params (dict of str to NDArray, optional) – Parameters, dict of name to NDArray of auxiliary states. allow_extra_params (boolean, optional) – Whether allow extra parameters that are not needed by symbol If this is True, no error will be thrown when arg_params or aux_params contain extra parameters that is not needed by the executor. ValueError – If there is additional parameters in the dict but allow_extra_params=False
reshape(partial_shaping=False, allow_up_sizing=False, **kwargs)

Return a new executor with the same symbol and shared memory, but different input/output shapes. For runtime reshaping, variable length sequences, etc. The returned executor shares state with the current one, and cannot be used in parallel with it.

Parameters: partial_shaping (bool) – Whether to allow changing the shape of unspecified arguments. allow_up_sizing (bool) – Whether to allow allocating new ndarrays that’s larger than the original. kwargs (dict of string to tuple of int) – new shape for arguments. exec – A new executor that shares memory with self. Executor
debug_str()

Get a debug string about internal execution plan.

Returns: debug_str – Debug string of the executor. string

## Testing Utility Reference¶

Tools for testing.

mxnet.test_utils.default_context()

Get default context for regression test.

mxnet.test_utils.set_default_context(ctx)

Set default ctx

mxnet.test_utils.default_dtype()

Get default data type for regression test.

mxnet.test_utils.get_atol(atol=None)

Get default numerical threshold for regression test.

mxnet.test_utils.get_rtol(rtol=None)

Get default numerical threshold for regression test.

mxnet.test_utils.random_arrays(*shapes)

Generate some random numpy arrays.

mxnet.test_utils.np_reduce(dat, axis, keepdims, numpy_reduce_func)

Compatible reduce for old version numpy

Parameters: dat (np.ndarray) – Same as Numpy axis (None or int or list-like) – Same as Numpy keepdims (bool) – Same as Numpy numpy_reduce_func (function) – Numpy reducing function like np.sum or np.max
mxnet.test_utils.find_max_violation(a, b, rtol=None, atol=None)

find location of maximum violation

mxnet.test_utils.same(a, b)

Test if two numpy arrays are the same

Parameters: a (np.ndarray) – b (np.ndarray) –
mxnet.test_utils.almost_equal(a, b, rtol=None, atol=None)

Test if two numpy arrays are almost equal.

mxnet.test_utils.assert_almost_equal(a, b, rtol=None, atol=None, names=('a', 'b'))

Test that two numpy arrays are almost equal. Raise exception message if not.

Parameters: a (np.ndarray) – b (np.ndarray) – threshold (None or float) – The checking threshold. Default threshold will be used if set to None
mxnet.test_utils.almost_equal_ignore_nan(a, b, rtol=None, atol=None)

Test that two numpy arrays are almost equal (ignoring NaN in either array). Combines a relative and absolute measure of approximate eqality. If either the relative or absolute check passes, the arrays are considered equal. Including an absolute check resolves issues with the relative check where all array values are close to zero.

Parameters: a (np.ndarray) – b (np.ndarray) – rtol (None or float) – The relative threshold. Default threshold will be used if set to None atol (None or float) – The absolute threshold. Default threshold will be used if set to None
mxnet.test_utils.assert_almost_equal_ignore_nan(a, b, rtol=None, atol=None, names=('a', 'b'))

Test that two numpy arrays are almost equal (ignoring NaN in either array). Combines a relative and absolute measure of approximate eqality. If either the relative or absolute check passes, the arrays are considered equal. Including an absolute check resolves issues with the relative check where all array values are close to zero.

Parameters: a (np.ndarray) – b (np.ndarray) – rtol (None or float) – The relative threshold. Default threshold will be used if set to None atol (None or float) – The absolute threshold. Default threshold will be used if set to None
mxnet.test_utils.retry(n)

Retry n times before failing for stochastic test cases

mxnet.test_utils.simple_forward(sym, ctx=None, is_train=False, **inputs)

A simple forward function for a symbol.

Primarily used in doctest to conveniently test the function of a symbol. Takes numpy array as inputs and outputs are also converted to numpy arrays.

Parameters: ctx (Context) – If None, will take the default context. inputs (keyword arguments) – Mapping each input name to a numpy array. The result as a numpy array. Multiple results will be returned as a list of numpy arrays.
mxnet.test_utils.numeric_grad(executor, location, aux_states=None, eps=0.0001, use_forward_train=True)

Calculates a numeric gradient via finite difference method.

Parameters: executor (Executor) – exectutor that computes the forward pass location (list of numpy.ndarray or dict of str to numpy.ndarray) – Argument values used as location to compute gradient Maps the name of arguments to the corresponding numpy.ndarray. Value of all the arguments must be provided. aux_states (None or list of numpy.ndarray or dict of str to numpy.ndarray, optional) – Auxiliary states values used as location to compute gradient Maps the name of aux_states to the corresponding numpy.ndarray. Value of all the auxiliary arguments must be provided. eps (float, optional) – epsilon for the finite-difference method use_forward_train (bool, optional) – Whether to use is_train=True in testing.

References

mxnet.test_utils.check_numeric_gradient(sym, location, aux_states=None, numeric_eps=0.001, rtol=0.01, atol=None, grad_nodes=None, use_forward_train=True, ctx=None)

Verify an operation by checking backward pass via finite difference method.

Parameters: sym (Symbol) – Symbol containing op to test location (list or tuple or dict) – Argument values used as location to compute gradient if type is list of numpy.ndarray inner elements should have the same the same order as mxnet.sym.list_arguments(). if type is dict of str -> numpy.ndarray maps the name of arguments to the corresponding numpy.ndarray. In either case, value of all the arguments must be provided. aux_states (ist or tuple or dict, optional) – The auxiliary states required when generating the executor for the symbol numeric_eps (float, optional) – Delta for the finite difference method that approximates the gradient check_eps (float, optional) – relative error eps used when comparing numeric grad to symbolic grad grad_nodes (None or list or tuple or dict, optional) – Names of the nodes to check gradient on use_forward_train (bool) – Whether to use is_train=True when computing the finite-difference ctx (Context, optional) – Check the gradient computation on the specified device

References

mxnet.test_utils.check_symbolic_forward(sym, location, expected, rtol=0.0001, atol=None, aux_states=None, ctx=None)

Compare foward call to expected value.

Parameters: sym (Symbol) – output symbol location (list of np.ndarray or dict of str to np.ndarray) – The evaluation point if type is list of np.ndarray contain all the numpy arrays corresponding to sym.list_arguments() if type is dict of str to np.ndarray contain the mapping between argument names and their values expected (list of np.ndarray or dict of str to np.ndarray) – The expected output value if type is list of np.ndarray contain arrays corresponding to exe.outputs if type is dict of str to np.ndarray contain mapping between sym.list_output() and exe.outputs check_eps (float, optional) – relative error to check to aux_states (list of np.ndarray of dict, optional) – if type is list of np.ndarray contain all the numpy arrays corresponding to sym.list_auxiliary_states if type is dict of str to np.ndarray contain the mapping between names of auxiliary states and their values ctx (Context, optional) – running context
mxnet.test_utils.check_symbolic_backward(sym, location, out_grads, expected, rtol=1e-05, atol=None, aux_states=None, grad_req='write', ctx=None)

Compare backward call to expected value.

Parameters: sym (Symbol) – output symbol location (list of np.ndarray or dict of str to np.ndarray) – The evaluation point if type is list of np.ndarray contain all the numpy arrays corresponding to mxnet.sym.list_arguments if type is dict of str to np.ndarray contain the mapping between argument names and their values out_grads (None or list of np.ndarray or dict of str to np.ndarray) – numpy arrays corresponding to sym.outputs for incomming gradient if type is list of np.ndarray contains arrays corresponding to exe.outputs if type is dict of str to np.ndarray contains mapping between mxnet.sym.list_output() and Executor.outputs expected (list of np.ndarray or dict of str to np.ndarray) – expected gradient values if type is list of np.ndarray contains arrays corresponding to exe.grad_arrays if type is dict of str to np.ndarray contains mapping between sym.list_arguments() and exe.outputs check_eps (float, optional) – relative error to check to aux_states (list of np.ndarray or dict of str to np.ndarray) – grad_req (str or list of str or dict of str to str, optional) – gradient requirements. ‘write’, ‘add’ or ‘null’ ctx (Context, optional) – running context
mxnet.test_utils.check_speed(sym, location=None, ctx=None, N=20, grad_req=None, typ='whole', **kwargs)

Check the running speed of a symbol

Parameters: sym (Symbol) – symbol to run the speed test location (none or dict of str to np.ndarray) – location to evaluate the inner executor ctx (Context) – running context N (int, optional) – repeat times grad_req (None or str or list of str or dict of str to str, optional) – gradient requirements typ (str, optional) – “whole” or “forward” “whole” test the forward_backward speed “forward” only test the forward speed
mxnet.test_utils.check_consistency(sym, ctx_list, scale=1.0, grad_req='write', arg_params=None, aux_params=None, tol=None, raise_on_err=True, ground_truth=None)

Check symbol gives the same output for different running context

Parameters: sym (Symbol or list of Symbols) – symbol(s) to run the consistency test ctx_list (list) – running context. See example for more detail. scale (float, optional) – standard deviation of the inner normal distribution. Used in initialization grad_req (str or list of str or dict of str to str) – gradient requirement.

Examples

>>> # create the symbol
>>> sym = mx.sym.Convolution(num_filter=3, kernel=(3,3), name='conv')
>>> # initialize the running context
>>> ctx_list =[{'ctx': mx.gpu(0), 'conv_data': (2, 2, 10, 10), 'type_dict': {'conv_data': np.float64}}, {'ctx': mx.gpu(0), 'conv_data': (2, 2, 10, 10), 'type_dict': {'conv_data': np.float32}}, {'ctx': mx.gpu(0), 'conv_data': (2, 2, 10, 10), 'type_dict': {'conv_data': np.float16}}, {'ctx': mx.cpu(0), 'conv_data': (2, 2, 10, 10), 'type_dict': {'conv_data': np.float64}}, {'ctx': mx.cpu(0), 'conv_data': (2, 2, 10, 10), 'type_dict': {'conv_data': np.float32}}]
>>> check_consistency(sym, ctx_list)
>>> sym = mx.sym.Concat(name='concat', num_args=2)
>>> ctx_list = [{'ctx': mx.gpu(0), 'concat_arg1': (2, 10), 'concat_arg0': (2, 10),  'type_dict': {'concat_arg0': np.float64, 'concat_arg1': np.float64}}, {'ctx': mx.gpu(0), 'concat_arg1': (2, 10), 'concat_arg0': (2, 10),  'type_dict': {'concat_arg0': np.float32, 'concat_arg1': np.float32}}, {'ctx': mx.gpu(0), 'concat_arg1': (2, 10), 'concat_arg0': (2, 10),  'type_dict': {'concat_arg0': np.float16, 'concat_arg1': np.float16}}, {'ctx': mx.cpu(0), 'concat_arg1': (2, 10), 'concat_arg0': (2, 10),  'type_dict': {'concat_arg0': np.float64, 'concat_arg1': np.float64}}, {'ctx': mx.cpu(0), 'concat_arg1': (2, 10), 'concat_arg0': (2, 10),  'type_dict': {'concat_arg0': np.float32, 'concat_arg1': np.float32}}]
>>> check_consistency(sym, ctx_list)