MXNet Python Symbolic API

Topics:

We also highly encouraged you to read Symbolic Configuration and Execution in Pictures.

How to Compose Symbols

The symbolic API provides a way to configure computation graphs. You can configure the graphs either at the level of neural network layer operations or as fine-grained operations.

The following example configures a two-layer neural network.

    >>> import mxnet as mx
    >>> net = mx.symbol.Variable('data')
    >>> net = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
    >>> net = mx.symbol.Activation(data=net, name='relu1', act_type="relu")
    >>> net = mx.symbol.FullyConnected(data=net, name='fc2', num_hidden=64)
    >>> net = mx.symbol.SoftmaxOutput(data=net, name='out')
    >>> type(net)
    <class 'mxnet.symbol.Symbol'>

The basic arithmetic operators (plus, minus, div, multiplication) are overloaded for element-wise operations of symbols.

The following example creates a computation graph that adds two inputs together.

    >>> import mxnet as mx
    >>> a = mx.symbol.Variable('a')
    >>> b = mx.symbol.Variable('b')
    >>> c = a + b

Symbol Attributes

You can add attributes to symbols by providing an attribute dictionary when creating a symbol.

    data = mx.sym.Variable('data', attr={'mood': 'angry'})
    op   = mx.sym.Convolution(data=data, name='conv', kernel=(1, 1),
                              num_filter=1, attr={'mood': 'so so'})

For proper communication with the C++ back end, both the key and values of the attribute dictionary should be strings. To retrieve the attributes, use attr(key) or list_attr():

    assert data.attr('mood') == 'angry'
    assert op.list_attr() == {'mood': 'so so'}

For a composite symbol, you can retrieve all of the attributes associated with that symbol and its descendants with list_attr(recursive=True). In the returned dictionary, all of the attribute names have the prefix 'symbol_name' + '_' to prevent naming conflicts.

    assert op.list_attr(recursive=True) == {'data_mood': 'angry', 'conv_mood': 'so so',
                                             'conv_weight_mood': 'so so', 'conv_bias_mood': 'so so'}

Notice that the mood attribute set for the Convolution operator is copied to conv_weight and conv_bias. They’re symbols that are automatically created by the Convolution operator, and the attributes are automatically copied for them. This is especially useful for annotating context groups in model parallelism. However, if you explicitly specify the weight or bias symbols, the attributes for the host operator are not copied to them:

    weight = mx.sym.Variable('crazy_weight', attr={'size': '5'})
    data = mx.sym.Variable('data', attr={'mood': 'angry'})
    op = mx.sym.Convolution(data=data, weight=weight, name='conv', kernel=(1, 1),
                                  num_filter=1, attr= {'mood': 'so so'})
    op.list_attr(recursive=True)
    # =>
    # {'conv_mood': 'so so',
    #  'conv_bias_mood': 'so so',
    #  'crazy_weight_size': '5',
    #  'data_mood': 'angry'}

As you can see, the mood attribute is copied to the symbol conv_bias, which was automatically created, but not to the manually created weight symbol crazy_weight.

Another way to attach attributes is to use AttrScope. AttrScope automatically adds the specified attributes to all of the symbols created within that scope. For example:

    data = mx.symbol.Variable('data')
    with mx.AttrScope(group='4', data='great'):
        fc1 = mx.symbol.Activation(data, act_type='relu')
        with mx.AttrScope(init_bias='0.0'):
            fc2 = mx.symbol.FullyConnected(fc1, num_hidden=10, name='fc2')
    assert fc1.attr('data') == 'great'
    assert fc2.attr('data') == 'great'
    assert fc2.attr('init_bias') == '0.0'

Naming convention: We recommend that you choose valid variable names for attribute names. Names with double underscores (e.g., __shape__) are reserved for internal use. The underscore '_' separates a symbol name and its attributes. It’s also the separator between a symbol and a variable that is automatically created by that symbol. For example, the weight variable that is created automatically by a Convolution operator named conv1 is called conv1_weight.

Components that use attributes: More and more components are using symbol attributes to collect useful annotations for the computational graph. Here is a (probably incomplete) list:

  • Variable uses attributes to store (optional) shape information for a variable.
  • Optimizers read __lr_mult__ and __wd_mult__ attributes for each symbol in a computational graph. This is useful to control per-layer learning rate and decay.
  • The model parallelism LSTM example uses the __ctx_group__ attribute to divide the operators into groups that correspond to GPU devices.

Serialization

There are two ways to save and load the symbols. You can pickle to serialize the Symbol objects. Or, you can use the mxnet.symbol.Symbol.save and mxnet.symbol.load functions. The advantage of using save and load is that it’s language agnostic and cloud friendly. The symbol is saved in JSON format. You can also directly get a JSON string using mxnet.symbol.Symbol.tojson.

The following example shows how to save a symbol to an S3 bucket, load it back, and compare two symbols using a JSON string.

    >>> import mxnet as mx
    >>> a = mx.symbol.Variable('a')
    >>> b = mx.symbol.Variable('b')
    >>> c = a + b
    >>> c.save('s3://my-bucket/symbol-c.json')
    >>> c2 = mx.symbol.load('s3://my-bucket/symbol-c.json')
    >>> c.tojson() == c2.tojson()
    True

Executing Symbols

After you have assembled a set of symbols into a computation graph, the MXNet engine can evaluate those symbols. If you are training a neural network, this is typically handled by the high-level Model class and the fit() function.

For neural networks used in “feed-forward”, “prediction”, or “inference” mode (all terms for the same thing: running a trained network), the input arguments are the input data, and the weights of the neural network that were learned during training.

To manually execute a set of symbols, you need to create an Executor object, which is typically constructed by calling the simple_bind() method on a symbol.For an example of this, see the sample notebook on how to use simple_bind().

Multiple Outputs

To group the symbols together, use the mxnet.symbol.Group function.

    >>> import mxnet as mx
    >>> net = mx.symbol.Variable('data')
    >>> fc1 = mx.symbol.FullyConnected(data=net, name='fc1', num_hidden=128)
    >>> net = mx.symbol.Activation(data=fc1, name='relu1', act_type="relu")
    >>> net = mx.symbol.FullyConnected(data=net, name='fc2', num_hidden=64)
    >>> out = mx.symbol.SoftmaxOutput(data=net, name='softmax')
    >>> group = mx.symbol.Group([fc1, out])
    >>> group.list_outputs()
    ['fc1_output', 'softmax_output']

After you get the group, you can bind on group instead. The resulting executor will have two outputs, one for fc1_output and one for softmax_output.

Symbol Creation API Reference

Symbolic configuration API of mxnet.

class mxnet.symbol.Symbol(handle)

Symbol is symbolic graph of the mxnet.

name

Get name string from the symbol, this function only works for non-grouped symbol.

Returns:value – The name of this symbol, returns None for grouped symbol.
Return type:str
attr(key)

Get attribute string from the symbol, this function only works for non-grouped symbol.

Parameters:key (str) – The key to get attribute from.
Returns:value – The attribute value of the key, returns None if attribute do not exist.
Return type:str
list_attr(recursive=False)

Get all attributes from the symbol.

Returns:ret – a dicitonary mapping attribute keys to values
Return type:dict of str to str
attr_dict()

Recursively get all attributes from the symbol and its childrens

Returns:ret – Returns a dict whose keys are names of the symbol and its children. Values of the returned dict are dictionaries that map attribute keys to values
Return type:dict of str to dict
get_internals()

Get a new grouped symbol whose output contains all the internal outputs of this symbol.

Returns:sgroup – The internal of the symbol.
Return type:Symbol
list_arguments()

List all the arguments in the symbol.

Returns:args – List of all the arguments.
Return type:list of string
list_outputs()

List all outputs in the symbol.

Returns:returns – List of all the outputs.
Return type:list of string
list_auxiliary_states()

List all auxiliary states in the symbol.

Returns:aux_states – List the names of the auxiliary states.
Return type:list of string

Notes

Auxiliary states are special states of symbols that do not corresponds to an argument, and do not have gradient. But still be useful for the specific operations. A common example of auxiliary state is the moving_mean and moving_variance in BatchNorm. Most operators do not have Auxiliary states.

infer_type(*args, **kwargs)

Infer the type of outputs and arguments of given known types of arguments.

User can either pass in the known types in positional way or keyword argument way. Tuple of Nones is returned if there is not enough information passed in. An error will be raised if there is inconsistency found in the known types passed in.

Parameters:
  • *args – Provide type of arguments in a positional way. Unknown type can be marked as None
  • **kwargs – Provide keyword arguments of known types.
Returns:

  • arg_types (list of numpy.dtype or None) – List of types of arguments. The order is in the same order as list_arguments()
  • out_types (list of numpy.dtype or None) – List of types of outputs. The order is in the same order as list_outputs()
  • aux_types (list of numpy.dtype or None) – List of types of outputs. The order is in the same order as list_auxiliary()

infer_shape(*args, **kwargs)

Infer the shape of outputs and arguments of given known shapes of arguments.

User can either pass in the known shapes in positional way or keyword argument way. Tuple of Nones is returned if there is not enough information passed in. An error will be raised if there is inconsistency found in the known shapes passed in.

Parameters:
  • *args – Provide shape of arguments in a positional way. Unknown shape can be marked as None
  • **kwargs – Provide keyword arguments of known shapes.
Returns:

  • arg_shapes (list of tuple or None) – List of shapes of arguments. The order is in the same order as list_arguments()
  • out_shapes (list of tuple or None) – List of shapes of outputs. The order is in the same order as list_outputs()
  • aux_shapes (list of tuple or None) – List of shapes of outputs. The order is in the same order as list_auxiliary()

infer_shape_partial(*args, **kwargs)

Partially infer the shape. The same as infer_shape, except that the partial results can be returned.

debug_str()

Get a debug string.

Returns:debug_str – Debug string of the symbol.
Return type:string
save(fname)

Save symbol into file.

You can also use pickle to do the job if you only work on python. The advantage of load/save is the file is language agnostic. This means the file saved using save can be loaded by other language binding of mxnet. You also get the benefit being able to directly load/save from cloud storage(S3, HDFS)

Parameters:fname (str) – The name of the file - s3://my-bucket/path/my-s3-symbol - hdfs://my-bucket/path/my-hdfs-symbol - /path-to/my-local-symbol

See also

symbol.load()
Used to load symbol from file.
tojson()

Save symbol into a JSON string.

See also

symbol.load_json()
Used to load symbol from JSON string.
simple_bind(ctx, grad_req='write', type_dict=None, group2ctx=None, **kwargs)

Bind current symbol to get an executor, allocate all the ndarrays needed. Allows specifying data types.

This function will ask user to pass in ndarray of position they like to bind to, and it will automatically allocate the ndarray for arguments and auxiliary states that user did not specify explicitly.

Parameters:
  • ctx (Context) – The device context the generated executor to run on.
  • grad_req (string) – {‘write’, ‘add’, ‘null’}, or list of str or dict of str to str, optional Specifies how we should update the gradient to the args_grad. - ‘write’ means everytime gradient is write to specified args_grad NDArray. - ‘add’ means everytime gradient is add to the specified NDArray. - ‘null’ means no action is taken, the gradient may not be calculated.
  • type_dict (dict of str->numpy.dtype) – Input type dictionary, name->dtype
  • group2ctx (dict of string to mx.Context) – The dict mapping the ctx_group attribute to the context assignment.
  • kwargs (dict of str->shape) – Input shape dictionary, name->shape
Returns:

executor – The generated Executor

Return type:

mxnet.Executor

bind(ctx, args, args_grad=None, grad_req='write', aux_states=None, group2ctx=None, shared_exec=None)

Bind current symbol to get an executor.

Parameters:
  • ctx (Context) – The device context the generated executor to run on.
  • args (list of NDArray or dict of str to NDArray) –

    Input arguments to the symbol.

    • If type is list of NDArray, the position is in the same order of list_arguments.
    • If type is dict of str to NDArray, then it maps the name of arguments to the corresponding NDArray.
    • In either case, all the arguments must be provided.
  • args_grad (list of NDArray or dict of str to NDArray, optional) –

    When specified, args_grad provide NDArrays to hold the result of gradient value in backward.

    • If type is list of NDArray, the position is in the same order of list_arguments.
    • If type is dict of str to NDArray, then it maps the name of arguments to the corresponding NDArray.
    • When the type is dict of str to NDArray, users only need to provide the dict for needed argument gradient. Only the specified argument gradient will be calculated.
  • grad_req ({'write', 'add', 'null'}, or list of str or dict of str to str, optional) –

    Specifies how we should update the gradient to the args_grad.

    • ‘write’ means everytime gradient is write to specified args_grad NDArray.
    • ‘add’ means everytime gradient is add to the specified NDArray.
    • ‘null’ means no action is taken, the gradient may not be calculated.
  • aux_states (list of NDArray, or dict of str to NDArray, optional) –

    Input auxiliary states to the symbol, only need to specify when list_auxiliary_states is not empty.

    • If type is list of NDArray, the position is in the same order of list_auxiliary_states
    • If type is dict of str to NDArray, then it maps the name of auxiliary_states to the corresponding NDArray,
    • In either case, all the auxiliary_states need to be provided.
  • group2ctx (dict of string to mx.Context) – The dict mapping the ctx_group attribute to the context assignment.
  • shared_exec (mx.executor.Executor) – Executor to share memory with. This is intended for runtime reshaping, variable length sequences, etc. The returned executor shares state with shared_exec, and should not be used in parallel with it.
Returns:

executor – The generated Executor

Return type:

Executor

Notes

Auxiliary states are special states of symbols that do not corresponds to an argument, and do not have gradient. But still be useful for the specific operations. A common example of auxiliary state is the moving_mean and moving_variance in BatchNorm. Most operators do not have auxiliary states and this parameter can be safely ignored.

User can give up gradient by using a dict in args_grad and only specify gradient they interested in.

grad(wrt)

Get the autodiff of current symbol.

This function can only be used if current symbol is a loss function.

Parameters:wrt (Array of String) – keyword arguments of the symbol that the gradients are taken.
Returns:grad – A gradient Symbol with returns to be the corresponding gradients.
Return type:Symbol
mxnet.symbol.Variable(name, attr=None, shape=None, lr_mult=None, wd_mult=None, dtype=None)

Create a symbolic variable with specified name.

Parameters:
  • name (str) – Name of the variable.
  • attr (dict of string -> string) – Additional attributes to set on the variable.
  • shape (tuple) – Optionally, one can specify the shape of a variable. This will be used during shape inference. If user specified a different shape for this variable using keyword argument when calling shape inference, this shape information will be ignored.
  • lr_mult (float) – Specify learning rate muliplier for this variable.
  • wd_mult (float) – Specify weight decay muliplier for this variable.
  • dtype (str or numpy.dtype) – Similar to shape, we can specify dtype for this variable.
Returns:

variable – The created variable symbol.

Return type:

Symbol

mxnet.symbol.Group(symbols)

Create a symbol that groups symbols together.

Parameters:symbols (list) – List of symbols to be grouped.
Returns:sym – The created group symbol.
Return type:Symbol
mxnet.symbol.load(fname)

Load symbol from a JSON file.

You can also use pickle to do the job if you only work on python. The advantage of load/save is the file is language agnostic. This means the file saved using save can be loaded by other language binding of mxnet. You also get the benefit being able to directly load/save from cloud storage(S3, HDFS)

Parameters:fname (str) –

The name of the file, examples:

  • s3://my-bucket/path/my-s3-symbol
  • hdfs://my-bucket/path/my-hdfs-symbol
  • /path-to/my-local-symbol
Returns:sym – The loaded symbol.
Return type:Symbol

See also

Symbol.save()
Used to save symbol into file.
mxnet.symbol.load_json(json_str)

Load symbol from json string.

Parameters:json_str (str) – A json string.
Returns:sym – The loaded symbol.
Return type:Symbol

See also

Symbol.tojson()
Used to save symbol into json string.
mxnet.symbol.pow(base, exp)

Raise base to an exp.

Parameters:
Returns:

result

Return type:

Symbol or Number

mxnet.symbol.maximum(left, right)

maximum left and right

Parameters:
Returns:

result

Return type:

Symbol or Number

mxnet.symbol.minimum(left, right)

minimum left and right

Parameters:
Returns:

result

Return type:

Symbol or Number

mxnet.symbol.hypot(left, right)

minimum left and right

Parameters:
Returns:

result

Return type:

Symbol or Number

mxnet.symbol.zeros(shape, dtype=<Mock name='mock.float32' id='47055993014608'>)
Create a Tensor filled with zeros, similar to numpy.zeros
See Also https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html.
Parameters:
  • shape (int or sequence of ints) – Shape of the new array.
  • dtype (type, optional) – The value type of the NDArray, default to np.float32
Returns:

out – The created Symbol

Return type:

Symbol

mxnet.symbol.ones(shape, dtype=<Mock name='mock.float32' id='47055993014608'>)
Create a Tensor filled with ones, similar to numpy.ones
See Also https://docs.scipy.org/doc/numpy/reference/generated/numpy.ones.html.
Parameters:
  • shape (int or sequence of ints) – Shape of the new array.
  • dtype (type, optional) – The value type of the NDArray, default to np.float32
Returns:

out – The created Symbol

Return type:

Symbol

mxnet.symbol.arange(start, stop=None, step=1.0, repeat=1, name=None, dtype=<Mock name='mock.float32' id='47055993014608'>)
Simlar function in the MXNet ndarray as numpy.arange
See Also https://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html.
Parameters:
  • start (number) – Start of interval. The interval includes this value. The default start value is 0.
  • stop (number, optional) – End of interval. The interval does not include this value.
  • step (number, optional) – Spacing between values
  • repeat (int, optional) – “The repeating time of all elements. E.g repeat=3, the element a will be repeated three times –> a, a, a.
  • dtype (type, optional) – The value type of the NDArray, default to np.float32
Returns:

out – The created Symbol

Return type:

Symbol

mxnet.symbol.Activation(*args, **kwargs)

Elementwise activation function.

The following activation types are supported (operations are applied elementwisely to each scalar of the input tensor):

  • relu: Rectified Linear Unit, y = max(x, 0)
  • sigmoid: y = 1 / (1 + exp(-x))
  • tanh: Hyperbolic tangent, y = (exp(x) - exp(-x)) / (exp(x) + exp(-x))
  • softrelu: Soft ReLU, or SoftPlus, y = log(1 + exp(x))

See LeakyReLU for other activations with parameters.

Parameters:
  • data (Symbol) – Input data to activation function.
  • act_type ({'relu', 'sigmoid', 'softrelu', 'tanh'}, required) – Activation function to be applied.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

Examples

A one-hidden-layer MLP with ReLU activation:

>>> data = Variable('data')
>>> mlp = FullyConnected(data=data, num_hidden=128, name='proj')
>>> mlp = Activation(data=mlp, act_type='relu', name='activation')
>>> mlp = FullyConnected(data=mlp, num_hidden=10, name='mlp')
>>> mlp
<Symbol mlp>

ReLU activation

>>> test_suites = [
... ('relu', lambda x: numpy.maximum(x, 0)),
... ('sigmoid', lambda x: 1 / (1 + numpy.exp(-x))),
... ('tanh', lambda x: numpy.tanh(x)),
... ('softrelu', lambda x: numpy.log(1 + numpy.exp(x)))
... ]
>>> x = test_utils.random_arrays((2, 3, 4))
>>> for act_type, numpy_impl in test_suites:
... op = Activation(act_type=act_type, name='act')
... y = test_utils.simple_forward(op, act_data=x)
... y_np = numpy_impl(x)
... print('%s: %s' % (act_type, test_utils.almost_equal(y, y_np)))
relu: True
sigmoid: True
tanh: True
softrelu: True
mxnet.symbol.BatchNorm(*args, **kwargs)

Apply batch normalization to input.

Parameters:
  • data (Symbol) – Input data to batch normalization
  • eps (float, optional, default=0.001) – Epsilon to prevent div 0
  • momentum (float, optional, default=0.9) – Momentum for moving average
  • fix_gamma (boolean, optional, default=True) – Fix gamma while training
  • use_global_stats (boolean, optional, default=False) – Whether use global moving statistics instead of local batch-norm. This will force change batch-norm into a scale shift operator.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.BlockGrad(*args, **kwargs)

Get output from a symbol and pass 0 gradient back

From:src/operator/tensor/elemwise_unary_op.cc:30

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Cast(*args, **kwargs)

Cast array to a different data type.

Parameters:
  • data (Symbol) – Input data to cast function.
  • dtype ({'float16', 'float32', 'float64', 'int32', 'uint8'}, required) – Target data type.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Concat(*args, **kwargs)

Perform a feature concat on channel dim (defaut is 1) over all This function support variable length of positional input.

Parameters:
  • data (Symbol[]) – List of tensors to concatenate
  • num_args (int, required) – Number of inputs to be concated.
  • dim (int, optional, default='1') – the dimension to be concated.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

Examples

Concat two (or more) inputs along a specific dimension:

>>> a = Variable('a')
>>> b = Variable('b')
>>> c = Concat(a, b, dim=1, name='my-concat')
>>> c
<Symbol my-concat>
>>> SymbolDoc.get_output_shape(c, a=(128, 10, 3, 3), b=(128, 15, 3, 3))
{'my-concat_output': (128L, 25L, 3L, 3L)}

Note the shape should be the same except on the dimension that is being concatenated.

mxnet.symbol.Convolution(*args, **kwargs)

Apply convolution to input then add a bias.

Parameters:
  • data (Symbol) – Input data to the ConvolutionOp.
  • weight (Symbol) – Weight matrix.
  • bias (Symbol) – Bias parameter.
  • kernel (Shape(tuple), required) – convolution kernel size: (h, w) or (d, h, w)
  • stride (Shape(tuple), optional, default=()) – convolution stride: (h, w) or (d, h, w)
  • dilate (Shape(tuple), optional, default=()) – convolution dilate: (h, w) or (d, h, w)
  • pad (Shape(tuple), optional, default=()) – pad for convolution: (h, w) or (d, h, w)
  • num_filter (int (non-negative), required) – convolution filter(channel) number
  • num_group (int (non-negative), optional, default=1) – Number of group partitions. Equivalent to slicing input into num_group partitions, apply convolution on each, then concatenate the results
  • workspace (long (non-negative), optional, default=1024) – Maximum tmp workspace allowed for convolution (MB).
  • no_bias (boolean, optional, default=False) – Whether to disable bias parameter.
  • cudnn_tune ({None, 'fastest', 'limited_workspace', 'off'},optional, default='None') – Whether to pick convolution algo by running performance test. Leads to higher startup time but may give faster speed. Options are: ‘off’: no tuning ‘limited_workspace’: run test and pick the fastest algorithm that doesn’t exceed workspace limit. ‘fastest’: pick the fastest algorithm and ignore workspace limit. If set to None (default), behavior is determined by environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT: 0 for off, 1 for limited workspace (default), 2 for fastest.
  • cudnn_off (boolean, optional, default=False) – Turn off cudnn for this layer.
  • layout ({None, 'NCDHW', 'NCHW', 'NDHWC', 'NHWC'},optional, default='None') – Set layout for input, output and weight. Empty for default layout: NCHW for 2d and NCDHW for 3d.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Correlation(*args, **kwargs)

Apply correlation to inputs

Parameters:
  • data1 (Symbol) – Input data1 to the correlation.
  • data2 (Symbol) – Input data2 to the correlation.
  • kernel_size (int (non-negative), optional, default=1) – kernel size for Correlation must be an odd number
  • max_displacement (int (non-negative), optional, default=1) – Max displacement of Correlation
  • stride1 (int (non-negative), optional, default=1) – stride1 quantize data1 globally
  • stride2 (int (non-negative), optional, default=1) – stride2 quantize data2 within the neighborhood centered around data1
  • pad_size (int (non-negative), optional, default=0) – pad for Correlation
  • is_multiply (boolean, optional, default=True) – operation type is either multiplication or subduction
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Crop(*args, **kwargs)

Crop the 2nd and 3rd dim of input data, with the corresponding size of h_w or with width and height of the second input symbol, i.e., with one input, we need h_w to specify the crop height and width, otherwise the second input symbol’s size will be used This function support variable length of positional input.

Parameters:
  • data (Symbol or Symbol[]) – Tensor or List of Tensors, the second input will be used as crop_like shape reference
  • num_args (int, required) – Number of inputs for crop, if equals one, then we will use the h_wfor crop height and width, else if equals two, then we will use the heightand width of the second input symbol, we name crop_like here
  • offset (Shape(tuple), optional, default=(0,0)) – crop offset coordinate: (y, x)
  • h_w (Shape(tuple), optional, default=(0,0)) – crop height and weight: (h, w)
  • center_crop (boolean, optional, default=False) – If set to true, then it will use be the center_crop,or it will crop using the shape of crop_like
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Custom(*args, **kwargs)

Custom operator implemented in frontend.

Parameters:
  • op_type (string) – Type of custom operator. Must be registered first.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Deconvolution(*args, **kwargs)

Apply deconvolution to input then add a bias.

Parameters:
  • data (Symbol) – Input data to the DeconvolutionOp.
  • weight (Symbol) – Weight matrix.
  • bias (Symbol) – Bias parameter.
  • kernel (Shape(tuple), required) – deconvolution kernel size: (y, x)
  • stride (Shape(tuple), optional, default=(1,1)) – deconvolution stride: (y, x)
  • pad (Shape(tuple), optional, default=(0,0)) – pad for deconvolution: (y, x), a good number is : (kernel-1)/2, if target_shape set, pad will be ignored and will be computed automatically
  • adj (Shape(tuple), optional, default=(0,0)) – adjustment for output shape: (y, x), if target_shape set, adj will be ignored and will be computed automatically
  • target_shape (Shape(tuple), optional, default=(0,0)) – output shape with targe shape : (y, x)
  • num_filter (int (non-negative), required) – deconvolution filter(channel) number
  • num_group (int (non-negative), optional, default=1) – number of groups partition
  • workspace (long (non-negative), optional, default=512) – Tmp workspace for deconvolution (MB)
  • no_bias (boolean, optional, default=True) – Whether to disable bias parameter.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Dropout(*args, **kwargs)

Apply dropout to input. During training, each element of the input is randomly set to zero with probability p. And then the whole tensor is rescaled by 1/(1-p) to keep the expectation the same as before applying dropout. During the test time, this behaves as an identity map.

Parameters:
  • data (Symbol) – Input data to dropout.
  • p (float, optional, default=0.5) – Fraction of the input that gets dropped out at training time
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

Examples

Apply dropout to corrupt input as zero with probability 0.2:

>>> data = Variable('data')
>>> data_dp = Dropout(data=data, p=0.2)
>>> shape = (100, 100)  # take larger shapes to be more statistical stable
>>> x = numpy.ones(shape)
>>> op = Dropout(p=0.5, name='dp')
>>> # dropout is identity during testing
>>> y = test_utils.simple_forward(op, dp_data=x, is_train=False)
>>> test_utils.almost_equal(x, y, threshold=0)
True
>>> y = test_utils.simple_forward(op, dp_data=x, is_train=True)
>>> # expectation is (approximately) unchanged
>>> numpy.abs(x.mean() - y.mean()) < 0.1
True
>>> set(numpy.unique(y)) == set([0, 2])
True
mxnet.symbol.ElementWiseSum(*args, **kwargs)

Perform element sum of inputs

From:src/operator/tensor/elemwise_sum.cc:56 This function support variable length of positional input.

Parameters:
  • args (NDArray[]) – List of input tensors
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Embedding(*args, **kwargs)

Map integer index to vector representations (embeddings). Those embeddings are learnable parameters. For a input of shape (d1, ..., dK), the output shape is (d1, ..., dK, output_dim). All the input values should be integers in the range [0, input_dim).

From:src/operator/tensor/indexing_op.cc:17

Parameters:
  • data (Symbol) – Input data to the EmbeddingOp.
  • weight (Symbol) – Embedding weight matrix.
  • input_dim (int, required) – vocabulary size of the input indices.
  • output_dim (int, required) – dimension of the embedding vectors.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

Examples

Assume we want to map the 26 English alphabet letters to 16-dimensional vectorial representations.

>>> vocabulary_size = 26
>>> embed_dim = 16
>>> seq_len, batch_size = (10, 64)
>>> input = Variable('letters')
>>> op = Embedding(data=input, input_dim=vocabulary_size, output_dim=embed_dim,
...name='embed')
>>> SymbolDoc.get_output_shape(op, letters=(seq_len, batch_size))
{'embed_output': (10L, 64L, 16L)}
>>> vocab_size, embed_dim = (26, 16)
>>> batch_size = 12
>>> word_vecs = test_utils.random_arrays((vocab_size, embed_dim))
>>> op = Embedding(name='embed', input_dim=vocab_size, output_dim=embed_dim)
>>> x = numpy.random.choice(vocab_size, batch_size)
>>> y = test_utils.simple_forward(op, embed_data=x, embed_weight=word_vecs)
>>> y_np = word_vecs[x]
>>> test_utils.almost_equal(y, y_np)
True
mxnet.symbol.Flatten(*args, **kwargs)

Flatten input into 2D by collapsing all the higher dimensions. A (d1, d2, ..., dK) tensor is flatten to (d1, d2* ... *dK) matrix.

Parameters:
  • data (NDArray) – Input data to reshape.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

Examples

Flatten is usually applied before FullyConnected, to reshape the 4D tensor produced by convolutional layers to 2D matrix:

>>> data = Variable('data')  # say this is 4D from some conv/pool
>>> flatten = Flatten(data=data, name='flat')  # now this is 2D
>>> SymbolDoc.get_output_shape(flatten, data=(2, 3, 4, 5))
{'flat_output': (2L, 60L)}
>>> test_dims = [(2, 3, 4, 5), (2, 3), (2,)]
>>> op = Flatten(name='flat')
>>> for dims in test_dims:
... x = test_utils.random_arrays(dims)
... y = test_utils.simple_forward(op, flat_data=x)
... y_np = x.reshape((dims[0], numpy.prod(dims[1:])))
... print('%s: %s' % (dims, test_utils.almost_equal(y, y_np)))
(2, 3, 4, 5): True
(2, 3): True
(2,): True
mxnet.symbol.FullyConnected(*args, **kwargs)

Apply matrix multiplication to input then add a bias. It maps the input of shape (batch_size, input_dim) to the shape of (batch_size, num_hidden). Learnable parameters include the weights of the linear transform and an optional bias vector.

Parameters:
  • data (Symbol) – Input data to the FullyConnectedOp.
  • weight (Symbol) – Weight matrix.
  • bias (Symbol) – Bias parameter.
  • num_hidden (int, required) – Number of hidden nodes of the output.
  • no_bias (boolean, optional, default=False) – Whether to disable bias parameter.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

Examples

Construct a fully connected operator with target dimension 512.

>>> data = Variable('data')  # or some constructed NN
>>> op = FullyConnected(data=data,
... num_hidden=512,
... name='FC1')
>>> op
<Symbol FC1>
>>> SymbolDoc.get_output_shape(op, data=(128, 100))
{'FC1_output': (128L, 512L)}

A simple 3-layer MLP with ReLU activation:

>>> net = Variable('data')
>>> for i, dim in enumerate([128, 64]):
... net = FullyConnected(data=net, num_hidden=dim, name='FC%d' % i)
... net = Activation(data=net, act_type='relu', name='ReLU%d' % i)
>>> # 10-class predictor (e.g. MNIST)
>>> net = FullyConnected(data=net, num_hidden=10, name='pred')
>>> net
<Symbol pred>
>>> dim_in, dim_out = (3, 4)
>>> x, w, b = test_utils.random_arrays((10, dim_in), (dim_out, dim_in), (dim_out,))
>>> op = FullyConnected(num_hidden=dim_out, name='FC')
>>> out = test_utils.simple_forward(op, FC_data=x, FC_weight=w, FC_bias=b)
>>> # numpy implementation of FullyConnected
>>> out_np = numpy.dot(x, w.T) + b
>>> test_utils.almost_equal(out, out_np)
True
mxnet.symbol.IdentityAttachKLSparseReg(*args, **kwargs)

Apply a sparse regularization to the output a sigmoid activation function.

Parameters:
  • data (Symbol) – Input data.
  • sparseness_target (float, optional, default=0.1) – The sparseness target
  • penalty (float, optional, default=0.001) – The tradeoff parameter for the sparseness penalty
  • momentum (float, optional, default=0.9) – The momentum for running average
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.InstanceNorm(*args, **kwargs)

An operator taking in a n-dimensional input tensor (n > 2), and normalizing the input by subtracting the mean and variance calculated over the spatial dimensions. This is an implemention of the operator described in “Instance Normalization: The Missing Ingredient for Fast Stylization”, D. Ulyanov, A. Vedaldi, V. Lempitsky, 2016 (arXiv:1607.08022v2). This layer is similar to batch normalization, with two differences: first, the normalization is carried out per example (‘instance’), not over a batch. Second, the same normalization is applied both at test and train time. This operation is also known as ‘contrast normalization’.

Parameters:
  • data (Symbol) – A n-dimensional tensor (n > 2) of the form [batch, channel, spatial_dim1, spatial_dim2, ...].
  • gamma (Symbol) – A vector of length ‘channel’, which multiplies the normalized input.
  • beta (Symbol) – A vector of length ‘channel’, which is added to the product of the normalized input and the weight.
  • eps (float, optional, default=0.001) – Epsilon to prevent division by 0.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.L2Normalization(*args, **kwargs)

Set the l2 norm of each instance to a constant.

Parameters:
  • data (Symbol) – Input data to the L2NormalizationOp.
  • eps (float, optional, default=1e-10) – Epsilon to prevent div 0
  • mode ({'channel', 'instance', 'spatial'},optional, default='instance') – Normalization Mode. If set to instance, this operator will compute a norm for each instance in the batch; this is the default mode. If set to channel, this operator will compute a cross channel norm at each position of each instance. If set to spatial, this operator will compute a norm for each channel.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.LRN(*args, **kwargs)

Apply convolution to input then add a bias.

Parameters:
  • data (Symbol) – Input data to the ConvolutionOp.
  • alpha (float, optional, default=0.0001) – value of the alpha variance scaling parameter in the normalization formula
  • beta (float, optional, default=0.75) – value of the beta power parameter in the normalization formula
  • knorm (float, optional, default=2) – value of the k parameter in normalization formula
  • nsize (int (non-negative), required) – normalization window width in elements.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.LeakyReLU(*args, **kwargs)

Apply activation function to input.

Parameters:
  • data (Symbol) – Input data to activation function.
  • act_type ({'elu', 'leaky', 'prelu', 'rrelu'},optional, default='leaky') – Activation function to be applied.
  • slope (float, optional, default=0.25) – Init slope for the activation. (For leaky and elu only)
  • lower_bound (float, optional, default=0.125) – Lower bound of random slope. (For rrelu only)
  • upper_bound (float, optional, default=0.334) – Upper bound of random slope. (For rrelu only)
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.LinearRegressionOutput(*args, **kwargs)

Use linear regression for final output, this is used on final output of a net.

Parameters:
  • data (Symbol) – Input data to function.
  • label (Symbol) – Input label to function.
  • grad_scale (float, optional, default=1) – Scale the gradient by a float factor
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.LogisticRegressionOutput(*args, **kwargs)

Use Logistic regression for final output, this is used on final output of a net. Logistic regression is suitable for binary classification or probability prediction tasks.

Parameters:
  • data (Symbol) – Input data to function.
  • label (Symbol) – Input label to function.
  • grad_scale (float, optional, default=1) – Scale the gradient by a float factor
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.MAERegressionOutput(*args, **kwargs)

Use mean absolute error regression for final output, this is used on final output of a net.

Parameters:
  • data (Symbol) – Input data to function.
  • label (Symbol) – Input label to function.
  • grad_scale (float, optional, default=1) – Scale the gradient by a float factor
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.MakeLoss(*args, **kwargs)

Get output from a symbol and pass 1 gradient back. This is used as a terminal loss if unary and binary operator are used to composite a loss with no declaration of backward dependency

Parameters:
  • data (Symbol) – Input data.
  • grad_scale (float, optional, default=1) – gradient scale as a supplement to unary and binary operators
  • valid_thresh (float, optional, default=0) – regard element valid when x > valid_thresh, this is used only in valid normalization mode.
  • normalization ({'batch', 'null', 'valid'},optional, default='null') – If set to null, op will not normalize on output gradient.If set to batch, op will normalize gradient by divide batch size.If set to valid, op will normalize gradient by divide # sample marked as valid
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Pad(*args, **kwargs)

Pads an n-dimensional input tensor. Allows for precise control of the padding type and how much padding to apply on both sides of a given dimension.

Parameters:
  • data (Symbol) – An n-dimensional input tensor.
  • mode ({'constant', 'edge'}, required) – Padding type to use. “constant” pads all values with a constant value, the value of which can be specified with the constant_value option. “edge” uses the boundary values of the array as padding.
  • pad_width (Shape(tuple), required) – A tuple of padding widths of length 2*r, where r is the rank of the input tensor, specifying number of values padded to the edges of each axis. (before_1, after_1, ... , before_N, after_N) unique pad widths for each axis. Equivalent to pad_width in numpy.pad, but flattened.
  • constant_value (double, optional, default=0) – This option is only used when mode is “constant”. This value will be used as the padding value. Defaults to 0 if not specified.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Pooling(*args, **kwargs)

Perform spatial pooling on inputs.

Parameters:
  • data (Symbol) – Input data to the pooling operator.
  • global_pool (boolean, optional, default=False) – Ignore kernel size, do global pooling based on current input feature map. This is useful for input with different shape
  • kernel (Shape(tuple), required) – pooling kernel size: (y, x) or (d, y, x)
  • pool_type ({'avg', 'max', 'sum'}, required) – Pooling type to be applied.
  • pooling_convention ({'full', 'valid'},optional, default='valid') – Pooling convention to be applied.kValid is default setting of Mxnet and rounds down the output pooling size.kFull is compatible with Caffe and rounds up the output pooling size.
  • stride (Shape(tuple), optional, default=(1,1)) – stride: for pooling (y, x) or (d, y, x)
  • pad (Shape(tuple), optional, default=(0,0)) – pad for pooling: (y, x) or (d, y, x)
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.RNN(*args, **kwargs)

Apply a recurrent layer to input.

Parameters:
  • data (Symbol) – Input data to RNN
  • parameters (Symbol) – Vector of all RNN trainable parameters concatenated
  • state (Symbol) – initial hidden state of the RNN
  • state_cell (Symbol) – initial cell state for LSTM networks (only for LSTM)
  • state_size (int (non-negative), required) – size of the state for each layer
  • num_layers (int (non-negative), required) – number of stacked layers
  • bidirectional (boolean, optional, default=False) – whether to use bidirectional recurrent layers
  • mode ({'gru', 'lstm', 'rnn_relu', 'rnn_tanh'}, required) – the type of RNN to compute
  • p (float, optional, default=0) – Dropout probability, fraction of the input that gets dropped out at training time
  • state_outputs (boolean, optional, default=False) – Whether to have the states as symbol outputs.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.ROIPooling(*args, **kwargs)

Performs region-of-interest pooling on inputs. Resize bounding box coordinates by spatial_scale and crop input feature maps accordingly. The cropped feature maps are pooled by max pooling to a fixed size output indicated by pooled_size. batch_size will change to the number of region bounding boxes after ROIPooling

Parameters:
  • data (Symbol) – Input data to the pooling operator, a 4D Feature maps
  • rois (Symbol) – Bounding box coordinates, a 2D array of [[batch_index, x1, y1, x2, y2]]. (x1, y1) and (x2, y2) are top left and down right corners of designated region of interest. batch_index indicates the index of corresponding image in the input data
  • pooled_size (Shape(tuple), required) – fix pooled size: (h, w)
  • spatial_scale (float, required) – Ratio of input feature map height (or w) to raw image height (or w). Equals the reciprocal of total stride in convolutional layers
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Reshape(*args, **kwargs)

Reshape input according to a target shape spec. The target shape is a tuple and can be a simple list of dimensions such as (12,3) or it can incorporate special codes that correspond to contextual operations that refer to the input dimensions. The special codes are all expressed as integers less than 1. These codes effectively refer to a machine that pops input dims off the beginning of the input dims list and pushes resulting output dims onto the end of the output dims list, which starts empty. The codes are:

0 Copy Pop one input dim and push it onto the output dims
-1 Infer Push a dim that is inferred later from all other output dims
-2 CopyAll Pop all remaining input dims and push them onto output dims
-3 Merge2 Pop two input dims, multiply them, and push result
-4 Split2 Pop one input dim, and read two next target shape specs, push them both onto output dims (either can be -1 and will be inferred from the other

The exact mathematical behavior of these codes is given in the description of the ‘shape’ parameter. All non-codes (positive integers) just pop a dim off the input dims (if any), throw it away, and then push the specified integer onto the output dims.

Examples: Type Input Target Output Copy (2,3,4) (4,0,2) (4,3,2) Copy (2,3,4) (2,0,0) (2,3,4) Infer (2,3,4) (6,1,-1) (6,1,4) Infer (2,3,4) (3,-1,8) (3,1,8) CopyAll (9,8,7) (-2) (9,8,7) CopyAll (9,8,7) (9,-2) (9,8,7) CopyAll (9,8,7) (-2,1,1) (9,8,7,1,1) Merge2 (3,4) (-3) (12) Merge2 (3,4,5) (-3,0) (12,5) Merge2 (3,4,5) (0,-3) (3,20) Merge2 (3,4,5,6) (-3,0,0) (12,5,6) Merge2 (3,4,5,6) (-3,-2) (12,5,6) Split2 (12) (-4,6,2) (6,2) Split2 (12) (-4,2,6) (2,6) Split2 (12) (-4,-1,6) (2,6) Split2 (12,9) (-4,2,6,0) (2,6,9) Split2 (12,9,9,9) (-4,2,6,-2) (2,6,9,9,9) Split2 (12,12) (-4,2,-1,-4,-1,2) (2,6,6,2)

From:src/operator/tensor/matrix_op.cc:61

Parameters:
  • data (NDArray) – Input data to reshape.
  • target_shape (Shape(tuple), optional, default=(0,0)) – (Deprecated! Use shape instead.) Target new shape. One and only one dim can be 0, in which case it will be inferred from the rest of dims
  • keep_highest (boolean, optional, default=False) – (Deprecated! Use shape instead.) Whether keep the highest dim unchanged.If set to true, then the first dim in target_shape is ignored,and always fixed as input
  • shape (Shape(tuple), optional, default=()) – Target shape, a tuple, t=(t_1,t_2,..,t_m).
  • the input dims be s=(s_1,s_2,.,s_n) (Let) –
  • output dims u=(u_1,u_2,.,u_p) are computed from s and t. (The) –
  • target shape tuple elements t_i are read in order, and used to generate successive output dims u_p (The) –
  • t_i (meaning: behavior:) –
  • explicit u_p = t_i (+ve) –
  • copy u_p = s_i (0) –
  • infer u_p = (Prod s_i) / (Prod u_j | j != p) (-1) –
  • copy all u_p = s_i, u_p+1 = s_i+1, .. (-2) –
  • merge two u_p = s_i * s_i+1 (-3) –
  • split two u_p = a, u_p+1 = b | a * b = s_i (-4,a,b) –
  • split directive (-4) in the target shape tuple is followed by two dimensions, one of which can be -1, which means it will be inferred from the other one and the original dimension. (The) –
  • can only be one globally inferred dimension (-1), aside from any -1 occuring in a split directive. (The) –
  • reverse (boolean, optional, default=False) – Whether to match the shapes from the backward. If reverse is true, 0 values in the shape argument will be searched from the backward. E.g the original shape is (10, 5, 4) and the shape argument is (-1, 0). If reverse is true, the new shape should be (50, 4). Otherwise it will be (40, 5).
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.SVMOutput(*args, **kwargs)

Support Vector Machine based transformation on input, backprop L2-SVM

Parameters:
  • data (Symbol) – Input data to svm.
  • label (Symbol) – Label data.
  • margin (float, optional, default=1) – Scale the DType(param_.margin) for activation size
  • regularization_coefficient (float, optional, default=1) – Scale the coefficient responsible for balacing coefficient size and error tradeoff
  • use_linear (boolean, optional, default=False) – If set true, uses L1-SVM objective function. Default uses L2-SVM objective
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.SequenceLast(*args, **kwargs)

Takes the last element of a sequence. Takes an n-dimensional tensor of the form [max sequence length, batchsize, other dims] and returns a (n-1)-dimensional tensor of the form [batchsize, other dims]. This operator takes an optional input tensor sequence_length of positive ints of dimension [batchsize] when the sequence_length option is set to true. This allows the operator to handle variable-length sequences. If sequence_length is false, then each example in the batch is assumed to have the max sequence length.

Parameters:
  • data (Symbol) – n-dimensional input tensor of the form [max sequence length, batchsize, other dims]
  • sequence_length (Symbol) – vector of sequence lengths of size batchsize
  • use_sequence_length (boolean, optional, default=False) – If set to true, this layer takes in extra input sequence_length to specify variable length sequence
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.SequenceMask(*args, **kwargs)

Sets all elements outside the sequence to a constant value. Takes an n-dimensional tensor of the form [max sequence length, batchsize, other dims] and returns a tensor of the same shape. This operator takes an optional input tensor sequence_length of positive ints of dimension [batchsize] when the sequence_length option is set to true. This allows the operator to handle variable-length sequences. If sequence_length is false, then each example in the batch is assumed to have the max sequence length, and this operator becomes the identity operator.

Parameters:
  • data (Symbol) – n-dimensional input tensor of the form [max sequence length, batchsize, other dims]
  • sequence_length (Symbol) – vector of sequence lengths of size batchsize
  • use_sequence_length (boolean, optional, default=False) – If set to true, this layer takes in extra input sequence_length to specify variable length sequence
  • value (float, optional, default=0) – The value to be used as a mask.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.SequenceReverse(*args, **kwargs)

Reverses the elements of each sequence. Takes an n-dimensional tensor of the form [max sequence length, batchsize, other dims] and returns a tensor of the same shape. This operator takes an optional input tensor sequence_length of positive ints of dimension [batchsize] when the sequence_length option is set to true. This allows the operator to handle variable-length sequences. If sequence_length is false, then each example in the batch is assumed to have the max sequence length.

Parameters:
  • data (Symbol) – n-dimensional input tensor of the form [max sequence length, batchsize, other dims]
  • sequence_length (Symbol) – vector of sequence lengths of size batchsize
  • use_sequence_length (boolean, optional, default=False) – If set to true, this layer takes in extra input sequence_length to specify variable length sequence
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.SliceChannel(*args, **kwargs)

Slice input equally along specified axis

Parameters:
  • num_outputs (int, required) – Number of outputs to be sliced.
  • axis (int, optional, default='1') – Dimension along which to slice.
  • squeeze_axis (boolean, optional, default=False) – If true AND the sliced dimension becomes 1, squeeze that dimension.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.Softmax(*args, **kwargs)

DEPRECATED: Perform a softmax transformation on input. Please use SoftmaxOutput

Parameters:
  • data (Symbol) – Input data to softmax.
  • grad_scale (float, optional, default=1) – Scale the gradient by a float factor
  • ignore_label (float, optional, default=-1) – the label value will be ignored during backward (only works if use_ignore is set to be true).
  • multi_output (boolean, optional, default=False) – If set to true, for a (n,k,x_1,..,x_n) dimensional input tensor, softmax will generate n*x_1*...*x_n output, each has k classes
  • use_ignore (boolean, optional, default=False) – If set to true, the ignore_label value will not contribute to the backward gradient
  • preserve_shape (boolean, optional, default=False) – If true, for a (n_1, n_2, ..., n_d, k) dimensional input tensor, softmax will generate (n1, n2, ..., n_d, k) output, normalizing the k classes as the last dimension.
  • normalization ({'batch', 'null', 'valid'},optional, default='null') – If set to null, op will do nothing on output gradient.If set to batch, op will normalize gradient by divide batch sizeIf set to valid, op will normalize gradient by divide sample not ignored
  • out_grad (boolean, optional, default=False) – Apply weighting from output gradient
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.SoftmaxActivation(*args, **kwargs)

Apply softmax activation to input. This is intended for internal layers. For output (loss layer) please use SoftmaxOutput. If mode=instance, this operator will compute a softmax for each instance in the batch; this is the default mode. If mode=channel, this operator will compute a num_channel-class softmax at each position of each instance; this can be used for fully convolutional network, image segmentation, etc.

Parameters:
  • data (Symbol) – Input data to activation function.
  • mode ({'channel', 'instance'},optional, default='instance') – Softmax Mode. If set to instance, this operator will compute a softmax for each instance in the batch; this is the default mode. If set to channel, this operator will compute a num_channel-class softmax at each position of each instance; this can be used for fully convolutional network, image segmentation, etc.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.SoftmaxOutput(*args, **kwargs)

Perform a softmax transformation on input, backprop with logloss.

Parameters:
  • data (Symbol) – Input data to softmax.
  • label (Symbol) – Label data, can also be probability value with same shape as data
  • grad_scale (float, optional, default=1) – Scale the gradient by a float factor
  • ignore_label (float, optional, default=-1) – the label value will be ignored during backward (only works if use_ignore is set to be true).
  • multi_output (boolean, optional, default=False) – If set to true, for a (n,k,x_1,..,x_n) dimensional input tensor, softmax will generate n*x_1*...*x_n output, each has k classes
  • use_ignore (boolean, optional, default=False) – If set to true, the ignore_label value will not contribute to the backward gradient
  • preserve_shape (boolean, optional, default=False) – If true, for a (n_1, n_2, ..., n_d, k) dimensional input tensor, softmax will generate (n1, n2, ..., n_d, k) output, normalizing the k classes as the last dimension.
  • normalization ({'batch', 'null', 'valid'},optional, default='null') – If set to null, op will do nothing on output gradient.If set to batch, op will normalize gradient by divide batch sizeIf set to valid, op will normalize gradient by divide sample not ignored
  • out_grad (boolean, optional, default=False) – Apply weighting from output gradient
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.SpatialTransformer(*args, **kwargs)

Apply spatial transformer to input feature map.

Parameters:
  • data (Symbol) – Input data to the SpatialTransformerOp.
  • loc (Symbol) – localisation net, the output dim should be 6 when transform_type is affine, and the name of loc symbol should better starts with ‘stn_loc’, so that initialization it with iddentify tranform, or you shold initialize the weight and bias by yourself.
  • target_shape (Shape(tuple), optional, default=(0,0)) – output shape(h, w) of spatial transformer: (y, x)
  • transform_type ({'affine'}, required) – transformation type
  • sampler_type ({'bilinear'}, required) – sampling type
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.SwapAxis(*args, **kwargs)

Apply swapaxis to input.

Parameters:
  • data (Symbol) – Input data to the SwapAxisOp.
  • dim1 (int (non-negative), optional, default=0) – the first axis to be swapped.
  • dim2 (int (non-negative), optional, default=0) – the second axis to be swapped.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.UpSampling(*args, **kwargs)

Perform nearest neighboor/bilinear up sampling to inputs This function support variable length of positional input.

Parameters:
  • data (Symbol[]) – Array of tensors to upsample
  • scale (int (non-negative), required) – Up sampling scale
  • num_filter (int (non-negative), optional, default=0) – Input filter. Only used by bilinear sample_type.
  • sample_type ({'bilinear', 'nearest'}, required) – upsampling method
  • multi_input_mode ({'concat', 'sum'},optional, default='concat') – How to handle multiple input. concat means concatenate upsampled images along the channel dimension. sum means add all images together, only available for nearest neighbor upsampling.
  • num_args (int, required) – Number of inputs to be upsampled. For nearest neighbor upsampling, this can be 1-N; the size of output will be(scale*h_0,scale*w_0) and all other inputs will be upsampled to thesame size. For bilinear upsampling this must be 2; 1 input and 1 weight.
  • workspace (long (non-negative), optional, default=512) – Tmp workspace for deconvolution (MB)
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.abs(*args, **kwargs)

Take absolute value of the src

From:src/operator/tensor/elemwise_unary_op.cc:67

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.adam_update(*args, **kwargs)

Updater function for adam optimizer

Parameters:name (string, optional.) – Name of the resulting symbol.
Returns:symbol – The result symbol.
Return type:Symbol
mxnet.symbol.arccos(*args, **kwargs)

Take arccos of the src

From:src/operator/tensor/elemwise_unary_op.cc:219

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.arccosh(*args, **kwargs)

Take arccosh of the src

From:src/operator/tensor/elemwise_unary_op.cc:291

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.arcsin(*args, **kwargs)

Take arcsin of the src

From:src/operator/tensor/elemwise_unary_op.cc:210

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.arcsinh(*args, **kwargs)

Take arcsinh of the src

From:src/operator/tensor/elemwise_unary_op.cc:282

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.arctan(*args, **kwargs)

Take arctan of the src

From:src/operator/tensor/elemwise_unary_op.cc:228

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.arctanh(*args, **kwargs)

Take arctanh of the src

From:src/operator/tensor/elemwise_unary_op.cc:300

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.argmax(*args, **kwargs)

Compute argmax

From:src/operator/tensor/broadcast_reduce_op_index.cc:11

Parameters:
  • data (NDArray) – Source input
  • axis (int, optional, default='-1') – Empty or unsigned. The axis to perform the reduction.If left empty, a global reduction will be performed.
  • keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.argmax_channel(*args, **kwargs)
Parameters:
  • src (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.argmin(*args, **kwargs)

Compute argmin

From:src/operator/tensor/broadcast_reduce_op_index.cc:19

Parameters:
  • data (NDArray) – Source input
  • axis (int, optional, default='-1') – Empty or unsigned. The axis to perform the reduction.If left empty, a global reduction will be performed.
  • keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.argsort(*args, **kwargs)

Returns the indices that would sort an array.

From:src/operator/tensor/ordering_op.cc:89

Parameters:
  • src (NDArray) – Source input
  • axis (int or None, optional, default='-1') – Axis along which to sort the input tensor. If not given, the flattened array is used. Default is -1.
  • is_ascend (boolean, optional, default=True) – Whether sort in ascending or descending order.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.batch_dot(*args, **kwargs)

Calculate batched dot product of two matrices. (batch, M, K) X (batch, K, N) –> (batch, M, N).

From:src/operator/tensor/matrix_op.cc:254

Parameters:
  • lhs (NDArray) – Left input
  • rhs (NDArray) – Right input
  • transpose_a (boolean, optional, default=False) – True if the first matrix is transposed.
  • transpose_b (boolean, optional, default=False) – True if the second matrix is tranposed.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_add(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_axis(*args, **kwargs)

Broadcast src along axis

From:src/operator/tensor/broadcast_reduce_op_value.cc:76

Parameters:
  • data (NDArray) – Source input
  • axis (Shape(tuple), optional, default=()) – The axes to perform the broadcasting.
  • size (Shape(tuple), optional, default=()) – Target sizes of the broadcasting axes.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_div(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_equal(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_greater(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_greater_equal(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_hypot(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_lesser(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_lesser_equal(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_maximum(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_minimum(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_minus(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_mul(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_not_equal(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_plus(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_power(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_sub(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.broadcast_to(*args, **kwargs)

Broadcast src to shape

From:src/operator/tensor/broadcast_reduce_op_value.cc:83

Parameters:
  • data (NDArray) – Source input
  • shape (Shape(tuple), optional, default=()) – The shape of the desired array. We can set the dim to zero if it’s same as the original. E.g A = broadcast_to(B, shape=(10, 0, 0)) has the same meaning as A = broadcast_axis(B, axis=0, size=10).
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.ceil(*args, **kwargs)

Take ceil of the src

From:src/operator/tensor/elemwise_unary_op.cc:90

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.choose_element_0index(*args, **kwargs)

Choose one element from each line(row for python, column for R/Julia) in lhs according to index indicated by rhs. This function assume rhs uses 0-based index.

Parameters:
  • lhs (NDArray) – Left operand to the function.
  • rhs (NDArray) – Right operand to the function.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.clip(*args, **kwargs)

Clip ndarray elements to range (a_min, a_max)

Parameters:
  • src (NDArray) – Source input
  • a_min (real_t) – Minimum value
  • a_max (real_t) – Maximum value
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.cos(*args, **kwargs)

Take cos of the src

From:src/operator/tensor/elemwise_unary_op.cc:192

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.cosh(*args, **kwargs)

Take cosh of the src

From:src/operator/tensor/elemwise_unary_op.cc:264

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.crop(*args, **kwargs)

(Crop the input tensor and return a new one.

  • the input and output (if explicitly given) are of the same data type, and on the same device.

)

From:src/operator/tensor/matrix_op.cc:142

Parameters:
  • data (NDArray) – Source input
  • begin (Shape(tuple), required) – starting coordinates
  • end (Shape(tuple), required) – ending coordinates
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.degrees(*args, **kwargs)

Take degrees of the src

From:src/operator/tensor/elemwise_unary_op.cc:237

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.dot(*args, **kwargs)

Calculate dot product of two matrices or two vectors.

From:src/operator/tensor/matrix_op.cc:228

Parameters:
  • lhs (NDArray) – Left input
  • rhs (NDArray) – Right input
  • transpose_a (boolean, optional, default=False) – True if the first matrix is transposed.
  • transpose_b (boolean, optional, default=False) – True if the second matrix is tranposed.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.elemwise_add(*args, **kwargs)
Parameters:
  • lhs (NDArray) – first input
  • rhs (NDArray) – second input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.exp(*args, **kwargs)

Take exp of the src

From:src/operator/tensor/elemwise_unary_op.cc:138

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.expand_dims(*args, **kwargs)

Expand the shape of array by inserting a new axis.

From:src/operator/tensor/matrix_op.cc:121

Parameters:
  • data (NDArray) – Source input
  • axis (int (non-negative), required) – Position (amongst axes) where new axis is to be inserted.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.expm1(*args, **kwargs)

Take exp(x) - 1 in a numerically stable way

From:src/operator/tensor/elemwise_unary_op.cc:183

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.fill_element_0index(*args, **kwargs)

Fill one element of each line(row for python, column for R/Julia) in lhs according to index indicated by rhs and values indicated by mhs. This function assume rhs uses 0-based index.

Parameters:
  • lhs (NDArray) – Left operand to the function.
  • mhs (NDArray) – Middle operand to the function.
  • rhs (NDArray) – Right operand to the function.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.fix(*args, **kwargs)

Take round of the src to integer nearest 0

From:src/operator/tensor/elemwise_unary_op.cc:105

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.flip(*args, **kwargs)

Flip the input tensor along axis and return a new one.

From:src/operator/tensor/matrix_op.cc:216

Parameters:
  • data (NDArray) – Source input
  • axis (int, required) – The dimension to flip
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.floor(*args, **kwargs)

Take floor of the src

From:src/operator/tensor/elemwise_unary_op.cc:95

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.gamma(*args, **kwargs)

Take the gamma function (extension of the factorial function) of the src

From:src/operator/tensor/elemwise_unary_op.cc:309

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.gammaln(*args, **kwargs)

Take gammaln (log of the absolute value of gamma(x)) of the src

From:src/operator/tensor/elemwise_unary_op.cc:318

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.identity(*args, **kwargs)

Identity mapping, copy src to output

From:src/operator/tensor/elemwise_unary_op.cc:14

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.log(*args, **kwargs)

Take log of the src

From:src/operator/tensor/elemwise_unary_op.cc:144

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.log10(*args, **kwargs)

Take base-10 log of the src

From:src/operator/tensor/elemwise_unary_op.cc:150

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.log1p(*args, **kwargs)

Take log(1 + x) in a numerically stable way

From:src/operator/tensor/elemwise_unary_op.cc:174

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.log2(*args, **kwargs)

Take base-2 log of the src

From:src/operator/tensor/elemwise_unary_op.cc:156

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.max(*args, **kwargs)

Compute max along axis. If axis is empty, global reduction is performed

From:src/operator/tensor/broadcast_reduce_op_value.cc:57

Parameters:
  • data (NDArray) – Source input
  • axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed.
  • keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.max_axis(*args, **kwargs)

Compute max along axis. If axis is empty, global reduction is performed

From:src/operator/tensor/broadcast_reduce_op_value.cc:57

Parameters:
  • data (NDArray) – Source input
  • axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed.
  • keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.min(*args, **kwargs)

Compute min along axis. If axis is empty, global reduction is performed

From:src/operator/tensor/broadcast_reduce_op_value.cc:67

Parameters:
  • data (NDArray) – Source input
  • axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed.
  • keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.min_axis(*args, **kwargs)

Compute min along axis. If axis is empty, global reduction is performed

From:src/operator/tensor/broadcast_reduce_op_value.cc:67

Parameters:
  • data (NDArray) – Source input
  • axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed.
  • keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.nanprod(*args, **kwargs)

Compute product of src along axis, ignoring NaN values. If axis is empty, global reduction is performed

From:src/operator/tensor/broadcast_reduce_op_value.cc:47

Parameters:
  • data (NDArray) – Source input
  • axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed.
  • keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.nansum(*args, **kwargs)

Sum src along axis, ignoring NaN values. If axis is empty, global reduction is performed

From:src/operator/tensor/broadcast_reduce_op_value.cc:37

Parameters:
  • data (NDArray) – Source input
  • axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed.
  • keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.negative(*args, **kwargs)

Negate src

From:src/operator/tensor/elemwise_unary_op.cc:61

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.norm(*args, **kwargs)
Parameters:
  • src (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.normal(*args, **kwargs)

Sample a normal distribution

Parameters:
  • loc (float, optional, default=0) – Mean of the distribution.
  • scale (float, optional, default=1) – Standard deviation of the distribution.
  • shape (Shape(tuple), optional, default=()) – The shape of the output
  • ctx (string, optional, default='') – Context of output, in format [cpu|gpu|cpu_pinned](n).Only used for imperative calls.
  • dtype (int, optional, default='0') – DType of the output
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.prod(*args, **kwargs)

Compute product of src along axis. If axis is empty, global reduction is performed

From:src/operator/tensor/broadcast_reduce_op_value.cc:27

Parameters:
  • data (NDArray) – Source input
  • axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed.
  • keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.radians(*args, **kwargs)

Take radians of the src

From:src/operator/tensor/elemwise_unary_op.cc:246

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.rint(*args, **kwargs)

Take round of the src to nearest integer

From:src/operator/tensor/elemwise_unary_op.cc:100

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.round(*args, **kwargs)

Take round of the src

From:src/operator/tensor/elemwise_unary_op.cc:85

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.rsqrt(*args, **kwargs)

Take reciprocal square root of the src

From:src/operator/tensor/elemwise_unary_op.cc:128

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.sgd_mom_update(*args, **kwargs)

Updater function for sgd optimizer

Parameters:name (string, optional.) – Name of the resulting symbol.
Returns:symbol – The result symbol.
Return type:Symbol
mxnet.symbol.sgd_update(*args, **kwargs)

Updater function for sgd optimizer

Parameters:name (string, optional.) – Name of the resulting symbol.
Returns:symbol – The result symbol.
Return type:Symbol
mxnet.symbol.sign(*args, **kwargs)

Take sign of the src

From:src/operator/tensor/elemwise_unary_op.cc:76

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.sin(*args, **kwargs)

Take sin of the src

From:src/operator/tensor/elemwise_unary_op.cc:165

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.sinh(*args, **kwargs)

Take sinh of the src

From:src/operator/tensor/elemwise_unary_op.cc:255

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.slice_axis(*args, **kwargs)

Slice the input along certain axis and return a sliced array.

From:src/operator/tensor/matrix_op.cc:197

Parameters:
  • data (NDArray) – Source input
  • axis (int, required) – The axis to be sliced
  • begin (int, required) – The beginning index to be sliced
  • end (int, required) – The end index to be sliced
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.smooth_l1(*args, **kwargs)

Calculate Smooth L1 Loss(lhs, scalar)

From:src/operator/tensor/elemwise_binary_scalar_op_extended.cc:63

Parameters:
  • data (NDArray) – source input
  • scalar (float) – scalar input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.softmax_cross_entropy(*args, **kwargs)

Calculate cross_entropy(lhs, one_hot(rhs))

From:src/operator/loss_binary_op.cc:12

Parameters:
  • data (NDArray) – Input data
  • label (NDArray) – Input label
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.sort(*args, **kwargs)

Return a sorted copy of an array.

From:src/operator/tensor/ordering_op.cc:59

Parameters:
  • src (NDArray) – Source input
  • axis (int or None, optional, default='-1') – Axis along which to choose sort the input tensor. If not given, the flattened array is used. Default is -1.
  • is_ascend (boolean, optional, default=True) – Whether sort in ascending or descending order.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.sqrt(*args, **kwargs)

Take square root of the src

From:src/operator/tensor/elemwise_unary_op.cc:119

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.square(*args, **kwargs)

Take square of the src

From:src/operator/tensor/elemwise_unary_op.cc:110

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.sum(*args, **kwargs)

Sum src along axis. If axis is empty, global reduction is performed

From:src/operator/tensor/broadcast_reduce_op_value.cc:17

Parameters:
  • data (NDArray) – Source input
  • axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed.
  • keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.sum_axis(*args, **kwargs)

Sum src along axis. If axis is empty, global reduction is performed

From:src/operator/tensor/broadcast_reduce_op_value.cc:17

Parameters:
  • data (NDArray) – Source input
  • axis (Shape(tuple), optional, default=()) – Empty or unsigned or tuple. The axes to perform the reduction.If left empty, a global reduction will be performed.
  • keepdims (boolean, optional, default=False) – If true, the axis which is reduced is left in the result as dimension with size one.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.tan(*args, **kwargs)

Take tan of the src

From:src/operator/tensor/elemwise_unary_op.cc:201

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.tanh(*args, **kwargs)

Take tanh of the src

From:src/operator/tensor/elemwise_unary_op.cc:273

Parameters:
  • data (NDArray) – Source input
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.topk(*args, **kwargs)

Return the top k element of an input tensor along a given axis.

From:src/operator/tensor/ordering_op.cc:18

Parameters:
  • src (NDArray) – Source input
  • axis (int or None, optional, default='-1') – Axis along which to choose the top k indices. If not given, the flattened array is used. Default is -1.
  • k (int, optional, default='1') – Number of top elements to select, should be always smaller than or equal to the element number in the given axis. A global sort is performed if set k < 1.
  • ret_typ ({'both', 'indices', 'mask', 'value'},optional, default='indices') – The return type. “value” means returning the top k values, “indices” means returning the indices of the top k values, “mask” means to return a mask array containing 0 and 1. 1 means the top k values. “both” means to return both value and indices.
  • is_ascend (boolean, optional, default=False) – Whether to choose k largest or k smallest. Top K largest elements will be chosen if set to false.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.transpose(*args, **kwargs)

Transpose the input tensor and return a new one

From:src/operator/tensor/matrix_op.cc:93

Parameters:
  • data (NDArray) – Source input
  • axes (Shape(tuple), optional, default=()) – Target axis order. By default the axes will be inverted.
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

mxnet.symbol.uniform(*args, **kwargs)

Sample a uniform distribution

Parameters:
  • low (float, optional, default=0) – The lower bound of distribution
  • high (float, optional, default=1) – The upper bound of distribution
  • shape (Shape(tuple), optional, default=()) – The shape of the output
  • ctx (string, optional, default='') – Context of output, in format [cpu|gpu|cpu_pinned](n).Only used for imperative calls.
  • dtype (int, optional, default='0') – DType of the output
  • name (string, optional.) – Name of the resulting symbol.
Returns:

symbol – The result symbol.

Return type:

Symbol

Execution API Reference

Symbolic Executor component of MXNet.

class mxnet.executor.Executor(handle, symbol, ctx, grad_req, group2ctx)

Executor is the actual executing object of MXNet.

forward(is_train=False, **kwargs)

Calculate the outputs specified by the bound symbol.

Parameters:
  • is_train (bool, optional) – whether this forward is for evaluation purpose.
  • **kwargs – Additional specification of input arguments.

Examples

>>> # doing forward by specifying data
>>> texec.forward(is_train=True, data=mydata)
>>> # doing forward by not specifying things, but copy to the executor before hand
>>> mydata.copyto(texec.arg_dict['data'])
>>> texec.forward(is_train=True)
>>> # doing forward by specifying data and get outputs
>>> outputs = texec.forward(is_train=True, data=mydata)
>>> print(outputs[0].asnumpy())
backward(out_grads=None)

Do backward pass to get the gradient of arguments.

Parameters:out_grads (NDArray or list of NDArray or dict of str to NDArray, optional) – Gradient on the outputs to be propagated back. This parameter is only needed when bind is called on outputs that are not a loss function.
set_monitor_callback(callback)

Install callback.

Parameters:callback (function) – Takes a string and an NDArrayHandle.
arg_dict

Get dictionary representation of argument arrrays.

Returns:arg_dict – The dictionary that maps name of arguments to NDArrays.
Return type:dict of str to NDArray
Raises:ValueError : if there are duplicated names in the arguments.
grad_dict

Get dictionary representation of gradient arrays.

Returns:grad_dict – The dictionary that maps name of arguments to gradient arrays.
Return type:dict of str to NDArray
aux_dict

Get dictionary representation of auxiliary states arrays.

Returns:aux_dict – The dictionary that maps name of auxiliary states to NDArrays.
Return type:dict of str to NDArray
Raises:ValueError : if there are duplicated names in the auxiliary states.
output_dict

Get dictionary representation of output arrays.

Returns:output_dict – The dictionary that maps name of output names to NDArrays.
Return type:dict of str to NDArray
Raises:ValueError : if there are duplicated names in the outputs.
copy_params_from(arg_params, aux_params=None, allow_extra_params=False)

Copy parameters from arg_params, aux_params into executor’s internal array.

Parameters:
  • arg_params (dict of str to NDArray) – Parameters, dict of name to NDArray of arguments
  • aux_params (dict of str to NDArray, optional) – Parameters, dict of name to NDArray of auxiliary states.
  • allow_extra_params (boolean, optional) – Whether allow extra parameters that are not needed by symbol If this is True, no error will be thrown when arg_params or aux_params contain extra parameters that is not needed by the executor.
Raises:

ValueError – If there is additional parameters in the dict but allow_extra_params=False

reshape(partial_shaping=False, allow_up_sizing=False, **kwargs)

Return a new executor with the same symbol and shared memory, but different input/output shapes. For runtime reshaping, variable length sequences, etc. The returned executor shares state with the current one, and cannot be used in parallel with it.

Parameters:
  • partial_shaping (bool) – Whether to allow changing the shape of unspecified arguments.
  • allow_up_sizing (bool) – Whether to allow allocating new ndarrays that’s larger than the original.
  • kwargs (dict of string to tuple of int) – new shape for arguments.
Returns:

exec – A new executor that shares memory with self.

Return type:

Executor

debug_str()

Get a debug string about internal execution plan.

Returns:debug_str – Debug string of the executor.
Return type:string

Testing Utility Reference

Tools for testing.

mxnet.test_utils.default_context()

Get default context for regression test.

mxnet.test_utils.set_default_context(ctx)

Set default ctx

mxnet.test_utils.default_dtype()

Get default data type for regression test.

mxnet.test_utils.default_numerical_threshold()

Get default numerical threshold for regression test.

mxnet.test_utils.random_arrays(*shapes)

Generate some random numpy arrays.

mxnet.test_utils.np_reduce(dat, axis, keepdims, numpy_reduce_func)

Compatible reduce for old version numpy

Parameters:
  • dat (np.ndarray) – Same as Numpy
  • axis (None or int or list-like) – Same as Numpy
  • keepdims (bool) – Same as Numpy
  • numpy_reduce_func (function) – Numpy reducing function like np.sum or np.max
mxnet.test_utils.print_max_err_loc(a, b, rtol=1e-07, atol=0)

print location of maximum violation

mxnet.test_utils.same(a, b)

Test if two numpy arrays are the same

Parameters:
  • a (np.ndarray) –
  • b (np.ndarray) –
mxnet.test_utils.reldiff(a, b)

Calculate the relative difference between two input arrays

Calculated by \(\frac{|a-b|_1}{|a|_1 + |b|_1}\)

Parameters:
  • a (np.ndarray) –
  • b (np.ndarray) –
mxnet.test_utils.almost_equal(a, b, threshold=None)

Test if two numpy arrays are almost equal.

mxnet.test_utils.assert_almost_equal(a, b, threshold=None)

Test that two numpy arrays are almost equal. Raise exception message if not.

Parameters:
  • a (np.ndarray) –
  • b (np.ndarray) –
  • threshold (None or float) – The checking threshold. Default threshold will be used if set to None
mxnet.test_utils.almost_equal_ignore_nan(a, b, rtol=None, atol=None)

Test that two numpy arrays are almost equal (ignoring NaN in either array). Combines a relative and absolute measure of approximate eqality. If either the relative or absolute check passes, the arrays are considered equal. Including an absolute check resolves issues with the relative check where all array values are close to zero.

Parameters:
  • a (np.ndarray) –
  • b (np.ndarray) –
  • rtol (None or float) – The relative threshold. Default threshold will be used if set to None
  • atol (None or float) – The absolute threshold. Default threshold will be used if set to None
mxnet.test_utils.simple_forward(sym, ctx=None, is_train=False, **inputs)

A simple forward function for a symbol.

Primarily used in doctest to conveniently test the function of a symbol. Takes numpy array as inputs and outputs are also converted to numpy arrays.

Parameters:
  • ctx (Context) – If None, will take the default context.
  • inputs (keyword arguments) – Mapping each input name to a numpy array.
Returns:

  • The result as a numpy array. Multiple results will
  • be returned as a list of numpy arrays.

mxnet.test_utils.numeric_grad(executor, location, aux_states=None, eps=0.0001, use_forward_train=True)

Calculates a numeric gradient via finite difference method.

Class based on Theano’s theano.gradient.numeric_grad [1]

Parameters:
  • executor (Executor) – exectutor that computes the forward pass
  • location (list of numpy.ndarray or dict of str to numpy.ndarray) – Argument values used as location to compute gradient Maps the name of arguments to the corresponding numpy.ndarray. Value of all the arguments must be provided.
  • aux_states (None or list of numpy.ndarray or dict of str to numpy.ndarray, optional) – Auxiliary states values used as location to compute gradient Maps the name of aux_states to the corresponding numpy.ndarray. Value of all the auxiliary arguments must be provided.
  • eps (float, optional) – epsilon for the finite-difference method
  • use_forward_train (bool, optional) – Whether to use is_train=True in testing.

References

..[1] https://github.com/Theano/Theano/blob/master/theano/gradient.py

mxnet.test_utils.check_numeric_gradient(sym, location, aux_states=None, numeric_eps=0.0001, check_eps=0.01, grad_nodes=None, use_forward_train=True, ctx=None)

Verify an operation by checking backward pass via finite difference method.

Based on Theano’s theano.gradient.verify_grad [1]

Parameters:
  • sym (Symbol) – Symbol containing op to test
  • location (list or tuple or dict) –

    Argument values used as location to compute gradient

    • if type is list of numpy.ndarray
      inner elements should have the same the same order as mxnet.sym.list_arguments().
    • if type is dict of str -> numpy.ndarray
      maps the name of arguments to the corresponding numpy.ndarray.

    In either case, value of all the arguments must be provided.

  • aux_states (ist or tuple or dict, optional) – The auxiliary states required when generating the executor for the symbol
  • numeric_eps (float, optional) – Delta for the finite difference method that approximates the gradient
  • check_eps (float, optional) – relative error eps used when comparing numeric grad to symbolic grad
  • grad_nodes (None or list or tuple or dict, optional) – Names of the nodes to check gradient on
  • use_forward_train (bool) – Whether to use is_train=True when computing the finite-difference
  • ctx (Context, optional) – Check the gradient computation on the specified device

References

..[1] https://github.com/Theano/Theano/blob/master/theano/gradient.py

mxnet.test_utils.check_symbolic_forward(sym, location, expected, check_eps=0.0001, aux_states=None, ctx=None)

Compare foward call to expected value.

Parameters:
  • sym (Symbol) – output symbol
  • location (list of np.ndarray or dict of str to np.ndarray) –

    The evaluation point

    • if type is list of np.ndarray
      contain all the numpy arrays corresponding to sym.list_arguments()
    • if type is dict of str to np.ndarray
      contain the mapping between argument names and their values
  • expected (list of np.ndarray or dict of str to np.ndarray) –

    The expected output value

    • if type is list of np.ndarray
      contain arrays corresponding to exe.outputs
    • if type is dict of str to np.ndarray
      contain mapping between sym.list_output() and exe.outputs
  • check_eps (float, optional) – relative error to check to
  • aux_states (list of np.ndarray of dict, optional) –
    • if type is list of np.ndarray
      contain all the numpy arrays corresponding to sym.list_auxiliary_states
    • if type is dict of str to np.ndarray
      contain the mapping between names of auxiliary states and their values
  • ctx (Context, optional) – running context
mxnet.test_utils.check_symbolic_backward(sym, location, out_grads, expected, check_eps=1e-05, aux_states=None, grad_req='write', ctx=None)

Compare backward call to expected value.

Parameters:
  • sym (Symbol) – output symbol
  • location (list of np.ndarray or dict of str to np.ndarray) –

    The evaluation point

    • if type is list of np.ndarray
      contain all the numpy arrays corresponding to mxnet.sym.list_arguments
    • if type is dict of str to np.ndarray
      contain the mapping between argument names and their values
  • out_grads (None or list of np.ndarray or dict of str to np.ndarray) –

    numpy arrays corresponding to sym.outputs for incomming gradient

    • if type is list of np.ndarray
      contains arrays corresponding to exe.outputs
    • if type is dict of str to np.ndarray
      contains mapping between mxnet.sym.list_output() and Executor.outputs
  • expected (list of np.ndarray or dict of str to np.ndarray) –

    expected gradient values

    • if type is list of np.ndarray
      contains arrays corresponding to exe.grad_arrays
    • if type is dict of str to np.ndarray
      contains mapping between sym.list_arguments() and exe.outputs
  • check_eps (float, optional) – relative error to check to
  • aux_states (list of np.ndarray or dict of str to np.ndarray) –
  • grad_req (str or list of str or dict of str to str, optional) – gradient requirements. ‘write’, ‘add’ or ‘null’
  • ctx (Context, optional) – running context
mxnet.test_utils.check_speed(sym, location=None, ctx=None, N=20, grad_req=None, typ='whole', **kwargs)

Check the running speed of a symbol

Parameters:
  • sym (Symbol) – symbol to run the speed test
  • location (none or dict of str to np.ndarray) – location to evaluate the inner executor
  • ctx (Context) – running context
  • N (int, optional) – repeat times
  • grad_req (None or str or list of str or dict of str to str, optional) – gradient requirements
  • typ (str, optional) –

    “whole” or “forward”

    • “whole”
      test the forward_backward speed
    • “forward”
      only test the forward speed
mxnet.test_utils.check_consistency(sym, ctx_list, scale=1.0, grad_req='write', arg_params=None, aux_params=None, tol=None, raise_on_err=True, ground_truth=None)

Check symbol gives the same output for different running context

Parameters:
  • sym (Symbol or list of Symbols) – symbol(s) to run the consistency test
  • ctx_list (list) – running context. See example for more detail.
  • scale (float, optional) – standard deviation of the inner normal distribution. Used in initialization
  • grad_req (str or list of str or dict of str to str) – gradient requirement.

Examples

>>> # create the symbol
>>> sym = mx.sym.Convolution(num_filter=3, kernel=(3,3), name='conv')
>>> # initialize the running context
>>> ctx_list =[{'ctx': mx.gpu(0), 'conv_data': (2, 2, 10, 10), 'type_dict': {'conv_data': np.float64}}, {'ctx': mx.gpu(0), 'conv_data': (2, 2, 10, 10), 'type_dict': {'conv_data': np.float32}}, {'ctx': mx.gpu(0), 'conv_data': (2, 2, 10, 10), 'type_dict': {'conv_data': np.float16}}, {'ctx': mx.cpu(0), 'conv_data': (2, 2, 10, 10), 'type_dict': {'conv_data': np.float64}}, {'ctx': mx.cpu(0), 'conv_data': (2, 2, 10, 10), 'type_dict': {'conv_data': np.float32}}]
>>> check_consistency(sym, ctx_list)
>>> sym = mx.sym.Concat(name='concat', num_args=2)
>>> ctx_list = [{'ctx': mx.gpu(0), 'concat_arg1': (2, 10), 'concat_arg0': (2, 10),  'type_dict': {'concat_arg0': np.float64, 'concat_arg1': np.float64}}, {'ctx': mx.gpu(0), 'concat_arg1': (2, 10), 'concat_arg0': (2, 10),  'type_dict': {'concat_arg0': np.float32, 'concat_arg1': np.float32}}, {'ctx': mx.gpu(0), 'concat_arg1': (2, 10), 'concat_arg0': (2, 10),  'type_dict': {'concat_arg0': np.float16, 'concat_arg1': np.float16}}, {'ctx': mx.cpu(0), 'concat_arg1': (2, 10), 'concat_arg0': (2, 10),  'type_dict': {'concat_arg0': np.float64, 'concat_arg1': np.float64}}, {'ctx': mx.cpu(0), 'concat_arg1': (2, 10), 'concat_arg0': (2, 10),  'type_dict': {'concat_arg0': np.float32, 'concat_arg1': np.float32}}]
>>> check_consistency(sym, ctx_list)

Next Steps