gluon.model_zoo.vision¶
Module for pre-defined neural network models.
This module contains definitions for the following model architectures: - AlexNet - DenseNet - Inception V3 - ResNet V1 - ResNet V2 - SqueezeNet - VGG - MobileNet - MobileNetV2
You can construct a model with random weights by calling its constructor:
from mxnet.gluon.model_zoo import vision
resnet18 = vision.resnet18_v1()
alexnet = vision.alexnet()
squeezenet = vision.squeezenet1_0()
densenet = vision.densenet_161()
We provide pre-trained models for all the listed models.
These models can constructed by passing pretrained=True
:
from mxnet.gluon.model_zoo import vision
resnet18 = vision.resnet18_v1(pretrained=True)
alexnet = vision.alexnet(pretrained=True)
All pre-trained models expect input images normalized in the same way,
i.e. mini-batches of 3-channel RGB images of shape (N x 3 x H x W),
where N is the batch size, and H and W are expected to be at least 224.
The images have to be loaded in to a range of [0, 1] and then normalized
using mean = [0.485, 0.456, 0.406]
and std = [0.229, 0.224, 0.225]
.
The transformation should preferrably happen at preprocessing. You can use
mx.image.color_normalize
for such transformation:
image = image/255
normalized = mx.image.color_normalize(image,
mean=mx.np.array([0.485, 0.456, 0.406]),
std=mx.np.array([0.229, 0.224, 0.225]))
-
get_model
(name, **kwargs)[source] Returns a pre-defined model by name
- Parameters
name (str) – Name of the model.
pretrained (bool) – Whether to load the pretrained weights for model.
classes (int) – Number of classes for the output layer.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
- Returns
The model.
- Return type
|
Returns a pre-defined model by name |
ResNet¶
|
ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. |
|
ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper. |
|
BasicBlock V1 from “Deep Residual Learning for Image Recognition” paper.This is used for ResNet V1 for 18, 34 layers.. |
|
BasicBlock V2 from “Identity Mappings in Deep Residual Networks” paper.This is used for ResNet V2 for 18, 34 layers.. |
|
Bottleneck V1 from “Deep Residual Learning for Image Recognition” paper.This is used for ResNet V1 for 50, 101, 152 layers.. |
|
Bottleneck V2 from “Identity Mappings in Deep Residual Networks” paper.This is used for ResNet V2 for 50, 101, 152 layers.. |
|
ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.. |
VGG¶
|
VGG model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper. |
|
VGG model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper. |
Alexnet¶
|
AlexNet model from the “One weird trick…” paper. |
|
AlexNet model from the “One weird trick…” paper. |
DenseNet¶
|
Densenet-BC 121-layer model from the “Densely Connected Convolutional Networks” paper. |
|
Densenet-BC 161-layer model from the “Densely Connected Convolutional Networks” paper. |
|
Densenet-BC 169-layer model from the “Densely Connected Convolutional Networks” paper. |
|
Densenet-BC 201-layer model from the “Densely Connected Convolutional Networks” paper. |
|
Densenet-BC model from the “Densely Connected Convolutional Networks” paper. |
SqueezeNet¶
|
SqueezeNet 1.0 model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper. |
|
SqueezeNet 1.1 model from the official SqueezeNet repo.SqueezeNet 1.1 has 2.4x less computation and slightly fewer parameters than SqueezeNet 1.0, without sacrificing accuracy.. |
|
SqueezeNet model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper.SqueezeNet 1.1 model from the official SqueezeNet repo.SqueezeNet 1.1 has 2.4x less computation and slightly fewer parameters than SqueezeNet 1.0, without sacrificing accuracy.. |
Inception¶
|
Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper. |
|
Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper. |
MobileNet¶
|
MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper. |
|
MobileNetV2 model from the “Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation” paper. |
API Reference¶
Module for pre-defined neural network models.
This module contains definitions for the following model architectures: - AlexNet - DenseNet - Inception V3 - ResNet V1 - ResNet V2 - SqueezeNet - VGG - MobileNet - MobileNetV2
You can construct a model with random weights by calling its constructor:
from mxnet.gluon.model_zoo import vision
resnet18 = vision.resnet18_v1()
alexnet = vision.alexnet()
squeezenet = vision.squeezenet1_0()
densenet = vision.densenet_161()
We provide pre-trained models for all the listed models.
These models can constructed by passing pretrained=True
:
from mxnet.gluon.model_zoo import vision
resnet18 = vision.resnet18_v1(pretrained=True)
alexnet = vision.alexnet(pretrained=True)
All pre-trained models expect input images normalized in the same way,
i.e. mini-batches of 3-channel RGB images of shape (N x 3 x H x W),
where N is the batch size, and H and W are expected to be at least 224.
The images have to be loaded in to a range of [0, 1] and then normalized
using mean = [0.485, 0.456, 0.406]
and std = [0.229, 0.224, 0.225]
.
The transformation should preferrably happen at preprocessing. You can use
mx.image.color_normalize
for such transformation:
image = image/255
normalized = mx.image.color_normalize(image,
mean=mx.np.array([0.485, 0.456, 0.406]),
std=mx.np.array([0.229, 0.224, 0.225]))
Classes
Functions
-
class
AlexNet
(classes=1000, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).AlexNet model from the “One weird trick…” paper.
- Parameters
classes (int, default 1000) – Number of classes for the output layer.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
BasicBlockV1
(channels, stride, downsample=False, in_channels=0, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).BasicBlock V1 from “Deep Residual Learning for Image Recognition” paper. This is used for ResNet V1 for 18, 34 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
BasicBlockV2
(channels, stride, downsample=False, in_channels=0, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).BasicBlock V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for ResNet V2 for 18, 34 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
BottleneckV1
(channels, stride, downsample=False, in_channels=0, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).Bottleneck V1 from “Deep Residual Learning for Image Recognition” paper. This is used for ResNet V1 for 50, 101, 152 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
BottleneckV2
(channels, stride, downsample=False, in_channels=0, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).Bottleneck V2 from “Identity Mappings in Deep Residual Networks” paper. This is used for ResNet V2 for 50, 101, 152 layers.
- Parameters
channels (int) – Number of output channels.
stride (int) – Stride size.
downsample (bool, default False) – Whether to downsample the input.
in_channels (int, default 0) – Number of input channels. Default is 0, to infer from the graph.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
DenseNet
(num_init_features, growth_rate, block_config, bn_size=4, dropout=0, classes=1000, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).Densenet-BC model from the “Densely Connected Convolutional Networks” paper.
- Parameters
num_init_features (int) – Number of filters to learn in the first convolution layer.
growth_rate (int) – Number of filters to add each layer (k in the paper).
block_config (list of int) – List of integers for numbers of layers in each pooling block.
bn_size (int, default 4) – Multiplicative factor for number of bottle neck layers. (i.e. bn_size * k features in the bottleneck layer)
dropout (float, default 0) – Rate of dropout after each dense layer.
classes (int, default 1000) – Number of classification classes.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
Inception3
(classes=1000, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper.
- Parameters
classes (int, default 1000) – Number of classification classes.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
MobileNet
(multiplier=1.0, classes=1000, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper.
- Parameters
multiplier (float, default 1.0) – The width multiplier for controling the model size. Only multipliers that are no less than 0.25 are supported. The actual number of channels is equal to the original channel size multiplied by this multiplier.
classes (int, default 1000) – Number of classes for the output layer.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
MobileNetV2
(multiplier=1.0, classes=1000, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).MobileNetV2 model from the “Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation” paper.
- Parameters
multiplier (float, default 1.0) – The width multiplier for controling the model size. The actual number of channels is equal to the original channel size multiplied by this multiplier.
classes (int, default 1000) – Number of classes for the output layer.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
ResNetV1
(block, layers, channels, classes=1000, thumbnail=False, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).ResNet V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
block (gluon.HybridBlock) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.
layers (list of int) – Numbers of layers in each block
channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
classes (int, default 1000) – Number of classification classes.
thumbnail (bool, default False) – Enable thumbnail.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
ResNetV2
(block, layers, channels, classes=1000, thumbnail=False, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
block (gluon.HybridBlock) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.
layers (list of int) – Numbers of layers in each block
channels (list of int) – Numbers of channels in each block. Length should be one larger than layers list.
classes (int, default 1000) – Number of classification classes.
thumbnail (bool, default False) – Enable thumbnail.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
SqueezeNet
(version, classes=1000, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).SqueezeNet model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper. SqueezeNet 1.1 model from the official SqueezeNet repo. SqueezeNet 1.1 has 2.4x less computation and slightly fewer parameters than SqueezeNet 1.0, without sacrificing accuracy.
- Parameters
version (str) – Version of squeezenet. Options are ‘1.0’, ‘1.1’.
classes (int, default 1000) – Number of classification classes.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
class
VGG
(layers, filters, classes=1000, batch_norm=False, **kwargs)[source]¶ Bases:
mxnet.gluon.block.HybridBlock
Methods
apply
(fn)Applies
fn
recursively to every child block as well as self.collect_params
([select])Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.export
(path[, epoch, remove_amp_cast])Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
forward
(x)Overrides the forward computation.
hybridize
([active, partition_if_dynamic, …])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, device, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load
(prefix)Load a model saved using the save API
load_dict
(param_dict[, device, …])Load parameters from dict
load_parameters
(filename[, device, …])Load parameters from file previously saved by save_parameters.
optimize_for
(x, *args[, backend, clear, …])Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass.
register_forward_hook
(hook)Registers a forward hook on the block.
Registers a forward pre-hook on the block.
register_op_hook
(callback[, monitor_all])Install op hook for block recursively.
reset_ctx
(ctx)This function has been deprecated.
reset_device
(device)Re-assign all Parameters to other devices.
save
(prefix)Save the model architecture and parameters to load again later
save_parameters
(filename[, deduplicate])Save parameters to file.
setattr
(name, value)Set an attribute to a new value for all Parameters.
share_parameters
(shared)Share parameters recursively inside the model.
summary
(*inputs)Print the summary of the model’s output and parameters.
Sets all Parameters’ gradient buffer to 0.
Attributes
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).VGG model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
layers (list of int) – Numbers of layers in each feature block.
filters (list of int) – Numbers of filters in each feature block. List length should match the layers.
classes (int, default 1000) – Number of classification classes.
batch_norm (bool, default False) – Use batch normalization.
-
apply
(fn)¶ Applies
fn
recursively to every child block as well as self.- Parameters
fn (callable) – Function to be applied to each submodule, of form fn(block).
- Returns
- Return type
this block
-
collect_params
(select=None)¶ Returns a
Dict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectDict
which match some given regular expressions.For example, collect the specified parameters in [‘conv1.weight’, ‘conv1.bias’, ‘fc.weight’, ‘fc.bias’]:
model.collect_params('conv1.weight|conv1.bias|fc.weight|fc.bias')
or collect all parameters whose names end with ‘weight’ or ‘bias’, this can be done using regular expressions:
model.collect_params('.*weight|.*bias')
- Parameters
select (str) – regular expressions
- Returns
- Return type
The selected
Dict
-
export
(path, epoch=0, remove_amp_cast=True)¶ Export HybridBlock to json format that can be loaded by gluon.SymbolBlock.imports or the C++ interface.
Note
When there are only one input, it will have name data. When there Are more than one inputs, they will be named as data0, data1, etc.
- Parameters
path (str or None) – Path to save model. Two files path-symbol.json and path-xxxx.params will be created, where xxxx is the 4 digits epoch number. If None, do not export to file but return Python Symbol object and corresponding dictionary of parameters.
epoch (int) – Epoch number of saved model.
remove_amp_cast (bool, optional) – Whether to remove the amp_cast and amp_multicast operators, before saving the model.
- Returns
symbol_filename (str) – Filename to which model symbols were saved, including path prefix.
params_filename (str) – Filename to which model parameters were saved, including path prefix.
-
hybridize
(active=True, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None)¶ Activates or deactivates
HybridBlock
s recursively. Has no effect on non-hybrid children.- Parameters
active (bool, default True) – Whether to turn hybrid on or off.
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
-
infer_shape
(*args)¶ Infers shape of Parameters from inputs.
-
infer_type
(*args)¶ Infers data type of Parameters from inputs.
-
initialize
(init=<mxnet.initializer.Uniform object>, device=None, verbose=False, force_reinit=False)¶ Initializes
Parameter
s of thisBlock
and its children.- Parameters
init (Initializer) – Global default Initializer to be used when
Parameter.init()
isNone
. Otherwise,Parameter.init()
takes precedence.device (Device or list of Device) – Keeps a copy of Parameters on one or many device(s).
verbose (bool, default False) – Whether to verbosely print out details on initialization.
force_reinit (bool, default False) – Whether to force re-initialization if parameter is already initialized.
-
load
(prefix)¶ Load a model saved using the save API
Reconfigures a model using the saved configuration. This function does not regenerate the model architecture. It resets each Block’s parameter UUIDs as they were when saved in order to match the names of the saved parameters.
This function assumes the Blocks in the model were created in the same order they were when the model was saved. This is because each Block is uniquely identified by Block class name and a unique ID in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph (Symbol & inputs) and settings are restored if it had been hybridized before saving.
- Parameters
prefix (str) – The prefix to use in filenames for loading this model: <prefix>-model.json and <prefix>-model.params
-
load_dict
(param_dict, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from dict
- Parameters
param_dict (dict) – Dictionary containing model parameters
device (Device, optional) – Device context on which the memory is allocated. Default is mxnet.device.current_device().
allow_missing (bool, default False) – Whether to silently skip loading parameters not represented in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this dict.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
-
load_parameters
(filename, device=None, allow_missing=False, ignore_extra=False, cast_dtype=False, dtype_source='current')¶ Load parameters from file previously saved by save_parameters.
- Parameters
filename (str) – Path to parameter file.
device (Device or list of Device, default cpu()) – Device(s) to initialize loaded parameters on.
allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.
ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.
cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.
dtype_source (str, default 'current') – must be in {‘current’, ‘saved’} Only valid if cast_dtype=True, specify the source of the dtype for casting the parameters
References
-
optimize_for
(x, *args, backend=None, clear=False, partition_if_dynamic=True, static_alloc=False, static_shape=False, inline_limit=2, forward_bulk_size=None, backward_bulk_size=None, **kwargs)¶ Partitions the current HybridBlock and optimizes it for a given backend without executing a forward pass. Modifies the HybridBlock in-place.
Immediately partitions a HybridBlock using the specified backend. Combines the work done in the hybridize API with part of the work done in the forward pass without calling the CachedOp. Can be used in place of hybridize, afterwards export can be called or inference can be run. See README.md in example/extensions/lib_subgraph/README.md for more details.
Examples
# partition and then export to file block.optimize_for(x, backend=’myPart’) block.export(‘partitioned’)
# partition and then run inference block.optimize_for(x, backend=’myPart’) block(x)
- Parameters
x (NDArray) – first input to model
*args (NDArray) – other inputs to model
backend (str) – The name of backend, as registered in SubgraphBackendRegistry, default None
backend_opts (dict of user-specified options to pass to the backend for partitioning, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
clear (bool, default False) – clears any previous optimizations
partition_if_dynamic (bool, default False) – whether to partition the graph when dynamic shape op exists
static_alloc (bool, default False) – Statically allocate memory to improve speed. Memory usage may increase.
static_shape (bool, default False) – Optimize for invariant input shapes between iterations. Must also set static_alloc to True. Change of input shapes is still allowed but slower.
inline_limit (optional int, default 2) – Maximum number of operators that can be inlined.
forward_bulk_size (optional int, default None) – Segment size of bulk execution during forward pass.
backward_bulk_size (optional int, default None) – Segment size of bulk execution during backward pass.
**kwargs (The backend options, optional) – Passed on to PrePartition and PostPartition functions of SubgraphProperty
-
property
params
¶ Returns this
Block
’s parameter dictionary (does not include its children’s parameters).
-
register_forward_hook
(hook)¶ Registers a forward hook on the block.
The hook function is called immediately after
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input, output) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_forward_pre_hook
(hook)¶ Registers a forward pre-hook on the block.
The hook function is called immediately before
forward()
. It should not modify the input or output.- Parameters
hook (callable) – The forward hook function of form hook(block, input) -> None.
- Returns
- Return type
mxnet.gluon.utils.HookHandle
-
register_op_hook
(callback, monitor_all=False)¶ Install op hook for block recursively.
- Parameters
callback (function) – Function called to inspect the values of the intermediate outputs of blocks after hybridization. It takes 3 parameters: name of the tensor being inspected (str) name of the operator producing or consuming that tensor (str) tensor being inspected (NDArray).
monitor_all (bool, default False) – If True, monitor both input and output, otherwise monitor output only.
-
reset_ctx
(ctx)¶ This function has been deprecated. Please refer to
HybridBlock.reset_device
.
-
reset_device
(device)¶ Re-assign all Parameters to other devices. If the Block is hybridized, it will reset the _cached_op_args.
- Parameters
device (Device or list of Device, default
device.current_device()
.) – Assign Parameter to given device. If device is a list of Device, a copy will be made for each device.
-
save
(prefix)¶ Save the model architecture and parameters to load again later
Saves the model architecture as a nested dictionary where each Block in the model is a dictionary and its children are sub-dictionaries.
Each Block is uniquely identified by Block class name and a unique ID. We save each Block’s parameter UUID to restore later in order to match the saved parameters.
Recursively traverses a Block’s children in order (since its an OrderedDict) and uses the unique ID to denote that specific Block.
Assumes that the model is created in an identical order every time. If the model is not able to be recreated deterministically do not use this set of APIs to save/load your model.
For HybridBlocks, the cached_graph is saved (Symbol & inputs) if it has already been hybridized.
- Parameters
prefix (str) – The prefix to use in filenames for saving this model: <prefix>-model.json and <prefix>-model.params
-
save_parameters
(filename, deduplicate=False)¶ Save parameters to file.
Saved parameters can only be loaded with load_parameters. Note that this method only saves parameters, not model structure. If you want to save model structures, please use
HybridBlock.export()
.- Parameters
filename (str) – Path to file.
deduplicate (bool, default False) – If True, save shared parameters only once. Otherwise, if a Block contains multiple sub-blocks that share parameters, each of the shared parameters will be separately saved for every sub-block.
References
-
setattr
(name, value)¶ Set an attribute to a new value for all Parameters.
For example, set grad_req to null if you don’t need gradient w.r.t a model’s Parameters:
model.setattr('grad_req', 'null')
or change the learning rate multiplier:
model.setattr('lr_mult', 0.5)
- Parameters
name (str) – Name of the attribute.
value (valid type for attribute name) – The new value for the attribute.
Share parameters recursively inside the model.
For example, if you want
dense1
to sharedense0
’s weights, you can do:dense0 = nn.Dense(20) dense1 = nn.Dense(20) dense1.share_parameters(dense0.collect_params())
- which equals to
dense1.weight = dense0.weight dense1.bias = dense0.bias
Note that unlike the load_parameters or load_dict functions, share_parameters results in the Parameter object being shared (or tied) between the models, whereas load_parameters or load_dict only set the value of the data dictionary of a model. If you call load_parameters or load_dict after share_parameters, the loaded value will be reflected in all networks that use the shared (or tied) Parameter object.
- Parameters
shared (Dict) – Dict of the shared parameters.
- Returns
- Return type
this block
-
summary
(*inputs)¶ Print the summary of the model’s output and parameters.
The network must have been initialized, and must not have been hybridized.
- Parameters
inputs (object) – Any input that the model supports. For any tensor in the input, only
mxnet.ndarray.NDArray
is supported.
-
zero_grad
()¶ Sets all Parameters’ gradient buffer to 0.
-
alexnet
(pretrained=False, device=cpu(0), root='/home/jenkins_slave/.mxnet/models', **kwargs)[source]¶ AlexNet model from the “One weird trick…” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
-
densenet121
(**kwargs)[source]¶ Densenet-BC 121-layer model from the “Densely Connected Convolutional Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
densenet161
(**kwargs)[source]¶ Densenet-BC 161-layer model from the “Densely Connected Convolutional Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
densenet169
(**kwargs)[source]¶ Densenet-BC 169-layer model from the “Densely Connected Convolutional Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
densenet201
(**kwargs)[source]¶ Densenet-BC 201-layer model from the “Densely Connected Convolutional Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
get_mobilenet
(multiplier, pretrained=False, device=cpu(0), root='/home/jenkins_slave/.mxnet/models', **kwargs)[source]¶ MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper.
- Parameters
multiplier (float) – The width multiplier for controling the model size. Only multipliers that are no less than 0.25 are supported. The actual number of channels is equal to the original channel size multiplied by this multiplier.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
-
get_mobilenet_v2
(multiplier, pretrained=False, device=cpu(0), root='/home/jenkins_slave/.mxnet/models', **kwargs)[source]¶ MobileNetV2 model from the “Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation” paper.
- Parameters
multiplier (float) – The width multiplier for controling the model size. Only multipliers that are no less than 0.25 are supported. The actual number of channels is equal to the original channel size multiplied by this multiplier.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
-
get_model
(name, **kwargs)[source]¶ Returns a pre-defined model by name
- Parameters
name (str) – Name of the model.
pretrained (bool) – Whether to load the pretrained weights for model.
classes (int) – Number of classes for the output layer.
ctx (Context, default CPU) – The context in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
- Returns
The model.
- Return type
-
get_resnet
(version, num_layers, pretrained=False, device=cpu(0), root='/home/jenkins_slave/.mxnet/models', **kwargs)[source]¶ ResNet V1 model from “Deep Residual Learning for Image Recognition” paper. ResNet V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
version (int) – Version of ResNet. Options are 1, 2.
num_layers (int) – Numbers of layers. Options are 18, 34, 50, 101, 152.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
-
get_vgg
(num_layers, pretrained=False, device=cpu(0), root='/home/jenkins_slave/.mxnet/models', **kwargs)[source]¶ VGG model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
num_layers (int) – Number of layers for the variant of densenet. Options are 11, 13, 16, 19.
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
-
inception_v3
(pretrained=False, device=cpu(0), root='/home/jenkins_slave/.mxnet/models', **kwargs)[source]¶ Inception v3 model from “Rethinking the Inception Architecture for Computer Vision” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default $MXNET_HOME/models) – Location for keeping the model parameters.
-
mobilenet0_25
(**kwargs)[source]¶ MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.25.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
-
mobilenet0_5
(**kwargs)[source]¶ MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.5.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
-
mobilenet0_75
(**kwargs)[source]¶ MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 0.75.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
-
mobilenet1_0
(**kwargs)[source]¶ MobileNet model from the “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications” paper, with width multiplier 1.0.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
-
mobilenet_v2_0_25
(**kwargs)[source]¶ MobileNetV2 model from the “Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
-
mobilenet_v2_0_5
(**kwargs)[source]¶ MobileNetV2 model from the “Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
-
mobilenet_v2_0_75
(**kwargs)[source]¶ MobileNetV2 model from the “Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
-
mobilenet_v2_1_0
(**kwargs)[source]¶ MobileNetV2 model from the “Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
-
resnet101_v1
(**kwargs)[source]¶ ResNet-101 V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
resnet101_v2
(**kwargs)[source]¶ ResNet-101 V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
resnet152_v1
(**kwargs)[source]¶ ResNet-152 V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
resnet152_v2
(**kwargs)[source]¶ ResNet-152 V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
resnet18_v1
(**kwargs)[source]¶ ResNet-18 V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
resnet18_v2
(**kwargs)[source]¶ ResNet-18 V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
resnet34_v1
(**kwargs)[source]¶ ResNet-34 V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
resnet34_v2
(**kwargs)[source]¶ ResNet-34 V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
resnet50_v1
(**kwargs)[source]¶ ResNet-50 V1 model from “Deep Residual Learning for Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
resnet50_v2
(**kwargs)[source]¶ ResNet-50 V2 model from “Identity Mappings in Deep Residual Networks” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
squeezenet1_0
(**kwargs)[source]¶ SqueezeNet 1.0 model from the “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
squeezenet1_1
(**kwargs)[source]¶ SqueezeNet 1.1 model from the official SqueezeNet repo. SqueezeNet 1.1 has 2.4x less computation and slightly fewer parameters than SqueezeNet 1.0, without sacrificing accuracy.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
vgg11
(**kwargs)[source]¶ VGG-11 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
vgg11_bn
(**kwargs)[source]¶ VGG-11 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
vgg13
(**kwargs)[source]¶ VGG-13 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
vgg13_bn
(**kwargs)[source]¶ VGG-13 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
vgg16
(**kwargs)[source]¶ VGG-16 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
vgg16_bn
(**kwargs)[source]¶ VGG-16 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
vgg19
(**kwargs)[source]¶ VGG-19 model from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.
-
vgg19_bn
(**kwargs)[source]¶ VGG-19 model with batch normalization from the “Very Deep Convolutional Networks for Large-Scale Image Recognition” paper.
- Parameters
pretrained (bool, default False) – Whether to load the pretrained weights for model.
device (Device, default CPU) – The device in which to load the pretrained weights.
root (str, default '$MXNET_HOME/models') – Location for keeping the model parameters.