tensorfx.training¶
Model Builder¶
-
class
tensorfx.training.
ModelArguments
(**kwargs)¶ -
classmethod
build_parser
()¶ Builds the argument parser.
Returns: An argument parser with arguments added.
-
classmethod
default
()¶ Creates an instance of the arguments with default values.
Returns: The model arguments with default values.
-
classmethod
parse
(args=None, parse_job=False)¶ Parses training arguments.
Parameters: - args – the arguments to parse. If unspecified, the process arguments are used.
- parse_job – whether to parse the job related standard arguments.
Returns: The model arguments, and optionally the training output path as well.
-
process
()¶ Processes the parsed arguments to produce any additional objects.
-
classmethod
-
class
tensorfx.training.
ModelBuilder
(args, dataset)¶ Builds model graphs for different phases: training, evaluation and prediction.
A model graph is an interface that encapsulates a TensorFlow graph, and references to tensors and ops within that graph.
A ModelBuilder serves as a base class for various models. Each specific model adds its specific logic to build the required TensorFlow graph.
-
args
¶ Retrieves the set of arguments specified for training.
-
build_evaluation
(inputs, outputs)¶ Builds the evaluation graph.abs
Parameters: - inputs – the dictionary of tensors corresponding to the input.
- outputs – the dictionary containing output tensors.
Returns: The eval metric tensor and the eval op.
-
build_evaluation_graph
()¶ Builds the graph to use for evaluating a model during training.
Returns: The set of tensors and ops references required for evaluation.
-
build_graph_interfaces
(config)¶ Builds graph interfaces for training and evaluating a model, and for predicting using it.
A graph interface is an object containing a TensorFlow graph member, as well as members corresponding to various tensors and ops within the graph.
Parameters: config – The training Configuration object. Returns: A tuple consisting of the training, evaluation and prediction interfaces.
-
build_inference
(inputs, training)¶ Builds the inference sub-graph.
Parameters: - inputs – the dictionary of tensors corresponding to the input.
- training – whether the inference sub-graph is being built for the training graph.
Returns: The inference values.
-
build_init
()¶ Builds the initialization sub-graph.
The default implementation creates an initialization op that initializes all variables, locals for initialization, and another for all non-traininable variables and tables for local initialization.
Initialization is run when the graph is first created, before training. Local initialization is performed after a previously trained model is loaded.
Returns: A tuple containing the init op and local init op to use to initialize the graph.
-
build_input
(source, batch, epochs, shuffle)¶ Builds the input sub-graph.
Parameters: - source – the name of data source to use for input (for training and evaluation).
- batch – the number of instances to read per batch.
- epochs – the number of passes over the data.
- shuffle – whether to shuffle the data.
Returns: A dictionary of tensors key’ed by feature names.
-
build_output
(inferences)¶ Builds the output sub-graph
Parameters: inferences – the inference values. Returns: A dictionary consisting of the output prediction tensors.
-
build_prediction_graph
()¶ Builds the graph to use for predictions with the trained model.
Returns: The set of tensors and ops references required for prediction.
-
build_training
(global_steps, inputs, inferences)¶ Builds the training sub-graph.
Parameters: - global_steps – the global steps variable to use.
- inputs – the dictionary of tensors corresponding to the input.
- inferences – the inference values.
Returns: The loss tensor, and the training op.
-
build_training_graph
()¶ Builds the graph to use for training a model.
This operates on the current default graph.
Returns: The set of tensors and ops references required for training.
-
dataset
¶ Retrieves the DataSet being used for training and evaluation data.
-
Training Jobs¶
-
class
tensorfx.training.
Configuration
(task, cluster, job, env)¶ Contains configuration information for the training process.
-
cluster
¶ Retrieves the cluster definition containing the current node.
This is None if the current node is part of a single node training job.
-
create_device_setter
(args)¶ Creates the device setter, which assigns variables and ops to devices in distributed mode.
Parameters: args – the arguments associated with the current job.
-
create_server
()¶ Creates the TensorFlow server, which is required for distributed training.
-
device
¶ Retrieve the device associated with the current node.
-
distributed
¶ Determines if training being performed is distributed or is single node training.
Returns: True if the configuration represents distributed training; False otherwise.
-
classmethod
environment
()¶ Creates a Configuration object for single node and distributed training.
This relies on looking up configuration from an environment variable, ‘TF_CONFIG’ which allows a hosting environment to configure the training process. The specific environment variable is expected to be a JSON formatted dictionary containing configuration about the current task, cluster and job.
Returns: A Configuration instance matching the current environment.
-
job
¶ Retrieves the job definition of the current training job.
-
classmethod
local
()¶ Creates a Configuration object representing single node training in a process.
Returns: A default Configuration instance with simple configuration.
-
master
¶ Retrieves whether the current task is a master task.
-
param_server
¶ Retrieves whether the current task is a parameter server task.
-
task
¶ Retrieves the task definition associated with the current node.
If no job information is provided, this is None.
-
worker
¶ Retrieves whether the current task is a worker task.
-
-
class
tensorfx.training.
ModelTrainer
(config=None)¶ Provides the functionality to train a model during a training job.
-
config
¶ Retrieves the training configuration.
-
train
(model_builder, job_args)¶ Runs the training process to train a model.
Parameters: - model_builder – the ModelBuilder to use to build graphs during training.
- job_args – the arguments for the training job.
Returns: The trained Model. The resulting value is only relevant for master nodes.
-