1# Converter command line examples 2 3This page shows how to use the TensorFlow Lite Converter in the command line. 4 5[TOC] 6 7## Command-line tools <a name="tools"></a> 8 9There are two approaches to running the converter in the command line. 10 11* `tflite_convert`: Starting from TensorFlow 1.9, the command-line tool 12 `tflite_convert` is installed as part of the Python package. All of the 13 examples below use `tflite_convert` for simplicity. 14 * Example: `tflite_convert --output_file=...` 15* `bazel`: In order to run the latest version of the TensorFlow Lite Converter 16 either install the nightly build using 17 [pip](https://www.tensorflow.org/install/pip) or 18 [clone the TensorFlow repository](https://www.tensorflow.org/install/source) 19 and use `bazel`. 20 * Example: `bazel run 21 //tensorflow/lite/python:tflite_convert -- 22 --output_file=...` 23 24### Converting models prior to TensorFlow 1.9 <a name="pre_tensorflow_1.9"></a> 25 26The recommended approach for using the converter prior to TensorFlow 1.9 is the 27[Python API](python_api.md#pre_tensorflow_1.9). If a command line tool is 28desired, the `toco` command line tool was available in TensorFlow 1.7. Enter 29`toco --help` in Terminal for additional details on the command-line flags 30available. There were no command line tools in TensorFlow 1.8. 31 32## Basic examples <a name="basic"></a> 33 34The following section shows examples of how to convert a basic float-point model 35from each of the supported data formats into a TensorFlow Lite FlatBuffers. 36 37### Convert a TensorFlow GraphDef <a name="graphdef"></a> 38 39The follow example converts a basic TensorFlow GraphDef (frozen by 40[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py)) 41into a TensorFlow Lite FlatBuffer to perform floating-point inference. Frozen 42graphs contain the variables stored in Checkpoint files as Const ops. 43 44``` 45curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ 46 | tar xzv -C /tmp 47tflite_convert \ 48 --output_file=/tmp/foo.tflite \ 49 --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ 50 --input_arrays=input \ 51 --output_arrays=MobilenetV1/Predictions/Reshape_1 52``` 53 54The value for `input_shapes` is automatically determined whenever possible. 55 56### Convert a TensorFlow SavedModel <a name="savedmodel"></a> 57 58The follow example converts a basic TensorFlow SavedModel into a Tensorflow Lite 59FlatBuffer to perform floating-point inference. 60 61``` 62tflite_convert \ 63 --output_file=/tmp/foo.tflite \ 64 --saved_model_dir=/tmp/saved_model 65``` 66 67[SavedModel](https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators) 68has fewer required flags than frozen graphs due to access to additional data 69contained within the SavedModel. The values for `--input_arrays` and 70`--output_arrays` are an aggregated, alphabetized list of the inputs and outputs 71in the [SignatureDefs](../../serving/signature_defs.md) within 72the 73[MetaGraphDef](https://www.tensorflow.org/saved_model#apis_to_build_and_load_a_savedmodel) 74specified by `--saved_model_tag_set`. As with the GraphDef, the value for 75`input_shapes` is automatically determined whenever possible. 76 77There is currently no support for MetaGraphDefs without a SignatureDef or for 78MetaGraphDefs that use the [`assets/` 79directory](https://www.tensorflow.org/guide/saved_model#structure_of_a_savedmodel_directory). 80 81### Convert a tf.Keras model <a name="keras"></a> 82 83The following example converts a `tf.keras` model into a TensorFlow Lite 84Flatbuffer. The `tf.keras` file must contain both the model and the weights. 85 86``` 87tflite_convert \ 88 --output_file=/tmp/foo.tflite \ 89 --keras_model_file=/tmp/keras_model.h5 90``` 91 92## Quantization 93 94### Convert a TensorFlow GraphDef for quantized inference <a name="graphdef_quant"></a> 95 96The TensorFlow Lite Converter is compatible with fixed point quantization models 97described 98[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/quantize/README.md). 99These are float models with `FakeQuant*` ops inserted at the boundaries of fused 100layers to record min-max range information. This generates a quantized inference 101workload that reproduces the quantization behavior that was used during 102training. 103 104The following command generates a quantized TensorFlow Lite FlatBuffer from a 105"quantized" TensorFlow GraphDef. 106 107``` 108tflite_convert \ 109 --output_file=/tmp/foo.tflite \ 110 --graph_def_file=/tmp/some_quantized_graph.pb \ 111 --inference_type=QUANTIZED_UINT8 \ 112 --input_arrays=input \ 113 --output_arrays=MobilenetV1/Predictions/Reshape_1 \ 114 --mean_values=128 \ 115 --std_dev_values=127 116``` 117 118### Use \"dummy-quantization\" to try out quantized inference on a float graph <a name="dummy_quant"></a> 119 120In order to evaluate the possible benefit of generating a quantized graph, the 121converter allows "dummy-quantization" on float graphs. The flags 122`--default_ranges_min` and `--default_ranges_max` accept plausible values for 123the min-max ranges of the values in all arrays that do not have min-max 124information. "Dummy-quantization" will produce lower accuracy but will emulate 125the performance of a correctly quantized model. 126 127The example below contains a model using Relu6 activation functions. Therefore, 128a reasonable guess is that most activation ranges should be contained in [0, 6]. 129 130``` 131curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ 132 | tar xzv -C /tmp 133tflite_convert \ 134 --output_file=/tmp/foo.cc \ 135 --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ 136 --inference_type=QUANTIZED_UINT8 \ 137 --input_arrays=input \ 138 --output_arrays=MobilenetV1/Predictions/Reshape_1 \ 139 --default_ranges_min=0 \ 140 --default_ranges_max=6 \ 141 --mean_values=128 \ 142 --std_dev_values=127 143``` 144 145## Specifying input and output arrays 146 147### Multiple input arrays 148 149The flag `input_arrays` takes in a comma-separated list of input arrays as seen 150in the example below. This is useful for models or subgraphs with multiple 151inputs. 152 153``` 154curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \ 155 | tar xzv -C /tmp 156tflite_convert \ 157 --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \ 158 --output_file=/tmp/foo.tflite \ 159 --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \ 160 --input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \ 161 --output_arrays=InceptionV1/Logits/Predictions/Reshape_1 162``` 163 164Note that `input_shapes` is provided as a colon-separated list. Each input shape 165corresponds to the input array at the same position in the respective list. 166 167### Multiple output arrays 168 169The flag `output_arrays` takes in a comma-separated list of output arrays as 170seen in the example below. This is useful for models or subgraphs with multiple 171outputs. 172 173``` 174curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \ 175 | tar xzv -C /tmp 176tflite_convert \ 177 --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \ 178 --output_file=/tmp/foo.tflite \ 179 --input_arrays=input \ 180 --output_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu 181``` 182 183### Specifying subgraphs 184 185Any array in the input file can be specified as an input or output array in 186order to extract subgraphs out of an input graph file. The TensorFlow Lite 187Converter discards the parts of the graph outside of the specific subgraph. Use 188[graph visualizations](#graph_visualizations) to identify the input and output 189arrays that make up the desired subgraph. 190 191The follow command shows how to extract a single fused layer out of a TensorFlow 192GraphDef. 193 194``` 195curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \ 196 | tar xzv -C /tmp 197tflite_convert \ 198 --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \ 199 --output_file=/tmp/foo.pb \ 200 --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \ 201 --input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \ 202 --output_arrays=InceptionV1/InceptionV1/Mixed_3b/concat_v2 203``` 204 205Note that the final representation in TensorFlow Lite FlatBuffers tends to have 206coarser granularity than the very fine granularity of the TensorFlow GraphDef 207representation. For example, while a fully-connected layer is typically 208represented as at least four separate ops in TensorFlow GraphDef (Reshape, 209MatMul, BiasAdd, Relu...), it is typically represented as a single "fused" op 210(FullyConnected) in the converter's optimized representation and in the final 211on-device representation. As the level of granularity gets coarser, some 212intermediate arrays (say, the array between the MatMul and the BiasAdd in the 213TensorFlow GraphDef) are dropped. 214 215When specifying intermediate arrays as `--input_arrays` and `--output_arrays`, 216it is desirable (and often required) to specify arrays that are meant to survive 217in the final form of the graph, after fusing. These are typically the outputs of 218activation functions (since everything in each layer until the activation 219function tends to get fused). 220 221## Logging 222 223 224## Graph visualizations 225 226The converter can export a graph to the Graphviz Dot format for easy 227visualization using either the `--output_format` flag or the 228`--dump_graphviz_dir` flag. The subsections below outline the use cases for 229each. 230 231### Using `--output_format=GRAPHVIZ_DOT` <a name="using_output_format_graphviz_dot"></a> 232 233The first way to get a Graphviz rendering is to pass `GRAPHVIZ_DOT` into 234`--output_format`. This results in a plausible visualization of the graph. This 235reduces the requirements that exist during conversion from a TensorFlow GraphDef 236to a TensorFlow Lite FlatBuffer. This may be useful if the conversion to TFLite 237is failing. 238 239``` 240curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ 241 | tar xzv -C /tmp 242tflite_convert \ 243 --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ 244 --output_file=/tmp/foo.dot \ 245 --output_format=GRAPHVIZ_DOT \ 246 --input_shape=1,128,128,3 \ 247 --input_arrays=input \ 248 --output_arrays=MobilenetV1/Predictions/Reshape_1 249``` 250 251The resulting `.dot` file can be rendered into a PDF as follows: 252 253``` 254dot -Tpdf -O /tmp/foo.dot 255``` 256 257And the resulting `.dot.pdf` can be viewed in any PDF viewer, but we suggest one 258with a good ability to pan and zoom across a very large page. Google Chrome does 259well in that respect. 260 261``` 262google-chrome /tmp/foo.dot.pdf 263``` 264 265Example PDF files are viewable online in the next section. 266 267### Using `--dump_graphviz_dir` 268 269The second way to get a Graphviz rendering is to pass the `--dump_graphviz_dir` 270flag, specifying a destination directory to dump Graphviz rendering to. Unlike 271the previous approach, this one retains the original output format. This 272provides a visualization of the actual graph resulting from a specific 273conversion process. 274 275``` 276curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ 277 | tar xzv -C /tmp 278tflite_convert \ 279 --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ 280 --output_file=/tmp/foo.tflite \ 281 --input_arrays=input \ 282 --output_arrays=MobilenetV1/Predictions/Reshape_1 \ 283 --dump_graphviz_dir=/tmp 284``` 285 286This generates a few files in the destination directory. The two most important 287files are `toco_AT_IMPORT.dot` and `/tmp/toco_AFTER_TRANSFORMATIONS.dot`. 288`toco_AT_IMPORT.dot` represents the original graph containing only the 289transformations done at import time. This tends to be a complex visualization 290with limited information about each node. It is useful in situations where a 291conversion command fails. 292 293`toco_AFTER_TRANSFORMATIONS.dot` represents the graph after all transformations 294were applied to it, just before it is exported. Typically, this is a much 295smaller graph with more information about each node. 296 297As before, these can be rendered to PDFs: 298 299``` 300dot -Tpdf -O /tmp/toco_*.dot 301``` 302 303Sample output files can be seen here below. Note that it is the same 304`AveragePool` node in the top right of each image. 305 306<table><tr> 307 <td> 308 <a target="_blank" href="https://storage.googleapis.com/download.tensorflow.org/example_images/toco_AT_IMPORT.dot.pdf"> 309 <img src="../images/convert/sample_before.png"/> 310 </a> 311 </td> 312 <td> 313 <a target="_blank" href="https://storage.googleapis.com/download.tensorflow.org/example_images/toco_AFTER_TRANSFORMATIONS.dot.pdf"> 314 <img src="../images/convert/sample_after.png"/> 315 </a> 316 </td> 317</tr> 318<tr><td>before</td><td>after</td></tr> 319</table> 320 321### Graph "video" logging 322 323When `--dump_graphviz_dir` is used, one may additionally pass 324`--dump_graphviz_video`. This causes a graph visualization to be dumped after 325each individual graph transformation, resulting in thousands of files. 326Typically, one would then bisect into these files to understand when a given 327change was introduced in the graph. 328 329### Legend for the graph visualizations <a name="graphviz_legend"></a> 330 331* Operators are red square boxes with the following hues of red: 332 * Most operators are 333 <span style="background-color:#db4437;color:white;border:1px;border-style:solid;border-color:black;padding:1px">bright 334 red</span>. 335 * Some typically heavy operators (e.g. Conv) are rendered in a 336 <span style="background-color:#c53929;color:white;border:1px;border-style:solid;border-color:black;padding:1px">darker 337 red</span>. 338* Arrays are octagons with the following colors: 339 * Constant arrays are 340 <span style="background-color:#4285f4;color:white;border:1px;border-style:solid;border-color:black;padding:1px">blue</span>. 341 * Activation arrays are gray: 342 * Internal (intermediate) activation arrays are 343 <span style="background-color:#f5f5f5;border:1px;border-style:solid;border-color:black;border:1px;border-style:solid;border-color:black;padding:1px">light 344 gray</span>. 345 * Those activation arrays that are designated as `--input_arrays` or 346 `--output_arrays` are 347 <span style="background-color:#9e9e9e;border:1px;border-style:solid;border-color:black;padding:1px">dark 348 gray</span>. 349 * RNN state arrays are green. Because of the way that the converter 350 represents RNN back-edges explicitly, each RNN state is represented by a 351 pair of green arrays: 352 * The activation array that is the source of the RNN back-edge (i.e. 353 whose contents are copied into the RNN state array after having been 354 computed) is 355 <span style="background-color:#b7e1cd;border:1px;border-style:solid;border-color:black;padding:1px">light 356 green</span>. 357 * The actual RNN state array is 358 <span style="background-color:#0f9d58;color:white;border:1px;border-style:solid;border-color:black;padding:1px">dark 359 green</span>. It is the destination of the RNN back-edge updating 360 it. 361