# TensorFlow Lite TensorFlow Lite is a set of tools that enables on-device machine learning by helping developers run their models on mobile, embedded, and edge devices. ### Key features - *Optimized for on-device machine learning*, by addressing 5 key constraints: latency (there's no round-trip to a server), privacy (no personal data leaves the device), connectivity (internet connectivity is not required), size (reduced model and binary size) and power consumption (efficient inference and a lack of network connections). - *Multiple platform support*, covering [Android](../android) and [iOS](ios) devices, [embedded Linux](python), and [microcontrollers](../microcontrollers). - *Diverse language support*, which includes Java, Swift, Objective-C, C++, and Python. - *High performance*, with [hardware acceleration](../performance/delegates) and [model optimization](../performance/model_optimization). - *End-to-end [examples](../examples)*, for common machine learning tasks such as image classification, object detection, pose estimation, question answering, text classification, etc. on multiple platforms. Key Point: The TensorFlow Lite binary is ~1MB when all 125+ supported operators are linked (for 32-bit ARM builds), and less than 300KB when using only the operators needed for supporting the common image classification models InceptionV3 and MobileNet. ## Development workflow The following guide walks through each step of the workflow and provides links to further instructions: Note: Refer to the [performance best practices](../performance/best_practices) guide for an ideal balance of performance, model size, and accuracy. ### 1. Generate a TensorFlow Lite model A TensorFlow Lite model is represented in a special efficient portable format known as [FlatBuffers](https://google.github.io/flatbuffers/){:.external} (identified by the *.tflite* file extension). This provides several advantages over TensorFlow's protocol buffer model format such as reduced size (small code footprint) and faster inference (data is directly accessed without an extra parsing/unpacking step) that enables TensorFlow Lite to execute efficiently on devices with limited compute and memory resources. A TensorFlow Lite model can optionally include *metadata* that has human-readable model description and machine-readable data for automatic generation of pre- and post-processing pipelines during on-device inference. Refer to [Add metadata](../models/convert/metadata) for more details. You can generate a TensorFlow Lite model in the following ways: * **Use an existing TensorFlow Lite model:** Refer to [TensorFlow Lite Examples](../examples) to pick an existing model. *Models may or may not contain metadata.* * **Create a TensorFlow Lite model:** Use the [TensorFlow Lite Model Maker](../models/modify/model_maker) to create a model with your own custom dataset. *By default, all models contain metadata.* * **Convert a TensorFlow model into a TensorFlow Lite model:** Use the [TensorFlow Lite Converter](../models/convert/) to convert a TensorFlow model into a TensorFlow Lite model. During conversion, you can apply [optimizations](../performance/model_optimization) such as [quantization](../performance/post_training_quantization) to reduce model size and latency with minimal or no loss in accuracy. *By default, all models don't contain metadata.* ### 2. Run Inference *Inference* refers to the process of executing a TensorFlow Lite model on-device to make predictions based on input data. You can run inference in the following ways based on the model type: * **Models *without* metadata**: Use the [TensorFlow Lite Interpreter](inference) API. *Supported on multiple platforms and languages such as Java, Swift, C++, Objective-C and Python.* * **Models *with* metadata**: You can either leverage the out-of-box APIs using the [TensorFlow Lite Task Library](../inference_with_metadata/task_library/overview) or build custom inference pipelines with the [TensorFlow Lite Support Library](../inference_with_metadata/lite_support). On android devices, users can automatically generate code wrappers using the [Android Studio ML Model Binding](../inference_with_metadata/codegen#mlbinding) or the [TensorFlow Lite Code Generator](../inference_with_metadata/codegen#codegen). *Supported only on Java (Android) while Swift (iOS) and C++ is work in progress.* On Android and iOS devices, you can improve performance using hardware acceleration. On either platforms you can use a [GPU Delegate](../performance/gpu), on android you can either use the [NNAPI Delegate](../android/delegates/nnapi) (for newer devices) or the [Hexagon Delegate](../android/delegates/hexagon) (on older devices) and on iOS you can use the [Core ML Delegate](../performance/coreml_delegate). To add support for new hardware accelerators, you can [define your own delegate](../performance/implementing_delegate). ## Get started You can refer to the following guides based on your target device: * **Android and iOS:** Explore the [Android quickstart](../android/quickstart) and [iOS quickstart](ios). * **Embedded Linux:** Explore the [Python quickstart](python) for embedded devices such as [Raspberry Pi](https://www.raspberrypi.org/){:.external} and [Coral devices with Edge TPU](https://coral.withgoogle.com/){:.external}, or C++ build instructions for [ARM](build_arm). * **Microcontrollers:** Explore the [TensorFlow Lite for Microcontrollers](../microcontrollers) library for microcontrollers and DSPs that contain only a few kilobytes of memory. ## Technical constraints * *All TensorFlow models* ***cannot*** *be converted into TensorFlow Lite models*, refer to [Operator compatibility](ops_compatibility). * *Unsupported on-device training*, however it is on our [Roadmap](roadmap).