1Name 2 3 QCOM_motion_estimation 4 5Name Strings 6 7 GL_QCOM_motion_estimation 8 9Contributors 10 11 Jonathan Wicks 12 Sam Holmes 13 Jeff Leger 14 15Contacts 16 17 Jeff Leger <jleger@qti.qualcomm.com> 18 19Status 20 21 Complete 22 23Version 24 25 Last Modified Date: March 19, 2020 26 Revision: 1.0 27 28Number 29 30 OpenGL ES Extension #326 31 32Dependencies 33 34 Requires OpenGL ES 2.0 35 36 This extension is written against the OpenGL ES 3.2 Specification. 37 38 This extension interacts with OES_EGL_image_external. 39 40Overview 41 42 Motion estimation, also referred to as optical flow, is the process of 43 producing motion vectors that convey the 2D transformation from a reference 44 image to a target image. There are various uses of motion estimation, such as 45 frame extrapolation, compression, object tracking, etc. 46 47 This extension adds support for motion estimation in OpenGL ES by adding 48 functions which take the reference and target images and populate an 49 output texture containing the corresponding motion vectors. 50 51New Procedures and Functions 52 53 void TexEstimateMotionQCOM(uint ref, 54 uint target, 55 uint output); 56 57 void TexEstimateMotionRegionsQCOM(uint ref, 58 uint target, 59 uint output, 60 uint mask); 61 62New Tokens 63 64 Accepted by the <pname> parameter of GetIntegerv, GetInteger64v, and GetFloatv: 65 66 MOTION_ESTIMATION_SEARCH_BLOCK_X_QCOM 0x8C90 67 MOTION_ESTIMATION_SEARCH_BLOCK_Y_QCOM 0x8C91 68 69Additions to the OpenGL ES 3.2 Specification 70 71 Add two new rows in Table 21.40 "Implementation Dependent Values" 72 73 Get Value Type Get Command Minimum Value Description Sec 74 --------- ---- ----------- ------------- ----------- ------ 75 MOTION_ESTIMATION_SEARCH_BLOCK_X_QCOM Z+ GetIntegerv 1 The block size in X 8.19 76 MOTION_ESTIMATION_SEARCH_BLOCK_Y_QCOM Z+ GetIntegerv 1 The block size in Y 8.19 77 78Additions to Chapter 8 of the OpenGL ES 3.2 Specification 79 80 The commands 81 82 void TexEstimateMotionQCOM(uint ref, 83 uint target, 84 uint output); 85 86 void TexEstimateMotionRegionsQCOM(uint ref, 87 uint target, 88 uint output, 89 uint mask); 90 91 are called to perfom the motion estimation based on the contents of the two input 92 textures, <ref> and <target>. The results of the motion estimation are stored in 93 the <output> texture. 94 95 The <ref> and <target> must be either be GL_R8 2D textures, or backed by EGLImages where 96 the underlying format contain a luminance plane. The <ref> and <target> dimension must 97 be identical and must be an exact multiple of the search block size. While <ref> and <target> 98 can have multiple levels, the implementation only reads from the base level. 99 100 The resulting motion vectors are stored in a 2D texture <output> of the format GL_RGBA16F, 101 ready to be used by other application shaders and stages. While <output> can have multiple 102 levels, the implementation only writes to the base level. The <output> dimensions 103 must be set as follows so that it can hold one vector per search block: 104 105 output.width = ref.width / MOTION_ESTIMATION_SEARCH_BLOCK_X_QCOM 106 output.height = ref.height / MOTION_ESTIMATION_SEARCH_BLOCK_Y_QCOM 107 108 Each texel in the <output> texture represents the estimated motion in pixels, for the supported 109 search block size, from the <ref> texture to the <target> target texture. Implementations may 110 generate sub-pixel motion vectors, in which case the returned vector components may have fractional 111 values. The motion vector X and Y components are provided in the R and G channels respectively. 112 The B and A components are currently undefined and left for future expansion. If no motion is 113 detected for a block, or if the <mask> texture indicates that the block should be skipped, then 114 the R and G channels will be set to zero, indicating no motion. 115 116 The <mask> texture is used to control the region-of-interest which can help to reduce the 117 overall workload. The <mask> texture dimensions must exactly match that of the <output> 118 texture and the format must be GL_R8UI. While <mask> can have multiple levels, the 119 implementation only reads from the base level. For any texel with a value of 0 in the <mask> 120 motion estimation will not be performed for the corresponding block. Any non-zero texel value 121 will produce a motion vector result in the <output> result. The <mask> only controls the vector 122 basepoint. Therefore it is possible for an unmasked block to produce a vector that lands in the 123 masked block. 124 125Errors 126 127 INVALID_OPERATION is generated if any of the textures passed in are invalid 128 129 INVALID_OPERATION is generated if the texture types are not TEXTURE_2D or TEXTURE_EXTERNAL_OES 130 131 INVALID_OPERATION is generated if <ref> is not of the format GL_R8, or when backed by an EGLImage, 132 when the underlying internal format does not contain a luminance plane. 133 134 INVALID_OPERATION is generated if <target> is not of the format GL_R8, or when backed by an EGLImage, 135 when the underlying internal format does not contain a luminance plane. 136 137 INVALID_OPERATION is generated if the <ref> and <target> textures do not have 138 identical dimensions 139 140 INVALID_OPERATION is generated if the <output> texture is not of the format GL_RGBA16F 141 142 INVALID_OPERATION is generated if the <mask> texture is not of the format GL_R8UI 143 144 INVALID_OPERATION is generated if the <output> or <mask> dimensions are not 145 ref.[width/height] / MOTION_ESTIMATION_SEARCH_BLOCK_[X,Y]_QCOM 146 147Interactions with OES_EGL_image_external 148 149 If OES_EGL_image_external is supported, then the <ref> and/or <target> parameters to 150 TexEstimateMotionQCOM and TexEstimateMotionRegionsQCOM may be backed by an EGLImage. 151 152Issues 153 154 (1) What should the pixel data of the input textures <ref> and <target> contain? 155 156 Resolved: Motion estimation tracks the brightness across the input textures. To produce 157 the best results, it is recommended that the texels in the <ref> and <target> textures 158 represent some measure of the luminance/luma. OpenGL ES does not currently expose 159 a Y8 or Y plane only format, so GL_R8 can be used. Alternatively, a texture backed by 160 and EGLImage, which has an underlying format where luminance is contained in a separate plane, 161 can also be used. If starting with an RGBA8 texture one way to convert it to GL_R8 would be 162 to perform a copy and use code such as the following: 163 164 fragColor = rgb_2_yuv(texture(tex, texcoord).rgb, itu_601_full_range).r;\n" 165 166 (2) Why use GL_RGBA16F instead of GL_RG16F for storing the motion vector output? 167 168 Resolved: While only the R and G channels are currently used, it was decided to use 169 a format with more channels for future expansion. A floating point format was chosen 170 to support implementations with sub-pixel precision without enforcing any particular precision 171 requirements other than what can be represented in a 16-bit floating point number. 172 173 (3) Why is the motion estimation quality not defined? 174 175 Resolved: The intention of this specification is to estimate the motion between 176 the two input textures. Implementations should aim to produce the highest quality estimations 177 but since the results are estimations there are no prescribed steps for how the vectors 178 must be generated. 179 180 181Revision History 182 183 Rev. Date Author Changes 184 ---- ---------- -------- ----------------------------------------- 185 1.0 03/19/2020 Jonathan Wicks Initial public version 186