1# Protobuf Editions Design: Features 2 3**Author:** [@haberman](https://github.com/haberman), 4[@fowles](https://github.com/fowles) 5 6**Approved:** 2022-10-13 7 8A proposal to use custom options as our way of defining and representing 9features. 10 11## Background 12 13The [Protobuf Editions](what-are-protobuf-editions.md) project uses "editions" 14to allow Protobuf to safely evolve over time. An edition is formally a set of 15"features" with a default value per feature. The set of features or a default 16value for a feature can only change with the introduction of a new edition. 17Features define the specific points of change and evolution on a per entity 18basis within a .proto file (entities being files, messages, fields, or any other 19lexical element in the file). The design in this doc supplants an earlier design 20which used strings for feature definition. 21 22Protobuf already supports 23[custom options](https://protobuf.dev/programming-guides/proto2#customoptions) 24and we will leverage these to provide a rich syntax without introducing new 25syntactic forms into Protobuf. 26 27## Sample Usage 28 29Here is a small sample usage of features to give a flavor for how it looks 30 31``` 32edition = "2023"; 33 34package experimental.users.kfm.editions; 35 36import "net/proto2/proto/features_cpp.proto"; 37 38option features.repeated_field_encoding = EXPANDED; 39option features.enum = OPEN; 40option features.(pb.cpp).string_field_type = STRING; 41option features.(pb.cpp).namespace = "kfm::proto_experiments"; 42 43message Lab { 44 // `Mouse` is open as it inherits the file's value. 45 enum Mouse { 46 UNKNOWN_MOUSE = 0; 47 PINKY = 1; 48 THE_BRAIN = 2; 49 } 50 repeated Mouse mice = 1 [features.repeated_field_encoding = PACKED]; 51 52 string name = 2; 53 string address = 3 [features.(pb.cpp).string_field_type = CORD]; 54 string function = 4 [features.(pb.cpp).string_field_type = STRING_VIEW]; 55} 56 57enum ColorChannel { 58 // Turn off the option from the surrounding file 59 option features.enum = CLOSED; 60 61 UNKNOWN_COLOR_CHANNEL = 0; 62 RED = 1; 63 BLUE = 2; 64 GREEN = 3; 65 ALPHA = 4; 66} 67``` 68 69## Language-Specific Features 70 71We will use extensions to manage features specific to individual code 72generators. 73 74``` 75// In net/proto2/proto/descriptor.proto: 76syntax = "proto2"; 77package proto2; 78 79message Features { 80 ... 81 extensions 1000; // for features_cpp.proto 82 extensions 1001; // for features_java.proto 83} 84 85``` 86 87This will allow third-party code generators to use editions for their own 88evolution as long as they reserve a single extension number in 89`descriptor.proto`. Using this from a .proto file would look like this: 90 91``` 92edition = "2023"; 93 94import "third_party/protobuf/compiler/cpp/features_cpp.proto" 95 96message Bar { 97 optional string str = 1 [features.(pb.cpp).string_field_type = true]; 98} 99``` 100 101## Inheritance 102 103To support inheritance, we will specify a single `Features` message that extends 104every kind of option: 105 106``` 107// In net/proto2/proto/descriptor.proto: 108syntax = "proto2"; 109package proto2; 110 111message Features { 112 ... 113} 114 115message FileOptions { 116 optional Features features = ..; 117} 118 119message MessageOptions { 120 optional Features features = ..; 121} 122// All the other `*Options` protos. 123``` 124 125At the implementation level, feature inheritance is exactly the behavior of 126`MergeFrom` 127 128``` 129void InheritFrom(const Features& parent, Features* child) { 130 Features tmp(parent); 131 tmp.MergeFrom(child); 132 child->Swap(&tmp); 133} 134``` 135 136which means that custom backends will be able to faithfully implement 137inheritance without difficulty. 138 139## Target Attributes 140 141While inheritance can be useful for minimizing changes or pushing defaults 142broadly, it can be overused in ways that would make simple refactoring of 143`.proto` files harder. Additionally, not all features are meaningful on all 144entities (for example `features.enum = OPEN` is meaningless on a field). 145 146To avoid these issues, we will introduce "target" attributes on features 147(similar in concept to the "target" attribute on Java annotations). 148 149``` 150enum FeatureTargetType { 151 FILE = 0; 152 MESSAGE = 1; 153 ENUM = 2; 154 FIELD = 3; 155 ... 156}; 157``` 158 159These will restrict the set of entities to which a feature may be attached. 160 161``` 162message Features { 163 ... 164 165 enum EnumType { 166 OPEN = 0; 167 CLOSED = 1; 168 } 169 optional EnumType enum = 2 [ 170 target = ENUM 171 ]; 172} 173``` 174 175## Retention 176 177To reduce the size of descriptors in protobuf runtimes, features will be 178permitted to specify retention rules (again similar in concept to "retention" 179attributes on Java annotations). 180 181``` 182enum FeatureRetention { 183 SOURCE = 0; 184 RUNTIME = 1; 185} 186``` 187 188## Specification of an Edition 189 190An edition is, effectively, an instance of the `Feature` proto which forms the 191base for performing inheritance using `MergeFrom`. This allows `protoc` and 192specific language generators to leverage existing formats (like text-format) for 193specifying the value of features at a given edition. 194 195Although naively we would think that field defaults are the right approach, this 196does not quite work, because the default is editions-dependent. Instead, we 197propose adding the following to the protoc-provided `features.proto`: 198 199``` 200message Features { 201 // ... 202 message EditionDefault { 203 optional string edition = 1; 204 optional string default = 2; // Textproto value. 205 } 206 207 extend FieldOptions { 208 // Ideally this is a map, but map extensions are not permitted... 209 repeated EditionDefault edition_defaults = 9001; 210 } 211} 212``` 213 214To build the edition defaults for a particular edition `current` in the context 215of a particular file `foo.proto`, we execute the following algorithm: 216 2171. Construct a new `Features feats;`. 2182. For each field in `Features`, take the value of the 219 `Features.edition_defaults` option (call it `defaults`), and sort it by the 220 value of `edition` (per the total order for edition names, 221 [Life of an Edition](life-of-an-edition.md)). 2223. Binsearch for the latest edition in `defaults` that is earlier or equal to 223 `current`. 224 1. If the field is of singular, scalar type, use that value as the value of 225 the field in `feats`. 226 2. Otherwise, the value of the field in `feats` is given by merging all of 227 the values less than `current`, starting from the oldest edition. 2284. For the purposes of this algorithm, `Features`'s fields all behave as if 229 they were `required`; failure to find a default explicitly via the editions 230 default search mechanism should result in a compilation error, because it 231 means the file's edition is too old. 2325. For each extension of `Features` that is visible from `foo.proto` via 233 imports, perform the same algorithm as above to construct the editions 234 default for that extension message, and add it to `feat`. 235 236This algorithm has the following properties: 237 238* Language-scoped features are discovered via imports, which is how they need 239 to be imported for use in a file in the first place. 240* Every value is set explicitly, so we correctly reject too-old files. 241* Files from "the future" will not be rejected out of hand by the algorithm, 242 allowing us to provide a flag like `--allow-experimental-editions` for ease 243 of allowing backends to implement a new edition. 244 245## Edition Zero Features 246 247Putting the parts together, we can offer a potential `Feature` message for 248edition zero: [Edition Zero Features](edition-zero-features.md). 249 250``` 251message Features { 252 enum FieldPresence { 253 EXPLICIT = 0; 254 IMPLICIT = 1; 255 LEGACY_REQUIRED = 2; 256 } 257 optional FieldPresence field_presence = 1 [ 258 retention = RUNTIME, 259 target = FIELD, 260 (edition_defaults) = { 261 edition: "2023", default: "EXPLICIT" 262 } 263 ]; 264 265 enum EnumType { 266 OPEN = 0; 267 CLOSED = 1; 268 } 269 optional EnumType enum = 2 [ 270 retention = RUNTIME, 271 target = ENUM, 272 (edition_defaults) = { 273 edition: "2023", default: "OPEN" 274 } 275 ]; 276 277 enum RepeatedFieldEncoding { 278 PACKED = 0; 279 EXPANDED = 1; 280 } 281 optional RepeatedFieldEncoding repeated_field_encoding = 3 [ 282 retention = RUNTIME, 283 target = FIELD, 284 (edition_defaults) = { 285 edition: "2023", default: "PACKED" 286 } 287 ]; 288 289 enum StringFieldValidation { 290 REQUIRED = 0; 291 HINT = 1; 292 SKIP = 2; 293 } 294 optional StringFieldValidation string_field_validation = 4 [ 295 retention = RUNTIME, 296 target = FIELD, 297 (edition_defaults) = { 298 edition: "2023", default: "REQUIRED" 299 } 300 ]; 301 302 enum MessageEncoding { 303 LENGTH_PREFIXED = 0; 304 DELIMITED = 1; 305 } 306 optional MessageEncoding message_encoding = 5 [ 307 retention = RUNTIME, 308 target = FIELD, 309 (edition_defaults) = { 310 edition: "2023", default: "LENGTH_PREFIXED" 311 } 312 ]; 313 314 extensions 1000; // for features_cpp.proto 315 extensions 1001; // for features_java.proto 316} 317``` 318