• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Protobuf Editions Design: Features
2
3**Author:** [@haberman](https://github.com/haberman),
4[@fowles](https://github.com/fowles)
5
6**Approved:** 2022-10-13
7
8A proposal to use custom options as our way of defining and representing
9features.
10
11## Background
12
13The [Protobuf Editions](what-are-protobuf-editions.md) project uses "editions"
14to allow Protobuf to safely evolve over time. An edition is formally a set of
15"features" with a default value per feature. The set of features or a default
16value for a feature can only change with the introduction of a new edition.
17Features define the specific points of change and evolution on a per entity
18basis within a .proto file (entities being files, messages, fields, or any other
19lexical element in the file). The design in this doc supplants an earlier design
20which used strings for feature definition.
21
22Protobuf already supports
23[custom options](https://protobuf.dev/programming-guides/proto2#customoptions)
24and we will leverage these to provide a rich syntax without introducing new
25syntactic forms into Protobuf.
26
27## Sample Usage
28
29Here is a small sample usage of features to give a flavor for how it looks
30
31```
32edition = "2023";
33
34package experimental.users.kfm.editions;
35
36import "net/proto2/proto/features_cpp.proto";
37
38option features.repeated_field_encoding = EXPANDED;
39option features.enum = OPEN;
40option features.(pb.cpp).string_field_type = STRING;
41option features.(pb.cpp).namespace = "kfm::proto_experiments";
42
43message Lab {
44  // `Mouse` is open as it inherits the file's value.
45  enum Mouse {
46    UNKNOWN_MOUSE = 0;
47    PINKY = 1;
48    THE_BRAIN = 2;
49  }
50  repeated Mouse mice = 1 [features.repeated_field_encoding = PACKED];
51
52  string name = 2;
53  string address = 3 [features.(pb.cpp).string_field_type = CORD];
54  string function = 4 [features.(pb.cpp).string_field_type = STRING_VIEW];
55}
56
57enum ColorChannel {
58  // Turn off the option from the surrounding file
59  option features.enum = CLOSED;
60
61  UNKNOWN_COLOR_CHANNEL = 0;
62  RED = 1;
63  BLUE = 2;
64  GREEN = 3;
65  ALPHA = 4;
66}
67```
68
69## Language-Specific Features
70
71We will use extensions to manage features specific to individual code
72generators.
73
74```
75// In net/proto2/proto/descriptor.proto:
76syntax = "proto2";
77package proto2;
78
79message Features {
80  ...
81  extensions 1000;  // for features_cpp.proto
82  extensions 1001;  // for features_java.proto
83}
84
85```
86
87This will allow third-party code generators to use editions for their own
88evolution as long as they reserve a single extension number in
89`descriptor.proto`. Using this from a .proto file would look like this:
90
91```
92edition = "2023";
93
94import "third_party/protobuf/compiler/cpp/features_cpp.proto"
95
96message Bar {
97  optional string str = 1 [features.(pb.cpp).string_field_type = true];
98}
99```
100
101## Inheritance
102
103To support inheritance, we will specify a single `Features` message that extends
104every kind of option:
105
106```
107// In net/proto2/proto/descriptor.proto:
108syntax = "proto2";
109package proto2;
110
111message Features {
112  ...
113}
114
115message FileOptions {
116  optional Features features = ..;
117}
118
119message MessageOptions {
120  optional Features features = ..;
121}
122// All the other `*Options` protos.
123```
124
125At the implementation level, feature inheritance is exactly the behavior of
126`MergeFrom`
127
128```
129void InheritFrom(const Features& parent, Features* child) {
130  Features tmp(parent);
131  tmp.MergeFrom(child);
132  child->Swap(&tmp);
133}
134```
135
136which means that custom backends will be able to faithfully implement
137inheritance without difficulty.
138
139## Target Attributes
140
141While inheritance can be useful for minimizing changes or pushing defaults
142broadly, it can be overused in ways that would make simple refactoring of
143`.proto` files harder. Additionally, not all features are meaningful on all
144entities (for example `features.enum = OPEN` is meaningless on a field).
145
146To avoid these issues, we will introduce "target" attributes on features
147(similar in concept to the "target" attribute on Java annotations).
148
149```
150enum FeatureTargetType {
151  FILE = 0;
152  MESSAGE = 1;
153  ENUM = 2;
154  FIELD = 3;
155  ...
156};
157```
158
159These will restrict the set of entities to which a feature may be attached.
160
161```
162message Features {
163  ...
164
165  enum EnumType {
166    OPEN = 0;
167    CLOSED = 1;
168  }
169  optional EnumType enum = 2 [
170      target = ENUM
171  ];
172}
173```
174
175## Retention
176
177To reduce the size of descriptors in protobuf runtimes, features will be
178permitted to specify retention rules (again similar in concept to "retention"
179attributes on Java annotations).
180
181```
182enum FeatureRetention {
183  SOURCE = 0;
184  RUNTIME = 1;
185}
186```
187
188## Specification of an Edition
189
190An edition is, effectively, an instance of the `Feature` proto which forms the
191base for performing inheritance using `MergeFrom`. This allows `protoc` and
192specific language generators to leverage existing formats (like text-format) for
193specifying the value of features at a given edition.
194
195Although naively we would think that field defaults are the right approach, this
196does not quite work, because the default is editions-dependent. Instead, we
197propose adding the following to the protoc-provided `features.proto`:
198
199```
200message Features {
201  // ...
202  message EditionDefault {
203    optional string edition = 1;
204    optional string default = 2;  // Textproto value.
205  }
206
207  extend FieldOptions {
208    // Ideally this is a map, but map extensions are not permitted...
209    repeated EditionDefault edition_defaults = 9001;
210  }
211}
212```
213
214To build the edition defaults for a particular edition `current` in the context
215of a particular file `foo.proto`, we execute the following algorithm:
216
2171.  Construct a new `Features feats;`.
2182.  For each field in `Features`, take the value of the
219    `Features.edition_defaults` option (call it `defaults`), and sort it by the
220    value of `edition` (per the total order for edition names,
221    [Life of an Edition](life-of-an-edition.md)).
2223.  Binsearch for the latest edition in `defaults` that is earlier or equal to
223    `current`.
224    1.  If the field is of singular, scalar type, use that value as the value of
225        the field in `feats`.
226    2.  Otherwise, the value of the field in `feats` is given by merging all of
227        the values less than `current`, starting from the oldest edition.
2284.  For the purposes of this algorithm, `Features`'s fields all behave as if
229    they were `required`; failure to find a default explicitly via the editions
230    default search mechanism should result in a compilation error, because it
231    means the file's edition is too old.
2325.  For each extension of `Features` that is visible from `foo.proto` via
233    imports, perform the same algorithm as above to construct the editions
234    default for that extension message, and add it to `feat`.
235
236This algorithm has the following properties:
237
238*   Language-scoped features are discovered via imports, which is how they need
239    to be imported for use in a file in the first place.
240*   Every value is set explicitly, so we correctly reject too-old files.
241*   Files from "the future" will not be rejected out of hand by the algorithm,
242    allowing us to provide a flag like `--allow-experimental-editions` for ease
243    of allowing backends to implement a new edition.
244
245## Edition Zero Features
246
247Putting the parts together, we can offer a potential `Feature` message for
248edition zero: [Edition Zero Features](edition-zero-features.md).
249
250```
251message Features {
252  enum FieldPresence {
253    EXPLICIT = 0;
254    IMPLICIT = 1;
255    LEGACY_REQUIRED = 2;
256  }
257  optional FieldPresence field_presence = 1 [
258      retention = RUNTIME,
259      target = FIELD,
260      (edition_defaults) = {
261        edition: "2023", default: "EXPLICIT"
262      }
263  ];
264
265  enum EnumType {
266    OPEN = 0;
267    CLOSED = 1;
268  }
269  optional EnumType enum = 2 [
270      retention = RUNTIME,
271      target = ENUM,
272      (edition_defaults) = {
273        edition: "2023", default: "OPEN"
274      }
275  ];
276
277  enum RepeatedFieldEncoding {
278    PACKED = 0;
279    EXPANDED = 1;
280  }
281  optional RepeatedFieldEncoding repeated_field_encoding = 3 [
282      retention = RUNTIME,
283      target = FIELD,
284      (edition_defaults) = {
285        edition: "2023", default: "PACKED"
286      }
287  ];
288
289  enum StringFieldValidation {
290    REQUIRED = 0;
291    HINT = 1;
292    SKIP = 2;
293  }
294  optional StringFieldValidation string_field_validation = 4 [
295      retention = RUNTIME,
296      target = FIELD,
297      (edition_defaults) = {
298        edition: "2023", default: "REQUIRED"
299      }
300  ];
301
302  enum MessageEncoding {
303    LENGTH_PREFIXED = 0;
304    DELIMITED = 1;
305  }
306  optional MessageEncoding message_encoding = 5 [
307      retention = RUNTIME,
308      target = FIELD,
309      (edition_defaults) = {
310        edition: "2023", default: "LENGTH_PREFIXED"
311      }
312  ];
313
314  extensions 1000;  // for features_cpp.proto
315  extensions 1001;  // for features_java.proto
316}
317```
318