• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Edition Zero: JSON Handling
2
3**Author:** [@mkruskal-google](https://github.com/mkruskal-google)
4
5**Approved:** 2023-05-10
6
7## Background
8
9Today, proto3 fully validates JSON mappings for uniqueness during parsing, while
10proto2 takes a best-effort approach and allows cases that don't have a 1:1
11mapping. This is laid out in more detail by *JSON Field Name Conflicts* (not
12available externally). While we had hoped to unify these before
13[Protobuf editions](what-are-protobuf-editions.md) launched, we ended up blocked
14by some internal use-cases. This issue is now blocking the editions launch,
15since we can't represent this behavior with the current set of
16[Edition Zero features](edition-zero-features.md).
17
18## Overview
19
20Today, by default, we transform each field name to a CamelCase name that will
21always be valid, but not necessarily unique in JSON. We also support a
22`json_name` field option to override this for JSON parsing/serialization. This
23allows conflicts to potentially arise where many proto fields map to the same
24JSON field. Our JSON handling has the following behaviors:
25
26*   All proto messages can be serialized to JSON
27    *   Conflicting mappings will produce JSON with duplicate keys
28*   All proto messages can be parsed from JSON
29    *   Conflicting mappings lead to undefined behavior. While the behavior is
30        deterministic in all of the cases we've encountered, it's inconsistent
31        across runtimes and unexpected.
32*   The Protobuf compiler will fail to parse any proto3 files if any JSON
33    conflicts are detected by default
34    *   Disabled by `deprecated_legacy_json_field_conflicts` option
35*   Proto2 files will only fail to parse if both of the conflicts fields have
36    `json_name` set
37    *   We will still warn for default json mapping conflicts if
38        `deprecated_legacy_json_field_conflicts` isn't set
39
40The goal here is to unify these behaviors into a future-facing feature as part
41of edition zero.
42
43## Recommendation
44
45We recommend adding a new `json_format` feature as part of
46[Edition Zero features](edition-zero-features.md). The doc will be updated to
47reflect the following details.
48
49JSON format can have three possible states:
50
51*   `ALLOW` - By default, fields will be fully validated during proto parsing.
52    Any conflicting JSON mappings will trigger protoc errors, guaranteeing
53    uniqueness. This will be consistent with the current proto3 behavior. No
54    runtime changes are needed, since we allow JSON parsing/serialization.
55*   `DISALLOW` - Alternatively, we will ban JSON encoding and disable all
56    validation related to JSON mappings. All runtimes will fail to parse or
57    serialize any messages to/from JSON when this feature is set on the
58    top-level messages. This is a new mode which provides an alternative to
59    `LEGACY_BEST_EFFORT` that doesn't involve any schema changes.
60*   `LEGACY_BEST_EFFORT` - Fields will be validated for correctness, but not for
61    uniqueness. Any conflicting JSON mappings will trigger protoc warnings, but
62    no errors. This will be consistent with the current proto2 behavior, or
63    proto3 where `deprecated_legacy_json_field_conflicts` is set. Since this is
64    undefined behavior we want to get rid of, a parallel effort will attempt to
65    remove this later. No runtime changes are needed, since we allow JSON
66    parsing/serialization.
67
68Long-term, we want JSON support to be specified at the proto level. For the
69migration from proto2/proto3, we will just migrate everything to `ALLOW` and
70`LEGACY_BEST_EFFORT` depending on the `syntax` and the value of
71`deprecated_legacy_json_field_conflicts`.
72
73We will additionally ban any `ALLOW` message from containing a `DISALLOW` type
74anywhere in its tree (including extensions, which will fail to compile).
75Attempting to add this will result in a compiler error. This has the following
76benefits:
77
78*   The implementation is a lot simpler, since most of the work is done in
79    protoc and parsers only need to check the top level message
80*   Runtime failures aren't dependent on the contents of the message being
81    serialized/parsed
82*   Avoids messy blurring of ownership. If a bug occurs because a `DISALLOW`
83    field is sometimes set, is the owner of the child type required to change it
84    to `ALLOW`? Or is the owner of the parent type responsible because they
85    added the dependency?
86
87    `LEGACY_BEST_EFFORT` will continue to allow serialization/parsing of types
88    with `DISALLOW` set.
89
90This feature will target messages and enums, but we will also provide it at the
91file level for convenience.
92
93Example use-cases for `DISALLOW`:
94
95*   https://github.com/protocolbuffers/protobuf/issues/12525
96*   Some projects generate proto descriptors at runtime and uses underscores to
97    disambiguate field names. They never use JSON format with these protos, but
98    currently have to work around our conflict checks
99
100## Alternatives
101
102### Dual State
103
104Instead of a tri-state feature, we could have a simple allow/disallow feature
105for JSON format.
106
107#### Pros
108
109*   Simpler conceptually
110
111#### Cons
112
113*   We would end up blocked by many of the protos that we were unable to migrate
114    as part of *JSON Field Name Conflicts* (not available externally). While
115    some of them could be migrated to `DISALLOW`, others are actually
116    **depending** on our current behavior under JSON mapping conflicts (as a
117    hack around some limitations in JSON customization).
118
119### Default to DISALLOW
120
121Instead of defaulting to `ALLOW`, we could default to `DISALLOW`.
122
123#### Pros
124
125The majority of internal Google protos are used for binary/text encoding and
126don't care about JSON, so this would:
127
128*   Be less noisy for teams who forget to explicitly set `DISALLOW` and may have
129    fields with conflicting JSON mappings
130*   Decrease our support surface
131
132#### Cons
133
134*   We would need to figure out where `DISALLOW` can be added
135
136### Do Nothing
137
138#### Pros
139
140*   Short-term it saves some trouble and keeps edition zero simpler
141
142#### Cons
143
144*   We'll eventually hit the same issues we hit in *JSON Field Name Conflicts*
145    (not available externally)
146*   The current proto2/proto3 behaviors are mutually exclusive. There's nothing
147    we can migrate to in today's edition zero that won't risk breaking one of
148    them.
149