1# Edition Zero: JSON Handling 2 3**Author:** [@mkruskal-google](https://github.com/mkruskal-google) 4 5**Approved:** 2023-05-10 6 7## Background 8 9Today, proto3 fully validates JSON mappings for uniqueness during parsing, while 10proto2 takes a best-effort approach and allows cases that don't have a 1:1 11mapping. This is laid out in more detail by *JSON Field Name Conflicts* (not 12available externally). While we had hoped to unify these before 13[Protobuf editions](what-are-protobuf-editions.md) launched, we ended up blocked 14by some internal use-cases. This issue is now blocking the editions launch, 15since we can't represent this behavior with the current set of 16[Edition Zero features](edition-zero-features.md). 17 18## Overview 19 20Today, by default, we transform each field name to a CamelCase name that will 21always be valid, but not necessarily unique in JSON. We also support a 22`json_name` field option to override this for JSON parsing/serialization. This 23allows conflicts to potentially arise where many proto fields map to the same 24JSON field. Our JSON handling has the following behaviors: 25 26* All proto messages can be serialized to JSON 27 * Conflicting mappings will produce JSON with duplicate keys 28* All proto messages can be parsed from JSON 29 * Conflicting mappings lead to undefined behavior. While the behavior is 30 deterministic in all of the cases we've encountered, it's inconsistent 31 across runtimes and unexpected. 32* The Protobuf compiler will fail to parse any proto3 files if any JSON 33 conflicts are detected by default 34 * Disabled by `deprecated_legacy_json_field_conflicts` option 35* Proto2 files will only fail to parse if both of the conflicts fields have 36 `json_name` set 37 * We will still warn for default json mapping conflicts if 38 `deprecated_legacy_json_field_conflicts` isn't set 39 40The goal here is to unify these behaviors into a future-facing feature as part 41of edition zero. 42 43## Recommendation 44 45We recommend adding a new `json_format` feature as part of 46[Edition Zero features](edition-zero-features.md). The doc will be updated to 47reflect the following details. 48 49JSON format can have three possible states: 50 51* `ALLOW` - By default, fields will be fully validated during proto parsing. 52 Any conflicting JSON mappings will trigger protoc errors, guaranteeing 53 uniqueness. This will be consistent with the current proto3 behavior. No 54 runtime changes are needed, since we allow JSON parsing/serialization. 55* `DISALLOW` - Alternatively, we will ban JSON encoding and disable all 56 validation related to JSON mappings. All runtimes will fail to parse or 57 serialize any messages to/from JSON when this feature is set on the 58 top-level messages. This is a new mode which provides an alternative to 59 `LEGACY_BEST_EFFORT` that doesn't involve any schema changes. 60* `LEGACY_BEST_EFFORT` - Fields will be validated for correctness, but not for 61 uniqueness. Any conflicting JSON mappings will trigger protoc warnings, but 62 no errors. This will be consistent with the current proto2 behavior, or 63 proto3 where `deprecated_legacy_json_field_conflicts` is set. Since this is 64 undefined behavior we want to get rid of, a parallel effort will attempt to 65 remove this later. No runtime changes are needed, since we allow JSON 66 parsing/serialization. 67 68Long-term, we want JSON support to be specified at the proto level. For the 69migration from proto2/proto3, we will just migrate everything to `ALLOW` and 70`LEGACY_BEST_EFFORT` depending on the `syntax` and the value of 71`deprecated_legacy_json_field_conflicts`. 72 73We will additionally ban any `ALLOW` message from containing a `DISALLOW` type 74anywhere in its tree (including extensions, which will fail to compile). 75Attempting to add this will result in a compiler error. This has the following 76benefits: 77 78* The implementation is a lot simpler, since most of the work is done in 79 protoc and parsers only need to check the top level message 80* Runtime failures aren't dependent on the contents of the message being 81 serialized/parsed 82* Avoids messy blurring of ownership. If a bug occurs because a `DISALLOW` 83 field is sometimes set, is the owner of the child type required to change it 84 to `ALLOW`? Or is the owner of the parent type responsible because they 85 added the dependency? 86 87 `LEGACY_BEST_EFFORT` will continue to allow serialization/parsing of types 88 with `DISALLOW` set. 89 90This feature will target messages and enums, but we will also provide it at the 91file level for convenience. 92 93Example use-cases for `DISALLOW`: 94 95* https://github.com/protocolbuffers/protobuf/issues/12525 96* Some projects generate proto descriptors at runtime and uses underscores to 97 disambiguate field names. They never use JSON format with these protos, but 98 currently have to work around our conflict checks 99 100## Alternatives 101 102### Dual State 103 104Instead of a tri-state feature, we could have a simple allow/disallow feature 105for JSON format. 106 107#### Pros 108 109* Simpler conceptually 110 111#### Cons 112 113* We would end up blocked by many of the protos that we were unable to migrate 114 as part of *JSON Field Name Conflicts* (not available externally). While 115 some of them could be migrated to `DISALLOW`, others are actually 116 **depending** on our current behavior under JSON mapping conflicts (as a 117 hack around some limitations in JSON customization). 118 119### Default to DISALLOW 120 121Instead of defaulting to `ALLOW`, we could default to `DISALLOW`. 122 123#### Pros 124 125The majority of internal Google protos are used for binary/text encoding and 126don't care about JSON, so this would: 127 128* Be less noisy for teams who forget to explicitly set `DISALLOW` and may have 129 fields with conflicting JSON mappings 130* Decrease our support surface 131 132#### Cons 133 134* We would need to figure out where `DISALLOW` can be added 135 136### Do Nothing 137 138#### Pros 139 140* Short-term it saves some trouble and keeps edition zero simpler 141 142#### Cons 143 144* We'll eventually hit the same issues we hit in *JSON Field Name Conflicts* 145 (not available externally) 146* The current proto2/proto3 behaviors are mutually exclusive. There's nothing 147 we can migrate to in today's edition zero that won't risk breaking one of 148 them. 149