1# Editions Feature Visibility 2 3**Authors:** [@mkruskal-google](https://github.com/mkruskal-google) 4 5**Approved:** 2023-09-08 6 7## Background 8 9While [Editions: Life of a FeatureSet](editions-life-of-a-featureset.md) handles 10how we propagate features *to* runtimes, what's left under-specified is how the 11runtimes should expose features to their users. *Exposing Editions Feature Sets* 12(not available externally) was an initial attempt to cover both these topics 13(specifically the C++ API section), but much of it has been redesigned since. 14This is a much more targeted document laying out how features should be treated 15by runtimes. 16 17## Problem Description 18 19There are two main concerns from a runtime's perspective: 20 211. **Direct access to resolved features protos** - While runtime decisions 22 *should* be made based on the data in these protos, their struct-like nature 23 makes them very rigid. Once users start to depend on the proto API, it makes 24 it very difficult for us to do internal refactoring. These protos are also 25 naturally structured based on how feature *specification* is done in proto 26 files, rather than the actual behaviors they represent. This makes it 27 difficult to guarantee that complex relationships between features and other 28 conditions are being uniformly handled. 29 302. **Accidental use of unresolved features** - Unresolved features represent a 31 clear foot-gun for users, that could also cause issues for us. Since they 32 share the same type as resolved features, it's not always easy to tell the 33 two apart. If runtime decisions are made using unresolved features, it's 34 very plausible that everything will work as expected in a given edition by 35 coincidence. However, when the proto's edition is bumped, it will very 36 likely break this code unexpectedly. 37 38Some concrete examples to help illustrate these concerns: 39 40* **Remodeling features** - We've bounced back and forth on how UTF8 41 validation should be modeled as a feature. None of the proposals resulted in 42 any functional changes, since edition zero preserves all proto2/proto3 43 behavior, the question was just about what features should be used to 44 control them. While the `.proto` file large-scale change to bump them to the 45 next edition containing these changes is unavoidable, we'd like to avoid 46 having to update any code simultaneously. If everyone is directly inspecting 47 the `utf8_validation` feature, we would need to do both. 48 49* **Incomplete features** - Looking at a feature like `packed`, it's really 50 more of a contextual *suggestion* than a strict rule. If it's set at the 51 file level, **all** fields will have the feature even though only packable 52 ones will actually respect it. Giving users direct access to this feature 53 would be problematic, because they would *also* need to check if it's 54 packable before making decisions based on it. Field presence is an even more 55 complicated example, where the logic we want people making runtime decisions 56 based on is distinct from what's specified in the proto file. 57 58* **Optimizations** - One of the major considerations in *Exposing Editions 59 Feature Sets* (not available externally) was whether or not it would be 60 possible to reduce the cost of editions later. Every descriptor is going to 61 contain two separate features protos, and it's likely this will end up 62 getting expensive as we roll out edition zero. We could decide to optimize 63 this by storing them as a custom class with a much more compact memory 64 layout. This is similar to other optimizations we've done to descriptor 65 classes, where we have the freedom to *because* we don't generally expose 66 them as protos. 67 68* **Bumpy Edition Large-scale Change** - The proto team is going to be 69 responsible for rolling out the next edition to internal Google repositories 70 every year (at least 80% of it per our churn policy). We *expect* that 71 people are only making decisions based on resolved features, and therefore 72 that Prototiller transformations are behavior-preserving (despite changing 73 the unresolved features). If people have easy access to unresolved features 74 though, we can expect a lot of Hyrum's law issues slowing down these 75 large-scale changes. 76 77## Recommended Solution 78 79We recommend a conservative approach of hiding all `FeatureSet` protos from 80public APIs whenever possible. This means that there should be no public 81`features()` getter, and that features should be stripped from any descriptor 82options. All `options()` getters should have an unset features field. Instead, 83helper methods should be provided on the relevant descriptors to encapsulate the 84behaviors users care about. This has already been done for edition zero features 85(e.g. `has_presence`, `requires_utf8_validation`, etc), and we should continue 86this model. 87 88The one notable place where we *can't* completely hide features is in 89reflection. Most of our runtimes provide APIs for converting descriptors back to 90their original state at runtime (e.g. `CopyTo` and `DebugString` in C++). In 91order to give a faithful representation of the original proto file in these 92cases, we should include the **unresolved** features here. Given how inefficient 93these methods are and how hard the resulting protos are to work with, we expect 94misuse of these unresolved features to be rare. 95 96**Note:** While we may need to adjust this approach in the future, this is the 97one that gives us the most flexibility to do so. Adding a new API when we have 98solid use-cases for it is easy to do. Removing an existing one when we decide we 99don't want it has proven to be very difficult. 100 101### Enforcement 102 103While we make the recommendation above, ultimately this decision should be up to 104the runtime owners. Outside of Google we can't enforce it, and the cost would be 105a worse experience for *their* users (not the entire protobuf ecosystem). Inside 106of Google, we should be more diligent about this, since the cost mostly falls on 107us. 108 109### μpb 110 111One notable standout here is μpb, which is a runtime *implementation*, but not a 112full runtime. Since μpb only provides APIs to the wrapping runtime in a target 113language, it's free to expose features anywhere it wants. The wrapping language 114should be responsible for stripping them out where appropriate. 115 116#### Pros 117 118* Prevents any direct access to resolved feature protos 119 120 * Gives us freedom to do internal refactoring 121 * Allows us to encapsulate more complex relationships 122 * Users don't have to distinguish between resolved/unresolved features 123 124* Limits access to unresolved features 125 126 * Accidental usage of these is less likely (especially considering the 127 above) 128 129* This should be easy to loosen in the future if we find a real use-case for 130 `features()` getters. 131 132* More inline with our descriptor APIs, which wrap descriptor protos but 133 aren't strictly 1:1 with them. Options are more an exception here, mostly 134 due to the need to expose extensions. 135 136#### Cons 137 138* There's no precedent for modifying `options()` like this. Up until now it 139 represented a faithful clone of what was specified in the proto file. 140 141* Deciding to loosen this in the future would be a bit awkward for 142 `options()`. If we stop stripping it, people will suddenly start seeing a 143 new field and Hyrum's law might result in breakages. 144 145* Requires duplicating high-level feature behaviors across every language. For 146 example, `has_presence` will need to be implemented identically in every 147 language. We will likely need some kind of conformance test to make sure 148 these all agree. 149 150## Considered Alternatives 151 152### Expose Features 153 154This is the simplest implementation, and was the initial approach taken in 155prototypes. We would just have public `features()` getters in our descriptor 156APIs, and keep the unresolved features in `options()`. 157 158#### Pros 159 160* Very easy to implement 161 162#### Cons 163 164* Doesn't solve any of the problems laid out above 165 166* Difficult to reverse later 167 168### Hide Features in Generated Options 169 170This is a tweak of the recommended solution where we add a hack to the generated 171options messages. Instead of just stripping the features out and leaving an 172empty field, we could give the `features` fields "package-scoped" visibility 173(e.g. access tokens in C++). We would still strip them, but nobody outside of 174our runtimes could even access them to see that they're empty. This eliminates 175the Hyrum's law concern above. 176 177#### Pros 178 179* Resolves one of the cons in the recommended approach. 180 181#### Cons 182 183* We'd have to do this separately for each runtime, meaning specific hacks in 184 *every* code generator 185 186* No clear benefit. This only helps **if** we decide to expose features and 187 **if** a bunch of people start depending on the fact that `features` are 188 always empty. 189 190### ClangTidy warning Options Features 191 192Similar to the above alternative, but leverages ClangTidy to warn users against 193checking `options().features()`. 194 195#### Pros 196 197* Resolves one of the cons in the recommended approach. 198 199#### Cons 200 201* Doesn't work in every language 202 203* Doesn't work in OSS 204