• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# -*- coding: utf-8 -*-
2# Copyright 2012 Google Inc. All Rights Reserved.
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8#     http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15"""Additional help about object metadata."""
16
17from __future__ import absolute_import
18
19from gslib.help_provider import HelpProvider
20
21_DETAILED_HELP_TEXT = ("""
22<B>OVERVIEW OF METADATA</B>
23  Objects can have associated metadata, which control aspects of how
24  GET requests are handled, including Content-Type, Cache-Control,
25  Content-Disposition, and Content-Encoding (discussed in more detail in
26  the subsections below). In addition, you can set custom metadata that
27  can be used by applications (e.g., tagging that particular objects possess
28  some property).
29
30  There are two ways to set metadata on objects:
31
32  - at upload time you can specify one or more headers to associate with
33    objects, using the gsutil -h option.  For example, the following command
34    would cause gsutil to set the Content-Type and Cache-Control for each
35    of the files being uploaded:
36
37      gsutil -h "Content-Type:text/html" \\
38             -h "Cache-Control:public, max-age=3600" cp -r images \\
39             gs://bucket/images
40
41    Note that -h is an option on the gsutil command, not the cp sub-command.
42
43  - You can set or remove metadata fields from already uploaded objects using
44    the gsutil setmeta command. See "gsutil help setmeta".
45
46  More details about specific pieces of metadata are discussed below.
47
48
49<B>CONTENT TYPE</B>
50  The most commonly set metadata is Content-Type (also known as MIME type),
51  which allows browsers to render the object properly.
52  gsutil sets the Content-Type automatically at upload time, based on each
53  filename extension. For example, uploading files with names ending in .txt
54  will set Content-Type to text/plain. If you're running gsutil on Linux or
55  MacOS and would prefer to have content type set based on naming plus content
56  examination, see the use_magicfile configuration variable in the gsutil/boto
57  configuration file (See also "gsutil help config"). In general, using
58  use_magicfile is more robust and configurable, but is not available on
59  Windows.
60
61  If you specify a Content-Type header with -h when uploading content (like the
62  example gsutil command given in the previous section), it overrides the
63  Content-Type that would have been set based on filename extension or content.
64  This can be useful if the Content-Type detection algorithm doesn't work as
65  desired for some of your files.
66
67  You can also completely suppress content type detection in gsutil, by
68  specifying an empty string on the Content-Type header:
69
70    gsutil -h 'Content-Type:' cp -r images gs://bucket/images
71
72  In this case, the Google Cloud Storage service will not attempt to detect
73  the content type. In general this approach will work better than using
74  filename extension-based content detection in gsutil, because the list of
75  filename extensions is kept more current in the server-side content detection
76  system than in the Python library upon which gsutil content type detection
77  depends. (For example, at the time of writing this, the filename extension
78  ".webp" was recognized by the server-side content detection system, but
79  not by gsutil.)
80
81
82<B>CACHE-CONTROL</B>
83  Another commonly set piece of metadata is Cache-Control, which allows
84  you to control whether and for how long browser and Internet caches are
85  allowed to cache your objects. Cache-Control only applies to objects with
86  a public-read ACL. Non-public data are not cacheable.
87
88  Here's an example of uploading a set of objects to allow caching:
89
90    gsutil -h "Cache-Control:public,max-age=3600" cp -a public-read \\
91           -r html gs://bucket/html
92
93  This command would upload all files in the html directory (and subdirectories)
94  and make them publicly readable and cacheable, with cache expiration of
95  one hour.
96
97  Note that if you allow caching, at download time you may see older versions
98  of objects after uploading a newer replacement object. Note also that because
99  objects can be cached at various places on the Internet there is no way to
100  force a cached object to expire globally (unlike the way you can force your
101  browser to refresh its cache). If you want to prevent caching of publicly
102  readable objects you should set a Cache-Control:private header on the object.
103  You can do this with a command such as:
104
105    gsutil -h Cache-Control:private cp -a public-read file.png gs://your-bucket
106
107  Another use of the Cache-Control header is through the "no-transform" value,
108  which instructs Google Cloud Storage to not apply any content transformations
109  based on specifics of a download request, such as removing gzip
110  content-encoding for incompatible clients.  Note that this parameter is only
111  respected by the XML API. The Google Cloud Storage JSON API respects only the
112  no-cache and max-age Cache-Control parameters.
113
114  For details about how to set the Cache-Control header see
115  "gsutil help setmeta".
116
117
118<B>CONTENT-ENCODING</B>
119  You can specify a Content-Encoding to indicate that an object is compressed
120  (for example, with gzip compression) while maintaining its Content-Type.
121  You will need to ensure that the files have been compressed using the
122  specified Content-Encoding before using gsutil to upload them. Consider the
123  following example for Linux:
124
125    echo "Highly compressible text" | gzip > foo.txt
126    gsutil -h "Content-Encoding:gzip" -h "Content-Type:text/plain" \\
127      cp foo.txt gs://bucket/compressed
128
129  Note that this is different from uploading a gzipped object foo.txt.gz with
130  Content-Type: application/x-gzip because most browsers are able to
131  dynamically decompress and process objects served with Content-Encoding: gzip
132  based on the underlying Content-Type.
133
134  For compressible content, using Content-Encoding: gzip saves network and
135  storage costs, and improves content serving performance. However, for content
136  that is already inherently compressed (archives and many media formats, for
137  instance) applying another level of compression via Content-Encoding is
138  typically detrimental to both object size and performance and should be
139  avoided.
140
141  Note also that gsutil provides an easy way to cause content to be compressed
142  and stored with Content-Encoding: gzip: see the -z option in "gsutil help cp".
143
144
145<B>CONTENT-DISPOSITION</B>
146  You can set Content-Disposition on your objects, to specify presentation
147  information about the data being transmitted. Here's an example:
148
149    gsutil -h 'Content-Disposition:attachment; filename=filename.ext' \\
150      cp -r attachments gs://bucket/attachments
151
152  Setting the Content-Disposition allows you to control presentation style
153  of the content, for example determining whether an attachment should be
154  automatically displayed vs should require some form of action from the user to
155  open it.  See http://www.w3.org/Protocols/rfc2616/rfc2616-sec19.html#sec19.5.1
156  for more details about the meaning of Content-Disposition.
157
158
159<B>CUSTOM METADATA</B>
160  You can add your own custom metadata (e.g,. for use by your application)
161  to an object by setting a header that starts with "x-goog-meta", for example:
162
163    gsutil -h x-goog-meta-reviewer:jane cp mycode.java gs://bucket/reviews
164
165  You can add multiple differently named custom metadata fields to each object.
166
167
168<B>SETTABLE FIELDS; FIELD VALUES</B>
169  You can't set some metadata fields, such as ETag and Content-Length. The
170  fields you can set are:
171
172  - Cache-Control
173  - Content-Disposition
174  - Content-Encoding
175  - Content-Language
176  - Content-MD5
177  - Content-Type
178  - Any field starting with a matching Cloud Storage Provider
179    prefix, such as x-goog-meta- (i.e., custom metadata).
180
181  Header names are case-insensitive.
182
183  x-goog-meta- fields can have data set to arbitrary Unicode values. All
184  other fields must have ASCII values.
185
186
187<B>VIEWING CURRENTLY SET METADATA</B>
188  You can see what metadata is currently set on an object by using:
189
190    gsutil ls -L gs://the_bucket/the_object
191""")
192
193
194class CommandOptions(HelpProvider):
195  """Additional help about object metadata."""
196
197  # Help specification. See help_provider.py for documentation.
198  help_spec = HelpProvider.HelpSpec(
199      help_name='metadata',
200      help_name_aliases=[
201          'cache-control', 'caching', 'content type', 'mime type', 'mime',
202          'type'],
203      help_type='additional_help',
204      help_one_line_summary='Working With Object Metadata',
205      help_text=_DETAILED_HELP_TEXT,
206      subcommand_help_text={},
207  )
208