1:mod:`!csv` --- CSV File Reading and Writing 2============================================ 3 4.. module:: csv 5 :synopsis: Write and read tabular data to and from delimited files. 6 7.. sectionauthor:: Skip Montanaro <skip.montanaro@gmail.com> 8 9**Source code:** :source:`Lib/csv.py` 10 11.. index:: 12 single: csv 13 pair: data; tabular 14 15-------------- 16 17The so-called CSV (Comma Separated Values) format is the most common import and 18export format for spreadsheets and databases. CSV format was used for many 19years prior to attempts to describe the format in a standardized way in 20:rfc:`4180`. The lack of a well-defined standard means that subtle differences 21often exist in the data produced and consumed by different applications. These 22differences can make it annoying to process CSV files from multiple sources. 23Still, while the delimiters and quoting characters vary, the overall format is 24similar enough that it is possible to write a single module which can 25efficiently manipulate such data, hiding the details of reading and writing the 26data from the programmer. 27 28The :mod:`csv` module implements classes to read and write tabular data in CSV 29format. It allows programmers to say, "write this data in the format preferred 30by Excel," or "read data from this file which was generated by Excel," without 31knowing the precise details of the CSV format used by Excel. Programmers can 32also describe the CSV formats understood by other applications or define their 33own special-purpose CSV formats. 34 35The :mod:`csv` module's :class:`reader` and :class:`writer` objects read and 36write sequences. Programmers can also read and write data in dictionary form 37using the :class:`DictReader` and :class:`DictWriter` classes. 38 39.. seealso:: 40 41 :pep:`305` - CSV File API 42 The Python Enhancement Proposal which proposed this addition to Python. 43 44 45.. _csv-contents: 46 47Module Contents 48--------------- 49 50The :mod:`csv` module defines the following functions: 51 52 53.. index:: 54 single: universal newlines; csv.reader function 55 56.. function:: reader(csvfile, dialect='excel', **fmtparams) 57 58 Return a :ref:`reader object <reader-objects>` that will process 59 lines from the given *csvfile*. A csvfile must be an iterable of 60 strings, each in the reader's defined csv format. 61 A csvfile is most commonly a file-like object or list. 62 If *csvfile* is a file object, 63 it should be opened with ``newline=''``. [1]_ An optional 64 *dialect* parameter can be given which is used to define a set of parameters 65 specific to a particular CSV dialect. It may be an instance of a subclass of 66 the :class:`Dialect` class or one of the strings returned by the 67 :func:`list_dialects` function. The other optional *fmtparams* keyword arguments 68 can be given to override individual formatting parameters in the current 69 dialect. For full details about the dialect and formatting parameters, see 70 section :ref:`csv-fmt-params`. 71 72 Each row read from the csv file is returned as a list of strings. No 73 automatic data type conversion is performed unless the ``QUOTE_NONNUMERIC`` format 74 option is specified (in which case unquoted fields are transformed into floats). 75 76 A short usage example:: 77 78 >>> import csv 79 >>> with open('eggs.csv', newline='') as csvfile: 80 ... spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|') 81 ... for row in spamreader: 82 ... print(', '.join(row)) 83 Spam, Spam, Spam, Spam, Spam, Baked Beans 84 Spam, Lovely Spam, Wonderful Spam 85 86 87.. function:: writer(csvfile, dialect='excel', **fmtparams) 88 89 Return a writer object responsible for converting the user's data into delimited 90 strings on the given file-like object. *csvfile* can be any object with a 91 :meth:`~io.TextIOBase.write` method. If *csvfile* is a file object, it should be opened with 92 ``newline=''`` [1]_. An optional *dialect* 93 parameter can be given which is used to define a set of parameters specific to a 94 particular CSV dialect. It may be an instance of a subclass of the 95 :class:`Dialect` class or one of the strings returned by the 96 :func:`list_dialects` function. The other optional *fmtparams* keyword arguments 97 can be given to override individual formatting parameters in the current 98 dialect. For full details about dialects and formatting parameters, see 99 the :ref:`csv-fmt-params` section. To make it 100 as easy as possible to interface with modules which implement the DB API, the 101 value :const:`None` is written as the empty string. While this isn't a 102 reversible transformation, it makes it easier to dump SQL NULL data values to 103 CSV files without preprocessing the data returned from a ``cursor.fetch*`` call. 104 All other non-string data are stringified with :func:`str` before being written. 105 106 A short usage example:: 107 108 import csv 109 with open('eggs.csv', 'w', newline='') as csvfile: 110 spamwriter = csv.writer(csvfile, delimiter=' ', 111 quotechar='|', quoting=csv.QUOTE_MINIMAL) 112 spamwriter.writerow(['Spam'] * 5 + ['Baked Beans']) 113 spamwriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam']) 114 115 116.. function:: register_dialect(name[, dialect[, **fmtparams]]) 117 118 Associate *dialect* with *name*. *name* must be a string. The 119 dialect can be specified either by passing a sub-class of :class:`Dialect`, or 120 by *fmtparams* keyword arguments, or both, with keyword arguments overriding 121 parameters of the dialect. For full details about dialects and formatting 122 parameters, see section :ref:`csv-fmt-params`. 123 124 125.. function:: unregister_dialect(name) 126 127 Delete the dialect associated with *name* from the dialect registry. An 128 :exc:`Error` is raised if *name* is not a registered dialect name. 129 130 131.. function:: get_dialect(name) 132 133 Return the dialect associated with *name*. An :exc:`Error` is raised if 134 *name* is not a registered dialect name. This function returns an immutable 135 :class:`Dialect`. 136 137.. function:: list_dialects() 138 139 Return the names of all registered dialects. 140 141 142.. function:: field_size_limit([new_limit]) 143 144 Returns the current maximum field size allowed by the parser. If *new_limit* is 145 given, this becomes the new limit. 146 147 148The :mod:`csv` module defines the following classes: 149 150.. class:: DictReader(f, fieldnames=None, restkey=None, restval=None, \ 151 dialect='excel', *args, **kwds) 152 153 Create an object that operates like a regular reader but maps the 154 information in each row to a :class:`dict` whose keys are given by the 155 optional *fieldnames* parameter. 156 157 The *fieldnames* parameter is a :term:`sequence`. If *fieldnames* is 158 omitted, the values in the first row of file *f* will be used as the 159 fieldnames and will be omitted from the results. If 160 *fieldnames* is provided, they will be used and the first row will be 161 included in the results. Regardless of how the fieldnames are determined, 162 the dictionary preserves their original ordering. 163 164 If a row has more fields than fieldnames, the remaining data is put in a 165 list and stored with the fieldname specified by *restkey* (which defaults 166 to ``None``). If a non-blank row has fewer fields than fieldnames, the 167 missing values are filled-in with the value of *restval* (which defaults 168 to ``None``). 169 170 All other optional or keyword arguments are passed to the underlying 171 :class:`reader` instance. 172 173 If the argument passed to *fieldnames* is an iterator, it will be coerced to a :class:`list`. 174 175 .. versionchanged:: 3.6 176 Returned rows are now of type :class:`OrderedDict`. 177 178 .. versionchanged:: 3.8 179 Returned rows are now of type :class:`dict`. 180 181 A short usage example:: 182 183 >>> import csv 184 >>> with open('names.csv', newline='') as csvfile: 185 ... reader = csv.DictReader(csvfile) 186 ... for row in reader: 187 ... print(row['first_name'], row['last_name']) 188 ... 189 Eric Idle 190 John Cleese 191 192 >>> print(row) 193 {'first_name': 'John', 'last_name': 'Cleese'} 194 195 196.. class:: DictWriter(f, fieldnames, restval='', extrasaction='raise', \ 197 dialect='excel', *args, **kwds) 198 199 Create an object which operates like a regular writer but maps dictionaries 200 onto output rows. The *fieldnames* parameter is a :mod:`sequence 201 <collections.abc>` of keys that identify the order in which values in the 202 dictionary passed to the :meth:`~csvwriter.writerow` method are written to file 203 *f*. The optional *restval* parameter specifies the value to be 204 written if the dictionary is missing a key in *fieldnames*. If the 205 dictionary passed to the :meth:`~csvwriter.writerow` method contains a key not found in 206 *fieldnames*, the optional *extrasaction* parameter indicates what action to 207 take. 208 If it is set to ``'raise'``, the default value, a :exc:`ValueError` 209 is raised. 210 If it is set to ``'ignore'``, extra values in the dictionary are ignored. 211 Any other optional or keyword arguments are passed to the underlying 212 :class:`writer` instance. 213 214 Note that unlike the :class:`DictReader` class, the *fieldnames* parameter 215 of the :class:`DictWriter` class is not optional. 216 217 If the argument passed to *fieldnames* is an iterator, it will be coerced to a :class:`list`. 218 219 A short usage example:: 220 221 import csv 222 223 with open('names.csv', 'w', newline='') as csvfile: 224 fieldnames = ['first_name', 'last_name'] 225 writer = csv.DictWriter(csvfile, fieldnames=fieldnames) 226 227 writer.writeheader() 228 writer.writerow({'first_name': 'Baked', 'last_name': 'Beans'}) 229 writer.writerow({'first_name': 'Lovely', 'last_name': 'Spam'}) 230 writer.writerow({'first_name': 'Wonderful', 'last_name': 'Spam'}) 231 232 233.. class:: Dialect 234 235 The :class:`Dialect` class is a container class whose attributes contain 236 information for how to handle doublequotes, whitespace, delimiters, etc. 237 Due to the lack of a strict CSV specification, different applications 238 produce subtly different CSV data. :class:`Dialect` instances define how 239 :class:`reader` and :class:`writer` instances behave. 240 241 All available :class:`Dialect` names are returned by :func:`list_dialects`, 242 and they can be registered with specific :class:`reader` and :class:`writer` 243 classes through their initializer (``__init__``) functions like this:: 244 245 import csv 246 247 with open('students.csv', 'w', newline='') as csvfile: 248 writer = csv.writer(csvfile, dialect='unix') 249 250 251.. class:: excel() 252 253 The :class:`excel` class defines the usual properties of an Excel-generated CSV 254 file. It is registered with the dialect name ``'excel'``. 255 256 257.. class:: excel_tab() 258 259 The :class:`excel_tab` class defines the usual properties of an Excel-generated 260 TAB-delimited file. It is registered with the dialect name ``'excel-tab'``. 261 262 263.. class:: unix_dialect() 264 265 The :class:`unix_dialect` class defines the usual properties of a CSV file 266 generated on UNIX systems, i.e. using ``'\n'`` as line terminator and quoting 267 all fields. It is registered with the dialect name ``'unix'``. 268 269 .. versionadded:: 3.2 270 271 272.. class:: Sniffer() 273 274 The :class:`Sniffer` class is used to deduce the format of a CSV file. 275 276 The :class:`Sniffer` class provides two methods: 277 278 .. method:: sniff(sample, delimiters=None) 279 280 Analyze the given *sample* and return a :class:`Dialect` subclass 281 reflecting the parameters found. If the optional *delimiters* parameter 282 is given, it is interpreted as a string containing possible valid 283 delimiter characters. 284 285 286 .. method:: has_header(sample) 287 288 Analyze the sample text (presumed to be in CSV format) and return 289 :const:`True` if the first row appears to be a series of column headers. 290 Inspecting each column, one of two key criteria will be considered to 291 estimate if the sample contains a header: 292 293 - the second through n-th rows contain numeric values 294 - the second through n-th rows contain strings where at least one value's 295 length differs from that of the putative header of that column. 296 297 Twenty rows after the first row are sampled; if more than half of columns + 298 rows meet the criteria, :const:`True` is returned. 299 300 .. note:: 301 302 This method is a rough heuristic and may produce both false positives and 303 negatives. 304 305An example for :class:`Sniffer` use:: 306 307 with open('example.csv', newline='') as csvfile: 308 dialect = csv.Sniffer().sniff(csvfile.read(1024)) 309 csvfile.seek(0) 310 reader = csv.reader(csvfile, dialect) 311 # ... process CSV file contents here ... 312 313 314.. _csv-constants: 315 316The :mod:`csv` module defines the following constants: 317 318.. data:: QUOTE_ALL 319 320 Instructs :class:`writer` objects to quote all fields. 321 322 323.. data:: QUOTE_MINIMAL 324 325 Instructs :class:`writer` objects to only quote those fields which contain 326 special characters such as *delimiter*, *quotechar* or any of the characters in 327 *lineterminator*. 328 329 330.. data:: QUOTE_NONNUMERIC 331 332 Instructs :class:`writer` objects to quote all non-numeric fields. 333 334 Instructs :class:`reader` objects to convert all non-quoted fields to type *float*. 335 336 337.. data:: QUOTE_NONE 338 339 Instructs :class:`writer` objects to never quote fields. When the current 340 *delimiter* occurs in output data it is preceded by the current *escapechar* 341 character. If *escapechar* is not set, the writer will raise :exc:`Error` if 342 any characters that require escaping are encountered. 343 344 Instructs :class:`reader` objects to perform no special processing of quote characters. 345 346.. data:: QUOTE_NOTNULL 347 348 Instructs :class:`writer` objects to quote all fields which are not 349 ``None``. This is similar to :data:`QUOTE_ALL`, except that if a 350 field value is ``None`` an empty (unquoted) string is written. 351 352 Instructs :class:`reader` objects to interpret an empty (unquoted) field 353 as ``None`` and to otherwise behave as :data:`QUOTE_ALL`. 354 355 .. versionadded:: 3.12 356 357.. data:: QUOTE_STRINGS 358 359 Instructs :class:`writer` objects to always place quotes around fields 360 which are strings. This is similar to :data:`QUOTE_NONNUMERIC`, except that if a 361 field value is ``None`` an empty (unquoted) string is written. 362 363 Instructs :class:`reader` objects to interpret an empty (unquoted) string as ``None`` and 364 to otherwise behave as :data:`QUOTE_NONNUMERIC`. 365 366 .. versionadded:: 3.12 367 368The :mod:`csv` module defines the following exception: 369 370 371.. exception:: Error 372 373 Raised by any of the functions when an error is detected. 374 375.. _csv-fmt-params: 376 377Dialects and Formatting Parameters 378---------------------------------- 379 380To make it easier to specify the format of input and output records, specific 381formatting parameters are grouped together into dialects. A dialect is a 382subclass of the :class:`Dialect` class containing various attributes 383describing the format of the CSV file. When creating :class:`reader` or 384:class:`writer` objects, the programmer can specify a string or a subclass of 385the :class:`Dialect` class as the dialect parameter. In addition to, or instead 386of, the *dialect* parameter, the programmer can also specify individual 387formatting parameters, which have the same names as the attributes defined below 388for the :class:`Dialect` class. 389 390Dialects support the following attributes: 391 392 393.. attribute:: Dialect.delimiter 394 395 A one-character string used to separate fields. It defaults to ``','``. 396 397 398.. attribute:: Dialect.doublequote 399 400 Controls how instances of *quotechar* appearing inside a field should 401 themselves be quoted. When :const:`True`, the character is doubled. When 402 :const:`False`, the *escapechar* is used as a prefix to the *quotechar*. It 403 defaults to :const:`True`. 404 405 On output, if *doublequote* is :const:`False` and no *escapechar* is set, 406 :exc:`Error` is raised if a *quotechar* is found in a field. 407 408 409.. attribute:: Dialect.escapechar 410 411 A one-character string used by the writer to escape the *delimiter* if *quoting* 412 is set to :const:`QUOTE_NONE` and the *quotechar* if *doublequote* is 413 :const:`False`. On reading, the *escapechar* removes any special meaning from 414 the following character. It defaults to :const:`None`, which disables escaping. 415 416 .. versionchanged:: 3.11 417 An empty *escapechar* is not allowed. 418 419.. attribute:: Dialect.lineterminator 420 421 The string used to terminate lines produced by the :class:`writer`. It defaults 422 to ``'\r\n'``. 423 424 .. note:: 425 426 The :class:`reader` is hard-coded to recognise either ``'\r'`` or ``'\n'`` as 427 end-of-line, and ignores *lineterminator*. This behavior may change in the 428 future. 429 430 431.. attribute:: Dialect.quotechar 432 433 A one-character string used to quote fields containing special characters, such 434 as the *delimiter* or *quotechar*, or which contain new-line characters. It 435 defaults to ``'"'``. 436 437 .. versionchanged:: 3.11 438 An empty *quotechar* is not allowed. 439 440.. attribute:: Dialect.quoting 441 442 Controls when quotes should be generated by the writer and recognised by the 443 reader. It can take on any of the :ref:`QUOTE_\* constants <csv-constants>` 444 and defaults to :const:`QUOTE_MINIMAL`. 445 446 447.. attribute:: Dialect.skipinitialspace 448 449 When :const:`True`, spaces immediately following the *delimiter* are ignored. 450 The default is :const:`False`. 451 452 453.. attribute:: Dialect.strict 454 455 When ``True``, raise exception :exc:`Error` on bad CSV input. 456 The default is ``False``. 457 458.. _reader-objects: 459 460Reader Objects 461-------------- 462 463Reader objects (:class:`DictReader` instances and objects returned by the 464:func:`reader` function) have the following public methods: 465 466.. method:: csvreader.__next__() 467 468 Return the next row of the reader's iterable object as a list (if the object 469 was returned from :func:`reader`) or a dict (if it is a :class:`DictReader` 470 instance), parsed according to the current :class:`Dialect`. Usually you 471 should call this as ``next(reader)``. 472 473 474Reader objects have the following public attributes: 475 476.. attribute:: csvreader.dialect 477 478 A read-only description of the dialect in use by the parser. 479 480 481.. attribute:: csvreader.line_num 482 483 The number of lines read from the source iterator. This is not the same as the 484 number of records returned, as records can span multiple lines. 485 486 487DictReader objects have the following public attribute: 488 489.. attribute:: DictReader.fieldnames 490 491 If not passed as a parameter when creating the object, this attribute is 492 initialized upon first access or when the first record is read from the 493 file. 494 495 496 497Writer Objects 498-------------- 499 500:class:`writer` objects (:class:`DictWriter` instances and objects returned by 501the :func:`writer` function) have the following public methods. A *row* must be 502an iterable of strings or numbers for :class:`writer` objects and a dictionary 503mapping fieldnames to strings or numbers (by passing them through :func:`str` 504first) for :class:`DictWriter` objects. Note that complex numbers are written 505out surrounded by parens. This may cause some problems for other programs which 506read CSV files (assuming they support complex numbers at all). 507 508 509.. method:: csvwriter.writerow(row) 510 511 Write the *row* parameter to the writer's file object, formatted according 512 to the current :class:`Dialect`. Return the return value of the call to the 513 *write* method of the underlying file object. 514 515 .. versionchanged:: 3.5 516 Added support of arbitrary iterables. 517 518.. method:: csvwriter.writerows(rows) 519 520 Write all elements in *rows* (an iterable of *row* objects as described 521 above) to the writer's file object, formatted according to the current 522 dialect. 523 524Writer objects have the following public attribute: 525 526 527.. attribute:: csvwriter.dialect 528 529 A read-only description of the dialect in use by the writer. 530 531 532DictWriter objects have the following public method: 533 534 535.. method:: DictWriter.writeheader() 536 537 Write a row with the field names (as specified in the constructor) to 538 the writer's file object, formatted according to the current dialect. Return 539 the return value of the :meth:`csvwriter.writerow` call used internally. 540 541 .. versionadded:: 3.2 542 .. versionchanged:: 3.8 543 :meth:`writeheader` now also returns the value returned by 544 the :meth:`csvwriter.writerow` method it uses internally. 545 546 547.. _csv-examples: 548 549Examples 550-------- 551 552The simplest example of reading a CSV file:: 553 554 import csv 555 with open('some.csv', newline='') as f: 556 reader = csv.reader(f) 557 for row in reader: 558 print(row) 559 560Reading a file with an alternate format:: 561 562 import csv 563 with open('passwd', newline='') as f: 564 reader = csv.reader(f, delimiter=':', quoting=csv.QUOTE_NONE) 565 for row in reader: 566 print(row) 567 568The corresponding simplest possible writing example is:: 569 570 import csv 571 with open('some.csv', 'w', newline='') as f: 572 writer = csv.writer(f) 573 writer.writerows(someiterable) 574 575Since :func:`open` is used to open a CSV file for reading, the file 576will by default be decoded into unicode using the system default 577encoding (see :func:`locale.getencoding`). To decode a file 578using a different encoding, use the ``encoding`` argument of open:: 579 580 import csv 581 with open('some.csv', newline='', encoding='utf-8') as f: 582 reader = csv.reader(f) 583 for row in reader: 584 print(row) 585 586The same applies to writing in something other than the system default 587encoding: specify the encoding argument when opening the output file. 588 589Registering a new dialect:: 590 591 import csv 592 csv.register_dialect('unixpwd', delimiter=':', quoting=csv.QUOTE_NONE) 593 with open('passwd', newline='') as f: 594 reader = csv.reader(f, 'unixpwd') 595 596A slightly more advanced use of the reader --- catching and reporting errors:: 597 598 import csv, sys 599 filename = 'some.csv' 600 with open(filename, newline='') as f: 601 reader = csv.reader(f) 602 try: 603 for row in reader: 604 print(row) 605 except csv.Error as e: 606 sys.exit('file {}, line {}: {}'.format(filename, reader.line_num, e)) 607 608And while the module doesn't directly support parsing strings, it can easily be 609done:: 610 611 import csv 612 for row in csv.reader(['one,two,three']): 613 print(row) 614 615 616.. rubric:: Footnotes 617 618.. [1] If ``newline=''`` is not specified, newlines embedded inside quoted fields 619 will not be interpreted correctly, and on platforms that use ``\r\n`` linendings 620 on write an extra ``\r`` will be added. It should always be safe to specify 621 ``newline=''``, since the csv module does its own 622 (:term:`universal <universal newlines>`) newline handling. 623