• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1 /*!
2 The `csv` crate provides a fast and flexible CSV reader and writer, with
3 support for Serde.
4 
5 The [tutorial](tutorial/index.html) is a good place to start if you're new to
6 Rust.
7 
8 The [cookbook](cookbook/index.html) will give you a variety of complete Rust
9 programs that do CSV reading and writing.
10 
11 # Brief overview
12 
13 **If you're new to Rust**, you might find the
14 [tutorial](tutorial/index.html)
15 to be a good place to start.
16 
17 The primary types in this crate are
18 [`Reader`](struct.Reader.html)
19 and
20 [`Writer`](struct.Writer.html),
21 for reading and writing CSV data respectively.
22 Correspondingly, to support CSV data with custom field or record delimiters
23 (among many other things), you should use either a
24 [`ReaderBuilder`](struct.ReaderBuilder.html)
25 or a
26 [`WriterBuilder`](struct.WriterBuilder.html),
27 depending on whether you're reading or writing CSV data.
28 
29 Unless you're using Serde, the standard CSV record types are
30 [`StringRecord`](struct.StringRecord.html)
31 and
32 [`ByteRecord`](struct.ByteRecord.html).
33 `StringRecord` should be used when you know your data to be valid UTF-8.
34 For data that may be invalid UTF-8, `ByteRecord` is suitable.
35 
36 Finally, the set of errors is described by the
37 [`Error`](struct.Error.html)
38 type.
39 
40 The rest of the types in this crate mostly correspond to more detailed errors,
41 position information, configuration knobs or iterator types.
42 
43 # Setup
44 
45 Add this to your `Cargo.toml`:
46 
47 ```toml
48 [dependencies]
49 csv = "1.1"
50 ```
51 
52 If you want to use Serde's custom derive functionality on your custom structs,
53 then add this to your `[dependencies]` section of `Cargo.toml`:
54 
55 ```toml
56 [dependencies]
57 serde = { version = "1", features = ["derive"] }
58 ```
59 
60 # Example
61 
62 This example shows how to read CSV data from stdin and print each record to
63 stdout.
64 
65 There are more examples in the [cookbook](cookbook/index.html).
66 
67 ```no_run
68 use std::{error::Error, io, process};
69 
70 fn example() -> Result<(), Box<dyn Error>> {
71     // Build the CSV reader and iterate over each record.
72     let mut rdr = csv::Reader::from_reader(io::stdin());
73     for result in rdr.records() {
74         // The iterator yields Result<StringRecord, Error>, so we check the
75         // error here.
76         let record = result?;
77         println!("{:?}", record);
78     }
79     Ok(())
80 }
81 
82 fn main() {
83     if let Err(err) = example() {
84         println!("error running example: {}", err);
85         process::exit(1);
86     }
87 }
88 ```
89 
90 The above example can be run like so:
91 
92 ```ignore
93 $ git clone git://github.com/BurntSushi/rust-csv
94 $ cd rust-csv
95 $ cargo run --example cookbook-read-basic < examples/data/smallpop.csv
96 ```
97 
98 # Example with Serde
99 
100 This example shows how to read CSV data from stdin into your own custom struct.
101 By default, the member names of the struct are matched with the values in the
102 header record of your CSV data.
103 
104 ```no_run
105 use std::{error::Error, io, process};
106 
107 #[derive(Debug, serde::Deserialize)]
108 struct Record {
109     city: String,
110     region: String,
111     country: String,
112     population: Option<u64>,
113 }
114 
115 fn example() -> Result<(), Box<dyn Error>> {
116     let mut rdr = csv::Reader::from_reader(io::stdin());
117     for result in rdr.deserialize() {
118         // Notice that we need to provide a type hint for automatic
119         // deserialization.
120         let record: Record = result?;
121         println!("{:?}", record);
122     }
123     Ok(())
124 }
125 
126 fn main() {
127     if let Err(err) = example() {
128         println!("error running example: {}", err);
129         process::exit(1);
130     }
131 }
132 ```
133 
134 The above example can be run like so:
135 
136 ```ignore
137 $ git clone git://github.com/BurntSushi/rust-csv
138 $ cd rust-csv
139 $ cargo run --example cookbook-read-serde < examples/data/smallpop.csv
140 ```
141 
142 */
143 
144 #![deny(missing_docs)]
145 
146 use std::result;
147 
148 use serde::{Deserialize, Deserializer};
149 
150 pub use crate::{
151     byte_record::{ByteRecord, ByteRecordIter, Position},
152     deserializer::{DeserializeError, DeserializeErrorKind},
153     error::{
154         Error, ErrorKind, FromUtf8Error, IntoInnerError, Result, Utf8Error,
155     },
156     reader::{
157         ByteRecordsIntoIter, ByteRecordsIter, DeserializeRecordsIntoIter,
158         DeserializeRecordsIter, Reader, ReaderBuilder, StringRecordsIntoIter,
159         StringRecordsIter,
160     },
161     string_record::{StringRecord, StringRecordIter},
162     writer::{Writer, WriterBuilder},
163 };
164 
165 mod byte_record;
166 pub mod cookbook;
167 mod debug;
168 mod deserializer;
169 mod error;
170 mod reader;
171 mod serializer;
172 mod string_record;
173 pub mod tutorial;
174 mod writer;
175 
176 /// The quoting style to use when writing CSV data.
177 #[derive(Clone, Copy, Debug)]
178 pub enum QuoteStyle {
179     /// This puts quotes around every field. Always.
180     Always,
181     /// This puts quotes around fields only when necessary.
182     ///
183     /// They are necessary when fields contain a quote, delimiter or record
184     /// terminator. Quotes are also necessary when writing an empty record
185     /// (which is indistinguishable from a record with one empty field).
186     ///
187     /// This is the default.
188     Necessary,
189     /// This puts quotes around all fields that are non-numeric. Namely, when
190     /// writing a field that does not parse as a valid float or integer, then
191     /// quotes will be used even if they aren't strictly necessary.
192     NonNumeric,
193     /// This *never* writes quotes, even if it would produce invalid CSV data.
194     Never,
195     /// Hints that destructuring should not be exhaustive.
196     ///
197     /// This enum may grow additional variants, so this makes sure clients
198     /// don't count on exhaustive matching. (Otherwise, adding a new variant
199     /// could break existing code.)
200     #[doc(hidden)]
201     __Nonexhaustive,
202 }
203 
204 impl QuoteStyle {
to_core(self) -> csv_core::QuoteStyle205     fn to_core(self) -> csv_core::QuoteStyle {
206         match self {
207             QuoteStyle::Always => csv_core::QuoteStyle::Always,
208             QuoteStyle::Necessary => csv_core::QuoteStyle::Necessary,
209             QuoteStyle::NonNumeric => csv_core::QuoteStyle::NonNumeric,
210             QuoteStyle::Never => csv_core::QuoteStyle::Never,
211             _ => unreachable!(),
212         }
213     }
214 }
215 
216 impl Default for QuoteStyle {
default() -> QuoteStyle217     fn default() -> QuoteStyle {
218         QuoteStyle::Necessary
219     }
220 }
221 
222 /// A record terminator.
223 ///
224 /// Use this to specify the record terminator while parsing CSV. The default is
225 /// CRLF, which treats `\r`, `\n` or `\r\n` as a single record terminator.
226 #[derive(Clone, Copy, Debug)]
227 pub enum Terminator {
228     /// Parses `\r`, `\n` or `\r\n` as a single record terminator.
229     CRLF,
230     /// Parses the byte given as a record terminator.
231     Any(u8),
232     /// Hints that destructuring should not be exhaustive.
233     ///
234     /// This enum may grow additional variants, so this makes sure clients
235     /// don't count on exhaustive matching. (Otherwise, adding a new variant
236     /// could break existing code.)
237     #[doc(hidden)]
238     __Nonexhaustive,
239 }
240 
241 impl Terminator {
242     /// Convert this to the csv_core type of the same name.
to_core(self) -> csv_core::Terminator243     fn to_core(self) -> csv_core::Terminator {
244         match self {
245             Terminator::CRLF => csv_core::Terminator::CRLF,
246             Terminator::Any(b) => csv_core::Terminator::Any(b),
247             _ => unreachable!(),
248         }
249     }
250 }
251 
252 impl Default for Terminator {
default() -> Terminator253     fn default() -> Terminator {
254         Terminator::CRLF
255     }
256 }
257 
258 /// The whitespace preservation behaviour when reading CSV data.
259 #[derive(Clone, Copy, Debug, PartialEq)]
260 pub enum Trim {
261     /// Preserves fields and headers. This is the default.
262     None,
263     /// Trim whitespace from headers.
264     Headers,
265     /// Trim whitespace from fields, but not headers.
266     Fields,
267     /// Trim whitespace from fields and headers.
268     All,
269     /// Hints that destructuring should not be exhaustive.
270     ///
271     /// This enum may grow additional variants, so this makes sure clients
272     /// don't count on exhaustive matching. (Otherwise, adding a new variant
273     /// could break existing code.)
274     #[doc(hidden)]
275     __Nonexhaustive,
276 }
277 
278 impl Trim {
should_trim_fields(&self) -> bool279     fn should_trim_fields(&self) -> bool {
280         self == &Trim::Fields || self == &Trim::All
281     }
282 
should_trim_headers(&self) -> bool283     fn should_trim_headers(&self) -> bool {
284         self == &Trim::Headers || self == &Trim::All
285     }
286 }
287 
288 impl Default for Trim {
default() -> Trim289     fn default() -> Trim {
290         Trim::None
291     }
292 }
293 
294 /// A custom Serde deserializer for possibly invalid `Option<T>` fields.
295 ///
296 /// When deserializing CSV data, it is sometimes desirable to simply ignore
297 /// fields with invalid data. For example, there might be a field that is
298 /// usually a number, but will occasionally contain garbage data that causes
299 /// number parsing to fail.
300 ///
301 /// You might be inclined to use, say, `Option<i32>` for fields such at this.
302 /// By default, however, `Option<i32>` will either capture *empty* fields with
303 /// `None` or valid numeric fields with `Some(the_number)`. If the field is
304 /// non-empty and not a valid number, then deserialization will return an error
305 /// instead of using `None`.
306 ///
307 /// This function allows you to override this default behavior. Namely, if
308 /// `Option<T>` is deserialized with non-empty but invalid data, then the value
309 /// will be `None` and the error will be ignored.
310 ///
311 /// # Example
312 ///
313 /// This example shows how to parse CSV records with numerical data, even if
314 /// some numerical data is absent or invalid. Without the
315 /// `serde(deserialize_with = "...")` annotations, this example would return
316 /// an error.
317 ///
318 /// ```
319 /// use std::error::Error;
320 ///
321 /// #[derive(Debug, serde::Deserialize, Eq, PartialEq)]
322 /// struct Row {
323 ///     #[serde(deserialize_with = "csv::invalid_option")]
324 ///     a: Option<i32>,
325 ///     #[serde(deserialize_with = "csv::invalid_option")]
326 ///     b: Option<i32>,
327 ///     #[serde(deserialize_with = "csv::invalid_option")]
328 ///     c: Option<i32>,
329 /// }
330 ///
331 /// # fn main() { example().unwrap(); }
332 /// fn example() -> Result<(), Box<dyn Error>> {
333 ///     let data = "\
334 /// a,b,c
335 /// 5,\"\",xyz
336 /// ";
337 ///     let mut rdr = csv::Reader::from_reader(data.as_bytes());
338 ///     if let Some(result) = rdr.deserialize().next() {
339 ///         let record: Row = result?;
340 ///         assert_eq!(record, Row { a: Some(5), b: None, c: None });
341 ///         Ok(())
342 ///     } else {
343 ///         Err(From::from("expected at least one record but got none"))
344 ///     }
345 /// }
346 /// ```
invalid_option<'de, D, T>(de: D) -> result::Result<Option<T>, D::Error> where D: Deserializer<'de>, Option<T>: Deserialize<'de>,347 pub fn invalid_option<'de, D, T>(de: D) -> result::Result<Option<T>, D::Error>
348 where
349     D: Deserializer<'de>,
350     Option<T>: Deserialize<'de>,
351 {
352     Option::<T>::deserialize(de).or_else(|_| Ok(None))
353 }
354