Skip to content
Julian Halliwell edited this page Nov 14, 2023 · 12 revisions

Reads a csv file and processes and/or returns the data.

Performance-optimized for large files and offering a wide range of options available via the Apache Commons CSV library.

By default the method returns an object to which further optional calls can be chained before finally calling execute() to process the file and (if required) return the data.

readCsv( filepath )[...optional configuration calls].execute();

Reading a csv file into an array

filepath = "c:/temp/my.csv";
result = readCsv( filepath )
  .intoAnArray()
  .execute();

Return value

A struct with 2 keys:

  1. columns: array of column names
  2. data: array of row value arrays

Specifying predefined formats

Predefined Formats are preset Commons CSV configuration combinations optimized for for different contexts, such as tab-delimited data.

filepath = "c:/temp/myTabDelimited.csv";
myArray = readCsv( filepath )
  .intoAnArray()
  .withPredefinedFormat( "TDF" )
  .execute();

If not specified the DEFAULT predefined format will be used.

Other configuration options

withAllowMissingColumnNames( required boolean )

Whether to allow missing column names in the header line. Commons CSV documentation

withAutoFlush( required boolean )

Whether to flush on close. Commons CSV documentation

withCommentMarker( required string marker )

Sets the comment start marker to the specified character. Commons CSV documentation

withDelimiter( required string delimiter )

Sets the delimiter character. Commons CSV documentation To set tab as the delimiter you can use any of the following values as parameters: "#Chr( 9 )#", "\t", "tab", "TAB"

withDuplicateHeaderMode( required string value )

Sets the duplicate header names behavior. Possible values: "ALLOW_ALL", "ALLOW_EMPTY", "DISALLOW". Commons CSV documentation

withEscapeCharacter( required string character )

Sets the escape character. Commons CSV documentation

withFirstRowIsHeader( boolean state )

Whether to use the first row as the header and exclude it from being processed as part of the data. Defaults to true if no state is specified.

withHeader( required array header )

Manually sets the row values which will be detected as the header. To auto-detect the header from the first row, use withFirstRowIsHeader() (see above) Commons CSV documentation

withIgnoreEmptyLines( required boolean )

Sets the empty line skipping behavior, true to ignore the empty lines between the records, false to translate empty lines to empty records. Commons CSV documentation

withIgnoreHeaderCase( required boolean )

Sets the parser case mapping behavior, true to access name/values, false to leave the mapping as is. Commons CSV documentation

withIgnoreSurroundingSpaces( required boolean )

Sets the parser trimming behavior, true to remove the surrounding spaces, false to leave the spaces as is. Commons CSV documentation

withNullString( required string value )

Converts strings equal to the given nullString to null when reading records. Commons CSV documentation

withQuoteCharacter( required string character )

Sets the quote character. Commons CSV documentation

withSkipFirstRows( required numeric numberOfRowsToSkip )

Ignore the specified number of rows at the start of the file. Should be a positive integer.

withSkipHeaderRecord( required boolean )

Sets whether to skip the header record. Commons CSV documentation

withTrailingDelimiter( required boolean )

Sets whether to add a trailing delimiter. Commons CSV documentation

withTrim( required boolean )

Sets whether to trim leading and trailing blanks. Commons CSV documentation

Filtering out rows

[Details to come]

Passing a UDF/Closure to process rows

[Details to come]

Clone this wiki locally