Skip to content

sdkn104/VBA-CSV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VBA-CSV

VBA-CSV provides CSV (Comma-Separated Values) parsers and writer as VBA functions. The CSV parsers read CSV text and return Collection or Array of the CSV table contents. The CSV writer converts 2-dimensional array to CSV text.

  • The parsers and writer are compliant with the CSV format defined in RFC4180, which allows commas, line breaks, and double-quotes included in the fields.
  • Function test procedure, performance test procedure and examples are included.
  • The parser takes about 2.2 sec. for 8MB CSV, 8000 rows x 100 columns. (on Core i5-3470 CPU @ 3.2GHz, 4GB RAM)
  • The writer takes about 1.2 sec. for 8MB CSV, 8000 rows x 100 columns. (on Core i5-3470 CPU @ 3.2GHz, 4GB RAM)
  • The parsers do not fully check the syntax error (they parse correctly if the CSV has no syntax error).

Also includes VBScript version in VBScript folder.

Usage and Examples

ParseCSVToCollection

Function ParseCSVToCollection( csvText As String, 
                               Optional allowVariableNumOfFields As Boolean = False ) As Collection
[example]
    Dim csv As Collection
    Dim rec As Collection, fld As Variant

    Set csv = ParseCSVToCollection("aaa,bbb,ccc" & vbCr & "xxx,yyy,zzz")
    If csv Is Nothing Then
        Debug.Print Err.Number & " (" & Err.Source & ") " & Err.Description
    End If
    
    Debug.Print csv(1)(3) '----> ccc
    Debug.Print csv(2)(1) '----> xxx
    For Each rec In csv
      For Each fld In rec
        Debug.Print fld
      Next
    Next

ParseCSVToCollection() returns a Collection of records, and the record is a collection of fields. If error occurs, it returns Nothing and the error information is set in Err object. Optional boolean argument allowVariableNumOfFields specifies whether variable number of fields in records is allowed or handled as error.

ParseCSVToArray

Function ParseCSVToArray( csvText As String, 
                          Optional allowVariableNumOfFields As Boolean = False ) As Variant
[example]
    Dim csv As Variant
    Dim i As Long, j As Variant

    csv = ParseCSVToArray("aaa,bbb,ccc" & vbCr & "xxx,yyy,zzz")
    If IsNull(csv) Then
        Debug.Print Err.Number & " (" & Err.Source & ") " & Err.Description
    End If
    
    Debug.Print csv(1, 3) '----> ccc
    Debug.Print csv(2, 1) '----> xxx
    For i = LBound(csv, 1) To UBound(csv, 1)
      For j = LBound(csv, 2) To UBound(csv, 2)
        Debug.Print csv(i, j)
      Next
    Next

ParseCSVToArray() returns a Variant that contains 2-dimensional array - String(1 To recordCount, 1 To fieldCount). If error occurs, it returns Null and the error information is set in Err object. If input text is zero-length (""), it returns empty array - String(0 To -1). Optional boolean argument allowVariableNumOfFields specifies whether variable number of fields in records is allowed or handled as error.

ConvertArrayToCSV

Function ConvertArrayToCSV( inArray As Variant,
                            Optional fmtDate As String = "yyyy/m/d",
                            Optional quoting As CSVUtilsQuote = CSVUtilsQuote.MINIMAL,
                            Optional recordSeparator As String = vbCrLf ) As String
[example]
    Dim csv As String
    Dim a(1 To 2, 1 To 2) As Variant
    a(1, 1) = DateSerial(1900, 4, 14)
    a(1, 2) = "Exposition Universelle de Paris 1900"
    a(2, 1) = DateSerial(1970, 3, 15)
    a(2, 2) = "Japan World Exposition, Osaka 1970"
    
    csv = ConvertArrayToCSV(a, "yyyy/mm/dd")
    If Err.Number <> 0 Then
        Debug.Print Err.Number & " (" & Err.Source & ") " & Err.Description
    End If
    
    Debug.Print csv

ConvertArrayToCSV() reads 2-dimensional array inArray and return CSV text. If error occurs, it return the string "", and the error information is set in Err object. fmtDate is used as the argument of text formatting function Format if an element of the array is Date type. The optional argument quoting specifies what type of fields to be quoted:

  • MINIMAL: Quoting only if it is necessary (the field includes double-quotes, comma, line breaks).
  • ALL: Quoting all the fields.
  • NONNUMERIC: Similar to MINIMAL, but quoting also all the String type fields.

The optional arugment recordSeparator specifies record separator (line terminator), default is CRLF.

SetCSVUtilsAnyErrorIsFatal

SetCSVUtilsAnyErrorIsFatal(value As Boolean)
[example]
    SetCSVUtilsAnyErrorIsFatal True
    SetCSVUtilsAnyErrorIsFatal False

This function changes error handling mode for CSV parsers and writer.

False (default) - When run-time error occurs, the parser function returns special value (Nothing, Null, etc.), and the error information is set to properties of Err object.
True - Any run-time error that occurs is fatal (an error message is displayed and execution stops).

ParseCSVToDictionary

Public Function ParseCSVToDictionary(ByRef csvText As String, Optional ByRef keyColumn As Long = 1, 
                                     Optional ByRef allowVariableNumOfFields As Boolean = False) As Object
[example]
    Dim csv As String
    Dim csvd As Object

    csv = "key,val1, val2" & vbCrLf & "name1,v11,v12" & vbCrLf & "name2,v21,v22"
    Set csvd = ParseCSVToDictionary(csv, 1)
    Debug.Print csvd("name1")(2)  ' --> val11
    Debug.Print csvd("name1")(3)  ' --> val12
    Debug.Print csvd("name2")(2)  ' --> val21

ParseCSVToDictionary() returns a Dictionary (Scripting.Dictionary) of records; the records are Collections of fields. In default, the first field of each record is the key of the dictionary. The column number of the key field can be specified by keyColumn, whose default value is 1. If there are multiple records whose key fields are the same, the value for the key is set to the last record among them. If error occurs, it returns Nothing and the error information is set in Err object. Optional boolean argument allowVariableNumOfFields specifies whether variable number of fields in records is allowed or handled as error.

GetFieldDictionary

Public Function GetFieldDictionary(ByRef csvText As String) As Object
[example]
    Dim csv As String
    Dim csva
    Dim field As Object

    csv = "key,val1, val2" & vbCrLf & "name1,v11,v12" & vbCrLf & "name2,v21,v22"
    Set field = GetFieldDictionary(csv)
    csva = ParseCSVToArray(csv)
    Debug.Print csva(2, field("key"))  ' --> name1
    Debug.Print csva(3, field("val1"))  ' --> v21

GetFieldDictionary() returns a Dictionary (Scripting.Dictionary) of field names, whose keys are the field values of the first records and whose values are the column numbers of the fields. If there are multiple fields of the same value in the first record, the value for the key is set to the largest column number among the fields. If error occurs, it returns Nothing and the error information is set in Err object.

Installation

  1. Download the latest release.
  2. Import CSVUtils.bas (and other *.bas) into your project (Open VBA Editor, Alt + F11; File > Import File)

Tested in

  • MS Excel 2000 on Windows 10
  • MS Excel 2013 on Windows 7

The CSV File format

There is no definitive standard for CSV (Comma-separated values) file format, however the most commonly accepted definition is RFC4180. VBA-CSV is compliant with RFC 4180, while still allowing some flexibility where CSV text deviate from the definition. The followings are the rules of CSV format such that VBA-CSV can handle correctly. (The rules indicated by italic characters don't exists in RFC4180)

  1. Each record is located on a separate line, delimited by a line break (CRLF, CR, or LF).

    aaa,bbb,ccc CRLF
    zzz,yyy,xxx CRLF
    
  2. The last record in the file may or may not have an ending line break.    The CSV file containing nothing (= "") is recognized as empty (it has no record nor fields).

    aaa,bbb,ccc CRLF
    zzz,yyy,xxx
    
  3. Within each record, there may be one or more fields, separated by commas.

    aaa,bbb,ccc
    
  4. Each record should contain the same number of fields throughout the file.

  5. Each field may or may not be enclosed in double quotes.

    "aaa","bbb","ccc" CRLF
    zzz,yyy,xxx
    
  6. Fields containing line breaks, double quotes, and commas should be enclosed in double-quotes.

    "aaa","b CRLF
    bb","ccc" CRLF
    zzz,yyy,xxx
    
  7. If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote.

    "aaa","b""bb","ccc"
    
  8. Spaces (including tabs) are considered part of a field and should not be ignored. If fields are enclosed with double quotes, then leading and trailing spaces outside of double quotes are ignored.

    " aaa", "bbb", ccc
    
  9. The special quotation expression (="CONTENT") is allowed inside of the double-quotes. CONTENT (field content) must not include any double-quote ("). MS Excel can read this.

    aaa,"=""bbb""",ccc
    

Author

sdkn104

License

This software is released under the MIT License.