When rolling your own csv files, you have to account for commas in the text. Extra commas will create extra columns and you don’t want that. Let’s start with this data:
Two of the names have commas and two don’t. By ignoring those commas, extra columns are created because a comma delimits a column. In this code, no account is made for commas in the text.
Sub MakeCSV1()
Dim rCell As Range
Dim rRow As Range
Dim lFnum As Long
Dim sFname As String
Dim sText As String
lFnum = FreeFile
sFname = “C:MyCSV1.csv”
Open sFname For Output As lFnum
For Each rRow In Sheet1.UsedRange.Rows
For Each rCell In rRow.Cells
sText = sText & rCell.Text & “,”
Next rCell
sText = Left$(sText, Len(sText) – 1) ‘remove comma
sText = sText & vbNewLine ‘add line break
Next rRow
Print #lFnum, sText
Close lFnum
End Sub
Dim rCell As Range
Dim rRow As Range
Dim lFnum As Long
Dim sFname As String
Dim sText As String
lFnum = FreeFile
sFname = “C:MyCSV1.csv”
Open sFname For Output As lFnum
For Each rRow In Sheet1.UsedRange.Rows
For Each rCell In rRow.Cells
sText = sText & rCell.Text & “,”
Next rCell
sText = Left$(sText, Len(sText) – 1) ‘remove comma
sText = sText & vbNewLine ‘add line break
Next rRow
Print #lFnum, sText
Close lFnum
End Sub
When you open MyCSV1.csv in Excel, you get this:
To keep that name field as one, test for the existence of a comma and enclose the field text in double quotes (Chr$(34) is used here).
Sub MakeCSV2()
Dim rCell As Range
Dim rRow As Range
Dim lFnum As Long
Dim sFname As String
Dim sText As String
lFnum = FreeFile ‘get availabe file number
sFname = “C:MyCSV2.csv”
Open sFname For Output As lFnum
For Each rRow In Sheet1.UsedRange.Rows
For Each rCell In rRow.Cells
If InStr(1, sText, “,”) > 0 Then
sText = sText & Chr$(34) & rCell.Text & Chr$(34) & “,”
Else
sText = sText & rCell.Text & “,”
End If
Next rCell
sText = Left$(sText, Len(sText) – 1) ‘remove comma
sText = sText & vbNewLine ‘add line break
Next rRow
Print #lFnum, sText ‘write to csv file
Close lFnum
End Sub
Dim rCell As Range
Dim rRow As Range
Dim lFnum As Long
Dim sFname As String
Dim sText As String
lFnum = FreeFile ‘get availabe file number
sFname = “C:MyCSV2.csv”
Open sFname For Output As lFnum
For Each rRow In Sheet1.UsedRange.Rows
For Each rCell In rRow.Cells
If InStr(1, sText, “,”) > 0 Then
sText = sText & Chr$(34) & rCell.Text & Chr$(34) & “,”
Else
sText = sText & rCell.Text & “,”
End If
Next rCell
sText = Left$(sText, Len(sText) – 1) ‘remove comma
sText = sText & vbNewLine ‘add line break
Next rRow
Print #lFnum, sText ‘write to csv file
Close lFnum
End Sub
That keeps everything nice and neat.
Dick, isn’t it easier just to format the names column as text? Then when you save the sheet as a.CSV file, Excel puts the quotes round the names for you, and it loads correctly. No VBA required.
Hi,
What about the other way around?
Often data is exported from an application as CSV.
When I import it into excel, commas in text fields cause the same problem. However, I have no control over the export functionality (it is built into the source application).
So, is there any way to get around that using VBA (say)? I can’t think of one, since there is no way to programmatically tell a delimiter from puntuation – or is there?
Has anyone addressed this using VBA?
Thanks,
Alan.
More a question then a comment.
Has anyone ever moved to tab delimited? As we all have issues with commas in .CVS files at some time and I have never seen a spreadsheet with a tab in a string. Admittedly I have never seen a tab delimited file used, just wondering if anyone else had any thoughts.
That sounds easier to me. :)
Hi Dick,
I haven’t had much cause to work with CSV files so I don’t know why one would roll one’s own. But given that you are, I’d suggest the test Application.WorksheetFunction.IsText(rCell.Value) rather than testing for a comma. The IsText() check would also ‘protect’ fields that contain text that might be misinterpreted as a number (zip codes with a leading zero or a 16 character credit card, etc.)
I’ve been burnt by code that copied a VBA array into a worksheet for sorting with all sorts of unintended consequences!
Regular expressions are a well-established way to deal with CSV parsing. Even then, you need to take care that you understand by what conventions the file was created, and whether it’s at all likely to change. I frequently get “CSV” file saved from German Excel – since they use commas where we have decimal points, the values are delimited by semicolons. Fun.
I don’t know why, but I really don’t like that common “concatenate commas then strip off the last one” code style, it just looks clumsy. So I tend to clamber through hoops to get my data into a form where I can call Join().
@Mike Woodhouse
You wrote:
I frequently get “CSV” file saved from German Excel – since they use commas where we have decimal points, the values are delimited by semicolons. Fun.
As I am a german, I can tell you lot of funny stories about commas and semicolons. E.g. our DSC-Software exports the data with . as decimal separator and ; as colum separator. For this reasons I am frequently changing my language settings to avoid problems.
Here is another approach to export data (tab as delimiter):
Dim var As Variant
Dim row As Long, col As Long
‘kein Set verwenden!!!
var = ActiveSheet.UsedRange
Open “C:Export.txt” For Output As #1
For row = 1 To UBound(rng, 1)
For col = 1 To UBound(rng, 2) – 1
‘Daten schreiben und nächste Spalte
Print #1, Format(var(row, col), “0.00”); Tab;
Next
Print #1, Format(var(row, col), “0.00”)
Next
Close #1
End Sub
This is useful.
The main reason for “rolling your own” CSV files is for when you have to transfer information to another package over which you have no control (IE: you have to match the required input specification because you cannot modify the code).
I have one app. which requires some number fields as numbers, others always as text (to allow $NA). All text fields have to be surrounded by double quotes. [it’s not on a PC]
An alternative approach I’ve used is a pipe: | (the character above
the backslash) instead of the comma as a delimiter. The pipe is a
relatively rare character in most text, so the comma-within-text
problem is avoided and the file can still be saved as a “.csv”.
If the receiving application can have its delimiter re-set to the
pipe, all’s well: the receiving application will parse on the pipe
character. If not, then a second process can be used to search-and-
replace the pipe with a quote-comma-quote (to take care of any
comma-within-text instances), followed by a single quote at the
beginning or each line and another single quote at the end of each
line (replace the
or any other special character that indicates
the end of a line and/or carriage return).
So, best case is a single, simple pass through the fields to add the
pipe without any need for regular expressions. Worst case is three
passes, but regular expressions only needed for passes two (changing pipes to quoted commas) and three (adding beginning- and ending-of-
line quotes).
The real problems occur when you get CSV files. They are so bad it sours you on the file format. In addition to the comma problem, don’t forget quotation marks in text fields, e.g. “Bank of .. The “Honest” Bankers” or “7’6? ceilings”.
[…] Handling CSV files when the content has genuine commas […]
I too have had to deal with the “I have no control over this incoming CSV file, but we need to use it in Excel… can you please import it for me?”. I have to remind everyone (and myself too sometimes) that the CSV file is nothing more than a text file. Open it up with a text editor, do a find & replace and begone with the offending stuff.
I regularly use tab separated values files as that’s the format currently exported by the ERP system I use. I assume that the discussed challenges with CSV are main the reason behind that ERP system’s shift from csv to tsv.
Of course, for anyone finding themselves fighting with data translations in general, I highly recommend having a look at Monarch.
When we say CSV it is not just a file with comma delimited(comedy limited :-) ) fields in the record.
It is a special file format. Let Excel to do the job.
We can get desireable result using the following:
Public Sub XL2CSV()
ThisWorkbook.SaveAs Filename:=ThisWorkbook.Path & “TestCSV.csv”, FileFormat:=xlCSV
End Sub
In the TestCSV.csv we will have:
1234,”Kusleika, Dick”,82
5678,”Smith, John”,95
9012,Cher,66
Hi,
I am new to VBA. How easy is it to save 4 columns (random) from the fist sheet of an excel file having around 45,000 rows and 6 columns. Would appreciate if some one can point to some sample code.
thanks in advance.
Venu
Mike: “I don’t know why, but I really don’t like that common “concatenate commas then strip off the last one” code style, it just looks clumsy.”
A more elegant approach is to add the comma at the start of every field (except the first, of course), and the return characters at the start of every record (again, except the first).
Additionally, string building/concatenation in VBA is very slow. If your CSV file is going to be very large, it’s worth creating a huge empty string, filling it using mid() and then trimming the excess off the end.
Sorry “huge empty string” doesn’t really make sense, does it? What I meant was create a long string using either of the string() or space() functions.
@Rob Bruce
I’ve compared the elapsed time when exporting data using the “SaveAs csv” and the “print”-function. I don’t think there is a big difference in time necessary. More important is how you get the data out of the excel sheet.
Here are the three approaches:
1. Using “SaveAs csv”
2. Reading the data in a variant -> use print
3. Reading the data in a range -> use print
And that’s the result:
1. 5.2 seconds
2. 4.7 seconds
3. 1.2 seconds
(for the setup see here: http://www.vba-blog.de/?q=content/datenexport-textdatei-mit-print – Sorry! It’s only a german website, but I’m still working on my english. )
If there are a lot of formulas in the workbook, the “SaveAs csv” will take even more time. So I would never recommend using the “SaveAs csv”-macro.
I will try to implement your “huge empty string” and compare it with the others.
Nice weekend!
Tobi
Damn! Error! I confused the numbers! Here’s the correct version:
Here are the three approaches:
1. Using “SaveAs csv”
2. Reading the data in a range -> use print
3. Reading the data in a variant -> use print
And that’s the result:
1. 5.2 seconds
2. 4.7 seconds
3. 1.2 seconds
I stick to tab delimited files whenever possible – they just work better.
Leonid,
The example you gave shows the downfall of the SaveAs CSV method – Excel’s inconsistent use of quote marks. There are many programs that cannot read this style of CSV properly, including (ridiculously) Microsoft Access. That’s why we often need to roll our own CSVs.
sir
i have excel file which was converted to .csv file in that one row have 14 characters number field but once i save and open it is giving 1.33333+4 error. Actually i want to upload a .csv file to our software because of this error i cannot do
In you code comma Chr(44) can be replaced by CHR(130).
Chr (130) looks just like normal comma, but doesn’t split the field in CSV
I have been trying to use MakeCSV2 and the following error occurs when I run it
Run-time error ’75’:
Path/File access error
When I click on the Debug button the following line is highlited
Any help would be appreciated
Have you defined sFname? and lFnum?
Thanks for the reply Jon.
As I am just trying to get my head around VBA can you please explain how to ‘define sFname and lFnum’
David – if you have copied and pasted the code without change the most likely problem is you don’t have write access to C:. If you change C: in the code to the full path to your Documents folder (or any other folder you know you have write access to), it should work.
David –
I think you need to find some VBA tutorials. You could buy VBA for Dummies, or you could Google for VBA tutorial.