A One Line CSV Parser

Filed under Code Garage, Regular Expressions, Utilities, VB Feng Shui

Parsing up Quote Comma delimited text is a pretty common thing to do, and it seems trivial enough till you realize all the little gotcha’s that come with the problem (like doubled quotes, commas in quotes, etc, etc). Then it becomes just another laborious exercise in boring coding.

I came across a regex some time ago that makes the process literally one line of code. I can no longer find the original author but the code I found (what little there was of it) was C# and actually split up into a few lines of code, so this is converted to the equivalent VB.net:

    Public Function QCSplit(ByVal Args As String) As String()

        Return (New System.Text.RegularExpressions.Regex(",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))")).Split(args)
    End Function

The regex used here doesn’t actually match on the contents, it matches on the commas that split things up, and it then uses the SPLIT function to actually split those things up.

I’ve used this a while now and it works great, but I’ve seen a few interesting alternatives since.

The most interesting thus far is this one by Daniel Einspanjer. Not interesting enough yet for me to switch to it, but the regexlib.com website is quite nice as a great repository of good regex recipes.

Regular Expression Tester

And speaking of Regular Expressions, the guys over at RAD Software, have a free Regular expression tester for .net style regular expressions that works fantastically. If you’re just starting out in regex’s (and seriously, who isn’t <G>), you owe it to yourself to pick up a decent regex test tool, and this one is as good as I’ve seen so far.

image

As far as I can tell, it can handle all the various options for regex’s, and dynamically shows match results, etc. Very handy for trying out expressions without actually running them in .net.

Add it to your External Tools menu in VS and it’ll be right there, good to go.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*