Using the String.Split method with multiple separator characters in PowerShell

15703896368_bfc55bdd19_m
This post is about what I thought of an odd behaviour when calling the .NET String.Split method with multiple separator characters from PowerShell. I first came across this myself but didn’t really pay much attention to it. Only after reading about it again over on Tommy Maynard’s blog, I decided to find out more.
Let’s have a look at an example first:

#using String.Split with one separator character works as expected
'This is a test'.Split('e')
#using multiple characters not so much
'c:\\test'.Split('\\')
'c:\\test'.Split('\\').Count

When running the second example trying to split a string based on double backslashes the result is an array of 3 strings instead of two. Let’s try to see why this is happening by retrieving the specific overload definition we are using:

#get the overload definition of the method we are using
''.Split.OverloadDefinitions[0]
#string[] Split(Params char[] separator)

Ok, it looks like this overload of the Split method expects a character array for the separator parameter. That is why we saw an additional split, every character of the string argument ‘\\’ is considered as a unique separator. Let’s see if String.Split has other overload definitions that accept a String as the separator argument:

''.Split.OverloadDefinitions | Select-String 'string[] separator' -SimpleMatch
<#
string[] Split(string[] separator, System.StringSplitOptions options)
string[] Split(string[] separator, int count, System.StringSplitOptions options)
#>

Indeed, there are two overloads that accept a String array argument instead. Let’s use the first one. We don’t need the StringSplitOptions parameter in this case and can therefore use a value of ‘None’ for the argument.

#this doesn't work since we need a String array
 'c:\\test'.Split('\\', 'None')
#finally we get only two parts back
 'c:\\test'.Split(@('\\'), 'None')
'c:\\test'.Split(@('\\'), 'None').Count

We could have used the -split operator in the first place, but that would have been to easy, right ;-). Furthermore with the String.Split method we can also split a string by multiple strings in just one go:

#using -split operator we need to escape the \ by doubling them since we are dealing with regular expressions
'c:\\test' -split '\\\\'
#splitting by two strings
'split by xx and yy in one go'.Split(('xx','yy'),'None')
#can be done also with -split using a scriptBlock

In conclusion, PowerShell provides a lot of options when it comes to splitting strings. Only looking at the separator parameter we have five options:

  1. Using String.Split’s first overload with a character array
  2. Using one of String.Split’s overloads that accept a string array
  3. Using the -split operator which accepts a string for the separator parameter (the string is actually interpreted as a regular expression)
  4. Using the -split operator which also accepts a ScriptBlock to determine the split operation. With that one can do a lot of things within the ScriptBlock $_ represents the current character, $args[0] the entire string, and $args[1] the current position within the entire string
  5. Finally there is also the .NET Regex.Split method with even more options but very similar to the -split operator

shareThoughts


Photo Credit: Matiluba via Compfight cc

Advertisements

3 thoughts on “Using the String.Split method with multiple separator characters in PowerShell

  1. your site took out my extra spaces in “it’s really ‘first.last [there are spaces here] ‘. I assume that’s a tab at the end.”

    Like

  2. $string = “first.last.priv”
    $string.split(@(‘.priv’),’none’)

    result appears to be “first.last”, but if I out-gridview and paste it into notepad, it’s really “first.last “. I assume that’s a tab at the end. I get the same result if I don’t use the entire “.priv” as the splitter. this of course breaks me feeding $string to get-aduser. am I doing something wrong?

    Like

    1. Your example will actually return an array with two elements of which one is empty. If you copy and paste the whole output you will get. “first.last” + a carriage return line feed “`n`r” in PowerShell. You could validate this with “$string.split(@(‘.priv’),’none’).Count”

      Like

I'd love to hear what you think

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s