Sergej Khackimullin - Fotolia

Get started Bring yourself up to speed with our introductory content.

Painlessly parse a string with PowerShell 5

For an administrator who works with unruly output from a legacy application, PowerShell 5 can help bring clarity to the situation.

One of the benefits to PowerShell over other shell languages is its ability to handle objects. Everything is an...

object in PowerShell. Objects are a structured way to represent information through various properties and methods.

However, not everything is a nicely structured object. There are times when administrators have to deal with unstructured data such as a big string that was output from a legacy program. The administrator has to convert this data into some kind of structured format.

Administrators will like many of the new Powershell 5 features Microsoft has developed, particularly the ones developed to deal with unstructured data. One of the most popular is the ConvertFrom-String cmdlet, which greatly reduces the time to convert an unstructured string into an object. The ConvertFrom-String cmdlet, originally developed by Microsoft Research for the FlashExtract project, converts unstructured strings by parsing the string based on a delimiter or a template. Let's first cover how ConvertFrom-String can parse a string into objects with a delimiter.

Using the Delimiter parameter

We can specify a regular expression to split the string into various elements with the Delimiter parameter. For example, say you have a string with groups of letters delimited by two spaces. To retrieve the groups you could use a delimiter of two spaces and use the string as input to the ConvertFrom-String cmdlet.

Delimiter parameter
Using the Delimiter parameter to split the unstructured string.

You can see that PowerShell output three object properties: P1, P2 and P3. At this point, you can then use these properties as necessary. However, P1, P2 and P3 aren't very intuitive representations of what data these properties represent. To attach labels to the properties, you can use the PropertyNames parameter to designate them with something more descriptive, such as Group 1, Group 2 and Group 3.

Although this is a nice feature, PowerShell 5 has had functionality similar to this with ConvertFrom-StringData. The more interesting component of ConvertFrom-String is the ability to parse a string based on a template.

The ConvertFrom-String cmdlet template delimiter works by supplying a template of a certain pattern contained in the string. From this pattern, the cmdlet can understand the structure and make better decisions on how to parse it. For example, a file that contains this type of text:

Name: Craig Trudeau

Name: Merle Baldridge

Name: Adam Bertram

Name: George Lucas

You can tell the ConvertFrom-String cmdlet where the first and last names are represented by specifying a template such as this:

Name:{FirstName*:Adam} {LastName:Bertram}

Name:{FirstName*:Merdfdfle} {LastName:Baldfdfd}

This would result in four objects with a FirstName and a LastName property.

template output
he output when specifying the first and last name with template.

By specifying two instances that looked like the others, the cmdlet determined how to parse the others.

Every line starts with "Name:" but I don't want that captured, so I must include that in the template. "Name:" represents a string of characters the ConvertFrom-String cmdlet needs to find. Next, you'll see text grouped in curly brackets -- those indicate that whatever is recognized in these brackets is what you would like in the output. I don't have to use the PropertyNames parameter this time to get the orderly output. I can specify the names of the properties inside of the template. I also have to use an asterisk if the string contains multiple references that match the template. The asterisk is used to signify the start of a new group.

The template string I used was not the same as either of the names in the data; you just need a representation of what the string is going to look like. The cmdlet will figure out the rest. To test this, you can find the code block below to build a text file, read it into memory and then parse it with ConvertFrom-String.

code block
As an example of the cmdlet's parsing abilities, this code builds a text file which is then deconstructed with ConvertFrom-String.

Working with more complicated data sets will most likely require some tinkering. This cmdlet is capable of parsing just about any kind of string if you're able to correctly structure the template. For some other examples and to see how to go more in depth with this cmdlet, check out this blog post by Stephen Owen on advanced parsing with ConvertFrom-String.

Next Steps

The top PowerShell commands for administrators

Learning to automate tasks in PowerShell

Upload and download files from FTP with PowerShell

Dig Deeper on Windows administration tools