(2013-02-08) Parsing A CSExport Generated XML File Into A Scoped CSV File
Posted by Jorge on 2013-02-08
I needed to research the CSExport output. However, that’s an XML file and not really helpful in diagnosing the data in it. Converting it to a CSV, so that it can be loaded into Excel is much much better. So after some google magic, I found Carol Wapshere’s script. However, that script had two issues I needed to solve. At first it does not support multi-values and I needed to diagnose the proxyAddresses attribute. And secondly, because the Get-Content CMDlet is being used to read the XML file it chokes if the XML file size is too large. The error can be seen below.
Figure 1: Error Reading An XML File File That’s Too Large With The Get-Content CMDlet
-
I added logic to the script to also process multi-valued attributes when applicable and instead of using the Get-Content CMDlet, I read the XML file as shown in the script below. I was not able to specify the “MaxCharactersInDcoument” property when using the Get-Content CMDlet. The XML file I was reading was about 500 MB and it contained about 1000000000 characters. The method I use below does not have a limit and allows the configuration “MaxCharactersInDcoument” property if needed.
# Original Script By Carol Wapsphere (http://www.wapshere.com/missmiis/using-powershell-to-parse-a-csexport-file) # Script Rewrite By Jorge de Almeida Pinto (http://jorgequestforknowledge.wordpress.com/) # # Takes an XML file created by CSEXPORT, and produces a CSV file more suitable for opening in Excel. # Supports both single-valued attributes and multi-valued attributes # # Before using the script: # * Define the attributes of interest # * Define the object types of interest. * means every object Param( [Parameter(Mandatory=$true)] [string] $sourceXML, [string] $targetCSV ) # List Of Attributes To Put In The CSV File. CHANGE THIS AS NEEDED! # The First 5 Attributes Are Available For All Connector Spaces $csvHeaderColumns = @("dn","connector-state","connector-type","mv-guid","object-type","givenName","sn","displayName","mail","mailNickname","legacyExchangeDN","proxyAddresses") # Object Types Of Interest $objectTypes = @("user") # Read The Source XML File [System.Xml.XmlDocument] $xmlCSExportDoc = New-Object System.Xml.XmlDocument $xmlCSExportDoc.load($sourceXML) # Check If CSV File Already Exists If (Test-Path $targetCSV) { Remove-Item -Path $targetCSV -Force } # Write The CSV Header To The CSV File ForEach ($csvHeaderColumn In $csvHeaderColumns) { If ($csvHeader -eq "" -Or $csvHeader -eq $null) { $csvHeader = $csvHeaderColumn } Else { $csvHeader = $csvHeader + "," + $csvHeaderColumn } } Add-Content $targetCSV $csvHeader # Get The Information For The Scoped Objects ForEach ($csObject In $xmlCSExportDoc."cs-objects"."cs-object") { If ($objectTypes -Contains $csObject."object-type" -Or $objectTypes -Contains "*") { $csObjectHashTable = @{} $csObjectHashTable.Add("dn",$csObject."cs-dn") $csObjectHashTable.Add("connector-state",$csObject."connector-state") $csObjectHashTable.Add("object-type",$csObject."object-type") If ($csObject.connector -eq "0") { $csObjectHashTable.Add("connector-type","disconnector") $csObjectHashTable.Add("mv-guid","") ForEach ($csObjectAttribute In $csObject."unapplied-export-hologram".entry.attr) { If ($csObjectAttribute.multivalued -eq "false") { $csObjectHashTable.Add($csObjectAttribute.name,$csObjectAttribute.value) } If ($csObjectAttribute.multivalued -eq "true" -And $csObjectAttribute.type -ne "binary") { $multivaluedAttrValues = "" If ($csObjectAttribute.value -ne "" -And $csObjectAttribute.value -ne $null) { ForEach ($value in $csObjectAttribute.value) { If ($multivaluedAttrValues -eq "" -Or $multivaluedAttrValues -eq $null) { $multivaluedAttrValues = $value } Else { $multivaluedAttrValues += ";" + $value } } $csObjectHashTable.Add($csObjectAttribute.name,$multivaluedAttrValues) } Else { $csObjectHashTable.Add($csObjectAttribute.name,"") } } } } Else { $csObjectHashTable.Add("connector-type","connector") $csObjectHashTable.Add("mv-guid",$csObject."mv-link"."#text") ForEach ($csObjectAttribute In $csObject."synchronized-hologram".entry.attr) { If ($csObjectAttribute.multivalued -eq "false") { $csObjectHashTable.Add($csObjectAttribute.name,$csObjectAttribute.value) } If ($csObjectAttribute.multivalued -eq "true" -And $csObjectAttribute.type -ne "binary") { $multivaluedAttrValues = "" If ($csObjectAttribute.value -ne "" -And $csObjectAttribute.value -ne $null) { ForEach ($value in $csObjectAttribute.value) { If ($multivaluedAttrValues -eq "" -Or $multivaluedAttrValues -eq $null) { $multivaluedAttrValues = $value } Else { $multivaluedAttrValues += ";" + $value } } $csObjectHashTable.Add($csObjectAttribute.name,$multivaluedAttrValues) } Else { $csObjectHashTable.Add($csObjectAttribute.name,"") } } } } $csvLine = "" ForEach ($csvHeaderColumn in $csvHeaderColumns) { If ($csObjectHashTable.Contains($csvHeaderColumn)) { If ($csvLine -eq "") { $csvLine = "`"" + $csObjectHashTable.Item($csvHeaderColumn) + "`"" } Else { $csvLine += "," + "`"" + $csObjectHashTable.Item($csvHeaderColumn) + "`"" } } Else { If ($csvLine -eq "") { $csvLine = "," } Else { $csvLine += "," } } } Add-Content $targetCSV $csvline } }
-
When using the script you need to make adjustments to it before actually using it. This is shown below.
Figure 2: Configuring The Object Types And The Attributes To Be Parsed Into The CSV File
-
[1] you need to define the attribute list you want to parse in an array. The first 5 attributes are available for all object types in every connector.
[2] you need to define the object types you want to parse in an array.
To execute the script, see below.
Figure 3: Using The Script
-
.\FIM-Sync-Engine-Parse-CSExport.ps1 -sourceXML <Source XML> -targetCSV <Target CSV>
-
The CSV file in Excel should then look similar to
Figure 4: The CSV File In Excel
-
Happy diagnosing!
-
Cheers,
Jorge
———————————————————————————————
* This posting is provided "AS IS" with no warranties and confers no rights!
* Always evaluate/test yourself before using/implementing this!
* DISCLAIMER: http://jorgequestforknowledge.wordpress.com/disclaimer/
———————————————————————————————
############### Jorge’s Quest For Knowledge #############
######### http://JorgeQuestForKnowledge.wordpress.com/ ########
———————————————————————————————


Brian Desmond said
Why not use Export-CSV?