(2013-02-08) Parsing A CSExport Generated XML File Into A Scoped CSV File
Posted by Jorge on 2013-02-08
I needed to research the CSExport output. However, that’s an XML file and not really helpful in diagnosing the data in it. Converting it to a CSV, so that it can be loaded into Excel is much much better. So after some google magic, I found Carol Wapshere’s script. However, that script had two issues I needed to solve. At first it does not support multi-values and I needed to diagnose the proxyAddresses attribute. And secondly, because the Get-Content CMDlet is being used to read the XML file it chokes if the XML file size is too large. The error can be seen below.
Figure 1: Error Reading An XML File File That’s Too Large With The Get-Content CMDlet
–
I added logic to the script to also process multi-valued attributes when applicable and instead of using the Get-Content CMDlet, I read the XML file as shown in the script below. I was not able to specify the “MaxCharactersInDcoument” property when using the Get-Content CMDlet. The XML file I was reading was about 500 MB and it contained about 1000000000 characters. The method I use below does not have a limit and allows the configuration “MaxCharactersInDcoument” property if needed.
# Original Script By Carol Wapsphere (http://www.wapshere.com/missmiis/using-powershell-to-parse-a-csexport-file) # Script Rewrite By Jorge de Almeida Pinto (https://jorgequestforknowledge.wordpress.com/) # # Takes an XML file created by CSEXPORT, and produces a CSV file more suitable for opening in Excel. # Supports both single-valued attributes and multi-valued attributes # # Before using the script: # * Define the attributes of interest # * Define the object types of interest. * means every object Param( [Parameter(Mandatory=$true)] [string] $sourceXML, [string] $targetCSV ) # List Of Attributes To Put In The CSV File. CHANGE THIS AS NEEDED! # The First 5 Attributes Are Available For All Connector Spaces $csvHeaderColumns = @("dn","connector-state","connector-type","mv-guid","object-type","givenName","sn","displayName","mail","mailNickname","legacyExchangeDN","proxyAddresses") # Object Types Of Interest $objectTypes = @("user") # Read The Source XML File [System.Xml.XmlDocument] $xmlCSExportDoc = New-Object System.Xml.XmlDocument $xmlCSExportDoc.load($sourceXML) # Check If CSV File Already Exists If (Test-Path $targetCSV) { Remove-Item -Path $targetCSV -Force } # Set The Starting Value For.. $csvHeader = $null # Write The CSV Header To The CSV File ForEach ($csvHeaderColumn In $csvHeaderColumns) { If ($csvHeader -eq $null) { $csvHeader = $csvHeaderColumn } Else { $csvHeader = $csvHeader + "," + $csvHeaderColumn } } Add-Content $targetCSV $csvHeader # Get The Information For The Scoped Objects ForEach ($csObject In $xmlCSExportDoc."cs-objects"."cs-object") { If ($objectTypes -Contains $csObject."object-type" -Or $objectTypes -Contains "*") { $csObjectHashTable = @{} $csObjectHashTable.Add("dn",$csObject."cs-dn") $csObjectHashTable.Add("connector-state",$csObject."connector-state") $csObjectHashTable.Add("object-type",$csObject."object-type") If ($csObject.connector -eq "0") { $csObjectHashTable.Add("connector-type","disconnector") $csObjectHashTable.Add("mv-guid","") ForEach ($csObjectAttribute In $csObject."unapplied-export-hologram".entry.attr) { If ($csObjectAttribute.multivalued -eq "false") { $csObjectHashTable.Add($csObjectAttribute.name,$csObjectAttribute.value) } If ($csObjectAttribute.multivalued -eq "true" -And $csObjectAttribute.type -ne "binary") { $multivaluedAttrValues = "" If ($csObjectAttribute.value -ne "" -And $csObjectAttribute.value -ne $null) { ForEach ($value in $csObjectAttribute.value) { If ($multivaluedAttrValues -eq "" -Or $multivaluedAttrValues -eq $null) { $multivaluedAttrValues = $value } Else { $multivaluedAttrValues += ";" + $value } } $csObjectHashTable.Add($csObjectAttribute.name,$multivaluedAttrValues) } Else { $csObjectHashTable.Add($csObjectAttribute.name,"") } } } } Else { $csObjectHashTable.Add("connector-type","connector") $csObjectHashTable.Add("mv-guid",$csObject."mv-link"."#text") ForEach ($csObjectAttribute In $csObject."synchronized-hologram".entry.attr) { If ($csObjectAttribute.multivalued -eq "false") { $csObjectHashTable.Add($csObjectAttribute.name,$csObjectAttribute.value) } If ($csObjectAttribute.multivalued -eq "true" -And $csObjectAttribute.type -ne "binary") { $multivaluedAttrValues = "" If ($csObjectAttribute.value -ne "" -And $csObjectAttribute.value -ne $null) { ForEach ($value in $csObjectAttribute.value) { If ($multivaluedAttrValues -eq "" -Or $multivaluedAttrValues -eq $null) { $multivaluedAttrValues = $value } Else { $multivaluedAttrValues += ";" + $value } } $csObjectHashTable.Add($csObjectAttribute.name,$multivaluedAttrValues) } Else { $csObjectHashTable.Add($csObjectAttribute.name,"") } } } } $csvLine = "" ForEach ($csvHeaderColumn in $csvHeaderColumns) { If ($csObjectHashTable.Contains($csvHeaderColumn)) { If ($csvLine -eq "") { $csvLine = "`"" + $csObjectHashTable.Item($csvHeaderColumn) + "`"" } Else { $csvLine += "," + "`"" + $csObjectHashTable.Item($csvHeaderColumn) + "`"" } } Else { If ($csvLine -eq "") { $csvLine = "," } Else { $csvLine += "," } } } Add-Content $targetCSV $csvline } }
–
When using the script you need to make adjustments to it before actually using it. This is shown below.
Figure 2: Configuring The Object Types And The Attributes To Be Parsed Into The CSV File
–
[1] you need to define the attribute list you want to parse in an array. The first 5 attributes are available for all object types in every connector.
[2] you need to define the object types you want to parse in an array.
To execute the script, see below.
Figure 3: Using The Script
–
.\FIM-Sync-Engine-Parse-CSExport.ps1 -sourceXML <Source XML> -targetCSV <Target CSV>
–
The CSV file in Excel should then look similar to
Figure 4: The CSV File In Excel
–
Happy diagnosing!
–
Cheers,
Jorge
———————————————————————————————
* This posting is provided "AS IS" with no warranties and confers no rights!
* Always evaluate/test yourself before using/implementing this!
* DISCLAIMER: https://jorgequestforknowledge.wordpress.com/disclaimer/
———————————————————————————————
############### Jorge’s Quest For Knowledge #############
######### http://JorgeQuestForKnowledge.wordpress.com/ ########
———————————————————————————————
Brian Desmond said
Why not use Export-CSV?
LikeLike
sean said
for some reason $csvheader was filled with extra stuff causing my final header to be 3 times as long as it should be. placing
$csvHeader=””
right before the foreach $csvheadercolumn line cleaned up the output. i’m not sure what was causing this behavior.
LikeLike
Jorge said
thank you for your feedback. I have adjusted the script to include your feedback. I added the following BOLD lines:
#######################################
# Set The Starting Value For..
$csvHeader = $null
# Write The CSV Header To The CSV File
ForEach ($csvHeaderColumn In $csvHeaderColumns) {
If ($csvHeader -eq $null) {
$csvHeader = $csvHeaderColumn
} Else {
$csvHeader = $csvHeader + “,” + $csvHeaderColumn
}
}
#######################################
LikeLike
Momen said
Hello, thanks for the good work, how can I add column to the excel sheet to include old value and new value?
LikeLike
David said
Check out this tool. Parses the XML file created by the csexport tool and has robust filtering capabilities with ability to generate HTML, CSV, and native Excel files for reporting.
https://github.com/FIMTooler/csReporter
LikeLike