Разбор вложенного сложного XML в CSV с помощью powershell

Добрый день,

Я новичок в PowerShell, и я пытаюсь разобрать сложный xml в CSV:

Вот xml-код

      <LoyaltyCustomer Action="E">
    <Retailer Id="1">
<HouseHold BuyingUnitInternalKey="2" HouseHoldExternalId="-1" SendEmail="false">
            <Members>
                <Member MemberInternalKey="2" MemberExternalId="-1" IsMainMember="true" LastName="Internal" FirstName="Use" StartDate="2012-10-02T12:42:00" RedemptionPrivileges="0" MemberStatus="0" AdressNormalizationUpdate="N">
                    <Cards>
                        <Card Id="-1" CardStatus="" IssueDate="2012-10-02T12:42:00" ExpirationDate="2056-12-31T00:00:00" />
                    </Cards>
                    <Stores>
                        <Store Id="26" StoreTypeId="1" IsHomeStore="true" />
                    </Stores>
                </Member>
            </Members>
        </HouseHold>
        <HouseHold BuyingUnitInternalKey="3" HouseHoldExternalId="244003000001" Country="11" State="223" City="Calgary" Street1="Main St" StreetNum="203" PostalCode="R4C 3R1" POBox="999999" HomePhone="438-439-1246" EMailAddress="zabana@zabana.ca" SendEmail="true">
            <Accounts>
                <Account Id="1"  EarnValue="6422.0000"   RedeemValue="6049.0000" Balance="373.0000" LastUpdate="2013-05-10" HousekeepingBalance="0.0000" HousekeepingTotalAccumulated="0.0000" HousekeepingTotalRedeemed="0.0000" />
                <Account Id="3"  EarnValue="10.0000"     RedeemValue="8.0000" Balance="2.0000" LastUpdate="2013-02-20" HousekeepingBalance="0.0000" HousekeepingTotalAccumulated="0.0000" HousekeepingTotalRedeemed="0.0000" />
                <Account Id="4"  EarnValue="10.0000"     RedeemValue="7.0000" Balance="3.0000" LastUpdate="2013-02-20" HousekeepingBalance="0.0000" HousekeepingTotalAccumulated="0.0000" HousekeepingTotalRedeemed="0.0000" />
                <Account Id="5"  EarnValue="8.0000"      RedeemValue="5.0000" Balance="3.0000" LastUpdate="2013-04-18" HousekeepingBalance="0.0000" HousekeepingTotalAccumulated="0.0000" HousekeepingTotalRedeemed="0.0000" />
                <Account Id="6"  EarnValue="9.0000"      RedeemValue="8.0000" Balance="1.0000" LastUpdate="2013-02-20" HousekeepingBalance="0.0000" HousekeepingTotalAccumulated="0.0000" HousekeepingTotalRedeemed="0.0000" />
                <Account Id="7"  EarnValue="7028.0000"   RedeemValue="6500.0000" Balance="528.0000" LastUpdate="2017-07-10" HousekeepingBalance="0.0000" HousekeepingTotalAccumulated="0.0000" HousekeepingTotalRedeemed="0.0000" />
                <Account Id="8"  EarnValue="269319.0000" RedeemValue="269000.0000" Balance="319.0000" LastUpdate="2019-07-10" HousekeepingBalance="0.0000" HousekeepingTotalAccumulated="0.0000" HousekeepingTotalRedeemed="0.0000" />
                <Account Id="10" EarnValue="688968.0000" RedeemValue="682641.0000" Balance="6327.0000" LastUpdate="2019-07-10" HousekeepingBalance="0.0000" HousekeepingTotalAccumulated="0.0000" HousekeepingTotalRedeemed="0.0000" />
            </Accounts>
            <HouseHoldSegments>
                <Segment Id="100" Status="1" AttachmentSourceId="1" />
                <Segment Id="500" Status="1" AttachmentSourceId="4" />
                <Segment Id="502" Status="1" AttachmentSourceId="8" />
                <Segment Id="531" Status="1" AttachmentSourceId="1" />
            </HouseHoldSegments>
            <Members>
                <Member MemberInternalKey="3" MemberExternalId="244003000001" IsMainMember="true" LastName="zabana" FirstName="Mike" BirthDate="1970-11-04" DriversLicense="drvlic" NationalInsuranceNumber="socsec" Remarks="Test Account" MobilePhoneNumber="438-439-1246" Gender="1" Title="1" StartDate="2013-01-21T14:28:00" EffectiveDate="2013-02-08T13:58:00" RedemptionPrivileges="0" LanguageId="0" NumberOfFamilyMembers="4" Anonimity="0" MemberStatus="1" ReceiptLayoutId="1" AdressNormalizationUpdate="N" UpdatedDate="2019-07-20T14:47:00" CommercialDriversLicense="comdrvlic">
                    <Cards>
                        <Card Id="244003000001" CardStatus="1" IssueDate="2013-01-21T00:00:00" ExpirationDate="2056-12-31T23:59:00" />
                    </Cards>
                    <Stores>
                        <Store Id="24" StoreTypeId="1" IsHomeStore="true" />
                    </Stores>
                    <MemberAttributes>
                        <Attribute Id="10004" Value="zabana MANAGEMENTS" />
                    </MemberAttributes>
                    <MemberAdditionalAddress />
                    <FamilyMembers>
                        <FamilyMember FamilyMemberId="1" TypeId="2" Name="Kathy" />
                        <FamilyMember FamilyMemberId="2" TypeId="1" Name="Melissa" BirthdayDate="1997-06-06" Gender="2" />
                        <FamilyMember FamilyMemberId="3" TypeId="1" Name="Trent" BirthdayDate="2000-08-29" Gender="1" />
                    </FamilyMembers>
                </Member>
                <Member MemberInternalKey="23612" MemberExternalId="244003000002" IsMainMember="false" LastName="zabana" FirstName="Kathy" BirthDate="1970-07-02" Remarks="Test Account for Family testing" Gender="2" Title="3" StartDate="2013-03-11T13:16:00" RedemptionPrivileges="0" LanguageId="0" PostOption="1" NumberOfFamilyMembers="1" Anonimity="0" MemberStatus="1" AdressNormalizationUpdate="N" UpdatedDate="2019-07-15T19:44:00">
                </Member>
                <Member MemberInternalKey="33421" MemberExternalId="244003000003" IsMainMember="false" LastName="zabana" FirstName="Trent" BirthDate="2000-08-29" Remarks="Test account to see how a different address and email affect the member export" Gender="1" Title="1" StartDate="2013-03-27T14:41:00" EffectiveDate="2017-06-26T13:55:00" RedemptionPrivileges="3" LanguageId="0" PostOption="1" NumberOfFamilyMembers="1" Anonimity="0" MemberStatus="1" AdressNormalizationUpdate="N" UpdatedDate="2019-07-05T18:07:00">
                </Member>
            </Members>
        </HouseHold>
    </Retailer>
</LoyaltyCustomer>

Я пытаюсь получить следующий результат:

      HouseHoldExternalId     IsMainMember    MemberExternalId    FirstName   LastName        StartDate           Id      Balance
-------------------     ------------    ----------------    ---------   --------        --------------      --      -------
-1                      TRUE            -1                  Use         Internal        10/2/2012 12:42     
244003000001            TRUE            244003000001        Mike        Zabana          1/21/2013 14:28     
244003000001            FALSE           244003000002        Kathy       zabana          3/11/2013 13:16     
244003000001            FALSE           244003000003        Trent       zabana          3/27/2013 14:41     
244003000001                                                                                                1       373
244003000001                                                                                                3       2
244003000001                                                                                                4       3
244003000001                                                                                                5       3
244003000001                                                                                                6       1
244003000001                                                                                                7       528
244003000001                                                                                                8       319
244003000001                                                                                                10      6327

Я использую этот сценарий ниже в PowerShell:

      $data = New-Object xml;
$data.load("C:\temp\parsing\mehdi.xml")

$mehdi = [pscustomobject]@{
"HouseHoldExternalId" = $data.SelectNodes("/LoyaltyCustomer/Retailer/HouseHold").HouseHoldExternalId |Out-String
"MemberExternalId" = $data.SelectNodes("/LoyaltyCustomer/Retailer/HouseHold/Members/Member").MemberExternalId |Out-String
"IsMainMember" = $data.SelectNodes("/LoyaltyCustomer/Retailer/HouseHold/Members/Member").IsMainMember |Out-String
"LastName" = $data.SelectNodes("/LoyaltyCustomer/Retailer/HouseHold/Members/Member").LastName |Out-String
"FirstName" = $data.SelectNodes("/LoyaltyCustomer/Retailer/HouseHold/Members/Member").FirstName |Out-String
"StartDate" = $data.SelectNodes("/LoyaltyCustomer/Retailer/HouseHold/Members/Member").StartDate |Out-String
"Id" = $data.SelectNodes("/LoyaltyCustomer/Retailer/HouseHold/Accounts/Account").Id |Out-String
"Balance" = $data.SelectNodes("/LoyaltyCustomer/Retailer/HouseHold/Accounts/Account").Balance |Out-String }
$mehdi | ConvertTo-Csv -NoTypeInformation -Delimiter ";" | Set-Content -Path C:\temp\parsing\test2.csv -Encoding UTF8

и я получаю, что все данные находятся в столбце HouseholdExternalID.

Как я могу получить такой же результат? Буду признателен за вашу помощь и советы. Спасибо

2 ответа

Похоже, это должно работать, чтобы преобразовать ваш XML в object[]который вы можете экспортировать в CSV после. По сути, ваш ожидаемый результат представляет собой комбинацию трех массивов:

  • $xml.LoyaltyCustomer.Retailer.Householdчтобы получить HouseHoldExternalId

  • $xml.LoyaltyCustomer.Retailer.Household.Members.Memberчтобы получить IsMainMember, MemberExternalId, FirstName, LastName и StartDate

  • $xml.LoyaltyCustomer.Retailer.Household.Accounts.Accountполучить идентификатор и баланс

      $xml = [xml]::new()
$xml.Load("C:\temp\parsing\mehdi.xml")

$psobjectOut = {
    param($element, $subelement)

    [pscustomobject]@{
        HouseHoldExternalId = $element.HouseHoldExternalId
        IsMainMember = $subelement.IsMainMember
        MemberExternalId = $subelement.MemberExternalId
        FirstName = $subelement.FirstName
        LastName = $subelement.LastName
        StartDate = '{0}' -f $subelement.StartDate -as [datetime]
        Id = $subelement.Id
        Balance = $subelement.Balance
    }
}

$result = foreach($element in $xml.LoyaltyCustomer.Retailer.Household)
{
    foreach($subelement in $element.Members.Member)
    {
        & $psobjectOut -element $element -subelement $subelement
    }

    foreach($subelement in $element.Accounts.Account)
    {
        & $psobjectOut -element $element -subelement $subelement
    }
}

Осмотр в $result:

      PS /> $result | Format-Table


HouseHoldExternalId IsMainMember MemberExternalId FirstName LastName StartDate             Id Balance
------------------- ------------ ---------------- --------- -------- ---------             -- -------
-1                  true         -1               Use       Internal 10/2/2012 12:42:00 PM    
244003000001        true         244003000001     Mike      zabana   1/21/2013 2:28:00 PM     
244003000001        false        244003000002     Kathy     zabana   3/11/2013 1:16:00 PM     
244003000001        false        244003000003     Trent     zabana   3/27/2013 2:41:00 PM     
244003000001                                                                               1  373.0000
244003000001                                                                               3  2.0000
244003000001                                                                               4  3.0000
244003000001                                                                               5  3.0000
244003000001                                                                               6  1.0000
244003000001                                                                               7  528.0000
244003000001                                                                               8  319.0000
244003000001                                                                               10 6327.0000

В дополнение к прекрасному ответу Сантьяго я хотел атаковать это с помощью xPath. Итак, я построил его работу с помощью и придумал следующую настройку t

      [XML]$XML = [XML]( Get-Content 'C:\Scripts\Example.xml' )

$HouseHolds = Select-Xml -Xml $Xml -XPath '/LoyaltyCustomer/Retailer/HouseHold'

$Output =
For($i = 0; $i -lt $HouseHolds.Count; ++$i ) {
    $HouseHoldID    = $HouseHolds[$i].Node.HouseHoldExternalId
    $HouseHoldXpath = "/LoyaltyCustomer/Retailer/HouseHold[$($i+1)]"

    Select-Xml -Xml $XML -XPath "$HouseHoldXpath/Members/Member|$HouseHoldXpath/Accounts/Account" |    
    ForEach-Object{
        [PSCustomObject]@{
            HouseHoldExternalId = $HouseHoldID
            IsMainMember        = $_.Node.IsMainMember
            MemberExternalId    = $_.Node.MemberExternalId
            FirstName           = $_.Node.FirstName
            LastName            = $_.Node.LastName
            StartDate           = $_.Node.StartDate -as [DateTime]
            Id                  = $_.Node.Id
            Balance             = $_.Node.Balance
        }
    }
    
}

$Output | Format-Table -AutoSize

Возвращает:

      HouseHoldExternalId IsMainMember MemberExternalId FirstName LastName StartDate             Id Balance
------------------- ------------ ---------------- --------- -------- ---------             -- -------  
-1                  true         -1               Use       Internal 10/2/2012 12:42:00 PM
244003000001                                                                               1  373.0000
244003000001                                                                               3  2.0000
244003000001                                                                               4  3.0000
244003000001                                                                               5  3.0000
244003000001                                                                               6  1.0000
244003000001                                                                               7  528.0000
244003000001                                                                               8  319.0000
244003000001                                                                               10 6327.0000
244003000001        true         244003000001     Mike      zabana   1/21/2013 2:28:00 PM
244003000001        false        244003000002     Kathy     zabana   3/11/2013 1:16:00 PM
244003000001        false        244003000003     Trent     zabana   3/27/2013 2:41:00 PM

Из-за комбинированного оператора xPath требуется только 1 внутренний цикл. Поэтому предварительная упаковка выражения для вывода PSCustomObject на самом деле не требуется. Тем не менее, порядок немного отличается, по-видимому, из-за того, как выполняется запрос xPath. Хотя все результаты есть.

Более ранняя версия, которая не использовала xPath так агрессивно, имела правильный порядок:

      [XML]$XML = [XML]( Get-Content 'C:\Scripts\Example.xml' )

$HouseHolds = Select-Xml -Xml $Xml -XPath '/LoyaltyCustomer/Retailer/HouseHold'

$ObjectExpression = {
    [PSCustomObject]@{
        HouseHoldExternalId = $HouseHoldID
        IsMainMember        = $_.Node.IsMainMember
        MemberExternalId    = $_.Node.MemberExternalId
        FirstName           = $_.Node.FirstName
        LastName            = $_.Node.LastName
        StartDate           = $_.Node.StartDate -as [DateTime]
        Id                  = $_.Node.Id
        Balance             = $_.Node.Balance
    }
}

$Output =
For($i = 0; $i -lt $HouseHolds.Count; ++$i ) {
    $HouseHoldID    = $HouseHolds[$i].Node.HouseHoldExternalId
    $HouseHoldXpath = "{0}{1}" -f $HouseHolds[$i].Pattern, "[$($i+1)]"
    
    Select-Xml -Xml $XML -XPath "$HouseHoldXpath/Members/Member" |
    ForEach-Object{ & $ObjectExpression }
    
    Select-Xml -Xml $XML -XPath "$HouseHoldXpath/Accounts/Account" |
    ForEach-Object{ & $ObjectExpression }
} 

$Output | Format-Table -AutoSize
Другие вопросы по тегам