Группировка по дочерним элементам

Я обрабатываю файлы IDML, используя XSLT. В IDML, который экспортируется из формы InDesign, он выполняет последовательные абзацы одного и того же стиля вместе, разделенные <Br/> теги, как это (мой входной XML):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Story>
    <ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/para">
        <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
            <Content>All rights reserved. No part of this publication may be reproduced in any material form (including photocopying or storing it in any medium by electronic means and whether or not transiently or incidentally to some other use of this publication) without the written permission of the copyright owner except in accordance with the provisions of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the </Content>
        </CharacterStyleRange>
        <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/italic">
            <Content>Copyright Licensing Agency Ltd, Saffron House, 6-10 Kirby Street, London, EC1N 8TS England</Content>
        </CharacterStyleRange>
        <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
            <Content>. Applications for the copyright owner’s written permission to reproduce any part of this publication should be addressed to the publisher.</Content>
            <Br/>
            <Content>Warning: The doing of an unauthorised act in relation to a copyright work may result in both a civil claim for damages and criminal prosecution.</Content>
            <Br/>
            <Content>Crown copyright material is reproduced with the permission of the Controller of HMSO and the Queen’s Printer for Scotland.</Content>
            <Br/>
        </CharacterStyleRange>
    </ParagraphStyleRange>
</Story>

Теперь мне нужно превратить это в XML, который выглядит следующим образом

<?xml version="1.0" encoding="UTF-8"?>
<story> 
    <para>All rights reserved. No part of this publication may be reproduced in any material form (including photocopying or storing it in any medium by electronic means and whether or not transiently or incidentally to some other use of this publication) without the written permission of the copyright owner except in accordance with the provisions of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the <italic>Copyright Licensing Agency Ltd, Saffron House, 6-10 Kirby Street, London, EC1N 8TS England</italic>. Applications for the copyright owner’s written permission to reproduce any part of this publication should be addressed to the publisher.</para>
    <para>Warning: The doing of an unauthorised act in relation to a copyright work may result in both a civil claim for damages and criminal prosecution.</para>
    <para>Crown copyright material is reproduced with the permission of the Controller of HMSO and the Queen’s Printer for Scotland.</para>
</story>

Мой XSL выглядит так:

<?xml version="1.0" ?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:template match="/Story">
        <story>
            <xsl:apply-templates/>
        </story>
    </xsl:template>

    <xsl:template match="ParagraphStyleRange">
        <xsl:apply-templates>
            <xsl:with-param name="para_style_name" select="replace(./@AppliedParagraphStyle, 'ParagraphStyle/', '')"/>
        </xsl:apply-templates>
    </xsl:template>

    <xsl:template match="CharacterStyleRange">
        <xsl:param name="para_style_name"/>
        <xsl:variable name="char_style_name" select="replace(./@AppliedCharacterStyle, 'CharacterStyle/', '')"/>
        <xsl:for-each-group select="*" group-ending-with="Br">
            <xsl:element name="{$para_style_name}">
                <xsl:choose>
                    <xsl:when test="$char_style_name = '$ID/[No character style]'">
                        <xsl:apply-templates select="current-group()"/>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:element name="{$char_style_name}">
                            <xsl:apply-templates select="current-group()"/>
                        </xsl:element>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:element>
        </xsl:for-each-group>
    </xsl:template>

    <xsl:template match="Content">
        <xsl:value-of select="."/>
    </xsl:template>

    <xsl:template match="Br"/>

</xsl:stylesheet>

Который почти работает, кроме первого para разделен на italic часть.

<?xml version="1.0" encoding="UTF-8"?>
<story> 
    <para>All rights reserved. No part of this publication may be reproduced in any material form (including photocopying or storing it in any medium by electronic means and whether or not transiently or incidentally to some other use of this publication) without the written permission of the copyright owner except in accordance with the provisions of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the </para>
    <para><italic>Copyright Licensing Agency Ltd, Saffron House, 6-10 Kirby Street, London, EC1N 8TS England</italic></para>
    <para>. Applications for the copyright owner’s written permission to reproduce any part of this publication should be addressed to the publisher.</para>
    <para>Warning: The doing of an unauthorised act in relation to a copyright work may result in both a civil claim for damages and criminal prosecution.</para>
    <para>Crown copyright material is reproduced with the permission of the Controller of HMSO and the Queen’s Printer for Scotland.</para>
</story>

Как и ожидалось, мой метод группировки не работает, когда охватывает несколько CharacterStyleRange элементы на входе. Но есть ли способ группировки, где это может работать? Или мне лучше выбрать другой подход, например, прекратить CharacterStyleRange а также ParagraphStyleRange на каждом Br и вновь открывать их как промежуточный шаг для облегчения обработки?

1 ответ

Решение

Итак, ваш почти там. Переместите группировку на один уровень вверх и сделайте это в шаблоне, соответствующем ParagraphStyleRange:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">

<xsl:output indent="yes"/>

<xsl:template match="Story">
    <story>
        <xsl:apply-templates select="ParagraphStyleRange"/>
    </story>
</xsl:template>

<xsl:template match="ParagraphStyleRange">
    <xsl:variable name="para_style_name" select="replace(@AppliedParagraphStyle, 'ParagraphStyle/', '')"/>
    <xsl:for-each-group select="CharacterStyleRange/*" group-ending-with="Br">
        <xs:element name="{para_style_name}">
            <xsl:apply-templates select="current-group()[self::Content]"/>
        </xs:element>
    </xsl:for-each-group>
</xsl:template>

<xsl:template match="Content">
    <xsl:variable name="char_style_name" select="replace(../@AppliedCharacterStyle, 'CharacterStyle/', '')"/>
    <xsl:choose>
        <xsl:when test="$char_style_name = '$ID/[No character style]'">
            <xsl:value-of select="."/>
        </xsl:when>
        <xsl:otherwise>
            <xsl:element name="{$char_style_name}">
                <xsl:value-of select="."/>
            </xsl:element>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

</xsl:stylesheet>
Другие вопросы по тегам