Class AbstractXmlParser

java.lang.Object
org.apache.maven.doxia.parser.AbstractParser
org.apache.maven.doxia.parser.AbstractXmlParser
All Implemented Interfaces:
LogEnabled, Markup, XmlMarkup, Parser
Direct Known Subclasses:
DocBookParser, FmlParser, Xhtml5BaseParser, XhtmlBaseParser

public abstract class AbstractXmlParser extends AbstractParser implements XmlMarkup
An abstract class that defines some convenience methods for XML parsers.
Since:
1.0
  • Field Details

    • PATTERN_ENTITY_1

      private static final Pattern PATTERN_ENTITY_1
      Entity pattern for HTML entity, i.e. &nbsp; "<!ENTITY(\\s)+([^>|^\\s]+)(\\s)+\"(\\s)*(&[a-zA-Z]{2,6};)(\\s)*\"(\\s)*>
      see http://www.w3.org/TR/REC-xml/#NT-EntityDecl.
    • PATTERN_ENTITY_2

      private static final Pattern PATTERN_ENTITY_2
      Entity pattern for Unicode entity, i.e. &#38; "<!ENTITY(\\s)+([^>|^\\s]+)(\\s)+\"(\\s)*(&(#x?[0-9a-fA-F]{1,5};)*)(\\s)*\"(\\s)*>"
      see http://www.w3.org/TR/REC-xml/#NT-EntityDecl.
    • ignorableWhitespace

      private boolean ignorableWhitespace
    • collapsibleWhitespace

      private boolean collapsibleWhitespace
    • trimmableWhitespace

      private boolean trimmableWhitespace
    • entities

      private Map<String,String> entities
    • validate

      private boolean validate
  • Constructor Details

    • AbstractXmlParser

      public AbstractXmlParser()
  • Method Details

    • parse

      public void parse(Reader source, Sink sink, String reference) throws ParseException
      Parses the given source model and emits Doxia events into the given sink.
      Specified by:
      parse in interface Parser
      Parameters:
      source - not null reader that provides the source document. You could use newReader methods from ReaderFactory.
      sink - A sink that consumes the Doxia events.
      reference - the reference
      Throws:
      ParseException - if the model could not be parsed.
    • initXmlParser

      protected void initXmlParser(org.codehaus.plexus.util.xml.pull.XmlPullParser parser) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Initializes the parser with custom entities or other options.
      Parameters:
      parser - A parser, not null.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem initializing the parser
    • getType

      public final int getType()
      The parser type value could be Parser.UNKNOWN_TYPE, Parser.TXT_TYPE or Parser.XML_TYPE.
      Specified by:
      getType in interface Parser
      Overrides:
      getType in class AbstractParser
      Returns:
      a int
    • getAttributesFromParser

      protected SinkEventAttributeSet getAttributesFromParser(org.codehaus.plexus.util.xml.pull.XmlPullParser parser)
      Converts the attributes of the current start tag of the given parser to a SinkEventAttributeSet.
      Parameters:
      parser - A parser, not null.
      Returns:
      a SinkEventAttributeSet or null if the current parser event is not a start tag.
      Since:
      1.1
    • parseXml

      private void parseXml(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionException
      Parse the model from the XmlPullParser into the given sink.
      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
      MacroExecutionException - if there's a problem executing a macro
    • handleStartTag

      protected abstract void handleStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionException
      Goes through the possible start tags.
      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
      MacroExecutionException - if there's a problem executing a macro
    • handleEndTag

      protected abstract void handleEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionException
      Goes through the possible end tags.
      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
      MacroExecutionException - if there's a problem executing a macro
    • handleText

      protected void handleText(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handles text events.

      This is a default implementation, if the parser points to a non-empty text element, it is emitted as a text event into the specified sink.

      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events. Not null.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
    • handleCdsect

      protected void handleCdsect(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handles CDATA sections.

      This is a default implementation, all data are emitted as text events into the specified sink.

      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events. Not null.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
    • handleComment

      protected void handleComment(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handles comments.

      This is a default implementation, all data are emitted as comment events into the specified sink.

      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events. Not null.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
    • handleEntity

      protected void handleEntity(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handles entities.

      This is a default implementation, all entities are resolved and emitted as text events into the specified sink, except:

      • the entities with names #160, nbsp and #x00A0 are emitted as nonBreakingSpace() events.
      Parameters:
      parser - A parser, not null.
      sink - the sink to receive the events. Not null.
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if there's a problem parsing the model
    • handleUnknown

      protected void handleUnknown(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink, int type)
      Handles an unknown event.

      This is a default implementation, all events are emitted as unknown events into the specified sink.

      Parameters:
      parser - the parser to get the event from.
      sink - the sink to receive the event.
      type - the tag event type. This should be one of HtmlMarkup.TAG_TYPE_SIMPLE, HtmlMarkup.TAG_TYPE_START, HtmlMarkup.TAG_TYPE_END or HtmlMarkup.ENTITY_TYPE. It will be passed as the first argument of the required parameters to the Sink Sink.unknown(String, Object[], org.apache.maven.doxia.sink.SinkEventAttributes) method.
    • isIgnorableWhitespace

      protected boolean isIgnorableWhitespace()

      isIgnorableWhitespace.

      Returns:
      true if whitespace will be ignored, false otherwise.
      Since:
      1.1
      See Also:
    • setIgnorableWhitespace

      protected void setIgnorableWhitespace(boolean ignorable)
      Specify that whitespace will be ignored. I.e.:
      <tr> <td/> </tr>
      is equivalent to
      <tr><td/></tr>
      Parameters:
      ignorable - true to ignore whitespace, false otherwise.
      Since:
      1.1
    • isCollapsibleWhitespace

      protected boolean isCollapsibleWhitespace()

      isCollapsibleWhitespace.

      Returns:
      true if text will collapse, false otherwise.
      Since:
      1.1
      See Also:
    • setCollapsibleWhitespace

      protected void setCollapsibleWhitespace(boolean collapsible)
      Specify that text will be collapsed. I.e.:
      Text   Text
      is equivalent to
      Text Text
      Parameters:
      collapsible - true to allow collapsible text, false otherwise.
      Since:
      1.1
    • isTrimmableWhitespace

      protected boolean isTrimmableWhitespace()

      isTrimmableWhitespace.

      Returns:
      true if text will be trim, false otherwise.
      Since:
      1.1
      See Also:
    • setTrimmableWhitespace

      protected void setTrimmableWhitespace(boolean trimmable)
      Specify that text will be collapsed. I.e.:
      <p> Text </p>
      is equivalent to
      <p>Text</p>
      Parameters:
      trimmable - true to allow trimmable text, false otherwise.
      Since:
      1.1
    • getText

      protected String getText(org.codehaus.plexus.util.xml.pull.XmlPullParser parser)

      getText.

      Parameters:
      parser - A parser, not null.
      Returns:
      the XmlPullParser.getText() taking care of trimmable or collapsible configuration.
      Since:
      1.1
      See Also:
    • getLocalEntities

      protected Map<String,String> getLocalEntities()
      Return the defined entities in a local doctype. I.e.:
       <!DOCTYPE foo [
         <!ENTITY bar "&#x160;">
         <!ENTITY bar1 "&#x161;">
       ]>
       
      Returns:
      a map of the defined entities in a local doctype.
      Since:
      1.1
    • isValidate

      public boolean isValidate()

      isValidate.

      Returns:
      true if XML content will be validate, false otherwise.
      Since:
      1.1
    • setValidate

      public void setValidate(boolean validate)
      Specify a flag to validate or not the XML content.
      Parameters:
      validate - the validate to set
      Since:
      1.1
      See Also:
    • addEntity

      private void addEntity(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, String entityName, String entityValue) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Add an entity given by entityName and entityValue to entities.
      By default, we exclude the default XML entities: &amp;, &lt;, &gt;, &quot; and &apos;.
      Parameters:
      parser - not null
      entityName - not null
      entityValue - not null
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if any
      See Also:
      • XmlPullParser.defineEntityReplacementText(String, String)
    • addLocalEntities

      private void addLocalEntities(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, String text) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handle entities defined in a local doctype as the following:
       <!DOCTYPE foo [
         <!ENTITY bar "&#x160;">
         <!ENTITY bar1 "&#x161;">
       ]>
       
      Parameters:
      parser - not null
      text - not null
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if any
    • addDTDEntities

      private void addDTDEntities(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, String text) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException
      Handle entities defined in external doctypes as the following:
       <!DOCTYPE foo [
         <!-- These are the entity sets for ISO Latin 1 characters for the XHTML -->
         <!ENTITY % HTMLlat1 PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN"
                "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
         %HTMLlat1;
       ]>
       
      Parameters:
      parser - not null
      text - not null
      Throws:
      org.codehaus.plexus.util.xml.pull.XmlPullParserException - if any