The development of Perl SAX 2.1 API is in progress. Comments should be sent to Perl-XML mailing list (perl-xml@listserv.ActiveState.com).
Open issues: none
Change C1
XMLVersion and Encoding fields added to document locator
(as in Locator2 interface of SAX 2.0 Ext. 1.1)
Change C2
[resolves issue I4]
The definition of parse() unified in the Basic and
Advanced documents. parse_uri() added. The new definition fits to
the current XML::SAX::Base implementation (as of XML::SAX v0.12).
Change C3
[resolves issue I6]
Changes in attribute_decl(): ValueDefault renamed to Mode.
The new name is less confusing and corresponds to SAX Java API.
Change C4
[resolves issue I12]
The following text has been added to define what the document locator is
supposed to return: "If possible, a Perl SAX driver should provide the line
and column position of the last character of the text associated with the
current document event. The first line is line 1; the first column in each
line is column 1."
Change C5
[resolves issue I11]
The spec defines that a stream argument that can be provided to parse_file()
method can be either a file handle, a glob reference, or a IO::Handle sub-classe.
Change C6
[resolves issue I7]
New section "Namespace Processing" has been added. It describes
the behavior of a parser for NS processing turned off:
Element/attribute hash keys are always present,
NamespaceURI, Prefix and LocalName are undef. Attributes keys
are prefixed with {}. NS declarations are treated as common
attributes. start_prefix_mapping and end_prefix_mapping are
never called.
Change C7
[resolves issue I8]
The spec defines explicitly that values of callback argument hashrefs are
Unicode strings (scalars with UTF-8 flag on).
Change C8
[resolves issue I1]
The section "Features" defines a read-only 'http://xmlns.perl.org/sax/version-2.1'
feature which returns 1 for a driver supporting Perl SAX 2.1.
Change C9
[resolves issue I2]
LexicalHandler and DeclHandler are set using the
parser options with the same name. The two read-only features,
(http://xmlns.perl.org/sax/lexicalHandler, declHandler) return
0 or 1 to indicate whether the parser supports these two interfaces.
Change C10
[resolves issue I13]
The spec states explicitly that the value of input source Encoding property
has a higher priority than encoding specified in an XML declaration.
Change C11
[resolves issue I9]
The following text has been added to the InputSource definition:
"String - The character or byte string of this input source.
The SAX parser will ignore this if there is also a byte stream or
a character stream specified, but it will use the string in preference
to opening a URI connection itself. If the UTF-8 flag of the string is turned
on, the effect is as if the Encoding property is set to UTF-8."
The order properties are checked in agrees with the current XML::SAX::Base
implementation (v0.12): CharacterStream, ByteStream, String, SystemId.
Change C12
[resolves issue I10]
In addition to streams, the parse_file() method also accepts system paths
to prevent possible confusion arising from the name of this method.
Change C13
The Features section has changed. All features have values of 1 or 0; the
section lists some common features.
Issue I1
status: closed, resolution: applied [resolved as change C8]
A parser should advertise SAX version it supports. There can be
a new method ($parser->get_sax_version()) or a read-only feature
(http://xmlns.perl.org/sax/version). This feature should be introduced
also to Perl SAX 2.0 retrospectively to distinguish between 1.0, 2.0
and 2.1 drivers.
Suggestion: the read-only feature.
Issue I2
status: closed, resolution: applied [resolved as change C9]
"http://xml.org/sax/handlers/LexicalHandler" feature on the parser
needs to be set to the object to receive lexical events currently. If
the reader does not support lexical events, it will throw
a XML::SAX::Exception::NotRecognized or a
XML::SAX::Exception::NotSupported when you attempt to register the
handler. DeclHandler works in the same way. Actually, this is a theory
- XML::SAX::Base doesn't implement this currently.
This approach is very different from the common PerlSAX mechanism:
look for a specific handler, then look for a handler method on the
default handler, ignore the callback when not found. It would be more
'perlish' to apply this simple mechanism to LexicalHandler and
DeclHandler too. If we want these two be extension handlers (compliant
2.1 parsers are not required to support them) there could be read-only
features to let apps to know whether extension handlers are supported
o not (http://xmlns.perl.org/sax/LexicalHandler, DeclHandler).
Suggestion: LexicalHandler and DeclHandler are set using the
parser options with the same name. The two read-only features,
(http://xmlns.perl.org/sax/LexicalHandler, DeclHandler) return
0 or 1 to indicate whether the parser support these two interfaces.
Issue I3
status: closed, resolution: denied
SAX 2.0 Ext. 1.1 has a new Attributes2 interface which extends
attributes with new properties (Declared, Specified) to distinguish
between attributed specified in an XML doc and those declared in
DTD. This could be introduced into Perl SAX 2.1 as an optional
extension (advertised by a feature).
Suggestion: not to apply.
Issue I4
status: closed, resolution: applied [resolved as change C2]
The parse() method is defined in different ways in the Basic and
Advanced documents.
Proposed solution: to add an explicit parse_uri() method, parse()
would call either parse_uri(), parse_string(), or parse_stream()
based on InputSource.
Issue I5
status: closed, resolution: denied
All hash references could be replaced with blessed classes. Need to
clarify what would be benefits of such change.
Suggestion: not to apply.
Issue I6
status: closed, resolution: applied [resolved as change C3]
Changes in attribute_decl
eName, aName, Type, Mode (was ValueDefault), Value
Issue I7
status: closed, resolution: applied [resolved as change C6]
The effect of turning off namespace processing is unclear in Perl SAX.
The spec should state that all namespace-related processing is skipped,
and no namespace-related information is made available.
Suggestion: All node keys are always present, NamespaceURI, Prefix and
LocalName are undef. Attributes keys are prefixed with {} (for example
{}pfx:lname). NS declarations are treated as common attributes.
Issue I8
status: closed, resolution: applied [resolved as change C7]
Perl SAX should require explicitly all event data to be Unicode strings
(to have the UTF-8 flag on).
Issue I9
status: closed, resolution: applied [resolved as change C11]
Input sources don't have a String property defined though the
parse_string() method exists and use it. Current XML::SAX::Base version
(0.12) already implements String property. The properties are checked in
this order: CharacterStream, ByteStream, String, SystemId.
Suggested solution: To add the following paragraph:
String - The character or byte string for this input source.
If there is a string specified, the SAX parser will ignore any byte
or character stream and will not attempt to open a URI connection to
the system identifier.
If the UTF-8 flag of the string is turned on, the effect is as if
the Encoding property is set to UTF-8.
The order of properties to be checked has to be determined.
Issue I10
status: closed, resolution: applied [resolved as change C12]
parse_file() is meant to accept streams in Perl SAX, while other
modules (such as XML::LibXML and XML::Parser) accept system paths for
this method.
Suggested solution:
To change parse_file() so that it accepts a system path in addition to the
currently supported types.
Issue I11
status: closed, resolution: applied [resolved as change C5]
The specification should state explicitly what is meant be "streams",
what are supported types for parse_file(): file handles,
glob references, IO::Handle sub-classes, ...
Suggested solution:
To support all of the above mentioned types.
Issue I12
status: closed, resolution: applied [resolved as change C4]
The spec should be more explicit about what a document locator is
supposed to return. It could for example read:
If possible, a Perl SAX driver should provide the line and column position
of the first character after the text associated with the document event.
The first line is line 1; the first column in each line is column 1.
Issue I13
status: closed, resolution: applied [resolved as change C10]
The value of the Encoding property of Input Sources should be able to override
the encoding value specified in the xml declaration. The spec should state this
explicitly, if so.
Suggestion: to add the following text: If available, the value
of Encoding property has a higher priority than encoding
specified in an XML declaration.
Issue I14
status: closed, resolution: denied
Parsers report an error when the parse() method is called and no input
source (Source option) is provided, regardless on the method's argument.
Suggestion: not to apply. It will be sufficient to provide a better error
message from XML::SAX::Base (e.g. to suggest than one may likely want to
call parse_uri() or parse_string() instead of parse().)