class MARC::UnsafeXMLWriter
UnsafeXMLWriter
bypasses real xml handlers like REXML or Nokogiri and just concatenates strings to produce the XML document. This has no guarantees of validity if the MARC
record you’re encoding isn’t valid and won’t do things like entity expansion, but it does escape using ruby’s String#encode(xml: :text) and it’s much, much faster – 4-5 times faster than using Nokogiri, and 15-20 times faster than the REXML version.
Constants
- COLLECTION
- NS_ATTRS
- NS_COLLECTION
- NS_RECORD
- RECORD
- XML_HEADER
Public Class Methods
Take a record and turn it into a valid MARC-XML string. Note that this is an XML snippet, without an XML header or <collection> enclosure. @param [MARC::Record] record The record to encode to XML @return [String] The XML snippet of the record in MARC-XML
# File lib/marc/unsafe_xmlwriter.rb, line 58 def encode(record, include_namespace: true) xml = open_record(include_namespace: include_namespace).dup # MARCXML only allows alphanumerics or spaces in the leader lead = fix_leader(record.leader) xml << "<leader>" << lead.encode(xml: :text) << "</leader>" record.each do |f| if f.instance_of?(MARC::DataField) xml << open_datafield(f.tag, f.indicator1, f.indicator2) f.each do |sf| xml << open_subfield(sf.code) << sf.value.encode(xml: :text) << "</subfield>" end xml << "</datafield>" elsif f.instance_of?(MARC::ControlField) xml << open_controlfield(f.tag) << f.value.encode(xml: :text) << "</controlfield>" end end xml << "</record>" xml.force_encoding("utf-8") end
Open ‘collection` tag, w or w/o namespace
# File lib/marc/unsafe_xmlwriter.rb, line 26 def open_collection(include_namespace: true) if include_namespace NS_COLLECTION else COLLECTION end end
# File lib/marc/unsafe_xmlwriter.rb, line 88 def open_controlfield(tag) "<controlfield tag=\"#{tag}\">" end
# File lib/marc/unsafe_xmlwriter.rb, line 80 def open_datafield(tag, ind1, ind2) "<datafield tag=\"#{tag}\" ind1=\"#{ind1}\" ind2=\"#{ind2}\">" end
# File lib/marc/unsafe_xmlwriter.rb, line 34 def open_record(include_namespace: true) if include_namespace NS_RECORD else RECORD end end
# File lib/marc/unsafe_xmlwriter.rb, line 84 def open_subfield(code) "<subfield code=\"#{code}\">" end
Produce an XML string with a single document in a collection @param [MARC::Record] record @param [Boolean] include_namespace Whether to namespace the resulting XML
# File lib/marc/unsafe_xmlwriter.rb, line 45 def single_record_document(record, include_namespace: true) xml = XML_HEADER.dup xml << open_collection(include_namespace: include_namespace) xml << encode(record, include_namespace: false) xml << "</collection>".freeze xml end
Public Instance Methods
Write the record to the target @param [MARC::Record] record
# File lib/marc/unsafe_xmlwriter.rb, line 20 def write(record) @fh.write(self.class.encode(record)) end