class MARC::UnsafeXMLWriter

UnsafeXMLWriter bypasses real xml handlers like REXML or Nokogiri and just concatenates strings to produce the XML document. This has no guarantees of validity if the MARC record you’re encoding isn’t valid and won’t do things like entity expansion, but it does escape using ruby’s String#encode(xml: :text) and it’s much, much faster – 4-5 times faster than using Nokogiri, and 15-20 times faster than the REXML version.

Constants

COLLECTION
NS_ATTRS
NS_COLLECTION
NS_RECORD
RECORD
XML_HEADER

Public Class Methods

encode(record, include_namespace: true) click to toggle source

Take a record and turn it into a valid MARC-XML string. Note that this is an XML snippet, without an XML header or <collection> enclosure. @param [MARC::Record] record The record to encode to XML @return [String] The XML snippet of the record in MARC-XML

# File lib/marc/unsafe_xmlwriter.rb, line 58
def encode(record, include_namespace: true)
  xml = open_record(include_namespace: include_namespace).dup

  # MARCXML only allows alphanumerics or spaces in the leader
  lead = fix_leader(record.leader)

  xml << "<leader>" << lead.encode(xml: :text) << "</leader>"
  record.each do |f|
    if f.instance_of?(MARC::DataField)
      xml << open_datafield(f.tag, f.indicator1, f.indicator2)
      f.each do |sf|
        xml << open_subfield(sf.code) << sf.value.encode(xml: :text) << "</subfield>"
      end
      xml << "</datafield>"
    elsif f.instance_of?(MARC::ControlField)
      xml << open_controlfield(f.tag) << f.value.encode(xml: :text) << "</controlfield>"
    end
  end
  xml << "</record>"
  xml.force_encoding("utf-8")
end
open_collection(include_namespace: true) click to toggle source

Open ‘collection` tag, w or w/o namespace

# File lib/marc/unsafe_xmlwriter.rb, line 26
def open_collection(include_namespace: true)
  if include_namespace
    NS_COLLECTION
  else
    COLLECTION
  end
end
open_controlfield(tag) click to toggle source
# File lib/marc/unsafe_xmlwriter.rb, line 88
def open_controlfield(tag)
  "<controlfield tag=\"#{tag}\">"
end
open_datafield(tag, ind1, ind2) click to toggle source
# File lib/marc/unsafe_xmlwriter.rb, line 80
def open_datafield(tag, ind1, ind2)
  "<datafield tag=\"#{tag}\" ind1=\"#{ind1}\" ind2=\"#{ind2}\">"
end
open_record(include_namespace: true) click to toggle source
# File lib/marc/unsafe_xmlwriter.rb, line 34
def open_record(include_namespace: true)
  if include_namespace
    NS_RECORD
  else
    RECORD
  end
end
open_subfield(code) click to toggle source
# File lib/marc/unsafe_xmlwriter.rb, line 84
def open_subfield(code)
  "<subfield code=\"#{code}\">"
end
single_record_document(record, include_namespace: true) click to toggle source

Produce an XML string with a single document in a collection @param [MARC::Record] record @param [Boolean] include_namespace Whether to namespace the resulting XML

# File lib/marc/unsafe_xmlwriter.rb, line 45
def single_record_document(record, include_namespace: true)
  xml = XML_HEADER.dup
  xml << open_collection(include_namespace: include_namespace)
  xml << encode(record, include_namespace: false)
  xml << "</collection>".freeze
  xml
end

Public Instance Methods

write(record) click to toggle source

Write the record to the target @param [MARC::Record] record

# File lib/marc/unsafe_xmlwriter.rb, line 20
def write(record)
  @fh.write(self.class.encode(record))
end