Extracting Data from a Fixed-Length Record to XML

In order to have the ability to put student names into different forms by computer, the student names must be in machine readabel form.  The new kid on the block for machine readablel form is XML.

What is XML?

XML is a specification that allows developers to create a taged-text vocabulary specific to a particular situation.  It is similar to HTML except the tags are not predefined as in HTML.  The developer specifies his/her own tag set that defines the data to be manulated.

Student Name Tag Set

The tag set defined for student names translation is:

      <document SchoolYear="2001-2002" Semester="Fall 2002">
        <Students>
          <Student>    !student may repeat more than once
            <FirstName>
            <LastName>
            <CallBy>
            <ContactAddress>
            <e-Mail>
            <ContactPhone>
            <Classes>
              <class>   !class may repeat more than once for any one student
                <Subject>
                <Period>
                <Teacher>
              </class>
            </Classes>
          </student>
        </Studnents>
      </document>

Because the student names will be in this specific form, an XML parser can parse the XML document into the form of a tree.  Then aTransform Engine (XSLT), along with the help of a style sheet, can rearrange the tree form of the XML data into any other vocabulary required.  In my case the new vocabulary will be the several HTML forms I use at school where student names are required.  Two such forms that we looked at were the Class Participation sheet and the Mastery Record sheet.

Getting Machine Readable Data

The first step in getting student names into a machine readable format, without me typing them in, is to export the names from a school maintained database.  The database that I have access to is the program that is used to print lables for mailers to the students or their parents.  It is that exported document that I transformed into the above XML vocabulary using REXX.

Exported File Format

The format of the exported document consists of fixed length fields (41 characters per field).  The names of the fields are:

  1. First Name
  2. Last Name
  3. Address
  4. City
  5. ZIP Code
  6. Home Phone
  7. Emergency Numbers
  8. Parents/Guardians
  9. Parent Emergency Numbers
The REXX Code

The REXX code is broken down into two major sections:

  1. The parser of the Export file, and
  2. The emitter of XML tags and data
The Export File parser is nothing more than REXX's extraordinarily powerful PARSE command.  The result of the parse is temporarly stored in a stem variable.

The XML emitter is very much like the HTML emitters discussed earlier.

The result of the transformation is the XML file.  I'll show you that file by running the REXX  program.