1. Trang chủ
  2. » Công Nghệ Thông Tin

Professional XML Databases phần 6 pot

84 158 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 84
Dung lượng 499,19 KB

Nội dung

Flat Files 443 Delimited First, let's see how to read a delimited file and save the results to an XML document. We've already seen examples of the delimited file and XML document that we'll be transforming. In this example, this file is called ch12_ex1.txt Kevin Williams,744 Evergreen Terrace,Springfield,KY,12345,12/01/2000,12/04/2000,1,blue 2 in. grommet,17,0.10,silver 3 in. widget,22,0.20,,0,0.00,,0,0.00,,0,0.00 Homer Simpson,742 Evergreen Terrace,Springfield,KY,12345,12/02/2000,12/05/2000,2,red 1 in. sprocket,13,0.30,blue 2 in. grommet,11,0.10,,0,0.00,,0,0.00,,0,0.00 This is mapping into the file ch12_ex1.xml: <InvoiceData> <Invoice customerIDREF="c1" orderDate="12/01/2000" shipDate="12/04/2000" shipMethod="UPS"> <LineItem partIDREF="p1" quantity="17" price="0.10" /> <LineItem partIDREF="p2" quantity="22" price="0.20" /> </Invoice> <Invoice customerIDREF="c2" orderDate="12/02/2000" shipDate="12/05/2000" shipMethod="USPS"> <LineItem partIDREF="p3" quantity="13" price="0.30" /> <LineItem partIDREF="p1" quantity="11" price="0.10" /> </Invoice> <Part partID="p1" name="grommet" size="2 in." color="blue" /> <Part partID="p2" name="widget" size="3 in." color="silver" /> <Part partID="p3" name="sprocket" size="1 in." color="red" /> Chapter 12 444 <Customer customerID="c1" name="Kevin Williams" address="744 Evergreen Terrace" city="Springfield" state="KY" postalCode="12345" /> <Customer customerID="c2" name="Homer Simpson" address="742 Evergreen Terrace" city="Springfield" state="KY" postalCode="12345" /> </InvoiceData> First, we need to map from the delimited file to the XML document. We've already done this a couple of pages earlier when we were discussing how transform maps are created. Armed with this information, we can use the following VBScript code to open the flat file, break it apart, and use the DOM to construct the XML equivalent. The entire listing can be found in the file, ch12_ex1.vbs, but we'll analyze it here section by section: Dim fso, ts, sLine Dim el, dom, root Dim sField(23) Dim sThisName, sThisSize, sThisColor Dim sSize, sColor, sName, sAddress Dim iLineItem Dim invoiceBucket, partBucket, customerBucket Dim li, nl, iCust, iPart, cust, part, sDelimit iCust = 1 iPart = 1 sDelimit = Chr(44) ' This sets the delimiter to be a comma We're using the iCust and iPart variables to keep track of our customers and parts that we create, so that we can generate a unique ID for each element created. Set fso = CreateObject("Scripting.FileSystemObject") Set dom = CreateObject("Microsoft.XMLDOM") Set root = dom.createElement("InvoiceData") dom.appendChild root We create a FileSystemObject for reading the flat file and writing the XML output, and a DOM object for building the XML output. We also go ahead, create the root element for the DOM output, and add it to the document tree. Set invoiceBucket = dom.createDocumentFragment() Set partBucket = dom.createDocumentFragment() Set customerBucket = dom.createDocumentFragment() Flat Files 445 This is a common technique when building up an XML document with ordered elements. As we parse the invoices from the flat file, we'll be generating Part and Customer elements, as we need them. The above variables are going to act as buckets to hold the invoice, part, and customer information in three different places so that they can be output consecutively as we require. Sorting them into groups as the flat file is being parsed by using XMLDocumentFragment objects, allows us to easily build the final output for the file – as our output structure requires us to write every Invoice element, then all the Part elements, and then all the Customer elements to the document, in that order. Set ts = fso.OpenTextFile("invoicedelim.txt") do while ts.AtEndOfStream <> True s = ts.ReadLine for iField=1to22 sField(iField) = left(s, InStr(s, sDelimit) - 1) s = mid(s, InStr(s, sDelimit) + 1) next sField(23) = s We know we have a comma-delimited file, so we break apart each line into fields based on the delimiting character we are expecting. Naturally, if we can't do this (for example, if there aren't enough delimiters in one record – or too many), we would add error handling to report this error to the user. At this point, the sField() array contains all of the fields found on one record of the flat file. Set el = dom.createElement ("Invoice") Since each record in our flat file corresponds to one invoice, we can create that Invoice element now. We add it to our Invoice document fragment so that we will be able to write all the invoices out as a group to the main document at the end of the process. ' check to see if we have this customer yet el.setAttribute "customerIDREF", "NOTFOUND" Set nl = customerBucket.childNodes for iNode=0tonl.length - 1 sName = nl.item(iNode).getAttribute("name") sAddress = nl.item(iNode).getAttribute("address") if sName = sField(1) and sAddress = sField(2) Then ' we presume we have this one already el.setAttribute "customerIDREF", _ nl.item(iNode).getAttribute("customerID") end if next Here, we're examining all the customers in our Customer document fragment to see if we already have the customer referenced in this invoice. Since our XML document normalizes customers together, we want to reuse a customer that matches the customer on this invoice if possible. For the purposes of this analysis, we are assuming that a customer with the same name and address is a match. If we find a match for the customer, we simply set the customerIDREF attribute of the invoice to point to it. if el.getAttribute("customerIDREF") = "NOTFOUND" Then ' we need to create a new customer Set cust = dom.createElement("Customer") cust.setAttribute "customerID", "CUST" & iCust cust.setAttribute "name", sField(1) Chapter 12 446 cust.setAttribute "address", sField(2) cust.setAttribute "city", sField(3) cust.setAttribute "state", sField(4) cust.setAttribute "postalCode", sField(5) customerBucket.appendChild cust el.setAttribute "customerIDREF", "CUST" & iCust iCust = iCust + 1 end if If we didn't find the customer, we create one and add it to the Customer document fragment, assigning a new ID to the element. We then reference that ID from the customerIDREF attribute of the invoice element we are creating. el.setAttribute "orderDate", sField(6) el.setAttribute "shipDate", sField(7) if sField(8) = 1 Then el.setAttribute "shipMethod", "USPS" if sField(8) = 2 Then el.setAttribute "shipMethod", "UPS" if sField(8) = 3 Then el.setAttribute "shipMethod", "FedEx" invoiceBucket.appendChild el We continue to set attributes on the Invoice element, translating the values provided in the flat file to their appropriate analogues, elements or attributes, depending on what we're using, in our XML document as necessary. Once we've done this, we append the element to our Invoice document fragment. for iLineItem=1to5 if sField(6 + iLineItem * 3) > "" Then ' this line item exists Here, we iterate through each of the three-field sets that represent a line item. We know that if the description for the item is present, the line item exists and needs to be represented in our XML target. Set li = dom.createElement ("LineItem") li.setAttribute "quantity", sField(6 + iLineItem*3+1) li.setAttribute "price", sField(6 + iLineItem*3+2) We create our line item element and set its attributes based on the contents of the flat file. ' break apart the description field sWork = sField(6 + iLineItem * 3) sThisColor = left(sWork, InStr(sWork, " ") - 1) sWork = Mid(sWork, InStr(sWork, " ") + 1) sThisSize = "" While InStr(sWork, " ") > 0 sThisSize = sThisSize + left(sWork, InStr(sWork, " ")) sWork = Mid(sWork, InStr(sWork, " ") + 1) Wend sThisSize = Left(sThisSize, len(sThisSize) - 1) sThisName = sWork Flat Files 447 Here, we've decomposed the description of the part provided in the flat file into the name, size, and color data points our XML document needs. Set nl = partBucket.childNodes li.setAttribute "partIDREF", "NOTFOUND" for iNode=0tonl.length - 1 sName = nl.item(iNode).getAttribute("name") sSize = nl.item(iNode).getAttribute("size") sColor = nl.item(iNode).getAttribute("color") If sThisName = sName And sThisSize = sSize And sThisColor = sColor _ Then ' we presume we have this one already li.setAttribute "partIDREF", _ nl.item(iNode).getAttribute("partID") end if next if li.getAttribute("partIDREF") = "NOTFOUND" Then ' we need to create a new part Set part = dom.createElement("Part") part.setAttribute "partID", "PART" & iPart part.setAttribute "name", sThisName part.setAttribute "size", sThisSize part.setAttribute "color", sThisColor partBucket.appendChild part li.setAttribute "partIDREF", "PART" & iPart iPart = iPart + 1 end if This code is similar to the corresponding code for Customer – we check to see if we have a part in our part list yet, and if not, we add it with a new ID. For the purposes of parts, we assume that if a part shares the same name, size, and color with a part already created, it is the same part. el.appendChild li Finally, we add the line item to the Invoice element we created earlier. end if next Loop ts.Close root.appendChild invoiceBucket root.appendChild partBucket root.appendChild customerBucket Once the entire flat file has been processed and we have created all the appropriate elements, we need to add them back to the document we are creating. We can do this by appending the document fragments for each element type, to the root element of the XML document. Set ts = fso.CreateTextFile("ch12_ex1.xml", True) ts.Write dom.xml ts.close Set ts = Nothing Chapter 12 448 Finally, we flush the XML to the output file, and we're done. The output of the preceding script, when run against the sample delimited file we saw earlier, should be the XML file detailed earlier, with the whitespace absent. Fixed-width You'll recall the fixed-width example we looked at earlier. Below is a similar example but with all of the information from the last example included and a more realistic spacing between fields. Kevin Williams 744 Evergreen Terrace Springfield KY12345 12/01/200012/04/2000UPS blue 2 in. grommet 0001700000.10silver 3 in. widget 0002200000.20 0000000000.00 0000000000.00 0000000000.00 Homer Simpson 742 Evergreen Terrace Springfield KY12345 12/02/200012/05/2000USPS red 1 in. sprocket 0001300000.30blue 2 in. grommet 0001100000.10 0000000000.00 0000000000.00 0000000000.00 The first thing we need to do is to map the contents of this file. Here is the map: Value Details record[N].field1 Data type: String Position: 1-30 Description: The name of the customer on invoice N record[N].field2 Data type: String Position: 31-60 Description: The address of the customer on invoice N record[N].field3 Data type: String Position: 61-80 Description: The city of the customer on invoice N record[N].field4 Data type: String Position: 81-82 Description: The state of the customer on invoice N record[N].field5 Data type: String Position: 83-92 Description: The postal code of the customer on invoice N. Flat Files 449 Value Details record[N].field6 Data type: Datetime Position: 93-102 Format: MM/DD/YYYY Description: The order date for invoice N record[N].field7 Data type: Datetime Position: 103-112 Format: MM/DD/YYYY Description: The ship date for invoice N record[N].field8 Data type: Enumerated value Position: 113-117 Values: UPS: United States Postal Service; USPS: United Parcel Service; FedEx: Federal Express Description: The shipping method used to ship the parts ordered on invoice N record[N].field9 Data type: String Position: 118-147 Format: No more than 30 characters Description: The description of the part ordered in the first line item of invoice N, in the form color size name record[N].field10 Data type: Numeric Position: 148-152 Format: ##### Description: The quantity of the part ordered in the first line item of invoice N record[N].field11 Data type: Numeric Position: 153-160 Format: #####.## Description: The price of the part ordered in the first line item of invoice N Table continued on following page Chapter 12 450 Value Details record[N].field12 Data type: String Position: 161-190 Description: The description of the part ordered in the second line item of invoice N, in the form color size name record[N].field13 Data type: Numeric Position: 191-195 Format: ##### Description: The quantity of the part ordered in the second line item of invoice N record[N].field14 Data type: Numeric Position: 196-203 Format: #####.## Description: The price of the part ordered in the second line item of invoice N record[N].field15 Data type: String Position: 204-233 Format: No more than 30 characters Description: The description of the part ordered in the third line item of invoice N, in the form color size name record[N].field16 Data type: Numeric Position: 234-238 Format: ##### Description: The quantity of the part ordered in the third line item of invoice N record[N].field17 Data type: Numeric Position: 239-246 Format: #####.## Description: The price of the part ordered in the third line item of invoice N record[N].field18 Data type: String Position: 247-276 Description: The description of the part ordered in the fourth line item of invoice N, in the form color size name Flat Files 451 Value Details record[N].field19 Data type: Numeric Position: 277-281 Format: ##### Description: The quantity of the part ordered in the fourth line item of invoice N record[N].field20 Data type: Numeric Position: 282-289 Format: #####.## Description: The price of the part ordered in the fourth line item of invoice N record[N].field21 Data type: String Position: 290-319 Description: The description of the part ordered in the fifth line item of invoice N, in the form color size name record[N].field22 Data type: Numeric Position: 320-324 Format: ##### Description: The quantity of the part ordered in the fifth line item of invoice N record[N].field23 Data type: Numeric Position: 325-332 Format: #####.## Description: The price of the part ordered in the fifth line item of invoice N Next, we map the fields back to the XML fields: Source Target Comments record[N].field1 InvoiceData.Invoice[N]->Customer.Name Create a new customer and link back to it from the Invoice record created record[N].field2 InvoiceData.Invoice[N]- >Customer.Address record[N].field3 InvoiceData.Invoice[N]->Customer.City record[N].field4 InvoiceData.Invoice[N]- >Customer.State record[N].field5 InvoiceData.Invoice[N]- >Customer.PostalCode Table continued on following page Chapter 12 452 Source Target Comments record[N].field6 InvoiceData.Invoice[N].orderDate record[N].field7 InvoiceData.Invoice[N].shipDate record[N].field8 InvoiceData.Invoice[N].shipMethod record[N].field9 InvoiceData.Invoice[N].LineItem [1]->Part.Name, InvoiceData. Invoice[N].LineItem[1]- >Part.Size, InvoiceData. Invoice[N].LineItem[1]- >Part.Color record[N].field10 InvoiceData.Invoice[N].LineItem [1].quantity record[N].field11 InvoiceData.Invoice[N].LineItem [1].price record[N].field12 InvoiceData.Invoice[N].LineItem [2]->Part.Name, InvoiceData. Invoice[N].LineItem[2]- >Part.Size, InvoiceData. Invoice[N].LineItem[2]- >Part.Color Blank (all spaces) if no line 2 appears on the invoice. record[N].field13 InvoiceData.Invoice[N].LineItem [2].quantity record[N].field14 InvoiceData.Invoice[N].LineItem [2].price record[N].field15 InvoiceData.Invoice[N].LineItem [3]->Part.Name, InvoiceData. Invoice[N].LineItem[3]- >Part.Size, InvoiceData. Invoice[N].LineItem[3]- >Part.Color Blank (all spaces) if no line 3 appears on the invoice. record[N].field16 InvoiceData.Invoice[N].LineItem [3].quantity record[N].field17 InvoiceData.Invoice[N].LineItem [3].price record[N].field18 InvoiceData.Invoice[N].LineItem [4]->Part.Name, InvoiceData. Invoice[N].LineItem[4]- >Part.Size, InvoiceData. Invoice[N].LineItem[4]- >Part.Color Blank (all spaces) if no line 4 appears on the invoice. [...]... trim(mid(s, 83, 10)) sField (6) = trim(mid(s, 93, 10)) sField(7) = trim(mid(s, 103, 10)) sField(8) = trim(mid(s, 113, 5)) sField(9) = trim(mid(s, 118, 30)) sField(10) = trim(mid(s, 148, 5)) sField(11) = trim(mid(s, 153, 8)) 453 Chapter 12 sField(12) = trim(mid(s, 161 , 30)) sField(13) = trim(mid(s, 191, 5)) sField(14) = trim(mid(s, 1 96, 8)) sField(15) = trim(mid(s, 204, 30)) sField( 16) = trim(mid(s, 234, 5))... el.setAttribute "orderDate", sField (6) el.setAttribute "shipDate", sField(7) el.setAttribute "shipMethod", sField(8) invoiceBucket.appendChild el for iLineItem = 1 to 5 if trim(sField (6 + iLineItem * 3)) > "" Then ' this line item exists Set li = dom.createElement ("LineItem") li.setAttribute "quantity", sField (6 + iLineItem * 3 + 1) li.setAttribute "price", sField (6 + iLineItem * 3 + 2) ' break apart... fso.CreateTextFile("ch12_ex2 .xml" , True) ts.Write dom .xml ts.close Set ts = Nothing This code is virtually identical to the code used to create an XML document from a delimited file, so we won't drill into it in too much depth here One interesting thing to point out is the code used to create the sField() array: sField(1) = trim(mid(s, 1, 30)) sField(2) = trim(mid(s, 31, 30)) sField(3) = trim(mid(s, 61 , 20)) sField(4)... = trim(mid(s, 83, 10)) sField (6) = trim(mid(s, 93, 10)) sField(7) = trim(mid(s, 103, 10)) sField(8) = trim(mid(s, 113, 5)) sField(9) = trim(mid(s, 118, 30)) sField(10) = trim(mid(s, 148, 5)) sField(11) = trim(mid(s, 153, 8)) sField(12) = trim(mid(s, 161 , 30)) sField(13) = trim(mid(s, 191, 5)) sField(14) = trim(mid(s, 1 96, 8)) sField(15) = trim(mid(s, 204, 30)) sField( 16) = trim(mid(s, 234, 5)) sField(17)... on The result of this code when applied to our sample is again the same as the previous XML Transforming from XML to Flat Files The other type of transformation, from XML to flat files, calls for a significantly different approach Let's look at the different programming techniques that may be used to transform an XML document to a flat file, and then see some examples of our strategy in action Programming... Then ' raise an error End If End If Loop Loop ts.Close root.appendChild invoiceBucket root.appendChild partBucket root.appendChild customerBucket set ts = fso.CreateTextFile("ch12_ex3 .xml" , True) ts.Write dom .xml ts.close 460 Flat Files This code is very similar to code we have already seen The major difference is that elements are created as the document is read as necessary – so when an invoice line... N Record[N].Customer.field2 Data Type: String Position: 32 -61 Description: The address of the customer on invoice N Record[N].Customer.field3 Data Type: String Position: 62 -81 Description: The city of the customer on invoice N Record[N].Customer.field4 Data Type: String Position: 82-83 Description: The state of the customer on invoice N 4 56 Flat Files Value Details Record[N].Customer.field5 Data Type:... can tackle the conversion of XML to flat files The most obvious one is to parse the document and serialize it out to a flat file However, another approach that works a little better is to use XSLT to transform the XML document into the required output format Let's look at the advantages and disadvantages of each approach Parse and Serialize One strategy would be to parse the XML document, using either... for an XML document, all you have to do is to add another style sheet We'll be using XSLT for our examples Handling Different File Types Let's take a look at an example of each of our different file types, and how we would go about producing it from our XML document These samples have all been tested using James Clark's XT The MS Windows executable can be downloaded from: ftp://ftp.jclark.com/pub /xml/ xt-win32.zip... ftp://ftp.jclark.com/pub /xml/ xt-win32.zip The homepage can be found at: http://www.jclark.com /xml/ xt.html Here is the input XML document we have been, and will continue, using: . root element of the XML document. Set ts = fso.CreateTextFile("ch12_ex1 .xml& quot;, True) ts.Write dom .xml ts.close Set ts = Nothing Chapter 12 448 Finally, we flush the XML to the output file,. represented in our XML target. Set li = dom.createElement ("LineItem") li.setAttribute "quantity", sField (6 + iLineItem*3+1) li.setAttribute "price", sField (6 + iLineItem*3+2) We. 8)) Chapter 12 454 sField(12) = trim(mid(s, 161 , 30)) sField(13) = trim(mid(s, 191, 5)) sField(14) = trim(mid(s, 1 96, 8)) sField(15) = trim(mid(s, 204, 30)) sField( 16) = trim(mid(s, 234, 5)) sField(17)

Ngày đăng: 13/08/2014, 12:21

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

  • Đang cập nhật ...

TÀI LIỆU LIÊN QUAN