• No results found

Validating XML Data with an XML Schema

N/A
N/A
Protected

Academic year: 2021

Share "Validating XML Data with an XML Schema"

Copied!
51
0
0

Loading.... (view fulltext now)

Full text

(1)

Validating XML Data

with an XML Schema

(2)

Contents

1. XML Validation Concepts

a. Concepts

b. Errors

c. Resources

2. Example: Validation with XMLSpy

a. Downloading Spy

b. Creating a new XMLSpy Project

c. Associate the homestead XML Schema with a folder d. Open the file in XMLSpy

e. Add the active file to the folder f. Click the "Validate" button

3. Example: Manipulating Large XML Data Sets with Ant & Eclipse

a. Tools for Records and Metadata vs. Tools for Data b. Apache Ant – DOS command line

c. Eclipse – GUI interface

(3)

Disclaimer

• The information and examples in this document are for

demonstration purposes only.

• The information and examples presented are for your information to assist in enhancing the abilities of

counties to work with and validate XML datasets with Minnesota Revenue XML schemas.

• The Minnesota Department of Revenue does not

endorse nor support any products mentioned in this presentation. It is beyond the scope of the mission of the Property Tax Division to support tools within each

(4)

XML Validation Concepts

<XML File/>

3

Validation errors Validates <XML Schema/> XML Validator If you have:

1) A valid XML file. And

2) a well defined XML Schema, you can

3) check the XML file to see if it is XML and has all the

required tags defined by the schema with any standard XML validation program.

(5)

XML Validation Concepts

• XML is a text file where well defined tags surround each data value.

• An XML Schema describes what tags are needed and where they need to be for a particular file.

Tag example: <Zip_Code>55101</Zip_Code>

<xs:element name="Zip_Code"> <xs:simpleType>

<xs:restriction base="xs:string"> <xs:pattern value=“[0-9]{5}"/> </xs:restriction>

(6)

XML Validation Errors

If you have:

1) An invalid XML file: You get an invalid XML, malformed XML or content error. Examples are missing tag brackets or other syntax errors.

2) A valid XML file with tag errors: You get a reasonable list of XML tag errors found that are inconsistent with the specific XML

Schema being validated against.

<XML File/>

3

Validation errors Validates <XML Schema/> XML Validator

(7)

ampersand greater than less than Name &amp; & &gt; > &lt; < Escape Character

There are five characters are used in XML syntax that cannot be used directly in a data value. They must be “escaped” by representing the character using the ampersand representation

XML Validation Errors

for XML Escape Characters

(8)

10 Common XML Transmission Errors

1. Mal-formed XML

2. Missing namespace declarations 3. Invalid document structure

4. Missing required element 5. Missing data in element

6. Invalid document type code values 7. Invalid property type code value 8. Invalid character values

9. Incorrect number of repeating fields 10. Incorrect tax year

(9)

XML & Validation Resources

W3C XML Standards Page – http://www.w3.org/XML/

OASIS XML Cover Pages –

http://xml.coverpages.org/xml.html#xmlValResources (lots of

references)

XML.com – http://www.xml.com (up-to-date XML information)

XML.com Schema Tools –

http://www.xml.com/pub/a/2000/12/13/schematools.html (older list of

schema tools)

(10)

Example:

Validating a Homestead File with

XMLSpy

(11)

Validating with XMLSpy Steps

1. Download XML Spy (30 day free eval) and homestead zip file

2. Create a new XML Spy Project

3. Associate the homestead XML Schema with a folder

(12)

Download XML Spy

• http://www.altova.com/products/xmlspy/xml_editor.html

(13)

Download Homestead Files

(14)

Start XML Spy

• Double click the XML Spy icon

(15)

New Project Window

• Note: if the window is not visible use the Window/Project menu to show the project window

(16)

Set the Properties of the XML Folder

• Right click over the XML files folder in the project view

• NOTE: RIGHT CLICK not left click

(17)
(18)

Browse… to homestead schema

• Click OK and then double click on your xml data file to be validated

(19)

Add this file to your project

• RIGHT click and select the "Add Active File"

(20)
(21)

View Results in Validation View

• If your file is valid a green check will appear in the validation view

(22)

File Size Limitations

• XMLSpy tends to have problems validating files over about 25MB on a system with

1GB of RAM

• Use Apache Ant and/or Eclipse if you want to validate larger files

(23)

Example:

Manipulating Large

XML Data Sets with Ant & Eclipse

(24)

Agenda

• Tools for Records and Metadata vs. Tools for Data

• Apache Ant

– DOS command line • Eclipse

– GUI interface

• V – The File Viewer – Viewing large files • XML databases

(25)

Records vs. Databases

• XML File Viewers (like XML Spy) are ideal for viewing single records and metadata (XML Schemas)

• Visual editing tools tend stop working

when file sizes exceed about 25MB (given 2GB of RAM) (e.g. We don't use MS-Word to edit 100,000 records in a database)

(26)

In Memory vs. Streaming

• There are several different approaches to checking large files

– Load the entire file into memory (DOM) – Stream the file through memory (SAX)

– Page only relevant sections into memory (Chunking – used in V-The-File-Viewer)

(27)

Apache Ant

• Open source build manager

• User give ant a high-level description of a task • Ant executes task using dependency analysis

(only validate after extract)

• Called from shell (DOS or UNIX)

• Called from Integrated Development Environment (IDE)

(28)
(29)
(30)

Adding tools.jar

• Apache ant needs one missing jar file call "tools.jar" that is free with Sun's Software Development Tools

• It is freely available from the Java download as part of the JavaSDK 1.4+ (but not the JDK)

• Temporary file is on the Java Open Source User Group JOSUG web site

(www.josug.org/tools.jar) • File is about 6MB!

(31)

Apache Ant 1.7

• Many new features

• Simple <schemavalidate> target • Faster execution

<schemavalidate

noNamespaceFile="homestead-data_v0.28.xsd"

(32)

<?xml version="1.0" encoding="UTF-8"?>

<project default="validate-homestead">

<property name="SrcDir" value="C:/homestead/stress-test"/> <property name="SchemaDir" value="C:/homestead/schemas"/> <target name="validate-homestead">

<schemavalidate noNamespaceFile="${SchemaDir}/homestead-data_v0.28.xsd" file="${SrcDir}/100MB-test.xml"> </schemavalidate> </target> </project>

Ant From DOS Command Line

1. Download Apache Ant version 1.7.0 2. Copy the build.xml into a directly

3. Change file locations in properties of the build file to match your local files

Change these to match your local system build.xml

(33)

Apache Ant Tasks

schemavalidate

– New Ant 1.7 optional task just for XML Schema

xmlvalidate

– very general Ant 1.6 task for validation of XML files – check for well-formed files

– check for validation against an XML Schema

(34)

schemavalidate options

(35)
(36)

Sample Ant 1.6 Validate Script

(37)

Eclipse

• OpenSource Integrated development

environment originally sponsored by IBM • "GUI" front end to Apache Ant

(38)
(39)

Complete Ant 1.7 Build File

<?xml version="1.0" encoding="UTF-8"?>

<project default="validate-homestead">

<property name="DataDir"value="C:/homestead/data-files"/> <property name="SchemaDir" value="C:/homestead/schemas"/> <target name="validate-homestead">

<schemavalidate noNamespaceFile="${SchemaDir}/homestead-data_v0.28.xsd" file="${DataDir}/my-data-file.xml"> </schemavalidate> </target> </project>

(40)

GUI "Point and Click" UI

• Sample "point and click" GUI interface • Alt+Shift+X, Q to run a task

(41)

XML Transform

• View a homestead record of a specific parcel ID Big File (Gigabytes) XML Transform With Matching Rules Very Small File match no match

(42)

Sample XML Transform

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheetversion="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:mn="http://data.state.mn.us" xmlns:c="http://niem.gov/niem/common/1.0" xmlns:u="http://niem.gov/niem/universal/1.0" xmlns:mnr="http://revenue.state.mn.us" xmlns:mnr-ptx="http://propertytax.state.mn.us" >

<xsl:outputindent="yes"exclude-result-prefixes="mn mnr c u mnr-ptx"/> <!--only display the homestead record for this parcel ID -->

<xsl:template

match="/HomesteadRecordsDocument/CountyHomesteadRecord/HomesteadParcels/HomesteadParcel/CountyPr opertyTaxStatement[mn:ParcelID='1234567']">

<!--copy the CountyHomesteadRecord that matched this parcel ID to the output --> <xsl:copy-ofselect="../../.."/>

</xsl:template>

<!--do not output anything else --> <xsl:templatematch="@*|node()">

<xsl:apply-templatesselect="@*|node()"/> </xsl:template>

(43)

V-The File Viewer

• $20 application (less in quantity)

• Easily allows viewing of files greater than 1GB (uses file

"chunking" technology)

(44)

Use Goto Function

• Goto is (Ctrl-G) or

(45)

XML Databases

• XML databases store XML in its native format

• You can associate a column in your databases or a "collection" with the homestead XML Schema

(46)

Example of XML Databases

• IBM DB2 version 9 "PureXML"

– free and low-cost "express" versions for development and testing

• eXist (open source)

– native XML database with XML Schema validation • Over 50 other free and low-cost solutions with

(47)

DB2

• IBM DB2 version 9 supports fast searches on complex XML data sets

• Load records into XML datatype

• Records are quickly validated using an XML Schema

(48)

eXist

• Open source

• Built in web-administration • Easy to setup and configure

• Allows data to be validated on insert • Fast searches

(49)

Microsoft SQL Server 2005

• Supports native XML datatype • Supports fast indexing

• Add SOAP services to XML documents • Support for XQuery and XQuery updates

(50)

Ant Book

(51)

References

Related documents

Appendix E: XML Schema Element and Attribute Reference 911. Appendix F: Schema Data Types

Tujuan dari penelitian ini adalah menentukan konsentrasi ekstrak daun ketepeng cina (Cassia alata Linn.) yang efektif dalam menghambat pertumbuhan jamur Cercospora

network embeddedness of the firm hosting the development. Characteristics connected to the innovation development process is used for explaining what otherwise may be conceived

We have chosen the I.C number of each students of 2DAA as we choose to collect a numeric data for the tasks. A numerical data is also known as quantitative data which consists

When you work with multiple XML data files and XML schemas, you typically create an XML Map for each schema, map the elements you want, and then import each XML data file to

An XML schema collection is a metadata object across the blunt that contains one cover more XML Schema Definition XSD language schemas It is used to validate xml data type instances

For parsing with references of schemas parse xml schema described with what that parses input xml files, which xml driver wrongly assumes that. You with parsing event

Although services are said to be free to pregnant/nursing mothers on paper, giving staff incentives to get express care and healthcare staff asking women to bring things not