• No results found

D. BUSINESS INTEREST IN AN ALTERNATIVE XML FORMAT

2. Arguments against a Binary XML Format

The Microsoft Corporation (2003) was against the development of a standardized binary format for XML. They claim to point 3 of the 10-points of XML document that XML is text. Text helps people learn, text is simple and there are many tools to optimize text parsing. Additionally, native XML is understood and is used as is within the IT industry. They state a new standard would add complexity to the XML space, requiring vendors to support two distinct versions of XML instead of one, and both venders and users have suffered the growing pains of the ever-increasing families of XML languages.

They conclude the contrary of the two predominant arguments for a new standard:

1. XML verbose: They claim that if compression is the goal of a binary XML format, then GZip or XMill can achieve compression good enough for low-bandwidth. Both of these compression techniques are designed for text formats and do show significant compression of XML files.

2. Processing intensity: Stating that if the entire XML document is in binary, then everything would have to be parsed and processed, adding

complexity and defeating the benefits trying to be accomplished for small handheld, “it becomes a tradeoff between smaller memory footprint and higher parsing cost, which consumes more power.”

Microsoft also notes that a large number of XML 1.0 applications are already deployed and changing this installed base is extremely costly and difficult.

The Oracle Corporation (2003), a leader in database technologies, argues that the routine nonproprietary method of XML processing is the Document Object Model (DOM) which processes an entire document into memory. Bringing an entire document into memory can quickly run an application out of memory spaces as the XML document grows in size. The other method is to use Simple API for XML (SAX), which is an event-driven method that saves on memory space, but it has to rely on sequential XML processing and proprietary methods. Oracle suggest a need for a XML compression to enable DOM-like XML processing to ensure platform independence for working in the Internet or direct-communication domains. Oracle contends that the purpose of the

compression is dependent upon where in the processing chain the document is operating.

The requirements for mobile device processing and displaying of XML are different from a Web service application or that of an enterprise server. They remain skeptical as to whether or not a single case of compression can meet the unique needs of all domains, and did not yet support an alternative XML format.

Computer Engineering and Networks Laboratory (2003), is a Swiss company focused on computer communications and distributed networks. While Computer

Engineering and Networks Laboratory remains interested in a “good idea” and the efforts of a binary XML format, they believe this needs to be a starting point rather than an end state. Their recommendations are to reconsider the foundations of XML and to define a binary representation based on existing well-known algorithms rather than a new

compression technique.

XimpleWare’s (2003) business model is the delivery of SOA tools for enterprises around the world. They are not a firm believer that a binary XML format will help the XML information set, claiming that the often-mentioned verbosity and performance issues of XML are seldom the true limiting factors. They do acknowledge the

verboseness of XML in low-bandwidth environments is a concern, and processing is of concern regardless of bandwidth. They believe that XML is its own problem claiming that processes running on XML when compared to non-XML always underperform.

They acknowledge that while an alternative format is likely to deliver some

improvements, doing such is contrary to XML “…give up the luxury of reading the wire format…back to the dark ages,” that is, reverting back to a legacy style of data formatting before human readability became widespread. Reverting to the “dark ages” in terms of XML would mandate a persistent schema, the same type of rigid format problem that originally enabled XML’s loose-format success over file-structures. Their solution is not the format of the XML, but the processing of XML by keeping the entire document in memory, much like a database to enable exceptionally fast queries and restructuring.

IBM (2003), a maker of all things digital has worked with XML as their dominant exchange data format for years, and has made a number of IBM-specific attempts to improve the XML verboseness and performance. IBM points to four success of XML:

1. Only one standard XML format with limited character settings (UTF-8 or 16), but can encompass any number of specific characters with escape features.

2. XML is human readable text, not binary…industry is focusing on more text than binary.

3. Flexible format that can reach to nearly any domain.

4. XML is self-describing.

The successes of XML are also the roots of XML’s problems. IBM, however, claims that any alteration to these keys of XML’s success will undermine XML’s interoperability since “…binary XML proposals with a healthy reluctance to tinker with the formula that has successfully carried XML so far.” They do believe that an

alternative XML format would make the XML family better, but also believe the diversity of opinions and needs will not converge. Moving forward without

standardization and convergence will lead to failure. IBM concludes that each domain might have to develop its own binary XML format in order to achieve domain-specific goals, but that this inhibits inter-domain operability, which would be an overall failure.

Rick Marshall (2003), an independent software developer with 25 years of experience within database domains and XML representation, claims that databases are the most efficient format, and that a binary XML is not worth the effort or the troubles of decoding. He states that while XML is a human readable format, once namespaces are introduced into an XML document the readability is questionable even though it is text.

Rick does note that XML tags could use improvement, as they are part of the reason for XML bloating. In addition, he notes for many documents, finding the end tag is

complicated, and so adding a tag-detection mechanism would be an added benefit. He further claims that compression is highly dependent on the use-case domain, and finding an efficient universal compression “silver bullet” applicable to all domains is not likely to

ever be achieved. His final claim is that Moore’s law will “take care” of the processing problem; optimization of the hardware should be addressed as the real long-term solution to XML’s problems, not the XML format itself.

Software AG (2003) is a developer of database management systems with customers around the world, but with primary focus in Europe. Like the other database companies, they see the problems of XML not as a function of verboseness but of efficiency in processing XML. They present external XML solutions:

 The Moore’s law argument stating “…processors are under-utilized and Moore’s Law predicts processing speed and memory capacity will double every couple of years…”, the XML problem will not be a problem in the near future given Moore’s law.

 Address the XML problems with a hardware-specific solution similar to a graphics card, and not the redesign of XML. The justification is that if XML is going to be prevalent, special optimized hardware is justified and has been developed by a few companies that can process XML faster than a generalized CPU.

 Optimize XML’s tags for faster querying, improving processing efficiency.

Software AG’s analysis doubted that any one binary format can meet the processing needs of both well-formatted schema and schema-less XML while at the same time addressing the verboseness issue.

E. SYNOPSIS OF INTEREST FOR AND AGAINST AN ALTERNATIVE