"We didn't start the fire ... it was always burning since the world's been turning ..." [Billy Joel 1989]. Is SOA the "Same Old Architecture?" or is it "Simply Over Ambitious?" Let's apply SOA's arsenal:: XML, BPM, Services, SOAP, Web Services - to the real world and find out. Let's put out some fires.

« Future possibilities explored at IEEE SCC 2008 | Main | What Difference can SOA make? »

Turbocharge your SOA Infrastructure with XML Appliances- Part III ( Some numbers from Intel )

In the last 2 blog posts, I talked about use of XML Appliances to turbocharge your SOA infrastructure. In this blog, I intend to provide some numbers provided by Intel. Most other vendors do not seem to have these numbers in a public document.  The features described here are available from many different vendors, and the other products in the space are equally good, with comparable performance numbers.

Intel's appliance is not actually an appliance but a product called SOAExpressWay. It runs on standard Linux servers, providing the "benefits of having an appliance, with the flexibility of a server, while using commodity hardware".   

 The benchmark numbers are impressive, and are for a dual CPU Intel Box. I wish the benchmarks used Java 1.5, which is approved in many enterprises rather than Java 1.6, which is not. But I do not think the numbers will be dramatically different.  

 Want to protect your SOA infrastructure from being flooded by badly written XML that does not conform to your schema? This could be an external denial of service attack, or more likely an error at the XML producer end. With SOA and agile development, there is the advantage of incremental development, but it is coupled with the risk of error prone code getting into your production or performance/staging environment.  The Intel SOAExpressway can validate XML to a schema at the rate of 507 MB/second, on a dual CPU box.

Want to validate the data that has come in using simple XSLT based rules? The Intel SOAExpressway provides XSLT processing at very high speeds. For example, specific fields could have valid ranges, or be constrained based on values of other fields.

An interesting feature of the SOAExpressway architecture, that it is not a pure appliance- non-XSLT based validations written in Java or C++  can also therefore be done on the XML data. As pointed out in this  excellent presentation on parleys.com(see slide #20,21) ,  there is always something that is very difficult, if not impossible to do in XML.

Want to convert about 1 GB of flat file into XML? It should take about 30 minutes or less with the SOAExpressway running on a dual CPU Intel Linux server. A past engagement showed a data volume of 50 MB per hour exchanged with an external application service provider. It would have been very convenient to use an appliance to validate, transform and import/export data using a product like the Intel SOA Expressway. The flat file to XML conversion is comparatively slower.

XML to XML conversions run a lot faster. The benchmark paper mentioned earlier, shows XML to XML conversions running at between: 500 KB per second to 41 MB per second. This is on a dual CPU box. This is not "wirespeed" as most appliance vendors hype, but it is enough to meet the needs of  many enterprise or even consumer facing enterprise portal needs. For example, consider a WML page of 4 KB on a cellphone, which is refreshed every 60 seconds. Assuming a transformation rate of 20 MB per second, a dual CPU server could support around 30,000 users.

XPath execution speeds in the  benchmark paper ranged from 400,000 to 1.5 million expressions per second. Using XPath to perform rule based validation, is thus very scalable using the Intel SOA Expressway. Incoming XML document can be validated for consistency, such as "To-date" being greater than from date.

Why not use XALAN instead? The manageability features of the SOAExpressway including basics such as SNMP Trap creation, Email notification, as well as the more advanced features such as  application level end-point support  and  Scriptable management using CLI are critical in an enterprise environment. 

A common problem in many SOA Projects is creating a "sub-set" XML document, from a given XML document. For example: The XML provided to you has 250 fields, but you need only 10 of them, in your application. A standard pattern is to read the large XML document, parse it to discover the 10 needed elements. The memory consumed in this parsing operation, can be large, causing frequent garbage collections.  Performing this operation on Intel SOAExpressway should be a lot faster than in the application server.  Moreover, you are  performing the transformation on a server that is comparatively cheaper. A fully loaded dual J2EE  Portal Server can easily cost  upwards of $100,000 per fully loaded server. In one portal project, about 95% of the portlets, called webservices on the backend, and transformed the XML so obtained, to generate the HTML markup. The  XML transformation load could be a large fraction of the CPU use on your expensive portal server.

In a given project, thinking about using a XML appliance may be difficult, but if XML Transformation and validation are provided as standard services, then this technology can percolate into the mainstream of your enterprise.

TrackBack

TrackBack URL for this entry:
http://www.infosysblogs.com/soa-mt/mt-tb.fcgi/70

Comments

Thanks Kevin for the highly informative series, looking forward to the subsequent ones! I was wondering if there are scenarios in typical projects where using XML Appliance would be an overkill, yet performance woes warrant some packaged solution.

I really enjoyed reading this blog. Intel has done another public performance test with PushToTest which can be obtained through requesting more info on SOA Expressway at http://www.intel.com/software/soae

Also, the pure software form factor of SOA Expressway makes it flexible to deploy in scenarios where a separate XML hardware device is not practical due to things like cost, network, or policy considerations. More details on these cases and how to find out how to resolve XML performance issues with a "soft appliance" is at my blog:
http://softwareblogs.intel.com/author/joseph-natoli/

Sudeep: The Intel Appliance is quite affordable, if implemented as a shared service for the entire enterprise. An individual project cannot justify the cost always. I am not sure if I can quote the list price here(I do not work for Intel)!. Most Application Servers provide ESB based mediation. In the end, it is about thinking in XML or thinking in Oracle/DB2 or thinking in Java. I will post a blog entry with a discussion about all this.

Kevin, Joe

What are your thoughts on using language specific parsing techniques to provide highly specialized programs to parse focussed grammars, like say a specific XML schema focused grammar,.. Some recent research has shown good performance ,
e.g.
Wei Zhang & Robert van Engelen
“A Table-Driven Streaming XML Parsing Methodology for High-Performance Web Services”, this paper won best paper award at SCC conference 2006

Intel's appliance is not actually an appliance but a product called SOAExpressWay.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)