I’ve moved the blog back to my old domain. If your subscribing to the Feedburner url you should even notice this in your RSS reader this but all new content will be published over at http://www.richardhallgren.com.
15 Jan
Posted by Richard as BizTalk 2006, Architecture
There was a question in Microsoft BizTalk forums the other day about how one could implement a Scatter and Gather pattern in a more loosely coupled fashion.
Most examples on the implementation of this pattern in BizTalk use the Call Shape functionality available in BizTalk orchestrations. This however creates a hard coupling between the “Scatter orchestration” and it’s “partner orchestrations”. The downside of that is that when one adds or removes a partner the whole solution has to be recompiled and redeployed.
If one however could use the publish-subscribe architecture in the MessageBox to route messages between the “Gather orchestration” and it’s partners, it’d be possible to add partners without having to worry about the rest of the solution. This post shows and example on how to implement a solution like that.
NOTE: Notice the difference between the PartnersRequest, the PartnersResponse, the PartnerRequest and the PartnerResponse messages. The names are unfortunately very similar.
The PartnersRequest and PartnersResponse messages are used for communication between the Scatter and the Gather orchestrations. It’s also a PartnersRequest message that activates the process.
PartnerResponse and PartnerRequest are used for communicating between the Scatter and the Gather orchestration and all the partner orchestrations.
An example of the implementation can be downloaded from here.
The solution contains five different schemas.
Five orchestrations
When the solution is built and deployed one needs to setup and bind two ports; one outgoing port and one incoming port (this could also be a Request-Response port by changing the port type in the Scatterer orchestration). That’s it!
Enlist and start everything by dropping a PartnersRequest test message (you’ll find one among the zipped files) in the incoming folder. A PartnersResposne message should then be published in the outgoing folder containing a calculated price from all the Partner orchestrations.
Test message and a binding file are part of the zipped solution.
I’ve made some major simplifications in this example to make it easy for setting it up and test the concept. These would very different in a “real” solution.
Partner Services
The Partner orchestrations are very simple. They actually don’t communicate with outside world at all. All the do is setting a hard coded price and post a response. In a real solution these would not be part of the same solution as the Scatter and Gather orchestration (otherwise we would be force to redeploy when adding a Partner orchestration to the dll).
The Partner orchestrations would also communicate with some sort of outbound source like a web service or database for example. This would however complicate the setup therefore I’ve skipped that part in the example.
Managing partners
One of the benefits with a loosely coupled implementation is the possibility to add and remove Partner orchestrations without having redeploy the rest of the solution. Using this implementation the Gatherer orchestration needs to know how many Partner responses it should wait for before timing out. This requires that value being set in a config file or something similar. In this example the number of Partners are hard coded into the Gather orchestration (it’s set to 3 Partners) to simplify the setup.
Knowing how to create loosely coupled solutions like this is good knowledge to have. It’s my own and others belief that this architecture makes it possible to create more robust and separated solutions that one can update without having to do a lot a work and disturb the current processes. It’s however not the best solution performancewise as it adds a lot of extra hits on the MessageBox database and generates more work for the MessageAgent.
There are also a few things to watch out for:
Eternal loops
It easy to end up in a situation where you’re subscribing to the same message as you posting to the MessageBox. That’ll create and endless loop and cause a lot disturbance before you’ll find it. Think through and document you subscriptions!
Correlations for promoting values
When doing a direct post in BizTalk most properties are not promoted. To force you properties to be promoted you’ll have to initialize a correlation on the property as you send it. I can’t say I like this. There should be a other way of saying that one wants it promoted.
A couple of other useful articles as we on the subject:
Download the example orchestration and let me know how you used it and what your solution looks like!
Whoho! I’ll be speaking at Developer Summit 2008. Developer Summit is a conference hosted by Cornerstone, it’s being held in Stockholm between the 9th and 11th of April 2008.
I’ll be doing a talk on Master Data Management (MDM) for the Enterprise using BizTalk 2006. Basically I’ll present some basic theory behind the MDM concept and how it relates to SOA. Then I’ll relate all that to a customer case I recently worked on solving a MDM requirement. Finally I’ll be showing a short demo where I publish some Master Data and update all subscribers to it. In the same demo I’ll also demonstrate on how to monitor the process using Business Activity Monitoring.
Sound interesting? Something specific you think should be in the presentation? Let me know!
Hope to see you there!
Way to often I get a request to deliver a message from BizTalk without any XML namespace. Comments like the one below aren’t that rare.
Why do you put those “ns0″ I front of every tag!? We can’t read the XML when it’s written like that! All our XPath and XSLT seems to fail.
We don’t want <ns0:OutMessage xmlns:ns0=”Acme.Messages.OutMessage/1.0″> … We expect the messages from you to be like <OutMessage> …
Is that some BizTalk specific?
I always try to explain what XML namespaces are, and why it’s a good idea to use them (especially when it comes to versioning of messages). Sometimes it’s just impossible to get people to understand the advantages of using it and to persuade them to change their solutions to handle XML namespaces. It’s in these cases it’ll be up to the implementation in BizTalk to remove the namespace.
There are a couple of ways of achieving this, we can use .NET code that we call in a pipeline or in an orchestration. But we can also handle this using XSL - and that’s what I’ll show in this post. The XSL stylesheet below will remove all XML namespaces while transforming the message. Basically it just copies the nodes, attributes and it’s values using the local-name() function to ignore the XML namespaces.
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <xsl:copy> <xsl:apply-templates /> </xsl:copy> </xsl:template> <xsl:template match="*"> <xsl:element name="{local-name()}"> <xsl:apply-templates select="@* | node()" /> </xsl:element> </xsl:template> <xsl:template match="@*"> <xsl:attribute name="{local-name()}"> <xsl:value-of select="."/> </xsl:attribute> </xsl:template> <xsl:template match="text() | processing-instruction() | comment()"> <xsl:copy /> </xsl:template> </xsl:stylesheet>
We can use this XSL stylesheet in a ordinary BizTalk map on the send port using the Custom XSL Path property in the BizTalk Mapper. The result is that the XSL we usually generate in the mapping tool will be overridden by our own XSL stylesheet. The figure below shows how we use the property windows of the grid in the BizTalk Mapper to set the property and point the Mapper to our XSLT document.
But what if we already have a map on the send port and it’s that already transformed message we like to remove the namespace from? One possibility is to use the XSLT Transform pipeline component that comes with the BizTalk 2006 SDK. It’s usually located at C:\Program Files\Microsoft BizTalk Server 2006\SDK\Samples\Pipelines\XslTransformComponent\XslTransform on your development machine. I’ve written about this sample component before here were I used it another scenario.
The figure below shows how we use the property windows of the XSLT Transform Component in a pipeline in the Pipeline designer tool to set the path to our XSLT stylesheet.
The XSLT Transform component is far from perfect and the obvious problem is of course that the component loads the whole message into memory using the XmlDocument class to read the message. That means that this solution isn’t for those scenarios where you’ll have huge messages coming in by the thousands. But for those cases where you have normal sized messages and you have a idea of the traffic you receive, it’s a quick and easy solution.
Any comments on the pro and cons on this solution and how you usually solve this scenario will be appreciated!
cXML (commerce eXtensible Markup Language) is a XML based standard for communication of data related to electronic commerce. The problem from a BizTalk perspective is that they don’t publish any XML schemas (XSD), only Document Type Definition (DTD).
When trying to generate a schema based on a DTD using the functionality in BizTalk (via Add Generated Items) one ends up with a schema split of three files that really doesn’t make any sense (XmlSpy doesn’t do a very good job either …). So after a while I found Nick Heppleston schema repository! After some tweaking I actually had a cXML Order schema in the version I was looking for! Thanks Nick!
The next set of problems was to handle the lack of XML namespace and the DOCTYPE declaration that messages validating against DTD carries on top.
<?xml version="1.0" standalone="no"?> <!DOCTYPE cXML SYSTEM "http://xml.cxml.org/schemas/cXML/1.2.014/cXML.dtd"> <cXML xml:lang="en-US" payloadID="2007117.25919@Contempus" timestamp="2007-11-07T11:06:16+01:00">
To handel these two issues I set up a receive pipeline that looked like the one below.
First I created a pipeline component to remove the DOCTYPE node. It’s simple code using regular expression to find the DOCTYPE node, replace it with an empty string and return the message.
public IBaseMessage Execute(IPipelineContext pc, IBaseMessage inmsg) { string messageString = new StreamReader(inmsg.BodyPart.Data).ReadToEnd(); Regex doctypePattern = new Regex("<!DOCTYPE.+?>"); messageString = doctypePattern.Replace(messageString, string.Empty); MemoryStream memStream = new MemoryStream(); byte[] data = Encoding.UTF8.GetBytes(messageString); memStream.Write(data, 0, data.Length); memStream.Seek(0, SeekOrigin.Begin); inmsg.BodyPart.Data = memStream; return inmsg; }
Secondly I used Richard Seroter’s post on how to change the SetNSForMsg component to add a XML namespace. That’s the second component showing in the decode stage of the pipeline.
Arrow number 3 shows how the SetMsgNS exposes a property that allows us to set the namespace that we can configure per pipeline. In this case I’ve set it to http://schemas.modhul.com/cXML/1.2.014/OrderRequest which is the namespace of the cXML schema I’m currently working agains.
In the end we’ll have a message with the following declaration and root node.
<?xml version="1.0" encoding="utf-16" standalone="no"?> <cXML xml:lang="en-US" payloadID="2007117.25919@Contempus" xmlns="http://schemas.modhul.com/cXML/1.2.014/OrderRequest" timestamp="2007-11-07T11:06:16+01:00">
Now we’re ready to start mapping!
Ordered Delivery is a great feature of BizTalk 2006. The problem is however that as soon as one introduce a orchestration in the process it doesn’t work. The Ordered Delivery option on ports in BizTalk 2006 ensures two things today:
Fine. But if we introduce an orchestration each message being persisted in the the MessageBox will start it’s own orchestration instance. These instances could (and in many cases will) finish in different order than they started. This means that we loose the correct order of the messages one they are persisted back to the MessageBox from the orchestration instance.
The above “limitation” is well known and the solution to this has always been a Sequential Orchestration pattern (also called Singleton Orchestration). Basically this correlates messages from for example a specific Receive Port of a particular Message Type. This ensures that only one instance of the orchestration will be started and we can keep the order of the messages. We’ve experienced a lot of problems using this this kind of solution for Ordered Delivery - all from Zombies to them using huge amounts of memory and so on.
Now
it finally looks like this problem is about to be solved! Microsoft released a white paper recently (download it here - found it via Richard Seroter’s blog) to solve these problems in a other way. It basically means that a pipeline will stamp the messages with a sequence number as they enter BizTalk (the messages are then in order as we’re then using Ordered Delivery on the port). Then one can have as many orchestration as one like processing these messages and publish them back to the MessageBox (now possibly out of order but with a sequence number stamped on them). The last business orchestration will set a “destination” property that will route the message to a resequence orchestration. This orchestration resequences the messages and decides by checking the last sequence number it sent out if the current message should be sent out or put back on queue.
My biggest concern in this solution is that is still based on a singleton. We’ve had cases where where the send procedure been extremely slow (for example when we used SOAP and had a slow Web Service in the other end), then the orchestration has built up in memory as it queued messages internally. However it looks like this solution is well thought through with a “flush queue” functionality, stop message, some ideas on how to handle errors (remember that if the singleton fails and you don’t handle this your processes stop) and so on.
Read it and please let me know what you think!
I’ve written about the ESB concept before and what I think an ESB architecture is. In the posts comments there were some discussion about if BizTalk is an ESB or not. And if not - why not?
I think this article does a great job in explaining and discussing this subject. Basically it says that the main reason for BizTalk not qualifying as an ESB today (I know about the ESB Guidance - we get there ;)) is because of it’s “all-or-nothing” packaging. What that means is that it’s different functionality can not be separately deployed a cross a bus structure. For example; the scenario of having the transformation functionality in one place and the routing functionality in another just isn’t possible in today’s architecture. Today you install the full product in one place.
I think I’ve personally have learnt to live with this limitation. On the other hand I can see that the possibility of splitting parts up definitely changes things as the possibility of reuse and single point of failure problems etc.
I still haven’t had as much time as I’d like to examine the ESB Guidance but I look forward to see how they worked around the problem described above.
Just looking at this architecture image shows that they’ve split things up in a new way and that each part is accessible trough services - nice! Could this be the future architecture of BizTalk server? What do you think?
I’ll try and install the ESB project as soon as I get some more time on my hands. In the mean time I’d love some tips and comments on articles and other reading on the experiences of the ESB Guidance project.
I’ve often been told that one of the biggest mistakes people make when implementing a service oriented architecture is that they don’t re-architect their current architecture to become service oriented. I’ve never really understood what they meant by that until I read this article. SOA is not about adding a service based call to expose your current procedure calls as a service - it’s so much more. It’s about enabling ease of change and to create a more agile architecture.
Each of these assumptions exist in a Remote Procedure Call. They are forms of coupling, pure and simple. They fly in the face of SOA.
What do you think? Does it make sense in your world?
I’ve blogged about the Visual Studio extension XPathmania before. It’s a very simple little tool that lets you write and test XPath inside of Visual Studio 2005. No big deal if you already have tools like XMLSpy or XML Notepad but still. I like not having to start another application, opening the XML document I’m working with and so on. Doing stuff inside of Visual Studio just feels right and saves some time anyway.
One of the last episodes of dnrTV hosted Dom Demsak (Don XML), the creator of XPathmania. The show is 30% about XPathmania and 70% about XPath and XPath syntax in general. It’s kind of basic XPath but I think it can be useful for someone who feels they haven’t got full control of the language.
During the show they touch on XML namespaces and XML default namespaces. However they don’t really explain the difference between them and how it effects the document. Something that’s sad as I feel that XML namespaces (and especially default namespaces) is something that most people haven’t fully understood.
Anyway, if you got some spare time watch it or forward it someone you think should watch it.