A situation that comes up regularly is receiving multiple individual entities in a tabular (not hierarchical) format, such as CSV or other flat file formats.
In this situation, the entity breaks can’t be inferred by structure, instead, it has to be done by value.
A popular way of addressing this is regrouping in XSL using the Muenchian Method or similar technique. This is perfectly fine and does work, but one can credibly argue that XSL is not the best language for this and in the BizTalk Stack, we definitely have a serious challenger, namely SQL.
In this article, we will demonstrate how to regroup an untagged tabular dataset coming from CSV into individual entities using SQL instead of XSL. The sample project can be found in the MSDN Code Gallery at: BizTalk: Sorting and Grouping Flat File Data In SQL…Instead of XSL
We receive Purchase Order data from an external source in CSV in the format:
PONumber1,2018-02-02T15:29:49.480,1,ItemID01,2
PONumber1,2018-02-02T15:29:49.480,2,ItemID02,2
PONumber2,2018-02-02T15:29:49.480,1,ItemID01,2
While it’s easily readable that there are two POs, the lack of line tags means the Flat File Disassembler cannot distinguish where PO 1 ends and PO 2 begins.
The lack of line tags is a problem for the Flat File Disassembler, but not for SQL.
To have SQL Server regroup the data, we need nothing more than a Stored Procedure called by the WCF SQL Adapter using a User Defined Table Type. No permanent storage or other SQL artefacts are necessary.
First, we create the User Defined Table Type to represent CSV data:
CREATE
TYPE [dbo].[POCSV]
AS
TABLE
(
[PONumber] [
varchar
](50)
NULL
,
[PODate] [datetime]
[POLineNumber] [
int
]
[ItemID] [
[ItemQuantity] [
)
GO
The sample is very simple, but it contains all the fields we would need to group by PO and Lines.
Next, we use a FOR XML Stored Procedure taking the User Defined Table Type as a parameter.
PROCEDURE
[dbo].[GroupPOsFromCSV]
-- Add the parameters for the stored procedure here
@POCSV
as
POCSV READONLY
BEGIN
SET
NOCOUNT
ON
;
SELECT
DISTINCT
PurchaseOrder.PONumber, PurchaseOrder.PODate, POLine.ItemID, POLine.ItemQuantity
FROM
PONumber, PODate
@POCSV)
PurchaseOrder
LEFT
JOIN
POLine
PurchaseOrder.PONumber = POLine.PONumber
FOR
XML AUTO, ELEMENTS
END
In the FOR XML query, the grouping by PO and Lines is a natural product of the LEFT JOIN. Compared to an XSL method, this is substantially less code and in nearly all cases, more readable and maintainable.
There is nothing special or unusual in the BizTalk app to use SQL for grouping. The app contains some regular BizTalk artefacts:
User Defined Table Types is fully supported by the WCF sqlBinding and appear in the generated Schemas in a predictable pattern. Here, is the Map from the CSV to the Stored Procedure:
<
>
PONumber
>PONumber1</
PODate
>2018-02-02T15:29:49.480</
ItemID
>ItemID01</
ItemQuantity
>2</
</
>ItemID02</
The sample project can be found in the MSDN Code Gallery at BizTalk: Sorting and Grouping Flat File Data In SQL…Instead of XSL