This morning I set out on a journey to convert an InfoPath form into a PDF document programmatically. The journey came about in a Visual Studio workflow, because after a user submits the associated InfoPath form, the workflow was to take that form, convert it to a PDF, and thereafter archive the PDF into a separate site collection. This turned out to be a neat little business process when all was said and done.
As far as generating the PDF goes, I decided to elicit a little help from a third party. I ran into Winnovative PDF, and they have a tool that takes HTML and converts that HTML into a PDF. While this isn't the silver bullet I would have hope it would have been (take the form itself, rather than HTML), I knew it wasn't that hard to convert an InfoPath form into HTML. Then the tool can do the dirty work for me to build the PDF.
Convert an InfoPath form to HTML
The basic steps are to grab the SPFile object that is the InfoPath form you want to convert to a PDF. Then, open a Stream that has the XSL that you can use to transform the form data (XML) into HTML. You can get the XSL from an InfoPath form out of the XSN file itself. The XSN file that is an InfoPath form is actually just a CAB that has many kinds of files in it, one of which is our XSL file. You'll see one XSL file for every view you've built in the InfoPath form. This is the XSL file you want:
After you have the XSL (one file for each view in the InfoPath form) in your Visual Studio project, use the following code to transform the form XML data into HTML:
string html = "";
using (SPSite site = new SPSite("http://contoso.com/acceptance"))
{
using (SPWeb web = site.RootWeb)
{
// grab the folder that contains the form
SPFolder folder = web.Folders["InfoPath Forms"];
// grab the InfoPath form itself
SPFile file = folder.Files["SomeForm.xml"];
XslCompiledTransform transform = new XslCompiledTransform();
// embedded rource file name (this is your XSL document
// you pulled out of the XSN/Cab).
string resourceName = @"AssemblyName.SomeForm.xsl";
// Load resource out of dll and into a stream
using (Stream res =
Assembly.GetExecutingAssembly().GetManifestResourceStream(resourceName))
{
// load stylesheet into the transformer
using (XmlTextReader stylesheet = new XmlTextReader(res))
{
transform.Load(stylesheet);
}
}
StringWriter sWriter = new StringWriter();
HtmlTextWriter writer = new HtmlTextWriter(sWriter);
// read the contents (XML) out of the InfoPath form
using (XmlReader reader = new XmlTextReader(file.OpenBinaryStream()))
{
XmlTextWriter results = new XmlTextWriter(writer.InnerWriter);
// Perform the transformation
transform.Transform(reader, results);
reader.Close();
}
// the transformation will load the resulting HTML into a string
html = sWriter.ToString();
}
}
Convert HTML into a PDF
For this second stage in the process, I used a tool from Winnovative to do the heavy lifting. It's a great tool that integrates into Visual Studio and comes in inexpensive at $350. Their API is rather intuitive, as you can see:
PdfConverter pdfConverter = new PdfConverter();
// set the license key (this is the public trialwear license
pdfConverter.LicenseKey = "Q2hzY3Jjc2N3bXNjcHJtcnFtenp6eg==";
// set the converter options
pdfConverter.PdfDocumentOptions.PdfPageSize = PdfPageSize.A4;
pdfConverter.PdfDocumentOptions.PdfCompressionLevel = PdfCompressionLevel.Normal;
pdfConverter.PdfDocumentOptions.PdfPageOrientation = PDFPageOrientation.Portrait;
pdfConverter.PdfDocumentOptions.ShowHeader = false;
pdfConverter.PdfDocumentOptions.ShowFooter = false;
// set to generate selectable pdf or a pdf with embedded image
pdfConverter.PdfDocumentOptions.GenerateSelectablePdf = true;
// Performs the conversion and get the pdf document bytes
byte[] pdfBytes = null;
pdfBytes = pdfConverter.GetPdfBytesFromHtmlString(html);
At the end of this code block, you're left with a byte array. This byte array contains all the bytes that make up the PDF file. Let's now take those bytes and re-write them back into SharePoint, but this time as a PDF!
// open SPWeb/SPFolder that will be taking the PDF
SPFolder archive = someSPWeb.Folders["Archived Forms"];
// grab the SPFile that is the InfoPath form, and use the
// name for the new PDF's name, but change the extension
string filename = file.Name.Replace(".xml", ".pdf");
// Add the file to the Files collection and commit to database
SPFile archivedFile = archive.Files.Add(archive.Url + "/" +
filename, pdfBytes, true);
archive.Update();
SWEET! Now we have an archived InfoPath for that's in a PDF format! Optionally, you could write the file to the file system:
string filename = file.Name.Replace(".xml", ".pdf");
// Outputs the document to the current web page
FileStream s = new FileStream("c:\\somefolder\\onserver\\harddrive\\" + filename, FileMode.Create);
s.Write(pdfBytes, 0, pdfBytes.Length);
What CAN'T you do with code? And ain't workflows sweet? I'm such a geek (and a pseudo hick) J
Phil