Tuesday, March 24, 2009

Programatically Extract Attachments from InfoPath Forms

Problem

You have an InfoPath form stored in a SharePoint Form Library. You want to run a workflow on this InfoPath form and let the workflow extract the file attachments and upload them to a document library.

Solution
Use the SPFile object of the SPListItem object that the workflow is running on to access the file that represents the XML of the InfoPath form. Load the XML into an XmlDocument object for read/write access. Get the base64-encoded value of the File Attachment field and decode it. Extract the file name and file contents from the decoded string. Upload the results to a document library. Clear the File Attachment field to remove the attachments from the InfoPath form. Replace the original XML of the SPFile object with the modified XML of the XmlDocument object.

Discussion

This article assumes that you have created an InfoPath form template that has a File Attachment field named Attachment, and published the InfoPath form template to a SharePoint Form Library. I also assume you have a variable workflowProperties bound to the workflow properties.

First we retrieve the SPFile associated with the item... this file is the InfoPath form. If the form doesn't exist, then we exit.
SPFile file = workflowProperties.Item.File;
if (file == null)
return;
Next, we get the binary stream from the form and load it into an XPathDocument

// Get the binary stream of the file
Stream formStream = file.OpenBinaryStream();
// Load the stream into an XPathDocument object
XmlDocument ipForm = null;
ipForm.Load(formStream);
Create a NameSpaceManager object and add the namespace of the form to it

XmlNamespaceManager ns = new XmlNamespaceManager(ipForm.NameTable);
ns.AddNamespace("my", "namespace_of_form");
Get the XML node containing the file attachment

XmlNode attachmentNode = ipForm.SelectSingleNode("/my:Attachment");
Decode the base64 encoded string into bytes

byte[] attachmentNodeBytes = Convert.FromBase64String(attachmentNode.InnerXml);
Position 20 contains a DWORD indicating the length of the filename buffer.
The filename is stored as Unicode so the length is multiplied by 2

int fnLength = attachmentNodeBytes[20] * 2;
byte[] fnBytes = new byte[fnLength];
The actual filename starts at position 24

for (int i = 0; i > fnLength; i++)
{
fnBytes[i] = attachmentNodeBytes[24 + i];
}
Convert the filename bytes to a string. The string terminates with '\0' so the actual filename is the original filename minus the last character

char[] charFileName = UnicodeEncoding.Unicode.GetChars(fnBytes);
string fileName = new string(charFileName);
fileName = fileName.Substring(0, fileName.Length - 1);
The file is located after the header, which is 24 bytes long (plus the length of the filename)

byte[] fileContents = new byte[attachmentNodeBytes.Length - (24 + fnLength)];
for (int i = 0; i < fileContents.Length; i++)
{
fileContents[i] = attachmentNodeBytes[24 + fnLength + i];
}
Open the document library

SPDocumentLibrary docLib = (SPDocumentLibrary)workflowProperties.Web.GetList(url_to_doc_library);
Get the root folder

SPFolder folder = docLib.RootFolder;
Get the files in the folder

SPFileCollection files = folder.Files;
Upload the file to the document library using the folder URL and filename we retrieved earlier

SPFile file = files.Add(folder.Url + "/" + fileName, fileContents);
Get the list item of the file we uploaded and set the title to the filename

SPListItem fileAdded = file.Item;
fileAdded["Title"] = fileName;
Save the list item

fileAdded.Update();
Remove the attachment from the InfoPath form. We do this by deleting the InnerXml and then adding the attribute xsi:nil=true

attachmentNode.InnerXml = string.Empty;
XmlNode nilAttribute = ipForm.CreateAttribute("xsi", "nil", http://www.w3.org/2001/XMLSchema-instance);
nilAttribute.Value = "true";
attachmentNode.Attributes.SetNamedItem(nilAttribute);
Convert the XML document to bytes

attachmentNodeBytes = Encoding.UTF8.GetBytes(ipForm.OuterXml);
Close the file stream

formStream.Close();
Save the bytes of the XML document as the contents of the SPFile object that represents the InfoPath form

file.SaveBinary(attachmentNodeBytes);
Save the changes made to the SPFile object
file.Update();

That's it! Your workflow should now be able to extract the attached document and upload it to a document library.

Sources:
http://www.bizsupportonline.net/infopath2007/set-infopath-form-field-value-sharepoint-workflow.htm
http://www.bizsupportonline.net/blog/2009/03/programmatically-rename-infopath-file-attachment/

Blog Archive