22 January 2012

Media from Sharepoint to the Cloud

On my last post I talked about a media processing component developed for Sharepoint 2010. Besides the processing features (video encoding, thumbnail generation, validations, etc…) there was an asset storage manager that enabled us to store the files on a configurable place, inside or outside Sharepoint. I said we initially started with three storage flavors: a Sharepoint library, an FTP server or a shared folder/virtual directory.

My co-worker Dwight Goins has extended the options by adding a cloud based storage manager. In this case he uses the Amazon Simple Storage Service (Amazon S3), which resulted in a great performing storage solution. In his post, he digs deeper in the issues of storing BLOBs (Binary Large Objects) in Sharepoint and the solution that Sharepoint provides, SQL’s Remote BLOB Storage (RBS). Then he talks about our solution and the details of the AmazonS3 client. Check this out!

12 January 2012

Media Processing Component for Sharepoint

I have been working recently on the task of building a media processing component for the Sharepoint project I am working on. The requirements for the component are more or less the following:

  • The Site Content Creators must be able to upload media assets (images, audio and video) to a Sharepoint list, and these assets must accomplish certain validation rules.
  • The final destination where the assets are saved after upload must be configurable and extensible. To begin with, we are supporting saving to a Sharepoint library, a network share or an FTP server.
  • The videos must be encoded to MP4 format, and thumbnail and poster images must be generated. The encoding process must be run asynchronously and the user must be notified by email when it is finished.
Today I want to share the design of the component and the key pieces of code. I will focus on the video upload process which is the more complex one because of the encoding. Audio and image uploads are more straightforward.
The main parts that build the solution are:
  1. Custom Upload Process: This is the front end of the solution. It consists of a custom list with a custom upload form. The list has the link to the media file and more metadata fields (title, author, date, keywords, etc). When you click on create a new item on the list the custom upload form is opened and you can browse for a file to upload. The form has the required validation logic and it serves to save the assets to the configured location, which can be a Sharepoint library or an external location, like File System or FTP server. When the upload finishes you are redirected to the list item edit form so you can enter the metadata. The experience is similar to uploading a file to a Sharepoint document library.
  2. Media Processing Backend Process: This consists of a timer job that queries the Media Assets list for items to process. It encodes the videos, generates thumbnail and poster images and uploads everything to the final destination. Finally, it notifies the user of the result of the process by email. For the video encoding we used the Microsoft Expression Encoder SDK. As I will explain later, this SDK cannot be used inside a Sharepoint process, so it runs in a separated process that is invoked from the timer job.
  3. Storage Manager: this is a flexible and extensible component that abstracts the logic of saving (and deleting) a file to the final location depending on the flavor chosen thru configuration (File System, Sharepoint library or FTP). This component is used both by the front end upload mechanism and the back end media processing job.
Here is a diagram of the overall design for the video processing:
MediaProcessing
Now I will explain in a little more detail each component:

1. Custom Upload Process

The Media Assets List
This is a Sharepoint list that stores the metadata of the media assets, but not the asset itself (the assets are stored in the definite storage, which can be a Sharepoint assets library, a network shared folder, or an FTP server). The list is based on three custom content types, WebVideo, WebAudio and WebImage, all three inheriting from a base MediaAsset content type. This content type has the required fields for saving the asset metadata. The more important ones for the processing being:
  • Location: the URL of the asset in its definite location (in the example on the picture it is a sharepoint library called MediaAssetsLib).
  • Temp Location: As videos needs asynchronous processing, they are saved in a temporary location on upload. It is the timer job that uploads them to the definite location after encoding. The temp location is a shared folder on the network.
  • Processing Status: It is Success for assets successfully uploaded to the definite storage, Pending for assets waiting for encoding in the back end process and Error in case of encoding fail.
MediaAssetsList
The list has an event receiver attached in order for deleting the assets from the final destination or temporary folder when the items are deleted from the list.
To achieve the storage flexibility, a custom upload form was developed and hooked to the MediaAssets list. When you click on the “Add new item” link of the picture above, the custom upload form is launched. The form is attached to the base content type definition in the Elements file as this:
<?xml version="1.0" encoding="utf-8"?>
<Elements xmlns="http://schemas.microsoft.com/sharepoint/">
  <!-- Parent ContentType: Item (0x01) -->
  <ContentType ID="0x01004e4f21afc14c487892253cb129dd5001"
               Name="MyMediaAsset" Group="MyContent Types"
               Description="My Media Asset" Inherits="TRUE"
               Version="0">
    <FieldRefs>    </FieldRefs>
    <XmlDocuments>
      <XmlDocument NamespaceURI="http://schemas.microsoft.com/sharepoint/v3/contenttype/forms/url">
        <FormUrls xmlns="http://schemas.microsoft.com/sharepoint/v3/contenttype/forms/url">
          <New>_layouts/MyMedia/Upload.aspx</New>
        </FormUrls>
      </XmlDocument>
    </XmlDocuments>
  </ContentType>
</Elements>
The Upload Form
The form was created as an Application Page (Upload.aspx) in the Layouts folder. It contains the browse control to upload the file.

MediaAssetUploadForm

A useful tip here is how to achieve the same look and feel as the Sharepoint OOB forms. The InputFormSeccion, InputFormControl and ButtonSection controls were used for that matter.

In order to use these controls, you need to register the namespace on the top of the page:

<%@ Register TagPrefix="wssuc" TagName="ButtonSection" Src="/_controltemplates/ButtonSection.ascx" %>
<%@ Register TagPrefix="wssuc" TagName="InputFormSection" Src="/_controltemplates/InputFormSection.ascx" %>
<%@ Register TagPrefix="wssuc" TagName="InputFormControl" Src="/_controltemplates/InputFormControl.ascx" %>
And then include them in the page like this:

<wssuc:InputFormSection ID="InputFormSection1" runat="server"
      Title="Upload Document" Description="Browse to the media asset you intend to upload." >
  <template_inputformcontrols>
    <wssuc:InputFormControl runat="server" LabelText="" >
      <Template_Control>                   
        <div class="ms-authoringcontrols">
          Name: <br />
          <input id="FileToUpload" class="ms-fileinput" size="35" type="file" runat="server">    
          <asp:RequiredFieldValidator id="RequiredFieldValidator" runat="server" ErrorMessage="You must specify a value for the required field." ControlToValidate="FileToUpload"></asp:RequiredFieldValidator>
          <br />
          <asp:RegularExpressionValidator id="FileExtensionValidator" runat="server" ErrorMessage="Invalid file name." ControlToValidate="FileToUpload"></asp:RegularExpressionValidator>
                                                                                   
        </div>
      </Template_Control>
    </wssuc:InputFormControl>
  </template_inputformcontrols>
</wssuc:InputFormSection>

You can read more about how to use these controls here.

So, what happens in the code behind?

On the OK button submit handler, the file to upload is processed. The logic is different depending on the asset type. Images are copied to the final destination by using the Storage Manager. A thumbnail is also generated and uploaded for them. Duration is calculated for Audio and Video files (Microsoft.WindowsAPICodePack API is used for that). Audio files are also copied to the final destination. Videos instead are leaved in a temporary storage (a network shared folder), because they need to be processed later by the timer job.

In all three cases, a list item is created with the asset metadata, and inserted into the MediaAssets list. Then the user is redirected to the list item Edit form, so he can complete filling the rest of the metadata.


MediaAssetsEditForm

Since the upload process may take a long time, all this happens in the context of an SPLongOperation. This is a Sharepoint class provided to display the “Processing…” dialog with the rotating gear image in it.

Here is part of the code:

protected void btnOk_Click(object sender, EventArgs e)
{
    if (FileToUpload.PostedFile == null || String.IsNullOrEmpty(FileToUpload.PostedFile.FileName))
        return;     //FileToUpload is the HtmlInputFile control
    var originFileInfo = new FileInfo(FileToUpload.PostedFile.FileName);
    SPWeb web = SPContext.Current.Web;
    try
    {
        //create a MediaAsset object to save all asset metadata
        MediaAsset asset = MediaAsset.FromFile(originFileInfo, web.Url, mediaConfig);
        //start long operation to show the user the "Processing..." message
        using (SPLongOperation longOperation = new SPLongOperation(this.Page))
        {
            longOperation.Begin();

            string newFileUniqueName = String.Concat(Guid.NewGuid().ToString(), originFileInfo.Extension);

            SPSecurity.RunWithElevatedPrivileges(delegate()
            {
                //save to file system. Need to elevate privileges for that
                var tempFileInfo = new FileInfo(Path.Combine(mediaConfig.TempLocationFolder, newFileUniqueName));
                FileToUpload.PostedFile.SaveAs(tempFileInfo.FullName);

                asset.TempLocation = tempFileInfo.FullName;
                asset.Duration = MediaLengthCalculator.GetMediaLength(tempFileInfo);
                ...
            });
            
            var list = web.Lists[mediaConfig.MediaAssetsListName];
            int id;
            string contentTypeId;
            //insert new item in the MediaAssets list
            mediaRepository.Insert(list, asset, out id, out contentTypeId);
            //build url of Edit Form to redirect 
            string url = String.Format("{0}?ID={1}&ContentTypeId={2}", list.DefaultEditFormUrl, id, contentTypeId);
            //long operation ends, redirecting to the Edit Form of the new list item
            longOperation.End(url);
        }
    }
    catch (ThreadAbortException) { /* Thrown when redirected */}
    catch (Exception ex)
    {
        logger.LogToOperations(ex, Categories.Media, EventSeverity.Error,
            "Error uploading file to MediaAssets list. FileName: '{0}'.", originFileInfo.Name);

        SPUtility.TransferToErrorPage(ex.Message);
    }
}


Ok, now let’s see how the asynchronous processing part works.

2. Media Processing Backend Process


The backend process consists of a Sharepoint Timer Job that orchestrates the video processing and a console application that performs the actual encoding and generate the images. The console application is invoked by the timer job.

Encoder Console Application
The tool chosen for encoding videos was Microsoft Expression Encoder 4. We used the Pro version (paid) which includes support for H.264 (this means can encode videos to mp4 format as required by our client).

The encoder comes with an SDK, so you can programmatically encode your videos. The thing is that this API depends upon .Net Framework 4.0, and it is also 32-bit only. This is incompatible with a Sharepoint process (either web or timer job), since Sharepoint relies upon .Net 3.5 and runs in 64 bits. Hence the need to build a separate process outside of the Sharepoint Timer Job. The console application was a simple solution, and it could be configured to target .Net Framework 4.0 and x86 Platform.

This application expects four input parameters: the path to the original video, the desired path for the thumbnail image to generate, the desired path of the poster image to generate and the desired path of the encoded video to generate.

The Encoder SDK provides full flexibility for setting the encoding parameters (like format, size, bitrate, etc). It also provides a set of presets, that let’s you implement the encoding very easily. For example, here is the code for encoding using the H264VimeoSD Preset:

public void EncodeVideo(FileInfo inputFile, FileInfo outputFile)
{
    Microsoft.Expression.Encoder.MediaItem mediaItem = new MediaItem(inputFile.FullName);
    int bitrate = GetBitrate(mediaItem);

    using (Microsoft.Expression.Encoder.Job job = new Job())
    {
        job.OutputDirectory = outputFile.Directory.FullName;
        job.CreateSubfolder = false;
        job.MediaItems.Add(mediaItem);
        //H264VimeoSD preset settings: Output Format: MP4. Container: MP4. Video Codec: H.264 - Main. 
        //Video size: 640, 480. Video Bitrate: 2500 Kbps. Video Encoding: CBR SinglePass. 
        //Audio Codec: AAC. Audio Channels: Stereo. Audio Bitrate: 128 Kbps. Audio Encoding: CBR Single Pass
        job.ApplyPreset(Microsoft.Expression.Encoder.Presets.H264VimeoSD);
        job.Encode();
    }
}

And here is the code for generating the thumbnail or poster images:

public void GenerateVideoImage(FileInfo mediaFile, string imageFilePath, int width, int height)
{
    var video = new MediaItem(mediaFile.FullName);
    using (var bitmap = video.MainMediaFile.GetThumbnail(
        new TimeSpan(0, 0, 5),
        new System.Drawing.Size(width, height)))
    {
        bitmap.Save(imageFilePath, ImageFormat.Jpeg);
    }
}
Media Processing Timer Job
Sharepoint supports asynchronous processing of data through Timer Jobs. These jobs run within the context of a windows service, and are easily managed and deployed using the same tools as any other Sharepoint solution.

As the requirement was to run the job only in one of the application servers, it inherits from SPServerJobDefinition. Here is the Timer Job code:

public class MediaProcessingTimerJob: SPServerJobDefinition { 
    private Logger logger = new Logger(); 

    public MediaProcessingTimerJob() : base() 
    { 
    } 
        
    public MediaProcessingTimerJob(string name,SPServer server):base(name,server) 
    { 
        this.Title = "MediaProcessingTimerJob"; 
    } 

    public override void Execute(SPJobState jobState) 
    { 
        string webUrl = String.Empty; 
        try 
       { 
            webUrl = this.Properties["webUrl"].ToString(); 
            var mediaProcessor = new MediaProcessor(webUrl, MediaConfig.FromConfigRepository(webUrl)); 
            mediaProcessor.ProcessMedia(); 
        } 
        catch(Exception ex) 
        { 
            logger.LogToOperations(ex,TRSCategories.Media, EventSeverity.Error, "Error executing MediaProcessingTimerJob in web '{0}'", webUrl); 
        } 
    } 
} 

During the deployment process, the MediaProcessingTimerJob is installed on the required server. The URL of the website for processing the assets are passed thru the job properties.

Here is part of the code of the helper tool that installs the job and sets it to run every 15 minutes:

private static void CreateMediaJob(string webUrl,SPServer server)
{
    var job = new MediaProcessingTimerJob("my-job-media-processing", server);
            
    job.Properties.Add("webUrl", webUrl);
            
    var schedule = new SPMinuteSchedule();
    schedule.BeginSecond = 0;
    schedule.EndSecond = 59;
    schedule.Interval = 15;
    job.Schedule = schedule;
    job.Update();
}

The logic of the timer job resides in the MediaProcessor::ProcessMedia method. It essentially queries the Media Assets List for assets in the “Pending” status and for each of these items it invokes the Encoder process and then uploads the resulting mp4 video and the generated thumbnail and poster images to the final destination. Finally it notifies the user of the result by email.

This is the code that the job uses to call the console application. Since the job tells the console application the output parameters (the path to the images and encoded video files), it doesn’t need to read any output from the console application. It must only read the standard error in case the console application fails.

private void ExecuteMediaProcess(string inVideoPath,string outThumbnailPath, string outPosterPath,string outVideoPath)
{
    string args = String.Format("\"{0}\" \"{1}\" \"{2}\" \"{3}\"", inVideoPath, outThumbnailPath, outPosterPath, outVideoPath);

    ProcessStartInfo startInfo = new ProcessStartInfo(config.EncoderExePath);
    startInfo.Arguments = args;
    startInfo.CreateNoWindow = true;
    startInfo.UseShellExecute = false;
    startInfo.RedirectStandardError = true;
    startInfo.RedirectStandardOutput = true;

    Process process = new Process();
    process.StartInfo = startInfo;
    process.Start();
    string error = process.StandardError.ReadToEnd();
    process.WaitForExit(MaxWaitingProcessMillisecs);

    if (process.ExitCode != 0)
    {
        //the application failed, get error from message from standard error
        throw new MediaProcessingException(String.Format("Video encoder process returned with exit code '{0}'. Error was: '{1}'",
            process.ExitCode, error));
    }
}

3. Storage Manager

The Storage Manager is the piece of code used by both the upload media form and the backend process. It is just a file manager used for abstracting from the actual destination of the files, which is configurable. As I said, we started supporting saving assets to File System, to a Sharepoint library or to an FTP server, but this can be further extended to support other places, like some storage on the Cloud.

The need for a flexibility in the location to store the files may come from bandwidth or space limitations, for a need to share assets with other applications or a need to manage a centralized file store.

Anyway, the interface is very simple:

public interface IAssetStorageManager{
    void Delete(string fileUrl);
    string Save(System.IO.FileInfo file);
    string Save(string fileName, System.IO.Stream fileStream);
}

There is a factory that will give you the particular storage manager framework depending on configuration (all configuration is saved in a Sharepoint list, and the Sharepoint Config Store is used for retrieving it). Here is part of the factory code:

public class AssetStorageFactory{    …
    static public IAssetStorageManager GetStorageManager(string configCategory,string webUrl)
    {
        var configHelper = new ConfigHelper(webUrl);
        string storageMethod = configHelper.GetValue(configCategory, StorageMethodConfigKey);
        if ("SPLibrary".Equals(storageMethod, StringComparison.InvariantCultureIgnoreCase))
        {
            return new SPLibraryAssetStorageManager(webUrl, mediaLibraryName);
        }
        else if ("FileSystem".Equals(storageMethod, StringComparison.InvariantCultureIgnoreCase))
        {
            return new FileSystemAssetStorageManager(storageFolderPath,storageBaseAddress);
        }   
        else if ("FTP".Equals(storageMethod, StringComparison.InvariantCultureIgnoreCase))
        {
            return new FTPAssetStorageManager(ftpServerUrl,ftpServerPullAdress,ftpServerUsername,ftpServerPassword);
        }
        throw new ArgumentException(String.Format("Incorrect configuration Value '{0}' in ConfigStore for category '{1}' and key '{2}'. Supported options are: '{3}'",
            storageMethod, configCategory, StorageMethodConfigKey, "FileSystem|FTP|SPLibrary"));
    }       
}

The implementation for a particular flavor is simple. For example, this is how the FTPAssetStorageManager saves a file stream:

public string Save(string fileName, System.IO.Stream fileStream)
{
    string fileUrl = ftpServerUrl + fileName;
    FtpWebRequest request = (FtpWebRequest)WebRequest.Create(fileUrl);
    request.Method = WebRequestMethods.Ftp.UploadFile;
    request.Credentials = new NetworkCredential(username, password);
    using (Stream ftpStream = request.GetRequestStream())
    {
        FileUtils.CopyStream(fileStream, ftpStream);
    }
    return pullBaseAddress + fileName;
}

And this is how the storage manager is invoked from the Upload form or the Timer Job:

//save to final locationIAssetStorageManager storage = AssetStorageFactory.GetStorageManager("Media", web.Url);
asset.Location = storage.Save(newFileUniqueName, fileInputStream);

Conclusion

Having talked about the most important parts of the Media Processing Component for Sharepoint, I think I’m done here. The code is too much to show everything in a post, but I’ve chosen the most important parts. I might dig deeper in some other post. I will probably write about how to display videos in a web page, too.

The interesting part is that even if this whole component doesn’t apply to another project, maybe some of its pieces can be reused, like the video encoding thing, or the custom upload form and storage manager to save files outside Sharepoint. So I hope it results useful to someone else!