CSV ValueProvider for model binding to collections

TL;DR

I have written a ValueProvider for binding model collections from records in a CSV file. The source code is available on BitBucket.

Model binding in ASP.NET MVC is responsible for binding values to the models that form the parameter arguments to the controller action matched to the route. But where do model binders get these values from? That’s where Value Providers come in; A ValueProvider provides values for model binding and there are ValueProviders for providing values from

Design Patterns in action

The interface for a ValueProvider is straightforward, comprising only two methods:

// Copyright (c) Microsoft Open Technologies, Inc. All rights reserved. See License.txt in the project root for license information.

namespace System.Web.Mvc
{
    public interface IValueProvider
    {
        bool ContainsPrefix(string prefix);
        ValueProviderResult GetValue(string key);
    }
}

ContainsPrefix determines whether the provider contains a value with the passed prefix and GetValue returns a ValueProviderResult for a given key (or null if the provider cannot provide a result for the key).

A provider is constructed by a… ValueProviderFactory! The factory method pattern is used to construct a provider, with each different framework provider constructed by a different factory class deriving from ValueProviderFactory. The MVC framework has a place to register new factories using the static ValueProviderFactories.Factories property which is an instance of ValueProviderFactoryCollection. This collection has a GetValueProvider method that returns ValueProviderCollection, a composite provider that implements IValueProvider and internally works with an instance of each IValueProvider constructed by each ValueProviderFactory registered. That’s a lot of detail to describe there and my reason for doing so is because of an important detail that rises from this implementation -

The first instance of ValueProviderResult returned from a GetValue call on an IValueProvider will be the result used in model binding. The order of IValueProvider implementations in ValueProviderFactories.Factories is important if more than one can provide a value for model binding.

We'll see a little later why this is important.

Providing values for model binding from a CSV file

CSV (comma/character separated values) are a file format for passing tabular data around in structured plain text and are regularly used to upload collections of records to web applications. In ASP.NET MVC, the HttpFileCollectionValueProvider provides access to uploaded files by specifying the model property to bind to as HttpPostedFileBase. The problem that I find with this approach however is that the responsibility for reading the file, transforming it into something more meaningful such as a collection of types to work with and handling validation errors now usually happens in one of two places

  • the controller action
  • a custom model binder

 

If your application allows users to upload CSV files for creating multiple records for different purposes this can be a lot of repetition. If we know that a file contains CSV data, it would be good if these values were provided for model binding; this way, the mechanism could be re-used anywhere where CSV files are used and more importantly, by being exposed as values for model binding the values will also be validated against the data annotations on the model. Let’s go ahead an implement a CSV Value Provider.

public class CsvValueProvider : IValueProvider
{
    public static readonly CsvConfiguration Configuration = new CsvConfiguration { AllowComments = true };

    private readonly Lazy<PrefixContainer> _prefixContainer;
    private readonly NameValueCollection _values = new NameValueCollection();

    public CsvValueProvider(ControllerContext controllerContext)
    {
        _prefixContainer = new Lazy<PrefixContainer>(() => new PrefixContainer(_values.AllKeys), true);

        var files = controllerContext.HttpContext.Request.Files;

        if (files.Count == 0)
        {
            return;
        }

        var mapping = new List<KeyValuePair<string, HttpPostedFileBase>>();
        for (var i = 0; i < files.Count; i++)
        {
            var key = files.AllKeys[i];
            if (key != null)
            {
                HttpPostedFileBase file = PostedFileHelper.ChooseFileOrNull(files[i]);

                // we only care about csv files
                if (file != null && !file.FileName.EndsWith(".csv"))
                {
                    continue;
                }

                mapping.Add(new KeyValuePair<string, HttpPostedFileBase>(key, file));
            }
        }

        var fileGroups = mapping.GroupBy(el => el.Key, el => el.Value, StringComparer.OrdinalIgnoreCase);

        foreach (var fileGroup in fileGroups)
        {
            foreach (var file in fileGroup)
            {
                using (var reader = new CsvReader(new StreamReader(file.InputStream), Configuration))
                {
                    int index = 0;
                    long previousCharPosition = 0;
                    while (reader.Read())
                    {
                        long charPosition = reader.Parser.CharPosition;

                        for (var j = 0; j < reader.CurrentRecord.Length; j++)
                        {
                            var subPropertyName = reader.FieldHeaders[j].Trim();
                            var value = reader.CurrentRecord[j];
                            _values.Add(string.Format("{0}[{1}].{2}", fileGroup.Key, index, subPropertyName), value);
                        }

                        // naive way of determining if the file is *really* a csv file.
                        // the csv parser's character position does not change when it can't read
                        // the file as a csv.
                        if (charPosition == previousCharPosition)
                        {
                            return;
                        }

                        previousCharPosition = charPosition;
                        index++;
                    }
                }
            }
        }
    }

    public bool ContainsPrefix(string prefix)
    {
        return _prefixContainer.Value.ContainsPrefix(prefix);
    }

    public ValueProviderResult GetValue(string key)
    {
        if (key == null)
        {
            throw new ArgumentNullException("key");
        }

        if (!_values.AllKeys.Contains(key, StringComparer.OrdinalIgnoreCase))
        {
            return null;
        }

        var rawValue = _values.GetValues(key);
        var attemptedValue = _values[key];
        return new ValueProviderResult(rawValue, attemptedValue, CultureInfo.InvariantCulture);
    }
}

The provider is using the great CsvHelper nuget package from Josh Close. It looks at the incoming files on the request and for any files with the *.csv file extension, it will open them in turn and attempt to read the records out of the file. For each record that it finds in the CSV, it will take the value and add it to a NameValueCollection keyed against the header name followed by an indexer indicating the position of the record, and prefixed with the name of the file as posted in the request. It uses a NameValueCollection so that multiple columns with the same header name can be specified thereby allowing binding to simple property collections on a model.

In order to create the provider, a factory is required:

public class CsvValueProviderFactory : ValueProviderFactory
{
    public static void AddToValueProviderFactoryCollection()
    {
        AddToValueProviderFactoryCollection(ValueProviderFactories.Factories);
    }

    public static void AddToValueProviderFactoryCollection(ValueProviderFactoryCollection collection)
    {
        var postedFileValueProviderFactory =
            collection.SingleOrDefault(x => x is System.Web.Mvc.HttpFileCollectionValueProviderFactory);

        if (postedFileValueProviderFactory != null)
        {
            var index = collection.IndexOf(postedFileValueProviderFactory);
            collection.Insert(index, new CustomHttpFileCollectionValueProviderFactory());
            collection.Remove(postedFileValueProviderFactory);
        }

        collection.Add(new CsvValueProviderFactory());
    }

    public override IValueProvider GetValueProvider(ControllerContext controllerContext)
    {
        return new CsvValueProvider(controllerContext);
    }
}

The factory is fairly straightforward, with the addition of a couple of static methods that can be used to add the provider to the factories collection.

Fightin’ the Framework

The static methods on CsvValueProviderFactory are there to not only add the factory to the factories collection, but also to remove the System.Web.Mvc.HttpFileCollectionValueProviderFactory and replace it with our own custom value provider for getting values from posted files. Why do we need to do this I hear you ask? Well, let’s go back to the point raised earlier.

The first instance of ValueProviderResult returned from a GetValue call on an IValueProvider will be the result used in model binding. The order of IValueProvider implementations in ValueProviderFactories.Factories is important if more than one can provide a value for model binding.

Let’s imagine we have a view model that has a collection of users on it that we wish to bind that looks like so:

using System.ComponentModel.DataAnnotations;

public class UsersModel
{
    public UsersModel()
    {
        Users = new List<User>();
    }

    public IList<User> Users { get; set; } 
}

public class User
{
    public User()
    {
        Roles = new string[0];
    }

    [Required]
    public string FullName { get; set; }

    [Required]
    public string Username { get; set; }

    [Required]
    [EmailAddress]
    public string Email { get; set; }

    [Required]
    public int? Age { get; set; }

    public string[] Roles { get; set; }
}

Now, we have a controller where users can submit a CSV file:

public class HomeController : Controller
{
    [HttpGet]
    public ActionResult Index()
    {
        return View(new UsersModel());
    }

    [HttpPost]
    public ActionResult Index(UsersModel model)
    {
        return View(model);
    }
}

and a corresponding view with the following form:

@using (Html.BeginForm("Index", "Home", FormMethod.Post, new { enctype = "multipart/form-data" }))
{
    @Html.TextBoxFor(m => m.Users, new { type = "file" })
    <button type="submit">Submit</button>
} 

Here’s what happens when we have the CsvValueProviderFactory hooked up in addition to System.Web.Mvc.HttpFileCollectionValueProviderFactory

  1. A user submits a file for the Users property on UsersModel
  2. The HttpFileCollectionValueProvider finds the file in the request and puts it into a dictionary against the string key “Users”
  3. The CSVValueProviderfinds the file in the request, determines it’s a csv file and reads the records out
  4. For each record, the provider creates keys named “Users[index].FieldHeader” where Users is the filename, index is the record row index and FieldHeader is a column in the csv file that corresponds to a property on the Usertype
  5. When model binding is taking place, the default model binder asks the value providers if they contain the prefix “model” first (this is the parameter name on the action; none of the providers have a value for this) and then asks if they contain the prefix “Users” (which both HttpFileCollectionValueProvider and CsvValueProviderwill return true for)
  6. binding calls GetValue on each of the ValueProviders passing in the key “Users”
  7. The HttpFileCollectionValueProvider returns an instance of HttpPostedFileBase for “Users” that will fail to be bound to the IList<User> type for the property Users on the UsersModel
  8. Binding will not ask the ValueProviders for any keys further down the property chain for “Users” and so the values for CSVValueProvider will not be used in model binding

On the basis of this knowledge, all the CustomHttpFileCollectionValueProvider constructed by CustomHttpFileCollectionValueProviderFactory does is to ignore csv files so that they cannot be provided as values for model binding and do not interfere with the function of CSVValueProvider.

Where did the errors happen?

I’ve put up the source code and a demo web project on bitbucket for those interested. What remains is to provide more meaningful error messages back to the user for any rows in the CSV file that fail model validation; at the moment, we can do something simple like:

for (int i = 0; i < Model.Users.Count; i++)
{
    if (!ViewContext.ViewData.ModelState.IsValidField("Users[" + i + "]"))
    {
        <p class="field-validation-error">Invalid values in row @i</p>
        @Html.ValidationMessageFor(m => m.Users[i].Age)
        @Html.ValidationMessageFor(m => m.Users[i].Email)
        @Html.ValidationMessageFor(m => m.Users[i].FullName)
        @Html.ValidationMessageFor(m => m.Users[i].Username)
        @Html.ValidationMessageFor(m => m.Users[i].Roles)        
    }
}

which will tell the user which row was invalid and write out the corresponding error messages. I’ll look to make something more generic that can be reused for other models and update the source on bitbucket.