Casting to generic interfaces that use covariance or contravariance is two orders of magnitude slower than normal casts in C#. This means casting to IEnumerable<T> is two orders of magnitude slower too. This result was quite unexpected and very surprising.

In this post, I investigate the cost of casting to implementations, interfaces, generic interfaces, covariant interfaces, and contravariant interfaces. I delve into the IL code to see if it holds any answers. Finally, I demonstrate that these results are not merely theoretical and that they also apply to IEnumerable<T>.

Background

My previous post on micro-benchmarking the three ways to cast safely made Mike "curious about the cost of casting a result from a dictionary that stores objects in a different type than what is required". He went on to do his own benchmarks and found that "casting is EXPENSIVE!". Mike's results show that accessing a value from a dictionary takes 21ns, casting it takes 63ns, and doing both takes 86ns.

Mike's results made me curious, is casting really that much more expensive than a dictionary lookup? To investigate, I repeated Mike's experiments and obtained similar results. This was very surprising: in my experience with optimizing tight loops, I've often seen dictionary lookups dominate the cost, but I've never seen cast operators dominate the cost.

I proceeded to reimplement Mike's code and found that casting was now a negligible part of the cost. What was the relevant difference between Mike's code and my code? The answer is contravariance. Mike's code was casting to an interface of type IInterface<in T>, while I was casting to an interface of type IInterface<T>.

Cost of Casting

To investigate the cost of casting, I used BenchmarkDotNet to micro-benchmark the cost of casting to the implementation, an interface, a generic interface, a covariant interface, and a contravariant interface.

Here is the code I used:

private readonly object value = new Implementation();
        
[Benchmark]
public object ObjectCast() => (object)value;

[Benchmark(Baseline=true)]
public Implementation ImplementationCast() => (Implementation)value;

[Benchmark]
public IInterface InterfaceCast() => (IInterface)value;

[Benchmark]
public IGeneric<int> GenericCast() => (IGeneric<int>)value;

[Benchmark]
public ICovariant<int> CovariantCast() => (ICovariant<int>)value;

[Benchmark]
public IContravariant<int> ContravariantCast() => (IContravariant<int>)value;

public class Implementation : IInterface, IGeneric<int>, ICovariant<int>, IContravariant<int> {}
public interface IInterface {}
public interface IGeneric<T> {}
public interface ICovariant<out T> {}
public interface IContravariant<in T> {}

I ran all the benchmarks in this post on both 64-bit with RyuJIT and 32-bit with LegacyJIT. While the absolute results differed, the relative performance was very similar. Therefore, I'll just present the results on 64-bit with RyuJIT:

BenchmarkDotNet=v0.10.3.0, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7 CPU 970 3.20GHz, ProcessorCount=12
Frequency=3128907 Hz, Resolution=319.6004 ns, Timer=TSC
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0
  DefaultJob : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0

             Method |        Mean |    StdErr |    StdDev | Scaled | Scaled-StdDev |
------------------- |------------ |---------- |---------- |------- |-------------- |
         ObjectCast |   0.0001 ns | 0.0001 ns | 0.0003 ns |   0.00 |          0.00 |
 ImplementationCast |   0.6011 ns | 0.0005 ns | 0.0018 ns |   1.00 |          0.00 |
      InterfaceCast |   2.6979 ns | 0.0003 ns | 0.0011 ns |   4.49 |          0.01 |
        GenericCast |   3.5961 ns | 0.0005 ns | 0.0018 ns |   5.98 |          0.02 |
      CovariantCast | 120.3516 ns | 0.0063 ns | 0.0242 ns | 200.21 |          0.59 |
  ContravariantCast | 139.3340 ns | 0.0188 ns | 0.0702 ns | 231.79 |          0.69 |

These results show that the cost of casting to the implementation is tiny, the cost of casting to an interface is higher, and the cost of casting to a generic interface is higher again. These results are as you would expect.

What is shocking, is the cost of casting to a covariant or contravariant interface. The cost of these is more than two orders of magnitude higher than casting to the implementation.

IL Code for Casting

At the IL level there are three distinct situations for the six benchmarked methods.

ObjectCast doesn't involve any casting at all, as the value is already of the required type, so no cast methods appear in the IL code.

In ImplementationCast and InterfaceCast, the target class does not involve generics, so the cast appears as castclass in the IL code.

In GenericCast, CovariantCast, and ContravariantCast, the target class involves generics, so the cast appears as castclass class in the IL code.

Unfortunately, the IL code holds no answers for why casts involving covariant or contravariant interfaces are so slow. The answer probably lies at the JIT level.

Real World: Casting to IEnumerable<T> is Slow

This doesn't just apply to covariant and contravariant interfaces that you define. It also applies to those defined by libraries and the .NET framework. For example, IEnumerable<T> is covariant and therefore, casting to IEnumerable<T> is slow.

Here is some code that benchmarks casting to IEnumerable<int>:

private readonly object value = new List<int>();

[Benchmark]
public object ObjectCast() => (object)value;

[Benchmark(Baseline = true)]
public List<int> GenericListCast() => (List<int>)value;

[Benchmark]
public IList ListInterfaceCast() => (IList)value;

[Benchmark]
public IEnumerable<int> IEnumerableCast() => (IEnumerable<int>)value;

And here are the results:

BenchmarkDotNet=v0.10.3.0, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7 CPU 970 3.20GHz, ProcessorCount=12
Frequency=3128907 Hz, Resolution=319.6004 ns, Timer=TSC
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0
  DefaultJob : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0

            Method |        Mean |    StdErr |    StdDev | Scaled | Scaled-StdDev |
------------------ |------------ |---------- |---------- |------- |-------------- |
        ObjectCast |   0.0001 ns | 0.0000 ns | 0.0002 ns |   0.00 |          0.00 |
   GenericListCast |   0.8998 ns | 0.0003 ns | 0.0010 ns |   1.00 |          0.00 |
 ListInterfaceCast |   6.8934 ns | 0.0003 ns | 0.0012 ns |   7.66 |          0.01 |
   IEnumerableCast | 120.0963 ns | 0.0184 ns | 0.0713 ns | 133.46 |          0.16 |

These results show that just like casting to a covariant or contravariant interface, the cost of casting to IEnumerable<T> is more than two orders of magnitude higher than casting to the implementation.

Practical Implications

In typical real-world code, you're unlikely to encounter this at all. Normally, you have an implementation of IEnumerable<T> and you need to call a method that requires IEnumerable<T> or you need to return an IEnumerable<T>. In both cases, there is no need to cast at all, and therefore, no cost.

In the odd case where you really do need to cast to IEnumerable<T>, the cost is not particularly significant. You can cast to IEnumerable<T> about ten million times per second.

The one case you should watch out for is repeated casting in a tight loop. When that happens, you need to look out for casts involving covariant or contravariant interfaces. For example, looking up values in Dictionary<Type, object> and casting to IEnumerable<T>. The cost of casting here is approximately three times greater than the dictionary lookup.

Conclusion

The cost of casting to generic interfaces that use covariance or contravariance is two orders of magnitude higher than normal casts in C#. This also affects library and framework types like IEnumerable<T>.

While unlikely to impact your code, there are situations where it can become a bottleneck. So be wary of casting to covariant and contravariant interfaces in tight loops.

There is nothing in the IL Code to indicate why casting to covariant and contravariant interfaces is so much less performant. If you know why it's slower, please share in the comments.

Addendum - Implicit Casting is Free

Update (14th April 2017): Mike has experimented further and found that implicit casting has the same performance as not casting. That's because implicit casting doesn't involve casting at all. If you check the IL code that corresponds to Mike's code, you'll find that neither Direct nor Implicit call castclass, but obviously Explicit does.

This means that if you call a method that expects an ICovariant<object> with a more specific ICovariant<string>, there is no need to cast and therefore, no cost. Similarly, if you call a method that expects an IContravariant<string> with a more general IContravariant<object>, there is no need to cast and therefore, no cost. This is further evidence that you're unlikely to encounter the cost of casting covariant and contravariant interfaces in real-world code.