How can a struct with a single object field be faster than a raw object?

I have a struct that holds a single object field to make working with the object easier. I wanted to test the performance (I expected some degradation), but I get very surprising results. The version with the struct actually is faster:

Without box: 8.08 s

With box: 7.76 s

How is this possible?

Below is the complete test code to reproduce the results.

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Runtime.CompilerServices;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication68
{
    partial class Program
    {
        private const int Iterations = 100000000;

        static void Main(string[] args)
        {
            // Force JIT compilation.

            TimeWithoutBox(new MyObject());
            TimeWithoutBox(7);
            TimeBox(new MyObject());
            TimeBox(7);

            // The tests.

            var withoutBox = new TimeSpan();
            var box = new TimeSpan();

            for (int i = 0; i < 10; i++)
            {
                withoutBox += TimeWithoutBox(new MyObject());
                withoutBox += TimeWithoutBox(7);
                box += TimeBox(new MyObject());
                box += TimeBox(7);
            }

            Console.WriteLine("Without box: " + withoutBox);
            Console.WriteLine("With box: " + box);

            Console.ReadLine();
        }

        private static TimeSpan TimeBox(object value)
        {
            var box = new MyBox(value);

            var stopwatch = Stopwatch.StartNew();

            for (int i = 0; i < Iterations; i++)
            {
                TestBox(box);
            }

            return stopwatch.Elapsed;
        }

        private static TimeSpan TimeWithoutBox(object value)
        {
            var stopwatch = Stopwatch.StartNew();

            for (int i = 0; i < Iterations; i++)
            {
                TestWithoutBox(value);
            }

            return stopwatch.Elapsed;
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        private static void TestBox(MyBox box)
        {
            if (box.IsDouble)
                TakeDouble((double)box.Value);
            else if (box.IsObject)
                TakeObject((MyObject)box.Value);
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        private static void TestWithoutBox(object box)
        {
            if (box.GetType() == typeof(double))
                TakeDouble((double)box);
            else if (box.GetType() == typeof(MyObject))
                TakeObject((MyObject)box);
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        private static void TakeDouble(double value)
        {
            // Empty method to force consuming the cast.
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        private static void TakeObject(MyObject value)
        {
            // Empty method to force consuming the cast.
        }
    }

    struct MyBox
    {
        private readonly object _value;

        public object Value
        {
            get { return _value; }
        }

        public MyBox(object value)
        {
            _value = value;
        }

        public bool IsDouble
        {
            get { return _value.GetType() == typeof(double); }
        }

        public bool IsObject
        {
            get { return _value.GetType() == typeof(MyObject); }
        }
    }

    class MyObject
    {
    }
}

EDIT:

I've changed the IsDouble and IsObject tests to have the same statements as the other test. I've re-executed the application and the resulting times are exactly the same.

EDIT2:

This code was tested using a Release build compiling at 32-bit without the debugger attached; .NET 4.5 and Visual Studio 2012. Compiling it against 64-bit gives drastically different results; on my machine:

Without box: 8.23 s

With box: 16.99 s


I copied the exact code, ran it Release without debugger (both important!) and on x64. Results:

Without box: 00:00:07.9650541
With box: 00:00:16.0958162

Changing the test to:

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static void TestBox(MyBox box)
    {
        if (box.Value.GetType() == typeof(double))
            TakeDouble((double)box.Value);
        else if (box.Value.GetType() == typeof(MyObject))
            TakeObject((MyObject)box.Value);
    }

Makes the run times almost equal:

Without box: 00:00:07.9488281
With box: 00:00:08.6084029

Why? Because the JIT decides not to inline IsDouble and manual inlining helps. That is strange because it is such a small function. The call at line 13 is this call.

在这里输入图像描述

Now why is there still some performance difference? The .NET JIT is not the best compiler out there... there are probably some instructions a little bit different. You can find out by comparing the disassembly of the two versions. I'll not have time for that because I expect the difference to be quite uninsteresting.

I'd expect a C compiler to get this right. The struct should behave like the single object member that it contains. Small methods should be inlined. This is definitely doable with todays compiler technology. Let's hope that the next generation JIT and NGEN can do this. A new JIT is being developed at the moment (RyuJIT) and they're moving optimizations from the VC backend to NGEN (was recently announced).

链接地址: http://www.djcxy.com/p/75928.html

上一篇: 原生实用工具在cygwin 1.7.26中用于64位窗口?

下一篇: 具有单个对象字段的结构如何能比原始对象更快?