Friday, 10 August 2012

C#, F# Algorithm: Faster than QuickSort?

Status: Completed
Competition start date: 2012.08.10
Competition end date: 2012.08.24
Download associated files (Version 1.5 with results)

Please make sure to read the Welcome page for general competition guidelines.
If you would like to participate in this competition please register an account on vWorker.com

Update:

2012.08.10: Each competitor is allowed 5 entries.
2012.08.11: Notes on SD at the end of the post.
2012.08.12: More notes on SD at the end of the post.
2012.08.17: New ranking method explained at end of post.
2012.08.30: Tentative Leader Board posted. Follow the discusion on GoogleGroups 
2012.08.30: Updated download files with code for each participant
2012.10.03: Winner is "Zaher". Congratulations! The source code is now available on github

Final Leader Board:

1 - Zaher, AvgRatio: 18.167, SD: 0.804, WorstCase: 16.136
2 - Renze, AvgRatio: 16.415, SD: 0.036, WorstCase: 16.343
3 - V_Tom_R, AvgRatio: 14.073, SD: 0.114, WorstCase: 13.919
4 - SoftwareBender, AvgRatio: 12.829, SD: 0.058, WorstCase: 12.739
5 - c6c, AvgRatio: 12.526, SD: 0.069, WorstCase: 12.380
6 - ErwinReid, AvgRatio: 11.452, SD: 0.036, WorstCase: 11.372
7 - Mihai, AvgRatio: 10.425, SD: 3.642, WorstCase: 5.725
8 - MoustafaS, AvgRatio: 10.024, SD: 0.186, WorstCase: 9.761
9 - aus1, AvgRatio: 7.972, SD: 0.186, WorstCase: 7.489
10 - bawr, AvgRatio: 5.317, SD: 0.240, WorstCase: 4.835

There is quite a lot of difference between the tentative leader board and the final leader board due to the way the various submissions where run. Thanks go to Renze for pointing this out and providing the code to fix it.

Final tentative Leader Board:

1. bawr - AvgRatio: 23.303, SD: 0.062, WorstCase: 23.201
2. Zaher - AvgRatio: 17.245, SD: 0.556, WorstCase: 16.822
3. c6c - AvgRatio: 11.495, SD: 0.413, WorstCase: 10.428
4. SoftwareBender - AvgRatio: 10.259, SD: 0.033, WorstCase: 10.204
5. ErwinReid - AvgRatio: 9.879, SD: 0.052, WorstCase: 9.785
6. V_Tom_R - AvgRatio: 9.713, SD: 0.022, WorstCase: 9.666
7. Mihai - AvgRatio: 9.656, SD: 0.097, WorstCase: 9.425
8. Renze - AvgRatio: 9.219, SD: 0.303, WorstCase: 8.438
9. MoustafaS - AvgRatio: 9.069, SD: 0.368, WorstCase: 8.362
10. aus1 - AvgRatio: 5.025, SD: 0.025, WorstCase: 4.961

New Leader Board:

1. Zaher Al-Sibai - ratio: 13.846 (SD: 0.075) (Worst case: 13.742) entry 1
2. Renze - ratio: 11.870 (SD: 0.081) (Worst case: 11.768) entry: 5
3. bawr - ratio: 10.213 (SD: 0.070) (Worst case: 10.088) entry 1
4. Software_bender - ratio: 9.738 (SD: 0.068) (Worst case: 9.601) entry 2
5. Renze - ratio: 9.129 (SD: 0.074) (Worst case: 8.994) entry: 4
6. MoustafaS  - ratio: 9.107 (SD: 0.096) (Worst case: 8.996) entry: 4
7. Erwin Reid  - ratio: 7.957 (SD: 0.113) (Worst case: 7.787) entry: 2
8. aus1 - ratio: 4.956 (SD: 0.085) (Worst case: 4.765) entry: 4
9. V_Tom_R - ratio 3.643 (SD: 0.062) (Worst case: 3.498) entry: 2

Old Leader Board:

1. Software_bender - ratio: 7.165 (SD: 0.130) (Worst case: 7.008) entry 2
2. MoustafaS  - ratio: 7.085 (SD: 0.113) (Worst case: 6.950) entry: 3
3. Renze - ratio: 6.548 (SD: 0.063) (Worst case: 6.435) entry: 3
4. Renze - ratio: 5.907 (SD: 0.156) (Worst case: 5.731) entry: 2
5. aus1 - ratio: 4.100 (SD: 0.145) (Worst case: 3.891) entry: 4
6. MoustafaS - ratio: 3.833 (SD: 0.094) (Worst case: 3.744) entry: 2
7. V_Tom_R - ratio 3.392 (SD: 0.108) (Worst case: 3.203) entry: 2
8. V_Tom_R - ratio 3.271 (SD: 0.129) (Worst case: 3.164) entry: 1
9. Software_bender - ratio: 2.754 (SD: 0.121) (Worst case: 2.658) entry 1
10. aus1 - ratio: 2.537 (SD: 0.080) entry: 3
11. aus1 - ratio: 2.475 (SD: 0.076) entry: 2
12. MoustafaS - ratio: 2.396 (SD: 0.086)
13. Renze - ratio: 2.340 (SD: 0.074)
14. aus1 - ratio: 2.04 (SD: 0.11) entry: 1

Introduction:

The purpose of this competition is to extract the start and end times from a list of DateRanges (which has already been pre-sorted) as a sorted DateTime list.

A DateRange can be thought of as a period of time that has well defined start and end time. (Please have a look at the class Orcomp.Entities.DateRange to get a better understanding of how this class works.)

The DateRange class implements the IComparable Interface, such that the DateRanges with earliest start times come first.

If two DateRanges have the same start time, then the DateRange with the shortest duration comes first.

A pictural representation of a sorted list of DateRanges may look like this: (Each line represents a DateRange start and end time.)

+--------------------------------------+
          +--------+
          +-------------+
          +---------------------+
                +--+
                    +---------------------------------------------------------+
                               +------+

The simplest way to get a sorted list of DateTimes from a list of DateRanges is to use the following code:


In the example shown above the Sort() method is used to order the DateTime values. Internally Sort() uses a QuickSort implementation.

This approach does not take advantage of the fact that we already have a list of pre-ordered DateRanges.

Therefore the aim of this competition is to implement the GetSortedDateTimes() method in the Orcomp.Extensions.DateRangeCollectionExtensions.cs file, taking this fact into account.

The implementation must also pass the unit test defined in the Orcomp.Tests project defined in the DateRangeCollectionExtensionsTest.cs file.

Finally once you have an implementation that works you can run the Orcomp.Benchmarks project to find out if your code runs faster or slower than the standard implementation.

Ratios greater than one indicate your code runs faster than the standard implementation.

Rules:

- There are none, as long as your code passes the unit test, it is a matter of coming up with the fastest implementation.
- Your code will be benchmarked against the standard algorithm defined in the Orcomp.Benchmarks project.
- The fastest code wins.

Notes:

- A ranking of submitted solutions will be kept up to date at the top of this post. If you want to stay anonymous please let me know when you submit your code.
- Please only submit the DateRangeCollectionExtensions.cs file.
- As a guide the standard deviation after running the benchmark should be less than 0.15.
- For F# users please send me an F# file with the implementation in it.

Notes on SD:

- The Standard Deviation (SD) is only a number which indicates the consistency or reliability of your benchmark run. If the the time result for all 20 iterations of the benchmark are very similar then the SD will approach 0. If, on the other hand the time results vary a lot, the SD will approach 1.
- The SD is influenced by other processes running on your computer at the time you run the benchmark. You will see you will get a different SD value every time you run the benchmark, so just run it as many times as you need to get an average with an SD less than 0.15. (The lower the SD the better.)
- The average ratio value that people are getting on their computer is different from what I am getting on my computer. I am not sure why, but please do not be alarmed about this. Towards the end of this competition I will be comparing the top entries against each other.

More notes on SD:

- Even though it is not recommended collecting the garbage in general, it does help to do so before each stopwatch.Reset() in order to obtain more consistent results.
- As the code being submitted is getting faster and faster it is harder to keep the SD below 0.15, therefore the worst case ratio is also included in the ranking to give an appreciation of the  spread between the times of the various runs.
- I have also noticed that the ratios fluctuate most at the beginning of a benchmark and then become a lot more reliable towards the end, therefore I might change the calculation to only include the last 10 loops in the benchmark.
- Another approach may be to forget the SD and accept a benchmark such that:
worst case ratio > average ratio * 0.95
best case ratio < average ratio * 1.05
(i.e. we have a 5% window.) This is short of doing a full blown confidence or tolerance interval analysis, but is probably good enough for what we trying to achieve.

New ranking method:

The benchmark for 1,000,000 date ranges is run 20 times.
The QuickSort implementation times are recorded for each loop and the last 10 results are selected. Outlayers are removed and the average is calculated.
The competitors implementation times are also recorded for each loop and the last 10 results are selected. Outlayers are removed and the ratio (using the average value for the QuickSort) is calculated.
The Benchmark is run in detached debugger mode (CRL - F5).
The Benchmark data has NOT changed.



12 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Hello,

    .NET introduces two factors besides other processes: JIT compilation and garbage collection.

    JIT compilation will probably not influence more than one run, which will be discarded by filtering out the min and max ratio (as it is done in the benchmark). Probably, because I am not sure whether .NET will rejit parts of the program during runs.

    Garbage collection, however, will influence the difference in run times. I ran a small test, and the SD goes down significantly if the benchmark does a garbage collect before calling the reference implementation and before calling the submitted implementation (outside the area measured by the stopwatches). This way the heap will start in more or less the same state for each program.

    Maybe it is even better to put in a closing garbage collection within the area of the stopwatches, so that the garbage collection of the heap (as caused by the algorithm) is included in the execution time...

    Kind regards,

    Renze.

    ReplyDelete
  4. Thanks Renze. I added GC.Collect() before each stopwatch.Reset() method and the results became a lot more stable.

    I don't want to include the GC in the timing at this stage, as I don't want memory to be an issue when designing the sort algorithm.

    ReplyDelete
  5. Hi, I got a question on the actual benchmark used for rating:
    Is it exactly the same as the one provided in the solution? The GetDateRanges() method provides an even data set, where all end dates are sorted (as well as all start dates). This doesn't look like good benchmark data. Perhaps I'm missing something?

    ReplyDelete
    Replies
    1. Hi,

      Yes the benchmark data is exactly the same as the one provided in the solution.
      You haven't missed anything. The benchmark data is not particularly good, however your code still has to pass the unit test.

      Delete
  6. Replies
    1. Please refer to the the first rule under the "Rules" heading. In short, yes it is allowed.

      Delete
  7. Ouch! Looks like I'm too late with my entry just as I finally optimized it to perform 14x better than QuickSort :( These time zones!

    Would it be possible to test my submission just for fun, outside the contest?

    ReplyDelete
  8. Hi orc,

    Do you think it is fair to come up with an updated benchmark application after the contest was over?

    Other contestants were not able to use the new benchmark including me even that I was at the top of the list.

    Besides, the benchmark result should show the ratio for the new implementation against QuickSort, so there should be no big difference for the benchmark result between different computers, but actually this is what I'm getting on my computer with the new benchmark:

    1 - bawr, AvgRatio: 10.740, SD: 0.098, WorstCase: 10.594
    2 - Zaher, AvgRatio: 8.431, SD: 0.063, WorstCase: 8.319
    3 - Renze, AvgRatio: 5.799, SD: 0.037, WorstCase: 5.713
    4 - SoftwareBender, AvgRatio: 4.535, SD: 0.021, WorstCase: 4.503
    5 - Mihai, AvgRatio: 4.333, SD: 0.046, WorstCase: 4.267
    6 - V_Tom_R, AvgRatio: 4.317, SD: 0.047, WorstCase: 4.245
    7 - c6c, AvgRatio: 3.789, SD: 0.036, WorstCase: 3.744
    8 - ErwinReid, AvgRatio: 3.663, SD: 0.042, WorstCase: 3.566
    9 - aus1, AvgRatio: 2.831, SD: 0.058, WorstCase: 2.771
    10 - MoustafaS, AvgRatio: 2.184, SD: 0.047, WorstCase: 2.111

    The previous benchmark gave the same result on my computer and your computer.

    Kind regards,

    Zaher.

    ReplyDelete
    Replies
    1. Hi Zaher,

      I haven't changed any of the benchmark data. I have re-organised some of the code to make it easier to run all competition entries in one go.

      The only changes I have made to the project are:
      - Changed build from "Debug" to "Release" mode
      - Run the benchmark with the debugger detached "CRL + F5"

      I agree that the ratios should be the same between computers but that is not the feedback I have been getting, so I am curious to see what others are getting.

      Could we please continue this discussion on https://groups.google.com/d/forum/orcomps

      Cheers,

      Ben

      Delete
  9. Perhaps this helps: I noticed one of the projects in the solution was being compiled for x86. When I changed that, the benchmark results changed dramatically.

    ReplyDelete