There were a number of issues with my real-time raytracer that have presented themselves. Some things only showed up when testing in debug build, such as multiple threads all contending for use of a single static variable. I also made the mistake of creating variables on the stack which I then passed to each thread.
A friend from work also ported the code over to linux, and suggested some ways to improve the performance using SSE optimisations provided via GCC, which is awesome.
I have decied to post a revised version of the code with all of these corrections and fixes. Also thrown into the mix is freznel attenuation for the mirrored surface, as well as light falloff, making things look prettier still.