Swift Arrays are not Threadsafe

Swift arrays are not necessarily contiguous blocks of memory (although they may be in some cases so some code may work in some cases but not in others). Some operations may trigger them to be converted to a contiguous block of memory and I believe that is what can particularly cause issues when there is access from multiple threads.

[Update - This post is more popular than I expected (thanks Reddit). I've added some additional notes at the bottom. Also the previous post about optimisation is probably more interesting and valuable.]

Even the code as described here may not work (and I haven't tested at all) when there are multiple reads or writes that access the same elements of the array. All the code I used was either reading from a constant array (safe from multiple threads/queues) or writing into non-overlapping ranges with no reads occurring (only seems safe with the technique described here).

.withUnsafeMutableBufferPointer

From 41 frames per second to 1560 - Full app 38x speedup

Having spent a couple of evenings back porting Async to iOS 7 and OS X Mavericks (10.9) and releasing it as Async.legacy I've gone back to trying to squeeze some more performance out of the GrayScott Cellular Automata app that Simon Gladman presented at the last London Swift Meetup.

For me this was an interesting case to see how fast something that is almost entirely CPU and memory bound can be made and it gave me a chance to play. Not many things need optimising if well strutured but this was a case where it could clearly be relevant.

Simon's original code calculates about 10fps in debug mode and displays many of them. Built with optimisations it increases to about 41fps calculated but it very rarely updates the screen due to the timing mechanism he used rather than calling back to the main thread. All this was done on a 70 pixel square calculation.

Running the latest code on a 70 pixel square calculation it calculates between 1550 and 1600 frames most seconds for a speedup of about 40 times and it is displaying far more frames to the screen too (well assigning the images to the image property of the imageView, the screen framerate is far lower).

This post focusses on making the main solving work multi-threaded for performance and in the optimisation of the inner loop. At this point we are moving beyond the point where we are optimising by improving the style, purity and immutability of the code. Some of the changes (inlining simple functions) go directly against good style and should only be done in inner loops. The parallelisation of the main solver is also something which makes the code less clean and tidy as is the incorporation of the pixelData generation into the main loop.