Swift 1.2 Update (Xcode 6.3 beta 2) - Performance

Apple have shipped another major update (release notes - registered devs only) only two weeks after shipping the beta 1. There are major updates to Swift Playgrounds, additional syntax support for more flexible `if let`, added a zip function, various other tweaks and fixed a tonne of bugs. This is all on top of the major changes in 1.2 beta 1 that I discussed at Swift London.

Erica Sadun has already blogged about the Playgrounds and `if let` changes and I'm sure that there will be plenty more over the next week. I don't intend to go over that ground but instead discuss the performance changes in Swift 1.2 versions. [Update: I should also have mentioned Jamesson Quave's post that covers the new zip function.]

The performance jumps are very significant in Swift 1.2 and certainly in Beta 2 the steps necessary to optimise your code are significantly changed. This post will cover some information on the improvements and how to get the best out of the Swift compiler. In most cases I have encountered the performance is now very close to C/C++ code and may be faster at times.

Swift 1.2 beta 2 Big Performance News

High performance can be achieved in Xcode 6.3 beta 2 without making code changes to achieve the result that were required to get close previously. Specifically the performance gains of these changes seem to have been largely eliminated:

  1. `final` methods/properties which used to be massive (although I still recommend it where possible for design/safety reasons).
  2. unsafeMutableBuffer instead of array (at least in -Ounchecked builds)
  3. moving code into the same file to allow the compiler to inline (with -whole-module-optimizations build option new in Xcode 6.3 beta 2).

There also seems to have been a slight general performance gain (about 5% above beta 1 from my initial tests).

Is Swift Fast Yet?

Yes. I've always believed that the language had potential to be as fast or faster than C because of strict compile time type checking, immutability and a lack of pointer aliasing concerns. With the Xcode 6.3 betas that promise is being delivered on. I'm sure the work is continuing and there is more to come but it is already fast in most cases. My thanks to the Apple team for making a fast platform but you have just taken away my ability to look impressive by speeding up code significantly with a few simple changes - you bastards.

Health Warning

Xcode 6.3 beta 1 was very buggy, (this isn't a complaint it was a first beta and the Swift team was very responsive to reports) and I haven't yet used beta 2 enough to know if I can recommend it as a main development platform. If you will need to ship a release in the next couple of months I would keep main development to Swift 1.1 for now as it is likely that you won't be able to release products to the App stores until the final release is made.

It is however worth branching your code, running the migrator and spending a little time getting it building if only so that you can find any issues that trip up your code and file bug reports as appropriate so that you don't get a nasty shock when it is final and that you can have an opportunity to influence the language with feedback before it is final.

You may also want to postpone or cancel Swift 1.1 optimisation work though unless you have an urgent need for speedup.

Performance

 -whole-module-optimizations (new in Xcode 6.3 Beta 2)

This new build option disables the incremental build feature introduced in Beta 1 so comes at a significant cost to build times but what it does is builds all the files in the module together so optimisations including inlining of functions and methods are possible across files.

This was the main issue in the Geekbench FFT performance test that put even Beta 1 ten times slower than C++. I optimised by moving the key functions into the file and got the speed to within about 10% of the C++ but with this change in Beta 2 if the option is enabled there is no need rearrange your source and I think the speed reaches parity with with C++. On this FFT benchmark this compile setting makes a difference of about a factor of EIGHT.

You may want to consider whether you need different build configurations or targets for development so that you don't have to pay the price in build time on every run but if performance is critical you should performance test and release with this enabled.

-Ounchecked

The -Ounchecked option has always been available in Swift (although called -Ofast for some of the initial beta) but I haven't generally seen much (maybe 10%) gain over the -O (fast) optimisation setting, that is until the Geekbench tests. On the Geekbench tests it allows the SGEMM test run TWO times faster with -Ounchecked than with the -O.

I'm not yet sure of the cost or risks of running in this mode (traps on overflow seem to be disabled but I will try to investigate if there is anything else. However running in this mode (combined with whole module optimisations) took the original code to a performance to similar the result of my optimisations and close to C++ speed for Mandelbrot and FFT.

Geekbench SGEMM Comparison

While the FFT and Mandelbrot are pretty close to on being on a level with the C++ versions the SGEMM tested leaves Swift a long way behind. The view of the Apple engineers is that the C++ version's use of -ffast-math build setting (which trades accuracy for speed) gives it the advantage in speed because Swift doesn't yet support a fast-math mode which is needed to vectorise this algorithm. It would be really interesting to see what the C++ performance would be like with accurate maths (Geekbench update please).

What optimisation can still be done?

I think that is investigation for another day but there might be room to use the Accelerate framework on the SGEMM test. Realistically that would change it from being a language test and the same optimisation could be applied to the C++ version if it increased the speed there but we are getting down to that low level.

I will also confirm that using immutable values, `for in` loops and `stride` doesn't have a significant effect but I don't expect much of a gain or a loss here.

Performance thoughts and Background to my Experiments

As I always note when discussing performance the exact results are highly workload dependent and if performance matters you need to benchmark and test your own code. Also most code is not performance critical in a way that language significantly affects; it is waiting on user input, network or other IO or is simply spending the bulk of time in library calls rather than in application code. The biggest wins may often be related to algorithms and data-structures rather than micro-optimising.

My approach to investigating performance has largely been to investigate what I can do with other people's projects particularly where they have come up short against their expectations in Swift code or where Swift has lost out in comparisons with other languages. I have preferred this rather than developing my own synthetic benchmark as it allows me to investigate optimisation from what someone else regards as complete rather than having a very blurred line between the basic development and the optimisation.

The performance comparisons here are all for CPU bound code which is where the difference will be made. In places I will give general estimates which are improvements that I have seen across several different code samples

Both Beta 1 and Beta 2 have included large leaps forwards in the performance of Swift. Even without code changes beta 1 made a 3x improvement in optimised builds and about 10x in unoptimised debug builds are fairly typical for CPU bound code. Beta 2 enables further big changes on normal code and removes the needs for most optimisation work that I was doing.

5 responses
-Ounchecked disables all uses of precondition() and preconditionFailure(), in addition to disabling overflow/underflow testing on arithmetic. This includes things like array bounds testing and any other place where invalid input would be expected to (safely) abort. Note that the optimizer still assumes that the condition used in precondition() is true (and that any instances of preconditionFailure() are unreachable). This means that you could get some very surprising behavior if you violate the precondition with -Ounchecked.
4 visitors upvoted this post.