Swift Arrays - Too Swift and flexible for Our Good

UPDATE 2: I've added a new post about the changes in Beta 3 (I like them)

UPDATE - Apple have indicated that the array behaviour in Swift is going to change - see this post

Update 3: Please note that this article remains for the historical record and to ensure that links in other articles aren't broken but that the behaviour of Swift arrays was radically changed in Beta 3 and it resolves most if not all of the issues that I had with Swift arrays making the following essentially redundant. Update 2 above links to my follow up article reviewing the changes. You can also read several of my more recent articles about Swift. [This presentation about optimising Swift is particularly recommended and this article about mutating Swift arrays from multiple threads is on topic with this post. - Update 4]

I really like Apple's new Swift language. It brings in some of the best features of many functional languages including ML, to Haskell such as type inference, strong types, generic types and combines them with with an object system that can interoperate with the existing frameworks while still being fast and efficient. It encourages immutable values over variables and will eliminate whole classes of errors related to null pointers and improper handling of them.

The main thing that I don't like about it is the handling of "immutable" arrays. For performance reasons when an array is declared with "let" it means only that the array length is immutable and other arrays cannot be assigned to the same variable. You can however change any of the values in the array. I think that almost whatever the performance gains this is not worth the confusing behaviour that results. [Edit: Follow up post explaining the current behaviour is now available.]

In the post I will mix speculation and a little investigation of the behaviour to understand what the current behaviour is and the rationale for it. I also intend to propose a way out of the situation by which the behaviour could be changed with minimum disruption. Apple has indicated that source compatibility is not guaranteed to be maintained and I think automatic conversion would be possible to migrate code.

Current Behaviour

I'm not going to go too much into the oddities in the current behaviour. They initially cam to my attention in this post that makes a number of other mistakes but is right about arrays. It was discussed on Hacker News and there is a good discussion on the Apple Developer forum (registered Apple Devs only).

For me there are plenty of gotchas to catch the unwary from the fact that the sort function modifies and returns the array given as argument to the odd copy semantics.

To be clear the current behaviour is as documented rather than implementation bugs. The Swift Programming Language book has this to say:

“Mutability of Collections

Arrays and dictionaries store multiple values together in a single collection. If you create an array or a dictionary and assign it to a variable, the collection that is created will be mutable. This means that you can change (or mutate) the size of the collection after it is created by adding more items to the collection, or by removing existing items from the ones it already contains. Conversely, if you assign an array or a dictionary to a constant, that array or dictionary is immutable, and its size cannot be changed.

For dictionaries, immutability also means that you cannot replace the value for an existing key in the dictionary. An immutable dictionary’s contents cannot be changed once they are set.

Immutability has a slightly different meaning for arrays, however. You are still not allowed to perform any action that has the potential to change the size of an immutable array, but you are allowed to set a new value for an existing index in the array. This enables Swift’s Array type to provide optimal performance for array operations when the size of an array is fixed.

The mutability behavior of Swift’s Array type also affects how array instances are assigned and modified. For more information, see Assignment and Copy Behavior for Collection Types.”

Excerpt From: Apple Inc. “The Swift Programming Language.” iBooks. https://itun.es/gb/jEUH0.l”

And it has this to say about Assignment and Copy Behaviour for Arrays:
“If you assign an Array instance to a constant or variable, or pass an Array instance as an argument to a function or method call, the contents of the array are not copied at the point that the assignment or call takes place. Instead, both arrays share the same sequence of element values. When you modify an element value through one array, the result is observable through the other.

For arrays, copying only takes place when you perform an action that has the potential to modify the length of the array. This includes appending, inserting, or removing items, or using a ranged subscript to replace a range of items in the array. If and when array copying does take place, the copy behavior for an array’s contents is the same as for a dictionary’s keys and values, as described in Assignment and Copy Behavior for Dictionaries.”

Excerpt From: Apple Inc. “The Swift Programming Language.” iBooks. https://itun.es/gb/jEUH0.l
This behaviour does not feel at all natural to me and will need constantly considering when using arrays in Swift.

Why Did Apple Choose This (Speculation)

Apple state performance reasons for choosing to allow modifications of an array declared with let and I'm sure that is right but it may not be obvious why there are performance advantages.

The performance gains are not going to be compared to read operations on a truly immutable array because the compiler has sufficient information to identify attempts to write to constant arrays so there would be no run time hit using constant arrays per se but there would be costs imposed by having to use fully flexible arrays when the fixed length but mutable one would do. There may also have to be additional copies made on assignment and in function calls.

Why is Array Performance Critical

Arrays are used everywhere. The performance is most critical when dealing with large amounts of data such as images and video data. Uncompressed images are a great example of things that may be a fixed size object requiring processing and changing of lots of values (e.g. an image filter or compositing of different images) and that require high performance.

Having said that most application programmer's arrays will be fairly small often with just a few items in them and they will not generally be performance critical.

Costs of fully Mutable Arrays

The cost of maintaining fully mutable arrays will depend on the implementation and how growing the arrays is handled. The two ways that seem obvious to me are:

  1. Every time the array needs to grow allocate a new space big enough for the existing array and additional space for more items, copy all the items into the new space and then free the original space.
  2. Maintain more than one block of space and track how they are arranged.

Option 1 will allow very quick read and modify in place operations (probably one pointer read worse than a constant fixed array) at the cost of very expensive operations on the occasions when the array needs to grow the space available to it (it would reserve chunks at a time so it wouldn't be every append).

Option 2 will probably need to walk an index array to find the correct location in the non-contiguous areas of memory that the array is split between on every operation. Every read adding a real but fairly small cost to every operation including read and write in place operations.

Apple is generally careful to avoid "stop the world" operations so I strongly suspect in the absence of other evidence and not having checked the assembler that they would have chose Option 2 or some variation thereof. It is also the explanation for the performance benefits of having a mutable but fixed size array. For such an array can always be allocated contiguously and require only a bounds check and a direct offset calculated from the index without any risk of a pointer chase.

How would I fix it?

I would make arrays properly obey let semantics and be properly constant. For the performance critical tasks such as image or video processing or otherwise dealing with large amounts of mutable fixed length data I would create a new type of Fixed Length Array. Any developer that needed the performance could opt into the different semantics of the fixed length mutable array and the complications on assignment for the particular object that required it.

What about code that has already been written?

It should be possible to do an automatic source conversion process.

  1. Evaluate the code and if it still compiles it is fine to leave it as truly constant arrays where relevant.
  2. In cases where the code won't work replace the array declarations with classes insert the new Fixed Length Array declarations and you should get the same behaviour as the original code.

Next Steps and Summary

Someone should (and I might if I have time) dig into the behaviour of the Swift compiler by checking the memory location of a test object and possibly also the generated assembler for the two categories of current arrays (fixed length and fully mutable.

Please Apple, fix this wart on your lovely new language.

Discuss on Hacker News