At WWDC 2021 amongst the other async talks there was the Meet AsyncSequence talk which amongst other things demonstrated new APIs for getting async sequences of bytes from files or URLs. I was curious as to how they would perform so I did a few tests. Note that the FileHandle and URL bytes properties do not seem to be working in Xcode 13 beta 1 but the version on URLSession does seem to work and can be used for files - see the updates on my last post
This is a very first pass with some rough numbers to get a feel for rough ballpark feeling. This is iOS 15 beta 1 (on iPhone XS) and Xcode 13 beta 1. Testing was done with -O (speed) optimisation setting. The tests themselves will be at the end of this post. I configured the tests to run just the once because first run performance was consistently slower and I thought therefore more representative of file reading where the file is not already loaded.
As a baseline I included a pure synchronous read into a Data object. The AsyncSequence approach should not get close to the performance of this as it has to do a great deal more context switching at least function calls whereas that is straight-line on a single thread code.
Results
Conclusion - for now
A 16x slowdown to get the cooperative async behaviour and have the possibility to make other async calls with the data isn't a bad trade-off in many scenarios. With really big files it will also be more memory efficient in many cases. That said, iterating over the AsyncBytes just to make an array of the data is likely not the best option. And the compiler isn't yet saving you from inefficient reduce calls.
I will try to revisit this in future betas and perhaps refine the testing too. This was a quick rough experiment. If you think I've missed anything important please let me know. If I get the time I might look at different file sizes to see how the results scale and possibly at some more involved use cases that should level the field a little. Possibly also randomly generated file data so caching can be remove from the equation to a greater extent.
Follow up
Honestly a little better that I expected. The buffer size is likely quite too small still, and we’re only just starting to tune the runtime for it. Also note that perf will be very different when the consumer is on the main actor vs a random one bc main actor is thread confined.
— David Smith (@Catfish_Man) June 17, 2021