The Half-Baked Neural Net APIs of iOS 10

iOS 10 contains 2 sets of APIs relating to Artificial Neural Nets and Deep Learning, aka The New New Thing. Unfortunately, both APIs are bizarrely incomplete: they allow you to specify the topology of the neural net, but have no facility for training.

I say this is "bizarre" for two reasons:

  • Topology and the results of training are inextricably linked; and
  • Topology is static

The training of a neural net is, ultimately, just setting the weighting factors for the elements in the network topology: for every connection in the network, you have some weighting factor. A network topology without weights is useless. A training process results in weights for that specific topology.

Topologies are static: neural nets do not modify their topologies at runtime. (Topologies are not generally modified even during training: instead, generally the experimenter uses their intuition to create a topology that they then train.) The topology of a neural net ought to be declarative and probably ought to be loaded from a configuration file, along with the weights that result from training.

When I first saw the iOS 10 APIs, I thought it was possible that Apple was going to reveal a high-level tool for defining and training ANNs: something like Quartz Composer, but for Neural Networks. Or, perhaps, some kind of iCloud-based service for doing the training. But instead, at the sessions at WWDC they said that the model was to develop and train your networks in something like Theanos and then use the APIs.

This is how it works:

  • Do all of your development using some set of tools not from Apple, but make sure that your results are restricted to the runtime capabilities of the Apple neural APIs.
  • When you're done, you'll have two things: a network graph and weights for each connection in that graph
  • In your code, use the Apple neural APIs to recreate the network graph.
  • As a resource (download or load from file) the weights.
  • Back in your code, stitch together the weights and the graph. One mistake and you're toast. If you discover a new, more efficient, topology, you'll have to change your binary.

This is my prediction: Anyone who uses these APIs is going to instantly write a higher-level API that combines the definition of the topology with the setting of the weights. I mean: Duh.

Now, to be fair, you could implement your own training algorithm on the device and modify the weights of a pre-existing neural network based on device-specific results. Which makes sense if you're Apple and want to do as much of the Siri / Image recognition / Voice recognition heavy lifting on the device as possible but allow for a certain amount of runtime flexibility. That is, you do the vast majority of the training during development, download the very complex topology and weight resources, but allow the device to modify the weights by a few percent. But even in that case, either your topology stays static or you build it based on a declarative configuration file, which means that whichever route you choose, you're still talking about a half-baked API.

Bizarre.