ARKit Tutorial in Swift 4 for Xcode 9 using SceneKit

ARKit Tutorial in Swift 4 for Xcode 9

In this tutorial I’m going to show you how to work with ARKit, the new Framework from Apple that allows us to easily create Augmented Reality experiences in our iOS apps. The first thing to know about ARKit is that it can be used in three major ways:

  • Inside of Xcode writing Swift or Obj-C Code
  • Inside of the Unity game engine
  • Inside of the Unreal game engine

In this tutorial, we’ll be working with the first option.

We will be writing the app shown above in Swift 4 and Xcode 9 Beta in this tutorial. If you are a game developer and comfortable working with Unity or Unreal, this tutorial is not intended for you. I recommend following Unity and Epic’s documentation respectively if you are working in one of those environments. However, if you want to learn how ARKit works under the hood, this still may be useful in understanding what’s happening when you work with the plugins provided. At the end of the day, your game engine is still wrapping the native Swift or Obj-C calls to ARKit.

Getting started

First off, create a new Xcode project and use the Augmented Reality template. (Xcode 9 Beta and above only)

From here you can run the app immediately on your device and you’ll see a spaceship floating in space. As you walk around with the phone, it should track your movements and simulate the AR effect with the ship.

From here, we’re going to replace this model with a custom one we download. Then, I’ll go over how to use the ARKit hitscan and anchor features to place this custom model on real-world objects.

If you’re already familiar with SceneKit and 3D models, I recommend skipping to the Section 2

Finding a model to work with

I like to use TurboSquid to get models to play around with when I’m prototyping games, and working with AR will be similar. So you can find free (or cheap) 3D models over here. One thing to be aware of though is that some models here are meant for non-realtime rendering, so the poly counts may be high. Any time you are planning to grab a model from TurboSquid, always check that the poly count is not too high. Anything over 10k is pretty high poly for a mobile app. As another reference point, very high end games designed to be played on cutting edge gaming PCs generally have between 50k and 150k polygons for a main character model. If you see a poly count that high it should be for an extremely detailed model, and would ideally be one of very few items in the scene. What you really want for mobile is something low poly, and if it’s a detailed model you should expect a normal map to be included. A normal map will create the illusion that there are more polys than there actually are, without a significant performance impact.

SceneKit has support for DAE models. (If you have another file format model that want to work with, I recommend getting a copy of blender and converting your models to DAE by importing and then exporting as DAE within the application.)

Turbosquid has a filter option that does not include DAE, however it works fine to just type in “DAE”. Using this filter will help you to skip the process of converting models, and instead allow us to focus on code. As a shortcut, you can use this direct link to find DAE models that are free and less than 10k polys. Every model on this list should be appropriate for use in SceneKit.

I like this tree model. We could create an AR experience where users can “plant” trees in AR and walk around through them. You can download the tree model here after signing up for a free account.

Import the DAE model in to the Xcode Project

So download the DAE version of the model and then you can drag the file in to the Xcode navigator in to your art.scnassets folder that was created from the AR app template. Once you have added the asset to your Xcode project, you should be able to preview it within Xcode hold click and drag around to rotate around your mesh. If you don’t see anything in this step there may be an issue with your model.

Section 2

Load the model in to your scene

Now that the model is in your project you can open up the ViewController.swift file created by the template and swap out the spaceship for your model. Find this line of code:

let scene = SCNScene(named: "art.scnassets/ship.scn")!

Change it to point to your downloaded model. In my case it’s Lowpoly_tree_sample.dae.

let scene = SCNScene(named: "art.scnassets/Lowpoly_tree_sample.dae")!

When running the app with this change you will not be able to see the model because it is much too large, and not positioned in front of the camera. An easy way to check the size of a model is to cut & paste the spaceship model from the default scn file in to the dae file using the SceneKit editor in Xcode. This will tell you the relative sizes, and in this case the tree is around 100 times too large.

Select the tree model in the SceneKit editor view by clicking it, and then on the right-hand side pane select the Node inspector. This is the tab with the cube icon. Here we can set the x, y, and z scale to 0.01, reducing the size of the tree by 99%. You can also set the scale in code, as well as position and rotation.

You may notice that in the default position, the tree is not visible. This is because it’s origin is at position 0, 0, 0, and it’s invisible because the camera is actually inside of the tree at this location. If you start up the AR app on a device and walk around a bit, when you turn around you’ll see the tree. But let’s just move it out some using some code. What we’ll do is change the position to be 1 unit in front of the camera in it’s default location. In order to do this we’ll need to find the tree as a node within the scene. Going back to the SceneKit editor (select the DAE file within Xcode), we can click on the tree model itself, and then again in the Node inspector there should be a name. This is the name of the mesh selected, in this case Tree_lp_11. If you’re using a different model the name may be different, or empty. Fortunately we can just type in our own name as needed.

Now that we know the name of the node, we can access it within our code.

let scene = SCNScene(named: "art.scnassets/Lowpoly_tree_sample.dae")!
let treeNode = scene.rootNode.childNode(withName: "Tree_lp_11", recursively: true)

The second line above does a search of the child nodes of the scene object created from the DAE file we imported, and returns a node with the name specified. Since 3D models can have deeply nested nodes, it’s often useful to recursively search through the nested heirarchy to find the mesh object, so we opt to search recursively, even though it is not neccessary for this particular model.

From here, we can simply reposition the node by moving it forward a bit. The means going in a negative direction in the z axis, as the default camera faces down the negative Z. Or in other words it’s looking at the point (0, 0, -Inf).

So to make the tree visible, let’s move it back 1 unit in the z direction. The easiest way to do this is just set the new z to -1 on the position object.

let scene = SCNScene(named: "art.scnassets/Lowpoly_tree_sample.dae")!
let treeNode = scene.rootNode.childNode(withName: "Tree_lp_11", recursively: true)
treeNode?.position.z = -1

This will work, but in practice what is more common is to create a new position object from scratch and assign that as the new position. This is based on taste, but it’s a common approach to avoid mutating positions of 3D objects, and instead to replace them. To do this we need to construct a SCNVector3 object with the position (0,0,-1). This code has the exact same effect, but is a better practice:

let scene = SCNScene(named: "art.scnassets/Lowpoly_tree_sample.dae")!
let treeNode = scene.rootNode.childNode(withName: "Tree_lp_11", recursively: true)
treeNode?.position = SCNVector3Make(0, 0, -1)

In order to modify the treeNode later, let’s keep an instance reference to it. Outside of any functions, but inside the ViewController class add an optional reference to treeNode:

class ViewController: UIViewController, ARSCNViewDelegate {
  var treeNode: SCNNode?
  ...
}

In the next step we’re going to want a reference to the treeNode again, so rather than doing the lookup every time, it is useful to cache the reference to the treeNode. To do this, I’ll modify the childNode call we just added inside of viewDidLoad so that it sets this to an instance variable treeNode as opposed to just the local treeNode variable:

let scene = SCNScene(named: "art.scnassets/Lowpoly_tree_sample.dae")!
self.treeNode = scene.rootNode.childNode(withName: "Tree_lp_11", recursively: true)
self.treeNode?.position = SCNVector3Make(0, 0, -1)

Although AR is not supported in the simulator, SceneKit is, and this is just a SceneKit model. So if you run the simulator you’ll find you can now see the tree model with a black background.

ARKit model rendering in iPhone Simulator

Using HitTest

Next, let’s modify the app so that when the user taps, it will place a new copy of the tree model to wherever he tapped. This is a very complex operation because tapping on the 2D screen needs to go through a projection matrix based on the estimated shape of the 3D scene it’s seeing through the camera. ARKit makes this complicated process fairly simple by providing the HitTest api.

So first, we implement an override for touchesBegan. This is called any time the user taps on the screen. From this we can retrieve the first touch and perform a hitTest in the ARScene. We’ll look for results that are anchors. These are basically points in space ARKit has identified and is tracking. In other words it’s a surface of some real-world object.

Once we get a result from the hitTest, we can position the treeNode to the same location as the anchor. The hit test comes back with a 4×4 matrix containing the scale, rotation, and position data. The 4th row of this matrix is the position. We can reconstruct the position using this row by accessing m41, m42, and m43 as x, y, and z respectively. Setting the position to a new SCNVector3 object with these coordinates should move the tree to that location.

override func touchesBegan(_ touches: Set<UITouch>, with event: UIEvent?) {
  guard let touch = touches.first else { return }
  let results = sceneView.hitTest(touch.location(in: sceneView), types: [ARHitTestResult.ResultType.featurePoint])
  guard let hitFeature = results.last else { return }
  let hitTransform = SCNMatrix4(hitFeature.worldTransform)
  let hitPosition = SCNVector3Make(hitTransform.m41,
                                   hitTransform.m42,
                                   hitTransform.m43)
  treeNode?.position = hitPosition;
}

Give this a try and you’ll find you can teleport the tree to locations in the real-world when you tap!

Xcode 9 Beta 1 vs Xcode 9 Beta 2

Note: Some users have complained that the call SCNMatrix4 function call causes a compiler error. This seems to be caused by users on Xcode 9 Beta 1. The new SCNMatrix4 initializer was added in Xcode 9 Beta 2. If you run in to this problem because you are still on Beta 1, just replace the function SCNMatrix4 with the function SCNMatrix4FromMat4 and this should fix the issue.

Similarly, we can make clones of the tree as well so we can plant our forest 🙂

override func touchesBegan(_ touches: Set<UITouch>, with event: UIEvent?) {
  guard let touch = touches.first else { return }
  let results = sceneView.hitTest(touch.location(in: sceneView), types: [ARHitTestResult.ResultType.featurePoint])
  guard let hitFeature = results.last else { return }
  let hitTransform = SCNMatrix4(hitFeature.worldTransform)
  let hitPosition = SCNVector3Make(hitTransform.m41,
                                   hitTransform.m42,
                                   hitTransform.m43)
  let treeClone = treeNode!.clone()
  treeClone.position = hitPosition
  sceneView.scene.rootNode.addChildNode(treeClone)
}

From here you can imagine some of the interesting things that could be done by adding interactions, animations, etc to the scene. To proceed from here with your project what you should learn is how to work with SceneKit, as those fundamentals will apply to most of your ARKit based apps.

You can find the complete source code to this tutorial on github here: https://github.com/jquave/ARTrees

Questions? Comments? Hate mail? Let me know what you thought of this tutorial on my contact page. I’ll be making the video version of this tutorial soon, so check back or subscribe on my YouTube channel to keep up to date.


Sign up now and get a set of FREE video tutorials on writing iOS apps coming soon.

Subscribe via RSS

Core NFC Tutorial for NFC on iOS Devices

With the release of iOS 11, for the first time third-party developers are able to use the NFC reader on devices iPhone 7 and higher. This could be used for passing along identification information and a whole host of other data exchange applications from door locks to subway passes.

The technology used on iOS 11 is called Core NFC, and I’m going to go over how to use it in this tutorial in Swift 4.

Core NFC Device communications on iOS

Because Core NFC is currently read-only, functionality such as contactless payments will not be possible out of the box, there are however other applications we can use the Core NFC reading capabilities for. So let me show you how it works.

The first step to work with NFC is to enable it for your App ID in the Apple Developer Center. Create a new App ID and enable “NFC Tag Reading” as a capability for the app.

NFC Tag Reading App Service

After you do this, I recommend creating a development/distribution provisioning profile specifically for this app ID, so that the NFC reading capability will be present when you try to build.

Next, in your Xcode project add the entitlement to your projectName.entitlements file. You’ll need to right click the file and select “Open As Source Code” to manually enter this key as shown:

<key>com.apple.developer.nfc.readersession.formats</key>
<array>
  <string>NDEF</string>
</array>

If you do not have an entitlements file, you can manually create one and point to it in the settings of the project. Under “Build Settings” look for the record “Code Signing Entitlements”, and punch in the relative path to your entitlements file. In my case it’s “CoreNFC-Tutorial/CoreNFC-Tutorial.entitlements” because my project files are inside a sub-folder called “CoreNFC-Tutorial”.

Next, we need to add the usage string to your Xcode project. Open your Info.plist file and add a new record, start typing and allow it to autocomplete “Privacy – NFC Scan Usage Description”. This message will be shown to users when NFC is used, so for the value enter something useful for the user such as “NFC is needed to unlock doors.”.

Next, in our code we want to import the CoreNFC module.

import CoreNFC

Note: Core NFC is completely unavailable on the iOS simulator, and even importing the module will fail. So Core NFC is for device-only testing!

I created an NFCHelper.swift to store all my NFC-related calls, and put everything in an NFCHelper class. In the init method of the class I created a session. Core NFC requires you utilize the NFCNDEFReaderSession class in order to start listening for NFC communications. (Note the class NFCReaderSession is abstract and should not be used directly)

class NFCHelper {
  init() {
    let session =
      NFCNDEFReaderSession(delegate: self,
                           queue: nil,
                           invalidateAfterFirstRead: true)
    session.begin()
  }
}

Here we create the session, and pass in a nil dispatch queue. Doing so will cause NFCNDEFReaderSession to automatically create a serial dispatch queue.

Creating a session, we can specify a delegate in our NFCNDEFReaderSession class. I would like to use the NFCHelper class as the delegate, so we must first adhere to the delegate protocol, NFCNDEFReaderSessionDelegate. This delegate is based on an Objective-C object, so we must first also adhere to NSObject. NFCNDEFReaderSessionDelegate has two delegate methods we must implement:

func readerSession(_ session: NFCNDEFReaderSession, didInvalidateWithError error: Error)
 
func readerSession(_ session: NFCNDEFReaderSession, didDetectNDEFs messages: [NFCNDEFMessage])

These two callbacks are called when an NFC session has a validation error, or when NFC activity is detected. How we use the messages will depend on your specific application, but everything you need to know will be sent through the didDetectNDEFs callback in the form of records inside of messages. To get started, you can log the results of each message in a loop. These are NFCNDEFPayload objects, which contain an identifier, payload, type, and typeNameFormat.

func readerSession(_ session: NFCNDEFReaderSession, didDetectNDEFs messages: [NFCNDEFMessage]) {
  print("Did detect NDEFs.")
  // Loop through all messages
  for message in messages {
    for record in message.records {
      print(record.identifier)
      print(record.payload)
      print(record.type)
      print(record.typeNameFormat)
    }
  }
}

To clean this up a little bit so I can actually integrate with the front-end of the app, I created a callback specific for my application. You may want to do something similar. I added a callback variable that the implementing view can work with:

I call this when I get a payload from an NFC tag, or an error:

class NFCHelper: NSObject, NFCNDEFReaderSessionDelegate {
  ...
  var onNFCResult: ((Bool, String) -> ())?
  ...
}

I also broke out my init function to open up a session using a method, so I can restart the session from a button on my ViewController. My final code for my NFCHelper.swift file is as follows:

//
//  NFCHelper.swift
//  CoreNFC-Tutorial
//
//  Created by Jameson Quave on 6/6/17.
//  Copyright © 2017 Jameson Quave. All rights reserved.
//
 
import Foundation
import CoreNFC
 
class NFCHelper: NSObject, NFCNDEFReaderSessionDelegate {
 
  var onNFCResult: ((Bool, String) -> ())?
 
  func restartSession() {
    let session =
    NFCNDEFReaderSession(delegate: self,
                       queue: nil,
                       invalidateAfterFirstRead: true)
    session.begin()
  }
  // MARK: NFCNDEFReaderSessionDelegate
  func readerSession(_ session: NFCNDEFReaderSession, didInvalidateWithError error: Error) {
    guard let onNFCResult = onNFCResult else {
      return
    }
    onNFCResult(false, error.localizedDescription)
  }
  func readerSession(_ session: NFCNDEFReaderSession, didDetectNDEFs messages: [NFCNDEFMessage]) {
    guard let onNFCResult = onNFCResult else {
      return
    }
    for message in messages {
      for record in message.records {
        if(record.payload.count > 0) {
          if let payloadString = String.init(data: record.payload, encoding: .utf8) {
              onNFCResult(true, payloadString)
          }
        }
      }
    }
  }
 
}

I also set up a simple UI in my ViewController to demonstrate usage of this class:

//
//  ViewController.swift
//  CoreNFC-Tutorial
//
//  Created by Jameson Quave on 6/6/17.
//  Copyright © 2017 Jameson Quave. All rights reserved.
//
 
import UIKit
 
class ViewController: UIViewController {
  var helper: NFCHelper?
  var payloadLabel: UILabel!
  var payloadText = ""
  override func viewDidLoad() {
    super.viewDidLoad()
    // Add a detect button
    let button = UIButton(type: .system)
    button.setTitle("Read NFC", for: .normal)
    button.titleLabel?.font = UIFont(name: "Helvetica", size: 28.0)
    button.isEnabled = true
    button.addTarget(self, action: #selector(didTapReadNFC), for: .touchUpInside)
    button.frame = CGRect(x: 60, y: 200, width: self.view.bounds.width - 120, height: 80)
    self.view.addSubview(button)
    // Add a label to display the payload in
    payloadLabel = UILabel(frame: button.frame.offsetBy(dx: 0, dy: 220))
    payloadLabel.text = "Press Read to see payload data."
    payloadLabel.numberOfLines = 100
    self.view.addSubview(payloadLabel)
  }
  // Called by NFCHelper when a tag is read, or fails to read
  func onNFCResult(success: Bool, message: String) {
    if success {
      payloadText = "\(payloadText)\n\(message)"
    }
    else {
      payloadText = "\(payloadText)\n\(message)"
    }
    // Update UI on main thread
    DispatchQueue.main.async {
      self.payloadLabel.text = self.payloadText
    }
 
  }
  // Called when user taps Read NFC Button
  @objc func didTapReadNFC() {
    if helper == nil {
      helper = NFCHelper()
      helper?.onNFCResult = self.onNFCResult(success:message:)
    }
    payloadText = ""
    helper?.restartSession()
  }
}

From here you can integrate the rest of your app with your intended use-case. Whether it’s to identify visitors to an event, check out the stats on your Amiibo, or even process payments, the Core NFC API from Apple has finally opened up the possibilities for NFC integration on these new devices. What kind of NFC product are you working on? Let me know at jquave@gmail.com.

Full Source Code

Video Tutorial (YouTube)


Sign up now and get a set of FREE video tutorials on writing iOS apps coming soon.

Subscribe via RSS

ARKit on iOS 11

ARKit is the “largest AR platform in the world” according to Apple’s latest keynote address from WWDC 2017. So what can it do? Well, as we saw in the demos, ARKit enabled features for surface tracking, depth perception, and integrated with existing libraries from the game development world. Note: If you’re here to learn how to make ARKit apps, I’ve got a detailed tutorial over here: ARKit Tutorial.

Most significantly, Apple showed that both the Unreal and Unity game engines will integrate with ARKit for support for Pokemon Go-esque apps. Multiple objects can interact with each other casting shadows with each other, and even having physics reactions. Some say this development means Apple will soon be releasing a Hololens style device. I personally think they just want to enable better experiences like what Snapchat offers today.

On the other hand, ARKit will also give developers of photo editing apps a better way to dive deep in to the contents of user’s photos in orer to enable new and improved photo editing experiences. Deep learning powered photo editing will be sure to make mobile photo (and video) editing a much better experience.

ARKit has support for:

Fast, stable motion tracking
Plane estimation with basic boundaries
Ambient lighting estimation
Scale estimation
Support for Unity, Unreal, SceneKit
Xcode app templates


Sign up now and get a set of FREE video tutorials on writing iOS apps coming soon.

Subscribe via RSS

Core ML for iOS Apps in iOS 11

Core ML, announced at WWDC 2017 is a new set of APIs built by Apple for use with iOS 11 or higher devices. With Core ML, developers can incorporate machine learning models in to their mobile apps, and have the inference accelerated using the Metal APIs. This means the processing of models will be significantly faster than using other systems such as TensorFlow or Caffe2.

Notably during the WWDC keynote, it was mentioned that Core ML will support models such as those coming from Keras or Caffe. So many existing models can be converted to work on-device with the new acceleration.

Core ML supports the following key features:
– Deep neural networks
– Recurrent neural networks
– Convolutional neural networks
– Support vector machines
– Tree ensembles
– Linear models

This is all done with on-device processing, and supports iOS, macOS, watchOS, and tvOS.

Under the Core ML banner is a Model Converter, Natural Language API, and a Vision API.

Once the APIs are made public, I’ll collect the references here for the official Core ML documentation. Follow along with me and let’s start learning!

For now, we know that the APIs will support Face tracking, Face detection, Landmarks, Text detection, Rectangle detections, Barcode detection, Object tracking, and Image registration.


Sign up now and get a set of FREE video tutorials on writing iOS apps coming soon.

Subscribe via RSS

Caffe2 on iOS – Deep Learning Tutorial

Caffe2 in an iOS App Deep Learning Tutorial

At this years’s F8 conference, Facebook’s annual developer event, Facebook announced Caffe2 in collaboration with Nvidia. This framework gives developers yet another tool for building deep learning networks for machine learning. But I am super pumped about this one, because it is specifically designed to operate on mobile devices! So I couldn’t resist but start digging in immediately.

I’m still learning, but I want to share my journey in working with Caffe2. So, in this tutorial I’m going to show you step by step how to take advantage of Caffe2 to start embedding deep learning capabilities in to your iOS apps. Sound interesting? Thought so… let’s roll 🙂

Building Caffe2 for iOS

The first step here is to just get Caffe2 built. Mostly their instructions are adequate so I won’t repeat too much of it here. You can learn how to build Caffe2 for iOS here.

The last step for their iOS install process is to run build_ios.sh, but that’s about where they leave you off with the instruction. So from here, let’s take a look at the build artifacts. The core library for Caffe2 on iOS is located inside the caffe2 folder:

  • caffe2/libCaffe2_CPU.a

And in the root folder:

  • libCAFFE2_NNPACK.a
  • libCAFFE2_PTHREADPOOL.a

NNPack is sorta like CUDNN for mobile, in that it accelerates the neural network operations for mobile CPUs. PThreadPool is a thread pool library.

Create an Xcode project

Now that the library was built, I created a new iOS app project in Xcode with a single-view template. From here I drag and drop the libCaffe2_CPU.a file in to my project heirarchy along with the other two libs, libCAFFE2_NNPACK.a and libCAFFE2_PTHREADPOOL.a. Select ‘Copy’ when prompted. The file is located at caffe2/build_ios/caffe2/libCaffe2_CPU.a. This pulls a copy of the library in to my project and tells Xcode I want to link against the library. We need to do the same thing with protobuf, which is located in caffe2/build_ios/third_party/protobuf/cmake/libprotobuf.a.

In my case I wanted to also include OpenCV2, which has it’s own requirements for setting up. You can learn about how to install OpenCV2 on their site. The main problem I ran in to with OpenCV2 was figuring out that I needed to create a Prefix.h file, and then in the settings of the project set the Prefix Header file to be MyAppsName/Prefix.h. In my example project I called the project DayMaker, so for me it was DayMaker/Prefix.h. Then I could put the following in the Prefix.h file so that OpenCV2 would get included before any Apple headers:

#ifdef __cplusplus
    #import <opencv2/opencv.hpp>
    #import <opencv2/stitching/detail/blenders.hpp>
    #import <opencv2/stitching/detail/exposure_compensate.hpp>
#endif

Prefix Headers for Caffe2 in Xcode

Include the Caffe2 headers

In order to actually use the library, we’ll need to pull in the right headers. Assuming you have a directory structure where your caffe2 files are a level above your project. (I cloned caffe2 in to ~/Code/caffe2 and set up my project in ~/Code/DayMaker.) You’ll need to add the following User Header Search Path in your project settings:

$(SRCROOT)/../caffe2
$(SRCROOT)/../caffe2/build_ios

You’ll also need to add the following to “Header Search Paths”

$(SRCROOT)/../caffe2/build_host_protoc/include
$(SRCROOT)/../caffe2/third_party/eigen

Now you can also try importing some caffe2 C++ headers in order to confirm it’s all working as expected. I created a new Objective-C class to wrap the Caffe2 C++ API around. To follow along, create a new Objective-C class called Caffe2. Then rename the Caffe2.m file it creates to Caffe2.mm. This causes the compiler to see this as Objective-C++ instead of just Objective-C, a requirement for making this all work.

Next, I added some Caffe2 headers to the .mm file. At this point this is my entire Caffe2.mm file:

#import "caffe2/core/context.h"
#import "caffe2/core/operator.h"
#import "Caffe2.h"
 
@implementation Caffe2
 
@end

According to this Github issue a reasonable place to start with a C++ interface to the Caffe2 library is this standalone predictor_verifier.cc app. So let’s expand the Caffe2.mm file to include some of this stuff and see if everything works on-device.

With a few tweaks we can make a class that loads up the caffe2 environment and loads in a set of predict/net files. I’ll pull in the files from Squeezenet on the Model Zoo. Copy these in to the project heirarchy, and we’ll load it up just like any iOS binary asset…

//
//  Caffe2.m
//  DayMaker
//
//  Created by Jameson Quave on 4/22/17.
//  Copyright © 2017 Jameson Quave. All rights reserved.
//
 
#import "Caffe2.h"
 
// Caffe2 Headers
#include "caffe2/core/flags.h"
#include "caffe2/core/init.h"
#include "caffe2/core/predictor.h"
#include "caffe2/utils/proto_utils.h"
 
// OpenCV
#import <opencv2/opencv.hpp>
 
namespace caffe2 {
    void run(const string& net_path, const string& predict_net_path) {
        caffe2::NetDef init_net, predict_net;
        CAFFE_ENFORCE(ReadProtoFromFile(net_path, &init_net));
        CAFFE_ENFORCE(ReadProtoFromFile(predict_net_path, &predict_net));
 
        // Can be large due to constant fills
        VLOG(1) << "Init net: " << ProtoDebugString(init_net);
        LOG(INFO) << "Predict net: " << ProtoDebugString(predict_net);
        auto predictor = caffe2::make_unique<Predictor>(init_net, predict_net);
        LOG(INFO) << "Checking that a null forward-pass works";
        Predictor::TensorVector inputVec, outputVec;
        predictor->run(inputVec, &outputVec);
        NSLog(@"outputVec size: %lu", outputVec.size());
        NSLog(@"Done running caffe2");
    }
}
 
@implementation Caffe2
 
- (instancetype) init {
    self = [super init];
    if(self != nil) {
        [self initCaffe];
    }
    return self;
}
 
- (void) initCaffe {
    int argc = 0;
    char** argv;
    caffe2::GlobalInit(&argc, &argv);
 
    NSString *net_path = [NSBundle.mainBundle pathForResource:@"exec_net" ofType:@"pb"];
    NSString *predict_net_path = [NSBundle.mainBundle pathForResource:@"predict_net" ofType:@"pb"];
 
    caffe2::run([net_path UTF8String], [predict_net_path UTF8String]);
    // This is to allow us to use memory leak checks.
    google::protobuf::ShutdownProtobufLibrary();
}
 
@end

Next, we can just instantiate this from the AppDelegate to test it out… (Note you’ll need to import Caffe2.h in your Bridging Header if you’re using Swift, like me.

#import "Caffe2.h"

In AppDelegate.swift:

func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplicationLaunchOptionsKey: Any]?) -> Bool {
 
    // Instantiate caffe2 wrapper instance
    let caffe2 = Caffe2()
 
    return true
}

This for me produced some linker errors from clang:

[F operator.h:469] You might have made a build error: the Caffe2 library does not seem to be linked with whole-static library option. To do so, use -Wl,-force_load (clang) or -Wl,--whole-archive (gcc) to link the Caffe2 library.

Adding -force_load DayMaker/libCaffe2_CPU.a as an additional linker flag corrected this issue, but then it presented an issue not being able to find opencv. The DayMaker part will be your project name, or just whatever folder your libCaffe2_CPU.a file is located in. This will show up as two flags, just make sure theyre in the right order and it should perform the right concatenation of the flags.

Linker flags

Building and running the app crashes immediately with this output:

libc++abi.dylib: terminating with uncaught exception of type caffe2::EnforceNotMet: [enforce fail at conv_op_impl.h:24] X.ndim() == filter.ndim(). 1 vs 4 Error from operator:
input: "data" input: "conv1_w" input: "conv1_b" output: "conv1" type: "Conv" arg { name: "stride" i: 2 } arg { name: "pad" i: 0 } arg { name: "kernel" i: 3 }

Success! I mean, it doesn’t look like success jut yet, but this is an error coming from caffe. The issue here is just that we never set anything for the input. So let’s fix that by providing data from an image.

Caffe2 on an iOS device

Loading up some image data

Here you can add a cat jpg to the project or some similar image to work with, and load it in:

UIImage *image = [UIImage imageNamed:@"cat.jpg"];

I refactored this a bit and moved my logic out in to a predictWithImage method, as well as creating the predictor in a seperate function:

namespace caffe2 {
 
    void LoadPBFile(NSString *filePath, caffe2::NetDef *net) {
        NSURL *netURL = [NSURL fileURLWithPath:filePath];
        NSData *data = [NSData dataWithContentsOfURL:netURL];
        const void *buffer = [data bytes];
        int len = (int)[data length];
        CAFFE_ENFORCE(net->ParseFromArray(buffer, len));
    }
 
    Predictor *getPredictor(NSString *init_net_path, NSString *predict_net_path) {
        caffe2::NetDef init_net, predict_net;
        LoadPBFile(init_net_path, &init_net);
        LoadPBFile(predict_net_path, &predict_net);
        auto predictor = new caffe2::Predictor(init_net, predict_net);
        init_net.set_name("InitNet");
        predict_net.set_name("PredictNet");
        return predictor;
    }
}

The predictWithImage method is using openCV to get the GBR data from the image, then I’m loading that in to Caffe2 as the inputVector. Most of the work here is actually done in OpenCV with the cvtColor line…

- (NSString*)predictWithImage: (UIImage *)image predictor:(caffe2::Predictor *)predictor {
    cv::Mat src_img, bgr_img;
    UIImageToMat(image, src_img);
    // needs to convert to BGR because the image loaded from UIImage is in RGBA
    cv::cvtColor(src_img, bgr_img, CV_RGBA2BGR);
 
    size_t height = CGImageGetHeight(image.CGImage);
    size_t width = CGImageGetWidth(image.CGImage);
 
    caffe2::TensorCPU input;
 
    // Reasonable dimensions to feed the predictor.
    const int predHeight = 256;
    const int predWidth = 256;
    const int crops = 1;
    const int channels = 3;
    const int size = predHeight * predWidth;
    const float hscale = ((float)height) / predHeight;
    const float wscale = ((float)width) / predWidth;
    const float scale = std::min(hscale, wscale);
    std::vector<float> inputPlanar(crops * channels * predHeight * predWidth);
 
    // Scale down the input to a reasonable predictor size.
    for (auto i = 0; i < predHeight; ++i) {
        const int _i = (int) (scale * i);
        printf("+\n");
        for (auto j = 0; j < predWidth; ++j) {
            const int _j = (int) (scale * j);
            inputPlanar[i * predWidth + j + 0 * size] = (float) bgr_img.data[(_i * width + _j) * 3 + 0];
            inputPlanar[i * predWidth + j + 1 * size] = (float) bgr_img.data[(_i * width + _j) * 3 + 1];
            inputPlanar[i * predWidth + j + 2 * size] = (float) bgr_img.data[(_i * width + _j) * 3 + 2];
        }
    }
 
    input.Resize(std::vector<int>({crops, channels, predHeight, predWidth}));
    input.ShareExternalPointer(inputPlanar.data());
 
    caffe2::Predictor::TensorVector input_vec{&input};
    caffe2::Predictor::TensorVector output_vec;
    predictor->run(input_vec, &output_vec);
 
    float max_value = 0;
    int best_match_index = -1;
    for (auto output : output_vec) {
        for (auto i = 0; i < output->size(); ++i) {
            float val = output->template data<float>()[i];
            if(val > 0.001) {
                printf("%i: %s : %f\n", i, imagenet_classes[i], val);
                if(val>max_value) {
                    max_value = val;
                    best_match_index = i;
                }
            }
        }
    }
 
    return [NSString stringWithUTF8String: imagenet_classes[best_match_index]];
}

The imagenet_classes are defined in a new file, classes.h. It’s just a copy from the Android example repo here.

Most of this logic was pulled and modified from bwasti’s github repo for the Android example.

With these changes I was able to simplify the initCaffe method as well:

- (void) initCaffe {
 
    NSString *init_net_path = [NSBundle.mainBundle pathForResource:@"exec_net" ofType:@"pb"];
    NSString *predict_net_path = [NSBundle.mainBundle pathForResource:@"predict_net" ofType:@"pb"];
 
    caffe2::Predictor *predictor = caffe2::getPredictor(init_net_path, predict_net_path);
 
    UIImage *image = [UIImage imageNamed:@"cat.jpg"];
    NSString *label = [self predictWithImage:image predictor:predictor];
    NSLog(@"Identified: %@", label);
 
    // This is to allow us to use memory leak checks.
    google::protobuf::ShutdownProtobufLibrary();
}

So you’ll notice I’m pulling in the cat.jpg here. I used this cat pic:

Cat

The output when running on iPhone 7:

Identified: tabby, tabby cat

Hooray! It works on a device!

I’m going to keep working on this and publishing what I learn. If that sounds like something you want to follow along with you can get new posts in your email, just join my mobile development newsletter. I’ll never spam you, just keep you up-to-date with deep learning and my own work on the topic.

Thanks for reading! Leave a comment or contact me if you have any feedback 🙂

Side-note: Compiling on Mac OS Sierra with CUDA

When compiling for Sierra as a target (not the iOS build script, but just running make) I ran in to a problem in protobuf that is related to this issue. This will only be a problem if you are building against CUDA. I suppose it’s somewhat unusual to do so because most Mac computers do not have NVIDIA chips in them, but in my case I have a 2013 MBP with an NVIDIA chip that I can use CUDA with.

To resolve the problem in the most hacky way possible, I applied the changes found in that issue pull. Just updating protobuf to the latest version by building from source would probably also work… but this just seemed faster. I open up my own version of this file in /usr/local/Cellar/protobuf/3.2.0_1/include/google/protobuf/stubs/atomicops.h and just manually commented out lines 198 through 205:

// Apple.
/*
#elif defined(GOOGLE_PROTOBUF_OS_APPLE)
#if __has_feature(cxx_atomic) || _GNUC_VER >= 407
#include <google/protobuf/stubs/atomicops_internals_generic_c11_atomic.h>
#else  // __has_feature(cxx_atomic) || _GNUC_VER >= 407
#include <google/protobuf/stubs/atomicops_internals_macosx.h>
#endif  // __has_feature(cxx_atomic) || _GNUC_VER >= 407
*/

I’m not sure what the implications of this are, but it seems to be what they did in the official repo, so it must not do much harm. With this change I’m able to make the Caffe2 project with CUDA support enabled. In the official version of protobuf used by tensorflow, you can see this bit is actually just removed, so it seems to be the right thing to do until protobuf v3.2.1 is released, where this is fixed using the same approach.


Sign up now and get a set of FREE video tutorials on writing iOS apps coming soon.

Subscribe via RSS

Swift 3 Tutorial – Fundamentals

Swift 3 Tutorial

Swift 3 Tutorial

In this Swift 3 tutorial, we’ll focus on how beginners may approach going from complete beginner to having a basic grasp on Swift, and we’ll be working with Swift 3. We chose to write this tutorial because newcomers will find many tutorials out there that are out of date, so it’s not appropriate to simply write a “what’s new in Swift 3” post. Instead we’ll approach this from the perspective of someone who has some programming experience in at least one other language, and we’ll teach you what’s unique about Swift, and how to use it effectively. Ready? Let’s go!


Constants and Variables

Any Swift variable is either a constant or not. Not to be confused with the type of the variable, such as Float or Int, constants and variables are just ways to describe terms that hold a value which can change (are mutable), and constants that can not be changed (becuase they are immutable).

To define a constant, use the let keyword.

Example:

let name = "Jameson"

If you were to try and change the value of name, you would be unable to do so, and the Swift compiler would produce an error.

let name = "Jameson"
name = "Bob"
error: cannot assign to value: 'name' is a 'let' constant
name = "Bob"
~~~~ ^

On the other hand, by using the var keyword, we define a variable that can change:

var name = "Jameson"
name = "Bob"

This code does not produce an error.

In general, you should always default to using the let keyword, unless you know you need a var keyword. This leads to code that is ultimately safer. If you define a constant, and later attempt to modify it, you will get an error and at that time can determine if you should switch to use the var keyword, or if the error is giving you a hint that maybe you should rethink the current logic flow. In general, immutability is preferred over mutability; it simply leads to less programmers errors and makes it easier to reason about your code.


Basic Types

In Swift, a type is indicated by declaring a variable, then putting a colon, folowed by the type name. For example to declare an integer, which is of Swift type Int, you could use the following:

let age: Int = 5

Or similarly, if you want to declare a string:

let name: String = "Jameson"

Swift supports type inference, so you can usually omit the type information, and let the compiler determine what the type should be based on it’s initial value.

let age = 5
let name = "Jameson"

The types for age and name are still Int and String respectively, but we can skip the type annotation, because it is obvious that 5 is an Int, and "Jameson" is a String.

Remember, the let keyword simply makes these values constant. If we expected the value of age to change, but not name, we might write these declarations like so:

var age = 5
let name = "Jameson"

Now if we want to update the age value, it’s possible to do:

var age = 5
let name = "Jameson"
age = 25
print(age)
25

Working with Strings

It’s frequently useful to print a command to the console or otherwise evaluate a String containing other variables. For example, I might want to form a sentence with my variables age and name and print it to the console. I can do this using the + operator between each String.

let age = "15"
let name = "Robb"
 
let sentence = name + " is " + age
print(sentence)
Robb is 15

A shortcut for this is to write your String as you normally would without the + operator seperating each string, and put each variable inside of a set of parentheses, preceeded by a backslash \.

let sentence = "\(name) is \(age)"
print(sentence)
Robb is 15

As you can see this has the same effect, but is much easier to read and compose sentences.

One thing you may have noticed is that age is now of type String because it was assigned the value "15" instead of just 15 without the quotes. This is because concatenating a String and an Int will not automatically cast the Int to String, which is a neccessary step before concatenating is possible.

Or in other words, this code will produce an error:

let age = 15
let name = "Robb"
 
let sentence = name + " is " + age
print(sentence)
Error: Binary operator '+' cannot be applied to operands of type 'String' and 'Int'

So what we have to do is get age converted to a String before using it here. Thing is done by casting the variable. Simply call the String constructor and pass in the Int value:

let age = 15
let name = "Robb"
 
let stringAge = String(age)
 
let sentence = name + " is " + stringAge
print(sentence)
Robb is 15

We created a new variable called stringAge here, but we also could have performed the cast in-place, because the string interpolation will evaluate each expression separately, and the same goes for the contents of the parenthesis when interpolating strings that way:

let age = 15
let name = "Robb"
 
let sentence = name + " is " + String(age)
print(sentence)
print("\(name) enjoys being \(String(age))")
Robb is 15
Robb enjoys being
15

Optionals

In Swift, there is also the concept of the optional. An optional is just a variable that can be nil, null, or otherwise not set to any value. In general, you can think of any variable in most other programming languages as being an optional. The “optionality” of a variable is declared by appending a question mark (?) on to the end of the type name in a type annotation. So continuing the example above, where we know age and name will always be set, we might want to add another variable that could be nil, because it is possible that it just isn’t present. Let’s use favoriteColor as an example. Many people have a favorite color, but it’s possible someone doesn’t, or we just don’t have the data. We would declare this variable as an optional, and not assign it to any value.

var favoriteColor: String?

Implicit in the declaration of an optional with no value set, is the assignment to nil. We can verify this by examining the value of favoriteColor after declaring it as an optional by printing it to the console using the print() function.

var favoriteColor: String?
print(favoriteColor)
nil

We can later assign something to favoriteColor and see that it is no longer nil.

var favoriteColor: String?
favoriteColor = "Blue"
print(favoriteColor)
Optional("Blue")

Note that instead of just getting the string "Blue", we get Optional("Blue"). This is because the value is still “wrapped” inside of the optional.

You can think of optionals like a birthday gift. The box the gift comes in, wrapped up in some paper with pretty balloons on it, could actually be empty. A rather cruel gift to give someone on their birthday, but it’s possible to do. It could also have something inside of it. But either way, if we pick it up and look at it, what we have is not the thing inside of of it in our hands, but it’s just the wrapped box itself.

If we want to get at the thing inside, we need to unwrap the gift first. This is the same way optionals work. When we pass around optional variables and interact with them, we’re really working with a container that may or may not have anything inside of it. Similar to our gift, the optional must be unwrapped before it can be used.

Declaring our optional with no value is valid Swift and will compile just fine. But, if we tried to declare this variable without the optional syntax, it would present an error.

There are also variables in Swift that are not optional. They always have a value. If you tried to assign nil to a non-optional variable, you will get a compiler error:

var favoriteColor = "Blue"
favoriteColor = nil
error: nil cannot be assigned to type 'String'

Similarly, non-optional values can not be assigned to nil during their declaration:

var favoriteColor: String
error: variables must have an initial value

Unwrapping Optionals

So we know what optionals are, and that they allow for a variable to become nil, and we know that optionals are containers rather than values themeselves. So, in our programs when we want to access the contents of an optional, how do we do it? There are several ways so let’s go over them now.

First, and most commonly, you will unwrap optionals using optional binding. In optional binding, you will assign a new variable to the value of an optional within an if statement. If the optional contains a value, this variable will be set, and the code block following this statement will be executed.

Let’s look at an example. Here we will declare two optionals, one called favoriteAnimal which is set to Fox, and one set to favoriteSong, which we will not set (it will remain nil)

var favoriteAnimal: String?
var favoriteSong: String?
 
favoriteAnimal = "Fox"

Let’s employ optional binding to discover if each variable is set, and if so we’ll print a sentence containing the value to the console. First we’ll do it with favoriteAnimal.

if let unwrappedFavoriteAnimal = favoriteAnimal {
    print("Favorite animal is: " + unwrappedFavoriteAnimal)
}
Favorite animal is: Fox

In the event that the value is not set, we simply will trigger whatever is in the else block, or nothing at all if an else block isn’t specified.

if let unwrappedFavoriteSong = favoriteSong {
    print("Favorite song is: " + unwrappedFavoriteSong)
}
else {
    print("I don't know what your favorite song is!")
}
I don't know what your favorite song is!

If we need to unwrap multiple optionals, and we require all of them to proceed with a bit of logic, we need to check each one:

var favoriteAnimal: String?
var favoriteSong: String?
 
favoriteAnimal = "Fox"
favoriteSong = "Shake it Off"
 
if let unwrappedFavoriteSong = favoriteSong {
    if let unwrappedFavoriteAnimal = favoriteAnimal {
        print(unwrappedFavoriteSong + " " + unwrappedFavoriteAnimal)
    }
}

This gets kind of messy kind of fast, so Swift offers a shortcut to unwrap multiple variables at once:

var favoriteAnimal: String?
var favoriteSong: String?
 
favoriteAnimal = "Fox"
favoriteSong = "Shake it Off"
 
if let unwrappedFavoriteSong = favoriteSong,
    let unwrappedFavoriteAnimal = favoriteAnimal {
    print(unwrappedFavoriteSong + " " + unwrappedFavoriteAnimal)
}

Collections

Swift has multiple collection types, the most common of which being arrays, sets, and dictionaries.

Array

Let’s take a look at an example of an Array

let starks: [String] = ["Eddard", "Catelyn", "Robb", "Sansa"]

Here we have a basic Array, which is of type [String]. The square brackets indicate that this is an array of String objects, rather than just being a single String. As usual, Swift can infer this type data too, just by examining the initial assignment:

let starks = ["Robb", "Sansa", "Arya", "Jon"]

We can access elements of this array in a variety of ways, such as using an Int index, calling the various collections methods.

let starks = ["Robb", "Sansa", "Arya", "Jon"]
 
print( starks[0] )
print( starks[2] )
print( starks.first! )
Robb
Arya
Robb

You’ll note that arrays are zero-indexed, so the first element in the array "Robb" is accessed using starks[0]

Additionally, you may notice that while the first method returns an optional (and therefore is being force-unwrapped with the ! symbol), the indexed accessors does not return an optional. If you try to access an index in an array that is not present, your program will fail at runtime! So always check the length of arrays when accessing by index:

if starks.count >= 4 {
    print( starks[3] )
}

There are ways to make this type of checking automated, but it is not done by default for performance reasons.

Hashable Types / Dictionary

Dictionaries are able to store values based on a key, typically the key is of type String, but it could actually be many different Swift types. In this example we create a basic Dictionary with String keys and Int values for the age of each person:

let ages = ["Robb": 15, "Sansa": 12, "Arya": 10, "Jon": 15]

We can access these values by their String keys:

print( ages["Arya"]! )
print( ages["Jon"]! )
10
15

Note that we’re unwrapping these because they are optional values, and could potentailly be nil. It is generally safer to use optional binding to unwrap the value from a Dictionary, especially if you have reason to believe the value could often be nil.

if let aryasAge = ages["Arya"] {
    print("Arya is \(aryasAge) years old")
}
Arya is 10 years old

We can also store arrays inside of dictionaries, or dictionaries inside of arrays, or a mix of both.

let families = [
    "Stark": ["Robb": 15, "Sansa": 12, "Arya": 10, "Jon": 15],
    "Baratheon": ["Joffrey": 13, "Tommen": 8]
]
let tommensAge = families["Baratheon"]!["Tommen"]!
print("Tommen is \(tommensAge) years old")
Tommen is 8 years old

The type of houses here would be [String: [String: Int]]
Or in other words it is a dictionary with a String key, and it’s values are [String: Int], another dictionary with String keys and Int values.

Set

A Swift 3 Set is similar to an Array, except the values in a Set are unique and unordered.

Initializing a Set looks almost exactly like intitializing an Array, the only different is the type:

let colors: Set<String> = ["Blue", "Red", "Orange", "Blue"]

This code creates a new set of type String. The greater than / less than symbols < and > are used to indicate Swift generic types, including the types of a Set as shown here.

You’ll notice we included "Blue" twice in our list, but if we print out the contents of colors, we only see it once:

let colors: Set<String> = ["Blue", "Red", "Orange", "Blue"]
print(colors)
["Orange", "Red", "Blue"]

You may also notice that the ordering is inconsistent. Sets do not maintain any particular order for their contents.

We can not access members of a Set using indexes as we can with arrays, but instead we use the methods built-in to the Set type to add and remove objects. We can also call the contains method to check if the Set includes something.

var colors: Set<String> = ["Blue", "Red", "Orange", "Blue"]
colors.insert("Black")
colors.insert("Black")
colors.remove("Red")
print(colors)
print(colors.contains("Black"))
print(colors.contains("Red"))
["Black", "Orange", "Blue"]
true
false

Constructing sets of objects is a common way to catalogue what is included or excluded in a list of things, as long as there is no need to order or have duplicates of the objects.

There are many methods I have not mentioned, and I would encourage you to read through Apple’s documentation on each of these three classes to further familiarize yourself with them.

Tuples

Tuples are not technicaly a collection, but instead simply multiple variables that can be passed around with a single identifier.

let fullName = ("Jameson", "Quave")

The type of the tuple here is (String, String), and we can manually access each numbered tuple element using dot-syntax, followed by the index:

let fullName = ("Jameson", "Quave")
print(fullName.1)
print(fullName.0)
Quave
Jameson

Tuples can also be deconstructed in to new variable names:

let (first, last) = ("Jameson", "Quave")
print(first)
Jameson

Since we’re not using the last name here, we could just ignore that value by using an underscore _ and still deconstruct the first name:

let (first, _) = ("Jameson", "Quave")
print(first)
Jameson

Tuples are useful when you have a method that you want to return multiple values.

Control Flow

Control flow in Swift looks pretty similar to other languages. At the most basic level we have the if and else statements.

if 10 > 5 {
  print("10 is greater than 5.")
}
else {
    print("10 is not greater than five.")
}
10 is greater than 5

You can alternatively put your conditions for an if statement in parenthesis:

if (10 > 5) {
...

Swift also support the switch statement, and checks at compile for whether or not you have exhaustively covered all possibilities. If you do not (or can not) specifically handle all possibiities, then you can use the default: case to handle everything not explicitly handled.

let name = "Jameson"
switch(name) {
case "Joe":
  print("Name is Joe")
case "Jameson":
  print("This is Jameson")
default:
  print("I don't know of this person!")
}
This is Jameson

Here because the value of name is "This is Jameson", we match the 2nd case and execute the line

...
  print("This is Jameson")
...

If however we set the name to be something not present in our list of cases, such as "Jason", the switch would fall back to the default case.

let name = "Jason"
switch(name) {
case "Joe":
  print("Name is Joe")
case "Jameson":
  print("This is Jameson")
default:
  print("I don't know of this person!")
}
I don't know of this person!

Loops and Collections

Swift 3 does not support the classical C-style for loops you may be used to, and instead opts for enumeration and for-each style loops for the for element in array syntax.

For example if we have an array names, and we want to print each one seperately, we can do so with a for loop:

let names = ["Robb", "Sansa", "Arya", "Jon"]
 
for name in names {
    print("Name: \(name)")
}
Name: Robb
Name: Sansa
Name: Arya
Name: Jon

This is great if you happen to want to loop over an array, but without C-style arrays, how would we loop over a series of numbers? The answer comes in the form of Swift’s Range and Stride. Let’s say we wanted to count to 10, by threes, we could do that by using a Range from 1 to 10 using the syntax 1...10. Then we could only print each number that is evenly divisible by three by using the modulo % operator and checking for a remainder of 0.

for i in 1...10 {
    if i % 3 == 0 {
        print(i)
    }
}
3
6
9

There is another option however, to only iterate every third item (or any arbitrary delta), known as a stride. A stride can be created using a variety of methods, but the most common is stride(from: , to:, by:) where the from value is where the stride starts, to is where it ends, and by is how much each value changes to approach the to. If that sounds a little confusing just look at this code sample

let byThrees = stride(from: 3, to: 10, by: 3)
for n in byThrees {
    print(n)
}
3
6
9

It’s almost readable in english, you might say you are “counting” from 3 to 10 by 3. Here we create a stride and store it in a variable named byThrees, but we could use it directly in the loop as well:

for n in stride(from: 3, to: 10, by: 3) {
    print(n)
}
3
6
9

Collections also all have an indices property that can be used to loops. This returns an array of the indices a collection has, useful for iterating or filtering some but not all of a collection. For example, back in our name collection we may want only the first three names, which we can retrieve like so:

let names = ["Robb", "Sansa", "Arya", "Jon"]
for nameIndex in names.indices {
    if(nameIndex < 3) {
        print(names[nameIndex])
    }
}
Robb, Sansa, Arya

There is also the enumerate method on collections, that allow you to get both the index and value from an array as you loop over it:

let names = ["Robb", "Sansa", "Arya", "Jon"]
for (index, name) in names.enumerated() {
    print("\(index): \(name)")
}
0: Robb
1: Sansa
2: Arya
3: Jon

There are still more ways to enumerate loop over objects in Swift 3, but these are the most commonly used.

You may notice in our for loop we are assigning to two variables at once, both index and name. These are seperated by commas and surrounded by parenthesis in orer to delineate there are two named variable we expect returned from the enumerated() method. These are techincally deconstructed tuples.

This concluded the fundamentals portion of the Swift 3 tutorial. Next, I’ll show you how to use what you’ve learned here in real-world scenarios. Keep up with these posts by subscribing below. You’ll be emailed when I make new posts, and post video tutorials and other neat stuff, never spam 🙂


Sign up now and get a set of FREE video tutorials on writing iOS apps coming soon.

Subscribe via RSS

SiriKit Resolutions with Swift 3 and iOS 10 – SiriKit Tutorial (Part 2)

SiriKit Resolutions with Swift 3 in iOS 10 – SiriKit Tutorial (Part 2)

This tutorial written on June 20th, 2016 using the Xcode 8 Beta 1, and is using the Swift 3.0 toolchain.

This post is a follow-up in a multi-part SiriKit tutorial. If you have not read part 1 yet, I recommend starting there.

Resolving requests from SiriKit

In order to make our Siri integration more useful, we can help fill out the content of our message using a callback method from the INSendMessageIntentHandling protocol. Investigating this protocol you can see this show up an optional methods.

resolveRecipients(forSendMessage intent: INSendMessageIntent, with completion: ([INPersonResolutionResult]) -> Swift.Void)
 
resolveContent(forSendMessage intent: INSendMessageIntent, with completion: (INStringResolutionResult) -> Swift.Void)
 
resolveGroupName(forSendMessage intent: INSendMessageIntent, with completion: (INStringResolutionResult) -> Swift.Void)
 
resolveServiceName(forSendMessage intent: INSendMessageIntent, with completion: (INStringResolutionResult) -> Swift.Void)
 
resolveSender(forSendMessage intent: INSendMessageIntent, with completion: (INPersonResolutionResult) -> Swift.Void)

So we can provide SiriKit with further information by implementing as many of these resolutions as we wish. Effectively enabling us to provide information regarding the recipients, content, group name, service name, or sender. These should be relatively self-explanatory.

Let’s try providing some static data for our title and content, to demonstrate how resolutions work.

First, let’s add the resolution for the content of the message, by implementing the resolveContent protocol method.

func resolveContent(forSendMessage intent: INSendMessageIntent, with completion: (INStringResolutionResult) -> Void) {
    let message = "My message body!"
    let response = INStringResolutionResult.success(with: message)
    completion(response)
}

Here we create a string resolution result, and call the success function. This is the simplest way to proceed, but we also have the option of returning a disambiguation, confirmationRequired, or unsupported response. We’ll get to those later, but first let’s actually use the data Siri is providing us.

Siri will send in it’s own transcription of our message in the intent object. We’re interested in the content property, so let’s take that and embed it inside of a string.

func resolveContent(forSendMessage intent: INSendMessageIntent, with completion: (INStringResolutionResult) -> Void) {
    let message = "Dictated text: \(content!)"
    let response = INStringResolutionResult.success(with: message)
 
    completion(response)
}

The content property is an optional, and as such we need to make sure Siri actually provided a transcription. If no transcription was provided then a message won’t be entirely useful, so we need to tell Siri that the information is missing and we need this value. We can do this by returning a resolution result calling the needsValue class method on INStringResolutionResult.

func resolveContent(forSendMessage intent: INSendMessageIntent, with completion: (INStringResolutionResult) -> Void) {
    if let content = intent.content {
        let message = "Dictated text: \(content)"
        let response = INStringResolutionResult.success(with: message)
        completion(response)
    }
    else {
        let response = INStringResolutionResult.needsValue()
        completion(response)
    }
}

SiriKit requesting additional information

Now SiriKit knows when we try to send a message, that the content value is a requirement. We should implement the same type of thing for the recipients. In this case, recipients can have multiple values, and we can look them up in a variety of ways. If you have a messaging app, you would need to take the INPerson intent object that is passed in and try to determine which of your own user’s the message is intended for.

This goes outside the scope of this Siri tutorial, so I’ll leave it up to you to implement your own application logic for the resolveRecipients method. If you want to see an example implementation, Apple have released some sample code here.

More iOS 10 Tutorials

We’ll be continuing to investigate iOS 10 and publish more free tutorials in the future. If you want to follow along be sure to subscribe to our newsletter and follow me on Twitter.

Thanks,
Jameson


Sign up now and get a set of FREE video tutorials on writing iOS apps coming soon.

Subscribe via RSS

Siri Integration in iOS 10 with Swift – SiriKit Tutorial (Part 1)

Siri integration on iOS 10 – Swift Tutorial

This tutorial written on June 13th, 2016 using the Xcode 8 Beta 1, and is using the Swift 3.0 toolchain.

Get Xcode 8 set up for iOS 10 and Swift 3 compilation.

If you have not yet downloaded Xcode 8 Beta 1, please do so here.

(Optional) Compiling from the command line

To opt in to the Swift 3.0 toolchain you shouldn’t need to change anything unless you want to build from the command line. If you plan to build from the command line, open Xcode-beta and from the OS menu bar select Xcode > Preferences. Then select the Locations tab. At the bottom of the page here you will see “Command Line Tools”. Make sure this is set to Xcode 8.0.

Now if you navigate to the project directory containing the .xcodeproj file, you can optional compile your project by calling xcodebuild from the command line.

(Optional) Migrating from an existing Swift 2 app

If you are working with an existing Swift 2 project and want to add Siri integration with Swift 3.0, click on the root of your project and select Build Settings. Under Swift Compiler – Version, find the field labeled Use Legacy Swift Language Version and set it to No. This will lead to compiler errors most likely that you will need to fix throughout your project, but it’s a step I recommend to keep up with Swift’s ever-changing semantics.

Getting started with SiriKit

First, in your app (or in a new single-view Swift app template if you are starting fresh), switch to the general view by selecting the root of your project. Under this tab you can click the (+) icon in the lower land corner of the side-pane on the left. From the dropdown that appears selection iOS > Application Extension, and then select Intents Extension.

Select Intents Extension

This adds a new intent to the project, and we’ll use it to listen for Siri commands. The product name should be something similar to your app so it’s easy to identify, for example if your app is called MusicMatcher, you could call the Product Name of this intent MusicMatcherSiriIntent. Make sure to also check the checkbox to Include UI Extension. We will need this later in the tutorial, and it’s easiest to just include the additional extension now.

Intents Extension Options

What we’ve created are two new targets as you can see in the project heirarchy. Let’s jump in to the boilerplate code and take a look at the example in the IntentHandler.swift file inside of the Intent extension folder. By default this will be populated with some sample code for the workout intent, allowing a user to say commands such as “Start my workout using MusicMatcher”, where MusicMatcher is the name of our app.

IntentHandler.swift

Run the Template App as-is

It’s helpful at this point to compile this code as-is and try out the command on an actual iOS device. So go ahead and build the app target, by selecting the app MusicMatcher from the Scheme dropdown, and when the target device set to your test iOS device, press the Build & Run button.

Select the MusicMatcher target

You should see a blank app appear, and in the background your extensions will also be loaded in to the device’s file system. Now you can close your app using the Stop button in Xcode to kill the app.

Then, switch your scheme to select the Intent target, and press build & run again.

Select the MusicMatcherSiriIntent target

This will now prompt asking which app to attach to, just select the app you just ran, MusicMatcher. This will present the app again on your device (a white screen/blank app most likely), but this time the debugger will be attached to the Intent extension.

Select the app to run with the extension

You can now exit to the home screen by pressing the home button, or the app may exit on it’s own since you are running the Intent and not the app itself (This is not a crash!!!)

Enable the extension

The extension should now be in place, but we as an iOS device user still may need to enable the extension in our Siri settings. On your test device enter the Settings app. Select the Siri menu, and near the bottom you should see MusicMatcher listed as a Siri App. Make sure the app is enabled in order to enable Siri to pick up the intents from the sample app.

Testing our first Siri command!

Try the Siri command. Activate Siri either by long pressing the Home button, or by saying “Hey Siri!” (note the “Hey Siri!” feature must be enabled in the settings first)

Try out some of the command “Start my workout using MusicMatcher”.

“Sorry, you’ll need to continue in the app.”

If you’re like me this will bail with an error saying “Sorry, you’ll need to continue in the app.” (For some reason this occassionally was not a problem. Ghosts?)

In the console you may see something like this:

dyld: Library not loaded: @rpath/libswiftCoreLocation.dylib
  Referenced from: /private/var/containers/Bundle/Application/CC815FA3-EB04-4322-B2BB-8E3F960681A0/LockScreenWidgets.app/PlugIns/JQIntentWithUI.appex/JQIntentWithUI
  Reason: image not found
Program ended with exit code: 1

We need to add the CoreLocation library to our main project, to make sure it gets copied in with our compiled Swift code.

Select the project root again and then select your main MusicMatcher target. Here under General you’ll find area area for Linked Frameworks and Libraries. Click the (+) symbol and add CoreLocation.framework. Now you can rebuild and run your app on the device, then follow the same steps as above and rebuild and run your intent target.

Finally, you can activate Siri again from your home screen.

“Hey Siri!”
“Start my workout using MusicMatcher”

Siri should finally respond, “OK. exercise started on MusicMatcher” and a UI will appear saying “Workout Started”

MusicMatcher Workout Started

How does it work?

The IntentHandler class defined in the template uses a laundry list of protocols:

First an foremost is INExtension which is what allows us to use the class as an intent extension in the first place. The remaining protocols are all intent handler types that we want to get callbacks for in our class:

INStartWorkoutIntentHandling
INPauseWorkoutIntentHandling
INResumeWorkoutIntentHandling
INCancelWorkoutIntentHandling
INEndWorkoutIntentHandling

The first one is the one we just tested, INStartWorkoutIntentHandling.

If you command-click this protocol name you’ll see in the Apple docs this documentation:

/*!
 @brief Protocol to declare support for handling an INStartWorkoutIntent
 @abstract By implementing this protocol, a class can provide logic for resolving, confirming and handling the intent.
 @discussion The minimum requirement for an implementing class is that it should be able to handle the intent. The resolution and confirmation methods are optional. The handling method is always called last, after resolving and confirming the intent.
 */

Or in other words, this protocol tells SiriKit that we’re prepared to handle the English phrase “Start my workout with AppName Here.”
This will vary based on the language spoken by the user, but the intent will always be to start a workout. The INStartWorkoutIntentHandling protocol calls on several more methods, and they are implemented in the sample code. I’ll leave you to learn more about them if you want to build a workout app, but what I’d rather do in the remainder of this tutorial is add a new intent handler for handling the sending of messages.

Let’s Add a New Message Intent

Now that we’ve confirmed that works, let’s move on to adding a new type of intent for sending messages. The docs here show the following:

Send a message
Handler:INSendMessageIntentHandling protocol
Intent:INSendMessageIntent
Response:INSendMessageIntentResponse

So let’s add the INSendMessageIntentHandling protocol to our class. First we’ll just specify we want to use it by appending it to the list of protocols our class adheres to in IntentHandler.swift. Since I don’t actually want the workout intent’s, I’ll also remove those, leaving us with just this for the class declaration:

class IntentHandler: INExtension, INSendMessageIntentHandling {
    ...

If we just left it at that we wouldn’t be able to compile our code since we stil need to implement the required methods from the INSendMessageIntentHandling protocol.

Again, if you ever need to check what those methods are, just command+click the text INSendMessageIntentHandling and take a look at what method signatures are present that are not marked with the optional keyword.

In this case we find only one required method:

/*!
 @brief handling method
 
 @abstract Execute the task represented by the INSendMessageIntent that's passed in
 @discussion This method is called to actually execute the intent. The app must return a response for this intent.
 
 @param  sendMessageIntent The input intent
 @param  completion The response handling block takes a INSendMessageIntentResponse containing the details of the result of having executed the intent
 
 @see  INSendMessageIntentResponse
 */
public func handle(sendMessage intent: INSendMessageIntent, completion: (INSendMessageIntentResponse) -> Swift.Void)

Adhering to the new Message Intent protocol

So back in our IntentHandler.swift, let’s add a line seperator (useful for navigating code with the jump bar)

// MARK: - INSendMessageIntentHandling

Underneath this MARK, we can implement the function. I find it’s most useful with Xcode 8 to simply begin typing the method name, and let autocomplete take it from there, choosing the relevant option.

Fill out the handle method with autocomplete

In our handler, we’ll need to construct an INSendMessageIntentResponse in order to call back the completion handler. We’ll just assume all messages are successful here and return a success value for the user activity in the INSendMessageIntentResponse constructor, similar to how this is done in the template app. We’ll also add a print statement so we can see when this handle method is triggered by a Siri event:

func handle(sendMessage intent: INSendMessageIntent, completion: (INSendMessageIntentResponse) -> Void) {
    print("Message intent is being handled.")
    let userActivity = NSUserActivity(activityType: NSStringFromClass(INSendMessageIntent))
    let response = INSendMessageIntentResponse(code: .success, userActivity: userActivity)
    completion(response)
}

Adding the intent type to the Info.plist

Before this app will be capable of handling INSendMessageIntent, we need to add the value to our Info.plist. Think of this as something like an app entitlement.

In the Info.plist file of the intent, find and expand the NSExtension key. Then extend NSExtensionAttributes, and then IntentsSupported under that. Here we need to add a new row for our INSendMessageIntent to allow the app to process Message intents.

Add support for INSendMessageIntent aka messages intent

Testing the new intent

Now that we’ve got our new intent set up, let’s give it a try. Recall that you must build the app, run it on the device, and then run the extension in order to debug the extension. If you don’t do run in this order the extension will either not work, or it will not log to the Xcode console.

Try calling upon our intent in Siri, and you will now see a new message window appear! The window is pretty empty, and there isn’t much logic to tie in to our app just yet. We need to implement the remaining callbacks and add some of our app’s messaging logic to provide a better experience. We’ll cover that in Part 2, which is available now. If you want me to email you about it when other tutorials come out as well, sign up for my newsletter to get the scoop.


Sign up now and get a set of FREE video tutorials on writing iOS apps coming soon.

Subscribe via RSS

Mobile First & Declining Usability

Mobile App Design is Fashion

Mobile app design moves very quickly, and with “mobile-first” thinking becoming the mantra of designers of all kinds, it is a leading indicator for design of just about everything else. So, I think it’s important to take a step back and analyze how apps are designed today. In this I break down the most recent mobile app design trends that I believe are actually hurting us, and making our software worse.

The way mobile app designs is very much a fashion-of-the-week type of situation where the target changes on a nearly daily basis. What was cool a month ago is stale in comparison to today, I use the term “fashion” to generally describe this reality. Before we dig in to the specifics of modern app design, let’s talk a little about fashion itself.

Fashion is irrelevant

I recently found myself reading an essay by Milton Glaser, one of the world’s most celebrated designers. In the essay he talks about “The Bull” by Pablo Picasso. In the work, Picasso renders 11 different variations of a bull each in differing styles. The work ranges from the realistic rendering, to a cartoon rendering, then to a more abstract cubist rendering, and finally simple line art.

Picasso: The Bull

As Glaser puts it, “What is clear just from looking at this single print is that style is irrelevant. In every one of these cases, from extreme abstraction to acute naturalism they are extraordinary regardless of the style.” – Ten things I have learned, Milton Glaser

I thought this line of thinking was very interesting. Glaser makes the point that “Style is not to be trusted”. Or another way to think about that, is that style is fashion, and it comes and goes with the tide. In demonstrating that the style of the work is actually irrelevant to it’s quality, we could draw the conclusion that style in ap design (the aesthetic) is also irrelevant, except for one small problem…

Fashion is of utmost importance

Back in the early days of smartphones the designers at Apple had a clear design goal: Familiarize people with touchscreens. So the style was glossy buttons with shadows, and controls that seem like they want to be touched. If there was a real-world analogue that could be used to symbolically represent a 3D interactive object, they would use it. This is known as skeuomorphism in the design communities, and I won’t rehash why everyone decided they hate it in the past few years here. What I will say however is that there was value in those old ugly interfaces. What we were able to do with skeuomorphism is train users to use our apps with visual language. The best example I can think of from those days is the picker view (called a UIPickerViewController in programmer speak).

UIPickerViewController

Nothing about the design of the picker view makes much sense, really. The only thing it does well is it looks like a spinning wheel, which encourages you to spin it like a contestant on The Price is Right. It’s kind of fun, in fact. But, this interface actually kind of sucks to use. You can only see about 4 or 5 options at any given time, and selecting a specific option requires a fine-tuning sort of interaction where you slowly and carefully align a value with the center of the view. A much simpler approach would be to just put options in a big fullscreen list; which is actually what Apple ended up using more often in their own apps. (Also known as a UITableViewController.)

UITableViewController

While these types of interfaces have many drawbacks, I have to wonder if my elderly friends, or my infant nephews would have been able to use smart devices so readily where they not using these metaphors for reality. Would smartphones have become as successful as they are now without skeuomorphism?

It wasn’t just this picker either, skeuomorphism informed the iBooks interface, designed to look like an actual bookshelf:

iBooks

It inspired the Calendar interface, to look like an actual calendar:

Calendar

And just for fun: Designer Meng To posted this skeuomorphic version of Facebook that looks like an actual book. on dribble.com

Facebook by Meng To

2016

Fast-forwarding to 2016, we have completely abandoned the idea that people don’t know what to do with touchscreens. I think we actually have gone too far. Have you ever tried to use Snapchat? It’s like they have made it intentionally confusing to use. Everything is hidden using gestures and invisible buttons you are supposed to know to tap and hold, and weird things like that. Some of the apps most interesting features are completely hidden from view.

There was a lot of positive reactions to the app Clear from a few years ago. It was a huge hit because of it’s unique UI animations. But here’s my question: How many of those downloads translated in to regular users? I don’t want to sit here and hate on the Clear app because I think it was a brilliant design in terms of being unique and fashionable. What it was not however, is usable.

Clear

Okay, so I swipe to the right to finish a task… Oops wrong one, how do I undo that? Do I just re-add it? Wait, how do I add a task again? Oh, I take two fingers and separate two existing tasks, and that makes a new one appear in between my existing ones? That’s odd… what if I don’t have two tasks? Do I just sort of “unpinch” the negative space where my tasks would be if I had two?

Yep.

That’s exactly what you’re supposed to do, and that sucks…

To clear a task swipe to the left, to finish a task swipe to the right. Wait, what’s the difference? I’ve cleared some of my tasks, but deleted others. Either way they get removed from my list…

If you go look at the app store reviews for Clear, you’ll see a lot of people complaining about the same thing, and I can’t even fully explain it to you because I don’t understand it either. But apparently if you perform a swipe gesture up (or down?) it will delete the entire list. How’s that for a fun and wacky gesture! So fun! You just lost your entire task list, which is the entire point of the app! That dismiss animation was sick though, so the app must be great, right?

Clear Reviews

The tutorial for Clear says I should “Pinch together vertically” to collapse my current level and navigate up. So instead of tapping a back button to navigate up, I have to perform a two-handed unpinch gesture for that too? I thought that was the gesture for creating a new task? HOW DO I USE THIS APP I DONT UNDERSTAND!? ARGGGH!!

deletes app…

Fashion sucks

As disappointing as it is, this seems to be the new fashion in apps: I summarize it basically as intentionally terrible user experience. I was talking to a prospect recently about an iOS App we were planning to help them build. They showed me a mockup that involved quite a few hidden buttons and gestures to control the UI. I carefully explained that users were unlikely to find these gestures and that we should move these things in to buttons that are more obvious. As a compromise, we needed at least a quick tutorial to show them these features. The response I received was surprising, to say the least:

We want this to be a secret

If we were designing a video game level this would maybe make some sense. It’s typical to “hide” some areas of play so that it’s rewarding when they are discovered. This is basically what they were going for. But for a mobile app, this is just a poor UI decision that is going to leave people confused. This wasn’t some kind of special easter egg hidden feature. This was basic functionality for switching to a friends list in the app.

You want to hide the friends list from users? This kind of thinking is now fashionable. Whether we like it or not, fashion influences the way we write software. As someone who is going to spend hours of my life building out these features, I hate the idea that most users simply will never find them. That’s why I declined that particular project. I left money on the table, but saved myself from spending time on work I didn’t find meaningful.

It’s our responsibility to fix it

I could sit around critiquing popular app’s designs all day (and I might), but this post wouldn’t be much use to anyone if that was all I did. So, let’s talk solutions…

1. Recognize that you can fix things

First, I think it’s important to accept that application developers have the final say in what their work ends up producing. You may think the decisions are simply made by your client, your boss, your partner, or your dog. Blame who you want, but your apps are the output of your efforts. If you are being boxed in to a corner and being asked to do something that sucks, you should stand up against it. You should learn to say “no” more often. At the end of the day, if you deliver what you think is best to solve the problem at hand, and you do your best work, no-one can complain about that. If they do, then they’re probably not worth working with anyway.

2. Do hallway testing

Hallway testing is the kind of testing you do when you grab some random person “out of the hallway” to test your app. You may have NDAs or similar agreements that make this a challenge, which is why it’s important to make sure agreements like this permit hallway testing. You may not be able to place an ad and have dozens of testers come to your office, but you can at least get family member or friend to try things out.

There are two very important aspects to hallway testing that are required for it to work. The first is that you absolutely can not explain things to the user while testing. It will be your first reaction to want to explain away any rough edges, defend your work, or try to help move things along to avoid embarrassment. This no longer represents a real hallway test though, this represents what it would be like if every user of your app was accompanied by you with your explanation, which obviously is not happening.

The second thing you must do is take notes of any stumbles. If you watch a user struggle with a component of your app for several minutes, and then finally figure it out, you could very easily write it off and say “oh well, they eventually figured that out so it’s not a priority”. WRONG! Your testers will have infinitely more patience than real-world users. If a tester is struggling, you need to make a note of it. Make several notes in fact, you’re going to forget if you don’t, and then those issues won’t be addressed.

3. Don’t hide anything inside of non-obvious gestures

If you are making a feature that involves swiping, pinching, or any other gesture, you should make sure it actually is intuitive to do so. For example, panning on a map (like an Apple Map) is very intuitive… it’s obvious that you would want to scroll the view and it’s obvious how you would go about doing that.

You’ll have to use your judgement on this one, but one way to confirm that your gestures are clear is by doing hallway testing, as mentioned above. You can trust your instinct with this stuff, but you also need to confirm you were right. Often something that seems intuitive to you will not be obvious to your testers.

4. Don’t be clever

There’s a famous quote often cited in software development projects.

Everyone knows that debugging is twice as hard as writing a program in the first place.
So if you’re as clever as you can be when you write it, how will you ever debug it?
“The Elements of Programming Style”, 2nd edition, Chapter 2

The same thing applies to user interface design. If it was a really clever idea, you should always be wary. Often the dead-simple obvious solution is the best one. This is basically just Occam’s Razor restated, which is often quoted for a reason: it’s true.

5. You tell me

What are the other takeaways here? How do you make sure your apps are highly usable? Tell me in the comments or on Twitter.

P.S. As an interesting side note, “The Bull” is also reportedly the subject of Apple’s training materials on their design thinking, for entirely different reasons.


Sign up now and get a set of FREE video tutorials on writing iOS apps coming soon.

Subscribe via RSS

Your Parse backend was always a bad idea.

So if you haven’t heard, Facebook is shutting down Parse, the backend as a service (BaaS) that was acquired by Facebook a little while ago. Lots of developers are feeling a little lost, and even betrayed by Facebook. I tweeted this screen cap someone made from the Parse homepage before the shutdown, and it pretty much says it all:

I didn’t need to add the emphasis there, they already did that. Thousands of developers TRUST US. You can see from this type of presentation of their image why developers are feeling so betrayed. Why would anyone continue using React Native, React JS, HHVM, Relay, or any other Facebook technology knowing that they may just randomly decide to pull the plug on it?

Sure, these are open source and the community can take over, but open source projects need maintainers and corporate backers are a huge boon. Facebook has proven that we can’t trust them, but this shouldn’t be that surprising to anyone who has worked with Facebook APIs in the past, or any third party social media API for that matter. I’m going to get in to that later, but let’s completely change the subject for a second to talk about the other elephant in the room, Twitter… and more importantly Twitter Fabric, which now owns Crashlytics and has integrated a bunch of the amazing work done by Felix Krause.

But to understand how Twitter has treated it’s development community in the past, I think we should talk about a little app called Meerkat. I promise we’ll get back to talking about Parse and Facebook, but this story falls under the same umbrella, so bear with me.

Meerkat

So a little backstory: I live in Austin, TX which means every year at SXSW I get a front-row seat to the startups that are going to be big over the next year. Twitter, Foursquare, GameSalad, and even the Four-Hour Work Week were all launched at SXSW. These are some of the bigger successes, but every year tons of wide-eyed founders show up to Austin to present their work, and hope it takes off with people at the festival. In 2015 there was an extremely clear SXSW winner called Meerkat. meerkat-app-tweet-live-video-twitter

 

Meerkat is basically a live streaming P2P platform for people to stream directly from their iPhones to other users, and it took SXSW by storm. Just walking around Austin last year at SXSW you would see Meerkat shirts everywhere. Everyone who was anyone was streaming live music, SXSW sessions, their lunch, or just about anything they were doing. Then, suddenly… it all stopped, for a very specific reason:

At the height of Meerkat’s launch, Twitter yanked API access right out from under their feet with only 2 hours notice.

If you’re familiar with the iOS App Store process you might understand how this is sort of an issue, considering even if the Meerkat folks could somehow rewrite their entire app to not rely on the Twitter APIs they could not get an app approved and on the app store with an update within 2 hours… Actually it’d probably take more like 3 weeks or so.

At the height of Meerkat’s launch, Twitter yanked API access right out from under their feet with only 2 hours notice.

Personally I was not that surprised, but many people were wondering why Twitter pull such a move… Did they violate the terms of the API agreement? Where they doing something illegal?

Well, no…

Actually it turns out the whole reason Twitter decided to handicap the most successful Twitter-based app in years is because they had their own competitor in the pipeline, called Periscope, which I refuse to link to.

So here we are, having Twitter once again asking for the trust of the development community. Sigh…

Parse

See, I told you we would get back to Parse!

Seeing as how Twitter and Facebook are basically just two sides of the same social media coin, It seems to me that trusting either of them has similar implications. When I wonder about how a tech company will proceed in to the future I always repeat the mantra, “Follow The Money”. This generally tells you how large companies will behave well in to the future, especially publicly traded ones (both TWTR and FB are public). American public companies answer to their shareholders regarding quarterly earnings reports, sometimes to the detriment of their customers and/or partners. When I saw Parse had such a huge threshold for their “Free Tier”, it really worried me. It seems to me 99% of apps (or more) would never cross that threshold, and if they did they would only do so by a few cents. What exactly are Facebook’s motivations when it comes to their developer tools, and in particular the backend as a service? I think the answer is pretty simply that they wanted your data, but it turned out to be worthless. So they decided Parse wasn’t worth their time any more. They can’t turn a profit by providing a free backend to hundreds of thousands of developers; so they shut it down. In Facebook’s own words:

We’re proud that we’ve been able to help so many of you build great mobile apps, but we need to focus our resources elsewhere.

Translation: “You aren’t making us enough money”

Facebook generates all of it’s bottom line from advertising, just like Twitter, just like Google who is now the most valuable company in the world, surpassing Apple. In fact there is only one major platform vendor who doesn’t make the majority of their profit from advertising, and it’s Apple.

Following The Money

I tweeted this earlier today:

It’s true, you really can’t trust these social media companies with your backend so blindly. You have to Follow The Money to find the motivations of the parties involved. If their motivations are not to provide you with a great service that benefits their bottom-line, it’s unlikely it’ll stick around for very long. This is also a great way to analyze your own business if you are a startup founder or CEO. When you work with anyone, you must be certain their financial motivations align with yours, otherwise there will always be a disconnect. This applies to employees, co-founders, partners, and vendors alike.

The Facebook API

Back in the early days of the Facebook API, you could easily retrieve a list of a user’s contacts. This is what led to the mass-spamming from Facebook games, and the rise of Farmville. But Facebook decided they didn’t like that, so they yanked that privilege to the detriment of many apps, and Zynga’s stock. Seriously, check out what that did to Zynga’s stock:

If your business depends on an app, then your backend is an extremely important business asset that you absolutely must control. It took almost a decade for the likes of Salesforce and other cloud-based enterprise companies to make their way in to large corporations, and even today most of them are using on-site hosted versions of the software. The reason is that in a well run business you will own anything that is mission-critical. This increases the maintenance cost as well as the cost to initially deploy, but without this control your business is dependent on the whims of some shady figures who are mining your data to serve ads. Is that who you want in control of your server? How much do you trust Facebook, Twitter, and Google?

Build Your Own Damn Backend

The only real answer to providing your mobile app with a stable backend that you control is to build it yourself. I know this sounds hard, but it really isn’t that difficult to use Ruby on Rails or NodeJS to produce a simple API to power your mobile apps. Frankly, the Parse backend with it’s javascript-based events is not all that different from writing a NodeJS app in express, grabbing some node modules for easy API delivery, and backing the whole thing with a MongoDB database. If this sounds really hard, just take a few hours reading some tutorials online, and you will realize how easy this all is to do yourself. Alternatively, you can just hire my company who does this routinely *end shameless plug*.

If you do hire a vendor to build your backend, make sure you are getting the source code, and the tools necessary to load it up on whatever server you need. Docker is a nice way to contain all the environmental requirements for an app, and services like Heroku make deploying Rails apps easy.

Fun Fact: About 95% the way through writing this post, my ironic Twitter embeds trashed all the formatting and I had to reformat the entire thing. ^_^


Sign up now and get a set of FREE video tutorials on writing iOS apps coming soon.

Subscribe via RSS