This post is also available in the following languages. Japanese, Korean

Improving Build Performance of LINE for iOS with Bazel

Doan TruongThi2020-07-09

iOS Engineer at LINE.

Background

Over the years, the LINE for iOS source tree has grown to a size of hundreds of modules. As of late 2019, the project consists of more than 1.4 million lines of code without showing any signs of stopping its increase. This in turn, tripped up the build times for all the developers who work on the project. As the project grew we started to see more unreproducible problems as well. For instance, the builds worked locally but not on CI and vice versa. We took a step back and thought about how we can improve the performance and the reproducibility of our builds.

Dependencies

First, let's take a look at the dependencies. At the end of 2012, we adopted CocoaPods to be our dependency manager. CocoaPods is a great piece of open-source software. It integrates very well with Xcode. You get to see and debug the source code of your dependencies as if it was your own code. The only downside is that it builds and cleans alongside the project—once you clean your project, you'll have to rebuild everything. This shouldn't be a big problem if most of your dependencies are written in Objective-C.

Our team and project scaled — then along came Swift. We had been using Swift since version 1.0 in our own code, but had managed to not have any Swift dependency for about three years. We couldn't help but add Carthage as our first dependency manager for Swift and had it alongside CocoaPods. Unlike CocoaPods, Carthage has little to no configuration. It pre-builds your dependencies as frameworks, and you will be responsible for integrating them into your project. We ended up having two dependency managers.

You can have Carthage download the prebuilt artifacts from GitHub or build the dependencies locally. For security reasons, we don't want to download arbitrary binaries from GitHub. For app performance reasons, we want to build all of our dependencies as static frameworks. So, we had our developers build every single dependency locally on their own machines. This often took quite a long time, typically 15-20 minutes depending on how many dependencies were updated. Carthage, behind the scenes, invokes Xcode to build your dependencies as fat binaries for every architecture that you support. If you still support iOS 10, that would mean building for a total of 4 architectures, which literally means building each dependency 4 times. Additionally, the Xcode's archive action — apparently by design — always involves a full clean build.

Caching builds

It would be a great waste of resources if everyone had to build the same thing over and over again. Good news, we were able to cache Carthage's build artifacts between builds and between machines. Rome — another open-source tool — was great for that. By the time we adopted Rome, it supported caching to a local directory on your machine and/or remotely on AWS S3. In addition to supporting S3, Rome works very well with most S3 compatible object storage services, one of which our server infrastructure provides. The results turned out that being able to use our own infrastructure for the remote cache works better than if we were to use an external service like S3, because the network between our offices and our data centers are extremely fast.

Carthage and Rome had served us pretty well. The only major shortcoming that concerned us was the inability to verify the correctness of the cache. Since cache poisoning wouldn't be acceptable for QA testing and release builds, we decided to rebuild everything every time from scratch for these particular types of builds.

Long-term approach

Dependencies are not everything. Although we were able to eliminate the rebuilding of dependencies, we still had to build our own code. Other than the dependencies, most of our code is built with Xcode. We observed Xcode sometimes decided to rebuild everything when nothing much had changed. In addition, when the developer runs into trouble building their code after pulling the latest changes, they often will resort to cleaning — this blows away build artifacts from targets that rarely change.

Without a doubt, the long-term approach would be to eliminate the rebuilds of any part of your code. The approach that we took was splitting up our codebase into modules — building, and caching them. This wouldn't be possible with Carthage, because:

You have to split your local targets into a separate repository, version them for every change, and update the version you're referencing to in your main repository.
You lose the ability to debug your local targets alongside with the main project, since they are now just prebuilt binaries.

It would be a big regression to development productivity if you chose to do it that way. Carthage and Rome somewhat solved the external dependencies problem. What about your own code? At LINE, the vast majority of our code is developed in-house — unless you come to a compromise of losing the ability to debug part of your code by having it prebuilt, this caching approach isn't going to scale well.

Bazel

One of the open-source tools that was notable for its advanced caching features is Bazel. Initially, Bazel didn't have an established impression in building for Apple platforms, especially for iOS. Bazel was originally developed for the purpose of building a large monorepo, hence it comes with conventions that are different from the needs of most companies. Bazel lacks some important features that are important in the Apple platforms in general, notably support for header maps, Clang modules, and mixed-language targets.

Fortunately, Bazel is very extensible. While Bazel itself can only build C family languages and Java, one can write their own build rules to extend Bazel to build pretty much anything. With Starlark, we extended Bazel's official build rules to solve some of the above-mentioned constraints. Among them, one of the major roadblocks was supporting mixed-language targets in Bazel. With a module named "MyModule", the initial approach was:

Compile the underlying Objective-C module with the name MyModuleObjc
Compile the Swift module as with the name MyModule. In this module, export the Objective-C module using Swift's internal @_exported attribute.

ExportObjcModule.swift

 @_exported import MyModuleObjc

This worked for most utility modules, where the Objective-C part and the Swift part of a module don't have to import each other's declarations, but wouldn't work otherwise.

We had been using Xcode to build our mixed-language framework targets for a while. In order for Xcode to build a mixed-language framework target, first we need to provide a module map for the underlying Objective-C module, and assign it to the MODULEMAP_FILE build setting of the target.

FoundationLineUtils.modulemap

 framework module FoundationLineUtils {
    requires objc
    umbrella "Headers"
    exclude header "FoundationLineUtils-Swift.h"
}

Xcode will then compile the Swift module by importing the provided Objective-C module, extend the provided module with the compiled Swift module, then compile the Objective-C module.

module.modulemap

 framework module FoundationLineUtils {
    requires objc
    umbrella "Headers"
    exclude header "FoundationLineUtils-Swift.h"
}

module FoundationLineUtils.Swift {
    requires objc
    header "FoundationLineUtils-Swift.h"
}

There's an additional trick that Xcode does to prevent the duplicate module definitions. While compiling Swift, it uses a VFS overlay to hide the final module interface with the underlying Objective-C module. At the end of the compilation, this overlay will no longer be used; there will be only one module.

unextended-module-overlay.yaml

 {
  'version': 0,
  'case-sensitive': 'false',
  'roots': [{
    'type': 'directory',
    'name': "<REDACTED>/Products/Debug-iphonesimulator/FoundationLineUtils.framework/Modules"
    'contents': [{
      'type': 'file',
      'name': "module.modulemap",
      'external-contents': "<REDACTED>/FoundationLineUtils.build/unextended-module.modulemap",
    }]
  }]
}

It wasn't very straightforward to mimic this compilation model with Bazel. First, Bazel doesn't build a target into a framework, but a static library. So instead of importing the modules by the framework search paths, we have to tell it where the module map of each module is defined using the "-fmodule-map-file" flag. Additionally, we can't use a VFS overlay to hide the underlying Objective-C module, because the overlay needs to know the absolute path of the map. Since the paths would be different between source tree clones and different machines, this is not going to be remote-cache friendly. The compilation steps translated to Bazel were:

Generate a module map for the underlying Objective-C module.
Instantiate a swift_library target to build the Swift code with the generated module map. The generated module map is added to the target's swiftc_inputs, but not to be added to the dependencies list.
Finally, instantiate an objc_library target to build the Objective-C code. This needs to depend on the swift_library target. Any target that depends on this mixed-language target only needs to refer to this final objc_library target.

The trick here is that we're declaring the Objective-C module map as an input of the Swift compilation, but not a module dependency. This way the underlying module will only available while compiling this very module, but won't be propagated up the dependency graph. If you have a mixed source project and are looking to adopt Bazel into your project, please take a look at the apple_library and mixed_static_framework rules at LINE's Apple rules for Bazel.

Targeting SDK developers, Bazel allows you to build an iOS module into a static framework that can be used for third-party distribution. This helped us while we were considering Bazel as a replacement for Carthage for our project. Using our custom rules, we prebuilt our external dependencies into static frameworks using Bazel, and integrated them manually into our Xcode project. This workflow is similar to Carthage's, with some advantages:

The flexibility of choosing to only build what you want. Carthage - at the time we were using it - builds all the predefined build schemes of each dependency, which can take a very long time on your first build.
The built-in remote cache feature. With a simple setup of a remote cache server, we were able to share the build caches between builds and between machines.

We started to build some of our internal modules with Bazel. Since making changes to our modules can happen from time to time, we maintained a prebuilt script in our Xcode project that determines whether it should invoke Bazel to rebuild the modules that had been modified. The script then copies them into a predefined directory that we tell Xcode to look for frameworks. This mechanism worked, but you have to be very careful in determining what files to update and what not to, otherwise Xcode would decide to rebuild something unnecessarily. Or worse, it doesn't rebuild something that depends on the changes.

At this point, we realized that this mechanism isn't that much different from implementing a build system ourselves. It was becoming more and more difficult to keep things in sync when we started to convert higher-level targets in the build graph to Bazel. So we decided to stop migrating this way, and instead focused on making the whole app build with Bazel.

Best of both worlds

Since we had already solved the hard part of the migration, converting the whole project to Bazel at this stage mostly only involved writing BUILD files for the remaining targets. This part took us just a few days as we had re-organized the project to follow some of the conventions:

We had one Xcode project per target: Each original project was generated by XcodeGen, thus we had one project.yml file per target, meaning we just needed to convert each project.yml to a BUILD file.
We only have 3 types of targets: Objective-C family targets (some contain C, C++, and Objective-C++), Swift targets, and mixed Objective-C and Swift targets.
One target per-directory, with the directory name being the module name. Following this convention brings some conveniences later on. For instance, you get to use #import <Module/Module-Swift.h> for free in mixed language targets.

When our developers add a new target, now in addition to the project.yml, they have to create a BUILD file to declare their new target. This process isn’t automated now, but with the predefined XcodeGen templates and Bazel rules, the project.yml and BUILD files tend to be very simple. In most cases, you just need to care about the module name.

To leverage the remote cache for pull requests on CI, we switched the builds on CI to Bazel. At this point, if someone makes a change on a BUILD file and forgets to update the relevant change on the project.yml of the same target, the Xcode builds would fail. Verifying the builds with both Xcode and Bazel on CI would solve that problem. However, it also means that the time it takes before a PR can land wouldn't change. Also, the resources required for each build would be doubled, while the number of build workers we have is limited. As Bazel’s dependency graph is always correct, we can already rely on the BUILD files as the source of truth. We created a script that synchronizes the dependencies declared in each BUILD file to the respective project.yml file. The script is set to run on a git pre-commit hook, so that whenever someone changes a BUILD file, it will update the project.yml for you automatically.

Results

Beta Build Times	Xcode	Bazel
Minimum	28.40	4.40
Maximum	35.42	26.53
Average	30.96	14.53

After switching to Bazel, we were able to achieve a huge improvement in the build times. This brought a significant improvement in the turn-around time during a QA period. Distributing a new build to our testers no longer means another hour waiting for building and testing.

Where to go from here

Although we were very happy with the results, there are still remaining problems that we're solving going forward. Despite the help of the remote caching, having some large targets in the build graph is still the bottleneck of the builds in many cases. There's still a lot that needs to be done in terms of modularization to better leverage the remote cache.

We’re hiring! If you are an iOS engineer who gets excited about building amazing developer experience and contributing to the iOS ecosystem, come join the team!

Blog