Compiling on release mode

Due to the errors with linking S4TF to an arbitrary Swift executable (https://github.com/tensorflow/swift-apis/issues/1185#issuecomment-1008034396), I am currently very constrained with how I can test code that imports S4TF. For now, my only option is to replace the Swift package tests with custom code I want to execute. Having to re-build S4TF repeatedly presents a bottleneck to my workflow.

I profiled S4TF build times on Google Colab (dual-core x64), and found out some interesting results. When running `swift test`, it always re-compiles your code, even if you compiled it previously via `swift build`. There is only one exception - when both `swift build` and `swift test` are in debug mode, it avoids redundantly re-compiling. This speedup does not apply when both are `-Onone` release, the option that compiles most quickly otherwise.

- Pre-build as release (`-Onone`) (excluding tests): 1 min 51 sec
  - Build tests as release (`-Onone`): 2 min 29 sec (everything)
    - Extrapolated time if excluding tests: 1 min 50 sec
  - Build tests as debug: 3 min 50 sec
    - Extrapolated time if excluding tests: 3 min 0 sec
- Pre-build as debug (excluding tests): 3 min 0 sec
  - Build tests as release (`-Onone`): 2 min 48 sec (everything)
    - Extrapolated time if excluding tests: 2 min 7 sec
  - Build tests as debug: 57 sec
    - Extrapolated time if excluding tests: 0 sec

If I can find a way to import S4TF outside of its tests, compiling with unoptimized release seems to be the wisest option. That would take around 2 minutes. I could add a special command to [Swift-Colab](https://github.com/philipturner/swift-colab) that caches the Swift package build products folder. When you restart the runtime (I do that often), it would link against the build products instead of re-compiling. It would also cache the x10 binaries so you only download them from the network once. This Colab command would be implemented once there is a Swift toolchain that both runs S4TF and has the Python LLDB API.

I previously heard that there were some performance concerns with not compiling S4TF with full optimization. There are tight loops where using debug mode could cause a bottleneck, but where do these loops happen? If they are in CTensorFlow, then it doesn't matter how S4TF is compiled because CTensorFlow is pre-compiled in the x10 binary.

When I tried compiling S4TF in fully optimized release mode, I got the compiler crash caused by BatchNorm, which is currently unsolved. The crash logs are in the Colab notebooks attached below. This crash did not happen in release when the `-Onone` flag was set - does that behavior reveal anything new about the bug?
[crash_no_tests.ipynb.zip](https://github.com/tensorflow/swift-apis/files/7835801/crash_no_tests.ipynb.zip)
[crash_with_tests.ipynb.zip](https://github.com/tensorflow/swift-apis/files/7835802/crash_with_tests.ipynb.zip)

I am compiling using the 2021-11-12 toolchain instead of the newest toolchain (2022-01-06).  Newer toolchains (starting with 2021-12-23 or earlier) introduce a bug that prevents S4TF from compiling even in debug mode (https://github.com/tensorflow/swift-apis/pull/1184#issuecomment-1008006067).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compiling on release mode #1189

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Compiling on release mode #1189

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions