Skip to content
This repository was archived by the owner on Jul 1, 2023. It is now read-only.
This repository was archived by the owner on Jul 1, 2023. It is now read-only.

Compiling on release mode #1189

@philipturner

Description

@philipturner

Due to the errors with linking S4TF to an arbitrary Swift executable (#1185 (comment)), I am currently very constrained with how I can test code that imports S4TF. For now, my only option is to replace the Swift package tests with custom code I want to execute. Having to re-build S4TF repeatedly presents a bottleneck to my workflow.

I profiled S4TF build times on Google Colab (dual-core x64), and found out some interesting results. When running swift test, it always re-compiles your code, even if you compiled it previously via swift build. There is only one exception - when both swift build and swift test are in debug mode, it avoids redundantly re-compiling. This speedup does not apply when both are -Onone release, the option that compiles most quickly otherwise.

  • Pre-build as release (-Onone) (excluding tests): 1 min 51 sec
    • Build tests as release (-Onone): 2 min 29 sec (everything)
      • Extrapolated time if excluding tests: 1 min 50 sec
    • Build tests as debug: 3 min 50 sec
      • Extrapolated time if excluding tests: 3 min 0 sec
  • Pre-build as debug (excluding tests): 3 min 0 sec
    • Build tests as release (-Onone): 2 min 48 sec (everything)
      • Extrapolated time if excluding tests: 2 min 7 sec
    • Build tests as debug: 57 sec
      • Extrapolated time if excluding tests: 0 sec

If I can find a way to import S4TF outside of its tests, compiling with unoptimized release seems to be the wisest option. That would take around 2 minutes. I could add a special command to Swift-Colab that caches the Swift package build products folder. When you restart the runtime (I do that often), it would link against the build products instead of re-compiling. It would also cache the x10 binaries so you only download them from the network once. This Colab command would be implemented once there is a Swift toolchain that both runs S4TF and has the Python LLDB API.

I previously heard that there were some performance concerns with not compiling S4TF with full optimization. There are tight loops where using debug mode could cause a bottleneck, but where do these loops happen? If they are in CTensorFlow, then it doesn't matter how S4TF is compiled because CTensorFlow is pre-compiled in the x10 binary.

When I tried compiling S4TF in fully optimized release mode, I got the compiler crash caused by BatchNorm, which is currently unsolved. The crash logs are in the Colab notebooks attached below. This crash did not happen in release when the -Onone flag was set - does that behavior reveal anything new about the bug?
crash_no_tests.ipynb.zip
crash_with_tests.ipynb.zip

I am compiling using the 2021-11-12 toolchain instead of the newest toolchain (2022-01-06). Newer toolchains (starting with 2021-12-23 or earlier) introduce a bug that prevents S4TF from compiling even in debug mode (#1184 (comment)).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions