upload sections

mikebo93 · mikebo93 · commit 1009ee86deb9 · 2025-03-27T00:02:45.000Z
diff --git a/chapter_programming_model/Bridging_Python_and_C_C++_Functions.md b/chapter_programming_model/Bridging_Python_and_C_C++_Functions.md
@@ -0,0 +1,98 @@
+# Bridging Python and C/C++ Functions
+
+Developers frequently encounter the need to incorporate custom operators
+into a machine learning framework. These operators implement new models,
+optimizers, data processing functions, and more. Custom operators, in
+particular, often require implementation in C/C++ to achieve optimized
+performance. They also have Python interfaces, facilitating developers
+to integrate custom operators with existing machine learning workflows
+written in Python. This section will delve into the implementation
+details of this process.
+
+The Python interpreter, being implemented in C, enables the invocation
+of C and C++ functions within Python. Contemporary machine learning
+frameworks such as TensorFlow, PyTorch, and MindSpore rely on pybind11
+to automatically generate Python functions from underlying C and C++
+functions. This mechanism is known as *Python binding*. Prior to the
+advent of pybind11, Python binding was accomplished using one of the
+following approaches:
+
+1.  **C-APIs in Python**: This approach necessitates the inclusion of
+    `Python.h` in C++ programs and the utilization of Python's C-APIs to
+    execute Python operations. To effectively work with C-APIs,
+    developers must possess a comprehensive understanding of Python's
+    internal implementation, such as managing reference counting.
+
+2.  **Simplified Wrapper and Interface Generator (SWIG)**: SWIG serves
+    as a bridge between C/C++ code and Python, and it played a
+    significant role in the initial development of TensorFlow. Utilizing
+    SWIG involves crafting intricate interface statements and relying on
+    SWIG to automatically generate C code that interfaces with Python's
+    C-APIs. However, due to the lack of readability in the generated
+    code, the maintenance costs associated with it tend to be high.
+
+3.  **Python `ctypes` module**: This module encompasses a comprehensive
+    range of types found in the C language and allows direct invocation
+    of dynamic link libraries (DLLs). However, a limitation of this
+    module is its heavy reliance on native C types, which results in
+    insufficient support for customized types.
+
+4.  **CPython**: In basic terms, CPython can be described as the fusion
+    of Python syntax with static types from the C language. It
+    facilitates the retention of Python's syntax while automatically
+    translating CPython functions into C/C++ code. This functionality
+    empowers developers to seamlessly incorporate invocations of C/C++
+    functions within the CPython environment.
+
+5.  **Boost::Python (a C++ library)**: Boost::Python allows for the
+    exposure of C++ functions as Python functions. It operates on
+    similar principles to Python's C-APIs but provides a more
+    user-friendly interface. However, the reliance on the Boost library
+    introduces a significant dependency on third-party components, which
+    can be a potential drawback for Boost::Python.
+
+In comparison to the above Python binding approaches, pybind11 shares
+similarities with Boost::Python in terms of simplicity and usability.
+However, pybind11 stands out due to its focus on supporting C++ 11 and
+eliminating dependencies on Boost. As a lightweight Python library,
+pybind11 is particularly suitable for exposing numerous Python functions
+in complex C++ projects such as the machine learning system discussed in
+this book. The combination of Code
+`ch02/code2.5.1` and Code
+`ch02/code2.5.2` is an example of adding a custom operator to
+Pytorch with the integration of C++ and Python:\
+In C++:
+
+**ch02/code2.5.1**
+```cpp
+//custom_add.cpp
+    #include <torch/extension.h>
+    #include <pybind11/pybind11.h>
+    
+    torch::Tensor custom_add(torch::Tensor a, torch::Tensor b) {
+        return a + b;
+    }
+    
+    PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
+        m.def("custom_add", &custom_add, "A custom add function");
+    }
+```
+
+In Python:
+
+**ch02/code2.5.2**
+```python
+import torch
+    from torch.utils.cpp_extension import load
+    
+    # Load the C++ extension
+    custom_extension = load(
+        name='custom_extension',
+        sources=['custom_add.cpp'],
+        verbose=True
+    )
+    # Use your custom add function
+    a = torch.randn(10)
+    b = torch.randn(10)
+    c = custom_extension.custom_add(a, b)
+```
diff --git a/chapter_programming_model/Functional_Programming.md b/chapter_programming_model/Functional_Programming.md
@@ -0,0 +1,115 @@
+# Functional Programming
+
+In the following, we will discuss the reasons behind the growing trend
+of incorporating functional programming into the design of machine
+learning frameworks.
+
+## Benefits of Functional Programming
+
+Training constitutes the most critical phase in machine learning, and
+the manner in which training is depicted hinges significantly on
+optimizer algorithms. Predominantly, contemporary machine learning tasks
+utilize first-order optimizers, favored for their ease of use. With
+machine learning advancing at a rapid pace, both software and hardware
+are incessantly updated to stay abreast. Consequently, an increasing
+number of researchers are beginning to investigate higher-order
+optimizers, noted for their superior convergence performance. Frequently
+utilized second-order optimizers, such as the Newton method,
+quasi-Newton method, and AdaHessians, necessitate the computation of a
+Hessian matrix incorporating second-order derivative information. Two
+considerable challenges arise from this computation: 1) how to manage
+such a hefty computational load efficiently; 2) how to express
+higher-order derivatives in programmatic language.
+
+In recent times, numerous large AI models have been introduced, which
+include (with the number of parameters noted in parentheses) OpenAI
+GPT-3 (175B) in 2020; PanGu (100B), PanGu-$\alpha$ (200B), Google's
+Switch Transformer (1.6T), and WuDao (1.75T) in 2021; along with
+Facebook's NLLB-200 (54B) in 2022. The demand for ultra-large model
+training is escalating, and data parallelism alone cannot meet this
+growing requirement. Conversely, model parallelism demands manual model
+segmentation, a process that is time-intensive and laborious.
+Consequently, the main challenge future machine learning frameworks must
+overcome is how to actualize automatic parallelism. At its core, a
+machine learning model is a representation of a mathematical model.
+Hence, the ability to succinctly represent machine learning models has
+risen to a key concern in the design of programming paradigms for
+machine learning frameworks.
+
+Recognizing the challenges presented by the practical implementation of
+machine learning frameworks, researchers have identified that functional
+programming could offer beneficial solutions. Functional programming, in
+computer science, is a programming paradigm that envisions computation
+as the evaluation of mathematical functions, actively avoiding state
+changes and data mutations. This paradigm harmonizes well with
+mathematical reasoning. Neural networks are composed of interconnected
+nodes, with each node performing basic mathematical operations.
+Functional programming languages allow developers to portray these
+mathematical operations in a language that closely mirrors the
+operations, enhancing the readability and maintainability of programs.
+Concurrently, in functional languages, functions are kept separate,
+simplifying the management of concurrency and parallelism.
+
+In summary, functional programming is anticipated to confer the
+following benefits to machine learning frameworks:
+
+1.  It is suited for machine learning scenarios where higher-order
+    derivatives are needed.
+
+2.  It simplifies the development of parallel programming interfaces.
+
+3.  It results in a more concise code representation.
+
+## Framework Support for Functional Programming
+
+Machine learning frameworks have increasing support for functional
+programming. In 2018, Google rolled out JAX. Contrary to traditional
+machine learning frameworks, JAX amalgamates neural network computation
+and numerical computation. Its interfaces are compatible with native
+data science interfaces in Python, such as NumPy and SciPy. Moreover,
+JAX extends distribution, vectorization, high-order derivation, and
+hardware acceleration in a functional programming style, characterized
+by Lambda closure and no side effects.
+
+In 2020, Huawei introduced MindSpore, the functional differential
+programming architecture of which allows users to concentrate on the
+native mathematical expressions of machine learning models. In 2022,
+taking inspiration from Google's JAX, PyTorch launched functorch.
+Functorch is essentially a library aimed at providing composable vmap
+(vectorization) and autodiff transforms compatible with PyTorch modules
+and PyTorch autograd, thereby achieving excellent eager-mode
+performance. It can be inferred that functorch meets the requirements
+for distributed parallelism in PyTorch static graphs. Code
+`ch02/code2.4` gives an example of functorch.
+
+**ch02/code2.4**
+```
+from functorch import combine_state_for_ensemble, vmap
+    minibatches = data[:num_models]
+    models = [MLP().to(device) for _ in range(num_models)]
+    fmodel, params, buffers = combine_state_for_ensemble(models)
+    predictions1_vmap = vmap(fmodel, out_dims=1)(params, buffers, minibatches)
+```
+
+Functorch introduces *vmap*, standing for \"vectorized map\". Its role
+is to adapt functions designed for individual inputs so that they can
+handle batches of inputs, therefore facilitating efficient vectorized
+calculations. Unlike the batch processing capabilities of standard
+PyTorch modules, vmap can convert any operation to be batch-aware
+without the need to alter the operation's original structure. Moreover,
+vmap offers greater flexibility to batch dimensions, allowing users to
+specify which dimension should be treated as the batch dimension
+(specifying the $out\_dim$ argument), a contrast to the default
+behaviour of the standard PyTorch where the first dimension is usually
+chosen as the batch dimension.
+
+By tracing the development of machine learning frameworks, it becomes
+evident that the functional programming paradigm become increasingly
+popular. This can be attributed to functional programming's ability to
+express machine learning models intuitively and its convenience for
+implementing automatic differentiation, high-order derivation, and
+parallel execution. Consequently, future machine learning frameworks are
+likely to adopt layered frontend interfaces that are not exclusively
+designed for machine learning scenarios. Instead, they will primarily
+offer differential programming in their abstraction designs, making
+gradient-based software easy to be developed for various applications.