[v1.1][ISSUE-274] refine ResNet and MiniGo demo and resize pictures (#288)

zigzagcai · web-flow · commit c176a12aeb67 · 2023-04-26T15:27:47.000-05:00
* refine

* refine

* refine

* refine
diff --git a/demo/builtin/minigo/MiniGo_DEMO.ipynb b/demo/builtin/minigo/MiniGo_DEMO.ipynb
@@ -12,32 +12,16 @@
   {
    "attachments": {},
    "cell_type": "markdown",
-   "id": "7e90b29d",
+   "id": "a358045c",
    "metadata": {},
    "source": [
-    "# Content\n",
-    "* [Overview](#Overview)\n",
-    "    * [Model Architecture](#Model-Architecture)\n",
-    "    * [Optimizations](#Optimizations)\n",
-    "    * [Performance](#Performance)\n",
-    "* [Getting Started](#Getting-Started)\n",
-    "    * [1. Environment Setup](#1.-Environment-Setup)\n",
-    "    * [2. Workflow Prepare](#2.-Workflow-Prepare)\n",
-    "    * [3. Data Prepare](#3.-Data-Prepare)\n",
-    "    * [4. Train](#4.-Train)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "60ba275a",
-   "metadata": {},
-   "source": [
-    "## MiniGo"
+    "## MiniGo DEMO"
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
-   "id": "b004c1af",
+   "id": "0ed0530e",
    "metadata": {},
    "source": [
     "MiniGo is an opensource minimalist Go engine modeled after AlphaGo Zero, which is a system that learns how to play Go at a superhuman level given only the rules of the game.\n",
@@ -50,6 +34,24 @@
     "Reference: https://www.newyorker.com/science/elements/how-the-artificial-intelligence-program-alphazero-mastered-its-games"
    ]
   },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "id": "7e90b29d",
+   "metadata": {},
+   "source": [
+    "# Content\n",
+    "* [Overview](#Overview)\n",
+    "    * [Model Architecture](#Model-Architecture)\n",
+    "    * [Optimizations](#Optimizations)\n",
+    "    * [Performance](#Performance)\n",
+    "* [Getting Started](#Getting-Started)\n",
+    "    * [1. Environment Setup](#1.-Environment-Setup)\n",
+    "    * [2. Workflow Prepare](#2.-Workflow-Prepare)\n",
+    "    * [3. Data Prepare](#3.-Data-Prepare)\n",
+    "    * [4. Train](#4.-Train)"
+   ]
+  },
   {
    "attachments": {},
    "cell_type": "markdown",
@@ -59,13 +61,13 @@
     "# Overview\n",
     "## Model Architecture\n",
     "### Overall train loop architecture of MiniGo\n",
-    "<img src=\"./img/minigo_model_arch.png\" width=\"800\"/><figure>MiniGo Model Architecture</figure>\n",
+    "<img src=\"./img/minigo_model_arch.png\" width=\"600\"/><figure>MiniGo Model Architecture</figure>\n",
     "\n",
     "### Deep network architecture based on residual blocks\n",
-    "<img src=\"./img/minigo_nn_arch.png\" width=\"800\"/><figure>Neural Network Architecture</figure>\n",
+    "<img src=\"./img/minigo_nn_arch.png\" width=\"600\"/><figure>Neural Network Architecture</figure>\n",
     "\n",
     "### MCTS architecture (Select->Expand->Evaluate->Backup/Backpropagation)\n",
-    "<img src=\"./img/minigo_mcts_arch.png\" width=\"800\"/><figure>Monte Carlo Tree Search in MiniGo</figure>"
+    "<img src=\"./img/minigo_mcts_arch.png\" width=\"600\"/><figure>Monte Carlo Tree Search in MiniGo</figure>"
    ]
   },
   {
@@ -283,6 +285,7 @@
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "id": "92f794fe",
    "metadata": {},
@@ -292,7 +295,7 @@
     "Remark:\n",
     "* Use rsync to synchronize signal, trained model, and mcts generated datset\n",
     "* Enable numa binding to fully utilize all physical cores in the cluster\n",
-    "<img src=\"./img/minigo_optimized_system_arch.JPG\" width=\"800\"/><figure>Optimized MiniGo System Architecture</figure>\n",
+    "<img src=\"./img/minigo_optimized_system_arch.JPG\" width=\"600\"/><figure>Optimized MiniGo System Architecture</figure>\n",
     "\n",
     "### Enable early stop during the train loop to leverage fast converge\n",
     "Remark:\n",
@@ -311,9 +314,9 @@
     "** num_readouts=600\n",
     "** fastplay_readouts=60\n",
     "* MCTS performance without finetune:\n",
-    "<img src=\"./img/mcts_baseline_speed.JPG\" width=\"800\"/><figure>Baseline MCTS</figure>\n",
+    "<img src=\"./img/mcts_baseline_speed.JPG\" width=\"600\"/><figure>Baseline MCTS</figure>\n",
     "* MCTS performance with finetune:\n",
-    "<img src=\"./img/mcts_tuned_speed.JPG\" width=\"800\"/><figure>Finetuned MCTS</figure>\n"
+    "<img src=\"./img/mcts_tuned_speed.JPG\" width=\"600\"/><figure>Finetuned MCTS</figure>\n"
    ]
   },
   {
@@ -324,7 +327,7 @@
    "source": [
     "## Performance\n",
     "\n",
-    "<img src=\"./img/minigo_perf.png\" width=\"900\"/>\n",
+    "<img src=\"./img/minigo_perf.png\" width=\"600\"/>\n",
     "\n",
     "* Distributed training with HW scaling delivered 3.57 speedup from 1 node to 4 nodes\n",
     "* Parallel selfplay and enable early stop delivered 2.50x speedup, and 8.92x speedup over baseline\n",
@@ -364,6 +367,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "# Noted: MiniGo is only runnable with baremetal env, so we won't provide docker option\n",
     "%%bash\n",
     "# prepare model codes\n",
     "bash workflow_prepare_minigo.sh\n",
@@ -484,6 +488,8 @@
    "metadata": {},
    "source": [
     "## 4. Train\n",
+    "Noted: Below performance result is using sample dataset and small iterations to demonstrate its function. The actual performance result please refers to the [performance section](#performance).\n",
+    "\n",
     "Edit config file to control SDA process"
    ]
   },
diff --git a/demo/builtin/resnet/RESNET_DEMO.ipynb b/demo/builtin/resnet/RESNET_DEMO.ipynb
@@ -9,10 +9,15 @@
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# RESNET Demo"
+    "# RESNET Demo\n",
+    "ResNet, short for Residual Network is a specific type of neural network that was introduced in paper “Deep Residual Learning for Image Recognition”.\n",
+    "\n",
+    "* original source\n",
+    "    * Source repo: https://github.com/mlcommons/training_results_v1.0/tree/master/Intel/benchmarks/resnet/2-nodes-16s-8376H-tensorflow"
    ]
   },
   {
@@ -32,15 +37,6 @@
     "    * [4. Train](#4.-Train)"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Image Classification\n",
-    "* Classify different images into different categories to achieve the smallest classification error. Image classification is a supervised learning problem: define a set of target classes (objects to identify in images), and train a model to recognize them using labeled example photos.\n",
-    "![image_classification.png](./img/image_classification.png)\n"
-   ]
-  },
   {
    "attachments": {},
    "cell_type": "markdown",
@@ -50,19 +46,22 @@
     "\n",
     "## Model Architecture\n",
     "\n",
+    "### Image Classification\n",
+    "* Classify different images into different categories to achieve the smallest classification error. Image classification is a supervised learning problem: define a set of target classes (objects to identify in images), and train a model to recognize them using labeled example photos.\n",
+    "<div><img src=\"./img/image_classification.png\" alt=\"image_classification.png\" width=\"600\"></div>\n",
+    "\n",
     "### VGG Neural Networks\n",
     "* VGG stands for Visual Geometry Group; it is a standard deep Convolutional Neural Network (CNN) architecture with multiple layers. The “deep” refers to the number of layers with VGG-16 or VGG-19 consisting of 16 and 19 convolutional layers. \n",
     "* The VGG architecture is the basis of ground-breaking object recognition models. Developed as a deep neural network, the VGGNet also surpasses baselines on many tasks and datasets beyond ImageNet. Moreover, it is now still one of the most popular image recognition architectures.\n",
     "\n",
-    "\n",
-    "![resnet.png](./img/resnet.png)\n",
+    "<div><img src=\"./img/resnet.png\" alt=\"resnet.png\"></div>\n",
     "### Deep Residual Learning for Image Recognition(ResNet)\n",
     "Deep residual networks like the popular ResNet-50 model is a convolutional neural network (CNN) that is 50 layers deep. A Residual Neural Network (ResNet) is an Artificial Neural Network (ANN) of a kind that stacks residual(shown as below image) blocks on top of each other to form a network.\n",
     "\n",
     "A residual network is a stack of many residual blocks. Regular design, like VGG: each residual block has two 3x3 conv. \n",
     "The Network is divided into stages: the first block of each stage halves the resolution (with stride-2 conv) and doubles the number of channels\n",
     "\n",
-    "![residual.png](./img/residual.png)"
+    "<div><img src=\"./img/residual.png\" alt=\"residual.png\"></div>"
    ]
   },
   {
@@ -122,7 +121,7 @@
    "source": [
     "## Performance\n",
     "\n",
-    "<img src=\"./img/resnet_perf.png\" width=\"900\"/>\n",
+    "<img src=\"./img/resnet_perf.png\" width=\"600\"/>\n",
     "\n",
     "* Distributed training with HW scaling delivered 3.84x speedup from 1 node to 4 nodes\n",
     "* HPO (hyper parameter optimization) with SDA (a component of AIOK, smart democratization advisor) delivered 1.21x speedup, and 4.63x speedup over baseline\n",
@@ -152,7 +151,7 @@
    "metadata": {},
    "source": [
     "## 1. Environment Setup\n",
-    "### Option 1 Setup Environment with Pip\n",
+    "### (Option 1) Use Pip Install\n",
     "pre-work: move e2eAIOK source code to /home/vmagent/app/e2eaiok."
    ]
   },
@@ -178,7 +177,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Option 2 Setup Environment with Docker\n",
+    "### (Option 2) Use Docker\n",
     "\n",
     "Step1. prepare code\n",
     "``` bash\n",
@@ -267,6 +266,8 @@
    "metadata": {},
    "source": [
     "## 4. Train\n",
+    "Noted: Below performance result is using sample dataset and small iterations to demonstrate its function. The actual performance result please refers to the [performance section](#performance).\n",
+    "\n",
     "Edit config file to control SDA process"
    ]
   },