|
16 | 16 | **2025.8.12**: 实现了纯推理的llama3 (6-layer Transformer, vocab-size=32000). 参考了[这里](https://github.com/likejazz/llama3.np)的NumPy实现和数据集. 将数据集下载到`llama`文件夹即可运行: |
17 | 17 |
|
18 | 18 | ```bash |
19 | | ->>> python .\llama\infer.py |
| 19 | +>>> python -m llama.infer |
20 | 20 | There was a boy named Timmy. He loved to play with hi toy and run around outside. One day, Timmy' mom asked him to help her with the laundry. Timmy didn't want to help because he wanted to play. But hi mom said, "Timmy, you need to help me. It' important to help out." |
21 | 21 | Timmy didn't want to help, but he knew he had to. So, he put on hi shoe and went outside to help hi mom. A they were folding the clothe, Timmy saw a big pile of laundry on the floor. He wanted to help, so he started to pick it up. But then, he accidentally knocked over a pile of clothe and they fell on him. Timmy wa okay, but he felt bad. |
22 | 22 | Hi mom saw what happened and said, "Timmy, you need to be more careful. You could have hurt yourself." Timmy felt bad and said sorry. Hi mom hugged him and said, "It' okay, accident happen. Let' clean up the laundry together." Timmy learned that it' important to be careful and help out when you need it. |
@@ -54,58 +54,58 @@ python setup.py install |
54 | 54 |
|
55 | 55 | ## Example |
56 | 56 |
|
57 | | -[examples](./examples/)中是一些例子。运行`python examples/XXX.py`即可: |
| 57 | +[examples/pydynet](./examples/pydynet)中是一些例子,[examples/pytorch](./examples/pytorch)给出等价的pytorch实现. 运行`python examples.pydynet.xxx`即可: |
58 | 58 |
|
59 | 59 | ### AutoDiff |
60 | 60 |
|
61 | | -[autodiff1d.py](examples/autodiff1d.py)利用自动微分,对一个一维凸函数进行梯度下降: |
| 61 | +[autodiff1d.py](examples/pydynet/autodiff1d.py)利用自动微分,对一个一维凸函数进行梯度下降: |
62 | 62 |
|
63 | 63 | <img src="imgs/ad1d.png" alt="ad1" style="zoom:67%;" /> |
64 | 64 |
|
65 | | -以及一个多元凸函数的例子: [autodiff2d.py](examples/autodiff2d.py) |
| 65 | +以及一个多元凸函数的例子: [autodiff2d.py](examples/pydynet/autodiff2d.py) |
66 | 66 |
|
67 | 67 | <img src="imgs/ad2d.png" alt="ad2" style="zoom:67%;" /> |
68 | 68 |
|
69 | 69 | ### MLP & LeNet |
70 | 70 |
|
71 | | -[mlp_cnn.py](examples/mlp_cnn.py)使用全连接网络(三层+残差)和LeNet对MNIST进行分类. 训练准确率和测试准确率: |
| 71 | +[mlp_cnn.py](examples/pydynet/mnist.py)使用MLP和LeNet对MNIST进行分类. 训练准确率和测试准确率: |
72 | 72 |
|
73 | 73 | <img src="imgs/mlp_cnn.png" alt="dnn" style="zoom:67%;" /> |
74 | 74 |
|
75 | 75 | ### Dropout & BN |
76 | 76 |
|
77 | | -[mlp_dropout_bn.py](examples/mlp_dropout_bn.py)使用三种网络对`fetch_olivetti_faces`人脸(64×64)数据集进行分类并进行性能对比: |
| 77 | +[mlp_dropout_bn.py](examples/pydynet/dropout_bn.py)使用三种网络对`fetch_olivetti_faces`人脸(64×64)数据集进行分类并进行性能对比: |
78 | 78 |
|
79 | 79 | 1. 三层MLP; |
80 | 80 | 2. 三层MLP + Dropout; |
81 | 81 | 3. 三层MLP + BatchNormalization. |
82 | 82 |
|
83 | 83 | 学习效果对比: |
84 | 84 |
|
85 | | -<img src="imgs/dropout_BN.png" alt="cnn" style="zoom:67%;" /> |
| 85 | +<img src="imgs/dropout_bn.png" alt="cnn" style="zoom:67%;" /> |
86 | 86 |
|
87 | 87 | ### RNN |
88 | 88 |
|
89 | | -[rnn_sin.py](examples/ts_prediction.py)中是一个用GRU做时序预测例子: |
| 89 | +[rnn_sin.py](examples/pydynet/ts_prediction.py)中是一个用GRU做时序预测例子: |
90 | 90 |
|
91 | 91 | <img src="imgs/rnn.png" alt="RNN" style="zoom:67%;" /> |
92 | 92 |
|
93 | 93 | ### Transformer |
94 | 94 |
|
95 | | -[transformer.py](examples/transformer.py)中是一个用Transformer训练文本分类模型的例子. 训练结果: |
| 95 | +[transformer.py](examples/pydynet/transformer.py)中是一个用Transformer训练文本分类模型的例子. 训练结果: |
96 | 96 |
|
97 | 97 | <img src="imgs/transformer.png" alt="transformer" style="zoom:67%;" /> |
98 | 98 |
|
99 | 99 | > 数据集 (CoLA) 链接: <https://nyu-mll.github.io/CoLA/cola_public_1.1.zip> |
100 | 100 |
|
101 | 101 | ## cuda加速 |
102 | 102 |
|
103 | | -在训练batch size为128, 测试batch size为512情况下,模型在CPU和GPU上的训练速度比较: |
| 103 | +在训练batch size为256, 测试batch size为1024情况下,模型在CPU和GPU上的训练速度比较: |
104 | 104 |
|
105 | | -| Net | Dataset | CPU time (s) per Epoch | GPU time (s) per Epoch | |
| 105 | +| Network structure | Dataset | CPU time (s) per epoch | GPU time (s) per epoch | |
106 | 106 | | :-----------------: | :---------------: | :--------------------: | :--------------------: | |
107 | | -| ResidualMLP | MNIST (80000×574) | 20.256±0.138 | 2.903±.018 | |
108 | | -| LeNet | MNIST (80000×574) | 239.664±2.108 | 10.148±0.026 | |
109 | | -| 1-layer Transformer | CoLA (8551×45×64) | 17.503±0.251 | 1.125±0.002 | |
| 107 | +| 3-layer MLP | MNIST (80000×574) | 7.256±0.138 | 1.203±.0181 | |
| 108 | +| LeNet | MNIST (80000×574) | 239.664±2.108 | 2.841±0.026 | |
| 109 | +| 1-layer Transformer (dim=512, head=4) | CoLA (8551×45×64) | 17.503±0.251 | 1.075±0.002 | |
110 | 110 |
|
111 | | -设备: Nvidia GeForce RTX 3090. |
| 111 | +设备: Nvidia GeForce RTX 4090. |
0 commit comments