Skip to content
This repository was archived by the owner on Jan 24, 2024. It is now read-only.
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1768 commits
Select commit Hold shift + click to select a range
44ebb91
fix op rand
qq332982511 Sep 3, 2018
faefdf1
add cast cpu op
chenjiaoAngel Sep 3, 2018
82d4945
Merge pull request #76 from PaddlePaddle/dev_v2
chenjiaoAngel Sep 3, 2018
ff27f79
Merge pull request #370 from yiicy/dev_v2_unpool
Sep 3, 2018
fc81733
Merge branch 'dev_v2' into dev_v2
Sep 3, 2018
c2c2e95
Merge branch 'dev_v2' into mvn_x86
Sep 3, 2018
30623ab
Merge branch 'dev_v2' into dev_v2
MyPandaShaoxiang Sep 3, 2018
5400f4d
Merge branch 'dev_v2' into dev2
Sep 3, 2018
fcdc0bb
Merge branch 'dev_v2' into dev_v3
Sep 3, 2018
b276f85
update x86 jit conv
Sep 3, 2018
fb398d2
Merge branch 'dev_v2' into intel_conv
Sep 3, 2018
d4f4962
add x86 power impl
MyPandaShaoxiang Sep 3, 2018
2c37ac9
add nv deconv
qq332982511 Sep 3, 2018
b368bb8
Merge pull request #386 from mengkai94/dev_v2
Sep 3, 2018
1a06f25
Merge branch 'dev_v2' into intel_conv
Sep 3, 2018
177ac62
Merge branch 'dev_v2' of https://github.com/PaddlePaddle/Anakin into …
qq332982511 Sep 3, 2018
9a39887
add x86_normalize impl
MyPandaShaoxiang Sep 3, 2018
cabcca9
landmark: improve in global memory management.
pangge Sep 3, 2018
2054005
fix x86_saber_normalize.h
MyPandaShaoxiang Sep 3, 2018
6ddc170
fix x86_saber_power.h
MyPandaShaoxiang Sep 3, 2018
dde71a8
Merge branch 'dev_v2' into dev_v2
MyPandaShaoxiang Sep 3, 2018
4163372
add x86_permute impl
MyPandaShaoxiang Sep 3, 2018
5bd9244
Merge branch 'dev_v2' into dev2
chenjiaoAngel Sep 3, 2018
4d1940a
Merge branch 'dev_v2' into dev_v2
chenjiaoAngel Sep 3, 2018
70446b6
Merge branch 'dev_v2' into dev_v3
chenjiaoAngel Sep 3, 2018
a987087
add x86 permute_power impl
MyPandaShaoxiang Sep 3, 2018
89adbed
fix x86 permute
MyPandaShaoxiang Sep 3, 2018
6fbc3a6
fix some saber conv bugs
Sep 3, 2018
636e422
add eltwise act
qq332982511 Sep 3, 2018
1115e3c
add crf_decoding gpu op
chenjiaoAngel Sep 3, 2018
8430c57
Merge branch 'dev_v2' into dev_v4
chenjiaoAngel Sep 4, 2018
10b53bd
Merge pull request #7 from throneclay/new_pangge
pangge Sep 4, 2018
41b6178
add permute_power x86 impl
MyPandaShaoxiang Sep 4, 2018
b12b51c
deconv only for gpu
qq332982511 Sep 4, 2018
3c14cfe
trigger CI
qq332982511 Sep 4, 2018
9e59f04
add pooling_with_index x86 impl
MyPandaShaoxiang Sep 4, 2018
265b3ae
fix x86 permute_power.cpp
MyPandaShaoxiang Sep 4, 2018
592aa53
add reshape x86 impl
MyPandaShaoxiang Sep 4, 2018
af5c3cd
milestone: successfully running on yolo_camera
pangge Sep 4, 2018
ca6eefc
rm useless files.
pangge Sep 4, 2018
f87eabf
add support changed weights used in saved graph.
pangge Sep 4, 2018
b58abb5
axpy
chenjiaoAngel Sep 5, 2018
af7b99b
milestone: improve some designation
pangge Sep 5, 2018
9167290
Merge branch 'dev_v2' into dev_v2
pangge Sep 5, 2018
784ff9a
Merge branch 'dev_v2' of https://github.com/PaddlePaddle/Anakin into …
pangge Sep 5, 2018
d96f712
Merge branch 'dev_v2' of https://github.com/pangge/Anakin into dev_v2
pangge Sep 5, 2018
08d042c
Merge branch 'dev_v2' into mvn_x86
xyoungli Sep 5, 2018
cd7483c
Merge pull request #395 from MyPandaShaoxiang/dev_v2
cyj1986 Sep 5, 2018
7fff43c
Merge remote-tracking branch 'upstream/dev_v2' into x86_normalize
MyPandaShaoxiang Sep 5, 2018
6bf88e8
fix saber_normalize.h
MyPandaShaoxiang Sep 5, 2018
5b73e54
Merge remote-tracking branch 'upstream/dev_v2' into x86_permute
MyPandaShaoxiang Sep 5, 2018
3f8feb2
Merge remote-tracking branch 'upstream/dev_v2' into x86_permute_power
MyPandaShaoxiang Sep 5, 2018
3e2f1e9
fix x86_permute_power impl'
MyPandaShaoxiang Sep 5, 2018
8e3f6a2
Merge remote-tracking branch 'upstream/dev_v2' into x86_pooling_with_…
MyPandaShaoxiang Sep 5, 2018
f0c17f4
Merge remote-tracking branch 'upstream/dev_v2' into x86_power
MyPandaShaoxiang Sep 5, 2018
503a0bf
Merge remote-tracking branch 'upstream/dev_v2' into x86_reshape
MyPandaShaoxiang Sep 5, 2018
b3cd55d
Merge branch 'dev_v2' into layer_norm_x86
Jayoprell Sep 5, 2018
7e0100e
Merge branch 'dev_v2' into mvn_x86
Jayoprell Sep 5, 2018
e904464
Merge branch 'dev_v2' into im2sequence_x86
Jayoprell Sep 5, 2018
07b8c3f
update im2col conv
Sep 5, 2018
9bb0e76
Merge branch 'dev_v2' into intel_conv
Sep 5, 2018
da47ff9
Merge pull request #408 from throneclay/intel_conv
cyj1986 Sep 6, 2018
c1f027f
element merge chaowen
qq332982511 Sep 6, 2018
1016a09
Merge branch 'dev_v2' into dev_v2_scale
Sep 6, 2018
07f4df3
element merge chaowen
qq332982511 Sep 6, 2018
6176789
change int16 to half
qq332982511 Sep 6, 2018
0b6bec4
Merge branch 'dev_v2' into x86_power
Sep 6, 2018
7bbb4f1
Merge branch 'dev_v2' into dev_v3
Sep 6, 2018
429bd41
Merge branch 'dev_v2' into dev2
Sep 6, 2018
4799767
remove saber deconv
qq332982511 Sep 6, 2018
651133f
remove change in calibrate
qq332982511 Sep 6, 2018
3b99e55
format
qq332982511 Sep 6, 2018
5a9f231
Merge pull request #403 from yiicy/dev_v2_scale
xyoungli Sep 6, 2018
c1870f6
Merge branch 'dev_v2' into dev_v2_reverse_op
Sep 6, 2018
4062db2
Merge branch 'dev_v2' into x86_reshape
Sep 6, 2018
d850fa5
remove test base modify
qq332982511 Sep 6, 2018
7c59257
Merge branch 'dev_v2' into x86_pooling_with_index
Sep 6, 2018
126ee4f
Merge branch 'dev_v2' into dev2
chenjiaoAngel Sep 6, 2018
d6b8bf6
Merge branch 'dev_v2' into dev_v3
chenjiaoAngel Sep 6, 2018
090ac64
Merge branch 'dev_v2' into dev_v2
chenjiaoAngel Sep 6, 2018
5146a5b
Merge branch 'dev_v2' into dev_v4
chenjiaoAngel Sep 6, 2018
ce89d94
update for resolving bugs in model save
pangge Sep 6, 2018
5028a84
add optimized gpu crf_Decoding
chenjiaoAngel Sep 6, 2018
1e8ac73
Merge branch 'dev_v4' of https://github.com/chenjiaoAngel/Anakin into…
chenjiaoAngel Sep 6, 2018
ce45d42
fix conv trans logic
Sep 6, 2018
d4b5f3c
Merge pull request #426 from MyPandaShaoxiang/x86_reshape
xyoungli Sep 6, 2018
ea747ad
Merge branch 'dev_v2' into x86_power
Sep 6, 2018
b1143f2
Merge branch 'dev_v2' into x86_permute
MyPandaShaoxiang Sep 6, 2018
ad3c3b0
Merge branch 'dev_v2' into x86_pooling_with_index
MyPandaShaoxiang Sep 6, 2018
6ed29eb
Merge branch 'dev_v2' into x86_permute_power
MyPandaShaoxiang Sep 6, 2018
d5bb5fd
Merge branch 'dev_v2' into x86_normalize
MyPandaShaoxiang Sep 6, 2018
f402f60
Merge pull request #8 from qq332982511/dev_v2_element_chaowen
pangge Sep 6, 2018
5bf6051
Merge pull request #9 from throneclay/pangge_new
pangge Sep 6, 2018
cbe41d9
temp changes.
pangge Sep 6, 2018
ec2dcb5
Merge branch 'dev_v2' of https://github.com/PaddlePaddle/Anakin into …
pangge Sep 6, 2018
c2b74ad
Merge branch 'dev_v2' of https://github.com/pangge/Anakin into dev_v2
pangge Sep 6, 2018
81ca901
Merge pull request #424 from MyPandaShaoxiang/x86_pooling_with_index
Sep 6, 2018
4692d92
Merge pull request #398 from chenjiaoAngel/dev2
Sep 6, 2018
b61bf82
Merge pull request #390 from Jayoprell/im2sequence_x86
Sep 6, 2018
ca3e492
Merge pull request #407 from chenjiaoAngel/dev_v3
Sep 6, 2018
bbf297d
Merge pull request #389 from Jayoprell/layer_norm_x86
Sep 6, 2018
2da0b34
Merge pull request #387 from Jayoprell/mvn_x86
Sep 6, 2018
eeebd09
BM updates to dev_v2 (#334)
guangzhixie Sep 6, 2018
99f5f73
Merge branch 'dev_v2' of https://github.com/guangzhixie/Anakin into d…
guangzhixie Sep 6, 2018
6deecef
Merge remote-tracking branch 'upstream/dev_v2' into dev_v2
guangzhixie Sep 6, 2018
4da247b
fix slice op bug (#420)
MyPandaShaoxiang Sep 6, 2018
9a1609c
dev_v2 add sequence pool op and ut (#405)
yiicy Sep 6, 2018
454cf68
Merge branch 'dev_v2' of https://github.com/PaddlePaddle/Anakin into …
qq332982511 Sep 6, 2018
7105840
Merge branch 'dev_v2' into x86_normalize
Sep 6, 2018
b636243
Merge branch 'dev_v2' into x86_permute
Sep 6, 2018
510daa4
Merge branch 'dev_v2' into x86_permute_power
Sep 6, 2018
a0e370a
Merge pull request #425 from MyPandaShaoxiang/x86_power
Sep 6, 2018
9157061
change default iter to 10 in test base
qq332982511 Sep 6, 2018
f6545cb
Merge pull request #423 from MyPandaShaoxiang/x86_permute_power
Sep 6, 2018
c64f8fd
Merge pull request #422 from MyPandaShaoxiang/x86_permute
Sep 6, 2018
d4cd056
Merge pull request #421 from MyPandaShaoxiang/x86_normalize
Sep 6, 2018
8e3fc4f
Merge branch 'dev_v2' into dev_v2_eltwise
Sep 6, 2018
b3141eb
Merge branch 'dev_v2' into dev_v4
Sep 6, 2018
e75e8c5
Merge pull request #400 from qq332982511/dev_v2_reverse_op
Sep 6, 2018
eba9cdd
Merge branch 'dev_v2' into dev_v2
Sep 6, 2018
a9ce35b
Merge pull request #397 from chenjiaoAngel/dev_v2
Sep 6, 2018
c870809
Merge branch 'dev_v2' into dev_v2_deconv
Sep 6, 2018
73b52e1
Merge branch 'dev_v2' into dev_v2_eltwise
Sep 6, 2018
2340f3b
Merge pull request #412 from qq332982511/dev_v2_eltwise
Sep 6, 2018
8b9c45c
Merge branch 'dev_v2' into dev_v2_deconv
Sep 6, 2018
ebd6314
Update shape.h
Sep 6, 2018
7402b00
Merge branch 'dev_v2' into dev_v2_cudnn_gru
Sep 6, 2018
b4c7edf
Merge branch 'dev_v2' into dev_v4
Sep 6, 2018
fd7df7d
Merge pull request #410 from qq332982511/dev_v2_deconv
Sep 6, 2018
15b3320
Merge branch 'dev_v2' into dev_v2_cudnn_gru
qq332982511 Sep 6, 2018
d49a442
Merge branch 'dev_v2' into dev_v4
chenjiaoAngel Sep 6, 2018
9f431c4
update macro
Sep 6, 2018
d9df6fa
update resize test
Sep 6, 2018
bd4428f
Merge pull request #431 from throneclay/conv_macro
Sep 6, 2018
25d81dc
Merge pull request #413 from chenjiaoAngel/dev_v4
Sep 6, 2018
f7720f8
Merge branch 'dev_v2' into dev_v2_cudnn_gru
Sep 6, 2018
aa0683b
Merge pull request #399 from qq332982511/dev_v2_cudnn_gru
Sep 7, 2018
d73e0c7
fix conflicts.
pangge Sep 7, 2018
4c9409b
Merge branch 'dev_v2' into dev_master
guangzhixie Sep 7, 2018
30e64c6
Revert "Revert "Use BM Kernel instead of BMDNN""
guangzhixie Sep 7, 2018
de1fdbe
bm kernel implementation for saber op
guangzhixie Sep 7, 2018
eaf7302
Update bmkernel_api_base
guangzhixie Sep 7, 2018
c1c061c
Add namespace
guangzhixie Sep 7, 2018
a8a83a6
revert namespace first
guangzhixie Sep 7, 2018
156dfa8
switch for bm kernel op
guangzhixie Sep 7, 2018
27dbdda
switch for bm kernel op
guangzhixie Sep 7, 2018
c588fac
update deconv
Sep 7, 2018
f942ae0
delete useless impl
Sep 7, 2018
9411c89
Merge pull request #10 from throneclay/conv_macro
pangge Sep 7, 2018
8502de8
fix compile error
Sep 7, 2018
b064642
Merge pull request #11 from throneclay/new_pange_fix
pangge Sep 7, 2018
fffeed0
update: add basic components for weights manage
pangge Sep 7, 2018
afc072a
Use enum for bm op type
guangzhixie Sep 7, 2018
6ce0724
Add BM conv implementation
guangzhixie Sep 7, 2018
4c4086a
add deconv trans api
Sep 7, 2018
b7926f9
Merge pull request #12 from throneclay/new_pangge_deconv
pangge Sep 7, 2018
514a52c
fix conv trans weights bug
Sep 7, 2018
2af1441
Merge pull request #13 from throneclay/new_pangge_deconv
pangge Sep 7, 2018
23e46fc
basic changes.
pangge Sep 7, 2018
13b58e8
Merge branch 'dev_v2' of https://github.com/pangge/Anakin into dev_v2
pangge Sep 7, 2018
fce70bd
add conv_unpadding_padding
qq332982511 Sep 7, 2018
4a24a96
Merge branch 'dev_v2' of https://github.com/PaddlePaddle/Anakin into …
qq332982511 Sep 7, 2018
2a6787e
fix some bugs.
pangge Sep 7, 2018
9f3cd3f
shutdown rpc.
pangge Sep 7, 2018
27bbdd2
Merge pull request #419 from pangge/dev_v2
LittleMaer Sep 7, 2018
aa66002
Merge pull request #433 from qq332982511/dev_v2_unpadding_padding
LittleMaer Sep 7, 2018
9bbf7cf
fix format
Sep 10, 2018
a9192a0
Merge pull request #440 from qq332982511/dev_v2_remove_bm_activation_…
Sep 10, 2018
10ec524
update bm bin path
guangzhixie Sep 10, 2018
eff245e
host bm kernel bin at bm root system directory
guangzhixie Sep 10, 2018
25b2f1a
Merge branch 'dev_master' into bmk_conv
guangzhixie Sep 10, 2018
62301c6
Update bm kernel bin path
guangzhixie Sep 10, 2018
ed296e3
Merge remote-tracking branch 'upstream/dev_v2' into dev_v2
guangzhixie Sep 10, 2018
256dd90
Merge branch 'dev_v2' into dev_master
guangzhixie Sep 10, 2018
c5ddf4b
Merge branch 'dev_master' into bmk_conv
guangzhixie Sep 10, 2018
b957a81
Cleanup after merge
guangzhixie Sep 11, 2018
8d0b900
Merge branch 'dev_master' into bmk_conv
guangzhixie Sep 11, 2018
59476a1
comment out configure_file first
guangzhixie Sep 11, 2018
b07a5dc
Comment out code with issue
guangzhixie Sep 11, 2018
895d0e2
Merge branch 'dev_master' into bmk_conv
guangzhixie Sep 11, 2018
a0a9618
uncomment bm conv test
guangzhixie Sep 11, 2018
7beca13
test
guangzhixie Sep 13, 2018
0e45e32
Revert "test"
guangzhixie Sep 13, 2018
c3ac7c9
Fix BM bin compilation issue
guangzhixie Sep 13, 2018
3e7048d
Fix issue for BM bin
guangzhixie Sep 14, 2018
bafff06
Merge branch 'dev_master' into bmk_conv
guangzhixie Sep 14, 2018
179d6f4
scripting permission
guangzhixie Sep 14, 2018
dcd473c
Merge branch 'dev_master' into bmk_conv
guangzhixie Sep 14, 2018
bf6e712
Fix bm bin compilation issue
guangzhixie Sep 14, 2018
b49e9de
Merge branch 'dev_master' into bmk_conv
guangzhixie Sep 14, 2018
46aa154
BM conv host implementation
guangzhixie Sep 18, 2018
60decef
BM conv device implementation
guangzhixie Sep 19, 2018
73a9326
Refactor
guangzhixie Sep 19, 2018
e72aa26
Update to new version to bm kernel APIs
guangzhixie Oct 1, 2018
81467b2
Update BM Kernel dependencies
guangzhixie Oct 1, 2018
015cd28
Implement conv with new version of BM Kernel
guangzhixie Oct 1, 2018
6473856
Remove redundancy
guangzhixie Oct 1, 2018
e89f3d5
Refactor
guangzhixie Oct 1, 2018
97afe14
refactor
guangzhixie Oct 5, 2018
e9f6bb6
refactor
guangzhixie Oct 8, 2018
c297105
[cmodel] test
guangzhixie Oct 10, 2018
faf5664
[cmodel] test
guangzhixie Oct 10, 2018
d7535b2
[cmodel] test
guangzhixie Oct 10, 2018
12e240e
[cmodel] test
guangzhixie Oct 10, 2018
2fa86ca
[cmodel] test
guangzhixie Oct 10, 2018
23be961
refactor
guangzhixie Oct 10, 2018
bf97549
[cmodel] refine location
guangzhixie Oct 10, 2018
bdbeedd
[cmodel] fix issue
guangzhixie Oct 11, 2018
76be09b
[cmodel] debug
guangzhixie Oct 11, 2018
1a6973c
[cmodel] debug
guangzhixie Oct 11, 2018
66bd4f3
test
guangzhixie Oct 11, 2018
a550333
fix issue
guangzhixie Oct 11, 2018
be2d669
debug
guangzhixie Oct 12, 2018
72a78f7
debug
guangzhixie Oct 12, 2018
a390326
test getcwd
guangzhixie Oct 18, 2018
a743451
update getcwd test
guangzhixie Oct 18, 2018
672cef3
test
guangzhixie Oct 18, 2018
149bc4d
Revert "test"
guangzhixie Oct 18, 2018
7738a69
Add ways to deinit handle
guangzhixie Oct 25, 2018
619595a
Fix issue
guangzhixie Oct 25, 2018
6ff7738
Merge device
guangzhixie Oct 26, 2018
a633117
fix issue
guangzhixie Oct 26, 2018
29c3cba
Issue fix
guangzhixie Oct 26, 2018
d6a7b5d
Use env to init/deinit BM handle
guangzhixie Oct 26, 2018
56f0241
Issue fix
guangzhixie Oct 26, 2018
7f41dbe
issue fix
guangzhixie Oct 26, 2018
7b48c88
Issue fix
guangzhixie Oct 26, 2018
4e408d7
Use context to manage BM handle
guangzhixie Oct 26, 2018
fe80796
Revert "Use context to manage BM handle"
guangzhixie Oct 30, 2018
331b0bb
Revert "Issue fix"
guangzhixie Oct 30, 2018
5c848e9
Revert "issue fix"
guangzhixie Oct 30, 2018
d5a23f4
Revert "Issue fix"
guangzhixie Oct 30, 2018
70a1344
Revert "Use env to init/deinit BM handle"
guangzhixie Oct 30, 2018
da4fffd
Revert "Issue fix"
guangzhixie Oct 30, 2018
90e1a11
Revert "fix issue"
guangzhixie Oct 30, 2018
64ce35b
Revert "Merge device"
guangzhixie Oct 30, 2018
670f59b
Revert "Fix issue"
guangzhixie Oct 30, 2018
92f12ce
Revert "Add ways to deinit handle"
guangzhixie Oct 30, 2018
5fb51b1
Implementation to deinit BM handle when finish
guangzhixie Oct 30, 2018
9e0a338
Add declarations in target wrapper
guangzhixie Oct 30, 2018
44404db
Add std namespace
guangzhixie Oct 30, 2018
2df5a61
Add env class declaration
guangzhixie Oct 30, 2018
26d7041
test
guangzhixie Oct 30, 2018
d3b9284
test
guangzhixie Oct 30, 2018
1f69d92
remove declaration
guangzhixie Oct 30, 2018
016ada5
c_model
Nov 15, 2018
3e81a02
asic
Nov 20, 2018
90ab86e
asic
Nov 21, 2018
77e140a
asic done
Nov 22, 2018
32086b1
update
Nov 27, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,6 @@ android_build
ios_build
gpu_build
output

.idea
.vscode
6 changes: 4 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ os:
env:
- JOB="-p NVIDIA-GPU -o Centos"
- JOB="-p NVIDIA-GPU -o Ubuntu"
#- JOB="-p AMD_GPU -o Centos"
#- JOB="-p AMD_GPU -o Ubuntu"
#- JOB="-p AMD-GPU -o Centos"
#- JOB="-p AMD-GPU -o Ubuntu"
#- JOB="-p X86-ONLY -o Centos"
#- JOB="-p X86-ONLY -o Ubuntu"
#- JOB="-p ARM -o Centos"
Expand All @@ -31,6 +31,8 @@ branches:
only:
- master
- developing
- AMD
- dev_v2

notifications:
email:
Expand Down
27 changes: 27 additions & 0 deletions AUTHORS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
| Github account | name |
|---|---|
| chenjiaoAngel | Jiao Chen |
| cyj1986 | Yujuan Cheng |
| feifei14119 | Fei Wang |
| jackyh | Chengjie He |
| Jayoprell | Xiaocheng Luo |
| jjsbear | Jingsong Ji |
| LittleMaer | Yi Zhuang |
| mengkai94 | Kai Meng |
| micytw | Michael Wu |
| pangge | Chaowen Cui |
| perchbird | Xiaokun Yu |
| PeterJkPeng | Junyi Peng |
| qq332982511 | Junjie Liu |
| Shixiaowei02 | Xiaowei Shi |
| sogalin | Soga Lin |
| throneclay | Shuai Zhang |
| vin-huang | Vin Huang |
| wgy0804 | Guoya Wang |
| xklnono | Kailu Xu |
| xyoungli | Xiaoyang Li |
| yanan1112 | Yanan Liu |
| yao-matrix | Weifeng Yao |
| zdcocnftcp10 | Dachuan Zhao |
| zhouhuan2009 | Huan Zhou |
| zoooooooyuan | Yuan Zu |
128 changes: 80 additions & 48 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,23 +1,32 @@
# ----------------------------------------------------------------------------
# Copyright (c) 2016 Baidu.com, Inc. All Rights Reserved
# @file root cmakefile
# @auther cuichaowen
# @date 2017-10-24
# ----------------------------------------------------------------------------
# Copyright (c) 2018 Anakin Authors, Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

include(cmake/thirdparty_version.cmake)
cmake_minimum_required(VERSION ${MIN_CMAKE_V} FATAL_ERROR)
project(ANAKIN C CXX)
project(ANAKIN C CXX)
include(cmake/msg_color.cmake)
include(cmake/utils.cmake)
include(cmake/statistic.cmake)

# ----------------------------------------------------------------------------
# section: global anakin version and lib name
# ----------------------------------------------------------------------------
# global anakin version 2.0.1
set(VERSION_MAJOR "2")
set(VERSION_MINOR "0")
set(VERSION_PATCH "1")
cmake_minimum_required(VERSION ${MIN_CMAKE_V} FATAL_ERROR)

# global anakin version 0.1.0
set(VERSION_MAJOR "0")
set(VERSION_MINOR "1")
set(VERSION_PATCH "0")
set(VERSION "${VERSION_MAJOR}.${VERSION_MINOR}.${VERSION_PATCH}")

# anakin lib name and global directories
Expand All @@ -28,12 +37,15 @@ set(ANAKIN_ROOT ${PROJECT_SOURCE_DIR})
include_directories(${ANAKIN_ROOT})

set(ANAKIN_FRAMEWORK ${ANAKIN_ROOT}/framework)
set(ANAKIN_THIRD_PARTY_PATH ${CMAKE_BINARY_DIR}/third-party)
set(ANAKIN_LITE ${ANAKIN_FRAMEWORK}/lite)
set(ANAKIN_UTILS ${ANAKIN_ROOT}/utils)
set(ANAKIN_THIRD_PARTY_PATH ${ANAKIN_ROOT}/third-party)
set(ANAKIN_MODEL_PARSER ${ANAKIN_FRAMEWORK}/model_parser)
set(ANAKIN_SERVICE ${ANAKIN_FRAMEWORK}/service)
set(ANAKIN_SABER ${ANAKIN_ROOT}/saber)
set(ANAKIN_UNIT_TEST ${ANAKIN_ROOT}/test)
set(ANAKIN_EXAMPLES ${ANAKIN_ROOT}/examples)


# ----------------------------------------------------------------------------
# section: options for anakin
Expand All @@ -48,12 +60,13 @@ anakin_option(ANAKIN_TYPE_INT8 "define the INT8 for data precision." NO)
anakin_option(USE_GPU_PLACE "Select the build mode for GPU place." YES)
anakin_option(USE_X86_PLACE "Select the build mode for X86 place." YES)
anakin_option(USE_ARM_PLACE "Select the build mode for ARM place." NO)
anakin_option(USE_BM_PLACE "Select the build mode for BM place." NO)

# plantfrom details
anakin_option(NVIDIA_GPU "Use NVIDIA GPU place." YES if USE_GPU_PLACE)
anakin_option(AMD_GPU "Use AMD GPU place." NO if USE_GPU_PLACE AND NOT NVIDIA_GPU)
anakin_option(TARGET_ANDROID "" NO if USE_ARM_PLACE)
anakin_option(TARGET_IOS "" NO if USE_ARM_PLACE)
anakin_option(TARGET_ANDROID "build for android" YES if USE_ARM_PLACE)
anakin_option(TARGET_IOS "not supported now" YES if USE_ARM_PLACE AND NOT TARGET_ANDROID)

# compile options for NVIDIA_GPU place
anakin_option(USE_CUDA "Use Cuda libs." YES if NVIDIA_GPU)
Expand All @@ -64,60 +77,52 @@ anakin_option(USE_CUDNN "Use Cudnn libs." YES if USE_CUDA)
anakin_option(BUILD_CROSS_PLANTFORM "Build anakin lib for any nvidia device plantform." YES if USE_CUDA)
anakin_option(BUILD_FAT_BIN "Build anakin cuda fat-bin lib for all device plantform" NO if BUILD_CROSS_PLANTFORM)

# compile options for BM place
#anakin_option(USE_BM "Use Cuda libs." YES if NVIDIA_GPU)
#anakin_option(USE_CUBLAS "Use Cublas libs." YES if USE_BM)
#anakin_option(USE_CURAND "Use Curand libs." YES if USE_BM)
#anakin_option(USE_CUFFT "Use CuFFT libs." YES if USE_BM)
#anakin_option(USE_CUDNN "Use Cudnn libs." YES if USE_BM)
#anakin_option(BUILD_CROSS_PLANTFORM "Build anakin lib for any nvidia device plantform." YES if USE_BM)


if(USE_CUDA)
# Select gpu target arch for local high performance implement sass code . Now we have checked on sm_61 sm_50 and it works well.
set(SELECTED_SASS_TARGET_ARCH "61")
elseif(USE_BM)
# Select gpu target arch for local high performance implement sass code . Now we have checked on sm_61 sm_50 and it works well.
#set(SELECTED_SASS_TARGET_ARCH "61")
endif()
if((NOT BUILD_FAT_BIN) AND (NOT BUILD_CROSS_PLANTFORM) AND USE_CUDA)
# Select the only nvidia gpu arch you want to be built on
set(TARGET_GPUARCH 6.1)
set(TARGET_GPUARCH 6.1)
endif()

# build options for cuda.
anakin_option(BUILD_CUBIN "BUILD with the -cubin option in Device mode" NO if USE_CUDA)
anakin_option(COMPILE_PTX "Returns a list of PTX files generated from src." NO if USE_CUDA)

# build options for BM.
anakin_option(BUILD_CUBIN "BUILD with the -cubin option in Device mode" NO if USE_BM)
anakin_option(COMPILE_PTX "Returns a list of PTX files generated from src." NO if USE_BM)


# common build options
anakin_option(ENABLE_DEBUG "Enable DEBUG(default) mode." NO)
anakin_option(ENABLE_DEBUG "Enable DEBUG(default) mode." YES)
anakin_option(ENABLE_VERBOSE_MSG "Enable verbose=1 : compile msg during make." NO)
anakin_option(DISABLE_ALL_WARNINGS "Disable all the warning msg during compile." YES)
anakin_option(ENABLE_NOISY_WARNINGS "Enable noisy warning msg during compile." NO if DISABLE_ALL_WARNINGS)

# using 3rd party libs
anakin_option(USE_GLOG "Build Glog components." NO)
anakin_option(USE_LOGGER "Build native logger components." YES)
anakin_option(USE_GLOG "Build Glog components." NO if NOT USE_LOGGER)
anakin_option(USE_PROTOBUF "Build Google protobuf components." YES)
anakin_option(USE_OPENCV "Use static opencv libs." NO)
anakin_option(USE_BOOST "Use static BOOST libs." NO)
anakin_option(USE_OPENMP "Use Openmp when in andriod environment." YES if TARGET_ANDROID)
anakin_option(USE_OPENMP "Use Openmp when in android environment." YES if TARGET_ANDROID)
anakin_option(USE_GTEST "Use googletest libs." NO if BUILD_WITH_UNIT_TEST)
anakin_option(USE_PYTHON "Generate py wrappers." NO)
anakin_option(USE_OPENCL "Use OpenCL ." NO)
anakin_option(USE_OPENCL "Use OpenCL ." YES if AMD_GPU)
anakin_option(USE_GFLAGS "Build Google gflags components." NO)
anakin_option(USE_MKL "Use mkl libs." NO if USE_X86_PLACE)
anakin_option(USE_MKLML "Use MKLML libs." YES if USE_X86_PLACE)
anakin_option(USE_XBYAK "Use XBYAK libs." YES if USE_X86_PLACE)
anakin_option(USE_OPENMP "Use Openmp when in andriod environment." YES if TARGET_ANDROID)
anakin_option(USE_OPENMP "Use Openmp when in android environment." YES if TARGET_ANDROID)

# build components
anakin_option(BUILD_WITH_UNIT_TEST "Build anakin unit test components." YES)

anakin_option(BUILD_WITH_FRAMEWORK "Build anakin framework" YES)

anakin_option(BUILD_RPC "Build anakin rpc service components." NO if BUILD_WITH_FRAMEWORK)
anakin_option(BUILD_WITH_LITE "Build anakin lite components." YES if USE_GPU_PLACE AND BUILD_WITH_FRAMEWORK)

# build examples
anakin_option(BUILD_EXAMPLES "build detection and classification examples" NO)

# build target
anakin_option(BUILD_SHARED "Build anakin shared lib." YES)
anakin_option(BUILD_STATIC "Build anakin static lib." YES if NOT BUILD_SHARED)
Expand All @@ -127,19 +132,25 @@ anakin_option(ENABLE_OP_TIMER "Enable op timer mode." NO)
# ----------------------------------------------------------------------------
# section: anakin compiler and linker options
# ----------------------------------------------------------------------------
set(CMAKE_BUILD_TYPE Debug FORCE)
if(ENABLE_DEBUG)
set(CMAKE_BUILD_TYPE Debug FORCE)
set(CMAKE_BUILD_TYPE Debug FORCE)
else()
set(CMAKE_BUILD_TYPE Release FORCE)
set(CMAKE_BUILD_TYPE Release FORCE)
endif()

if(USE_LOGGER)
anakin_option(ENABLE_STACKTRACES "If enable local logger with stacktrace." YES if NOT USE_ARM_PLACE)
anakin_option(SUPPORT_PTHREADS "If enable local logger with supporting pthreads. " YES)
endif()

# ----------------------------------------------------------------------------
# section:configure a header file to pass some of the CMake settings to the source
# code
# ----------------------------------------------------------------------------
configure_file (
"${PROJECT_SOURCE_DIR}/cmake/config/anakin_config.h.in"
"${PROJECT_BINARY_DIR}/anakin_config.h"
"${PROJECT_SOURCE_DIR}/cmake/config/anakin_config.h.in"
"${PROJECT_BINARY_DIR}/anakin_config.h"
)
# add the binary tree to the search path so that anakin will find ak_config.h
include_directories(${PROJECT_BINARY_DIR})
Expand All @@ -157,10 +168,6 @@ if(USE_CUDA)
include(cmake/cuda.cmake)
endif()

if(USE_BM)
#include(cmake/cuda.cmake)
endif()

if(USE_X86_PLACE)
set(ANAKIN_TEMP_THIRD_PARTY_PATH ${CMAKE_BINARY_DIR}/third-party)
if(USE_MKLML)
Expand All @@ -172,6 +179,10 @@ if(USE_X86_PLACE)
#include(cmake/external/mkldnn.cmake)
endif()

if(AMD_GPU)
include(cmake/amd.cmake)
endif()

# gather all the config options to anakin
include(cmake/gather.cmake)

Expand All @@ -181,14 +192,35 @@ include(cmake/gather.cmake)
# ----------------------------------------------------------------------------
# add source sub_directory whick holds the cmake build module
# fetch files of model_parser
add_subdirectory(${ANAKIN_MODEL_PARSER})


add_subdirectory(${ANAKIN_SABER})
add_subdirectory(${ANAKIN_FRAMEWORK})

if(USE_BM_PLACE)
add_subdirectory(${ANAKIN_SABER}/funcs/impl/bm)
endif()

if(BUILD_WITH_FRAMEWORK)
add_subdirectory(${ANAKIN_MODEL_PARSER})
if(BUILD_RPC)
add_subdirectory(${ANAKIN_SERVICE})
endif()
if(BUILD_WITH_LITE)
add_subdirectory(${ANAKIN_LITE})
endif()
add_subdirectory(${ANAKIN_FRAMEWORK})
endif()

if(BUILD_WITH_UNIT_TEST)
add_subdirectory(${ANAKIN_UNIT_TEST})
endif()

if (BUILD_EXAMPLES)
if(BUILD_WITH_FRAMEWORK)
add_subdirectory(${ANAKIN_EXAMPLES})
endif()
endif()

anakin_print_statistic()


Expand Down
48 changes: 25 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Anakin
# Anakin12

[![Build Status](https://travis-ci.org/PaddlePaddle/Anakin.svg?branch=developing)](https://travis-ci.org/PaddlePaddle/Anakin)
[![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE)
Expand All @@ -7,63 +7,65 @@

Welcome to the Anakin GitHub.

Anakin is an cross-platform, high-performance inference engine, which is originally
Anakin is a cross-platform, high-performance inference engine, which is originally
developed by Baidu engineers and is a large-scale application of industrial products.

Please refer to our [release announcement]() to track the latest feature of Anakin.
Please refer to our [release announcement](https://github.com/PaddlePaddle/Anakin/releases) to track the latest feature of Anakin.

## Features

- **Flexibility**

Anakin supports a wide range of neural network architectures and
diffrent hardware platform. It is easy to run Anakin at GPU/x86/ARM platform.
different hardware platforms. It is easy to run Anakin on GPU / x86 / ARM platform.

- **High performance**

In order to giving full play to the performance of hardware, we optimize the
forward prediction at diffrent levels.
- Automatic graph fusion. The goal of all performance optimization under a
given algorithm is to make ALU as busy as possible, Operator fusion
can effectively reduce memory access and keep ALU busy.
- Memory reuse. Forward prediction is a one-way calculation. We reuse
the memory between the input and output of different operators, thus
In order to give full play to the performance of hardware, we optimized the
forward prediction at different levels.
- Automatic graph fusion. The goal of all performance optimizations under a
given algorithm is to make the ALU as busy as possible. Operator fusion
can effectively reduce memory access and keep the ALU busy.

- Memory reuse. Forward prediction is a one-way calculation. We reuse
the memory between the input and output of different operators, thus
reducing the overall memory overhead.

- Assembly level optimization. Saber is Anakin's underlying DNN library, which
- Assembly level optimization. Saber is a underlying DNN library for Anakin, which
is deeply optimized at assembly level. Performance comparison between Anakin, TensorRT
and Tensorflow-lite, please refer to the benchmark tests.
and Tensorflow-lite, please refer to the [benchmark tests](benchmark/README.md).


## Installation

It is recommended to check out the
[Docker installation guide](docker/README.md).
[docker installation guide](docker/README.md).
before looking into the
[build from source guide](docs/Manual/INSTALL_en.md).

For ARM, please refer [run on arm](docs/Manual/run_on_arm_en.md).

## Benchmark
It is recommended to check out the [Benchmark Readme](benchmark/README.md)
It is recommended to check out the [readme of benchmark](benchmark/README.md).

## Documentation

We provide [English](docs/Manual/Tutorial_en.md) and
[Chinese](docs/Manual/Tutorial_ch.md) documentation.
We provide [English](docs/Manual/Tutorial_en.md) and [Chinese](docs/Manual/Tutorial_ch.md) documentation.

- [Anakin developer guide]()
- Developer guide

You might want to know more details of Anakin and make it better.
You might want to know more details of Anakin and make it better. Please refer to [how to add custom devices](docs/Manual/addCustomDevice.md) and [how to add custom device operators](docs/Manual/addCustomOp.md).

- [C++ API]()
- User guide

Python API is under-developing.
You can get the working principle of the project, C++ interface description and code examples from [here](docs/Manual/Tutorial_ch.md). You can also learn about the model converter [here](docs/Manual/Converter_ch.md).

- [How to Contribute]()
- [How to Contribute](docs/Manual/Contribution_ch.md)

We appreciate your contributions!



## Ask Questions

You are welcome to submit questions and bug reports as [Github Issues](https://github.com/PaddlePaddle/Anakin/issues).
Expand Down
Loading