Zhe Work Report
2018.02.28
The report includes :
- model comparison tool.
- Non-blocking multi node communication in chainer master v3.
- Source code in developer01.
- Label Device
Model Comparison Tool
Model comparisontool is a tool to help developer to compare difference between two model from different framework. The tool support Caffe, TF, MXNet, Chainer,pytorch. We mainly compare the variable node shape and functionnode type. We also compare the computation the graph and padding strategy.
We also develop a web model comparison tool for user. User just need upload their models file and see the visually graph comparison result. Below Section 1 is the design detail of the tool and Section 2 is the Web version Detail, final is the Deploy part.
Design Detail
- The whole tool based on the architecture as below.

- In our tool, firstlly loading the graph from different framework. You can find the method to load function responding to different framework as the Table below.
Framework Input Files Function file Detail Caffe train_val.prototxt load_caffe_model.py Based on the caffe.proto.caffe_pb2 to read model parameters. Tensorflow model save files load_tf_model.py Read computation graph from stored model file. So User should save the graph before. Chainer model class .py files load_chainer_model.py Get all functionNode from output. MXNet model json file load_mxnet_model.py Read json file and parameters. SSD SSD model load_ssd_*.py SSD have special layer. - Convert to unified data format and add different pdding strategies, as the Figure below. Later version, we add pw, ph, sh, sw, kh, kw params. source code in rank_multi_port.py.

- Graph comparison is a difficult problem. So we transfer graph to an ordered list and find all difference. Source code in rank_multi_port.py.
Web Model Comparison Tool
For a better User Experience, we develop a website. User just upload the input files and get the visual comparison result and detail parameters. The system combine Django backend and Bootstrap front end. Using D3.js to visualize the model graph. All source code in the web-model-comparison-tool folder.

The website have two pages. In page 1, user upload their needing. Page 2 show the Graph result.


Deploy Method
Our website verison deploy in developer01 server. The root username modeltools, password: abc110.
ssh to the web server.
xxxxxxxxxxssh modeltools@developer01cd to work folder.
xxxxxxxxxxcd ~/web-model-comparison-tool/model_comparison_tool/Start the uwsgi server.
xxxxxxxxxxuwsgi —socket /tmp2/model_comparison_tool.sock —module model_comparison_tool.wsgi —chmod-socket=777 —uid=www-data —gid=www-dataRestart the nginx service.
xxxxxxxxxxsudo /etc/init.d/nginx restartNow you can browser the site: server local IP:8000 to use our tool.
If wanna know more, please click on HELP.
Non-blocking Multi Node In Chainer Master V3
In Chainermn master, multi use the blocking to implement multi node trainning. In Chainer, we use data parallelism as multi node practices. The weight update like the figure below.
Blocking communication, every iteration do allreduce together. But Non-blocking do weight all reduce every layer. So we can hide communication with omputation. Just like below.

Our code based on the intel chainer master_v3 branch. All the diff in the patch files.
Source code in developer01
I stored backup files in developer01 server. Model comparison tool is in ~/MCT folder. Non-blocking-chainer is in ./NBC folder. Our tool git also in dl_framework-dl_tools repo.
Device
A Desktop in mine. NUM is: BB26468.