How to deploy to C++ project, thank you!怎么部署到C++工程，谢谢！ #220

zoufangyu1987 · 2019-07-23T01:30:01Z

Now there is a pytorch model. The CenterNet code is python. My project needs c++. How to deploy it?
Note: My Python code ability is weak

现在已经有pytorch的模型，CenterNet代码都是python的，我的工程需要c++，怎么布署呢？
注：本人python代码能力弱

kakaluote · 2019-07-23T11:34:49Z

一种解决方案是python做服务器，c++通过socket发送检测请求。
还是希望能c++推断

zoufangyu1987 · 2019-07-23T11:55:20Z

Because there are not only detection modules in my project, but also other modules, so I need C++, look forward to guidance!
因为我的工程里不只有检测模块，还有其它模块，所以需要C++，望指导！

xingyizhou · 2019-07-23T18:38:00Z

I honestly have no experience in c++ deployment ...

zoufangyu1987 · 2019-07-24T01:08:07Z

@xingyizhou
Thank you anyway.

Markusgami · 2019-07-24T05:38:54Z

Convert model to caffe. And run it in caffe

zoufangyu1987 · 2019-07-24T11:30:15Z

@Markusgami
This is OK, mainly forward-propagating code needs to be converted to c++, is there any available c++ code? Thank you!
这样是可以，主要是前向传播的代码需要转成c++，有可用的c++代码吗？谢谢！

kunyao2015 · 2019-07-26T06:15:29Z

持续关注，有好的解决方案吗，用libtorch?

zoufangyu1987 · 2019-07-26T06:24:59Z

Wait online, it's urgent!
在线等，挺急的!

jnulzl · 2019-07-29T03:17:36Z

@zoufangyu1987 @kunyao2015
实现步骤：
1、把模型转成caffemodel;
2、前后处理自己c++实现；
3、Done!
亲测可以，祝你好运！

zoufangyu1987 · 2019-07-29T03:31:33Z

@jnulzl
Can you share your c++ code? Thank you!
可以分享你的c++代码吗？谢谢！

jnulzl · 2019-07-29T03:39:47Z

@zoufangyu1987
不好意思，暂时不行

zoufangyu1987 · 2019-07-29T04:04:52Z

@jnulzl
I want to cry!
好想哭!

BokyLiu · 2019-07-30T02:50:38Z

用trace转模型,再用libtorch部署吧,亲测可用

wangshankun · 2019-08-01T06:55:51Z

@zoufangyu1987 @kunyao2015
实现步骤：
1、把模型转成caffemodel;
2、前后处理自己c++实现；
3、Done!
亲测可以，祝你好运！

转caffe时候的DCNV2怎么办？caffe又不是天然支持

zoufangyu1987 · 2019-08-07T02:39:27Z

我把没有dcn层的dlav0_34的pytorch模型已经转成caffemodel，这两天理了下centernet的demo的前处理和后处理相关python代码，还是蛮繁琐的，有搞好的朋友分享下C++代码啊，万分感谢！
I have converted the pytorch model of dlav0_34 without DCN layer into caffe model. These two days, I have handled the Python code related to the pre-processing and post-processing of the demo in the CenterNet. It is Difficult. Who can share the C++ code well? Thank you very much!

Fighting-JJ · 2019-08-08T01:39:55Z

用trace转模型,再用libtorch部署吧,亲测可用

trace 可以成功么？你训练的什么arch的模型？
@BokyLiu

BokyLiu · 2019-08-08T01:42:25Z

用trace转模型,再用libtorch部署吧,亲测可用

trace 可以成功么？你训练的什么arch的模型？
@BokyLiu

res18的

Fighting-JJ · 2019-08-08T02:49:25Z

@zoufangyu1987 @kunyao2015
实现步骤：
1、把模型转成caffemodel;
2、前后处理自己c++实现；
3、Done!
亲测可以，祝你好运！

转caffe时候的DCNV2怎么办？caffe又不是天然支持

请问，你解决了DCNv2的部署问题么？

zoufangyu1987 · 2019-08-08T05:54:32Z

@Fighting-JJ
没有，没有找到pytorch转caffemodel支持DCNV2层的代码，现在用dlav0_34，放弃dcn，caffemodel已经验证输出的output参数完全一致，不过是在python上验证的，C++还没有搞，工作量有点大，坑有点多，发现有好些朋友已经成功，但不分享源码也没办法，只能一步一步搞
No, I haven't found the code of pytorch to caffemodel to support DCNV2 layer. Now I use dlav0_34, give up dcn, caffemodel to verify that the output parameters of output are exactly the same. However, it's verified on python. C++ hasn't been done yet, the workload is a bit heavy, the pit is a bit too many. I find that some friends have succeeded, but they don't share the source code. There's no way to do it. We can only do it step by step.

zoufangyu1987 · 2019-08-08T05:55:46Z

等我成功在C++上部署，一定分享给大家源码
When I successfully deploy on C++, I will share the source code with you.

zoufangyu1987 · 2019-08-09T07:55:10Z

我已经把pytorch全部剥离，在python上依赖numpy正常跑通了，下一步转C++，发现numpy有C++版本——"numcpp"，搞得身心疲惫，希望后面少点坑！
I've stripped all pytorch and relied on numpy to run normally on python. Next, I turn to C++. I find that numpy has a version of C++ - "numcpp", which makes me tired physically and mentally. I hope there are fewer pits behind it.

Fighting-JJ · 2019-08-09T08:28:05Z

我已经把pytorch全部剥离，在python上依赖numpy正常跑通了，下一步转C++，发现numpy有C++版本——"numcpp"，搞得身心疲惫，希望后面少点坑！
I've stripped all pytorch and relied on numpy to run normally on python. Next, I turn to C++. I find that numpy has a version of C++ - "numcpp", which makes me tired physically and mentally. I hope there are fewer pits behind it.

You can use jit.trace to trace the model then deploy it with c++ by libtorch which is a c++ library.
then only the post-process is left.

chenjx1005 · 2019-08-13T03:50:42Z

我已经把pytorch全部剥离，在python上依赖numpy正常跑通了，下一步转C++，发现numpy有C++版本——"numcpp"，搞得身心疲惫，希望后面少点坑！
I've stripped all pytorch and relied on numpy to run normally on python. Next, I turn to C++. I find that numpy has a version of C++ - "numcpp", which makes me tired physically and mentally. I hope there are fewer pits behind it.

It's no need to use numcpp. You can read the image to CV::Mat by opencv in C++ version, and convert the Mat to caffe::Blob.

zoufangyu1987 · 2019-08-14T02:09:56Z

@chenjx1005
我也认为仅仅用OpenCV是可以的，不过我已经在用NumCpp和OpenCV结合，我先试试，如果不行再去掉NumCpp
I also think it's possible to use OpenCV only.But I've combined NumCpp with OpenCV. I'll try it first, and if I can't, I'll remove NumCpp.

hexiangquan · 2019-08-14T06:40:55Z

https://github.com/hexiangquan/CenterNetCPP

zoufangyu1987 · 2019-08-14T09:37:05Z

@hexiangquan
感激不尽
Be deeply grateful

zoufangyu1987 · 2019-08-16T10:51:31Z

基于numcpp的C++版本加载caffemodel已经成功了，结果一致，谢谢大家！

BokyLiu · 2019-08-20T00:26:10Z

基于numcpp的C++版本加载caffemodel已经成功了，结果一致，谢谢大家！

期待您的分享

zoufangyu1987 · 2019-08-20T01:41:37Z

@BokyLiu
这几天上班事情有点多，这两天空下来我就把整个流程和相关文件整理一下分享给大家
上面@hexiangquan也已经分享预处理、forward、后处理的c++代码，感谢他！

Fighting-JJ · 2019-08-24T09:19:38Z

我已经把pytorch全部剥离，在python上依赖numpy正常跑通了，下一步转C++，发现numpy有C++版本——"numcpp"，搞得身心疲惫，希望后面少点坑！
I've stripped all pytorch and relied on numpy to run normally on python. Next, I turn to C++. I find that numpy has a version of C++ - "numcpp", which makes me tired physically and mentally. I hope there are fewer pits behind it.
You can use jit.trace to trace the model then deploy it with c++ by libtorch which is a c++ library.
then only the post-process is left.

您trace的时候dcn_v2部分有没有trace成功？ @Fighting-JJ

我没有使用DCN的，用的DLA034或者是resnet

121649982 · 2019-12-04T12:02:18Z

@zoufangyu1987 If it is possible could you please share the C++ pre-processing and post processing code? I have managed to make the pt file for hourglass model and was able to load it in windows using libtorch. Only thing left is the pre-processing and post processing steps. I was notable to access the baidu link you shared. Is there any other way you can share the code?
how can you convert your model to pt?

121649982 · 2019-12-04T12:10:14Z

@ALL 另一种方式也可尝试：C++直接调用python下的pytorch模型（亲测有效）
参见博客: https://blog.csdn.net/u011681952/article/details/92765549

这个需要python

VishnuPJ · 2019-12-04T12:38:42Z

@zoufangyu1987 If it is possible could you please share the C++ pre-processing and post processing code? I have managed to make the pt file for hourglass model and was able to load it in windows using libtorch. Only thing left is the pre-processing and post processing steps. I was notable to access the baidu link you shared. Is there any other way you can share the code?
how can you convert your model to pt?

This worked for me,
#414 (comment)

zoufangyu1987 · 2020-03-25T10:37:23Z

@BokyLiu
@Fighting-JJ
为什么我的模型在libtorch比pytorch多很多，pytorch显存占750M，libtorch占1300M，为什么会这样？
Why does my model have a lot more in libtorch than in pytorch? Why does pytorch display 750m memory and libtorch 1300m memory?

121649982 · 2020-03-25T10:39:07Z

libtorch优化不好，而且速度很慢，建议用tensorrt

Dantju · 2020-06-22T09:08:27Z

@zoufangyu1987 I cannot find file about Alg_VIR, so cannot build centernet c++ project,can u share this file?

zoufangyu1987 · 2020-06-22T09:46:16Z

@Dantju
This file is not needed. You can remove the header file
这个文件不需要，你可以把包涵的头文件去掉

Dantju · 2020-06-22T10:58:09Z

@zoufangyu1987 但是centernet.cpp中有一些这样的参数定义

zoufangyu1987 · 2020-06-22T11:04:13Z

@Dantju
你不用管它，它只是我项目工程里的一些东西，当时没有删除。

Dantju · 2020-06-24T05:29:33Z

@zoufangyu1987 sVIRInput 整个结构体在后面的code中用到了，那么整个头文件十是否能分享一下呢

sVIRInput virInput;
//sVIROutput virOutput;
//std::string img_name;
//#undef GPU
// Caffe::set_mode(Caffe::GPU);
// Caffe::SetDevice(0);
//#define GPU

Caffe::set_mode(Caffe::CPU);

const std::string modelPath;
init(modelPath);

#if 01
std::vectorstd::string img_name_path;
std::vectorstd::string Img_info;
cv::Mat imgin;
//一次处理一张图片
while(getline(filelist,line))
{
//获得图片路径和图片信息
virInput.vInImg.clear();
img_name_path.clear();

Dantju · 2020-06-24T05:31:58Z

@zoufangyu1987 这个code是在windows下编译的吗，我编译总是遇到syntax error:missing ';' before '}'

zoufangyu1987 · 2020-06-24T05:38:33Z

@Dantju
linux

Dantju · 2020-06-24T05:45:44Z

@zoufangyu1987 ok，那么能提供一下sVIRInput 的头文件吗

zoufangyu1987 · 2020-06-24T05:51:24Z

@Dantju
那只是我自己项目工程里定义的结构体，用来加图像的，你把它删除自己创建一个图像就好了，改一改就行，理解一下代码就行了，很容易

Dantju · 2020-06-24T07:42:18Z

有成功在windows下跑成功的吗，可以分享下code吗

xiaowk5516 · 2020-09-23T12:17:53Z

感谢您分享的代码，想请问一下，您后来有对代码进行优化么？是怎么做的呢？

zoufangyu1987 · 2020-09-30T09:33:09Z

@xiaowk5516
后处理这部分我没有做优化，个人能力有限。
有个排序函数在debug上处理很慢需要50-70ms(和硬件平台有关系)，但在release模式下大约10ms.
对处理速度方面做了一些优化，主要是针对图像处理部分，采用simd图像并行加速库，对resize和copy等的替换。提升速度显著。特别是在图像预处理部分。可参考：https://www.jianshu.com/p/5b272f108ed2

I didn't optimize the post-processing part, and my personal ability is limited.
A sort function is very slow on debug, which takes 50-70 MS (depending on the hardware platform), but about 10 ms in release mode
The speed of processing has been optimized, mainly for the image processing part, the SIMD image parallel acceleration library is used to replace restore and copy. The lifting speed is remarkable. Especially in the image preprocessing part. For reference: https://www.jianshu.com/p/5b272f108ed2

xiaowk5516 · 2020-09-30T13:02:56Z

@xiaowk5516
后处理这部分我没有做优化，个人能力有限。
有个排序函数在debug上处理很慢需要50-70ms(和硬件平台有关系)，但在release模式下大约10ms.
对处理速度方面做了一些优化，主要是针对图像处理部分，采用simd图像并行加速库，对resize和copy等的替换。提升速度显著。特别是在图像预处理部分。可参考：https://www.jianshu.com/p/5b272f108ed2

I didn't optimize the post-processing part, and my personal ability is limited.
A sort function is very slow on debug, which takes 50-70 MS (depending on the hardware platform), but about 10 ms in release mode
The speed of processing has been optimized, mainly for the image processing part, the SIMD image parallel acceleration library is used to replace restore and copy. The lifting speed is remarkable. Especially in the image preprocessing part. For reference: https://www.jianshu.com/p/5b272f108ed2

多谢您的回答，我会去了解一下的，再次感谢。

leilaShen · 2021-03-09T05:53:06Z

用trace转模型,再用libtorch部署吧,亲测可用

我用trace转成了模型但是模型的结果在c++中调用产生的结果和python中不同应该是forward用trace还是不准确。但是script太难用了转不了模型

ttanzhiqiang · 2021-06-30T02:41:50Z

https://github.com/ttanzhiqiang/onnx_tensorrt_project

ShihuaiXu · 2021-09-29T07:32:33Z

我已经共享在如下：链接: https://pan.baidu.com/s/1m6zdWSKE8soSMwXRbU1aeg 提取码: yntr

哥，您这个代码用了C++的numpy库，想问下放到板子上能运行吗

LaserLV52 · 2022-01-24T09:08:20Z

用trace转模型,再用libtorch部署吧,亲测可用

你好，转成libtorch的代码你还存着吗，我的网络用的是dla，不是说网络里面有dcn直接用trace转换会报错吗，请问你是怎么解决的呢？

zoufangyu1987 · 2022-01-24T09:21:47Z

@ShihuaiXu
您好！板子上我没有试过，您可以试一试，当时用NUMPY库主要是libtorch还不够成熟，我那时候也了解不多，好像那些处理函数libtorch都有相应的函数。

zoufangyu1987 · 2022-01-24T09:24:31Z

@LaserLV52
您好！我说一下我当时是怎么解决dcn的，直接不用dla_dcn的版本，直接用没有dcn层的版本dlav0好像是。别取笑我，哈哈。

LaserLV52 · 2022-01-24T09:28:20Z

@LaserLV52 您好！我说一下我当时是怎么解决dcn的，直接不用dla_dcn的版本，直接用没有dcn层的版本dlav0好像是。别取笑我，哈哈。

都是相互学习的哈哈哈，对了，顺便问一下你，你当时换用了没有dcn层的网络之后，有没有对比过和用了dcn层的性能差距，他们的性能表现差距大吗？

zoufangyu1987 · 2022-01-24T09:56:04Z

@LaserLV52
会差一点点，但可以乎略不计，问题不大。

LaserLV52 · 2022-01-25T07:41:08Z

@LaserLV52 会差一点点，但可以乎略不计，问题不大。

好我去试试看

LaserLV52 · 2022-02-08T07:46:12Z

@LaserLV52 会差一点点，但可以乎略不计，问题不大。

楼主我又来了，我现在用dlav0的网络训练了一个模型，就是没有dcn了，用trace导出torchscript是成功了，到c++里面能够正常forward，forward出来的格式也是对的，就是hm，wh，reg三个部分，但是出来的结果数据类型是torch::jit::IValue，这个类型我不知道要怎么操作，我试过他们的办法用output.toTuple()->elements()[0].toTensor()是会报错的，不知道你是怎么解决的呢？

zoufangyu1987 · 2022-02-08T08:09:12Z

@LaserLV52

    auto img_tensor = torch::CPU(torch::kFloat32).tensorFromBlob(floatImg1.data, {1, 608, 608,3});//将cv::Mat转成tensor,大小为1,608,608,3
    img_tensor = img_tensor.permute({0, 3, 1, 2});  //调换顺序变为torch输入的格式 1,3,608,608
    auto img_var = torch::autograd::make_variable(img_tensor, false);  //不需要梯度
    inputs.clear();
    inputs.emplace_back(img_var.to(at::kCUDA));  // 把预处理后的图像放入gpu
    cudaDeviceSynchronize();

    //struct timeval t1_,t2_;
    //double timeuse_;
    //gettimeofday(&t1_,NULL);

    //torch::Tensor output = module->forward(inputs).toTuple()->elements()[0].toTensor();
    c10::intrusive_ptr<c10::ivalue::Tuple> output = module->forward(inputs).toTuple();

    //cudaDeviceSynchronize();
    //gettimeofday(&t2_,NULL);
    //timeuse_ = t2_.tv_sec - t1_.tv_sec + (t2_.tv_usec - t1_.tv_usec)/1000000.0;
    //printf("forward:%f\n",timeuse_);

    torch::Tensor output_c = output->elements()[0].toTensor();
    torch::Tensor output_w = output->elements()[1].toTensor();
    torch::Tensor output_h = output->elements()[2].toTensor();

    torch::Tensor output_maxpool = torch::max_pool2d(output_c, {3,3}, {1,1}, {1,1});
    output_c = torch::sigmoid_(output_c);
    output_maxpool = torch::sigmoid_(output_maxpool);
    torch::Tensor keep = (output_maxpool == output_c).to(torch::kFloat32);
    torch::Tensor heat = output_c * keep;

我是这样做的，你试试。

LaserLV52 · 2022-02-09T07:18:40Z

@LaserLV52

    auto img_tensor = torch::CPU(torch::kFloat32).tensorFromBlob(floatImg1.data, {1, 608, 608,3});//将cv::Mat转成tensor,大小为1,608,608,3
    img_tensor = img_tensor.permute({0, 3, 1, 2});  //调换顺序变为torch输入的格式 1,3,608,608
    auto img_var = torch::autograd::make_variable(img_tensor, false);  //不需要梯度
    inputs.clear();
    inputs.emplace_back(img_var.to(at::kCUDA));  // 把预处理后的图像放入gpu
    cudaDeviceSynchronize();

    //struct timeval t1_,t2_;
    //double timeuse_;
    //gettimeofday(&t1_,NULL);

    //torch::Tensor output = module->forward(inputs).toTuple()->elements()[0].toTensor();
    c10::intrusive_ptr<c10::ivalue::Tuple> output = module->forward(inputs).toTuple();

    //cudaDeviceSynchronize();
    //gettimeofday(&t2_,NULL);
    //timeuse_ = t2_.tv_sec - t1_.tv_sec + (t2_.tv_usec - t1_.tv_usec)/1000000.0;
    //printf("forward:%f\n",timeuse_);

    torch::Tensor output_c = output->elements()[0].toTensor();
    torch::Tensor output_w = output->elements()[1].toTensor();
    torch::Tensor output_h = output->elements()[2].toTensor();

    torch::Tensor output_maxpool = torch::max_pool2d(output_c, {3,3}, {1,1}, {1,1});
    output_c = torch::sigmoid_(output_c);
    output_maxpool = torch::sigmoid_(output_maxpool);
    torch::Tensor keep = (output_maxpool == output_c).to(torch::kFloat32);
    torch::Tensor heat = output_c * keep;

我是这样做的，你试试。

昨天我又自己摸索了一下，现在是搞通了，就是要在python代码里面修改一下网络的输出就可以了。但是现在有个新问题，就是我现在在c++里面看网络的三层输出，hm，wh，reg，发现跟python里面的不一样，然后就发现python代码里面的预处理，是用了仿射变换然后还进行了均值和方差的偏移运算的，而我c++里面只是用了resize，所以输出结果不一样了。你应该当时也遇到了这个问题？所以你的c++预处理的仿射变换和均值方差运算是怎么做的呢？python的numpy有广播机制很好运算，c++确实不熟，拿着数据都不知道要怎么用。。。

How to deploy to C++ project, thank you!怎么部署到C++工程，谢谢！ #220

How to deploy to C++ project, thank you!怎么部署到C++工程，谢谢！ #220

Comments

zoufangyu1987 commented Jul 23, 2019

kakaluote commented Jul 23, 2019 • edited Loading

zoufangyu1987 commented Jul 23, 2019

xingyizhou commented Jul 23, 2019

zoufangyu1987 commented Jul 24, 2019

Markusgami commented Jul 24, 2019

zoufangyu1987 commented Jul 24, 2019

kunyao2015 commented Jul 26, 2019

zoufangyu1987 commented Jul 26, 2019

jnulzl commented Jul 29, 2019

zoufangyu1987 commented Jul 29, 2019

jnulzl commented Jul 29, 2019 • edited Loading

zoufangyu1987 commented Jul 29, 2019

BokyLiu commented Jul 30, 2019

wangshankun commented Aug 1, 2019

zoufangyu1987 commented Aug 7, 2019

Fighting-JJ commented Aug 8, 2019

BokyLiu commented Aug 8, 2019

Fighting-JJ commented Aug 8, 2019

zoufangyu1987 commented Aug 8, 2019

zoufangyu1987 commented Aug 8, 2019

zoufangyu1987 commented Aug 9, 2019

Fighting-JJ commented Aug 9, 2019

chenjx1005 commented Aug 13, 2019

zoufangyu1987 commented Aug 14, 2019 • edited Loading

hexiangquan commented Aug 14, 2019

zoufangyu1987 commented Aug 14, 2019

zoufangyu1987 commented Aug 16, 2019

BokyLiu commented Aug 20, 2019

zoufangyu1987 commented Aug 20, 2019

Fighting-JJ commented Aug 24, 2019

121649982 commented Dec 4, 2019

121649982 commented Dec 4, 2019

VishnuPJ commented Dec 4, 2019

zoufangyu1987 commented Mar 25, 2020

121649982 commented Mar 25, 2020

Dantju commented Jun 22, 2020

zoufangyu1987 commented Jun 22, 2020

Dantju commented Jun 22, 2020

zoufangyu1987 commented Jun 22, 2020

Dantju commented Jun 24, 2020

Dantju commented Jun 24, 2020

zoufangyu1987 commented Jun 24, 2020

Dantju commented Jun 24, 2020

zoufangyu1987 commented Jun 24, 2020

Dantju commented Jun 24, 2020

xiaowk5516 commented Sep 23, 2020

zoufangyu1987 commented Sep 30, 2020

xiaowk5516 commented Sep 30, 2020

leilaShen commented Mar 9, 2021

ttanzhiqiang commented Jun 30, 2021

ShihuaiXu commented Sep 29, 2021

LaserLV52 commented Jan 24, 2022

zoufangyu1987 commented Jan 24, 2022

zoufangyu1987 commented Jan 24, 2022

LaserLV52 commented Jan 24, 2022

zoufangyu1987 commented Jan 24, 2022

LaserLV52 commented Jan 25, 2022

LaserLV52 commented Feb 8, 2022

zoufangyu1987 commented Feb 8, 2022 • edited Loading

LaserLV52 commented Feb 9, 2022

kakaluote commented Jul 23, 2019 •

edited

Loading

jnulzl commented Jul 29, 2019 •

edited

Loading

zoufangyu1987 commented Aug 14, 2019 •

edited

Loading

zoufangyu1987 commented Feb 8, 2022 •

edited

Loading