Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue on training (bash train_pose.sh) #212

Open
changkk opened this issue Dec 31, 2018 · 1 comment
Open

Issue on training (bash train_pose.sh) #212

changkk opened this issue Dec 31, 2018 · 1 comment

Comments

@changkk
Copy link

changkk commented Dec 31, 2018

Hi, I am following the training instruction exactly.
I installed caffe_train and downloaded 189GB LMDB file and launched train_pose.sh which came from set_layer.py.
However I got this error.


[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
I1230 22:03:59.697592 6184 upgrade_proto.cpp:52] Attempting to upgrade input file specified using deprecated V1LayerParameter: /home/changkoo/Realtime_Multi-Person_Pose_Estimation_ck/training/dataset/COCO/train/VGG_ILSVRC_19_layers.caffemodel
I1230 22:04:00.144183 6184 upgrade_proto.cpp:60] Successfully upgraded file specified using deprecated V1LayerParameter
I1230 22:04:00.156769 6184 upgrade_proto.cpp:66] Attempting to upgrade input file specified using deprecated input fields: /home/changkoo/Realtime_Multi-Person_Pose_Estimation_ck/training/dataset/COCO/train/VGG_ILSVRC_19_layers.caffemodel
I1230 22:04:00.156787 6184 upgrade_proto.cpp:69] Successfully upgraded file specified using deprecated input fields.
W1230 22:04:00.156791 6184 upgrade_proto.cpp:71] Note that future Caffe releases will only support input layers and not input fields.
I1230 22:04:00.156970 6184 net.cpp:761] Ignoring source layer pool1
I1230 22:04:00.157120 6184 net.cpp:761] Ignoring source layer pool2
I1230 22:04:00.158727 6184 net.cpp:761] Ignoring source layer pool3
I1230 22:04:00.161192 6184 net.cpp:761] Ignoring source layer conv4_3
I1230 22:04:00.161202 6184 net.cpp:761] Ignoring source layer relu4_3
I1230 22:04:00.161206 6184 net.cpp:761] Ignoring source layer conv4_4
I1230 22:04:00.161207 6184 net.cpp:761] Ignoring source layer relu4_4
I1230 22:04:00.161208 6184 net.cpp:761] Ignoring source layer pool4
I1230 22:04:00.161211 6184 net.cpp:761] Ignoring source layer conv5_1
I1230 22:04:00.161214 6184 net.cpp:761] Ignoring source layer relu5_1
I1230 22:04:00.161216 6184 net.cpp:761] Ignoring source layer conv5_2
I1230 22:04:00.161219 6184 net.cpp:761] Ignoring source layer relu5_2
I1230 22:04:00.161222 6184 net.cpp:761] Ignoring source layer conv5_3
I1230 22:04:00.161226 6184 net.cpp:761] Ignoring source layer relu5_3
I1230 22:04:00.161227 6184 net.cpp:761] Ignoring source layer conv5_4
I1230 22:04:00.161231 6184 net.cpp:761] Ignoring source layer relu5_4
I1230 22:04:00.161233 6184 net.cpp:761] Ignoring source layer pool5
I1230 22:04:00.161237 6184 net.cpp:761] Ignoring source layer fc6
I1230 22:04:00.161238 6184 net.cpp:761] Ignoring source layer relu6
I1230 22:04:00.161240 6184 net.cpp:761] Ignoring source layer drop6
I1230 22:04:00.161244 6184 net.cpp:761] Ignoring source layer fc7
I1230 22:04:00.161247 6184 net.cpp:761] Ignoring source layer relu7
I1230 22:04:00.161249 6184 net.cpp:761] Ignoring source layer drop7
I1230 22:04:00.161252 6184 net.cpp:761] Ignoring source layer fc8
I1230 22:04:00.161254 6184 net.cpp:761] Ignoring source layer prob
I1230 22:04:00.198676 6184 caffe.cpp:251] Starting Optimization
I1230 22:04:00.198695 6184 solver.cpp:279] Solving
I1230 22:04:00.198698 6184 solver.cpp:280] Learning Rate Policy: step
1adfadsf 0xaf9f4e0first 0xaf9ca60second 0x20second 0x21
1adfadsf 0xafa3d40first 0xaf9cc60second 0x20second 0x21
1adfadsf 0x1aa62190first 0xaf9f660second 0x20second 0x21
F1230 22:04:00.317441 6184 eltwise_layer.cpp:35] Check failed: bottom[i]->shape() == bottom[0]->shape()
*** Check failure stack trace: ***
@ 0x7f79d032b5cd google::LogMessage::Fail()
@ 0x7f79d032d433 google::LogMessage::SendToLog()
@ 0x7f79d032b15b google::LogMessage::Flush()
@ 0x7f79d032de1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f79d075d404 caffe::EltwiseLayer<>::Reshape()
@ 0x7f79d082e708 caffe::Net<>::ForwardFromTo()
@ 0x7f79d082eab7 caffe::Net<>::Forward()
@ 0x7f79d0852690 caffe::Solver<>::Step()
@ 0x7f79d08532d9 caffe::Solver<>::Solve()
@ 0x40cccf train()
@ 0x4086c0 main
@ 0x7f79cf22d830 __libc_start_main
@ 0x408ed9 _start
@ (nil) (unknown)

Seems like the error comes from eltwise_layer.cpp, so I dissected this file, and it seems the error is generated because of the different sizes of two bottom layers in one of the eltwise layer. I am not sure why this is happening even though I haven't changed anything in the protxt files and set_layer.py other than the source path.
Isn't there anyone who got this error using the repo?

I tried to unbold the error, but I don't know how to do.... Sorry for that!

Thanks!

@changkk
Copy link
Author

changkk commented Dec 31, 2018

I just found similar issue in the closed issues but this issue can be related to the batch size??? Should I use the batch size more than 1?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant