Some question about training progress #10

wangjiyuan9 · 2023-06-05T08:16:25Z

Dear @Dwawayu ：
Thank you for your amazing work!

Now I'm tying to retraining your model with only 1 RTX3090. So I use: CUDA_VISIBLE_DEVICES=0 python -u train.py --png --model_name ** --use_denseaspp --use_mixture_loss --plane_residual --flip_right for first stage and delete all the code like dist.get_rank() == 0: blablabla . Are there any other details I need to pay attention?
I would like to see what your training output looks like. I've noticed that my training loss is often negative and after about 20 epochs, the val abs_rel is around 0.11. Is this normal?
I notice that at here you recode the bestmodel . But at HRfinetune you use --load_weights_folder ./log/ResNet/exp1/last_models .Why you don't use the bestmodel?
In the paper:

I'm wandering All of your results are the 50th epoch result (last_model)? Or the best_model result? And All the three line you had use the HRfinetune, right?

Thank your for your time and help.

The text was updated successfully, but these errors were encountered:

Dwawayu · 2023-06-05T10:37:56Z

Dear Jiyuan,
Hi! Thank you again for your kind words and attention. Here are some suggestions:

Instead of modifying the source code, I recommend making changes only to the running command. To run the code on a single GPU, you can use:

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 train.py [--args]

The Mixture Laplace Loss is the negtive log-likelihood of the destribution, so the loss may become negative and it's very normal. Since we validate our model on 640x192 during the first stage to save memory as much as possible, so the val abs_rel may be slightly worse, but it should be less than 0.1 after about 20 epochs.
I think that both loading the last and the best model are OK. Since I'm continuing the training and not testing the results, I personally prefer loading the last model. You can load the best_model since it may avoid overfitting and may get better results.
All the three results are the best_model after HRfinetune and self-distillation.

I hope these help, please let me know if you have any further questions.

Best,
Ruoyu Wang

wangjiyuan9 · 2023-06-06T11:49:17Z

Helpful ！Thanks ! :)

wangjiyuan9 · 2023-06-09T06:29:35Z

Dear RuoYu:
I have some questions about the results presented in the paper. Specifically, in the Stage 1 phase, my results in kittti_raw_test dataset show an RMSE of 4.413 and an absolute relative difference (abs_rel) of 0.103. And after applying HRfinetune, the RMSE is reduced to 4.05 and abs_rel is increased to 0.9569. In contrast, the paper reports an abs_rel of 0.085 and RMSE of 4.023.

I am unsure if the difference in the abs_rel result is caused by an error in my implementation or if there is a mistake in anywhere. Moreover, I have observed that the paper refers to the self-distillation method only in relation to the third line of results (marked with a † symbol, which was obtained by utilizing their self-distillation method for post-processing）

And you said:

4. All the three results are the best_model after HRfinetune and self-distillation.

Does this mean that the first two lines of results also include self-distillation or did I misunderstand something?(If it is,I am also curious about the difference between the self-distillation used during training and the post-processing self-distillation for generating labels. Could you please elaborate on this?)

Thank you in advance for your help and clarification.

wangjiyuan9 · 2023-06-09T12:23:31Z

Also , At here ，Why you use 'color_aug' to predict pose instead of color?

Dwawayu · 2023-06-09T12:39:48Z

Dear Jiyuan,
Hi! For the first question, the model after the first stage should perform better. Specifically, it should have an abs_rel of approximately 0.090 and an RMSE of around 4.180, so there might be mistakes in the first stage or evaluation. Have you checked whether the model was tested at 1280*384? Furthermore, things became more perplexing after the HRfinetuning, as it showed a good RMSE but a completely inaccurate abs_rel. Considering the difference between these two metrics, there might be a very small GT depth occurring somewhere. A visualization of the predicted depth or abs_rel error of each image may help us to find the mistake.

For the second question, self-distillation can enhance the raw predictions, so It can generate labels in training and improve prediction results in testing. Our best model was obtained after self-distillation training, and the three results are raw prediction, pp of raw prediction, and sd of raw prediction in the testing of the best model.

Dwawayu · 2023-06-09T12:45:45Z

Oh! Regarding the new question, the augmented images are used as inputs of all networks to rich the data.

wangjiyuan9 closed this as completed Jun 6, 2023

wangjiyuan9 reopened this Jun 9, 2023

wangjiyuan9 closed this as completed Jun 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some question about training progress #10

Some question about training progress #10

wangjiyuan9 commented Jun 5, 2023

Dwawayu commented Jun 5, 2023

wangjiyuan9 commented Jun 6, 2023

wangjiyuan9 commented Jun 9, 2023

wangjiyuan9 commented Jun 9, 2023

Dwawayu commented Jun 9, 2023

Dwawayu commented Jun 9, 2023

Some question about training progress #10

Some question about training progress #10

Comments

wangjiyuan9 commented Jun 5, 2023

Dwawayu commented Jun 5, 2023

wangjiyuan9 commented Jun 6, 2023

wangjiyuan9 commented Jun 9, 2023

wangjiyuan9 commented Jun 9, 2023

Dwawayu commented Jun 9, 2023

Dwawayu commented Jun 9, 2023