-
Notifications
You must be signed in to change notification settings - Fork 45.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add ADE20K dataset #3853
add ADE20K dataset #3853
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @walkerlala ! Mainly style nits.
FLAGS = tf.app.flags.FLAGS | ||
flags = tf.app.flags | ||
|
||
tf.app.flags.DEFINE_string( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah.. just some ancient code. I will remove it ;-)
import random | ||
import string | ||
import sys | ||
from PIL import Image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is legacy code. I will remove it.
RuntimeError: If loaded image and label have different shape. | ||
""" | ||
|
||
img_names = glob.glob(os.path.join(dataset_dir, '*.jpg')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe tf.gfile.Glob(...)?
seg_names.append(seg) | ||
|
||
num_images = len(img_names) | ||
num_per_shard = int(math.ceil(num_images) / float(_NUM_SHARDS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
math.ceil(num_images / float(_NUM_SHARDS)) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
||
CURRENT_DIR=$(pwd) | ||
WORK_DIR="./ADE20K" | ||
mkdir -p ${WORK_DIR} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"${WORK_DIR}", please also fix variables below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just trying to "stay consistent with existing code ;-)". Anyway, I will fix those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, walkerlala!
Overall looks great. A few suggestions. Please take a look.
_NUM_SHARDS = 4 | ||
|
||
def _convert_dataset(dataset_split, dataset_dir, dataset_label_dir): | ||
""" Convert the ADE20k dataset into into tfrecord format (SSTable). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"" Convert -> """Converts
""" Convert the ADE20k dataset into into tfrecord format (SSTable). | ||
|
||
Args: | ||
dataset_split: dataset split (e.g., train, val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make first word capital.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And put a period (i.e., '.') at the end of each sentence.
splits_to_sizes = { | ||
'train': 20210, # num of samples in images/training | ||
'val': 2000, # num of samples in images/validation | ||
'eval': 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this split? Add comment for it? It seems that this split is not generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, this split is used for my own testing. I just forgot to remove it. I will remove it.
@@ -99,7 +99,7 @@ def get_model_init_fn(train_logdir, | |||
tf.logging.info('Initializing model from path: %s', tf_initial_checkpoint) | |||
|
|||
# Variables that will not be restored. | |||
exclude_list = ['global_step'] | |||
exclude_list = ['global_step', 'logits'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are cases where you do want to restore `logits' weights (e.g., further fine-tuning them on validation set).
My proposal will be: Could you please
- Modify get_extra_layer_scopes in model.py as follows.
def get_extra_layer_scopes(last_layers_contain_logits_only=False):
"""Gets the scopes for extra layers.
Args:
last_layers_contain_logits_only: Boolean, True if only consider logits
as the last layers (i.e., exclude ASPP module, decoder module and so on).
Returns:
A list of scopes for extra layers.
"""
if last_layers_contain_logits_only:
return [_LOGITS_SCOPE_NAME]
else:
return [
_LOGITS_SCOPE_NAME,
_IMAGE_POOLING_SCOPE,
_ASPP_SCOPE,
_CONCAT_PROJECTION_SCOPE,
_DECODER_SCOPE,
]
- Add a flag in train.py.
flags.DEFINE_boolean('last_layers_contain_logits_only', False,
'Only consider logits as last layers or not.')
- Modify line 295 in train.py as follows.
last_layers = model.get_extra_layer_scopes(
FLAGS.last_layers_contain_logits_only)
research/deeplab/g3doc/ade20k.md
Outdated
fine_tune_batch_norm = False. | ||
|
||
2. User should fine tune the `min_resize_value` and `max_resize_value` to get | ||
better result. Note that `resize_factor` has to equals to `output_stride`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has to equals -> has to be equal
@YknZhu @aquariusjay All fixed. Please review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Please wait for @aquariusjay's review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thank you, walkerlala!
Hello @walkerlala , could you share a script similar to local_test.sh with all steps to download, convert and retrain a deeplab model with ADE20K dataset, instead of Pascal dataset like in local_test.sh ? |
This PR provide scripts/documents for downloading/converting the ADE20K dataset and training deeplabv3 on it. One particular thing to note about is the
exclude_list
variable inutils/train_utils.py
. I think it is OK to do this, but it might hurt performance of other models.