-
Notifications
You must be signed in to change notification settings - Fork 45.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change reference_data.py to use tf.gfile #3921
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious about why read/write to JSON file as binary format. What's the content of JSON file? TF version should just be text.
My contention would be that we should adopt one absolutely safe way to dump JSON, and use it everywhere. Which unfortunately is serializing and then writing to the file. I understand the position that sometimes it's overkill, but that way the code is robust and you never have to think about whether the code will ever see ASCII. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add lots of comments explaining why this is necessary to avoid the next helpful engineer from thinking, gee, this could be simpler if we...
As chatted offline, I think for now its better to skip the encode/decode for utf8 since most of the data you dump to json are just text data. Using "wb" and "rb" confuses people. |
f3c519b
to
3f29187
Compare
Alright, everything should be good now. The tests were also throwing warnings due to a small change in batch_norm, so I updated them. |
b3a2443
to
7f6373d
Compare
"Superficial change in batch_norm" in theory shouldn't require an update here, right? I thought we were going to be robust to that? I worry that we're about to introduce flaky tests. |
We are robust to that. If you don't change the reference files it still passes, you just get warnings that the op graph is different than what is expected. So once we confirm that the change is innocuous we can make the change at our convenience. |
* change reference_data.py to use tf.gfile * simplify json treatment * Update reference files to account for a superficial change in batch_norm
Reference tests are failing on TAP because they use open(). Due to some quirks with gfile (in py2 it defaults to bytes and in py3 it defaults to string, but it is illegal to pass "rt" or "rb") it is necessary to only interact with files as bytes. This leads to some awkward encode/decode calls with json, but those calls keep the code identical between py2 and py3.