Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom x axes to TensorBoard #38

Merged
merged 7 commits into from
Dec 4, 2019
Merged

Add custom x axes to TensorBoard #38

merged 7 commits into from
Dec 4, 2019

Conversation

naeioi
Copy link
Member

@naeioi naeioi commented Nov 23, 2019

This PR adds support for custom x-axes to the TensorBoard scalar plot. The axis names are specified in the tensorboard output constructor.

class TensorBoardOutput(LogOutput):
    def __init__(self,
                 log_dir,
                 x_axes=None,
                 flush_secs=120,
                 histogram_samples=1e3): 

When x_axes is None, it falls back to use iterations as the x-axis.
If any x_axis is not present in the scalar tabular. A warning will be logged to the console.
If all x_axes are not present, it falls back to use iteration as the x-axis.

Screenshots of an experiment with Epoch and TotalEnvSteps as x-axes.
Screenshot from 2019-11-22 20-43-24
Screenshot from 2019-11-22 20-43-43

I will open a separate PR in garage repo to set TotalEnvSteps as the default x-axis.

@naeioi naeioi requested a review from a team as a code owner November 23, 2019 04:48
@codecov
Copy link

codecov bot commented Nov 23, 2019

Codecov Report

Merging #38 into master will decrease coverage by 0.07%.
The diff coverage is 93.93%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #38      +/-   ##
==========================================
- Coverage    94.2%   94.13%   -0.08%     
==========================================
  Files           7        7              
  Lines         328      358      +30     
  Branches       48       58      +10     
==========================================
+ Hits          309      337      +28     
  Misses         12       12              
- Partials        7        9       +2
Impacted Files Coverage Δ
src/dowel/logger.py 93.67% <ø> (-0.08%) ⬇️
src/dowel/tensor_board_output.py 95.55% <93.93%> (-1.06%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ae0b7c...b76c6db. Read the comment docs.

@ryanjulian
Copy link
Member

If any x_axis is not present in the scalar tabular. A warning will be logged to the console.

I think this should be an error -- you told me to expect a key and then didn't give it to me.

@ryanjulian
Copy link
Member

I'm actually willing to endure the API breakage of not implicitly counting itrs in TensorBoardOutput -- because I think that it's a bad pattern.

@krzentner WDYT?

src/dowel/logger.py Outdated Show resolved Hide resolved
for axis in self._x_axes:
if axis not in nonexist_axes and key is not axis:
x = data.as_dict[axis]
self._record_kv('{}/{}'.format(axis, key), value, x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you were going to do this, you should use the prefix feature.

I recommend the following pattern, though, which I think will lead to better TensorBoard displays:

  • Default x-axis (either specific or just x_axes[0]) gets unmodified keys, such as Policy/Entropy
  • Additional x-axes get keys pre- or post-fixed with some non-/ char and their x-axis

e.g.:

  • post: : Policy/Entropy@Itr
  • pre: Itr@Policy/Entropy

I recommend you try both and see which is most intuitive for someone who doesn't know how to write a regex.

These are intended to:

  • Make it easy to filter the TB display to show only one kind of x-axis using regexes
  • Preserve the sections in the web-app, which are defined by the first level of / tokens

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default x-axis (either specific or just x_axes[0]) gets unmodified keys, such as Policy/Entropy

This is probably not a good idea, because the axis name is not displayed on tensorboard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • post: : Policy/Entropy@Itr
  • pre: Itr@Policy/Entropy

I like the post one. The pre one will double the number of sections if there are two axes, which is visually distracting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: defaults
Most users will be using one x-axis key, and the default should not be ugly. Perhaps you can accept both a str and list for x_axes as courtesy to users and only apply labels in the list case. I still think this isn't quite right, because even users of more than one x-axis have a default in mind which they will look at 90% of the time, and only use the others for debugging specific issues.

Alternatively, you could change the API to have 2 args: x_axis and additional_x_axes, and only label the for additional_x_axes. This is a little less magical than what I suggested with using x_axes[0].

As of today, we also don't have axis labels and the default is implicitly Itr but noted nowhere, so we already have the problem you mentioned today. AFAIK nobody's ever asked me what our x-axis represents. If you are scared of not exposing necessary information, you could always output a Text summary which tells the user how TensorBoardOutput is configured. In practice, I doubt that. This is a basic flaw in the TensorBoard interface, so we're not going to be able to solve it fully.

re:pre- vs post-fix
It's all about which distraction you prefer. It might be better to have all of the plots of one x-axis type grouped into their own section, so you can look at just those without using a regex.

Perhaps you can make examples of both for us to play with?

Copy link
Member Author

@naeioi naeioi Nov 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: defaults
When the axis name is not displayed on tensorboard, printing this name to console helps a bit, but will easily be omitted. There are simply too many messages on the console and people will only scrutinize them when the program fails.

TensorBoard has the flaw of not displaying the x-axis name, but your suggestion of using @axis is an excellent workaround to hint people what they are reading.

Plus, the default x-axis is specified in experiment_wrapper.py, which is a file that most regular users will not touch. Don't expect anyone to know what is x-axis.

Printing this name to TensorBoard will add some redundancy, but I think it wins by being much easier to understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@axis is nice for non-defaults but it makes the default case worse, which is the case which 90% of users will only ever experience. this is pretty unpythonic. see: https://www.python.org/dev/peps/pep-0008/#a-foolish-consistency-is-the-hobgoblin-of-little-minds

this intention here is to establish that the default x-axis in garage is always TotalEnvSteps. we can put it in the documentation and all over the code. beating users over the head with it over and over again in the TensorBoard is consistent in the way that makes programmers feel "correct" and all warm and fuzzy inside, and makes life harder for everyone else for no reason.

i'll add more reviewers to ask what they think. if you really want to convince people your way is better then make some demos. you are proposing changing the status quo, so the burden is on you to convince everyone.

Copy link
Member Author

@naeioi naeioi Nov 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take your suggestion of using x_axis + additional_x_axes as TensorboardOutput's interface.

Meanwhile, I want to point out that TotalEnvSteps as x-axis is not the status quo in open-source frameworks' support for Tensorboard. Most frameworks use iterations, just like what garage is doing now. To list some of the most popular ones,

Dopamine
tf-agents
ray-rllib, code1, code2
keras-rl

So after this PR, garage will be one of the few frameworks, if not the only one, that support using TotalEnvSteps as the x-axis in TensorBoard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and IMO it will be a great differentiator. One of the great hassles of using RL libraries is that the x-axis of the output logs differ greatly between libraries when the mathematically-valid x-axis is always TotalEnvSteps (or perhaps TotalGradSteps).

Publications use TotalEnvSteps almost universally to compare algorithms. Determining the actual number of TotalEnvSteps, if it's not explicitly logged, can itself be a challenge which takes hours to figure out in an unfamiliar codebase, depending on how obfuscated the code is.

btw, tf_agents uses the global gradient step counter (e.g. TotalGradSteps), not Itr as defined by garage. This can make a lot sense sometimes in off-policy algorithms, but is mostly a hold-over from the practices used for supervised learning. The relevant publication benchmark is still TotalEnvSteps.

@ryanjulian
Copy link
Member

If you want to share a sample with us, you can use https://tensorboard.dev

src/dowel/logger.py Outdated Show resolved Hide resolved
@naeioi
Copy link
Member Author

naeioi commented Nov 27, 2019

I found that allowing missing axes is inevitable. For example, DDPG calls logger.dump() every epoch, but the scalar table is empty for the beginning several epochs. So I keep this as a warning but not an error.

I also found that '@' in the name will be replaced with underline by tensorboard. The followings are candidate naming formats.

post_slash: https://tensorboard.dev/experiment/sydhawaaQr2BFlCnqzF2qQ/#scalars
post_underline: https://tensorboard.dev/experiment/wQsEzHaGQkekfnB5GqSz9g/#scalars
pre_slash: https://tensorboard.dev/experiment/zZi3idwRRGOq4SVOybs2Nw/#scalars
pre_underline: https://tensorboard.dev/experiment/Bt1o8m20S6Op5nXInfQgvA/#scalars
pre_full_axes_name: https://tensorboard.dev/experiment/jUElLlA3TQu7d7AskiBuvw/#scalars

@ryanjulian
Copy link
Member

Ah, let me take a look at those. What a bummer that TF removes the @

@ryanjulian ryanjulian requested review from krzentner and removed request for zhanpenghe and krzentner December 3, 2019 01:14
@naeioi naeioi requested a review from avnishn December 4, 2019 00:27
Copy link
Member

@avnishn avnishn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ryanjulian ryanjulian merged commit 7b9fed2 into master Dec 4, 2019
@ryanjulian ryanjulian deleted the custom_x_axes branch December 4, 2019 02:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants