This is a simple demonstration of XProf's capabilites, on an example training workload running on Cloud TPUs.
# Install the stable version of XProfpip install -U xprof
# Update protobuf version in the environmentpip install -U protobuf
# git clone the xprof repo so we have access to the demo data theregit clone http://github.com/openxla/xprof
Cloning into 'xprof'... warning: redirecting to https://github.com/openxla/xprof/ remote: Enumerating objects: 15835, done. remote: Counting objects: 100% (527/527), done. remote: Compressing objects: 100% (342/342), done. remote: Total 15835 (delta 287), reused 190 (delta 185), pack-reused 15308 (from 2) Receiving objects: 100% (15835/15835), 79.33 MiB | 41.49 MiB/s, done. Resolving deltas: 100% (11905/11905), done.
# Load the TensorBoard notebook extension.
%load_ext tensorboard
# Launch TensorBoard and navigate to the Profile tab to view performance profile
%tensorboard --logdir=xprof/demo
Once tensorboard loads the profile data, use the Tools dropdown to select the tool you want to explore. Please see the tool-specific documentation pages for explanations of each tool's outputs.