After having used both CoLab GPUs and TPUs for almost a month I must significantly revise my previous opinion. Even for a Keras model not written or optimized for TPUs, with some minimal configuration changes TPUs perform much faster - minimum of twice the speed. In addition to making sure that all operations are TPU compatible, the only major configuration change required is increasing the batch size by 8. At first I was playing around with the batch size, but I realized that this was unnecessary. TPUs have 8 shards, so you simply multiple the GPU batch size by 8 and that should be a good baseline.
The model I am currently training on a TPU and a GPU simultaneously is training 3-4x faster on the TPU than on the GPU and the code is exactly the same. I have this block of code:
use_tpu = True
# if we are using the tpu copy the keras model to a new var and assign the tpu model to model
TPU_WORKER = 'grpc://' + os.environ['COLAB_TPU_ADDR']
# create network and compiler
tpu_model = tf.contrib.tpu.keras_to_tpu_model(
model, strategy = tf.contrib.tpu.TPUDistributionStrategy(
BATCH_SIZE = BATCH_SIZE * 8
The model is created with Keras and the only change I make is setting use_tpu to True on the TPU instance.
One other thing I thought I would mention is that CoLab creates separate instances for GPU, TPU and CPU, so you can run multiple notebooks without sharing RAM or processor if you give each one a different type.