How to use tensorflow-hub module with tensorflow-dataset api

I want to use Tensorflow Dataset api to initialize my dataset using tensorflow Hub. I want to use function to convert my text data into embedding. My Tensorflow version is 1.14.

Since I used elmo v2 modlule which converts bunch of sentences array into their word embeddings, I used the following code:

import tensorflow as tf
import tensorflow_hub as hub
sentences_array = load_sentences()
#Sentence_array=["I love Python", "python is a good PL"]
def parse(sentences):
    elmo = hub.Module("./ELMO")
    embeddings = elmo([sentences], signature="default", as_dict=True) 
    return embeddings
dataset =
dataset = dataset.apply( = 
parse, batch_size=batch_size))

I want embedding of text array like [batch_size, max_words_in_batch, embedding_size], but I got an error message as:

"NotImplementedError: Using TF-Hub module within a TensorFlow defined 
 function is currently not supported."

How can I get the expected results?


Unfortunately this is not supported in TensorFlow 1.x

It is, however, supported in TensorFlow 2.0 so if you can upgrade to tensorflow 2 and choose from the available text embedding modules for tf 2 (current list here) then you can use this in your dataset pipeline. Something like this:

embedder = hub.load("")

def parse(sentences):
    embeddings = embedder([sentences])
    return embeddings

dataset ="text.txt")
dataset =

If you are tied to 1.x or tied to Elmo (which I don't think is yet available in the new format) then the only option I can see for embedding in the preprocessing stage is to first run your dataset through a simple embedding model and save the results then use the embedded vectors for the downstream task separately. (I appreciate this is less than ideal).

Posted on by Stewart_R