This is a tool to visualize the distribution of attention in a text-based sequence-to-sequence task such as summarization. As you hover your mouse over the decoded words, the tool shows a heatmap of attention over the source words. A demo of the original source code, from See et al. can be found here.
Additionally, for pointer-generator networks such as the described in this paper, the tool displays the generation probability of each decoded word. This tool was designed to work with the Tensorflow code for the paper.
To run the visualizer, run
python -m http.server
from this directory and open http://localhost:8000/,
or the port used to initialize the interface. The visualizer will show some example data.
Add your images to the images folder, along with the proper JSON file in the jsons. The JSON file can be produced by this Tensorflow code, or by our modified model. Each JSON file should contain the following fields:
article_lst
: the article (or source text) as a list of words.decoded_lst
: the decoded (machine-generated) summary as a list of words.abstract_str
: the reference summary as a single string.attn_dists
: a list having the same length as thedecoded_lst
, containing lists of length "attention length", containing probabilities. Note attention length must be less than or equal to thearticle_lst
. The article may have 500 words, but if one feeds the first 200 words into the model, the attention will be of length 200. In this case, the visualizer marks the truncation point in the article.p_gens
a list containing generation probabilities for each word in thedecoded_lst
.url
: the url, or file name, of the infographic being decoded.positions
: A list of bounding boxes for each specific word indecoded_lst
. The list contains 8 values per word, for a total of 8 xdecoded_lst
values. The 8 values correspond to the coordinates of each corner in the bounding box going clock-wise from the top-left of the image. It is ordered as top_left_x, top_left_y, top_right_x, top_right_y, bottom_right_x, bottom_right_y, bottom_left_x, bottom_left_y, where x is the horizontal and y is the vertical distance from the top left corner of the infographic.
WARNING: Make sure that none of the strings in article_lst
, decoded_lst
, or abstract_str
contain <angled brackets>
. These will interfere with the HTML and can result in text not being displayed.