Skip to content

Generation of natural language from structured table data - Table To Text Generation

Notifications You must be signed in to change notification settings

gortibaldik/TTTGen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generating text from structured data

This repository contains all the code as well as the text of my thesis at Charles University in Prague. Multiple valuable resources that helped me on my learning path (both in learning tensorflow pecularities and in learning about different Neural Network Architectures) are stored in documentation file.

Abstract

In this thesis we examine ways of conditionally generating document-scale natural language text given structured input data. Specifically we train Deep Neural Network models on RotoWire dataset containing statistical data about basketball matches paired with descriptive summaries. First, we analyse the dataset and propose several preprocessing methods (e.g. Byte Pair Encoding). Next, we train a baseline model based on the Encoder-Decoder architecture on the preprocessed dataset. We discuss several problems of the baseline and explore advanced Deep Neural Network architectures that aim to solve them (Copy Attention, Content Selection, Content Planning). We hypothesize that our models are not able to learn the structure of the input data and we propose a method reducing its complexity. Our best model trained on the simplified data manages to outperform the baseline by more than 5 BLEU points.

Example of a text generated by the best scoring method

The Detroit Pistons defeated the Miami Heat , 107 - 84 , at bankers life fieldhouse on tuesday. The Pistons ( 7 - 9 ) came in to tuesday ’s contest with a 28 - point second quarter , and the Pistons ( 7 - 10 ) opened the game on a 62 run from the field ( 28 percent ) from the field and 27 percent from three - point range . Detroit was led by Kentavious Caldwell-Pope , who tallied 22 points on 7 - of - 13 shooting from the field . Andre Drummond and Rodney McGruder each turned in a pair of 11 - point efforts , with the former adding 5 rebounds , 2 assists , 2 steals and a block , and the latter posting 7 boards , 2 assists , 2 steals and 2 blocks. Dion Waiters and Rodney McGruder supplied matching 8 - point efforts , with the former adding 5 rebounds , 2 assists , 2 steals and 2 blocks , and the latter posting 3 rebounds , 2 assists , 2 steals and 2 blocks . Jon Leuer led the second unit with 15 points , 3 rebounds , 3 assists and 1 block . Kentavious Caldwell-Pope led the way for Detroit with 22 points , 5 rebounds , 2 assists and 1 steal . Andre Drummond followed with a 18 - point , 15 - rebound double - double that also featured 4 steals and 4 blocks . Tobias Harris matched Tobias Harris ’ scoring total and added 3 rebounds , an assist and a block . Jon Leuer led the second unit with 11 points , 3 rebounds , 2 assists and 1 block . The Heat head back home to faceoff the Oklahoma City Thunder on monday evening , while the Pistons remain home to face off with the Miami Heat on sunday evening as well .

Structure of repository

About

Generation of natural language from structured table data - Table To Text Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published