GitHub - Shandilya21/summarization: A stacked LSTM based Network for Text Summarization Using Keras

Extractive Summarization Using Stacked RNN

About The Project

The approaches to text summarization vary depending on the number of input documents (single or multiple), purpose (generic, domain specific, or query-based) and output (extractive or abstractive).

Extractive summarization means identifying important sections of the text and generating them verbatim producing a subset of the sentences from the original text; while abstractive summarization reproduces important material in a new way after interpretation and examination of the text using advanced natural language techniques to generate a new shorter text that conveys the most critical information from the original one.

Why we need this?

A summary is meant to inform your reader—who has not read the text or seen the presentation—of what the text is about. It describes its purpose or main idea, and summarizes the supporting arguments that develop that idea.

Built With

Getting Started

Below the the basic step to reproduce the code with few commands.

Language: Python 3.0+

Clone the repository

git clone https://github.com/Shandilya21/extractive_summarization.git

Prerequisites

pip install -r requirement.txt

Before performing experiments, SET the config for the data path. In config.py Change the DATA_PATH to your data path location.

Dataset: Data can be downloaded from here (Raw Documents) and (Summary). Create and place the data inside the data/train folder. You may also create the test set to check the performance of the model from the split )(deifned in code).

Pretrained Weights: Download the pretrained weights from here (GloVe). save the file inside the data/embeddings/glove. if not feel free to save anywhere.

Before performing further steps, SET the config for the data path. In config.py Change the DATA_PATH to your data path location.

Usage

This repo are for 3 word window and 5 word window architecture text summarization (extractive methods), Also you can produce results for similar hparams. The default number of epochs = 5. Feel free to set the epochs from run.sh.

chmod +x run.sh
bash run.sh

Results

Model	Ratio	Acc
1	0.31	71.24
3	0.37	77.88
5	0.41	80.08

Roadmap

See the open issues for a list of proposed features (and known issues). Also, if you have any issue, feel free to open a new issue.

Contributing

Contributions are what make the project such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git build -b build/newfeature)
Commit your Changes (git commit -m 'Add some newfeature')
Push to the Branch (git push origin build/newfeature)
Open a Pull Request

Contact

Arunav Shandilya - arunavshandilya96@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
assets		assets
data		data
.gitignore		.gitignore
3_win_word.py		3_win_word.py
5_win_word.py		5_win_word.py
LICENSE		LICENSE
README.md		README.md
config.py		config.py
requirement.txt		requirement.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Extractive Summarization Using Stacked RNN

Table of Contents

About The Project

Built With

Getting Started

Prerequisites

Usage

Results

Roadmap

Contributing

Contact

About

Languages

License

Shandilya21/summarization

Folders and files

Latest commit

History

Repository files navigation

Extractive Summarization Using Stacked RNN

Table of Contents

About The Project

Built With

Getting Started

Prerequisites

Usage

Results

Roadmap

Contributing

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages