KNKY > Yandex-Zen-Parser

Python 3 script for parsing pages from the Yandex.Zen service. Script use BeautifulSoup4 library for HTML-parsing.

Yandex-Zen-Parser download images and return pure HTML-file without any class or style attributes.

Valid tags

Parse only Articles with text and images ['p', 'h2', 'h3', 'img']. Video tags are not supported yet.

Libraries required

Before run script install BeautifulSoup4. Type and run in command line.

pip install beautifulsoup4

Input urls

Paste your urls into input.json and put this file into script root folder.

input.json structure

{
  "urls": [
    "https://zen.yandex.ru/...",
    "https://zen.yandex.ru/...",
    "https://zen.yandex.ru/..."
  ]
}

Output HTML and images

Files will be stored in the appropriate folders. Jpeg images are stored right there.

./output/
--------/article-alias/
----------------------/article-alias-1.jpg
----------------------/article-alias-2.jpg
----------------------/article-alias-3.jpg
----------------------/article-alias-4.jpg
----------------------/article-alias.html
--------/another-article-alias/
...

Start parsing

Execute script in command line.

python ZenParser.py

This project can be improved

This script is provided as is. If you have noticed bugs or can offer an improvement, then welcome!

KNKY.RU

Please visit our website for more information KNKY.RU

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
KNKY_ZenParser.py		KNKY_ZenParser.py
LICENSE		LICENSE
README.md		README.md
ZenParser.py		ZenParser.py
input.json		input.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KNKY > Yandex-Zen-Parser

Valid tags

Libraries required

Input urls

input.json structure

Output HTML and images

Start parsing

This project can be improved

KNKY.RU

About

Releases

Packages

Languages

License

knky-ru/Yandex-Zen-Parser

Folders and files

Latest commit

History

Repository files navigation

KNKY > Yandex-Zen-Parser

Valid tags

Libraries required

Input urls

input.json structure

Output HTML and images

Start parsing

This project can be improved

KNKY.RU

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages