Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds function for returning all sections with given title #57

Merged
merged 8 commits into from
Dec 16, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .deepsource.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,12 @@ test_patterns = [
]

exclude_patterns = [

]

[[analyzers]]
name = 'python'
enabled = true
runtime_version = '3.x.x'
max_line_length = 90
type_checker = "mypy"
3 changes: 3 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[flake8]
max-line-length = 90
extend-ignore = E203, W503
4 changes: 4 additions & 0 deletions .isort.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[settings]
profile = "google"
multi_line_output = 3
include_trailing_comma = True
4 changes: 4 additions & 0 deletions .mypy.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[mypy]
python_version = 3.10
warn_return_any = True
warn_unused_configs = True
62 changes: 62 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks

exclude: original

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.2.0
hooks:
- id: check-case-conflict
- id: check-merge-conflict
- id: check-docstring-first
- id: check-executables-have-shebangs
- id: check-yaml
- id: trailing-whitespace
- id: end-of-file-fixer
# - id: double-quote-string-fixer
- id: check-yaml
# - id: check-added-large-files
- id: requirements-txt-fixer
- id: name-tests-test
args: ["--django"]

- repo: https://github.com/pre-commit/mirrors-prettier
rev: "v2.7.1"
hooks:
- id: prettier

# - repo: https://github.com/pre-commit/mirrors-autopep8
# rev: "v1.5.6"
# hooks:
# - id: autopep8

- repo: https://github.com/pycqa/isort
rev: 5.8.0
hooks:
- id: isort
args: ["--profile", "google", "--filter-files"]

# - repo: https://github.com/asottile/pyupgrade
# rev: v3.2.2
# hooks:
# - id: pyupgrade
# args: [--py36-plus]

- repo: https://github.com/pycqa/flake8
rev: 6.0.0
hooks:
- id: flake8
additional_dependencies: [flake8-bugbear]

- repo: https://github.com/psf/black
rev: 22.10.0
hooks:
- id: black

- repo: https://github.com/pre-commit/mirrors-mypy
rev: v0.991
hooks:
- id: mypy
additional_dependencies:
["types-requests", "types-cachetools", "types-aiofiles"]
4 changes: 3 additions & 1 deletion API.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ WikipediaPage
* ``text`` - returns text of the page
* ``sections`` - list of all sections (list of ``WikipediaPageSection``)
* ``langlinks`` - language links to other languages ({lang: ``WikipediaLangLink``})
* ``section_by_title(name)`` - finds section by title (``WikipediaPageSection``)
* ``section_by_title(name)`` - finds last section by title (``WikipediaPageSection``)
* ``sections_by_title(name)`` - finds all section by title (``WikipediaPageSection``)
* ``links`` - links to other pages ({title: ``WikipediaPage``})
* ``categories`` - all categories ({title: ``WikipediaPage``})
* ``displaytitle``
Expand Down Expand Up @@ -45,6 +46,7 @@ WikipediaPageSection
* ``level``
* ``text``
* ``sections``
* ``section_by_title(title)``

ExtractFormat
-------------
Expand Down
7 changes: 7 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
Changelog
=========

0.5.5
-----

* Adds support for retrieving all sections with given name - `Issue 39`_

.. _Issue 39: https://github.com/martin-majlis/Wikipedia-API/issues/39

0.5.4
-----

Expand Down
36 changes: 33 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -143,13 +143,45 @@ To get all top level sections of page, you have to use property ``sections``. It


print_sections(page_py.sections)
# *: History - Python was conceived in the late 1980s,
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readable
# **: Indentation - Python uses whitespace indentation, rath
# **: Statements and control flow - Python's statements include (among other
# **: Expressions - Some Python expressions are similar to l

How To Get Page Section By Title
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To get last section of page with given title, you have to use function ``section_by_title``.
It returns the last ``WikipediaPageSection`` with this title.

.. code-block:: python

section_history = page_py.section_by_title('History')
print("%s - %s" % (section_history.title, section_history.text[0:40]))

# History - Python was conceived in the late 1980s b

How To Get All Page Sections By Title
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To get all sections of page with given title, you have to use function ``sections_by_title``.
It returns the all ``WikipediaPageSection`` with this title.

.. code-block:: python

page_1920 = wiki_wiki.page('1920')
sections_january = page_1920.sections_by_title('January')
for s in sections_january:
print("* %s - %s" % (s.title, s.text[0:40]))

# * January - January 1
# Polish–Soviet War in 1920: The
# * January - January 2
# Isaac Asimov, American author
# * January - January 1 – Zygmunt Gorazdowski, Polish

How To Get Page In Other Languages
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -437,5 +469,3 @@ Other Pages
.. |libraries-io-dependent-repos| image:: https://img.shields.io/librariesio/dependent-repos/pypi/Wikipedia-API.svg
:target: https://libraries.io/pypi/Wikipedia-API
:alt: Libraries.io - Dependent Repos


20 changes: 20 additions & 0 deletions tests/extract_html_format_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,26 @@ def test_subsubsection(self):
)
self.assertEqual(len(section.sections), 0)

def test_subsection_by_title_return_last(self):
page = self.wiki.page('Test_Nested')
section = page.section_by_title('Subsection B')
self.assertEqual(section.title, 'Subsection B')
self.assertEqual(section.text, '<p><b>Text for section 3.B</b>\n\n\n</p>')
self.assertEqual(len(section.sections), 0)

def test_subsections_by_title(self):
page = self.wiki.page('Test_Nested')
sections = page.sections_by_title('Subsection B')
self.assertEqual(len(sections), 3)
self.assertEqual(
[s.text for s in sections],
[
'<p><b>Text for section 1.B</b>\n\n\n</p>',
'<p><b>Text for section 2.B</b>\n\n\n</p>',
'<p><b>Text for section 3.B</b>\n\n\n</p>'
]
)

def test_text(self):
page = self.wiki.page('Test_1')
self.maxDiff = None
Expand Down
44 changes: 44 additions & 0 deletions tests/mock_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,50 @@ def wikipedia_api_request(page, params):
}
}
},
'en:action=query&prop=extracts&titles=Test_Nested&': {
"batchcomplete": "",
"warnings": {
"extracts": {
"*": "\"exlimit\" was too large for a whole article extracts request, lowered to 1."
}
},
"query": {
"normalized": [
{
"from": "Test_Nested",
"to": "Test Nested"
}
],
"pages": {
"4": {
"pageid": 14,
"ns": 0,
"title": "Test Nested",
"extract": (
"<p><b>Summary</b> text\n\n</p>\n" +
"<h2>Section 1</h2>\n" +
"<p>Text for section 1</p>\n\n\n" +
"<h3><span id=\"s1.1\">Subsection A</span></h3>\n" +
"<p><b>Text for section 1.A</b>\n\n\n</p>" +
"<h3>Subsection B</h3>\n" +
"<p><b>Text for section 1.B</b>\n\n\n</p>" +
"<h2><span id=\"s2\">Section 2</span></h2>\n" +
"<p><b>Text for section 2</b>\n\n\n</p>" +
"<h3><span id=\"s2.1\">Subsection A</span></h3>\n" +
"<p><b>Text for section 2.A</b>\n\n\n</p>" +
"<h3>Subsection B</h3>\n" +
"<p><b>Text for section 2.B</b>\n\n\n</p>" +
"<h2><span id=\'s3\'>Section 3</span></h2>\n" +
"<p><b>Text for section 3</b>\n\n\n</p>" +
"<h3><span id=\"s3.1\">Subsection A</span></h3>\n" +
"<p><b>Text for section 3.A</b>\n\n\n</p>" +
"<h3>Subsection B</h3>\n" +
"<p><b>Text for section 3.B</b>\n\n\n</p>"
)
}
}
}
},
'en:action=query&prop=extracts&titles=Test_Edit&': {
"batchcomplete": "",
"warnings": {
Expand Down
Loading