Skip to content

archivebox.parsers.generic_html

Nick Sweeting edited this page Nov 13, 2024 · 2 revisions

{py:mod}archivebox.parsers.generic_html

:allowtitles:

Module Contents

Classes

:class: autosummary longtable
:align: left

* - {py:obj}`HrefParser <archivebox.parsers.generic_html.HrefParser>`
  -

Functions

:class: autosummary longtable
:align: left

* - {py:obj}`parse_generic_html_export <archivebox.parsers.generic_html.parse_generic_html_export>`
  - ```{autodoc2-docstring} archivebox.parsers.generic_html.parse_generic_html_export
    :summary:
    ```
* - {py:obj}`did_urljoin_misbehave <archivebox.parsers.generic_html.did_urljoin_misbehave>`
  - ```{autodoc2-docstring} archivebox.parsers.generic_html.did_urljoin_misbehave
    :summary:
    ```
* - {py:obj}`fix_urljoin_bug <archivebox.parsers.generic_html.fix_urljoin_bug>`
  - ```{autodoc2-docstring} archivebox.parsers.generic_html.fix_urljoin_bug
    :summary:
    ```

Data

:class: autosummary longtable
:align: left

* - {py:obj}`KEY <archivebox.parsers.generic_html.KEY>`
  - ```{autodoc2-docstring} archivebox.parsers.generic_html.KEY
    :summary:
    ```
* - {py:obj}`NAME <archivebox.parsers.generic_html.NAME>`
  - ```{autodoc2-docstring} archivebox.parsers.generic_html.NAME
    :summary:
    ```
* - {py:obj}`PARSER <archivebox.parsers.generic_html.PARSER>`
  - ```{autodoc2-docstring} archivebox.parsers.generic_html.PARSER
    :summary:
    ```

API

:canonical: archivebox.parsers.generic_html.HrefParser

Bases: {py:obj}`html.parser.HTMLParser`

````{py:method} handle_starttag(tag, attrs)
:canonical: archivebox.parsers.generic_html.HrefParser.handle_starttag

```{autodoc2-docstring} archivebox.parsers.generic_html.HrefParser.handle_starttag
```

````

:canonical: archivebox.parsers.generic_html.parse_generic_html_export

```{autodoc2-docstring} archivebox.parsers.generic_html.parse_generic_html_export
```
:canonical: archivebox.parsers.generic_html.KEY
:value: >
   'html'

```{autodoc2-docstring} archivebox.parsers.generic_html.KEY
```

:canonical: archivebox.parsers.generic_html.NAME
:value: >
   'Generic HTML'

```{autodoc2-docstring} archivebox.parsers.generic_html.NAME
```

:canonical: archivebox.parsers.generic_html.PARSER
:value: >
   None

```{autodoc2-docstring} archivebox.parsers.generic_html.PARSER
```

:canonical: archivebox.parsers.generic_html.did_urljoin_misbehave

```{autodoc2-docstring} archivebox.parsers.generic_html.did_urljoin_misbehave
```
:canonical: archivebox.parsers.generic_html.fix_urljoin_bug

```{autodoc2-docstring} archivebox.parsers.generic_html.fix_urljoin_bug
```
Clone this wiki locally