Skip to content

Commit

Permalink
add new plugin xdxf_css (XdxfCss) based on PR #570 by @soshial
Browse files Browse the repository at this point in the history
- use CSS for all XDXF element (decrease size of output dict file)
- created a list (ol / ul) for nested `<def>`s
- support loading of `<abbreviations>` data
- show js tooltips for abbreviations
- fixed showing hidden `<gr>`
- fixed losing spaces before XDXF tags
- add braces to `<co>`
- show CSS instead of inlined HTML tags for: `<ex>, <pos>, <abbr>, <k>, <gr>, <mrkd>`
  • Loading branch information
ilius committed Sep 2, 2024
1 parent 7f64af5 commit aa6765b
Show file tree
Hide file tree
Showing 8 changed files with 990 additions and 10 deletions.
1 change: 1 addition & 0 deletions doc/p/__index__.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
| WordNet | Wordnet | [wordnet.md](./wordnet.md) |
| Wordset.org JSON directory | Wordset | [wordset.md](./wordset.md) |
| XDXF (.xdxf) | Xdxf | [xdxf.md](./xdxf.md) |
| XDXF with CSS and JS | XdxfCss | [xdxf_css.md](./xdxf_css.md) |
| XDXF Lax (.xdxf) | XdxfLax | [xdxf_lax.md](./xdxf_lax.md) |
| Yomichan (.zip) | Yomichan | [yomichan.md](./yomichan.md) |
| Zim (.zim, for Kiwix) | Zim | [zim.md](./zim.md) |
28 changes: 28 additions & 0 deletions doc/p/xdxf_css.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
## XDXF with CSS and JS

### General Information

| Attribute | Value |
| --------------- | -------------------------------------------------------------------------------------------------------------- |
| Name | XdxfCss |
| snake_case_name | xdxf_css |
| Description | XDXF with CSS and JS |
| Extensions | |
| Read support | Yes |
| Write support | No |
| Single-file | Yes |
| Kind | 📝 text |
| Sort-on-write | default_no |
| Sort key | (`headword_lower`) |
| Wiki | [XDXF](https://en.wikipedia.org/wiki/XDXF) |
| Website | [XDXF standard - @soshial/xdxf_makedict](https://github.com/soshial/xdxf_makedict/tree/master/format_standard) |

### Dependencies for reading

PyPI Links: [lxml](https://pypi.org/project/lxml)

To install, run:

```sh
pip3 install lxml
```
20 changes: 20 additions & 0 deletions plugins-meta/index.json
Original file line number Diff line number Diff line change
Expand Up @@ -1823,6 +1823,26 @@
"lzma"
]
},
{
"module": "xdxf_css",
"lname": "xdxf_css",
"name": "XdxfCss",
"description": "XDXF with CSS and JS",
"extensions": [],
"singleFile": true,
"optionsProp": {},
"canRead": true,
"canWrite": false,
"readOptions": {},
"readDepends": {
"lxml": "lxml"
},
"readCompressions": [
"gz",
"bz2",
"lzma"
]
},
{
"module": "xdxf_lax",
"lname": "xdxf_lax",
Expand Down
13 changes: 3 additions & 10 deletions pyglossary/plugins/xdxf/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@

from pyglossary.lxml_types import Element

from lxml import etree as ET

from pyglossary.compression import (
compressionOpen,
stdCompressions,
Expand Down Expand Up @@ -79,7 +81,6 @@
),
}


"""
new format
<xdxf ...>
Expand Down Expand Up @@ -110,9 +111,7 @@
</xdxf>
"""


if TYPE_CHECKING:

class TransformerType(typing.Protocol):
def transform(self, article: "Element") -> str: ...

Expand Down Expand Up @@ -154,7 +153,6 @@ def makeTransformer(self) -> None:

def open(self, filename: str) -> None: # noqa: PLR0912
# <!DOCTYPE xdxf SYSTEM "http://xdxf.sourceforge.net/xdxf_lousy.dtd">
from lxml import etree as ET

self._filename = filename
if self._html:
Expand Down Expand Up @@ -220,9 +218,6 @@ def __len__(self) -> int:
return 0

def __iter__(self) -> "Iterator[EntryType]":
from lxml import etree as ET
from lxml.etree import tostring

context = ET.iterparse( # type: ignore
self._file,
events=("end",),
Expand All @@ -238,7 +233,7 @@ def __iter__(self) -> "Iterator[EntryType]":
if len(words) == 1:
defi = self._re_span_k.sub("", defi)
else:
b_defi = cast(bytes, tostring(article, encoding=self._encoding))
b_defi = cast(bytes, ET.tostring(article, encoding=self._encoding))
defi = b_defi[4:-5].decode(self._encoding).strip()
defiFormat = "x"

Expand All @@ -265,8 +260,6 @@ def close(self) -> None:
def tostring(
elem: "Element",
) -> str:
from lxml import etree as ET

return (
ET.tostring(
elem,
Expand Down
Loading

0 comments on commit aa6765b

Please sign in to comment.