Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: import books from Wikisource #9674

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
161 commits
Select commit Hold shift + click to select a range
19d9874
first draft
pidgezero-one Aug 1, 2024
f23baad
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 1, 2024
34667e7
linting
pidgezero-one Aug 1, 2024
17575bf
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 1, 2024
dca89fb
use a class for imports
pidgezero-one Aug 1, 2024
147784a
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Aug 1, 2024
df423ca
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 1, 2024
604a5d8
mypy fixes
pidgezero-one Aug 1, 2024
34fe390
merge
pidgezero-one Aug 1, 2024
a307eca
more linting
pidgezero-one Aug 1, 2024
1fe4e84
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 1, 2024
d7f4065
is this deprecated too?
pidgezero-one Aug 1, 2024
bcdcdb1
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Aug 1, 2024
c504f6c
is this deprecated too?
pidgezero-one Aug 1, 2024
a996e64
is this deprecated too?
pidgezero-one Aug 1, 2024
a3c299e
improved data model
pidgezero-one Aug 2, 2024
0f2b113
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 2, 2024
e18ba52
reformat name formatter
pidgezero-one Aug 2, 2024
4bdf428
ruff fix
pidgezero-one Aug 2, 2024
57388b4
improve infobox fetching
pidgezero-one Aug 2, 2024
31b2a7f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 2, 2024
9492403
uncomment
pidgezero-one Aug 2, 2024
dacef92
merge
pidgezero-one Aug 2, 2024
1186696
remove unnecessary print
pidgezero-one Aug 2, 2024
dd82901
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 2, 2024
47b57c8
uncomment imports
pidgezero-one Aug 2, 2024
3c82121
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Aug 2, 2024
5e58373
better template check:
pidgezero-one Aug 2, 2024
2c268b2
publishers?
pidgezero-one Aug 2, 2024
be0d1a8
fix array
pidgezero-one Aug 2, 2024
d52c109
unused import
pidgezero-one Aug 2, 2024
df683a1
different wiki markup strip
pidgezero-one Aug 2, 2024
8e7cb38
reduce image calls
pidgezero-one Aug 2, 2024
66744ef
unstash
pidgezero-one Aug 2, 2024
5c51bcc
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 2, 2024
b5e4319
fix newlines
pidgezero-one Aug 2, 2024
34da18d
undo comments
pidgezero-one Aug 2, 2024
76a1724
logger name
pidgezero-one Aug 2, 2024
2ce0e17
fix array typing
pidgezero-one Aug 2, 2024
3645e9e
more cleanup
pidgezero-one Aug 2, 2024
0e9650b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 2, 2024
9663789
dry run outputs to a jsonl file in a gitignored folder
pidgezero-one Aug 6, 2024
8f0810d
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Aug 6, 2024
2a3e617
add this directory
pidgezero-one Aug 6, 2024
0268018
.
pidgezero-one Aug 6, 2024
99f3c93
Merge branch 'master' into 9671/feat/add-wikisource-import-script
pidgezero-one Aug 6, 2024
5390b54
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 6, 2024
b06b60d
.
pidgezero-one Aug 6, 2024
d7561f8
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Aug 6, 2024
3e10c5d
unicode
pidgezero-one Aug 6, 2024
4bd8e54
remove dry run flag
pidgezero-one Aug 6, 2024
e445276
this produces around 500 records
pidgezero-one Aug 13, 2024
d633c62
wikisource API gives better image results. this script now gets most …
pidgezero-one Aug 13, 2024
4053014
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 13, 2024
e1bd982
undo comments
pidgezero-one Aug 13, 2024
62d1798
clearer comments
pidgezero-one Aug 13, 2024
25d9243
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 13, 2024
bbe242f
formatting
pidgezero-one Aug 13, 2024
15aa2b2
formatting
pidgezero-one Aug 13, 2024
cae28f9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 13, 2024
cb2a959
condense
pidgezero-one Aug 13, 2024
5cb9dc9
more cleanup
pidgezero-one Aug 13, 2024
5370e41
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 13, 2024
b96a07d
more cleanup
pidgezero-one Aug 13, 2024
4d4d091
precommit
pidgezero-one Aug 13, 2024
da00d4e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 13, 2024
d116d6e
aaaaa
pidgezero-one Aug 13, 2024
8d83830
more false positives, letter filter literally does not work for reaso…
pidgezero-one Aug 14, 2024
f7e61c0
this is annoying
pidgezero-one Aug 14, 2024
3018a5f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 14, 2024
878051c
ruff
pidgezero-one Aug 14, 2024
9135ff5
ruff
pidgezero-one Aug 14, 2024
113e6a7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 14, 2024
42c05d0
cmt
pidgezero-one Aug 14, 2024
e9fef40
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Aug 14, 2024
5193267
filters
pidgezero-one Aug 15, 2024
2075f4e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 15, 2024
dbac756
fix
pidgezero-one Aug 16, 2024
916c3ae
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 16, 2024
d1683a5
ruff
pidgezero-one Aug 16, 2024
287bfe0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 16, 2024
4ad37de
aint no way precommit thinks 'pleas' is a typo
pidgezero-one Aug 16, 2024
e8fb019
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Aug 16, 2024
d8452d8
comment clarity
pidgezero-one Aug 16, 2024
0d15c54
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 16, 2024
db5c4a9
fix publishers
pidgezero-one Aug 16, 2024
3c46c9a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 16, 2024
31d84a7
fix WS-side category filtering
pidgezero-one Aug 16, 2024
3dbe7b9
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Aug 16, 2024
d6d303e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 16, 2024
aa73b6b
ruff
pidgezero-one Aug 16, 2024
a8fdcfb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 16, 2024
916dabc
clean up some re-request loops
pidgezero-one Aug 16, 2024
bb3c30c
clean up some re-request loops
pidgezero-one Aug 16, 2024
e28c8cd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 16, 2024
f6f6c99
addresses most PR comments
pidgezero-one Sep 29, 2024
beaf68c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 29, 2024
e6fe169
precommit
pidgezero-one Sep 29, 2024
408da50
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Sep 29, 2024
e7f714b
fetches more author info, not sure how to format it yet
pidgezero-one Sep 29, 2024
8f01b5b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 29, 2024
77b23d8
brackets in wrong placE
pidgezero-one Sep 29, 2024
2aca912
Merge branch 'master' into 9671/feat/add-wikisource-import-script
pidgezero-one Oct 12, 2024
7bfc39b
format that works with /import/api
pidgezero-one Oct 13, 2024
461b02a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 13, 2024
147fab3
wip
pidgezero-one Oct 13, 2024
b73b8d1
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Oct 13, 2024
92065ed
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 13, 2024
0750cb2
?
pidgezero-one Oct 13, 2024
5842e13
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Oct 13, 2024
83a00f8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 13, 2024
bc31426
precommit errors
pidgezero-one Oct 13, 2024
2e99560
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Oct 13, 2024
cb14b14
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 13, 2024
8d4dac0
support author id matching
pidgezero-one Oct 13, 2024
7d85a0f
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Oct 13, 2024
f1b0edd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 13, 2024
5b57217
can't get it to work
pidgezero-one Oct 13, 2024
0bd52e8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 13, 2024
05f3644
unnecessary changes
pidgezero-one Oct 13, 2024
12b96f4
idk
pidgezero-one Oct 13, 2024
6a8234c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 13, 2024
b9c36e4
it works
pidgezero-one Oct 14, 2024
2fd4569
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Oct 14, 2024
5a8ea7a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 14, 2024
268f055
wip: support more identifiers
pidgezero-one Oct 14, 2024
60e5d00
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Oct 14, 2024
b6b68f3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 14, 2024
e02ae78
fix remote ids
pidgezero-one Oct 14, 2024
d7dd818
fix remote ids
pidgezero-one Oct 14, 2024
ee00b9a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 14, 2024
0ecba61
comment
pidgezero-one Oct 14, 2024
f952c4f
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Oct 14, 2024
0e2c510
fix wd condition
pidgezero-one Oct 14, 2024
1265681
remote_ids will never be empty in script
pidgezero-one Oct 14, 2024
216cb50
attempt unit tests
pidgezero-one Oct 14, 2024
eb8d3d4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 14, 2024
d298d39
irrelevant
pidgezero-one Oct 14, 2024
8c17a14
update comment
pidgezero-one Oct 14, 2024
8917ae8
unnecessary change
pidgezero-one Oct 14, 2024
a174609
Update openlibrary/components/AuthorIdentifiers.vue
pidgezero-one Oct 15, 2024
ad263aa
Update openlibrary/components/AuthorIdentifiers.vue
pidgezero-one Oct 15, 2024
ab624f6
Update openlibrary/components/AuthorIdentifiers.vue
pidgezero-one Oct 15, 2024
320be8b
suggested rename
pidgezero-one Oct 16, 2024
f07e5fb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 16, 2024
3bfe2d9
why did changing remote_ids to identifiers break tests?
pidgezero-one Oct 16, 2024
a0aaf40
Merge branch '9671/feat/add-wikisource-import-script' of https://gith…
pidgezero-one Oct 16, 2024
d97921d
fix import key
pidgezero-one Oct 16, 2024
565853b
identifiers
pidgezero-one Oct 16, 2024
d84c5fd
identifiers
pidgezero-one Oct 16, 2024
f0a1103
move this code
pidgezero-one Nov 27, 2024
8d3fe1f
/
pidgezero-one Nov 27, 2024
eea9276
unused imports
pidgezero-one Nov 27, 2024
aca95e0
this got moved out by accident
pidgezero-one Nov 28, 2024
d3b4767
wtf?
pidgezero-one Nov 28, 2024
2545c12
add another exclusion
pidgezero-one Dec 3, 2024
eb3cb40
Merge branch 'master' into 9671/feat/add-wikisource-import-script
pidgezero-one Dec 3, 2024
88648e0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 3, 2024
e04b80c
ruff fix
pidgezero-one Dec 3, 2024
c55261b
just get birth and death dates for authors to use in import matching
pidgezero-one Dec 3, 2024
abb913c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ isbnlib==3.10.14
luqum==0.11.0
lxml==4.9.4
multipart==0.2.4
mwparserfromhell==0.6.6
nameparser==1.1.3
Pillow==10.4.0
psycopg2==2.9.6
pydantic==2.4.0
Expand All @@ -30,3 +32,4 @@ sentry-sdk==2.8.0
simplejson==3.19.1
statsd==4.0.1
validate_email==1.3
wikitextparser==0.56.1
Loading
Loading