You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since there are many tens of thousands of wikimedia wiki installs around the world, such as feature would be likely to be of use to others as well as ourselves.
Note: I'm a WMF volunteer, not a WMF staffer. I do not / cannot speak for the WMF.
https://quarry.wmcloud.org/ (which has dozens of users working in dozens of natural languages) would be the test server for this. I'll have my own DEV install.
I believe I understand how to do most of this work. I've already had one pull request merged to master (see #24440).
It is likely that binary fields will be dropped (or values swapped with placeholders), because there is no reliable way to represent them in wikitables. UTF-8 will have to be un-escaped (I believe); there are plenty of WMF users querying non-Roman-script wikis to test with.
The text was updated successfully, but these errors were encountered:
However the context is very different; Quarry has already mapped VARBINARY(M) and other types to UTF8, whereas Superset hasn't; binary fields aren't handled in that Quarry export as well as they are in Superset; etc.
Hi @fredster33 - sorry this slipped under the collective radar when you posted it. We're trying to get better about these things.
Are you still interested in contributing to this? I think we would welcome a SIP or a PR to discuss, if so. It does indeed seem useful... we just need to make sure that as we add more and more export formats, they're kept discrete and maintainable - maybe even pluggable!
In any case, normally, I'd close this as stale, and because it's not a bug, but meanwhile I'll move it to a GitHub Discussion as an Ideas thread in case you (or anyone) want to pick it up. Let us know!
Discussed in #24455
Originally posted by stuartyeates June 20, 2023
"Download to wikitable" parallel to "Download to CSV" in sqllab
Motivation
The Wikimedia Foundation (WMF) currently runs an in-house SQL explorer called Quarry https://quarry.wmcloud.org/ The plan is to move to superset, see https://phabricator.wikimedia.org/T169452. A very heavily used feature of Quarry is export as a wikitable (i.e. markup that can be included in a wikimedia wiki page) See for example https://quarry.wmcloud.org/query/74483
Since there are many tens of thousands of wikimedia wiki installs around the world, such as feature would be likely to be of use to others as well as ourselves.
Note: I'm a WMF volunteer, not a WMF staffer. I do not / cannot speak for the WMF.
Proposed Change
Clone everything under /api/v1/sqllab/export/ to /api/v1/sqllab/exportwiki/ (or a similar name) and change the output. By keeping things separate impact on the existing code should be less significant. The core of the csv export is https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html but there is not directly equalivent for a wikitable, so I'll have to write my own based partly on the logic at https://github.com/pandas-dev/pandas/blob/main/pandas/io/formats/csvs.py
The documentation on the output format is at https://www.mediawiki.org/wiki/Help:Tables
https://quarry.wmcloud.org/ (which has dozens of users working in dozens of natural languages) would be the test server for this. I'll have my own DEV install.
I believe I understand how to do most of this work. I've already had one pull request merged to master (see #24440).
New or Changed Public Interfaces
Addition of /api/v1/sqllab/exportwiki/
Migration Plan and Compatibility
Migration: none.
Compatibility: By using the basic definition at https://www.mediawiki.org/wiki/Help:Tables rather than the richer definition at https://en.wikipedia.org/wiki/Help:Table the plan is to keep the output as portable as possible, both across wikis and going forward in time.
It is likely that binary fields will be dropped (or values swapped with placeholders), because there is no reliable way to represent them in wikitables. UTF-8 will have to be un-escaped (I believe); there are plenty of WMF users querying non-Roman-script wikis to test with.
The text was updated successfully, but these errors were encountered: