Skip to content

Commit

Permalink
Merge branch 'master' into qt/add_policy
Browse files Browse the repository at this point in the history
  • Loading branch information
qtomlinson authored Sep 27, 2024
2 parents 0419fd3 + a394fde commit 20d440b
Show file tree
Hide file tree
Showing 14 changed files with 738 additions and 68 deletions.
37 changes: 37 additions & 0 deletions .github/workflows/build-and-deploy-dev.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# This workflow will build a docker image, push it to ghcr.io, and deploy it to an Azure WebApp.
name: Build and Deploy -- DEV

on:
workflow_dispatch:
# TODO: once tested run on master push
# push:
# branches: [master]

jobs:
upload-package-lock-json:
name: Upload package-lock.json from this repo
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4.1.1

- name: Upload package-lock.json
uses: actions/upload-artifact@v4
with:
name: package-lock.json
path: package-lock.json

build-and-deploy:
name: Build and Deploy
needs: upload-package-lock-json
uses: clearlydefined/operations/.github/workflows/app-build-and-deploy.yml@v2.0.0
secrets:
AZURE_CREDENTIALS: ${{ secrets.AZURE_CREDENTIALS }}
AZURE_WEBAPP_PUBLISH_PROFILE: ${{ secrets.AZURE_WEBAPP_PUBLISH_PROFILE_DEV }}
DEPLOY_TOKEN: ${{ secrets.DEPLOY_TOKEN }}
PRODUCTION_DEPLOYERS: ${{ secrets.PRODUCTION_DEPLOYERS }}
with:
deploy-env: dev
application-type: worker
azure-app-base-name: cdcrawler
azure-app-name-postfix: -dev
18 changes: 12 additions & 6 deletions providers/fetch/pypiFetch.js
Original file line number Diff line number Diff line change
Expand Up @@ -79,18 +79,24 @@ class PyPiFetch extends AbstractFetch {
for (const classifier in classifiers) {
if (classifiers[classifier].includes('License :: OSI Approved ::')) {
const lastColon = classifiers[classifier].lastIndexOf(':')
const rawLicense = classifiers[classifier].slice(lastColon + 1)
return spdxCorrect(rawLicense)
return classifiers[classifier].slice(lastColon + 1)
}
}
return null
}

_extractDeclaredLicense(registryData) {
const licenseFromClassifiers = this._extractLicenseFromClassifiers(registryData)
if (licenseFromClassifiers) return licenseFromClassifiers
const license = get(registryData, 'info.license')
return license && spdxCorrect(license)
const licenseInMetadata = get(registryData, 'info.license')
const hasVersionInMeta = /\d+/.test(licenseInMetadata)
const licenseInClassifiers = this._extractLicenseFromClassifiers(registryData)
const hasVersionInClassifier = /\d+/.test(licenseInClassifiers)

let licenses = [licenseInMetadata, licenseInClassifiers]
if (hasVersionInClassifier && !hasVersionInMeta) licenses = [licenseInClassifiers, licenseInMetadata]
for (const rawLicense of licenses) {
const parsed = rawLicense && spdxCorrect(rawLicense)
if (parsed) return parsed
}
}

async _getPackage(spec, registryData, destination) {
Expand Down
2 changes: 1 addition & 1 deletion providers/process/pypiExtract.js
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ class PyPiExtract extends AbstractClearlyDefinedProcessor {
}

get toolVersion() {
return '1.1.1'
return '1.2.1'
}

canHandle(request) {
Expand Down
43 changes: 43 additions & 0 deletions test/fixtures/pypi/registryData-info_bsd3.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
{
"info": {
"author": "Joel Nothman",
"author_email": "joel.nothman@gmail.com",
"bugtrack_url": null,
"classifiers": [
"Intended Audience :: Science/Research",
"License :: OSI Approved :: BSD License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.6",
"Topic :: Scientific/Engineering :: Visualization"
],
"description": "UpSetPlot documentation\n============================\n\n|version| |licence| |py-versions|\n\n|issues| |build| |docs| |coverage|\n\nThis is another Python implementation of UpSet plots by Lex et al. [Lex2014]_.\nUpSet plots are used to visualise set overlaps; like Venn diagrams but\nmore readable. Documentation is at https://upsetplot.readthedocs.io.\n\nThis ``upsetplot`` library tries to provide a simple interface backed by an\nextensible, object-oriented design.\n\nThere are many ways to represent the categorisation of data, as covered in\nour `Data Format Guide <https://upsetplot.readthedocs.io/en/stable/formats.html>`_.\n\nOur internal input format uses a `pandas.Series` containing counts\ncorresponding to subset sizes, where each subset is an intersection of named\ncategories. The index of the Series indicates which rows pertain to which\ncategories, by having multiple boolean indices, like ``example`` in the\nfollowing::\n\n >>> from upsetplot import generate_counts\n >>> example = generate_counts()\n >>> example\n cat0 cat1 cat2\n False False False 56\n True 283\n True False 1279\n True 5882\n True False False 24\n True 90\n True False 429\n True 1957\n Name: value, dtype: int64\n\nThen::\n\n >>> from upsetplot import plot\n >>> plot(example) # doctest: +SKIP\n >>> from matplotlib import pyplot\n >>> pyplot.show() # doctest: +SKIP\n\nmakes:\n\n.. image:: http://upsetplot.readthedocs.io/en/latest/_images/sphx_glr_plot_generated_001.png\n :target: ../auto_examples/plot_generated.html\n\nAnd you can save the image in various formats::\n\n >>> pyplot.savefig(\"/path/to/myplot.pdf\") # doctest: +SKIP\n >>> pyplot.savefig(\"/path/to/myplot.png\") # doctest: +SKIP\n\nThis plot shows the cardinality of every category combination seen in our data.\nThe leftmost column counts items absent from any category. The next three\ncolumns count items only in ``cat1``, ``cat2`` and ``cat3`` respectively, with\nfollowing columns showing cardinalities for items in each combination of\nexactly two named sets. The rightmost column counts items in all three sets.\n\nRotation\n........\n\nWe call the above plot style \"horizontal\" because the category intersections\nare presented from left to right. `Vertical plots\n<http://upsetplot.readthedocs.io/en/latest/auto_examples/plot_vertical.html>`__\nare also supported!\n\n.. image:: http://upsetplot.readthedocs.io/en/latest/_images/sphx_glr_plot_vertical_001.png\n :target: http://upsetplot.readthedocs.io/en/latest/auto_examples/plot_vertical.html\n\nDistributions\n.............\n\nProviding a DataFrame rather than a Series as input allows us to expressively\n`plot the distribution of variables\n<http://upsetplot.readthedocs.io/en/latest/auto_examples/plot_diabetes.html>`__\nin each subset.\n\n.. image:: http://upsetplot.readthedocs.io/en/latest/_images/sphx_glr_plot_diabetes_001.png\n :target: http://upsetplot.readthedocs.io/en/latest/auto_examples/plot_diabetes.html\n\nLoading datasets\n................\n\nWhile the dataset above is randomly generated, you can prepare your own dataset\nfor input to upsetplot. A helpful tool is `from_memberships`, which allows\nus to reconstruct the example above by indicating each data point's category\nmembership::\n\n >>> from upsetplot import from_memberships\n >>> example = from_memberships(\n ... [[],\n ... ['cat2'],\n ... ['cat1'],\n ... ['cat1', 'cat2'],\n ... ['cat0'],\n ... ['cat0', 'cat2'],\n ... ['cat0', 'cat1'],\n ... ['cat0', 'cat1', 'cat2'],\n ... ],\n ... data=[56, 283, 1279, 5882, 24, 90, 429, 1957]\n ... )\n >>> example\n cat0 cat1 cat2\n False False False 56\n True 283\n True False 1279\n True 5882\n True False False 24\n True 90\n True False 429\n True 1957\n dtype: int64\n\nSee also `from_contents`, another way to describe categorised data, and\n`from_indicators` which allows each category to be indicated by a column in\nthe data frame (or a function of the column's data such as whether it is a\nmissing value).\n\nInstallation\n------------\n\nTo install the library, you can use `pip`::\n\n $ pip install upsetplot\n\nInstallation requires:\n\n* pandas\n* matplotlib >= 2.0\n* seaborn to use `UpSet.add_catplot`\n\nIt should then be possible to::\n\n >>> import upsetplot\n\nin Python.\n\nWhy an alternative to py-upset?\n-------------------------------\n\nProbably for petty reasons. It appeared `py-upset\n<https://github.com/ImSoErgodic/py-upset>`_ was not being maintained. Its\ninput format was undocumented, inefficient and, IMO, inappropriate. It did not\nfacilitate showing plots of each subset's distribution as in Lex et al's work\nintroducing UpSet plots. Nor did it include the horizontal bar plots\nillustrated there. It did not support Python 2. I decided it would be easier to\nconstruct a cleaner version than to fix it.\n\nReferences\n----------\n\n.. [Lex2014] Alexander Lex, Nils Gehlenborg, Hendrik Strobelt, Romain Vuillemot, Hanspeter Pfister,\n *UpSet: Visualization of Intersecting Sets*,\n IEEE Transactions on Visualization and Computer Graphics (InfoVis '14), vol. 20, no. 12, pp. 1983–1992, 2014.\n doi: `doi.org/10.1109/TVCG.2014.2346248 <https://doi.org/10.1109/TVCG.2014.2346248>`_\n\n\n.. |py-versions| image:: https://img.shields.io/pypi/pyversions/upsetplot.svg\n :alt: Python versions supported\n\n.. |version| image:: https://badge.fury.io/py/UpSetPlot.svg\n :alt: Latest version on PyPi\n :target: https://badge.fury.io/py/UpSetPlot\n\n.. |build| image:: https://github.com/jnothman/upsetplot/actions/workflows/test.yml/badge.svg\n :alt: Github Workflows CI build status\n :scale: 100%\n :target: https://github.com/jnothman/UpSetPlot/actions/workflows/test.yml\n\n.. |issues| image:: https://img.shields.io/github/issues/jnothman/UpSetPlot.svg\n :alt: Issue tracker\n :target: https://github.com/jnothman/UpSetPlot\n\n.. |coverage| image:: https://coveralls.io/repos/github/jnothman/UpSetPlot/badge.svg\n :alt: Test coverage\n :target: https://coveralls.io/github/jnothman/UpSetPlot\n\n.. |docs| image:: https://readthedocs.org/projects/upsetplot/badge/?version=latest\n :alt: Documentation Status\n :scale: 100%\n :target: https://upsetplot.readthedocs.io/en/latest/?badge=latest\n\n.. |licence| image:: https://img.shields.io/badge/Licence-BSD-blue.svg\n :target: https://opensource.org/licenses/BSD-3-Clause\n",
"description_content_type": "",
"docs_url": null,
"download_url": "",
"downloads": {
"last_day": -1,
"last_month": -1,
"last_week": -1
},
"home_page": "https://upsetplot.readthedocs.io",
"keywords": "",
"license": "BSD-3-Clause",
"maintainer": "",
"maintainer_email": "",
"name": "UpSetPlot",
"package_url": "https://pypi.org/project/UpSetPlot/",
"platform": null,
"project_url": "https://pypi.org/project/UpSetPlot/",
"project_urls": {
"Homepage": "https://upsetplot.readthedocs.io"
},
"release_url": "https://pypi.org/project/UpSetPlot/0.9.0/",
"requires_dist": null,
"requires_python": "",
"summary": "Draw Lex et al.'s UpSet plots with Pandas and Matplotlib",
"version": "0.9.0",
"yanked": false,
"yanked_reason": null
}
}
55 changes: 55 additions & 0 deletions test/fixtures/pypi/registryData-info_chardet-5.1.0.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
{
"info": {
"author": "Mark Pilgrim",
"author_email": "mark@diveintomark.org",
"bugtrack_url": null,
"classifiers": [
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Developers",
"License :: OSI Approved :: GNU Lesser General Public License v2 or later (LGPLv2+)",
"Operating System :: OS Independent",
"Programming Language :: Python",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: Implementation :: PyPy",
"Topic :: Software Development :: Libraries :: Python Modules",
"Topic :: Text Processing :: Linguistic"
],
"description": "Chardet: The Universal Character Encoding Detector\n--------------------------------------------------\n\n.. image:: https://img.shields.io/travis/chardet/chardet/stable.svg\n :alt: Build status\n :target: https://travis-ci.org/chardet/chardet\n\n.. image:: https://img.shields.io/coveralls/chardet/chardet/stable.svg\n :target: https://coveralls.io/r/chardet/chardet\n\n.. image:: https://img.shields.io/pypi/v/chardet.svg\n :target: https://warehouse.python.org/project/chardet/\n :alt: Latest version on PyPI\n\n.. image:: https://img.shields.io/pypi/l/chardet.svg\n :alt: License\n\n\nDetects\n - ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)\n - Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)\n - EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP (Japanese)\n - EUC-KR, ISO-2022-KR, Johab (Korean)\n - KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)\n - ISO-8859-5, windows-1251 (Bulgarian)\n - ISO-8859-1, windows-1252, MacRoman (Western European languages)\n - ISO-8859-7, windows-1253 (Greek)\n - ISO-8859-8, windows-1255 (Visual and Logical Hebrew)\n - TIS-620 (Thai)\n\n.. note::\n Our ISO-8859-2 and windows-1250 (Hungarian) probers have been temporarily\n disabled until we can retrain the models.\n\nRequires Python 3.7+.\n\nInstallation\n------------\n\nInstall from `PyPI <https://pypi.org/project/chardet/>`_::\n\n pip install chardet\n\nDocumentation\n-------------\n\nFor users, docs are now available at https://chardet.readthedocs.io/.\n\nCommand-line Tool\n-----------------\n\nchardet comes with a command-line script which reports on the encodings of one\nor more files::\n\n % chardetect somefile someotherfile\n somefile: windows-1252 with confidence 0.5\n someotherfile: ascii with confidence 1.0\n\nAbout\n-----\n\nThis is a continuation of Mark Pilgrim's excellent original chardet port from C, and `Ian Cordasco <https://github.com/sigmavirus24>`_'s\n`charade <https://github.com/sigmavirus24/charade>`_ Python 3-compatible fork.\n\n:maintainer: Dan Blanchard\n",
"description_content_type": "",
"docs_url": null,
"download_url": "",
"downloads": {
"last_day": -1,
"last_month": -1,
"last_week": -1
},
"home_page": "https://github.com/chardet/chardet",
"keywords": "encoding,i18n,xml",
"license": "LGPL",
"maintainer": "Daniel Blanchard",
"maintainer_email": "dan.blanchard@gmail.com",
"name": "chardet",
"package_url": "https://pypi.org/project/chardet/",
"platform": null,
"project_url": "https://pypi.org/project/chardet/",
"project_urls": {
"Documentation": "https://chardet.readthedocs.io/",
"GitHub Project": "https://github.com/chardet/chardet",
"Homepage": "https://github.com/chardet/chardet",
"Issue Tracker": "https://github.com/chardet/chardet/issues"
},
"release_url": "https://pypi.org/project/chardet/5.1.0/",
"requires_dist": null,
"requires_python": ">=3.7",
"summary": "Universal encoding detector for Python 3",
"version": "5.1.0",
"yanked": false,
"yanked_reason": null
}
}
Loading

0 comments on commit 20d440b

Please sign in to comment.