feat: support 'chevron' library for templating as jinja alternative #11617

mistercrunch · 2020-11-09T06:12:08Z

SUMMARY

chevron is a python implementation of mustache.js, and seems much safer by scope (project tagline is "Logic-less templates" !) than jinja2 could ever be.

While the library is not super active lately, it appears to be feature complete and is easy to review from a security standpoint as it's a few short-ish modules. Overall the whole library is <1k lines.

This PR adds support for the chevron library behind a feature flag.

SHORTCOMINGS

one shortcoming is the html escaping that we have to work around using triple curlies ie:{{{ ... }}}. I submitted a PR to enable turning this off in our context here allow bypassing html_escaping noahmorrison/chevron#81
accessing a position in an array seems impossible in the release using the undocumented {{mylist.0}} mustache feature, but works in chevron's master, highlighting the fact that the lib hasn't released to PyPI in a long time
given the security risks here, we should do a full audit of chevron's [small] codebase, and pin the lib. Bumping the lib should force a deep analysis of the changelog/changeset to make sure there are no security regressions

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

(chevron)[https://github.com/noahmorrison/chevron] is a python implementation of mustache.js, and seems much safer by scope than jinja could ever be. While the library is not super active lately, it appears to be feature complete and is easy to review from a security standpoint as it's a few short-ish modules. This PR adds support for the chevron library behind a feature flag.

codecov-io · 2020-11-09T06:34:57Z

Codecov Report

Merging #11617 (ed78947) into master (92a9acd) will increase coverage by 0.42%.
The diff coverage is 60.00%.

@@            Coverage Diff             @@
##           master   #11617      +/-   ##
==========================================
+ Coverage   62.26%   62.69%   +0.42%     
==========================================
  Files         873      873              
  Lines       42238    43307    +1069     
  Branches     3959     4079     +120     
==========================================
+ Hits        26301    27151     +850     
- Misses      15757    15976     +219     
  Partials      180      180

Flag	Coverage Δ
javascript	`62.94% <ø> (+0.09%)`	⬆️
python	`62.51% <60.00%> (+0.60%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
superset/config.py	`90.11% <ø> (ø)`
superset/jinja_context.py	`79.83% <60.00%> (-1.99%)`	⬇️
...dashboard/components/FiltersBadge/DetailsPanel.tsx	`17.80% <0.00%> (-0.80%)`	⬇️
...set-frontend/src/views/CRUD/welcome/EmptyState.tsx	`87.14% <0.00%> (-0.67%)`	⬇️
...ntend/src/views/CRUD/annotation/AnnotationList.tsx	`81.74% <0.00%> (-0.40%)`	⬇️
...ontend/src/components/ListViewCard/ImageLoader.tsx	`86.20% <0.00%> (-0.16%)`	⬇️
superset-frontend/src/components/Menu/SubMenu.tsx	`100.00% <0.00%> (ø)`
...-frontend/src/common/components/common.stories.tsx	`0.00% <0.00%> (ø)`
.../src/components/dataViewCommon/TableCollection.tsx	`100.00% <0.00%> (ø)`
superset/db_engine_specs/kylin.py	`94.73% <0.00%> (+0.61%)`	⬆️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 92a9acd...ed78947. Read the comment docs.

craig-rueda · 2020-11-09T17:21:59Z

superset/jinja_context.py

@@ -338,6 +347,8 @@ def get_template_processor(
        template_processor = template_processors.get(
            database.backend, BaseTemplateProcessor
        )
+    elif feature_flag_manager.is_feature_enabled("CHEVRON_TEMPLATE_PROCESSING"):


Might be better to nest this conditional under if ENABLE_TEMPLATE_PROCESSING, sort of as a sub option after the user turns tempate processing on

Maybe we should rename ENABLE_TEMPLATE_PROCESSING to JINJA_TEMPLATE_PROCESSING instead (albeit a breaking change)?

Right, clearly there's some cleanup to be done here, but also need to maintain backward compatibility. I think that's the easy portion of this PR, the real question is around the viability of chevron as a secure templating backend.

As a solution to this particular topic though I'd like to add JINJA_TEMPLATE_PROCESSING as True by default and nest the condition under ENABLE_TEMPLATE_PROCESSING.

I would propose calling the feature flag TEMPLATE_PROCESSING, and then adding a config parameter TEMPLATE_PROCESSOR: Optional[TemplateProcessorType] = TemplateProcessorType.CHEVRON. Similar to how other feature flagged options do (see e.g. THUMBNAIL_SELENIUM_USER, THUMBNAIL_CACHE_CONFIG etc).

robdiciuccio · 2020-11-09T19:14:45Z

This looks good upon cursory review. I'll look into the lib a bit more deeply.

villebro

Looks good and no harm in supporting multiple template processors. I'm slightly concerned by the staleness of the library, but that's a different topic (not a problem before chevron is potentially enabled by default).

villebro · 2020-11-10T08:21:32Z

superset/jinja_context.py

@@ -338,6 +347,8 @@ def get_template_processor(
        template_processor = template_processors.get(
            database.backend, BaseTemplateProcessor
        )
+    elif feature_flag_manager.is_feature_enabled("CHEVRON_TEMPLATE_PROCESSING"):


I would propose calling the feature flag TEMPLATE_PROCESSING, and then adding a config parameter TEMPLATE_PROCESSOR: Optional[TemplateProcessorType] = TemplateProcessorType.CHEVRON. Similar to how other feature flagged options do (see e.g. THUMBNAIL_SELENIUM_USER, THUMBNAIL_CACHE_CONFIG etc).

mistercrunch · 2020-11-10T17:40:46Z

FYI, offered help to maintain chevron here: noahmorrison/chevron#82

ktmud · 2020-11-10T18:25:11Z

On another note, have you looked into JinjaSQL? I've used it in some Python scripts before, but not sure whether it is safer than the raw jinja.

robdiciuccio · 2020-11-10T19:00:16Z

@ktmud Thanks for the suggestion! It seems with JinjaSQL "You can use the full power of Jinja templates," which is what we're trying to move away from (the ability to execute arbitrary python code in user-generated input).

mistercrunch · 2020-11-11T04:21:50Z

@ktmud didn't spend much time but that lib seems to be doing very little on top of jinja as highlighted in the single, very simple <200 LoC module here:
https://github.com/hashedin/jinjasql/blob/master/jinjasql/core.py

ktmud · 2020-11-11T04:41:20Z

It seems the only thing it does it to parameterize SQL queries, which at least reduces the risk of SQL injections (if there's ever such risk within Superset).

ktmud · 2020-11-12T01:42:14Z

Not sure if we really want to move away from Jinja. Other than the unintentional exposure of unsafe objects, what are other potential risks with Jinja?

One thing we could to is to be more careful with what we expose to Jinja context---and the first thing we should probably do is to always expose raw functions instead of class objects (i.e., replace hive.latest_partition and presto.latest_partition with latest_partition).

robdiciuccio · 2020-11-12T16:24:36Z

One concern with chevron is that it allows loading of "partials" from the filesystem. ex:

chevron.render('Config: {{> superset_config }}', {}, '.', 'py')
>>> 'Config: import os\nfrom superset.stats_logger import DummyStatsLogger\nfrom cachelib.file...'

While this can be mitigated since we control how chevron.render is called, it does not appear that it can be disabled.

robdiciuccio · 2020-11-12T16:37:54Z

@ktmud The intention is the prevent security issues like CVE-2020-13948. We should absolutely move away from exposing class objects to the Jinja context, but this doesn't prevent access to Python internals. We're relying heavily on Jijna's Sandboxing to prevent RCE, which feels very brittle.

ktmud · 2020-11-12T16:57:13Z

@ktmud The intention is the prevent security issues like CVE-2020-13948. We should absolutely move away from exposing class objects to the Jinja context, but this doesn't prevent access to Python internals. We're relying heavily on Jijna's Sandboxing to prevent RCE, which feels very brittle.

I think CVE-2020-13948 has the same root cause as the other security bug we had, which is exposing class objects or modules that allow unsafe chained access to things we don't want to expose. As long as we stop doing that, is there anything else we should worry about?

A lot of OSS use Jinja as their template engine, including Airflow and dbt, is there anything we can learn from them to make Jinja more secure?

robdiciuccio · 2020-11-12T19:22:54Z

@ktmud The difference here is that we're processing user-generated input, while Airflow and dbt are operating on templates defined in code. Unclear how the hosted versions of these apps (Astronomer and dbt Cloud) handle Jinja security.

ktmud · 2020-11-12T19:46:20Z

To be fair dbt is also very much user-generated input. Anyone can login to the dbt UI and write templated code in the IDE: https://docs.getdbt.com/docs/running-a-dbt-project/using-the-dbt-ide

mistercrunch · 2020-11-12T21:43:02Z

Looks like jinja changed their narrative around the sandboxed mode recently (?). I'm pretty sure they used to advise against using Jinja to execute untrusted strings not that long ago.

mistercrunch · 2020-11-12T21:44:47Z

Out of curiosity we could look at DBT's code to see how they secure their sandbox.

mistercrunch · 2020-11-12T21:45:43Z

Clearly the compatibility with DBT AND Airflow through Jinja is a really important factor for many. People want to be able to iterate and go back and forth between these tools.

robdiciuccio · 2020-11-13T04:07:20Z

I'll take a stab at making the jinja implementation more secure in a separate branch.

robdiciuccio · 2020-11-14T02:54:47Z

Attempt at making Jinja more secure: #11704

mistercrunch · 2020-11-15T20:43:33Z

Superseded by #11704

superset-github-bot bot added the preset-io label Nov 9, 2020

pull-request-size bot added the size/S label Nov 9, 2020

craig-rueda reviewed Nov 9, 2020

View reviewed changes

improve feature flags

ed78947

villebro reviewed Nov 10, 2020

View reviewed changes

pull-request-size bot added size/M and removed size/S labels Nov 11, 2020

mistercrunch closed this Nov 15, 2020

robdiciuccio mentioned this pull request Nov 17, 2020

feat(templating): Safer Jinja template processing #11704

Merged

7 tasks

alanyee mentioned this pull request Nov 27, 2020

[Feature Request] Ability to disable loading partials from the filesystem noahmorrison/chevron#84

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support 'chevron' library for templating as jinja alternative #11617

feat: support 'chevron' library for templating as jinja alternative #11617

mistercrunch commented Nov 9, 2020 •

edited

Loading

codecov-io commented Nov 9, 2020 •

edited

Loading

craig-rueda Nov 9, 2020

ktmud Nov 9, 2020

mistercrunch Nov 9, 2020 •

edited

Loading

villebro Nov 10, 2020

robdiciuccio commented Nov 9, 2020

villebro left a comment

villebro Nov 10, 2020

mistercrunch commented Nov 10, 2020

ktmud commented Nov 10, 2020

robdiciuccio commented Nov 10, 2020

mistercrunch commented Nov 11, 2020

ktmud commented Nov 11, 2020

ktmud commented Nov 12, 2020 •

edited

Loading

robdiciuccio commented Nov 12, 2020

robdiciuccio commented Nov 12, 2020

ktmud commented Nov 12, 2020

robdiciuccio commented Nov 12, 2020

ktmud commented Nov 12, 2020 •

edited

Loading

mistercrunch commented Nov 12, 2020

mistercrunch commented Nov 12, 2020

mistercrunch commented Nov 12, 2020 •

edited

Loading

robdiciuccio commented Nov 13, 2020

robdiciuccio commented Nov 14, 2020

mistercrunch commented Nov 15, 2020

feat: support 'chevron' library for templating as jinja alternative #11617

feat: support 'chevron' library for templating as jinja alternative #11617

Conversation

mistercrunch commented Nov 9, 2020 • edited Loading

SUMMARY

SHORTCOMINGS

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

codecov-io commented Nov 9, 2020 • edited Loading

Codecov Report

craig-rueda Nov 9, 2020

Choose a reason for hiding this comment

ktmud Nov 9, 2020

Choose a reason for hiding this comment

mistercrunch Nov 9, 2020 • edited Loading

Choose a reason for hiding this comment

villebro Nov 10, 2020

Choose a reason for hiding this comment

robdiciuccio commented Nov 9, 2020

villebro left a comment

Choose a reason for hiding this comment

villebro Nov 10, 2020

Choose a reason for hiding this comment

mistercrunch commented Nov 10, 2020

ktmud commented Nov 10, 2020

robdiciuccio commented Nov 10, 2020

mistercrunch commented Nov 11, 2020

ktmud commented Nov 11, 2020

ktmud commented Nov 12, 2020 • edited Loading

robdiciuccio commented Nov 12, 2020

robdiciuccio commented Nov 12, 2020

ktmud commented Nov 12, 2020

robdiciuccio commented Nov 12, 2020

ktmud commented Nov 12, 2020 • edited Loading

mistercrunch commented Nov 12, 2020

mistercrunch commented Nov 12, 2020

mistercrunch commented Nov 12, 2020 • edited Loading

robdiciuccio commented Nov 13, 2020

robdiciuccio commented Nov 14, 2020

mistercrunch commented Nov 15, 2020

mistercrunch commented Nov 9, 2020 •

edited

Loading

codecov-io commented Nov 9, 2020 •

edited

Loading

mistercrunch Nov 9, 2020 •

edited

Loading

ktmud commented Nov 12, 2020 •

edited

Loading

ktmud commented Nov 12, 2020 •

edited

Loading

mistercrunch commented Nov 12, 2020 •

edited

Loading