Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error-prone SQL queries #15828

Merged
merged 9 commits into from
Jan 23, 2022
Merged

Conversation

ClearlyClaire
Copy link
Contributor

@ClearlyClaire ClearlyClaire commented Mar 3, 2021

Following the news of the compromise of one infamous Mastodon fork, I have been reviewing Mastodon's source code for SQL injection vulnerabilities.

So far, I have found none, but I have found a few queries that had me scratch my heard pretty hard and that could easily introduce vulnerabilities in the future if we are not careful.

This PR should make them more robust.

@ClearlyClaire ClearlyClaire requested a review from Gargron March 4, 2021 21:17
@ClearlyClaire
Copy link
Contributor Author

Removed unneeded space characters around the generated tsquery, I do not think they serve any purpose, I tried a bunch of tsqueries terms, and encoding them with or without spaces always result in the same tsquery.

@stale
Copy link

stale bot commented Jul 8, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the status/wontfix This will not be worked on label Jul 8, 2021
@ClearlyClaire
Copy link
Contributor Author

I still think this is a worthy change.

@stale stale bot removed the status/wontfix This will not be worked on label Jul 9, 2021
Copy link
Contributor

@mohe2015 mohe2015 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know ruby that much but this should really get merged. You need to rebase though.

@@ -96,12 +96,12 @@ class Status < ApplicationRecord
scope :not_excluded_by_account, ->(account) { where.not(account_id: account.excluded_from_timeline_account_ids) }
scope :not_domain_blocked_by_account, ->(account) { account.excluded_from_timeline_domains.blank? ? left_outer_joins(:account) : left_outer_joins(:account).where('accounts.domain IS NULL OR accounts.domain NOT IN (?)', account.excluded_from_timeline_domains) }
scope :tagged_with_all, ->(tag_ids) {
Array(tag_ids).reduce(self) do |result, id|
Array(tag_ids).map(&:to_i).reduce(self) do |result, id|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know ruby but the previous here looks quite dangerous as far as I can tell. Are the tag_ids sanitized somewhere else or is there another reason why this is not an sql injection?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ids are not passed from user input

Copy link
Contributor

@mohe2015 mohe2015 Jul 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't fully understood this query but couldn't this check in sql tag_id IN (:tag_ids)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. That's what tagged_with does (returning statuses that are tagged with any of the given tags). tagged_with_all needs to return tagged with all the given tags, there might be better way to query that, but right now I don't know which, and tag_id IN (:tag_ids) isn't one of them. Though tagged_with_none can most probably be rewritten in a better way using tag_id IN (:tag_ids).

Copy link
Contributor Author

@ClearlyClaire ClearlyClaire Jul 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rewritten tagged_with_none in a way that seems slightly more efficient and does not involve creating names from any input. However, tagged_with_all remains unchanged for the moment.

@ClearlyClaire ClearlyClaire force-pushed the fixes/atrocious-sql branch 4 times, most recently from 4c6e682 to 98ac12e Compare July 10, 2021 15:43
@ClearlyClaire ClearlyClaire force-pushed the fixes/atrocious-sql branch 2 times, most recently from e941d6a to a9c6646 Compare August 8, 2021 05:56
@stale
Copy link

stale bot commented Jan 8, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the status/wontfix This will not be worked on label Jan 8, 2022
While this code seems to not present an actual vulnerability, one could
easily be introduced by mistake due to how the query is built.

This PR parameterises the `to_tsquery` input to make the query more robust.
Those two scopes aren't used in a way that could be vulnerable to an SQL
injection, but keeping them unchanged might be a hazard.
This avoids one level of indentation while making clearer that the SQL template
isn't build from all the dynamic parameters of advanced_search_for.
@stale stale bot removed the status/wontfix This will not be worked on label Jan 10, 2022
@ClearlyClaire
Copy link
Contributor Author

ClearlyClaire commented Jan 10, 2022

I still think this is a good change. Rebased.

Again, this PR:

  • slightly improves readability of Account.search_for
  • slightly improves performance, and greatly improves readability of Status.tagged_with_none
  • fixes a bug when both Status.tagged_with_all and Status.tagged_with_none are used at the same time
  • while it does not fix any existing security vulnerability, it fixes 4 brakeman warnings and improve error-prone code, including some (Status.tagged_with_all / Status.tagged_with_none) that could be called in an inappropriate way without brakeman catching it

terms = unsanitized_terms.gsub(DISALLOWED_TSQUERY_CHARACTERS, ' ')

# The final ":*" is for prefix search.
"'#{terms}':*"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I mentioned this before: This does not yield the same results as ''' ' || #{terms} || ' ''' || ':*'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which different results? ''' ' || #{terms} || ' ''' || ':*' (when #{terms} get substituted by a string surrounded by single-quotes and not containing some characters) is a way to write the string ' #{terms} ':* in SQL (it's tricky because of the single-quotes), "' #{terms} ':*" results in the same string, which cannot be pasted directly in SQL, but it's ok because it's used as a parameter to a parameterized query instead.

So the difference between the two strings is just the spaces surrounding #{terms}. I understand the query is from https://github.com/Casecommons/pg_search/blob/master/lib/pg_search/features/tsearch.rb#L100-L138 but I don't understand what those extra spaces would mean, I could not understand it from PostgreSQL's documentation and did not observe any difference on testing data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I could not understand the reason for the surrounding spaces in the built query, I ended up comparing the generated tsqueries (at the SQL level) using the following code:

DISALLOWED_TSQUERY_CHARACTERS = /['?\\:‘’]/.freeze

def compare_tsqueries(unsanitized_terms)
  terms = unsanitized_terms.gsub(DISALLOWED_TSQUERY_CHARACTERS, ' ')
  inline_terms = Arel.sql(ActiveRecord::Base.connection.quote(terms))
  inline_tsquery = "to_tsquery('simple', ''' ' || #{inline_terms} || ' ''' || ':*')"
  tsquery_source = "'#{terms}':*"

  sql = "SELECT #{inline_tsquery} = to_tsquery('simple', $1)"
  ActiveRecord::Base.connection.select_all(sql, nil, [[nil, tsquery_source]]).rows[0][0]
end

def random_ascii_string(length)
  value = ''
  length.times do
    value  << (32 + rand(126-32)).chr
  end
  value
end

1000000.times.map { random_ascii_string(10) }.filter { |x| !compare_tsqueries(x) }

While I don't understand that result, it seems to_tsquery('simple', query) works slightly differently if the first term of query starts with ./, ../ or .. . I don't know how to explain that change, so I'm introducing the spaces back, even though I do not think these discrepancies (one case searching for a leading . less) would cause any issue at all.

The two queries are not strictly equivalent.

This reverts commit 86f16c5.
@Gargron Gargron merged commit 0a120d8 into mastodon:main Jan 23, 2022
ClearlyClaire added a commit that referenced this pull request Feb 3, 2022
* Fix error-prone SQL queries in Account search

While this code seems to not present an actual vulnerability, one could
easily be introduced by mistake due to how the query is built.

This PR parameterises the `to_tsquery` input to make the query more robust.

* Harden code for Status#tagged_with_all and Status#tagged_with_none

Those two scopes aren't used in a way that could be vulnerable to an SQL
injection, but keeping them unchanged might be a hazard.

* Remove unneeded spaces surrounding tsquery term

* Please CodeClimate

* Move advanced_search_for SQL template to its own function

This avoids one level of indentation while making clearer that the SQL template
isn't build from all the dynamic parameters of advanced_search_for.

* Add tests covering tagged_with, tagged_with_all and tagged_with_none

* Rewrite tagged_with_none to avoid multiple joins and make it more robust

* Remove obsolete brakeman warnings

* Revert "Remove unneeded spaces surrounding tsquery term"

The two queries are not strictly equivalent.

This reverts commit 86f16c5.
ClearlyClaire added a commit that referenced this pull request Feb 3, 2022
* Fix error-prone SQL queries in Account search

While this code seems to not present an actual vulnerability, one could
easily be introduced by mistake due to how the query is built.

This PR parameterises the `to_tsquery` input to make the query more robust.

* Harden code for Status#tagged_with_all and Status#tagged_with_none

Those two scopes aren't used in a way that could be vulnerable to an SQL
injection, but keeping them unchanged might be a hazard.

* Remove unneeded spaces surrounding tsquery term

* Please CodeClimate

* Move advanced_search_for SQL template to its own function

This avoids one level of indentation while making clearer that the SQL template
isn't build from all the dynamic parameters of advanced_search_for.

* Add tests covering tagged_with, tagged_with_all and tagged_with_none

* Rewrite tagged_with_none to avoid multiple joins and make it more robust

* Remove obsolete brakeman warnings

* Revert "Remove unneeded spaces surrounding tsquery term"

The two queries are not strictly equivalent.

This reverts commit 86f16c5.
koba-lab added a commit to koba-lab/mastodon that referenced this pull request Feb 8, 2022
* commit '637c7d464b2876765370d1143b7ba6441efb730b': (698 commits)
  Bump version to 3.3.2
  Fix spurious errors when receiving an Add activity for a private post
  disable legacy XSS filtering (mastodon#17289)
  Change mastodon:webpush:generate_vapid_key task to not require functional env (mastodon#17338)
  Fix response_to_recipient? CTE
  Fix insufficient sanitization of report comments
  Fix compacted JSON-LD possibly causing compatibility issues on forwarding
  Compact JSON-LD signed incoming activities
  Fix error-prone SQL queries (mastodon#15828)
  Change docker-compose.yml to specifically tag v3.3.1 images
  Bump to version 3.3.1
  Save bundle config as local (mastodon#17188)
  Add manual GitHub Actions runs (mastodon#17000)
  Change workflow to push to Docker Hub (mastodon#16980)
  Build container image by GitHub Actions (mastodon#16973)
  Add more advanced migration tests
  Fix edge case in migration helpers that caused crash because of PostgreSQL quirks (mastodon#17398)
  Fix some old migration scripts (mastodon#17394)
  Fix filtering DMs from non-followed users (mastodon#17042)
  Fix upload of remote media with OpenStack Swift sometimes failing (mastodon#16998)
  ...

# Conflicts:
#	CHANGELOG.md
#	Gemfile.lock
#	app/controllers/auth/sessions_controller.rb
#	app/controllers/concerns/sign_in_token_authentication_concern.rb
#	app/controllers/concerns/signature_verification.rb
#	app/controllers/concerns/two_factor_authentication_concern.rb
#	app/javascript/mastodon/components/status_action_bar.js
#	app/javascript/mastodon/features/getting_started/index.js
#	app/javascript/mastodon/locales/ja.json
#	app/javascript/styles/mastodon/boost.scss
#	app/lib/activitypub/activity/announce.rb
#	app/lib/activitypub/activity/create.rb
#	app/lib/formatter.rb
#	app/lib/webfinger.rb
#	app/models/user.rb
#	app/services/fan_out_on_write_service.rb
#	app/services/resolve_account_service.rb
#	app/views/statuses/_detailed_status.html.haml
#	app/views/statuses/_simple_status.html.haml
#	chart/Chart.yaml
#	chart/values.yaml.template
#	db/migrate/20200620164023_add_fixed_lowercase_index_to_accounts.rb
#	lib/cli.rb
#	lib/mastodon/maintenance_cli.rb
#	lib/paperclip/response_with_limit_adapter.rb
#	package.json
#	spec/controllers/auth/sessions_controller_spec.rb
#	spec/services/resolve_account_service_spec.rb
koba-lab added a commit to koba-lab/mastodon that referenced this pull request Feb 8, 2022
* stable-3.4: (666 commits)
  Fix insufficient sanitization of report comments (mastodon#17430)
  Bump version to 3.4.6
  disable legacy XSS filtering (mastodon#17289)
  Change mastodon:webpush:generate_vapid_key task to not require functional env (mastodon#17338)
  Fix response_to_recipient? CTE
  Fix insufficient sanitization of report comments
  Fix compacted JSON-LD possibly causing compatibility issues on forwarding
  Compact JSON-LD signed incoming activities
  Fix error-prone SQL queries (mastodon#15828)
  Fix spurious errors when receiving an Add activity for a private post (mastodon#17425)
  Bump version to 3.4.5
  Add more advanced migration tests (mastodon#17393)
  Fix followers synchronization mechanism not working when URI has empty path (mastodon#16510)
  Add manual GitHub Actions runs (mastodon#17000)
  Change workflow to push to Docker Hub (mastodon#16980)
  Build container image by GitHub Actions (mastodon#16973)
  Bump ruby-saml from 1.11.0 to 1.13.0 (mastodon#16723)
  Save bundle config as local (mastodon#17188)
  Fix edge case in migration helpers that caused crash because of PostgreSQL quirks (mastodon#17398)
  Fix some old migration scripts (mastodon#17394)
  ...

# Conflicts:
#	.github/workflows/build-image.yml
#	CHANGELOG.md
#	Dockerfile
#	Gemfile.lock
#	app/lib/activitypub/activity/announce.rb
#	app/lib/activitypub/activity/create.rb
#	app/models/account.rb
#	chart/values.yaml
#	config/brakeman.ignore
#	config/environments/production.rb
#	config/locales/ja.yml
#	config/locales/simple_form.ja.yml
#	docker-compose.yml
#	lib/mastodon/maintenance_cli.rb
#	lib/mastodon/migration_helpers.rb
#	lib/mastodon/version.rb
#	lib/terrapin/multi_pipe_extensions.rb
#	yarn.lock
jesseplusplus pushed a commit to jesseplusplus/decodon that referenced this pull request Feb 10, 2022
* Fix error-prone SQL queries in Account search

While this code seems to not present an actual vulnerability, one could
easily be introduced by mistake due to how the query is built.

This PR parameterises the `to_tsquery` input to make the query more robust.

* Harden code for Status#tagged_with_all and Status#tagged_with_none

Those two scopes aren't used in a way that could be vulnerable to an SQL
injection, but keeping them unchanged might be a hazard.

* Remove unneeded spaces surrounding tsquery term

* Please CodeClimate

* Move advanced_search_for SQL template to its own function

This avoids one level of indentation while making clearer that the SQL template
isn't build from all the dynamic parameters of advanced_search_for.

* Add tests covering tagged_with, tagged_with_all and tagged_with_none

* Rewrite tagged_with_none to avoid multiple joins and make it more robust

* Remove obsolete brakeman warnings

* Revert "Remove unneeded spaces surrounding tsquery term"

The two queries are not strictly equivalent.

This reverts commit 86f16c5.
rararwg pushed a commit to rararwg/mastodon that referenced this pull request May 9, 2022
* Fix error-prone SQL queries in Account search

While this code seems to not present an actual vulnerability, one could
easily be introduced by mistake due to how the query is built.

This PR parameterises the `to_tsquery` input to make the query more robust.

* Harden code for Status#tagged_with_all and Status#tagged_with_none

Those two scopes aren't used in a way that could be vulnerable to an SQL
injection, but keeping them unchanged might be a hazard.

* Remove unneeded spaces surrounding tsquery term

* Please CodeClimate

* Move advanced_search_for SQL template to its own function

This avoids one level of indentation while making clearer that the SQL template
isn't build from all the dynamic parameters of advanced_search_for.

* Add tests covering tagged_with, tagged_with_all and tagged_with_none

* Rewrite tagged_with_none to avoid multiple joins and make it more robust

* Remove obsolete brakeman warnings

* Revert "Remove unneeded spaces surrounding tsquery term"

The two queries are not strictly equivalent.

This reverts commit 86f16c5.
jesseplusplus pushed a commit to jesseplusplus/decodon that referenced this pull request May 18, 2022
* Fix error-prone SQL queries in Account search

While this code seems to not present an actual vulnerability, one could
easily be introduced by mistake due to how the query is built.

This PR parameterises the `to_tsquery` input to make the query more robust.

* Harden code for Status#tagged_with_all and Status#tagged_with_none

Those two scopes aren't used in a way that could be vulnerable to an SQL
injection, but keeping them unchanged might be a hazard.

* Remove unneeded spaces surrounding tsquery term

* Please CodeClimate

* Move advanced_search_for SQL template to its own function

This avoids one level of indentation while making clearer that the SQL template
isn't build from all the dynamic parameters of advanced_search_for.

* Add tests covering tagged_with, tagged_with_all and tagged_with_none

* Rewrite tagged_with_none to avoid multiple joins and make it more robust

* Remove obsolete brakeman warnings

* Revert "Remove unneeded spaces surrounding tsquery term"

The two queries are not strictly equivalent.

This reverts commit 86f16c5.
jesseplusplus pushed a commit to jesseplusplus/decodon that referenced this pull request May 18, 2022
* Fix error-prone SQL queries in Account search

While this code seems to not present an actual vulnerability, one could
easily be introduced by mistake due to how the query is built.

This PR parameterises the `to_tsquery` input to make the query more robust.

* Harden code for Status#tagged_with_all and Status#tagged_with_none

Those two scopes aren't used in a way that could be vulnerable to an SQL
injection, but keeping them unchanged might be a hazard.

* Remove unneeded spaces surrounding tsquery term

* Please CodeClimate

* Move advanced_search_for SQL template to its own function

This avoids one level of indentation while making clearer that the SQL template
isn't build from all the dynamic parameters of advanced_search_for.

* Add tests covering tagged_with, tagged_with_all and tagged_with_none

* Rewrite tagged_with_none to avoid multiple joins and make it more robust

* Remove obsolete brakeman warnings

* Revert "Remove unneeded spaces surrounding tsquery term"

The two queries are not strictly equivalent.

This reverts commit 86f16c5.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants