Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make revocation of access tokens optional #1318

Conversation

NobodysNightmare
Copy link
Contributor

Summary

As discussed in #1314, the automatic revocation of old access tokens can lead to problems when multiple physical machines (clients) share the same set of credentials (e.g. multiple web servers behind a load balancer).

However, the revocation was implemented, to make the fetching of tokens more efficient, since it did not perform well for large amounts of tokens.

Therefore this pull request implements two changes:

Considerations

I decided to disable automatic revocation by default. My reasoning:

  • It was a breaking change introduced in 5.2, so it was a surprise for anyone upgrading from 5.1 and below
  • I am not sure if there is real demand for that feature or whether it was purely out of performance considerations... But I can at least imagine the demand

Fetching based on TTL is not working when you set a custom_access_token_expires_in, since we could not guarantee that the tokens being filtered out would be considered as expired by the custom strategy. I am pretty sad about introducing this piece of "works different depending on configuration" and I am not sure whether we should at least point it out in the documentation.

@doorkeeper-bot
Copy link

doorkeeper-bot commented Oct 17, 2019

1 Warning
⚠️ Please squash all your commits to a single one

Generated by 🚫 Danger

@NobodysNightmare NobodysNightmare changed the title Efficient authorized tokens for Make revocation of access tokens optional Oct 17, 2019
@NobodysNightmare
Copy link
Contributor Author

In #1314 I proposed that we also adapt the changelog for 5.2, to note that the automatic revocation might be an [IMPORTANT] change.

Should we do this? in this PR?

@NobodysNightmare
Copy link
Contributor Author

Please squash all your commits to a single one

Is this desirable here? I feel that the changes belong together (one PR), but are distinct enough to justify documenting them in different commits.

Should I split this pull request into two requests?

@NobodysNightmare NobodysNightmare force-pushed the efficient_authorized_tokens_for branch 2 times, most recently from e396db2 to 305efd6 Compare October 17, 2019 09:29
@NobodysNightmare NobodysNightmare force-pushed the efficient_authorized_tokens_for branch 2 times, most recently from 92270c4 to ca5c7d6 Compare October 17, 2019 10:38
@nbulaj
Copy link
Member

nbulaj commented Oct 18, 2019

It was a breaking change introduced in 5.2, so it was a surprise for anyone upgrading from 5.1 and below

Actually agree.

I proposed that we also adapt the changelog for 5.2, to note that the automatic revocation might be an [IMPORTANT] change.

Also agree. Could you please add it? Also it would be great to add an entry to the wiki in Upgrading section.

Is this desirable here? I feel that the changes belong together (one PR), but are distinct enough to justify documenting them in different commits.

I can do a squash merge after we finish with this PR so you can don't mind about it. Btw, there is no reason to have a full history for any particular PR - we have everything discussed in the PR / commit description / etc.

lib/doorkeeper/models/access_token_mixin.rb Outdated Show resolved Hide resolved
lib/doorkeeper/models/access_token_mixin.rb Outdated Show resolved Hide resolved
lib/doorkeeper/oauth/client_credentials/creator.rb Outdated Show resolved Hide resolved
@NobodysNightmare NobodysNightmare force-pushed the efficient_authorized_tokens_for branch 2 times, most recently from 7aeba43 to 53e36d9 Compare October 22, 2019 07:38
Copy link
Member

@nbulaj nbulaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nbulaj
Copy link
Member

nbulaj commented Oct 24, 2019

Let's summarize if this PR breaks anything within previous releases and do we need to updagrate Wiki guides ?

Sorry I'm too busy and can't currently re-check everything, so any help would be appreciated 🤝

Comment on lines +105 to +136
# When enabling this option, make sure that you do not expect multiple processes
# using the same credentials at the same time (e.g. web servers spanning
# multiple machines and/or processes).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this comment. Aren't most of the clients nowadays multi-thread/process/machine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say yes. On the other hand I can imagine doorkeeper being deployed in small-scale setups, where you probably don't even think about redundancy. E.g. if I was using doorkeeper for my home projects, that would probably be on a single raspberry-pi or vServer and not in a Kubernetes cluster...

Full disclosure: I don't quite understand the use case of the feature there. I just realized that it was an unexpected breaking change for a multi-server setup. Therefore I made it optional and added the hint to not enable it for multi-machine setups.

@NobodysNightmare
Copy link
Contributor Author

Let's summarize if this PR breaks anything within previous releases and do we need to updagrate Wiki guides ?

Sorry I'm too busy and can't currently re-check everything, so any help would be appreciated

In essence there is only a slightly breaking change towards 5.2.0, because we disable something by default that (IMO) was a breaking change before.

This change is marked in the README with [IMPORTANT].

Updating wiki guides: To be honest, I don't know, I do not use the wiki that much. The breaking change would probably not need to be documented there. Maybe the new configuration revoke_previous_client_credentials_token needs an explanation, but I feel like I am the wrong person to explain this, since I just made it optional, but did not build it. So I can't explain the use case very well.

Sorry for my late response as well.

valid_from = Doorkeeper.configuration.access_token_expires_in.seconds.ago
end

tokens = ordered_by(:created_at, :desc)
Copy link

@hugopeixoto hugopeixoto Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in #1314 (comment), I think using time math would handle every scenario.

We have expires_in stored in the database, so something like this could work:

tokens
  .where(
    application_id: application_id,
    resource_owner_id: resource_owner_id,
    revoked_at: nil,
  )
  .where("created_at + expires_in * interval '1 second' > ?", Time.now.utc)

I'm not sure if this syntax covers every supported database.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this syntax covers every supported database.

We must support all the dbs that are supported by ActiveRecord. So either implement such query our self in the adapters , either .. I don't know..

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double checked. This syntax won't work in sqlite3. Unsure how to do this. Creating the expires_at column would sidestep this whole issue, but I see why we'd want to avoid the redundancy. I'm guessing we're storing expires_in because that's the recommended way of reporting the expiration in oauth (https://tools.ietf.org/html/rfc6749#section-4.2.2) but maybe expires_in should be calculated from expires_at, instead of the other way round? Maybe this has been discussed before.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this may work (have not tested):

tokens
  .where(
    application_id: application_id,
    resource_owner_id: resource_owner_id,
    revoked_at: nil,
  )
  .where(
    if ActiveRecord::Base.connection.adapter_name == "SQLite"
      "datetime(created_at, '+' || expires_in || ' seconds') > ?"
    else
      "created_at + expires_in * interval '1 second' > ?"
    end,
    Time.now.utc,
  )

Thoughts?

Copy link
Member

@nbulaj nbulaj Jan 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EugeneFractal

It would work, yep. But I'm afraid of timezones used by databases (Postgres, Mysql, etc). Any project could set their own local timezones both for application and DBMS so we could have invalid queries. We can try to convert expressions like created_at + expires_in * interval '1 second' to UTC in the database, but every database has it's own syntax for this.

Also I don't sure how it works with nil values in expires_in (endless tokens)

Copy link
Member

@nbulaj nbulaj Jan 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something generic but very dangerous would be:

def support_adapter?
  %w[
    sqlite
    sqlite3
    postgis
    postgresql
    mysql
    mysql2
    sqlserver
    oracleenhanced
  ].include?(ActiveRecord::Base.connection.adapter_name.downcase)
end

def not_expired_for(application_id, resource_owner_id)
  # TODO: don't forget about endless tokens with expires_at.nil?
  relation = where(
      application_id: application_id,
      resource_owner_id: resource_owner_id,
      revoked_at: nil,
    )

  if support_adapter?
    relation.where("#{expiration_date_sql_expression} > ?", Time.now.utc)
  else
    relation.select { ... } # for other cases
  end
end

def expiration_date_sql_expression
  case ActiveRecord::Base.connection.adapter_name.downcase
  when "sqlite", "sqlite3"
    Arel.sql("DATETIME(created_at, '+' || expires_in || ' SECONDS')")
  when "postgis", "postgresql"
    Arel.sql("created_at + expires_in * INTERVAL '1 SECOND' AT TIME ZONE 'UTC'")
  when "mysql", "mysql2"
    Arel.sql("created_at + expires_in * INTERVAL '1 SECOND'")
  when "sqlserver"
    Arel.sql("DATEADD(second, expires_in, created_at) AT TIME ZONE 'UTC'")
  when "oracleenhanced"
    Arel.sql("created_at + INTERVAL to_char(expires_in) second")
  else
    raise "Unsupported database!"
  end
end

@NobodysNightmare
Copy link
Contributor Author

NobodysNightmare commented Dec 11, 2019 via email

@hugopeixoto
Copy link

Remember the custom expiry setting that accepts a block. IIRC this stopped me from going completely on time comparison.

I think the block is used to calculate the expiry date upon token creation, and it gets stored in expires_in. I don't think it's used every time we need to check the expiration.

@ksengers
Copy link

Is there any progress in this PR?

@NobodysNightmare
Copy link
Contributor Author

NobodysNightmare commented Jan 21, 2020 via email

@ksengers
Copy link

Whoops. I was thinking this already got merged..apparently not. Anything I need to do? koensengers notifications@github.com schrieb am Di., 21. Jan. 2020, 11:34:

Is there any progress in this PR? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1318?email_source=notifications&email_token=AADOXUDEPINIBRSUIOLSXV3Q63FU3A5CNFSM4JBVJ6I2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJPIQZQ#issuecomment-576620646>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADOXUHGE7QI4XL335YSUP3Q63FU3ANCNFSM4JBVJ6IQ .

Update branch with master and thats it

Tokens whose TTL has expired are per se not authorized anymore,
so we can filter them out early, which should reduce the amount
of tokens fetched from the database.

Limitation: This is only possible, when no custom expiration
has been set, since we can't apply custom rules while fetching from
the database, but those rules might extend the TTL of any given token.
Revocation of access tokens obtained via client credentials
was (re-)introduced in Doorkeeper 5.2. However, this is incompatible
with clients that are distributed over multiple physical machines,
that use the same set of credentials. In this case it could happen
that one machine invalidates the credentials fetched by another
machine.

Since there seems to be a use case/demand for the automated revocation,
I decided to turn it into a configuration option.
@NobodysNightmare
Copy link
Contributor Author

NobodysNightmare commented Jan 22, 2020

Update branch with master and thats it

Easier said than done. I now rebased onto master. Essential conflicts I solved:

  • Adding the created_since scope is now done in the new mixin for access tokens
  • I dropped the order statement from my change to authorized_tokens_for, since it was dropped from that method upstream as well
  • I kept the where clause in my change to authorized_tokens_for, because it was added equally upstream (this was irritating me the most, I did not yet check which MR introduced it) (apparently it was there the whole time and I just forgot and misread the diff)

revoked_at: nil)

if valid_from
tokens.created_since(valid_from)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have specs for this case? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping here, I think you missed it 😬

# by already filtering out tokens whose TTL has expired.
# We can't do that if custom expirations times have been configured, though.
unless Doorkeeper.configuration.option_defined?(:custom_access_token_expires_in)
valid_from = Doorkeeper.configuration.access_token_expires_in.seconds.ago
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we use UTC everywhere, does this .ago respects it?

@NobodysNightmare
Copy link
Contributor Author

NobodysNightmare commented Jan 22, 2020 via email

@nbulaj
Copy link
Member

nbulaj commented Jan 29, 2020

Hi @NobodysNightmare ! Could you please rebase PR to resolve the conflicts? I'll do other changes myself then. Thanks for your contribution!

@nbulaj
Copy link
Member

nbulaj commented Jan 29, 2020

Or just let me know and I'll do all myself

nbulaj added a commit that referenced this pull request Jan 29, 2020
Originally #1318, but without optimizations for existing token lookup +
code refactoring.
@nbulaj
Copy link
Member

nbulaj commented Jan 29, 2020

I have some plans to release a new version on this week, so I've decided to clone some of the work from this PR to a new one with fixes (currently we can't use optimization with created_since API). Check it here #1353

Thanks @NobodysNightmare for awesome work <3

@NobodysNightmare
Copy link
Contributor Author

Nice to see this being integrated (even partially).

Do I still need to rebase onto master or should I just close the MR?

@nbulaj
Copy link
Member

nbulaj commented Jan 29, 2020

@NobodysNightmare you can close this PR as most of the changes from it already in master.

existing token lookup need to be refactored and introduced in a new PR. I posted sample solution that uses time math fo different databases, but I still don't sure about it.

Btw thanks for your work!

UPD: done it myself :)

@nbulaj nbulaj closed this Jan 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants