Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master' into feat/s3-delete-files
Browse files Browse the repository at this point in the history
  • Loading branch information
lbeckman314 committed Jul 31, 2024
2 parents 41b20eb + 58f8164 commit 2bf992a
Show file tree
Hide file tree
Showing 43 changed files with 1,502 additions and 768 deletions.
6 changes: 3 additions & 3 deletions .secrets.baseline
Original file line number Diff line number Diff line change
Expand Up @@ -268,14 +268,14 @@
"filename": "tests/conftest.py",
"hashed_secret": "1348b145fa1a555461c1b790a2f66614781091e9",
"is_verified": false,
"line_number": 1561
"line_number": 1569
},
{
"type": "Base64 High Entropy String",
"filename": "tests/conftest.py",
"hashed_secret": "227dea087477346785aefd575f91dd13ab86c108",
"is_verified": false,
"line_number": 1583
"line_number": 1593
}
],
"tests/credentials/google/test_credentials.py": [
Expand Down Expand Up @@ -422,5 +422,5 @@
}
]
},
"generated_at": "2024-03-16T00:09:27Z"
"generated_at": "2024-07-25T17:19:58Z"
}
561 changes: 35 additions & 526 deletions README.md

Large diffs are not rendered by default.

6 changes: 4 additions & 2 deletions clear_prometheus_multiproc
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
set -ex

rm -Rf $1
mkdir $1
mkdir -p $1
chmod 755 $1
chown 100:101 $1
if id -u nginx &>/dev/null; then
chown $(id -u nginx):$(id -g nginx) $1
fi
8 changes: 8 additions & 0 deletions docs/additional_documentation/authorization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@

## Access Control / Authz

Currently fence works with another Gen3 service named
[arborist](https://github.com/uc-cdis/arborist) to implement attribute-based access
control for commons users. The YAML file of access control information (see
[#create-user-access-file](setup.md#create-user-access-file)) contains a section `authz` which are data sent to
arborist in order to set up the access control model.
22 changes: 22 additions & 0 deletions docs/additional_documentation/data_access.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
## Accessing Data

Fence has multiple options that provide a mechanism to access data. The access
to data can be moderated through authorization information in a User Access File.

Users can be provided specific `privilege`'s on `projects` in the User Access
File. A `project` is identified by a unique authorization identifier AKA `auth_id`.

A `project` can be associated with various storage backends that store
object data for that given `project`. You can assign `read-storage` and `write-storage`
privileges to users who should have access to that stored object data. `read` and
`write` allow access to the data stored in a graph database.

Depending on the backend, Fence can be configured to provide users access to
the data in different ways.


### Signed URLS

Temporary signed URLs are supported in all major commercial clouds. Signed URLs are the most 'cloud agnostic' way to allow users to access data located in different platforms.

Fence has the ability to request a specific file by its GUID (globally unique identifier) and retrieve a temporary signed URL for object data in AWS or GCP that will provide direct access to that object.
File renamed without changes.
19 changes: 19 additions & 0 deletions docs/additional_documentation/default_expiration_times.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
## Default Expiration Times in Fence

Table contains various artifacts in fence that have temporary lifetimes and their default values.

> NOTE: "SA" in the below table stands for Service Account
| Name | Lifetime | Extendable? | Maximum Lifetime | Details |
|-------------------------------------|--------------|-------------|-----------------------|-----------------------------------------------------------------------------------------------------------------------------|
| Access Token | 20 minutes | TRUE | Life of Refresh Token | |
| Refresh Token | 30 days | FALSE | N/A | |
| User's SA Account Access | 7 days | TRUE | N/A | Access to data (e.g. length it stays in the proxy group). Can optionally provide an expiration less than 7 days |
| User's Google Account Access | 1 day | TRUE | N/A | After AuthN, how long we associate a Google email with the given user. Can optionally provide an expiration less than 1 day |
| User's Google Account Linkage | Indefinite | N/A | N/A | Can optionally provide an expiration less than 1 hour |
| Google Signed URL | Up to 1 hour | FALSE | N/A | Can optionally provide an expiration less than 1 hour |
| AWS Signed URL | Up to 1 hour | FALSE | N/A | Obtained by an oauth client through /credentials/google |
| Client SA (for User) Key | 10 days | FALSE | N/A | Obtained by the user themselves for temp access. Can optionally provide an expiration less than 10 days |
| User Primary SA Key | 10 days | FALSE | N/A | Used for Google URL signing |
| User Primary SA Key for URL Signing | 30 days | FALSE | N/A | |
| Sliding Session Window | 15 minutes | TRUE | 8 hours | access_token cookies get generated automatically when expired if session is still active |
File renamed without changes.
126 changes: 126 additions & 0 deletions docs/additional_documentation/fence_create.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
## fence-create: Automating common tasks with a command line interface

fence-create is a command line utility that is bundled with fence and allows you to automate some commons tasks within fence. For the latest and greatest run the command `fence-create --help`.

WARNING: fence-create directly modifies the database in some cases and may circumvent security checks (most of these utilities are used for testing). BE CAREFUL when you're running these commands and make sure you know what they're doing.


### Register Internal Oauth Client

As a Gen3 commons administrator, if you want to create an oauth client that skips user consent step, use the following command:

```bash
fence-create client-create --client CLIENT_NAME --urls OAUTH_REDIRECT_URL --username USERNAME --auto-approve (--expires-in 30)
```

The optional `--expires-in` parameter allows specifying the number of days until this client expires.

### Register an Implicit Oauth Client

As a Gen3 commons administrator, if you want to create an implicit oauth client for a webapp:

```bash
fence-create client-create --client fancywebappname --urls 'https://betawebapp.example/fence
https://webapp.example/fence' --public --username fancyapp --grant-types authorization_code refresh_token implicit
```

If there are more than one URL to add, use space to delimit them like this:

```bash
fence-create client-create --urls 'https://url1/' 'https://url2/' --client ...
```

To specify allowed scopes, use the `allowed-scopes` argument:
```bash
fence-create client-create ... --allowed-scopes openid user data
```

### Register an Oauth Client for a Client Credentials flow

The OAuth2 Client Credentials flow is used for machine-to-machine communication and scenarios in which typical authentication schemes like username + password do not make sense. The system authenticates and authorizes the app rather than a user. See the [OAuth2 specification](https://www.rfc-editor.org/rfc/rfc6749#section-4.4) for more details.

As a Gen3 commons administrator, if you want to create an OAuth client for a client credentials flow:

```bash
fence-create client-create --client CLIENT_NAME --grant-types client_credentials (--expires-in 30)
```

This command will return a client ID and client secret, which you can then use to obtain an access token:

```bash
curl --request POST https://FENCE_URL/oauth2/token?grant_type=client_credentials -d scope="openid user" --user CLIENT_ID:CLIENT_SECRET
```

The optional `--expires-in` parameter allows specifying the number of *days* until this client expires. The recommendation is to rotate credentials with the `client_credentials` grant at least once a year (see [Rotate client credentials](#rotate-client-credentials) section).

NOTE: In Gen3, you can grant specific access to a client the same way you would to a user. See the [user.yaml guide](https://github.com/uc-cdis/fence/blob/master/docs/user.yaml_guide.md) for more details.

NOTE: Client credentials tokens are not linked to a user (the claims contain no `sub` or `context.user.name` like other tokens). Some Gen3 endpoints that assume the token is linked to a user, or whose logic require there being a user, do not support them. For an example of how to adapt an endpoint to support client credentials tokens, see [here](https://github.com/uc-cdis/requestor/commit/a5078fae27fa258ac78045cf2bb89cb2104f53cf). For an example of how to explicitly reject client credentials tokens, see [here](https://github.com/uc-cdis/requestor/commit/0f4974c25343d2185c7cdb48dcdeb58f97800672).

### Modify OAuth Client

```bash
fence-create client-modify --client CLIENT_NAME --urls http://localhost/api/v0/oauth2/authorize
```

That command should output any modifications to the client. Similarly, multiple URLs are
allowed here too.

Add `--append` argument to add new callback urls or allowed scopes to existing client (instead of replacing them) using `--append --urls` or `--append --allowed-scopes`
```bash
fence-create client-modify --client CLIENT_NAME --urls http://localhost/api/v0/new/oauth2/authorize --append (--expires-in 30)
```

### Rotate client credentials

Use the `client-rotate` command to receive a new set of credentials (client ID and secret) for a client. The old credentials are NOT deactivated and must be deleted or expired separately (see [Delete Expired OAuth Clients](#delete-expired-oauth-clients) section). This allows for a rotation without downtime.

```bash
fence-create client-rotate --client CLIENT_NAME (--expires-in 30)
```

Note that the `usersync` job must be run after rotating the credentials so that the new client ID is granted the same access as the old one.

### Delete OAuth Client

```bash
fence-create client-delete --client CLIENT_NAME
```
That command should output the result of the deletion attempt.

### Delete Expired OAuth Clients

```bash
fence-create client-delete-expired
```

To post a warning in Slack about any clients that expired or are about to expire:

```bash
fence-create client-delete-expired --slack-webhook <url> --warning-days <default 7: only post about clients expiring in under 7 days>
```


### List OAuth Clients

```bash
fence-create client-list
```
That command should output the full records for any registered OAuth clients.

### Set up for External Buckets on Google

```bash
fence-create link-external-bucket --bucket-name demo-bucket
fence-create link-bucket-to-project --bucket_id demo-bucket --bucket_provider google --project_auth_id test-project
```

The link-external-bucket returns an email for a Google group which needs to be added to access to the bucket `demo-bucket`.

### Notify users who are blocking service account registration

```bash
fence-create notify-problem-users --emails ex1@gmail.com ex2@gmail.com --auth_ids test --google_project_id test-google
```

`notify-problem-users` emails users in the provided list (can be fence user email or linked google email) who do not have access to any of the auth_ids provided. Also accepts a `check_linking` flag to check that each user has linked their google account.
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ The `/login/shib` endpoint accepts the query parameter `shib_idp`. Fence checks

After the user logs in and is redirected to `/login/shib/login`, we get the `eppn` (EduPerson Principal Name) from the request headers to use as username. If the `eppn` is not available, we use the `persistent-id` (or `cn`) instead.

![Shibboleth Login Flow](images/seq_diagrams/shibboleth_flow.png)
![Shibboleth Login Flow](../images/seq_diagrams/shibboleth_flow.png)

Notes about the NIH login implementation:
- NIH login is used as the default when the `idp` is fence and no `shib_idp` is specified (for backwards compatibility).
Expand All @@ -32,7 +32,7 @@ Notes about the NIH login implementation:

### In the multi-tenant Fence instance

The [Shibboleth dockerfile](../DockerfileShib) image is at https://quay.io/repository/cdis/fence-shib and is NOT compatible yet with python 3/the latest Fence (for now, use Fence 2.7.x).
The [Shibboleth dockerfile](../../DockerfileShib) image is at https://quay.io/repository/cdis/fence-shib and is NOT compatible yet with python 3/the latest Fence (for now, use Fence 2.7.x).

The deployment only includes `revproxy` and `fenceshib`. The Fence configuration enables the `shibboleth` provider:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,19 +25,19 @@ References:

This shows external DRS Client(s) communicating with Gen3 Framework Services (as a GA4GH DRS Server) and how G3FS interacts with Passport Brokers to validate and verify JWTs.

![Passport and Visa JWT Handling](images/ga4gh/passport_jwt_handling.png)
![Passport and Visa JWT Handling](../images/ga4gh/passport_jwt_handling.png)

## G3FS: Configurable Roles for Data Access

Gen3 Framework Services are capable of acting in many different roles. As data repositories (or DRS Servers in GA4GH terminology), as authorization decision makers (GA4GH Claims Clearinghouses), and/or as token issuers (GA4GH Passport Brokers). G3FS is also capable of being a client to other Passport Brokers. G3FS must be a client to an upstream Identity Provider (IdP) as it does not ever store user passwords but relies on authentication from another trusted source.

In order to describe the role of the passport in these various configurations, the following diagrams may help.

![Gen3 as DRS Server](images/ga4gh/gen3_as_drs.png)
![Gen3 as DRS Server](../images/ga4gh/gen3_as_drs.png)

![Gen3 as Client](images/ga4gh/gen3_as_client.png)
![Gen3 as Client](../images/ga4gh/gen3_as_client.png)

![Gen3 as Both](images/ga4gh/gen3_as_client_and_drs_server.png)
![Gen3 as Both](../images/ga4gh/gen3_as_client_and_drs_server.png)

## Performance Improvements

Expand All @@ -52,22 +52,22 @@ We added a number of things to mitigate the performance impact on researchers' w

To illustrate the need for such a cache, see the images below for before and after.

![Before Caching](images/ga4gh/caching_before.png)
![Before Caching](../images/ga4gh/caching_before.png)

![After Caching](images/ga4gh/caching_after.png)
![After Caching](../images/ga4gh/caching_after.png)

## User Identities

Different GA4GH Visas may refer to the same subject differently. In order to maintain the known mappings between different representations of the same identity, we are creating an Issuer+Subject to User mapping table. The primary key on this table is the combination of the `iss` and `sub` from JWTs.

![User Identities](images/ga4gh/users.png)
![User Identities](../images/ga4gh/users.png)

## Backend Updates and Expiration

In order to ensure the removal of access at the right time, the cronjobs we have are updated based on the figure and notes below. We are requiring movement away from the deprecated, legacy, limited Fence authorization support in favor of the new policy engine (which allows expiration of policies out of the box).

There is an argument here for event-based architecture, but Gen3 does not currently support such an architecture. We are instead extending the support of our cronjobs to ensure expirations occur at the right time.

![Cronjobs and Expirations](images/ga4gh/expiration.png)
![Cronjobs and Expirations](../images/ga4gh/expiration.png)

> _All diagrams are originally from an **internal** CTDS Document. The link to that document is [here](https://lucid.app/lucidchart/5c52b868-5cd2-4c6e-b53b-de2981f7da98/edit?invitationId=inv_9a757cb1-fc81-4189-934d-98c3db06d2fc) for internal people who need to edit the above diagrams._
Loading

0 comments on commit 2bf992a

Please sign in to comment.