Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support error/info log desensitization #18566

Closed
kolbe opened this issue Jul 14, 2020 · 19 comments · Fixed by #19392 or #19409
Closed

Support error/info log desensitization #18566

kolbe opened this issue Jul 14, 2020 · 19 comments · Fixed by #19392 or #19409
Labels
feature/accepted This feature request is accepted by product managers priority/P0 The issue has P0 priority. type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@kolbe
Copy link
Contributor

kolbe commented Jul 14, 2020

Feature Request

Is your feature request related to a problem? Please describe:

Sensitive data can be written to the error log in many cases. This is problematic for environments where strict controls must be placed on sensitive, identifying, or financial data.

Describe the feature you'd like:

We should ensure that sensitive data is not written to error and other informational logs. These are different from logs that are explicitly enabled by the user (the general query log, slow query log, etc.), for which desensitization procedures would be different.

To resolve this issue will require an extensive audit of how we write log entries. There are many cases where the contents of SQL statements are written to the log (permission failures, syntax errors, etc.), and all of those need to be dealt with to resolve this task.

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

@kolbe kolbe added the type/feature-request Categorizes issue or PR as related to a new feature. label Jul 14, 2020
@ghost
Copy link

ghost commented Jul 14, 2020

I think the only way you can really be sure to do this safely, is the error log can not output any user input directly. It has to be the prepared statement form of the query.

The slow query log needs to be off by default, and it should state that it outputs raw user data in an unencrypted form.

@kolbe
Copy link
Contributor Author

kolbe commented Jul 14, 2020

In particular, server/conn.go is often responsible for printing the full query text if there is an execution problem. This would be a good place to start with masking query content. Perhaps a system variable to control whether error log output is desensitized (default ON) and some changes to func (cc getLastStmtInConn) String() string (https://github.com/pingcap/tidb/blob/release-4.0/server/conn.go#L1604) would be a good start.

@nolouch
Copy link
Member

nolouch commented Jul 31, 2020

closed by #18578

@nolouch nolouch closed this as completed Jul 31, 2020
@zz-jason
Copy link
Member

zz-jason commented Aug 6, 2020

when tidb_log_desensitization is set to 1, the duplicated key is printed to the log. I'm going to reopen this since it has not been fixed totally.

Here is an example:

create table t(a bigint primary key, b bigint);
insert into t values(1, 1);
insert into t values(1, 1); -- returns: ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'

the log printed for the last insert statement:

[2020/08/06 08:52:21.984 +08:00] [WARN] [session.go:471] ["can not retry txn"] [conn=3] [label=general] [error="[kv:1062]Duplicate entry '1' for key 'PRIMARY'"] [IsBatchInsert=false] [IsPessimistic=false] [InRestrictedSQL=false] [tidb_retry_limit=10] [tidb_disable_txn_auto_retry=true]
[2020/08/06 08:52:21.984 +08:00] [WARN] [session.go:487] ["commit failed"] [conn=3] ["finished txn"="Txn{state=invalid}"] [error="[kv:1062]Duplicate entry '1' for key 'PRIMARY'"]
[2020/08/06 08:52:21.984 +08:00] [ERROR] [conn.go:744] ["command dispatched failed"] [conn=3] [connInfo="id:3, addr:127.0.0.1:55984 status:10, collation:utf8_general_ci, user:root"] [command=Query] [status="inTxn:0, autocommit:1"] [sql="insert into t values ( ... )"] [txn_mode=OPTIMISTIC] [err="[kv:1062]Duplicate entry '1' for key 'PRIMARY'"]
[2020/08/06 08:52:21.985 +08:00] [INFO] [2pc.go:739] ["2PC clean up done"] [conn=3] [txnStartTS=418558808419729408]

@zz-jason zz-jason reopened this Aug 6, 2020
@zz-jason zz-jason added the feature/accepted This feature request is accepted by product managers label Aug 6, 2020
@zz-jason
Copy link
Member

zz-jason commented Aug 6, 2020

@zanmato1984 PTAL, thanks.

@zz-jason
Copy link
Member

zz-jason commented Aug 6, 2020

maybe we can write a tool to monitor the log file. report a security issue if it prints data in the table or from the query.

@zz-jason
Copy link
Member

zz-jason commented Aug 7, 2020

when tidb_log_desensitization is set to 1, the duplicated key is printed to the log. I'm going to reopen this since it has not been fixed totally.

Here is an example:

create table t(a bigint primary key, b bigint);
insert into t values(1, 1);
insert into t values(1, 1); -- returns: ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'

the log printed for the last insert statement:

[2020/08/06 08:52:21.984 +08:00] [WARN] [session.go:471] ["can not retry txn"] [conn=3] [label=general] [error="[kv:1062]Duplicate entry '1' for key 'PRIMARY'"] [IsBatchInsert=false] [IsPessimistic=false] [InRestrictedSQL=false] [tidb_retry_limit=10] [tidb_disable_txn_auto_retry=true]
[2020/08/06 08:52:21.984 +08:00] [WARN] [session.go:487] ["commit failed"] [conn=3] ["finished txn"="Txn{state=invalid}"] [error="[kv:1062]Duplicate entry '1' for key 'PRIMARY'"]
[2020/08/06 08:52:21.984 +08:00] [ERROR] [conn.go:744] ["command dispatched failed"] [conn=3] [connInfo="id:3, addr:127.0.0.1:55984 status:10, collation:utf8_general_ci, user:root"] [command=Query] [status="inTxn:0, autocommit:1"] [sql="insert into t values ( ... )"] [txn_mode=OPTIMISTIC] [err="[kv:1062]Duplicate entry '1' for key 'PRIMARY'"]
[2020/08/06 08:52:21.985 +08:00] [INFO] [2pc.go:739] ["2PC clean up done"] [conn=3] [txnStartTS=418558808419729408]

It exposes another issue: we can reduce the logs of Duplicate entry error. #19052 was created to track this log expansion.

@zz-jason
Copy link
Member

zz-jason commented Aug 7, 2020

I noticed that it's a P0 sub-task in #18084, also we'd better backport this security improvement to release 4.0

@zz-jason zz-jason added the priority/P0 The issue has P0 priority. label Aug 7, 2020
@zz-jason
Copy link
Member

zz-jason commented Aug 7, 2020

we also need to consider the debug level logs.

@zz-jason
Copy link
Member

zz-jason commented Aug 7, 2020

the privilege of turning the global variable tidb_log_desensitization also needs careful design. for example, only users with super privilege can turn on or turn off this variable.

@ghost
Copy link

ghost commented Aug 7, 2020

we also need to consider the debug level logs.

I think it might be more practical to document that it is only supported for INFO and higher levels? Debug implies verbose, so as a user I don't have a full expectation I can have both. But for safety, it might be useful to handle this in config check, and refuse to allow DEBUG if tidb_log_desensitization is enabled warn when attempting to enable tidb_log_desensitization w/DEBUG see comment here.

@nolouch
Copy link
Member

nolouch commented Aug 7, 2020

cc @crazycs520

@kolbe
Copy link
Contributor Author

kolbe commented Aug 7, 2020

@nullnotnil I agree. I like the idea of disallowing DEBUG logs if the tidb_log_desensitization is set.

@zz-jason only a user with SUPER can change the value of a GLOBAL variable, so this is already the case, but good point and it's important to mention!

@crazycs520
Copy link
Contributor

I noticed that it's a P0 sub-task in #18084, also we'd better backport this security improvement to release 4.0

Already cherry-pick to release-4.0: #18581

@crazycs520
Copy link
Contributor

we also need to consider the debug level logs.

I think it might be more practical to document that it is only supported for INFO and higher levels? Debug implies verbose, so as a user I don't have a full expectation I can have both. But for safety, it might be useful to handle this in config check, and refuse to allow DEBUG if tidb_log_desensitization is enabled.

This check is added in #19103

@SunRunAway
Copy link
Contributor

After #20436 merged, we are now working together to review all errors which may have sensitive information and redact them. See https://docs.google.com/document/d/1dOVzkaahH-P8fTYgtuBdKRTRlT8BEV7DM8LJPQwZaAQ/edit#heading=h.x2vdsn83mi0d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/accepted This feature request is accepted by product managers priority/P0 The issue has P0 priority. type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
7 participants