-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support read and write lock in table level to reduce lock competition #3775
Conversation
ref #3743 |
fe/src/main/java/org/apache/doris/alter/MaterializedViewHandler.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @caiconghui , first of all, this change is worth it.
But I don't think this is a good way to achieve it. In the previous implementation, it was not a good design to expose the database lock outside, and there have been multiple deadlocks caused by lock nesting problems.
When the lock granularity is refined to the table level, this way of exposing locks will bring more error-proneness. And so many changes to locks in one PR are difficult to review.
I suggest to refer to the implementation of kudu MetadataGroupLock to abstract the operation of the lock and then gradually modify the relevant code.
I just reviewed a little, and submitted review accidentally. |
+1. |
It is hard to review the changed files at one time, so I classify the modified files into different groups. Please Pay more attention to Alter, Catalog, Load theme, which could be quite error prone. And now I don't change httpv2 code, because it is not used now, so we can modify it later. Stmt 10 DescribeStmt.java Backup 21 BackupHandler.java Catalog 24 Catalog.java Colocate 26 ColocateTableIndex.java Meta 28 Database.java MetaManager 34 TabletStatMgr.java Proc 39 EsPartitionsProcDir.java Check 47 CheckConsistencyJob.java Rest 49 GetDdlStmtAction.java Load System 71 ReportHandler.java Task 74 HadoopLoadPendingTask.java TransactionMgr 79 DatabaseTransactionMgr.java Rpc 82 FrontendServiceImpl.java Test Looking forwards to lively discussions. |
2c247dc
to
5c76212
Compare
f6a0178
to
2300dce
Compare
0aed452
to
d8f4bcd
Compare
Main Modification and Some Lock rule to discuss.
|
304942c
to
45ef112
Compare
45ef112
to
1585a01
Compare
db.writeUnlock(); | ||
} | ||
String tableName = stmt.getTableName().getTbl(); | ||
OlapTable olapTable = (OlapTable) db.getTableOrThrowException(tableName, TableType.OLAP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing if (olapTable.getState() != OlapTableState.NORMAL) {
check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
processDropMaterializedView has check table state operation
} | ||
} finally { | ||
db.writeUnlock(); | ||
switch (table.getType()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just lock the table outside the switch
, avoid call lock/unlock everywhere inside the processAlterOlapTable
and processAlterExternalTable
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some function need db lock and table lock but some function only need table lock, what't more, If the nesting function is too deep, it is easy to repeat locking, if we lock small scope for modificaion, which will be more clear although we may write many lock code
break; | ||
} else if (alterClause instanceof ColumnRenameClause) { | ||
Catalog.getCurrentCatalog().renameColumn(db, table, (ColumnRenameClause) alterClause); | ||
db.writeLock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better put these locks inside the renameTable
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
break; | ||
} else { | ||
Preconditions.checkState(false); | ||
table.writeLock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put lock inside the method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
} | ||
} | ||
|
||
private void processRename(Database db, Table table, List<AlterClause> alterClauses) throws DdlException { | ||
for (AlterClause alterClause : alterClauses) { | ||
if (alterClause instanceof TableRenameClause) { | ||
Catalog.getCurrentCatalog().renameTable(db, table, (TableRenameClause) alterClause); | ||
db.writeLock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put lock inside the method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -14,6 +14,22 @@ | |||
// KIND, either express or implied. See the License for the | |||
// specific language governing permissions and limitations | |||
// under the License. | |||
// Licensed to the Apache Software Foundation (ASF) under one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this license is different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is an incorrect modification.
} | ||
|
||
lock(dbs); | ||
List<Table> tables = Lists.newArrayList(tableMap.values()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this can get table list in order?
I think it is more safe and clear to change tableMap
to tableList
, and sort the tableList
explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, tableMap is a treeMap which is used to for previous analysis, the treeMap will keep table id in In ascending order,
} else { | ||
LOG.warn("failed to update backend report version, db {} does not exist", dbId); | ||
} | ||
atomicLong.set(newReportVersion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In original implementation, we use db locks to ensure mutual exclusion and order.
So here we may still need to add table locks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
atomicLong is AtomicLong,so I think it is meanless for add table lock or db lock?
Map<String, EtlPartitionConf> etlPartitions = createEtlPartitions(); | ||
Preconditions.checkNotNull(etlPartitions); | ||
taskConf.setEtlPartitions(etlPartitions); | ||
EtlTaskConf taskConf = new EtlTaskConf(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Table's lock should be held to call method like createEtlPartitions()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add table lock in createEtlPartitions
continue; | ||
} | ||
OlapTable olapTable = (OlapTable) table; | ||
olapTable.readLock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it more safe to lock all tables at once outside the for loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the final result is same if the table is dropped before loop or in loop, we just make sure that it is thread safe, and we has already add committed txn check if user want to drop table during loading data
6144347
to
1485ab0
Compare
fe/fe-core/src/main/java/org/apache/doris/common/util/MetaLockUtils.java
Show resolved
Hide resolved
@@ -265,17 +265,16 @@ private void executeDynamicPartition() { | |||
String tableName; | |||
boolean skipAddPartition = false; | |||
OlapTable olapTable; | |||
db.readLock(); | |||
olapTable = (OlapTable) db.getTable(tableId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When drop a table from database, a db lock will be held;
Why not get a db read lock here when get a table from database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wangbo
because when drop table, we need to prevent that other thread drop table or create table too. but for get table, we just need to ensure that the get table operation is thread safe, and the final result is ok, we don't purse the strict consistency here for better performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another consideration is to avoid deadlock, all lock sequence is db lock -> table lock -> other lock, and if we has get table lock and sometimes need to get table again from db, there may cause dead lock, so db get table operation not get db read lock anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get, you mean Performance is ahead of NPE
.
For the second point about lock sequence
.
I think it should become programming specifications for Doris.
1 If try to lock multiple tables or dbs, it should use getXXXInOrderMethod.
2 If try to lock different type locks, lock sequence should be guaranteed.
3 The appearance of nested locks of the same type should be avoided.
17c0d45
to
1a02c13
Compare
29ac965
to
56b8d31
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…apache#3775) This PR is to reduce lock competition by supporting read and write lock in table level. When we modify or read table's meta, we don't need to get database lock, just get table write or read lock. And when we get database lock, that means meta directly in db cannot be modified by other thread. Database lock only protect meta in Database class, while table lock protect meta in Table class. Change-Id: Icec6f39f58708950c786059edaeb474a8aa0e324
This PR is to reduce lock competition by supporting read and write lock in table level. When we modify or read table's meta, we don't need to get db lock, just get table write or read lock. And when we get db lock, that means meta directly in db cannot be modified by other thread. Db lock only protect meta in Database class, while table lock protect meta in Table class.