-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure caches are not used unsafely #10691
Conversation
core/trino-main/src/main/java/io/trino/server/security/oauth2/OAuth2TokenExchange.java
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/CachingDirectoryLister.java
Show resolved
Hide resolved
...-raptor-legacy/src/main/java/io/trino/plugin/raptor/legacy/storage/ShardRecoveryManager.java
Show resolved
Hide resolved
...-raptor-legacy/src/main/java/io/trino/plugin/raptor/legacy/storage/ShardRecoveryManager.java
Show resolved
Hide resolved
1ea184f
to
6eb77b1
Compare
@@ -64,21 +61,12 @@ | |||
private final BigQuery bigQuery; | |||
private final ViewMaterializationCache materializationCache; | |||
private final boolean caseInsensitiveNameMatching; | |||
private final Cache<String, Optional<RemoteDatabaseObject>> remoteDatasets; | |||
private final Cache<TableId, Optional<RemoteDatabaseObject>> remoteTables; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @hashhar see Remove unused caches in BigQueryClient
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦♂️
Indeed. It has been present since the initial commit. I compute a fresh mapping near // explicitly cache the information if the requested dataset doesn't exist
but don't put it into the cache at all.
I'd prefer fixing this instead of removing it though (in separate PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer fixing this instead of removing it though (in separate PR).
understood, except that i didn't want to spend time fixing this particular one.
i can drop the change. note that this is not the only problem, probably -- what about cache invalidation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused caches in BigQueryClient
dropped
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grantatspothero is both fixing this to actually populate the cache and converting to use the EvictableCache to prevent leaking keys that are already being loaded.
Note that we don't have a procedure to clear the entire cache so it's a better situation than the JDBC connectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove the cache for now. We ran into some isssues - bulk loading not playing well with loading caches, no invalidation logic when we make changes via Trino etc.
@findepi Can you restore your commit? Sorry for the back and forth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We ran into some isssues - bulk loading not playing well with loading caches, no invalidation logic when we make changes via Trino etc.
@hashhar @grantatspothero
bulk list + cache.put
is inherently not playing well with invalidation
in JDBC, we 'solved' this problem by not having invalidation (CachingIdentifierMapping
)
cc @kokosing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you restore your commit?
6eb77b1
to
17fa8e1
Compare
@@ -496,6 +497,8 @@ public void removeOldTasks() | |||
try { | |||
DateTime endTime = taskInfo.getStats().getEndTime(); | |||
if (endTime != null && endTime.isBefore(oldestAllowedTask)) { | |||
// The removal here is concurrency safe with respect to any concurrent loads: the cache has no expiration, | |||
// the taskId is in the cache, so there mustn't be an ongoing load. | |||
tasks.asMap().remove(taskId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would that make sense to assert that task was actually removed to follow what comment above says.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe, but i don't feel comfortable adding assertions in this class. do you want to address this as a followup?
@findepi a pattern that often comes up with the connectors is we want a cache that also keeps track of negative cache entries. For example, in the bigquery connector we can only bulk list tables in the bigquery API to populate the table name cache, so if someone requests reading a table that does not exist we have to list all the tables in the schema. It would be nice to keep track of this nonexistent table so we do not keep hitting the bigquery API. Is there a cache implementation we should be using for bulk loads that handles concurrency and invalidation of a negative cache without races? |
17fa8e1
to
194cdef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/cache/InvalidateAllMode.java
Outdated
Show resolved
Hide resolved
lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/cache/InvalidateAllMode.java
Outdated
Show resolved
Hide resolved
lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/cache/SafeCaches.java
Show resolved
Hide resolved
194cdef
to
2b2c3b4
Compare
bulk list is not bulk load you don't ask a cache for multiple potentially existing entries at the same time
not sure about "negative cache". from cache perspective, there is no such thing, and "negativeness" is handled by using a marker, right? let's continue BigQuery caching needs under BigQuery caching PR |
oh, i thought there is a PR. Let's talk under the issue #10697 |
They were never written to, so always empty. This also removes `bigquery.case-insensitive-name-matching.cache-ttl` config property, which was accepted, but didn't do anything.
The cache has no `refreshAfterWrite`, so `asyncReloading` does not change anything.
`UniformNodeSelectorFactory` and `TopologyAwareNodeSelectorFactory` attempt to warn only once per 30 seconds about a node being inaccessible. Fix a race around that which allowed multiple threads to issue a warning at once.
The `.getIfPresent() compute .put()` can be replaced with a `.get(key, loader)`.
The cache has no expiration, so it's equivalent to `ConcurrentHashMap` with `Map.computeIfAbsent`.
Guava's `Cache` and `LoadingCache` have concurrency issues around invalidation and ongoing loads. Ensure that - code uses `EvictableCache` or `EvictableLoadingCache` which fix the probem, or - code uses safety wrappers, `NonEvictableCache`, `NonEvictableLoadingCache`, which fail when unsafe invalidation is called. Additionally, the interfaces have the unimplemented methods marked as `@Deprecated`, to signal the problem as early as possible.
2b2c3b4
to
9c5c238
Compare
rebased on top of #10725 |
Guava's
Cache
andLoadingCache
have concurrency issues aroundinvalidation and ongoing loads.
Ensure that
EvictableCache
orEvictableLoadingCache
which fix theprobem, or
NonEvictableCache
,NonEvictableLoadingCache
, which fail when unsafe invalidation iscalled. Additionally, the interfaces have the unimplemented methods
marked as
@Deprecated
, to signal the problem as early as possible.Follows #10512