Skip to content
This repository has been archived by the owner on Oct 12, 2022. It is now read-only.

Prevent linear memory growth of typechecking cache / correct cache purging #194

Merged
merged 1 commit into from
Jun 14, 2017

Conversation

emidoots
Copy link
Member

@emidoots emidoots commented Jun 14, 2017

Overview

Typechecking used to cause linear memory growth, now it doesn't. In the docker repository, editing and hovering 5 times (to trigger typechecking) produces:

Without this change:

  1. 719MB
  2. 1.21GB
  3. 1.89GB
  4. 2.24GB
  5. 2.51GB

With this change:

  1. 655MB
  2. 776MB
  3. 787MB
  4. 787MB
  5. 787MB

i.e. we used to grow by about 500MB per typecheck operation, now we just idle at ~500MB.

Why didn't I make this change earlier?

I actually planned to remove the cache entirely, but with a change in direction about how I will be tackling the slowness of go to definition and hover, removing the cache entirely is not possible anymore. This is the next best thing, and is still a HUGE improvement.

Explanation

Each LangHandler has it's own cache (see (*LangHandler).resetCaches), but they are backed by the same LRU cache (see cache.go). Every cache that wraps the LRU cache (see newTypecheckCache) uses a global cache ID to avoid conflicts:

	// cacheID is used to prevent key conflicts between different
	// LangHandlers in the same process.
	cacheID int64

That is, we use the same backing LRU cache but keep our data separate. The intent of this, I presume, is to keep memory usage lower (because 2 LRU caches would have 2x the number of objects in caches overall). At first glance, this seems like a good idea. But when it comes to purging the cache, issues crop up. For example:

func (c *boundedCache) Purge() {
	// c.c is a process level cache. Since c.id is part of the cache keys,
	// we can just change its values to make it seem like we have purged
	// the cache.
	c.mu.Lock()
	c.id = nextCacheID()
	c.mu.Unlock()
}

This code assumes that we can purge the cache by incrementing the cache ID. And that's true -- data from the old cache ID will never be returned. However, this doesn't actually purge the old cache contents from memory!

In the case of sourcegraph.com, not purging things from memory is OK (we expect our process to hold all of it's memory, basically). But for local editors using the language server, this isn't OK. Every time the user modifies a file, the cache is purged (because the typechecking data is invalid). Because the purging doesn't actually clear contents from memory, we store (by default) up to 500 typechecked copies of the user's program. For something like docker, these cache objects can be up to 500MB of memory each (and we'll store 500 of them).

Given this is the case, we should really purge the cache contents instead of just invalidating them via ID incrementing. This will actually help lower memory consumption on sourcegraph.com, too, so it's a win-win situation.

Helps microsoft/vscode-go#853
Helps #178

…rging

Typechecking used to cause linear memory growth, now it doesn't. In the docker repository, editing and hovering 5 times (to trigger typechecking) produces:

Without this change:

1. 719MB
2. 1.21GB
3. 1.89GB
4. 2.24GB
5. 2.51GB

With this change:

1. 655MB
2. 776MB
3. 787MB
4. 787MB
5. 787MB

i.e. we used to grow by about 500MB per typecheck operation, now we just idle at ~500MB.

I actually planned to remove the cache _entirely_, but with a change in direction about how I will be tackling the slowness of go to definition and hover, removing the cache entirely is not possible anymore. This is the next best thing, and is still a HUGE improvement.

Each `LangHandler` has it's own cache (see `(*LangHandler).resetCaches`), but they are backed by the same LRU cache (see `cache.go`). Every cache that wraps the LRU cache (see `newTypecheckCache`) uses a global cache ID to avoid conflicts:

```
	// cacheID is used to prevent key conflicts between different
	// LangHandlers in the same process.
	cacheID int64
```

That is, we use the same backing LRU cache but keep our data separate. The intent of this, I presume, is to keep memory usage lower (because 2 LRU caches would have 2x the number of objects in caches overall). At first glance, this seems like a good idea. But when it comes to purging the cache, issues crop up. For example:

```
func (c *boundedCache) Purge() {
	// c.c is a process level cache. Since c.id is part of the cache keys,
	// we can just change its values to make it seem like we have purged
	// the cache.
	c.mu.Lock()
	c.id = nextCacheID()
	c.mu.Unlock()
}
```

This code assumes that we can purge the cache by incrementing the cache ID. And that's true -- data from the old cache ID will never be returned. However, this doesn't actually purge the old cache contents from memory!

In the case of sourcegraph.com, not purging things from memory is OK (we expect our process to hold all of it's memory, basically). But for local editors using the language server, this isn't OK. Every time the user modifies a file, the cache is purged (because the typechecking data is invalid). Because the purging doesn't actually clear contents from memory, we store (by default) up to 500 typechecked copies of the user's program. For something like docker, these cache objects can be up to 500MB of memory each (and we'll store 500 of them).

Given this is the case, we should really purge the cache contents instead of just invalidating them via ID incrementing. This will actually help lower memory consumption on sourcegraph.com, too, so it's a win-win situation.

Helps microsoft/vscode-go#853
Helps #178
@emidoots emidoots assigned sqs and neelance and unassigned sqs Jun 14, 2017
@neelance
Copy link

Nice explanation. Question: Do you expect any issues by the cleanup loop holding the lock for so long?

@emidoots
Copy link
Member Author

Do you expect any issues by the cleanup loop holding the lock for so long?

Good question. I don't expect any issues to come from that, no.

  1. It's always going to be a small loop (e.g. the default cache length is 500 objects) so it's not really doing that much work (but I didn't measure).
  2. I believe for the case of sourcegraph.com, the cache is only ever purged if the file is modified in the VFS, which we don't allow currently AFAIK.

@neelance
Copy link

Okay, then this should be fine.

@emidoots emidoots merged commit fe8fc70 into master Jun 14, 2017
@emidoots emidoots deleted the sg/proper-cache-purge branch June 14, 2017 22:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants