Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: order chunks on demand #4773

Closed
wants to merge 15 commits into from
Closed

Conversation

naalit
Copy link
Contributor

@naalit naalit commented Jun 17, 2021

Contains

You'll probably notice that chunks near you often generate after chunks far away from you, especially if you're moving. Right now, the ChunkProcessingPipeline orders chunks by storing them in a priority queue indexed by their distance to a player. The problem is, when the player moves the priority queue has cached some priorities, and so a chunk that used to be far away and is now close will still have a low priority.
In extreme cases, if you move to right next to a chunk that used to be at the edge of your view distance, all of the chunks that are now in range that didn't used to be will generate before the one right next to you.

There are multiple ways to fix this problem, but what this PR does is replace the priority queue with a list that's updated and re-sorted every time the relevant chunks change. Then, instead of queueing large numbers of chunks for generation, the chunk generator threads pop elements off the list.

That involves changing more that it sounds like; there's a new InitialChunkProvider which the chunk generator threads query for the next chunk to generate, with an implementation for generating chunks around the player and an implementation using a priority for loading chunks that have been sent from the server.

How to test

Start a world and fly around; the generated chunks should follow you pretty well.

Outstanding before merging

@skaldarnar suggested looking into the impact on LOD chunks.

This is a fairly big change, involving multiple new classes, so it's important that people agree that this is the right way to go.

@github-actions github-actions bot added the Type: Improvement Request for or addition/enhancement of a feature label Jun 17, 2021
@naalit
Copy link
Contributor Author

naalit commented Jun 17, 2021

The last commit slows down my game quite a bit, because it's taking so long to process generated chunks. A temporary solution is to reduce the number of chunk processing threads, but hopefully some of that work can be moved off the main thread.


class LocalChunkProviderTest {

private static final int WAIT_CHUNK_IS_READY_IN_SECONDS = 30;
private static final int WAIT_CHUNK_IS_READY_IN_SECONDS = 5;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a way more reasonable time to wait for chunks to be ready 👍

return chunk;
});
/**
* Load a chunk from disk or generate it, so that it's ready for processing in the chunk processing pipeline.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

private static final Logger logger = LoggerFactory.getLogger(ChunkProcessingPipeline.class);

private final InitialChunkProvider initialChunkProvider;
private final Object waitForNewChunks = new Object();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this Object used for? I think I've never seen a pure Object used anywhere before ... 👀

edit: ah, I see: https://www.baeldung.com/java-wait-notify

Copy link
Contributor Author

@naalit naalit Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's just for a signal. It would probably be clearer to use an actual Condition.

private static final Logger logger = LoggerFactory.getLogger(ChunkProcessingPipeline.class);

private final InitialChunkProvider initialChunkProvider;
private final Object waitForNewChunks = new Object();
private final Thread[] poolThreads = new Thread[NUM_TASK_THREADS];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just call this threadPool? (I think "thread pool" is quite a common term, and that's probably also what I'd be searching/looking for when debugging this class).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's so much a common term that I get suspicious when I see us implementing our own like this.

Copy link
Contributor Author

@naalit naalit Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There used to also be a ThreadPoolExecutor, but now each worker thread fetches its own work. We actually don't need to store the threads in an array like this anymore, now that it doesn't interrupt them to stop them, so I removed the array entirely in the latest commit.

}

private void runPoolThread() {
Preconditions.checkState(!stages.isEmpty(), "ChunkProcessingPipeline must to have at least one stage");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Preconditions.checkState(!stages.isEmpty(), "ChunkProcessingPipeline must to have at least one stage");
Preconditions.checkState(!stages.isEmpty(), "ChunkProcessingPipeline must have at least one stage");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

import java.util.concurrent.BlockingQueue;
import java.util.concurrent.PriorityBlockingQueue;

public class ReceivedChunkInitialProvider implements InitialChunkProvider {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd appreciate some Javadoc similar to the interface itself. What does "received" stand for?

And should it be named

Suggested change
public class ReceivedChunkInitialProvider implements InitialChunkProvider {
public class ReceivedInitialChunkProvider implements InitialChunkProvider {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the javadoc added in the latest commit look good?

private BlockingQueue<Chunk> chunkQueue;

public ReceivedChunkInitialProvider(Comparator<Chunk> comparator) {
chunkQueue = new PriorityBlockingQueue<>(800, comparator);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the 800 is any arbitrary number? Or is there more math behind it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the initial capacity that was used for the priority queue in the previous ChunkProcessingPipeline; it probably doesn't actually need to be that high here.


@Override
public synchronized Chunk next(Set<Vector3ic> currentlyGenerating) {
return chunkQueue.poll();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we don't make use of currentlyGenerating - is this on purpose?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, that parameter is the only thing thing that distinguishes this class from the Queue<Chunk> that is backing it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I see the other implementation of this interface does make use of it, though this does not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, the client doesn't receive duplicate chunks, so it doesn't need to check; and if it did receive the same chunk from the server again before it's done processing, the new one is more up-to-date, so it should keep the new one.

Comment on lines 66 to 67
initialProvider = new ReceivedChunkInitialProvider(new LocalPlayerRelativeChunkComparator(localPlayer));
loadingPipeline = new ChunkProcessingPipeline(this::getChunk, initialProvider);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this even work? Do we have multiple RemoteChunkProviders, one per player? Or how are other players (clients) taken into account?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one instance of RemoteChunkProvider on each client, which receives chunks from the server and submits them to the pipeline. That's in contrast to the LocalChunkProvider, which runs on the server and generates chunks itself.

Comment on lines 32 to 36
relevanceSystem.neededChunks().forEach(chunkPos -> {
if (!chunksInRange.contains(chunkPos)) {
chunksInRange.add(chunkPos);
}
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can neededChunks() contain duplicates? Otherwise, we could sort out the positions already contained in a filter step beforehand:

Suggested change
relevanceSystem.neededChunks().forEach(chunkPos -> {
if (!chunksInRange.contains(chunkPos)) {
chunksInRange.add(chunkPos);
}
});
relevanceSystem.neededChunks()
.filter(pos -> !chunksInRange.contains(pos))
.forEach(chunksInRange::add);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, that's a lot cleaner.

@skaldarnar skaldarnar added the Topic: WorldGen Requests, Issues and Changes related to facets, rasterizers, etc. label Jun 17, 2021
Copy link
Member

@keturn keturn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, you weren't kidding when you wrote "this is a fairly big change."

requestCreatingOrLoadingArea(chunkPosition).get(WAIT_CHUNK_IS_READY_IN_SECONDS, TimeUnit.SECONDS);
requestCreatingOrLoadingArea(chunkPosition);
chunkProvider.notifyRelevanceChanged();
Thread.sleep(WAIT_CHUNK_IS_READY_IN_SECONDS * 1000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to find a way to write these tests without Thread.sleep.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between future.get(WAIT_…) and Thread.sleep(WAIT_…) is that the get will take at most that much time to execute, but the sleep will take at least that much time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's annoying that they take so long now. Unfortunately there aren't any futures to wait on anymore, but I guess we could add an object to wait on in the ChunkProcessingPipeline, or use a loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask about that: Why did you remove the futures?

My guess would have been because you only call the methods involved from a thread dedicated to doing that work, so they're all blocking and don't return until the result is available. But the number of times the tests rely on Thread.sleep suggests that's not the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, there's just no need for futures currently. Everything is just scheduled for after the chunk is generated via stages. Futures are only needed in the tests, because the rest of the code just submits chunks for generation asynchronously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another object which is notified when generation finishes, which fixes the tests.


public final class ChunkProcessingInfo {
public final ReentrantLock lock = new ReentrantLock();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adds a field with a lock, but the lock isn't used by any of this class's methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used by the ChunkProcessingPipeline worker threads so that they're not working on the same chunk at the same time. I should probably add a javadoc comment for that.

Comment on lines 111 to 112
} catch (InterruptedException e) {
if (!executor.isTerminated()) {
logger.error("Reactor thread was interrupted", e);
}
reactor.interrupt();
Thread.currentThread().interrupt();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What InterruptedException would we be catching here? Something from synchronized?

What's the difference between catching it and calling .interrupt() again versus leaving it un-caught?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should actually never happen now that it doesn't interrupt the worker threads to stop them anymore, but it used to be there for them to shut down gracefully.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the problem with using interrupt on the worker threads? That sounds like it was working as intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would sometimes interrupt the threads while they were doing IO, which would at least cause some scary error messages. It wasn't actually simpler, and now that the waiting logic is correct, it's not needed anyway.

.map(pos -> getChunkBy(chunkProcessingInfo.getChunkTaskProvider(), pos))
.filter(Objects::nonNull)
.collect(Collectors.toSet());
if (providedChunks.size() == chunkTask.getRequirements().size()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this size comparison checking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's checking whether all the requirements are satisfied. This probably isn't the most readable way to do that, but I just moved this section of code from somewhere else.

Comment on lines 16 to 22
boolean hasNext();

/**
* @param currentlyGenerating the set of chunks which are currently being processed.
* This lets the InitialChunkProvider discard those chunks before trying to generate them at all.
*/
Chunk next(Set<Vector3ic> currentlyGenerating);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next requires a parameter but hasNext does not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter on next is so that it can check if a chunk is already processing before generating it; it doesn't really matter for hasNext, but because they're different hasNext can return true but then next return null. It might be better to either just have one method which can return null if there are no new chunks, give both methods the same parameter, or make next return a Producer<Chunk> or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I just removed hasNext and made next return an Optional, which is simpler.


@Override
public synchronized Chunk next(Set<Vector3ic> currentlyGenerating) {
return chunkQueue.poll();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, that parameter is the only thing thing that distinguishes this class from the Queue<Chunk> that is backing it.

@keturn
Copy link
Member

keturn commented Jun 18, 2021

At the fifty thousand foot level:

± I'm worried that the problem of "we have more work to do than we can finish quickly" is answered with "re-order the work so it looks like we finished quickly, but we actually have a lot to catch up on before anything else needs these CPU resources."

If that bit of smoke and mirrors improves the player experience and the maintenance cost is low, sure, go for it. But I don't think it's worth adding a lot of complexity to support.

+ Changing the "which chunk to load next" supplier from a queue to a lazily-evaluated function that figures out the answer on demand sounds like a good idea. It should remove that thorny question of how to manage that queue as relevance changes. As long as the time it takes to do that calculation is much less than the time it takes to generate a chunk, there should be no downside.

Regarding the implementation as it is at the moment:

Despite my optimism in the previous paragraph, the diff isn't convincing me that it's removing complexity.

@keturn
Copy link
Member

keturn commented Jun 18, 2021

[note: this was written before your last three commits]

I'm sensing a theme here, which I'll summarize as: Take advantage of existing interfaces!

Want to have a function that does its work asynchronously and then do something with a result? Return a Future! (And yes, "it makes testing the code easier" is a valid consideration in interface design.)

Want to stop a thread in progress? Call its interrupt method! The fact that it does not wait for I/O to complete is a feature here, not a bug.

Want some threads to run tasks, taking new entries from a queue as they're available? Use a ThreadPoolExecutor!

Having an appropriate configuration for the size of the thread pool and the ability to shut down threads when a user leaves that game are the types of things we don't want each subsystem to make up its own interface for.


Having said that, it is true that ExecutorService is not designed as a pull interface. If you don't have events pushing tasks on to a queue, an Executor doesn't make sense. I think this is the part where we wish we were using a language with coroutines, or some sort of channels interface with controls for backpressure.

@naalit
Copy link
Contributor Author

naalit commented Jun 18, 2021

I'm worried that the problem of "we have more work to do than we can finish quickly" is answered with "re-order the work so it looks like we finished quickly, but we actually have a lot to catch up on before anything else needs these CPU resources.

If that bit of smoke and mirrors improves the player experience and the maintenance cost is low, sure, go for it. But I don't think it's worth adding a lot of complexity to support.

In general, I agree that real performance is a better goal. But at the same time, it's just not feasible to generate chunks fast enough that we can generate the whole view distance by the time the player gets to the edge of the world. In general, if the player is moving at all, chunk generation never finishes even for a second; that's true in other voxel games as well. So in my opinion, something like this is important.

In terms of complexity - I actually think this reduces net complexity. It could probably be even simpler, but the previous implementation was pretty complex.


Want to have a function that does its work asynchronously and then do something with a result? Return a Future! (And yes, "it makes testing the code easier" is a valid consideration in interface design.)

I actually don't think a Future makes sense here at all, because there's no function which would return it. There's no "submit chunk for processing" function, and in a real game worker threads start and stop on their own.

Want to stop a thread in progress? Call its interrupt method! The fact that it does not wait for I/O to complete is a feature here, not a bug.

What if a stage writes chunk data to disk? Now if the thread is interrupted, the file will probably be corrupted. That doesn't happen
now and probably shouldn't, but even if interrupt would work, I don't think there's any reason to use it. It wouldn't actually simplify any code, because worker threads still have to check the condition variable since the interrupt doesn't do anything if they're not waiting.

In general: IMO the existing interfaces don't really match this use case, and even if they did, they wouldn't actually remove any complexity because the uses of those things here are very simple and don't contain much generalizable code. The thread pool, for example, can't be simplified at all because it still needs to specify how to get tasks and what to do with them, and the runPoolThread function just wraps that in a loop.


I hope that wasn't too long or too antagonistic, I do think your feedback is valuable and I see where you're coming from.

@DarkWeird
Copy link
Contributor

DarkWeird commented Jun 18, 2021

@keturn

Having said that, it is true that ExecutorService is not designed as a pull interface. If you don't have events pushing tasks on to a queue, an Executor doesn't make sense. I think this is the part where we wish we were using a language with coroutines, or some sort of channels interface with controls for backpressure.

RxJava or Java Quasar(coroutines/fibers/etc)

@keturn
Copy link
Member

keturn commented Jun 19, 2021

In general, if the player is moving at all, chunk generation never finishes even for a second; that's true in other voxel games as well.

Checking in to see if there's a terminology miscommunication lurking behind this: Are we talking about first-time generation of previously unexplored area, or the way we get chunks more generally, whether it's new generation or loading from a save?

I'm assuming that new generation speed << loading speed. There are a few weak spots in that assumption:

  • Is generation speed really slower? We have 3D chunks, so generating a chunk does not mean you need to generate the entire vertical from the bedrock to the heavens. Maybe a lot of chunks generate just as quickly as you could deserialize them.
  • Is loading speed really "fast enough"? IMHO loading speed should be >> player movement speed, but maybe that's not the case.

I actually think this reduces net complexity. It could probably be even simpler, but the previous implementation was pretty complex.

Yeah, that's part of the challenge of reviewing this. A lot of what's showing up in the diff is probably the status quo, and it's not fair to blame the PR for that, but I haven't sorted out which parts are the real changes yet.

@keturn
Copy link
Member

keturn commented Jun 19, 2021

[side discussion on threads]

What if a stage writes chunk data to disk? Now if the thread is interrupted, the file will probably be corrupted.

Being most fluent in Python and having survived mostly by avoiding threads as much as possible, there are some details I'm fuzzy on. I think the root question I have here is: Is thread.interrupt() ever called by code we haven't explicitly written ourselves?

For example, if the application gets a SIGINT or SIGTERM in Linux (if I recall correctly, these signals in Linux can only be addressed to the process, not individual threads), how does it try to shutdown the threads?

If it does that with interrupt(), then we need to be sure we're handling interrupts safely in any case. And as long as we're handling them safely, we might as well use that mechanism instead of introducing a parallel one.

Admittedly "we need to safely handle interrupts during writes" is doing a lot of hand-waving there. How does that work? I'm searching, but as far as I can tell so far,

  • there's no mechanism to guard an InterruptableChannel from interruptions,
  • but Thread.interrupt often does not, in practice, interrupt I/O, because the JVM implementation may not have any way to interrupt a synchronous write() syscall on that operating system.
  • this chapter about thread interruption has both the "set a variable" and "use interrupt" techniques, but it's still not clear to me why you'd need your own variable if you can use interrupt. I think it's past lunchtime and my blood sugar is too low for this. (I got to that link as a google result, but I wouldn't be surprised if it's paywalled if we link directly to it here. There's certainly a paywall when I try to click through to the "interrupt will be useful with Chapter 10" at the end.)

worker threads still have to check the condition variable since the interrupt doesn't do anything if they're not waiting.

Can they use Thread.interrupted for that?

@keturn
Copy link
Member

keturn commented Jun 19, 2021

(DarkWeird's recent comment about wanting to consolidate to a single thread pool is part of why I'm trying to nudge toward a more standard model.)

@naalit
Copy link
Contributor Author

naalit commented Jun 20, 2021

Checking in to see if there's a terminology miscommunication lurking behind this: Are we talking about first-time generation of previously unexplored area, or the way we get chunks more generally, whether it's new generation or loading from a save?

I'm assuming that new generation speed << loading speed.

I was talking about new generation, and I'm also assuming that although I haven't actually checked. I mean that when the player is exploring a new area, order can matter more than overall speed, since even if generation is really fast the game won't actually be able to generate all the chunks in range unless the player stops moving.

it's still not clear to me why you'd need your own variable if you can use interrupt.

Can they use Thread.interrupted for that?

I didn't realize Thread.interrupted worked like that! When you interrupt a thread that isn't waiting on anything, it doesn't stop the thread; but I didn't know that it does actually mark it as interrupted in a way that thread can check. So yes, interrupt would actually work and avoid the condition variable.

As for whether it's safe, and whether non-user code can interrupt threads, I'm still not really sure. Based on this SO answer it sounds like the JVM won't interrupt threads on shutdown, it will just stop running them, so if they were in the middle of writing data that would probably still corrupt it. Regardless, switching to interrupt would work right now, since chunk processing threads don't currently do any writing.

@naalit
Copy link
Contributor Author

naalit commented Jul 3, 2021

This PR manually spawns some threads and has some ad-hoc concurrency, so I'm planning on refactoring it to use RxJava once we figure out how that's integrated into the engine (#4798), and waiting on it for now.

@naalit
Copy link
Contributor Author

naalit commented Jul 14, 2021

I've rewritten this PR using Reactor: #4822. It's a lot simpler and the tests should no longer fail due to hard-to-reproduce concurrency bugs!

Edit: I've spoken too soon. That PR also has failing tests on CI that reliably pass on my computer. Oh well, I'll try to fix it tomorrow.

@naalit naalit closed this Jul 19, 2021
@naalit naalit deleted the feat/order-chunks-on-demand branch July 19, 2021 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Topic: WorldGen Requests, Issues and Changes related to facets, rasterizers, etc. Type: Improvement Request for or addition/enhancement of a feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants