Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite behavior of collect(countfrom()) #11749

Closed
jdlangs opened this issue Jun 18, 2015 · 19 comments
Closed

Infinite behavior of collect(countfrom()) #11749

jdlangs opened this issue Jun 18, 2015 · 19 comments

Comments

@jdlangs
Copy link
Contributor

jdlangs commented Jun 18, 2015

I was curious to see what would happen if I tried to collect an infinite iterator. I was annoyed to discover it doesn't just run forever, but completely locked up my system. Would it be appropriate to add error methods to prevent this for Count and the other infinite iterators?

@JeffBezanson
Copy link
Member

Hopefully we'd eventually give an out-of-memory error, but OSes are not always cooperative here.

@ScottPJones
Copy link
Contributor

What about a settable per-process limit to the amount of (virtual) memory that a julia process can allocate? (not relying on any OS settings) When you are trying to run 1000s of processes, limiting the amount of each can be important...

@StefanKarpinski
Copy link
Member

I do love the way you have to tell the JVM how much memory you're going to use. It's almost as good as the old Mac OS behavior where you had to commit to a fixed amount of memory for each process up front. Having an opt-in memory limit might be a good thing though.

@carnaval
Copy link
Contributor

What does virtual memory have to do with this ? We definitely could limit physical memory used with an option, but virtual memory is used liberally in the allocator (on 64bit arch) to have nice flat spaces.

@ScottPJones
Copy link
Contributor

For people just running things pretty much by themselves on a machine, the current situation is acceptable, but an opt- in memory limit is a requirement if you want to make sure that one out of control process doesn't bring the entire (hospital) system to a grinding halt.

@ScottPJones
Copy link
Contributor

And what happens when you run out of swap space then, @carnaval? Nothing pleasant, I can assure you!

@carnaval
Copy link
Contributor

The (large) slack virtual memory we use is not commited so I doubt the OS will try to swap it.

@StefanKarpinski
Copy link
Member

Are you aware of ulimit?

@jdlangs
Copy link
Contributor Author

jdlangs commented Jun 18, 2015

Memory issues aside, would it be acceptable to make this call error immediately as it can't ever be legitimate?

@yuyichao
Copy link
Contributor

I don't think there's a way do detect that automatically and adding special cases like these feels wrong for me...

@ScottPJones
Copy link
Contributor

It's not the size of the space I'm talking about, but the amount of VM actually used by the processes... The OS will want real physical swap space available for any dirtied pages

@jdlangs
Copy link
Contributor Author

jdlangs commented Jun 18, 2015

@yuyichao The implementation I was testing was to give Count, Cycle, and Repeated an abstract supertype and add a collect method that errors when given one of them. So at least there's some sort of interface there and not just special cases.

But I agree trying to guard against all possible bad programming is a pretty grey area.

@ScottPJones
Copy link
Contributor

Duh... Snide remarks again? Or at least so it seems...
'ulimit' doesn't give you the flexibility you need.

@yuyichao
Copy link
Contributor

It's special cases in the sense that you will not be able to catch all of them. It is pretty easy to define iterators that are infinite depending on their values and even finite iterators can be very long and effectively infinite.

There are also more methods that iterate the whole iterator other than collect so even if you want to special case a few definitely infinite types, you would still need to extend the iterator interface. I personally don't think it doesn't worth the effort to have such a interface that catch some non-common problems half of the time.

@StefanKarpinski
Copy link
Member

@ScottPJones, there's nothing snide about asking if you're aware of ulimit. I don't expect you to be omniscient despite your widely known 29 years of industry experience and 35 years of C programming.

I don't disagree that it would helpful to have an opt-in ability to limit how much physical memory a Julia process will allocate. That will always be a best-effort limit, however, since C libraries can allocate as much as they want "under Julia's nose" and circumvent that limit. So it would never be a total solution – you'd need to use ulimit in any case. Which makes sense – only the OS can really enforce this.

@jdlangs
Copy link
Contributor Author

jdlangs commented Jun 18, 2015

@yuyichao Thanks, that was just the type of discussion I was hoping to get. My main uncertainty was what the general policy was for code that could never be correct (e.g., the function call in the title). Obviously breakage that depends on run-time values is a whole different set of circumstances.

But your point regarding other methods also causing the infinite behavior is well taken. I hadn't thought enough to realize even stuff like sum would break. I'll go ahead and close the issue then.

@jdlangs jdlangs closed this as completed Jun 18, 2015
@tkelman
Copy link
Contributor

tkelman commented Jun 19, 2015

Speaking of ulimit, see #10390 and #11201 - could still use a fallback there

@ScottPJones
Copy link
Contributor

@StefanKarpinski That's actually 42 years of programming, 35 with C/Unix, 34 professionally... 😀
If you'd said something like "Why didn't you use ulimit?" I wouldn't have taken it to be catty at all...
but to ask somebody with too many years of Unix experience if they are aware of something like ulimit...

It actually isn't the physical memory that really needs to be limited... where the OSes tend to crash and burn is not when they run out of physical memory, but when they've run out of swap space... (they will thrash and get really slow if they don't have enough physical memory, but tend to keep running)
@phobon commented in the opening message:

was annoyed to discover it doesn't just run forever, but completely locked up my system.

That sort of "locking up the system" isn't acceptable to most customers... so if we all would like julia to become capable of being used in non-stop, five nines reliability, then some work will need to be done.

For the issues of a process chewing through CPU, I wonder if in julia setting some sort of timer to cause an exception if more than a certain time had passed before a function finished would be useful?

For the memory issues, I still have to learn more about how julia works with addprocs (one of our developers has been trying to use that, and ran into performance issues with starting lots of jobs,
but maybe that's been fixed recently), but it's also important to look at overall system memory usage.

What I found is useful, is having settable thresholds, on a per-process basis (potentially with a security system to control if a process can up its limits), so that the language (such as julia) can give feedback to the application, so it can decide to release some cached structures if the system is running out of memory, or to allocate smaller buffers... lots of things are possible.

@ScottPJones
Copy link
Contributor

Oh, also, yes, with julia you may be more likely to suffer from effects of memory and resources allocated outside of julia's control (at least, until all the big libraries are all rewritten in Julia 😀), but it's still better to give as much control as possible to the application developer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants