-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-1478: Upgrade FlumeInputDStream's FlumeReceiver to support FLUME-1915 #566
Conversation
Can one of the admins verify this patch? |
Jenkins, test this please. |
@@ -153,3 +181,15 @@ class FlumeReceiver( | |||
|
|||
override def preferredLocation = Some(host) | |||
} | |||
|
|||
private[streaming] | |||
class CompressionChannelPipelineFactory() extends ChannelPipelineFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for ()
when no parameters are present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Except a few nits, it looks good to me. However, since its so late in the process of Spark 1.0, I am little extra afraid of breaking something. If possible, can you run this one a cluster with real data transfer from producer to see if this works? |
OK I have reviewed the commits and I will be making changes this morning. Thank tdas. |
Jenkins, this okay to test. |
Hey @tmalaska, I pondered about the code a bit more, especially about the lazy vals. The lazy val in this case is probably not a good idea. The receivers are now (after #300) are designed to be restartable multiple times. So onStart() + onStop() could be called multiple times if the receiver decides to restart itself (to handle exceptions). In which case, start() will be called on the netty server after it has been closed. I am not sure that is possible. So its best to create a new NettyServer every time a onStart() is called, rather than lazy initialize and use the netty server. So its probably best to do something like this.
|
Will do. I will start tomorrow. Shouldn't take long. |
Let me know if the changes are ok. The only difference from what you told me to do was I made a check to prevent a double start. Let me know if you want me to take it out. If so I can make the change very fast.
|
aah, right, makes sense. Please go ahead with it, and test it as well. I am still hopeful that we can squeeze this in for Spark 1.0 :) |
I already updated the code and tested it. Feel free to commit unless you see anything wrong. If you commit it in the next couple hours. I can start on SPARK-1642 tonight or tomorrow morning. |
Hey tdas, How is this Jira looking. Is there anything I need to do to get it passed? |
Got side tracked, will take a look asap!
|
LOL tdas, how it going. Just pinging. |
new MLlib documentation for optimization, regression and classification new documentation with tex formulas, hopefully improving usability and reproducibility of the offered MLlib methods. also did some minor changes in the code for consistency. scala tests pass. this is the rebased branch, i deleted the old PR jira: https://spark-project.atlassian.net/browse/MLLIB-19 Author: Martin Jaggi <m.jaggi@gmail.com> Closes apache#566 and squashes the following commits: 5f0f31e [Martin Jaggi] line wrap at 100 chars 4e094fb [Martin Jaggi] better description of GradientDescent 1d6965d [Martin Jaggi] remove broken url ea569c3 [Martin Jaggi] telling what updater actually does 964732b [Martin Jaggi] lambda R() in documentation a6c6228 [Martin Jaggi] better comments in SGD code for regression b32224a [Martin Jaggi] new optimization documentation d5dfef7 [Martin Jaggi] new classification and regression documentation b07ead6 [Martin Jaggi] correct scaling for MSE loss ba6158c [Martin Jaggi] use d for the number of features bab2ed2 [Martin Jaggi] renaming LeastSquaresGradient
@tdas this seems pretty useful - could you take a look? |
Yeah, starting to look at all pending PRs now. On Wed, Jun 4, 2014 at 4:20 PM, Patrick Wendell notifications@github.com
|
Hey tdas, I was going to do 1642 tonight, but I noticed these changes are not in the code yet. What should I do? Thanks |
Jenkins, test this again. |
Let me know if there is anything I can do to help this go through. Thanks tdas On Fri, Jun 20, 2014 at 4:38 PM, Tathagata Das notifications@github.com
|
class CompressionChannelPipelineFactory extends ChannelPipelineFactory { | ||
|
||
def getPipeline() = { | ||
val pipeline = Channels.pipeline() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting issue. 2 space indents required.
import org.jboss.netty.channel.ChannelPipelineFactory | ||
import java.util.concurrent.Executors | ||
import org.jboss.netty.channel.Channels | ||
import org.jboss.netty.handler.codec.compression.ZlibDecoder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please dedup, and sort. see import style in https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide
Sorry Ted, that this has been sitting here for so long. Will get this in ASAP. |
No worries. I'm starting to free up so I would love to do more work. I will finish this one up then the Flume encryption one. Then if you have anything else. Let me at it. Thanks |
I'm going to have to make a new pull request, because I had drop the repo that belonged to this pull request. I will update the ticket with the information when it's ready |
New Pull request #1168 |
new MLlib documentation for optimization, regression and classification new documentation with tex formulas, hopefully improving usability and reproducibility of the offered MLlib methods. also did some minor changes in the code for consistency. scala tests pass. this is the rebased branch, i deleted the old PR jira: https://spark-project.atlassian.net/browse/MLLIB-19 Author: Martin Jaggi <m.jaggi@gmail.com> Closes apache#566 and squashes the following commits: 5f0f31e [Martin Jaggi] line wrap at 100 chars 4e094fb [Martin Jaggi] better description of GradientDescent 1d6965d [Martin Jaggi] remove broken url ea569c3 [Martin Jaggi] telling what updater actually does 964732b [Martin Jaggi] lambda R() in documentation a6c6228 [Martin Jaggi] better comments in SGD code for regression b32224a [Martin Jaggi] new optimization documentation d5dfef7 [Martin Jaggi] new classification and regression documentation b07ead6 [Martin Jaggi] correct scaling for MSE loss ba6158c [Martin Jaggi] use d for the number of features bab2ed2 [Martin Jaggi] renaming LeastSquaresGradient Conflicts: mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala
… ExternalShuffleBlockHandler (apache#566) More context on https://issues.apache.org/jira/browse/SPARK-27773. Basically gives us a rough indicator of health of the external shuffle service / metric that we can monitor and alert on.
* Update Go version for 1.13 conformance job Release 1.13 of Kubernetes supports Go 1.12 https://github.com/kubernetes/kubernetes/blob/release-1.13/Godeps/Godeps.json#L3 * Update tag Application:Go
…apache#566) * AL-4757 when refresh InMemoryFileIndex, if recursiveFileLookup is true, use recursiveDirChildrenFiles * AL-4757 add UT
…apache#566) * AL-4757 when refresh InMemoryFileIndex, if recursiveFileLookup is true, use recursiveDirChildrenFiles * AL-4757 add UT
…apache#566) * AL-4757 when refresh InMemoryFileIndex, if recursiveFileLookup is true, use recursiveDirChildrenFiles * AL-4757 add UT
…apache#566) * AL-4757 when refresh InMemoryFileIndex, if recursiveFileLookup is true, use recursiveDirChildrenFiles * AL-4757 add UT
No description provided.