-
Notifications
You must be signed in to change notification settings - Fork 507
METRON-1860 new developer option for ansible in docker to deploy to vagrant #1261
Conversation
…docker ( and a copy of the metron codebase ) are required to run the Metron full-dev vm with it's default setup. This is the initial work, there will be refactorings
Valiant effort @ottobackwards . I am just wondering how much easier this really makes it? Have you thought about just publishing a Metron Demo image to Vagrant Cloud? Does that scratch the same itch?
|
@ottobackwards thanks for the submission. Per recent community comments, it sounds like we could use some improvements to how our build/deploy dep versions interact with other tooling that may require other versions of things. Is this the correct dependency listings for each host/container? The Docker container will be pre-configured with:
And the developer host machine would now need to manage versions of:
The deployment goes through these steps:
At the end, you have an ephemeral Docker instance + Vagrant instance that has the running Metron instance? |
@nickwallen I did not think of that. I was improving the process that stands today. I think in a world where the posted image exists, we would still want the ability to try the latest ( to verify a fix pre-release etc ). |
@mmiklavc That is basically correct. Except that the ansible version is 2.5, since it only applies to this build, and allows for the yaml log formatting. Also, in the latest version, the ansible once again does the clean and build as opposed to the script. I had a lot of problems getting the c++ picked up from ansible and moved the build out of it for the time, but the idea was always to have ansible run the metron_build, and that has returned. The reasoning for the prompts to build the vagrant box and the docker ->
|
The integration test failure has to do with the profiler tests and seem unrelated. |
If we go this approach, why not just replace the existing "Full Dev" environments (both centos and ubuntu) rather than add new environments to support, test, and keep in-sync? |
How does the build time compare to the current approach ( How does the time it takes to get Full Dev running compare (whatever is equivalent to |
I'm going to take a stab at a further look next week. For now I gave it a quick run-up and it was successful. |
@nickwallen That is an option, but not something I would pick as the goal from the outset if you know what I mean. |
If we'd want to replace full dev we would need to get skip tags passed in appropriately, I use that a lot. That said I'm not 100% that we need to do that all at once. |
We could also use more tags, for example I may want to skip building the java, but not skip building the RPMs. Think of a dev flow -> I make my change, run my local tests and want to spin up full dev. It is already built, but needs the rpms, I should be able to make ansible skip the compile/package of java and still do the rpms/debs |
anyone have any ideas of the best way to time these things? |
…sts file, and some like john zeolla have a train wreck
…software like microsnitch
In the interest of pure (and almost certainly incredibly ignorant) speculation, along the lines of the vagrant cloud, is it theoretically possible to save off a base image that has the Hadoop and non-Metron installs done? Then if said base image is missing (or possibly has changed properties, e.g. HDP version) rebuild it, and if found just build Metron and install mpack? Or alternatively, don't load the end result to Vagrant Cloud, just upload the base image we install on top of. Then only update it on a base image update. What I'm getting at is that it would be nice to be able to build off of master, in a way that doesn't require an external dependency, and still lets us cache off the majority of the install (setting up HDP and Ambari). At that point, full dev would essentially be (for most builds) "build metron, build RPMs, run up image, do Metron install and setup". Which is probably 20 minutes faster and would make me a substantially happier person. |
It is possible to imagine a number of scenarios, including that, but also needing to build with new hadoop versions ( can't lose build from scratch ). There are a number of things we can do down the road. I think this work is going to help people enough in the near term to land it, while we discuss longer term refactoring and workflow. |
If you create an issue for your vagrant base machine with our hadoop / ambari already in it, you can assign it to me. @justinleet |
hey @ottobackwards - I am still wrapping my head around this, but one small nit is that I don't like all the confirmation prompts. With the prompts I have to constantly check back to ensure it is not stuck on another prompt waiting for me to do something. As much as possible it should just be fire-and-forget, so I can run it, work on something else, and hopefully come back some time later with a functioning Metron install. I think you could get the same flexibility, if you either used command-line switches or if you moved all the prompts to the beginning of the script. |
@nickwallen, yeah, I did prompts as I went along debugging. I was thinking that folks may not like them. |
Doubt it. CPUs: 6, Mem: 12 GB, Swap 3 GB Is it running sufficiently fast for you @JonZeolla ? |
I was thinking about this the other day actually. Since with these changes we no longer rely on Vagrant to kick-off Ansible, we could alter the steps so that it first completes the build and packaging and then only after that is done, it launches the VM. That way we don't have the two fighting for memory at the same time. First off, does that sound like a reasonable thing to do @ottobackwards ? If so, is it a heavy lift we should hold off on for a follow-on PR? Or is it just a simple re-ordering of some of the actions in Edit: We should probably just get the basics working here before going on changing anything else drastically. Ignore what I said. |
./metron-up.sh 6.30s user 3.09s system 0% cpu 1:20:47.77 total @nickwallen I think that is worth doing definitely! I'll get on it and let you know what it will take |
Can we do it as a follow-on PR? EDIT:. Looks like it also took about 80 minutes for you also. Hmm. |
I think it will take me a couple of hours to make the change. |
Ok, I'm going to do that as a follow on |
Hey @ottobackwards - What are your thoughts on the duration of the build? Do you have any thoughts on how we could improve that? |
You mean other than not run the vm? |
I don't know where this stands. Do you think this should be merged in spite of the time it takes to spin-up? Or are you looking at ways to improve the build time? What is the path forward here? |
I have a secondary branch and I'm working through building without the vm running |
Some thoughts for next steps.
|
@ottobackwards Thanks for that comparison info, that makes it really easy to compare. It's a fractionally small part of the build, but what's going on with clean taking 8x as long? Those time drops are pretty substantial - is that purely resource contention? Thinking through this build process a bit more, just curious, are you sharing local m2 repo from the host? I'm assuming this doesn't pull every dep down local to the Docker container. That's probably worth adding to the diagram for clarity. |
I believe the issue with clean is due to slowness with docker doing writes to osx hosted volumes. |
DOCKER_CMD="bash" | ||
DOCKER_CMD_BASE[0]="docker run -d -t --name MetronBuild " | ||
DOCKER_CMD_BASE[1]="-v \"${VAGRANT_PATH}/../../..:/root/metron\" " | ||
DOCKER_CMD_BASE[2]="-v ~/.m2:/root/.m2 " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@mmiklavc Somone else needs to try it obviously |
@mmiklavc as I stated above, if we need to, another change would be to create a docker layer that was a copy of the source, such that we didn't need to use the volumes at all.... but I'd have to do some research on that as a follow on |
@ottobackwards I ran your latest code up a few times and I am not able to replicate your results unfortunately. This is the overall duration to go from source code to a fully deployed dev environment.
Your results differ significantly from mine...
(1) Any ideas why we are getting such different results? I have allocated Docker 12G of RAM, 6 cores, and 3G of swap. Are there any other settings that you have adjusted on your build machine? (2) Which logs did you use to break out the build, clean, package times? Since Ansible runs on Docker, we no longer have that log persisted on the host. Once the Docker container is shut-down the logs are lost, unless I am missing something. |
In this PR, the ansible log goes to the /logs directory |
@ottobackwards - I want to spend some time in the next week resurrecting this and taking another look to see where we're at. |
@mmiklavc I think we are doing docker wrong, wrt building. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, may I ask you a question? For testing purposes, I compiled metron according to the source code on github.
Reference links are as follows: https://github.com/apache/metron/tree/master/metron-deployment/packaging/docker/ansible-docker.
The last command to execute is MVN clean package-DskipTests.
The result is successful, showing build success.
Excuse me, how many jar packages are generated by this operation?
How to start metron in docker? Or can we say something about the next operation? I can't find duying in the community anymore. Thank you very much!
@luozhenwei - it's generally best to ask these questions on the user or dev list, but here's the list of jars we depend on in our full dev (e.g. https://github.com/apache/metron/tree/master/metron-deployment/development/centos6) env:
17 jars. The Docker deploy you linked to hasn't been maintained/updated in quite some time, so I'm not entirely sure what its current state is. @merrimanr may have more detail on this. If you're just trying to explore Metron, I would run up full dev on centos6 via the instructions in the link I provided above. |
The goal of this PR is to provide a new "full_dev" option for new and old users that does not require as much setup and version matching to try Metron's full dev environment.
Currently, the vagrant up command runs ansible locally, on the host machine, to build and deploy metron. This means that the user must not only have Vagrant, Virtual Box and Docker, but must also have all the tools necessary to build metron ( maven, java, c++ 11 etc ) and run ansible ( python and others ). It has been a common source of problems for new users to get started with Metron because of version or setup problems.
This PR introduces a new metron-deployment/development option which tries to address this problem, and make it possible for the user to only have Vagrant, VirtualBox and Docker ( along with a local copy of the source ) to be able to run full dev.
The new option starts the Vagrant VM, but does not run ansible in it. Instead it runs a docker container which contains all the tools/versions necessary, and that container is what runs ansible.
##Testing
Have the correct versions of vagrant, virtual box and docker installed and running
Answer yes to building the vagrant box.
Answer yes to building the docker machine
Go grab a coffee.
The end result should be full dev running in the vagrant instance.
The logs directory will have a log for each run.
If you run a second time, you can say no to building the docker machine.
Differences
For all changes:
For code changes:
Have you included steps to reproduce the behavior or problem that is being changed or addressed?
Have you included steps or a guide to how the change may be verified and tested manually?
[-] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
[-] Have you written or updated unit tests and or integration tests to verify your changes?
[-] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
For documentation related changes:
site-book/target/site/index.html
: