METRON-1860 new developer option for ansible in docker to deploy to vagrant #1261

ottobackwards · 2018-11-13T17:45:18Z

The goal of this PR is to provide a new "full_dev" option for new and old users that does not require as much setup and version matching to try Metron's full dev environment.

Currently, the vagrant up command runs ansible locally, on the host machine, to build and deploy metron. This means that the user must not only have Vagrant, Virtual Box and Docker, but must also have all the tools necessary to build metron ( maven, java, c++ 11 etc ) and run ansible ( python and others ). It has been a common source of problems for new users to get started with Metron because of version or setup problems.

This PR introduces a new metron-deployment/development option which tries to address this problem, and make it possible for the user to only have Vagrant, VirtualBox and Docker ( along with a local copy of the source ) to be able to run full dev.

The new option starts the Vagrant VM, but does not run ansible in it. Instead it runs a docker container which contains all the tools/versions necessary, and that container is what runs ansible.

##Testing
Have the correct versions of vagrant, virtual box and docker installed and running

cd $METRON_SRC_ROOT/metron-deployment/development/centos6_docker_build
./build_and_run.sh

Answer yes to building the vagrant box.
Answer yes to building the docker machine
Go grab a coffee.

The end result should be full dev running in the vagrant instance.
The logs directory will have a log for each run.

If you run a second time, you can say no to building the docker machine.

Differences

This does not support skip tags passed on the cli
This does not support provision

For all changes:

Is there a JIRA ticket associated with this PR? If not one needs to be created at Metron Jira.
Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically master)?

For code changes:

Have you included steps to reproduce the behavior or problem that is being changed or addressed?
Have you included steps or a guide to how the change may be verified and tested manually?
[-] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
```
mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh 
```
[-] Have you written or updated unit tests and or integration tests to verify your changes?
[-] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?

For documentation related changes:

[-] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via site-book/target/site/index.html:

…docker ( and a copy of the metron codebase ) are required to run the Metron full-dev vm with it's default setup. This is the initial work, there will be refactorings

nickwallen · 2018-11-13T18:33:49Z

Valiant effort @ottobackwards . I am just wondering how much easier this really makes it?

Have you thought about just publishing a Metron Demo image to Vagrant Cloud? Does that scratch the same itch?

Everything is already pre-installed (removes one source of potential problems for new users.)
All a user needs is Vagrant and VirtualBox.

mmiklavc · 2018-11-13T21:09:25Z

@ottobackwards thanks for the submission. Per recent community comments, it sounds like we could use some improvements to how our build/deploy dep versions interact with other tooling that may require other versions of things.

Is this the correct dependency listings for each host/container?

The Docker container will be pre-configured with:

Java 8
Ansible 2.4.0+
Python 2.7
Maven 3.3.9
C++11 compliant compiler, like GCC

And the developer host machine would now need to manage versions of:

Vagrant 2.0+
Vagrant Hostmanager Plugin
Virtualbox 5.0+
Docker

The deployment goes through these steps:

You mount m2 repo and your code dir in the Docker instance.
build_and_run optionally spins up Vagrant in VirtualBox.
build_and_run optionally creates Docker instance with pre-reqs for building and deploying Metron.
build_and_run runs the build and then calls the Ansible deployment scripts from within Docker.

At the end, you have an ephemeral Docker instance + Vagrant instance that has the running Metron instance?

ottobackwards · 2018-11-14T14:42:54Z

@nickwallen I did not think of that. I was improving the process that stands today. I think in a world where the posted image exists, we would still want the ability to try the latest ( to verify a fix pre-release etc ).

ottobackwards · 2018-11-14T14:47:56Z

@mmiklavc That is basically correct. Except that the ansible version is 2.5, since it only applies to this build, and allows for the yaml log formatting.

Also, in the latest version, the ansible once again does the clean and build as opposed to the script. I had a lot of problems getting the c++ picked up from ansible and moved the build out of it for the time, but the idea was always to have ansible run the metron_build, and that has returned.

The reasoning for the prompts to build the vagrant box and the docker ->

if you are using this during development, IE> we are working ON ansible or ON docker, you may fail in the docker or ansible stage without modifying the vm, and thus not need to vagrant up again.
Likewise, you may not need to rebuild the docker machine if you have not made changes, or you may in fact need to. I added these flags as I developed.

ottobackwards · 2018-11-14T14:50:52Z

The integration test failure has to do with the profiler tests and seem unrelated.

nickwallen · 2018-11-14T15:12:13Z

If we go this approach, why not just replace the existing "Full Dev" environments (both centos and ubuntu) rather than add new environments to support, test, and keep in-sync?

nickwallen · 2018-11-14T15:14:52Z

How does the build time compare to the current approach (mvn clean install -DskipTests)? Does it change at all since it is now run in a Docker container?

How does the time it takes to get Full Dev running compare (whatever is equivalent to vagrant up)?

JonZeolla · 2018-11-14T15:22:25Z

I'm going to take a stab at a further look next week. For now I gave it a quick run-up and it was successful.

ottobackwards · 2018-11-14T15:24:05Z

@nickwallen That is an option, but not something I would pick as the goal from the outset if you know what I mean.

JonZeolla · 2018-11-14T15:24:12Z

If we'd want to replace full dev we would need to get skip tags passed in appropriately, I use that a lot. That said I'm not 100% that we need to do that all at once.

ottobackwards · 2018-11-14T15:26:05Z

We could also use more tags, for example I may want to skip building the java, but not skip building the RPMs. Think of a dev flow -> I make my change, run my local tests and want to spin up full dev. It is already built, but needs the rpms, I should be able to make ansible skip the compile/package of java and still do the rpms/debs

ottobackwards · 2018-11-14T15:27:18Z

anyone have any ideas of the best way to time these things?

…sts file, and some like john zeolla have a train wreck

…software like microsnitch

justinleet · 2018-11-21T17:30:06Z

In the interest of pure (and almost certainly incredibly ignorant) speculation, along the lines of the vagrant cloud, is it theoretically possible to save off a base image that has the Hadoop and non-Metron installs done? Then if said base image is missing (or possibly has changed properties, e.g. HDP version) rebuild it, and if found just build Metron and install mpack? Or alternatively, don't load the end result to Vagrant Cloud, just upload the base image we install on top of. Then only update it on a base image update.

What I'm getting at is that it would be nice to be able to build off of master, in a way that doesn't require an external dependency, and still lets us cache off the majority of the install (setting up HDP and Ambari). At that point, full dev would essentially be (for most builds) "build metron, build RPMs, run up image, do Metron install and setup". Which is probably 20 minutes faster and would make me a substantially happier person.

ottobackwards · 2018-11-21T18:18:32Z

It is possible to imagine a number of scenarios, including that, but also needing to build with new hadoop versions ( can't lose build from scratch ).

There are a number of things we can do down the road.

I think this work is going to help people enough in the near term to land it, while we discuss longer term refactoring and workflow.

ottobackwards · 2018-11-21T18:19:49Z

If you create an issue for your vagrant base machine with our hadoop / ambari already in it, you can assign it to me. @justinleet

nickwallen · 2018-11-21T19:23:32Z

hey @ottobackwards - I am still wrapping my head around this, but one small nit is that I don't like all the confirmation prompts. With the prompts I have to constantly check back to ensure it is not stuck on another prompt waiting for me to do something.

As much as possible it should just be fire-and-forget, so I can run it, work on something else, and hopefully come back some time later with a functioning Metron install.

I think you could get the same flexibility, if you either used command-line switches or if you moved all the prompts to the beginning of the script.

ottobackwards · 2018-11-21T19:43:42Z

@nickwallen, yeah, I did prompts as I went along debugging. I was thinking that folks may not like them.
I'll parameterize things.

nickwallen · 2019-05-01T16:55:09Z

@JonZeolla: Is it your CPU/Memory preferences in docker?

Doubt it. CPUs: 6, Mem: 12 GB, Swap 3 GB

Is it running sufficiently fast for you @JonZeolla ?

nickwallen · 2019-05-01T17:01:31Z

@mmiklavc: ...and it takes 10x as long for me when I've got a Vagrant instance running along with a bunch of browser tabs.

I was thinking about this the other day actually. Since with these changes we no longer rely on Vagrant to kick-off Ansible, we could alter the steps so that it first completes the build and packaging and then only after that is done, it launches the VM. That way we don't have the two fighting for memory at the same time.

First off, does that sound like a reasonable thing to do @ottobackwards ? If so, is it a heavy lift we should hold off on for a follow-on PR? Or is it just a simple re-ordering of some of the actions in metron-up.sh?

Edit: We should probably just get the basics working here before going on changing anything else drastically. Ignore what I said.

ottobackwards · 2019-05-01T17:25:49Z

./metron-up.sh 6.30s user 3.09s system 0% cpu 1:20:47.77 total

@nickwallen I think that is worth doing definitely! I'll get on it and let you know what it will take

nickwallen · 2019-05-01T17:51:53Z

Can we do it as a follow-on PR?

EDIT:. Looks like it also took about 80 minutes for you also. Hmm.

ottobackwards · 2019-05-01T17:56:53Z

I think it will take me a couple of hours to make the change.

ottobackwards · 2019-05-01T18:19:29Z

Ok, I'm going to do that as a follow on

ottobackwards · 2019-05-01T19:32:50Z

https://issues.apache.org/jira/browse/METRON-2098

…t-docker

nickwallen · 2019-05-03T13:55:43Z

Hey @ottobackwards - What are your thoughts on the duration of the build? Do you have any thoughts on how we could improve that?

ottobackwards · 2019-05-03T14:52:32Z

You mean other than not run the vm?

nickwallen · 2019-05-03T16:01:46Z

I don't know where this stands. Do you think this should be merged in spite of the time it takes to spin-up? Or are you looking at ways to improve the build time? What is the path forward here?

ottobackwards · 2019-05-03T18:11:37Z

I have a secondary branch and I'm working through building without the vm running
I'll have time comparisons soon, let's see how that pan's out

ottobackwards · 2019-05-06T19:51:26Z

This latest merge contains a refactoring that makes the resource use much better, by delaying the vagrant machine start until after the builds are done.

This involves refactoring the playbooks and the scripts.

ottobackwards · 2019-05-06T19:53:57Z

Some thoughts for next steps.

try setting volume sync options to :delegate ( I tried and it didn't make much difference, but we could look more )
build the docker image, but add a last layer with a COPY of all the source into the image. That way no volume penalty, and caching will make the docker build fast.
new vagrant base image that has the cluster already installed, just without metron to make deploy faster

mmiklavc · 2019-05-07T14:09:24Z

@ottobackwards Thanks for that comparison info, that makes it really easy to compare.

It's a fractionally small part of the build, but what's going on with clean taking 8x as long?

Those time drops are pretty substantial - is that purely resource contention?

Thinking through this build process a bit more, just curious, are you sharing local m2 repo from the host? I'm assuming this doesn't pull every dep down local to the Docker container. That's probably worth adding to the diagram for clarity.

ottobackwards · 2019-05-07T14:34:39Z

I believe the issue with clean is due to slowness with docker doing writes to osx hosted volumes.
I think resource contention is a big part of this on my machine at least.
I do share the repo.
https://github.com/apache/metron/pull/1261/files#diff-80771b861babcdefd8d7bbc81c412ff1R110

mmiklavc · 2019-05-07T14:36:14Z

metron-deployment/development/centos6/host_scripts/docker_run_build_container.sh

+DOCKER_CMD="bash"
+DOCKER_CMD_BASE[0]="docker run -d -t --name MetronBuild "
+DOCKER_CMD_BASE[1]="-v \"${VAGRANT_PATH}/../../..:/root/metron\" "
+DOCKER_CMD_BASE[2]="-v ~/.m2:/root/.m2 "


ottobackwards · 2019-05-07T14:47:58Z

@mmiklavc Somone else needs to try it obviously

ottobackwards · 2019-05-07T14:50:16Z

@mmiklavc as I stated above, if we need to, another change would be to create a docker layer that was a copy of the source, such that we didn't need to use the volumes at all.... but I'd have to do some research on that as a follow on

nickwallen · 2019-05-11T17:02:57Z

@ottobackwards I ran your latest code up a few times and I am not able to replicate your results unfortunately.

This is the overall duration to go from source code to a fully deployed dev environment.

Master; time vagrant up = 65 - 68 minutes
METRON-1860 new developer option for ansible in docker to deploy to vagrant #1261 : time ./metron-up.sh = 85 - 90 minutes

Your results differ significantly from mine...

Master takes you 103 minutes, which is 35 minutes longer my experience.
METRON-1860 new developer option for ansible in docker to deploy to vagrant #1261 takes 67 minutes, which is 18 minutes less than my experience.

(1) Any ideas why we are getting such different results? I have allocated Docker 12G of RAM, 6 cores, and 3G of swap. Are there any other settings that you have adjusted on your build machine?

(2) Which logs did you use to break out the build, clean, package times? Since Ansible runs on Docker, we no longer have that log persisted on the host. Once the Docker container is shut-down the logs are lost, unless I am missing something.

ottobackwards · 2019-05-12T14:55:16Z

In this PR, the ansible log goes to the /logs directory

mmiklavc · 2019-07-25T14:46:47Z

@ottobackwards - I want to spend some time in the next week resurrecting this and taking another look to see where we're at.

ottobackwards · 2019-07-25T15:00:27Z

@mmiklavc I think we are doing docker wrong, wrt building.
I think we may want to have the build docker be at the root of the source and use docker ignore to set that up, that way the src is automagically in the image etc.

luozhenwei

Hello, may I ask you a question? For testing purposes, I compiled metron according to the source code on github.
Reference links are as follows: https://github.com/apache/metron/tree/master/metron-deployment/packaging/docker/ansible-docker.
The last command to execute is MVN clean package-DskipTests.
The result is successful, showing build success.
Excuse me, how many jar packages are generated by this operation?
How to start metron in docker? Or can we say something about the next operation? I can't find duying in the community anymore. Thank you very much!

mmiklavc · 2019-09-03T15:32:00Z

@luozhenwei - it's generally best to ask these questions on the user or dev list, but here's the list of jars we depend on in our full dev (e.g. https://github.com/apache/metron/tree/master/metron-deployment/development/centos6) env:

[root@node1 ~]# ls -1 /usr/metron/0.7.2/lib
metron-common-0.7.2.jar
metron-data-management-0.7.2.jar
metron-elasticsearch-storm-0.7.2-uber.jar
metron-enrichment-common-0.7.2-uber.jar
metron-enrichment-storm-0.7.2-uber.jar
metron-maas-service-0.7.2-uber.jar
metron-management-0.7.2.jar
metron-parsers-0.7.2-uber.jar
metron-parsers-common-0.7.2-uber.jar
metron-parsing-storm-0.7.2-uber.jar
metron-pcap-backend-0.7.2.jar
metron-performance-0.7.2.jar
metron-profiler-repl-0.7.2.jar
metron-profiler-spark-0.7.2.jar
metron-profiler-storm-0.7.2-uber.jar
metron-rest-0.7.2.jar
metron-solr-storm-0.7.2-uber.jar

17 jars. The Docker deploy you linked to hasn't been maintained/updated in quite some time, so I'm not entirely sure what its current state is. @merrimanr may have more detail on this. If you're just trying to explore Metron, I would run up full dev on centos6 via the instructions in the link I provided above.

ottobackwards added 4 commits November 13, 2018 00:08

Initial commit - This provides an environment where only vagrant and …

1ac44c0

…docker ( and a copy of the metron codebase ) are required to run the Metron full-dev vm with it's default setup. This is the initial work, there will be refactorings

refactored locations

f54cc49

do not have to install java

868c351

do not have to install ansible

1725afc

ottobackwards added 9 commits November 15, 2018 09:15

inject hosts entry for node1

8c5b86c

refactor how we get the ip address, since not everyone has a clean ho…

615b133

…sts file, and some like john zeolla have a train wreck

future indent proof the regex

e605ef0

get the current directory for the script the proper way

57e159b

get current script directory here too

c048944

do not use pwd when we already have the path

800df29

fix sed statement, add missing command

39053de

skip tags support

b0b755b

disable audio in vm so it does not grab the mic and trigger security …

e1adc0b

…software like microsnitch

Merge branch 'master' of https://github.com/apache/metron into vagran…

f973559

…t-docker

ottobackwards added 2 commits May 3, 2019 19:31

do not start vm until after build

601a87b

refactor

80dd1f0

shellcheck fixes

15a32c1

mmiklavc reviewed May 7, 2019

View reviewed changes

luozhenwei reviewed Sep 3, 2019

View reviewed changes

ottobackwards mentioned this pull request Sep 13, 2019

METRON-2246 rpm-docker - minimise use of bind mounts due to performance #1501

Open

6 tasks

ottobackwards closed this Feb 27, 2020

METRON-1860 new developer option for ansible in docker to deploy to vagrant #1261

METRON-1860 new developer option for ansible in docker to deploy to vagrant #1261

Conversation

ottobackwards commented Nov 13, 2018 • edited Loading

Differences

For all changes:

For code changes:

For documentation related changes:

nickwallen commented Nov 13, 2018

mmiklavc commented Nov 13, 2018

ottobackwards commented Nov 14, 2018

ottobackwards commented Nov 14, 2018 • edited Loading

ottobackwards commented Nov 14, 2018

nickwallen commented Nov 14, 2018

nickwallen commented Nov 14, 2018

JonZeolla commented Nov 14, 2018

ottobackwards commented Nov 14, 2018

JonZeolla commented Nov 14, 2018

ottobackwards commented Nov 14, 2018

ottobackwards commented Nov 14, 2018

justinleet commented Nov 21, 2018

ottobackwards commented Nov 21, 2018 • edited Loading

ottobackwards commented Nov 21, 2018

nickwallen commented Nov 21, 2018 • edited Loading

ottobackwards commented Nov 21, 2018

nickwallen commented May 1, 2019

nickwallen commented May 1, 2019 • edited Loading

ottobackwards commented May 1, 2019 • edited Loading

nickwallen commented May 1, 2019 • edited Loading

ottobackwards commented May 1, 2019 • edited Loading

ottobackwards commented May 1, 2019

ottobackwards commented May 1, 2019

nickwallen commented May 3, 2019

ottobackwards commented May 3, 2019

nickwallen commented May 3, 2019

ottobackwards commented May 3, 2019

ottobackwards commented May 6, 2019

ottobackwards commented May 6, 2019

mmiklavc commented May 7, 2019

ottobackwards commented May 7, 2019

mmiklavc May 7, 2019

Choose a reason for hiding this comment

ottobackwards commented May 7, 2019

ottobackwards commented May 7, 2019

nickwallen commented May 11, 2019

ottobackwards commented May 12, 2019

mmiklavc commented Jul 25, 2019

ottobackwards commented Jul 25, 2019

luozhenwei left a comment

Choose a reason for hiding this comment

mmiklavc commented Sep 3, 2019

ottobackwards commented Nov 13, 2018 •

edited

Loading

ottobackwards commented Nov 14, 2018 •

edited

Loading

ottobackwards commented Nov 21, 2018 •

edited

Loading

nickwallen commented Nov 21, 2018 •

edited

Loading

nickwallen commented May 1, 2019 •

edited

Loading

ottobackwards commented May 1, 2019 •

edited

Loading

nickwallen commented May 1, 2019 •

edited

Loading

ottobackwards commented May 1, 2019 •

edited

Loading