-
Notifications
You must be signed in to change notification settings - Fork 560
LXC Linux Containers. Lightweight isolation. Create more hadoop clusters on a set of machines
http://en.wikipedia.org/wiki/LXC
LXC has really only become production worthy in the last year? Seems better than 2010-12 era. Here's a 2013 article talking up container tech: http://www.linuxjournal.com/content/containers%E2%80%94not-virtual-machines%E2%80%94are-future-cloud
(This is the same as Docker technology)
LXC (Linux Containers) is an operating system–level virtualization method for running multiple isolated Linux systems (containers) on a single control host.
The Linux kernel comprises cgroups for resource isolation (CPU, memory, block I/O, network, etc.) that does not require starting any virtual machines. Cgroups also provides namespace isolation to completely isolate applications' view of the operating environment, including process trees, network, user ids and mounted file systems.
LXC combines cgroups and namespace support to provide an isolated environment for applications. Docker can also use LXC as one of its execution drivers, enabling image management and providing deployment services.
LXC provides operating system-level virtualization through a virtual environment that has its own process and network space, instead of creating a full-fledged virtual machine. LXC relies on the Linux kernel cgroups functionality that was released in version 2.6.24. It also relies on other kinds of namespace-isolation functionality, which were developed and integrated into the mainline Linux kernel.
Oracle has some nice user level documentation here: http://docs.oracle.com/cd/E37670_01/E37355/html/ol_containers.html
some deeper stuff with vagrant I might explore http://containerops.org/2013/11/19/lxc-networking/
More powerful use of lxc https://www.stgraber.org/2013/12/21/lxc-1-0-your-second-container/
I'm going to try attaching raw disk devices for use in mapr. Not sure if it will work. like this (but with partition names)
container has to be running
sudo lxc-device -n p1 add /dev/sdb /dev/sdb
sudo lxc-device -n p1 add /dev/sdb5 /dev/sdb5
sudo lxc-device -n p1 add /dev/sdb6 /dev/sdb6
sudo lxc-device -n p1 add /dev/sdb7 /dev/sdb7
gparted sees them if I do the above, but can't seem to find a superblock? Maybe that won't hurt MapR.
https://github.com/lxc/lxc/blob/master/src/lxc/lxc-device
Didn't seem to work with mapr? libvirt means a disk attach rather than a device attach
This page seems to show a method with mounts thru fstab http://it.randomthemes.com/2012/07/16/how-to-mount-disk-to-lxc-container/ Normal nfs mounts to the host would work I guess.
but we need it to be a block device..from http://s3hh.wordpress.com/2012/10/22/easily-making-a-blockdev-available-to-a-container/
But he says lxc-device now should work?
screens
apt-get install byobu
# to run
byobu
apt-get install clusterssh
# to run
clusterssh -o "-X" -l root 192.168.1.171 192.168.1.172 192.168.1.173 192.168.1.174 192.168.1.175 192.168.1.176 192.168.1.177 192.168.1.178 192.168.1.179 192.168.1.180
I typically add -X for X windows stuff. so I can get X stuff back on my local machine if needed.
byobu: nicely if you detach, you can ssh back and byobu again, and reattach to the old session.
I rarely ssh directly to the container IPs, but you can since the setup below makes them public just like any other machine
I use clusterssh to ssh to all machines, then byobu. Then I use f2 to create a new set of screens, and lxc-stop -n cntr1 and lxc-start -n cntr1 on all the machines. Then I can login. Then I have all ten of the virtual machines, and I can f3 to shift byobu to the 10 host machines
Warning: keep track of what machines you're on! It's really easy to trash the wrong machines.
All of the container machines I create, end in -cntr so
mr-0x1-cntr1
is on
mr-0x1
Here's ifconfig on the host with a running container. The lxcbr0 is left over, because I didn't delete it as described above. It doesn't hurt anyone. Note that the eth0 doesn't get the ip address, the br0 does.
The veth* is the ethernet device for the container.
I left in tun0 which is my vpn tunnel from home. I had to use vpnc-connect for the vpn, because the ubuntu network-manager gets disabled when it sees I modified /etc/network/interfaces, and I can't use the gui for the vpn start. lBut with vpnc, I can vpn even with the bridged eth0.
I show all this, to show that it's robust even in a more complicated case.
br0 Link encap:Ethernet HWaddr d4:3d:7e:18:db:22
inet addr:192.168.0.34 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::d63d:7eff:fe18:db22/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
...
eth0 Link encap:Ethernet HWaddr d4:3d:7e:18:db:22
inet6 addr: fe80::d63d:7eff:fe18:db22/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
...
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
...
lxcbr0 Link encap:Ethernet HWaddr 2e:2c:9c:54:39:a3
inet addr:10.0.3.1 Bcast:10.0.3.255 Mask:255.255.255.0
inet6 addr: fe80::2c2c:9cff:fe54:39a3/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
...
tun0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:192.168.1.226 P-t-P:192.168.1.226 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1412 Metric:1
...
vethL7EZtF Link encap:Ethernet HWaddr fa:8f:c0:f6:36:f7
inet6 addr: fe80::f88f:c0ff:fef6:36f7/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
...
Links are dropped in for additional research if curious. What is used is not exactly described at any site, so treat as research.
https://www.digitalocean.com/community/tutorials/getting-started-with-lxc-on-an-ubuntu-13-04-vps
I use the latest backport from the developers for lxc install. I need the lxc-include functionality in the config, for network config maintenance. That wasn't in the normal ubuntu apt-get install. Here's the ppa:
This PPA contains backports of stable version of LXC for all supported Ubuntu releases. Note this is NOT the lts release, which I think is 1.0
I think the stable release is lxc 1.1.x? https://launchpad.net/~ubuntu-lxc/+archive/ubuntu/lxc-stable
lts release https://launchpad.net/~ubuntu-lxc/+archive/ubuntu/lxc-lts
to setup the ppa for the lts release as root:
apt-get update
add-apt-repository ppa:ubuntu-lxc/lts
apt-get update
apt-get upgrade
That creates entries for stable first, then lts in /etc/sources.list.d/ubuntu-lxc-stable-precise.list It's good to be careful and pay attention on any apt upgrade that updates lxc. You may want to take the new configuration files and modify them to include your old changes rather than just keep your old (when prompted
I had to go back to the lts release. I couldn't get /proc/meminfo and cpuinfo and top and ps in the container (not sure how the 1.1 lxc is doing things there.
deb http://ppa.launchpad.net/ubuntu-lxc/lts/ubuntu precise main
deb-src http://ppa.launchpad.net/ubuntu-lxc/lts/ubuntu precise main
here's some versions that get installed as of 4/18/15 from stable that I couldn't use..had to apt-get purge and reinstall
Setting up libseccomp2 (2.1.1-1~ubuntu12.04.1~ppa1) ...
Setting up liblxc1 (1.1.2-0ubuntu3~ubuntu12.04.1~ppa1) ...
Setting up python3-lxc (1.1.2-0ubuntu3~ubuntu12.04.1~ppa1) ...
Setting up lxc (1.1.2-0ubuntu3~ubuntu12.04.1~ppa1) ...
Installing new version of config file /etc/apparmor.d/lxc/lxc-default ...
Installing new version of config file /etc/apparmor.d/usr.bin.lxc-start ...
Installing new version of config file /etc/default/lxc ...
Installing new version of config file /etc/init/lxc-net.conf ...
Installing new version of config file /etc/init/lxc.conf ...
Preserving user changes to /etc/dnsmasq.d-available/lxc (renamed from /etc/dnsmasq.d/lxc)...
The dnsmasq configuration has been migrated twice, fixing it.
Setting up lxc dnsmasq configuration.
Setting up lxc-templates (1.1.2-0ubuntu3~ubuntu12.04.1~ppa1) ...
Setting up lxcfs (0.7-0ubuntu2~ubuntu12.04.1~ppa1) ...
Note that lxc-shutdown is gone with the latest 1.1 lxc. The lxc-stop is what you want: a clean shutdown, followed by kill if that doesn't work. Has --no_kill if you want to avoid a kill.
http://man7.org/linux/man-pages/man1/lxc-stop.1.html
apt-get install lxc lxctl bridge-utils lxc-templates
lxctl and lxc-templates might not be needed. I believe the above installs the lxc web panel. nothing extra needed for that if you want to use (I don't)
also see:
http://sylvain.fankhauser.name/setting-up-lxc-containers-in-30-minutes-debian-wheezy.html
To check:
lxc-checkconfig
Old note..i don't have this issue any more: I think the dnsmasq that's running for the containers lxcbr0 is messing with my external dns server setup. I pkill'ed it, and my dns started working Ended up having to put the dns server in the container resolv.conf manually.
sudo vi /etc/resolvconf/resolv.conf.d/base
nameserver 172.16.0.200
LXC config files. have to sort out why so many now. Some say they are autogenerated. Maybe I'm not restarting LXC service when I should.
http://bj0z.wordpress.com/2011/08/19/howto-build-a-base-lxc-container-in-ubuntu-11-04/
edit these files:
/etc/default/lxc
# update: apparently these files arrived with latest lxc?
/etc/init/lxc-net.conf
/etc/default/lxc-net
this line
USE_LXC_BRIDGE="true"
to
USE_LXC_BRIDGE="false"
and restart lxc? (reboot?)
On a running system, after you've done everything below, you can remove the lxcbr0 bridge (unused) if you forget this step (although it will be recreated if you don't do the above, on reboot?)
The bridge-utils package provides the brctl and bridge_ports extenstion to /etc/network interfaces
apt-get install bridge-utils
Some say libvirt should be used instead, because of suspected bug in bridge-utils when used with some other VM environments? I've not had any issues with bridge-utils, and don't have another VM environment.
Removing existing lxcbr0 bridge (not critical)
ip link set lxcbr0 down
brctl delbr lxcbr0
ifconfig
Before booting a new kernel, you can check its configuration
CONFIG=/path/to/config /usr/bin/lxc-checkconfig
https://help.ubuntu.com/12.04/serverguide/lxc.html
I have /home2 on 171-175 and /home3 on 176-180 and 181-190 partitioned for use by lxc It's just normal filesystem so visible from the host. It's not like the hidden mapr partitions.
I replace the normal lxc directories with symbolic links: could make it home3 and sit on home2 but that can be confusing and create hidden files if home2 isn't mounted
NEWDIR=home2
sudo mkdir /$NEWDIR/lxclib /$NEWDIR/lxccache
NEWDIR=home3
sudo mkdir /$NEWDIR/lxclib /$NEWDIR/lxccache
# don't do these if the directories already exist above! assume these links were created
sudo rm -rf /var/lib/lxc /var/cache/lxc
sudo ln -s /$NEWDIR/lxclib /var/lib/lxc
sudo ln -s /$NEWDIR/lxccache /var/cache/lxc
My current setup only requires /etc/hostname to be changed (because it wants to be globally unique) and the my_cntr1_network file to be modified with ip/mac info. The mac info is randomly generated on a lxc-create, and should use that on into the my_cntr1_network (cut it out)
as root:
cd /var/lib/lxc
lxc-create -n cntr1 -t ubuntu
or (change subsequent names to trusty1 if this container is used)
lxc-create -t download -n trusty1 -- --dist ubuntu --release trusty --arch amd64
cd cntr1
# edit config. Cut/save these lines out to ../my_cntr1_network. Make it one dir above so not lost if lxc-destroy
# Network configuration. first 3 lines are unique. kbn 9/9/14
lxc.utsname = mr-0x10-cntr1
lxc.network.hwaddr = 00:16:3e:8d:77:2b
lxc.network.ipv4 = 172.16.2.211/16 172.16.255.255
lxc.network.ipv4.gateway = 172.16.0.1
lxc.network.type = veth
lxc.network.link = br0
# up should be last
lxc.network.flags = up
Then add this to cntr1/config (remember you have the latest lxc installed via ppa, to use lxc.include)
lxc.include = /var/lib/lxc/my_cntr1_network
Don't say lxc-include in error. No error detecting, it will just be ignored.
Host eth0 is changed to be a bridge. I had problems adding a separate bridge, requires NAT forwarding and maybe promiscious mode, which has performance issues? This seems robust (tried on both 12.04 LTS and hwe trusty (kernel 3.2 and 3.13)
edit /etc/network/interfaces on host. Best to reboot after, but /etc/init.d/networking restart sometimes is enough. MAKE SURE THIS IS EXACTLY RIGHT (subsitute correct addresses for your machine) and bridge-utils is installed. Otherwise bridge_ports won't work, and you'll have no network access and need to plug in a console to fix! hard to do if remote and no IPMI!
<on host>
sudo vi /etc/network/interfaces
auto lo
iface lo inet loopback
# double check that your device is eth0! it might be eth1, eth2 or eth3 due to renaming.
# Use ifconfig.
auto eth0
iface eth0 inet manual
auto br0
iface br0 inet static
# double check that your device is eth0! it might be eth1, eth2 or eth3 due to renaming
bridge_ports eth0
bridge_fd 0
bridge_maxwait 0
bridge_stp off
address 172.16.2.180
# Note my network uses a supernetted netmask here. Adjust as necessary
netmask 255.255.0.0
# inline comments might break things. Don't do. These are not needed.
# network 172.16.0.0
# broadcast 172.16.255.255
gateway 172.16.0.1
dns-nameservers 172.16.0.200
Heads up: if you do this on your home machine, the network-manager won't be usable anymore, so you can't start vpn with it. I installed vpnc which is nice, and use vpnc-connect and vpnc-disconnect. vpnc uses a config file, search for "vpnc package" on this page https://help.ubuntu.com/community/VPNClient for instructions (or google).
So at home I can test LXC and still have a VPN connected, even though network-manager disables itself because it saw /etc/network/interfaces was in use (the default Ubuntu /etc/NetworkManager/NetworkManager.conf has managed=true)
The 0xdata machines have static ips, and network-manager removed
apt-get purge network-manager
Some of that might be okay with default settings, but I include all for clarity. See resulting ifconfig above
/etc/init.d/networking restart
Might be enough, but to be sure, you should reboot.
lxc-create -n cntr1 -t ubuntu
With the latest lxc stuff, you'll see error messages from some services being terminated. That's okay.
lxc-start -n cntr1
output looks like. Because of dhcp delays? you might have to wait 30 secs to login
root@mr-0x5:/var/lib/lxc/cntr1# lxc-start -n cntr1
<4>init: hwclock main process (7) terminated with status 77
<4>init: ureadahead main process (8) terminated with status 5
<4>init: udev-fallback-graphics main process (53) terminated with status 1
<4>init: setvtrgb main process (71) terminated with status 1
<4>init: console-setup main process (100) terminated with status 1
<30>udevd[149]: starting version 175
Stopping (this does clean stop. Kills if necessary. This is in the new lxc (you don't need the older twwo commands)
lxc-stop -n cntr1
lxc-restart exists in the old lxc, which combines stop and start. But the release I've installed above, doesn't have it. Don't install lxc with apt-get if prompted. You want to use lxc-stop and lxc-start insted
I never use this console attach
lxc-console -n cntr1
I do use this to get to the command line in the container if networking is broken there and no ssh. You do this from the host, and it gets you to the container command line
lxc-attach -n cntr1
UPDATE: I'm having problems with the network config when I clone. I now copy the original config (cntr1/config) to the cloned (cntr2/config) and then edit it and s/cntr1/cntr2. The cloned config looks very different than the original. I suspect maybe with the latest lxc they have some issues? This hand copy/edit of the config seems to make the clone work. I sometimes have to also add the network stuff to /etc/network/interfaces inside the clone and stop/start. Not sure if I need that always, yet.
For rapid provisioning, you may wish to customize a canonical container according to your needs and then make multiple copies of it. This can be done with the lxc-clone program. Given an existing container called C1, a new container called C2 can be created using
# but I have to hardwire the IP's correctly? and hostname maybe
sudo lxc-clone -o C1 -n C2
from http://docs.oracle.com/cd/E37670_01/E37355/html/ol_shutdown_containers.html
To display the containers that are configured, use the lxc-ls (or lxc-ls -f) command on the host.
[root@host ~]# lxc-ls
ol6ctr1
ol6ctr2
To display the containers that are running on the host system, specify the --active option.
[root@host ~]# lxc-ls --active
ol6ctr1
To display the state of a container, use the lxc-info command on the host.
[root@host ~]# lxc-info -n ol6ctr1
state: RUNNING
pid: 10171
To view the state of the processes in the container from the host, either run ps -ef --forest and look for the process tree below the lxc-start process or use the lxc-attach command to run the ps command in the container.
[root@host ~]# ps -ef --forest
If you were logged into the container, the output from the ps -ef command would look similar to the following.
[root@ol6ctr1 ~]# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 07:58 ? 00:00:00 /sbin/init
root 183 1 0 07:58 ? 00:00:00 /sbin/dhclient -H ol6ctr1 ...
root 206 1 0 07:58 ? 00:00:00 /sbin/rsyslogd -i ...
root 247 1 0 07:58 ? 00:00:00 /usr/sbin/sshd
root 254 1 0 07:58 lxc/console 00:00:00 /sbin/mingetty /dev/console
root 258 1 0 07:58 ? 00:00:00 login -- root
root 260 1 0 07:58 lxc/tty2 00:00:00 /sbin/mingetty /dev/tty2
root 262 1 0 07:58 lxc/tty3 00:00:00 /sbin/mingetty /dev/tty3
root 264 1 0 07:58 lxc/tty4 00:00:00 /sbin/mingetty /dev/tty4
root 268 258 0 08:04 lxc/tty1 00:00:00 -bash
root 279 268 0 08:04 lxc/tty1 00:00:00 ps -ef
Note that the process numbers differ from those of the same processes on the host, and that they all descend from the process 1, /sbin/init, in the container.
To suspend or resume the execution of a container, use the lxc-freeze and lxc-unfreeze commands on the host.
[root@host ~]# lxc-freeze -n ol6ctr1
[root@host ~]# lxc-unfreeze -n ol6ctr1
remove this line from the /etc/hosts
127.0.1.1 cntr1
The containers /etc/network/interfaces looks like this..leave it with dhcp. It will get the static ip from the lxc config but get the dns server (192.168.1.200) from dhcp
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet dhcp
UPDATE: I've been just changing and this works. Have to not race against a dhcp? (first time seems fine..2nd not, without this)
auto eth0
iface eth0 inet manual
old recommendation: Check the eth0 ip with ifconfig. I have to manually set the /etc/network/interfaces in the container for some reason. Sometimes it seems like it's the only reliable way to get static ips (regardless of the outside settings) and dns nameservers. Like this:
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
address 172.16.2.110 # change
# Note my network uses a supernetted netmask here. Adjust as necessary
netmask 255.255.0.0
# inline comments might break things. don't do that. These are unneeded.
# network 172.16.0.0
# broadcast 172.16.255.255
gateway 172.16.0.1
dns-nameservers 172.16.0.200
Then /etc/init.d/networking restart
make hostname unique.
hostname <hostname>
vi /etc/hostname
i sudo sh, use passwd to give root a password.
apt-get install vim
edit ~/.vimrc
:set expandtab
au BufEnter * set tabstop=4 shiftwidth=4
au BufEnter *.java set tabstop=2 shiftwidth=2
set timezone with
sudo dpkg-reconfigure tzdata
or command line only (PST)
echo "America/Los_Angeles" | sudo tee /etc/timezone
sudo dpkg-reconfigure --frontend noninteractive tzdata
gives
Current default time zone: 'America/Los_Angeles'
Local time is now: Wed Oct 8 21:18:11 PDT 2014.
Universal Time is now: Thu Oct 9 04:18:11 UTC 2014.
Can install ntp if you want
apt-get install ntp
because of typing 'y' or 'yes' to lots of parallel machines and getting wedged if the machine was not in sync with the others and didn't expect 'y' or 'yes' (y repeatedly outputs y) I add this to the .bashrc for root and maybe kevin and jenkins
alias y=/bin/ls
alias yes=/bin/ls
I shorten the failsafe timeout to get fast boot http://tech.pedersen-live.com/2012/05/disable-waiting-for-network-configuration-messages-on-ubuntu-boot/
vi /etc/init/failsafe.conf
change the first sleep 20 :question: to sleep 5
change the sleep 40 to sleep 15
change the sleep 59 to sleep 15
This is about waiting for the network to be 'configured'
need showmount
apt-get install nfs-common
so I can do this
showmount -e mr-0xs3
Export list for mr-0xs3:
/mnt/mr-0xs3-pool/hdp2.1_hdfs_datasets (everyone)
Install java with ppa per http://www.webupd8.org/2012/01/install-oracle-java-jdk-7-in-ubuntu-via.html
ubuntu 12.04?
apt-get update
sudo apt-get install python-software-properties
ubuntu 14.04
apt-get update
sudo apt-get install software-properties-common
both:
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java7-installer
sudo apt-get install oracle-java8-installer
sudo apt-get install oracle-java8-set-default
sudo apt-get install oracle-java7-set-default
other things I like for our 0xdata "standard machines"
ssh:
apt-get install openssh-server
copy things in /root from mr-0xe1 (some shell scripts) and do some. Setup vnc server? maybe yeah. Setup user passwords
http://askubuntu.com/questions/246323/why-does-sshs-password-prompt-take-so-long-to-appear I turn off reverse dns lookup on ssh login
There are several things that can go wrong. Add -vvv to make ssh print a detailed trace of what it's doing, and see where it's pausing.
I turn off reverse DNS lookup in /etc/ssh/sshd_config. And GSSAPI while you're at it.
UseDNS no
GSSAPIAuthentication no
then
service ssh restart
It's possible the initial lxc container setup, then changing the dns, leaves a bad dns or something in /etc/resolve.conf or somewhere? not sure.
I used to install autofs. Maybe not on newer machines. (and not on LXC machines?) I did have it on the host machines.
copy /etc/ssh/sshd_config from an existing 0xdata machine (details about max starts/sessions
in case you need to get stuff from s3
apt-get install s3cmd
latest from source forge..as of 6/15 at http://sourceforge.net/projects/s3tools/files/s3cmd/1.5.2/
wget http://sourceforge.net/projects/s3tools/files/s3cmd/1.5.2/s3cmd-1.5.2.tar.gz
tar -xvzf s3cmd*tar.gz
cd s3cmd*
python ./setup.py install
s3cmd --version
aptitude is good
apt-get install aptitude
aptitude update -y
path resolution
apt-get install realpath
monitoring tools:
apt-get install htop
apt-get install iotop
apt-get install saidar
in /etc/profile
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
/opt/who.sh #custom script to look for h2o processes
For R/numpy/scipy
sudo apt-get install libcurl4-openssl-dev
Probably want these for liblinear etc
sudo apt-get install libblas3gf -y
sudo apt-get install libblas-doc -y
sudo apt-get install libblas-dev -y
sudo apt-get install liblapack3gf -y --reinstall
sudo apt-get install liblapack3gf-base -y --reinstall
sudo apt-get install liblapack-doc -y
sudo apt-get install liblapack-dev -y --reinstall
sudo ln -s /usr/lib/liblapack.so.3gf /usr/lib/liblapack.so.3
sudo ln -s /usr/lib/libblas.so.3gf /usr/lib/libblas.so.3
seem to required these links..can't find something that installs them right and some r packages look for them
If I copy the /usr/local/lib/R/site-library and try to do library("LiblineaR") or others in R...sometimes it can't find libRblas.so (and maybe others: libRlapack.so)
seems like if I copy these two files to /usr/local/libR/lib and create links to them in /usr/lib things are okay. But not sure why my install didn't set them up right. Do I have old copies of packages and this would go right if I install.packages("..") them in R rather than copying the site-library? or ??
cd /usr/local/lib/R scp -p -r <another machine's copy> /usr/local/lib/R/lib . cd /usr/lib ln -s /usr/local/lib/R/lib/libRlapack.so ln -s /usr/local/lib/R/lib/libRblas.so
add to /etc/apt/sources.list
deb http://cran.stat.ucla.edu/bin/linux/ubuntu precise/
deb-src http://cran.stat.ucla.edu/bin/linux/ubuntu precise/
then
apt-get update
you get
GPG error: http://cran.stat.ucla.edu precise/ Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 51716619E084DAB9
reload the missing key
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 51716619E084DAB9
then
apt-get update
apt-get install r-base
setuptools is platform dependent install see here: https://pypi.python.org/pypi/setuptoo
I suppose on ubuntu
apt-get install python-setuptools
should work. But not sure what version you get. Better to do
wget https://bootstrap.pypa.io/ez_setup.py -O - | sudo python
platform dependent ways of installing pip are at https://pip.pypa.io/en/latest/installing.html
on ubuntu
sudo apt-get install python-pip
pip install pip --upgrade
weird..that leaves pip not visible. it changed paths. Need to start a new shell to see the new path
# pip install pip --upgrade
bash: /usr/bin/pip: No such file or directory
this fixes it
apt-get install python-pip --reinstall
# you can't do --force-reinstall with the version installed above.. just upgrade first
pip install pip --upgrade
# complains about /usr/bin/pip. It's at /usr/local/bin/pip ..force the path to make it work
# probably need to start a new shell to get path right
bash
/usr/local/bin/pip install pip --upgrade --force-reinstall
To check version apparently want to update distribute too. Got this from the pip list after above
Warning: cannot find svn location for distribute===0.6.24dev-r0
So I make sure that's most recent too
But this wipes out setuptools and means distribute can't install again
pip install distribute --upgrade --force-reinstall
so just do
pip install distribute --upgrade
# pip list | egrep '(distribute|pip|setuptools)'
distribute (0.7.3)
pip (7.0.3)
setuptools (17.0)
See, I'm thinking maybe we want people to install the latest setuptools also
doing both the pip install and upgrade (we should warn that the default platform builds, like for apt-get install python-pip, you don't get the latest pip and have to do
I like the reinstall, because it makes sure everything's clean and you can see the version. And it's all different if you have virtualenv or private place for python packages?
also we should note that this is all just for python 2.7 python 3.0 instructions are different
as they say " On Linux, pip will generally be available for the system install of python using the system package manager, although often the latest version will be unavailable."
# sudo apt-get install python-numpy python-scipy
sudo apt-get install python-matplotlib python-nose
sudo apt-get install python-dev
# this should have been done above
# sudo apt-get install python-setuptools
# easy_install pip
# newer versions?
pip install -U numpy scipy scikit-learn statsmodels
# upgrades?
# already done above
# pip install -U pip
pip install -U numpy scipy statsmodels
other python packages
pip install requests simplejson paramiko psutil
remove libreoffice stuff
apt-get remove libreoffice*
Current R packages we use can be copied from /usr/local/lib/R/site-library on another machine. Maybe need to install.packages("LibLineaR") in R
If rgl didn't install because of GL/gl.h in ubuntu, do this install first
sudo apt-get install r-base-dev xorg-dev libglu1-mesa-dev mesa-common-dev
sudo apt-get build-dep r-cran-rgl
probably need to add universe and multiverse repos first /etc/apt/sources.list
deb http://us.archive.ubuntu.com/ubuntu/ precise universe
deb-src http://us.archive.ubuntu.com/ubuntu/ precise universe
deb http://us.archive.ubuntu.com/ubuntu/ precise-updates universe
deb-src http://us.archive.ubuntu.com/ubuntu/ precise-updates universe
apt-get update
apt-get install r-cran-rgl
Then go into R and just do install.packages("rgl") and see that it completes without error
apt-get install traceroute6.iputils
# should be bigger than 1500 by default
tracepath localhost
add this to /etc/network/interfaces (under localhost) and reboot
auto lo
iface lo inet loopback
# see explanation of 1500 mtu issue at https://0xdata.atlassian.net/wiki/pages/viewpage.action?pageId=31916232
post-up /sbin/ethtool --offload lo tso off
post-up /sbin/ethtool --offload lo ufo off
post-up /sbin/ethtool --offload lo gso off
post-up /sbin/ethtool --offload lo gro off
post-up /sbin/ifconfig lo mtu 1500
test after reboot. should see 1500
tracepath localhost
open browser
http://localhost:5000 username : admin password admin
web panel config file : /srv/lwp/lwp.conf
current info
http://man7.org/linux/man-pages/man1/lxc-autostart.1.html
says to set lxc.start.auto == 1 in the config
can check with lxc-ls --fancy
old info: To make a container autostart, you simply need to symlink its config file into the /etc/lxc/auto directory:
(I've not checked whether they really restart on reboot with this)
# with the new lxc, have to create the auto directory?
mkdir /etc/lxc/auto
ln -s /var/lib/lxc/cntr1/config /etc/lxc/auto/cntr1.conf
LXC 8.0 allows
lxc.network.ipv4.gateway = 172.16.0.1
so you can specify the gateway.
You can specify the broadcast if necessary on the same line as the ipv4
lxc.network.ipv4 = 172.16.2.112/16 172.16.2.255
https://gist.github.com/gionn/7585324 How to enable bind mount inside lxc container
When mount is returning:
STDERR: mount: block device /srv/database-data/postgres is write-protected, mounting read-only
mount: cannot mount block device /srv/database-data/postgres read-only
and dmesg shows:
[ 6944.194280] type=1400 audit(1385049795.420:32): apparmor="DENIED" operation="mount" info="failed type match" error=-13 parent=6631 profile="lxc-container-default" name="/var/lib/postgresql/9.1/main/" pid=6632 comm="mount" srcname="/srv/database-data/postgres/" flags="rw, bind"
AppArmor is blocking mount -o bind inside the LXC container.
To enable it add in /etc/apparmor.d/lxc/lxc-default:
profile lxc-container-default flags=(attach_disconnected,mediate_deleted) {
...
mount options=(rw, bind),
...
Reload apparmor:
# /etc/init.d/apparmor reload
To ensure read-only mounts work, you'll want mount options to be:
mount options=(rw, bind, ro),
Rather than this, I put hard mounts in /etc/fstab for /mnt thngs to the underlying system's /mnt which is actually autofs. that seemed to work. how to do that is described in another section lower down
Using NFS/Autofs in a LXC container http://bridge.grumpy-troll.org/2014/03/lxc-routed-on-ubuntu/ You can mount NFS from outside the container, which is the approach I use with NAT’d containers, although then the container is unaware of the mount-point and you’re not using the same uid space.
To mount NFS inside the container, you need to tell AppArmor to allow this
# vi /etc/apparmor.d/abstractions/lxc/container-base
...
# service apparmor restart
The rules I add are:
# allow NFS
mount fstype=nfs,
mount fstype=nfs4,
mount fstype=rpc_pipefs,
You can then just add NFS mount-points to the /etc/fstab inside the container’s rootfs.
This is in /var/lib/lxc/precise1/fstab
home-0xdiag-datasets was bind'ed to /exports for nfs mount reasons, so we can reuse it here (rather than /home/0xdiag/home-0xdiag-datasets. Does it matter?
To create the directory (for the mount) automatically in the container, you can also add the create=dir option in the fstab :
/exports/home-0xdiag-datasets /var/lib/lxc/precise1/rootfs/home/0xdiag/home-0xdiag-datasets none bind,create=dir
This is specific to LXC. https://lists.linuxcontainers.org/pipermail/lxc-devel/2013-December/006444.html
Just like we already had "optional", this adds two new LXC-specific mount flags:
create=dir (will do a mkdir_p on the path)
create=file (will do a mkdir_p on the dirname + a fopen on the path)
This was motivated by some of the needed bind-mounts for the unprivileged containers.
cribbed from digitalocean, thanks https://www.digitalocean.com/community/tutorials/getting-started-with-lxc-on-an-ubuntu-13-04-vps
Besides just isolation, another massive benefit of using LXC is its ability to apply cgroup limits to the processes within a container.
Limits for a container are defined in its config file, which for our container can be found at /var/lib/lxc/test-container/config.
Memory limits can be used to set a maximum RAM usage for container. In this case, we'll limit our container to 50MB of memory:
lxc.cgroup.memory.limit_in_bytes = 50000000
CPU limits are defined slightly differently; unlike with memory, where physical limits are defined, CPU limits operate with CPU 'shares':
lxc.cgroup.cpu.shares = 100
These shares are not linked to any physical quantity but instead just represent relative allocations of CPU resources, meaning a container with more shares gets higher CPU access priority. The numbers used are completely arbitrary though, so giving one container 10 and another 20 is the same as giving them 1000 and 2000 respectively, as all it tells us is that the second container has twice the CPU share priority. Just ensure you are consistent with your scale between containers.
Once you've changed the cgroup limits in the config file, you'll need to shutdown and restart the container for the changes to take effect.
Alternatively, limits can be set temporarily on a running container with the lxc-cgroup command:
lxc-cgroup -n test-container cpu.shares 100
It is often the case that you'll want the containers to autostart after a reboot, particularly if they are hosting services. By default, containers will not be started after a reboot, even if they were running prior to the shutdown.
To make a container autostart, you simply need to symlink its config file into the /etc/lxc/auto directory:
ln -s /var/lib/lxc/test-container/config /etc/lxc/auto/test-container.conf
Now running lxc-ls -f again will show that our container is setup to autostart:
# lxc-ls -f
NAME STATE IPV4 IPV6 AUTOSTART
----------------------------------------------------
test-container RUNNING 10.0.3.143 - YES
I just had a headache with a copied container "working" but then ssh getting interrupted and ntpd binding with ipv6 not working. Also note apparently you can't disable ipv6 or things stop working? It seems like my problem was reusing a mac address I had used on another container, with the new copied container. Something in my network didn't like that (although I think all my ips and macs were unique). Things were nice after I inc'ed the mac address by one. I think the lesson is: treat MAC + container has one-use ..You create a new container, create a new MAC to use with it..always. I set the mac outside in the lxc config stuff (ipv4 only). so my container /etc/network/interfaces now just say "manual"
I was using lxc.network.ipv4.gateway = auto and it stopped working. 'ip route' showed the default route not going to my desired 172.16.0.1 gateway..went to the host machine.
looks like I have to specify both the ip address and the gateway in the container's config..not just /etc/network/interfaces
I also specify the broadcast, just to be safe i.e. for 172.16.2.211
lxc.network.ipv4 = 172.16.2.211/16 172.16.255.255
lxc.network.ipv4.gateway = 172.16.0.1
I also have it correct in the /etc/network/interfaces inside the container
now it works
Docker drops LXC as default execution environment by Chris Swan on Mar 13, 2014
With the release of version 0.9 Docker.io have dropped LXC as the default execution environment, replacing it with their own libcontainer. At the same time Docker now supports a much broader range of isolation tools through the use of execution drivers, which include: OpenVZ, systemd-nspawn, libvirt-lxc, libvirt-sandbox, qemu/kvm, BSD Jails, Solaris Zones, and chroot.
Libcontainer is a library written in Go that provides direct access for Docker to Linux container APIs:
Docker out of the box can now manipulate namespaces, control groups, capabilities, apparmor profiles, network interfaces and firewalling rules - all in a consistent and predictable way, and without depending on LXC or any other userland package. This drastically reduces the number of moving parts, and insulates Docker from the side-effects introduced across versions and distributions of LXC. In fact, libcontainer delivered such a boost to stability that we decided to make it the default. In other words, as of Docker 0.9, LXC is now optional.
LXC itself recently announced the release of version 1.0. Whilst Docker can still be used in combination with LXC it’s likely that most users will run with the new default that omits LXC.
trusty kernel backport. I do this on the host machine only, not the container. CHECK FIRST! if you're seeing trusty packages installed by apt-get update after a new install, and uname -r says 3.13.x..you don't need/want this. I did a 12.04.5 install on a haswell-e box, and it installed 3.13.x kernel for a ubuntu 12.04.5 LTS install from flash (iso).
sudo apt-get install xserver-xorg-lts-precise
hwe-support-status --verbose
sudo apt-get install linux-generic-lts-trusty xserver-xorg-lts-trusty libgl1-mesa-glx-lts-trusty linux-image-generic-lts-trusty
# or this?
# apt-get install --install-recommends linux-generic-lts-trusty xserver-xorg-lts-trusty libgl1-mesa-glx-lts-trusty
reboot
after uninstalling and reinstalling gdm and lightdm, I was dead in the water until I did this on my haswell-e box
sudo apt-get install xserver-xorg-lts-precise
apt-get install --install-recommends linux-generic-lts-trusty xserver-xorg-lts-trusty libgl1-mesa-glx-lts-trusty
after apt-get install gdm
useful for switching between:
sudo dpkg-reconfigure gdm
or
sudo dpkg-reconfigure lightdm
picking gdm as default display manager
service gdm start
had the right effect of restarting the display then
service lightdm start
restarted me back to the login, which was good.
When I removed gdm, and went back to lightdm, a restart didn't work well although
startx
fixed that, then the second time, service lightdm restart worked (after getting to a CTRL-ALT-F1 terminal)
Seems better with gdm though..ah! now I got the gdm login greeter
disabled the user list at login with
apt-get install gconf-editor
gconf-editor
apps -> gdm -> simple-greeter
check the 'disable user list' box
(look at the ubuntu install notes on confluence under network infrastructure for more details about how to get this in lightdm conf correctly on ubuntu 14.04)
To get the full text to see why things are delayed: You would need to edit the file /etc/default/grub. In this file you'll find an entry called GRUB_CMDLINE_LINUX_DEFAULT. This entry must be edited to control the display of the splash screen.
The presence of the word splash in this entry enables the splash screen, with condensed text output. Adding quiet as well, results in just the splash screen; which is the default for the desktop edition since 10.04 (Lucid Lynx). In order to enable the "normal" text start up, you would remove both of these.
So, the default for the desktop, (i.e. splash screen only):
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
For the traditional, text display:
GRUB_CMDLINE_LINUX_DEFAULT=
After editing the file, you need to run update-grub.
sudo update-grub
To reduce the network timeout delays for dhcp (why can't I disable dhcp when I'm static everywhere??) you can try to set In /etc/dhcp/dhclient.conf.
timeout 10;
backoff-cutoff 0;
initial-interval 0;
retry 15;
See dhclient.conf manpage (man dhclient.conf) for reference.
msr-tools only on host machine (for rdmsr/wrmsr (turbo mode state)
apt-get install msr-tools
tmpreaper
apt-get install tmpreaper
Copy /etc/tmpreaper.conf from existing machine
turbostat only on the host machine
sudo apt-get install linux-tools-common
sudo modprobe msr
sudo turbostat
tools. only do these on the host machine
add-apt-repository -y ppa:yannubuntu/boot-repair
apt-get update
apt-get install boot-repair
apt-get install smartmontools
apt-get install hddtemp
apt-get install hdparm
This PPA contains the latest release of Grub Customizer.
sudo add-apt-repository ppa:danielrichter2007/grub-customizer
sudo apt-get update
sudo apt-get install grub-customizer
ipmi. I only do this on the host machine
apt-get install freeipmi-tools
apt-get install ipmitool
modprobe ipmi_devintf
modprobe ipmi_si
You can add these to /etc/modules to have them loaded automatically (just list the module names):
ipmi_devintf
ipmi_si
sensors. I only do this on the host machine
apt-get install lm-sensors
sensors-detect
sensors
edac-util. I only do this on the host machine
apt-get install edac-utils
not working on haswell system? mcelog is the replacement? (mcelog..does it work at same time as edac-utils in older systems?)
apt-get install mcelog
Only do these on the host machine
apt-get install dstat
apt-get install cpufrequtils