Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finit errors on boot & trouble with logging #100

Closed
villains opened this issue Mar 1, 2018 · 22 comments
Closed

finit errors on boot & trouble with logging #100

villains opened this issue Mar 1, 2018 · 22 comments
Milestone

Comments

@villains
Copy link
Contributor

villains commented Mar 1, 2018

I am trying finit 3.1 and I am having two issues. Hopefully I am just doing something wrong. :)

First, when finit starts up, the following messages appear in the log:

Jan  1 00:10:13 igep00x0 syslog.info syslogd started: BusyBox v1.19.4
Jan  1 00:10:15 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/var/run/finit/cond/net/lo/exist)
Jan  1 00:10:15 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/var/run/finit/cond/net/lo/up)
Jan  1 00:10:15 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/var/run/finit/cond/net/lo/running)
Jan  1 00:10:15 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/var/run/finit/cond/net/eth0/exist)
Jan  1 00:10:15 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/var/run/finit/cond/net/eth0/up)
Jan  1 00:10:15 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/var/run/finit/cond/net/eth0/running)
Jan  1 00:10:15 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/var/run/finit/cond/net/route/default)
Jan  1 00:10:16 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/var/run/finit/cond/net/eth0/exist)
Jan  1 00:10:16 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/var/run/finit/cond/net/eth0/up)
Jan  1 00:10:16 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/var/run/finit/cond/net/eth0/running)
Jan  1 00:10:16 igep00x0 auth.info login[412]: root login on 'ttyO2'

I realize that BusyBox is rather old, but that is what I am stuck with for this project.

The error message appears to be coming from the file reconf missing from the directory /var/run/finit/cond/ (line 75 in cond-w.c).

Next, I wanted to test out the logging features.

I tried putting the following in finit.conf:

task [2345] log:/var/log/test1 uname -a
task [2345] log uname -a
task [2345] log:/var/log/test2 <svc/sbin/syslogd> uname -a

The files appear in /var/log/ except they are empty. Also, no output from the task appear in the syslog messages.
I was expecting to see the following:

The same is true if I change task to service in the above snippet.

Could you suggest some steps to help troubleshoot these issues?

I am using the bootmisc and netlink plugins on this test setup. The hardware is an IGEP0020 board running a custom Yocto with Linux kernel 2.6.37. Thanks very much!

@villains
Copy link
Contributor Author

villains commented Mar 1, 2018

Sorry, I realized I didn't include my complete finit.conf file. Here it is:

user root
host test
runlevel 2

# Serial port console
tty [12345] /dev/ttyO2 115200 nowait noclear linux

# Set up logging to /var/log/messages
log size=100k count=10
service [S12345] /sbin/syslogd -n -b 10 -D

# Test logging
task [2345] log:/var/log/test1 uname -a
task [2345] log uname -a
task [2345] log:/var/log/test2 <svc/sbin/syslogd> uname -a

@troglobit
Copy link
Owner

Hi!

OK, first of all we need to get a mount point for /run. The bootmisc plugin can set up a compat symlink in /var/run to /run. The condition subsystem relies on this to set up properly. Without that <svc/sbin/syslogd> will not work.

The log redirect should work, not really sure why it doesn't work for you. Maybe try adding --debug to the kernel cmdline at boot? This enables a lot of debug messages to the console (and dmesg), which can help pinpoint this issue.

@villains
Copy link
Contributor Author

villains commented Mar 1, 2018

Thanks. I do have both the /var and /run directories, and finit seems to recognize that they are there, because directories appear in both.

root@igep00x0:~# ls -al /var
drwxrwxrwt   12 root     root           280 Jan  1 00:00 .
drwxr-xr-x   16 root     root          1056 Jan  1 00:00 ..
drwxr-xr-x    2 root     root            40 Jan  1 00:00 cache
drwxr-xr-x    2 root     root            40 Jan  1 00:00 db
drwxr-xr-x    2 root     root            40 Jan  1 00:00 empty
drwxr-xr-x    2 root     root            40 Jan  1 00:00 games
drwxr-xr-x    5 root     root           100 Jan  1 00:00 lib
lrwxrwxrwx    1 root     root             9 Jan  1 00:00 lock -> /run/lock
drwxr-xr-x    3 root     root           160 Jan  1 00:00 log
drwxr-xr-x    2 root     root            40 Jan  1 00:00 mail
drwxr-xr-x    2 root     root            40 Jan  1 00:00 opt
lrwxrwxrwx    1 root     root             4 Jan  1 00:00 run -> /run
drwxr-xr-x    3 root     root            60 Jan  1 00:00 spool
drwxr-xr-x    2 root     root            40 Jan  1 00:00 tmp
root@igep00x0:~# ls -al /run
drwxr-xr-x    9 root     root           280 Jan  1 00:00 .
drwxr-xr-x   16 root     root          1056 Jan  1 00:00 ..
drwxr-xr-x    3 root     root            60 Jan  1 00:00 finit
srwx------    1 root     root             0 Jan  1 00:00 finit.sock
-rw-r--r--    1 root     root            16 Jan  1 00:00 ifstate
drwxr-xr-x    2 root     root            40 Jan  1 00:00 lldpd
d-wxrw---t    2 root     root            40 Jan  1 00:00 lock
drwxr-xr-x    2 root     root            40 Jan  1 00:00 network
drwxr-xr-x    2 root     root            40 Jan  1 00:00 pluto
drwxr-xr-x    2 root     root            40 Jan  1 00:00 quagga
lrwxrwxrwx    1 root     root             8 Jan  1 00:00 shm -> /dev/shm
drwxr-xr-t    2 root     root            40 Jan  1 00:00 sshd
-rw-r--r--    1 root     root             4 Jan  1 00:00 syslogd.pid
-rw-r--r--    1 root     utmp          1152 Jan  1 00:00 utmp

Here is the contents of /etc/fstab:

rootfs               /                    auto       defaults              1  1
proc                 /proc                proc       defaults              0  0
devpts               /dev/pts             devpts     mode=0620,gid=5       0  0
tmpfs                /media/ram           tmpfs      defaults              0  0
tmpfs                /var                 tmpfs      defaults              0  0
tmpfs                /run                 tmpfs      defaults              0  0

When I added --debug to the kernel command line, finit produced the following output in /var/log/messages:

root@igep00x0:~# cat /var/log/messages
Jan  1 00:01:22 igep00x0 syslog.info syslogd started: BusyBox v1.19.4
Jan  1 00:01:22 igep00x0 daemon.debug finit[1]: service_step():               uname(   0):   halted disabled/clean   cond:on
Jan  1 00:01:22 igep00x0 daemon.debug finit[1]: service_step():       /sbin/syslogd( 358): ->  running
Jan  1 00:01:22 igep00x0 daemon.debug finit[1]: service_step():       /sbin/syslogd( 358):  running  enabled/clean   cond:on
Jan  1 00:01:22 igep00x0 daemon.debug finit[1]: service_step():               uname(   0):   halted disabled/clean   cond:on
Jan  1 00:01:22 igep00x0 daemon.debug finit[1]: sm_step():state: running, runlevel: 0, newlevel: -1, teardown: 0, reload: 0
Jan  1 00:01:22 igep00x0 daemon.debug finit[1]: api_init():Setting up external API socket ...
Jan  1 00:01:22 igep00x0 daemon.debug finit[1]: main():Starting bootstrap finalize timer ...
Jan  1 00:01:22 igep00x0 daemon.debug finit[1]: main():Entering main loop ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_bootstrap_cb():Checking completion of all run/task ... 10
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_bootstrap_cb():All run/task have completed, resuming bootstrap.
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: finalize():Change to default runlevel, start all services ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: networking():Setting up networking ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: run():Started ifup and ended OK: 0
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: networking():Calling all network up hooks ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set_oneshot():s => hook/net/up
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_update():hook/net/up
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0):   halted disabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: sm_step():state: running, runlevel: 0, newlevel: 2, teardown: 0, reload: 0
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: sm_step():state: runlevel/change, runlevel: 0, newlevel: 2, teardown: 0, reload: 0
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: sm_step():Setting new runlevel --> 2 <-- previous 0
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: sm_step():Stopping services services not allowed in new runlevel ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():       /sbin/syslogd( 358):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0):   halted  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0): ->    ready
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0):    ready  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: sm_step():state: runlevel/wait, runlevel: 2, newlevel: -1, teardown: 1, reload: 0
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: sm_step():All services have been stoppped, calling runlevel change hooks ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0):    ready  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: sm_step():Starting services services new to this runlevel ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():       /sbin/syslogd( 358):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0):    ready  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_start():Starting uname: uname -a
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname( 395):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname( 395): ->  running
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname( 395):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: sm_step():state: running, runlevel: 2, newlevel: -1, teardown: 0, reload: 0
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: finalize():Clean up all bootstrap-only tasks/services ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: finalize():Running svc up hooks ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set_oneshot():s => hook/svc/up
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_update():hook/svc/up
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname( 395):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():       /sbin/syslogd( 358):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname( 395):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: finalize():Calling all system up hooks ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set_oneshot():s => hook/sys/up
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_update():hook/sys/up
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname( 395):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():       /sbin/syslogd( 358):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname( 395):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: finalize():Going silent ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: finalize():Launching all getty services ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: tty_start():/dev/ttyO2: Starting built-in getty ...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: generic_io_cb():Calling I/O netlink from runloop...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: nl_link():lo: New link, flags 0x10049, change 0x1
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set():net/lo/exist
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/lo/exist
Jan  1 00:01:23 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/lo/exist)
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_update():net/lo/exist
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set():net/lo/up
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/lo/up
Jan  1 00:01:23 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/lo/up)
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_update():net/lo/up
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set():net/lo/running
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/lo/running
Jan  1 00:01:23 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/lo/running)
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_update():net/lo/running
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: generic_io_cb():Calling I/O netlink from runloop...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: nl_link():eth0: New link, flags 0x11043, change 0x1
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set():net/eth0/exist
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/eth0/exist
Jan  1 00:01:23 igep00x0 daemon.debug finit[395]: Calling /bin/sh uname -a
Jan  1 00:01:23 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/exist)
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_update():net/eth0/exist
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set():net/eth0/up
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/eth0/up
Jan  1 00:01:23 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/up)
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_update():net/eth0/up
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set():net/eth0/running
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/eth0/running
Jan  1 00:01:23 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/running)
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_update():net/eth0/running
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: sigchld_cb():Collected child 395
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname( 395):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_monitor():collected uname(395)
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0):  running  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0): ->   active
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0):   active  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0): ->     done
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0):     done  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: sm_step():state: running, runlevel: 2, newlevel: -1, teardown: 0, reload: 0
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: sigchld_cb():Collected child 397
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_step():               uname(   0):     done  enabled/clean   cond:on
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: service_monitor():collected unknown PID 397
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: generic_io_cb():Calling I/O netlink from runloop...
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set():net/route/default
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/route/default
Jan  1 00:01:23 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/route/default)
Jan  1 00:01:23 igep00x0 daemon.debug finit[1]: cond_update():net/route/default
Jan  1 00:01:25 igep00x0 daemon.debug finit[1]: generic_io_cb():Calling I/O netlink from runloop...
Jan  1 00:01:25 igep00x0 daemon.debug finit[1]: nl_link():eth0: New link, flags 0x1003, change 0x0
Jan  1 00:01:25 igep00x0 daemon.debug finit[1]: cond_set():net/eth0/exist
Jan  1 00:01:25 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/eth0/exist
Jan  1 00:01:25 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/exist)
Jan  1 00:01:25 igep00x0 daemon.debug finit[1]: cond_update():net/eth0/exist
Jan  1 00:01:25 igep00x0 daemon.debug finit[1]: cond_set():net/eth0/up
Jan  1 00:01:25 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/eth0/up
Jan  1 00:01:25 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/up)
Jan  1 00:01:25 igep00x0 daemon.debug finit[1]: cond_update():net/eth0/up
Jan  1 00:01:25 igep00x0 daemon.debug finit[1]: cond_clear():net/eth0/running
Jan  1 00:01:25 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/eth0/running
Jan  1 00:01:25 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/running)
Jan  1 00:01:25 igep00x0 daemon.debug finit[1]: cond_update():net/eth0/running
Jan  1 00:01:26 igep00x0 daemon.debug finit[1]: generic_io_cb():Calling I/O netlink from runloop...
Jan  1 00:01:26 igep00x0 daemon.debug finit[1]: nl_link():eth0: New link, flags 0x11043, change 0x0
Jan  1 00:01:26 igep00x0 daemon.debug finit[1]: cond_set():net/eth0/exist
Jan  1 00:01:26 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/eth0/exist
Jan  1 00:01:26 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/exist)
Jan  1 00:01:26 igep00x0 daemon.debug finit[1]: cond_update():net/eth0/exist
Jan  1 00:01:26 igep00x0 daemon.debug finit[1]: cond_set():net/eth0/up
Jan  1 00:01:26 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/eth0/up
Jan  1 00:01:26 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/up)
Jan  1 00:01:26 igep00x0 daemon.debug finit[1]: cond_update():net/eth0/up
Jan  1 00:01:26 igep00x0 daemon.debug finit[1]: cond_set():net/eth0/running
Jan  1 00:01:26 igep00x0 daemon.debug finit[1]: cond_set_path():/run/finit/cond/net/eth0/running
Jan  1 00:01:26 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/running)
Jan  1 00:01:26 igep00x0 daemon.debug finit[1]: cond_update():net/eth0/running
Jan  1 00:01:34 igep00x0 auth.info login[396]: root login on 'ttyO2'

@villains
Copy link
Contributor Author

villains commented Mar 1, 2018

It looks like the directory /run/finit/cond/net/eth0 is not being created automatically.

The eth0 interface is up and working correctly (able to ping other computers on the LAN, etc.).

@troglobit
Copy link
Owner

troglobit commented Mar 1, 2018

Yup, it seems Finit currently doesn't check if /run is to be mounted by the system /etc/fstab. So /run is mounted twice in your case, and the file /run/finit/cond/reconf is set in the first mount. Hence the problems.

This might explain why nothing else works either, although the non-conditionalized uname -a tasks should run fine in runlevel 2, well after syslogd has started.

@villains
Copy link
Contributor Author

villains commented Mar 1, 2018

In that case, should I remove /run from /etc/fstab to avoid the double mount?

@villains
Copy link
Contributor Author

villains commented Mar 1, 2018

So I just tried removing /run from /etc/fstab and it doesn't seem to help. I still get the Unable to read configuration generation errors related to missing /run/finit/cond/net/* stuff.

@villains
Copy link
Contributor Author

villains commented Mar 1, 2018

It looks like finit isn't creating lo and eth0 items in /run/finit/cond/net/ in response to netlink events?

@troglobit
Copy link
Owner

troglobit commented Mar 1, 2018

So, removing the double mount doesn't help and you get no directories for the network interfaces. Do you have the /run/finit/cond/reconf file, or is that still missing? That file should be first created before mount -na is called at boot, and after each initctl reload. All conditions need to read that file to set their "generation id".

Double check that /run isn't mounted twice, try unmounting it and check ls /run ... and mount commands.

@villains
Copy link
Contributor Author

villains commented Mar 1, 2018

Will check and report back. Thank you for your support!

@villains
Copy link
Contributor Author

villains commented Mar 1, 2018

Would it help if I sent you a .tar.gz of the whole rootfs? It is only 3MB.

@troglobit
Copy link
Owner

troglobit commented Mar 1, 2018

Probably not, I'd likely have to be able to run it to see what's going on at runtime. Also, I wouldn't be able to look at it until tomorrow evening (CET) at the earliest.

A minor detail, that shouldn't affect things, the following two lines of your /etc/fstab can also be removed:

proc                 /proc                proc       defaults              0  0
devpts               /dev/pts             devpts     mode=0620,gid=5       0  0

I'm sorry I cannot be of any more help :-/

@villains
Copy link
Contributor Author

villains commented Mar 2, 2018

No problem, I appreciate the time you're spending on this already. :)

I will probably get back to working on this next week, and hopefully I will report back here with a solution.

@villains
Copy link
Contributor Author

villains commented Mar 2, 2018

Quick update.

My /etc/fstab is now just this:

tmpfs                /media/ram           tmpfs      defaults              0  0
tmpfs                /var                 tmpfs      defaults              0  0

Still, finit does not seem to be happy:

Jan  1 00:28:03 igep00x0 syslog.info syslogd started: BusyBox v1.19.4
Jan  1 00:28:03 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/wlan0/exist)
Jan  1 00:28:03 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/wlan0/up)
Jan  1 00:28:03 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/wlan0/running)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/lo/exist)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/lo/up)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/lo/running)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/exist)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/up)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/running)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/route/default)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/wlan0/exist)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/wlan0/up)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/wlan0/running)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/exist)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/up)
Jan  1 00:28:13 igep00x0 daemon.err finit[1]: cond_set_path():Unable to read configuration generation (/run/finit/cond/net/eth0/running)
Jan  1 00:28:15 igep00x0 authpriv.notice login[478]: ROOT LOGIN  on '/dev/ttyO2' from '2.6.37+'

Contents of finit.conf:

user root
host test
runlevel 2

tty [12345] /dev/ttyO2 115200 nowait noclear linux

# Set up logging to /var/log/messages
log size=100k count=10
service [S12345] /sbin/syslogd -n -b 10 -D

Mounted items:

root@igep00x0:~# mount
rootfs on / type rootfs (rw)
ubi0:rootfs on / type ubifs (rw,relatime)
devtmpfs on /dev type devtmpfs (rw,relatime,size=257740k,nr_inodes=64435,mode=755)
none on /proc type proc (rw,relatime)
none on /sys type sysfs (rw,relatime)
devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620)
shm on /dev/shm type tmpfs (rw,relatime)
tmpfs on /run type tmpfs (rw,nodev,relatime,size=51568k,mode=755)
tmpfs on /media/ram type tmpfs (rw,relatime)
tmpfs on /var type tmpfs (rw,relatime)

Per your questions previously:

So, removing the double mount doesn't help and you get no directories for the network interfaces. Do you have the /run/finit/cond/reconf file, or is that still missing? That file should be first created before mount -na is called at boot, and after each initctl reload. All conditions need to read that file to set their "generation id".

Contents of /run/finit/cond:

root@igep00x0:~# ls -al /run/finit/cond
drwxr-xr-x    3 root     root            60 Jan  1 00:30 .
drwxr-xr-x    3 root     root            60 Jan  1 00:30 ..
drwxr-xr-x    6 root     root           120 Jan  1 00:30 hook

Looks like the reconf file is still missing.

Double check that /run isn't mounted twice, try unmounting it and check ls /run ... and mount commands.

Here are the results of that:

root@igep00x0:~# umount /run
root@igep00x0:~# ls -al /run/finit/cond
ls: /run/finit/cond: No such file or directory
root@igep00x0:~# mount
rootfs on / type rootfs (rw)
ubi0:rootfs on / type ubifs (rw,relatime)
devtmpfs on /dev type devtmpfs (rw,relatime,size=257740k,nr_inodes=64435,mode=755)
none on /proc type proc (rw,relatime)
none on /sys type sysfs (rw,relatime)
devpts on /dev/pts type devpts (rw,relatime,gid=5,mode=620)
shm on /dev/shm type tmpfs (rw,relatime)
tmpfs on /media/ram type tmpfs (rw,relatime)

The run directory itself is empty.

I think I've made this little system about as simple & primitive as I can possibly make it. Finit still does not like it. :(

I'm starting to carefully read the finit code to see about the missing reconf file. Any other suggestions to try?

Thanks again for your support so far!

@villains
Copy link
Contributor Author

villains commented Mar 8, 2018

After some more testing and poking around, I learned the following:

  • finit does not seem to work with a /var directory mounted as tmpfs
  • If the freshly installed rootfs has an empty /var directory (not mounted as tmpfs) then finit conditions will not work on the first boot (same errors about Unable to read configuration generation will appear). On the second boot (presumably after the bootmisc plugin has managed to create the FHS directories?) finit conditions seem to work fine (running the command initctl cond dump displays what I expect).

@villains
Copy link
Contributor Author

villains commented Mar 8, 2018

Digging a bit further, it looks like the problems originate from the following #define from the Linux paths.h header:

#define _PATH_VARRUN    "/var/run/"

It is used throughout finit to establish paths to items in /run by assuming that /var/run will be a symlink to /run. It looks to me that if both the /var and /run directories are mounted as tmpfs and are empty at boot, then assuming that /var/run exists and is a symlink to /run would cause some issues.

@troglobit
Copy link
Owner

At work we run systems with /var and /run mounted as tmpfs, the latter mounted by Finit, it works as expected. Not sure what's going on here, but a colleague of mine got a similar issue with Unable to read configuration generation during bring-up of a new system, so there might be something here to look into.

The /var/run confusion in paths.h is supposed to be handled by pid_runpath() in pid.h.

I'll try to get to the bottom of this during the weekend. My intention was to have an example system for you to compare with, but TroglOS isn't really that usable atm. So I'm preparing a basic rootfs + kernel for running with Qemu.

troglobit added a commit to troglobit/tinyroot that referenced this issue Mar 12, 2018
This seems to be necessary early at boot due to troglobit/finit#100

Signed-off-by: Joachim Nilsson <troglobit@gmail.com>
@troglobit
Copy link
Owner

troglobit commented Mar 12, 2018

OK, this took a while longer than I expected, sorry. Finally got around to it this weekend, created tinyroot to be able to confirm your issue.

So far this is the only workaround I have: troglobit/tinyroot@9035168 -- create a symlink in /var/run to /run in the root fiile system, so Finit can reference /var/run before mount -a has run.

@villains
Copy link
Contributor Author

Thanks, troglobit.

For now, the system works well enough with /var being a regular directory on the UBIFS partition. I had some ideas to use the functions from mntent.h to check whether /var will be mounted as tmpfs and if so, to mount it early and create the /var/run symlink to point to /run, before cond_init() is invoked. Hopefully I will find time to prototype it in the next couple of weeks. If it works I'll post a patch. Any thoughts on that approach in general?

@villains
Copy link
Contributor Author

Forgot to say: thanks for all your support & for taking some of your weekend to look into this! Very much appreciated.

@troglobit
Copy link
Owner

No problem, really wanted to get to the bottom of this myself. :)

The root cause of it all is the /var/run to /run migration that I've managed so poorly. Early mount of /var may be one way to attack this. But I think the COND_RECONF define should be wrapped with pid_runpath() everywhere, like in the cond_bump_reconf() function where all the pain starts ...

Also, cond_init() should not be called if /run isn't mounted, e.g. if the lines prior to that call fail. In one of my bring-up attempts of tinyroot I had forgotten to include CONFIG_TMPFS support in the kernel, and as you've pointed out, bring-up is the hardest part of trying out Finit so that needs to be greatly improved upon.

Again, sorry for this mess!

I'll see if I find some time to address these issues over the next couple of weeks. Unless you're still interested in helping out and beat me to it :)

@troglobit
Copy link
Owner

@villains I've just pushed a fix to the dev branch, which seems to work as intended on my system. I have a few other issues to address before the next release, but please let me know if the problem persists for you, if so we'll reopen this bug report.

@troglobit troglobit added this to the v3.2 milestone Mar 18, 2018
troglobit added a commit that referenced this issue Mar 22, 2018
We cannot allow pid_runpath() to return its 'file' argument because that
may be a stack variable in a helper function.

Signed-off-by: Joachim Nilsson <troglobit@gmail.com>
@troglobit troglobit modified the milestones: v3.3, v3.2 Apr 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants