-
Notifications
You must be signed in to change notification settings - Fork 3
/
RELEASE-collectl
1778 lines (1624 loc) · 95.7 KB
/
RELEASE-collectl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
RELEASE NOTES FOR COLLECTL
INSTALLATION
Installing the rpm
rpm -ihv collectl-x.y.z.noarch.rpm
Installing from source
unpack the tarball, which you've obviously done
follow the instructions in the README, which basically says to run INSTALL
Configure to start on boot
In both cases, collectl will not be configured to start on boot
but can easily be set to do so with the command:
chkconfig collectl on
KNOWN PROBLEMS/RESTRICTIONS
- There is a known problem with older perl Time::HiRes modules, newer
versions of glibc and colletcl intervals of 1 second or greater (see
http://collectl.sourceforge.net/HiResTime.html for more details) that
can result in 'setitimer' messages being logged at system startup when
collectl has been configured to run as a daemon. these messages appear
to benign, but be sure to let someone know if that proves not to be the
case. If collectl determines your system has this mismatch, it will
report it as a warning in collectl's message file in /var/log/collectl
every time it starts as a daemon. If you choose, you can easily turn
off the checking by editing the entry at the bottom of /etc/collectl.conf
named TimeHiResCheck and setting it to 0.
- if run as a non-privileged users, network speeds will NOT be recorded in
in file headers and the default speed specified by DefNetSpeed in the conf
file will be used to determine if any network stats are bogus
- if system time is changed by more then the log rolling frequency after
collectl starts, multiple log files will be created during the next polling
cycle(s)
- added 2 new fields to verbose cpu Summary Stats - Run Total and Blocked Total
- added VmSwap to process/memory display
CHANGES
3.6.0-3 Oct 17, 2011
- added dirty memory to lexpr
3.6.0-2
- support for numa
- split anon pages into separate field in verbose mode as well as plot format
- changed the memory header for -sm to SUMMARY rather than STATISTICS as the
latter is currently used to indicate detail data, something that didn't exist
for memory prior to numa support
- added --xopts i to be consistent with --dskopts and --netopts. did NOT add
such a switch for lustre
- expanded error checking with perfquery to catch 'Failed to open' errors
during initialization
- discovered and removed reading of /proc/stat during -sm, which was there to support
2.4 kernel fields that have since been moved
- changed collectl-debian start script to use /bin/sh instead of bash
- removed ".B collectl" at start of collectl man page for debian/lintian compliance
- made width of number of dentries in -si --verbose 7 instead of 6 digits wide
3.6.0-1
- do NOT call derived() when playing back rawp files or you'll get unit var
for $memUsedLast.
- need to include non-numeric type interrupt counts in interrupt totals
- fixed a few problems with envronmental data and interpretation of --envopts
- was not allowed to use with -P and only 'M' should have been restricted
- was only honoring C/F when temp name started with Temp rather anywhere in string
- was not correctly overriding default ipmi devices with user define options
- fixed formatting/calculations for interactive memory subtotals generations when
RETURN is typed in conjunction with --memopts R in brief mode
- added new section to FAQ called 'gottchas' as a place to describe the perils of
round-off error and normalization
- when printing verbose data in import modules, need to clear $$lineref or the last
line that mainline collectl reports (if any) will be repeated. this was fixed
in hello.ph and atigpu.ph
- new switch: --dskopts z, which when specified filters out disk details lines of all 0s
- added switch examples to start scripts for clarification of use
- added support for 'vd' disks [thanks gavin]
- since kernel 2.6 compatible with 3.0 and 2.4 is sooo old, 2.4 support officially dropped!
[thanks for the push, tony]
- dropped support for collectl data generated by versions of collectl older than 2.0
- need to set $cpusEnabled to 0 when playing back interrupts in plot format w/o -sC, since the
code that normally does that has already been executed and 'C' not yet added to $subsys. subtle...
- filled in some missing ; in nvidia.ph in PrintPlot routine
- fixed problem writing plot files with --import
- added 'i' to both dskopts and netopts which will cause i/o sizes to be displayed in
brief mode like --iosize except in this case independent of each other
- do not include virtual networks in network summary [thanks hank]
- in newLog() need to use gettimeofday for current time when hires::time is used otherwise you'll
occasionally get a time 1 second earlier and new files names are wrong! [thanks hank]
- exclude vlan from network totals to avoid duplicate counts [thanks andrey]
3.5.1-1 May 23, 2011
- change expression used to find CPU count in /sys since -P isn't necessarily
built into all greps
- instead of only getting the platform name when -sE, always try to get it
- forgot to include 'T' as valid --envopts
- check for failure of 'ipmitool sdr dump' command
- need to ignore interval checks with --showcolhead and -sE
- fix bug in checkSubSys() because while it could find newer subsys it couldn't find
dropped ones
- needed to clear nethostflag outside conditional that looks at prefix changed, which
was incorrectly preventing consecutive files on the same day from being identified
- added new routine pushmsg() that allowed one to stack up messages generated BEFORE
'beginning execution' message and then play them afterward, making log easier to read
- changed several calls from logmsg() to pushmsg()
- added support for files that cross midnight and ability to play them back in full
see updated Playback.html
- remove duplicate message in sexpr
- have found an instance where the number of networks in the header didn't match the ones
listed (some were dropped!) and so added a check to take care of this
- renamed $active, $inactive and $dirty to $memAct, $memInact and $memDirty for better
consistency with other memory variable names. Didn't bother with older V2.4 mem variables
- new switch --memopts R: display memory info as changes/interval, similar to sar's -R switch
- logic to clear '$sameColsFlag' in verbose mode and --import was wrong
- --showcolheaders and -sE requires root
- added support for nvidia driver V270.41.19 which has different output format. highly
probable other versions will behave different as well
3.5.0-3 Feb 12, 2011
- expanded interrupt details to include non-numeric interrupts
- new import module added for GPUs: nvidia.ph
- added getExec type 0 to support new import
- updated version of gexpr, with new switches to control using default ganglia
variable names
- bug fix: wasn't sending E and F types messages to syslog
- wasn't initializing enough 'last' vars for latest nfs V4
- only allow -sT with -P or -f
- added new switch --tworaw as a synonym for --group which makes more sense
- if an imported module returned -1 in its init routine, disable it. return
1 for success
- new --procopt: R causes real-time priorities to be displayed rather than RT,
at the cost of 2 extra columns in the display [thanks lee]
- added optional callback GetHeader to --import API, if not defined not called
- change error handling when playing back files with no selected subsystems to be
non-fatal, skip the file and continue processing
- added dl585-g7 to envrules.txt
- allow -s-all to remove ALL L subsystems when you only wanted --import data played
back. I actually forgot to add this to release notes until V3.5.1
3.5.0-2 Jan 09, 2011
- turned utime into a mask, so we can control the granularity of micro-logging
to include /proc time with/without process accesses
3.5.0-1 Jan 09, 2011
- renamed --showplotheaders to --showcolheader since it now applies to ALL headers
for single header line output (will only show cpu for -scd --verbose)
- fixed ALL verbose and detail output formats to include date/time headers
- newer kernels added additional files to /sys/devices/system/cpu/ which messed
up the way total CPUs were being calculated
- added 2 new variableS to lexpr: cputotals.num and cputotals.total [thanks chris]
- removed unused switch --pidfile from collectl -x
- file processing push/pop code wasn't handling data change correctly
- added new flag to show host changed since THAT was what was needed in 'consecutive'
file identification processing
- found problem with playing back multiple files with --thru for different hosts!
needed to 'undef $newSeconds[$rawPFlag]' whenever hostname changed
- new netopts values
e - show errors in brief mode and explicit types everywhere else
E - only print lines that have non-zero network errors in them
- new diagnistic switch --utime, causes periodic micro-timestamps to be written into
raw file at different points in time for finer grained measurements of operation times
3.4.4-3 Dec 9, 2010
- if -s during playback, at least ONE requested subsys must be in recorded file.
if c recorded, C would cause error message because pattern match didn't have 'i'
- add requirement for STDOUT to be connected to a terminal as a condition to call resize
- change to collectl.conf - roll logs at exactly midnight, not 1 minute past
- new --envopts value of T to truncate values to integers
- ignore 'Fan Redundant' in env data for dl160g6
- if impi data field is blank, ignore it
- fixed filtering of ipmi data AND renames 'c' option to 'p', for power
- include THRD in -P format for processes
- only turn off echo when in brief mode AND not playing back a file
- if data collectl w/o HIRES and display request msec, set default to '000' instead of 0
- discovered only --ssh in help so removed -S
3.4.4-2 Nov 10, 2010
- base36() needs to do an int() on values <10 so their fraction not
included in output string
- reduced printing of headers for -sf --verbose to one call to printText()
per line. otherwise one hostname prepended to each line of socket call.
- fixed a problem with --procfilt C: it was trying to match whole process
name rather than just the beginning of it [thanks gary]
3.4.4-1 Nov 09, 2010
- vmstat not handling date/time correctly, needed $dateTime[0]
- need to call export module's init routine in playback mode
- lustre 1.8.4 module location moved, check expanded [thanks Frederik]
- new top sort options, pid and cpu, which don't make a lot of sense
unless used with filters
- do NOT include hostname in RECORD printing routine with -A
- CPU verbose output should not right shift 1st header line with -oT
- removed printing of extra '$line' at end of NFS DETAIL header
- incorrectly setting recSubsys to [YZ] if user specifies --top even
if -s specified too! They should be merged [thanks mats]
- don't write to a socket if shutting down in which case $doneFlag set
- don't report socket errors if not in server mode
- added 'ProLiant DL160se G6' to envrules.std
- disableSubsys should ONLY remove subsystems from export option 's='
was also clearing KFlag rather than LFlag [thanks chris]
- new process sort option 'thread', sorts by thread count
- changed start/stop in initd scripts from "$network +openibd" to "$all"
so collectl will start after everyting else
3.4.3-3 August 19, 2010
- added --netfilt
- very rare: if playing back CPU data but none collected, be sure to
set $cpusEnabled to number of CPUs or else you'll get warning that
one or more disabled
- pattern match wrong for 'emcpower' disks [thanks lewis]
- changed disk details to use 'cvt()' for reporting number of I/Os since DM numbers
can be more than 4 digits
- change --umask behavior. default is to do nothing unless explicity set
AND user is 'root'
- 2 new process sort fields: pid and cpu
3.4.3-2 August 16, 2010
- only look at $cpuDisabledFlag when processing CPU data
- perfquery in OFED 1.5 can report warnings in its output stream which need to be ignored
- if you try to playback a file and specify -s with no existing subsystems you'll
get an error
3.4.3-1 August 02, 2010
- perfquery checks problems
- version finding code not working correctly for ofed 1.5
- disabling -sl by mistake when perfquery not found
- when errors detected during initialization not skipping subsequent checks
3.4.2-5 July 21, 2010
- changed INSTALL to only execute commands like chkconfig OR update-rc
when $DESTDIR is / [thanks mike]
3.4.2-4 July 09, 2010
- added --dskfilt
- added check for client-side OST uuid status 'DEACTIVATED', which seems to
have showed up somewhere in the 1.6 timeframe but now sure when, thanks Heiko
3.4.2-3 June 25, 2010
- new memory field 'SUnreclaim' ONLY available in plot format and lexpr,
just not enough room in terminal based output [thanks seb/fred]
- misc now considers uptime, mhz and mounts as 'lightweight' counters and will
sample every standard interval. Only logins, which is heavy-weight, will be
sampled based on "i=" or the default of 60 seconds. Further, all lightweight
samples will be returned every interval by lexpr whereas the heavy-weight ones
will only be returned when sampled. In order to keep sexpr/gexpr formats constant
(primarily because I don't know the effect of not doing so), they will report
all counters every interval.
- support for CPUs dynamically changing stats and going off/on-line
- NOTE -- can't detect this during interrupt processing unless also
monitoring CPU data, which people typically do anyways
3.4.2-2 June 15,2010
- not correctly handling discovery of new disks during playback
- new feature: select process by UID range [thanks mark]
- fixed bug in --procfile u/U processing while testing
- added systot and usertot to lexpr to report totals for all system and user
counters
- changed error message processing when trying to playback a file with process
when there isn't any or slabs data, etc. Rather than only show the message
when -m, which could result in only a 'no files processed' message they will
be unconditionally displayed as they should
3.4.2-1 May 21, 2010
- change default umask to 133 so that colplot can read files since webserver
doesn't have privs
- now that raw files are always compressed, the message about disabling it
with -oz when no compression no longer makes sense so the message has been
clarified to use --quiet with raw files and -oz with plot files
- added README-WINDOWS to src tarball
- cleaned up code that still expected [com] in $lustOpts instead of $lustreSvcs
- more cleanup and bug fixes to INSTALL for debian support. thanks bernd
- change to /bin/sh
- do not use ANY explicit paths
- minor changes to man pages, also for debian restrictions
- wasn't reading NfsFilter correctly from header on playback
- save perfquery version and use it to drive the skipping of 'field 13' rather than
OFED versions which isn't always available
- do not issue 'stty' if !PC, running on terminal and !background. missed a couple...
3.4.1-5 Mar 30, 2010
- new env options F/T converts temps to C or F
3.4.1-4 Mar 29, 2010
- new switch --whatsnew prints a summary of changes, a mini-release notes
3.4.1-3 Mar 23, 2010
- added Fusion-IO card to list of valid disks: fio
- gexpr, lexpr and misc weren't honoring internal interval counter.
- if a secondary/tertiary interval specified gexpr/lexpr didn't process
it correctly
- new switch: --envfilt allows you to specify filters
- if you specify a " in DaemonCommands it gets passed along in the variable itself
(not a problem for ') so we have to remove them
- added new section 'Filters' to header. Added EnvFilt and moved NfsFilt to it
- added new switch --envremap, which allows for renaming one or more output field names
- added new feature switch to lexpr. if x=file is specified, that file will be loaded
via require and a corresponding function name called after every print cycle, allowing
one to do modified, custom output
- new switch, --umask too control output file protections, see man umask. default is 0137
- new environmental option - if you include a device number with --envopts use THAT as a
device number with -d when running ipmitool. for some systems the default devices is
the slower one and this will have an impact on how fast ipmitool will run, possibly
slowing down collectl
- added 'use 5.008000', which should have probably been there years ago
3.4.1-2 Mar 16, 2010
- do now allow -oA in verbose mode
- consolidated all code to disable -s subsystems when a conflict consolidated into
disableSubsys which ALSO disables them in s= clause of --export
- removed code to disable s= in all the ph export modules since now redundant
- support for DESTDIR env variable in INSTALL/UNINSTALL [thanks Bernd]
- Voltaire changes output of ofed_info so we have to process IB version
slightly differently
- change lustre message about needing -L to --lustsvc
- changes to lexpr to include processes in run queue and to change prefix
for proc creates/runs to 'proc'
- changes fo misc.ph to ALWAYS report latest values in --export as well if 'a'
paremeter, noting the default is to only report them when sampled. collection
still defaults to 1 minute, overridable via 'i='.
- since loading formatit.ph moved in a recent release, any calls to error()
before it's loaded since it needs a routine internal to formatit. so now
only call printText() from error() if formatit loaded.
3.4.1-1 Feb 22, 2010
- when printing plot data to files, wasn't putting headers on subsequent days' files
3.4.1-0 Jan 10, 2010
- make sure all major release settings in RELEASE-collectl have dates
- remove blank line in all collectl start scripts right before 'END INIT INFO'
since debian doesn't like it and we should be consistent
3.4.0-4 Jan 04, 2010
- updated envrules to include additional parsing rules for dl185 [thanks evan]
- changed envrules header for dl585 G1 to G5
- if running an ofed >= 1.5, ignore 'CounterSelect2' field, which is right in the middle
- send errors in getExec() to /dev/null because perfquery for > ofed 1.4 is braindead
- was incorrectly using 256 to print IB debugging info instead of 2
3.4.0-3 Dec 14, 2009
- was not clearing right variable for CPU Detail Totals in sexpr.ph
- fixed typo on QLogic HCA name from qlib to qib
3.4.0-2 Dec 13, 2009
- fixed typo of HugePages from HughPages [thanks Frederic]
- fixed typo of 'openib' in start script LSB headers to 'openibd'
- clarified help and man page for --all to indicate ONLY summary data will
be reported, meaning NO process or detail data either
3.4.0-1
- restructure installation directories to be more standard
- pid was not properly set for suse flush command
3.3.7-1 Nov 26, 2009
- added support for psv [polyserve] disks
- added support for QLogic IB HCA
- changes to INSTALL/UNINSTALL to handle gentoo and to restructure 'generic'
distro processing for more flexibility in the future
- 3 'standard' tools turned out not to be standard on gentoo and so:
- limit checking for ethtool to writing to log file OR --showhead
- if can't find lspci during -sx processing (and -sx IS a daemon default),
disable -sx rather than throw a hard error.
- only use dmidecode if -sE and if not found, set product name to 'Unknown'
- creating /var/log/collectl in INSTALL so when installed this way the
daemon writes logs into that directory instead of /var/log. this now
matches what an RPM install does
- if required include files can't be find in same directory as collectl, look
in ReqDir which is initially set to /usr/share/collectl. This can be
changed in collectl.conf
- when exiting due to a fatal error, be sure to exit(1) and not just exit.
- some process I/O counters found to be missing on CentOS 4.8 and so had to
initialize to 0 in case not found
- wasn't catching 'ioall' as invalid --top option
3.3.6-2 Sep 16, 2009
- if printing interrupts in brief mode, Cpu headers have to be changed as the number
of cpus increase to 2 or 3 digits. [thanks Aron]
3.3.6-1 Aug 19, 2009
- changed error message about missing ethtool or lspci to just ethtool since
missing lspci was already caught and reported
- change location of collectl to /usr/bin in collectl-debian
- make -P honor --hr which it currently does not [thanks giles]
3.3.5-4 Jul 20. 2009
- performance optimizations in dataAnalyze()
- check process/slabs first whenever type is proc/slab. then in a separate clause
look at subsys, thereby preventing parsing of type in other checks
- always include test of subsys and do it first. found to be completely missing
in lustre tests
3.3.5-3 Jul 17, 2009
- expanded meaning of -G to include slabs in 'rawp' files and to add 'g' to the Flags
in the header, which also uncovered a number of bugs in the way batches of files for
different hosts/dates were selected/handled even before slabs were added
- drop support for -sy in brief mode since it really doesn't make much sense and if you
do specify -sy it now forces verbose mode. see Slab documentation for more on playing
back files generated with -G
- if can't find an ofed utility AND rpm isn't on system, don't use it [thanks seb]
- fixed some problems with -oA processing
- removed a couple of error checks for switches that don't apply to a particular option
since they are silently ignored already, making it easier to recall a command and add
switches rather than having to remove those that don't apply
- flush STDIN at startup in case someone typed extra CRs
- added col2tlviz to kit
- changes to --export processing broke --vmstat so moved call to setFlags() from right
before playback code (which sets them itself) to right after call to $expName init routine
- changed start scripts so that if you can specifice "start/restart {[extension] switches]"
making easier to use/document. the old syntax which put the switches 1st meant you had to
use "" if you didn't want to change them AND it didn't work with redhat's 'service' command
3.3.5-2 June 30, 2009
- added client.pl to examples/ and moved readS to /examples
- added new switch --procstate, which allows you to limit process displays to
only show those processes in one or more explicit states
- incorrectly looking for 'LustreVersion' in header instead of 'CfsVersion'
- when dropped SubOpts from header it broke pattern matching for subsys in
header during playback
- only calculate disk detail stats using CPU time when hires not available
- when reporting a lustre server that is both an MDS and OST in brief mode,
the 2nd line column headers are reversed for the types of server
- removed obsolete switches (and warnings) -b, -e, -oP, -Y, -Z, -O, --subopts and -sLL
- changed buddyinfo headers in verbose, plot and detail files being sure to include
name/zone after : in details [thanks bayard]
- use mergeSubsys() everywhere $userSubsys is used to reset value of $subsys
- changed some instaces local variable $file to begin sorting out of local variables
with the same name as the global one
- if newlog starts and NOT an interval 2 interval, we don't record correct slab data so
only clear $newRawSlabFlag (also renamed for clarification) during interval 2
3.3.5-1 June 19, 2009
- print load averages to 2 decimal places in plot format to match interactive format,
which also required adding to lexpr and allowing it to deal with fractions
[thanks stevef]
- when disk order changes, error message was not reporting correct old maj/min
numbers [thanks philippe]
- code for including >ignore< stanza in envrules was causing unititialized variable errors
- do not make sure ipmi available when running with --envtest
- do not include ':' in lexpr network name string
- re-enable sending startup and E/F messages to syslog
3.3.4-5 June 14, 2009
- old redhat distros don't recognize the -p switch on the start script so check
first before using it
3.3.4-4
- make sure all LSB headers the same and only contain "$network +openib" for
services so that collectl can run diskless and not require ntp
3.3.4-3
- fixed a few things with gexpr.ph
- incorrectly used ' instead of " for detail counters variable names [thanks evan]
- using wrong variable name for interrupt totals by CPU
- changed way lustre OST names are parsed so that they handle embedded _s correctly
- include LSB comments in start script headers
- make SubsysCore in collectl.conf match real subsys core, even though just a comment
3.3.4-2
- changed all hardcoded occurances of /etc/collectl.conf to $configFile even in
error messages, in case someone ran with -C [thanks philippe, for this and others]
- added DiskMaxValue to collectl.conf, with default of -1. If >0 and a disk
read/write rate is greater, reset all stats for this disk to 0 because something
reset them and they're probably all bogus
- moved code that initialized disk names to separate subroutine and added logic to
save disk major/minor numbers so it can also be called later if disks are reordered
- if DiskFilter specified in collectl.conf, use that string for disk filtering. if
not specified continue to use separate if statements for tests in getProc() since
they're slightly more efficient
- if diskremap.ph exists, call internal remapDisk() routine when disk array is being
initialized in initDisk()
- newLog() was clearing $printHeaders instead of $headersPrinted
- if playing back multiple files for same day with -sD and disk config changes, generate
an error if not -ou because mixing the data in the same detail file will make it impossible
to interpret
- remove unused variable '$intFlag'
3.3.4-1
- added "ProLiant BL490c G6" to envrules as a 'standard' system since there
is nothing special to do to parse the data
- changed lustreMDS data for sexpr, lexpr and gexpr to be consistent with what
is being reported. this wasn't done when lustre 1.6 support was added and
should have been
- fixed a typo in a lustre ost variable name in gexpr
- don't just report ETH traffic in -sn brief mode, use same numbers as --verbose
- added [ignore] stanza to envrules to allow ingoring anything that matches
- only call loadEnvRule is -sE or debugging with --envtest
- rewrote formatting code for g/G option because it wasn't working correctly for
all situations
3.3.3-1 April 28, 2009
- forgot to include misc.ph in INSTALL
3.3.2.1 April 28, 2009
- screwed up $rootFlag and set to 0 after it was intialized correctly
- fixed a couple of problems in INSTALL: added 'q' to gzip, added gexpr/envrules.std
- added DL385G5 top envrules.std
3.3.1-10 April 27, 2009
- If root, add product name from 'dmidecode' to header
- If !root, don't allow -sE because ipmitool will fail
- When running -sE and no --envrules, look in 'envrules.std' for matching product rules
- remove '.' from ipmi device names before applying parsing rules (screws up =~//)
- change ipmi value of 'no reading' to -1
3.3.1-9 April 24, 2009
- When splitting off the daemon options, needed to include ',2' in the split
or any *expr options get screwed up since they can have their own =
- removed 'C' from -s in daemon command string since no longer needed
3.3.1-8 April 22, 2009
- renamed cmuextras to misc and renamed all variables accordingly
- added inactive memory to lexpr
- set default interval for 'misc.ph' to 60 seconds
- a couple sets of data names in gexpr (for cpu and disk detail) were framed in
single quotes and neede to use doubles
- wrong variable name for $intrptTot
- removed check for CPU data in presence of -sD since always there
- -sL --lustopts O not properly parsing read/write bytes for CFS/SUN release
- accidentally left some debugging code in that changed 'sd' disks to 'xvd' disks
- added support for disk types of 'emcpower'
- when running with -P and --rawtoo, collectl only write to the raw file but still created an
empty prc file. Not it doesn't create that empty file. Also added reason to FAQ
3.3.1-7
- removed memhuge from cmuextras and added to core memory stats as well as
gexpr, lexpr and sexpr
- cleaned up a couple bugs in gexpr for i= processing
- silently remove 'x' from 's=' in gexpr, lexpr and sexpr if not part of -s since it
could have been disabled. this allows one to specify -sx as well as s=x without
fear of getting a hard error from the *expr
3.3.1-6
- updated collectl-debian
- added avg/min/max options to gexpr and lexpr
- added import 'cmuextras.ph' to kit
- removed line that set $message to 'unexpected perfquey error' which was
clearly the wrong thing to be doing
- in 3.2.1-6 added 'unexpected message' for perfquery failures that was wrong so
removed it
3.3.1-5 Apr 09, 2009
- need to include command switches when changing process name
- rewrite of all the start scripts (collectl, -generic, -debian and -suse) to support
multiple daemons. In the process fixed a bug where debian wouldn't restart correctly.
Added --restry 2 to start-stop-daemon and that seemed to fix it.
- added type 4 to gedtExec()
3.3.1-4 Apr 06, 2009
- changed interface to sexpr and lexpr to more closely reflect gexpr dir/file naming,
updated documentation and also changed lexpr to include only sending changes and
handling TTL, mainly by stealing a lot of code from gexpr.
- got rid of --expdir since that now handled with 'f=' option to all 3
- Had to move calling of ${export}Init to after initRecord()
- Reporting incorrect variables for -si with all 'expr' routines. Had changed
inode data a long time ago but apparently nobody uses 'expr' or -si or both
- Needed to add -sC with -sj in sexpr
- Added SwapFree to *expr even though it can be derived
- new switch: --pname name, tells collectl to run as a different process name
and use a different pid file with that name, which in conjunction with hacking
up another init.d/collectl file will allow you to run a second instance of a
daemon with a different name
- reset $interval2SecsReal to 1 at same as $intereval2Secs when $i2Secs is 0
3.3.1-1
- when writing to plot files not including new headers on subsequent days
- typo on major fault display string in lexpr.ph
- if only logging plot detail data, was getting errors trying to print to
unopened tab file
- API for --import allows custom data collection, includes example hello.ph
- had to allow for playing of file with blank Subsys field
3.2.1-6 March 03, 2009
- added --nfsopts z to filter lines of 0 in -sF mode
- if collectl.conf is not writeable (eg in R/O filesystem), do not try to add
IB paths dynamically
- wrong logic for handling --nfsopts z
- minor formatting changes to column positions in brief format and slab detail
- wasn't including CPU type, speed, cores and siblings when converting to plot files
- dropped inode info from header which was dropped from collectl awhile back
- don't report open failures on nfs data since not always there
- add support for XEN xvd disk types [thanks brian]
3.2.1-5
- incremented $nfsCommit instead of $nfsCommitTot
- wasn't handling --nfsfilt correctly on playback of 3.2.1-4 files
- don't set $sockFlag until after socket opened otherwise we can't report socket errors
on terminal
- if read & write fields for an nfs version are both zero assume not active and don't
report in detail format
- make nfs one of the default subsystems to collect data for
- UNINSTALL wasn't removing link to start script on Debian
- file selection logic for playback wasn't working correctly for multiple hosts with
multiple files on same date
- fixed preprocessPlayback() to deal with +/- when -s specified
- fixed very subtle bug involving playing back multiple files for same day, the first having
-sy and the second having -sY and -s overrided with -s+. caused print on opened filehandle
3.2.1-4
- always write client/server nfs data, using nfsc- and nfss- as prefix
- added --nfsfilt to control details output
- other misc stuff for support of ALL nfs data in raw file at once
- dropped SubOpts and NfsOpts from header
- added NfsFilt to header
3.2.1-3
- do now allow -O any more, must use --nfsopts and --lustopts
- support for nfs V4. will now collect ALL data in /proc but still only report
on 1 type either interactively or during playback, based on --nfsopts
- only turn echo back on in error() if not a PC
- only look for passwd file when recording/playing back process data
- when playing back a file with a prefix in front of the host name and specifying
multiple directories the destination was not being correctly resolved.
3.2.1-2
- only set $nfsOpts from header during playback if -s wasnt' specified OR it
was and contained an 'f'
- do not exit on broken pipe if "-A server"
- --vmstat wasn't respecting --hr 0 or 1
3.2.1-1
- fixed a couple of bugs in INSTALL
- init.d scripts and release notes copied to wrong directory
- added Passwd to collectl.conf which if defined will point to default passwd file
- changed the way /proc/vmstat read to get more data
- added swap in/out and page faults to verbose memory display
- added page faults to tab file
- when running interactively over multiple days with -P, headers were not being
including in subsequent files
- changes some verbose summary headers to mix-cased
3.1.3-1 January 23, 2009
- output for '--procopts i' off by one column near accutim
- if RETURN entered in brief mode before 1st interval reported, ignore
it because we'll get a divide by 0 error
- add +openibd to sles startup script so collectl will start after IB
- fixed problem processing data from different time zones with new
--from/-thru processing
- fatal bug in playing back process data was missed before release
- another fatal bug in --procanalyze. if looking at a process which were only
there for a single interval, when calculating the % of cpu which takes into
account the process lifetime (in this case 0), you get a divide by 0 error!
the fix is to set the duration to 1.
- not all files were opened if -s specified with + and --procanal/--slabinfo
so added restriction against doing so
- when playing back interrupt data in plot format you have to include -sC and this
was too confusing so just silently (unless -m) adding it in and documenting in FAQ.
- if --slabanal or --procanal but no -sY/Z, don't write to slb or prc file
- allow --passwd for ALL situations since /etc/passwd not valid for NIS. also add
to help output
- selection of task by UID wasn't working
- if uid can't be translated to a username, report the UID instead of ???
- fixed problem with divide by 0 errors if proc/slab analysis on multiple host/days
3.1.2-4 January 20, 2009
- bug fixes to handling of interval times
- -sm --verbose needs 1 extra line with --top
- when exiting from --top, move cursor to bottom of display
- if playing back files for same host, don't reset header counters between them
- ignore parent process when looking for duplicate instances of -sx [thanks kaya]
3.1.2-3
- support for allowing multiple clients to connect when in server mode
- new documentation page: Genenerating Plottable Files
- dropped support for data files generated by pre V1.3 version
- when rolling logs, write a timestamp onto end of last file
- in playback mode, if last timestamp of previous file matches first timestamp of
new file, treat as contiguous data which results in no 'holes' in output stream
3.1.2-2
- check for nfsopts/playback in checkSubsysOpts was incorrectly looking at
$plotFlag when it should have been looking at $playback
- added Power Meter (ipmitool sdr type current) to env data when available
- added all environmental data to lexpr and sexpr
- added swap total/used to lexpr and sexpr
- building incorrect symlinks to collectl-suse and collectl-debian in INSTALL
- also wrong in collectl.spec
- for IB monitoring, when couldn't find ofed_info was still trying to run it
- need to intialize $interval2SecsReal to i2 first time when 0
- do NOT report process/slab data for the first interval with data in it
3.1.2-1
- more cleanup to INSTALL to give work read access to ARTISTIC, COPYING and GPL
and set a few more protections on other files
- chage to --from/--thru processing since error messages implied you could use
dates too, so now you can. see man page or web documentation on playback for
details
3.1.1-5 November 5, 2008
- two new fields added to slab data to show changes in total allocation between samples
- when mixing --procanalyze with other subsystems, the non-process data wasn't getting
written
- new switch: --slabanalyze
- in header for process data change 'faults are ...' to 'counters are ...' since
we're now including I/O counters as well
- wasn't printing process i/o headers with --procanalyze output. thanks Sven
- when using -on, cpu % needs to divide by the real interval and not 1. thanks to Sven again!
- added percent CPU utilization for process I/O format as well as prc and prcs files
- also --procanalye now honors -om for msec level times
- new process option: c. will include cpu times of any child processes (not threads)
that have since died
3.1.1-4 October 29, 2008
- fixed a rounding problem with numbers between bewteen 1000M and 1024M
that were getting printed as 0G (thanks Marko)
- found error in conversions to K, M, etc where in some cases dividing by
1000 instead of 1024! Specifically: i/o sizes for disk, networks lustre and infiniband.
Also lustre BRW states and some of the KBs fields processes for I/O and memory usage
in the default format - the detailed memory format had it right.
brief formats for: disk, network, quadrics, IB, lustre
But also note these only come into play when values being reported exceed the default
field widths and so ususally aren't tripped.
- changed -oF to --procopts f
- limit username in process display to 8 chars
- make sure terminal echo turned on when falling through error()
- if processes AccumTime>999 minutes in Process Summary, which is pretty rare, drop
fractional seconds resulting in a different format
- included sort-type in top process display
- added ioall to --showtopopts menu
- allow a numeric width to be included with --procopt w
- removed restriction for considering -sl ambiguous and it will be assumed to mean lustre
subsystem rather than a typo for -slab
- misspelled RSys/WSys as RSYS/WSYS in procanalyze code
3.1.1 October 8, 2008
- missing leading space before 'sd' when determining disk names during initialization
can result in wrong devices being listed with -sD and the header if they contain
an embedded 'sd'
- fixed problem with --slabopts s and or S with -P or in playback
- fixed --top checking to verify ALL different I/O related types
- generate an error message for mixing lustre client option O with M or R
- allow printing detaild in --top mode, BUT user needs to control top part of
display with --hr
- playback of environmental data in plot format was printing values every interval rather
than just during interval3
- some impi values may be '' and so report them as 0 to make sure gnuplot can handle it
- make a few changes to INSTALL for debian-based installations
- added 'AccuTime' to top I/O display format
- new feature: top slabs! same switch as top processes, --top, but include names of
slab column to sort by. see --showtopopts
- filtering for old slabs now matches beginning of slab name just like slub
- lustre OST/B data wasn't shifting headers when -oT included
3.1.0 September 3, 2008
- fixed 2 problems in INSTALL (thanks sebastien)
- forgot to copy collect.conf to BINDIR/etc
- forgot to set protections on collectl and inet.d script
- cleaned up interval header printing
- new feature: environmental monitoring via ipmitool
- added environments to daemon defaults in collectl.conf
- changed default interval3 monitoring interval to 2 minutes
- 1st line of brief headers were 1 column too narrow for -t,m,h&f
- when reading lustre MDS stats, don't tell getProc to skip over anything and
save everything that starts with 'mds_'
- extended lustre MDS data reporting
- added I/O size to lustre Client/OST verbose/detail output and make it honor
--iosize in brief mode
- fixed a 1 column formatting shift when using -oT with some lustre client/ost output
- fixed problem in which --hr 1 wasn't causing a new header every intereval for
detail data of same type
- increased size of KBBytes for lustre/interconnect data to 7 digits
- very minor, but if user specified -s+l and --lustsvc and lustre
disabled, only looking at subsys in checkSubsysOpts was generating an error
so now it looks at '$userSubsys' too.
- make $filename local in getSys()
- another pair of switches: -X, --helpall lists ALL help making it possible to
grep for something if you can't remember where it is
- added --grep which allows printing all entries in raw file as timestamped
lines. may mix with other playback switches
- if filtering processes and no data initially collected, interval2Secs will be
0 first time and flt/sec will generate illegal division error so set i2 to 1
- was calling procAnalyze even if no data processed during an interval and as a result
the last pid seen was being credited for that interval when it shouldn't have been
- added parent pid to top i/o display
- when looking for collectl procsses with -sx, be sure to ignore those instances
where the command is 'ssh'
- discovered '$lastInt2Secs' not getting reset when a new set of prefixes
were being played back. This meant the denominator for first line of
process/slab rate data would be wrong, but most people probably wouldn't
have even seen this
- a couple of fixes to correct --procanalyze reporting errors
- removed extra space from --procopts i header.
- significantly expanded --top sort types
- "waiting for..." message will now honor --quiet
- if more than one file played back with interrupt data AND latter one had more
CPUs $intrptLast{}->[$cpu] wasn't getting getting initialized
- allow commas in addition to spaces to separate files in 'playback' list
- discovered a user app can modify contents of /proc/pid/cmdline and so cannot
assume it will always end in null (see test of $cmd1)
- change test of !$slubinfoFlab to $slabinoFlag since both may be missing
3.0.0-4 July 1, 2008
- major switch cleanup
- completed cut-over from -O to xxxopts started in V2.6.4 by creating
--nfsopts/--lustopts. -O kept around for backwards compatibility
for nfs and lustre
- a couple of switch changes to reduce complexity of -o and to clarify new
meaning of handling time offsets and from/thru times for playback
- replaced -ot with --home
- replaced -oP to passwd
- replaced -t/--timezone to --offsettime which now takes a time in seconds
- replaced -b/-e to --from/--thru
- new switch: --procanalyze will produce space separated process summary file
(extension = prcs) that summaries process data for each unique process
- big enhancement for --top. now when -s specified prints a scrolling window
showing histories (-oT recommended but not required) if in brief OR verbose
and all lines the same. note - this mode does NOT support detail subsystem data
- also now identifying the parent who created the thread correctly
- output format cleanup to make things more concise. no changes to plot format
- changed order of columns for brief lustre client to be consistent with all
other brief fields
- changed order of I/O related verbose subsystems (disk, network, infiniband
and lustre) to be more consistent with brief mode. in other words, all input
stats preceed output stats and KBs preceed I/Os. NOTE - the order of the
fields for plot data have not been touched.
- reformatted help to make more readable (I hope) and fit in 80 columns too!
- nfs got inadevetantly dropped as a valid subsystem in V2.6.3 and it's now back
- wrong logic for verifying --procopts Z only allowed in -top mode
- -oA was calling printMini1Counters() instead of printBriefCounters()
- renamed printVerbose() to printTerm() because it makes more sense
- when reading diskstats, make sure leading space before 'sd' as there is
with $diskFilter in formatit.ph
- fixed printing process data that got broken in plot format
- made brief fields 1 column wider for lustre/infiniband in brief mode
- lustre client names didn't make it into header with -sLL was specified using
old option format
- discovered the cvt() routine wasn't being used everywhere in printBrief()
- found/fixed bug that's been there almost forever! if you play back a file
recorded with -sZc but force collectl to only process -sZ it got fatal errors.
Just goes to show how many combinations of conditions there really are!
- fixed problem (I hope) where extra 'RECORD' separators were getting printed
for empty intervals
- fixed code that checks for another instance using IB since it wasn't dealing
with -s using both + and - in it such as a daemon that has -s+YZ-x
- couldn't play back process data on a PC without --passwd since /etc/passwd
not there
- wasn't dividing lustre client OST details by 1024
- discovered/fixed file header entry for switch options which only showed switches
and not options. since a read-only field it shouldn't have hurt anything.
2.6.4 June 11, 2008
- fixed references to gzerror() to be in string context and so error text correct
- miscellaneous documentation changes, mainly to support code changes
- do not report /proc/pid open failures since they happen often enough to be
a nuisance
- changed order of options for --top to be type,num and if no num use the screen
size
- dropped --procio and --procmem replacing them with --procopts i and m
- new options for --procopts: r and z
- broke --vmstat when changed $cls to $clscr
- removed inline code for vmstats since now down via vmstat.ph
- collectl --top generating uninitialized variable message when blank line in
/etc/passwd was fixed
- wasn't honoring -ot for a single subsystem in --verbose mode (sheesh) and now it is
- remove special code that removes collectl from --top display unless explictly
requested. this will help make users more aware of collectl overhead
- found at least one system that returned different format from 'resize' and so
changed pattern match to make it more general
2.6.3 May 12, 2008
- added a README, INSTALL and UNINSTALL to the tarball to aid in manual
installation and removal
- changed --procopts to --procfilt and --slabopt to --slabfilt because I want
to differentiate between options and filters.
- enhanced socket error handling
- new I/O output data for disks, networks and interconnect
- i/o sizes will always be included in verbose output
- new switch --iosize will add to brief displays
- NOTE this data is not written to tab file since it can be derived
- changes to -si (inode data)
- removed info from header and will get it from proc instead
- changed what is reported as some fields no longer valid and added 'number' of
dentry noting that the values for 'unused', which increase as files are
created makes no sense to me. also including file handles and inode counts
in brief format.
- as a result of adding -si to brief format, --all results in brief output for
everything and so you'll need to include ---verbose to see verbose form
- several new options for --procopts (thanks for the push Matt)
s: will add read/write system calls to process stats
t: will force collectl to look/display threads for ALL processes
note that this can be a lot of overhead if there are a lot of threads on
your system. All you threads can also be seen via 'ps -eLf'
w: will make display wider by including arguments to process names
- you can now request what to sort on for --top (cpu, io or page faults)
- you can now include --procfilt with --top and it will only consider
those processes that match for display
- you can now use --top in playback mode
2.6.2 Apr 29, 2008
- forgot to rename call to resetMini1Counters() in collectl.pl
- do NOT clear $miniDateTime when --export
- added swapin/sec and swapout/sec to [MEM] data in tab file
2.6.1 Apr 24, 2008
- for perl version checks, use 2 digit minor/patch levels (thanks devzero)
- report zlib and HiRes vesions in collectl -v output
- grab ALL of /proc/meminfo for non-2.4 kernels even though we're not
processing all of it
- added the number of active lustre file systems seen by the client for lexpr/sexpr
- was incorrectly restricting -A to -P or --export and that was wrong
- allow --export in playback mode, making it possible to use --vmstat as wellx
- extended --top to allow -s to be included along with proc stats. not that pretty
but very useful
- renamed printTerm() to printVerbose(), briefFormat() to printBrief() and other
associated printMin1 routines
- when changed syswrite() in writeData in last version, lost trailing /n and
so put it back
- ibcheck was redefining global $port so reopening socket in 'server' mode failed!
- when --export added forgot to handle writeData() conditional correctly for
process and slab data
2.6.0 Apr 03, 2008
- lustre
- typo for lustre readahead 'not consecutive' variable!
- added 2 new readahead variables for 'failed grab...' and 'wrong page...'
- extended meaning of --headerrepeat and added a synonym of --hr for it
a value of -1 means never display a header and 0 means only display it once,
eliminating the need for -oH and -oh which are still supported but not shown
in help. They will be eliminated in a future release.
- bug in regx prevented gzclose on zipped tab file
- cleaned up code (finally) that deals with displaying headers such as how often
and when to skip entirely. this included dropping the -oh option which
predates --verbose mode
- if we can find 'resize', use it to get number of lines in display and use for
default. This can still be overriden in collectl.conf
- slight change to -ot behavior. only erase screen one time and then just
overwrite what's there as it's softer on the eyes
- fixed format error in 's-expr rate' for disk summary stats
- modification to the way --custom is used to make it work with -f, -P, sockets
and --rawtoo just like --sexpr. In fact, sexpr and vmstat code has been
removed from formatit.ph and are now standalone include file named sexpr.ph
and vmstat.ph respectively. See documentation for more details.
- renamed --custom and --custdir to --export and --expdir to better reflect
that the main purpose of these is to export data to a file or over a socket
- had to move subsys/interval initialization code around to happen before
calling --export
- changed init.d file for SuSE as it couldn't detect collectl running
when pids was 5 digits
- added code to handle write of partial data over socket
- based on popular demand, --all has been provided to show all summary stats.
be sure to try it with -ot
- added CPU number to process detail report. since this data has actually been
collected all along, you can play back older raw file and now get them
- changed default socket port to 2655
2.5.1 Mar 21, 2008
- added OFED 1.3 location for perfquery to collectl.conf
- added new constant for ofed_info to collectl.conf
- if can't find perfquery and/or ofed_info, ask rpm and if there update collectl.conf
- redefined debug flag of 8 for lustre checks and leave 2 for interconnect only
- adding more debugging details for infiniband initialization
- changed daemon startup switches to include -sC. this will NOT generate any extra
load on collectl but will cause CPU details to be generated in plot format which
will include interrupts/cpu
- make sure user have privileges to run perfquery
- moved location of --sexpr with -sj check
- for lustre versions < 1.6 don't limit BRW stats to being in directory with
MNT in its name, which was certainly the case for HP-SFS
- changed headers for lustre rpc buffers to 'P' rather than 'K'
- changed directory on MDS that we look in for stats from .../MDT/mds/stats for
older versions of lustre to ...MDS/mds/stats for versions >= 1.6
- need to check lustre version BEFORE calling lustreCheck() routines
- in lustreCheckClt(), only do OST level tests if really a client
2.5.0 Feb 29, 2008
- if HCA present but IB stack not completely loaded, the cat of /sys/class/infiniband/*
fails and reports error. redirecting STDERR supresses that error
- added support for reporting interrupts by CPU
- removed all but the collectl and collectl-data man pages, moving their content
to the collectl web site at sourceforge AND to /opt/hp/collectl/docs
- when installing is a brand new ROCKS environment /bin/rm not there yet so make
conditional in %pre section of spec file [thanks roy]
- modified spec file to add build level to release so I can keep the release number
the same [thanks again, roy]
2.4.3 Feb 04, 2008
- cpu percentages calculations need to include iowait in denominator
- memory stats: include AnonPages in mapped memory
- fixed pattern match for IB device number to properly select mlx4_ adapter
- was incorrectly including network bond stats with total network stats
- wasn't printing date/time for --vmstat when requested
- added IbDupCheckFlag to collectl.conf to allow disabling the check for
duplicate instances both trying to read IB counters
- removed a couple of spaces from default output so now <80 columns wide
- when someone creates a new logical disk after collectl has been started, we need
to add that disk to the list of valid disk names
- changed the algorithm used to check for bogus network data. you can also
disable these checks by setting DefNetMax to a negative value in
collectl.conf
2.4.2 Jan 16, 2008
- changed purge algorithm to explicitly purge any files in the logging directory
that match hostname, contain date/time stamp and do NOT end in 'log'. Before only
raw files were purged and this was clearly not the intent.
- on a lustre MDS, the mds_sync counter has moved as well as others added so pull more
of them. even though the newer ones won't be reported on, they'll be in the 'raw'
file for reference via tools like grep.
- bogus network record processing changed as follows: