-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
1127 lines (780 loc) · 57.8 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>HPCC Service Status</title>
<meta name="description" content="We'll post information about ICER's system downtimes, updates, new features, and other information for the ICER user community here.
">
<link rel="stylesheet" href="/css/main.css">
<link rel="canonical" href="http://blog.icer.msu.edu//">
<link rel="alternate" type="application/rss+xml" title="HPCC Service Status" href="http://blog.icer.msu.edu//feed.xml">
</head>
<body>
<header class="site-header">
<div class="wrapper">
<a class="site-title" href="/">HPCC Service Status</a>
<nav class="site-nav">
<a href="#" class="menu-icon">
<svg viewBox="0 0 18 15">
<path fill="#424242" d="M18,1.484c0,0.82-0.665,1.484-1.484,1.484H1.484C0.665,2.969,0,2.304,0,1.484l0,0C0,0.665,0.665,0,1.484,0 h15.031C17.335,0,18,0.665,18,1.484L18,1.484z"/>
<path fill="#424242" d="M18,7.516C18,8.335,17.335,9,16.516,9H1.484C0.665,9,0,8.335,0,7.516l0,0c0-0.82,0.665-1.484,1.484-1.484 h15.031C17.335,6.031,18,6.696,18,7.516L18,7.516z"/>
<path fill="#424242" d="M18,13.516C18,14.335,17.335,15,16.516,15H1.484C0.665,15,0,14.335,0,13.516l0,0 c0-0.82,0.665-1.484,1.484-1.484h15.031C17.335,12.031,18,12.696,18,13.516L18,13.516z"/>
</svg>
</a>
<div class="trigger">
<a class="page-link" href="/about/">About</a>
</div>
</nav>
</div>
</header>
<div class="page-content">
<div class="wrapper">
<div class="home">
<h1 class="page-heading">Posts</h1>
<ul class="post-list">
<li>
<span class="post-meta">Dec 4, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/12/04/Winter-Break-Hours">Winter Break Limited Coverage</a>
</h2>
<p>There will be limited coverage while MSU observes winter break from December 24, 2024 through January 1, 2025. The system will continue to run jobs and be monitored for emergency issues. Tickets will be sorted by priority on January 2 when our team returns to work after the holiday break.
If you have any questions, <a href="https://contact.icer.msu.edu/">please contact us</a></p>
</li>
<li>
<span class="post-meta">Dec 3, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/12/03/Winter-Maintenance">HPCC Scheduled Downtime - RESOLVED 12/19/2024</a>
</h2>
<p>RESOLVED: Maintenance is complete, thank you for your patience. Job submissions will continue to run after 5PM on 12/19. Please note that as the intel14 cluster has been retired, the <code class="language-plaintext highlighter-rouge">intel14</code> constraint must be removed from any jobs.</p>
</li>
<li>
<span class="post-meta">Nov 19, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/11/19/intel16-outage">Intel16 Cluster Currently Offline - RESOLVED 11/19/2024</a>
</h2>
<p>RESOLVED: 11/19/2024 12:10PM - On 11/18/2024 ITS performed maintenance on a number of switches in the data center that required rebooting critical network infrastructure. After these reboots, several links connecting to the intel16 cluster did not recover. During this time, you may have also noticed brief pauses in OnDemand and on Gateway nodes. This morning we were able to work with ITS to re-establish connectivity to all intel16 nodes, and the intel16 cluster, along with all other nodes, are now back in production and running jobs via Slurm.</p>
</li>
<li>
<span class="post-meta">Oct 31, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/10/31/MATLAB-License">MATLAB License issue - RESOLVED 10/31/2024</a>
</h2>
<p>RESOLVED: 10/31/2024 5:15PM - The issue is resolved on development and compute nodes.</p>
</li>
<li>
<span class="post-meta">Oct 31, 2024</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2024/10/31/module_software-fileserver-restart">Shared Module and Software Server Restart - RESOLVED 11/1/2024</a>
</h2>
<p>RESOLVED: 11/1/2024 6:15 AM - The system restart is complete and all services should be online.</p>
</li>
<li>
<span class="post-meta">Oct 30, 2024</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2024/10/30/software-fileserver-restart">Shared Software File Server Restart - RESOLVED 12:50 10/30/2024</a>
</h2>
<p>RESOLVED: 1250 10/30/2024 - The system restart is complete and all services should be online.</p>
</li>
<li>
<span class="post-meta">Oct 28, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/10/28/ICER-Web-App-Login-Error">ICER Web Application Login Error - RESOLVED 10/29/2024</a>
</h2>
<p>UPDATE: 10/29/2024 - Logins to RT, OpenOnDemand, and Contact forms looks to be fully functional again. Values might be cached and you might need to clear your cache. You can test by opening a private browser. Email general@rt.hpcc.msu.edu if you still experience problems.</p>
</li>
<li>
<span class="post-meta">Oct 25, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/10/25/UserID-Information-Lookup-Error">ICER Contact Form UserID Information Lookup Error RESOLVED 10/29/2024</a>
</h2>
<p>The <a href="https://contact.icer.msu.edu/">ICER contact form</a> is currently experiencing a technical error retrieving userID information for some MSU accounts. This error may result in your inability to log new account or new research space requests. While we continue to troubleshoot this error, please use the <a href="https://contact.icer.msu.edu/contact">general contact form</a> to submit your requests. This post will continue to be updated as we have more information.</p>
</li>
<li>
<span class="post-meta">Oct 24, 2024</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2024/10/24/gateway-node-os-upgrade">Gateway Node Operating System Upgrades</a>
</h2>
<p>Starting on Monday 10/28/2024 and over the next few weeks, we will be upgrading the operating systems on the gateway nodes. If you experience a timeout while attempting to connect to the HPCC during this time, please try again after a short delay or use our <a href="https://ondemand.hpcc.msu.edu">open ondemand instance</a>. If you continue to have difficulty loging into HPCC resources, please let us know by submitting a ticket through our <a href="https://contact.icer.msu.edu/contact">Contact Forms</a></p>
</li>
<li>
<span class="post-meta">Oct 23, 2024</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2024/10/23/development-node-reboot">2024-10-24 Development node reboots - RESOLVED 2024-10-24 0715</a>
</h2>
<p>RESOLVED: 10/24/2024 - All reboots are complete and the development nodes should be available. Please report any issues through our <a href="https://contact.icer.msu.edu">contact forms</a></p>
</li>
<li>
<span class="post-meta">Oct 17, 2024</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2024/10/17/Firewall-Maintenance">2024-10-29 Nondisruptive firewall update</a>
</h2>
<p>Between 7 PM and 9 PM on October 29th, ITS will perform updates to the ICER firewall. We do not anticipate any impact to users as the firewall is configured
with full redundancy, but please <a href="https://contact.icer.msu.edu/">open a ticket</a> if you notice any issues.</p>
</li>
<li>
<span class="post-meta">Sep 27, 2024</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2024/09/27/File-System-Performance-Update">File System Performance - RESOLVED 9/27/2024</a>
</h2>
<p>RESOLVED: 9/27/2024 - ICER has completed the migration of data from the old home and research file system to the new file system. This should resolve the occasional slowdowns that have occurred since the start of the project this past spring. Home and research file system operations have returned to normal. This includes disaster recovery replication and our file system quota processes. Thank you for your patience during this transition.</p>
</li>
<li>
<span class="post-meta">Sep 26, 2024</span>
<h2>
<a class="post-link" href="/announcement/issue/2024/09/26/Illegal-Instructions">'Illegal instruction (core dumped)' Errors - RESOLVED 10/14/2024</a>
</h2>
<p>RESOLVED: 10/14/2024 - We have applied a fix that we believe has solved the issue. If you are still experiencing problems, please contact <a href="https://contact.icer.msu.edu">contact ICER support</a> with a description and steps to reproduce the issue.</p>
</li>
<li>
<span class="post-meta">Sep 16, 2024</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2024/09/16/OnDemand-Update">OnDemand Portal Update on Friday 9/20 - RESOLVED 9/23/24</a>
</h2>
<p>At 9:00PM on Friday, September 20th, ICER’s OnDemand portal will undergo an update from version 3.0.1 to version 3.1.7. The most notable change to the portal following this update will be Globus integration. When browsing files in the updated OnDemand portal, a ‘Globus’ button will be available that will open the current directory inside of Globus. A full list of changes made by this update can be viewed <a href="https://github.com/OSC/ondemand/compare/v3.0.1...v3.1.7">here</a>. If you have any questions about this update or encounter any issues with the OnDemand portal following the update, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Sep 4, 2024</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2024/09/04/Development-node-dev-intel18-maintenance">Development node dev-intel18 maintenance - RESOLVED</a>
</h2>
<p>RESOLVED: The maintenance on dev-intel18 is complete as of 11:20 AM, September 6, 2024 and the node should be available for use.</p>
</li>
<li>
<span class="post-meta">Sep 3, 2024</span>
<h2>
<a class="post-link" href="/announcement/change/2024/09/03/Change-to-Loading-Modules-in-SLURM-Scripts">Change to Loading Modules in SLURM Scripts</a>
</h2>
<p>In one week, ICER will make a small change to the way modules are loaded in SLURM scripts. Please make sure that all SLURM scripts you submit load modules in scripts before you use them! For more information and also how this affects workflow managers like Nextflow and Snakemake, please <a href="https://docs.icer.msu.edu/2024-08-28_Change_to_Modules_in_SLURM_Jobs/">see our documentation</a>.</p>
</li>
<li>
<span class="post-meta">Aug 29, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/08/29/log-ins-down">OnDemand and Contact form login issues - RESOLVED</a>
</h2>
<p>RESOLVED: ITS has resolved the login issue and all systems are accessible as normal.</p>
</li>
<li>
<span class="post-meta">Aug 9, 2024</span>
<h2>
<a class="post-link" href="/announcement/bug/2024/08/09/Filesystem-slowdown">Filesystem Slowdown and User Creation Pause - RESOLVED</a>
</h2>
<p>UPDATE 8/13/2024: The recovery processes have finished running, and Home filesystem performance has now returned to normal.</p>
</li>
<li>
<span class="post-meta">Aug 6, 2024</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2024/08/06/Scheduled-Maintenance">August 6, 2024: HPCC Scheduled Downtime and Transition of Remaining CentOS Nodes (Completed 8/6/2024)</a>
</h2>
<p>Updates:
05:00PM - Upgrades are complete and in the processes of moving the system to production. This process takes about 30 minutes. HPCC should be available by 5:30PM or shortly after. Home and Research filesystem is little slow while snapshots catch up. Those will clear later this evening. If you notice problems, <a href="https://contact.icer.msu.edu">contact us</a></p>
</li>
<li>
<span class="post-meta">Jul 22, 2024</span>
<h2>
<a class="post-link" href="/announcement/bug/2024/07/22/Scavenger-Queue-Issues">RESOLVED 7/31/24 Scavenger Queue jobs not starting</a>
</h2>
<p>The scavenger queue is operating normally now that the buyin node OS transition has been completed.</p>
</li>
<li>
<span class="post-meta">Jul 12, 2024</span>
<h2>
<a class="post-link" href="/announcement/bug/2024/07/12/Minor-data-machine-issue">Data machine nodes not showing up in scontrol - UPDATED</a>
</h2>
<p>On July 12th, it was discovered that the data machine nodes are not properly responding to diagnostic commands. However, these nodes are still available and scheduling jobs.</p>
</li>
<li>
<span class="post-meta">Jul 11, 2024</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2024/07/11/MPI-Rebuild">Rebuilding default OpenMPI, may cause login issues - RESOLVED</a>
</h2>
<p>On July 11th, 2024 from 5:30-6:00PM Michigan time, we will be rebuilding the default OpenMPI module, <code class="language-plaintext highlighter-rouge">OpenMPI/4.1.5-GCC-12.3.0</code>. This will result in errors from the module system when logging in, as the module needs to be deleted to be rebuilt. This will <em>not</em> affect running jobs, and will be isolated to development nodes only. The rebuild should be complete by 6:00PM at which time this blog post will be updated.</p>
</li>
<li>
<span class="post-meta">Jul 1, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/07/01/Details-About-Current-Issues">Update details about current filesystem and OnDemand issues</a>
</h2>
<p><strong><em>OnDemand:</em></strong> OnDemand is periodically losing connection to our gateway nodes. This makes home and scratch unavailable. We are still investigating the cause. <strong><em>Home directories</em></strong>: The home file system underwent diagnostics from 6/24-6/28. This caused slowdowns for logging in and using the HPCC. We have restarted our backup process after the scan ended 6/28 evening and users may see pauses as the file system catches up. <strong><em>NewOS:</em></strong> We upgraded our operating system to Ubuntu 22.04 in mid June. This included a reinstallation of all software modules. Please read our documentation <a href="https://docs.icer.msu.edu/OS_Upgrade/">here </a>for more details about the upgrade, and <a href="https://contact.icer.msu.edu/contact">contact us</a> if you are having issues not covered by this documentation. <strong><em>Please click the title of this post for more detailed information and our planned timeline.</em></strong> <strong>Updated: 7/10</strong> at the end.</p>
</li>
<li>
<span class="post-meta">Jun 27, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/06/27/Current-Issues">Current system issues</a>
</h2>
<p>We are aware of two issues affected the system at this time: slow response to commands/slow login, and OnDemand scratch space missing.
The system slowdowns are caused by diagnostics on the home filesystem as part of our upgrade to a new home filesystem.
We do not currently have an estimate for when these diagnostics will complete.
The OnDemand scratch space connection is also being diagnosed and addressed with our storage vendor.
Please check back for updates as we have them.</p>
</li>
<li>
<span class="post-meta">Jun 24, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/06/24/Home_filesystem_issues_affecting_OnDemand">Home filesystem issues affecting OnDemand</a>
</h2>
<p>OnDemand functionality has been partially recovered. Users should be able to log in, connect, and access their home and research spaces, as well as interactive app sessions. Scratch remains unavailable at this time. Please report access issues at https://contact.icer.msu.edu/contact</p>
</li>
<li>
<span class="post-meta">Jun 17, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/06/17/Filesystem_issues">Home filesystem issues affecting OnDemand - Resolved</a>
</h2>
<p>At approximately 12:00 PM on 6/17/2024 we started experiencing an outage with the Home filesystem. This outage primarily affects OnDemand, but may be apparent on other nodes as well.</p>
</li>
<li>
<span class="post-meta">Jun 17, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/06/17/OS_Upgrade">Compute Operating system upgrades (complete)</a>
</h2>
<p>On 17 June, 2024 the primary operating system on HPCC resources is being changed from Centos 7 to Ubuntu 22.04.
Please review <a href="https://docs.icer.msu.edu/OS_Upgrade/">our operating system upgrade documentation</a> for details.</p>
</li>
<li>
<span class="post-meta">May 16, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/05/16/Samba-connectivity">Samba connectivity issues</a>
</h2>
<p>UPDATE 3:45pm 5/16/24 Samba file sharing is now back online. Please submit a ticket at https://contact.icer.msu.edu/contact if you continue to experience issues.</p>
</li>
<li>
<span class="post-meta">May 13, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/05/13/HomeDirectory_issues">Home filesystem issues - update</a>
</h2>
<p>At approximately 6:15PM on 5/13/2024, users began reporting issues accessing their home directory on HPCC. We are aware of the issue and are working with our vendors to address it.</p>
</li>
<li>
<span class="post-meta">May 10, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/05/10/Filesystem_issues">Home filesystem issues causing login problems</a>
</h2>
<p>At approximately 11:10 AM on 5/10/2024 we experienced a transient outage while conducting upgrades and hardware refresh of our Home filesystem. This outage may have caused login issues or stale filemounts. Services were restored after approximately 15 minutes and home directories should be available again. If you countinue to experience issues with your home directory, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">May 6, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/05/06/System-Reboots">System Reboots Thursday May 9</a>
</h2>
<p>On Thursday May 9 the following systems will be rebooted from 10-12am:</p>
</li>
<li>
<span class="post-meta">May 3, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/05/03/Home-Directory-Problems">Home filesystem issue UPDATED 5/3/2024 5:00PM</a>
</h2>
<p>UPDATE (5/3/2024 5:00 pm) - The issue has been resolved and all services should be available. If you encounter any additional issues, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">May 2, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/05/02/Filesystem_issues">Home filesystem issue causing login problems - UPDATED 5/2/2024 12:30 pm</a>
</h2>
<p>UPDATE (5/2/2024 12:30 pm) - File system and connectivity issues have been resolved.</p>
</li>
<li>
<span class="post-meta">Mar 18, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/03/18/SLURM_Controller_Reboot">Scheduler Reboot at 10:00AM on 3/19/24</a>
</h2>
<p>At 10:00AM on Tuesday, March 19th, the SLURM scheduling server will go offline for a reboot. This reboot is necessary to apply updates to the underlying hardware that hosts the scheduler. The scheduler is expected to be offline for roughly 15 minutes. During this time, jobs may not be submitted and scheduler specific client commands will not work (e.g. squeue, sbatch, etc). Running jobs will not be affected. If you have any questions about this outage, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Mar 15, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/03/15/SLURM_Controller_Reboot">Scheduler Reboot at 10:00AM on 3/18/24</a>
</h2>
<p>At 10:00AM on Monday, March 18th, the SLURM scheduling server will go offline for a reboot. This reboot is necessary to apply updates to the underlying hardware that hosts the scheduler. The scheduler is expected to be offline for roughly 15 minutes. During this time, jobs may not be submitted and scheduler specific client commands will not work (e.g. squeue, sbatch, etc). Running jobs will not be affected. If you have any questions about this outage, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Mar 1, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/03/01/Scratch_OnDemand_Problem">Scratch space not acccessible via OnDemand</a>
</h2>
<p>UPDATE (3/1/2024) - Access to scratch via OnDemand has been restored</p>
</li>
<li>
<span class="post-meta">Feb 1, 2024</span>
<h2>
<a class="post-link" href="/announcement/2024/02/01/VSCode_Update_Problem">VSCode updates will break access</a>
</h2>
<p>This post applies to users of VS Code that SSH into the ICER HPCC from their own copy of VS Code.</p>
<p>Error message:
“This machine does not meet Visual Studio Code Server’s prerequisites, expected either…: - find GLIBC >= v2.28.0 (but found v2.17.0 instead) for GNU environments”</p>
<p>Details
Microsoft recently updated Visual Studio Code to version 1.86, and it is no longer compatible with the operating system we use at ICER. The change note that lists the change is here https://code.visualstudio.com/updates/v1_86#_engineering (scroll down to “Linux minimum requirements update”) Although we plan to upgrade our operating system this year, in the meantime there are two solutions to this incompatibility.</p>
<p>Solutions</p>
<p>1) Use our code server app in OnDemand (Interactive Apps -> Code Server (beta)) You can request compute nodes to work on for a specified amount of time, and use VS Code in your browser.</p>
<p>2) Downgrade to the previous 1.85 version of VS Code and disable automatic updates. You can access the previous version here https://code.visualstudio.com/updates/v1_85 (see the Downloads section for a version for your PC or Mac)</p>
</li>
<li>
<span class="post-meta">Jan 5, 2024</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2024/01/05/Minor-SLURM-Update">Minor SLURM Update on 01/11/24</a>
</h2>
<p>On Thursday, January 11th, we will be deploying a minor update to the SLURM scheduling software. This update will bring ICER to the latest minor revision of SLURM 23.02. Running and queued jobs should not be affected. No interruptions are expected to client command functionality (e.g. squeue, sbatch, sacct). If you have any questions about this update or you experience issues following this update, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Dec 13, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/12/13/Winter-Break-Hours">Winter Break Limited Coverage</a>
</h2>
<p>There will be limited coverage while MSU observes winter break from December 22, 2023 through January 2, 2024. The system will continue to run jobs and be monitored for emergency issues. Tickets will be sorted by priority on January 3 when our team returns to work after the holiday break.
If you have any questions, <a href="https://contact.icer.msu.edu/">please contact us</a></p>
</li>
<li>
<span class="post-meta">Dec 4, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/12/04/Retirement-dev-intel14">Retirement of dev-intel14 and dev-intel14-k20 on 12/14/23</a>
</h2>
<p>On Thursday, December 14th, we will be retiring the dev-intel14 and dev-intel14-k20 nodes. After this date, the dev-intel14 and dev-intel14-k20 nodes will no longer be avialable for use as development nodes. Users should connect to the remaining active development nodes for any development node tasks. If you have any questions about this change, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Nov 30, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/11/30/Minor-SLURM-Update">Minor SLURM Update on 12/05/23</a>
</h2>
<p>On Tuesday, December 5th, we will be deploying a minor update to the SLURM scheduling software. Running and queued jobs should not be affected. No interruptions are expected to client command functionality (e.g. squeue, sbatch, sacct). If you have any questions about this update or you experience issues following this update, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Nov 26, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/11/26/Winter-Maintenance">HPCC Scheduled Downtime - Completed</a>
</h2>
<p>The HPCC will be unavailable on Wednesday, December 20th for our regularly scheduled maintenance. No jobs will run during this time. Jobs that will not be completed before December 20th will not begin until after maintenance is complete. For example, if you submit a four day job three days before the maintenance outage, your job will be postponed and will not begin to run until after maintenance is completed.</p>
</li>
<li>
<span class="post-meta">Nov 15, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/11/15/RT-problems-resolved">RT Ticketing system problem last night 11/15/23</a>
</h2>
<p>The RT/Ticketing systems had problems after an upgrade last night. The time of the problem was from 9:00 pm 11-14-23 to 9:00 am 11-15-23. If you had problems during that timeframe please try again now. If you experience problems again please clear your browser cache. Thank You.</p>
</li>
<li>
<span class="post-meta">Nov 6, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/11/06/Minor-SLURM-Update">Minor SLURM Update on 11/09/23</a>
</h2>
<p>On Thursday, November 9th, we will be deploying a minor update to the SLURM scheduling software. This update will improve the efficiency of our SLURM controllers application logs. Running and queued jobs should not be affected. No interruptions are expected to client command functionality (e.g. squeue, sbatch, sacct). If you have any questions about this update or you experience issues following this update, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Oct 27, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/10/27/Jobs-Requeued-On-Prolog-Failure">Jobs Now Always Automatically Requeued On Prolog Failure</a>
</h2>
<p>As of Thursday, October 26th, jobs that fail to start due to a prolog script error will always be requeued.</p>
</li>
<li>
<span class="post-meta">Oct 24, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/10/24/Home-Directory-Problems">Performance problem on home system - UPDATED 10/24/2023</a>
</h2>
<p>UPDATE (10/24/2023) - The performance issues with the home directory system have now been resolved.</p>
</li>
<li>
<span class="post-meta">Oct 18, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/10/18/Minor-SLURM-Update">Minor SLURM Update on 10/23/23</a>
</h2>
<p>On Monday, October 23rd, we will be deploying a minor update to the SLURM scheduling software. This update brings our installation to the latest release and includes many bug fixes. Running and queued jobs should not be affected. No interruptions are expected to client command functionality (e.g. squeue, sbatch, sacct). If you have any questions about this update or you experience issues following this update, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Oct 10, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/10/10/Minor-Singularity-Update">Minor Singularity Update on 10/12/23</a>
</h2>
<p>On Thursday, October 12th, we will be deploying a minor update to the Singularity container software. This update will bring the HPCC from version 3.11.4 to the latest 3.11.5. A handful of bug fixes and new features are available in this version. For a full list of changes, please refer to <a href="https://github.com/sylabs/singularity/releases">Singularity’s release notes on GitHub</a>. If you have any questions about this update or you experience issues following this update, <a href="https://contact.icer.msu.edu/">please contact us</a></p>
</li>
<li>
<span class="post-meta">Oct 2, 2023</span>
<h2>
<a class="post-link" href="/announcement/outage/2023/10/02/HPCC-connectivity-issues">HPCC Connectivity Issues - UPDATED 10/2/23 </a>
</h2>
<p>UPDATE (10/2/2023): We experienced an issue at 0930 this morning with home directories that prevented user logins. All services are now recovered and login should again be successful. Please let us know if you continue to experience issues.</p>
</li>
<li>
<span class="post-meta">Sep 28, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/09/28/Minor-SLURM-Update">Minor SLURM Update on 10/09/23</a>
</h2>
<p>On Monday, October 9th, we will be deploying a minor update to the SLURM scheduling software. Running and queued jobs should not be affected. No interruptions are expected to client command functionality (e.g. squeue, sbatch, sacct). If you have any questions about this update or you experience issues following this update, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Sep 20, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/09/20/contact-gateway-connectivity">Contact and Gateway Issues on 9/19/23</a>
</h2>
<p>Due to a failure of a supporting service, gateway-02 and the contact forms were unavailable at around 6 PM this evening. Staff have restored these services.</p>
</li>
<li>
<span class="post-meta">Sep 18, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/09/18/Minor-SLURM-Update">Minor SLURM Update on 9/21/23</a>
</h2>
<p>On Thursday, September 21st, we will be deploying a minor update to the SLURM scheduling software. This update is built against newer Nvidia drivers to support scheduling of multi-instance GPUs. If you have any questions about this update or you experience issues following this update, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Aug 31, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/08/31/Home-Directory-Performance-Issues-Updated">Performance problem on home system - resolved</a>
</h2>
<p>UPDATE:
8/31/2023
The cause of the system slowdowns was identified on 8/29/2023 as jobs saturating the storage I/O. Please follow the <a href="https://docs.icer.msu.edu/2023-08-30_LabNotebook_SlowdownIncidentReport/">lab notebook</a> for details and best practices to prevent this from happening again.</p>
</li>
<li>
<span class="post-meta">Aug 16, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/08/16/Globus-Restored">Globus Restored to Service - 8/16/2023</a>
</h2>
<p>8/16/2023:</p>
</li>
<li>
<span class="post-meta">Aug 15, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/08/15/Summer-Maintenance">HPCC Scheduled Downtime - UPDATED 8/15/2023</a>
</h2>
<p>UPDATE (8/15/2023):
All scheduled updates are completed for the 8/15/2023 summer maintenance.</p>
</li>
<li>
<span class="post-meta">Aug 2, 2023</span>
<h2>
<a class="post-link" href="/announcement/outage/2023/08/02/HPCC-connectivity-issues">HPCC Connectivity Issues</a>
</h2>
<p>Update:
Network problems in the data center were fixed by 3pm.<br />
Stability with home directories and gateways were restored by 5:30pm.
File a ticket if you notice any other issues. We will continue to monitor closely this evening.</p>
</li>
<li>
<span class="post-meta">Jul 26, 2023</span>
<h2>
<a class="post-link" href="/announcement/outage/2023/07/26/intermittent-performance-issues">Intermittent HPCC Performance Issues</a>
</h2>
<p>We are experiencing sporadic episodes of slowness with logging in to the gateways and/or interactive work on the development nodes. We’re in the process of tracking down this issue. If you are experiencing this issue and/or have any other comments or questions, please feel free to file a ticket with us here: https://contact.icer.msu.edu/contact</p>
</li>
<li>
<span class="post-meta">Jul 11, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/07/11/SLURM-Controller-Outage">Scheduler Outage on 7/25/23 at 6:00PM</a>
</h2>
<p>Starting at 6:00PM on Tuesday, July 25th, the SLURM scheduler will go offline in order to perform a migration of its underlying compute resources. This migration is necessary to complete routine maintenance on underlying compute resources. This outage is expected to last up to 30 minutes. During this time, SLURM client commands (sbatch, squeue, etc.) will be unavailable and no new jobs will be started. Queued and running jobs will not be affected. If you have any question about this outage, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Jul 6, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/07/06/Minor-SLURM-Update">Minor SLURM Update on 7/10/23</a>
</h2>
<p>On Monday, July 10th, we will be deploying a minor update to the SLURM scheduling software. This update contains a patch designed to address a bug experienced with some large jobs (>50 nodes) that causes job processes to persist past a job’s end time. If you have any questions about this update or you experience issues following this update, please contact us at https://contact.icer.msu.edu/.</p>
</li>
<li>
<span class="post-meta">Jun 29, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/06/29/Minor-Singularity-Update">Minor Singularity Update on 7/3/23</a>
</h2>
<p>On Monday, July 3rd, we will be deploying a minor update to the Singularity container software. This update will bring the HPCC from version 3.11.2 to the latest 3.11.4. Several bug fixes and new features are available in this version. For a full list of changes, please refer to <a href="https://github.com/sylabs/singularity/releases">Singularity’s release notes on GitHub</a>. If you have any questions about this update or you experience issues following this update, <a href="https://contact.icer.msu.edu/">please contact us</a></p>
</li>
<li>
<span class="post-meta">Jun 27, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/06/27/Minor-SLURM-Update">Minor SLURM Update on 6/28/23</a>
</h2>
<p>On Wednesday, June 28th, we will be deploying a minor update to the SLURM scheduling software. This update contains minor bug fixes and should not impact HPCC users. If you have any questions about this update or you experience issues following this update, please contact us at https://contact.icer.msu.edu/.</p>
</li>
<li>
<span class="post-meta">Jun 23, 2023</span>
<h2>
<a class="post-link" href="/announcement/outage/2023/06/23/HPCC-connectivity-issues">HPCC Connectivity Issues - UPDATED</a>
</h2>
<p>UPDATE: Network connectivity has been restored, and all ICER services are operational.</p>
</li>
<li>
<span class="post-meta">Jun 21, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/06/21/Server-Maintenance-06-22-At-0530AM">Server Maintenance on June 22 at 5:30AM - UPDATED</a>
</h2>
<p>UPDATE: This maintenance work is now complete.</p>
</li>
<li>
<span class="post-meta">Jun 15, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/06/15/Network-Maintenance-Planned-June-19-2023">Network Maintenance Planned for June 19, 2023 at 6:30PM - UPDATED</a>
</h2>
<p>UPDATE: Scheduled HPCC network maintenance is now complete.</p>
</li>
<li>
<span class="post-meta">Jun 5, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/06/05/Email-Delivery-Delay-Updated">Email Delivery Delays - June 5, 2023 - UPDATED</a>
</h2>
<p>UPDATE: Email to ICER is now functioning again without errors or delays.</p>
</li>
<li>
<span class="post-meta">Jun 1, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/06/01/XprizeSlowDown">Temporary Service Slowdown Possible - June 3, 2023</a>
</h2>
<p>ICER users may notice slow network speeds from June 3 to June 7, 2023. In support of the XPRIZE competition, ICER will share significant HPCC resources during this timeframe which may result in service slowdowns</p>
</li>
<li>
<span class="post-meta">May 23, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/05/23/GFPSUpdates">Scheduled Home Filesystem Update - Tuesday, May 30, 2023</a>
</h2>
<p>On Tuesday, May 30th at 10am EDT, we will be performing a minor version upgrade of our home filesystem. This process will take approximately two hours. While we will be performing the update with the filesystem online, there is a possibility that the cluster may briefly lose connection to the storage. Please take this into consideration for any jobs which will be running during this time.</p>
</li>
<li>
<span class="post-meta">May 2, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/05/02/New-Local-File-Quotas">Local File Quotas Are Now Set for /tmp and /var on All Nodes</a>
</h2>
<p>Beginning May 3, 2023, user quotas will be in place on all nodes for the /tmp and /var directories. All user accounts will be limited to 95% of the total /tmp partition space that is available on a particular node, and a 5GB limit on the /var partition. If a user account exceeds this quota, a 2 hour grace period will be allowed before the user account is no longer able to write to the /tmp or /var directory.</p>
</li>
<li>
<span class="post-meta">Apr 3, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/04/03/Intel14-Dedicated-to-Ondemand">Intel14 Nodes Now Dedicated to OnDemand</a>
</h2>
<p>Intel14 nodes have been removed from general queues and repurposed. A combined total of 2468 CPU cores and 15.66TB of memory has been dedicated to running jobs submitted through ICER’s installation of <a href="https://docs.icer.msu.edu/Open_OnDemand">Open OnDemand</a>. Dedicating these resources will help to reduce the amount of time users have to wait to launch interactive jobs through OnDemand.</p>
</li>
<li>
<span class="post-meta">Mar 28, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/03/28/SLURM-Node-Updates">SLURM Node Updates on Thursday, March 30th</a>
</h2>
<p>On Thursday, March 30th, at 10:00AM, SLURM clients will be updated to the latest version. This update will bring the node and user components of SLURM to the same version as our SLURM controller and database. Most client commands (e.g. squeue, sbatch, sacct) should work seemlessly through this update. New jobs can be queued as normal and running jobs should not be affected. During these updates, nodes will appear as offline and no new jobs will start. Please note that pending srun/salloc commands may fail to start after this update is complete. If you have a job submitted through srun/salloc that fails after this update, <a href="https://contact.icer.msu.edu/">please contact us</a>. We can boost the priority of your job after resubmission.</p>
</li>
<li>
<span class="post-meta">Mar 23, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/03/23/MPI-Performance">MPI Performance Issues Following SLURM Controller Update - Updated</a>
</h2>
<p><em>UPDATE:</em> We applied a patch from the software vendor that eliminates the performance issue.</p>
</li>
<li>
<span class="post-meta">Mar 21, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/03/21/Intel14-Removal">Intel14 nodes to be removed from general queues - Updated</a>
</h2>
<p><em>UPDATE:</em> Intel14 nodes have been removed from general queues</p>
</li>
<li>
<span class="post-meta">Mar 13, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/03/13/SLURM-Controller-Update">SLURM Scheduler Update at 5:00PM on 3/16/23 - Updated</a>
</h2>
<p><em>UPDATE:</em> The scheduler is back online and functioning normally.</p>
</li>
<li>
<span class="post-meta">Mar 10, 2023</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2023/03/10/SLURM-Database-Server-Update">SLURM Database Outage at 10:00AM on 3/9/23 - UPDATED</a>
</h2>
<p><em>UPDATE:</em> The database upgrade is complete. The sacct command will now function as expected.</p>
</li>
<li>
<span class="post-meta">Feb 1, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/02/01/Scratch-Purge">Scratch purge of 45 day old files</a>
</h2>
<p>Starting on February 15th, files on /mnt/scratch (/mnt/gs21) that have not been modified within the last 45 days will be deleted. Due to technical issues, this purge has not been running and older files have not been regularly removed from scratch/gs21. This issue has been fixed and automatic deletion will resume on February 15th. Users should ensure that any data older than 45 days on scratch/gs21 that they wish to save has been moved to persistent storage (home/research spaces or external storage.)</p>
</li>
<li>
<span class="post-meta">Jan 5, 2023</span>
<h2>
<a class="post-link" href="/announcement/2023/01/05/Winter-Maintenance">HPCC Scheduled Downtime</a>
</h2>
<p>Update 1/5/2023
All updates were completed by 3pm on 1/4/2023. Globus had problems and was brought back online 1/5/2023.
If you experience any problems, <a href="https://contact.icer.msu.edu/">please contact us</a></p>
</li>
<li>
<span class="post-meta">Dec 22, 2022</span>
<h2>
<a class="post-link" href="/maintenance/2022/12/22/Rsync">Resolved: Rsync gateway issues</a>
</h2>
<p>RESOLVED 12/22/22: The issue with the rsync gateway is resolved and file transfers are fully functional.</p>
</li>
<li>
<span class="post-meta">Dec 12, 2022</span>
<h2>
<a class="post-link" href="/maintenance/2022/12/12/Rsync-gateway-issues">Resolved: Rsync gateway issues</a>
</h2>
<p>RESOLVED 12/13/22: The issue with the rsync gateway is resolved and file transfers are fully functional.</p>
</li>
<li>
<span class="post-meta">Dec 7, 2022</span>
<h2>
<a class="post-link" href="/announcement/2022/12/07/Winter-Break-Hours">Winter Break Limited Coverage</a>
</h2>
<p>There will be limited coverage while MSU observes winter break from December 23, 2022 through January 2, 2023. The system will continue to run jobs and monitored for emergency issues. Tickets will be sorted by priority on January 3 when our team returns to work after the holiday break.
If you have any questions, <a href="https://contact.icer.msu.edu/">please contact us</a></p>
</li>
<li>
<span class="post-meta">Nov 18, 2022</span>
<h2>
<a class="post-link" href="/announcement/2022/11/18/Scavenger-Queue-Limits">New Limits on Scavenger Queue</a>
</h2>
<p>We have implemented a new limit of 520 running jobs per user and 1000 submitted jobs per user in the scavenger queue. We have put this limit in place ensure that the scheduler is able to evaluate all the jobs in the queue during its regular scheduling cycles. This matches our general queue limits. Please see <a href="https://docs.icer.msu.edu">our documentation</a> for more information about our <a href="https://docs.icer.msu.edu/job_policies/">scheduler policy</a> and <a href="https://docs.icer.msu.edu/Scavenger_Queue/">scavenger queue</a>. If you have any questions regarding this change, <a href="https://contact.icer.msu.edu/">please contact us</a>.</p>
</li>
<li>
<span class="post-meta">Nov 15, 2022</span>
<h2>
<a class="post-link" href="/announcement/2022/11/15/Login-errors">Resolved: Login issue - Stale file handle</a>
</h2>
<p>We are currently experiencing a login issue with our gateway nodes that report <code class="language-plaintext highlighter-rouge">/mnt/home/<username>/.bash_profile: Stale file handle</code>. We are working to resolve this issue.</p>
</li>
<li>
<span class="post-meta">Nov 1, 2022</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2022/11/01/Scheduler-Outage">Scheduler Outage on November 1st at 8PM</a>
</h2>
<p>On November 1st at 8PM the scheduler will be offline momentarily in order to add additional computing resources to the machine that hosts the scheduling software. If you have any questions or concerns regarding this outage, please <a href="https://contact.icer.msu.edu">contact us.</a></p>
</li>
<li>
<span class="post-meta">Oct 26, 2022</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2022/10/26/RequestTracker-Outage">Resolved: Request Tracker rt.hpcc.msu.edu outage.</a>
</h2>
<p>From about 4 AM to 9 AM this morning (10-26) RT was unavailable due to a configuration management issue. It has been resolved but please let us know if you have any issues.</p>
</li>
<li>
<span class="post-meta">Oct 12, 2022</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2022/10/12/ondemand-acm-error">Resolved: Ondemand failing when job is scheduled on a new acm node.</a>
</h2>
<p>RESOVLED 10/14/2022: OnDemand Desktop works on the amd22 cluster now</p>
</li>
<li>
<span class="post-meta">Oct 10, 2022</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2022/10/10/Service-Availability-Issues">Service availability issues 10/10</a>
</h2>
<p>At about 12:20 PM on October 10th, a bad git merge for our configuration management software caused old configurations to get pushed out to all nodes, which broke a number of services (including the contact forms and job submission on some nodes.) This was reverted by 1:08 PM, but due to caching some nodes may have received this configuration through 2 PM. All nodes and services should be back to normal functionality by 3 PM on October 10th.</p>
</li>
<li>
<span class="post-meta">Oct 7, 2022</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2022/10/07/RequestTracker-Upgrade">Resolved: Request Tracker and Contact Forms outage on 10/11</a>
</h2>
<p>Update 10/11 8 AM: Maintenance on RT has completed. Please let us know if you have any issues.</p>
</li>
<li>
<span class="post-meta">Oct 4, 2022</span>
<h2>
<a class="post-link" href="/announcement/maintenance/update/2022/10/04/scratch-issues">HPCC Scratch filesystem issues - Resolved</a>
</h2>
<p>The HPCC scratch filesystem is currently experiencing an issue. Users may have seen issues as early as 7:30 AM this morning. We are working to identify the cause and correct the issue and will post updates here as they become available.</p>
</li>
<li>
<span class="post-meta">Sep 27, 2022</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2022/09/27/Password-logins-to-rsync-gateway-disabled">Password logins to the rsync gateway will be disabled on 10/12/22</a>
</h2>
<p>UPDATE: 10/14: This has been implemented. Users using sshfs on Windows should <a href="https://contact.icer.msu.edu/">contact the ICER help desk</a> for help using public key authentication with rsync.hpcc.msu.edu.</p>
</li>
<li>
<span class="post-meta">Aug 31, 2022</span>
<h2>
<a class="post-link" href="/announcement/maintenance/update/2022/08/31/new-gs21-scratch">New Scratch gs21 availability and gs18/ls15 retirement - UPDATED</a>
</h2>
<p>We are excited to announce the general release of our new gs21 scratch system, now available at /mnt/gs21/scratch on all user systems, including gateways, development nodes, and the compute cluster. The new scratch system provides 3 PB of space for researchers and allows us to continue to maintain 50 TB quotas for our growing community. The new system also includes 200 TB of high-speed flash. You may begin to utilize the new scratch system immediately. Please read on for more information about the transition to this space.
<!--more--></p>
</li>
<li>
<span class="post-meta">Aug 31, 2022</span>
<h2>
<a class="post-link" href="/maintenance/outage/2022/08/31/File-Transfer-Service-Migration">File Transfer Service Network Migration - Resolved</a>
</h2>
<p><em>UPDATE:</em> The rsync service (rsync.hpcc.msu.edu) is available (8-31). A reminder that the rsync service node should only be used for file transfers.</p>
</li>
<li>
<span class="post-meta">Aug 17, 2022</span>
<h2>
<a class="post-link" href="/announcement/maintenance/2022/08/17/SLURM-Configuration-update">Brief Scheduler Outage at 8:00PM 8/18/22 - UPDATED</a>
</h2>