-
Notifications
You must be signed in to change notification settings - Fork 0
/
feed.xml
203 lines (142 loc) · 13.5 KB
/
feed.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>HPCC Service Status</title>
<description>We'll post information about ICER's system downtimes, updates, new features, and other information for the ICER user community here.
</description>
<link>http://blog.icer.msu.edu//</link>
<atom:link href="http://blog.icer.msu.edu//feed.xml" rel="self" type="application/rss+xml"/>
<pubDate>Thu, 19 Dec 2024 21:05:08 +0000</pubDate>
<lastBuildDate>Thu, 19 Dec 2024 21:05:08 +0000</lastBuildDate>
<generator>Jekyll v4.2.2</generator>
<item>
<title>Winter Break Limited Coverage</title>
<description><p>There will be limited coverage while MSU observes winter break from December 24, 2024 through January 1, 2025. The system will continue to run jobs and be monitored for emergency issues. Tickets will be sorted by priority on January 2 when our team returns to work after the holiday break.
If you have any questions, <a href="https://contact.icer.msu.edu/">please contact us</a></p>
</description>
<pubDate>Wed, 04 Dec 2024 19:00:00 +0000</pubDate>
<link>http://blog.icer.msu.edu//announcement/2024/12/04/Winter-Break-Hours</link>
<guid isPermaLink="true">http://blog.icer.msu.edu//announcement/2024/12/04/Winter-Break-Hours</guid>
<category>announcement</category>
</item>
<item>
<title>HPCC Scheduled Downtime - RESOLVED 12/19/2024</title>
<description><p>RESOLVED: Maintenance is complete, thank you for your patience. Job submissions will continue to run after 5PM on 12/19. Please note that as the intel14 cluster has been retired, the <code class="language-plaintext highlighter-rouge">intel14</code> constraint must be removed from any jobs.</p>
<p>The HPCC will be unavailable on Thursday, December 19th for our regularly scheduled maintenance. No jobs will run during this time. Jobs that will not be completed before December 19th will not begin until after maintenance is complete. For example, if you submit a four day job three days before the maintenance outage, your job will be postponed and will not begin to run until after maintenance is completed.</p>
<p>High level overview of changes include:</p>
<ul>
<li>Gateways host keys replacement</li>
<li>Cluster OS updates to improve stability and security</li>
<li>Firmware and driver upgrades</li>
<li>Final GPFS updates for summer storage replacement</li>
<li>Slurm upgrade</li>
<li>Network upgrades for new cluster</li>
<li>Bug fixes and other improvements</li>
</ul>
<p>If you have any questions, <a href="https://contact.icer.msu.edu/">please contact us</a></p>
</description>
<pubDate>Tue, 03 Dec 2024 16:00:00 +0000</pubDate>
<link>http://blog.icer.msu.edu//announcement/2024/12/03/Winter-Maintenance</link>
<guid isPermaLink="true">http://blog.icer.msu.edu//announcement/2024/12/03/Winter-Maintenance</guid>
<category>announcement</category>
</item>
<item>
<title>Intel16 Cluster Currently Offline - RESOLVED 11/19/2024</title>
<description><p>RESOLVED: 11/19/2024 12:10PM - On 11/18/2024 ITS performed maintenance on a number of switches in the data center that required rebooting critical network infrastructure. After these reboots, several links connecting to the intel16 cluster did not recover. During this time, you may have also noticed brief pauses in OnDemand and on Gateway nodes. This morning we were able to work with ITS to re-establish connectivity to all intel16 nodes, and the intel16 cluster, along with all other nodes, are now back in production and running jobs via Slurm.</p>
<p>Around 8:40pm on 11/18/2024 the intel16 cluster went offline due to a network outage and remains offline. Currently no jobs requesting to run on intel16 will be scheduled. We are troubleshooting this network outage with IT Services and will provide more information when it is available via an update to this blog post.</p>
</description>
<pubDate>Tue, 19 Nov 2024 13:40:00 +0000</pubDate>
<link>http://blog.icer.msu.edu//announcement/2024/11/19/intel16-outage</link>
<guid isPermaLink="true">http://blog.icer.msu.edu//announcement/2024/11/19/intel16-outage</guid>
<category>announcement</category>
</item>
<item>
<title>MATLAB License issue - RESOLVED 10/31/2024</title>
<description><p>RESOLVED: 10/31/2024 5:15PM - The issue is resolved on development and compute nodes.</p>
<p>UPDATE: 10/31/2024 4:00PM - The issue is resolved on development nodes. The fix is still being synced to compute nodes.</p>
<p>ICER is investigating an issue with MATLAB licensing. When starting MATLAB, you may see an error message:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>License checkout failed.
License Manager Error -97
License Manager cannot start.
Check that the ports specified in the license file are not already in use.
Restarting your machine may clear the ports.
Troubleshoot this issue by visiting:
https://www.mathworks.com/support/lme/97
Diagnostic Information:
Feature: MATLAB
License path: /mnt/home/grosscra/.matlab/R2023b_licenses:/opt/software-current/2023.06/x86_64/generic/software/MATLAB/2023b/licenses/license.dat:/opt/software-current/2023.06/x86_64/generic/software/MATLAB/2023b/licenses/network.lic
Licensing error: -97,121.
</code></pre></div></div>
<p>This issue is due to problems on our side which we are working to resolve. We will update this blog post accordingly.</p>
</description>
<pubDate>Thu, 31 Oct 2024 15:00:00 +0000</pubDate>
<link>http://blog.icer.msu.edu//announcement/2024/10/31/MATLAB-License</link>
<guid isPermaLink="true">http://blog.icer.msu.edu//announcement/2024/10/31/MATLAB-License</guid>
<category>announcement</category>
</item>
<item>
<title>Shared Module and Software Server Restart - RESOLVED 11/1/2024</title>
<description><p>RESOLVED: 11/1/2024 6:15 AM - The system restart is complete and all services should be online.</p>
<p>At 6:00 AM on 1 November, 2024 we will restart our shared module and software server to improve performance. Users may experience a delay or failure when logging into development nodes in the period of time while this system is restarting. Slurm jobs should continue to run, but active connections to development nodes may experience momentary delays accessing the module system. This restart should be completed within approximately 15 Minutes and updates will be posted here upon completion.</p>
</description>
<pubDate>Thu, 31 Oct 2024 13:00:00 +0000</pubDate>
<link>http://blog.icer.msu.edu//announcement/maintenance/2024/10/31/module_software-fileserver-restart</link>
<guid isPermaLink="true">http://blog.icer.msu.edu//announcement/maintenance/2024/10/31/module_software-fileserver-restart</guid>
<category>announcement</category>
<category>maintenance</category>
</item>
<item>
<title>Shared Software File Server Restart - RESOLVED 12:50 10/30/2024</title>
<description><p>RESOLVED: 1250 10/30/2024 - The system restart is complete and all services should be online.</p>
<p>UPDATE: 11:00 10/30/2024 - The restart of the system will be scheduled for 12:30 PM on 10/30/2024.</p>
<p>At 11:00 AM on 30 October, 2024 we will be restarting our shared software server to improve performance. Users may experience a delay or failure when logging into development nodes in the period of time while this system is restarting. This restart should be completed within approximately 15 Minutes and updates will be posted here upon completion.</p>
</description>
<pubDate>Wed, 30 Oct 2024 13:00:00 +0000</pubDate>
<link>http://blog.icer.msu.edu//announcement/maintenance/2024/10/30/software-fileserver-restart</link>
<guid isPermaLink="true">http://blog.icer.msu.edu//announcement/maintenance/2024/10/30/software-fileserver-restart</guid>
<category>announcement</category>
<category>maintenance</category>
</item>
<item>
<title>ICER Web Application Login Error - RESOLVED 10/29/2024</title>
<description><p>UPDATE: 10/29/2024 - Logins to RT, OpenOnDemand, and Contact forms looks to be fully functional again. Values might be cached and you might need to clear your cache. You can test by opening a private browser. Email general@rt.hpcc.msu.edu if you still experience problems.</p>
<p>Due to a change earlier today with how the MSU ID office provides identity information to the CILogin service ICER uses to authenticate userIDs, some users may be unable to authenticate to ICER web applications, including OnDemand and the ICER contact forms. The issue with the information provided to CILogin has been resolved; however, now the MSU single sign on services must update all of their records, which may take several hours. If you receive the error message “user is not assigned the application” when attempting to login to ICER web services, you will need to wait for these records to update. SSH access to the HPCC is unaffected and will remain available during this time.</p>
</description>
<pubDate>Mon, 28 Oct 2024 17:15:00 +0000</pubDate>
<link>http://blog.icer.msu.edu//announcement/2024/10/28/ICER-Web-App-Login-Error</link>
<guid isPermaLink="true">http://blog.icer.msu.edu//announcement/2024/10/28/ICER-Web-App-Login-Error</guid>
<category>announcement</category>
</item>
<item>
<title>ICER Contact Form UserID Information Lookup Error RESOLVED 10/29/2024</title>
<description><p>The <a href="https://contact.icer.msu.edu/">ICER contact form</a> is currently experiencing a technical error retrieving userID information for some MSU accounts. This error may result in your inability to log new account or new research space requests. While we continue to troubleshoot this error, please use the <a href="https://contact.icer.msu.edu/contact">general contact form</a> to submit your requests. This post will continue to be updated as we have more information.</p>
<p>UPDATE 10/29/2024: Details in a <a href="https://blog.icer.msu.edu/announcement/2024/10/28/ICER-Web-App-Login-Error">later blog</a></p>
</description>
<pubDate>Fri, 25 Oct 2024 18:50:00 +0000</pubDate>
<link>http://blog.icer.msu.edu//announcement/2024/10/25/UserID-Information-Lookup-Error</link>
<guid isPermaLink="true">http://blog.icer.msu.edu//announcement/2024/10/25/UserID-Information-Lookup-Error</guid>
<category>announcement</category>
</item>
<item>
<title>Gateway Node Operating System Upgrades</title>
<description><p>Starting on Monday 10/28/2024 and over the next few weeks, we will be upgrading the operating systems on the gateway nodes. If you experience a timeout while attempting to connect to the HPCC during this time, please try again after a short delay or use our <a href="https://ondemand.hpcc.msu.edu">open ondemand instance</a>. If you continue to have difficulty loging into HPCC resources, please let us know by submitting a ticket through our <a href="https://contact.icer.msu.edu/contact">Contact Forms</a></p>
</description>
<pubDate>Thu, 24 Oct 2024 12:00:00 +0000</pubDate>
<link>http://blog.icer.msu.edu//announcement/maintenance/2024/10/24/gateway-node-os-upgrade</link>
<guid isPermaLink="true">http://blog.icer.msu.edu//announcement/maintenance/2024/10/24/gateway-node-os-upgrade</guid>
<category>announcement</category>
<category>maintenance</category>
</item>
<item>
<title>2024-10-24 Development node reboots - RESOLVED 2024-10-24 0715</title>
<description><p>RESOLVED: 10/24/2024 - All reboots are complete and the development nodes should be available. Please report any issues through our <a href="https://contact.icer.msu.edu">contact forms</a></p>
<p>All development nodes will be rebooted to enable a configuration change starting at 6:00 AM on 10/24/2024. It is anticipated that the down time for each node will be less than 15 minutes and all maintenance should be completed by 8:00 AM on 10/24/2024.</p>
</description>
<pubDate>Wed, 23 Oct 2024 18:00:00 +0000</pubDate>
<link>http://blog.icer.msu.edu//announcement/maintenance/2024/10/23/development-node-reboot</link>
<guid isPermaLink="true">http://blog.icer.msu.edu//announcement/maintenance/2024/10/23/development-node-reboot</guid>
<category>announcement</category>
<category>maintenance</category>
</item>
</channel>
</rss>