Skip to content

ARIS

Messages

Wed Nov 6 14:01 EET 2024

Dear ARIS users,

We have powered off ARIS after a serious incident that took place on 29/10/2024 in the building infrastruct ure where the system is installed. Our initial estimation is that computing and storage have not been affec ted.

Currently, we are working to restore the safety of the area and replace the affected equipment (electrical, cooling and others).

This will take at least 8 weeks. After that, the system will go live again.

Whenever new information about the progress is available, it will will be announced in the ARIS page: https://doc.aris.grnet.gr/

Furthermore, due to the incident the results of the 17th production call as well as any applications for preparatory projects, will be announced after system resume.

The projects that expired during the downtime period will be extended accordingly.

Thank you for your understanding

Wed Oct 30 EET 2024

Since 2024-10-29 17:31 the ARIS system has been shut down as a precaution due to maintainance issues with t he building facilities. Maintenance staff are working to restore conditions and as soon as it is safe, ARIS will resume operation. We will provide updates.

Tue Oct 29 17:31 EET 2024

   Cooling system problem. System powered off. Running jobs killed.

Mon Sep 02 18:54 EEST 2024

   System Back in production

Mon Sep 02 17:03 EEST 2024

   Cooling system problem. All compute nodes powered 
   off. Running jobs killed.

Mon Aug 13 13:49 EEST 2024

   System in production.

Mon Aug 12 13:55 EEST 2024

   Due to power grid instability, all compute nodes  
   powered off. Running jobs killed, they will start 
   again when system is powered on.

Tue Jun 11 19:13 EEST 2024

System back to production

Thu Jun 06 13:28 EEST 2024

     System will be unavailable in the period         
     2024-06-10 22:00 - 2024-06-11 20:00 for power    
     infrastructure management.                       
     Jobs not starting with reason ReqNodeUnavail     
     request time that falls in downtime period.

Mon Apr 30 23:34 EEST 2024

System in full production.

Mon Apr 08 16:30 EEST 2024

System will be unavailable in the period
2024-04-30 12:00 - 2024-05-01 01:00 for building power management.

Fri Apr 05 15:45 EEST 2024

It is possible to face small network interuptions at 09/04/2024 in the 11:00 - 13:00 window.

Fri Mar 15 18:29 EET 2024

Storage and login available.

Queues will be enabled gradually to ensure storage stability.

Fri Mar 15 06:23 EET 2024

Storage failure. System unavailable.

All running jobs killed.

Sun Jan 14 11:02 EET 2024

System back in full production.

Sun Jan 14 02:15 EET 2024

Power Outage, system went down. All running jobs killed.

Tue Sept 05 12:23:00 EEST 2023

System back in production

Fri Sep 01 10:23 EEST 2023

System will be unavailable in the period
2023-09-05 00:00 - 2023-09-05 17:00 for area power management.

ReqNodeNotAvail means that the walltime request of your job 
extends into the upcoming scheduled cluster maintenance and 
therefore the scheduler will not run your job before maintenance.

Tue Aug 29 13:00 EEST 2023

System back in production

Tue Aug 29 2023

Compute nodes will be unavailable at 2023-08-31 02:00 - 16:00

Wed May 10 13:02:00 EEST 2023

System back in production

Thu Apr 27 2023

System will be UNAVAILABLE in the period
2023-05-09 23:00 - 2023-05-10 16:00 for area power maintenance

Fri Mar 24 22:00:00 EET 2023

System back in production

Fri Mar 17 2023

System will be unavailable in period
2023-03-24 16:00 - 24:00 for building power maintenance

Fri Oct 06 2022

System will be fully unavailable in the period
2022-10-14 22:00 - 2022-10-15 22:00 for building power management.

Tue Aug 09 02:21:00 EEST 2022

Area Power Outage. All jobs running in this period killed and restarted after power restore after 2022-08-09 04:25

Sun Apr 10 2022

System will be fully unavailable in the period 2022-05-04 20:00 - 2022-05-05 10:00 for area power management

Fri Feb 04 13:25:00 EET 2022

Power maintenance canceled. System in production.

Thu Feb 03 21:50:00 EET 2022

Possible power mainenance at 2022-02-05 10:00-12:00 Jobs with expected end time after 2022-02-05 09:00 will not start - Reason : ReqNodesUnavail.
Login nodes and storage are not affected
Pending jobs will start after 2022-02-05 12:00

Tue Jan 25 13:30:00 EET 2022

Compute nodes back in production.

Tue Jan 25 04:20:00 EET 2022

Storage and login nodes available, compute nodes
powered off for investigation.

Tue Jan 25 02:40:00 EET 2022

Problems with power/cooling. System Powered off.

Wed Jul 22 2020

System will be unavailable in period 2020-07-23 09:00 - 18:00 for
power mainenance

Wed May 13 2020

System will be unavailable in the period
2020-05-16 23:00 - 2020-05-17 23:00
for area power maintenace,

Fri May 08 2020

mmlsquota command is now available.

Wed May 06 13:00:00 EEST 2020

The network problems have been resolved. The system is operating normally. Jobs queued or running in period
06.05.2020 05:00 - 06.05.2020 13:00, failed.
You should resubmit.

Wed May 06 10:00:00 EEST 2020

There are problems with the Infiniband network
switch. Until they are resolved, slurm queues are at
drain status. GPFS filesystems may become unstable.

Fri Mar 20 2020

mmlsquota command will be unavailable until further notice.
As a workaround, you can see your quota usage in the file:
/users/quota/$USER
This file will be updated every 2 hours.

Thu Mar 12 2020

GRNET put in place measures for all staff members in the effort to
slow the spread of COVID-19.
Starting on Thursday, 12 March, all HPC staff is working remotely.
For all technical requests please continue to contact us via email.
Contact details at https://hpc.grnet.gr/en/contact/

Thu Jan 2 10:54:24 EET 2020

System will be unavailable at 02-Jan-2020 at 10:00 - 18:00

Mon May 20 19:01:49 EET 2019

Scheduled system outage (power outage in area) at 2019-05-26.
Jobs that are expected to finish after 2019-05-25 00:00:00 will start after power outage.

Wed Oct 24 13:21:56 EEST 2018

Connections to login.aris.grnet.gr will not be possible on 2018-10-25 between 16:00 and 17:00 because of network maintenance.
The rest of the system and the running jobs will not be affected.

2018-06-27

Scheduled system outage (power outage in area) at 2018-06-28
Jobs that are expected to finish after 2018-06-28 00:00:00 will start after power outage.

System will be unavailable at 2018-05-31 after 10:00 for urgent maintenance.

Thu May 24 17:28:10 EEST 2018

The compute nodes will be unavailable between 15.00 and 20.00 on Friday 25 May 2018 because of a power outage.

Wed Mar 21 17:55:36 EET 2018

All systems are operational

Wed Mar 21 14:11:16 EET 2018

All queque are not available for submission, due to storage problems

Wed Feb 21 15:54:28 EET 2018

Non Scheduled Power outage in building. All compute nodes down.

Fri May 12 14:33:47 EEST 2017

ARIS will be down in period 2017-05-12 15:00 - 18:00 due to unscheduled area power maintenance. Running jobs will be killed

Tue May 9 17:04:55 EEST 2017

ARIS will be unavailable due to area power maintenance in the periods : 2017-05-19 11:00 - 21:00 2017-05-26 22:00 - 2017-05-27 22:00

Wed Dec 14 16:15:55 EET 2016

System will be unavailable in the period 2016-12-16 20:00 - 2016-12-17 20:00

Mon Jul 18 14:57:10 EEST 2016

SLURM is back in production.

Sun Jul 17 20:05:01 EEST 2016

SLURM is unavailable

Mon May 23 10:27:28 EEST 2016

System is reserved for Maintenance in period :
2016-05-26 00:00:00 - 2016-05-29 23:59:59
If a job’s estimated end time falls in the period, job execution will start after maintenance period.

Fri Apr 15 14:00:58 EEST 2016

The system will be unavailable between 15.30 and 19.30 today
(Friday April 15 2016) because of a power outage.

Tue Mar 22 19:29:38 EET 2016

System Back in Production.

System will be unavailable at 2016-03-22 08:00 - 20:00 for System Maintenance and Upgrade.
Please schedule your jobs accordingly

Wed Mar 16 14:41:35 EET 2016

All the compute nodes have been powered on. The system is back in production

Wed Mar 16 12:21:55 EET 2016

Because of a problem with the /work (/dev/gpfs0) filesystem, all systems including the login nodes are currently down. We apologise for any inconvenience.

Tue Mar 15 19:25:14 EET 2016

All the compute nodes have been powered on. The system is back in production

Tue Mar 15 17:33:03 EET 2016

THERE IS A POWER OUTAGE. ALL COMPUTE NODES HAVE BEEN SHUT OFF.

Tue Nov 17 18:30:08 EET 2015

ARIS cooling system problem resolved, ARIS is back in production since 2015-11-17 18:30. Thank you for your understanding.

Mon Nov 16 13:08:08 EET 2015

There is an ambient temperature problem because of A/C failure, all compute nodes have been shutdown. We apologise for any inconvenience.