Storage system failure causes tech services shutdown

(File photo/The Daily Campus)

Many tech systems and services at the University of Connecticut went down Tuesday due to issues with the university’s storage system.

A message on the UConn Information Technology (UITS) website said the problems were due to issues with the university’s storage environment, which is like the “hard disk” for UConn’s systems and impacts authentication and local services.

“All of our locally-provided systems and services rely on storage either directly or indirectly and so they were all impacted,” Vice Provost for Information Technology and Chief Information Officer Michael R. Mundrane, said. “Some services were more tolerant of these issues and, while affected, continued to operate. Others were less tolerant and at times went down.”

Mundrane said there is an underlying problem with the system which manifested as intermittent issues that began yesterday and continued today, disrupting services such as printing, mail and HuskyCT.

“Many, if not most, locally-provided services can be impacted,” Mundrane said. “IT Services at a university are interdependent. This allows them to work in concert, but this also means that one failure can have a cascade effect.”

The storage performance issues resulted in a stop of the NetID system, which is used to sign in to most of the tech services UConn students, faculty and staff use on a daily basis.

Mundrane said UITS has worked to reset significant portions of the storage environment over the past 24 hours. They are working with their storage vendor to understand the problem’s root cause and to find a more permanent solution, he said.

“We have declared this as a high-priority service outage and have escalated the problem through their support channels,” Mundrane said.

UITS is focused on fixing the services that the UConn community uses most, according to Mundrane.

“UITS will prioritize restoration of community interfacing services first,” Mundrane said. “This should fix the issue from their perspective more quickly.”

While UITS has had IT outages before, this problem with the primary storage environment is one Mundrane said he has never encountered before during his four years at UConn.

Mundrane said UITS will work to better understand the underlying problems that caused the cascade of service failures, and to develop strategies to incorporate into the existing structure to help mitigate these failures’ impact in the future.

“Foundational systems, such as storage and authentication, are designed and operated with robustness as the primary design consideration. Unfortunately, no system can be implemented that might never break,” Mundrane said. “Our goal is to minimize the likelihood of outages, and when they occur, to make service restoration our top priority to limit the consequences.”


Anna Zarra Aldrich is a staff writer for The Daily Campus. She can be reached via email at anna.aldrich@uconn.edu. She tweets @ZarraAnna.