Just like you wouldn’t use a sledgehammer to hang a picture nail, university IT must find the right tools for the analytics challenge.
Data analytics has become commonplace on college campuses as universities have developed into repositories for all sorts of data focusing on everything from student success and recruitment, to enrollment management, faculty retention, finance and budgeting. And those are merely the administrative records collected. Universities also house some of the largest research institutions in the country, resulting in the creation of tremendous amounts of data on a daily basis.
However, storage can be the most prohibitive obstacle in creating a data analytics system. As the application of data grows, so too will the volume collected. University IT departments will have to find ways to handle the massive amounts of data that needs to be stored and accessed.
But in a time of shrinking budgets, universities are hard pressed to find ways to effectively manage this deluge.
Know the data first
There are many types of data. It can be categorized as:
- Structured data, which can reside in clear fields within a database. Structured data is normally associated with the academic or administrative offices. Often times the data behind the student information system or other administrative ERP systems are structured data.
- Unstructured data, which is information that doesn’t fit into a traditional database format
- Semi-structured, which has components of both.
Unstructured and semi-structured data—such as video or social media – is where the data explosion is really taking place at the university level. Colleges need to be able to store and provide access to this data at high speeds to multiple sources in order to turn that data into actionable information.
There are three methods most often employed related to data storage: flash storage; traditional mechanical disk storage; and cloud storage. Much as you wouldn’t use a sledgehammer to hang a picture nail in your living room, when it comes to the storage of data, you need to be sure to use the right tool for the job.
(Next page: Finding the right storage option for the data job)
When it comes to data storage, flash storage is synonymous with speed and performance. Solid state flash drives have no moving parts; information is stored in microchips. This lack of moving parts is what makes solid state drives so much faster than hard disk drives.
However, this speed comes with a higher cost. When universities are analyzing their storage needs, they need to turn a critical eye to determining which data needs to be accessed most frequently by the most people. Solid state drives are more costly upfront, but they produce results much faster, cutting costs in the long run. Mission critical applications that are sensitive to performance should be deployed using flash technology. Some storage vendors today incorporate flash as a caching tier to make workloads on traditional mechanical disk storage faster.
Traditional Mechanical Disk Storage
Considered by many to be the faithful work horse of the storage landscape, hard disk drives are a cost-effective way to handle routine data for many workloads. While slower than solid state drives, hard disk drives have a greater capacity than their solid state drive counterparts, which means that they can store far more information. Assigning data that is not in active use, or whose use is not reliant on speed and accessibility from multiple sources, is a good use of hard disk drives in any data analytics system. Some vendors today have technologies to make this storage as efficient as possible using technologies like compression and deduplication. Virtual copies, or snapshots, can also simplify data protection and make backup and recovery very fast.
While universities will want to keep their most sensitive data on-premises, unstructured and semi-structured data is especially well suited to cloud storage, with significant advantages:
- Cost reduction: Data analytics can require significant amount of computing resources to analyze and process large volumes of data. Using the cloud reduces costs to the university, particularly in a pay-per-use, utility pricing model.
- Reduced overhead: By taking advantage of cloud technologies, IT teams can reallocate dollars normally spent on physical hardware. Where many private on-premise data centers are at capacity, using cloud can be a strategy to extend the life of an existing data center.
- Provisioning and scaling: Utilizing the cloud provides the ability to easily scale as required by the amount and type of data being processed. This is especially helpful with regards to research as it allows for decreasing the reliance on high performance computers for routine data, freeing up these HPCs for research efforts.
- Ability to store your data in your own, private data center, while taking advantage of public compute resources. This model enables institutions to maintain complete sovereign control over their data, while taking advantage of public cloud compute resources. This helps institutions avoid cloud lock in and they can maintain a strong negotiation posture with their cloud vendor.
Data analytics allows university faculty and administrators to innovate, support, and enhance the college experience for students and staff alike. In most university settings, the right storage tool for the job is a combination of all three. Taking the time to do a thorough analysis of your data collected and need for access, allows IT teams to determine the right combination of storage options to suit each institution’s needs. Providing the proper support on the storage side can help smooth the way for the continued growth of big data on college campuses for years to come.
Matt Lawson is principal architect for state, local government and education at Net App.