Data Warehouse Modernization Initiative

 

 The primary project goal is to improve our ability to maintain, and increase the quality of, the University鈥檚 custom and historical data assets to better support critical reporting, decision making, and predictive analytics.

Following an extensive RFP process, Plante Moran has been selected to help the University implement a solution leveraging Informatica and Snowflake that will:

  • Implement modern tools for data processing and data warehouse management.
  • Provide tools to strategically manage our data as we would any other high value asset.
  • Develop a data catalog of definitions and robust data warehouse contents.


 
Evolution of the Data Warehouse  (Events leading up to the project.)

1996 - Data Warehouse Created

The Data Warehouse was initially created and is still known to this day as the Decision Support Database (DSD).  The Decision Support Database was created to generate and maintain the critical data sets required to support Institutional Research and reporting.  

2017 - Data Warehouse Experiences End-of-Life Support Issues
  • Production of consistent, timely data becomes limited from out-of-date and non-upgradable technology. 
  • Loss of institutional knowledge over time regarding data definitions, changes in how data was being entered, or how data is/was used.
  • Hand-coded warehouse management methods became unsustainable and complex.
2020 - Data Warehouse Adopts Shared Services Model

Executive Leadership committed to long-term support for the creation of a distinct Data Warehouse team, managed by a shared services governance model.

  • Permanent staffing was increased from 1 to 2 FTE via broad system office reorganization.
  • The Institutional Research Shared Services Committee (IRSSC) was formed, in which the four IR Directors equally participated in guiding and supporting the Data Warehouse functions.  
  • High-level goals were drafted:
    • Continue to maintain the frozen data sets required by the four Institutional Research offices to generate reports and insights used in decision-making.
    • Optimize resources by increasing common data sets, increasing automations, and centralizing the warehousing tasks common to all four Institutional Research offices.
    • Promote consistency by upholding common, official, custom data definitions and applying data quality standards.
2021 - Needs Assessment and Gap Analysis Performed

IRSSC and the new Data Warehousing team began a gap analysis, examining our assets and needs compared to industry best practices.

  • Industry trends in data warehouse methods and technology have seen dramatic improvements, offering significantly more advanced capabilities than the current system can support.
  • Demand for data in standardized, refreshed, and ad-hoc formats exceeds current system capacity creating a strain on 麻豆传媒鈥檚 reporting resources both in technology and personnel.
  • Authority and stewardship for data unclear, lack of policy and procedures.
  • Increase in the number of data systems in use across 麻豆传媒 equally increases the complexity of our data ecosystem and introduces significant challenges to orchestrating vast amounts of data in a unified and meaningful way.
2022 - Data Warehouse Modernization Project Initiated

The ETL, Data Warehouse and Data Catalog Project was initiated. 

Research identified the following needs:

  • A modern data Extraction, Transformation, and Loading (ETL) tool to help bring data from various sources into one location.
  • A robust Data Warehouse management tool to ensure robust security, access, process automations, and to enhance data quality.
  • A comprehensive Data Catalog to capture both shared and unique data definitions and the various critical reporting metrics across 麻豆传媒.

 

2023 - RFP Launched

Issued an RFP seeking a technology platform to better manage our data assets across disparate data systems and silos, and to accommodate the University鈥檚 common and distinct business needs. 

With support from OIT, each MAU IR office, and Procurement the Data Warehouse RFP was launched in September 2023.

  • We received 鈥渁 historically unprecedented number of inquiries鈥 and 鈥渁n unexpectedly large number of proposals鈥 in response.
  • The project scoring team was committed to thorough examination and comparison of the diverse options proposed.
    • In first quarter 2024 the scoring committee worked to identify a smaller, top-ranked group of 2-5 proposals.
    • Scheduled a first round of demonstrations that were technical in nature and focused on the ETL and data warehousing capabilities.
    • Solicited feedback from the broader 麻豆传媒 community, then invited them to attend a second round of demonstrations focused on business-oriented elements of the data catalog and related reporting features.

 

2024 - Solution Selected

Following an extensive RFP process, Plante Moran has been selected to help the University implement a Data Warehouse & Data Catalog solution leveraging Informatica and Snowflake.  Implementation will likely involve:

  • Redesigning the new Data Warehouse to include MAU-specific tables that reflect the local business practices of each University. University-level data can then be used to inform aggregated 麻豆传媒 system-level tables.
  • Focus on value-added data utility while addressing current challenges, such as increasing clarity and context of our data definitions, setting demarcation of historical/new warehouse data, and providing a common framework to support future system-level initiatives.
  • The implementation is reliant upon; the capabilities/limits of the vendor product selected, the level of support and resources invested by 麻豆传媒, and our ability to balance current workload demands with the work efforts required to stand up a new system.

 

2025 and Beyond

Plante Moran will perform significant pre-assessment and discovery activities to solidify scope, guide implementation, and establish a roadmap prioritizing the iteration phases and areas for potential future expansion.

 



6 Phases of Implementation

Phase 1 - Preliminary
This phase is complete!

This phase included the initial kickoff and related logistics such as VPN and server access, security forms as well as an initial review of any documentation that exists for the system.

Phase 2 - Install and Configuration
This phase is currently in progress.

Includes installation and configuration of the software environments for Snowflake and Informatica, security conversations with 麻豆传媒 staff, and mentoring sessions.

Phase 3 - Discovery
This phase is currently in progress.

A highly interactive phase as Plante Moran is meeting with the core project team and selected business areas to ensure there is a complete understanding of the current data environment.

Phase 4 - Plan

During the Plan phase we'll begin to brainstorm what the build phases will look like, in particular agreeing on the scope and deliverables of the development iterations.  The Plan phase will also include a foundational data governance workshop as well as the initial setup and overview of the data catalog.
Phase 5 - Build Iterations

Build iterations are all about the building and delivering the subject areas agreed to during the Plan phase.  Each of the iterations will include analysis of the data, design architecture and documentation, data table setup within Snowflake and ETL development and testing to each of the Snowflake zones.  Some of the builds may also include multiple fact tables as well as aggregates.  Each of the loads will be productionalized, optimized, and scheduled.  Data catalogs and feeds will be created and updated as we go to ensure the business and technical definitions are captured within the system.
Phase 6 鈥 Training and Support

The final phase will ensure that the handover of the system to 麻豆传媒 is seamless and that 麻豆传媒 is well-placed to continue the journey forward. There are also support hours available for future endeavors.



Looking for more?

This information is intended for 麻豆传媒 employees only, access with 麻豆传媒 SSO credentials. 

Members and Sponsors

We want to hear from you!

麻豆传媒 employees are encouraged to submit questions, comments and feedback. 

 

 

Project Origins

 

Printable PDF Version