HOME SOLUTIONS SERVICES NEWS ABOUT US  
 
 
 
 


ETL Highlights

The ETL (or Extract Transform Load) modules is one of the most challenging aspects of a data warehousing project.  Many data warehousing projects stall at this stage.  ZogoTech's approach is much simpler than others and helps ensures a successful deployment.

Extraction

In this phase, the Estudias ETL Manager retrieves information from your student information system (Datatel Colleague, SCT Banner, PeopleSoft, etc) and puts it into text files used in the Load and Transform stages.

  • Estudias only requires that your student information system be able to generate text files (CSV or tab separated) in certain formats.

    • For systems that implement ODBC, queries can be written fairly easily to extract the data into text files
    • For UniData (Datatel Colleague) systems, ZogoTech has an extraction tool to retrieve the information in the correct formats (details)
  • If your student information system changes, you will only need to change how those files are extracted.
  • Extracts are small and fast, composed of a few fundamental building blocks.

    For example, Estudias does not download GPA (often a slow, calculated field on the student information system).  Instead the Estudias ETL Manager downloads grade points, GPA credits to compute the GPA, allowing users to analyze metrics such as the GPAs of students in types of classes, or the average GPA of groups of students.

  • Users have fine-grained control on how to schedule and partition extracts.

Transformation (Pre-load)

In this stage, minor cleanups (such as converting the semester formats to a common format) are applied to the downloaded files.  Most transformations are actually done after the information is loaded, so this is sometimes called ELT rather than ETL. 

  • Files are merged to remove duplicates and to make sure that more recent records take precedence over older records.

  • The database schema is examined and any missing schema elements (domains, function, tables, and indexes) are created.

  • Record level cleansing is performed and CSV files that can be bulk-loaded are created.

Load


At this stage, information is loaded into the Estudias database.

  • Preprocesses common queries to improve performance and scalabilit

  • Table and database level cleansing is performed via a series of SQL statements.

  • Creates, designs and processes OLAP Cubes.

Transformation (Post-load)


Running the transformations after the database is loaded means it's easy to add new transformations using standard SQL rather than a proprietary language.

  • Involves a multi-pass update stage to handle dependencies between the transformations.

  • Over 100 fields are calculated by Estudias in this phase such as calculations of

    • Starting and ending cohorts

    • Retaking classes

    • Stop-ins / stop-outs

    • Transfer-ins / transfer-outs (via National Student Clearinghouse integration)

    • Term Indices so that cohorts can be compared to each other (enabling users to compare how 2000 Fall cohorts did in their first three semesters compared to 2001 Fall cohorts, for example)

Also see: Detailed look at ZogoTech's Open Architecture