HOME SOLUTIONS SERVICES NEWS ABOUT US  
 
 
 
 

Open Architecture

Estudias is built on a very open architecture with standard tools and protocols. This allows a high degree of customization at every point in the process. The architecture also allows power users to access the data in various stages.

Important points

  • The entire process is atomic so that if there are any problems, the transactions are rolled back and the database is not corrupted.
  • Detailed log file
  • Separating the processes and creating intermediate CSV files gives institutions tremendous flexibility. It means that the ETL process can be started and stopped at any point in the process. For example, institutions can write their own Extract program and then call the Estudias ETL process to complete the transformations and loading. Institutions also have a lot of power to stage things at different times (i.e. the extraction process could happen at one time, the transformation at another time, and loading at still another).

  Extraction -- The Estudias extraction engine automatically downloads information at a scheduled time (via SSL, SSH, ODBC, etc). The extraction tool supports partitioned extracts and round-robin extract windows (i.e. if a download can't finish by a specified cutoff time, the extraction tool will abort and restart the next day where it left off). Institutions have used this engine to automatically download other information (even if it's not related to Estudias).

Extract from multiple databases. Specifying the tables, fields

  Extracted CSV Files: As part of ZogoTech's Open Architecture, information is extracted into comma separated value files, so that power users (IR, IT) can access the raw data directly. These files can easily be imported into Excel, SPSS, or SAS. Third party applications can be added to further manipulate the downloaded .CSV files or use them as input.

Alternatively, these files can be encrypted for security or written to an encrypted virtual disk. In this case, the data remains encrypted throughout the ETL process and is never written to disk unencrypted.

  Pre-Load Transformations: Estudias supports pre- and post-load transformations. It is often easier to do simple text transformations here (i.e. date conversion) as opposed to in SQL Server (step 6). In the pre-load transformations, the CSV files are manipulated and metrics are automatically computed to improve performance and scalability. At this point other pre-defined data sources (i.e. National Student Clearinghouse) are combined with institutional data to compute transfer-outs, transfer-ins, etc.

  Transformed CSV Files: Transformed Files are available for power users which, at this point, have metrics not available in files in step 2 (i.e. cohorts, stop-ins, stop-outs, transfer outs). Again, these files can be easily manipulated using Excel, SPSS, SAS. As in Step 2, institutions may write their own programs to customize these files, although it is usually easier to use SQL as in step 5.

  Database Preparation: Database is automatically created or upgraded if necessary. In case of an upgrade, the schema is migrated using ZogoTech's Seamless Schema Migration (SSM) algorithms, automatically creating or changing missing schema elements. The database is backed up before any changes are made. Users are disconnected to protect the integrity of the data and prevent disruptions to the load process. Permissions are applied. Security is sever-based and integrated with Active Directory / LDAP so end users can access the data warehouse via any tool.

 Load and Post-Load Transformations: The .CSV files are bulk loaded into the database. Remaining transformations are done on the database using standard SQL statements for easy extensibility. Once information is on the server, users can access it via any ODBC connection (i.e. Excel, SPSS, SAS).

Because many transformations are quite complex, Estudias supports arbitrarily deep hierarchical transformations. ZogoTech's Automated Dependency Calculation (ADC) Engine ensures that dependencies between the transformations are automatically computed and processed in the correct order (i.e. calculation of term indices cannot be calculated until starting cohorts are calculated which in turn cannot be calculated until term enrollment and FTIC is calculated).

 Data Quality Checks: Warnings are created if various conditions are met: i.e. all rows in a table field are empty or the number of rows in a table has significantly increased or decreased since the last run. In one institution, Estudias caught a subtle problem in their ERP upgrade by noticing that a field had a significant number of empty values compared to before the run before the ERP upgrade.

 OLAP Cube Design and Processing: Fact and dimension tables (star and snowflake schema) are created and OLAP cubes are created (if they don't exist) and processed. Users can access these cubes from any OLAP-compliant tool (i.e. MapPoint, Excel, SPSS, SAS).