Open Source Analytics in the Cloud Support National Course Outcomes
Improving Course Outcomes with Open Source Analytics on AWS
Jisc Effective Learning Analytics Project
For over two years Unicon has been a core partner in the development of Jisc’s Learning Analytics Service for the UK Higher Education and Further Education sectors. During the R&D Learning Analytics Project, Unicon provided infrastructure and a blend of on-site and remote consulting support for the development of an open source predictive data model. Unicon also deployed an open source case management tool (Student Success Plan - SSP) to enable visualization of ‘at risk’ indicators from the predictive data model. These indicators assisted tutors in identifying students and expediting interventions (via SSP) to support student attainment, progression, or retention. As the R&D Learning Analytics Project moves towards being a beta Jisc service, Unicon is currently providing transitional support to the Jisc technical team for the remainder of 2017.
Jisc member institution stakeholders identified that the effective use of learning analytics and student data has the potential to improve student retention, degree attainment, and student satisfaction. Retention and attainment are key issues in the UK. With a 10% rate of students expected to leave higher education without a degree[1] and a cost of millions of pounds per year due to student loss[2], improving retention has clear financial value. Beyond just the direct costs, reduced degree attainment has long-term effects on the employability of individuals as well as broader economic effects of a lower-skilled workforce[3]. Beyond just the direct costs, reduced degree attainment has long-term effects on the employability of individuals as well as broader economic effects of a lower-skilled workforce[4].
Jisc’s Effective Learning Analytics Project addresses these core education issues by establishing an infrastructure that tracks student learning activity and other student data to identify students at risk for completing a course. Jisc’s solution includes tools for support of interventions to improve course completion, retention, and attainment[5][6][7].
Challenges
Data analytics environments often have complex infrastructure and integrations, large storage needs, and periodically consume large amounts of compute resource. Additionally, analytics environments commonly require high degrees of agility to allow researchers to deploy and experiment with new analytics algorithms.
In the case of learning analytics, there is highly sensitive/restricted personally identifiable information (PII) regarding student performance and demographics. Within the overall environment, individual identifiers need to be preserved so that student interventions can be made; therefore, anonymization is not an option. Moreover, Jisc’s member institutions operate in the EU, with stringent data privacy requirements. Given the privacy requirements and sensitive nature of the data, a strong security infrastructure along with a set of governance/operating controls must be established.
Faster Innovation
Building a dedicated physical infrastructure for learning analytics would easily consume months as well as substantial budget just in labor. AWS supported substantially faster “time to value,” enabling Jisc to move quickly to the proof of concept stage while providing a flexible environment where analytics staff could execute, evaluate, and finalize the necessary predictive models to identify at-risk students. The rich AWS Elastic Map Reduce (EMR) ecosystem provides the needed tooling for bringing in data from source systems, centralizing the data in a learning record store, and then depositing that data in S3 buckets and AWS Data Pipeline (to move source data to HDFS). Data is then extracted from the HDFS data store into Apache Hive, orchestrating workflow and executing predictive model jobs using Oozie. The result is the push of model outputs (at-risk scores) back to the learning record store and the case management solution.
Reduced Costs
By provisioning the AWS resources only when they are needed and leveraging the TCO advantages of the more “fixed” resources for the case management solution, overall costs are substantially reduced. The EMR cluster is an on-demand cluster; CloudFormation Templates are used by AWS Data Pipeline to provision the cluster when the predictive model needs to be run, and de-provision when model runs are completed.
The case management solution is the open source Student Success Plan - SSP. SSP relies on a classic web application architecture with the front end running on EC2 and the back end on RDS. The TCO for running web applications in AWS can be as much as 80% savings[6]. The Jisc project environment includes dev, test, and production, so cost savings quickly compound.
Security
To address jurisdictional mandates related to the Data Protection Act (DPA) and EU General Data Protection Regulation (GDPR), the Unicon team established a secure architecture for the run time environments that would ensure all data and data processing could be confined to AWS regions within the EU.
The architecture, based on AWS best practices, constrained all data to a single region. Strict processes were invoked to control any movement of data outside of that region. The run-time architecture was implemented in a tiered structure with carefully controlled access to public-facing services contained within a virtual private cloud populated with isolated subnets. Sensitive data resided in the most protected tier, replicating this structure across several availability zones within the region for fault tolerance. Encryption was leveraged throughout (EBS volumes, RDS instances, EMR clusters) and AWS key management services were employed to simplify key rotation and frequency. Amazon Workspaces were used in the appropriate environments, enabling the team to interact with the data, while adhering to the tenets of the DPA/DPD. In addition, automated processes were created to perform quarterly audits of accounts across all instances with special emphasis on privileges and user accesses. Both CloudWatch and CloudTrail services monitored system level activity for audit and forensic purposes. The team adhered to the strict data governance policies imposed within the UK to deliver a secure, flexible, and verifiably compliant solution to the client (by utilizing automation, AWS services aligned with AWS best practices, and Unicon’s internal policy/procedure). Unicon was able to bring its experience and technical expertise to the Jisc Learning Analytics Project, particularly in terms of addressing challenges, the need for faster innovation, reduced costs, and robust security.
Meeting the Challenges:
Unicon Services for AWS
With experience in many of the AWS services including EC2, RDS, CloudFormation, EMR, and Data Pipeline, Unicon is able to help clients fully realize time to market advantages, high levels of reliability and scalability, and demand-based sizing and costs. Unicon has staff certified in current AWS certifications, including AWS Certified Professional level Solutions Architects and DevOps Engineers.
Unicon’s team of AWS professionals developed and deployed components of Jisc’s Effective Learning Analytics Project to the cloud to meet stringent data privacy requirements, increase agility, and satisfy storage along with compute resource needs. This secure cloud architecture enabled Jisc to realize faster innovation, reduced cost, and enhanced security for their data analytics environment.
Unicon is an Advanced Consulting Partner in the AWS Partner Network (APN). This allows Unicon to leverage AWS to its fullest potential given Unicon’s deep expertise in deploying and operating applications on AWS as well as application development.
Results
Faster Innovation
- The rich AWS Elastic Map Reduce ecosystem supported substantially faster “time to value,” allowing elasticity when it came to processing varying amounts of data using HIVE for ETL and Spark for machine learning algorithms
- Workflows could be orchestrated using Oozie, extracting HDFS data store into Apache Hive, ultimately channeling predictions into other systems such as learning record stores and case management solutions
Reduced Costs
- Running the AWS EMR cluster on-demand means CloudFormation Templates are used by AWS Data Pipeline to provision the cluster when the predictive model needs to be run, and de-provisioned when model runs are completed
- Cost savings achieved a compound effect with dev, test, and production environments running a classic web application architecture using AWS components EC2 and RDS
Security
- The run-time architecture was implemented in a tiered structure with carefully controlled access to public-facing services contained within a virtual private cloud populated with isolated subnets. Sensitive data resided in the most protected tier and this structure was replicated across several availability zones within the region for fault tolerance
- Encryption was leveraged throughout (EBS volumes, RDS instances, EMR clusters) and AWS key management services were employed to simplify key rotation and allow increased rotation frequency
- Amazon Workspaces were used in the appropriate environments enabling the team to interact with the data, while adhering to the tenets of the DPA/DPD
- Automated processes were created to perform quarterly audits of all accounts across the instances with special emphasis on privileges and user access
- CloudWatch and CloudTrail services monitored system level activity for audit and forensic purposes
[1] HESA. Non-continuation rates summary: UK Performance Indicators 2015/16. March 2017.
[2] Webb. Learning analytics, organising the organisation. June 2017.
[3] Newman and Beetham. Student digital experience tracker 2017: the voice of 22,000 UK learners. Jisc. June 2017
[4] https://www.jisc.ac.uk/rd/projects/effective-learning-analytics
[5] Sclater. Using Learning Analytics to Enhance the Curriculum. Jisc. June 2017
[6] Sclater. Planning Interventions with At Risk Students. Jisc. June 2017
[7] Varia. The Total Cost of (Non) Ownership of Web Applications in the Cloud. Amazon Web Services. August 2012
Jisc www.jisc.ac.uk is the UK’s expert member organisation for digital technology and digital resources in higher education, further education, skills and research. Jisc’s vision is to make the UK the most digitally advanced education and research nation in the world.
Jisc plays a pivotal role in the development, adoption and use of technology by UK universities and colleges, supporting them to improve learning, teaching, the student experience and institutional efficiency, as well as enabling more powerful research.
At the heart of Jisc’s support is Janet – the UK’s world-class National Research and Education Network (NREN). Owned, managed and operated by Jisc, Janet comprises a secure, state-of-the-art network infrastructure spanning all four nations of the UK.
Further details about Jisc's R&D Learning Project and the new beta Learning Analytics Service are available from https://www.jisc.ac.uk/rd/projects/effective-learning-analytics and https://analytics.jiscinvolve.org/wp/.