Background
In 2019, TICAS revamped the College Insight (CIS), a unique online resource that provides user-friendly college profiles as well as detailed, research-level higher education data. Whether you’re a research analyst, policy advisor, or school staff, or interested in learning more about college access, the CIS website provides tools for searching colleges and exploring multiple levels of data aggregation. Our contractor—Wide Eye—maintains the website interfaces and query tools. We provide them with CSV files with these data that are then uploaded into the website and accessed through an SQL database. Users can also download a technical dataset that includes all the variables (and more) featured on the site across all levels of aggregation and data years.
The same data is analyzed to write the annual Student Debt Report (SDR). Since 2005, we have published this report to reduce the burden of student loan debt and increase public understanding of debt and its implications for families, the economy, and society. The report features several key statistics on student debt, such as average debt of college graduates, and summarizes variations and trends in debt by time, state, and college. This analysis is produced from data files that are separate from the CIS data files, but these files are produced in parallel during the same data process.
Data come from multiple sources with different structures and formats. We currently use STATA DO files (120+ referenced by master DO file) to process the data and output files for both CIS and SDR. We typically take about one month to process data and running the code successfully is a process of trial-and-error. Adjustments are needed as data sources change variables and we make improvements to the report analysis and website. The current data process is likely sub-optimal for the inputs and outcomes involved, including this SQL infrastructure and website maintained by Wide Eye.
Project tasks
- Advise TICAS in identifying and identifying a strategy to make improvements to data process for combining and aggregating data for the SDR and CIS.
- Advise TICAS on the pros and cons of continuing to use STATA to process data versus other programming languages
- Redesign and recode the data process for the Wide Eye data files and SDR analyses.
- Redesign and recode the SQL infrastructure on CIS to more efficiently query data and improve the efficiency.
- Work collaboratively with TICAS staff to run data process, refine code, and verify that outputs are equivalent to data in previous reports and the current CIS website.
- Train TICAS staff and contractors to maintain the redesigned data process and infrastructure, including periodic data for the report and website.
Desired Qualifications and skills
- Bachelor’s degree in computer science, information technology, applied math, or related field, or an engineering certification, such as IBM Certified Data Engineer or Google’s Certified Professional
- Advanced expertise in identifying, designing, and implementing internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Ability to build processes supporting data transformation, data structures, metadata, dependency and workload management.
- Advanced expertise in assembling large, complex data sets that have multiple levels of data aggregation and that are used both for research and for use in powering publicly-facing webtools and queries.
- A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Advanced expertise in both object-oriented and statistical programming languages (e.g., Python, R, etc.) that can combine and aggregate data from multiple sources into standardized format
- Proficiency in SQL and ability to build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources
- Proficiency in STATA and ability to translate code into data flows and other programming languages
- Strong project management and organizational skills
- Ability to work with diverse stakeholders to assist with data-related technical issues and support their data infrastructure needs
- Preferred: Experience in higher education data analysis or research
Application
Interested candidates should submit a résumé and a cover letter that describes a few examples of products that involve data processing and development of the backend of web data tools that make you the right fit for this position. Please submit your application at this link.
This position will remain open until filled. TICAS is an equal opportunity employer and is committed to diversity; diverse candidates are encouraged to apply.