Data Science Courses


Data Science

DATA 110 Explorations in Data Science (4)
Data Science is on the forefront of the big data revolution. Governments, companies, nonprofits, and health care providers are collecting, storing, and analyzing vast amounts of data to extract information about us and make predictions about our lives. The mathematical and technological aspects of data science have been central to its success, yet they cannot exist in isolation. The context in which data is collected and used, and potentially misused, shape the impact on individuals and society as a whole. Therefore, the study of issues involving data collection, analysis, and its communication from multiple contexts will be a central theme of this class. (WCore: WCSAM, QE)
DATA 150 Data and Society (4)
Quantitative literacy is increasingly important in our world of information. The primary goal of this course is to learn about data and how it’s used. Along the way, we will learn how to develop basic tools to analyze and visualize data, read and evaluate research claims, and report research findings in honest and ethical ways. (This course may not be taken for credit if a student already has credit for DATA 220.) (WCore: QE)
DATA 220 Introduction to Statistics (4)
Statistics is the study of data. This course will develop tools for analyzing data from a variety of fields. We follow the process from data gathering (sampling methods and experimental design) to exploratory data analysis (graphs, tables, charts, and summary statistics) to inferential statistics (hypothesis tests and confidence intervals) using simulation and sampling distributions. A key component of the course is the introduction of the statistical language R for analysis and R Markdown for the presentation of statistical analysis. (WCore: QE).
DATA 300 Special Topics in Data Science (1-4)
Special courses offered when there is sufficient demand.
DATA 307 Databases for Data Science (2)
A study of the application of relational databases to information collection and extraction. SQL queries are studied in depth. Prerequisite: CMPT 190.
DATA 350 Statistical Modeling (4)
The general linear model is a powerful framework for modeling relationships in data analysis. This course establishes the theory and application of regression models from simple and multiple regression through ANOVA and logistic regression. In addition to building models, we will also learn to diagnose model fit and handle a wide range of possible complications. We will use the statistical language R for analysis and R Markdown for the presentation of statistical analysis.
DATA 360 Data Science with Python (4)
Python is currently the top programming language for data science. It’s a flexible and efficient language that’s relatively easy to learn and use, with an extensive set of packages for data wrangling, visualization, statistics, and machine learning. In this course we will supplement basic programming skills by exploring data formats and storage, data cleaning and wrangling, and exploratory data analysis using industry-standard Python packages. The goal of this course is to take a more programmatic and Pythonic view of data science. Much of our work will be in the Jupyter notebook environment with some exposure to the command line and scripting. We will also cover basic SQL queries for interacting with databases. Students will learn reproducible research techniques and skills for working with big data in Python.
DATA 370 Statisical Learning (4)
Statistical learning is a broad term that refers to any statistical technique that seeks to estimate the relationships among data. Modern advances in computational power allow us to use technology to build a wide array of models to analyze increasingly complex data sets. This course will explore the theory and application of statistical learning techniques such as clustering, regression, discriminant analysis, resampling, regularization, splines, generalized additive models, and Bayesian inference. We will use the statistical language R for analysis and R Markdown for the presentation of statistical analysis.
DATA 470 Capstone Project (1)
The capstone project is an opportunity for students to apply the knowledge gained throughout the Data Science minor to an interesting data problem, preferably in conjunction with a research project in their major. The students in the course will work with a mentor in their field of interest as well as the faculty member running the Data Science capstone project to develop a research plan to analyze one or more data sets addressing a topic of interest. All capstone students will meet together one hour a week to share ideas and take advantage of interdisciplinary collaboration. The capstone experience will culminate in a paper and a presentation.