Data Science

Download PDF

DATA - Data Science

DATA-110: Explorations in Data Science (Credits: 4)

Data Science is on the forefront of the Big Data Revolution. Governments, companies, nonprofits, and health care providers are collecting, storing, and analyzing vast amounts of data to extract information about us and make predictions about our lives. The mathematical and technological aspects of data science have been central to its success, yet they cannot exist in isolation. The context in which data is collected and used, and potentially misused, shape the impact on individuals and society as a whole. Therefore, the study of issues involving data collection, analysis, and its communication from multiple contexts involving different disciplines-including but not limited to economics, psychology, sociology, biology, medicine and chemistry-will be a central theme of this class. (WCore: WCSAM, QE)

DATA-150: Data and Society (Credits: 4)

Quantitative literacy is increasingly important in our world of information. The primary goal of this course is to learn about data and how it's used. Along the way, we will learn how to develop basic tools to analyze and visualize data, read and evaluate research claims, and report research findings in honest and ethical ways. (This course may not be taken for credit if a student already has credit for DATA 220.) (WCore: QE)

DATA-220: Introduction to Statistics (Credits: 4)

Statistics is the study of data. This course will develop tools for analyzing data from a variety of fields. We follow the process from data gathering (sampling methods and experimental design) to exploratory data analysis (graphs, tables, charts, and summary statistics) to inferential statistics (hypothesis tests and confidence intervals) using simulation and sampling distributions. A key component of the course is the introduction of the statistical language R for analysis and R Markdown for the presentation of statistical analysis. (WCore: QE)

DATA-300: Special Topics in Data Science (Credits: 1 to 4)

Covers special topics normally not offered in regular Data Science curriculum.

DATA-300EE: Bayesian Data Analysis (Credits: 2)

Bayes' Theorem is a powerful tool from probability theory that allows us to analyze data and make inferences about a population. More traditional statistical methods, like null hypothesis significance testing with its confidence intervals and p-values, reason indirectly and can be difficult to interpret. Bayesian inference has a direct probabilistic interpretation. In this class, we will learn some of the mathematical theory behind Bayesian statistics and apply our knowledge to the analysis of real data.

DATA-300NNN: Structural Equation Modeling (Credits: 2)

Structural Equation Modeling (SEM) refers to a family of statistical methods for modeling the relationships between variables. As a research tool, SEM integrates and extends features of analysis of variance (ANOVA) procedures, linear multiple regression, and factor analysis by allowing the testing of predictive and causal relationships among continuous and categorical variables, both observed and unobserved (latent). This course will provide a conceptual as well as an applied, hands-on understanding of SEM assumptions, analyses, and the interpretation of data, and is recommended for students, faculty, and staff that are interested in learning about how SEM can be used to model various forms of quantitative data, using null-hypothesis testing and Bayesian approaches. Statistical programs, such as R, will be highlighted as a way to learn about and apply SEM to a wide-variety of research questions and hypotheses in the sciences, business, and beyond.

DATA-307: Databases for Data Science (Credits: 2)

A study of the application of relational databases to information collection and extraction. SQL queries are studied in depth.

DATA-350: Statistical Modeling (Credits: 4)

The general linear model is a powerful framework for modeling relationships in data analysis. This course establishes the theory and application of regression models from simple and multiple regression through ANOVA and logit/probit models. In addition to building models, we will also learn to diagnose model fit and handle a wide range of possible complications. We will use the statistical language R for analysis and R Markdown for the presentation of statistical analysis.

DATA-360: Data Science With Python (Credits: 4)

Python is currently the top programming language for data science. It's a flexible and efficient language that's relatively easy to learn and use, with an extensive set of packages for data wrangling, visualization, statistics, and machine learning. In this course we will supplement basic programming skills by exploring data formats and storage, data cleaning and wrangling, and exploratory data analysis using industry-standard Python packages. The goal of this course is to take a more programmatic and Pythonic view of data science. Much of our work will be in the Jupyter notebook environment with some exposure to the command line and scripting. We will also cover basic SQL queries for interacting with databases. Students will learn reproducible research techniques and skills for working with big data in Python.

DATA-370: Statistical Learning (Credits: 4)

Statistical learning is a broad term that refers to any statistical technique that seeks to estimate the relationships among data. Modern advances in computational power allow us to use technology to build a wide array of models to analyze increasingly complex data sets. This course will explore the theory and application of statistical learning techniques such as clustering, regression, discriminant analysis, resampling, regularization, splines, generalized additive models, and Bayesian inference. We will use the statistical language R for analysis and R Markdown for the presentation of statistical analysis.

DATA-401: Directed Study (Credits: 1 to 4)

A tutorial-based course used only for student-initiated proposals for intensive individual study of topics not otherwise offered in the Data Science Program. Requires consent of instructor and school dean. This course is repeatable for credit.

DATA-440: Internship (Credits: 1 to 8)

Offers students the opportunity to integrate classroom knowledge with practical experience. Prerequisites: junior or senior standing (for transfer students, at least 15 hours completed at Westminster or permission of instructor), minimum 2.5 GPA, completion of the Career Center Internship Workshop, and consent of program director and Career Center Internship Coordinator. REGISTRATION NOTE: Registration for internships is initiated through the Career Center website and is finalized upon completion of required paperwork and approvals. More info: 801-832-2590 <a></a>

DATA-470: Capstone Project (Credit: 1)

The capstone project is an opportunity for students to apply the knowledge gained throughout the Data Science minor to an interesting data problem, preferably in conjunction with a research project in their major. The students in the course will work with a mentor in their field of interest as well as the faculty member running the Data Science capstone project to develop a research plan to analyze one or more data sets addressing a topic of interest. All capstone students will meet together one hour a week to share ideas and take advantage of interdisciplinary collaboration. The capstone experience will culminate in a paper and a presentation.