You are here
Think Stats: Probability and Statistics for Programmers
Title:  Think Stats: Probability and Statistics for Programmers: . 
11 views


Name(s): 
Downey, Allen B., author 

Type of Resource:  Textbook  
Date Issued:  2011  
Abstract/Description:  Think Stats: Probability and Statistics for Programmers is a textbook for a new kind of introductory probstat class. It emphasizes the use of statistics to explore large datasets. It takes a computational approach, which has several advantages: Students write programs as a way of developing and testing their understanding. For example, they write functions to compute a least squares fit, residuals, and the coefficient of determination. Writing and testing this code requires them to understand the concepts and implicitly corrects misunderstandings. Students run experiments to test statistical behavior. For example, they explore the Central Limit Theorem (CLT) by generating samples from several distributions. When they see that the sum of values from a Pareto distribution doesn’t converge to normal, they remember the assumptions the CLT is based on. Some ideas that are hard to grasp mathematically are easy to understand by simulation. For example, we approximate pvalues by running Monte Carlo simulations, which reinforces the meaning of the pvalue. Using discrete distributions and computation makes it possible to present topics like Bayesian estimation that are not usually covered in an introductory class. For example, one exercise asks students to compute the posterior distribution for the “German tank problem,” which is difficult analytically but surprisingly easy computationally. Because students work in a generalpurpose programming language (Python), they are able to import data from almost any source. They are not limited to data that has been cleaned and formatted for a particular statistics tool. The book lends itself to a projectbased approach. In my class, students work on a semesterlong project that requires them to pose a statistical question, find a dataset that can address it, and apply each of the techniques they learn to their own data.  
Identifier:  STA2023_10 (IID)  
Note(s): 
Statistics, STA2023 

Subject(s):  Statistics  
Links:  https://greenteapress.com/thinkstats/html/index.html  
Use and Reproduction:  Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons AttributionNonCommercial 3.0 Unported License, which is available at http://creativecommons.org/licenses/bync/3.0/.  
Host Institution:  FLVC  
Is Part Of: 
