This database is a list of tasks that students of data science may want to know how to accomplish, all phrased as “How to” questions. The table below lists all tasks in the database. To see them categorized, check out the topics page.

How to add a polynomial term to a model In Python: using sklearn
In R: solution
How to add a transformed term to a model In Python: using NumPy and sklearn
In R: solution
How to add an interaction term to a model In R: solution
How to add details to a plot In Python: using Matplotlib
In R: solution
How to analyze the sample means of different treatment conditions In Python: using Matplotlib and Seaborn
In R: using gplots and emmeans
How to change axes, ticks, and scale in a plot In Python: using Matplotlib
How to check the assumptions of a linear model In Python: using NumPy, SciPy, sklearn, Matplotlib and Seaborn
In R: solution
How to choose the sample size in a study with two population means In Python: using statsmodels
In R: solution
How to compare two nested linear models In Python: using statsmodels
In R: solution
How to compute a confidence interval for a mean difference (matched pairs) In Python: using NumPy and SciPy
In R: solution
How to compute a confidence interval for a population mean In Python: using SciPy
In R: solution
In Julia: solution
How to compute a confidence interval for a population mean using z-scores In Python: using SciPy
In R: solution
How to compute a confidence interval for a regression coefficient In Python: using statsmodels
In R: solution
How to compute a confidence interval for a single population variance In Python: using SciPy
In R: solution
How to compute a confidence interval for the difference between two means when both population variances are known In Python: using NumPy and SciPy
In R: solution
How to compute a confidence interval for the difference between two means when population variances are unknown In Python: using NumPy and SciPy
In R: solution
How to compute a confidence interval for the difference between two proportions In Python: using SciPy
In R: solution
How to compute a confidence interval for the expected value of a response variable In Python: using statsmodels and sklearn
In R: solution
How to compute a confidence interval for the population proportion In Python: using SciPy
In R: solution
How to compute a confidence interval for the ratio of two population variances In Python: using SciPy
In R: solution
How to compute adjusted R-squared In Python: using statsmodels
In R: solution
How to compute covariance and correlation coefficients In Python: using pandas and NumPy
In R: solution
How to compute Fisher’s confidence intervals In R: solution
How to compute probabilities from a distribution In Python: using SciPy
In R: solution
In Excel: solution
In Julia: solution
How to compute R-squared for a simple linear model In Python: using SciPy
In R: solution
In Julia: solution
How to compute summary statistics In Python: using pandas and NumPy
In R: solution
In Excel: solution
In Julia: solution
How to compute the derivative of a function In Python: using SymPy
In R: solution
How to compute the domain of a function In Python: using SymPy
How to compute the error bounds on a Taylor approximation In Python: using SymPy
How to compute the limit of a function In Python: using SymPy
How to compute the power of a test comparing two population means In Python: using statsmodels
In R: solution
How to compute the residuals of a linear model In Python: using statsmodels
In R: solution
How to compute the standard error of the estimate for a model In Python: using statsmodels
In R: solution
How to compute the Taylor series for a function In Python: using SymPy
How to conduct a mixed designs ANOVA In Python: using pandas and pingouin
In R: solution
How to conduct a repeated measures ANOVA In Python: using pandas and pingouin
In R: using rstatix and tidyr and car
How to convert a text column into dates In Python: using pandas
In R: solution
How to create a box (and whisker) plot In Python: using Matplotlib
In R: solution
How to create a data frame from scratch In Python: solution
In R: solution
How to create a histogram In Python: using Matplotlib
In R: solution
How to create a QQ-plot In Python: using SciPy, using statsmodels
In R: solution
How to create basic plots In Python: using Matplotlib
In R: solution
How to create bivariate plots to compare groups In Python: using Matplotlib and Seaborn
In R: using lattice and gplots
How to create symbolic variables In Python: using SymPy
How to define a mathematical sequence In Python: using SymPy
How to define a mathematical series In Python: using SymPy
How to do a goodness of fit test for a multinomial experiment In Python: using SciPy
In R: solution
How to do a hypothesis test for a mean difference (matched pairs) In Python: using SciPy
In R: solution
How to do a hypothesis test for a population proportion In Python: using SciPy
In R: solution
How to do a hypothesis test for population variance In R: solution
How to do a hypothesis test for the difference between means when both population variances are known In Python: using SciPy
In R: solution
How to do a hypothesis test for the difference between two proportions In Python: using SciPy
In R: solution
How to do a hypothesis test for the mean with known standard deviation In Python: using SciPy
In R: solution
How to do a hypothesis test for the ratio of two population variances In Python: using SciPy
In R: solution
How to do a hypothesis test of a coefficient’s significance In R: solution
How to do a Kruskal-Wallis test In Python: using SciPy
In R: solution
How to do a one-sided hypothesis test for two sample means In Python: using SciPy
In R: solution
How to do a one-way analysis of variance (ANOVA) In Python: using SciPy
In R: solution
In Julia: solution
How to do a Spearman rank correlation test In Python: using SciPy
In R: solution
How to do a test of joint significance In Python: using statsmodels
In R: solution
How to do a two-sided hypothesis test for a sample mean In Python: using SciPy
In R: solution
In Julia: solution
How to do a two-sided hypothesis test for two sample means In Python: using SciPy
In R: solution
In Julia: solution
How to do a two-way ANOVA test with interaction In Python: using statsmodels
In R: solution
How to do a two-way ANOVA test without interaction In Python: using statsmodels
In R: solution
How to do a Wilcoxon rank-sum test In Python: using SciPy
In R: solution
How to do a Wilcoxon signed-rank test In Python: using SciPy
In R: solution
How to do a Wilcoxon signed-rank test for matched pairs In Python: using SciPy
In R: solution
How to do basic mathematical computations In Python: using NumPy, using SymPy, solution
In R: solution
In Excel: solution
In Julia: solution
How to do implicit differentiation In Python: using SymPy
How to find critical values and p-values from the normal distribution In R: solution
In Julia: solution
How to find critical values and p-values from the t-distribution In R: solution
In Julia: solution
How to find the critical numbers of a function In Python: using SymPy
How to find the critical points of a multivariate function In Python: using SymPy
How to fit a linear model to two columns of data In Python: using SciPy, using statsmodels
In R: solution
In Julia: solution
How to fit a multivariate linear model In Python: using statsmodels
In R: solution
How to generate random values from a distribution In Python: using SciPy
In R: solution
In Excel: solution
In Julia: solution
How to graph a two-variable function as a surface In Python: using SymPy
How to graph curves that are not functions In Python: using SymPy
How to graph mathematical functions In Python: using NumPy and Matplotlib, using SymPy
In R: solution
How to graph mathematical sequences In Python: using SymPy and Matplotlib
How to isolate one variable in an equation In Python: using SymPy
How to perform a chi-squared test on a contingency table In Python: using SciPy
In R: solution
In Julia: solution
How to perform a planned comparison test In R: using gmodels
How to perform an analysis of covariance (ANCOVA) In Python: using pingouin
In R: solution
How to perform pairwise comparisons In Python: using statsmodels
In R: solution
How to perform post-hoc analysis with Tukey’s HSD test In Python: using statsmodels and Matplotlib
In R: using agricolae, solution
How to plot continuous probability distributions In Python: using SciPy
In R: solution
In Excel: solution
In Julia: solution
How to plot discrete probability distributions In Python: using SciPy
In R: solution
In Julia: solution
How to plot interaction effects of treatments In Python: using Matplotlib and Seaborn
In R: using ggpubr
How to predict the response variable in a linear model In Python: using statsmodels
In R: solution
How to quickly load some sample data In Python: solution
In R: solution
In Julia: solution
How to solve an ordinary differential equation In Python: using SymPy
How to solve symbolic equations In Python: using SymPy
How to substitute a value for a symbolic variable In Python: using SymPy
How to summarize a column In Python: solution
In R: solution
In Excel: solution
How to summarize and compare data by groups In Python: solution
In R: solution
How to test data for normality with Pearson’s chi-squared test In R: solution
How to test data for normality with the D’Agostino-Pearson test In Python: using SciPy
How to test data for normality with the Jarque-Bera test In Python: using SciPy
How to test for a treatment effect in a single factor design In Python: using SciPy and statsmodels
In R: using perm
How to use Bonferroni’s Correction method In R: solution
How to write a piecewise-defined function In Python: using SymPy
How to write an ordinary differential equation In Python: using SymPy
How to write and evaluate definite integrals In Python: using SymPy
How to write and evaluate indefinite integrals In Python: using SymPy
How to write and evaluate Riemann sums In Python: using SymPy
How to write symbolic equations In Python: using SymPy