Data Smart: Using Data Science to Transform Information into Insight

Description

Efnisyfirlit

Title Page
Copyright
Contents
Chapter 1 Everything You Ever Needed to Know about Spreadsheets but Were Too Afraid to Ask
Some Sample Data
Moving Quickly with the Control Button
Copying Formulas and Data Quickly
Formatting Cells
Paste Special Values
Inserting Charts
Locating the Find and Replace Menus
Formulas for Locating and Pulling Values
Using VLOOKUP to Merge Data
Filtering and Sorting
Using PivotTables
Using Array Formulas
Solving Stuff with Solver
OpenSolver: I Wish We Didn’t Need This, but We Do
Wrapping Up
Chapter 2 Cluster Analysis Part I: Using K-Means to Segment Your Customer Base
Girls Dance with Girls, Boys Scratch Their Elbows
Getting Real: K-Means Clustering Subscribers in E-mail Marketing
Joey Bag O’ Donuts Wholesale Wine Emporium
The Initial Dataset
Determining What to Measure
Start with Four Clusters
Euclidean Distance: Measuring Distances as the Crow Flies
Distances and Cluster Assignments for Everybody!
Solving for the Cluster Centers
Making Sense of the Results
Getting the Top Deals by Cluster
The Silhouette: A Good Way to Let Different K Values Duke It Out
How about Five Clusters?
Solving for Five Clusters
Getting the Top Deals for All Five Clusters
Computing the Silhouette for 5-Means Clustering
K-Medians Clustering and Asymmetric Distance Measurements
Using K-Medians Clustering
Getting a More Appropriate Distance Metric
Putting It All in Excel
The Top Deals for the 5-Medians Clusters
Wrapping Up
Chapter 3 Naïve Bayes and the Incredible Lightness of Being an Idiot
When You Name a Product Mandrill, You’re Going to Get Some Signal and Some Noise
The World’s Fastest Intro to Probability Theory
Totaling Conditional Probabilities
Joint Probability, the Chain Rule, and Independence
What Happens in a Dependent Situation?
Bayes Rule
Using Bayes Rule to Create an AI Model
High-Level Class Probabilities Are Often Assumed to Be Equal
A Couple More Odds and Ends
Let’s Get This Excel Party Started
Removing Extraneous Punctuation
Splitting on Spaces
Counting Tokens and Calculating Probabilities
And We Have a Model! Let’s Use It
Wrapping Up
Chapter 4 Optimization Modeling: Because That “Fresh Squeezed” Orange Juice Ain’t Gonna Blend
Why Should Data Scientists Know Optimization?
Starting with a Simple Trade-Off
Representing the Problem as a Polytope
Solving by Sliding the Level Set
The Simplex Method: Rooting around the Corners
Working in Excel
There’s a Monster at the End of This Chapter
Fresh from the Grove to Your Glass…with a Pit Stop Through a Blending Model
You Use a Blending Model
Let’s Start with Some Specs
Coming Back to Consistency
Putting the Data into Excel
Setting Up the Problem in Solver
Lowering Your Standards
Dead Squirrel Removal: The Minimax Formulation
If-Then and the “Big M” Constraint
Multiplying Variables: Cranking Up the Volume to 11
Modeling Risk
Normally Distributed Data
Wrapping Up
Chapter 5 Cluster Analysis Part II: Network Graphs and Community Detection
What Is a Network Graph?
Visualizing a Simple Graph
Brief Introduction to Gephi
Gephi Installation and File Preparation
Laying Out the Graph
Node Degree
Pretty Printing
Touching the Graph Data
Building a Graph from the Wholesale Wine Data
Creating a Cosine Similarity Matrix
Producing an r-Neighborhood Graph
How Much Is an Edge Worth? Points and Penalties in Graph Modularity
What’s a Point and What’s a Penalty?
Setting Up the Score Sheet
Let’s Get Clustering!
Split Number 1
Split 2: Electric Boogaloo
And…Split 3: Split with a Vengeance
Encoding and Analyzing the Communities
There and Back Again: A Gephi Tale
Wrapping Up
Chapter 6 The Granddaddy of Supervised Artificial Intelligence—Regression
Wait, What? You’re Pregnant?
Don’t Kid Yourself
Predicting Pregnant Customers at RetailMart Using Linear Regression
The Feature Set
Assembling the Training Data
Creating Dummy Variables
Let’s Bake Our Own Linear Regression
Linear Regression Statistics: R-Squared, F Tests, t Tests
Making Predictions on Some New Data and Measuring Performance
Predicting Pregnant Customers at RetailMart Using Logistic Regression
First You Need a Link Function
Hooking Up the Logistic Function and Reoptimizing
Baking an Actual Logistic Regression
Model Selection—Comparing the Performance of the Linear and Logistic Regressions
For More Information
Wrapping Up
Chapter 7 Ensemble Models: A Whole Lot of Bad Pizza
Using the Data from Chapter 6
Bagging: Randomize, Train, Repeat
Decision Stump Is an Unsexy Term for a Stupid Predictor
Doesn’t Seem So Stupid to Me!
You Need More Power!
Let’s Train It
Evaluating the Bagged Model
Boosting: If You Get It Wrong, Just Boost and Try Again
Training the Model—Every Feature Gets a Shot
Evaluating the Boosted Model
Wrapping Up
Chapter 8 Forecasting: Breathe Easy; You Can’t Win
The Sword Trade Is Hopping
Getting Acquainted with Time Series Data
Starting Slow with Simple Exponential Smoothing
Setting Up the Simple Exponential Smoothing Forecast
You Might Have a Trend
Holt’s Trend-Corrected Exponential Smoothing
Setting Up Holt’s Trend-Corrected Smoothing in a Spreadsheet
So Are You Done? Looking at Autocorrelations
Multiplicative Holt-Winters Exponential Smoothing
Setting the Initial Values for Level, Trend, and Seasonality
Getting Rolling on the Forecast
And…Optimize!
Please Tell Me We’re Done Now!!!
Putting a Prediction Interval around the Forecast
Creating a Fan Chart for Effect
Wrapping Up
Chapter 9 Outlier Detection: Just Because They’re Odd Doesn’t Mean They’re Unimportant
Outliers Are (Bad?) People, Too
The Fascinating Case of Hadlum v. Hadlum
Tukey Fences
Applying Tukey Fences in a Spreadsheet
The Limitations of This Simple Approach
Terrible at Nothing, Bad at Everything
Preparing Data for Graphing
Creating a Graph
Getting the k Nearest Neighbors
Graph Outlier Detection Method 1: Just Use the Indegree
Graph Outlier Detection Method 2: Getting Nuanced with k-Distance
Graph Outlier Detection Method 3: Local Outlier Factors Are Where It’s At
Wrapping Up
Chapter 10 Moving from Spreadsheets into R
Getting Up and Running with R
Some Simple Hand-Jamming
Reading Data into R
Doing Some Actual Data Science
Spherical K-Means on Wine Data in Just a Few Lines
Building AI Models on the Pregnancy Data
Forecasting in R
Looking at Outlier Detection
Wrapping Up
Conclusion
Where Am I? What Just Happened?
Before You Go-Go
Get to Know the Problem
We Need More Translators
Beware the Three-Headed Geek-Monster: Tools, Performance, and Mathematical Perfection
You Are Not the Most Important Function of Your Organization
Get Creative and Keep in Touch!
Index

Additional information

Veldu vöru	Rafbók til eignar

Reviews

There are no reviews yet.

Be the first to review “Data Smart: Using Data Science to Transform Information into Insight”

Data Smart: Using Data Science to Transform Information into Insight

Description

Efnisyfirlit

Additional information

Reviews

Aðrar vörur

Bókakaup

Um okkur

Skráðu þig á póstlistann okkar

Data Smart: Using Data Science to Transform Information into Insight

Description

Efnisyfirlit

Additional information

Reviews

Aðrar vörur

Related products

Archaeology: The Basics

An Introduction to Sociolinguistics

Abnormal Psychology

A History of Modern Psychology

Bókakaup

Um okkur

Skráðu þig á póstlistann okkar