MARC details
000 -LEADER |
fixed length control field |
09108nam a22002057a 4500 |
005 - DATE AND TIME OF LATEST TRANSACTION |
control field |
20190904105646.0 |
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION |
fixed length control field |
190904b ||||| |||| 00| 0 eng d |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER |
International Standard Book Number |
9788126546145 |
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER |
Classification number |
006.31 |
Item number |
FOR |
100 ## - MAIN ENTRY--PERSONAL NAME |
Personal name |
Foreman, John W. |
245 ## - TITLE STATEMENT |
Title |
Data smart: using data science to transform Information into insight |
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT) |
Place of publication, distribution, etc. |
New Delhi |
Name of publisher, distributor, etc. |
Wiley India Pvt. Ltd. |
Date of publication, distribution, etc. |
2018 |
300 ## - PHYSICAL DESCRIPTION |
Extent |
xx, 409 p. |
365 ## - TRADE PRICE |
Price type code |
INR |
Price amount |
799.00 |
504 ## - BIBLIOGRAPHY, ETC. NOTE |
Bibliography, etc. note |
TABLE OF CONTENTS<br/>Introduction xiii<br/><br/>1 Everything You Ever Needed to Know about Spreadsheets but Were Too Afraid to Ask 1<br/><br/>Some Sample Data 2<br/><br/>Moving Quickly with the Control Button 2<br/><br/>Copying Formulas and Data Quickly 4<br/><br/>Formatting Cells 5<br/><br/>Paste Special Values 7<br/><br/>Inserting Charts 8<br/><br/>Locating the Find and Replace Menus 9<br/><br/>Formulas for Locating and Pulling Values 10<br/><br/>Using VLOOKUP to Merge Data 12<br/><br/>Filtering and Sorting 13<br/><br/>Using PivotTables 16<br/><br/>Using Array Formulas 19<br/><br/>Solving Stuff with Solver 20<br/><br/>OpenSolver: I Wish We Didn’t Need This, but We Do 26<br/><br/>Wrapping Up 27<br/><br/>2 Cluster Analysis Part I: Using K-Means to Segment Your Customer Base 29<br/><br/>Girls Dance with Girls, Boys Scratch Their Elbows 30<br/><br/>Getting Real: K-Means Clustering Subscribers in E-mail Marketing 35<br/><br/>Joey Bag O’ Donuts Wholesale Wine Emporium 36<br/><br/>The Initial Dataset 36<br/><br/>Determining What to Measure 38<br/><br/>Start with Four Clusters 41<br/><br/>Euclidean Distance: Measuring Distances as the Crow Flies 41<br/><br/>Distances and Cluster Assignments for Everybody! 44<br/><br/>Solving for the Cluster Centers 46<br/><br/>Making Sense of the Results 49<br/><br/>Getting the Top Deals by Cluster 50<br/><br/>The Silhouette: A Good Way to Let Different K Values Duke It Out 53<br/><br/>How about Five Clusters? 60<br/><br/>Solving for Five Clusters 60<br/><br/>Getting the Top Deals for All Five Clusters 61<br/><br/>Computing the Silhouette for 5-Means Clustering 64<br/><br/>K-Medians Clustering and Asymmetric Distance Measurements 66<br/><br/>Using K-Medians Clustering 66<br/><br/>Getting a More Appropriate Distance Metric 67<br/><br/>Putting It All in Excel 69<br/><br/>The Top Deals for the 5-Medians Clusters 70<br/><br/>Wrapping Up 75<br/><br/>3 Naïve Bayes and the Incredible Lightness of Being an Idiot 77<br/><br/>When You Name a Product Mandrill, You’re Going to Get Some Signal and Some Noise 77<br/><br/>The World’s Fastest Intro to Probability Theory 79<br/><br/>Totaling Conditional Probabilities 80<br/><br/>Joint Probability, the Chain Rule, and Independence 80<br/><br/>What Happens in a Dependent Situation? 81<br/><br/>Bayes Rule 82<br/><br/>Using Bayes Rule to Create an AI Model 83<br/><br/>High-Level Class Probabilities Are Often Assumed to Be Equal 84<br/><br/>A Couple More Odds and Ends 85<br/><br/>Let’s Get This Excel Party Started 87<br/><br/>Removing Extraneous Punctuation 87<br/><br/>Splitting on Spaces 88<br/><br/>Counting Tokens and Calculating Probabilities 92<br/><br/>And We Have a Model! Let’s Use It 94<br/><br/>Wrapping Up 98<br/><br/>4 Optimization Modeling: Because That “Fresh Squeezed” Orange Juice Ain’t Gonna Blend Itself 101<br/><br/>Why Should Data Scientists Know Optimization? 102<br/><br/>Starting with a Simple Trade-Off f 103<br/><br/>Representing the Problem as a Polytope 103<br/><br/>Solving by Sliding the Level Set 105<br/><br/>The Simplex Method: Rooting around the Corners 106<br/><br/>Working in Excel 108<br/><br/>There’s a Monster at the End of This Chapter 117<br/><br/>Fresh from the Grove to Your Glasswith a Pit Stop Through a Blending Model 118<br/><br/>You Use a Blending Model 119<br/><br/>Let’s Start with Some Specs 119<br/><br/>Coming Back to Consistency 121<br/><br/>Putting the Data into Excel 121<br/><br/>Setting Up the Problem in Solver 124<br/><br/>Lowering Your Standards 126<br/><br/>Dead Squirrel Removal: The Minimax Formulation 131<br/><br/>If-Then and the “Big M” Constraint 133<br/><br/>Multiplying Variables: Cranking Up the Volume to 11 137<br/><br/>Modeling Risk 144<br/><br/>Normally Distributed Data 145<br/><br/>Wrapping Up 154<br/><br/>5 Cluster Analysis Part II: Network Graphs and Community Detection 155<br/><br/>What Is a Network Graph? 156<br/><br/>Visualizing a Simple Graph 157<br/><br/>Brief Introduction to Gephi 159<br/><br/>Gephi Installation and File Preparation 160<br/><br/>Laying Out the Graph 162<br/><br/>Node Degree 165<br/><br/>Pretty Printing 166<br/><br/>Touching the Graph Data 168<br/><br/>Building a Graph from the Wholesale Wine Data 170<br/><br/>Creating a Cosine Similarity Matrix 172<br/><br/>Producing an r-Neighborhood Graph 174<br/><br/>How Much Is an Edge Worth? Points and Penalties in Graph Modularity 178<br/><br/>What’s a Point and What’s a Penalty? 179<br/><br/>Setting Up the Score Sheet 183<br/><br/>Let’s Get Clustering! 185<br/><br/>Split Number 1 185<br/><br/>Split 2: Electric Boogaloo 190<br/><br/>And…Split 3: Split with a Vengeance 192<br/><br/>Encoding and Analyzing the Communities 193<br/><br/>There and Back Again: A Gephi Tale 197<br/><br/>Wrapping Up 202<br/><br/>6 The Granddaddy of Supervised Artificial Intelligence—Regression 205<br/><br/>Wait, What? You’re Pregnant? 205<br/><br/>Don’t Kid Yourself 206<br/><br/>Predicting Pregnant Customers at RetailMart Using Linear Regression 207<br/><br/>The Feature Set 207<br/><br/>Assembling the Training Data 209<br/><br/>Creating Dummy Variables 210<br/><br/>Let’s Bake Our Own Linear Regression 213<br/><br/>Linear Regression Statistics: R-Squared, F Tests, t Tests 221<br/><br/>Making Predictions on Some New Data and Measuring Performance 230<br/><br/>Predicting Pregnant Customers at RetailMart Using Logistic Regression 239<br/><br/>First You Need a Link Function 240<br/><br/>Hooking Up the Logistic Function and Reoptimizing 241<br/><br/>Baking an Actual Logistic Regression 244<br/><br/>Model Selection—Comparing the Performance of the Linear and Logistic Regressions 245<br/><br/>For More Information 248<br/><br/>Wrapping Up 249<br/><br/>7 Ensemble Models: A Whole Lot of Bad Pizza 251<br/><br/>Using the Data from Chapter 6 252<br/><br/>Bagging: Randomize, Train, Repeat 254<br/><br/>Decision Stump Is an Unsexy Term for a Stupid Predictor 254<br/><br/>Doesn’t Seem So Stupid to Me! 255<br/><br/>You Need More Power! 257<br/><br/>Let’s Train It 258<br/><br/>Evaluating the Bagged Model 267<br/><br/>Boosting: If You Get It Wrong, Just Boost and Try Again 272<br/><br/>Training the Model—Every Feature Gets a Shot 272<br/><br/>Evaluating the Boosted Model 280<br/><br/>Wrapping Up 283<br/><br/>8 Forecasting: Breathe Easy; You Can’t Win 285<br/><br/>The Sword Trade Is Hopping 286<br/><br/>Getting Acquainted with Time Series Data 286<br/><br/>Starting Slow with Simple Exponential Smoothing 288<br/><br/>Setting Up the Simple Exponential Smoothing Forecast 290<br/><br/>You Might Have a Trend 296<br/><br/>Holt’s Trend-Corrected Exponential Smoothing 299<br/><br/>Setting Up Holt’s Trend-Corrected Smoothing in a Spreadsheet 300<br/><br/>So Are You Done? Looking at Autocorrelations 306<br/><br/>Multiplicative Holt-Winters Exponential Smoothing 313<br/><br/>Setting the Initial Values for Level, Trend, and Seasonality 315<br/><br/>Getting Rolling on the Forecast 319<br/><br/>And Optimize! 324<br/><br/>Please Tell Me We’re Done Now!!! 326<br/><br/>Putting a Prediction Interval around the Forecast 327<br/><br/>Creating a Fan Chart for Effect 331<br/><br/>Wrapping Up 333<br/><br/>9 Outlier Detection: Just Because They’re Odd Doesn’t Mean They’re Unimportant 335<br/><br/>Outliers Are (Bad?) People, Too 335<br/><br/>The Fascinating Case of Hadlum v Hadlum 336<br/><br/>Tukey Fences 337<br/><br/>Applying Tukey Fences in a Spreadsheet 338<br/><br/>The Limitations of This Simple Approach 340<br/><br/>Terrible at Nothing, Bad at Everything 341<br/><br/>Preparing Data for Graphing 342<br/><br/>Creating a Graph 345<br/><br/>Getting the k Nearest Neighbors 347<br/><br/>Graph Outlier Detection Method 1: Just Use the Indegree 348<br/><br/>Graph Outlier Detection Method 2: Getting Nuanced with k-Distance 351<br/><br/>Graph Outlier Detection Method 3: Local Outlier Factors Are Where It’s At 353<br/><br/>Wrapping Up 358<br/><br/>10 Moving from Spreadsheets into R 361<br/><br/>Getting Up and Running with R 362<br/><br/>Some Simple Hand-Jamming 363<br/><br/>Reading Data into R 370<br/><br/>Doing Some Actual Data Science 372<br/><br/>Spherical K-Means on Wine Data in Just a Few Lines 372<br/><br/>Building AI Models on the Pregnancy Data 378<br/><br/>Forecasting in R 385<br/><br/>Looking at Outlier Detection 389<br/><br/>Wrapping Up 394<br/><br/>Conclusion 395<br/><br/>Where Am I? What Just Happened? 395<br/><br/>Before You Go-Go 395<br/><br/>Get to Know the Problem 396<br/><br/>We Need More Translators 397<br/><br/>Beware the Three-Headed Geek-Monster: Tools, Performance, and Mathematical Perfection 397<br/><br/>You Are Not the Most Important Function of Your Organization 400<br/><br/>Get Creative and Keep in Touch! 400<br/><br/>Index 401 |
520 ## - SUMMARY, ETC. |
Summary, etc. |
DESCRIPTION<br/>The book provides nine tutorials on optimization, machine learning, data mining, and forecasting all within the confines of a spreadsheet. Each tutorial uses a real-world problem and the author guides the reader using query’s the reader might ask as how to craft a solution using the correct data science technique. Hosting these nine spreadsheets for download will be necessary so that the reader can work the problems along with the book.<br/><br/>Important topics covered by the book:<br/><br/>Linear and integer programming<br/>K-nearest neighbors graphs and clustering<br/>Logistic regression<br/>Demand forecasting with seasonal adjustments<br/>Price sensitivity, revenue optimization, and price-sensitive forecasting<br/>Naïve Bayes classification<br/>Outlier detection using graphs and Local Outlier Factors<br/>Multi-criteria decision analysis |
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name as entry element |
Data mining |
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM |
Topical term or geographic name as entry element |
Web usage mining |
942 ## - ADDED ENTRY ELEMENTS (KOHA) |
Source of classification or shelving scheme |
Dewey Decimal Classification |
Koha item type |
Book |