Data Mining for Business Analytics: Concepts, Techniques, and Applications in R presents an applied approach to data mining concepts and methods, using R software for illustration Readers will learn how to implement a variety of popular data mining algorithms in R (a free and open-source software) to tackle business problems and opportunities. This is the fifth version of this successful text, and the first using R. It covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, recommender systems, clustering, text mining and network analysis. It also includes: • Two new co-authors, Inbal Yahav and Casey Lichtendahl, who bring both expertise teaching business analytics courses using R, and data mining consulting experience in business and government • Updates and new material based on feedback from instructors teaching MBA, undergraduate, diploma and executive courses, and from their students • More than a dozen case studies demonstrating applications for the data mining techniques described • End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented • A companion website with more than two dozen data sets, and instructor materials including exercise solutions, PowerPoint slides, and case solutions www.dataminingbook.com Data Mining for Business Analytics: Concepts, Techniques, and Applications in R is an ideal textbook for graduate and upper-undergraduate level courses in data mining, predictive analytics, and business analytics. This new edition is also an excellent reference for analysts, researchers, and practitioners working with quantitative methods in the fields of business, finance, marketing, computer science, and information technology. “ This book has by far the most comprehensive review of business analytics methods that I have ever seen, covering everything from classical approaches such as linear and logistic regression, through to modern methods like neural networks, bagging and boosting, and even much more business specific procedures such as social network analysis and text mining. If not the bible, it is at the least a definitive manual on the subject.” Gareth M. James, University of Southern California and co-author (with Witten, Hastie and Tibshirani) of the best-selling book An Introduction to Statistical Learning, with Applications in R Galit Shmueli, PhD, is Distinguished Professor at National Tsing Hua University’s Institute of Service Science. She has designed and instructed data mining courses since 2004 at University of Maryland, Statistics.com, Indian School of Business, and National Tsing Hua University, Taiwan. Professor Shmueli is known for her research and teaching in business analytics, with a focus on statistical and data mining methods in information systems and healthcare. She has authored over 70 publications including books. Peter C. Bruce is President and Founder of the Institute for Statistics Education at Statistics.com. He has written multiple journal articles and is the developer of Resampling Stats software. He is the author of Introductory Statistics and Analytics: A Resampling Perspective (Wiley) and co-author of Practical Statistics for Data Scientists: 50 Essential Concepts (O’Reilly). Inbal Yahav, PhD, is Professor at the Graduate School of Business Administration at Bar-Ilan University, Israel. She teaches courses in social network analysis, advanced research methods, and software quality assurance. Dr. Yahav received her PhD in Operations Research and Data Mining from the University of Maryland, College Park. Nitin R. Patel, PhD, is Chairman and cofounder of Cytel, Inc., based in Cambridge, Massachusetts. A Fellow of the American Statistical Association, Dr. Patel has also served as a Visiting Professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years. Kenneth C. Lichtendahl, Jr., PhD, is Associate Professor at the University of Virginia. He is the Eleanor F. and Phillip G. Rust Professor of Business Administration and teaches MBA courses in decision analysis, data analysis and optimization, and managerial quantitative analysis. He also teaches executive education courses in strategic analysis and decision-making, and managing the corporate aviation function.
Concepts, Techniques, and Applications in R
Author: Galit Shmueli,Peter C. Bruce,Inbal Yahav,Nitin R. Patel,Kenneth C. Lichtendahl, Jr.
Publisher: John Wiley & Sons
Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner®, Third Edition presents an applied approach to data mining and predictive analytics with clear exposition, hands-on exercises, and real-life case studies. Readers will work with all of the standard data mining methods using the Microsoft® Office Excel® add-in XLMiner® to develop predictive models and learn how to obtain business value from Big Data. Featuring updated topical coverage on text mining, social network analysis, collaborative filtering, ensemble methods, uplift modeling and more, the Third Edition also includes: Real-world examples to build a theoretical and practical understanding of key data mining methods End-of-chapter exercises that help readers better understand the presented material Data-rich case studies to illustrate various applications of data mining techniques Completely new chapters on social network analysis and text mining A companion site with additional data sets, instructors material that include solutions to exercises and case studies, and Microsoft PowerPoint® slides https://www.dataminingbook.com Free 140-day license to use XLMiner for Education software Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner®, Third Edition is an ideal textbook for upper-undergraduate and graduate-level courses as well as professional programs on data mining, predictive modeling, and Big Data analytics. The new edition is also a unique reference for analysts, researchers, and practitioners working with predictive analytics in the fields of business, finance, marketing, computer science, and information technology. Praise for the Second Edition "…full of vivid and thought-provoking anecdotes... needs to be read by anyone with a serious interest in research and marketing."– Research Magazine "Shmueli et al. have done a wonderful job in presenting the field of data mining - a welcome addition to the literature." – ComputingReviews.com "Excellent choice for business analysts...The book is a perfect fit for its intended audience." – Keith McCormick, Consultant and Author of SPSS Statistics For Dummies, Third Edition and SPSS Statistics for Data Analysis and Visualization Galit Shmueli, PhD, is Distinguished Professor at National Tsing Hua University’s Institute of Service Science. She has designed and instructed data mining courses since 2004 at University of Maryland, Statistics.com, The Indian School of Business, and National Tsing Hua University, Taiwan. Professor Shmueli is known for her research and teaching in business analytics, with a focus on statistical and data mining methods in information systems and healthcare. She has authored over 70 journal articles, books, textbooks and book chapters. Peter C. Bruce is President and Founder of the Institute for Statistics Education at www.statistics.com. He has written multiple journal articles and is the developer of Resampling Stats software. He is the author of Introductory Statistics and Analytics: A Resampling Perspective, also published by Wiley. Nitin R. Patel, PhD, is Chairman and cofounder of Cytel, Inc., based in Cambridge, Massachusetts. A Fellow of the American Statistical Association, Dr. Patel has also served as a Visiting Professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad for 15 years.
Concepts, Techniques, and Applications with XLMiner
Author: Galit Shmueli,Peter C. Bruce,Nitin R. Patel
Publisher: John Wiley & Sons
Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro® presents an applied and interactive approach to data mining. Featuring hands-on applications with JMP Pro®, a statistical package from the SAS Institute, the book uses engaging, real-world examples to build a theoretical and practical understanding of key data mining methods, especially predictive models for classification and prediction. Topics include data visualization, dimension reduction techniques, clustering, linear and logistic regression, classification and regression trees, discriminant analysis, naive Bayes, neural networks, uplift modeling, ensemble models, and time series forecasting. Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro® also includes: Detailed summaries that supply an outline of key topics at the beginning of each chapter End-of-chapter examples and exercises that allow readers to expand their comprehension of the presented material Data-rich case studies to illustrate various applications of data mining techniques A companion website with over two dozen data sets, exercises and case study solutions, and slides for instructors www.dataminingbook.com Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro® is an excellent textbook for advanced undergraduate and graduate-level courses on data mining, predictive analytics, and business analytics. The book is also a one-of-a-kind resource for data scientists, analysts, researchers, and practitioners working with analytics in the fields of management, finance, marketing, information technology, healthcare, education, and any other data-rich field. Galit Shmueli, PhD, is Distinguished Professor at National Tsing Hua University’s Institute of Service Science. She has designed and instructed data mining courses since 2004 at University of Maryland, Statistics.com, Indian School of Business, and National Tsing Hua University, Taiwan. Professor Shmueli is known for her research and teaching in business analytics, with a focus on statistical and data mining methods in information systems and healthcare. She has authored over 70 journal articles, books, textbooks, and book chapters, including Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner®, Third Edition, also published by Wiley. Peter C. Bruce is President and Founder of the Institute for Statistics Education at www.statistics.com He has written multiple journal articles and is the developer of Resampling Stats software. He is the author of Introductory Statistics and Analytics: A Resampling Perspective and co-author of Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner ®, Third Edition, both published by Wiley. Mia Stephens is Academic Ambassador at JMP®, a division of SAS Institute. Prior to joining SAS, she was an adjunct professor of statistics at the University of New Hampshire and a founding member of the North Haven Group LLC, a statistical training and consulting company. She is the co-author of three other books, including Visual Six Sigma: Making Data Analysis Lean, Second Edition, also published by Wiley. Nitin R. Patel, PhD, is Chairman and cofounder of Cytel, Inc., based in Cambridge, Massachusetts. A Fellow of the American Statistical Association, Dr. Patel has also served as a Visiting Professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years. He is co-author of Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner®, Third Edition, also published by Wiley.
Concepts, Techniques, and Applications with JMP Pro
Author: Galit Shmueli,Peter C. Bruce,Mia L. Stephens,Nitin R. Patel
Publisher: John Wiley & Sons
Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner
Author: Galit Shmueli,Nitin R. Patel,Peter C. Bruce
Publisher: John Wiley & Sons
Collecting, analyzing, and extracting valuable information from a large amount of data requires easily accessible, robust, computational and analytical tools. Data Mining and Business Analytics with R utilizes the open source software R for the analysis, exploration, and simplification of large high-dimensional data sets. As a result, readers are provided with the needed guidance to model and interpret complicated data and become adept at building powerful models for prediction and classification. Highlighting both underlying concepts and practical computational skills, Data Mining and Business Analytics with R begins with coverage of standard linear regression and the importance of parsimony in statistical modeling. The book includes important topics such as penalty-based variable selection (LASSO); logistic regression; regression and classification trees; clustering; principal components and partial least squares; and the analysis of text and network data. In addition, the book presents: • A thorough discussion and extensive demonstration of the theory behind the most useful data mining tools • Illustrations of how to use the outlined concepts in real-world situations • Readily available additional data sets and related R code allowing readers to apply their own analyses to the discussed materials • Numerous exercises to help readers with computing skills and deepen their understanding of the material Data Mining and Business Analytics with R is an excellent graduate-level textbook for courses on data mining and business analytics. The book is also a valuable reference for practitioners who collect and analyze data in the fields of finance, operations management, marketing, and the information sciences.
Author: Johannes Ledolter
Publisher: John Wiley & Sons
Customer and Business Analytics: Applied Data Mining for Business Decision Making Using R explains and demonstrates, via the accompanying open-source software, how advanced analytical tools can address various business problems. It also gives insight into some of the challenges faced when deploying these tools. Extensively classroom-tested, the text is ideal for students in customer and business analytics or applied data mining as well as professionals in small- to medium-sized organizations. The book offers an intuitive understanding of how different analytics algorithms work. Where necessary, the authors explain the underlying mathematics in an accessible manner. Each technique presented includes a detailed tutorial that enables hands-on experience with real data. The authors also discuss issues often encountered in applied data mining projects and present the CRISP-DM process model as a practical framework for organizing these projects. Showing how data mining can improve the performance of organizations, this book and its R-based software provide the skills and tools needed to successfully develop advanced analytics capabilities.
Applied Data Mining for Business Decision Making Using R
Author: Daniel S. Putler,Robert E. Krider
Publisher: CRC Press
Category: Business & Economics
An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.
with Applications in R
Author: Gareth James,Daniela Witten,Trevor Hastie,Robert Tibshirani
Publisher: Springer Science & Business Media
Data Mining Applications with R is a great resource for researchers and professionals to understand the wide use of R, a free software environment for statistical computing and graphics, in solving different problems in industry. R is widely used in leveraging data mining techniques across many different industries, including government, finance, insurance, medicine, scientific research and more. This book presents 15 different real-world case studies illustrating various techniques in rapidly growing areas. It is an ideal companion for data mining researchers in academia and industry looking for ways to turn this versatile software into a powerful analytic tool. R code, Data and color figures for the book are provided at the RDataMining.com website. Helps data miners to learn to use R in their specific area of work and see how R can apply in different industries Presents various case studies in real-world applications, which will help readers to apply the techniques in their work Provides code examples and sample data for readers to easily learn the techniques by running the code by themselves
Author: Yanchang Zhao,Yonghua Cen
Publisher: Academic Press
Expert guidance on information management for optimum customer intelligence processes Providing essential guidance for information management, this book helps you understand the basics of information management, how to design and launch customer intelligence campaigns, and optimize existing customer intelligence processes. How to align information management with company strategy Examines how to get, grow, and retain valuable customers Discusses how to optimize existing customer intelligence processes Showing you how to make extensive use of data, statistical, and quantitative analysis, explanatory and predictive modeling, and fact-based management to drive decision making, Business Analytics for Customer Intelligence provides you with the tools your business needs to optimize you data driven processes.
How to Compete in the Information Age
Author: Gert H. N. Laursen
Publisher: John Wiley & Sons
Category: Business & Economics
R for Business Analytics looks at some of the most common tasks performed by business analysts and helps the user navigate the wealth of information in R and its 4000 packages. With this information the reader can select the packages that can help process the analytical tasks with minimum effort and maximum usefulness. The use of Graphical User Interfaces (GUI) is emphasized in this book to further cut down and bend the famous learning curve in learning R. This book is aimed to help you kick-start with analytics including chapters on data visualization, code examples on web analytics and social media analytics, clustering, regression models, text mining, data mining models and forecasting. The book tries to expose the reader to a breadth of business analytics topics without burying the user in needless depth. The included references and links allow the reader to pursue business analytics topics. This book is aimed at business analysts with basic programming skills for using R for Business Analytics. Note the scope of the book is neither statistical theory nor graduate level research for statistics, but rather it is for business analytics practitioners. Business analytics (BA) refers to the field of exploration and investigation of data generated by businesses. Business Intelligence (BI) is the seamless dissemination of information through the organization, which primarily involves business metrics both past and current for the use of decision support in businesses. Data Mining (DM) is the process of discovering new patterns from large data using algorithms and statistical methods. To differentiate between the three, BI is mostly current reports, BA is models to predict and strategize and DM matches patterns in big data. The R statistical software is the fastest growing analytics platform in the world, and is established in both academia and corporations for robustness, reliability and accuracy. The book utilizes Albert Einstein’s famous remarks on making things as simple as possible, but no simpler. This book will blow the last remaining doubts in your mind about using R in your business environment. Even non-technical users will enjoy the easy-to-use examples. The interviews with creators and corporate users of R make the book very readable. The author firmly believes Isaac Asimov was a better writer in spreading science than any textbook or journal author.
Author: A Ohri
Publisher: Springer Science & Business Media
Category: BUSINESS & ECONOMICS
Data Mining: Concepts and Techniques provides the concepts and techniques in processing gathered data or information, which will be used in various applications. Specifically, it explains data mining and the tools used in discovering knowledge from the collected data. This book is referred as the knowledge discovery from data (KDD). It focuses on the feasibility, usefulness, effectiveness, and scalability of techniques of large data sets. After describing data mining, this edition explains the methods of knowing, preprocessing, processing, and warehousing data. It then presents information about data warehouses, online analytical processing (OLAP), and data cube technology. Then, the methods involved in mining frequent patterns, associations, and correlations for large data sets are described. The book details the methods for data classification and introduces the concepts and methods for data clustering. The remaining chapters discuss the outlier detection and the trends, applications, and research frontiers in data mining. This book is intended for Computer Science students, application developers, business professionals, and researchers who seek information on data mining. Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of your data
Author: Jiawei Han,Jian Pei,Micheline Kamber
Handbook of Statistical Analysis and Data Mining Applications, Second Edition, is a comprehensive professional reference book that guides business analysts, scientists, engineers and researchers, both academic and industrial, through all stages of data analysis, model building and implementation. The handbook helps users discern technical and business problems, understand the strengths and weaknesses of modern data mining algorithms and employ the right statistical methods for practical application. This book is an ideal reference for users who want to address massive and complex datasets with novel statistical approaches and be able to objectively evaluate analyses and solutions. It has clear, intuitive explanations of the principles and tools for solving problems using modern analytic techniques and discusses their application to real problems in ways accessible and beneficial to practitioners across several areas—from science and engineering, to medicine, academia and commerce. Includes input by practitioners for practitioners Includes tutorials in numerous fields of study that provide step-by-step instruction on how to use supplied tools to build models Contains practical advice from successful real-world implementations Brings together, in a single resource, all the information a beginner needs to understand the tools and issues in data mining to build successful data mining solutions Features clear, intuitive explanations of novel analytical tools and techniques, and their practical applications
Author: Robert Nisbet,Gary Miner,Ken Yale
Concise, thoroughly class-tested primer that features basic statistical concepts in the concepts in the context of analytics, resampling, and the bootstrap A uniquely developed presentation of key statistical topics, Introductory Statistics and Analytics: A Resampling Perspective provides an accessible approach to statistical analytics, resampling, and the bootstrap for readers with various levels of exposure to basic probability and statistics. Originally class-tested at one of the first online learning companies in the discipline, www.statistics.com, the book primarily focuses on applications of statistical concepts developed via resampling, with a background discussion of mathematical theory. This feature stresses statistical literacy and understanding, which demonstrates the fundamental basis for statistical inference and demystifies traditional formulas. The book begins with illustrations that have the essential statistical topics interwoven throughout before moving on to demonstrate the proper design of studies. Meeting all of the Guidelines for Assessment and Instruction in Statistics Education (GAISE) requirements for an introductory statistics course, Introductory Statistics and Analytics: A Resampling Perspective also includes: Over 300 “Try It Yourself” exercises and intermittent practice questions, which challenge readers at multiple levels to investigate and explore key statistical concepts Numerous interactive links designed to provide solutions to exercises and further information on crucial concepts Linkages that connect statistics to the rapidly growing field of data science Multiple discussions of various software systems, such as Microsoft Office Excel®, StatCrunch, and R, to develop and analyze data Areas of concern and/or contrasting points-of-view indicated through the use of “Caution” icons Introductory Statistics and Analytics: A Resampling Perspective is an excellent primary textbook for courses in preliminary statistics as well as a supplement for courses in upper-level statistics and related fields, such as biostatistics and econometrics. The book is also a general reference for readers interested in revisiting the value of statistics.
A Resampling Perspective
Author: Peter C. Bruce
Publisher: John Wiley & Sons
Learn methods of data analysis and their application to real-world data sets This updated second edition serves as an introduction to data mining methods and models, including association rules, clustering, neural networks, logistic regression, and multivariate analysis. The authors apply a unified “white box” approach to data mining methods and models. This approach is designed to walk readers through the operations and nuances of the various methods, using small data sets, so readers can gain an insight into the inner workings of the method under review. Chapters provide readers with hands-on analysis problems, representing an opportunity for readers to apply their newly-acquired data mining expertise to solving real problems using large, real-world data sets. Data Mining and Predictive Analytics, Second Edition: Offers comprehensive coverage of association rules, clustering, neural networks, logistic regression, multivariate analysis, and R statistical programming language Features over 750 chapter exercises, allowing readers to assess their understanding of the new material Provides a detailed case study that brings together the lessons learned in the book Includes access to the companion website, www.dataminingconsultant.com, with exclusive password-protected instructor content Data Mining and Predictive Analytics, Second Edition will appeal to computer science and statistic students, as well as students in MBA programs, and chief executives.
Author: Daniel T. Larose,Chantal D. Larose
Publisher: John Wiley & Sons
Addresses the impacts of data mining on education and reviews applications in educational research teaching, and learning This book discusses the insights, challenges, issues, expectations, and practical implementation of data mining (DM) within educational mandates. Initial series of chapters offer a general overview of DM, Learning Analytics (LA), and data collection models in the context of educational research, while also defining and discussing data mining’s four guiding principles— prediction, clustering, rule association, and outlier detection. The next series of chapters showcase the pedagogical applications of Educational Data Mining (EDM) and feature case studies drawn from Business, Humanities, Health Sciences, Linguistics, and Physical Sciences education that serve to highlight the successes and some of the limitations of data mining research applications in educational settings. The remaining chapters focus exclusively on EDM’s emerging role in helping to advance educational research—from identifying at-risk students and closing socioeconomic gaps in achievement to aiding in teacher evaluation and facilitating peer conferencing. This book features contributions from international experts in a variety of fields. Includes case studies where data mining techniques have been effectively applied to advance teaching and learning Addresses applications of data mining in educational research, including: social networking and education; policy and legislation in the classroom; and identification of at-risk students Explores Massive Open Online Courses (MOOCs) to study the effectiveness of online networks in promoting learning and understanding the communication patterns among users and students Features supplementary resources including a primer on foundational aspects of educational mining and learning analytics Data Mining and Learning Analytics: Applications in Educational Research is written for both scientists in EDM and educators interested in using and integrating DM and LA to improve education and advance educational research.
Applications in Educational Research
Author: Samira ElAtia,Donald Ipperciel,Osmar R. ZaÃ ̄ane
Publisher: John Wiley & Sons
Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today. Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making. Understand how data science fits in your organization—and how you can use it for competitive advantage Treat data as a business asset that requires careful investment if you’re to gain real value Approach business problems data-analytically, using the data-mining process to gather good data in the most appropriate way Learn general concepts for actually extracting knowledge from data Apply data science principles when interviewing data science job candidates
What You Need to Know about Data Mining and Data-Analytic Thinking
Author: Foster Provost,Tom Fawcett
Publisher: "O'Reilly Media, Inc."
This book is a complete introduction to the power of R for marketing research practitioners. The text describes statistical models from a conceptual point of view with a minimal amount of mathematics, presuming only an introductory knowledge of statistics. Hands-on chapters accelerate the learning curve by asking readers to interact with R from the beginning. Core topics include the R language, basic statistics, linear modeling, and data visualization, which is presented throughout as an integral part of analysis. Later chapters cover more advanced topics yet are intended to be approachable for all analysts. These sections examine logistic regression, customer segmentation, hierarchical linear modeling, market basket analysis, structural equation modeling, and conjoint analysis in R. The text uniquely presents Bayesian models with a minimally complex approach, demonstrating and explaining Bayesian methods alongside traditional analyses for analysis of variance, linear models, and metric and choice-based conjoint analysis. With its emphasis on data visualization, model assessment, and development of statistical intuition, this book provides guidance for any analyst looking to develop or improve skills in R for marketing applications.
Author: Chris Chapman,Elea McDonnell Feit
Category: Business & Economics
During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book’s coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for “wide” data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.
Data Mining, Inference, and Prediction
Author: Trevor Hastie,Robert Tibshirani,Jerome Friedman
Publisher: Springer Science & Business Media
The fourth edition of this popular graduate textbook, like its predecessors, presents a balanced and comprehensive treatment of both time and frequency domain methods with accompanying theory. Numerous examples using nontrivial data illustrate solutions to problems such as discovering natural and anthropogenic climate change, evaluating pain perception experiments using functional magnetic resonance imaging, and monitoring a nuclear test ban treaty. The book is designed as a textbook for graduate level students in the physical, biological, and social sciences and as a graduate level text in statistics. Some parts may also serve as an undergraduate introductory course. Theory and methodology are separated to allow presentations on different levels. In addition to coverage of classical methods of time series regression, ARIMA models, spectral analysis and state-space models, the text includes modern developments including categorical time series analysis, multivariate spectral methods, long memory series, nonlinear models, resampling techniques, GARCH models, ARMAX models, stochastic volatility, wavelets, and Markov chain Monte Carlo integration methods. This edition includes R code for each numerical example in addition to Appendix R, which provides a reference for the data sets and R scripts used in the text in addition to a tutorial on basic R commands and R time series. An additional file is available on the book’s website for download, making all the data sets and scripts easy to load into R.
With R Examples
Author: Robert H. Shumway,David S. Stoffer
Master text-taming techniques and build effective text-processing applications with R About This Book Develop all the relevant skills for building text-mining apps with R with this easy-to-follow guide Gain in-depth understanding of the text mining process with lucid implementation in the R language Example-rich guide that lets you gain high-quality information from text data Who This Book Is For If you are an R programmer, analyst, or data scientist who wants to gain experience in performing text data mining and analytics with R, then this book is for you. Exposure to working with statistical methods and language processing would be helpful. What You Will Learn Get acquainted with some of the highly efficient R packages such as OpenNLP and RWeka to perform various steps in the text mining process Access and manipulate data from different sources such as JSON and HTTP Process text using regular expressions Get to know the different approaches of tagging texts, such as POS tagging, to get started with text analysis Explore different dimensionality reduction techniques, such as Principal Component Analysis (PCA), and understand its implementation in R Discover the underlying themes or topics that are present in an unstructured collection of documents, using common topic models such as Latent Dirichlet Allocation (LDA) Build a baseline sentence completing application Perform entity extraction and named entity recognition using R In Detail Text Mining (or text data mining or text analytics) is the process of extracting useful and high-quality information from text by devising patterns and trends. R provides an extensive ecosystem to mine text through its many frameworks and packages. Starting with basic information about the statistics concepts used in text mining, this book will teach you how to access, cleanse, and process text using the R language and will equip you with the tools and the associated knowledge about different tagging, chunking, and entailment approaches and their usage in natural language processing. Moving on, this book will teach you different dimensionality reduction techniques and their implementation in R. Next, we will cover pattern recognition in text data utilizing classification mechanisms, perform entity recognition, and develop an ontology learning framework. By the end of the book, you will develop a practical application from the concepts learned, and will understand how text mining can be leveraged to analyze the massively available data on social media. Style and approach This book takes a hands-on, example-driven approach to the text mining process with lucid implementation in R.
Author: Ashish Kumar,Avinash Paul
Publisher: Packt Publishing Ltd