uc-r.github.io
Open in
urlscan Pro
2606:50c0:8000::153
Public Scan
URL:
http://uc-r.github.io/
Submission: On April 26 via api from GB — Scanned from GB
Submission: On April 26 via api from GB — Scanned from GB
Form analysis
0 forms found in the DOMText Content
Follow me on twitter @bradleyboehmke Home Course: Intro to R Bootcamp Course: Data Wrangling with R Introduction to R R Basics Workflow RStudio Projects R Markdown R Notebook Data Types Dealing with Numbers Dealing with Characters Dealing with Regex Dealing with Factors Dealing with Logical Values Dealing with Dates Dealing with NA's Data Structures Basics Managing Vectors Managing Lists Managing Matrices Managing Data Frames Managing Tibbles Functions Basics Writing Functions Control Statements Apply Family Importing/Exporting Data Importing Data Scraping Data Exporting Data Shape & Transform Data Simplify Code with %>% Reshape Data with tidyr Transform Data with dplyr Visualizations Quick Plots Visual Data Exploration Advanced Plots with ggplot Analytics Descriptive Analytics Predictive Analytics Prescriptive Analytics UC BUSINESS ANALYTICS R PROGRAMMING GUIDE CREATING TEXT FEATURES WITH BAG-OF-WORDS, N-GRAMS, PARTS-OF-SPEACH AND MORE 02 Oct 2018 Historically, data has been available to us in the form of numeric (i.e. customer age, income, household size) and categorical features (i.e. region, department, gender). However, as organizations look for ways to collect new forms of information such as unstructured text, images, social media posts, etcetera, we need to understand how to convert this information into structured features to use in data science tasks such as customer segmentation or prediction tasks. In this post, we explore a few fundamental feature engineering approaches that we can start using to convert unstructured text into structured features. MULTIVARIATE ADAPTIVE REGRESSION SPLINES 08 Sep 2018 Several previous tutorials (i.e. linear regression, logistic regression, regularized regression) discussed algorithms that are intrinsically linear. Many of these models can be adapted to nonlinear patterns in the data by manually adding model terms (i.e. squared terms, interaction effects); however, to do so you must know the specific nature of the nonlinearity a priori. Alternatively, there are numerous algorithms that are inherently nonlinear. When using these models, the exact form of the nonlinearity does not need to be known explicitly or specified prior to model training. Rather, these algorithms will search for, and discover, nonlinearities in the data that help maximize predictive accuracy. This latest tutorial discusses multivariate adaptive regression splines (MARS), an algorithm that essentially creates a piecewise linear model which provides an intuitive stepping block into nonlinearity after grasping the concept of linear regression and other intrinsically linear models. INTERPRETING MACHINE LEARNING MODELS WITH THE IML PACKAGE 01 Aug 2018 With machine learning interpretability growing in importance, several R packages designed to provide this capability are gaining in popularity. In recent blog posts I assessed lime for model agnostic local interpretability functionality and DALEX for both local and global machine learning explanation plots. This newest tutorial examines the iml package to assess its functionality in providing machine learning interpretability to help you determine if it should become part of your preferred machine learning toolbox. MODEL INTERPRETABILITY WITH DALEX 11 Jul 2018 As advanced machine learning algorithms are gaining acceptance across many organizations and domains, machine learning interpretability is growing in importance to help extract insight and clarity regarding how these algorithms are performing and why one prediction is made over another. There are many methodologies to interpret machine learning results (i.e. variable importance via permutation, partial dependence plots, local interpretable model-agnostic explanations), and many machine learning R packages implement their own versions of one or more methodologies. However, some recent R packages that focus purely on ML interpretability agnostic to any specific ML algorithm are gaining popularity. One such package is DALEX and this latest tutorial covers what this package does (and does not do) so that you can determine if it should become part of your preferred machine learning toolbox. GRADIENT BOOSTING MACHINES 14 Jun 2018 Gradient boosting machines (GBMs) are an extremely popular machine learning algorithm that have proven successful across many domains and is one of the leading methods for winning Kaggle competitions. Whereas random forests build an ensemble of deep independent trees, GBMs build an ensemble of shallow and weak successive trees with each tree learning and improving on the previous. When combined, these many weak successive trees produce a powerful “committee” that are often hard to beat with other algorithms. This latest tutorial covers the fundamentals of GBMs for regression problems. Older Newer