www.projectpro.io Open in urlscan Pro
52.21.40.75  Public Scan

Submitted URL: https://www.projectpro.io/recipes/explain-use-of-explode-and-lateral-view-hive#:~:text=Explode%20returns%20a%20single%20co...
Effective URL: https://www.projectpro.io/recipes/explain-use-of-explode-and-lateral-view-hive
Submission: On June 15 via manual from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

Project Library

Data Science Projects
Big Data Projects
Learning Paths
Machine Learning Projects Data Science Projects Keras Projects NLP Projects
Neural Network Projects Deep Learning Projects Tensorflow Projects Banking and
Finance Projects
Apache Spark Projects PySpark Projects Apache Hadoop Projects Apache Hive
Projects AWS Projects Microsoft Azure Projects Apache Kafka Projects Spark SQL
Projects
Machine Learning MLOps Computer Vision Deep Learning Apache Spark Apache Hadoop
AWS NLP
Browse all Data Science Projects Show all Projects Browse all Big Data Projects
Show all Projects Browse all Learning Paths Show all Projects
 * Reviews
 * ExpertsNew

 * Project Path
   
   * Data Science Project Path
   * Big Data Project Path

 * Recipes
   
   * All Recipes
   * Recipes By Tag
   * Recipes By Company
 * Sign In
 * Start Learning


EXPLAIN THE USE OF EXPLODE AND LATERAL VIEW IN THE HIVE

The recipe explains how to use lateral view explode in the Hive.
Last Updated: 11 Apr 2023

Get access to Data Science projects View all Data Science projects
DATA SCIENCE PROJECTS IN PYTHON DATA CLEANING PYTHON DATA MUNGING MACHINE
LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS



TABLE OF CONTENTS

 * Explode and Lateral View in Hive Recipe Objective 
 * What is Explode Function in Hive?
 * What is the Lateral View in Hive? 
 * Explode vs. Lateral View in Hive 
 * Lateral View Explode in Hive
 * Lateral View Explode SQL
 * Lateral View Explode vs. Lateral View Outer Explode
 * A Step-by-Step Guide on the Use of Explode and Lateral View in Hive 
 * Examples of Using Explode and Lateral View in Hive
 * FAQs 


EXPLODE AND LATERAL VIEW IN HIVE RECIPE OBJECTIVE 

Explode and Lateral View are two essential functions in Hive used to manipulate
complex data types such as arrays, maps, and structs. These functions help
transform data into a format more suitable for analysis and processing. Let's
look at these functions and how they can be used in Hive queries.


WHAT IS EXPLODE FUNCTION IN HIVE?

Explode is a built-in Hive function that takes an array or a map as input and
returns a new row for each element in the array or key-value pair in the map. In
other words, it "explodes" the input into multiple rows. Let's look at an
example:

SELECT explode(array(1, 2, 3));

This query will return three rows, each containing a single integer value:

1

2

3

Similarly, if we have a map with key-value pairs, we can use explode to split it
into separate rows:

SELECT explode(map('a', 1, 'b', 2, 'c', 3));

This query will return three rows, each containing a single key-value pair:

a 1

b 2

c 3


WHAT IS THE LATERAL VIEW IN HIVE? 

Lateral View is another built-in Hive function that allows you to create new
rows from complex data types such as arrays and maps. It's often used in
conjunction with the Explode function to create multiple rows from a single
input row. Let's look at an example:

SELECT *

FROM mytable

LATERAL VIEW explode(myarray) exploded_array_table;

In this query, we're using the Lateral View function with the Explode function
to split an array column called "myarray" into multiple rows. The Lateral View
function allows us to treat the output of the Explode function as a table, which
we've given the alias "exploded_array_table." 




EXPLODE VS. LATERAL VIEW IN HIVE 

While both Explode, and Lateral View can be used to split arrays and maps into
separate rows, they have some key differences:

 * Explode returns a single column, whereas Lateral View returns multiple
   columns.

 * Explode can only be used in the SELECT clause, whereas Lateral View can be
   used in any clause that supports table aliases (such as FROM and JOIN).

 * Lateral View allows you to join the output of one table or function with
   another table or function.

There are several additional functions in Hive, including Lateral View Explode
in Hive, Lateral View Explode SQL, and Lateral View Outer Explode. Let's examine
these functions in more detail.


LATERAL VIEW EXPLODE IN HIVE

Lateral View Explode is another function in Hive that is used to split a column,
but instead of creating multiple rows, it creates multiple columns. This
function is beneficial when working with maps. It allows us to split a map
column into multiple columns, each containing one key-value pair from the
map.   


LATERAL VIEW EXPLODE SQL

Lateral View Explode is a SQL function that splits a column into multiple
columns in Hive. This function is used when dealing with complex data types such
as arrays and maps. It allows us to flatten the data and make it easier to
analyze. To use Lateral View Explode in SQL, we use the Lateral View keyword
followed by the Explode function. We then specify the column that we want to
explode. 


LATERAL VIEW EXPLODE VS. LATERAL VIEW OUTER EXPLODE

Lateral View Explode, and Lateral View Outer Explode are two similar functions
in Hive that are used to split a column into multiple columns. The main
difference between them is how they handle null values.

Lateral View Explode ignores null values and only creates a row for each
non-null element in the array or map. On the other hand, Lateral View Outer
Explode includes null values and creates a row with null values for each null
element in the array or map.

Practice makes a man perfect! Start working on these projects in data science
using Python and excel in your data science career.


A STEP-BY-STEP GUIDE ON THE USE OF EXPLODE AND LATERAL VIEW IN HIVE 


SYSTEM REQUIREMENTS :

 * Install ubuntu in the virtual machine click here

 * Install single node hadoop machine click here

 * Install apache hive click here


STEP 1 : PREPARE THE DATASET

Here we are using the employee related comma separated values (csv) dataset for
the create hive table in local.

Data of Output looks as follows:



Before create a table open the hive shell and we need to create a database as
follows :Open the hive shell as below



To create database using below queries :

Create database dezyre_db; use dezyre_db;

As follows below:




STEP 2 : CREATE A HIVE TABLE AND LOAD THE DATA INTO THE TABLE AND VERIFY THE
DATA

Here we are going create a hive table for loading the data from this table to
created bucketed tables, Use below to create a hive table:

CREATE TABLE employee ( employee_id int, company_id int, seniority int, salary
int, join_date string, quit_date string, dept string ) ROW FORMAT DELIMITED
fields terminated by ',' TBLPROPERTIES ("skip.header.line.count"="1");

the above query runs as follows :



Loading the data into hive table and verifying the data

load data local inpath '/home/bigdata/Downloads/empdata.csv' into table
employee;

Verifying the data by running the select query as follows



EXPLANATION OF VIEWS IN HIVE

Views are logical data structures that can be used to simplify queries by either
hiding the complexities such as joins, subqueries, and filters or by flattening
the data. Hive views do not store data. Once the Hive view is created, its
schema is frozen immediately. Subsequent changes to the underlying tables will
not be reflected in the view’s schema.


STEP 3 : CREATING A VIEW

Here we are creating a view using select statement from table employee which
employees has greater than 1 lakh salary

hive> CREATE VIEW emp_100000 AS SELECT * FROM employee WHERE salary>100000;

Output of the above query :



We can verify using below queries whether the view is created or not and format
of the view as below:

hive> show create table emp_100000;

Output of the above query as follows :



hive> desc formatted emp_100000;

Output of the above query as follows :




STEP 4 : ALTER A VIEW PROPERTIES USING ALTER STATEMENT

hive> ALTER VIEW emp_100000 SET TBLPROPERTIES ('comment' = 'This is my view
table');

Output of the above query as follows :




STEP 5 : DROP A VIEW DROP STATEMENT

Here we are going to drop the view to drop a view run the below query is as
follows:

hive> DROP VIEW emp_100000;

Output of the above query as follows :




STEP 6 : USAGE OF EXPLODE AND LATERAL VIEW

The LATERAL VIEW statement is used with user-defined table generating functions
such as EXPLODE() to flatten the map or array type of a column.The explode
function can be used on both ARRAY and MAP with LATERAL VIEW.

Before performing exploring and lateral view functions we will create table and
insert data into it To create a table in hive with array data type column is as
follows

hive> create table std_course_details( std_id int, stud_name string, location
string, course array);

Query runs as follows below:



To insert data into above table using insert statement

hive> INSERT INTO TABLE std_course_details VALUES
(1,'vamshi','hyd',array('hive','hadoop','spark')),(2,'chandana','bangalore',array('reactjs','javascript')),(3,'Divya','pune',array('python','pyspark','airflow','spark')),(4,'srikanth','pune',array('java','spring
boot')),(5,'sreethan','pune',array('c','c++'));

To verify the insert data into table



 * Explode function:

The explode function explodes an array to multiple rows. Returns a row-set with
a single column (col), one row for each element from the array.

Explore More Data Science and Machine Learning Projects for Practice. Fast-Track
Your Career Transition with ProjectPro

Here we are going to split array column values into rows by running the below
query :

hive > select explode(course) from std_course_details;

the above query runs as follows



 * Lateral View :

Lateral view explodes the array data into multiple rows. In other words, lateral
view expands the array into rows.

When you use a lateral view along with the explode function, you will get the
result something like below.

hive> select std_id,stud_name,location,courses from std_course_details LATERAL
VIEW explode(course) courses_list as courses;

Output of the above query :




EXAMPLES OF USING EXPLODE AND LATERAL VIEW IN HIVE

Let's look at examples of using Explode and Lateral View in Hive.


EXAMPLE 1: HIVE LATERAL VIEW EXPLODE EXAMPLE 

Suppose we have a Hive table that contains information about website visitors
and their browsing history. The browsing_history column contains a map of
website URLs and the corresponding timestamps when they were visited. We want to
extract the browsing history into separate rows to perform analysis on the
website level. Here's how we can use lateral view explode to achieve this:

SELECT visitor_id, website_url, timestamp 

FROM visitor 

LATERAL VIEW EXPLODE(browsing_history) websiteTable AS website_url, timestamp;

The lateral view explode function converts the browsing_history column into
separate rows in the above query. The website URLs and corresponding timestamps
are extracted into new columns called website_url and timestamp, respectively.
The result set is then filtered to include only the visitor_id, website_url, and
timestamp columns.


EXAMPLE 2: USING EXPLODE TO SPLIT AN ARRAY COLUMN

Suppose we have a table called "mytable" with an array column called "myarray".
We can use the Explode function to split the array into multiple rows: 

SELECT explode(myarray)

FROM mytable;

This query will return one row for each element in the "myarray" column.


EXAMPLE 3: USING LATERAL VIEW WITH EXPLODE TO SPLIT AN ARRAY COLUMN

Suppose we have the same table as before but want to split the array column and
include others in the output. We can use the Lateral View function to achieve
this:

SELECT t.col1, t.col2, exploded_array_table.*

FROM mytable t

LATERAL VIEW explode(myarray) exploded_array_table;

In this query, we're selecting the columns "col1" and "col2" from the "mytable"
table, and then using Lateral View with Explode to split the "myarray" column
into multiple rows. We've given the output of the Lateral View function the
alias "exploded_array_table", which we can then use to select all columns using
the "*" wildcard.


EXAMPLE 3: USING HIVE LATERAL VIEW EXPLODE WITH WHERE CLAUSE

Suppose we have a table with a map column called "mymap", and we only want to
select rows where the value associated with the key "key1" is greater than 10.
We can use Lateral View with a Where clause to achieve this:

SELECT t.*

FROM mytable t

LATERAL VIEW explode(mymap) exploded_map_table AS k, v

WHERE k = 'key1' AND v > 10;

In this query, we're using Lateral View with Explode to split the "mymap" column
into key-value pairs and then giving the output the aliases "k" and "v." We're
then using a Where clause to filter the output only to include rows where the
key is "key1" and the value is greater than 10.

Looking for end to end solved machine learning projects? Check out ProjectPro's
repository of solved Machine Learning Projects in R with source code!


EXAMPLE 4: USING LATERAL VIEW EXPLODE WITH MULTIPLE COLUMNS 

In addition to splitting a single column into multiple columns, the Lateral View
function in Hive can also be used to split multiple columns into multiple rows.
Here's an example query that uses Lateral View with two columns:

SELECT t.col1, t.col2, exploded_array_table.*

FROM mytable t

LATERAL VIEW explode(t.myarray) exploded_array_table

LATERAL VIEW explode(t.myotherarray) exploded_other_array_table;

In this query, we're selecting the columns "col1" and "col2" from the "mytable"
table, and then using Lateral View with Explode to split both the "myarray" and
"myotherarray" columns into multiple rows. We've given the output of each
Lateral View function a separate alias, which we can then use to select all
columns using the "*" wildcard.

Learn more about using explode and lateral view in Hive by working on several
real-world industry-grade projects by ProjectPro. Explore ProjectPro Repository
to access over 270+ projects based on data science and big data. 


FAQS 


1. WHAT IS THE USE OF LATERAL VIEW?

In Apache Hive, the Lateral View is a table operator that processes complex data
types like arrays and maps. It allows users to combine the columns of a table
with the elements of a complex data type and generate a new table for further
analysis. The Lateral View function helps process nested data structures like
JSON.


2. WHAT IS THE DIFFERENCE BETWEEN EXPLODE AND INLINE IN HIVE?

In Hive, both Explode and Inline are used to unnest data structures like arrays
and maps. However, the main difference is that Explode generates a new row for
each element in the input array or map. In contrast, Inline generates a single
row with multiple columns.


3. HOW TO EXPLODE DATA IN HIVE?

To use the Explode function in Hive, you can follow these steps:

 * First, create a table with a column containing a nested data structure like
   an array or map.

 * Next, use the Lateral View operator to combine the table's columns with the
   data structure's elements.

 * Finally, apply the Explode function to generate a new row for each element in
   the data structure.


4. HOW DOES EXPLODE WORK IN SQL?

Explode is not a built-in function in standard SQL. However, some SQL database
systems, like Apache Hive and Apache Spark SQL, have implemented the Explode
function to unnest data structures. In these systems, Explode works by
generating a new row for each element in a nested data structure like an array
or map.


5. WHAT IS THE USE OF EXPLODE?

The Explode function is used to unnest data structures like arrays and maps and
generate a new row for each element in the data structure. This is useful for
processing nested data structures and further analyzing the individual elements.
The Explode function is commonly used in data processing frameworks like Apache
Hive and Apache Spark.



 

Join Millions of Satisfied Developers and Enterprises to Maximize Your
Productivity and ROI with ProjectPro - Read ProjectPro Reviews Now!





What Users are saying..



RAY HAN

Tech Leader | Stanford / Yale University


I think that they are fantastic. I attended Yale and Stanford and have worked at
Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big
Data and Hadoop,NoSQL, Spark, Hadoop... Read More

RELEVANT PROJECTS

MACHINE LEARNING PROJECTS

DATA SCIENCE PROJECTS

PYTHON PROJECTS FOR DATA SCIENCE

DATA SCIENCE PROJECTS IN R

MACHINE LEARNING PROJECTS FOR BEGINNERS

DEEP LEARNING PROJECTS

NEURAL NETWORK PROJECTS

TENSORFLOW PROJECTS

NLP PROJECTS

KAGGLE PROJECTS

IOT PROJECTS

BIG DATA PROJECTS

HADOOP REAL-TIME PROJECTS EXAMPLES

SPARK PROJECTS

DATA ANALYTICS PROJECTS FOR STUDENTS

YOU MIGHT ALSO LIKE

DATA SCIENCE TUTORIAL

DATA SCIENTIST SALARY

HOW TO BECOME A DATA SCIENTIST

DATA ANALYST VS DATA SCIENTIST

DATA SCIENTIST RESUME

DATA SCIENCE PROJECTS FOR BEGINNERS

MACHINE LEARNING ENGINEER

MACHINE LEARNING PROJECTS FOR BEGINNERS

DATASETS

PANDAS DATAFRAME

MACHINE LEARNING ALGORITHMS

REGRESSION ANALYSIS

MNIST DATASET

DATA SCIENCE INTERVIEW QUESTIONS

PYTHON DATA SCIENCE INTERVIEW QUESTIONS

SPARK INTERVIEW QUESTIONS

HADOOP INTERVIEW QUESTIONS

DATA ANALYST INTERVIEW QUESTIONS

MACHINE LEARNING INTERVIEW QUESTIONS

AWS VS AZURE

HADOOP ARCHITECTURE

SPARK ARCHITECTURE

RELEVANT PROJECTS

BUILD REGRESSION MODELS IN PYTHON FOR HOUSE PRICE PREDICTION

In this Machine Learning Regression project, you will build and evaluate various
regression models in Python for house price prediction.
View Project Details

--------------------------------------------------------------------------------

LOAN ELIGIBILITY PREDICTION USING GRADIENT BOOSTING CLASSIFIER

This data science in python project predicts if a loan should be given to an
applicant or not. We predict if the customer is eligible for loan based on
several factors like credit score and past history.
View Project Details

--------------------------------------------------------------------------------

DEPLOY TRANSFORMER BART MODEL FOR TEXT SUMMARIZATION ON GCP

Learn to Deploy a Machine Learning Model for the Abstractive Text Summarization
on Google Cloud Platform (GCP)
View Project Details

--------------------------------------------------------------------------------

NATURAL LANGUAGE PROCESSING CHATBOT APPLICATION USING NLTK FOR TEXT
CLASSIFICATION

In this NLP AI application, we build the core conversational engine for a
chatbot. We use the popular NLTK text classification library to achieve this.
View Project Details

--------------------------------------------------------------------------------

BUILD A LOGISTIC REGRESSION MODEL IN PYTHON FROM SCRATCH

Regression project to implement logistic regression in python from scratch on
streaming app data.
View Project Details

--------------------------------------------------------------------------------

PYCARET PROJECT TO BUILD AND DEPLOY AN ML APP USING STREAMLIT

In this PyCaret Project, you will build a customer segmentation model with
PyCaret and deploy the machine learning application using Streamlit.
View Project Details

--------------------------------------------------------------------------------

BUILD A MULTI CLASSTEXT CLASSIFICATION MODEL USING NAIVE BAYES

Implement the Naive Bayes Algorithm to build a multi class text classification
model in Python.
View Project Details

--------------------------------------------------------------------------------

AWS PROJECT TO BUILD AND DEPLOY LSTM MODEL WITH SAGEMAKER

In this AWS Sagemaker Project, you will learn to build a LSTM model on Sagemaker
for sales forecasting while analyzing the impact of weather conditions on Sales.
View Project Details

--------------------------------------------------------------------------------

A/B TESTING APPROACH FOR COMPARING PERFORMANCE OF ML MODELS

The objective of this project is to compare the performance of BERT and
DistilBERT models for building an efficient Question and Answering system. Using
A/B testing approach, we explore the effectiveness and efficiency of both models
and determine which one is better suited for Q&A tasks.
View Project Details

--------------------------------------------------------------------------------

BUILD MULTI CLASS TEXT CLASSIFICATION MODELS WITH RNN AND LSTM

In this Deep Learning Project, you will use the customer complaints data about
consumer financial products to build multi-class text classification models
using RNN and LSTM.
View Project Details

--------------------------------------------------------------------------------

SUBSCRIBE TO RECIPES

×
Please share your company email to get customized projects




Get Access CONTINUE

SIGN UP TO VIEW RECIPE

×
Please share your company email to get customized projects




View Recipe CONTINUE

Trending Project Categories

 * Machine Learning Projects
 * Data Science Projects
 * Deep Learning Projects
 * Big Data Projects
 * Apache Hadoop Projects
 * Apache Spark Projects
 * Show more
   NLP Projects IoT Projects Neural Network Projects Tensorflow Projects PySpark
   Projects Spark Streaming Projects Python Projects for Data Science Microsoft
   Azure Projects GCP Projects AWS Projects Show less

Trending Projects

 * Walmart Sales Forecasting Data Science Project
 * BigMart Sales Prediction ML Project
 * Music Recommender System Project
 * Credit Card Fraud Detection Using Machine Learning
 * Resume Parser Python Project for Data Science
 * Time Series Forecasting Projects
 * Show more
   Twitter Sentiment Analysis Project Credit Score Prediction Machine Learning
   Retail Price Optimization Algorithm Machine Learning Store Item Demand
   Forecasting Deep Learning Project Human Activity Recognition ML Project
   Visualize Website Clickstream Data Handwritten Digit Recognition Code Project
   Anomaly Detection Projects PySpark Data Pipeline Project Show less

Trending Blogs

 * Machine Learning Projects for Beginners with Source Code
 * Data Science Projects for Beginners with Source Code
 * Big Data Projects for Beginners with Source Code
 * IoT Projects for Beginners with Source Code
 * Data Analyst vs Data Scientist
 * Data Science Interview Questions and Answers
 * Show more
   Hadoop Interview Questions and Answers Spark Interview Questions and Answers
   AWS vs Azure Types of Analytics Hadoop Architecture Spark Architecture
   Machine Learning Algorithms Data Partitioning in Spark Datasets for Machine
   Learning Big Data Tools Comparison Compare The Best Big Data Tools Show less

Trending Recipes

 * Search for a Value in Pandas DataFrame
 * Pandas Create New Column based on Multiple Condition
 * LSTM vs GRU
 * Plot ROC Curve in Python
 * Python Upload File to Google Drive
 * Optimize Logistic Regression Hyper Parameters
 * Show more
   Drop Out Highly Correlated Features in Python How to Split Data and Time in
   Python Pandas Replace Multiple Values Convert Categorical Variable to Numeric
   Pandas Classification Report Python RandomizedSearchCV Grid Search Decision
   Tree Catboost Hyperparameter Tuning Pandas Normalize Column Show less

Trending Tutorials

 * PCA in Machine Learning Tutorial
 * PySpark Tutorial
 * Hive Commands Tutorial
 * MapReduce in Hadoop Tutorial
 * Apache Hive Tutorial -Tables
 * Linear Regression Tutorial
 * Show more
   Apache Spark Tutorial Evaluate Performance Metrics for Machine Learning
   Models K-Means Clustering Tutorial Sqoop Tutorial R Import Data From Website
   Install Spark on Linux Data.Table Packages in R Apache ZooKeeper Hadoop
   Tutorial Hadoop Tutorial Show less

PROJECTPRO

© 2023 Iconiq Inc.

About us

Contact us

Privacy policy

User policy

Write for ProjectPro