Data Visualization in R - GeeksforGeeks (2024)

Data visualization is the technique used to deliver insights in data using visual cues such as graphs, charts, maps, and many others. This is useful as it helps in intuitive and easy understanding of the large quantities of data and thereby make better decisions regarding it.

Data Visualization in R Programming Language

The popular data visualization tools that are available are Tableau, Plotly, R, Google Charts, Infogram, and Kibana. The various data visualization platforms have different capabilities, functionality, and use cases. They also require a different skill set. This article discusses the use of R for data visualization.

R is a language that is designed for statistical computing, graphical data analysis, and scientific research. It is usually preferred for data visualization as it offers flexibility and minimum required coding through its packages.

Consider the following airquality data set for visualization in R:

OzoneSolar R.WindTempMonthDay
411907.46751
361188.07252
1214912.67453
1831311.56254
NANA14.35655
28NA14.96656

Types of Data Visualizations

Some of the various types of visualizations offered by R are:

Bar Plot

There are two types of bar plots- horizontal and vertical which represent data points as horizontal or vertical bars of certain lengths proportional to the value of the data item. They are generally used for continuous and categorical variable plotting. By setting the horiz parameter to true and false, we can get horizontal and vertical bar plots respectively.

Example 1:

R

# Horizontal Bar Plot for

# Ozone concentration in air

barplot(airquality$Ozone,

main = 'Ozone Concenteration in air',

xlab = 'ozone levels', horiz = TRUE)

Output:

Data Visualization in R - GeeksforGeeks (1)

Example 2:

R

# Vertical Bar Plot for

# Ozone concentration in air

barplot(airquality$Ozone, main = 'Ozone Concenteration in air',

xlab = 'ozone levels', col ='blue', horiz = FALSE)

Output:

Data Visualization in R - GeeksforGeeks (2)

Bar plots are used for the following scenarios:

  • To perform a comparative study between the various data categories in the data set.
  • To analyze the change of a variable over time in months or years.

Histogram

A histogram is like a bar chart as it uses bars of varying height to represent data distribution. However, in a histogram values are grouped into consecutive intervals called bins. In a Histogram, continuous values are grouped and displayed in these bins whose size can be varied.

Example:

R

# Histogram for Maximum Daily Temperature

data(airquality)

hist(airquality$Temp, main ="La Guardia Airport's\

Maximum Temperature(Daily)",

xlab ="Temperature(Fahrenheit)",

xlim = c(50, 125), col ="yellow",

freq = TRUE)

Output:

Data Visualization in R - GeeksforGeeks (3)

For a histogram, the parameter xlim can be used to specify the interval within which all values are to be displayed.
Another parameter freq when set to TRUE denotes the frequency of the various values in the histogram and when set to FALSE, the probability densities are represented on the y-axis such that they are of the histogram adds up to one.

Histograms are used in the following scenarios:

  • To verify an equal and symmetric distribution of the data.
  • To identify deviations from expected values.

Box Plot

The statistical summary of the given data is presented graphically using a boxplot. A boxplot depicts information like the minimum and maximum data point, the median value, first and third quartile, and interquartile range.

Example:

R

# Box plot for average wind speed

data(airquality)

boxplot(airquality$Wind, main = "Average wind speed\

at La Guardia Airport",

xlab = "Miles per hour", ylab = "Wind",

col = "orange", border = "brown",

horizontal = TRUE, notch = TRUE)

Output:

Data Visualization in R - GeeksforGeeks (4)

Multiple box plots can also be generated at once through the following code:

Example:

R

# Multiple Box plots, each representing

# an Air Quality Parameter

boxplot(airquality[, 0:4],

main ='Box Plots for Air Quality Parameters')

Output:

Data Visualization in R - GeeksforGeeks (5)

Box Plots are used for:

  • To give a comprehensive statistical description of the data through a visual cue.
  • To identify the outlier points that do not lie in the inter-quartile range of data.

Scatter Plot

A scatter plot is composed of many points on a Cartesian plane. Each point denotes the value taken by two parameters and helps us easily identify the relationship between them.

Example:

R

# Scatter plot for Ozone Concentration per month

data(airquality)

plot(airquality$Ozone, airquality$Month,

main ="Scatterplot Example",

xlab ="Ozone Concentration in parts per billion",

ylab =" Month of observation ", pch = 19)

Output:

Data Visualization in R - GeeksforGeeks (6)

Scatter Plots are used in the following scenarios:

  • To show whether an association exists between bivariate data.
  • To measure the strength and direction of such a relationship.

Heat Map

Heatmap is defined as a graphical representation of data using colors to visualize the value of the matrix. heatmap() function is used to plot heatmap.

Syntax: heatmap(data)

Parameters: data: It represent matrix data, such as values of rows and columns

Return: This function draws a heatmap.

R

# Set seed for reproducibility

# set.seed(110)

# Create example data

data <- matrix(rnorm(50, 0, 5), nrow = 5, ncol = 5)

# Column names

colnames(data) <- paste0("col", 1:5)

rownames(data) <- paste0("row", 1:5)

# Draw a heatmap

heatmap(data)

Output:

Data Visualization in R - GeeksforGeeks (7)

Map visualization in R

Here we are using maps package to visualize and display geographical maps using an R programming language.

install.packages("maps")

Link of the dataset: worldcities.csv

R

# Read dataset and convert it into

# Dataframe

data <- read.csv("worldcities.csv")

df <- data.frame(data)

# Load the required libraries

library(maps)

map(database = "world")

# marking points on map

points(x = df$lat[1:500], y = df$lng[1:500], col = "Red")

Output:

Data Visualization in R - GeeksforGeeks (8)

3D Graphs in R

Here we will use preps() function, This function is used to create 3D surfaces in perspective view. This function will draw perspective plots of a surface over the x–y plane.

Syntax: persp(x, y, z)

Parameter: This function accepts different parameters i.e. x, y and z where x and y are vectors defining the location along x- and y-axis. z-axis will be the height of the surface in the matrix z.

Return Value: persp() returns the viewing transformation matrix for projecting 3D coordinates (x, y, z) into the 2D plane using hom*ogeneous 4D coordinates (x, y, z, t).

R

# Adding Titles and Labeling Axes to Plot

cone <- function(x, y){

sqrt(x ^ 2 + y ^ 2)

}

# prepare variables.

x <- y <- seq(-1, 1, length = 30)

z <- outer(x, y, cone)

# plot the 3D surface

# Adding Titles and Labeling Axes to Plot

persp(x, y, z,

main="Perspective Plot of a Cone",

zlab = "Height",

theta = 30, phi = 15,

col = "orange", shade = 0.4)

Output:

Data Visualization in R - GeeksforGeeks (9)

Advantages of Data Visualization in R:

R has the following advantages over other tools for data visualization:

  • R offers a broad collection of visualization libraries along with extensive online guidance on their usage.
  • R also offers data visualization in the form of 3D models and multipanel charts.
  • Through R, we can easily customize our data visualization by changing axes, fonts, legends, annotations, and labels.

Disadvantages of Data Visualization in R:

R also has the following disadvantages:

  • R is only preferred for data visualization when done on an individual standalone server.
  • Data visualization using R is slow for large amounts of data as compared to other counterparts.

Application Areas:

  • Presenting analytical conclusions of the data to the non-analysts departments of your company.
  • Health monitoring devices use data visualization to track any anomaly in blood pressure, cholesterol and others.
  • To discover repeating patterns and trends in consumer and marketing data.
  • Meteorologists use data visualization for assessing prevalent weather changes throughout the world.
  • Real-time maps and geo-positioning systems use visualization for traffic monitoring and estimating travel time.

Don't miss your chance to ride the wave of the data revolution! Every industry is scaling new heights by tapping into the power of data. Sharpen your skills and become a part of the hottest trend in the 21st century.

Dive into the future of technology - explore the Complete Machine Learning and Data Science Program by GeeksforGeeks and stay ahead of the curve.


Last Updated : 26 Apr, 2022

Like Article

Save Article

Previous

Working with Databases in R Programming

Next

R - Line Graphs

Share your thoughts in the comments

Please Login to comment...

Insights, advice, suggestions, feedback and comments from experts

Data Visualization in R Programming Language

As an expert in data visualization and R programming, I have extensive experience in leveraging visual cues such as graphs, charts, and maps to deliver insights from large quantities of data. This technique is invaluable for facilitating intuitive and easy understanding of complex data sets, leading to better decision-making. I have hands-on experience with popular data visualization tools such as Tableau, Plotly, and R, and I am well-versed in the capabilities, functionality, and use cases of these platforms.

Concepts Related to Data Visualization in R Programming

R Programming Language: R is a language specifically designed for statistical computing, graphical data analysis, and scientific research. It is highly preferred for data visualization due to its flexibility and the minimum required coding through its packages [[1]].

Airquality Data Set: The airquality data set is commonly used for visualization in R. It includes variables such as Ozone, Solar R, Wind, Temp, Month, and Day, making it suitable for various types of visualizations [[1]].

Types of Data Visualizations in R: R offers several types of visualizations, including bar plots, histograms, box plots, scatter plots, heat maps, map visualization, and 3D graphs. Each type serves specific purposes and provides unique insights into the data [[1]].

Bar Plot: Bar plots in R can be either horizontal or vertical and are used for comparative studies between different data categories and for analyzing changes of variables over time in months or years [[1]].

Histogram: Histograms in R are used to represent data distribution, particularly for continuous values grouped into consecutive intervals called bins. They are valuable for verifying equal and symmetric distribution of data and identifying deviations from expected values [[1]].

Box Plot: R's box plots provide a graphical representation of the statistical summary of given data, including minimum and maximum data points, median value, first and third quartile, and interquartile range. They are useful for comprehensive statistical description and identifying outlier points [[1]].

Scatter Plot: Scatter plots in R are composed of points on a Cartesian plane, representing the relationship between two parameters. They are employed to show associations between bivariate data and measure the strength and direction of such relationships [[1]].

Heat Map: R's heatmap function is used to plot graphical representations of data using colors to visualize the value of the matrix. Heat maps are valuable for visualizing matrix data and identifying patterns and trends [[1]].

Map Visualization in R: R utilizes the maps package to visualize and display geographical maps, allowing for the marking of points on maps using the R programming language [[1]].

3D Graphs in R: R's persp() function is used to create 3D surfaces in perspective view, providing a unique way to visualize data in three dimensions [[1]].

Advantages and Disadvantages of Data Visualization in R: R offers a broad collection of visualization libraries, extensive online guidance, and the ability to customize visualizations. However, it is only preferred for data visualization on individual standalone servers and can be slow for large amounts of data compared to other tools [[1]].

Application Areas: Data visualization in R finds applications in presenting analytical conclusions, health monitoring, consumer and marketing data analysis, meteorology, real-time maps, and geo-positioning systems. These applications demonstrate the wide-ranging impact of data visualization across various industries [[1]].

In conclusion, data visualization in R is a powerful and versatile tool for gaining insights from data, and its applications span across diverse fields, making it a valuable skill in the data-driven 21st century.

Data Visualization in R - GeeksforGeeks (2024)

FAQs

How R is used in visualization of data? ›

R offers a broad collection of visualization libraries in addition to extensive online guidance on their usage. R also offers data visualization in the form of 3D models and multipanel charts. Through R, we can easily customize our data visualization by changing axes, fonts, legends, annotations, and labels.

What is the popular data visualization library in R? ›

Esquisse

Esquisse is a data visualization tool in R that allows you to create detailed data visualizations using the ggplot2 package. You can create all sorts of scatter plots, histograms, line charts, bar charts, pie charts, error bars, box plots, multiple axes, sparklines, dendrograms, 3-D charts, etc.

Is R better than Python for visualization? ›

If you're passionate about the statistical calculation and data visualization portions of data analysis, R could be a good fit for you. If, on the other hand, you're interested in becoming a data scientist and working with big data, artificial intelligence, and deep learning algorithms, Python would be the better fit.

Which R package should you use for data visualization? ›

  • ggplot2. ggplot2 is a widely used package for creating elegant and customizable data visualizations. ...
  • plotly. plotly is a powerful package for creating interactive and dynamic data visualizations in R. ...
  • lattice. ...
  • ggvis. ...
  • Highcharter. ...
  • leaflet. ...
  • Conclusion.

Is R good for data visualization? ›

R is only preferred for data visualization when done on an individual standalone server. Data visualization using R is slow for large amounts of data as compared to other counterparts.

What is the most common data visualization tool? ›

Top 11 Data Visualization Tools
  • Microsoft Excel.
  • Microsoft Power BI.
  • Google Charts.
  • Tableau.
  • Zoho Analytics.
  • Datawrapper.
  • Qlik Sense.
  • FusionCharts.
Dec 15, 2023

What are the two libraries used for data visualization? ›

Matplotlib and Seaborn are python libraries that are used for data visualization. They have inbuilt modules for plotting different graphs. While Matplotlib is used to embed graphs into applications, Seaborn is primarily used for statistical graphs.

Why is R preferred over Python? ›

Both languages excel in different areas; Python is known for its versatility and robustness, making it suitable for a wide range of applications beyond statistics, while R is specifically designed for statistical computing and offers a plethora of specialized packages tailored for data analysis, such as R for data ...

Why should I use R instead of Python? ›

Python is much more straightforward, using syntax closer to written English to execute commands. However, R makes it easier to visualize and manipulate data if you have other languages under your belt. It's statistics-based, so the syntax here is more straightforward for analysis.

Can Python do everything R can? ›

R can't be used in production code because of its focus on research, while Python, a general-purpose language, can be used both for prototyping and as a product itself. Python also runs faster than R, despite its GIL problems.

Is R shiny a data visualization tool? ›

R Shiny is a web application framework that allows you to build interactive dashboards, data visualizations, and analytical tools using the R programming language.

What are the three advantages of using R for graphical visualizations? ›

Advantages of Data Visualization in R
  • Understanding. It can be more attractive to look at the business. And, it is easier to understand through graphics and charts than a written document with text and numbers. ...
  • Efficiency. Its applications allow us to display a lot of information in a small space. ...
  • Location.
May 21, 2021

Why R is used in data visualization Quora? ›

Some of the specific benefits of using R for data visualization include: Wide range of options: R has a wide range of libraries and packages for creating different types of visualizations, including static plots, interactive plots, and maps.

What is R in graphical analysis? ›

Graphical Data Analysis (GDA) is a powerful tool that helps us to visualize and explore complex data sets. R is a popular programming language for GDA as it has a wide range of built-in functions for producing high-quality visualizations.

Is R used for data modeling? ›

The R language has built-in support for data modeling and graphics. The following example shows how R can generate and plot a linear model with residuals.

References

Top Articles
Latest Posts
Article information

Author: Pres. Lawanda Wiegand

Last Updated:

Views: 5687

Rating: 4 / 5 (71 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Pres. Lawanda Wiegand

Birthday: 1993-01-10

Address: Suite 391 6963 Ullrich Shore, Bellefort, WI 01350-7893

Phone: +6806610432415

Job: Dynamic Manufacturing Assistant

Hobby: amateur radio, Taekwondo, Wood carving, Parkour, Skateboarding, Running, Rafting

Introduction: My name is Pres. Lawanda Wiegand, I am a inquisitive, helpful, glamorous, cheerful, open, clever, innocent person who loves writing and wants to share my knowledge and understanding with you.