ggraph: ggplot for 그래프를 위한 ggplot 


A graph, a collection of nodes connected by edges, is just data. Whether it’s a social network (where nodes are people, and edges are friend relationships), or a decision tree (where nodes are branch criteria or values, and edges decisions), the nature of the graph is easily represented in a data object. It might be represented as a matrix (where rows and columns are nodes, and elements mark whether an edge between them is present) or as a data frame (where each row is an edge, with columns representing the pair of connected nodes).

The trick comes in how you represent a graph visually; there are many different options each with strengths and weaknesses when it comes to interpretation. A graph with many nodes and edges may become an unintelligible hairball without careful arrangement, and including directionality or other attributes of edges or nodes can reveal insights about the data that wouldn’t be apparent otherwise. There are many R packages for creating and displaying graphs (igraph is a popular one, and this CRAN task view lists many others) but that’s a problem in its own right: an important part of the data exploration process is trying and comparing different visualization options, and the myriad packages and interfaces makes that process difficult for graph data.

Now, there’s the new ggraph package,  recently published to CRAN by author Thomas Lin Pederson, which promises to make exploring graph data easier. Unlike other graphing packages, ggraph uses the grammar of graphics paradigm of the ggplot2 package, unifying the data structures and attributes associated with graphics. It also includes a wide range of visual representations of graphs — layouts — and makes it easy to switch between them. The basic “mesh” visualization of nodes and edges provides 11 different options for arranging the nodes:


Other types of visualizations are supported, too: hive plots, dendrograms, treemaps, and circle plots, to name just a few. Note that only static graphs are available, though: unlike igraph and some other packages, you can’t rearrange the location of the nodes or otherwise manipulate the graphics with a mouse.

For the R programmer, most of the work is done by the ggraph function. It’s analagous to the ggplot function, except that you don’t provide data for the locations of the nodes; their position is selected by an algorithm. (Similarly, layout choices are automatically made for visualization types other than the mesh.) There are also various themes suited to graphs you can use to style your chart: goodbye gridlines and axes; hello labels, annotations and edge arrows.

The ggraph package is available on CRAN now, and works with R version 2.10 and later. For more on the ggraph package, see the announcement blog post linked below.

Data Imaginist: Announcing ggraph: A grammar of graphics for relational data

소스: ggraph: ggplot for graphs | R-bloggers

[패키지] hrbrmstr/hrbrthemes: 활자 중심의 ggplot2 테마와 테마 구성 요소


hrbrthemes : Additional Themes and Theme Components for ‘ggplot2’

Project Status: Active - The project has reached a stable, usable state and is being actively developed. codecov Travis-CI Build Status CRAN\_Status\_Badge downloads keybase verified

This is a very focused package that provides typography-centric themes and theme components for ggplot2. It’s a an extract/riff of hrbrmisc created by request.

The core theme: theme_ipsum (“ipsum” is Latin for “precise”) uses Arial Narrow which should be installed on practically any modern system, so it’s “free”-ish. This font is condensed, has solid default kerning pairs and geometric numbers. That’s what I consider the “font trifecta” must-have for charts. An additional quality for fonts for charts is that they have a diversity of weights. Arial Narrow (the one on most systems, anyway) does not have said diversity but this quality is not (IMO) a “must have”.

The following functions are implemented/objects are exported:

  • theme_ipsum : Arial Narrow-based theme
  • theme_ipsum_rc : Roboto Condensed-based theme
  • gg_check: Spell check ggplot2 plot labels
  • update_geom_font_defaults: Update matching font defaults for text geoms (the default is — unsurprisingly — Arial Narrow)
  • scale_x_comma / scale_y_comma : Comma format for axis text and expand=c(0,0) (you need to set limits)
  • scale_x_percent / scale_y_percent : Percent format for axis text and expand=c(0,0) (you need to set limits)
  • scale_color_ipsum / scale_fill_ipsum / ipsum_pal : A muted discrete color palette with 9 colors
  • font_an: a short global alias for “Arial Narrow
  • font_rc: a short global alias for “Roboto Condensed
  • font_rc_light: a short global alias for “Roboto Condensed Light





# current verison
## [1] '0.1.0'

Base theme (Arial Narrow)

ggplot(mtcars, aes(mpg, wt)) +
  geom_point() +
  labs(x="Fuel effiiency (mpg)", y="Weight (tons)",
       title="Seminal ggplot2 scatterplot example",
       subtitle="A plot that is only useful for demonstration purposes",
       caption="Brought to you by the letter 'g'") + 

Roboto Condensed

ggplot(mtcars, aes(mpg, wt)) +
  geom_point() +
  labs(x="Fuel effiiency (mpg)", y="Weight (tons)",
       title="Seminal ggplot2 scatterplot example",
       subtitle="A plot that is only useful for demonstration purposes",
       caption="Brought to you by the letter 'g'") + 

Scales (Color/Fill)

ggplot(mtcars, aes(mpg, wt)) +
  geom_point(aes(color=factor(carb))) +
  labs(x="Fuel effiiency (mpg)", y="Weight (tons)",
       title="Seminal ggplot2 scatterplot example",
       subtitle="A plot that is only useful for demonstration purposes",
       caption="Brought to you by the letter 'g'") + 
  scale_color_ipsum() +

Scales (Axis)

count(mpg, class) %>% 
  mutate(pct=n/sum(n)) %>% 
  ggplot(aes(class, pct)) +
  geom_col() +
  scale_y_percent() +
  labs(x="Fuel effiiency (mpg)", y="Weight (tons)",
       title="Seminal ggplot2 column chart example with percents",
       subtitle="A plot that is only useful for demonstration purposes",
       caption="Brought to you by the letter 'g'") + 

ggplot(uspopage, aes(x=Year, y=Thousands, fill=AgeGroup)) + 
  geom_area() +
  scale_fill_ipsum() +
  scale_x_continuous(expand=c(0,0)) +
  scale_y_comma() +
  labs(title="Age distribution of population in the U.S., 1900-2002",
       subtitle="Example data from the R Graphics Cookbook",
       caption="Source: R Graphics Cookbook") +
  theme_ipsum_rc(grid="XY") +
  theme(axis.text.x=element_text(hjust=c(0, 0.5, 0.5, 0.5, 1))) +


count(mpg, class) %>% 
  mutate(n=n*2000) %>% 
  arrange(n) %>% 
  mutate(class=factor(class, levels=class)) %>% 
  ggplot(aes(class, n)) +
  geom_col() +
  geom_text(aes(label=scales::comma(n)), hjust=0, nudge_y=2000) +
  scale_y_comma(limits=c(0,150000)) +
  coord_flip() +
  labs(x="Fuel effiiency (mpg)", y="Weight (tons)",
       title="Seminal ggplot2 column chart example with commas",
       subtitle="A plot that is only useful for demonstration purposes, esp since you'd never\nreally want direct labels and axis labels",
       caption="Brought to you by the letter 'g'") + 

Spellcheck ggplot2 labels

df <- data.frame(x=c(20, 25, 30), y=c(4, 4, 4), txt=c("One", "Two", "Three"))

ggplot(mtcars, aes(mpg, wt)) +
  geom_point() +
  labs(x="This is some txt", y="This is more text",
       title="Thisy is a titlle",
       subtitle="This is a subtitley",
       caption="This is a captien") +
  theme_ipsum_rc(grid="XY") -> gg

## Possible misspelled words in [title]: (Thisy, titlle)
## Possible misspelled words in [subtitle]: (subtitley)
## Possible misspelled words in : (captien)

Test Results


## [1] "Mon Feb 27 07:03:41 2017"

## testthat results ========================================================================================================
## OK: 10 SKIPPED: 0 FAILED: 0
## DONE ===================================================================================================================

관계형 데이터의 그래픽 문법 ggraph 패키지


Announcing ggraph: A grammar of graphics for relational data

FEBRUARY 23, 2017

I am absolutely thrilled to announce that ggraph has finally been released on CRAN. ggraph is my most ambitious package to date and its very early genesis has been described in a prior post. If any mention of ggraph is completely new to you, then in short terms ggraph is an extension of the ggplot2 API to support relational data such as networks and trees. I feel fairly confident in saying that ggraph is the most powerful way to create static network based visualizations in R. Leading up to the release, the three main concepts of ggraph has been described in detail in their own blog posts (layouts, nodes, and edges) so this will not be reiterated here. Instead I’ll talk a bit about the philosophy behind the package as well as show of some of the features that do not fall into any of the three main concepts.

The Philosophy

There is no shortage of software for creating network visualizations and there is no shortage of said visualizations themselves. Often though, the visualizations are more impressive than informative and it is easy to feel that their main task is to show that we are really dealing with some complex data. All of this has led to a certain disdain for classic network visualizations perfectly encapsulated in the nickname hairballs. It does not have to be like this! The greatness of ggplot2 lies in how it allows users to quickly iterate over visualization approaches, thus better ensuring that the best visualization approach is reached. If this was extended to relational data it is my belief that users would be more likely to try to make plots that are more meaningful. After all we all want interpretability, right? Consider having to try out 7 different network visualization packages with different APIs versus just mixing and matching layouts and geoms in an iterative process — I know which way I prefer.

The goal of ggraph is thus clear — provide everything related to visualizations of relational data in a ggplot2-like API to lessen the cognitive load on experimenting with different visual representations. I’m not there yet, but I feel the current version represents a solid foundation where most users will not feel many limitations — on the contrary I believe most users will feel like the chains have come off and they are set free.

Future focus

As I pointed out, ggraph is far from done. I’ll try to keep my development focus in the open by putting things on the road-map as GitHub issues. Honorable mentions include matrix, d3-force and sankey layout, expanded support for edge endings (more choices than grid::arrow()provides), edge routing (avoid node collision), and textbox nodes. I welcome all suggestions as the world of network visualizations is moving fast and I cannot keep on top of everything.

Features besides layouts, nodes, and edges

Understanding the node and edge geoms along with how layouts are defined will get you a long way towards visualizing networks. Still, ggraph has more to offer, some of which will be discussed here:


Consider the following plot:

graph <- graph_from_data_frame(highschool)

p <- ggraph(graph, layout = 'kk') + 
    geom_edge_link(aes(colour = factor(year))) + 
    geom_node_point() + 
    ggtitle('An example')


While the ggplot2 heritage clearly shows due to the grey background with white grid lines, the whole concept of x and y axes is often redundant in network visualizations and are just a distraction. ggraphprovides its own theme optimized for network visualizations called theme_graph(), that facilitates clean and beautiful visualizations:

p + theme_graph()


theme_graph(), besides removing axes, grids, and border, changes the font to Arial Narrow (this can be overridden). Furthermore, it makes it easy to change the coloring of the plot:

p + theme_graph(background = 'grey20', text_colour = 'white')


Adding the same theme to every plot is tedious and ggraph provides a way to avoid this. Using set_graph_style() the theme_graph() is set as default. As an extra benefit all text-based geoms gets their defaults updated so the text automatically uses the same style as the theme.





A powerful but underutilized way of gaining insight into networks is by using small multiples. This technique can reduce edge over-plotting in a very meaningful way by spreading nodes and edges out based on their attributes. The benefits of small multiples are not unique to relational data, as the popularity of ggplot2s facetting functionality shows. The base facetting functions provided by ggplot2 is a bad fit for networks though, as we are working with two very distinct types of data. If you facet on a node attribute, all edges would be plotted in all panels, despite the terminal nodes not being present which is not what you expect. Because of this ggraph comes with its own set of facetting functions tailored to network data:

facet_nodes() and facet_edges()

These two functions are equivalent to facet_wrap() in functionality, but they only address node and edge data respectively. When using facet_nodes() edges are only drawn in a panel if both terminal nodes are present there. When using facet_edges() nodes are always drawn in all panels even if the node data contains an attribute named the same as the one used for the edge facetting.

# Assign each node to a random class
V(graph)$class <- sample(letters[1:4], gorder(graph), TRUE)
# Make year a character
E(graph)$year <- as.character(E(graph)$year)

p <- ggraph(graph, layout = 'kk') + 
    geom_edge_fan(aes(alpha = ..index.., colour = year)) + 
    geom_node_point(aes(shape = class)) + 
    scale_edge_alpha(guide = 'none')

p + facet_edges(~year)


Often, when working with small multiples it is nice to have some visual separation between each plot — setting a foreground color in theme_graph() will add strip background and border (you can also use the th_foreground() helper for this):

p + facet_nodes(~class) + th_foreground(foreground = 'grey80', border = TRUE)


# Lets not have to add this everytime
set_graph_style(foreground = 'grey80')


Facetting on two variables simultaneously is very powerful and something that is supported in ggplot2 with facet_grid(). In ggraphthe same is possible using facet_graph() that takes the behavior of facet_nodes() and facet_edges() and combines them:

p + facet_graph(year ~ class)


As with facet_grid() marginal plots are supported as well:

p + facet_graph(year ~ class, margins = TRUE)


While the default is to put facet the rows on edges and the columns on nodes, this is free to change using the row_type and col_typearguments. There is nothing stopping you from facetting on the same type in each dimension either:

# Facet edge by the class of their start node as well as year
p + facet_graph(year ~ node1.class, col_type = 'edge')


I hope I have convinced you that facetting in the context of relational data is both very easy, as well as extremely powerful. Avoiding the hairball is one of the prime goal of network visualizations and using small multiples is a fantastic way of cutting down on the number of nodes and edges while still getting the full picture.

ggraph, a package for creating network and tree visualizations using the ggplot2 API has just been released on CRAN

소스: Data Imaginist – Announcing ggraph: A grammar of graphics for relational data

[패키지] bayesplot : 베이지안 모델 플로팅을 위한 패키지



Travis-CI Build Status codecov CRAN_Status_Badge

An R package providing a library of plotting functions for use after fitting Bayesian models (typically with MCMC). The idea is not only to provide convenient functionality for users, but also a common set of functions that can be easily used by developers working on a variety of packages for Bayesian modeling, particularly (but not necessarily) those powered by RStan.

베이지안 모델 (일반적으로 MCMC 포함)을 적용한 후에 사용할 플로팅 함수 라이브러리를 제공하는 R 패키지. 이 아이디어는 사용자에게 편리한 기능을 제공 할뿐만 아니라 베이 즈 모델링을위한 다양한 패키지로 작업하는 개발자가 쉽게 사용할 수있는 일반적인 기능 세트, 특히 RStan에서 제공되는 패키지 (필수는 아님)를 제공합니다.


The plots created by bayesplot are ggplot objects, which means that after a plot is created it can be further customized using the various functions for modifying ggplot objects provided by the ggplot2 package.


  • Install from CRAN:
  • Install latest development version from GitHub (requires devtools package):
if (!require("devtools"))

devtools::install_github("stan-dev/bayesplot", dependencies = TRUE, build_vignettes = TRUE)

If you are not using the RStudio IDE and you get an error related to “pandoc” you will either need to remove the argument build_vignettes=TRUE (to avoid building the vignettes) or install pandoc (e.g., brew install pandoc) and probably also pandoc-citeproc (e.g., brew install pandoc-citeproc). If you have the rmarkdown R package installed then you can check if you have pandoc by running the following in R:



Some quick examples using MCMC draws obtained from the rstanarm and rstan packages.


fit <- stan_glm(mpg ~ ., data = mtcars)
posterior <- as.matrix(fit)

plot_title <- ggtitle("Posterior distributions",
                      "with medians and 80% intervals")
           pars = c("cyl", "drat", "am", "wt"), 
           prob = 0.8) + plot_title

ppc_dens_overlay(y = fit$y, 
                 yrep = posterior_predict(fit, draws = 50))

# also works nicely with piping
fit %>% 
  posterior_predict(draws = 500) %>%
  ppc_stat_grouped(y = mtcars$mpg, 
                   group = mtcars$carb, 
                   stat = "median")

# with rstan demo model
fit2 <- stan_demo("eight_schools", warmup = 300, iter = 700)
posterior2 <- extract(fit2, inc_warmup = TRUE, permuted = FALSE)

p <- mcmc_trace(posterior2,  pars = c("mu", "tau"), n_warmup = 300,
                facet_args = list(nrow = 2, labeller = label_parsed))
p + facet_text(size = 15)

np <- nuts_params(fit2)
mcmc_nuts_energy(np, merge_chains = FALSE) + ggtitle("NUTS Energy Diagnostic")

# another example with rstanarm

fit <- stan_glmer(mpg ~ wt + (1|cyl), data = mtcars)
  y = mtcars$mpg,
  yrep = posterior_predict(fit),
  x = mtcars$wt,
  prob = 0.5
) +
    x = "Weight (1000 lbs)",
    y = "MPG",
    title = "50% posterior predictive intervals \nvs observed miles per gallon",
    subtitle = "by vehicle weight"
  ) +
  panel_bg(fill = "gray95", color = NA) +
  grid_lines(color = "white")

[패키지] shinystan R 패키지와 ShinyStan GUI



Travis-CI Build Status Codecov CRAN_Status_Badge RStudio CRAN Mirror Downloads

ShinyStan provides immediate, informative, customizable visual and numerical summaries of model parameters and convergence diagnostics for MCMC simulations. The ShinyStan interface is coded primarily in R using the Shiny web application framework and is available via the shinystan R package.

ShinyStan은 모델 매개 변수 및 MCMC 시뮬레이션을위한 컨버전스 진단에 대한 즉각적이고 유익한 사용자 정의 가능한 시각 및 수치 요약을 제공합니다. ShinyStan 인터페이스는 Shiny 웹 어플리케이션 프레임 워크를 사용하여 주로 R로 코딩되며 shinystan R 패키지를 통해 사용할 수 있습니다.



  • Install from CRAN:

If this fails, try adding the arguments type='source' and/or repos=''.

  • Install from GitHub (requires devtools package):
if (!require("devtools"))
devtools::install_github("stan-dev/shinystan", build_vignettes = TRUE)


After installing run



About ShinyStan

Applied Bayesian data analysis is primarily implemented through the MCMC algorithms offered by various software packages. When analyzing a posterior sample obtained by one of these algorithms the first step is to check for signs that the chains have converged to the target distribution and and also for signs that the algorithm might require tuning or might be ill-suited for the given model. There may also be theoretical problems or practical inefficiencies with the specification of the model.

ShinyStan provides interactive plots and tables helpful for analyzing a posterior sample, with particular attention to identifying potential problems with the performance of the MCMC algorithm or the specification of the model. ShinyStan is powered by RStudio’s Shiny web application framework and works with the output of MCMC programs written in any programming language (and has extended functionality for models fit using RStan and the No-U-Turn sampler).

Saving and deploying (sharing)

The shinystan package allows you to store the basic components of an entire project (code, posterior samples, graphs, tables, notes) in a single object. Users can save many of the plots as ggplot2 objects for further customization and easy integration in reports or post-processing for publication.

shinystan also provides the deploy_shinystan function, which lets you easily deploy your own ShinyStan apps online using RStudio’s ShinyApps service for any of your models. Each of your apps (each of your models) will have a unique url and is compatible with Safari, Firefox, Chrome, and most other browsers.

Get help or submit bug report


The shinystan R package and ShinyStan interface are open source licensed under the GNU Public License, version 3 (GPLv3).

Scatterplot3d : 3D 그래픽 – R 소프트웨어 및 데이터 시각화 – 쉬운 안내서

, ,

There are many packages in R (RGL, car, lattice, scatterplot3d, …) for creating 3D graphics.

This tutorial describes how to generate a scatter pot in the 3D space using R software and the package scatterplot3d.

scaterplot3d is very simple to use and it can be easily extended by adding supplementary points or regression planes into an already generated graphic.

It can be easily installed, as it requires only an installed version of R.

3d scatter plot

Install and load scaterplot3d

install.packages("scatterplot3d") # Install
library("scatterplot3d") # load

Prepare the data

The iris data set will be used:

  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

iris data set gives the measurements of the variables sepal length and width, petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.

The function scatterplot3d()

A simplified format is:

scatterplot3d(x, y=NULL, z=NULL)

x, y, z are the coordinates of points to be plotted. The arguments y and z can be optional depending on the structure of x.

In what cases, y and z are optional variables?

  • Case 1 : x is a formula of type zvar ~ xvar + yvar. xvar, yvar and zvar are used as x, y and z variables
  • Case 2 : x is a matrix containing at least 3 columns corresponding to x, y and z variables, respectively

Basic 3D scatter plots

# Basic 3d graphics

Scatterplot3d - R software and data visualization

# Change the angle of point view
scatterplot3d(iris[,1:3], angle = 55)

Scatterplot3d - R software and data visualization

Change the main title and axis labels

              main="3D Scatter Plot",
              xlab = "Sepal Length (cm)",
              ylab = "Sepal Width (cm)",
              zlab = "Petal Length (cm)")

Scatterplot3d - R software and data visualization

Change the shape and the color of points

The argument pch and color can be used:

scatterplot3d(iris[,1:3], pch = 16, color="steelblue")

Scatterplot3d - R software and data visualization

Read more on the different point shapes available in R : Point shapes in R

Change point shapes by groups

shapes = c(16, 17, 18) 
shapes <- shapes[as.numeric(iris$Species)]
scatterplot3d(iris[,1:3], pch = shapes)

Scatterplot3d - R software and data visualization

Read more on the different point shapes available in R : Point shapes in R

Change point colors by groups

colors <- c("#999999", "#E69F00", "#56B4E9")
colors <- colors[as.numeric(iris$Species)]
scatterplot3d(iris[,1:3], pch = 16, color=colors)

Scatterplot3d - R software and data visualization

Read more about colors in R: colors in R

Change the global appearance of the graph

The arguments below can be used:

  • grid: a logical value. If TRUE, a grid is drawn on the plot.
  • box: a logical value. If TRUE, a box is drawn around the plot

Remove the box around the plot

scatterplot3d(iris[,1:3], pch = 16, color = colors,
              grid=TRUE, box=FALSE)

Scatterplot3d - R software and data visualization

Note that, the argument grid = TRUE plots only the grid on the xy plane. In the next section, we’ll see how to add grids on the other facets of the 3D scatter plot.

Add grids on scatterplot3d

This section describes how to add xy-, xz- and yz- to scatterplot3d graphics.

We’ll use a custom function named addgrids3d(). The source code is available here : addgrids3d.r. The function is inspired from the discussion on this forum.

A simplified format of the function is:

addgrids3d(x, y=NULL, z=NULL, grid = TRUE,
           col.grid = "grey", lty.grid=par("lty"))
  • x, y, and z are numeric vectors specifying the x, y, z coordinates of points. x can be a matrix or a data frame containing 3 columns corresponding to the x, y and z coordinates. In this case the arguments y and z are optional
  • grid specifies the facet(s) of the plot on which grids should be drawn. Possible values are the combination of “xy”, “xz” or “yz”. Example: grid = c(“xy”, “yz”). The default value is TRUE to add grids only on xy facet.
  • col.grid, lty.grid: the color and the line type to be used for grids

Add grids on the different factes of scatterplot3d graphics:

# 1. Source the function
# 2. 3D scatter plot
scatterplot3d(iris[, 1:3], pch = 16, grid=FALSE, box=FALSE)
# 3. Add grids
addgrids3d(iris[, 1:3], grid = c("xy", "xz", "yz"))

Scatterplot3d - R software and data visualization

The problem on the above plot is that the grids are drawn over the points.

The R code below, we’ll put the points in the foreground using the following steps:

  1. An empty scatterplot3 graphic is created and the result of scatterplot3d() is assigned to s3d
  2. The function addgrids3d() is used to add grids
  3. Finally, the function s3d$points3d is used to add points on the 3D scatter plot
# 1. Source the function
# 2. Empty 3D scatter plot using pch=""
s3d <- scatterplot3d(iris[, 1:3], pch = "", grid=FALSE, box=FALSE)
# 3. Add grids
addgrids3d(iris[, 1:3], grid = c("xy", "xz", "yz"))
# 4. Add points
s3d$points3d(iris[, 1:3], pch = 16)

Scatterplot3d - R software and data visualization

The function points3d() is described in the next sections.

Add bars

The argument type = “h” is used. This is useful to see very clearly the x-y location of points.

scatterplot3d(iris[,1:3], pch = 16, type="h", 

Scatterplot3d - R software and data visualization

Modification of scatterplot3d output

scatterplot3d returns a list of function closures which can be used to add elements on a existing plot.

The returned functions are :

  • xyz.convert(): to convert 3D coordinates to the 2D parallel projection of the existing scatterplot3d. It can be used to add arbitrary elements, such as legend, into the plot.
  • points3d(): to add points or lines into the existing plot
  • plane3d(): to add a plane into the existing plot
  • box3d(): to add or refresh a box around the plot

Add legends

Specify the legend position using xyz.convert()

  1. The result of scatterplot3d() is assigned to s3d
  2. The function s3d$xyz.convert() is used to specify the coordinates for legends
  3. the function legend() is used to add legends to plots
s3d <- scatterplot3d(iris[,1:3], pch = 16, color=colors)
legend(s3d$xyz.convert(7.5, 3, 4.5), legend = levels(iris$Species),
      col =  c("#999999", "#E69F00", "#56B4E9"), pch = 16)

Scatterplot3d - R software and data visualization

It’s also possible to specify the position of legends using the following keywords: “bottomright”, “bottom”, “bottomleft”, “left”, “topleft”, “top”, “topright”, “right” and “center”.

Read more about legend in R: legend in R.

Specify the legend position using keywords

# "right" position
s3d <- scatterplot3d(iris[,1:3], pch = 16, color=colors)
legend("right", legend = levels(iris$Species),
      col =  c("#999999", "#E69F00", "#56B4E9"), pch = 16)

Scatterplot3d - R software and data visualization

# Use the argument inset
s3d <- scatterplot3d(iris[,1:3], pch = 16, color=colors)
legend("right", legend = levels(iris$Species),
  col = c("#999999", "#E69F00", "#56B4E9"), pch = 16, inset = 0.1)

Scatterplot3d - R software and data visualization

What means the argument inset in the R code above?

The argument inset is used to inset distance(s) from the margins as a fraction of the plot region when legend is positioned by keyword. ( see ?legend from R). You can play with inset argument using negative or positive values.

# "bottom" position
s3d <- scatterplot3d(iris[,1:3], pch = 16, color=colors)
legend("bottom", legend = levels(iris$Species),
      col = c("#999999", "#E69F00", "#56B4E9"), pch = 16)

Scatterplot3d - R software and data visualization

Using keywords to specify the legend position is very simple. However, sometimes, there is an overlap between some points and the legend box or between the axis and legend box.

Is there any solution to avoid this overlap?

Yes, there are several solutions using the combination of the following arguments for the function legend():

  • bty = “n” : to remove the box around the legend. In this case the background color of the legend becomes transparent and the overlapping points become visible.
  • bg = “transparent”: to change the background color of the legend box to transparent color (this is only possible when bty != “n”).
  • inset: to modify the distance(s) between plot margins and the legend box.
  • horiz: a logical value; if TRUE, set the legend horizontally rather than vertically
  • xpd: a logical value; if TRUE, it enables the legend items to be drawn outside the plot.

Customize the legend position

# Custom point shapes
s3d <- scatterplot3d(iris[,1:3], pch = shapes)
legend("bottom", legend = levels(iris$Species),
       pch = c(16, 17, 18), 
      inset = -0.25, xpd = TRUE, horiz = TRUE)

Scatterplot3d - R software and data visualization

# Custom colors
s3d <- scatterplot3d(iris[,1:3], pch = 16, color=colors)
legend("bottom", legend = levels(iris$Species),
      col =  c("#999999", "#E69F00", "#56B4E9"), pch = 16, 
      inset = -0.25, xpd = TRUE, horiz = TRUE)

Scatterplot3d - R software and data visualization

# Custom shapes/colors
s3d <- scatterplot3d(iris[,1:3], pch = shapes, color=colors)
legend("bottom", legend = levels(iris$Species),
      col =  c("#999999", "#E69F00", "#56B4E9"), 
      pch = c(16, 17, 18), 
      inset = -0.25, xpd = TRUE, horiz = TRUE)

Scatterplot3d - R software and data visualization

In the R code above, you can play with the arguments inset, xpd and horiz to see the effects on the appearance of the legend box.

Add point labels

The function text() is used as follow:

scatterplot3d(iris[,1:3], pch = 16, color=colors)
text(s3d$xyz.convert(iris[, 1:3]), labels = rownames(iris),
     cex= 0.7, col = "steelblue")

Scatterplot3d - R software and data visualization

Add regression plane and supplementary points

  1. The result of scatterplot3d() is assigned to s3d
  2. A linear model is calculated as follow : lm(zvar ~ xvar + yvar). Assumption : zvar depends on xvar and yvar
  3. The function s3d$plane3d() is used to add the regression plane
  4. Supplementary points are added using the function s3d$points3d()

The data sets trees will be used:

  Girth Height Volume
1   8.3     70   10.3
2   8.6     65   10.3
3   8.8     63   10.2
4  10.5     72   16.4
5  10.7     81   18.8
6  10.8     83   19.7

This data set provides measurements of the girth, height and volume for black cherry trees.

3D scatter plot with the regression plane:

# 3D scatter plot
s3d <- scatterplot3d(trees, type = "h", color = "blue",
    angle=55, pch = 16)
# Add regression plane
my.lm <- lm(trees$Volume ~ trees$Girth + trees$Height)
# Add supplementary points
s3d$points3d(seq(10, 20, 2), seq(85, 60, -5), seq(60, 10, -10),
    col = "red", type = "h", pch = 8)

Scatterplot3d - R software and data visualization

소스: Scatterplot3d: 3D graphics – R software and data visualization – Easy Guides – Wiki – STHDA

비주얼 스튜디오를 위한 R Tools 1.0 소개


After more than a year in preview R Tools for Visual Studio, the open-source extension to the Visual Studio IDE for R programming, is nearing its official release. RTVS Release Candidate 1 is now available for download, giving you the opportunity to try out the new features ahead of the official announcement.


Preview: R Tools for Visual Studio 1.0


We’ll cover the features in detail with the general availability release of RTVS 1.0, but in summary the new features include:

  • Remote Execution: type R code in your local RTVS instance, but have the computations performed on a remote R server. You can also switch between local and remote workspaces at will.
  • SQL Server Integration: work with database connections and SQL queries, and create stored procedures with embedded R code.
  • Enhanced R Graphics Support: multiple floating and dockable plot windows, each with plot history.

RTVS works with all flavours of R on Windows: CRAN R, Microsoft R Open, and Microsoft R Client & Server. It requires Visual Studio 2015 (including the free Community edition). The RTVS team welcomes your feedback: you can report issues or offer suggestions via the RTVS Github repository. To get started with RTVS, follow the link below.

R Tools for Visual Studio: Welcome to R Tools for Visual Studio Preview!


소스: Preview: R Tools for Visual Studio 1.0 | R-bloggers

2017.1 월 추가 패키지 추천


n a recent post, I highlighted several new packages that arrived on CRAN in January that provided R users with access to data. In this post, I present additional selections for interesting January packages, organized into the categories Miscellaneous, Machine Learning, Statistics and Utilities.

rcss v1.2: Provides functions for Solving Control Problems with Linear State Dynamics.

stormwindmodel v0.1.0: Provides functions to calculate wind speeds for hurricanes and tropical storms in the North American Atlantic basin. One vignette describes the package and another shows how to use it.

Machine Learning
crisp v1.0.0: Implements the convex regression with interpretable partitions (CRISP) method of predicting an outcome variable on the basis of two covariates.

BayesS5 v1.22: Implements Bayesian Variable Selection Using Shotgun Stochastic Search with Screening (S5) useful in settings where p >> n. For details, see the paper

classifierplots v1.3.2: Provides functions to generate a grid of binary classifier and diagnostic plots with a single function call. See the README for details.

eclust v0.1.0: Provides an algorithm for clustering high-dimensional data that can be affected by an environmental factor. See the paper for details.

EnsCat v1.1: Implements various clustering methods for categorical data. See the website for examples and the paper for the details.

MAVE v0.1.7: Implements the MAVE (Minimum Average Variance Estimation) method of dimension reduction. Look here for the math and here for examples.

mfe v0.1.0: Provides functions to extract meta-features from datasets to support the design of recommendation systems. The vignette provides examples.

rsparkling v0.1.0: extends sparklyr with an interface to the H2O Sparkling Water machine learning library. The README explains how to use the package.

confSAM v0.1: Contains a function that computes estimates and confidence bounds for the false discovery proportion in a multiple testing environment. The vignette describes the theory and provides examples.

pdSpecEst v1.0.0: Implements a non-parametric, geometric wavelet method to estimate autocovariance matrix of a time series that preserves positive-definiteness of the estimator. This preserves the intrepretability of the estimate as a covariance matrix and helps with computational issues. The paper describes the theory and the vignette provides an example.

tsdecomp v0.2: Implements ARIMA model-based decompositions for quarterly and monthly time series data. The vignette describes the math.

TSeriesMMA v0.1.1: Provides a function to calculate the Hurst surface for a time series. Multiscale, multifractical analysis (MMA) is described in a paper by Gieraltowski et al.

awsjavasdk v0.2.0: Provides a boilerplate of classes used to access the Amazon Web Services Java Software Development Kit via package rJava. The vignette shows how to use the package.

colr v0.1.900: Provides functions that use Perl regular expressions to select and rename columns in dataframes, lists and numeric types. The vignette contains examples.

flifo v0.1.4: Provides functions to create and manipulate FIFO (First In First Out), LIFO (Last In First Out), and NINO (Not In or Never Out) stacks in R. See the vignette for examples.

fst v0.7.2: Provides functions to read and write data frames at high speed, and compress data with type-optimized algorithms that allow random access of stored data frames.

manipulateWidget v0.5.1: Provides helper functions to add controls like sliders, pickers, checkboxes, etc. to interactive charts created with package htmlwidgets. The animated vignette will get you started.

msgtools v0.2.4: Provides utilities for error, warning, and other messages in R packages, including consistency checks across messages, spell-checking, and message translations for localization. See the vignette for examples.

padr v0.2.0: Provides functions to transform datetime data into a format ready for analysis, including aggregating data to a higher level interval (thicken) and imputing records where observations were absent (pad). There is an Introduction.

pbdPRC v0.1-1: Implements light, yet secure remote procedure calls with a unified interface via ssh (OpenSSH) or plink/plink.exe (PuTTY). The vignette provides examples.

reprex v0.1.1: Provides a way to send code snippets with rendered output to sites like stackoverflow and github. The README shows examples.

restfulr v0.0.8: Models a RESTful service as if it were a list.

sys v1.1: A replacement for base system2 with consistent behavior across platforms. Supports interruption, background tasks, and full control over STDOUT / STDERR binary or text streams. README provides some details.

textclean v0.3.0: Provides tools to clean and process text, such as replacing or removing substrings that are not optimal for analysis. The README shows how to use them.

tidyxl v0.2.1: Imports non-tabular data from Excel into R. The vignette shows how.

unpivotr v0.1.0: Provides tools for converting data from complex or irregular layouts into a columnar structure. There is one vignette showing how to unpivot pivot tables from a spreadsheet, and another that shows how to work with multiple, similar tables.

WVPlots v0.2.2: Provides examples of ggplot plots that can be generated from a standard calling interface. Here is the explanation of the concept, and here are some nice examples.

by Joseph Rickert In a recent post, I highlighted several new packages that arrived on CRAN in January that provided R users with access to data. In this post, I present additional selections for interesting January packages, organized into the categories Miscellaneous, Machine Learning, Statistics and Utilities. Miscellaneous rcss v1.2: Provides functions for Solving Control […]

소스: More January Package Picks | R-bloggers

테스트 캘린더