networkD3: D3 JavaScript Network Graphs from R

CRAN Version Dev-version: 0.4

 About

This started as a port of Christopher Gandrud’s R package d3Network for creating D3network graphs to the htmlwidgets framework. The htmlwidgets framework greatly simplifies the package’s syntax for exporting the graphs, improves integration with RStudio’s Viewer Pane, RMarkdown, and Shiny web apps. See below for examples.

It currently supports the following types of network graphs:

 Install

networkD3 works very well with the most recent version of RStudio (>=v0.99, download). When you use this version of RStudio, graphs will appear in the Viewer Pane. Not only does this give you a handy way of seeing and tweaking your graphs, but you can also export the graphs to the clipboard or a PNG/JPEG/TIFF/etc. file.

The package can be downloaded from CRAN.

 Usage

For a full set of examples for each of the functions see this page.

Note: You are probably used to R’s 1-based numbering (i.e. counting in R starts from 1). However, networkD3 plots are created using JavaScript, which is 0-based. So, your data links will need to start from 0. See this data set for example. You can also use igraph to build your graph data and then use the igraph_to_networkD3 function to convert this data to a suitable object for networkD3 plotting.

> simpleNetwork

For very basic force directed network graphics you can use simpleNetwork. For example:

# Load package
library(networkD3)

# Create fake data
src <- c("A", "A", "A", "A",
        "B", "B", "C", "C", "D")
target <- c("B", "C", "D", "J",
            "E", "F", "G", "H", "I")
networkData <- data.frame(src, target)

# Plot
simpleNetwork(networkData)
ABCDEFGHIJ

> forceNetwork

Use forceNetwork to have more control over the appearance of the forced directed network and to plot more complicated networks. Here is an example:

# Load data
data(MisLinks)
data(MisNodes)

# Plot
forceNetwork(Links = MisLinks, Nodes = MisNodes,
            Source = "source", Target = "target",
            Value = "value", NodeID = "name",
            Group = "group", opacity = 0.8)
MyrielNapoleonMlle.BaptistineMme.MagloireCountessdeLoGeborandChamptercierCravatteCountOldManLabarreValjeanMargueriteMme.deRIsabeauGervaisTholomyesListolierFameuilBlachevilleFavouriteDahliaZephineFantineMme.ThenardierThenardierCosetteJavertFaucheleventBamataboisPerpetueSimpliceScaufflaireWoman1JudgeChampmathieuBrevetChenildieuCochepaillePontmercyBoulatruelleEponineAnzelmaWoman2MotherInnocentGribierJondretteMme.BurgonGavrocheGillenormandMagnonMlle.GillenormandMme.PontmercyMlle.VauboisLt.GillenormandMariusBaronessTMabeufEnjolrasCombeferreProuvaireFeuillyCourfeyracBahorelBossuetJolyGrantaireMotherPlutarchGueulemerBabetClaquesousMontparnasseToussaintChild1Child2BrujonMme.Hucheloup

From version 0.1.3 you can also allow scroll-wheel zooming by setting zoom = TRUE.

> sankeyNetwork

You can also create Sankey diagrams with sankeyNetwork. Here is an example using downloaded JSON data:

# Load energy projection data
# Load energy projection data
URL <- paste0(
        "https://cdn.rawgit.com/christophergandrud/networkD3/",
        "master/JSONdata/energy.json")
Energy <- jsonlite::fromJSON(URL)
# Plot
sankeyNetwork(Links = Energy$links, Nodes = Energy$nodes, Source = "source",
             Target = "target", Value = "value", NodeID = "name",
             units = "TWh", fontSize = 12, nodeWidth = 30)
Agricultural ‘waste’Bio-conversionLiquidLossesSolidGasBiofuel importsBiomass importsCoal importsCoalCoal reservesDistrict heatingIndustryHeating and cooling – commercialHeating and cooling – homesElectricity gridOver generation / exportsH2 conversionRoad transportAgricultureRail transportLighting & appliances – commercialLighting & appliances – homesGas importsNgasGas reservesThermal generationGeothermalH2HydroInternational shippingDomestic aviationInternational aviationNational navigationMarine algaeNuclearOil importsOilOil reservesOther wastePumped heatSolar PVSolar ThermalSolarTidalUK land based bioenergyWaveWind

> radialNetwork

From version 0.2, tree diagrams can be created using radialNetwork or diagonalNetwork.

URL <- paste0(
        "https://cdn.rawgit.com/christophergandrud/networkD3/",
        "master/JSONdata//flare.json")

## Convert to list format
Flare <- jsonlite::fromJSON(URL, simplifyDataFrame = FALSE)

# Use subset of data for more readable diagram
Flare$children = Flare$children[1:3]

radialNetwork(List = Flare, fontSize = 10, opacity = 0.9)
flareanalyticsanimatedataclustergraphoptimizationEasingFunctionSequenceinterpolateISchedulableParallelPauseSchedulerSequenceTransitionTransitionerTransitionEventTweenconvertersDataFieldDataSchemaDataSetDataSourceDataTableDataUtilAgglomerativeClusterCommunityStructureHierarchicalClusterMergeEdgeBetweennessCentralityLinkDistanceMaxFlowMinCutShortestPathsSpanningTreeAspectRatioBankerArrayInterpolatorColorInterpolatorDateInterpolatorInterpolatorMatrixInterpolatorNumberInterpolatorObjectInterpolatorPointInterpolatorRectangleInterpolatorConvertersDelimitedTextConverterGraphMLConverterIDataConverterJSONConverter
diagonalNetwork(List = Flare, fontSize = 10, opacity = 0.9)
flareanalyticsanimatedataclustergraphoptimizationEasingFunctionSequenceinterpolateISchedulableParallelPauseSchedulerSequenceTransitionTransitionerTransitionEventTweenconvertersDataFieldDataSchemaDataSetDataSourceDataTableDataUtilAgglomerativeClusterCommunityStructureHierarchicalClusterMergeEdgeBetweennessCentralityLinkDistanceMaxFlowMinCutShortestPathsSpanningTreeAspectRatioBankerArrayInterpolatorColorInterpolatorDateInterpolatorInterpolatorMatrixInterpolatorNumberInterpolatorObjectInterpolatorPointInterpolatorRectangleInterpolatorConvertersDelimitedTextConverterGraphMLConverterIDataConverterJSONConverter

> dendroNetwork

From version 0.2, it is also possible to create dendrograms using dendroNetwork.

hc <- hclust(dist(USArrests), "ave")

dendroNetwork(hc, height = 600)
FloridaNorth CarolinaCaliforniaHawaiiMarylandAlaskaWashingtonRhode IslandMissouriGeorgiaIdahoArizonaNew MexicoDelawareMississippiSouth CarolinaOregonMassachusettsNew JerseyArkansasTennesseeColoradoTexasNebraskaOhioUtahWest VirginiaAlabamaLouisianaIllinoisNew YorkMichiganNevadaWyomingKentuckyMontanaIndianaKansasConnecticutPennsylvaniaMaineSouth DakotaNorth DakotaVermontMinnesotaOklahomaVirginiaWisconsinIowaNew Hampshire

Interacting with igraph

You can use igraph to create network graph data that can be plotted with networkD3. The igraph_to_networkD3 function converts igraph graphs to lists that work well with networkD3. For example:

# Load igraph
library(igraph)

# Use igraph to make the graph and find membership
karate <- make_graph("Zachary")
wc <- cluster_walktrap(karate)
members <- membership(wc)

# Convert to object suitable for networkD3
karate_d3 <- igraph_to_networkD3(karate, group = members)

# Create force directed network plot
forceNetwork(Links = karate_d3$links, Nodes = karate_d3$nodes, 
             Source = 'source', Target = 'target', 
             NodeID = 'name', Group = 'group')
12345678910111213141516171819202122232425262728293031323334

 Output

Saving to an external stand alone HTML file

Use saveNetwork to save a network to a stand alone HTML file:

library(magrittr)

simpleNetwork(networkData) %>%
saveNetwork(file = 'Net1.html')

Including in an RMarkdown file

It is simple to include a networkD3 graphic in an RMarkdown file. Simply place the code to create the graph in a code chunk the same way you would any other plot. Checkout this simple example.

Including in Shiny web apps

You can also easily include networkD3 graphs in Shiny web apps.

In the server.R file create the graph by placing the function inside of render*Network, where the * is either SimpleForce, or Sankey depending on the graph type. For example:

output$force <- renderForceNetwork({
forceNetwork(Links = MisLinks, Nodes = MisNodes,
            Source = "source", Target = "target",
            Value = "value", NodeID = "name",
            Group = "group", opacity = input$opacity)
})

In the shinyUI part of your app.R file (for single-file Shiny apps) include *NetworkOutput (with * as before, but starting with a lowercase letter). The argument placed in this function should be the element specified with output, e.g.:

forceNetworkOutput("force")

You can run a simple example with the following code:

shiny::runGitHub('christophergandrud/networkD3-shiny-example')

Full source code for this example can be found here.

Saving as static PNG image

You can use RStudio to save static images of networkD3 plots as PNG files. Simply create your plot as usual in RStudio. The output should appear in the Viewer pane. Then click Export > Save as Image…. A new window will appear. You can use this window to manipulate the plot, resize it, and save the result as a PNG file.

RStudio-save-plot-as-image

thomasp85/patchwork: The Composer of ggplots

patchwork

Travis-CI Build Status AppVeyor Build Status CRAN_Release_Badge CRAN_Download_Badge

The goal of patchwork is to make it ridiculously simple to combine separate ggplots into the same graphic. As such it tries to solve the same problem as gridExtra::grid.arrange() but using an API that incites exploration and iteration.

Installation

You can install patchwork from github with:

# install.packages("devtools")
devtools::install_github("thomasp85/patchwork")

Example

The usage of patchwork is simple: just add plots together!

library(ggplot2)
library(patchwork)

p1 <- ggplot(mtcars) + geom_point(aes(mpg, disp))
p2 <- ggplot(mtcars) + geom_boxplot(aes(gear, disp, group = gear))

p1 + p2

you are of course free to also add the plots together as part of the same plotting operation:

ggplot(mtcars) +
  geom_point(aes(mpg, disp)) +
  ggplot(mtcars) + 
  geom_boxplot(aes(gear, disp, group = gear))

layouts can be specified by adding a plot_layout() call to the assemble. This lets you define the dimensions of the grid and how much space to allocate to the different rows and columns

p1 + p2 + plot_layout(ncol = 1, heights = c(3, 1))

If you need to add a bit of space between your plots you can use plot_spacer() to fill a cell in the grid with nothing

p1 + plot_spacer() + p2

You can make nested plots layout by wrapping part of the plots in parentheses – in these cases the layout is scoped to the different nesting levels

p3 <- ggplot(mtcars) + geom_smooth(aes(disp, qsec))
p4 <- ggplot(mtcars) + geom_bar(aes(carb))

p4 + {
  p1 + {
    p2 +
      p3 +
      plot_layout(ncol = 1)
  }
} +
  plot_layout(ncol = 1)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Advanced features

In addition to adding plots and layouts together, patchwork defines some other operators that might be of interest. / will behave like + but put the left and right side in the same nesting level (as opposed to putting the right side into the left sides nesting level). Observe:

(p1 + p2) + p3 + plot_layout(ncol = 1)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

this is basically the same as without braces (just like standard math arithmetic) – the plots are added sequentially to the same nesting level. Now look:

(p1 + p2) / p3 + plot_layout(ncol = 1)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Now p1 + p2 and p3 is on the same level…

There are two additional operators that are used for a slightly different purpose, namely to reduce code repetition. Consider the case where you want to change the theme for all plots in an assemble. Instead of modifying all plots individually you can use * or ^ to add elements to all subplots. The two differ in that * will only affect the plots on the current nesting level:

(p1 + (p2 + p3) + p4 + plot_layout(ncol = 1)) * theme_bw()
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

whereas ^ will recurse into nested levels:

(p1 + (p2 + p3) + p4 + plot_layout(ncol = 1)) ^ theme_bw()
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

This is all it does for now, but stay tuned as more functionality is added, such as collapsing guides, etc…

 

소스: thomasp85/patchwork: The Composer of ggplots

rOpenSci | Magick 1.6: clipping, geometries, fonts, fuzz, and a bit of history

Magick 1.6: clipping, geometries, fonts, fuzz, and a bit of history

  Jeroen Ooms   | DECEMBER 5, 2017

This week magick 1.6 appeared on CRAN. This release is a big all-round maintenance update with lots of tweaks and improvements across the package.

The NEWS file gives an overview of changes in this version. In this post we highlight some changes.

library(magick)
stopifnot(packageVersion('magick') >= 1.6)

If you are new to magick, check out the vignette for a quick introduction.

Perfect Graphics Rendering

I have fixed a few small rendering imperfections in the graphics device. The native magick graphics device image_graph() now renders identical or better quality images as the R-base bitmap devices pngjpeg, etc.

One issue was that sometimes magick graphics would show a 1px black border around the image. It turned out this is caused by rounding of clipping coordinates.

When R calculates clipping area it often ends up at non-whole values. It is then up to the graphics device to decide what to do with the pixel that is partially clipped. Let’s show clipping in action:

testplot <- function(title = ""){
  plot(1, main = title)
  abline(0, 1, col = "blue", lwd = 2, lty = "solid")
  abline(0.1, 1, col = "red", lwd = 3, lty = "dotted")
  abline(0.2, 1, col = "green", lwd = 4, lty = "twodash")
  abline(0.3, 1, col = "black", lwd = 5, lty = "dotdash")
  abline(0.4, 1, col = "purple", lwd = 6, lty = "dashed")
  abline(0.5, 1, col = "yellow", lwd = 7, lty = "longdash")
  abline(-0.1, 1, col = "blue", lwd = 10, lend = "round", lty = "dashed")
  abline(-0.2, 1, col = "blue", lwd = 10, lend = "butt", lty = "dashed")
  abline(-0.3, 1, col = "blue", lwd = 10, lend = "square", lty = "dashed")
}

Now we run it with and without clipping:

img2 <- magick::image_graph(clip = FALSE)
testplot("Without clipping")
dev.off()

noclip.png

img1 <- magick::image_graph(clip = TRUE)
testplot("With clipping")
dev.off()

clip.png

As we can see the latter image is now perfectly clipped. The colored lines are truncated exactly at the pixel where the axis starts. This is not always the case in base R 😉

Font Families

In magick there are two ways to render text on an image. You can either open the image or graphic in the magick graphics device and then use base R text() function to print text. Alternatively there is image_annotate() which is a simpler version to print some text on an image.

Wherever text rendering is involved, two major headache arise: encoding and fonts. The latter is tricky because different operating systems have different fonts with different names. In addition a font can be specified as a name, or family name, or alias.

Below is a simple test that I use to quickly inspect if fonts are working on different systems:

img <- image_graph(width = 800, height = 500, pointsize = 20, res = 96)
graphics::plot.new()
graphics::par(mar = c(0,0,3,0))
graphics::plot.window(xlim = c(0, 20), ylim = c(-.5, 8))
title(expression(Gamma %prop% sum(x[alpha], i==1, n) * sqrt(mu)), expression(hat(x)))

# Standard families as supported by other devices
text(0.95, 7, "abcdefg  - Helvetica", pos = 4, family = "helvetica")
text(0.95, 6, "abcdefg  - Sans (Arial)", pos = 4, family = "sans")
text(0.95, 5, "abcdefg - Serif (Times)", pos = 4, family = "serif")
text(0.95, 4, "abcdefg - Monospace (Courier New)", pos = 4, family = "mono")
text(0.95, 3, "abcdefg - Symbol Face", pos = 4, font = 5)
text(0.95, 2, "abcdefg  - Comic Sans", pos = 4, family = "Comic Sans")
text(0.95, 1, "abcdefg - Georgia Serif", pos = 4, family = "Georgia")
text(0.95, 0, "abcdefg - Courier", pos = 4, family = "Courier")
dev.off()
img <- image_border(img, 'red', geometry = '2x2')

families

R requires that a graphics device supports at least 4 font types: serifsansmono and symbol. The latter is a special 8bit font with some Greek letters and other characters needed for rendering math. This set of fonts corresponds to the original 13 base fonts from the 1984 postscript standard:

  • 4x Courier (Regular, Oblique, Bold, Bold Oblique)
  • 4x Helvetica (Regular, Oblique, Bold, Bold Oblique)
  • 4x Times (Roman, Italic, Bold, Bold Italic)
  • Symbol

Below a photo of the 1985 Apple Laser Writer which was the first laser printer to use the PostScript language and support all these fonts! Not much later PostScript graphics devices were adopted by R’s predecessor “The New S” (The New S Language, 1988).

printers

Geometry Helpers

Another major improvement in this release is the introduction of helper functions for geometry and option strings. Many functions in magick require a special geometry syntax to specify a size, area, or point. For example to resize an image you need to specify a size:

image_resize(img, "50%")
image_resize(img, "300x300")
image_resize(img, "300x300!")

Or to crop you need to specify an area which consists of a size and offset:

image_crop(img, "300x300+100+100")

We added a few handy ?geometry helper functions to generate proper geometry syntax

geometries

Magick Options

A lot of the power in ImageMagick is contained in the hundreds of built-in filters, colorspaces, compose operators, disposal types, convolution kernels, noise types and what not. These are specified simply as a string in the function.

For example in our previous post about Image Convolution we discussed a few kernel types:

# Gaussian Kernel
img %>% image_convolve('Gaussian:0x5', scaling = '60,40%')

# Sobel Kernel
img %>% image_convolve('Sobel')

# Difference of Gaussians
img %>% image_convolve('DoG:0,0,2')

Supported values for each option are described in the online ImageMagick documentation. We now have added functions in the magick package that list all values for each option. This should make it a easier to see what is supported and harness the full power of built-in ImageMagick algorithms.

options

So we can now easily list e.g. supported kernel types:

> kernel_types()
 [1] "Undefined"     "Unity"         "Gaussian"      "DoG"          
 [5] "LoG"           "Blur"          "Comet"         "Binomial"     
 [9] "Laplacian"     "Sobel"         "FreiChen"      "Roberts"      
[13] "Prewitt"       "Compass"       "Kirsch"        "Diamond"      
[17] "Square"        "Rectangle"     "Disk"          "Octagon"      
[21] "Plus"          "Cross"         "Ring"          "Peaks"        
[25] "Edges"         "Corners"       "Diagonals"     "ThinDiagonals"
[29] "LineEnds"      "LineJunctions" "Ridges"        "ConvexHull"   
[33] "ThinSe"        "Skeleton"      "Chebyshev"     "Manhattan"    
[37] "Octagonal"     "Euclidean"     "User Defined" 

That’s a lot of kernels.

Fuzz Scaling

Finally one more (breaking) change: several functions in magick use a fuzz parameter to specify the max distance between two colors to be considered similar.

For example the flood fill algorithm (the paint-bucket button in ms-paint) changes the color of a given starting pixel, and then recursively all adjacent pixels that have the same color. However sometimes neighboring pixels are not precisely the same color, but nearly the same. The fuzz parameter allows the fill to continue when pixels are not the same but similar color.

# Paint the shirt orange
frink <- image_read("https://jeroen.github.io/images/frink.png") %>%
  image_fill("orange", point = "+100+200", fuzz = 25)

frink

What has changed in this version is that fuzz parameter been rescaled to a percentage. Hence you should always provide a value between 0 and 100. Previously it was the absolute distance between colors, but this depends on the type and color depth of the image at hand, which was very confusing.

소스: rOpenSci | Magick 1.6: clipping, geometries, fonts, fuzz, and a bit of history

ggplot2 막대그래프

막대그래프

오늘도 ggplot문제 하나 드리겠습니다. ggplot의 geom_bar()는 stacked bar plot을 만드는데 쓰입니다. 예를 들어 moonBook 패키지의 acs데이타를 사용하여 남여 성별과 흡연 상태에 따른 bar plot을 만들면 다음과 같은 그래프를 만들 수 있습니다.

require(ggplot2)
require(moonBook)

ggplot(data=acs,aes(x=sex,fill=smoking)) +geom_bar()

plot of chunk unnamed-chunk-8

하지만 position=”fill”로 하시면 proportional stacked bar plot을 만들수 있습니다.

ggplot(data=acs,aes(x=sex,fill=smoking)) +geom_bar(position="fill")

plot of chunk unnamed-chunk-9

이 그래프는 상당히 정보를 왜곡하고 있습니다. 즉 남여의 숫자가 다른데 막대그래프의 폭이 같기 때문에 보기에 따라 남여 전체 숫자가 같게 느껴집니다. R에 기본으로 포함되어 있는 mosaicplot은 이와 같은 왜곡을 해결해줍니다.

mosaicplot(sex~smoking,data=acs)

plot of chunk unnamed-chunk-10

또한 spineplot() 함수는 spineplot을 그려줍니다.

spineplot(factor(smoking)~factor(sex),data=acs)

plot of chunk unnamed-chunk-11

문제 1) ggplot을 이용한 spineplot

spineplot의 개념을 ggplot에 도입하면 다음과 같은 그림을 그릴 수 있습니다. 그리고 columnwise ratio를 라벨로 붙여준다면 더욱 좋겠습니다.

plot of chunk unnamed-chunk-12

또한 geom_bar()와 같이 position=”dodge”나 “stack”도 그릴 수 있으면 좋겠습니다.

plot of chunk unnamed-chunk-13

spinogram : histogram의 확장

spineplot()함수를 연속형변수에 적용하면 spinogram이 됩니다.

spineplot(factor(sex)~age,data=acs)

plot of chunk unnamed-chunk-14

문제 2) ggplot을 이용한 spinogram

이 spinogram을 ggplot으로 그려보면 다음과 같이 됩니다.

plot of chunk unnamed-chunk-15

또한 geom_bar()와 같이 position=”dodge”나 “stack”도 그릴 수 있으면 좋겠습니다.

plot of chunk unnamed-chunk-16

이와 같은 일을 할 수 있는 함수를 제작해보셔요. 즉 R 에 기본적으로 포함되어 있는 spineplot()의 ggplot버젼입니다.

소스: rquiz/quiz.md at master · cardiomoon/rquiz

tidyquant로 한국 주식 하려면 tqk

ChanYub Park

2017-11-22

tidyquant로 한국 주식 하려면 tqk

개요

tidyquant의 tq_get()으로 한국의 데이터를 가져오는데 제약이 있어 시작했습니다. 우선 code_get()으로 종목 코드를 가져오고, tqk_get()으로 tq_get()과 같은 양식의 데이터를 확보하여 이후 tidyquant의 모든 기능을 한국 데이터로 활용할 수 있습니다.

사전준비

tidyquant와 tqk 패키지를 불러옵니다.

library(tidyquant)
library(tqk)

주식 데이터를 tidy 개념으로 tidyquant

tidyquant는 quantmod 등 주식 분석을 주 목적으로 하는 중요 함수를 제공하는 중요한 패키지입니다. tidy data 개념을 활용한 데이터 핸들링, ggplot과 연계된 강한 차트 그리기, 야후를 기본으로 구글 및 각자 독자적인 데이터 소스로 부터 필요한 데이터를 손쉽게 가져오는 기능, 성능 분석 함수들을 제공하고 있습니다.

주가 지수 가져오기

tidyquant는 야후 파이넨스에서 정보를 가져옵니다. 가져오는 데이터 소스를 바꾸고 싶으면 어떤 곳에서 가져올지 결정할 수 있는데, tq_get_options()는 가능한 후보를 보여줍니다.

if (!require(tidyquant)) install.packages("tidyquant", verbose = F)
library(tidyquant)
tq_get_options()
##  [1] "stock.prices"       "stock.prices.japan" "financials"        
##  [4] "key.stats"          "key.ratios"         "dividends"         
##  [7] "splits"             "economic.data"      "exchange.rates"    
## [10] "metal.prices"       "quandl"             "quandl.datatable"

이때 코스피와 코스닥을 이르는 이름이 각각 ^KS11와 ^KOSDAQ입니다. 각각 한번 가져와 보겠습니다.

tq_get("^KS11")
## # A tibble: 2,666 x 7
##          date    open    high     low   close volume adjusted
##        <date>   <dbl>   <dbl>   <dbl>   <dbl>  <dbl>    <dbl>
##  1 2007-01-02 1438.89 1439.71 1430.06 1435.26 147800  1435.26
##  2 2007-01-03 1436.42 1437.79 1409.31 1409.35 203200  1409.35
##  3 2007-01-04 1410.55 1411.12 1388.50 1397.29 241200  1397.29
##  4 2007-01-05 1398.60 1400.59 1372.36 1385.76 277200  1385.76
##  5 2007-01-08 1376.76 1384.65 1366.48 1370.81 177600  1370.81
##  6 2007-01-09 1376.71 1381.99 1367.74 1374.34 216800  1374.34
##  7 2007-01-10 1372.52 1372.52 1345.08 1355.79 225400  1355.79
##  8 2007-01-11 1357.57 1375.31 1355.63 1365.31 211800  1365.31
##  9 2007-01-12 1379.00 1389.00 1372.87 1388.37 213800  1388.37
## 10 2007-01-15 1396.87 1397.64 1385.81 1390.96 163800  1390.96
## # ... with 2,656 more rows
tq_get("^KOSDAQ")
## # A tibble: 1 x 7
##         date  open   high   low close volume adjusted
##       <date> <dbl>  <dbl> <dbl> <dbl>  <dbl>    <dbl>
## 1 2017-09-15 662.9 671.31 662.9 671.3 645613    671.3

각 기업의 주가를 가져오려면 종목 번호를 알고 있어야 합니다. 양식은 종목 번호.KS입니다. 종목번호는 세종기업 데이터에서 가져온 정보를 활용하겠습니다.

library(readr)
url<-"https://github.com/mrchypark/sejongFinData/raw/master/codeData.csv"
download.file(url,destfile = "./codeData.csv")
codeData<-read_csv("./codeData.csv")
head(codeData)
## # A tibble: 6 x 2
##   종목번호     종목명
##      <chr>      <chr>
## 1   005930   삼성전자
## 2   000660 SK하이닉스
## 3   005935 삼성전자우
## 4   005380     현대차
## 5   035420      NAVER
## 6   015760   한국전력

삼성전자를 가져와 볼까요.

tar<-paste0(codeData[grep("^삼성전자$",codeData$`종목명`),1],".KS")
tq_get(tar)
## # A tibble: 2,666 x 7
##          date     open     high      low  close volume adjusted
##        <date>    <dbl>    <dbl>    <dbl>  <dbl>  <dbl>    <dbl>
##  1 2007-01-02 700392.6 708300.3 695874.0 626000 352146 554146.3
##  2 2007-01-03 708300.3 709430.0 690225.7 614000 393530 543523.7
##  3 2007-01-04 690225.8 691355.4 681188.4 606000 365178 536441.9
##  4 2007-01-05 686836.7 687966.3 672151.0 595000 568008 526704.6
##  5 2007-01-08 668761.9 671021.3 654076.3 584000 661631 516967.2
##  6 2007-01-09 663113.8 671021.5 657465.5 586000 402197 518737.5
##  7 2007-01-10 657465.4 659724.7 649557.7 578000 515126 511655.8
##  8 2007-01-11 654076.4 664243.4 652946.7 583000 596787 516081.9
##  9 2007-01-12 666502.8 684577.4 660854.4 606000 806921 536441.9
## 10 2007-01-15 691355.3 694744.3 685707.0 612000 673506 541753.2
## # ... with 2,656 more rows

날짜를 지정할 수도 있습니다.

tq_get(tar, from="2016-01-01", to="2016-05-05")
## # A tibble: 84 x 7
##          date    open    high     low   close volume adjusted
##        <date>   <dbl>   <dbl>   <dbl>   <dbl>  <dbl>    <dbl>
##  1 2016-01-04 1288562 1288562 1232315 1205000 306939  1178290
##  2 2016-01-05 1229247 1245610 1212885 1208000 216002  1181224
##  3 2016-01-06 1235383 1235383 1194477 1175000 366752  1148955
##  4 2016-01-07 1192431 1209816 1177091 1163000 282388  1137221
##  5 2016-01-08 1189363 1212885 1189363 1171000 257763  1145044
##  6 2016-01-11 1182205 1192431 1171978 1152000 241277  1126465
##  7 2016-01-12 1174023 1192431 1169932 1146000 206283  1120598
##  8 2016-01-13 1179137 1185273 1174023 1148000 143316  1122554
##  9 2016-01-14 1156638 1167887 1156638 1138000 209022  1112775
## 10 2016-01-15 1165842 1178114 1149479 1132000 209464  1106908
## # ... with 74 more rows

배당금 정보는 dividends 에서 확인하시면 됩니다.

tq_get(tar, get = "dividends")
## # A tibble: 22 x 2
##          date dividends
##        <date>     <dbl>
##  1 2007-06-28       500
##  2 2007-12-27      7500
##  3 2008-06-27       500
##  4 2008-12-29      5000
##  5 2009-06-29       500
##  6 2009-12-29      7500
##  7 2010-06-29      4230
##  8 2010-12-29      5000
##  9 2011-06-29       500
## 10 2011-12-28      5000
## # ... with 12 more rows

야후 파이넨스가 데이터 소스이다 보니 모든 정보가 있다고 보기 어렵니다. 그것을 커버하기 위해서 tqk가 시작됬습니다.

종목 코드 가져오기

본래 tidyquant 패키지는 symbol(ex> 애플사는 AAPL)를 인자로 주식 데이터를 가져옵니다. 한국 주식은 각 종목별로 코드가 있으며 그것 때문에 코드와 종목명이 매치되있는 데이터를 확인할 수 있어야 합니다. tqk 패키지는 code_get()함수를 통해 진행 가능합니다.

library(tqk)
code<-code_get()
code
## # A tibble: 2,273 x 3
##      code                    name category
##     <chr>                   <chr>    <chr>
##  1 060310                      3S   KOSDAQ
##  2 095570              AJ네트웍스    KOSPI
##  3 068400                AJ렌터카    KOSPI
##  4 006840                AK홀딩스    KOSPI
##  5 054620               APS홀딩스   KOSDAQ
##  6 211270                  AP위성   KOSDAQ
##  7 152100             ARIRANG 200      ETF
##  8 222170 ARIRANG S&P한국배당성장      ETF
##  9 161490      ARIRANG 경기방어주      ETF
## 10 161500      ARIRANG 경기주도주      ETF
## # ... with 2,263 more rows

주식 데이터 가져오기

tqk_get()은 종목 코드로 데이터를 가져오도록 만들었습니다.

ss_prices  <- tqk_get(code[grep("삼성전자", code[,2]),1]
                       , from="2017-01-01")
## [1] "please wait for getting data using internet."
## [1] "close and adjusted are same now."
ss_prices
## # A tibble: 193 x 7
##          date  open  high   low close volume adjusted
##        <date> <int> <int> <int> <int>  <dbl>    <int>
##  1 2017-01-02  2820  3035  2790  3000 478672     3000
##  2 2017-01-03  3070  3070  2900  3030 309507     3030
##  3 2017-01-04  3030  3095  2980  3090 248971     3090
##  4 2017-01-05  3090  3300  3050  3090 928979     3090
##  5 2017-01-06  3090  3095  3015  3015 202702     3015
##  6 2017-01-09  3045  3090  2995  3010 125422     3010
##  7 2017-01-09  3045  3090  2995  3010 125422     3010
##  8 2017-01-10  3010  3030  2900  2930 247850     2930
##  9 2017-01-11  2970  2985  2920  2930  92280     2930
## 10 2017-01-12  2935  2970  2880  2880 160546     2880
## # ... with 183 more rows

데이터는 주요 사이트인 n사, d사, p사 를 모두 대응하는 것을 목표로 하고 있고, 현재 p사(구현 편의성)로 작성되어 있습니다.

Quandl

Quandl은 방대한 양의 경제, 주식에 대한 정보를 가지고 서비스하는 데이터 판매 기업입니다. Quandl이라는 자체 패키지만을 사용해도 되고, tidyquant가 내장하고 있어서 같이 사용해도 됩니다.

tidyverse와 함께 사용하는 시계열 데이터

그 동안의 주식관련 패키지들은 파이프 연산자 %>%와 함꼐 사용하지 못했는데, tidyquant는 그런 문제를 해결하였습니다. 아래 2가지 중요한 함수를 추가함으로써 dplyr과 tidyr의 함수와 함께 사용할 수 있게 되었습니다.

  • tq_transmute(): 계산된 내용의 컬럼만으로 데이터를 구성합니다.
  • tq_mutate(): 데이터에 계산된 내용의 컬럼을 추가합니다.

tq_에서 계산 가능한 함수들

tq_transmute_fun_options() 함수는 각 참고 패키지에서 활용할 수 있는 함수의 리스트를 보여줍니다. 모두 zooxtsquantmodTTRPerformanceAnalytics의 5개 패키지내의 함수를 지원합니다.

tq_transmute_fun_options() %>% str
## List of 5
##  $ zoo                 : chr [1:14] "rollapply" "rollapplyr" "rollmax" "rollmax.default" ...
##  $ xts                 : chr [1:27] "apply.daily" "apply.monthly" "apply.quarterly" "apply.weekly" ...
##  $ quantmod            : chr [1:25] "allReturns" "annualReturn" "ClCl" "dailyReturn" ...
##  $ TTR                 : chr [1:62] "adjRatios" "ADX" "ALMA" "aroon" ...
##  $ PerformanceAnalytics: chr [1:7] "Return.annualized" "Return.annualized.excess" "Return.clean" "Return.cumulative" ...

zoo 함수

tq_transmute_fun_options()$zoo
##  [1] "rollapply"          "rollapplyr"         "rollmax"           
##  [4] "rollmax.default"    "rollmaxr"           "rollmean"          
##  [7] "rollmean.default"   "rollmeanr"          "rollmedian"        
## [10] "rollmedian.default" "rollmedianr"        "rollsum"           
## [13] "rollsum.default"    "rollsumr"
  • 롤링관련 함수 :
    • 롤링 마진에 기능을 적용하는 일반적인 기능.
    • form :rollapply(data, width, FUN, ..., by = 1, by.column = TRUE, fill = if (na.pad) NA, na.pad = FALSE, partial = FALSE, align = c("center", "left", "right"), coredata = TRUE).
    • 옵션에는 rollmax,rollmean,rollmedian,rollsum 등이 있습니다.

xts 함수

tq_transmute_fun_options()$xts
##  [1] "apply.daily"     "apply.monthly"   "apply.quarterly"
##  [4] "apply.weekly"    "apply.yearly"    "diff.xts"       
##  [7] "lag.xts"         "period.apply"    "period.max"     
## [10] "period.min"      "period.prod"     "period.sum"     
## [13] "periodicity"     "to.daily"        "to.hourly"      
## [16] "to.minutes"      "to.minutes10"    "to.minutes15"   
## [19] "to.minutes3"     "to.minutes30"    "to.minutes5"    
## [22] "to.monthly"      "to.period"       "to.quarterly"   
## [25] "to.weekly"       "to.yearly"       "to_period"
  • 기간 적용 기능 :
    • 기능을 시간 세그먼트 (예 : maxminmean 등)에 적용합니다.
    • 양식 :apply.daily (x, FUN, ...).
    • 옵션은apply.daily,weekly,monthly,quarterly,yearly를 포함합니다.
  • 기간 기능 :
    • 시계열을 낮은 주기성의 시계열로 변환합니다 (예 : 매일 매일의 주기성으로 변환).
    • 형식 :to.period (x, period = 'months', k = 1, indexAt, name = NULL, OHLC = TRUE, ...).
    • 옵션에는to.minutes,hourly,daily,weekly,monthly,quarterly,yearly가 포함됩니다.
    • 참고 :to.periodto.monthly (to.weekly,to.quarterly 등) 양식의 리턴 구조는 다릅니다. to.period는 날짜를 반환하고, to.months는 MON YYYY 문자를 반환합니다. lubridate를 통해 시계열로 작업하고 싶다면to.period를 사용하는 것이 가장 좋습니다.

quantmod 함수

tq_transmute_fun_options()$quantmod
##  [1] "allReturns"      "annualReturn"    "ClCl"           
##  [4] "dailyReturn"     "Delt"            "HiCl"           
##  [7] "Lag"             "LoCl"            "LoHi"           
## [10] "monthlyReturn"   "Next"            "OpCl"           
## [13] "OpHi"            "OpLo"            "OpOp"           
## [16] "periodReturn"    "quarterlyReturn" "seriesAccel"    
## [19] "seriesDecel"     "seriesDecr"      "seriesHi"       
## [22] "seriesIncr"      "seriesLo"        "weeklyReturn"   
## [25] "yearlyReturn"
  • 비율 변경 (Delt) 및 Lag 기능
    • Delt :Delt (x1, x2 = NULL, k = 0, type = c ( "arithmetic", "log"))
      • Delt의 변형 : ClCl, HiCl, LoCl, LoHi, OpCl, OpHi, OpLo, OpOp
      • 양식 :Opcl (OHLC)
    • Lag :Lag(x, k = 1)/ Next :Next(x, k = 1)(dplyr :: lagdplyr :: lead도 사용할 수 있습니다)
  • 기간 반환 함수 :
    • 매일, 매주, 매월, 분기 별 및 연간을 포함하는 다양한주기에 대한 산술 또는 로그 반환을 가져옵니다.
    • 형식 :periodReturn (x, period = 'monthly', 부분 집합 = NULL, type = 'arithmetic', leading = TRUE, ...)
  • 시리즈 기능 :
    • 계열을 설명하는 반환 값. 옵션에는 증감, 가감 및 고저 설명이 포함됩니다.
    • 양식 :seriesHi (x),seriesIncr (x, thresh = 0, diff. = 1L),seriesAccel (x)

TTR 함수

tq_transmute_fun_options()$TTR
##  [1] "adjRatios"          "ADX"                "ALMA"              
##  [4] "aroon"              "ATR"                "BBands"            
##  [7] "CCI"                "chaikinAD"          "chaikinVolatility" 
## [10] "CLV"                "CMF"                "CMO"               
## [13] "DEMA"               "DonchianChannel"    "DPO"               
## [16] "DVI"                "EMA"                "EMV"               
## [19] "EVWMA"              "GMMA"               "growth"            
## [22] "HMA"                "KST"                "lags"              
## [25] "MACD"               "MFI"                "momentum"          
## [28] "OBV"                "PBands"             "ROC"               
## [31] "rollSFM"            "RSI"                "runCor"            
## [34] "runCov"             "runMAD"             "runMax"            
## [37] "runMean"            "runMedian"          "runMin"            
## [40] "runPercentRank"     "runSD"              "runSum"            
## [43] "runVar"             "SAR"                "SMA"               
## [46] "SMI"                "SNR"                "stoch"             
## [49] "TDI"                "TRIX"               "ultimateOscillator"
## [52] "VHF"                "VMA"                "volatility"        
## [55] "VWAP"               "VWMA"               "wilderSum"         
## [58] "williamsAD"         "WMA"                "WPR"               
## [61] "ZigZag"             "ZLEMA"
  • 웰즈 와일더의 방향 운동 지수 : *ADX (HLC, n = 14, maType, ...)
  • 볼린저 밴드 :
    • BBands (HLC, n = 20, maType, sd = 2, …) : 볼린저 밴드
  • 변화율 / 운동량 : ROC (x, n = 1, type = c ( "연속", "이산"), na.pad = TRUE): 변화율 운동량 (x, n = 1, na.pad = TRUE): 운동량
  • 이동 평균 (maType) : SMA (x, n = 10, ...): 단순 이동 평균 EMA (x, n = 10, wilder = FALSE, ratio = NULL, ...): 지수 이동 평균
    • DEMA (x, n = 10, v = 1, wilder = FALSE, ratio = NULL)`: 이중 지수 이동 평균
    • WMA (x, n = 10, wts = 1 : n, …)`: 가중 이동 평균
    • EVWMA (가격, 수량, n = 10, …) : 탄성, 체중 이동 평균 ZLEMA (x, n = 10, 비율 = NULL, ...): Zero Lag Exponential Moving Average VWAP (가격, 물량, n = 10, ...): 물량 가중 평균 가격
    • VMA (x, w, 비율 = 1, …) : 가변 길이 이동 평균 HMA (x, n = 20, ...): 선체 이동 평균 ALMA (x, n = 9, offset = 0.85, sigma = 6, ...): Arnaud Legoux 이사 평균
  • MACD Oscillator : MACD (x, nFast = 12, nSlow = 26, nSig = 9, maType, percent = TRUE, …)
  • 상대 강도 지수 : *RSI (가격, n = 14, maType, ...)
  • runFun : runSum (x, n = 10, cumulative = FALSE): n- 기간 이동 윈도우에 대한 합계를 반환합니다. runMin (x, n = 10, cumulative = FALSE): n- 기간 이동 윈도우에 대한 최소값을 반환합니다. runMax (x, n = 10, cumulative = FALSE): n- 기간 이동 윈도우에 대해 최대 값을 반환합니다. runMean (x, n = 10, cumulative = FALSE): n-period 이동 윈도우를 의미합니다. *runMedian (x, n = 10, non.unique = "mean", cumulative = FALSE): n-period 이동 윈도우에 대한 중앙값을 반환합니다.
    • runCov (x, y, n = 10, use = "all.obs", sample = TRUE, 누적 = FALSE): n-period 이동 윈도우에 대한 공분산을 반환합니다. runCor (x, y, n = 10, use = "all.obs", sample = TRUE, 누적 = FALSE): n-period 이동 윈도우에 대한 상관 관계를 반환합니다. runVar (x, y = NULL, n = 10, 샘플 = TRUE, 누적 = FALSE): n- 기간 이동 윈도우에 대한 분산을 반환합니다. runSD (x, n = 10, 샘플 = TRUE, 누적 = FALSE): n- 기간 이동 윈도우에 대한 표준 편차를 반환합니다. runMAD (x, n = 10, center = NULL, stat = "중간 값", 상수 = 1.4826, non.unique = "평균", cumulative = FALSE)n 기간 이동에 대한 중간 / 평균 절대 편차를 반환합니다. 창문. wilderSum (x, n = 10): n- 기간 이동 윈도우에 대해 Welles Wilder 스타일 가중치 합계를 되 돌린다.
  • Stochastic Oscillator / Stochastic Momentum Index : Stochastic Oscillator (HLC, nFastK = 14, nFastD = 3, nSlowD = 3, maType, bounded = TRUE, smooth = 1, …)
    • SMI (HLC, n = 13, nFast = 2, nSlow = 25, nSig = 9, maType, bounded = TRUE, …) : 확률 모멘텀 지수

PerformanceAnalytics 함수

tq_transmute_fun_options()$PerformanceAnalytics
## [1] "Return.annualized"        "Return.annualized.excess"
## [3] "Return.clean"             "Return.cumulative"       
## [5] "Return.excess"            "Return.Geltner"          
## [7] "zerofill"

Return.annualized 및Return.annualized.excess : 기간 반환을 취하여 연간 수익으로 통합합니다. Return.clean : 반환 값에서 특이 값을 제거합니다. Return.excess : 무위험 이자율을 초과하는 수익률로 수익률에서 무위험 이자율을 제거합니다. zerofill : ’NA’값을 0으로 대체하는 데 사용됩니다.

ggplot2와 연계된 차트 그리기

ggplot2 차트를 그리는데 R에서 가장 유명한 패키지 입니다. gg는 Grammar of Graphics의 줄임말로 그림을 생성하는 것에 대한 규칙을 제안하고 있습니다. tidyquant는 ggplot2에 더해 아래와 같은 기능을 추가로 제공합니다.

  • 차트 종류 : 두 개의 차트 타입 시각화는geom_barchartgeom_candlestick을 사용하여 가능합니다.
  • 이동 평균 : ’geom_ma’를 사용하여 7 개의 이동 평균 시각화를 사용할 수 있습니다.
  • Bollinger Bands : Bollinger 밴드는 ’geom_bbands’를 사용하여 시각화 할 수 있습니다. BBand 이동 평균은 이동 평균에서 사용할 수있는 7 가지 중 하나 일 수 있습니다.
  • 날짜 범위 확대 : 차트의 특정 영역을 확대 할 때 데이터 손실을 방지하는 두 가지coord 함수 (coord_x_date 및coord_x_datetime)를 사용할 수 있습니다. 이것은 이동 평균 및 Bollinger 밴드 기하학을 사용할 때 중요합니다.

살펴보기

tqk_get를 이용해서 사용할 데이터를 가져옵니다. 내장 데이터인 SHANK과 삼성, 네이버를 예시로 사용하겠습니다.

library(tqk)
data(SHANK)

SS <- tqk_get(code[grep("^삼성전자$",code$name),1], to = "2016-12-31")
## [1] "please wait for getting data using internet."
## [1] "close and adjusted are same now."
NVR <- tqk_get(code[grep("^NHN$",code$name),1], to = "2016-12-31")
## [1] "please wait for getting data using internet."
## [1] "close and adjusted are same now."

’end` 매개 변수는 예제 전체에서 날짜 제한을 설정할 때 사용됩니다.

end <- as_date("2016-12-31")

차트 종류

라인 차트

tidyquant의 geom_함수를 사용하여 가로 막대형 차트와 촛대형 차트를 시각화하기 전에 단순한 선 차트로 주가를 시각화하여 그래픽 문법을 확인해보겠습니다. 이것은ggplot2 패키지의geom_line을 사용하여 이루어집니다. 주식 데이터로 시작하고 파이프 연산자 (%> %)를 사용하여ggplot ()함수로 보냅니다.

SS %>%
    ggplot(aes(x = date, y = close)) +
    geom_line() +
    labs(title = "SamSung Line Chart", y = "Closing Price", x = "") + 
    theme_tq()

바 차트

바 차트는 geom_line를 geom_barchart로 바꾸는 걸로 해결됩니다. aes()내의 내용을 의미에 맞게 조정하는 것으로 바 차트를 그리는 것이 끝납니다.

SS %>%
    ggplot(aes(x = date, y = close)) +
    geom_barchart(aes(open = open, high = high, low = low, close = close)) +
    labs(title = "SamSung Bar Chart", y = "Closing Price", x = "") + 
    theme_tq()

우리는coord_x_date를 사용하여 특정 섹션을 확대 / 축소합니다.이 섹션에는xlim 및ylim 인수가c (start, end)로 지정되어 차트의 특정 영역에 초점을 맞 춥니 다. xlim의 경우 우리는lubridate를 사용하여 문자 날짜를 날짜 클래스로 변환 한 다음weeks ()함수를 사용하여 6 주를 뺍니다. ylim의 경우 가격을 100에서 120까지 확대합니다.

SS %>%
    ggplot(aes(x = date, y = close)) +
    geom_barchart(aes(open = open, high = high, low = low, close = close)) +
    labs(title = "SamSung Bar Chart", 
         subtitle = "Zoomed in using coord_x_date",
         y = "Closing Price", x = "") + 
    coord_x_date(xlim = c(end - weeks(6), end),
                 ylim = c(1600000, 1800000)) + 
    theme_tq()

색상은color_up 및color_down 인수를 사용하여 수정할 수 있으며size와 같은 매개 변수를 사용하여 모양을 제어 할 수 있습니다.

SS %>%
    ggplot(aes(x = date, y = close)) +
    geom_barchart(aes(open = open, high = high, low = low, close = close),
                     color_up = "darkgreen", color_down = "darkred", size = 1) +
    labs(title = "SamSung Bar Chart", 
         subtitle = "Zoomed in, Experimenting with Formatting",
         y = "Closing Price", x = "") + 
    coord_x_date(xlim = c(end - weeks(6), end),
                 ylim = c(1600000, 1800000)) + 
    theme_tq()

캔들 차트

캔들 차트 또한 바 차트를 그리는 것과 거의 같습니다.

SS %>%
    ggplot(aes(x = date, y = close)) +
    geom_candlestick(aes(open = open, high = high, low = low, close = close)) +
    labs(title = "SamSung Candlestick Chart", y = "Closing Price", x = "") +
    theme_tq()

색상은color_upcolor_down을 사용하여 선 색상을 조절할 수 있고, fill_upfill_down은 사각형을 채 웁니다.

SS %>%
    ggplot(aes(x = date, y = close)) +
    geom_candlestick(aes(open = open, high = high, low = low, close = close),
                        color_up = "darkgreen", color_down = "darkred", 
                        fill_up  = "darkgreen", fill_down  = "darkred") +
    labs(title = "SamSung Candlestick Chart", 
         subtitle = "Zoomed in, Experimenting with Formatting",
         y = "Closing Price", x = "") + 
    coord_x_date(xlim = c(end - weeks(6), end),
                 ylim = c(1600000, 1800000)) + 
    theme_tq()

여러개의 차트를 그리기

facet_wrap을 사용하여 동시에 여러 주식을 시각화 할 수 있습니다. ggplot ()의 aes()group을 추가하고ggplot 워크 플로우의 끝에서facet_wrap()함수와 결합함으로써 네 개의 “FANG”주식을 동시에 모두 볼 수 있습니다.

start <- end - weeks(6)
SHANK %>%
    filter(date >= start - days(2 * 15)) %>%
    ggplot(aes(x = date, y = close, group = symbol)) +
    geom_candlestick(aes(open = open, high = high, low = low, close = close)) +
    labs(title = "SHANK Candlestick Chart", 
         subtitle = "Experimenting with Mulitple Stocks",
         y = "Closing Price", x = "") + 
    coord_x_date(xlim = c(start, end)) +
    facet_wrap(~ symbol, ncol = 2, scale = "free_y") + 
    theme_tq()

트랜드 시각화

Moving averages are critical to evaluating time-series trends. tidyquant includes geoms to enable “rapid prototyping” to quickly visualize signals using moving averages and Bollinger bands.

이동 평균

tidyquant에서는 다양한 이동평균 함수를 제공합니다.

이동 평균은geom_ma 함수로 차트에 추가 된 레이어로 적용됩니다. 기하 구조는TTR 패키지에서SMA,EMA,WMA,DEMA,ZLEMA,VWMA,EVWMA와 같은 기본 이동 평균 함수의 래퍼입니다.

Example 1: 50일/200일 단순 이동 평균 차트 작성

SS %>%
    ggplot(aes(x = date, y = close)) +
    geom_candlestick(aes(open = open, high = high, low = low, close = close)) +
    geom_ma(ma_fun = SMA, n = 50, linetype = 5, size = 1.25) +
    geom_ma(ma_fun = SMA, n = 200, color = "red", size = 1.25) + 
    labs(title = "SamSung Candlestick Chart", 
         subtitle = "50 and 200-Day SMA", 
         y = "Closing Price", x = "") + 
         coord_x_date(xlim = c(end - weeks(24), end),
                      ylim = c(1500000, 1850000)) + 
    theme_tq()

Example 2: 지수 이동 평균 차트

SS %>%
    ggplot(aes(x = date, y = close)) +
    geom_barchart(aes(open = open, high = high, low = low, close = close)) +
    geom_ma(ma_fun = EMA, n = 50, wilder = TRUE, linetype = 5, size = 1.25) +
    geom_ma(ma_fun = EMA, n = 200, wilder = TRUE, color = "red", size = 1.25) + 
    labs(title = "SamSung Bar Chart", 
         subtitle = "50 and 200-Day EMA", 
         y = "Closing Price", x = "") + 
         coord_x_date(xlim = c(end - weeks(24), end),
                      ylim = c(1500000, 1850000)) + 
    theme_tq()

볼린저 밴드

[Bollinger Bands] https://en.wikipedia.org/wiki/Bollinger_Bands)는 이동 평균(일반적으로 상하 2SD) 주위의 범위를 플로팅하여 변동성을 시각화하는 데 사용됩니다. 그것들은 이동 평균을 사용하기 때문에,geom_bbands 함수는geom_ma와 거의 동일하게 작동합니다. 동일한 7 개의 이동 평균이 호환됩니다. 가장 큰 차이점은 기본적으로 2 인 표준 편차 인sd 인수와 밴드를 계산하는 데 필요한 ‘high’, ’low’및 ’close’를 aes()에 추가하는 것입니다.

Example 1: SMA를 사용하여 BBands 적용

간단한 이동 평균을 사용하여 Bollinger Bands를 추가하는 기본 예제를 살펴 보겠습니다.

SS %>%
    ggplot(aes(x = date, y = close, open = open,
               high = high, low = low, close = close)) +
    geom_candlestick() +
    geom_bbands(ma_fun = SMA, sd = 2, n = 20) +
    labs(title = "SamSung Candlestick Chart", 
         subtitle = "BBands with SMA Applied", 
         y = "Closing Price", x = "") + 
         coord_x_date(xlim = c(end - weeks(24), end),
                      ylim = c(1500000, 1850000)) + 
    theme_tq()

Example 2: Bollinger Bands의 모양 바꾸기

모양은color_ma,color_bands,alpha,fill 인자를 사용하여 수정할 수 있습니다. BBands에 새로운 서식을 적용한 Example 1과 같은 그림이 있습니다.

SS %>%
    ggplot(aes(x = date, y = close, open = open,
               high = high, low = low, close = close)) +
    geom_candlestick() +
    geom_bbands(ma_fun = SMA, sd = 2, n = 20, 
                linetype = 4, size = 1, alpha = 0.2, 
                fill        = palette_light()[[1]], 
                color_bands = palette_light()[[1]], 
                color_ma    = palette_light()[[2]]) +
    labs(title = "SamSung Candlestick Chart", 
         subtitle = "BBands with SMA Applied, Experimenting with Formatting", 
         y = "Closing Price", x = "") + 
    coord_x_date(xlim = c(end - weeks(24), end),
                 ylim = c(1500000, 1850000)) + 
    theme_tq()

Example 3: 여러 주식에 BBands 추가

start <- end - weeks(12)
SHANK %>%
    filter(date >= start - days(2 * 20)) %>%
    ggplot(aes(x = date, y = close, 
               open = open, high = high, low = low, close = close, 
               group = symbol)) +
    geom_barchart() +
    geom_bbands(ma_fun = SMA, sd = 2, n = 20, linetype = 5) +
    labs(title = "SHANK Bar Chart", 
         subtitle = "BBands with SMA Applied, Experimenting with Multiple Stocks", 
         y = "Closing Price", x = "") + 
    coord_x_date(xlim = c(start, end)) +
    facet_wrap(~ symbol, ncol = 2, scales = "free_y") + 
    theme_tq()

ggplot2 함수

기본 ggplot2는 재무 데이터를 분석하는데 유용한 많은 기능을 가지고 있습니다. 네이버(NVR)을 사용하여 몇 가지 간단한 예제를 살펴 보겠습니다.

Example 1 : scale_y_log10을 사용한 로그 스케일

ggplot2는 y 축을 로그 스케일로 스케일하기위한scale_y_log10 ()함수를 가지고 있습니다. 이는 분석 할 수있는 선형 추세를 조정하는 경향이 있으므로 매우 유용합니다.

Continuous Scale:

NVR %>%
    ggplot(aes(x = date, y = adjusted)) +
    geom_line(color = palette_light()[[1]]) + 
    scale_y_continuous() +
    labs(title = "Naver Line Chart", 
         subtitle = "Continuous Scale", 
         y = "Closing Price", x = "") + 
    theme_tq()

Log Scale:

NVR %>%
    ggplot(aes(x = date, y = adjusted)) +
    geom_line(color = palette_light()[[1]]) + 
    scale_y_log10() +
    labs(title = "Naver Line Chart", 
         subtitle = "Log Scale", 
         y = "Closing Price", x = "") + 
    theme_tq()

Example 2: geom_smooth로 회귀 추세선

우리는 워크 플로우에geom_smooth ()함수를 빠르게 추가하는 추세선을 적용 할 수 있습니다. 이 함수는 선형(lm)과 loess(loess) 를 포함한 몇 가지 예측 방법을 가지고 있습니다.

Linear:

NVR %>%
    ggplot(aes(x = date, y = adjusted)) +
    geom_line(color = palette_light()[[1]]) + 
    scale_y_log10() +
    geom_smooth(method = "lm") +
    labs(title = "Naver Line Chart", 
         subtitle = "Log Scale, Applying Linear Trendline", 
         y = "Adjusted Closing Price", x = "") + 
    theme_tq()

Example 3: geom_segment로 차트 볼륨

우리는geom_segment ()함수를 사용하여 라인의 시작과 끝을 xy 점으로하는 일일 볼륨을 차트로 표시 할 수 있습니다. aes()를 사용하여 볼륨의 값을 기준으로 색상을 지정하여 이러한 데이터를 강조 표시합니다.

NVR %>%
    ggplot(aes(x = date, y = volume)) +
    geom_segment(aes(xend = date, yend = 0, color = volume)) + 
    geom_smooth(method = "loess", se = FALSE) +
    labs(title = "Naver Volume Chart", 
         subtitle = "Charting Daily Volume", 
         y = "Volume", x = "") +
    theme_tq() +
    theme(legend.position = "none") 

특정 지역을 확대 할 수 있습니다. scale_color_gradient를 사용하여 고점 및 저점을 빠르게 시각화 할 수 있으며geom_smooth를 사용하여 추세를 볼 수 있습니다.

start <- end - weeks(24)
NVR %>%
    filter(date >= start - days(50)) %>%
    ggplot(aes(x = date, y = volume)) +
    geom_segment(aes(xend = date, yend = 0, color = volume)) +
    geom_smooth(method = "loess", se = FALSE) +
    labs(title = "Naver Bar Chart", 
         subtitle = "Charting Daily Volume, Zooming In", 
         y = "Volume", x = "") + 
    coord_x_date(xlim = c(start, end)) +
    scale_color_gradient(low = "red", high = "darkblue") +
    theme_tq() + 
    theme(legend.position = "none") 

테마

tidyquant 패키지는 3 가지 테마로 구성되어있어 신속하게 재무 차트를 조정 할 수 있습니다.

  • Lighttheme_tq() + scale_color_tq() + scale_fill_tq()
  • Darktheme_tq_dark() + scale_color_tq(theme = "dark") + scale_fill_tq(theme = "dark")
  • Greentheme_tq_green() + scale_color_tq(theme = "green") + scale_fill_tq(theme = "green")

Dark

n_mavg <- 50 # Number of periods (days) for moving average
SHANK %>%
    filter(date >= start - days(2 * n_mavg)) %>%
    ggplot(aes(x = date, y = close, color = symbol)) +
    geom_line(size = 1) +
    geom_ma(n = 15, color = "darkblue", size = 1) + 
    geom_ma(n = n_mavg, color = "red", size = 1) +
    labs(title = "Dark Theme",
         x = "", y = "Closing Price") +
    coord_x_date(xlim = c(start, end)) +
    facet_wrap(~ symbol, scales = "free_y") +
    theme_tq_dark() +
    scale_color_tq(theme = "dark")

소스: tidyquant with tqk

How to Make a Histogram with ggvis in R (article) – DataCamp

Learn how to make a histogram with ggvis. Go from the very basics to creating interactive graphs with shiny to display distributions.

The two previous posts described how you can make histograms with basic Rand the ggplot2 package. This third and last part of our histograms tutorial will look at ggvis. This package is similar to ggplot2, as it is also based on “the grammar of graphics”. However, ggvis has slightly different expressions and extends ggplot2 by adding new features to make your plots interactive. Want to learn more? Discover the DataCamp tutorials.

Step One. Get The ggvis Package into RStudio

To start off plotting histograms in ggvis, you first need to load the ggvispackage:

library(ggvis)

If ggvis is not yet installed on your system, you’ll have to do this first. This can easily be done through

install.packages("ggvis")

Step Two. Your Data

It might seem silly, but do not forget about your data! You can use a data set that is built into R already, or get your own data set. In this example, we will continue with the data from the previous blogpost on histograms: the choldata set.

If you’re new to this tutorial, you can load in the data set through the url()function, embedded into the read.table() function:

187 41 179 74 221 nonsmo a alive
188 49 161 61 268 pipe b alive
189 35 176 73 234 sigare o alive
190 37 173 67 259 sigare o alive
191 49 160 74 191 nonsmo a alive
192 34 179 78 189 nonsmo o alive
193 31 166 68 200 sigare a alive
194 37 159 82 256 nonsmo a alive
195 43 175 80 219 sigare o alive
196 35 174 57 222 pipe a alive
197 38 172 91 227 nonsmo b alive
198 26 170 60 167 sigare a alive
199 39 165 74 259 sigare o alive
200 49 178 81 275 pipe b alive
>

Step Three. Start Plotting Basic Histograms

To make a basic histogram, you can use the following line of code to make a simple histogram with the “AGE” column from the “chol” data frame:

> # Simple histogram with `AGE` data
> chol %>%
ggvis(~AGE) %>%
layer_histograms()
Guessing width = 1 # range / 40
>
basic histogram in R

Study the line of code printed above a bit more clearly: you see that ggvis uses the operator %>%, which is also known as the pipe operator, from the magrittr package. This operator passes the result from its left-hand side into the first argument of the function on its right-hand side. So f(x) %>% g(y) is actually a shortcut for g(f(x), y).

As an extra example, consider the two following R commands, that are completely equivalent:

option1 <- sum(abs(-3:3))
option2 <- -3:3 %>% abs() %>% sum()

Step Four. Prettify Your Histograms

Similarly to the hist() function and functions of ggplot2(), you can easily adapt the visualization of your histogram by extending the original code. In this case, the original line of code could be extended as follows to visualize a histogram that takes the “AGE” column from the “chol” data set, fill the bins up with a red color and makes bins of width 5, with a zero center and add age as a title to the x-axis:

# Histogram of `AGE`
chol %>%
ggvis(~AGE) %>%
layer_histograms(width = 5, center = 35, fill := “#E74C3C”) %>%
add_axis(“x”, title = “Age”)%>%
add_axis(“y”, title = “Count”)
complete histogram

Just like with the hist() function and ggplot2, you can break up this rather large chunk of code to see what each small piece contributes to the visualization during the plotting of the histogram:

Bins

You can easily adjust the width of the bins by changing the width argument inside the layer_histograms() function:

chol %>%
ggvis(~AGE) %>%
…………….(width = 5)
R histogram binwidth

The width argument already set the bin width to 5, but where do bins start and where do they end?

You can use the center or boundary argument for this. center should refer to one of the bins’ center value, which automatically determines the other bins location. The boundary argument specifies the boundary value of one of the bins. Here again, specifying a single value fixes the location of all bins. As these two arguments specify the same thing in a different way, you should set at most one of center or boundary.

As an example, compare the previous histogram (with for example 35 as a bin center) to the following histogram, where center is set to 36:

chol %>%
ggvis(~AGE) %>%
layer_histograms(width = 5, ………..)
histogram R center

You can enforce the same plot by setting the boundary argument to 33.5. This number equals 36 minus half the bin width:

chol %>%
ggvis(~AGE) %>%
layer_histograms(width = 5, ……………)
histogram boundary

Note that the boundary and center may be outside the range of the data. In that case ggvis is smart enough to determine extrapolate what you meant and will decide on a location of the bins. Experiment with the arguments described above to see what the influence could be on the interpretation of the histogram.

Names/colors

By using the pipe operator in addition to add_axis(), you can specify which axis you want to give a certain title or label. In this case, we put

chol %>%
ggvis(~AGE) %>%
layer_histograms(width = 5, center = 35) %>%
………(“x”, title = “Age”)
histogram with x axis

Similarly, you can also label the y-axis:

chol %>%
ggvis(~AGE) %>%
layer_histograms(width = 5, center = 35) %>%
……..(“x”, title = “Age”) %>%
……..(“y”, title = “Bin Count”)
histogram with y axis

You can fill up the bins with any color that you would like by using fill :=.

 

chol %>%
ggvis(~AGE) %>%
layer_histograms(width = 5, center = 35, …. := “#E74C3C”) %>%
add_axis(“x”, title = “Age”) %>%
add_axis(“y”, title = “Bin Count”)
histogram with fill

Note the use of :=, which sets a property to a specific color (or size, width, etc.), as opposed to mapping it, as the = operator does.

Tip: you can find more information about this in DataCamp’s ggvis course.

Step Five. Adding Basic Interactivity To Your Histograms

There are many advantages of working with ggvis, but one that definitely stands out is the fact that you can actually make your histograms interactive! This allows readers of your reports to change the parameters of your plots and see the results of these changes instantaneously. ggvis makes use of shiny for this. Let’s go into the basics of the interactivity that you can add to your histograms.

For example, you can add an input slider that lets you decide on the width of your bins. Here’s how:

1
2
3
4
5
6
7
chol %>%
ggvis(~AGE) %>%
layer_histograms(….. = input_slider(1, 10, step = 1, label = “Bin Width”),
center = 35,
fill := “#E74C3C”) %>%
……..(“x”, title = “Age”)%>%
add_axis(“y”, title = “Bin Count”)
interactive histogram 1

Note that the figure that is shown in this post is a static version of the histogram. To check the interactive plot, that is backed with an R process to change the graphics on the fly, visit the histogram on our Shiny server.

The input values that are given to input_slider are those that you can also see when you execute the code in the RStudio console. In this case, you would see a slider that ranges from values 1 to 10 with steps of 1 and that has a label Bin Width.

Next, you can also add a different type of user input, a select box, to determine the fill color of the bins:

1
2
3
4
5
6
7
8
chol %>%
ggvis(~AGE) %>%
…………….(width = input_slider(1, 10, step = 1, label = “Bin Width”),
center = 35,
fill := input_select(choices = c(“red”, “green”, “blue”,
“yellow”),
selected = “blue”, label = “Fill Color”)) %
>%
add_axis(“x”, title = “Age”) %>%
add_axis(“y”, title = “Bin Count”)
interactive histogram 1

The interactive version of this histogram can again be found on our Shiny server. Here, the fill property is set to one of four choices, specified in the choices argument inside input_select()selected specified that the default color is blue, while label shows the label Fill Color in the interactive plot.

Note that the shiny package is needed for these interactive visualizations. Normally, RStudio comes with this package by default. If you are not working in RStudio, install shiny by executing install.packages("shiny").

Remember to keep in mind what you want to achieve with your histogram and how you want to achieve this! Sometimes, a static representation might be better than a dynamic one.

Step Six. Using ggvis For So Much More

This section was just the tip of the iceberg! Believe us when we say that you can do so much more with this package.

Tip: If you would like to have a more detailed and broader understanding of data visualization in R, you might be interested in the Data Visualization with R skill track, in which you’ll learn how to communicate the most important features of your data by creating beautiful visualizations with ggplot2 and base R graphics.

Conclusion

There are a lot of options to make histograms in R. The option you should choose is really rather a trade-off between what you want to accomplish and how fast you want to accomplish this. But of course, this trade-off is subject to your own programming experience with R and other languages: either option would go fast for any well-trained programmer.

However, if you are just starting with R, it might be a good idea to keep this tutorial at hand. Additionally, we encourage you to go and check out the additional information that you can find in the tutorial and below. This way, you will gradually be submerged into the field of data visualization, which will give you the time to get fluent very quickly.

So, in the end, the only thing that remains of the original trade-off and that probably is the most important thing to take into account is the question “What do you want to achieve with your histogram?”.



소스: How to Make a Histogram with ggvis in R (article) – DataCamp

R을 사용하여 사용자 정의 Sankey 다이어그램 만들기

Creating custom Sankey diagrams using R

Custom sankey diagrams

I have previously shown how Sankey diagrams can easily be used to visualize response patterns in surveys and to display decision trees. Following on from these posts, I will now be getting a bit more technical, and describe how to create custom Sankey diagrams in R. I will start by explaining the basics of Sankey diagrams, and then provide examples of automatically created and manually controlled layouts.

 


 

The elements of a Sankey diagram

A Sankey diagram consists of three sets of elements: the nodes, the links, and the instructions which determine their positions.

To begin with, there are the nodes. In the diagram above, a node is wherever the lines change direction. However, in the example below, boxes represent the four nodes.

The second element of a Sankey diagram is the links (or edges)that connect the nodes together. These links have a value associated with them, which is represented by the thickness of the link. In the example below, the first link that connects Node A with Node B, is half the width of the second link that connects A with C. Furthermore, the link from B to D is bigger again, and the largest link is from C to D.

Lastly, instructions specify where the nodes should appear in relation to each other. There are two strategies for positioning the nodes. One is to give specific coordinates. This is what is illustrated in the example above: the position of the nodes reflects places in France, Russia, and Poland. Alternatively, the nodes can be placed automatically using an algorithm (most commonly, a variant of the force-directed graph layout algorithm is used).

 


 

Using R

I’ve created the example above using from within Displayr.

It is created using the following R code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
library(networkD3)
nodes = data.frame("name"=
 c("Node A", # Node 0
 "Node B", # Node 1
 "Node C", # Node 2
 "Node D"))# Node 3
links = as.data.frame(matrix(c(
 0, 1, 10, # Each row represents a link. The first number
 0, 2, 20, # represents the node being conntected from.
 1, 3, 30, # the second number represents the node connected to.
 2, 3, 40),# The third number is the value of the node
 byrow = TRUE, ncol = 3))
names(links) = c("source", "target", "value")
sankeyNetwork(Links = links, Nodes = nodes,
 Source = "source", Target = "target",
 Value = "value", NodeID = "name",
 fontSize= 12, nodeWidth = 30)

 

Some aspects of this code to note:

  • Line 1 is loading a package (networkD3).
  • Lines 2 to 6 are creating a data frame that contains a single variable, called name. It contains four nodes, which I have creatively named A, B, C, and D.
  • Lines 7 to 11 specify the links. Line 8, for example, shows that the link from node 0 (i.e., A) to node 1 (i.e., B), has a value of 10.
  • The final rows use the sankeyNetwork function.

If you want to adapt this example, you only need to modify the nodes (lines 3 to 6 in this example), and the links (lines 8 to 11). Additionally, you can play around with, and modify, the example live in Displayr by clicking here. Clicking on any of the examples in Displayr will show you the R code. Modify the code first, and then run it by pressing Calculate.

 


 

Sankey diagrams using automated layout

You can also use Sankey diagrams to create conversion funnels, illustrated in the next example. Following this, another one shows data on load energy projections.  My first post on Sankey diagrams also features this latter example.

 


 

Sankey diagrams with manual layout

In Minard’s classic Sankey diagram of the march of Napoleon to Moscow and back, the thickness of the line shows the size of Napoleon’s army. The nodes are where the line changes direction. Automatic placement determined the position of the nodes in the previous examples, whereas here, the nodes represent the locations of places in Europe.

1280px-minard
Below you can see Minard’s visualization reproduced in R. The code used to create this example has basically the same structure as used in the previous examples, except that and coordinates are provided for the nodes, and the color is explicitly set.


TRY IT OUT
You can investigate this data set further or even work on your own data in Displayr. Just click here.


 

Acknowledgements

The final example uses January Weiner’ s riverplot package for R. All the other examples use a modified version of networkD3, created by Kenton Russell (timelyportfolio/networkD3@feature/responsive). networkD3 is an HTMLwidget version of Mike Bostock’s D3 Sankey diagram code, which is inspired by Tom Counsell’s Sankey library. The load energy flow example is from networkD3, which is a reworking of a Sankey library example, using data from the UK’s Department of Energy & Climate Change.

 

소스: Displayr | Creating custom Sankey diagrams using R

패키지 | packcircles 0.2.0

,

packcircles 패키지의 버전 0.2.0이 방금 CRAN에 게시되었습니다. 이 패키지는 경계가 있고 제한되지 않은 영역에서 겹치지 않는 원의 배열을 찾는 함수를 제공합니다.

패키지에는 새로운 circleProgressiveLayout 함수가 있습니다. 중첩을 피하면서 이전에 배치 된 두 원에 외부 접선을 연속적으로 배치하여 원을 배열하는 효율적인 결정적 알고리즘을 사용합니다.

다음은 패키지 비 네트에서 가져온 새로운 함수의 작은 예입니다.

library(packcircles)
library(ggplot2)

t <- theme_bw() +
theme(panel.grid = element_blank(),
axis.text=element_blank(),
axis.ticks=element_blank(),
axis.title=element_blank())

theme_set(t)

# circle areas
areas <- 1:1000

# arrange circles from small to large
packing1 <- circleProgressiveLayout(areas)
dat1 <- circleLayoutVertices(packing1)

# arrange same circles from large to small
packing2 <- circleProgressiveLayout( rev(areas) )
dat2 <- circleLayoutVertices(packing2)

dat <- rbind(
cbind(dat1, set = 1),
cbind(dat2, set = 2) )

ggplot(data = dat, aes(x, y)) +
geom_polygon(aes(group = id, fill = -id),
colour = “black”, show.legend = FALSE) +

scale_fill_distiller(palette = “RdGy”) +

coord_equal() +

facet_wrap(~set,
labeller = as_labeller(
c('1' = "small circles first",
'2' = "big circles first"))
)

새 함수를 위한 vigneet 에는 ggplot2 및 ggiraph를 사용하여 미리 정의 된 색상과 동적 레이블이있는 원 레이아웃의 SVG 이미지를 만드는 방법을 비롯한 많은 예제가 있습니다.

소스: packcircles version 0.2.0 released | R-bloggers

slickR – R에서 사용할 수 있는 자바스크립트 패키지 

,

매끄러운 자바 스크립트 라이브러리를 R에 사용해 봅시다.     This tool helps review multiple outputs in an efficient manner and saves much needed space in documents and Shiny applications, while creating a user friendly experience. These carousels can be used directly from the R console, from RStudio, in Shiny apps and R Markdown documents. Installation […]

ggedit 0.2.0이 이제 CRAN에 등록 되었습니다.

,

ggedit

 

다음과 같이 설치할 수 있습니다.

install.packages('ggedit')

The source version is still tracked on github, which has been reorganized to be easier to navigate.

소스 버전은 여전히 쉽게 탐색 할 수 있도록 재구성 된 github에서 확인할 수 있습니다. 설치는 다음과 같이 하십시요.

devtools::install_github('metrumresearchgroup/ggedit')

ggedit은 어떤 패키지?

ggedit is an R package that is used to facilitate ggplot formatting. With ggedit, R users of all experience levels can easily move from creating ggplots to refining aesthetic details, all while maintaining portability for further reproducible research and collaboration.

ggedit는 ggplot 형식화를 용이하게하기 위해 사용되는 R 패키지입니다. ggedit을 사용하면 대부분의 R 사용자가 ggplots 생성에서 미적 세부 묘사로 쉽게 전환 할 수 있습니다.

ggedit는 R 콘솔에서 실행되거나 Shiny 응용 프로그램에서 반응 객체로 실행됩니다. 사용자는  ggplot 객체 또는 객체 목록을 입력합니다. 응용 프로그램은 Bootstrap modal들을 기동 시키고 ggplot 객체의 각 계층, 크기 및 테마에서 발견되는 모든 요소로 채 웁니다. 그런 다음 사용자는 이러한 요소를 편집하고 변경 사항이 발생할 때 플롯과 상호 작용할 수 있습니다. 편집하는 동안 스크립트의 비교가 기록되며 직접 복사하고 공유 할 수 있습니다. 응용 프로그램 출력은 객체와 스크립트 양식 모두에서 편집 된 레이어, 비율 및 테마가 포함 된 중첩 목록이므로 일반 ggplot2 문법을 사용하여 원본 그림과 독립적으로 편집 된 객체를 적용 할 수 있습니다.

이게 왜 중요할까요? ggedit는 효율적인 협업을 촉진합니다. Plot을 팀 구성원과 공유하여 서식을 변경할 수 있으며, 편집 한 모든 개체를 다시 사용자에게 전송하여 구현할 수 있습니다.

Updates in ggedit 0.2.0:

  • The layer modal (popups) elements have been reorganized for less clutter and easier navigation.
  • The S3 method written to plot and compare themes has been removed from the package, but can still be found on the repo, see plot.theme.

Deploying

  • call from the console: ggedit(p)
  • call from the addin toolbar: highlight script of a plot object on the source editor window of RStudio and run from toolbar.
  • call as part of Shiny: use the Shiny module syntax to call the ggEdit UI elements.
    • server: callModule(ggEdit,'pUI',obj=reactive(p))
    • ui: ggEditUI('pUI')
  • if you have installed the package you can see an example of a Shiny app by executing runApp(system.file('examples/shinyModule.R',package = 'ggedit'))

Outputs

ggedit returns a list containing 8 elements either to the global enviroment or as a reactive output in Shiny.

  • updatedPlots
    • List containing updated ggplot objects
  • updatedLayers
    • For each plot a list of updated layers (ggproto) objects
    • Portable object
  • updatedLayersElements
    • For each plot a list elements and their values in each layer
    • Can be used to update the new values in the original code
  • updatedLayerCalls
    • For each plot a list of scripts that can be run directly from the console to create a layer
  • updatedThemes
    • For each plot a list of updated theme objects
    • Portable object
    • If the user doesn’t edit the theme updatedThemes will not be returned
  • updatedThemeCalls
    • For each plot a list of scripts that can be run directly from the console to create a theme
  • updatedScales
    • For each plot a list of updated scales (ggproto) objects
    • Portable object
  • updatedScaleCalls
    • For each plot a list of scripts that can be run directly from the console to create a scale

Short Clip to use ggedit in Shiny


Jonathan Sidi joined Metrum Research Group in 2016 after working for several years on problems in applied statistics, financial stress testing and economic forecasting in both industrial and academic settings. To learn more about additional open-source software packages developed by Metrum Research Group please visit the Metrum website. Contact: For questions and comments, feel free to email me at: yonis@metrumrg.com or open an issue for bug fixes or enhancements at github.

 

소스: ggedit 0.2.0 is now on CRAN – R-posts.com