## [패키지] forecast 8.0

In what is now a roughly annual event, the forecast package has been updated on CRAN with a new version, this time 8.0. A few of the more important new features are described below. Check residuals…

In what is now a roughly annual event, the forecast package has been updated on CRAN with a new version, this time 8.0.

A few of the more important new features are described below.

### Check residuals

A common task when building forecasting models is to check that the residuals satisfy some assumptions (that they are uncorrelated, normally distributed, etc.). The new function checkresiduals makes this very easy: it produces a time plot, an ACF, a histogram with super-imposed normal curve, and does a Ljung-Box test on the residuals with appropriate number of lags and degrees of freedom.

 fit <- auto.arima(WWWusage) checkresiduals(fit)

##
##  Ljung-Box test
##
## data:  residuals
## Q* = 7.8338, df = 8, p-value = 0.4499
##
## Model df: 2.   Total lags used: 10

This should work for all the modelling functions in the package, as well as some of the time series modelling functions in the stats package.

### Different types of residuals

Usually, residuals are computed as the difference between observations and the corresponding one-step forecasts. But for some models, residuals are computed differently; for example, a multiplicative ETS model or a model with a Box-Cox transformation. So the residuals() function now has an additional argument to deal with this situation.

“Innovation residuals”” correspond to the white noise process that drives the evolution of the time series model. “Response residuals” are the difference between the observations and the fitted values (as with GLMs). For homoscedastic models, the innovation residuals and the one-step response residuals are identical. “Regression residuals” are also available for regression models with ARIMA errors, and are equal to the original data minus the effect of the regression variables. If there are no regression variables, the errors will be identical to the original series (possibly adjusted to have zero mean).

 library(ggplot2) fit <- ets(woolyrnq) res <- cbind(Residuals = residuals(fit), Response.residuals = residuals(fit, type='response')) autoplot(res, facets=TRUE)

### Some new graphs

The geom_histogram() function in the ggplot2 package is nice, but it does not have a good default bandwidth. So I added the gghistogram function which provides a quick histogram with good defaults. You can also overlay a normal density curve or a kernel density estimate.

 gghistogram(lynx)

The ggseasonplot function is useful for studying seasonal patterns and how they change over time. It now has a polar argument to create graphs like this.

 ggseasonplot(USAccDeaths, polar=TRUE)

I often want to add a time series line to an existing plot. Base graphics has line() which works well when a time series is passed as an argument. So I added autolayer which is similar (but more general). It is an S3 method like autoplot, and adds a layer to an existing ggplot object. autolayer will eventually form part of the next release of ggplot2, but for now it is available in the forecast package. There are methods provided for ts and forecast objects:

 WWWusage %>% ets %>% forecast(h=20) -> fc autoplot(WWWusage, series="Data") + autolayer(fc, series="Forecast") + autolayer(fitted(fc), series="Fitted")

### Cross-validation

The tsCV and CVar functions have been added. These were discussed in a previous post.

### Bagged ETS

The baggedETS function has been added, which implements the procedure discussed in Bergmeir et al (2016) for bagging ETS forecasts.

### head and tail of time series

I’ve long found it annoying that head and tail do not work on multiple time series. So I added some functions to the package so they now work.

### Imports and Dependencies

The pipe operator from the magrittr package is now imported. So you don’t need to load the magrittr package to use it.

There are now no packages that are loaded with forecast – everything required is imported. This makes the start up much cleaner (no more annoying messages from all those packages being loaded). Instead, some random tips are occasionally printed when you load the forecast package (much like ggplot2 does).

There is quite a bit more — see the Changelog for a list.

## 시각적 자극에 대한 뇌 반응 모델링 – superheat 패키지

이 사례 연구는 UC Berkeley의 Gallant 신경 과학 연구소 (Gallant lab 관련 논문 참조)에서 한 개인에 대해 수행한 기능적 자기 공명 영상 (fMRI) 실험에서 수집 한 데이터를 검사합니다. edu / mlpapers / paper_files / NIPS2008_0963.pdf “> 참조1 참조2 ).

fMRI는 신경 활동의 간접적인 척도로 간주 될 수있는 뇌의 산소가 공급 된 혈액 흐름을 측정합니다 (두 과정은 상호 연관성이 높음). fMRI 실험에서 얻은 측정치는 뇌의 큐브와 같은 화소 내에서 수십만개 뉴런의 총 응답과 일치합니다. 두뇌를 3D 화소로 세분화하는 것은 이미지를 2D 픽셀로 세분화하는 것과 유사합니다.

### The data

이 데이터에는 1,750 가지의 이미지 (예 : 아기, 집 또는 그림의 영상)에 대한 반응으로 한 개인의 시각 피질에있는 약 1,300개 화소에 대한 fMRI 측정 값 입니다.

진행 상황을 시각화하기 위해 아래 그림은 fMRI 시스템에서 개별 이미지를 보는 모습입니다.

각각의 이미지는 길이 1282 = 16,3841282 = 16,384의 벡터로 표현 될 수 있지만 가보 (Gabor) 웨이블릿 변환을 통해 길이 10,92110,921로 감소 될 수있는 128 × 128128 × 128 픽셀 그레이 스케일 이미지이다.

원시 데이터는 Computational Neuroscience (CRCNS) 데이터 공유 저장소의 협업 연구에 저장되며 여기에서 찾을 수 있습니다. 데이터에 액세스하려면 데이터로 수행 할 작업을 설명하는 CRCNS 계정을 요청해야하지만 이는 매우 간단합니다.

Installing the superheat package from github is easy, assuming you have the devtools package installed in R. Simply type the following command:

# install devtools if you don't have it already
install.packages("devtools")
# install the development version of superheat
devtools::install_github("rlbarter/superheat")

Assuming that you didn’t run into any unfortunate errors when installing the package, you can load the package into R in the normal way.

library(superheat)

### Viewing the raw images

# some useful libraries
library(ggplot2)
library(dplyr)
library(gridExtra)
library(knitr)
library(RColorBrewer)

It is helpful to have an idea of what kind of images the subject was viewing. The raw images are contained in the Stimuli.mat file, and is separated into a set of 1,750 training images (used to train our models) and 120 validation images (used to evaluate model performance).

The code below loads in the data and extracts the training and validation images

# a library for loading matlab files
library(R.matlab)
# load in the stimuli images
# extract training stimuli array
train.stimuli <- stimuli[[1]]
# extract validation stimuli array
val.stimuli <- stimuli[[2]]
# remove the original stimuli object
rm(stimuli)

We display four of the training images below using our superheat package.

# view some of the images
im1 <- superheat(train.stimuli[1, 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = F)
im2 <- superheat(train.stimuli[3, 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = F)
im3 <- superheat(train.stimuli[10, 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = F)
im4 <- superheat(train.stimuli[15, 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = F)
# place the images in a 2x2 grid
grid.arrange(im1$plot, im2$plot, im3$plot, im4$plot, ncol = 2)

### Preparing the data for analysis

Despite best efforts, sadly not all of the data is publicly available. The Gabor wavelet filters, for example, are not available on the CRCNS website (see above), but the dedicated reader can try to generate their own Gabor wavelets using the raw images contained in Stimuli.mat (see above).

### Gabor wavelet features

I, however, have access to a file, fMRIdata.RData which contains the Gabor wavelet features for the training and validation set images. These Gabor feature matrices are contained in the fit_feat and val_feat objects from the fMRIdata.Rdata file (again, this is unfortunately not publicly available).

# load in the gabor wavelet filters
load("processed_data/fMRIdata.RData") # not available in github
# fit_feat contains the gabor wavelet features for the 1750 training images
train.feat <- fit_feat
dim(train.feat)
## [1]  1750 10921
# val_feat contains the gabor wavelet features for the 120 validation images
val.feat <- val_feat
dim(val.feat)
## [1]   120 10921
# Remove the other objects that correspond to the responses of a subset of the voxels
rm(fit_feat)
rm(val_feat)
rm(resp_dat)
rm(loc_dat)

### The voxel responses

The voxel responses to each image was collected for two subjects, “S1” and “S2”. We will restrict our analysis to predicting the voxel response in the V1 region for Subject 1 only (see the image below taken from Matthew Schmolesky’s Webvision).

Loading in the voxel responses (from the EstimatedResponses.mat file) involves first converting the .mat file to a format readable by R. Specifically I had to convert the data to version 6 MATLAB file in Octave:

# In Octave, run:
>> save("-V6", "EstimatedResponsesV6.mat", "dat")

Having converted the original MATLAB file to version 6 MATLAB file, we can load it into R using the R.matlab package.

# load in the Version 6 matlab file
voxel.response <- readMat("processed_data/EstimatedResponsesV6.mat")$dat dimnames(voxel.response)[[1]] ## [1] "dataTrnS1" "dataTrnS2" "dataValS1" "dataValS2" "roiS1" "roiS2" ## [7] "voxIdxS1" "voxIdxS2" We can then filter through the objects contained in this file to extract only the responses for the 1331 V1 voxels to the training and validation images. # extract useful objects # the columns of train.resp are the 1,750 training images # the rows are the 25,915 voxels train.resp <- voxel.response[[1]] # the columns of val.resp are the 120 training images # the rows are the 25,915 voxels val.resp <- voxel.response[[3]] # extract the V1 voxels V1.vox.index <- which(voxel.response[[5]][, 1] == 1) # there are 1331 V1 voxels length(V1.vox.index) ## [1] 1331 # filter train.resp and val.resp to the V1 voxels only train.resp <- train.resp[V1.vox.index, ] val.resp <- val.resp[V1.vox.index, ] # remove the remaining data rm(voxel.response) ### Cleaning the data The final data objects in our workspace are ls() ## [1] "im1" "im2" "im3" "im4" ## [5] "train.feat" "train.resp" "train.stimuli" "V1.vox.index" ## [9] "val.feat" "val.resp" "val.stimuli" "voxel.response" where • train.stimuli: a 1750×128×1281750×128×128 array corresponding to the 1,750 raw training images each of dimension 128×128128×128. • val.stimuli: a 120×128×128120×128×128 array corresponding to the 120 raw validation images each of dimension 128×128128×128. • train.feat: a 1750×109211750×10921 matrix corresponding to the 10,921 Gabor wavelet features for each of the 1,750 training images. • val.feat: a 120×10921120×10921 matrix corresponding to the 10,921 Gabor wavelet features for each of the 120 validation images. • train.resp: a 1331×17501331×1750 matrix containing the responses of the 1,331 voxels in the V1 region to each of the 1,750 training images. • val.resp: a 1331×1201331×120 matrix containing the responses of the 1,331 voxels in the V1 region to each of the 120 validation images. #### Missing values Note that of the 1,331 voxels in the V1 region, 37 of them have at least 40% missing responses. So we will remove these voxels. # identify the proportion of missing values for each voxel missing.variables <- apply(train.resp, 1, function(x) sum(is.na(x)) / length(x)) # print the proportion of missingness for the voxels with at # least some missingness. length(missing.variables[missing.variables > 0]) ## [1] 37 # remove these voxels from the training and validation sets train.resp <- t(train.resp[which(missing.variables == 0), ]) val.resp <- t(val.resp[which(missing.variables == 0), ]) save(train.feat, train.resp, file = "processed_data/fmri_training_data.RData") save(val.feat, val.resp, file = "processed_data/fmri_validation_data.RData") # number of voxels remaining after removing missing values ncol(train.resp) ## [1] 1294 We now thus have 1,294 voxels in the V1 region. ### Modeling For each of our models, our goal is to predict the response to the viewing of an image for each of the 1,294 voxels (cubic region in the brain). That is, we have 1,294 separate models (one for each voxel), where the predictors/variables correspond to the 10,921 Gabor features from the training images. #### Feature selection To make the problem computationally feasible, we decided to filter the 10,921 Gabor features to the 500 that were most correlated with each voxel. To identify the correlation of each Gabor feature with each voxel response, we ran the following code (also contained in the code/voxel_cor.R file that can be found here) on the statistics department computer cluster at UC Berkeley. You could probably run it on your laptop, but I would advise against it unless you are incredibly patient. library(parallel) nCores <- 24 # to set manually cl <- makeCluster(nCores) # export the necessary variables clusterExport(cl, c("train.feat", "train.resp", "glmnet"), envir=environment()) # calcualte the correlation of each variable with each voxel cor.vox <- parLapply(cl, data.frame(train.resp), function(voxel) { apply(train.feat, 2, function(feature) cor(voxel, feature)) }) # save the results save(cor.vox, file = "results/voxel_cor.RData") Next, to identify the 500 Gabor features that are most correlated with each voxel, we can simply load the correlation data saved above and run the following code. load("results/voxel_cor.RData") # identify the 500 most correlated features for each variable top.features <- lapply(cor.vox, function(cor) { order(cor, decreasing = TRUE)[1:500] }) ### Lasso + OLS model We now fit a lasso linear model to predict the voxel responses to the image stimuli. The 1750×5001750×500 design matrix (feature matrix) can be represented as follows: X=⎡⎣⎢⎢⎢⎢x1,1x2,1x1750,1x1,2x2,2x1750,2............x1,500x2,500x1750,500⎤⎦⎥⎥⎥⎥X=[x1,1×1,2…x1,500×2,1×2,2…x2,500⋮⋮…⋮x1750,1×1750,2…x1750,500] where xi,jxi,j is the jjth Gabor wavelet feature value for the iith image. We have 1,294 response vectors (as we fit 1,294 separate models) each of which contains the response for an individual voxel to the 17501750 training images. The response vector for voxel v{1,...,1294}v∈{1,…,1294} can be written as: yv=⎡⎣⎢⎢⎢⎢⎢yv1yv2yv1750⎤⎦⎥⎥⎥⎥⎥yv=[y1vy2v⋮y1750v] To fit these 1,294 models, the following code was run using the Statistics Department cluster at UC Berkeley. You will not be able to run this code on your laptop. library(glmnet) library(parallel) # set up the cluster for parallelization nCores <- 24 # to set manually cl <- makeCluster(nCores) # load the CV parameter selection function source("code/select_lambda.R") # fit the lasso model to each voxel using the 1000 most correlated features # for the voxel. clusterExport(cl, c("train.feat", "train.resp", "glmnet", "top.features"), envir=environment()) lasso.list <- parLapply(cl, 1:ncol(train.resp), function(voxel) { glmnet(x = train.feat[, top.features[[voxel]]], y = as.vector(train.resp[ , voxel])) }) # extract the lambda sequence from each voxel # we want to avoid the ends of the sequence # note that some sequences terminate early so providing a # universal endpoint for the vector sometimes produces an error lambda.seq <- lapply(lasso.list, function(model) { model$lambda[5:length(model$lambda)] }) # use CV to select the best lambda for each voxel-model # export the necessary variables clusterExport(cl, c("train.feat", "train.resp", "top.features", "cv.glmnet", "lambda.seq"), envir=environment()) # the ith entry of selected.lambda corresponds to the # best lambda value for voxel i selected.lambda <- selectLambda(cl = cl, x.mat = train.feat, y.mat = train.resp, features = top.features, lambda.seq = lambda.seq) # save the results save(lasso.list, selected.lambda, top.features, file = "results/lasso_results_top500.RData") # stop the cluster stopCluster(cl) Having run the above code on the cluster and saved the results in the file lasso_results_top500.RData, we can load the results into the current R session and calculate the corresponding voxel response predictions. First, we can extract the model coefficients for each model (since we saved the results for each possible lambda, we need to filter to the model with the “best” lambda). load("results/lasso_results_top500.RData") # extract the best model from lasso.list (above) corresponding to the best lambda beta <- lapply(1:length(lasso.list), function(voxel) { model.index <- which(lasso.list[[voxel]]$lambda == selected.lambda[voxel])
as.matrix(lasso.list[[voxel]]$beta)[, model.index] }) We can also identify which feature was selected (had a non-zero lasso coefficient) by each voxel. # identify the selected features for each voxel selected.features.list <- lapply(1:length(top.features), function(voxel) { # idenitfy which features were selected for the voxel # start with the features used to train the lasso model voxel.features <- top.features[[voxel]] # obtain the coefficients for these feautures voxel.features.selected <- beta[[voxel]] != 0 # filter the features index and beta vector to the non-zero coefficients voxel.features <- voxel.features[voxel.features.selected] }) Next, for each voxel we re-fit a classical un-regularized OLS model for each voxel based on the selected (non-zero) coefficients from the corresponding lasso model. # fit an OLS model for each voxel based on the selected features ols.list <- lapply(1:ncol(train.resp), function(voxel) { voxel.features <- selected.features.list[[voxel]] # filter the feature matrix to these selected features voxel.train <- as.data.frame(train.feat[, voxel.features]) colnames(voxel.train) <- paste0("X", 1:ncol(voxel.train)) voxel.train$y <- train.resp[, voxel]
# fit an OLS model
lm(y ~ ., data = voxel.train)
})

#### Evaluating the model on the validation set

The code below extracts the predicted voxel response for the validation set.

# calculate the predicted voxel responses for each image in the TRAINING set
predictions <- lapply(1:ncol(val.resp), function(voxel) {
# extract the selected Gabor features for the current voxel
voxel.val <- as.data.frame(val.feat[, selected.features.list[[voxel]]])
colnames(voxel.val) <- paste0("X", 1:ncol(voxel.val))
# predict the voxel response for the validation images
predict.lm(ols.list[[voxel]], voxel.val)
})

Below we calculate and plot the correlation of the true voxel responses with the predicted voxel responses for each of the 1,294 voxels.

# calculate the correlation between the predictions and the true responses for each voxel
prediction.cor <- sapply(1:ncol(val.resp), function(voxel) {
cor(predictions[[voxel]], val.resp[ ,voxel])
})
# convert to data frame
prediction.cor <- data.frame(voxel = 1:length(prediction.cor), cor = prediction.cor)
# plot a histogram of the correlations between the true and predicted voxel responses
ggplot(prediction.cor) +
geom_histogram(aes(x = cor), col = "white", binwidth = 0.02) +
scale_y_continuous(name = "Number of voxels") +
scale_x_continuous(name = "Correlation of predicted and true voxel response") 

We see that there appear to be two groups of voxels: those whose whose true and predicted correlations are quite low (around 0.2), and another group whose correlations are around 0.6.

#### Visualizing voxel performance accross all voxels

We will now use superheat to simultaneously evaluate the performance of the lasso models for predicting the responses to the validation images of each of the 1294 voxels.

First we will cluster the images and voxels into two groups for visualization purposes.

set.seed(384653)
# calculate row clusters
image.clusters <- kmeans(val.resp, centers = 2)$cluster # calcualte column clusters voxel.clusters <- kmeans(t(val.resp), centers = 2)$cluster
save(voxel.clusters, file = "results/voxel_clusters.RData")

Below we plot some examples of images from each cluster clusters. The following presents four images from the first cluster

set.seed(23847)
four.images.cl1 <- sample(which(image.clusters == 1), 4)
cl1.im1 <- superheat(val.stimuli[four.images.cl1[1], 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = FALSE)
cl1.im2 <- superheat(val.stimuli[four.images.cl1[2], 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = FALSE)
cl1.im3 <- superheat(val.stimuli[four.images.cl1[3], 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = FALSE)
cl1.im4 <- superheat(val.stimuli[four.images.cl1[4], 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = FALSE)
grid.arrange(cl1.im1$plot, cl1.im2$plot, cl1.im3$plot, cl1.im4$plot, ncol = 2)

While the next four images are from the second cluster.

set.seed(23847)
four.images.cl2 <- sample(which(image.clusters == 2), 4)
cl2.im1 <- superheat(val.stimuli[four.images.cl2[1], 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = FALSE)
cl2.im2 <- superheat(val.stimuli[four.images.cl2[2], 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = FALSE)
cl2.im3 <- superheat(val.stimuli[four.images.cl2[3], 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = FALSE)
cl2.im4 <- superheat(val.stimuli[four.images.cl2[4], 128:1, ],
heat.pal = c("black", "white"),
legend = FALSE,
print.plot = FALSE)
grid.arrange(cl2.im1$plot, cl2.im2$plot, cl2.im3$plot, cl2.im4$plot, ncol = 2)

Next, we plot a heatmap of the validation response matrix, that is, the response of each voxel to each validation image. Above each column in the heatmap (each column corresponds to a voxel), we plot the correlation of the observed voxel response with the predicted voxel response.

superheat(val.resp,
yt = prediction.cor$cor, yt.axis.name = "Correlation between\npredicted and true\nvoxel responses", heat.pal = brewer.pal(5, "RdBu"), yt.obs.col = rep("slategray4", ncol(val.resp)), yt.point.alpha = 0.6, yt.axis.name.size = 12, yt.plot.size = 0.7, yt.point.size = 2, membership.rows = image.clusters, membership.cols = voxel.clusters, left.label = "none", bottom.label = "none", grid.hline.col = "white", grid.vline.col = "white", grid.hline.size = 2, grid.vline.size = 2, row.title = "Validation images (120)", column.title = "Voxels (1,294)", title = "(a)") The heatmap above is very noisy as a lot of information is being crammed into a small number of pixels. It is thus often much easier to “smooth” the heatmap within its clusters to highlight the “big picture”. superheat(val.resp, X.text = matrix(c("lower\nvoxel\nresponse", "higher\nvoxel\nresponse", "neutral\nvoxel\nresponse", "neutral\nvoxel\nresponse"), ncol = 2), X.text.col = matrix(c("white", "white", "black", "black"), ncol = 2), smooth.heat = T, heat.pal = brewer.pal(5, "RdBu"), yt = prediction.cor$cor,
yt.axis.name = "Correlation between\npredicted and true\nvoxel responses",
yt.plot.type = "boxplot",
yt.cluster.col = "slategray4",
yt.axis.name.size = 12,
yt.plot.size = 0.7,

membership.rows = image.clusters,
membership.cols = voxel.clusters,

left.label = "none",
bottom.label = "none",
grid.hline.col = "white",
grid.vline.col = "white",
grid.hline.size = 2,
grid.vline.size = 2,

row.title = "Validation images (120)",
column.title = "Voxels (1,294)",
title = "(b)")
## [1] -0.30 -0.20 -0.04  0.09  0.20

What we can see is that the two clusters of voxels behave very differently:

• the first cluster of voxels is much more active in response to viewings of images from the second cluster of images than to images from the first cluster.
• the second cluster of voxels does not appear to exhibit a strong difference in response to the two clusters of images.

Furthermore, the predicted responses for the first cluster of voxels are much more correlated with the true responses for these voxels, with an average correlation of around 0.5. The predicted responses to the second cluster of voxels, on the other hand, have approximately zero correlation with the true responses exhibited by these voxels.

### Exploring the voxel clusters

Notice that we found two clusters of voxels that respond to the visual stimuli differently: the first cluster of voxels is highly sensitive to visual stimuli, whereas the second cluster is not.

Our goal is to explore the physical locations of the voxels. We can load in the locations data that can be found on Yuval Benjamini’s website.

load("raw_data/v1_locations.RData")
# v1 locations
dim(v1_locations)
## [1] 1331    3
##      [,1] [,2] [,3]
## [1,]   34   22    3
## [2,]   34   23    3
## [3,]   34   24    3
## [4,]   34   25    3
## [5,]   34   26    3
## [6,]   34   20    4

Note that v1_locations appears to hold the (x, y, z)-locations of each V1 voxel. However, recall that we had a number of voxels with mostly missing values. We remove these from our location data frame below.

v1.locations <- as.data.frame(v1_locations[which(missing.variables == 0), ])
colnames(v1.locations) <- c("x", "y", "z")
v1.locations\$cluster <- factor(voxel.clusters)
rm(v1_locations)

Next, we can plot the voxels in space.

library(plotly)
voxel.clusters <- factor(paste("cluster", voxel.clusters))
plot_ly(v1.locations, x = ~x, y = ~y, z = ~z,
color = ~cluster,
type = "scatter3d",
mode = "markers",
colors = c('#e41a1c', '#377eb8'),
marker = list(line = list(color = 'black', width = 1.5))) %>%
layout(scene = list(xaxis = list(title = 'x'),
yaxis = list(title = 'y'),
zaxis = list(title = 'z')))

library(plotly)
voxel.clusters <- factor(paste("cluster", voxel.clusters))
plot_ly(v1.locations, x = ~x, y = ~y, z = ~z,
color = ~cluster,
type = "scatter3d",
mode = "markers",
colors = c('#e41a1c', '#377eb8'),
marker = list(line = list(color = 'black', width = 1.5))) %>%
layout(scene = list(xaxis = list(title = 'x'),
yaxis = list(title = 'y'),
zaxis = list(title = 'z')))

소스: Superheat

## 빅데이터 인포그래픽스

빅데이터를 한눈에 파악해 볼 수 있는 코너인 “빅데이터 인포그래픽스” 입니다. 각종 데이터나 정보를 시각적으로 표현해서 빅데이터에 대한 이해를 좀더 쉽게 했습니다.