본문 바로가기

Data Analytics/Google Data Analytics

[Coursera] Course 7. Data Analysis with R programming

WEEK 1 - Programming and data analytics
WEEK 2 - Programming using RStudio
WEEK 3 - Working with data in R
WEEK 4 - More about visualizations, aesthetics, and annotations

[Definition]

-Programming languages: The words and symbols we use to write instructions for computers to follow

-R: A programming language frequently used for statistical analysis, visualization, and other data analysis

-Open source: Code that is freely available and may be modified and shared by the people who use it

-Integrated Development Environment (IDE): A software application that brings together all the tools you may want to use in a single place

-CRAN (Comprehensive R Archive Network): An online archive with R packages, source code, manuals, and documentation

-Packages (R): Units of reproducible R code

-Tidyverse (R): A system of packages in R with common design philosophy for data manipulation, exploration, and visualization

-ggplot2 (R): Create a variety of data viz by applying different visual properties to the data variables in R

-tidyr (R): A package used for data cleaning to make tidy data

-readr (R): Used for importing data

-dplyr (R): Offers a consistent set of functions that help you complete some common data manipulation tasks

-Mapping (R): Matching up a specific variable in your dataset with a specific aesthetic

[The basic concepts of R]

  • Functions: A body of reusable code used to perform specific tasks in R
  • Comments: Describe or explain what's going on in the code
  • Variables: A representation of a value in R that can be stored for use later during programming
  • Data types: Logical, integer, double, character, etc.
  • Vectors: A group of data elements of the same type
  • Pipes: A tool in R for expressing a sequence of multiple operations, represented with "%>%"

[The most common data structures in R]

  • Vector: A one-dimensional sequence of data elements. Only a single data type. / List: Elements can be of any type.
  • Array
  • Data frame: A collection of columns
  • Matrix: A two-dimensional (rows and columns) collection of data elements. Only contain a single data type.

[What Packages include]

  • Reusable R functions
  • Documentation about the functions
  • Sample datasets
  • Tests for checking your code

[To get summaries of a data fram]

  • skim_without_charts()
  • glimpse()
  • head()
  • select()

[Operators]

1. Arithmetic +, -, *, /, %%, %/%, ^
2. Relational <, >, <=, >=, ==, !=
3. Logical &, &&, |, ||, ! 
4. Assignment <-, <<-, =, ->, ->>

[Core concepts in ggplot2]

  • Aesthetics: A visual property of an object in your plot
  • Geoms: The geometric object used to represent your data
  • Facets: Let you display smaller groups, or subsets of your data
  • Labels and annotations: Let you customize your plot

ex) ggplot(data = penguins) + geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g))

[Facet functions]

  • Facet_wrap()
  • Facet_grid()

[Resources for R]

https://ourcodingclub.github.io/tutorials.html

 

Tutorials

Here you can find our collection of coding, data science and statistics tutorials with examples in R, Python, JavaScript and Python. As you click through, you'll notice that some tutorials have ribbons on their logos - they are part of our free and self-pa

ourcodingclub.github.io

 

https://ropensci.org/

 

rOpenSci - open tools for open science

Open Tools and R Packages for Open Science

ropensci.org

 

https://cran.r-project.org/web/views/

 

CRAN Task Views

Omics Genomics, Proteomics, Metabolomics, Transcriptomics, and Other Omics

cran.r-project.org

 

https://rstudio-education.github.io/tidyverse-cookbook/

 

The Tidyverse Cookbook

How to solve common data science tasks with R’s Tidyverse

rstudio-education.github.io

 

https://r4ds.had.co.nz/index.html

 

Welcome | R for Data Science

This book will teach you how to do data science with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it. In this book, you will find a practicum of skills for data science. Just as

r4ds.had.co.nz

 

https://datacarpentry.org/dc_zurich/R-ecology/05-visualisation-ggplot2.html

 

Visualizing data with ggplot2

download.file("http://datacarpentry.github.io/dc_zurich/data/portal_data_joined.csv", "data/portal_data_joined.csv") surveys_complete <- read.csv(file = "data/portal_data_joined.csv") In this lesson, we will be using functions from the ggplot2 package to c

datacarpentry.org

 

https://viz-ggplot2.rsquaredacademy.com/index.html

 

Data Visualization with ggplot2

Learn to visualize data with ggplot2.

viz-ggplot2.rsquaredacademy.com

 

https://github.com/yixuan/prettydoc/

 

GitHub - yixuan/prettydoc: Creating Pretty HTML From R Markdown

Creating Pretty HTML From R Markdown. Contribute to yixuan/prettydoc development by creating an account on GitHub.

github.com