14.12.26 금요일 전산실습

프로그래밍, 통계학/R(전산실습) 2014. 12. 26. 14:46

##수업내용

p.53 외부데이터 읽기

cyworld.com/biicii

>게시판 ㅎㅎ1, ㅎㅎ2 자료 다운로드

#행렬은 >> 숫자 아니면 문자 오직 한종류씩

#데이터프레임은 숫자, 문자 같이 있을 수 있음

p.39 attach 함수

rbind( 두 자료 합치기)

**2장 연습문제 9번, 10번 : 레포트!!! ** 모아서 중간고사날에 제출!! (1월2일)

##p.74,75 수학,통계관련함수 무조건 암기

##p.75 문자 함수

##p.82 시험문제 >> 3.1.5 변수변환(Recoding)

program1.R

# 14.12.26 금요일 ================================
install.packages("xlsx")
library(xlsx)

world95_1<-read.xlsx("D:/R/world95_1.xlsx", 1,header=T)

world95_1<-read.xlsx("D:/R/world95_2.xlsx", 1,header=T) #시트2에 붙여서 불러옴

length(world95_1) # 변수의 수
[1] 26

dim(world95_1) # 차원
[1] 49 26

str(world95_1)    ## str 은 개체구조를 파악할 수 있음 , 데이터49개 변수26개
'data.frame':   49 obs. of 26 variables:
$ country : Factor w/ 49 levels "Afghanistan",..: 1 2 3 4 5 6 7 8 9 10 ...
$ populatn: num 20500 33900 3700 17800 8000 7400 600 125000 256 10300 ...
$ density : num 25 12 126 2.3 94 86 828 800 605 50 ...
$ urban   : Factor w/ 38 levels ".","12","15",..: 5 35 26 34 21 19 33 4 14 25 ...
$ religion: Factor w/ 8 levels "Animist","Buddhist",..: 4 3 5 6 3 4 4 4 6 5 ...
$ lifeexpf: num 44 75 75 80 79 75 74 53 78 76 ...
$ lifeexpm: num 45 68 68 74 73 67 71 53 73 66 ...
$ literacy: Factor w/ 28 levels ".","100","18",..: 6 25 27 2 28 27 16 7 28 28 ...
$ pop_incr: num 2.8 1.3 1.4 1.38 0.2 1.4 2.4 2.4 0.21 0.32 ...
$ babymort: num 168 25.6 27 7.3 6.7 35 25 106 20.3 19 ...
$ gdp_cap : num 205 3408 5000 16848 18396 ...
$ region : num 3 6 5 1 1 5 5 3 6 2 ...
$ calories: Factor w/ 34 levels ".","1667","1916",..: 1 23 1 24 30 1 1 6 1 1 ...
$ aids    : Factor w/ 46 levels ".","0","1","10",..: 2 27 14 32 6 1 8 3 28 4 ...
$ birth_rt: num 53 20 23 15 12 23 29 35 16 13 ...
$ death_rt: num 22 9 6 8 11 7 4 11 8.4 11 ...
$ aids_rt : Factor w/ 47 levels ".","0","0.000857632933104631",..: 2 18 6 32 24 1 27 3 23 7 ...
$ log_gdp : num 2.31 3.53 3.7 4.23 4.26 ...
$ lg_aidsr: Factor w/ 47 levels ".","0","0.24359046806108",..: 2 25 6 34 28 1 16 3 47 7 ...
$ b_to_d : num 2.41 2.22 3.83 1.88 1.09 ...
$ fertilty: Factor w/ 39 levels ".","1.4","1.47",..: 38 18 21 11 4 18 24 27 7 10 ...
$ log_pop : num 4.31 4.53 3.57 4.25 3.9 ...
$ cropgrow: Factor w/ 30 levels ".","1","10","12",..: 4 30 8 24 8 9 10 26 28 16 ...
$ lit_male: Factor w/ 27 levels ".","100","28",..: 7 25 2 2 1 2 10 8 27 2 ...
$ lit_fema: Factor w/ 25 levels ".","100","14",..: 3 23 2 2 1 2 12 6 25 2 ...
$ climate : Factor w/ 9 levels ".","1","3","4",..: 3 8 1 3 8 3 3 5 5 8 ...

#lifeexpf 여자 평균수명 , lifeexpm 남자평균수명

summary(world95_1$lifeexpf) #world95_1데이터에서 lifeexpf 변수의 요약
   Min. 1st Qu. Median    Mean 3rd Qu.    Max.
44.00   64.00   75.00   69.76   78.00   82.00
summary(world95_1$lifeexpm) #남자 평균수명요약
   Min. 1st Qu. Median    Mean 3rd Qu.    Max.
41.00   59.00   67.00   64.41   73.00   76.00

attach(world95_1) # attach쓰고나면 world95_1 다음부터 입력안해도 가능
summary(lifeexpf)
Min. 1st Qu. Median Mean 3rd Qu. Max.
44.00 64.00 75.00 69.76 78.00 82.00

plot(lifeexpf,lifeexpm)

detach(world95_1) #world95_1 사용 종료

world95<-rbind(world95_1,world95_2) #월드1,2를 행으로 합치기 rbind

str(world95) ##str의 개체구조, 데이터 109개 변수 26개
'data.frame':   109 obs. of 26 variables:
$ country : Factor w/ 109 levels "Afghanistan",..: 1 2 3 4 5 6 7 8 9 10 ...
$ populatn: num 20500 33900 3700 17800 8000 7400 600 125000 256 10300 ...
$ density : num 25 12 126 2.3 94 86 828 800 605 50 ...
$ urban   : Factor w/ 38 levels ".","12","15",..: 5 35 26 34 21 19 33 4 14 25 ...
$ religion: Factor w/ 11 levels "Animist","Buddhist",..: 4 3 5 6 3 4 4 4 6 5 ...
$ lifeexpf: num 44 75 75 80 79 75 74 53 78 76 ...
$ lifeexpm: num 45 68 68 74 73 67 71 53 73 66 ...
$ literacy: Factor w/ 46 levels ".","100","18",..: 6 25 27 2 28 27 16 7 28 28 ...
$ pop_incr: num 2.8 1.3 1.4 1.38 0.2 1.4 2.4 2.4 0.21 0.32 ...
$ babymort: num 168 25.6 27 7.3 6.7 35 25 106 20.3 19 ...
$ gdp_cap : num 205 3408 5000 16848 18396 ...
$ region : num 3 6 5 1 1 5 5 3 6 2 ...
$ calories: Factor w/ 75 levels ".","1667","1916",..: 1 23 1 24 30 1 1 6 1 1 ...
$ aids    : Factor w/ 94 levels ".","0","1","10",..: 2 27 14 32 6 1 8 3 28 4 ...
$ birth_rt: num 53 20 23 15 12 23 29 35 16 13 ...
$ death_rt: chr "22" "9" "6" "8" ...
$ aids_rt : Factor w/ 105 levels ".","0","0.000857632933104631",..: 2 18 6 32 24 1 27 3 23 7 ...
$ log_gdp : num 2.31 3.53 3.7 4.23 4.26 ...
$ lg_aidsr: Factor w/ 105 levels ".","0","0.24359046806108",..: 2 25 6 34 28 1 16 3 47 7 ...
$ b_to_d : chr "2.40909090909091" "2.22222222222222" "3.83333333333333" "1.875" ...
$ fertilty: Factor w/ 85 levels ".","1.4","1.47",..: 38 18 21 11 4 18 24 27 7 10 ...
$ log_pop : num 4.31 4.53 3.57 4.25 3.9 ...
$ cropgrow: Factor w/ 40 levels ".","1","10","12",..: 4 30 8 24 8 9 10 26 28 16 ...
$ lit_male: Factor w/ 45 levels ".","100","28",..: 7 25 2 2 1 2 10 8 27 2 ...
$ lit_fema: Factor w/ 52 levels ".","100","14",..: 3 23 2 2 1 2 12 6 25 2 ...
$ climate : Factor w/ 10 levels ".","1","3","4",..: 3 8 1 3 8 3 3 5 5 8 ...

write.xlsx(world95,"D:/R/world95_R.xlsx") ## 만든 데이터 저장하기

write.xlsx(world95,"D:/R/world95_R2.xlsx", row.names=FALSE) ##행이름삭제하고 저장

##3장 p.70 벡터
x<-c(1,2,3,4)

x
[1] 1 2 3 4
x<-c(x,5)
x
[1] 1 2 3 4 5
y<-c(6,7,8)
x<-c(x,y)
x
[1] 1 2 3 4 5 6 7 8

seq(0,5,by=0.5) # 0에서 5까지 간격은 0.5
[1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

seq(0,5,length=3)
[1] 0.0 2.5 5.0
rep(1, times=5)
[1] 1 1 1 1 1
rep(c(1,3,5),2)
[1] 1 3 5 1 3 5
seq(0,5,length=5)
[1] 0.00 1.25 2.50 3.75 5.00

x<-c(7,8,9,10)
y<-c(1,2,3,4)

x+y
[1] 8 10 12 14

x-y
[1] 6 6 6 6

x*y
[1] 7 16 27 40

x/y
[1] 7.0 4.0 3.0 2.5

x^y
[1] 7 64 729 10000

x<-c(1,2,3,3.75,5.9,0.00347,10,2.3025)
x
[1] 1.00000 2.00000 3.00000 3.75000 5.90000 0.00347 10.00000 2.30250
> abs(x)
[1] 1.00000 2.00000 3.00000 3.75000 5.90000 0.00347 10.00000 2.30250
>

substr("Statistics",1,4) #첫째부터 넷째 까지만 가져온다
[1] "Stat"

y<-c("정보통계학과", "응용통계학과")
substr(y,3,6) #3번째 부터 6번째까지 가져옴
[1] "통계학과" "통계학과"
> substr(y,c(1,3),c(2,6)) #앞에건 1에서2, 뒤에건 3에서6 번째 가져옴
[1] "정보" "통계학과"

cities<-c("New York, NY", "Ann Arbor, MI", "Chicago, IL")
states<-substr(cities, nchar(cities)-1,nchar(cities));states #끝에 있는 두글자만 가져오기
[1] "NY" "MI" "IL"

city<-strsplit(cities,split=",") #문자열 분리
city
[[1]]
[1] "New York" " NY"

[[2]]
[1] "Ann Arbor" " MI"

[[3]]
[1] "Chicago" " IL"

저작자표시 (새창열림)

'프로그래밍, 통계학 > R(전산실습)' 카테고리의 다른 글

2014.12.30.화요일 (0)	2014.12.30
14.12.29 월요일 전산실습 (0)	2014.12.29
2014.12.24.수요일 전산실습 (0)	2014.12.24
2014.12.23(결석수업)자료 (0)	2014.12.24
2014.12.22.월요일 (0)	2014.12.22

테시리

14.12.26 금요일 전산실습

'프로그래밍, 통계학 > R(전산실습)' 카테고리의 다른 글

공지사항

카테고리

태그목록

최근에 받은 트랙백

글 보관함

달력

링크

테시리

LATEST FROM OUR BLOG

LATEST COMMENTS

BLOG VISITORS

티스토리툴바

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31