Data Frame常用操作

subset可以用来筛选data.frame的子集,语法如下:

一、创建数据

1
2
3
4
5
6
7
8
9
10
11
12
13
> name <- c("Layton","Ben","Paul","Peter","Haley","Jump")
> age <- c(25,18,45,60,24,35)
> sex <- c("M","M","M","M","F","M")
> weight <- c(120,150,125,140,115,110)
> data <- data.frame(name,age,sex,weight)
> data
name age sex weight
1 Layton 25 M 120
2 Ben 18 M 150
3 Paul 45 M 125
4 Peter 60 M 140
5 Haley 24 F 115
6 Jump 35 M 110

二、判断类型

1
2
3
4
5
6
7
8
9
> class(data)
[1] "data.frame"

> str(data)
'data.frame': 6 obs. of 4 variables:
$ name : Factor w/ 6 levels "Ben","Haley",..: 4 1 5 6 2 3
$ age : num 25 18 45 60 24 35
$ sex : Factor w/ 2 levels "F","M": 2 2 2 2 1 2
$ weight: num 120 150 125 140 115 110

三、行数列数

1
2
3
4
> nrow(data)
[1] 6
> ncol(data)
[1] 4

四、查询

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
> data[which(data$sex=="M"),]
name age sex weight height
1 Layton 25 M 120 1.70
2 Ben 18 M 150 1.65
3 Paul 45 M 125 1.60
4 Peter 60 M 140 1.72
6 Jump 35 M 110 1.66

> data[which(data$sex=="M"),"age"]
[1] 25 18 45 60 35

> data[which(data$sex=="M"),c("age","weight")]
age weight
1 25 120
2 18 150
3 45 125
4 60 140
6 35 110

五、子集

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
> subset(data,age<30)
name age sex weight
1 Layton 25 M 120
2 Ben 18 M 150
5 Haley 24 F 115

> subset(data,age<30&sex=="F")
name age sex weight
5 Haley 24 F 115

> subset(data,age<30,select=name)
name
1 Layton
2 Ben
5 Haley

> subset(data,age>20&sex=="M"&weight>120,select=c("name","height"))
name height
3 Paul 1.60
4 Peter 1.72

六、打印行列

1
2
3
4
5
6
7
> print(name)
[1] "Layton" "Ben" "Paul" "Peter" "Haley" "Jump"
> data[2,]
name age sex weight height
2 Ben 18 M 150 1.65
> data[,2]
[1] 25 18 45 60 24 35

七、添加列

1
2
3
4
5
6
7
8
9
> data$height <- c(1.7,1.65,1.6,1.72,1.74,1.66)
> data
name age sex weight height
1 Layton 25 M 120 1.70
2 Ben 18 M 150 1.65
3 Paul 45 M 125 1.60
4 Peter 60 M 140 1.72
5 Haley 24 F 115 1.74
6 Jump 35 M 110 1.66
  • 本文作者:括囊无誉
  • 本文链接: Linux/dataframe/
  • 版权声明: 本博客所有文章均为原创作品,转载请注明出处!
------ 本文结束 ------
坚持原创文章分享,您的支持将鼓励我继续创作!