English 中文(简体)
count of entries in data frame in R

I m looking to get a count for the following data frame:

> Santa
   Believe Age Gender Presents Behaviour
1    FALSE   9   male       25   naughty
2     TRUE   5   male       20      nice
3     TRUE   4 female       30      nice
4     TRUE   4   male       34   naughty

of the number of children who believe. What command would I use to get this?

(The actual data frame is much bigger. I ve just given you the first four rows...)



You could use table:

R> x <- read.table(textConnection( 
   Believe Age Gender Presents Behaviour
1    FALSE   9   male       25   naughty
2     TRUE   5   male       20      nice
3     TRUE   4 female       30      nice
4     TRUE   4   male       34   naughty 
), header=TRUE)

R> table(x$Believe)

    1     3 

I think of this as a two-step process:

  1. subset the original data frame according to the filter supplied (Believe==FALSE); then

  2. get the row count of this subset

For the first step, the subset function is a good way to do this (just an alternative to ordinary index or bracket notation).

For the second step, i would use dim or nrow

One advantage of using subset: you don t have to parse the result it returns to get the result you need--just call nrow on it directly.

so in your case:

v = nrow(subset(Santa, Believe==FALSE))     #  subset  returns a data.frame

or wrapped in an anonymous function:

>> fnx = function(fac, lev){nrow(subset(Santa, fac==lev))}

>> fnx(Believe, TRUE)

Aside from nrow, dim will also do the job. This function returns the dimensions of a data frame (rows, cols) so you just need to supply the appropriate index to access the number of rows:

v = dim(subset(Santa, Believe==FALSE))[1] 

An answer to the OP posted before this one shows the use of a contingency table. I don t like that approach for the general problem as recited in the OP. Here s the reason. Granted, the general problem of how many rows in this data frame have value x in column C? can be answered using a contingency table as well as using a "filtering" scheme (as in my answer here). If you want row counts for all values for a given factor variable (column) then a contingency table (via calling table and passing in the column(s) of interest) is the most sensible solution; however, the OP asks for the count of a particular value in a factor variable, not counts across all values. Aside from the performance hit (might be big, might be trivial, just depends on the size of the data frame and the processing pipeline context in which this function resides). And of course once the result from the call to table is returned, you still have to parse from that result just the count that you want.

So that s why, to me, this is a filtering rather than a cross-tab problem.


You can do summary(santa$Believe) and you will get the count for TRUE and FALSE

DPLYR makes this really easy.


If you wanted to count by a group; for instance, how many males v females believe, just add a group_by:


A one-line solution with data.table could be

   Believe N
1:   FALSE 1
2:    TRUE 3

using sqldf fits here:

sqldf("SELECT Believe, Count(1) as N FROM Santa
       GROUP BY Believe")

SELECT command to calculate percentage

I m trying to get the percentage of each video I have in my database based on its view count against all other videos. I m then trying to display all the videos from highest view count to lowest, ...

Consolidating a COUNT query

I have a page where I am running an initial SQL query to get a list of subjects, then I loop over this query and run two additional queries for each record returned from the original subjects query (I ...

R: Count number of objects in list [closed]

Can someone recommend a function that can allow me to count and return the number of items in a list? library(stringr) l <- strsplit(words, "a") if(# number of items in list l < 1) ...

Mysql get count of rows for each day

My Current query is: SELECT DISTINCT DATE(vote_timestamp) AS Date, COUNT(*) AS TotalVotes FROM `votes` WHERE vote_target_id= 83031 GROUP BY DATE(vote_timestamp) ORDER BY DATE(vote_timestamp) DESC ...

Group by named column

I always forget how to do things like this. I have a database table with birthdates and I want to find out how many people have the same age. I m trying: SELECT TIMESTAMPDIFF( YEAR, birthdate, ...

TableView oval button for Index/counts

Can someone help me create an index/count button for a UITableView, like this one? iTunes http://img.skitch.com/20091107-nwyci84114dxg76wshqwgtauwn.preview.jpg Is there an Apple example, or other ...
