getSetSizesByDegree.Rd
Count number of items in each set by degree of overlap. The degree of overlap is the number of sets of which an item is a member.
getSetSizesByDegree(df, setNames, idName, maxDegree = 4)
df | A data frame indicating set membership |
---|---|
setNames | A character vector of set names |
idName | A string specifying name of ID variable for each item |
maxDegree | A numeric input indicating upper limit on degree |
A data frame with variables:
set
indicating set
degree
indicating degree of overlap (maximum of maxDegree
)
degreeLabel
a factor labeling the degree variable
N
number of items
The input data frame should contain a row for each item and a binary variable
for each set indicating the membership of each item. The setNames
input should correspond to the binary indicator columns in the data frame.
# Define set names data("movieSets") setNames <- setNames <- colnames(movieSets[,-c(1:8)]) # Calculate set sizes getSetSizesByDegree(movieSets , setNames, "movieId")#> # A tibble: 76 x 4 #> set degree N prop #> <fct> <dbl> <dbl> <dbl> #> 1 Action 1 178 0.0506 #> 2 Action 2 862 0.245 #> 3 Action 3 1402 0.398 #> 4 Action 4 1078 0.306 #> 5 Adventure 1 80 0.0343 #> 6 Adventure 2 469 0.201 #> 7 Adventure 3 889 0.382 #> 8 Adventure 4 891 0.383 #> 9 Animation 1 83 0.0808 #> 10 Animation 2 231 0.225 #> # ... with 66 more rows# Calculate set sizes with max degree 3 getSetSizesByDegree(movieSets , setNames, "movieId", maxDegree = 3)#> # A tibble: 57 x 4 #> set degree N prop #> <fct> <dbl> <dbl> <dbl> #> 1 Action 1 178 0.0506 #> 2 Action 2 862 0.245 #> 3 Action 3 2480 0.705 #> 4 Adventure 1 80 0.0343 #> 5 Adventure 2 469 0.201 #> 6 Adventure 3 1780 0.764 #> 7 Animation 1 83 0.0808 #> 8 Animation 2 231 0.225 #> 9 Animation 3 713 0.694 #> 10 Children 1 24 0.0211 #> # ... with 47 more rows