Tables • cancR

Introduction

Tables are essential for presenting the study population and the distribution of important covariates. The tablR function in the cancR package is useful for this task.

Tables with only on group

We use our redcap_df as an example. Let us create a simple table of the whole population and present the covariates age, sex and tumor type

tablR(redcap_df,
      vars = c(age, sex, type))
#>                        Overall (N=500)
#> 1 Age                                 
#> 2    Median (Q1, Q3) 49.7 (34.1, 64.5)
#> 3    Range              10.8 - 88.8   
#> 4 Sex                                 
#> 5    Median (Q1, Q3)  1.0 (1.0, 2.0)  
#> 6    Range               1.0 - 2.0    
#> 7 Type                                
#> 8    Median (Q1, Q3)  1.0 (1.0, 2.0)  
#> 9    Range               0.0 - 2.0

We instantly conclude that sex and type is incorrectly formatted, as we wish to see percentages and not a median as if the 0/1 structure was numeric. We convert the variables using factR() and pipe into the tablR(). Notice that when we pipe we do not need to specify the data frame in tablR()

redcap_df %>% 
  factR(c(sex, type)) %>% 
  tablR(vars = c(age, sex, type))
#>                         Overall (N=500)
#> 1  Age                                 
#> 2     Median (Q1, Q3) 49.7 (34.1, 64.5)
#> 3     Range              10.8 - 88.8   
#> 4  Sex                                 
#> 5     1                   256 (51%)    
#> 6     2                   244 (49%)    
#> 7  Type                                
#> 8     1                   254 (51%)    
#> 9     2                   133 (27%)    
#> 10    0                   113 (23%)

Tables with multiple groups

If we want to compare multiple groups, we use the group argument. Let us compare tumor types

redcap_df %>% 
  factR(c(sex, type)) %>% 
  tablR(group = type,
        vars = c(age, sex))
#>                              1 (N=254)         2 (N=133)         0 (N=113)
#> 1 Age                                                                     
#> 2    Median (Q1, Q3) 49.6 (34.9, 66.0) 50.4 (34.8, 63.4) 48.3 (31.8, 64.2)
#> 3    Range              10.8 - 88.8       11.2 - 86.3       12.0 - 84.6   
#> 4 Sex                                                                     
#> 5    1                   131 (52%)         59 (44%)          66 (58%)     
#> 6    2                   123 (48%)         74 (56%)          47 (42%)

We can also add a totalcolumn and also test the differences in distributions

redcap_df %>% 
  factR(c(sex, type),
        levels = list("type" = c("0", "1", "2"))) %>% 
  tablR(group = type,
        vars = c(age, sex),
        test = TRUE,
        total = TRUE)
#>                              0 (N=113)         1 (N=254)         2 (N=133)
#> 1 Age                                                                     
#> 2    Median (Q1, Q3) 48.3 (31.8, 64.2) 49.6 (34.9, 66.0) 50.4 (34.8, 63.4)
#> 3    Range              12.0 - 84.6       10.8 - 88.8       11.2 - 86.3   
#> 4 Sex                                                                     
#> 5    1                   66 (58%)          131 (52%)         59 (44%)     
#> 6    2                   47 (42%)          123 (48%)         74 (56%)     
#>       Total (N=500)  P-value
#> 1                   p = 0.94
#> 2 49.7 (34.1, 64.5)         
#> 3    10.8 - 88.8            
#> 4                   p = 0.09
#> 5     256 (51%)             
#> 6     244 (49%)

Customizing tables

Most of the content of the table can be customized in tablR()

Changing labels

It is possible to rename three types of labels:
- Group names: labs.groups
- Variable names: labs.headings
- Variable labels/levels: labs.subheadings

redcap_df %>%
  factR(c(type, sex)) %>%
  tablR(
    group = type,
    vars=c(age, sex),
    labs.groups = list("benign" = "0",
                                     "in situ" = "1",
                                     "malignant" = "2"),
    labs.headings = list("Age at Debut" = "age"),
    labs.subheadings = list("sex" = list("Female" = "2",
                                         "Male" = "1")))
#>                         benign (N=113)   in situ (N=254) malignant (N=133)
#> 1 Age at Debut                                                            
#> 2    Median (Q1, Q3) 48.3 (31.8, 64.2) 49.6 (34.9, 66.0) 50.4 (34.8, 63.4)
#> 3    Range              12.0 - 84.6       10.8 - 88.8       11.2 - 86.3   
#> 4 Sex                                                                     
#> 5    Female              47 (42%)          123 (48%)         74 (56%)     
#> 6    Male                66 (58%)          131 (52%)         59 (44%)

Changing orders

The order of the groups and variable levels can be specified with the reference() and levels() arguments

redcap_df %>%
  factR(c(type, sex, localisation)) %>%
  tablR(
    group = type,
    vars=c(age, sex, localisation),
    labs.groups = list("benign" = "0",
                                     "in situ" = "1",
                                     "malignant" = "2"),
    labs.headings = list("Age at Debut" = "age"),
    labs.subheadings = list("sex" = list("Female" = "2",
                                         "Male" = "1"),
                            "localisation" = list("Neck" = "0",
                                                  "Head" = "1",
                                                  "Trunk" = "2",
                                                  "Upper Extremity" = "3",
                                                  "Lower Extremity" = "4",
                                                  "Unspecified" = "5")),
    reference = list("sex" = c("Female")))
#>                          benign (N=113)   in situ (N=254) malignant (N=133)
#> 1  Age at Debut                                                            
#> 2     Median (Q1, Q3) 48.3 (31.8, 64.2) 49.6 (34.9, 66.0) 50.4 (34.8, 63.4)
#> 3     Range              12.0 - 84.6       10.8 - 88.8       11.2 - 86.3   
#> 4  Sex                                                                     
#> 5     Female              47 (42%)          123 (48%)         74 (56%)     
#> 6     Male                66 (58%)          131 (52%)         59 (44%)     
#> 7  Localisation                                                            
#> 8     Neck                6 (5.3%)          8 (3.1%)          1 (0.8%)     
#> 9     Head                9 (8.0%)          31 (12%)          24 (18%)     
#> 10    Trunk               47 (42%)          74 (29%)          46 (35%)     
#> 11    Upper Extremity     30 (26%)          94 (37%)          35 (26%)     
#> 12    Lower Extremity     20 (18%)          45 (18%)          23 (17%)     
#> 13    Unspecified         1 (0.9%)          2 (0.8%)          4 (3.0%)

Changing summary measures

The default summary measures for continuous variables are median, interquartile range and range. This can be specified in the numeric() argument

redcap_df %>%
  factR(c(type, sex, localisation)) %>%
  tablR(
    group = type,
    vars=c(age, sex, localisation),
    labs.groups = list("benign" = "0",
                                     "in situ" = "1",
                                     "malignant" = "2"),
    labs.headings = list("Age at Debut" = "age"),
    labs.subheadings = list("sex" = list("Female" = "2",
                                         "Male" = "1"),
                            "localisation" = list("Neck" = "0",
                                                  "Head" = "1",
                                                  "Trunk" = "2",
                                                  "Upper Extremity" = "3",
                                                  "Lower Extremity" = "4",
                                                  "Unspecified" = "5")),
    reference = list("sex" = c("Female")),
    numeric = c("mean", "sd"))
#>                       benign (N=113) in situ (N=254) malignant (N=133)
#> 1  Age at Debut                                                       
#> 2     Mean                    49.0           49.9               49.7  
#> 3     SD                      19.2           19.1               18.5  
#> 4  Sex                                                                
#> 5     Female                47 (42%)       123 (48%)          74 (56%)
#> 6     Male                  66 (58%)       131 (52%)          59 (44%)
#> 7  Localisation                                                       
#> 8     Neck                  6 (5.3%)       8 (3.1%)           1 (0.8%)
#> 9     Head                  9 (8.0%)       31 (12%)           24 (18%)
#> 10    Trunk                 47 (42%)       74 (29%)           46 (35%)
#> 11    Upper Extremity       30 (26%)       94 (37%)           35 (26%)
#> 12    Lower Extremity       20 (18%)       45 (18%)           23 (17%)
#> 13    Unspecified           1 (0.9%)       2 (0.8%)           4 (3.0%)

Customizing tables

All levels and labels can be set manually. Furthermore, the table can be exported as a flextable object for nicer layout. For this example we also collapse variables of similar kind for simplicity with the simplify argument. This is preferred in a lot of variables with yes/no, 1/0, positive/negative syntax.

# redcap_df %>%
#   mutate(margins = sample(c("0","1"), nrow(redcap_df), replace=TRUE)) %>%
#   factR(c(type, sex, localisation, cd10, sox10, ck, margins, necrosis)) %>%
#   tablR(group=type,
#          vars = c(age, sex, localisation, cd10, sox10, ck, necrosis, margins),
#     labs.groups = list("benign" = "0",
#                                      "in situ" = "1",
#                                      "malignant" = "2"),
#          reverse = T,
#          labs.headings = list("Age at Debut" = "age",
#                               "Gender" = "sex",
#                               "CD10" = "cd10"),
#          labs.subheadings = list("sex" = list("Female" = "2",
#                                               "Male" = "1"),
#                                  "localisation" = list("Neck" = "0",
#                                                        "Head" = "1",
#                                                        "Trunk" = "2",
#                                                        "Upper Extremity" = "3",
#                                                        "Lower Extremity" = "4",
#                                                        "Unspecified" = "5")),
#          reference = list("sex" = c("Male")),
#          simplify=list("Immunohistochemistry" = c("cd10", "sox10", "ck"),
#                        "Tumor" = c("necrosis", "margins")),
#          censur=T,
#          numeric = c("mean", "sd", "q1q3", "iqr"),
#          test=T,
#          total=T)