Skip to contents

Basic narrative type is descriptive stats, looking into outliers or biggest contributors to the total volumes in the data set. This narratives are quite useful and can help to look deeper into the data set hierarchy.

R

Simple Table

Starting with a simplest table that has only one dimension and one measure. Here overall sales volume as well as outlying Territories will be analyzed.

df_one <- sales %>%
  group_by(Region) %>%
  summarise(Sales = sum(Sales, na.rm = TRUE)) %>%
  arrange(desc(Sales))

kable(df_one)
Region Sales
NA 18079736
EMEA 13555413
ASPAC 3919261
LATAM 3236068
narrate_descriptive(df_one)
#> $`Total Sales`
#> Total Sales across all Regions is 38790478.4.
#> 
#> $`Region by Sales`
#> Outlying Regions by Sales are NA (18079736.4, 46.6 %) and EMEA (13555412.7, 34.9 %).

Summarization

There are multiple summarization/aggregation options for the data frame, controlled by summarization argument that can be sum, count or average

sales %>%
  narrate_descriptive(
    measure = "Sales", 
    dimensions = "Region",
    summarization = "count"
  )
#> $`Total Sales`
#> Total Sales across all Regions is 9026.
#> 
#> $`Region by Sales`
#> Outlying Regions by Sales are NA (3821, 39.4 %) and EMEA (2883, 29.8 %).
sales %>%
  narrate_descriptive(
    measure = "Sales", 
    dimensions = "Region", 
    summarization = "average"
  )
#> $`Average Sales`
#> Average Sales across all Regions is 3879.
#> 
#> $`Region by Sales`
#> Outlying Regions by Sales are LATAM (2303.3, -40.6 % vs average Sales) and ASPAC (2398.6, -38.2 % vs average Sales).

Multiple Dimensions

df_two <- sales %>%
  filter(Region %in% c("NA", "EMEA")) %>%
  group_by(Region, Product) %>%
  summarise(Sales = sum(Sales, na.rm = TRUE)) %>%
  arrange(desc(Sales))

kable(df_two)
Region Product Sales
NA Food & Beverage 7392821.0
EMEA Food & Beverage 5265113.2
NA Electronics 3789132.7
EMEA Electronics 3182803.4
NA Home 2165764.5
NA Tools 2054959.1
EMEA Home 1633026.4
NA Baby 1521544.7
EMEA Tools 1499974.6
NA Clothing 1155514.4
EMEA Baby 1146743.8
EMEA Clothing 827751.3
narrate_descriptive(df_two)
#> $`Total Sales`
#> Total Sales across all Regions is 31635149.1.
#> 
#> $`Region by Sales`
#> Outlying Region by Sales is NA (18079736.4, 57.2 %).
#> 
#> $`NA by Product`
#> In NA, significant Products by Sales are Food & Beverage (7392821, 40.9 %) and Electronics (3789132.7, 21 %).
#> 
#> $`Product by Sales`
#> Outlying Products by Sales are Food & Beverage (12657934.2, 40 %) and Electronics (6971936.1, 22 %).

Depth

Narration depth can be controlled with narration_depth argument. To get summary narratives only set narration_depth = 1

narrate_descriptive(
  df_two, 
  narration_depth = 1
)
#> $`Total Sales`
#> Total Sales across all Regions is 31635149.1.
#> 
#> $`Region by Sales`
#> Outlying Region by Sales is NA (18079736.4, 57.2 %).
#> 
#> $`Product by Sales`
#> Outlying Products by Sales are Food & Beverage (12657934.2, 40 %) and Electronics (6971936.1, 22 %).

Coverage

Key argument for all narratives is coverage. It is used to narrate the most important things and avoid simple looping through all of the dimension levels.

By default coverage is set to 0.5 and this means that narration will stop as soon as cumulative sum reaches 50 % mark. With increased coverage, additional narrative is returned.

df_three <- sales %>%
  group_by(Product) %>%
  summarise(Sales = sum(Sales, na.rm = TRUE)) %>%
  arrange(desc(Sales))

df_three %>%
  mutate(
    Share = round(Sales/sum(Sales)*100, 1),
    Cumulative = cumsum(Share)) %>%
  kable()
Product Sales Share Cumulative
Food & Beverage 15543470 40.1 40.1
Electronics 8608963 22.2 62.3
Home 4599371 11.9 74.2
Tools 4404197 11.4 85.6
Baby 3256835 8.4 94.0
Clothing 2377643 6.1 100.1
narrate_descriptive(df_three)
#> $`Total Sales`
#> Total Sales across all Products is 38790478.4.
#> 
#> $`Product by Sales`
#> Outlying Products by Sales are Food & Beverage (15543469.7, 40.1 %) and Electronics (8608962.8, 22.2 %).
narrate_descriptive(df_three, coverage = 0.7)
#> $`Total Sales`
#> Total Sales across all Products is 38790478.4.
#> 
#> $`Product by Sales`
#> Outlying Products by Sales are Food & Beverage (15543469.7, 40.1 %), Electronics (8608962.8, 22.2 %) and Home (4599370.9, 11.9 %).