Purpose
Bellabeat is a health-focused smart products company aimed at empowering women with knowledge about their own health and habits. Founded in 2013 with a drive to develop beautifully designed technology for its core market, it has grown rapidly and stood out within its segment.
Bellabeat wants to analyze smart device usage in order to gain insight into how consumers use non-Bellabeat smart devices and get high-level recommendations that can inform marketing strategies.
Data
Publicly available FitBit Fitness Tracker Data from Kaggle obtained via Amazon Mechanical Turk was used for analysis. Participants are identified across the data sets by unique id.
Cleaning and Initial Insights
While there were 18 CSVs within the data set, it was best to limit the analysis to those that contained a large enough number of unique users to provide relevant insight. Some of the CSVs are also simply more granular forms of others, i.e., Minute Intensities vs Hourly Intensities vs Daily Intensities. Those chosen represented:
- Daily Activity
- Daily Intensities
- Daily Sleep
Creation of data frames:
<- read.csv("dailyActivity_merged.csv")
daily_activity <- read.csv("dailyIntensities_merged.csv")
daily_intensities <- read.csv("sleepDay_merged.csv") daily_sleep
Number of unique users per set:
<- Filter(function(x) is(x, "data.frame"), mget(ls()))
df_list lapply(df_list, function(x) {n_distinct(x$Id)})
## $daily_activity
## [1] 33
##
## $daily_intensities
## [1] 33
##
## $daily_sleep
## [1] 24
Check for missing data:
lapply(df_list, function(x) {apply(is.na(x), 2, sum)})
## $daily_activity
## Id ActivityDate TotalSteps
## 0 0 0
## TotalDistance TrackerDistance LoggedActivitiesDistance
## 0 0 0
## VeryActiveDistance ModeratelyActiveDistance LightActiveDistance
## 0 0 0
## SedentaryActiveDistance VeryActiveMinutes FairlyActiveMinutes
## 0 0 0
## LightlyActiveMinutes SedentaryMinutes Calories
## 0 0 0
##
## $daily_intensities
## Id ActivityDay SedentaryMinutes
## 0 0 0
## LightlyActiveMinutes FairlyActiveMinutes VeryActiveMinutes
## 0 0 0
## SedentaryActiveDistance LightActiveDistance ModeratelyActiveDistance
## 0 0 0
## VeryActiveDistance
## 0
##
## $daily_sleep
## Id SleepDay TotalSleepRecords TotalMinutesAsleep
## 0 0 0 0
## TotalTimeInBed
## 0
Summaries of data sets:
lapply(df_list, function(x) {summary(x)})
## $daily_activity
## Id ActivityDate TotalSteps TotalDistance
## Min. :1.504e+09 Length:940 Min. : 0 Min. : 0.000
## 1st Qu.:2.320e+09 Class :character 1st Qu.: 3790 1st Qu.: 2.620
## Median :4.445e+09 Mode :character Median : 7406 Median : 5.245
## Mean :4.855e+09 Mean : 7638 Mean : 5.490
## 3rd Qu.:6.962e+09 3rd Qu.:10727 3rd Qu.: 7.713
## Max. :8.878e+09 Max. :36019 Max. :28.030
## TrackerDistance LoggedActivitiesDistance VeryActiveDistance
## Min. : 0.000 Min. :0.0000 Min. : 0.000
## 1st Qu.: 2.620 1st Qu.:0.0000 1st Qu.: 0.000
## Median : 5.245 Median :0.0000 Median : 0.210
## Mean : 5.475 Mean :0.1082 Mean : 1.503
## 3rd Qu.: 7.710 3rd Qu.:0.0000 3rd Qu.: 2.053
## Max. :28.030 Max. :4.9421 Max. :21.920
## ModeratelyActiveDistance LightActiveDistance SedentaryActiveDistance
## Min. :0.0000 Min. : 0.000 Min. :0.000000
## 1st Qu.:0.0000 1st Qu.: 1.945 1st Qu.:0.000000
## Median :0.2400 Median : 3.365 Median :0.000000
## Mean :0.5675 Mean : 3.341 Mean :0.001606
## 3rd Qu.:0.8000 3rd Qu.: 4.782 3rd Qu.:0.000000
## Max. :6.4800 Max. :10.710 Max. :0.110000
## VeryActiveMinutes FairlyActiveMinutes LightlyActiveMinutes SedentaryMinutes
## Min. : 0.00 Min. : 0.00 Min. : 0.0 Min. : 0.0
## 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.:127.0 1st Qu.: 729.8
## Median : 4.00 Median : 6.00 Median :199.0 Median :1057.5
## Mean : 21.16 Mean : 13.56 Mean :192.8 Mean : 991.2
## 3rd Qu.: 32.00 3rd Qu.: 19.00 3rd Qu.:264.0 3rd Qu.:1229.5
## Max. :210.00 Max. :143.00 Max. :518.0 Max. :1440.0
## Calories
## Min. : 0
## 1st Qu.:1828
## Median :2134
## Mean :2304
## 3rd Qu.:2793
## Max. :4900
##
## $daily_intensities
## Id ActivityDay SedentaryMinutes LightlyActiveMinutes
## Min. :1.504e+09 Length:940 Min. : 0.0 Min. : 0.0
## 1st Qu.:2.320e+09 Class :character 1st Qu.: 729.8 1st Qu.:127.0
## Median :4.445e+09 Mode :character Median :1057.5 Median :199.0
## Mean :4.855e+09 Mean : 991.2 Mean :192.8
## 3rd Qu.:6.962e+09 3rd Qu.:1229.5 3rd Qu.:264.0
## Max. :8.878e+09 Max. :1440.0 Max. :518.0
## FairlyActiveMinutes VeryActiveMinutes SedentaryActiveDistance
## Min. : 0.00 Min. : 0.00 Min. :0.000000
## 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.:0.000000
## Median : 6.00 Median : 4.00 Median :0.000000
## Mean : 13.56 Mean : 21.16 Mean :0.001606
## 3rd Qu.: 19.00 3rd Qu.: 32.00 3rd Qu.:0.000000
## Max. :143.00 Max. :210.00 Max. :0.110000
## LightActiveDistance ModeratelyActiveDistance VeryActiveDistance
## Min. : 0.000 Min. :0.0000 Min. : 0.000
## 1st Qu.: 1.945 1st Qu.:0.0000 1st Qu.: 0.000
## Median : 3.365 Median :0.2400 Median : 0.210
## Mean : 3.341 Mean :0.5675 Mean : 1.503
## 3rd Qu.: 4.782 3rd Qu.:0.8000 3rd Qu.: 2.053
## Max. :10.710 Max. :6.4800 Max. :21.920
##
## $daily_sleep
## Id SleepDay TotalSleepRecords TotalMinutesAsleep
## Min. :1.504e+09 Length:413 Min. :1.000 Min. : 58.0
## 1st Qu.:3.977e+09 Class :character 1st Qu.:1.000 1st Qu.:361.0
## Median :4.703e+09 Mode :character Median :1.000 Median :433.0
## Mean :5.001e+09 Mean :1.119 Mean :419.5
## 3rd Qu.:6.962e+09 3rd Qu.:1.000 3rd Qu.:490.0
## Max. :8.792e+09 Max. :3.000 Max. :796.0
## TotalTimeInBed
## Min. : 61.0
## 1st Qu.:403.0
## Median :463.0
## Mean :458.6
## 3rd Qu.:526.0
## Max. :961.0
Setting up certain summary information as tables offers an easier to read format and will make for better exporting to a simplified presentation.
measure | value |
---|---|
Average Daily Steps | 7637.91 |
Average Daily Miles | 5.49 |
Average Logged Activities Miles | 0.11 |
Average Daily Calories Burned | 2303.61 |
One thing that stands out here and has been highlighted is the number of calories burned. The recommended daily allowance for calories is 2000. If as seen here on average most users were burning more than that allotment the tracker may have indeed been very effective in keeping users aware of their calorie use and helped them in burning more per day than they consumed, assuming a balanced diet.
Seeing that so many participants met or exceeded the 2000 calorie mark it may be worthwhile to allow users to share when they hit milestones–not directly with calorie-to-calorie comparisons, but with friendly notifications of how many people in their circle have hit their daily, weekly, or monthly goals.
measure | value |
---|---|
Average Sedentary Minutes | 991.21 |
Average Light Active Minutes | 192.81 |
Average Moderately Active Minutes | 13.56 |
Average Very Active Minutes | 21.16 |
Average Light Active Distance | 3.34 |
Average Moderately Active Distance | 0.57 |
Average Very Active Distance | 0.57 |
measure | value |
---|---|
Average Daily Sleep Sessions | 1.12 |
Average Daily Sleep Minutes | 419.47 |
Average Daily Sleep Hours | 6.99 |
Average Daily Minutes In Bed | 458.64 |
Average Daily Hours In Bed | 7.64 |
Summary of Tables
On average participants spent:
- 3.2 hours per day engaged in light activity
- 15 minutes per day on moderate activity and 20 minutes per day
engaged in very heavy activity
- 16 hours per day sedentary, including 7.64 hours total time in bed with 6.99 of those hours coming as daily sleep.
Other findings:
- Participants didn’t appear to take advantage of the Logged
Activities feature. It may be worth investigating whether potential
users don’t desire such a feature or simply need to be made aware of
it.
- Average Daily Sleep Sessions may indicate that few participants have time for or choose to nap daily. Helping to encourage a regular sleep pattern through Bellabeat wearables and related apps could be a useful strategy to test.
Additional Points of Interest
When are users most active? This question could provide useful facets of information about potential customers who would be interested in fitness trackers.
Average Intensity peaks from 5-7PM with a lower peak from 12-2PM. Perhaps combining this type of data further with personas derived from customer profiles that align with others in their segment who achieved measurable results could provide positive outcomes for new users. Data like this could easily be updated daily or weekly for wearers to help them see how they’re performing and if they’re on track for goals. This would also help customers see if they’re keeping a regular schedule from day to day or week to week to sustain their momentum.
Do those who are more active burn more calories regardless of intensity level? Health professionals and fitness instructors recommend that people should be active daily even if that activity isn’t strenuous. Perhaps the tracker is able to show the simple relationship between more movement of any kind, including daily steps taken, having a positive effect on calories burned.
Let’s look at the correlation coefficients of different tests, then a scatter plot of calories burned against steps taken.
cor.test(daily_activity$TotalSteps, daily_activity$Calories, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: daily_activity$TotalSteps and daily_activity$Calories
## t = 22.472, df = 938, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.5483688 0.6316184
## sample estimates:
## cor
## 0.5915681
cor.test(daily_activity$TotalSteps, daily_activity$Calories, method = "kendall")
##
## Kendall's rank correlation tau
##
## data: daily_activity$TotalSteps and daily_activity$Calories
## z = 18.179, p-value < 2.2e-16
## alternative hypothesis: true tau is not equal to 0
## sample estimates:
## tau
## 0.3974441
cor.test(daily_activity$TotalSteps, daily_activity$Calories, method = "spearman")
##
## Spearman's rank correlation rho
##
## data: daily_activity$TotalSteps and daily_activity$Calories
## S = 61010776, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.5592679
All three tests for correlation as well as the scatter plot with a logistic regression line of best fit show a positive relationship between the number of steps taken and total calories burned. P-values well below 0.05 indicate statistical significance. One would expect that someone who simply takes more steps per day would be likely show more total calories burned than someone who took fewer. Simple is a great starting point when it comes to getting people to take up and stay consistent with a daily activity routine.
Recommendations
Bellabeat can combine the above findings with their own user-informed research to develop a marketing strategy focused on their target audience of female customers. Collecting similar data that provides better insights for their users could be useful to offer feedback that actively engages each individual.
Specific Recommendations Summary:
- Create ad campaigns as well as device & app-related designs that
help create awareness of core features users might not otherwise note.
For instance, display information on activity tracking and offer
optional “get on the move”/sedentary duration reminders during device
setup. Conduct A/B tests to see if such campaigns & feature
awareness provide better results and more consistent device use.
- Consider surveying customers about their sleep habits and trial
interest and engagement with “sleep companion” features. Present napping
recommendation for users whose activity patterns suggest the need.
- Highlight users’ most active hours to further drive engagement and help them establish activity routines that align with their goals.
- Provide shared highlights via available devices & apps. Create features that encourage better together ideals of moving forward as a group focused on common goals and milestones without a scoreboard mentality.
Photos by Bellabeat, Matt Heaton, Jennifer Coffin-Grey, Levi Guzman