COMET
  • Get Started
    • Quickstart Guide
    • Install and Use COMET
    • Get Started
  • Learn By Skill Level
    • Getting Started
    • Beginner
    • Intermediate - Econometrics
    • Intermediate - Geospatial
    • Advanced

    • Browse All
  • Learn By Class
    • Making Sense of Economic Data (ECON 226/227)
    • Econometrics I (ECON 325)
    • Econometrics II (ECON 326)
    • Statistics in Geography (GEOG 374)
  • Learn to Research
    • Learn How to Do a Project
  • Teach With COMET
    • Learn how to teach with Jupyter and COMET
    • Using COMET in the Classroom
    • See COMET presentations
  • Contribute
    • Install for Development
    • Write Self Tests
  • Launch COMET
    • Launch on JupyterOpen (with Data)
    • Launch on JupyterOpen (lite)
    • Launch on Syzygy
    • Launch on Colab
    • Launch Locally

    • Project Datasets
    • Github Repository
  • |
  • About
    • COMET Team
    • Copyright Information

On this page

  • Outline
    • Prerequisites
    • References
    • Outcomes
  • Part 1: Import Data into R
  • Part 2: Exploratory Data Analysis in R
    • Cleaning and Reshaping SFS_data
    • Ensuring correct data-types
    • Computing Descriptive Statistics using vtable in R
    • Grouping observations
    • Test your knowledge
  • Part 3: Running \(t\)-tests in R
    • Studying how wealth and education might impact the income-gap
    • Optional: Returns to HS diploma
    • Test your knowledge
    • Appendix
  • Report an issue

Other Formats

  • Jupyter

1.0.1 - Beginner - Introduction to Statistics using R

introduction
review
econ 226
econ 227
descriptive statistics
beginner
importing data
data wrangling
t-test
inferential statistics
R
This notebook is an introduction to basic statistics using Jupyter and R, and some fundamental data analysis. It is a high-level review of the most important applied tools from earlier units.
Author

COMET Team
William Co, Anneke Dresselhuis, Jonathan Graves, Emrul Hasan, Jonah Heyl, Mridul Manas, Shiming Wu.

Published

8 May 2023

Outline

Prerequisites

  • Introduction to Jupyter
  • Introduction to R

References

  • Esteban Ortiz-Ospina and Max Roseqoifhoihr (2018) - “Economic inequality by gender”. Published online at OurWorldInData.org. Retrieved from: https://ourworldindata.org/economic-inequality-by-gender [Online Resource]

Outcomes

In this notebook, you will learn how to:

  • Import data from the Survey of Financial Security (Statistics Canada, 2019)
  • Wrangle, reshape and visualize SFS_data as part of an Exploratory Data Analysis (EDA)
  • Run statistical tests, such as the \(t\)-test, to compare mean income of male-led vs. female-led households
  • Generate summary statistics tables and other data-representations of the data using group_by()
  • Optional: Run a formal two sample t-test to check for heterogeneity in how gender affects income and compare the returns to education

Part 1: Import Data into R

The data we use comes from the 2019 Survey of Financial Security released by Statistics Canada 1.

# run this cell to load necessary packages for this tutorial
# install.packages('vtable')
# install.packages('viridis')
library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.4.3
Warning: package 'tibble' was built under R version 4.4.2
Warning: package 'tidyr' was built under R version 4.4.2
Warning: package 'readr' was built under R version 4.4.2
Warning: package 'purrr' was built under R version 4.4.3
Warning: package 'dplyr' was built under R version 4.4.3
Warning: package 'stringr' was built under R version 4.4.2
Warning: package 'forcats' was built under R version 4.4.2
Warning: package 'lubridate' was built under R version 4.4.2
library(haven)
Warning: package 'haven' was built under R version 4.4.2
library(dplyr)
library(vtable)
Warning: package 'kableExtra' was built under R version 4.4.3
library(viridis)
Warning: package 'viridisLite' was built under R version 4.4.2
source("beginner_intro_to_statistics2_tests.r")
Warning: package 'digest' was built under R version 4.4.2
source("beginner_intro_to_statistics2_functions.r")
[1] "Loaded!"
# warning messages are okay

The tidyverse is a collection of R packages developed by Hadley Wickham and his colleagues as a cohesive set of tools for data manipulation, visualization, and analysis. In a tidy data set, each variable forms a column and each observation forms a row. tidyverse packages such as the tidyr and dplyr are recommended for cleaning and transforming your data into tidy formats.

Let’s import the .dta file from Statistics Canada using the read_dta function.

# if this is your first time using Jupyter Lab, the shortcut to run a cell is `Shift + Enter`
SFS_data <- read_dta("../datasets_beginner/SFS_2019_Eng.dta")

Here are some of the common file extensions and import functions in R:

  • .dta and read_dta() for STATA files
  • .csv and read_csv() for data stored as comma-separated values
  • .Rda and load() for RStudio files and other files formatted for R
head(SFS_data, 5)
# A tibble: 5 × 100
  pefamid pweight pagemieg PAS1MRAG  PAS1MRG1 PAS1MRG2 PASR1MFA PASR1MR pasrbuyg
  <chr>     <dbl> <chr>    <dbl+lbl> <dbl+lb> <dbl+lb> <dbl+lb> <dbl+l> <dbl+lb>
1 00001     4415. 06       NA        NA       NA       NA        2 [No]  7 [200…
2 00002      805. 14       NA        NA       NA       NA       NA      NA      
3 00003     1393. 12       NA        NA       NA       NA       NA       4 [198…
4 00004      217. 11       NA        NA       NA       NA       NA       5 [199…
5 00005      455. 11       NA        NA       NA       NA       NA       5 [199…
# ℹ 91 more variables: pasrcnmg <dbl+lbl>, pasrcon <dbl+lbl>,
#   pasrcst <dbl+lbl>, pasrcurg <dbl+lbl>, PASRDPO1 <dbl+lbl>,
#   PASRDPO2 <dbl+lbl>, PASRDPO3 <dbl+lbl>, PASRDPO4 <dbl+lbl>,
#   PASRDPO5 <dbl+lbl>, pasrdwng <dbl+lbl>, pasrfnmg <dbl+lbl>,
#   pasrint <dbl+lbl>, pasrintg <chr>, pasrmoag <dbl+lbl>, pasrmpfg <dbl+lbl>,
#   pasrmryg <dbl+lbl>, pasrrntg <dbl+lbl>, pasrskp <dbl+lbl>,
#   pattcrc <dbl+lbl>, pattcrlm <dbl+lbl>, pattcrr <dbl+lbl>, …

Note: head(df, n) displays the first n rows of the data frame. Other popular methods include glance() and print().

# you can read the documentation for a given function by adding a question-mark before its name
?head
starting httpd help server ... done

Part 2: Exploratory Data Analysis in R

There are a routine of steps you should generally follow as part of your EDA or Exploratory Data Analysis. Normally, you would analyze and visualize the variation, correlation, and distribution of your variables of interest. We do this to gain an intuitive understanding of the data before we undertake any formal hypothesis tests or model-fitting.

Let’s think of our key variables of interest. We’re interested in estimating the effect of gender on differences in earnings.

  • Independent variable: gender of the highest income earner

  • Variable of interest: income after tax for each individual

  • Variable of interest: income before tax for each individual

  • Control: wealth for the household

  • Control: level of education

Cleaning and Reshaping SFS_data

For now, it’d be convenient to work with a new data frame containing only the key variables (columns) listed above. Moreover, the columns need to be renamed so they are easier for the reader to remember.

# rename columns
SFS_data <- SFS_data %>%
    rename(income_before_tax = pefmtinc) %>% 
    rename(income_after_tax = pefatinc) %>%
    rename(wealth = pwnetwpg) %>%
    rename(gender = pgdrmie) %>%
    rename(education = peducmie) 

# drop rows where tax info is missing, ie. pefmtinc = 'NA'.
SFS_data <- filter(SFS_data, !is.na(SFS_data$income_before_tax))

keep <- c("pefamid", "gender", "education", "wealth", "income_before_tax", "income_after_tax")

# new df with chosen columns
df_gender_on_wealth <- SFS_data[keep]

# preview
head(df_gender_on_wealth, 5)
# A tibble: 5 × 6
  pefamid gender    education  wealth income_before_tax income_after_tax
  <chr>   <dbl+lbl> <chr>       <dbl> <dbl+lbl>         <dbl+lbl>       
1 00001   1 [Male]  3          511750 210000            126650          
2 00002   1 [Male]  3           18250      0             23000          
3 00003   1 [Male]  2         8806000 250000            171200          
4 00004   1 [Male]  2         2019000 290000            233850          
5 00005   1 [Male]  3         4080100 170000            138500          

Note: This is another tidy representation of the original data but with less variables. The original data set is still stored as SFS_data.

Ensuring correct data-types

Notice that education is stored as chr but we want to keep it as a factor. The variable education came encoded as it is from a set of values {1, 2, 3, 4, 9}, each of which represent a level of education obtained.

df_gender_on_wealth <- df_gender_on_wealth %>%
         mutate(education = as.factor(education), 
         gender = as.factor(gender),
         income_before_tax = as.numeric(income_before_tax),
         income_after_tax = as.numeric(income_after_tax))

head(df_gender_on_wealth, 2)
# A tibble: 2 × 6
  pefamid gender education wealth income_before_tax income_after_tax
  <chr>   <fct>  <fct>      <dbl>             <dbl>            <dbl>
1 00001   1      3         511750            210000           126650
2 00002   1      3          18250                 0            23000

All good! Let’s use descriptive statistics to understand how each of the numbers in the set {1, 2, 3, 9} represent an individual’s educational background.

Computing Descriptive Statistics using vtable in R

Let’s calculate the summary statistics of our dataset.

Note: the sumtable method from the vtable package can be used to display the table in different formats including LaTeX, HTML, and data.frame.

# out = "kable" tells it to return a knitr::kable()
# replace "kable" with "latex" and see what happens!
sumtbl <- sumtable(df_gender_on_wealth, out = "kable")
sumtbl
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
gender 10201
... 1 6132 60%
... 2 4069 40%
education 10201
... 1 1216 12%
... 2 2279 22%
... 3 2819 28%
... 4 3827 38%
... 9 60 1%
wealth 10201 1030591 1744118 -875500 110750 1270250 30055000
income_before_tax 10201 95625 125418 -47000 27000 130000 3300000
income_after_tax 10201 86570 80987 -479300 39000 114250 2159100

This is like having a birds-eye view of our data. As a researcher, we should take note of outliers and other irregularities and ask how those issues might affect the validity of our models and tests.

Note: see Appendix for a common method to remove outliers using Z-score thresholds.

Grouping observations

Wouldn’t it be neat to see how mean or median incomes for male and female-led households look like based on the level of education obtained by the main income-earner?

by_gender_education <- df_gender_on_wealth %>%
  group_by(gender, education) %>%
  summarise(mean_income = mean(income_before_tax, na.rm = TRUE),
            median_income = median(income_before_tax, na.rm = TRUE),
            mean_wealth = mean(wealth, na.rm = TRUE),
            median_wealth = median(wealth, na.rm = TRUE))
`summarise()` has regrouped the output.
ℹ Summaries were computed grouped by gender and education.
ℹ Output is grouped by gender.
ℹ Use `summarise(.groups = "drop_last")` to silence this message.
ℹ Use `summarise(.by = c(gender, education))` for per-operation grouping
  (`?dplyr::dplyr_by`) instead.
by_gender_education
# A tibble: 10 × 6
# Groups:   gender [2]
   gender education mean_income median_income mean_wealth median_wealth
   <fct>  <fct>           <dbl>         <dbl>       <dbl>         <dbl>
 1 1      1              47141.         28000     587458.       263850 
 2 1      2              81191.         65000     840157.       425275 
 3 1      3              99336.         87500     901916.       494250 
 4 1      4             157819.        120000    1680511.      1049500 
 5 1      9              60547.         49250     599164.       238375 
 6 2      1              23477.          9500     448291.       129210 
 7 2      2              51472.         33000     702158.       308000 
 8 2      3              65724.         47000     747951.       314675 
 9 2      4             108537.         90000    1213004.       771992.
10 2      9              26093.         15750     234826.        79600 

Note: this is again a tidy representation of SFS_data. Grouping observations by gender and education makes it a bit easier to make comparisons across groups.

We can take this chain-of-thought further and generate a heatmap using the ggplot package.

library(ggplot2)
library(viridis)

# Create the heatmap with an accessible color palette
heatmap_plot <- ggplot(by_gender_education, aes(x = education, y = gender, fill = mean_income)) +
  geom_tile() +
  scale_fill_viridis_c(option = "plasma", na.value = "grey", name = "Mean Income") +
  labs(x = "Education", y = "Gender")

# Display the heatmap
heatmap_plot

Note: we use scale_fill_viridis_c() from the viridis package to ensure that the color palette follows the standards of DS.

Now, what does this tell you about how male-led households (gender = 1) compare with female-led households in terms of the mean household income? Does this tell if education widens the income gap between male-led and female-led households with the same level of education?

We can infer from the visualization that the female-led households with the same level of education have different mean incomes as compared to male-led households. This smells of heterogeneity and we can explore regression and other empirical methods to formally test this claim.

However, we shouldn’t yet draw any conclusive statements about the relationships between gender (of the main income earner), income, education and other variables such as wealth.

As researchers, we should ask if the differences in the mean or median incomes for the two groups are significant at all. We can then go a bit further and test if education indeed widens the gap or not.

Think Deeper: how would you specify the null and alternative hypotheses?

Test your knowledge

Match the function with the appropriate description. Enter your answer as a long string with the letter choices in order.

  1. Order rows using column values
  2. Keep distinct/unique rows
  3. Keep rows that match a condition
  4. Get a glimpse of your data
  5. Create, modify, and delete columns
  6. Keep or drop columns using their names and types
  7. Count the observations in each group
  8. Group by one or more variables
  9. A general vectorised if-else
  1. mutate()
  2. glimpse()
  3. filter()
  4. case_when()
  5. select()
  6. group_by()
  7. distinct()
  8. arrange()
  9. count()

Note: it’s fine if you don’t know all those functions yet! Match the functions you know and run code to figure out the rest.

# Enter your answer as a long string ex: if you think the matches are 1-B, 2-C, 3-A, enter answer as "BCA"
answer_1 <- ""

test_1()

Part 3: Running \(t\)-tests in R

Let’s run a t-test for a comparison of means.

# performs a t-test for means comparison
t_test_result <- t.test(income_before_tax ~ gender, data = df_gender_on_wealth)
print(t_test_result)

    Welch Two Sample t-test

data:  income_before_tax by gender
t = 15.057, df = 10167, p-value < 2.2e-16
alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
95 percent confidence interval:
 30119.32 39135.46
sample estimates:
mean in group 1 mean in group 2 
      109437.54        74810.15 

The 95% confidence interval does not include 0 and we can confirm that the male-led households on average earn more as income before tax than the female-led households, and the gap is statistically significant.

Let’s now run a test to compare the medians of both groups.

# perform a Mann-Whitney U test for median comparison
mannwhitneyu_test_result <- wilcox.test(income_before_tax ~ gender, data = df_gender_on_wealth)
print(mannwhitneyu_test_result)

    Wilcoxon rank sum test with continuity correction

data:  income_before_tax by gender
W = 15307599, p-value < 2.2e-16
alternative hypothesis: true location shift is not equal to 0

This p-value is again highly significant, and based on our data, the median incomes for the two groups are not equal.

Our variable of interest is income, and so far, we have provided statistical evidence for the case that the gender of the main income-earner is correlated with the household’s income.

We are however more interested in the causal mechanisms through which education and wealth determine how gender affects household income.

Think Deeper: According to Ortiz-Ospina and Roser (2018), women are overrepresented in low-paying jobs and are underrepresented in high-paying ones. What role does the attainment of education play in sorting genders into high vs. low-paying jobs? Can we test this formally with the data?

Studying how wealth and education might impact the income-gap

There are multiple reasons to study the links between wealth and the income gap. For instance, we might want to answer whether having more wealth affects an individual’s income.

We can use some of the methods we have learned in R to analyze and visualize relationships between income, gender, education and wealth.

Let’s see if having a university degree widens the gender income gap.

SFS_data <- SFS_data %>% 
            mutate(university = case_when(     # create a new variable with mutate
                            education == "4" ~ "Yes",    # use case_when and ~ operator to applt `if else` conditions 
                            TRUE ~ "No")) %>% 
            mutate(university = as_factor(university)) #remember, it's a factor!

head(SFS_data$university, 10)
 [1] No  No  No  No  No  Yes No  Yes Yes Yes
Levels: No Yes

Let’s visualize how the mean wealth compares for male-led vs. female-led households, conditional on whether the main-income earner went to university.

results <- SFS_data %>%
           group_by(university,gender) %>%
           summarize(m_wealth = mean(wealth), sd_wealth = sd(wealth))
`summarise()` has regrouped the output.
ℹ Summaries were computed grouped by university and gender.
ℹ Output is grouped by university.
ℹ Use `summarise(.groups = "drop_last")` to silence this message.
ℹ Use `summarise(.by = c(university, gender))` for per-operation grouping
  (`?dplyr::dplyr_by`) instead.
results 
# A tibble: 4 × 4
# Groups:   university [2]
  university gender     m_wealth sd_wealth
  <fct>      <dbl+lbl>     <dbl>     <dbl>
1 No         1 [Male]    817055.  1459054.
2 No         2 [Female]  668021.  1447673.
3 Yes        1 [Male]   1680511.  2346053.
4 Yes        2 [Female] 1213004.  1563383.
f <- ggplot(data = SFS_data, aes(x = gender, y = wealth)) + xlab("Gender") + ylab("Wealth")    # label and define our x and y axis
f <- f + geom_bar(stat = "summary", fun = "mean", fill = "lightblue")    # produce a summary statistic, the mean
f <- f + facet_grid(. ~ university)    # add a grid by education

f

It smells like the wealth gap between the two types of households widens for groups that have obtained a university degree.

Similarly, let’s look at the difference in wealth gap in percentage terms. We use results generated in the previous cell (the \(4 \times 4\) table) as the inputs this time. We need to load the package scales to use the function percent.

library(scales)
Warning: package 'scales' was built under R version 4.4.2
percentage_table <- SFS_data %>%
                    group_by(university) %>%
                    group_modify(~ data.frame(wealth_gap = mean(filter(., gender == 2)$wealth)/mean(filter(., gender == 1)$wealth) - 1)) %>%
                    mutate(wealth_gap = scales::percent(wealth_gap))

percentage_table
# A tibble: 2 × 2
# Groups:   university [2]
  university wealth_gap
  <fct>      <chr>     
1 No         -18%      
2 Yes        -28%      

Notice the signs are both negative. Hence, on average, female-led households have less wealth regardless of whether they have a university degree or not.

More importantly, based on our data, female-led households with university degrees on average have 28% less wealth than male-led households with university degrees. Comparing the two groups given they don’t have university degrees, the gap is quite smaller: 18%.

So, we have shown that the gap widens by about 10% when conditioned for a university degree.

Let’s test this further by creating sub-samples of “university degree” and “no university degree” respectively and then running a formal two sample t-test.

university_data <- filter(SFS_data, university == "Yes") # university only data 
nuniversity_data <- filter(SFS_data, university == "No") # non university data

t2 = t.test(
       x = filter(university_data, gender == 1)$wealth,
       y = filter(university_data, gender == 2)$wealth,
       alternative = "two.sided",
       mu = 0,
       conf.level = 0.95)

t2  # test for the wealth gap in university data

    Welch Two Sample t-test

data:  filter(university_data, gender == 1)$wealth and filter(university_data, gender == 2)$wealth
t = 7.3872, df = 3782.4, p-value = 1.836e-13
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 343428.9 591586.0
sample estimates:
mean of x mean of y 
  1680511   1213004 
round(t2$estimate[1] - t2$estimate[2],2) # rounds our estimate
mean of x 
 467507.5 
t3 = t.test(
       x = filter(nuniversity_data, gender == 1)$wealth,
       y = filter(nuniversity_data, gender == 2)$wealth,
       alternative = "two.sided",
       mu = 0,
       conf.level = 0.95)

t3 # test for the wealth gap in non-university data

    Welch Two Sample t-test

data:  filter(nuniversity_data, gender == 1)$wealth and filter(nuniversity_data, gender == 2)$wealth
t = 3.9821, df = 5192, p-value = 6.925e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  75662.69 222403.86
sample estimates:
mean of x mean of y 
 817054.8  668021.5 
round(t3$estimate[1] - t3$estimate[2],2) # rounds our estimate
mean of x 
 149033.3 

In both tests, the p-values are very small, indicating strong statistical evidence to reject the null hypothesis. The confidence intervals also provide a range of plausible values for the difference in means, further supporting the alternative hypothesis.

Based on these results, there appears to be a significant difference in wealth between the two gender groups regardless of university-status, with males consistently having higher mean wealth compared to females.

Optional: Returns to HS diploma

Next, examine whether returns to education differ between genders. For our purposes, we will define returns to education as the difference in average income before tax between two subsequent education levels.

The following t-test finds the returns to education of a high school diploma for males (retHS) and for females(retHSF).

# Returns to education: High school diploma

less_than_high_school_data <- filter(SFS_data, education == 1) # Less than high school
high_school_data <- filter(SFS_data, education == 2) # High school
post_secondary_data <- filter(SFS_data, education == 3) # Non-university post-secondary
university_data <- filter(SFS_data, education == 4) # University


retHS = t.test(
       x = filter(high_school_data, gender == 1)$income_before_tax,
       y = filter(less_than_high_school_data, gender == 1)$income_before_tax,
       alternative = "two.sided",
       mu = 0,
       conf.level = 0.95)
retHS_ans=round(retHS$estimate[1] - retHS$estimate[2],2)

retHSF = t.test(
       x = filter(high_school_data, gender == 2)$income_before_tax,
       y = filter(less_than_high_school_data, gender == 2)$income_before_tax,
       alternative = "two.sided",
       mu = 0,
       conf.level = 0.95)

retHS

    Welch Two Sample t-test

data:  filter(high_school_data, gender == 1)$income_before_tax and filter(less_than_high_school_data, gender == 1)$income_before_tax
t = 11.44, df = 1968.1, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 28213.26 39888.30
sample estimates:
mean of x mean of y 
 81191.49  47140.72 
retHSF

    Welch Two Sample t-test

data:  filter(high_school_data, gender == 2)$income_before_tax and filter(less_than_high_school_data, gender == 2)$income_before_tax
t = 7.0649, df = 846.02, p-value = 3.359e-12
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 20217.37 35772.52
sample estimates:
mean of x mean of y 
 51472.26  23477.31 
retHS_ans=round(retHS$estimate[1] - retHS$estimate[2],2)
retHSF_ans=round(retHSF$estimate[1] - retHSF$estimate[2],2)

We have found statistically significant evidence for the case that returns to graduating with a high school diploma are indeed positive for individuals living in both male-led and female-led households.

Test your knowledge

As an exercise, create a copy of the cell above and try to calculate the returns of a university degree for males and females.

# your code here

Now let’s work with a simulated dataset of mutual fund performance. Interpret the data below as the yearly returns for a sample of 300 mutual funds from 2010 to 2015.

fund_performance
    fund       Y2010        Y2011       Y2013        Y2014       Y2015
1      1 0.033185731 0.0285427344 0.082220368 0.0195765748 0.068595502
2      2 0.043094675 0.0274193310 0.049179591 0.0262605836 0.027274695
3      3 0.096761249 0.0218438389 0.049000090 0.0589878105 0.075545740
4      4 0.052115252 0.0184246016 0.004517971 0.0991715573 0.027562101
5      5 0.053878632 0.0368852140 0.073711560 0.0825385103 0.068907195
6      6 0.101451950 0.0599353752 0.043677975 0.0312629758 0.082899849
7      7 0.063827486           NA 0.030297712 0.0747776870 0.020346712
8      8 0.012048163 0.0563594130 0.007639226 0.0485429494 0.083239851
9      9 0.029394414 0.0871002514 0.041007125 0.0590394096 0.035314014
10    10 0.036630141 0.1111272205 0.024528166 0.0578108447 0.058830602
11    11 0.086722454 0.0890352798 0.038089084 0.1272634929 0.056055124
12    12 0.060794415 0.0727032429 0.013472000 0.0144413357 0.037184108
13    13 0.062023144           NA 0.100627684 0.0530275958 0.058043086
14    14 0.053320481 0.0319547988 0.049519924           NA 0.013087072
15    15 0.033324766 0.0394386063 0.082248352 0.0676950777 0.045915894
16    16 0.103607394 0.0711057171          NA 0.0828982542 0.074773725
17    17 0.064935514 0.0468298600 0.036404065 0.0933698672          NA
18    18          NA 0.0122405412 0.029735531           NA 0.005362214
19    19 0.071040677 0.1005330712 0.013312215 0.0623830849 0.015141873
20    20 0.035816258 0.0773417388 0.096398275 0.0978010985 0.002327309
21    21 0.017965289 0.0571229082 0.007541542 0.0375795241 0.062587491
22    22 0.043460753 0.0865432583 0.059551708 0.0436354840 0.020212150
23    23 0.019219867 0.0098367714 0.075393089 0.0489038833          NA
24    24 0.028133263 0.0698246089 0.055345706 0.0609505625 0.030872937
25    25 0.031248822 0.0343126287 0.023742336 0.0699547963 0.038280942
26    26          NA 0.0705123657 0.078234974 0.0895346265 0.075703564
27    27 0.075133611 0.0481753414 0.055117642 0.0471353723 0.016887436
28    28 0.054601194 0.0689888214 0.018095063 0.0558883414 0.084838678
29    29 0.015855892 0.0900655285 0.008358529 0.1246399363 0.061950881
30    30 0.087614448 0.0502187027 0.112601523 0.0629329678 0.060870565
31    31 0.062793927 0.0805267591 0.029644906 0.0556625933 0.024424230
32    32 0.041147856 0.0143469789          NA 0.0097327062 0.108610036
33    33 0.076853770 0.0283518668 0.065997781 0.0500856814 0.045071875
34    34 0.076344005 0.0955765313 0.059306908 0.0433602154          NA
35    35 0.074647432 0.0613216392 0.009384970 0.0496686251 0.043884306
36    36 0.070659208           NA          NA 0.0327374708          NA
37    37 0.066617530 0.0090788764 0.046510924 0.0293955304 0.040684697
38    38 0.048142649 0.0439765695 0.084181889 0.0283767910 0.037333190
39    39 0.040821120 0.0759733821 0.069083721 0.0435648645 0.070454891
40    40 0.038585870 0.0469435023 0.035211877 0.0910439794 0.080284886
41    41 0.029158791 0.0687256242 0.024974353 0.0814725988 0.028216851
42    42 0.043762482 0.0787701613 0.058132003 0.0392007465 0.074183266
43    43 0.012038109 0.1001316449 0.054720600           NA 0.092729693
44    44 0.115068679 0.0516805020 0.068891352 0.0246624971 0.026475680
45    45 0.086238860 0.0484405428 0.038126061 0.0362671840 0.030427869
46    46 0.016306743           NA 0.076980622 0.0531091401 0.069523351
47    47 0.037913455 0.0529798278 0.025075654 0.0301217817 0.055491439
48    48 0.036000339 0.0328444983 0.040083659 0.1102004207 0.066463249
49    49 0.073398954 0.0207797125 0.072224436 0.0418319740 0.092140529
50    50 0.047498928 0.0446028131 0.079699148 0.0135816659 0.061612494
51    51 0.057599555 0.0804482952          NA 0.0457621473 0.081551038
52    52 0.049143597           NA 0.053215712 0.0198386726 0.068687164
53    53 0.048713886 0.0371816214 0.068263370 0.0546846712 0.063008612
54    54 0.091058069 0.0534991185 0.006475271 0.0570090084 0.061582533
55    55 0.043226870 0.0232037729 0.064418768 0.0606676283 0.088739699
56    56 0.095494118 0.0600170883 0.025154772 0.0013442522 0.019932204
57    57 0.003537416 0.0623428976 0.080607590 0.0566213387 0.016844518
58    58 0.067538412 0.0490089152 0.066154461 0.0593135024 0.067758380
59    59 0.053715627           NA 0.073071569 0.0073667466 0.046409310
60    60 0.056478247 0.1271437444 0.053621580 0.0786609692 0.052220156
61    61 0.061389184 0.0438410223 0.075909453 0.0735251264 0.072238321
62    62 0.034930296 0.0695357984 0.091415436 0.1189885808 0.072598851
63    63 0.040003778 0.0582129947 0.108987440 0.0547010896 0.042119885
64    64 0.019442739 0.0807401970 0.049148148 0.0514020058 0.040623684
65    65 0.017846263 0.0745297834          NA 0.0528975750 0.052207958
66    66 0.059105859 0.0437062049 0.050945780 0.0520929869 0.081890534
67    67 0.063446293 0.0613450332 0.056166836           NA 0.062780615
68    68 0.051590127 0.0216377351 0.045339640           NA 0.092990225
69    69 0.077668024 0.0757076903 0.067048659 0.0476738310 0.049770894
70    70 0.111502541 0.0361688498 0.080320339 0.0325679786 0.083770028
71    71 0.035269065 0.1225032006 0.034460527 0.0516420958 0.076490069
72    72          NA 0.0004685331 0.041177140           NA 0.068362504
73    73 0.080172156 0.0360803827 0.061935266 0.0050390524 0.062441021
74    74 0.028723977 0.0747613959 0.033493288 0.0169554968 0.041603528
75    75 0.029359742 0.0653039764 0.052738022 0.0795817466 0.046728875
76    76 0.080767141 0.0323155688          NA 0.0170452998 0.056881865
77    77 0.041456810 0.0200965777 0.016403008 0.0260145814 0.051466667
78    78 0.013378469 0.0543342711 0.010167336 0.0523962146 0.078296734
79    79 0.055439104 0.0495707776 0.024391289 0.0403176091 0.046720486
80    80 0.045833259           NA 0.029200864 0.0543925154 0.047888692
81    81 0.050172926 0.0510365320 0.061469154 0.1191518595 0.035470427
82    82 0.061558412 0.0557069095 0.079463390 0.0162618899 0.045849910
83    83 0.038880199 0.0552417919 0.028178494 0.0408359108 0.047937031
84    84 0.069331296 0.0183494887 0.020094831 0.0344972165          NA
85    85 0.043385403 0.0642839983 0.018749334 0.0953718628 0.009055049
86    86 0.059953459 0.0913571041 0.037562338 0.0269154523 0.047825393
87    87 0.082905170 0.0636870921 0.042829128 0.0475373929 0.042041487
88    88 0.063055445 0.0159323459 0.064508526 0.0736140084 0.013973920
89    89 0.040222052 0.0369306359 0.040360255 0.0182422839          NA
90    90 0.084464229 0.0603831086          NA 0.0996552745 0.039368923
91    91 0.079805116 0.0305886311 0.047256972 0.0702728725 0.069604873
92    92 0.066451909           NA 0.085615604 0.0177738017 0.103197159
93    93 0.057161952 0.0765275246 0.085748038 0.0636373343 0.048846296
94    94 0.031162818 0.0251156717 0.026331103 0.0436007857 0.094795545
95    95 0.090819573 0.0327931919 0.003566704 0.0593968632 0.052490665
96    96 0.031992212 0.0951170183 0.123741815 0.0473007441 0.053465963
97    97 0.115619990 0.0267756521 0.045127342 0.0821154811 0.059744759
98    98 0.095978319 0.0753719462 0.047076462 0.0094669884 0.023882683
99    99 0.042928989 0.0121795136 0.062617226 0.0343214991 0.048448454
100  100 0.019207373 0.0393637279 0.001578816 0.0425242797 0.077253431
101  101 0.028687803 0.0477933194 0.028153427 0.0201260383 0.027508226
102  102 0.057706511 0.0149404573 0.003786728 0.0188013487 0.040351818
103  103 0.042599244 0.0309575521 0.029207162 0.0494605928 0.015566877
104  104 0.039573722 0.0491347534 0.053565483 0.0460347460 0.060630566
105  105 0.021451443 0.0701208791 0.009058716           NA 0.062743993
106  106 0.048649168 0.0004836037 0.067699480 0.0812172037 0.069450421
107  107 0.026452866 0.0395073728 0.058680321 0.0574917721 0.013405699
108  108          NA 0.0726921932 0.022873549 0.1224862212 0.053217051
109  109 0.038593204 0.0338357252 0.056789748 0.0705559471 0.021678269
110  110 0.077569898 0.0568187577 0.072442435 0.0365912207 0.049988461
111  111 0.032739591 0.0647668571 0.081832858 0.1339217344 0.090278718
112  112 0.068238930 0.0580350505 0.043614552 0.1349667807 0.034894241
113  113 0.001463519 0.0695977304 0.047190896 0.0134386455 0.071500500
114  114 0.048333141 0.0463187402 0.047398576 0.0640709587 0.027509942
115  115 0.065582216 0.0375897046 0.093243853 0.0436625924 0.035644154
116  116 0.059034601           NA 0.083752157 0.0556115344 0.063161653
117  117 0.053170286 0.0472117694 0.075032047 0.0568262819 0.029626632
118  118 0.030778820 0.0629085409 0.041379776 0.0121429861          NA
119  119 0.024508870 0.0660619652 0.061197243 0.0585676874 0.087955053
120  120 0.019276136 0.0333416495 0.062098710 0.1024774209 0.060810717
121  121 0.053529398 0.1033850873 0.018749801 0.0450772999 0.032490817
122  122 0.021575762 0.0585927326          NA 0.0451121987          NA
123  123 0.035283277 0.0537894758 0.069254901 0.0919571602 0.107066293
124  124 0.042317234 0.0881680034 0.004120684 0.0769518872 0.151711125
125  125 0.105315860 0.0284460134 0.050050511 0.0005451554 0.056224412
126  126 0.030441503 0.0364898413 0.057507435 0.0568567092 0.075494199
127  127 0.057061597 0.1219235744 0.066916022 0.0996064170 0.086736809
128  128 0.052338825 0.0503338756 0.055682787 0.0924582905 0.028945867
129  129 0.021144301 0.0990070526 0.028014386 0.0625985481 0.039464113
130  130 0.047860757 0.0068448007 0.079590976 0.0716366242          NA
131  131 0.093336526 0.0442844959 0.102159013 0.0140919436 0.027902653
132  132 0.063545122 0.0613527171 0.076435364 0.0590039470 0.068672293
133  133 0.051236988 0.0590011564          NA 0.0213665319 0.041278520
134  134 0.037325095 0.0198309122 0.091987286 0.0362594580 0.043573654
135  135          NA 0.0505777782 0.048318322 0.0780681105 0.046623213
136  136 0.083940116 0.0176773804 0.065747428 0.0158932066          NA
137  137 0.006180798 0.0713810998 0.068660997 0.0580075475 0.075128898
138  138 0.072198425 0.0825432527 0.047099418 0.0628499612 0.006695213
139  139 0.107273107           NA 0.047742104 0.0516473591 0.043742895
140  140 0.006683205 0.0870708039 0.080574712 0.1046656647 0.036843096
141  141 0.071053530 0.0127686651 0.071348058 0.0193295801 0.043442185
142  142 0.042134075 0.0636430781 0.079707867 0.0681839078 0.093798978
143  143 0.002835675 0.0697970791 0.121487801 0.0473320830 0.032538202
144  144 0.004559970 0.0440033052 0.069932476 0.0421750327 0.026507072
145  145 0.001953915 0.0306465813 0.056221435 0.0639227370 0.004410380
146  146 0.034072804 0.0549596306          NA 0.0193879823 0.025829058
147  147 0.006147332 0.0631645610 0.130751420 0.0105964725 0.015014459
148  148 0.070637503 0.0764990846 0.035519695 0.0351655737 0.062238386
149  149 0.113003268           NA 0.121242041 0.1025527145 0.024109873
150  150 0.011389086 0.0009086220 0.061239307 0.0516729432 0.059121261
151  151 0.073632165 0.0929120702 0.096152906 0.0599430319 0.045607175
152  152 0.073071267 0.0813988654 0.046708690 0.0443046009 0.006993135
153  153 0.059966077 0.0630586685 0.065344123 0.0641147820 0.026281766
154  154 0.019748702 0.0714553522 0.056418739 0.0214496137 0.076553374
155  155 0.046416422 0.0775152475 0.044416379 0.0847373140 0.077092283
156  156 0.041588140           NA 0.046388185 0.0675411578 0.110167198
157  157 0.066889686 0.0833083129 0.080385030 0.0258064153 0.049892591
158  158 0.038826837 0.0354503721 0.043956256 0.0516365974 0.005125196
159  159 0.079309202 0.0569185049          NA 0.0714899486 0.026947489
160  160 0.038762574 0.0411452660 0.044123323 0.0667319295 0.062254655
161  161 0.081581344 0.0761589486 0.066193718 0.0944580205 0.107004090
162  162 0.018524690 0.0395458265 0.068493671 0.0316103674 0.053300274
163  163 0.012195343 0.0655551130 0.068497035 0.0834840985 0.084211605
164  164 0.147231198 0.0382794506          NA 0.0810964403 0.073042439
165  165 0.037494272 0.0172163837 0.061062262 0.0451255060 0.014957251
166  166 0.058946828 0.0863003153 0.079035776 0.0207221992 0.044866620
167  167 0.069097090 0.0722270003 0.088297360 0.0173256443 0.089157846
168  168 0.035486581 0.1017278672 0.043251162 0.0637336087 0.076282883
169  169 0.065505861 0.0519546180 0.040343222 0.0478661980 0.063913884
170  170 0.061068936 0.0837500824 0.094635135 0.1033730800 0.064313427
171  171 0.043538585 0.1092625716          NA 0.0660541388 0.035257841
172  172 0.051958791 0.0415555365 0.036895101 0.0388416537 0.010418441
173  173 0.048977982 0.0103114666 0.063723862 0.0192337325 0.088862774
174  174 0.113853557 0.0428194530 0.001466787 0.0325279498 0.007393415
175  175 0.027759917 0.0435787628 0.058388836 0.0602866518 0.021833122
176  176 0.017120112 0.0545504151 0.106335921 0.0364719606 0.068868950
177  177 0.051133652 0.1013691493 0.049878180 0.0654269036 0.012134164
178  178 0.059314422 0.0402156832 0.041646379 0.0399698584 0.033443888
179  179 0.063095704 0.0611901397 0.064247351 0.0468332027 0.014516015
180  180 0.036249040 0.0431694781 0.041627835 0.0280847098 0.068619907
181  181 0.018100216 0.0506135213 0.074402011 0.1071513075 0.063389390
182  182 0.087895555 0.0594217299 0.077133064 0.0599786519 0.062656541
183  183 0.039510488 0.0898464409 0.050080750 0.0569190092 0.063273943
184  184 0.024034614 0.0536395513 0.014699235           NA 0.066717372
185  185 0.042911613 0.0713852696 0.010453378 0.0697937570 0.069180695
186  186 0.044084723 0.0733658009 0.032210079 0.0192912923          NA
187  187 0.083297609 0.0774431981 0.073921415 0.0232543528 0.045535509
188  188 0.052542119 0.0327681634          NA 0.0775502351 0.053373914
189  189 0.072621614 0.0988064364          NA 0.0364189806 0.071740286
190  190 0.035021239 0.0385712978 0.030386605           NA 0.014375418
191  191 0.056433359 0.0468264750 0.061831845 0.1030971233 0.035011994
192  192 0.040259423 0.0921215080 0.022593019           NA 0.017790710
193  193 0.052837506 0.0888225172 0.076602471 0.0671843459 0.081717206
194  194 0.023139099 0.0173002438 0.060001099 0.0805174775 0.088372177
195  195 0.010675954 0.0238078700 0.044880811 0.0310709640 0.073630302
196  196 0.109916402 0.0092576282 0.074564844 0.0633286115 0.013327899
197  197 0.068021265 0.0554554158 0.061650955 0.0631739117 0.063558564
198  198 0.012461859 0.0549452260 0.036621949 0.0812186946 0.084513476
199  199 0.031665022 0.0609234406 0.056933448 0.0645229816 0.055038229
200  200 0.014435597 0.0665647314 0.069425401 0.0426534866 0.033016720
201  201 0.115964310 0.0319432146 0.060688500 0.0774797617 0.017416453
202  202 0.089372389 0.0201890423 0.030259694 0.0740186707 0.030040916
203  203 0.042045648 0.0808035517 0.075656066 0.0219029290 0.071445451
204  204 0.066295822 0.0725318391 0.084588087 0.0079763770 0.037050167
205  205 0.037569802 0.0047250039 0.058288237 0.0548083262 0.056828448
206  206 0.035712593 0.0471455765 0.054323140 0.0417811288 0.088848374
207  207 0.026341915 0.0231215653 0.047731248 0.0204338266 0.067350048
208  208 0.032161482           NA 0.114842475 0.0525179204 0.090940183
209  209 0.099527224 0.0545036039 0.058289466 0.0104001042          NA
210  210 0.048379156 0.0476236487 0.045251179 0.0548367905 0.041579712
211  211 0.053577357 0.0470789220          NA 0.0312521484 0.051952041
212  212 0.057310623 0.0564845763 0.003041547 0.0787149282 0.067357679
213  213 0.086974276 0.0764739549 0.047669804 0.1227346742 0.014923801
214  214 0.034518085 0.0561679251 0.056188821 0.0225206227 0.074185546
215  215 0.020224785 0.0315069247 0.058306174 0.0817299251 0.059221702
216  216 0.100270908 0.0279560224 0.074645203 0.0747544918 0.057914180
217  217 0.036765103 0.0460459162 0.044175428 0.0478941733 0.065254544
218  218 0.028308021 0.0593005096 0.086437664 0.0363906088 0.046509247
219  219 0.012911806 0.0188095894 0.022354519 0.0972592312 0.077766383
220  220 0.011458528 0.0444707339 0.013746718           NA 0.069446893
221  221 0.032780796 0.0790180178 0.013130415 0.0307041563 0.045493719
222  222 0.068539575 0.0467515973 0.072268911 0.0068946967 0.081211311
223  223 0.083295444 0.0290473800 0.047512402 0.0918594032 0.058776761
224  224 0.071227651 0.0417216449 0.073694538 0.0442788970 0.070062542
225  225 0.039090281 0.0834394564 0.041968807 0.0342598640 0.032174671
226  226 0.051792498 0.0665013188 0.032243237 0.1455213342 0.097412955
227  227 0.028862106 0.0871002740 0.038949423 0.0484988820 0.049880332
228  228 0.028483455 0.0541729357          NA 0.0366875206 0.075435283
229  229 0.076539515 0.0623082529 0.014911542 0.0589959575 0.046996504
230  230 0.019532223 0.0332462926 0.006738961 0.0029472614 0.041611103
231  231 0.108658819 0.0681611201 0.081629668 0.0647090793 0.073533147
232  232 0.047290412 0.0348099937 0.032080097 0.0471151040 0.002461501
233  233 0.056436165 0.0073830349 0.073683796 0.0640557537 0.064350984
234  234 0.027844169 0.0538397890 0.095494718 0.0205288809 0.061806991
235  235 0.032768339 0.1083755365 0.044246756 0.0193104847          NA
236  236 0.010489516 0.0740274302 0.058516367 0.0291975601 0.061051320
237  237 0.044512238 0.0849576017          NA 0.0269603128          NA
238  238 0.062569472 0.0607656717 0.025439907 0.0889714990 0.069794131
239  239 0.059729130 0.0317432847 0.051686445 0.0973743669 0.036382588
240  240 0.026553905 0.0439327743 0.058972607 0.0452932414 0.029151895
241  241 0.026341341 0.0418025568 0.027218056 0.0392319032 0.049794611
242  242 0.034934038 0.0359390066 0.130545770 0.0401288351 0.091191561
243  243 0.094881820 0.0711250185 0.036248296 0.0520770943 0.030940308
244  244 0.015880891 0.0140790949 0.051927307 0.0529071270 0.066743099
245  245 0.044628452 0.0759909840 0.069493756 0.0587010316 0.060234736
246  246 0.107070855 0.0759245746 0.049219441 0.0275996318 0.014614441
247  247 0.046970753 0.0140413292 0.030692978 0.0245931083          NA
248  248 0.009204779 0.0691847599 0.081359170 0.0859123299          NA
249  249 0.030056917 0.1229067996 0.098466360 0.0335411792 0.066538226
250  250 0.064563799 0.0332835355 0.049109181 0.0590913709 0.048957738
251  251 0.038731914 0.0753471272 0.066868020 0.0482908840 0.105517151
252  252 0.033143709 0.0265339446 0.047077625 0.0212645182 0.067210253
253  253 0.039682483 0.0833213425 0.080493657 0.0677318573 0.075490877
254  254 0.052714899 0.0574947416 0.015314978 0.0551931462 0.090031508
255  255 0.097955263 0.0995574617 0.119625807 0.0919935007 0.034978427
256  256 0.047343047 0.0062308781 0.031894063 0.0535237875 0.065302938
257  257 0.082423985 0.0484610634 0.006234518 0.0400536273 0.076063798
258  258 0.068922623 0.0341922447 0.039472465 0.0583488474 0.091080551
259  259 0.046590803 0.0440820539 0.054401255 0.0144322505 0.072879534
260  260 0.004012940 0.0311126377 0.098708636 0.0249231784 0.062634415
261  261 0.034366480 0.0249846926 0.077336290 0.0653081975 0.023953279
262  262 0.035303886 0.0673616712 0.054273753 0.0400063730 0.071886811
263  263 0.051414633 0.0173725786 0.008315494 0.0480211716 0.065007976
264  264 0.089005960 0.0945209279 0.024018868 0.0465433487 0.069027508
265  265 0.118792369 0.0144138024 0.045101452 0.0304846214 0.062709351
266  266 0.096427432 0.0530323745 0.126590783           NA 0.043944859
267  267 0.046005471 0.0659896786          NA 0.0604650491 0.047694023
268  268          NA 0.0676020602 0.083931640 0.0728491852 0.070620923
269  269 0.038336604 0.0409476001 0.034182972 0.0113385129 0.055148945
270  270 0.052676217 0.0523850599 0.099979727 0.0944720816 0.025096743
271  271 0.075350390 0.0788379246 0.015823981 0.0615546447 0.041295226
272  272 0.078875839 0.0063060225 0.054308697 0.0902492088 0.010426228
273  273 0.070529283 0.0265478087 0.017013472 0.0212848859 0.020989043
274  274 0.008141770 0.0596120694 0.077105493 0.0550034387 0.045661668
275  275 0.075489291 0.0366565406 0.094513385 0.0469995812          NA
276  276 0.036603284 0.0911001198 0.108521630 0.0730552230          NA
277  277 0.055244081 0.0701976159 0.073928020 0.0327242130 0.083076956
278  278 0.052236535 0.0521650026 0.105297988 0.0496970698 0.032701432
279  279 0.062845003 0.0047672805 0.087392717           NA          NA
280  280 0.050740249 0.0507830068 0.046043753 0.0266713567 0.046614103
281  281          NA 0.0405075240 0.064311117 0.0537510165 0.089632078
282  282 0.072094879 0.0469296046 0.020840174 0.0288103552 0.069867629
283  283 0.061580797 0.0145532232 0.044443935 0.0486929154 0.063241496
284  284 0.042030451 0.0649597413 0.086628911 0.0359622208 0.085512377
285  285 0.053544335 0.0188313068 0.066238524 0.0682079041 0.026854957
286  286 0.054021159 0.0432133406 0.063720720 0.0850546493 0.071890676
287  287 0.056630584 0.0614427749 0.018856069 0.0253249576 0.032387432
288  288 0.099225385 0.0264945263 0.031864603 0.0407889032 0.050022926
289  289 0.043428489 0.0674897422 0.027061820 0.0931928378 0.116433960
290  290 0.055041962 0.0105046880 0.061858876           NA 0.079083032
291  291 0.085051516           NA 0.020284771 0.0404048662 0.073040231
292  292 0.081625431 0.0639490397 0.066861242 0.1119411285 0.016750163
293  293 0.084357893 0.0752161948 0.016507508 0.1158077021 0.026412922
294  294 0.032675960 0.0414246373 0.104855914 0.0546978597 0.118523494
295  295 0.110074482 0.0651237876 0.063817740 0.0240917315 0.017200977
296  296 0.052001026 0.0153225042 0.028969892 0.0549637226 0.056434381
297  297 0.106005555 0.0461855418 0.057231378 0.0304167681 0.076777132
298  298 0.009472919           NA 0.039426404 0.0935845185 0.080562739
299  299 0.050629508 0.0854354267 0.061134439 0.0258055203 0.082673360
300  300 0.087497437 0.1057973258 0.057305982 0.0611873479 0.045106130

Create a subset of the data with the returns for 2015. Rename the column to investment_returns. Store your answer in fp_15.

# feel free to use this cell if you need 
fp_15 <- fund_performance %>%
          ...(fund, ...)

answer_2 <- fp_15

test_2()

Calculate the mean and median return of funds in 2015. Store your answers in mean_ret and median_ret, respectively.

# feel free to use this cell if you need 
mean_ret <- ...(...)

answer_3 <- mean_ret

test_3()
median_ret <- ...(...)

answer_4 <- median_ret

test_4()

Let’s suppose the market return (average return of investments available) was 5%. Run a 95% confidence level t-test on the returns of fp_15 to find whether the funds outperformed the market or not. Complete the code below.

t_stat = ...( 
       ...,
       mu = ...,
       alternative = "two.sided",
       conf.level = ...)

answer_5 <- t_stat$conf.int

test_5()

Do we have statistical evidence to believe that the funds outperformed the market?

  1. Yes
  2. No
# enter your answer as either "A" or "B"
answer_6 <- ""

test_6()

But wait! Do you notice anything interesting about this dataset? Investigate the dataset with special attention to the NAs. Do you notice a pattern?

fund_performance
    fund       Y2010        Y2011       Y2013        Y2014       Y2015
1      1 0.033185731 0.0285427344 0.082220368 0.0195765748 0.068595502
2      2 0.043094675 0.0274193310 0.049179591 0.0262605836 0.027274695
3      3 0.096761249 0.0218438389 0.049000090 0.0589878105 0.075545740
4      4 0.052115252 0.0184246016 0.004517971 0.0991715573 0.027562101
5      5 0.053878632 0.0368852140 0.073711560 0.0825385103 0.068907195
6      6 0.101451950 0.0599353752 0.043677975 0.0312629758 0.082899849
7      7 0.063827486           NA 0.030297712 0.0747776870 0.020346712
8      8 0.012048163 0.0563594130 0.007639226 0.0485429494 0.083239851
9      9 0.029394414 0.0871002514 0.041007125 0.0590394096 0.035314014
10    10 0.036630141 0.1111272205 0.024528166 0.0578108447 0.058830602
11    11 0.086722454 0.0890352798 0.038089084 0.1272634929 0.056055124
12    12 0.060794415 0.0727032429 0.013472000 0.0144413357 0.037184108
13    13 0.062023144           NA 0.100627684 0.0530275958 0.058043086
14    14 0.053320481 0.0319547988 0.049519924           NA 0.013087072
15    15 0.033324766 0.0394386063 0.082248352 0.0676950777 0.045915894
16    16 0.103607394 0.0711057171          NA 0.0828982542 0.074773725
17    17 0.064935514 0.0468298600 0.036404065 0.0933698672          NA
18    18          NA 0.0122405412 0.029735531           NA 0.005362214
19    19 0.071040677 0.1005330712 0.013312215 0.0623830849 0.015141873
20    20 0.035816258 0.0773417388 0.096398275 0.0978010985 0.002327309
21    21 0.017965289 0.0571229082 0.007541542 0.0375795241 0.062587491
22    22 0.043460753 0.0865432583 0.059551708 0.0436354840 0.020212150
23    23 0.019219867 0.0098367714 0.075393089 0.0489038833          NA
24    24 0.028133263 0.0698246089 0.055345706 0.0609505625 0.030872937
25    25 0.031248822 0.0343126287 0.023742336 0.0699547963 0.038280942
26    26          NA 0.0705123657 0.078234974 0.0895346265 0.075703564
27    27 0.075133611 0.0481753414 0.055117642 0.0471353723 0.016887436
28    28 0.054601194 0.0689888214 0.018095063 0.0558883414 0.084838678
29    29 0.015855892 0.0900655285 0.008358529 0.1246399363 0.061950881
30    30 0.087614448 0.0502187027 0.112601523 0.0629329678 0.060870565
31    31 0.062793927 0.0805267591 0.029644906 0.0556625933 0.024424230
32    32 0.041147856 0.0143469789          NA 0.0097327062 0.108610036
33    33 0.076853770 0.0283518668 0.065997781 0.0500856814 0.045071875
34    34 0.076344005 0.0955765313 0.059306908 0.0433602154          NA
35    35 0.074647432 0.0613216392 0.009384970 0.0496686251 0.043884306
36    36 0.070659208           NA          NA 0.0327374708          NA
37    37 0.066617530 0.0090788764 0.046510924 0.0293955304 0.040684697
38    38 0.048142649 0.0439765695 0.084181889 0.0283767910 0.037333190
39    39 0.040821120 0.0759733821 0.069083721 0.0435648645 0.070454891
40    40 0.038585870 0.0469435023 0.035211877 0.0910439794 0.080284886
41    41 0.029158791 0.0687256242 0.024974353 0.0814725988 0.028216851
42    42 0.043762482 0.0787701613 0.058132003 0.0392007465 0.074183266
43    43 0.012038109 0.1001316449 0.054720600           NA 0.092729693
44    44 0.115068679 0.0516805020 0.068891352 0.0246624971 0.026475680
45    45 0.086238860 0.0484405428 0.038126061 0.0362671840 0.030427869
46    46 0.016306743           NA 0.076980622 0.0531091401 0.069523351
47    47 0.037913455 0.0529798278 0.025075654 0.0301217817 0.055491439
48    48 0.036000339 0.0328444983 0.040083659 0.1102004207 0.066463249
49    49 0.073398954 0.0207797125 0.072224436 0.0418319740 0.092140529
50    50 0.047498928 0.0446028131 0.079699148 0.0135816659 0.061612494
51    51 0.057599555 0.0804482952          NA 0.0457621473 0.081551038
52    52 0.049143597           NA 0.053215712 0.0198386726 0.068687164
53    53 0.048713886 0.0371816214 0.068263370 0.0546846712 0.063008612
54    54 0.091058069 0.0534991185 0.006475271 0.0570090084 0.061582533
55    55 0.043226870 0.0232037729 0.064418768 0.0606676283 0.088739699
56    56 0.095494118 0.0600170883 0.025154772 0.0013442522 0.019932204
57    57 0.003537416 0.0623428976 0.080607590 0.0566213387 0.016844518
58    58 0.067538412 0.0490089152 0.066154461 0.0593135024 0.067758380
59    59 0.053715627           NA 0.073071569 0.0073667466 0.046409310
60    60 0.056478247 0.1271437444 0.053621580 0.0786609692 0.052220156
61    61 0.061389184 0.0438410223 0.075909453 0.0735251264 0.072238321
62    62 0.034930296 0.0695357984 0.091415436 0.1189885808 0.072598851
63    63 0.040003778 0.0582129947 0.108987440 0.0547010896 0.042119885
64    64 0.019442739 0.0807401970 0.049148148 0.0514020058 0.040623684
65    65 0.017846263 0.0745297834          NA 0.0528975750 0.052207958
66    66 0.059105859 0.0437062049 0.050945780 0.0520929869 0.081890534
67    67 0.063446293 0.0613450332 0.056166836           NA 0.062780615
68    68 0.051590127 0.0216377351 0.045339640           NA 0.092990225
69    69 0.077668024 0.0757076903 0.067048659 0.0476738310 0.049770894
70    70 0.111502541 0.0361688498 0.080320339 0.0325679786 0.083770028
71    71 0.035269065 0.1225032006 0.034460527 0.0516420958 0.076490069
72    72          NA 0.0004685331 0.041177140           NA 0.068362504
73    73 0.080172156 0.0360803827 0.061935266 0.0050390524 0.062441021
74    74 0.028723977 0.0747613959 0.033493288 0.0169554968 0.041603528
75    75 0.029359742 0.0653039764 0.052738022 0.0795817466 0.046728875
76    76 0.080767141 0.0323155688          NA 0.0170452998 0.056881865
77    77 0.041456810 0.0200965777 0.016403008 0.0260145814 0.051466667
78    78 0.013378469 0.0543342711 0.010167336 0.0523962146 0.078296734
79    79 0.055439104 0.0495707776 0.024391289 0.0403176091 0.046720486
80    80 0.045833259           NA 0.029200864 0.0543925154 0.047888692
81    81 0.050172926 0.0510365320 0.061469154 0.1191518595 0.035470427
82    82 0.061558412 0.0557069095 0.079463390 0.0162618899 0.045849910
83    83 0.038880199 0.0552417919 0.028178494 0.0408359108 0.047937031
84    84 0.069331296 0.0183494887 0.020094831 0.0344972165          NA
85    85 0.043385403 0.0642839983 0.018749334 0.0953718628 0.009055049
86    86 0.059953459 0.0913571041 0.037562338 0.0269154523 0.047825393
87    87 0.082905170 0.0636870921 0.042829128 0.0475373929 0.042041487
88    88 0.063055445 0.0159323459 0.064508526 0.0736140084 0.013973920
89    89 0.040222052 0.0369306359 0.040360255 0.0182422839          NA
90    90 0.084464229 0.0603831086          NA 0.0996552745 0.039368923
91    91 0.079805116 0.0305886311 0.047256972 0.0702728725 0.069604873
92    92 0.066451909           NA 0.085615604 0.0177738017 0.103197159
93    93 0.057161952 0.0765275246 0.085748038 0.0636373343 0.048846296
94    94 0.031162818 0.0251156717 0.026331103 0.0436007857 0.094795545
95    95 0.090819573 0.0327931919 0.003566704 0.0593968632 0.052490665
96    96 0.031992212 0.0951170183 0.123741815 0.0473007441 0.053465963
97    97 0.115619990 0.0267756521 0.045127342 0.0821154811 0.059744759
98    98 0.095978319 0.0753719462 0.047076462 0.0094669884 0.023882683
99    99 0.042928989 0.0121795136 0.062617226 0.0343214991 0.048448454
100  100 0.019207373 0.0393637279 0.001578816 0.0425242797 0.077253431
101  101 0.028687803 0.0477933194 0.028153427 0.0201260383 0.027508226
102  102 0.057706511 0.0149404573 0.003786728 0.0188013487 0.040351818
103  103 0.042599244 0.0309575521 0.029207162 0.0494605928 0.015566877
104  104 0.039573722 0.0491347534 0.053565483 0.0460347460 0.060630566
105  105 0.021451443 0.0701208791 0.009058716           NA 0.062743993
106  106 0.048649168 0.0004836037 0.067699480 0.0812172037 0.069450421
107  107 0.026452866 0.0395073728 0.058680321 0.0574917721 0.013405699
108  108          NA 0.0726921932 0.022873549 0.1224862212 0.053217051
109  109 0.038593204 0.0338357252 0.056789748 0.0705559471 0.021678269
110  110 0.077569898 0.0568187577 0.072442435 0.0365912207 0.049988461
111  111 0.032739591 0.0647668571 0.081832858 0.1339217344 0.090278718
112  112 0.068238930 0.0580350505 0.043614552 0.1349667807 0.034894241
113  113 0.001463519 0.0695977304 0.047190896 0.0134386455 0.071500500
114  114 0.048333141 0.0463187402 0.047398576 0.0640709587 0.027509942
115  115 0.065582216 0.0375897046 0.093243853 0.0436625924 0.035644154
116  116 0.059034601           NA 0.083752157 0.0556115344 0.063161653
117  117 0.053170286 0.0472117694 0.075032047 0.0568262819 0.029626632
118  118 0.030778820 0.0629085409 0.041379776 0.0121429861          NA
119  119 0.024508870 0.0660619652 0.061197243 0.0585676874 0.087955053
120  120 0.019276136 0.0333416495 0.062098710 0.1024774209 0.060810717
121  121 0.053529398 0.1033850873 0.018749801 0.0450772999 0.032490817
122  122 0.021575762 0.0585927326          NA 0.0451121987          NA
123  123 0.035283277 0.0537894758 0.069254901 0.0919571602 0.107066293
124  124 0.042317234 0.0881680034 0.004120684 0.0769518872 0.151711125
125  125 0.105315860 0.0284460134 0.050050511 0.0005451554 0.056224412
126  126 0.030441503 0.0364898413 0.057507435 0.0568567092 0.075494199
127  127 0.057061597 0.1219235744 0.066916022 0.0996064170 0.086736809
128  128 0.052338825 0.0503338756 0.055682787 0.0924582905 0.028945867
129  129 0.021144301 0.0990070526 0.028014386 0.0625985481 0.039464113
130  130 0.047860757 0.0068448007 0.079590976 0.0716366242          NA
131  131 0.093336526 0.0442844959 0.102159013 0.0140919436 0.027902653
132  132 0.063545122 0.0613527171 0.076435364 0.0590039470 0.068672293
133  133 0.051236988 0.0590011564          NA 0.0213665319 0.041278520
134  134 0.037325095 0.0198309122 0.091987286 0.0362594580 0.043573654
135  135          NA 0.0505777782 0.048318322 0.0780681105 0.046623213
136  136 0.083940116 0.0176773804 0.065747428 0.0158932066          NA
137  137 0.006180798 0.0713810998 0.068660997 0.0580075475 0.075128898
138  138 0.072198425 0.0825432527 0.047099418 0.0628499612 0.006695213
139  139 0.107273107           NA 0.047742104 0.0516473591 0.043742895
140  140 0.006683205 0.0870708039 0.080574712 0.1046656647 0.036843096
141  141 0.071053530 0.0127686651 0.071348058 0.0193295801 0.043442185
142  142 0.042134075 0.0636430781 0.079707867 0.0681839078 0.093798978
143  143 0.002835675 0.0697970791 0.121487801 0.0473320830 0.032538202
144  144 0.004559970 0.0440033052 0.069932476 0.0421750327 0.026507072
145  145 0.001953915 0.0306465813 0.056221435 0.0639227370 0.004410380
146  146 0.034072804 0.0549596306          NA 0.0193879823 0.025829058
147  147 0.006147332 0.0631645610 0.130751420 0.0105964725 0.015014459
148  148 0.070637503 0.0764990846 0.035519695 0.0351655737 0.062238386
149  149 0.113003268           NA 0.121242041 0.1025527145 0.024109873
150  150 0.011389086 0.0009086220 0.061239307 0.0516729432 0.059121261
151  151 0.073632165 0.0929120702 0.096152906 0.0599430319 0.045607175
152  152 0.073071267 0.0813988654 0.046708690 0.0443046009 0.006993135
153  153 0.059966077 0.0630586685 0.065344123 0.0641147820 0.026281766
154  154 0.019748702 0.0714553522 0.056418739 0.0214496137 0.076553374
155  155 0.046416422 0.0775152475 0.044416379 0.0847373140 0.077092283
156  156 0.041588140           NA 0.046388185 0.0675411578 0.110167198
157  157 0.066889686 0.0833083129 0.080385030 0.0258064153 0.049892591
158  158 0.038826837 0.0354503721 0.043956256 0.0516365974 0.005125196
159  159 0.079309202 0.0569185049          NA 0.0714899486 0.026947489
160  160 0.038762574 0.0411452660 0.044123323 0.0667319295 0.062254655
161  161 0.081581344 0.0761589486 0.066193718 0.0944580205 0.107004090
162  162 0.018524690 0.0395458265 0.068493671 0.0316103674 0.053300274
163  163 0.012195343 0.0655551130 0.068497035 0.0834840985 0.084211605
164  164 0.147231198 0.0382794506          NA 0.0810964403 0.073042439
165  165 0.037494272 0.0172163837 0.061062262 0.0451255060 0.014957251
166  166 0.058946828 0.0863003153 0.079035776 0.0207221992 0.044866620
167  167 0.069097090 0.0722270003 0.088297360 0.0173256443 0.089157846
168  168 0.035486581 0.1017278672 0.043251162 0.0637336087 0.076282883
169  169 0.065505861 0.0519546180 0.040343222 0.0478661980 0.063913884
170  170 0.061068936 0.0837500824 0.094635135 0.1033730800 0.064313427
171  171 0.043538585 0.1092625716          NA 0.0660541388 0.035257841
172  172 0.051958791 0.0415555365 0.036895101 0.0388416537 0.010418441
173  173 0.048977982 0.0103114666 0.063723862 0.0192337325 0.088862774
174  174 0.113853557 0.0428194530 0.001466787 0.0325279498 0.007393415
175  175 0.027759917 0.0435787628 0.058388836 0.0602866518 0.021833122
176  176 0.017120112 0.0545504151 0.106335921 0.0364719606 0.068868950
177  177 0.051133652 0.1013691493 0.049878180 0.0654269036 0.012134164
178  178 0.059314422 0.0402156832 0.041646379 0.0399698584 0.033443888
179  179 0.063095704 0.0611901397 0.064247351 0.0468332027 0.014516015
180  180 0.036249040 0.0431694781 0.041627835 0.0280847098 0.068619907
181  181 0.018100216 0.0506135213 0.074402011 0.1071513075 0.063389390
182  182 0.087895555 0.0594217299 0.077133064 0.0599786519 0.062656541
183  183 0.039510488 0.0898464409 0.050080750 0.0569190092 0.063273943
184  184 0.024034614 0.0536395513 0.014699235           NA 0.066717372
185  185 0.042911613 0.0713852696 0.010453378 0.0697937570 0.069180695
186  186 0.044084723 0.0733658009 0.032210079 0.0192912923          NA
187  187 0.083297609 0.0774431981 0.073921415 0.0232543528 0.045535509
188  188 0.052542119 0.0327681634          NA 0.0775502351 0.053373914
189  189 0.072621614 0.0988064364          NA 0.0364189806 0.071740286
190  190 0.035021239 0.0385712978 0.030386605           NA 0.014375418
191  191 0.056433359 0.0468264750 0.061831845 0.1030971233 0.035011994
192  192 0.040259423 0.0921215080 0.022593019           NA 0.017790710
193  193 0.052837506 0.0888225172 0.076602471 0.0671843459 0.081717206
194  194 0.023139099 0.0173002438 0.060001099 0.0805174775 0.088372177
195  195 0.010675954 0.0238078700 0.044880811 0.0310709640 0.073630302
196  196 0.109916402 0.0092576282 0.074564844 0.0633286115 0.013327899
197  197 0.068021265 0.0554554158 0.061650955 0.0631739117 0.063558564
198  198 0.012461859 0.0549452260 0.036621949 0.0812186946 0.084513476
199  199 0.031665022 0.0609234406 0.056933448 0.0645229816 0.055038229
200  200 0.014435597 0.0665647314 0.069425401 0.0426534866 0.033016720
201  201 0.115964310 0.0319432146 0.060688500 0.0774797617 0.017416453
202  202 0.089372389 0.0201890423 0.030259694 0.0740186707 0.030040916
203  203 0.042045648 0.0808035517 0.075656066 0.0219029290 0.071445451
204  204 0.066295822 0.0725318391 0.084588087 0.0079763770 0.037050167
205  205 0.037569802 0.0047250039 0.058288237 0.0548083262 0.056828448
206  206 0.035712593 0.0471455765 0.054323140 0.0417811288 0.088848374
207  207 0.026341915 0.0231215653 0.047731248 0.0204338266 0.067350048
208  208 0.032161482           NA 0.114842475 0.0525179204 0.090940183
209  209 0.099527224 0.0545036039 0.058289466 0.0104001042          NA
210  210 0.048379156 0.0476236487 0.045251179 0.0548367905 0.041579712
211  211 0.053577357 0.0470789220          NA 0.0312521484 0.051952041
212  212 0.057310623 0.0564845763 0.003041547 0.0787149282 0.067357679
213  213 0.086974276 0.0764739549 0.047669804 0.1227346742 0.014923801
214  214 0.034518085 0.0561679251 0.056188821 0.0225206227 0.074185546
215  215 0.020224785 0.0315069247 0.058306174 0.0817299251 0.059221702
216  216 0.100270908 0.0279560224 0.074645203 0.0747544918 0.057914180
217  217 0.036765103 0.0460459162 0.044175428 0.0478941733 0.065254544
218  218 0.028308021 0.0593005096 0.086437664 0.0363906088 0.046509247
219  219 0.012911806 0.0188095894 0.022354519 0.0972592312 0.077766383
220  220 0.011458528 0.0444707339 0.013746718           NA 0.069446893
221  221 0.032780796 0.0790180178 0.013130415 0.0307041563 0.045493719
222  222 0.068539575 0.0467515973 0.072268911 0.0068946967 0.081211311
223  223 0.083295444 0.0290473800 0.047512402 0.0918594032 0.058776761
224  224 0.071227651 0.0417216449 0.073694538 0.0442788970 0.070062542
225  225 0.039090281 0.0834394564 0.041968807 0.0342598640 0.032174671
226  226 0.051792498 0.0665013188 0.032243237 0.1455213342 0.097412955
227  227 0.028862106 0.0871002740 0.038949423 0.0484988820 0.049880332
228  228 0.028483455 0.0541729357          NA 0.0366875206 0.075435283
229  229 0.076539515 0.0623082529 0.014911542 0.0589959575 0.046996504
230  230 0.019532223 0.0332462926 0.006738961 0.0029472614 0.041611103
231  231 0.108658819 0.0681611201 0.081629668 0.0647090793 0.073533147
232  232 0.047290412 0.0348099937 0.032080097 0.0471151040 0.002461501
233  233 0.056436165 0.0073830349 0.073683796 0.0640557537 0.064350984
234  234 0.027844169 0.0538397890 0.095494718 0.0205288809 0.061806991
235  235 0.032768339 0.1083755365 0.044246756 0.0193104847          NA
236  236 0.010489516 0.0740274302 0.058516367 0.0291975601 0.061051320
237  237 0.044512238 0.0849576017          NA 0.0269603128          NA
238  238 0.062569472 0.0607656717 0.025439907 0.0889714990 0.069794131
239  239 0.059729130 0.0317432847 0.051686445 0.0973743669 0.036382588
240  240 0.026553905 0.0439327743 0.058972607 0.0452932414 0.029151895
241  241 0.026341341 0.0418025568 0.027218056 0.0392319032 0.049794611
242  242 0.034934038 0.0359390066 0.130545770 0.0401288351 0.091191561
243  243 0.094881820 0.0711250185 0.036248296 0.0520770943 0.030940308
244  244 0.015880891 0.0140790949 0.051927307 0.0529071270 0.066743099
245  245 0.044628452 0.0759909840 0.069493756 0.0587010316 0.060234736
246  246 0.107070855 0.0759245746 0.049219441 0.0275996318 0.014614441
247  247 0.046970753 0.0140413292 0.030692978 0.0245931083          NA
248  248 0.009204779 0.0691847599 0.081359170 0.0859123299          NA
249  249 0.030056917 0.1229067996 0.098466360 0.0335411792 0.066538226
250  250 0.064563799 0.0332835355 0.049109181 0.0590913709 0.048957738
251  251 0.038731914 0.0753471272 0.066868020 0.0482908840 0.105517151
252  252 0.033143709 0.0265339446 0.047077625 0.0212645182 0.067210253
253  253 0.039682483 0.0833213425 0.080493657 0.0677318573 0.075490877
254  254 0.052714899 0.0574947416 0.015314978 0.0551931462 0.090031508
255  255 0.097955263 0.0995574617 0.119625807 0.0919935007 0.034978427
256  256 0.047343047 0.0062308781 0.031894063 0.0535237875 0.065302938
257  257 0.082423985 0.0484610634 0.006234518 0.0400536273 0.076063798
258  258 0.068922623 0.0341922447 0.039472465 0.0583488474 0.091080551
259  259 0.046590803 0.0440820539 0.054401255 0.0144322505 0.072879534
260  260 0.004012940 0.0311126377 0.098708636 0.0249231784 0.062634415
261  261 0.034366480 0.0249846926 0.077336290 0.0653081975 0.023953279
262  262 0.035303886 0.0673616712 0.054273753 0.0400063730 0.071886811
263  263 0.051414633 0.0173725786 0.008315494 0.0480211716 0.065007976
264  264 0.089005960 0.0945209279 0.024018868 0.0465433487 0.069027508
265  265 0.118792369 0.0144138024 0.045101452 0.0304846214 0.062709351
266  266 0.096427432 0.0530323745 0.126590783           NA 0.043944859
267  267 0.046005471 0.0659896786          NA 0.0604650491 0.047694023
268  268          NA 0.0676020602 0.083931640 0.0728491852 0.070620923
269  269 0.038336604 0.0409476001 0.034182972 0.0113385129 0.055148945
270  270 0.052676217 0.0523850599 0.099979727 0.0944720816 0.025096743
271  271 0.075350390 0.0788379246 0.015823981 0.0615546447 0.041295226
272  272 0.078875839 0.0063060225 0.054308697 0.0902492088 0.010426228
273  273 0.070529283 0.0265478087 0.017013472 0.0212848859 0.020989043
274  274 0.008141770 0.0596120694 0.077105493 0.0550034387 0.045661668
275  275 0.075489291 0.0366565406 0.094513385 0.0469995812          NA
276  276 0.036603284 0.0911001198 0.108521630 0.0730552230          NA
277  277 0.055244081 0.0701976159 0.073928020 0.0327242130 0.083076956
278  278 0.052236535 0.0521650026 0.105297988 0.0496970698 0.032701432
279  279 0.062845003 0.0047672805 0.087392717           NA          NA
280  280 0.050740249 0.0507830068 0.046043753 0.0266713567 0.046614103
281  281          NA 0.0405075240 0.064311117 0.0537510165 0.089632078
282  282 0.072094879 0.0469296046 0.020840174 0.0288103552 0.069867629
283  283 0.061580797 0.0145532232 0.044443935 0.0486929154 0.063241496
284  284 0.042030451 0.0649597413 0.086628911 0.0359622208 0.085512377
285  285 0.053544335 0.0188313068 0.066238524 0.0682079041 0.026854957
286  286 0.054021159 0.0432133406 0.063720720 0.0850546493 0.071890676
287  287 0.056630584 0.0614427749 0.018856069 0.0253249576 0.032387432
288  288 0.099225385 0.0264945263 0.031864603 0.0407889032 0.050022926
289  289 0.043428489 0.0674897422 0.027061820 0.0931928378 0.116433960
290  290 0.055041962 0.0105046880 0.061858876           NA 0.079083032
291  291 0.085051516           NA 0.020284771 0.0404048662 0.073040231
292  292 0.081625431 0.0639490397 0.066861242 0.1119411285 0.016750163
293  293 0.084357893 0.0752161948 0.016507508 0.1158077021 0.026412922
294  294 0.032675960 0.0414246373 0.104855914 0.0546978597 0.118523494
295  295 0.110074482 0.0651237876 0.063817740 0.0240917315 0.017200977
296  296 0.052001026 0.0153225042 0.028969892 0.0549637226 0.056434381
297  297 0.106005555 0.0461855418 0.057231378 0.0304167681 0.076777132
298  298 0.009472919           NA 0.039426404 0.0935845185 0.080562739
299  299 0.050629508 0.0854354267 0.061134439 0.0258055203 0.082673360
300  300 0.087497437 0.1057973258 0.057305982 0.0611873479 0.045106130

There are no funds with negative performance in the dataset! It’s likely that the NAs have replaced the observations with negative returns. How might that affect our analysis of fund performance? Think about the biases that could have been introduced in our mean and statistical test calculations.

Appendix

Removing outliers is a common practice in data analysis. The code below removes outliers based on a custom Z-score threshold.

Note: here we use the 95th percentile but you should first visualize your data with box plots and then find a convenient threshold to remove outliers in the variables of interest.

# function to remove outliers based on z-score
remove_outliers_zscore <- function(data, variable, threshold) {
  z_scores <- scale(data[[variable]])
  data_without_outliers <- data[abs(z_scores) <= threshold, ]
  return(data_without_outliers)
}

# set the threshold for z-score outlier removal
zscore_threshold <- 1.645    # Adjust as needed

# remove outliers based on z-score for the desired variable
df_filtered <- remove_outliers_zscore(df_gender_on_wealth, "wealth", zscore_threshold)

df_filtered
# A tibble: 9,775 × 6
   pefamid gender education  wealth income_before_tax income_after_tax
   <chr>   <fct>  <fct>       <dbl>             <dbl>            <dbl>
 1 00001   1      3          511750            210000           126650
 2 00002   1      3           18250                 0            23000
 3 00004   1      2         2019000            290000           233850
 4 00006   1      4          160550            100000           120150
 5 00007   2      2         2473000            160000           120675
 6 00008   2      4          290225             23000            36775
 7 00009   2      4         2332255            370000           226875
 8 00010   1      4         1908750             80000            82075
 9 00011   1      2           88650            195000           160625
10 00012   1      3         1241255            205000           182750
# ℹ 9,765 more rows

Footnotes

  1. Statistics Canada, Survey of Financial Security, 2019, 2021. Reproduced and distributed on an “as is” basis with the permission of Statistics Canada. Adapted from Statistics Canada, Survey of Financial Security, 2019, 2021. This does not constitute an endorsement by Statistics Canada of this product.↩︎

  • Creative Commons License. See details.
 
  • Report an issue
  • The COMET Project and the UBC Vancouver School of Economics are located on the traditional, ancestral and unceded territory of the xʷməθkʷəy̓əm (Musqueam) and Sḵwx̱wú7mesh (Squamish) peoples.