\[\\[0.2in]\]
This site was created with the intention of investigating the education system in British Columbia, Canada. Although test scores can be quite informative, I believe they can be misleading–the difficulty of tests can fluctuate dramatically, effectively resulting in student grades also fluctuating. As such, our metric of choice was grade-to-grade transition rates. These rates do not reflect the quality of the education too accurately, but they are a very good indicator of how much support the British Columbian government is providing to its youth. As such, our questions are as follows:
How has the British Columbian Government’s educational support (as indicated by grade-to-grade transition rates) in the past 30 years differed in different kinds of schools, as well as for different types of British Columbian students (i.e., Indigenous? Special needs?)? Are there any other specific attributes that is strongly related to a lower/higher quality of educational support?
Note that this is quite a complex question, and so naturally, a lot of data is required to answer it. Thus, this website uses various open data from British Columbia. Our main dataset of interest is the 1992-2021 data on grade-to-grade transition rates published by British Columbia Education Analytics. This dataset consists of information on a provincial, district, and school level. Additional information gathered from supplementary datasets. A dataset on class Sizes from 2006 to 2021 provides us information on all three levels, but here we mainly use it on the school level. For the district level, we used a dataset regarding each district’s office and a dataset regarding British Columbian Teachers for each district.
If you wish to see the Github repository for this site, you can find it here.
If you wish to see a PDF version of this report, you can find a copy here. The report is created after the creation of this website, so it shares many of the same content. However, do note that there are some slight differences–for instance, the PDF report features some maps that are not found here, but the site features the interactive figures that are not found there.
\[\\[0.2in]\]
All files that were used are of the ‘.csv’ format. Our main data came in the form of 3 files containing information regarding BC grade-to-grade transitions on a provincial, district and school level (the 3 files for 3 different decades). It is worth noting that few of the files were tidy–many of the files had some rows that recorded data on “all students” whilst other rows recorded data on “Indigenous students”, so each row should not be treated as its own observation. As such, we had to be careful when working with it. We merged and filtered the data from all of the files into three data tables–one for the provincial level, one for the district level, and one for the school value. Rather than imputing, we removed entries in our table with missing data, as I felt as though imputation would lead to my figures being misleading.
Before doing anything meaningful, we should first do some exploratory data analysis. Based on the structure of the data, I felt as though it would be most appropriate to look at the situation from all three levels. We begin by making interactive figures to understand the situation from a provincial and district level. We will also perform basic model fitting on a school level. By mimicking the structure of our data, we can make the most out of it! Our data exploration is primarily for the purpose of answering our first question: How has the British Columbian Government’s educational support (as indicated by grade-to-grade transition rates) in the past 30 years differed in different kinds of schools, as well as for different types of British Columbian students (i.e., Indigenous? Special needs?)?
After doing some basic Exploratory Data Analysis, we will fit some machine learning models (namely regression trees, bagging, random forests, boosting and extreme-gradient boosting) on a school level, as there are not enough features or number of observations on the district and provincial levels to fit these machine-learning models in a meaningful way. The primary focus of this section is to answer our second question: Are there any other specific attributes that are strongly related to a lower/higher quality of educational support?
\[\\[0.2in]\]
Let us begin with some exploratory data analysis.
You can see the plots and analysis that were made for the provincial, district and school levels by clicking on the corresponding tabs.
Below is an interactive plot that shows the grade-to-grade transition rates over time. Using the drop menus located on the lower right corner of the grid, we can tweak our population of interest. The axis is fixed for ease of comparison, but you can always zoom in. We can observe various things by tweaking the settings of the plot. Below are just a few discoveries that I found.
Province-Total
and All Students
setting, we can see that overall, as time went by, the percentage of students that successfully transitioned to the next grade increased. This is the most prominent for students in grades 8-11. It is also worth noting that the percentage of students in grades 10/11 that successfully transition to the next grade are significant lower than the percentage of students in the lower grades that successfully transition.All Students
but altering the top dropdown between BC Public School
and BC Independent School
, we can see that the rates of grade transitions for students of BC public schools are lower than that of BC independent schools, but the increase in the rate of successful grade transitions is significantly higher in public schools. Furthermore, private schools (i.e., independent schools) seem to have a much higher rate of transition, especially for grade levels 8 and above. This does make sense, as private schools are often times more expensive, and so the students attending them naturally live in my affluent families.Province-Total
but altering the bottom dropdown between Indigenous
and Non Indigenous
, we can see that the rate of successful grade transitions is significantly lower in Indigenous students compared to their non-Indigenous counterparts. We can also see that the increase in the rates of grade transitions over the years is higher for Indigenous students compared to non-Indigenous students.Province-Total
but altering the bottom dropdown between Non Special Needs
and Special Needs
, we can see that the rate of successful grade transitions is significantly lower in special needs students compared to students without special needs.\[\\[0.1in]\]
\[\\[0.1in]\]
Below are two interactive plots. Figure 2 is an interactive heatmap that displays all the data we have regarding the transition rates for each group, district, and year. Below is a short explanation of the abbreviations used.
(NSN)
is an abbreviation for “Non Special Needs”(SN)
is an abbreviation for “special Needs”(NI)
is an abbreviation for “Non Indigenous”(I)
is an abbreviation for “Indigenous”Figure 3 consists of an interactive boxplot that summarizes the data for all the districts for each year. The axis is fixed for ease of comparison, but you can always zoom in (or hover your mouse to see more details). There are many interesting things you can find with the interactive plots. Below are just a few findings from our figures:
\[\\[0.1in]\]
I ended up fitting three linear models that predicted grade-to-grade transition rates (i.e., the grade-to-grade transition rate of a specific school, grade level, and a certain population). The first model further split the groups into Indigenous students and non-Indigenous students, while the second model split the group into ones with special needs and without. The third model simply dealt with all students (i.e., we did not split into subgroups). The predictors I used was the year, the grade level of interest, and whether the school was public or private. I also used the sub-population of interest as a predictor in my first two models.
Note that the point of creating these models was not to create a model for future prediction–we will do that at the machine learning section (which is below this section) instead. The models created here were purely for interpretability. As such, although my first, second, and third models had \(R^2\) values of 0.2164, 0.1542 and 0.1551, respectively–which are very low \(R^2\) values–I was not bothered.
All of the predictors had an extremely small p-values (less than 1e-16), which hints at the fact that it is very likely that there is a difference in grade-to-grade transitions between people of different grades, between Indigenous and non Indigenous students, between special needs and non special students, and between students of public and private/independent schools. The coefficients of our models tell us more about these differences. The first model suggests that Indigenous students had a 2.502% lower mean rate of successful transitions, while our second model suggested that students with special needs have a 1.126% lower mean rate of successful transitions compared to their non-Indigenous and non-special needs counterparts, respectively. Looking at the coefficients for the grades for all of our models, we can also see that our model suggests that students in high school (BC high school starts at grade 8) are significantly less likely to transition to their next grade compared to their elementary school counterparts. This is probably due to the riskier behaviors certain high school students have, which may result them getting in all sorts of trouble. Lastly, our model suggests that the transition rate for public schools is smaller than that of private schools, but this different is fairly negligible. This provides a nice answer to our first question at a school level.
As we will be training a machine learning model on a dataset at the school level, it is worth doing some exploratory data analysis on that dataset. Note that as the dataset that we will use for the machine learning model is the result of merging several datasets, and as some datasets cover a different range of years, the machine learning dataset will only cover years from 2006 and 2016.
Below is an interactive plot showing how the number of full time educators in a district and the average class size for a grade group in a school may have an effect on the grade transition rates. You can change the year and the grade of interest through the dropdown menu. One observation we can make is that the lower grade transition rates are often located on the bottom (i.e., relatively low number of students for a grade group) and for relatively high grades, like grade 11.