Are electric cars a worse investment?

This post examines the value depreciation of electric vehicles (EV) in the Montreal area, with an analysis of the impact of the Quebec government's decision to phase out the Roulez Vert program on the value.


Electric vehicles (EVs) are generally more expensive than their gasoline counterparts. Yet, adoption has still been relatively high thanks to government incentives and the cheap cost of electricity in the province of Quebec. But, as the Roulez Vert program from the government of Quebec is slowly being phased out, it is worth examining more precisely the financial impact of EVs. The program’s initial goal was to promote EVs with monetary incentives (up to $7,000 CAD for new cars and $3,500 CAD for used cars). Removing this program means that every EV purchased starting on January 1, 2025, will be more expensive.

This presents a unique opportunity for acquiring an EV this year with a lower value depreciation than what would traditionally be expected, as the cost of EVs will slowly increase over the next few years.

My understanding of EVs is that they should hold value longer than your typical gasoline vehicle. There are fewer parts involved in the engine, and the battery capacity can remain relatively stable over a huge number of recharge cycles. The brake pads can be changed less often thanks to the regenerative braking system. The main disadvantage is that battery capacity has been increasing significantly with every new generation of EVs. Therefore, there is an incentive to upgrade your car and thus saturate the used EV market, which decreases prices and depreciates the value of the car faster.

Before making any assumptions, it is best to refer to data.

Acquiring used vehicle market data

I scraped some used vehicle market data using the Web Scraper Chrome extension. I then used the US car models GitHub repository to parse the vehicle make and model out of the text description for each vehicle.

import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import glob
import numpy as np
import re

# Parse US car models reference data
us_car_models_df = None
for file in glob.glob("data/us-car-models/*.csv"):
    current_year = pd.read_csv(file)
    us_car_models_df = current_year if us_car_models_df is None else pd.concat([us_car_models_df, current_year])

all_cars_models = sorted(us_car_models_df.model.unique().tolist(), key=len, reverse=True)

# Parse scraped vehicle data
evs = pd.read_csv("data/electric_cars.csv")
gas = pd.read_csv("data/gasoline_cars.csv")
evs["engine_type"] = "electric"
gas["engine_type"] = "gasoline"
df = pd.concat([evs, gas]).drop_duplicates()

df["vehicle-name"] = df["vehicle-name"].str.strip()
df["year"] = df["vehicle-name"].str[:4].astype(float)
df["price"] = (
    df["price"].str.extract("([\d,]+)", expand=False).str.replace(",", "").astype(float)
)
df = df[df["year"] > 2010]
df["make"] = df["vehicle-name"].str.extract("\d+\s+([A-Za-z\-]+)\s", expand=False)

def find_model_in_paragraph(text):
    reg = re.compile('[^a-z0-9]')
    for model in all_cars_models:
        if reg.sub('', model.lower()) in reg.sub('', text.lower()):
            return model
    return None

df["model"] = df["vehicle-name"].apply(find_model_in_paragraph)
df["mileage"] = df.mileage.str.replace("[km,]", "", regex=True).astype(float)
df["vehicle_age"] = 2025 - df.year
df.drop(
    ["vehicle-name", "vehicle-desc", "pagination", "web-scraper-order", "web-scraper-start-url"],
    axis=1, inplace=True
)
df.dropna(inplace=True)
df.year = df.year.astype(int)
df["MakeModel"] = (df["make"] + " " + df["model"]).astype("category")

Data visualization

We can start our analysis with some visualization of this newly created dataset.

fig = make_subplots(rows=2, cols=2)

ages, age_counts = np.unique(df.vehicle_age, return_counts=True)
fig.add_trace(go.Bar(x=ages, y=age_counts, name='Vehicle age'), row=1, col=1)

engine_types, engine_counts = np.unique(df.engine_type, return_counts=True)
fig.add_trace(go.Bar(x=engine_types, y=engine_counts, name='Engine type'), row=1, col=2)

bin_size = 5000
bins, counts = np.unique(df.mileage.astype(int) // bin_size, return_counts=True)
fig.add_trace(go.Bar(x=bins * bin_size / 1000, y=counts, name='Mileage'), row=2, col=1)

makes, make_counts = np.unique(df.make, return_counts=True)
top_idx = np.argsort(make_counts)[::-1][:15]
fig.add_trace(go.Bar(x=makes[top_idx], y=make_counts[top_idx], name='Top 15 makes'), row=2, col=2)

fig.update_layout(showlegend=False)
fig.update_xaxes(title_text='Vehicle age (years)', row=1, col=1)
fig.update_yaxes(title_text='Count', row=1, col=1)
fig.update_xaxes(title_text='Engine type', row=1, col=2)
fig.update_xaxes(title_text='Mileage (thousand km)', row=2, col=1)
fig.update_yaxes(title_text='Count', row=2, col=1)
fig.update_xaxes(tickangle=-40, row=2, col=2)
fig.show()
Histogram for each column of the dataset.

To examine the effect of engine type on vehicle price depreciation, we can calculate both the average and median prices for vehicles of varying ages.

avg_price_by_age = df.groupby(['vehicle_age', 'engine_type']).price.mean().reset_index()
med_price_by_age = df.groupby(['vehicle_age', 'engine_type']).price.median().reset_index()

fig = make_subplots(rows=1, cols=2, shared_yaxes=True)

for engine_type in ['electric', 'gasoline']:
    avg = avg_price_by_age[avg_price_by_age.engine_type == engine_type]
    med = med_price_by_age[med_price_by_age.engine_type == engine_type]
    fig.add_trace(go.Scatter(x=avg.vehicle_age, y=avg.price, mode='lines+markers', name=f'{engine_type} (avg)'), row=1, col=1)
    fig.add_trace(go.Scatter(x=med.vehicle_age, y=med.price, mode='lines+markers', name=f'{engine_type} (median)', line=dict(dash='dot')), row=1, col=2)

fig.update_xaxes(title_text='Vehicle age (years)')
fig.update_yaxes(title_text='Price ($ CAD)', tickformat=',.0f', row=1, col=1)
fig.show()
Average and median vehicle price by engine type.

Modeling

To validate this conclusion, I created a simple linear mixed-effect regression model to predict vehicle prices using mileage, age, and engine type. There are some biases that are unaccounted for: the options added to the vehicle and the MSRP. We still can get a rough estimate by inspecting the parameters of the learned model.

import statsmodels.formula.api as smf

model = smf.mixedlm(
    "price ~ engine_type*mileage + engine_type*vehicle_age",
    df, groups="MakeModel"
).fit()

print(model.summary())
Number of listings per make/model in the dataset (top 50). Used as grouping variable in the mixed-effects model.
Mixed-effects model coefficient summary.

What’s most interesting about the results are the coefficients. We notice that, as expected and on average, every km driven (mileage) decreases the value of the vehicle by $0.097 and every year added decreases it by $3,327. The model uses EVs as a baseline, and we can notice that gasoline vehicles increase the value of the car by $0.021 for every km and $635 for every year. Now, these numbers are inexact — the relationship between these variables is most likely not linear. The sign of the coefficient is, however, undeniable: gasoline vehicles hold their value longer than their electrical counterparts.

Conclusion

It is evident that EVs undergo greater depreciation than gasoline cars. However, this difference appears primarily in vehicles older than 5 years. Five years ago marked a significant increase in the popularity of EVs in Canada, notably with the release of the Tesla Model 3. The technology has advanced considerably since then, and it is somewhat expected that the early models of EVs did not perform well in the market and are thus resold at lower values.