{marginaleffects} playground

import polars as pl
import numpy as np
import statsmodels.formula.api as smf
from marginaleffects import *

happiness = pl.read_csv("../data/happiness.csv", null_values = "NA")
happiness = happiness.with_columns(
    happiness["latin_america"].cast(pl.Int32),
    happiness["happiness_binary"].cast(pl.Int32),
    happiness["region"].cast(pl.Categorical),
    happiness["income"].cast(pl.Categorical)
)

Sliders and switches

model_slider = smf.ols(
  "happiness_score ~ life_expectancy", 
  data = happiness).fit()
model_slider.summary()

OLS Regression Results
Dep. Variable:	happiness_score	R-squared:	0.552
Model:	OLS	Adj. R-squared:	0.549
Method:	Least Squares	F-statistic:	188.5
Date:	Thu, 17 Jul 2025	Prob (F-statistic):	1.83e-28
Time:	00:59:46	Log-Likelihood:	-179.26
No. Observations:	155	AIC:	362.5
Df Residuals:	153	BIC:	368.6
Df Model:	1
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	-2.2146	0.556	-3.983	0.000	-3.313	-1.116
life_expectancy	0.1055	0.008	13.729	0.000	0.090	0.121

Omnibus:	6.034	Durbin-Watson:	1.935
Prob(Omnibus):	0.049	Jarque-Bera (JB):	3.244
Skew:	-0.101	Prob(JB):	0.198
Kurtosis:	2.321	Cond. No.	647.

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

plot_predictions(model_slider, condition = "life_expectancy")

model_switch = smf.ols(
  "happiness_score ~ latin_america", 
  data = happiness).fit()
model_switch.summary()

OLS Regression Results
Dep. Variable:	happiness_score	R-squared:	0.075
Model:	OLS	Adj. R-squared:	0.069
Method:	Least Squares	F-statistic:	12.38
Date:	Thu, 17 Jul 2025	Prob (F-statistic):	0.000572
Time:	00:59:46	Log-Likelihood:	-235.45
No. Observations:	155	AIC:	474.9
Df Residuals:	153	BIC:	481.0
Df Model:	1
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	5.2438	0.096	54.360	0.000	5.053	5.434
latin_america	0.9009	0.256	3.518	0.001	0.395	1.407

Omnibus:	4.411	Durbin-Watson:	1.917
Prob(Omnibus):	0.110	Jarque-Bera (JB):	3.644
Skew:	0.273	Prob(JB):	0.162
Kurtosis:	2.485	Cond. No.	2.93

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

plot_predictions(model_switch, condition = "latin_america")

model_mixer = smf.ols(
  "happiness_score ~ life_expectancy + school_enrollment + C(region)", 
  data = happiness.to_pandas()).fit()
model_mixer.summary()

OLS Regression Results
Dep. Variable:	happiness_score	R-squared:	0.565
Model:	OLS	Adj. R-squared:	0.528
Method:	Least Squares	F-statistic:	15.28
Date:	Thu, 17 Jul 2025	Prob (F-statistic):	3.68e-14
Time:	00:59:47	Log-Likelihood:	-111.78
No. Observations:	103	AIC:	241.6
Df Residuals:	94	BIC:	265.3
Df Model:	8
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	-2.7896	1.358	-2.055	0.043	-5.485	-0.094
C(region)[T.Middle East & North Africa]	0.1580	0.248	0.636	0.526	-0.335	0.651
C(region)[T.South Asia]	-0.2803	0.411	-0.682	0.497	-1.096	0.536
C(region)[T.Latin America & Caribbean]	0.7005	0.221	3.175	0.002	0.262	1.138
C(region)[T.Sub-Saharan Africa]	0.2946	0.371	0.793	0.430	-0.443	1.032
C(region)[T.East Asia & Pacific]	-0.0315	0.255	-0.123	0.902	-0.538	0.475
C(region)[T.North America]	1.0827	0.544	1.989	0.050	0.002	2.163
life_expectancy	0.1025	0.017	5.894	0.000	0.068	0.137
school_enrollment	0.0077	0.010	0.785	0.435	-0.012	0.027

Omnibus:	7.101	Durbin-Watson:	2.050
Prob(Omnibus):	0.029	Jarque-Bera (JB):	3.602
Skew:	-0.219	Prob(JB):	0.165
Kurtosis:	2.195	Cond. No.	2.24e+03

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 2.24e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

Basic examples

Happiness and life expectancy

model_mixer = smf.ols(
  "happiness_score ~ life_expectancy + school_enrollment + C(region)", 
  data = happiness.to_pandas()).fit()
avg_comparisons(model_mixer, variables = "life_expectancy")

shape: (1, 9)

term	contrast	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
str	str	f64	f64	f64	f64	f64	f64	f64
"life_expectancy"	"+1"	0.102474	0.017387	5.893645	3.7777e-9	27.979848	0.068396	0.136552

Difference between North America and South Asia

avg_comparisons(
  model_mixer, 
  variables = {"region": ["North America", "South Asia"]}
)

shape: (1, 9)

term	contrast	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
str	str	f64	f64	f64	f64	f64	f64	f64
"region"	"South Asia - North America"	-1.362978	0.670415	-2.033036	0.042049	4.571788	-2.676967	-0.048988

Same results from a regular regression table, but it takes a lot more work!

model_mixer.summary()

OLS Regression Results
Dep. Variable:	happiness_score	R-squared:	0.565
Model:	OLS	Adj. R-squared:	0.528
Method:	Least Squares	F-statistic:	15.28
Date:	Thu, 17 Jul 2025	Prob (F-statistic):	3.68e-14
Time:	00:59:47	Log-Likelihood:	-111.78
No. Observations:	103	AIC:	241.6
Df Residuals:	94	BIC:	265.3
Df Model:	8
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	-2.7896	1.358	-2.055	0.043	-5.485	-0.094
C(region)[T.Middle East & North Africa]	0.1580	0.248	0.636	0.526	-0.335	0.651
C(region)[T.South Asia]	-0.2803	0.411	-0.682	0.497	-1.096	0.536
C(region)[T.Latin America & Caribbean]	0.7005	0.221	3.175	0.002	0.262	1.138
C(region)[T.Sub-Saharan Africa]	0.2946	0.371	0.793	0.430	-0.443	1.032
C(region)[T.East Asia & Pacific]	-0.0315	0.255	-0.123	0.902	-0.538	0.475
C(region)[T.North America]	1.0827	0.544	1.989	0.050	0.002	2.163
life_expectancy	0.1025	0.017	5.894	0.000	0.068	0.137
school_enrollment	0.0077	0.010	0.785	0.435	-0.012	0.027

Omnibus:	7.101	Durbin-Watson:	2.050
Prob(Omnibus):	0.029	Jarque-Bera (JB):	3.602
Skew:	-0.219	Prob(JB):	0.165
Kurtosis:	2.195	Cond. No.	2.24e+03

Are these significantly different from each other?

avg_comparisons(
  model_mixer, 
  variables = {"region": ["North America", "South Asia"]},
  hypothesis=0
)

shape: (1, 9)

term	contrast	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
str	str	f64	f64	f64	f64	f64	f64	f64
"region"	"South Asia - North America"	-1.362978	0.670415	-2.033036	0.042049	4.571788	-2.676967	-0.048988

Is it significantly different from -1?

avg_comparisons(
  model_mixer, 
  variables = {"region": ["North America", "South Asia"]},
  hypothesis=-1
)

shape: (1, 9)

term	contrast	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
str	str	f64	f64	f64	f64	f64	f64	f64
"region"	"South Asia - North America"	-1.362978	0.670415	-0.541422	0.588217	0.765581	-2.676967	-0.048988

Squared life expectancy

model_poly = smf.ols(
  "happiness_score ~ life_expectancy + I(life_expectancy**2) + school_enrollment + latin_america", 
  data = happiness.to_pandas()).fit()
model_poly.summary()

OLS Regression Results
Dep. Variable:	happiness_score	R-squared:	0.589
Model:	OLS	Adj. R-squared:	0.572
Method:	Least Squares	F-statistic:	35.08
Date:	Thu, 17 Jul 2025	Prob (F-statistic):	3.66e-18
Time:	00:59:47	Log-Likelihood:	-108.92
No. Observations:	103	AIC:	227.8
Df Residuals:	98	BIC:	241.0
Df Model:	4
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	15.9100	5.220	3.048	0.003	5.550	26.270
life_expectancy	-0.4251	0.150	-2.832	0.006	-0.723	-0.127
I(life_expectancy ** 2)	0.0037	0.001	3.498	0.001	0.002	0.006
school_enrollment	0.0053	0.009	0.592	0.555	-0.013	0.023
latin_america	0.8022	0.197	4.077	0.000	0.412	1.193

Omnibus:	7.729	Durbin-Watson:	1.984
Prob(Omnibus):	0.021	Jarque-Bera (JB):	5.183
Skew:	-0.402	Prob(JB):	0.0749
Kurtosis:	2.251	Cond. No.	4.21e+05

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 4.21e+05. This might indicate that there are
strong multicollinearity or other numerical problems.

avg_comparisons(
  model_poly, 
  newdata = datagrid(life_expectancy = [60, 80]), 
  by = "life_expectancy", 
  variables="life_expectancy"
)

shape: (2, 10)

life_expectancy	term	contrast	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
f64	str	str	f64	f64	f64	f64	f64	f64	f64
60.0	"life_expectancy"	"+1"	0.021869	0.025088	0.871712	0.383366	1.383207	-0.027302	0.071041
80.0	"life_expectancy"	"+1"	0.170868	0.024183	7.065503	1.6003e-12	39.184817	0.12347	0.218267

plot_predictions(
  model_poly,
  condition="life_expectancy"
)

Are those two slopes significantly different from each other?

avg_comparisons(
  model_poly, 
  newdata = datagrid(life_expectancy = [60, 80]), 
  by = "life_expectancy", 
  variables="life_expectancy",
  hypothesis = "difference ~ pairwise"
)

shape: (1, 8)

term	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
str	f64	f64	f64	f64	f64	f64	f64
"(80.0) - (60.0)"	0.148999	0.04259	3.498484	0.000468	11.061478	0.065525	0.232473

Interaction terms

model_poly_int = smf.ols(
    "happiness_score ~ life_expectancy * latin_america + I(life_expectancy**2) * latin_america + school_enrollment",
    data=happiness.to_pandas()
).fit()
model_poly_int.summary()

OLS Regression Results
Dep. Variable:	happiness_score	R-squared:	0.590
Model:	OLS	Adj. R-squared:	0.565
Method:	Least Squares	F-statistic:	23.05
Date:	Thu, 17 Jul 2025	Prob (F-statistic):	1.10e-16
Time:	00:59:47	Log-Likelihood:	-108.73
No. Observations:	103	AIC:	231.5
Df Residuals:	96	BIC:	249.9
Df Model:	6
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	15.7272	5.317	2.958	0.004	5.172	26.282
life_expectancy	-0.4196	0.153	-2.739	0.007	-0.724	-0.116
latin_america	58.5005	98.312	0.595	0.553	-136.647	253.648
life_expectancy:latin_america	-1.5484	2.636	-0.587	0.558	-6.782	3.685
I(life_expectancy ** 2)	0.0037	0.001	3.388	0.001	0.002	0.006
I(life_expectancy ** 2):latin_america	0.0104	0.018	0.587	0.558	-0.025	0.045
school_enrollment	0.0052	0.009	0.568	0.571	-0.013	0.023

Omnibus:	7.750	Durbin-Watson:	1.996
Prob(Omnibus):	0.021	Jarque-Bera (JB):	4.988
Skew:	-0.381	Prob(JB):	0.0826
Kurtosis:	2.238	Cond. No.	7.98e+06

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 7.98e+06. This might indicate that there are
strong multicollinearity or other numerical problems.

plot_predictions(
  model_poly_int,
  condition = ["life_expectancy", "latin_america"]
)

avg_comparisons(
  model_poly_int,
  newdata = datagrid(life_expectancy = [60, 80], latin_america = [0, 1]), 
  by = ["life_expectancy", "latin_america"], 
  variables = "life_expectancy")

shape: (4, 11)

life_expectancy	latin_america	term	contrast	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
f64	i64	str	str	f64	f64	f64	f64	f64	f64	f64
60.0	0	"life_expectancy"	"+1"	0.022809	0.025418	0.897368	0.369523	1.436265	-0.027009	0.072627
60.0	1	"life_expectancy"	"+1"	-0.280753	0.51962	-0.540305	0.588987	0.763693	-1.299189	0.737682
80.0	0	"life_expectancy"	"+1"	0.170263	0.024856	6.849983	7.3859e-12	36.978369	0.121546	0.21898
80.0	1	"life_expectancy"	"+1"	0.281633	0.200314	1.405955	0.159737	2.646226	-0.110976	0.674242

Synthetic data

predictions(
  model_poly_int, 
  newdata = datagrid(life_expectancy = [60, 80], latin_america = [0, 1])
)

shape: (4, 12)

life_expectancy	latin_america	rowid	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high	happiness_score	school_enrollment
f64	i64	i32	f64	f64	f64	f64	f64	f64	f64	f64	f64
60.0	0	0	4.29733	0.195685	21.9605	0.0	inf	3.913796	4.680865	5.707689	91.344333
60.0	1	1	7.240226	3.769334	1.920824	0.054754	4.190894	-0.147533	14.627984	5.707689	91.344333
80.0	0	2	6.228052	0.109109	57.081022	0.0	inf	6.014202	6.441901	5.707689	91.344333
80.0	1	3	7.249024	0.52484	13.811885	0.0	inf	6.220357	8.277691	5.707689	91.344333

Logistic regression

model_logit = smf.logit(
    formula="happiness_binary ~ life_expectancy + school_enrollment + latin_america",
    data=happiness.to_pandas()
).fit()
model_logit.summary()

Optimization terminated successfully.
         Current function value: 0.380007
         Iterations 7

Logit Regression Results
Dep. Variable:	happiness_binary	No. Observations:	103
Model:	Logit	Df Residuals:	99
Method:	MLE	Df Model:	3
Date:	Thu, 17 Jul 2025	Pseudo R-squ.:	0.3788
Time:	00:59:47	Log-Likelihood:	-39.141
converged:	True	LL-Null:	-63.004
Covariance Type:	nonrobust	LLR p-value:	2.436e-10

	coef	std err	z	P>\|z\|	[0.025	0.975]
Intercept	-19.6938	4.777	-4.123	0.000	-29.056	-10.331
life_expectancy	0.2490	0.068	3.637	0.000	0.115	0.383
school_enrollment	0.0226	0.048	0.470	0.638	-0.071	0.117
latin_america	1.1725	0.848	1.383	0.167	-0.489	2.835

Ew log odds ↑

Kinda okay odds ratios ↓

np.exp(model_logit.params)

Intercept            2.799432e-09
life_expectancy      1.282772e+00
school_enrollment    1.022811e+00
latin_america        3.230127e+00
dtype: float64

plot_predictions(
  model_logit, 
  condition=["life_expectancy", "latin_america"]
)

Probability scale predictions

predictions(
  model_logit, 
  newdata = datagrid(life_expectancy = [60, 80], latin_america = [0, 1])
)

shape: (4, 12)

life_expectancy	latin_america	rowid	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high	school_enrollment	happiness_binary
f64	i64	i32	f64	f64	f64	f64	f64	f64	f64	f64	f64
60.0	0	0	0.063438	0.05717	1.109638	0.267155	1.904251	-0.048613	0.175488	91.344333	1.0
60.0	1	1	0.179515	0.184336	0.973847	0.330132	1.598884	-0.181777	0.540807	91.344333	1.0
80.0	0	2	0.907904	0.046455	19.543888	0.0	inf	0.816855	0.998954	91.344333	1.0
80.0	1	3	0.969553	0.025765	37.631279	0.0	inf	0.919055	1.02005	91.344333	1.0

Probability scale slopes!

comparisons(
  model_logit, 
  newdata = datagrid(life_expectancy = [60, 80], latin_america = [0, 1]),
  by = ["life_expectancy", "latin_america"], 
  variables = "life_expectancy"
)

shape: (4, 11)

life_expectancy	latin_america	term	contrast	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
f64	i64	str	str	f64	f64	f64	f64	f64	f64	f64
60.0	0	"life_expectancy"	"+1"	0.01482	0.008675	1.708342	0.087573	3.513373	-0.002183	0.031823
60.0	1	"life_expectancy"	"+1"	0.036689	0.022376	1.639665	0.101075	3.306505	-0.007167	0.080546
80.0	0	"life_expectancy"	"+1"	0.020849	0.005693	3.662306	0.00025	11.966046	0.009691	0.032006
80.0	1	"life_expectancy"	"+1"	0.007367	0.005437	1.354877	0.175457	2.510813	-0.00329	0.018024

Poisson regression

equality = pl.read_csv("../data/equality.csv")
equality = equality.with_columns(
    equality["historical"].cast(pl.Categorical),
    equality["region"].cast(pl.Categorical),
    equality["state"].cast(pl.Categorical)
)

model_poisson = smf.poisson(
  "laws ~ percent_urban + historical", 
  data = equality.to_pandas()).fit()
model_poisson.summary()

Optimization terminated successfully.
         Current function value: 3.722404
         Iterations 6

Poisson Regression Results
Dep. Variable:	laws	No. Observations:	49
Model:	Poisson	Df Residuals:	45
Method:	MLE	Df Model:	3
Date:	Thu, 17 Jul 2025	Pseudo R-squ.:	0.4263
Time:	00:59:47	Log-Likelihood:	-182.40
converged:	True	LL-Null:	-317.91
Covariance Type:	nonrobust	LLR p-value:	1.851e-58

	coef	std err	z	P>\|z\|	[0.025	0.975]
Intercept	0.2080	0.270	0.769	0.442	-0.322	0.738
historical[T.swing]	0.9053	0.143	6.348	0.000	0.626	1.185
historical[T.dem]	1.5145	0.135	11.194	0.000	1.249	1.780
percent_urban	0.0163	0.004	4.558	0.000	0.009	0.023

np.exp(model_poisson.params)

Intercept              1.231184
historical[T.swing]    2.472684
historical[T.dem]      4.547026
percent_urban          1.016391
dtype: float64

plot_predictions(
  model_poisson, condition = ["percent_urban", "historical"]
)

avg_slopes(
  model_poisson,
  newdata = datagrid(percent_urban = [45, 85], historical = ["dem", "gop", "swing"]),
  by = ["percent_urban", "historical"],
  variables = "percent_urban"
)

shape: (6, 11)

percent_urban	historical	term	contrast	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
f64	enum	str	str	f64	f64	f64	f64	f64	f64	f64
45.0	"dem"	"percent_urban"	"dY/dX"	0.189168	0.018391	10.285996	0.0	inf	0.153122	0.225213
45.0	"gop"	"percent_urban"	"dY/dX"	0.041603	0.007211	5.769223	7.9638e-9	26.903898	0.027469	0.055736
45.0	"swing"	"percent_urban"	"dY/dX"	0.10287	0.013163	7.815058	5.5511e-15	47.356144	0.077071	0.128669
85.0	"dem"	"percent_urban"	"dY/dX"	0.362473	0.090267	4.015584	0.000059	14.041638	0.185554	0.539393
85.0	"gop"	"percent_urban"	"dY/dX"	0.079717	0.024479	3.256552	0.001128	9.792347	0.031739	0.127694
85.0	"swing"	"percent_urban"	"dY/dX"	0.197114	0.050732	3.885366	0.000102	13.256657	0.09768	0.296548

Your turn

Create a model that explains variation in penguin weight (body_mass_g) based on flipper length, squared flipper length, bill length, and species interacted with flipper length:

body_mass_g ~ flipper_length_mm * species + I(flipper_length_mm**2) * species + bill_length_mm

penguins = pl.read_csv("../data/penguins.csv")
penguins = penguins.with_columns(
    penguins["species"].cast(pl.Categorical),
    penguins["island"].cast(pl.Categorical),
    penguins["sex"].cast(pl.Categorical)
)

model_penguins = smf.ols(
    "body_mass_g ~ flipper_length_mm * species + I(flipper_length_mm**2) * species + bill_length_mm",
    data=penguins.to_pandas()
).fit()
model_penguins.summary()

OLS Regression Results
Dep. Variable:	body_mass_g	R-squared:	0.830
Model:	OLS	Adj. R-squared:	0.825
Method:	Least Squares	F-statistic:	174.7
Date:	Thu, 17 Jul 2025	Prob (F-statistic):	2.00e-118
Time:	00:59:48	Log-Likelihood:	-2405.5
No. Observations:	333	AIC:	4831.
Df Residuals:	323	BIC:	4869.
Df Model:	9
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	-3693.2681	1.58e+04	-0.234	0.815	-3.48e+04	2.74e+04
species[T.Gentoo]	-6.227e+04	3.52e+04	-1.771	0.078	-1.31e+05	6922.145
species[T.Chinstrap]	3.005e+04	2.73e+04	1.099	0.273	-2.38e+04	8.39e+04
flipper_length_mm	29.0297	165.768	0.175	0.861	-297.092	355.151
flipper_length_mm:species[T.Gentoo]	561.1021	332.142	1.689	0.092	-92.332	1214.536
flipper_length_mm:species[T.Chinstrap]	-311.5523	282.129	-1.104	0.270	-866.595	243.490
I(flipper_length_mm ** 2)	-0.0115	0.436	-0.026	0.979	-0.868	0.845
I(flipper_length_mm ** 2):species[T.Gentoo]	-1.2578	0.789	-1.594	0.112	-2.810	0.295
I(flipper_length_mm ** 2):species[T.Chinstrap]	0.7879	0.727	1.083	0.279	-0.643	2.219
bill_length_mm	59.1176	7.385	8.006	0.000	44.590	73.646

Omnibus:	5.547	Durbin-Watson:	2.272
Prob(Omnibus):	0.062	Jarque-Bera (JB):	5.494
Skew:	0.314	Prob(JB):	0.0641
Kurtosis:	3.018	Cond. No.	9.57e+07

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 9.57e+07. This might indicate that there are
strong multicollinearity or other numerical problems.

Look at the raw coefficients and try to interpret what happens to body mass as flipper length increases by one mm (and despair)

Answer these questions

Plot the predicted values of body mass across the full range of flipper lengths and across the three species

Quantity: Predictions
Grid: Predictions for each row
Aggregation: Display as smoothed line

plot_predictions(
  model_penguins,
  condition = ["flipper_length_mm", "species"]
)

On average, how does body mass increase in relation to small increases in flipper length across species?

Quantity: Comparison—a one-mm increase in flipper length (slope)
Grid: Predictions for six synthetic values (short and long flippers * three speices), with all other values held at the mean
Aggregation: Report values for predictions; no collapsing

comparisons(
  model_penguins,
  newdata = datagrid(flipper_length_mm = [190, 220], species = ["Adelie", "Chinstrap", "Gentoo"]),
  by = ["flipper_length_mm", "species"],
  variables = "flipper_length_mm"
)

shape: (6, 11)

flipper_length_mm	species	term	contrast	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
f64	enum	str	str	f64	f64	f64	f64	f64	f64	f64
190.0	"Adelie"	"flipper_length_mm"	"+1"	24.677659	4.411169	5.594358	2.2144e-8	25.428511	16.031926	33.323391
190.0	"Chinstrap"	"flipper_length_mm"	"+1"	12.527043	9.015424	1.389512	0.164677	2.602289	-5.142863	30.196949
190.0	"Gentoo"	"flipper_length_mm"	"+1"	107.812769	37.803434	2.851931	0.004345	7.846276	33.719399	181.906138
220.0	"Adelie"	"flipper_length_mm"	"+1"	23.990488	26.28641	0.912657	0.361423	1.468241	-27.529928	75.510905
220.0	"Chinstrap"	"flipper_length_mm"	"+1"	59.11383	28.892588	2.045986	0.040758	4.616782	2.485398	115.742262
220.0	"Gentoo"	"flipper_length_mm"	"+1"	31.657121	5.743241	5.512066	3.5465e-8	24.749046	20.400576	42.913666

Is there a significant difference in the average effect of flipper length between Adelies and Chinstraps?

Quantity: Comparison—a one-mm increase in flipper length (slope)
Grid: Predictions for two synthetic values (two species), with all other values held at the mean
Aggregation: Report values for predictions; no collapsing
Test: Difference in means

avg_comparisons(
  model_penguins,
  newdata = datagrid(species = ["Adelie", "Chinstrap"]),
  by = ["flipper_length_mm", "species"],
  variables = "flipper_length_mm",
  hypothesis = "difference ~ pairwise"
)

shape: (1, 8)

term	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
str	f64	f64	f64	f64	f64	f64	f64
"(Chinstrap) - (Adelie)"	5.131115	13.134894	0.390648	0.696058	0.522721	-20.612804	30.875034