```
::install_github('tsostarics/contrastable')
devtools::install_github('tsostarics/contrastable') remotes
```

I’ve been working on a package called `contrastable`

on and off for the past year or so. The package’s goal is to provide a tidy approach to setting factor contrasts for regression analysis. While this can be done with repeated `contrasts<-`

calls, this workflow is tedious when working with multiple factors and especially error-prone when manually specifying contrast matrices to use. In this latter case, the user would need to be careful to specify the correct fractions in the correct order with the correct signs, which can be a lot to keep track of. These issues quickly become apparent when the number of factor levels is greater than 2. In this post I will:

- Run through an example of a typical contrast coding workflow using
`contrasts<-`

. I will give an example of an error that can arise due to a typo, and show how to diagnose what this error actually reflects by checking the hypothesis matrix. - Show how the
`contrastable`

package can be used to sidestep mistakes caused by error-prone and tedious calls to`contrasts<-`

. - Briefly link to other packages on contrasts and level comparisons.

You can find the `contrastable`

package on my Github at this repo: tsostarics/contrastable You can install it like so:

## Contrasts overview

Contrast coding refers to assigning numeric values to levels of a categorical variable for use in regression analyses. Depending on the numbers used, different comparisons can be made between the group means of a variable. These comparisons can correspond to particular null hypotheses that a researcher might have, and particular combinations of numbers can encode high-level questions like “Are there differences between levels when compared to a common reference level?” or “Does each level differ from the levels that came before it?” Critically, the contrasts used don’t impact the model fit but do impact the coefficient estimates that are used to make inferences about the data. For example, you might conclude that there’s an overall effect of some factor when in reality the effect (shown by the coefficient estimate) is an effect that only holds for one particular group!

Consider an example where you have two groups of listeners where English is their native (L1) or non-native (L2) language. You might be interested in whether reading times are slower or faster in two different syntactic conditions, such as active vs passive constructions. Two possible research questions might be whether there’s an *main* effect of syntax on reading times on the one hand or whether there’s a *simple* effect such that reading times in the passive construction are only slower for L2 speakers.^{1} These are similar, but different, research questions and more importantly, the interpretation of one coefficient depends on how *other* variables are coded. Many researchers realize (or are starting to at least) that the default^{2} “0/1 contrasts” (aka treatment or dummy coding) will only give them the simple effect of structure, but if what you’re interested in is that *main* effect, then your statistics has not yet answered your question! To rectify this, researchers will opt for “the +.5/-.5 contrasts” to obtain main effects.

The name for this contrast scheme is *not consistent*, especially in the 2-level case where the values are +.5/-.5. I’ve seen it called sum coding, simple coding, effects coding, scaled sum coding, helmert coding, difference coding, contrast coding, sum-to-zero coding, and +.5/-.5 coding. See Brehm and Alday (2022) for concerns about transparent descriptions of contrast coding schemes.

While researchers may differ on what they call +.5/-.5, for 2 levels the result is nonetheless the same. But, a researcher using “helmert coding” and a researcher using “scaled sum coding” for a factor with 3 or more levels will be using very different contrast matrices, and thus address very different research questions about the data. Let’s use some functions from `contrastable`

to look at how these contrast matrices differ.

`library(contrastable)`

`helmert_code(2) |> MASS::fractions()`

```
[,1]
[1,] 1/2
[2,] -1/2
```

`scaled_sum_code(2) |> MASS::fractions()`

```
[,1]
1 1/2
2 -1/2
```

`helmert_code(3) |> MASS::fractions()`

```
[,1] [,2]
[1,] 2/3 0
[2,] -1/3 1/2
[3,] -1/3 -1/2
```

`scaled_sum_code(3) |> MASS::fractions()`

```
[,1] [,2]
1 2/3 -1/3
2 -1/3 2/3
3 -1/3 -1/3
```

`helmert_code(5) |> MASS::fractions()`

```
[,1] [,2] [,3] [,4]
[1,] 4/5 0 0 0
[2,] -1/5 3/4 0 0
[3,] -1/5 -1/4 2/3 0
[4,] -1/5 -1/4 -1/3 1/2
[5,] -1/5 -1/4 -1/3 -1/2
```

`scaled_sum_code(5) |> MASS::fractions()`

```
[,1] [,2] [,3] [,4]
1 4/5 -1/5 -1/5 -1/5
2 -1/5 4/5 -1/5 -1/5
3 -1/5 -1/5 4/5 -1/5
4 -1/5 -1/5 -1/5 4/5
5 -1/5 -1/5 -1/5 -1/5
```

Note that I use the term *scaled sum coding* for the “pairwise comparisons with main effects” contrast scheme. I opt for this term for three reasons.

- First, I see sum coding used more frequently in statistics and econometrics to refer to +1/-1; this is also what
`contr.sum`

in R returns. - Second, the salient part of going from sum coding to scaled sum coding, especially in the 2-level case, is that there’s some kind of division or scaling operation involved; I frequently see people use
`contr.sum(2)/2`

, although importantly`contr.sum(3)/3`

does not yield the expected result. - Third, “simple” coding is counterintuitive to me since we’re trying to
*avoid*“simple effects;” “effects coding” and “contrast coding” are largely meaningless as all coding schemes will encode some kind of effect, and setting*any*contrast matrix is an instance of contrast coding.

So, for the researcher trying to remember “I need to use *those* contrasts where they’re divided to get the main effects”, it (to me) seems easy to reach for a tool where *scaled* is in the name and is clearly distinguished from sum coding. ^{3}

## Typical approach to contrast coding

Typically when I see people in Linguistics set contrasts, they do something like the following, using the `palmerpenguins`

dataset as an example.

## Code

```
library(dplyr) # Data wrangling
library(purrr) # Mapping functions
library(palmerpenguins) # Dataset
<- penguins
penguins_with_contrasts
# Default treatment/dummy coding for a 2 and 3 level factor
contrasts(penguins_with_contrasts$sex)
```

```
male
female 0
male 1
```

## Code

`contrasts(penguins_with_contrasts$species)`

```
Chinstrap Gentoo
Adelie 0 0
Chinstrap 1 0
Gentoo 0 1
```

## Code

```
# Easy enough for 2 levels, -contr.sum(2)/2 is also used a lot
contrasts(penguins_with_contrasts$sex) <- c(-.5, .5)
# Not so fun for three levels!
contrasts(penguins_with_contrasts$species) <- matrix(c(-1/3, 2/3, -1/3,
-1/3, -1/3, 2/3),
nrow = 3)
```

The chance of making a mistake increases when including more and more categorical variables. Catching these mistakes can be very difficult, in part because this workflow erases the labels in the regression output. This means you have to keep track of what `1`

and `2`

in the regression coefficients correspond to.

While the `dimnames`

argument can be used to set the labels, anecdotally I rarely see people use this in their analyses when perusing code on the osf. Winter (2019, 127) notes that “Using the ‘1’ after the predictor name is a notational convention for representing the slopes of sum-coded predictors in R” but this is slightly incorrect; in the absence of `dimnames`

being set, R will use the numeric indices of the contrast matrix’s columns (no matter what the scheme is).

Below, the two sets of coefficients represent pairwise comparisons to the `Adelie`

baseline, but the intercepts differ due to how the contrasts are set, with the first using treatment coding and the second using scaled sum coding. I’ll start with a case that only considers the categorical variable, but will include an additional continuous independent variable later on.

## Code

```
# Compare the default treatment coding with the penguins dataset
# with the contrasts we specified in penguins_with_contrasts
<- coef(lm(bill_length_mm ~ species,
treatment_coefs data = penguins))
<- coef(lm(bill_length_mm ~ species,
scaledsum_coefs data = penguins_with_contrasts))
# I'm using list() to print and caption results side by side, purely aesthetic
list("(Default) Treatment Coding" = treatment_coefs,
"(Manual) Scaled Sum Coding" = scaledsum_coefs)
```

```
$`(Default) Treatment Coding`
(Intercept) speciesChinstrap speciesGentoo
38.791391 10.042433 8.713487
$`(Manual) Scaled Sum Coding`
(Intercept) species1 species2
45.043364 10.042433 8.713487
```

The model coefficients for the scaled sum coding shows the same pairwise comparisons as the model using treatment coding, but the intercepts differ. We can check what they correspond to manually:

## Code

```
<-
group_means |>
penguins ::group_by(species) |>
dplyr::summarize(mean_length = mean(bill_length_mm, na.rm = TRUE)) |>
dplyr::pluck('mean_length') |>
purrr`names<-`(c('Adelie', 'Chinstrap', 'Gentoo'))
list("Group means"= group_means,
"Grand mean" = mean(group_means))
```

```
$`Group means`
Adelie Chinstrap Gentoo
38.79139 48.83382 47.50488
$`Grand mean`
[1] 45.04336
```

So the intercept for the treatment coded model is the mean of the `Adelie`

group while the scaled sum coded model is the *grand mean*, or the mean of group means. But, typing in the scaled sum contrast matrix was a bit obnoxious with all the `-1/3`

we typed. If we had made a slight mistake while typing the matrix out, what would have happened to our model? Would our coefficients reflect the averages and differences we were expecting? As an example, let’s see what happens when we change a `2/3`

to `1/3`

:

## Code

```
# What if we accidentally typed 1/3 instead of 2/3?
contrasts(penguins_with_contrasts$species) <- matrix(c(-1/3, 1/3, -1/3,
-1/3, -1/3, 2/3),
nrow = 3)
<- coef(lm(bill_length_mm ~ species,
mistake_coefs data = penguins_with_contrasts))
list("(Current) Mistaken Scaled Sum Coding:" = mistake_coefs,
"(Previous) Correct Scaled Sum Coding:" = scaledsum_coefs)
```

```
$`(Current) Mistaken Scaled Sum Coding:`
(Intercept) species1 species2
46.717103 15.063649 8.713487
$`(Previous) Correct Scaled Sum Coding:`
(Intercept) species1 species2
45.043364 10.042433 8.713487
```

Here we can see that the intercept and the value for `species1`

have increased in magnitude. In particular, the new reported effect of `species1`

is much larger than it previously was. If we stopped at this point, we would conclude that the difference in bill length between the Chinstrap and Adelie groups is a whopping 15mm (remember we originally calculated it to be about 10). If we were interested in whether there was a positive or negative difference that was significant or not, we’d still make that conclusion, but any claims about the magnitude of the effect would be misguided. This problem opens up a related question though: What does this new inflated-in-magitude coefficient estimate represent?

### Diagnosing our mistake

To check what these numbers correspond to, we have to check the *hypothesis matrix* that corresponds to our *contrast matrix*. The process of obtaining the hypothesis matrix has been referred to as finding the generalized inverse of the contrast matrix (see Schad et al. 2020 for details).

## Code

```
matrix(c(1, 1, 1, # Add a column of 1s for the intercept
-1/3, 1/3, -1/3,
-1/3, -1/3, 2/3),
nrow = 3,
dimnames = list(NULL, c('Intercept', 'species1', 'species2'))) |>
t() |>
solve() |>
::fractions() # This function just shows numbers as fractions MASS
```

```
Intercept species1 species2
[1,] 1/6 -3/2 -1
[2,] 1/2 3/2 0
[3,] 1/3 0 1
```

Here the intercept is represented by the weighted sum of each group mean, where the weights are shown in the intercept column. In most cases, the intercept should reflect the grand mean, or the mean of the group means, and so would usually have equal weights (i.e., `1/3`

here) for the levels. In this case, we see the fractional weights are not the same. We can verify this by calculating the weighted mean ourselves:

## Code

```
list("Grand Mean" = mean(group_means),
"Weighted mean" = weighted.mean(group_means, c(1/6, 1/2, 1/3)))
```

```
$`Grand Mean`
[1] 45.04336
$`Weighted mean`
[1] 46.7171
```

Similarly, the coefficient for `species1`

shows the difference between the group means of levels 1 and 2 (i.e., mean of Chinstrap - mean of Adelie) but times a factor of `3/2`

. Crucially, if our goal is to evaluate the difference between the means of these two levels, then our mistake in coding the hypothesis matrix will give us a larger estimate (~15 vs 10). Consider a similar setup where the larger estimate was 5 instead of 0; if we were relying on null hypothesis testing it’s possible we’d get a significant effect when really we shouldn’t have.

## Code

```
list("Mistaken Scaled Sum Coding" = mistake_coefs,
"Correct Scaled Sum Coding" = scaledsum_coefs,
"Computed Chinstrap-Adelie Difference with 3/2 scaling" =
3/2 * group_means[['Chinstrap']]) - (3/2 * group_means[['Adelie']]),
("Actual Chinstrap-Adelie Difference" =
'Chinstrap']] - group_means[['Adelie']]) group_means[[
```

```
$`Mistaken Scaled Sum Coding`
(Intercept) species1 species2
46.717103 15.063649 8.713487
$`Correct Scaled Sum Coding`
(Intercept) species1 species2
45.043364 10.042433 8.713487
$`Computed Chinstrap-Adelie Difference with 3/2 scaling`
[1] 15.06365
$`Actual Chinstrap-Adelie Difference`
[1] 10.04243
```

Point being: we made an honest mistake of typing `1/3`

instead of `2/3`

but this had ramifications for the coefficients in our model output that we use to make inferences. In practice, because we did the multiple `contrasts<-`

calls, we would likely assume that what we did was correct in the absence of any errors.

## Tidy approach to contrasts

Here I’ll show a different approach using the `contrastable`

package. This package takes a tidy approach to take care of the overhead of labels and reference levels involved when using common contrast coding schemes. Specifically, this package provides a series of functions that use a special formula implementation that assigns specific meanings to each operator. The left hand side of the formula is the factor column whose contrasts you want to change. The right hand side consists of (at minimum) a function to generate contrast matrices such as `contr.treatment`

or `treatment_code`

. Additional operators provide extra optional functionality:

`+ x`

: Set reference level to level`x`

`* x`

: Set intercept to be the mean of`x`

`- 3:4`

: For polynomial contrasts only, drop trends`3`

and`4`

`| c("A-B", "A-C")`

: Set the comparison labels to`A-B`

and`A-C`

(must be the last operator if used)

Recall that in many cases researchers want pairwise comparisons while retaining main effects, and so the choice of reference level for the comparisons is very important. By default, R uses the first level alphabetically as the reference level, but sometimes we want to change this manually ourselves. Here’s an example where we set the `sex`

and `species`

factors to the two contrast schemes we manually set before. The `set_contrasts`

function will show a message if it detects additional factor variables in the dataframe that the user did not provide contrasts for.

## Code

```
# library(contrastable) was loaded earlier
<-
penguins_df |>
penguins set_contrasts(sex ~ scaled_sum_code + "male", # Set reference level with +
~ scaled_sum_code + 'Adelie') species
```

`Expect contr.treatment or contr.poly for unset factors: island`

## Code

`contrasts(penguins_df$species) |> MASS::fractions()`

```
Chinstrap Gentoo
Adelie -1/3 -1/3
Chinstrap 2/3 -1/3
Gentoo -1/3 2/3
```

## Code

`contrasts(penguins_df$sex) |> MASS::fractions()`

```
female
female 1/2
male -1/2
```

`penguins_df`

now has its contrasts set, and we can run our model as usual. Note that we didn’t have to type out any matrices ourselves, but we got the correct contrasts that we needed.

## Code

`coef(lm(bill_length_mm ~ species + bill_depth_mm, data = penguins_df))`

```
(Intercept) speciesChinstrap speciesGentoo bill_depth_mm
20.997115 9.938955 13.403279 1.394011
```

If we wanted to change the labels to better reflect the comparisons being made, we could do that in the formula too with the `|`

operator.

## Code

```
<-
penguins_df |>
penguins_df set_contrasts(species ~ scaled_sum_code + 'Adelie' |
c('Chinstrap-Ad', 'Gentoo-Ad'))
coef(lm(bill_length_mm ~ species, data = penguins_df))
```

```
(Intercept) speciesChinstrap-Ad speciesGentoo-Ad
45.043364 10.042433 8.713487
```

### Additional functions

Typically when I use this package in my analyses the `set_contrasts`

function is all I really need, but there are other functions that follow the same syntax that provide other information. To avoid retyping things, I’ll usually keep the contrasts in a list assigned to a separate variable and pass that to functions.

The `glimpse_contrasts`

function can show information about the factors in a dataset along with the contrast schemes that have been assigned to each factor.

## Code

```
<-
my_contrasts list(
~ scaled_sum_code + 'female',
sex ~ helmert_code
species
)
glimpse_contrasts(penguins_df, my_contrasts) |> gt::gt()
```

factor | n_levels | level_names | scheme | reference | intercept | orthogonal | centered | dropped_trends | explicitly_set |
---|---|---|---|---|---|---|---|---|---|

sex | 2 | female, male | scaled_sum_code | female | grand mean | NA | TRUE | NA | TRUE |

species | 3 | Adelie, Chinstrap, Gentoo | helmert_code | Gentoo | grand mean | TRUE | TRUE | NA | TRUE |

island | 3 | Biscoe, Dream, Torgersen | contr.treatment | Biscoe | mean(Biscoe) | FALSE | FALSE | NA | FALSE |

The `enlist_contrasts`

function does the same thing as `set_contrasts`

, but returns a list of contrast matrices that can be used in the `contrasts`

argument of some model-fitting functions.^{4} It also provides an easy way to show the contrast matrices in an appendix or supplementary material.

## Code

`enlist_contrasts(penguins_df, my_contrasts) |> purrr::map(MASS::fractions)`

`Expect contr.treatment or contr.poly for unset factors: island`

```
$sex
male
female -1/2
male 1/2
$species
>Adelie >Chinstrap
Adelie 2/3 0
Chinstrap -1/3 1/2
Gentoo -1/3 -1/2
```

### Available contrast schemes

Here are the different contrast functions this package currently provides.

## Code

```
# = contr.treatment
treatment_code(5) |> MASS::fractions()
```

```
2 3 4 5
1 0 0 0 0
2 1 0 0 0
3 0 1 0 0
4 0 0 1 0
5 0 0 0 1
```

## Code

```
# = contr.sum
sum_code(5) |> MASS::fractions()
```

```
[,1] [,2] [,3] [,4]
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1
5 -1 -1 -1 -1
```

## Code

```
# = contr.sum
scaled_sum_code(5) |> MASS::fractions()
```

```
[,1] [,2] [,3] [,4]
1 4/5 -1/5 -1/5 -1/5
2 -1/5 4/5 -1/5 -1/5
3 -1/5 -1/5 4/5 -1/5
4 -1/5 -1/5 -1/5 4/5
5 -1/5 -1/5 -1/5 -1/5
```

## Code

```
# NOT = contr.helmert, which is unscaled
helmert_code(5) |> MASS::fractions()
```

```
[,1] [,2] [,3] [,4]
[1,] 4/5 0 0 0
[2,] -1/5 3/4 0 0
[3,] -1/5 -1/4 2/3 0
[4,] -1/5 -1/4 -1/3 1/2
[5,] -1/5 -1/4 -1/3 -1/2
```

## Code

`reverse_helmert_code(5) |> MASS::fractions()`

```
[,1] [,2] [,3] [,4]
[1,] -1/2 -1/3 -1/4 -1/5
[2,] 1/2 -1/3 -1/4 -1/5
[3,] 0 2/3 -1/4 -1/5
[4,] 0 0 3/4 -1/5
[5,] 0 0 0 4/5
```

## Code

`forward_difference_code(5) |> MASS::fractions()`

```
[,1] [,2] [,3] [,4]
[1,] 4/5 3/5 2/5 1/5
[2,] -1/5 3/5 2/5 1/5
[3,] -1/5 -2/5 2/5 1/5
[4,] -1/5 -2/5 -3/5 1/5
[5,] -1/5 -2/5 -3/5 -4/5
```

## Code

`backward_difference_code(5) |> MASS::fractions()`

```
[,1] [,2] [,3] [,4]
[1,] -4/5 -3/5 -2/5 -1/5
[2,] 1/5 -3/5 -2/5 -1/5
[3,] 1/5 2/5 -2/5 -1/5
[4,] 1/5 2/5 3/5 -1/5
[5,] 1/5 2/5 3/5 4/5
```

## Code

```
# = contr.poly, poly(1:n, degree = n-1, raw = FALSE)
orth_polynomial_code(5) |> MASS::fractions()
```

```
.L .Q .C ^4
[1,] -265/419 929/1738 -2026009/6406803 4018/33617
[2,] -2026009/6406803 -809/3027 191/302 -246481/515552
[3,] 0 -6263/11717 0 1042/1453
[4,] 12484830/39480499 -809/3027 -265/419 -246481/515552
[5,] 191/302 929/1738 10458821/33073696 4018/33617
```

## Code

```
# = poly(1:n, degree = n-1, raw = TRUE)
raw_polynomial_code(5) |> MASS::fractions()
```

```
1 2 3 4
[1,] 1 1 1 1
[2,] 2 4 8 16
[3,] 3 9 27 81
[4,] 4 16 64 256
[5,] 5 25 125 625
```

## Other packages and resources

This package is not the first package made for contrast coding, though to my knowledge it is the first to take a “tidy” approach to it.

The `hypr`

package (Rabe et al. 2020) takes a different approach, where the focus is on considering the hypothesis matrix and declaring *specifically* which comparisons you want to make, then the package can provide a corresponding matrix. I like `hypr`

a lot actually, but I find it a bit tedious when I know what the contrast matrix should look like but I have to type out the comparisons; still better than `matrix`

calls though.

The `emmeans`

package (Lenth 2022) is extremely useful for making pairwise comparisons, but is capable of a lot more as well. You can see its vignette on contrasts here.

The `multcomp`

package (Hothorn, Bretz, and Westfall 2008) is useful for simultaneous inference, which seeks to extend workflows for multiple comparisons.

I haven’t used the `contrasts`

package (O’Callaghan 2021) very much, but judging from its vignette here it seems like it extends the `rms`

package’s `contrast`

function (Harrell Jr 2022). It seems useful for calculating different comparisons after a model is run, but its usage isn’t very transparent to me on first glance.

While not a package, this page from UCLA pops up a lot when people discuss contrast coding. It’s very useful, and I used it as a starting point for implementing different contrast functions. However, I will note that I don’t follow its naming conventions.

## Code

`sessionInfo()`

```
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8
[2] LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] palmerpenguins_0.1.1 purrr_1.0.0 dplyr_1.1.0
[4] contrastable_0.1.0
loaded via a namespace (and not attached):
[1] pillar_1.8.1 compiler_4.2.2 tools_4.2.2 digest_0.6.31
[5] jsonlite_1.8.4 evaluate_0.20 lifecycle_1.0.3 tibble_3.1.8
[9] gtable_0.3.1 pkgconfig_2.0.3 rlang_1.0.6 cli_3.5.0
[13] rstudioapi_0.14 yaml_2.3.7 xfun_0.37 fastmap_1.1.0
[17] withr_2.5.0 stringr_1.5.0 knitr_1.42 sass_0.4.5
[21] generics_0.1.3 vctrs_0.5.2 htmlwidgets_1.6.1 grid_4.2.2
[25] tidyselect_1.2.0 glue_1.6.2 R6_2.5.1 fansi_1.0.3
[29] rmarkdown_2.20 ggplot2_3.4.0 magrittr_2.0.3 scales_1.2.1
[33] htmltools_0.5.4 MASS_7.3-58.1 gt_0.8.0 colorspace_2.0-3
[37] utf8_1.2.2 stringi_1.7.8 munsell_0.5.0 crayon_1.5.2
```

## References

*Journal of Memory and Language*125: 104334. https://doi.org/10.1016/j.jml.2022.104334.

*Rms: Regression Modeling Strategies*. https://CRAN.R-project.org/package=rms.

*Biometrical Journal*50 (3): 346–63. https://cran.r-project.org/package=multcomp.

*Emmeans: Estimated Marginal Means, Aka Least-Squares Means*. https://CRAN.R-project.org/package=emmeans.

*Contrast: A Collection of Contrast Methods*. https://CRAN.R-project.org/package=contrast.

*Journal of Open Source Software*5 (48): 2134. https://doi.org/10.21105/joss.02134.

*Journal of Memory and Language*110: 104038. https://doi.org/10.1016/j.jml.2019.104038.

*Statistics for Linguists: An Introduction Using r*. Routledge.

## Footnotes

One might also see main effects referred to as

*marginal effects*and simple effects referred to as*conditional effects*. For the latter, the idea is that, for example, the “effect of syntax” is the effect of syntax*given*the reference level of the other categorical predictor. The main effect of syntax would be the average of the syntax effect for the L1 group and the syntax effort for the L2 group.↩︎You can view your default contrasts using

`options('contrasts')`

, where the unordered default is typically`contr.treatment`

and the ordered default is typically`contr.poly`

. I do not recommend changing these defaults because even though you may save some time for yourself, if you forget to explicitly mention that you changed the defaults then your analyses won’t be reproducible by the vast majority of others who have not changed the deafults.↩︎If it were really up to me, I would call it “centered pairwise coding” since you get pairwise comparisons with the intercept centered on the grand mean, but it doesn’t quite roll off the tongue and it’s a bit wordy to type out.↩︎

Not all modeling functions use the

`contrasts`

argument,`brms::brm`

is one example. In these cases you*must*set the contrast matrix to the factor, hence`set_contrasts`

is typically more useful for the actual modeling part while`enlist_contrasts`

is useful for showing matrices.↩︎

## Citation

```
@online{sostarics2022,
author = {Sostarics, Thomas},
title = {Contrastable},
date = {2022-07-13},
url = {https://tsostaricsblog.netlify.app/posts/contrastable/},
langid = {en}
}
```