To investigate the effects of a novel drug candidate, we conducted an in vivo experiment; a basic study in immunology. The raw data, collected from animals treated with different doses of the drug, underwent quality control procedures to ensure accuracy and consistency. Following some preanalysis to get a flavor, we proceeded with statistical analysis based on key assumptions and data transformation. A mixed ANOVA model was employed to assess the impact of the drug dose on the observed outcomes, while accounting for any potential random effects present in the data.
In this study, animals have been treated with different doses of the drug candidate:
group | dose (µg) |
---|---|
A | untreated |
B | 1.6 |
C | 8 |
D | 40 |
E | 200 |
At termination the scientists recovered the lymphoid organs, i.e., the spleen, encoded sp
in the rest of the study, and lymph nodes (ln
), and prepared single-cell suspensions for an ex vivo restimulation with:
medium
(negative/baseline control),PC
(positive control),CMP1
, andCMP2
.Finally, IFN-$\gamma$ release was measured by ELISA.
The plates were prepared in duplicates for lymph node samples, and triplicates for splenocytes. For some data points, different dilution factors were applied, e.g., 1/20, 1/100 and 1/200. Viability and cell count was measured. Additional flow cytometry was performed on the splenocytes only to measure viability, %CD4+, %CD8+, %CD19+ and %HLA-DR+.
The dataset was totally pseudonymised for this educational case study.
import math
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import pingouin as pg
# delete boring Future warnings from pingouin
import warnings
warnings.simplefilter(action='ignore')
For reproduciblity, it's a good practice to list the version of the most sensive packages/libraries.
import sys
print("Python:", sys.version)
print("numpy:", np.__version__)
print("pandas:", pd.__version__)
print("seaborn:", sns.__version__)
Python: 3.12.2 (tags/v3.12.2:6abddd9, Feb 6 2024, 21:26:36) [MSC v.1937 64 bit (AMD64)] numpy: 1.26.4 pandas: 2.2.1 seaborn: 0.13.2
!pip show pingouin
Name: pingouin Version: 0.5.3 Summary: Pingouin: statistical package for Python Home-page: https://pingouin-stats.org/index.html Author: Raphael Vallat Author-email: raphaelvallat9@gmail.com License: GPL-3.0 Location: C:\Users\Sébastien\Documents\second_scight\website\analysis_animal_exp\.env\Lib\site-packages Requires: matplotlib, numpy, outdated, pandas, pandas-flavor, scikit-learn, scipy, seaborn, statsmodels, tabulate Required-by:
To improve readability, we're only showing a limited number of rows from the DataFrame in this notebook.
# Set the maximum number of rows to display
pd.set_option('display.max_rows', 15)
To begin our analysis, we'll retrieve the data from an Excel file containing the information entered by the researchers
# import the raw data
data = pd.read_csv('./ifng_data.csv', sep=";", na_values='na')
To gain deeper insights from the data, we will undertake a process of data organization. This process involves mapping individual data points to groups with shared characteristics. By doing so, we can identify patterns and trends more effectively. Additionally, we will assign labels to each group to streamline the analysis in later statistical analysis stages.
# mapping the dose to the group label
map_groups = {
'a': 0,
'b': 1.6,
'c': 8,
'd': 40,
'e': 200,
}
data['dose_µg'] = data['group'].map(map_groups)
# creating a new variable based on the group id and animal number
data['animal_id'] = data['group'] + '|' + data['group_#'].apply(str)
data = data.drop(columns=['group', 'group_#'])
# pseudonymisation
stim_map = {
'aaaa': 'CMP1',
'bbbb': 'CMP2',
'cccc': 'PC',
'medium': 'medium',
}
data['stim'] = data['stim'].map(stim_map)
# we also encode the stimulation condition for the ANOVA
data['stim_id'] = data['stim'] + '|' + data['conc_µg_ml-1'].apply(str)
data
sample | stim | conc_µg_ml-1 | ifng_pg_ml-1 | dose_µg | animal_id | stim_id | |
---|---|---|---|---|---|---|---|
0 | sp | medium | 0.0 | 59.405353 | 0.0 | a|1 | medium|0.0 |
1 | sp | CMP1 | 10.0 | 5671.825154 | 0.0 | a|1 | CMP1|10.0 |
2 | sp | CMP1 | 1.0 | 2680.813757 | 0.0 | a|1 | CMP1|1.0 |
3 | sp | CMP1 | 0.1 | 795.581544 | 0.0 | a|1 | CMP1|0.1 |
4 | sp | CMP2 | 10.0 | 27495.080920 | 0.0 | a|1 | CMP2|10.0 |
... | ... | ... | ... | ... | ... | ... | ... |
651 | ln | CMP1 | 0.1 | 288.879208 | 200.0 | e|9 | CMP1|0.1 |
652 | ln | CMP2 | 10.0 | 3699.793797 | 200.0 | e|9 | CMP2|10.0 |
653 | ln | CMP2 | 1.0 | 1954.964786 | 200.0 | e|9 | CMP2|1.0 |
654 | ln | CMP2 | 0.1 | 282.778393 | 200.0 | e|9 | CMP2|0.1 |
655 | ln | PC | NaN | 16477.947980 | 200.0 | e|9 | PC|nan |
656 rows × 7 columns
Due to known issues with lymphocyte responsiveness from animal 'A5', its corresponding ln
data points have been excluded from the original data file, so that no value was recorded. We decide to exclude this animal from the whole study.
ix = data['animal_id'] == 'a|5'
data[ix]
sample | stim | conc_µg_ml-1 | ifng_pg_ml-1 | dose_µg | animal_id | stim_id | |
---|---|---|---|---|---|---|---|
32 | sp | medium | 0.0 | 21.736263 | 0.0 | a|5 | medium|0.0 |
33 | sp | CMP1 | 10.0 | 148.652715 | 0.0 | a|5 | CMP1|10.0 |
34 | sp | CMP1 | 1.0 | 126.871397 | 0.0 | a|5 | CMP1|1.0 |
35 | sp | CMP1 | 0.1 | 113.452095 | 0.0 | a|5 | CMP1|0.1 |
36 | sp | CMP2 | 10.0 | 582.270884 | 0.0 | a|5 | CMP2|10.0 |
... | ... | ... | ... | ... | ... | ... | ... |
99 | ln | CMP1 | 0.1 | NaN | 0.0 | a|5 | CMP1|0.1 |
100 | ln | CMP2 | 10.0 | NaN | 0.0 | a|5 | CMP2|10.0 |
101 | ln | CMP2 | 1.0 | NaN | 0.0 | a|5 | CMP2|1.0 |
102 | ln | CMP2 | 0.1 | NaN | 0.0 | a|5 | CMP2|0.1 |
103 | ln | PC | NaN | NaN | 0.0 | a|5 | PC|nan |
16 rows × 7 columns
# drop animal A5 from the whole analysis
data.drop(data[ix].index, inplace=True)
Finally we check the cleaned dataset. While we identified some NaN values for the restimulation concentration conc_µg_ml-1
, these specifically correspond to the concentration of positive control PC
. Since these missing values are expected, we've opted to keep them in the data.
data.info()
<class 'pandas.core.frame.DataFrame'> Index: 640 entries, 0 to 655 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 sample 640 non-null object 1 stim 640 non-null object 2 conc_µg_ml-1 560 non-null float64 3 ifng_pg_ml-1 640 non-null float64 4 dose_µg 640 non-null float64 5 animal_id 640 non-null object 6 stim_id 640 non-null object dtypes: float64(3), object(4) memory usage: 40.0+ KB
data[data['conc_µg_ml-1'].isna()]
sample | stim | conc_µg_ml-1 | ifng_pg_ml-1 | dose_µg | animal_id | stim_id | |
---|---|---|---|---|---|---|---|
7 | sp | PC | NaN | 2357.286815 | 0.0 | a|1 | PC|nan |
15 | sp | PC | NaN | 4831.622047 | 0.0 | a|2 | PC|nan |
23 | sp | PC | NaN | 7404.615045 | 0.0 | a|3 | PC|nan |
31 | sp | PC | NaN | 6741.175581 | 0.0 | a|4 | PC|nan |
47 | sp | PC | NaN | 7632.145795 | 0.0 | a|6 | PC|nan |
... | ... | ... | ... | ... | ... | ... | ... |
623 | ln | PC | NaN | 14867.005840 | 200.0 | e|5 | PC|nan |
631 | ln | PC | NaN | 890.046823 | 200.0 | e|6 | PC|nan |
639 | ln | PC | NaN | 9169.993506 | 200.0 | e|7 | PC|nan |
647 | ln | PC | NaN | 10161.485230 | 200.0 | e|8 | PC|nan |
655 | ln | PC | NaN | 16477.947980 | 200.0 | e|9 | PC|nan |
80 rows × 7 columns
To ensure data quality and identify potential issues, we conducted a thorough analysis of the positive control PC
samples.
data_PC = data[data['stim'] == 'PC']
While not intended for general use, the following function provides a tailored approach to creating box plots optimized for the visualization needs of this specific study.
def draw_boxnstripplot(
x='dose_µg',
y='ifng_pg_ml-1',
data=data,
hue='sample',
palette='Accent',
):
"""
Draw a stripplot and a customized boxplot, with hue on sample (SP vs. LN).
"""
ax = sns.boxplot(
x=x,
y=y,
data=data,
hue=hue,
showcaps=False,
showfliers=False,
boxprops=dict(linewidth=2, edgecolor='k', facecolor='white',),
whiskerprops=dict(linewidth=2, color='k',),
capprops=dict(linewidth=2, color='k',),
medianprops=dict(linewidth=2, color='darkgrey',),
)
sns.stripplot(
x=x,
y=y,
data=data,
hue=hue,
dodge=True,
palette=palette,
size=5,
alpha=0.85,
ax=ax,
)
sns.despine()
# generate a nicer legend
handles, labels = ax.get_legend_handles_labels()
n_items = len(labels)//2
plt.legend(
handles[n_items:], labels[n_items:],
fontsize=8,
title_fontsize=9,
fancybox=False,
)
We present a summary boxplot depicting the ELISA response to restimulation in the positive control group. The response is visualized at various drug concentrations, with data further categorized by lymphoid organ.
plt.figure(figsize=(6,3.5))
draw_boxnstripplot(data=data_PC, palette='Dark2')
plt.ylim((5, 2e5))
plt.xlabel("dose of drug (µg)")
plt.ylabel("[IFN-$\gamma$](pg/mL)")
plt.yscale("log")
plt.title("restimulation with Positive Control")
plt.gca().get_legend().set_title("sample")
plt.tight_layout();
We complement the boxplot with a table containing descriptive statistics, providing a detailed breakdown of the ELISA response to restimulation for all combinations of drug concentration and lymphoid organ in the positive control group. An ANOVA test and a post-hoc analysis are subsequently performed to statistically evaluate the significance of any observed differences between these groups.
data_PC.groupby(['sample','dose_µg']).describe()['ifng_pg_ml-1']
count | mean | std | min | 25% | 50% | 75% | max | ||
---|---|---|---|---|---|---|---|---|---|
sample | dose_µg | ||||||||
ln | 0.0 | 7.0 | 15701.805526 | 6739.609791 | 7496.349090 | 11174.950206 | 15170.899100 | 18821.843260 | 27251.803560 |
1.6 | 8.0 | 9536.057609 | 4977.493491 | 2711.783721 | 6669.294884 | 9774.753105 | 11924.078515 | 17960.493060 | |
8.0 | 8.0 | 9316.513183 | 6739.574502 | 1758.995678 | 5108.393259 | 8511.964890 | 10960.736607 | 23807.761460 | |
40.0 | 8.0 | 10142.228410 | 7894.074925 | 3422.978052 | 4126.810321 | 7569.684480 | 13610.679860 | 26195.707890 | |
200.0 | 9.0 | 14583.158003 | 7392.203196 | 890.046823 | 10161.485230 | 14867.005840 | 17700.191070 | 25559.543020 | |
sp | 0.0 | 7.0 | 6521.857562 | 2285.459666 | 2357.286815 | 5786.398814 | 7237.272611 | 7518.380420 | 9448.885043 |
1.6 | 8.0 | 2477.304483 | 1093.774661 | 390.289530 | 2118.675463 | 2810.950628 | 3172.089082 | 3726.386554 | |
8.0 | 8.0 | 2615.217522 | 4216.415500 | 463.763273 | 653.130086 | 1158.616933 | 1949.494963 | 12950.361210 | |
40.0 | 8.0 | 3554.253922 | 4883.213734 | 398.796385 | 774.024033 | 1927.244999 | 3383.356395 | 15098.226550 | |
200.0 | 9.0 | 3724.867523 | 2389.856297 | 641.664264 | 2640.331036 | 2940.849040 | 4762.757852 | 8277.768827 |
# 2-way ANOVA
data_PC.anova(
dv="ifng_pg_ml-1",
between=['sample', 'dose_µg'],
).round(3)
Source | SS | DF | MS | F | p-unc | np2 | |
---|---|---|---|---|---|---|---|
0 | sample | 1.318432e+09 | 1.0 | 1.318432e+09 | 45.823 | 0.000 | 0.396 |
1 | dose_µg | 3.062497e+08 | 4.0 | 7.656243e+07 | 2.661 | 0.040 | 0.132 |
2 | sample * dose_µg | 5.961853e+07 | 4.0 | 1.490463e+07 | 0.518 | 0.723 | 0.029 |
3 | Residual | 2.014061e+09 | 70.0 | 2.877230e+07 | NaN | NaN | NaN |
Our analysis reveals a significant effect of sample type on group means for the IFN-$\gamma$ response, with a P value close to zero. However, the impact of the drug dose is less evident (P value = 0.04). Although there seems to be some influence on the response to the positive control stimulation, the data suggests a possible biological plateau effect, at least in the lymph node samples. This plateau might be masking any dose-dependent variations in the IFN-$\gamma$ response.
pg.pairwise_tests(
data=data_PC,
dv="ifng_pg_ml-1",
between=['sample', 'dose_µg'],
).round(3)
Contrast | sample | A | B | Paired | Parametric | T | dof | alternative | p-unc | BF10 | hedges | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | sample | - | ln | sp | False | True | 6.573 | 78.000 | two-sided | 0.000 | 1.818e+06 | 1.456 |
1 | dose_µg | - | 0.0 | 1.6 | False | True | 2.311 | 23.776 | two-sided | 0.030 | 2.369 | 0.840 |
2 | dose_µg | - | 0.0 | 8.0 | False | True | 2.122 | 27.018 | two-sided | 0.043 | 1.774 | 0.758 |
3 | dose_µg | - | 0.0 | 40.0 | False | True | 1.669 | 27.822 | two-sided | 0.106 | 0.967 | 0.592 |
4 | dose_µg | - | 0.0 | 200.0 | False | True | 0.762 | 29.492 | two-sided | 0.452 | 0.422 | 0.260 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
26 | sample * dose_µg | sp | 1.6 | 40.0 | False | True | -0.609 | 14.000 | two-sided | 0.552 | 0.485 | -0.288 |
27 | sample * dose_µg | sp | 1.6 | 200.0 | False | True | -1.409 | 11.486 | two-sided | 0.185 | 0.808 | -0.624 |
28 | sample * dose_µg | sp | 8.0 | 40.0 | False | True | -0.412 | 14.000 | two-sided | 0.687 | 0.453 | -0.195 |
29 | sample * dose_µg | sp | 8.0 | 200.0 | False | True | -0.657 | 10.798 | two-sided | 0.525 | 0.487 | -0.313 |
30 | sample * dose_µg | sp | 40.0 | 200.0 | False | True | -0.090 | 9.905 | two-sided | 0.930 | 0.421 | -0.043 |
31 rows × 12 columns
This analysis revealed a significant influence of the drug dose. When spleen and lymph node samples are mixed together, we observed a trend towards reduced sensitivity to the positive control with increasing drug dose (the p-unc
value increases when the concentration increases in column B
). Interestingly, the IFN-$\gamma$ signal remained high across all lymph node groups, suggesting a possible ceiling effect. Note that the ELISA values are presented on a linear scale.
While this study focused on the immediate effects of the drug on IFN-$\gamma$ response, the richness of the collected data opens doors for further exploration. By combining the ELISA data with animal characteristics and the flow cytometry results, we could delve deeper into understanding the drug's potential impact on various aspects of the immune system. This section will offer a glimpse into potential future directions by examining some preliminary correlations.
Beyond the current scope, the study provides valuable data on various animal characteristics (age, sex), housing details (cage of origin), and operator information. These parameters could potentially influence the observed results and warrant further investigation in future studies.
To gain preliminary insights, we will load this dataset and explore its summary statistics. This initial analysis will provide a springboard for identifying potential correlations that warrant further investigation in a future follow-up study.
animals = pd.read_csv("./animals.csv", sep=';', comment='!')
day_exp = 44887 # day of the study
animals['age_days'] = day_exp - animals['day_of_birth'].astype(float)
animals['animal_id'] = animals['group'] + '|' + animals['group_#'].apply(str)
animals.drop(columns=['group_#', 'id', 'day_of_birth'], inplace=True)
animals.head()
group | sex | cage | termination | age_days | animal_id | |
---|---|---|---|---|---|---|
0 | a | m | 12 | ES | 59.0 | a|1 |
1 | a | m | 12 | ES | 59.0 | a|2 |
2 | a | m | 16 | SA | 94.0 | a|3 |
3 | a | m | 16 | SA | 94.0 | a|4 |
4 | a | m | 17 | SA | 63.0 | a|5 |
animals.groupby(['group', 'sex'])['age_days'].describe()
count | mean | std | min | 25% | 50% | 75% | max | ||
---|---|---|---|---|---|---|---|---|---|
group | sex | ||||||||
a | f | 1.0 | 85.000000 | NaN | 85.0 | 85.00 | 85.0 | 85.00 | 85.0 |
m | 7.0 | 71.428571 | 15.714719 | 59.0 | 61.00 | 63.0 | 81.00 | 94.0 | |
b | f | 2.0 | 85.000000 | 0.000000 | 85.0 | 85.00 | 85.0 | 85.00 | 85.0 |
m | 6.0 | 72.666667 | 16.524729 | 62.0 | 62.00 | 62.0 | 86.00 | 94.0 | |
c | f | 2.0 | 85.000000 | 0.000000 | 85.0 | 85.00 | 85.0 | 85.00 | 85.0 |
m | 6.0 | 71.500000 | 12.243366 | 62.0 | 63.00 | 68.0 | 73.75 | 94.0 | |
d | f | 2.0 | 60.500000 | 2.121320 | 59.0 | 59.75 | 60.5 | 61.25 | 62.0 |
m | 6.0 | 72.833333 | 2.401388 | 68.0 | 73.25 | 74.0 | 74.00 | 74.0 | |
e | f | 2.0 | 65.000000 | 4.242641 | 62.0 | 63.50 | 65.0 | 66.50 | 68.0 |
m | 7.0 | 69.571429 | 5.593363 | 62.0 | 65.00 | 73.0 | 74.00 | 74.0 |
plt.figure(figsize=(3,2))
sns.boxplot(
x='group',
y='age_days',
data=animals,
hue='sex',
showfliers=True,
);
# plt.tight_layout();
pg.anova(data=animals, dv='age_days', between=['sex', 'group'])
Source | SS | DF | MS | F | p-unc | np2 | |
---|---|---|---|---|---|---|---|
0 | sex | 90.280589 | 1.0 | 90.280589 | 0.729665 | 0.399545 | 0.022996 |
1 | group | 332.166006 | 4.0 | 83.041501 | 0.671157 | 0.616959 | 0.079699 |
2 | sex * group | 833.096395 | 4.0 | 208.274099 | 1.683310 | 0.178957 | 0.178443 |
3 | Residual | 3835.595238 | 31.0 | 123.728879 | NaN | NaN | NaN |
# merge this data with the full ELISA data set
data_animals = data.merge(animals, on='animal_id')
data_animals.head(3)
sample | stim | conc_µg_ml-1 | ifng_pg_ml-1 | dose_µg | animal_id | stim_id | group | sex | cage | termination | age_days | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | sp | medium | 0.0 | 59.405353 | 0.0 | a|1 | medium|0.0 | a | m | 12 | ES | 59.0 |
1 | sp | CMP1 | 10.0 | 5671.825154 | 0.0 | a|1 | CMP1|10.0 | a | m | 12 | ES | 59.0 |
2 | sp | CMP1 | 1.0 | 2680.813757 | 0.0 | a|1 | CMP1|1.0 | a | m | 12 | ES | 59.0 |
sns.relplot(
x='age_days',
y='ifng_pg_ml-1',
style='sex',
hue='conc_µg_ml-1',
size='dose_µg',
sizes=[1, 8, 40, 90, 250],
data=data_animals.fillna(10), # concentration for PC
col='stim',
row='sample',
height=2.5,
aspect=1,
legend='auto',
alpha=.75,
palette='nipy_spectral',
)
plt.yscale('log')
# plt.tight_layout();
Despite achieving a well-balanced experimental design, the ANOVA test did not detect statistically significant differences in age distribution between treatment groups. This balanced design is advantageous for our upcoming mixed ANOVA analysis, which allows us to incorporate additional independent variables like sex, age, and cage number to potentially explain any observed variations in the data.
We also have access to the corresponding flow cytometry data, which holds promise for a future, in-depth analysis. For now, let's get a glimpse of the information it contains.
cytometry = pd.read_csv("cytometry.csv", sep=';', comment='!')
cytometry['animal_id'] = cytometry['group'] + '|' + cytometry['group_#'].apply(str)
data_cytometry = data.merge(cytometry, on='animal_id')
data_cytometry.head()
sample | stim | conc_µg_ml-1 | ifng_pg_ml-1 | dose_µg | animal_id | stim_id | group | group_# | viability_pct | hladr_pct | cd19_pct | cd8_pct | cd4_pct | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | sp | medium | 0.0 | 59.405353 | 0.0 | a|1 | medium|0.0 | a | 1 | 88.9 | 23.3 | 19.3 | 9.83 | 12.0 |
1 | sp | CMP1 | 10.0 | 5671.825154 | 0.0 | a|1 | CMP1|10.0 | a | 1 | 88.9 | 23.3 | 19.3 | 9.83 | 12.0 |
2 | sp | CMP1 | 1.0 | 2680.813757 | 0.0 | a|1 | CMP1|1.0 | a | 1 | 88.9 | 23.3 | 19.3 | 9.83 | 12.0 |
3 | sp | CMP1 | 0.1 | 795.581544 | 0.0 | a|1 | CMP1|0.1 | a | 1 | 88.9 | 23.3 | 19.3 | 9.83 | 12.0 |
4 | sp | CMP2 | 10.0 | 27495.080920 | 0.0 | a|1 | CMP2|10.0 | a | 1 | 88.9 | 23.3 | 19.3 | 9.83 | 12.0 |
We can check globally the viability of the cells used in the study as another quality point.
plt.figure(figsize=(3,2))
sns.boxplot(
x='group',
y='viability_pct',
data=cytometry,
)
plt.ylim((80, 100));
We will illustrate the analysis using CD8 (top panel) and CD8 (bottom panel) T cells as an example. This methodology can be effectively replicated to analyze data for any cell type obtained in the experiment.
# for celltype in ('hladr_pct', 'cd19_pct', 'cd8_pct', 'cd4_pct'):
for celltype in ('cd8_pct', 'cd4_pct'):
sns.relplot(
x=celltype,
y='ifng_pg_ml-1',
hue='conc_µg_ml-1',
style='sample',
size='dose_µg',
sizes=[1, 6, 30, 80, 200],
data=data_cytometry.fillna(10), # fake concentration for PC
col='stim',
height=3,
aspect=.75,
legend='auto',
alpha=.70,
palette='Set1',
)
plt.yscale('log');
# filtering data either CMP1 or medium control
data_CMP1 = data[(data['stim'] == 'CMP1') | (data['conc_µg_ml-1'] == 0)]
data_CMP1.groupby(['dose_µg', 'sample', 'stim', 'conc_µg_ml-1']).describe(percentiles=[.5])['ifng_pg_ml-1']
count | mean | std | min | 50% | max | ||||
---|---|---|---|---|---|---|---|---|---|
dose_µg | sample | stim | conc_µg_ml-1 | ||||||
0.0 | ln | CMP1 | 0.1 | 7.0 | 817.140381 | 796.610850 | 75.714679 | 577.098512 | 2400.309755 |
1.0 | 7.0 | 2210.753717 | 2540.936318 | 115.467230 | 1872.883475 | 7678.569123 | |||
10.0 | 7.0 | 3771.122427 | 3054.144372 | 304.119215 | 3519.635227 | 9707.649227 | |||
medium | 0.0 | 7.0 | 34.623566 | 18.816274 | 11.949131 | 28.705061 | 72.093906 | ||
sp | CMP1 | 0.1 | 7.0 | 960.868304 | 223.934137 | 795.581544 | 917.903459 | 1456.012620 | |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
200.0 | ln | medium | 0.0 | 9.0 | 72.109845 | 55.625232 | 12.998240 | 57.308153 | 173.686927 |
sp | CMP1 | 0.1 | 9.0 | 362.844449 | 304.948962 | 50.678500 | 294.487313 | 910.011775 | |
1.0 | 9.0 | 1073.192492 | 591.718820 | 170.867993 | 898.653612 | 2132.258257 | |||
10.0 | 9.0 | 1948.155756 | 1257.182240 | 554.271580 | 1753.861420 | 3902.806026 | |||
medium | 0.0 | 9.0 | 49.876183 | 23.371521 | 22.478801 | 45.144443 | 103.555885 |
40 rows × 6 columns
plt.figure(figsize=(10,4))
for i, sampl_ in enumerate(('sp', 'ln')):
plt.subplot(1,2,i+1)
draw_boxnstripplot(
x='dose_µg',
data=data_CMP1[data_CMP1['sample'] == sampl_],
hue='conc_µg_ml-1',
palette='tab20b_r',
)
plt.yscale('linear')
# plt.ylim((0,25000))
plt.xlabel('dose of drug (µg)')
plt.ylabel('[IFN$\gamma$](pg/mL)')
plt.title(f"influence of drug dose on {sampl_} samples restim with CMP1", size=10, fontweight='bold')
plt.gca().get_legend().set_title("concentration")
plt.tight_layout();
The results suggest a potential trend: the stronger the concentration used to restimulate the cells, the higher the cytokine response. However, it's unclear if the initial drug treatment dose itself has an effect. Additionally, there seem to be a few outlier data points, particularly in the ln
samples restimulated with the highest CMP1 concentrations. Let's now focus on the data from cells restimulated with CMP2.
data_CMP2 = data[(data['stim'] == 'CMP2') | (data['conc_µg_ml-1'] == 0)]
data_CMP2.groupby(['dose_µg', 'sample', 'stim', 'conc_µg_ml-1']).describe(percentiles=[.5])['ifng_pg_ml-1']
count | mean | std | min | 50% | max | ||||
---|---|---|---|---|---|---|---|---|---|
dose_µg | sample | stim | conc_µg_ml-1 | ||||||
0.0 | ln | CMP2 | 0.1 | 7.0 | 4105.579009 | 4129.743499 | 218.355170 | 3108.493300 | 12614.761430 |
1.0 | 7.0 | 17519.735045 | 25335.639807 | 1111.065441 | 8779.501359 | 72953.894280 | |||
10.0 | 7.0 | 37922.774760 | 45267.067137 | 1666.928359 | 21093.249890 | 135413.978600 | |||
medium | 0.0 | 7.0 | 34.623566 | 18.816274 | 11.949131 | 28.705061 | 72.093906 | ||
sp | CMP2 | 0.1 | 7.0 | 5095.828155 | 3617.860716 | 2086.342701 | 3041.426623 | 10341.507490 | |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
200.0 | ln | medium | 0.0 | 9.0 | 72.109845 | 55.625232 | 12.998240 | 57.308153 | 173.686927 |
sp | CMP2 | 0.1 | 9.0 | 718.298342 | 554.521045 | 128.962498 | 483.514389 | 1814.468536 | |
1.0 | 9.0 | 2699.733319 | 1800.224535 | 901.143809 | 2218.807523 | 5999.015402 | |||
10.0 | 9.0 | 8944.243014 | 8225.876811 | 1701.793320 | 6359.198704 | 23246.583770 | |||
medium | 0.0 | 9.0 | 49.876183 | 23.371521 | 22.478801 | 45.144443 | 103.555885 |
40 rows × 6 columns
plt.figure(figsize=(10,4))
for i, sampl_ in enumerate(('sp', 'ln')):
plt.subplot(1,2,i+1)
draw_boxnstripplot(
x='dose_µg',
data=data_CMP2[data_CMP2['sample'] == sampl_],
hue='conc_µg_ml-1',
palette='tab20b_r',
)
plt.yscale('linear')
# plt.ylim((0,25000))
plt.xlabel('dose of drug (µg)')
plt.ylabel('[IFN$\gamma$](pg/mL)')
plt.title(f"influence drug dose on {sampl_} samples restim with CMP2", size=10, fontweight='bold')
plt.gca().get_legend().set_title("concentration")
plt.tight_layout();
We see similar trends for CMP2, but with generally stronger cytokine responses. Since outliers are expected to be uncommon, these high values might be due to limitations of the linear scale we're using so far, rather than actual outliers in the data. We'll explore this possibility in the next section.
Before computing mixed ANOVA test, we need to perform some preliminary tests to check if the assumptions are met:
The distribution of the INF-$\gamma$ values can be confusing at first sight, and normality is crucial for the various statistical tests used.
Using the linear scale, we can see decently symetrical boxplots, with globally medians at the center of the boxes, indicating a centered distribution, though, many 'outliers' are observed, i.e., in at 10 µg/mL CMP2 in the two highest dose groups (2 outliers in each group) in the splenocytes, and one outlier in each of the untreated, low and 80-µg dose groups in the lymph node samples, with pretty high signals compared to PC stimulation (see graph in previous section).
It is therefore highly recommanded to have another look at the raw data, to detect any technical deviation in the duplicate/replicate, and applying some stringent QC filters if necessary, and scale transformation, before reanalyzing the data set.
To get a better view of the IFN-$\gamma$ levels, let's switch to a log scale for the y-axis.
plt.figure(figsize=(10,7))
i=1
for stim_ in ('CMP1', 'CMP2'):
for sampl_ in ('sp', 'ln'):
plt.subplot(2,2,i)
draw_boxnstripplot(
x='conc_µg_ml-1', data=data[
(
(data['stim'] == stim_)
|
(data['conc_µg_ml-1'] == 0)
)
&
(data['sample'] == sampl_)
],
hue='dose_µg',
palette=sns.color_palette('muted', 5)
)
plt.yscale('log')
plt.ylim(5, 2e5)
plt.xlabel(f"[{stim_}](µg/mL)")
plt.ylabel('[IFN$\gamma$](pg/mL), $\log_{10}$')
plt.title(f"{sampl_} samples restimulated with {stim_}", size=10, fontweight='bold')
plt.gca().get_legend().set_title("dose (µg)")
i+=1
plt.tight_layout();
Looking at the values on a logarithmic scale, more dyssymmetry appears, with a tendency to skew downwards, especially from lymph node samples, which does not indicate a priori the need to transform the values. Nevertheless, there is no extreme outliers observed here (don't forget that we are looking at linear values on a log scale, not at log-transformed values), and normality seems to be achievable after log-transformation.
Furthermore, the appearance of a top plateau is more evident in this view, with an impression that the splenocytes cannot give more than $8\times10^4$ pg/mL, a value reached in the group at 1.6 µg from an CMP2 concentration of 1.0 µg/mL. This could be the signal that CMP2 concentrations greater than 1.0 µg/mL would not be necessary or even useless or even counterproductive in this assay.
data[(data['sample'] == 'sp') & (data['conc_µg_ml-1'] == 1)]['ifng_pg_ml-1'].max()
79350.82434
We calculate the Shapiro-Wilk statistic and the associated P value for each box plots or combinations of factor levels as follows.
pd.set_option("display.max_rows", None, "display.max_columns", None)
res_norm = data.groupby(['sample', 'stim', 'conc_µg_ml-1', 'dose_µg'])['ifng_pg_ml-1'].apply(pg.normality)
res_norm.sample(6)
W | pval | normal | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
sample | stim | conc_µg_ml-1 | dose_µg | |||||||
sp | medium | 0.0 | 1.6 | sp | medium | 0.0 | 1.6 | 0.938105 | 0.592538 | True |
ln | CMP1 | 1.0 | 0.0 | ln | CMP1 | 1.0 | 0.0 | 0.743625 | 0.010879 | False |
medium | 0.0 | 8.0 | ln | medium | 0.0 | 8.0 | 0.935566 | 0.568048 | True | |
0.0 | ln | medium | 0.0 | 0.0 | 0.875648 | 0.207823 | True | |||
sp | CMP2 | 0.1 | 8.0 | sp | CMP2 | 0.1 | 8.0 | 0.905743 | 0.325061 | True |
ln | CMP1 | 1.0 | 8.0 | ln | CMP1 | 1.0 | 8.0 | 0.789583 | 0.022120 | False |
# we just count the proportion of `True` normality
pct_normal_distributed_lin = 100*res_norm['normal'].mean()
print(f"normal distribution founded in {pct_normal_distributed_lin:.1f}% of the box plots (original/untransformed)")
normal distribution founded in 65.7% of the box plots (original/untransformed)
To address the potential limitations of the linear scale, we will now explore the data after applying a base-10 logarithmic transformation ($\log_{10}$) to the IFN-$\gamma$ levels.
data['log_ifng'] = data['ifng_pg_ml-1'].apply(np.log10)
res_norm_log = data.groupby(['sample', 'stim', 'conc_µg_ml-1', 'dose_µg'])['log_ifng'].apply(pg.normality)
print(f"normal distribution founded in {100*res_norm_log['normal'].mean():.1f}% of the box plots (log-transformed)")
normal distribution founded in 92.9% of the box plots (log-transformed)
The application of a base-10 logarithmic transformation to the IFN-$\gamma$ levels has demonstrably improved the normality of the data distribution. This is confirmed by the Shapiro-Wilk test, which indicates a statistically more normal distribution after transformation compared to the untransformed data. This suggests that the log scaling provides a more suitable representation for subsequent statistical analyses that rely on the assumption of normality.
Visual inspection of the log-transformation's effect on normality is possible with QQ plots of the entire dataset, i.e., including medium and PC controls to the set.
plt.figure(figsize=(6,3))
plt.subplot(121)
pg.qqplot(data['ifng_pg_ml-1'])
plt.title("original")
plt.subplot(122)
pg.qqplot(data['log_ifng'])
plt.title("log-transformed")
plt.tight_layout();
The QQ plot clearly indicates a better fit between the log-transformed data points and the theoretical quantiles (right panel), compared to the untransformed values (left panel). To examine this further, let's look at individual QQ plots for each group (excluding negative and positive controls). These control groups tend to skew the data towards the extreme left and right sides of the chart, making it harder to assess normality within the treatment groups.
data_filtered=(
data
.dropna() # remove PC
[data['conc_µg_ml-1'] != 0] # remove medium control
)
data_filtered.head()
sample | stim | conc_µg_ml-1 | ifng_pg_ml-1 | dose_µg | animal_id | stim_id | log_ifng | |
---|---|---|---|---|---|---|---|---|
1 | sp | CMP1 | 10.0 | 5671.825154 | 0.0 | a|1 | CMP1|10.0 | 3.753723 |
2 | sp | CMP1 | 1.0 | 2680.813757 | 0.0 | a|1 | CMP1|1.0 | 3.428267 |
3 | sp | CMP1 | 0.1 | 795.581544 | 0.0 | a|1 | CMP1|0.1 | 2.900685 |
4 | sp | CMP2 | 10.0 | 27495.080920 | 0.0 | a|1 | CMP2|10.0 | 4.439255 |
5 | sp | CMP2 | 1.0 | 9137.546979 | 0.0 | a|1 | CMP2|1.0 | 3.960830 |
# let's save now the data in a csv file for later use in R
data_filtered.to_csv("./data_filtered.csv")
plt.figure(figsize=(9,6))
i=1
for stim_ in ('CMP1', 'CMP2'):
for sampl_ in ('sp', 'ln'):
for conc_ in (0.1, 1.0, 10):
for dose_ in data['dose_µg'].unique():
data_ = data[
(data['stim'] == stim_)
&
(data['sample'] == sampl_)
&
(data['conc_µg_ml-1'] == conc_)
&
(data['dose_µg'] == dose_)
]
ax_ = plt.subplot(6,10,i)
pg.qqplot(data_['ifng_pg_ml-1'], confidence=False, ax=ax_)
plt.xlabel('')
plt.ylabel('')
for text_ in ax_.texts: text_.set_fontsize(5) # decrease size of R² text
plt.title(f"{stim_}|{sampl_}|{conc_}|{dose_}", fontsize=6)
plt.xticks([],[]) #remove x-axis ticks
plt.yticks([],[]) #remove y-axis ticks
i+=1
plt.suptitle("individual QQ-plots for untransformed values")
plt.tight_layout();
plt.figure(figsize=(9,6))
i=1
for stim_ in ('CMP1', 'CMP2'):
for sampl_ in ('sp', 'ln'):
for conc_ in (0.1, 1.0, 10):
for dose_ in data['dose_µg'].unique():
data_ = data[
(data['stim'] == stim_)
&
(data['sample'] == sampl_)
&
(data['conc_µg_ml-1'] == conc_)
&
(data['dose_µg'] == dose_)
]
ax_ = plt.subplot(6,10,i)
pg.qqplot(data_['log_ifng'], confidence=False, ax=ax_)
plt.xlabel('')
plt.ylabel('')
for text_ in ax_.texts: text_.set_fontsize(5) # decrease size of R² text
plt.title(f"{stim_}|{sampl_}|{conc_}|{dose_}", fontsize=6)
plt.xticks([],[]) #remove x-axis ticks
plt.yticks([],[]) #remove y-axis ticks
i+=1
plt.suptitle("individual QQ-plots for log-transformed values")
plt.tight_layout();
Together with the results of the normality tests, QQ plots using original and log-transformed values also confirmed that log-transformed values are more adequate for the rest of the analysis, as most of the points fall approximately along the reference line.
The variance of the outcome variable should be equal between the groups of the between-subjects factors, which here is the drug dose. This can be assessed using the Levene's test for equality of variances after grouping the data by sample and stimulus|concentration
, so that we compare between the different treatment dose groups.
Of note, we take advantage of the encoded "stimulation identifier" stim_id
created at the very beginning of the notebook, that encompasses the nature of the stimumus (CMP1 or CMP2) and its concentration, and we remove the negative and positive control from this and later analysis.
# untransformed data
data_filtered.groupby(['sample', 'stim_id']).apply(pg.homoscedasticity, dv='ifng_pg_ml-1', group='dose_µg')
W | pval | equal_var | |||
---|---|---|---|---|---|
sample | stim_id | ||||
ln | CMP1|0.1 | levene | 0.642492 | 0.635804 | True |
CMP1|1.0 | levene | 0.697553 | 0.598806 | True | |
CMP1|10.0 | levene | 0.679567 | 0.610761 | True | |
CMP2|0.1 | levene | 3.054415 | 0.029305 | False | |
CMP2|1.0 | levene | 1.995815 | 0.116646 | True | |
CMP2|10.0 | levene | 1.577794 | 0.201878 | True | |
sp | CMP1|0.1 | levene | 3.933091 | 0.009698 | False |
CMP1|1.0 | levene | 6.944668 | 0.000317 | False | |
CMP1|10.0 | levene | 3.387082 | 0.019177 | False | |
CMP2|0.1 | levene | 13.245935 | 0.000001 | False | |
CMP2|1.0 | levene | 6.974131 | 0.000307 | False | |
CMP2|10.0 | levene | 3.756314 | 0.012069 | False |
# log-transformed data
data_filtered.groupby(['sample', 'stim_id']).apply(pg.homoscedasticity, dv='log_ifng', group='dose_µg')
W | pval | equal_var | |||
---|---|---|---|---|---|
sample | stim_id | ||||
ln | CMP1|0.1 | levene | 0.447543 | 0.773420 | True |
CMP1|1.0 | levene | 0.431301 | 0.785027 | True | |
CMP1|10.0 | levene | 0.469472 | 0.757717 | True | |
CMP2|0.1 | levene | 0.217637 | 0.926834 | True | |
CMP2|1.0 | levene | 0.171234 | 0.951649 | True | |
CMP2|10.0 | levene | 0.128958 | 0.970873 | True | |
sp | CMP1|0.1 | levene | 1.763441 | 0.158338 | True |
CMP1|1.0 | levene | 2.119820 | 0.099076 | True | |
CMP1|10.0 | levene | 1.318188 | 0.282389 | True | |
CMP2|0.1 | levene | 0.635545 | 0.640552 | True | |
CMP2|1.0 | levene | 0.276042 | 0.891461 | True | |
CMP2|10.0 | levene | 0.460545 | 0.764112 | True |
We observe a homogeneity of variances in all stimulation conditions tested using log-transformed values, but not with untransformed original values, with all P values > 0.05 as assessed by Levene's test.
Note: if group sample sizes are (approximately) equal, it is still possible to run the three-way mixed ANOVA because it is somewhat robust to heterogeneity of variance in these circumstances.
Imagine each group in this study is a separate box. Homogeneity of covariance means the way variables relate to each other (how much they change together) should be similar across all these boxes. In other terms, for the ANOVA to be valid, the way different variables tend to change together (their covariance) should be consistent across all groups defined by the between-subjects factors.
pg.box_m(data_filtered, dvs=['ifng_pg_ml-1'], group='dose_µg')
Chi2 | df | pval | equal_cov | |
---|---|---|---|---|
box | 490.345738 | 4.0 | 8.203819e-105 | False |
pg.box_m(data_filtered, dvs=['log_ifng'], group='dose_µg')
Chi2 | df | pval | equal_cov | |
---|---|---|---|---|
box | 6.322671 | 4.0 | 0.176312 | True |
In the untransformed data, the covariances are not statistically homogenous across the groups. This issue is resolved after applying the log transformation.
In 3-way ANOVA, sphericity, which refers to equal variances across all group comparisons, is crucial to ensure the validity of the test statistic and avoid inflated Type I error rates (false positives). In other terms, sphericity is a more specific concept related to the equality of variances of differences in repeated-measures ANOVAs.
The pingouin.mixed_ANOVA
function automatically calculates Mauchly's sphericity test and any necessary corrections. It returns the test statistic W-spher
, the corresponding p-value (p-spher
), and a boolean value (sphericity) indicating whether the data meets the sphericity assumption. We will see this later in the notebook.
Having explored the data and confirmed key assumptions, let's now proceed with the mixed ANOVA analysis.
This section focuses on integrating the results obtained from R libraries into our Jupyter Notebook. We will perform a three-way mixed ANOVA using the anova_test
function from the R package rstatix. The analysis will explore the interaction effects between three factors:
dose_µg
: between-subjects factor representing different drug dose levelssample
: within-subjects factor referring to the sample type (e.g., spleen, lymph node)stim_id
: another within-subjects factor representing the stimulation condition as encoded earlierlibrary(tidyverse)
library(tidyr)
library(ggpubr)
df<-read_csv("data_filtered.csv")
bxp <- ggboxplot(
df, x = "dose_µg", y = "log_ifng",
color = "stim_id", palette = "jco",
facet.by = "sample", short.panel.labs = FALSE
)
bxp
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ── ✔ dplyr 1.1.4 ✔ readr 2.1.5 ✔ forcats 1.0.0 ✔ stringr 1.5.1 ✔ ggplot2 3.5.0 ✔ tibble 3.2.1 ✔ lubridate 1.9.3 ✔ tidyr 1.3.1 ✔ purrr 1.0.2 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors New names: • `` -> `...1` Rows: 480 Columns: 9 ── Column specification ──────────────────────────────────────────────────────── Delimiter: "," chr (4): sample, stim, animal_id, stim_id dbl (5): ...1, conc_µg_ml-1, ifng_pg_ml-1, dose_µg, log_ifng ℹ Use `spec()` to retrieve the full column specification for this data. ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
library(rstatix)
res.aov <- anova_test(
data = df, dv = log_ifng, wid = animal_id,
between = dose_µg, within = c(sample, stim_id)
)
get_anova_table(res.aov)
Attachement du package : 'rstatix' L'objet suivant est masqué depuis 'package:stats': filter
Effect | DFn | DFd | F | p | p<.05 | ges | |
---|---|---|---|---|---|---|---|
<chr> | <dbl> | <dbl> | <dbl> | <dbl> | <chr> | <dbl> | |
1 | dose_µg | 4.00 | 35.00 | 7.345 | 2.10e-04 | * | 0.374 |
2 | sample | 1.00 | 35.00 | 13.879 | 6.85e-04 | * | 0.054 |
3 | stim_id | 2.48 | 86.74 | 399.037 | 1.00e-47 | * | 0.523 |
4 | dose_µg:sample | 4.00 | 35.00 | 0.160 | 9.57e-01 | 0.003 | |
5 | dose_µg:stim_id | 9.91 | 86.74 | 6.417 | 2.73e-07 | * | 0.066 |
6 | sample:stim_id | 3.25 | 113.76 | 3.634 | 1.30e-02 | * | 0.005 |
7 | dose_µg:sample:stim_id | 13.00 | 113.76 | 2.609 | 3.00e-03 | * | 0.014 |
The three-way interaction between the drug dose, analyzed sample, and stimulation type/concentration is statistically significant ($F(13, 113.76) = 2.609, p = 0.003$). This indicates that the effect of the drug dose on the response depends on both the type of sample analyzed (e.g., spleen vs. lymph node) and the specific stimulation condition.
For completeness, we report the two-way interaction terms even though the three-way interaction is significant. These interactions may be explored further in studies where a three-way interaction is not observed. Here, we see a significant interaction between sample and stimulation type/concentration ($p = 0.013$) and between drug dose and stimulation type/concentration ($p < 0.0001$). The interaction between drug dose and sample was not statistically significant ($p = 0.957$).
Note: If a three-way interaction is not significant, researchers typically follow up by examining two-way interactions and potentially conducting simple main effects analyses or pairwise comparisons between groups. For more details on interpreting mixed ANOVA results in R, refer to the article mixed ANOVA in R.
When a three-way interaction is statistically significant, it indicates that the effect of one factor depends on the levels of the other two factors. To further understand this complex interaction, we can employ several post-hoc analysis techniques:
To gain a deeper understanding of the three-way interaction, we will specifically explore the interaction between the drug dose and stimulation type within each sample type (spleen and lymph node). This analysis is particularly relevant because it directly addresses the impact of the drug dose on the response to different stimulations, depending on whether the sample originates from the spleen or lymph node.
We will achieve this by performing separate simple two-way ANOVAs. In each analysis, we will group the data by sample type (spleen or lymph node) and examine the interaction between drug dose and stimulation type, i.e., dose_µg:stim_id
interaction, within that specific sample
group."
# Two-way ANOVA at each dose group level
two.way <- df %>%
group_by(sample) %>%
anova_test(dv = log_ifng, between = dose_µg, wid = animal_id, within = stim_id)
get_anova_table(two.way)
sample | Effect | DFn | DFd | F | p | p<.05 | ges |
---|---|---|---|---|---|---|---|
<chr> | <chr> | <dbl> | <dbl> | <dbl> | <dbl> | <chr> | <dbl> |
ln | dose_µg | 4.00 | 35.00 | 4.951 | 3.00e-03 | * | 0.325 |
ln | stim_id | 2.73 | 95.62 | 202.300 | 7.97e-40 | * | 0.464 |
ln | dose_µg:stim_id | 10.93 | 95.62 | 5.270 | 1.97e-06 | * | 0.083 |
sp | dose_µg | 4.00 | 35.00 | 8.042 | 1.04e-04 | * | 0.444 |
sp | stim_id | 2.62 | 91.83 | 393.074 | 4.22e-50 | * | 0.599 |
sp | dose_µg:stim_id | 10.50 | 91.83 | 4.968 | 7.05e-06 | * | 0.070 |
# same in Python
data_filtered.groupby(['sample']).apply(
pg.mixed_anova,
dv='log_ifng',
between='dose_µg',
within='stim_id',
subject='animal_id',
effsize='ng2', # generalized eta-squared
correction=False
)
Source | SS | DF1 | DF2 | MS | F | p-unc | ng2 | eps | ||
---|---|---|---|---|---|---|---|---|---|---|
sample | ||||||||||
ln | 0 | dose_µg | 24.135926 | 4 | 35 | 6.033981 | 4.951279 | 2.861356e-03 | 0.324774 | NaN |
1 | stim_id | 42.580516 | 5 | 175 | 8.516103 | 198.006717 | 4.519549e-70 | 0.459037 | 0.398419 | |
2 | Interaction | 4.533131 | 20 | 175 | 0.226657 | 5.269959 | 2.972874e-10 | 0.082853 | NaN | |
sp | 0 | dose_µg | 24.234661 | 4 | 35 | 6.058665 | 8.042147 | 1.042409e-04 | 0.443506 | NaN |
1 | stim_id | 45.646615 | 5 | 175 | 9.129323 | 395.360919 | 2.452373e-93 | 0.600177 | 0.486179 | |
2 | Interaction | 2.294174 | 20 | 175 | 0.114709 | 4.967656 | 1.448528e-09 | 0.070152 | NaN |
Our analysis revealed a statistically significant interaction between the drug dose and stimulation type for both lymph node ($F(20, 175) = 5.27, p < 0.0001$) and spleen samples ($F(20, 175) = 4.97, p < 0.0001$). This indicates that the effect of the drug dose on the response to stimulation depends on the type of sample analyzed (spleen vs. lymph node).
There can be subtle differences in the calculation of degrees of freedom (DFn, DFd), F-statistic, and p-value between rstatix::anova_test
in R and pingouin::mixed_anova
in Python, even though they are both designed to perform mixed ANOVAs.
Both packages might offer options for handling missing data or unequal variances. rstatix::anova_test might use the Satterthwaite approximation by default to adjust for these issues when calculating the denominator degrees of freedom (DFd), while pingouin::mixed_anova might use the Kenward-Roger adjustment by default for DFd calculations.
The way random effects are treated in the model can also influence the F-statistic and p-value.
There might be slight variations in how these packages handle specific situations or edge cases within the mixed ANOVA framework. These variations can have minimal effects on the results but can contribute to minor discrepancies.
It's important to note that these potential differences are usually minor and may not significantly affect the overall interpretation of the results. However, if high precision is crucial for the analysis, it's advisable to stick with one software package or carefully consider the potential impact of these discrepancies on your conclusions.
Nevertheless, please consult the documentation for both rstatix::anova_test
and pingouin::mixed_anova
to understand their default settings for handling missing data, unequal variances, and random effects estimation. If possible, try to specify the same methods for handling these issues in both packages to minimize discrepancies. Finally, when reporting your ANOVA results, mention the specific software package used and the methods employed for handling potential issues.
To further understand the significant interaction between drug dose and stimulation type, we can employ a technique called "simple main effects" analysis. This approach allows us to examine the effect of one factor (in this case, stimulation type stim_id
) at each level of the other factor (drug dose dose_µg
).
We can achieve this by performing separate ANOVAs for each stimulation type. In each separate ANOVA, we will analyze the effect of drug dose on the cytokine secretion (log_ifng
). Alternatively, we could perform separate ANOVAs for each drug dose level, examining the effect of stimulation type on cytokine secretion at each dose level as follows.
df %>%
group_by(sample, stim_id) %>%
anova_test(dv=log_ifng, wid=animal_id, between=dose_µg) %>%
get_anova_table()
sample | stim_id | Effect | DFn | DFd | F | p | p<.05 | ges | |
---|---|---|---|---|---|---|---|---|---|
<chr> | <chr> | <chr> | <dbl> | <dbl> | <dbl> | <dbl> | <chr> | <dbl> | |
1 | ln | CMP1|0.1 | dose_µg | 4 | 35 | 1.497 | 2.24e-01 | 0.146 | |
2 | ln | CMP1|1.0 | dose_µg | 4 | 35 | 1.679 | 1.77e-01 | 0.161 | |
3 | ln | CMP1|10.0 | dose_µg | 4 | 35 | 2.256 | 8.30e-02 | 0.205 | |
4 | ln | CMP2|0.1 | dose_µg | 4 | 35 | 8.748 | 5.27e-05 | * | 0.500 |
5 | ln | CMP2|1.0 | dose_µg | 4 | 35 | 6.715 | 4.04e-04 | * | 0.434 |
6 | ln | CMP2|10.0 | dose_µg | 4 | 35 | 7.137 | 2.60e-04 | * | 0.449 |
7 | sp | CMP1|0.1 | dose_µg | 4 | 35 | 3.433 | 1.80e-02 | * | 0.282 |
8 | sp | CMP1|1.0 | dose_µg | 4 | 35 | 4.999 | 3.00e-03 | * | 0.364 |
9 | sp | CMP1|10.0 | dose_µg | 4 | 35 | 4.479 | 5.00e-03 | * | 0.339 |
10 | sp | CMP2|0.1 | dose_µg | 4 | 35 | 14.479 | 4.52e-07 | * | 0.623 |
11 | sp | CMP2|1.0 | dose_µg | 4 | 35 | 11.821 | 3.52e-06 | * | 0.575 |
12 | sp | CMP2|10.0 | dose_µg | 4 | 35 | 7.063 | 2.80e-04 | * | 0.447 |
# same in python
data_filtered.groupby(['sample', 'stim_id']).apply(
pg.anova,
dv='log_ifng',
between='dose_µg',
)
Source | ddof1 | ddof2 | F | p-unc | np2 | |||
---|---|---|---|---|---|---|---|---|
sample | stim_id | |||||||
ln | CMP1|0.1 | 0 | dose_µg | 4 | 35 | 1.497435 | 2.241239e-01 | 0.146128 |
CMP1|1.0 | 0 | dose_µg | 4 | 35 | 1.678824 | 1.769133e-01 | 0.160979 | |
CMP1|10.0 | 0 | dose_µg | 4 | 35 | 2.256068 | 8.281843e-02 | 0.204984 | |
CMP2|0.1 | 0 | dose_µg | 4 | 35 | 8.748103 | 5.269404e-05 | 0.499946 | |
CMP2|1.0 | 0 | dose_µg | 4 | 35 | 6.714988 | 4.036000e-04 | 0.434206 | |
CMP2|10.0 | 0 | dose_µg | 4 | 35 | 7.136827 | 2.597158e-04 | 0.449229 | |
sp | CMP1|0.1 | 0 | dose_µg | 4 | 35 | 3.432916 | 1.809816e-02 | 0.281781 |
CMP1|1.0 | 0 | dose_µg | 4 | 35 | 4.999110 | 2.706360e-03 | 0.363595 | |
CMP1|10.0 | 0 | dose_µg | 4 | 35 | 4.479431 | 4.995939e-03 | 0.338596 | |
CMP2|0.1 | 0 | dose_µg | 4 | 35 | 14.478872 | 4.523632e-07 | 0.623314 | |
CMP2|1.0 | 0 | dose_µg | 4 | 35 | 11.820793 | 3.522519e-06 | 0.574640 | |
CMP2|10.0 | 0 | dose_µg | 4 | 35 | 7.062799 | 2.804004e-04 | 0.446651 |
The analysis revealed a statistically significant effect of the drug dose on IFN-$\gamma$ ($\log_{10}$ scale) release following stimulation with CMP2. This effect was observed for both lymph node and spleen samples at all concentrations tested. Interestingly, for CMP1 stimulation, a significant effect of the drug dose was only seen in spleen samples, and not in lymph node samples across any of the three tested CMP1 concentrations.
Since we observed a significant interaction between drug dose and stimulation type, we can further pinpoint specific differences between groups using pairwise comparisons. This analysis will help us identify which drug dose levels lead to statistically significant variations in IFN-$\gamma$ release for each sample type (spleen/lymph node) and stimulation condition (CMP1/CMP2). To ensure reliable results while conducting multiple comparisons, we will employ pairwise t-tests with Bonferroni adjustment. This adjustment accounts for the increased risk of false positives when performing numerous comparisons:
df %>%
group_by(sample, stim_id) %>%
pairwise_t_test(
log_ifng ~ dose_µg, paired = FALSE,
p.adjust.method = "bonferroni"
)
sample | stim_id | .y. | group1 | group2 | n1 | n2 | p | p.signif | p.adj | p.adj.signif | |
---|---|---|---|---|---|---|---|---|---|---|---|
<chr> | <chr> | <chr> | <chr> | <chr> | <int> | <int> | <dbl> | <chr> | <dbl> | <chr> | |
1 | ln | CMP1|0.1 | log_ifng | 0 | 1.6 | 7 | 8 | 0.2850 | ns | 1.000 | ns |
2 | ln | CMP1|0.1 | log_ifng | 0 | 8 | 7 | 8 | 0.9580 | ns | 1.000 | ns |
3 | ln | CMP1|0.1 | log_ifng | 1.6 | 8 | 8 | 8 | 0.2930 | ns | 1.000 | ns |
4 | ln | CMP1|0.1 | log_ifng | 0 | 40 | 7 | 8 | 0.2470 | ns | 1.000 | ns |
5 | ln | CMP1|0.1 | log_ifng | 1.6 | 40 | 8 | 8 | 0.0251 | * | 0.251 | ns |
6 | ln | CMP1|0.1 | log_ifng | 8 | 40 | 8 | 8 | 0.2110 | ns | 1.000 | ns |
7 | ln | CMP1|0.1 | log_ifng | 0 | 200 | 7 | 9 | 0.5930 | ns | 1.000 | ns |
8 | ln | CMP1|0.1 | log_ifng | 1.6 | 200 | 8 | 9 | 0.0953 | ns | 0.953 | ns |
9 | ln | CMP1|0.1 | log_ifng | 8 | 200 | 8 | 9 | 0.5420 | ns | 1.000 | ns |
10 | ln | CMP1|0.1 | log_ifng | 40 | 200 | 8 | 9 | 0.4920 | ns | 1.000 | ns |
11 | ln | CMP1|1.0 | log_ifng | 0 | 1.6 | 7 | 8 | 0.5260 | ns | 1.000 | ns |
12 | ln | CMP1|1.0 | log_ifng | 0 | 8 | 7 | 8 | 0.8770 | ns | 1.000 | ns |
13 | ln | CMP1|1.0 | log_ifng | 1.6 | 8 | 8 | 8 | 0.4150 | ns | 1.000 | ns |
14 | ln | CMP1|1.0 | log_ifng | 0 | 40 | 7 | 8 | 0.1530 | ns | 1.000 | ns |
15 | ln | CMP1|1.0 | log_ifng | 1.6 | 40 | 8 | 8 | 0.0364 | * | 0.364 | ns |
16 | ln | CMP1|1.0 | log_ifng | 8 | 40 | 8 | 8 | 0.1860 | ns | 1.000 | ns |
17 | ln | CMP1|1.0 | log_ifng | 0 | 200 | 7 | 9 | 0.2200 | ns | 1.000 | ns |
18 | ln | CMP1|1.0 | log_ifng | 1.6 | 200 | 8 | 9 | 0.0558 | ns | 0.558 | ns |
19 | ln | CMP1|1.0 | log_ifng | 8 | 200 | 8 | 9 | 0.2670 | ns | 1.000 | ns |
20 | ln | CMP1|1.0 | log_ifng | 40 | 200 | 8 | 9 | 0.7960 | ns | 1.000 | ns |
21 | ln | CMP1|10.0 | log_ifng | 0 | 1.6 | 7 | 8 | 0.6940 | ns | 1.000 | ns |
22 | ln | CMP1|10.0 | log_ifng | 0 | 8 | 7 | 8 | 0.7970 | ns | 1.000 | ns |
23 | ln | CMP1|10.0 | log_ifng | 1.6 | 8 | 8 | 8 | 0.5020 | ns | 1.000 | ns |
24 | ln | CMP1|10.0 | log_ifng | 0 | 40 | 7 | 8 | 0.0480 | * | 0.480 | ns |
25 | ln | CMP1|10.0 | log_ifng | 1.6 | 40 | 8 | 8 | 0.0160 | * | 0.160 | ns |
26 | ln | CMP1|10.0 | log_ifng | 8 | 40 | 8 | 8 | 0.0722 | ns | 0.722 | ns |
27 | ln | CMP1|10.0 | log_ifng | 0 | 200 | 7 | 9 | 0.1570 | ns | 1.000 | ns |
28 | ln | CMP1|10.0 | log_ifng | 1.6 | 200 | 8 | 9 | 0.0627 | ns | 0.627 | ns |
29 | ln | CMP1|10.0 | log_ifng | 8 | 200 | 8 | 9 | 0.2290 | ns | 1.000 | ns |
30 | ln | CMP1|10.0 | log_ifng | 40 | 200 | 8 | 9 | 0.4990 | ns | 1.000 | ns |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
91 | sp | CMP2|0.1 | log_ifng | 0 | 1.6 | 7 | 8 | 4.90e-02 | * | 4.90e-01 | ns |
92 | sp | CMP2|0.1 | log_ifng | 0 | 8 | 7 | 8 | 1.79e-01 | ns | 1.00e+00 | ns |
93 | sp | CMP2|0.1 | log_ifng | 1.6 | 8 | 8 | 8 | 1.18e-03 | ** | 1.18e-02 | * |
94 | sp | CMP2|0.1 | log_ifng | 0 | 40 | 7 | 8 | 8.46e-04 | *** | 8.46e-03 | ** |
95 | sp | CMP2|0.1 | log_ifng | 1.6 | 40 | 8 | 8 | 1.08e-06 | **** | 1.08e-05 | **** |
96 | sp | CMP2|0.1 | log_ifng | 8 | 40 | 8 | 8 | 2.41e-02 | * | 2.41e-01 | ns |
97 | sp | CMP2|0.1 | log_ifng | 0 | 200 | 7 | 9 | 1.46e-04 | *** | 1.46e-03 | ** |
98 | sp | CMP2|0.1 | log_ifng | 1.6 | 200 | 8 | 9 | 1.30e-07 | **** | 1.30e-06 | **** |
99 | sp | CMP2|0.1 | log_ifng | 8 | 200 | 8 | 9 | 5.53e-03 | ** | 5.53e-02 | ns |
100 | sp | CMP2|0.1 | log_ifng | 40 | 200 | 8 | 9 | 5.99e-01 | ns | 1.00e+00 | ns |
101 | sp | CMP2|1.0 | log_ifng | 0 | 1.6 | 7 | 8 | 3.85e-02 | * | 3.85e-01 | ns |
102 | sp | CMP2|1.0 | log_ifng | 0 | 8 | 7 | 8 | 4.88e-01 | ns | 1.00e+00 | ns |
103 | sp | CMP2|1.0 | log_ifng | 1.6 | 8 | 8 | 8 | 5.61e-03 | ** | 5.61e-02 | ns |
104 | sp | CMP2|1.0 | log_ifng | 0 | 40 | 7 | 8 | 6.18e-03 | ** | 6.18e-02 | ns |
105 | sp | CMP2|1.0 | log_ifng | 1.6 | 40 | 8 | 8 | 7.71e-06 | **** | 7.71e-05 | **** |
106 | sp | CMP2|1.0 | log_ifng | 8 | 40 | 8 | 8 | 2.81e-02 | * | 2.81e-01 | ns |
107 | sp | CMP2|1.0 | log_ifng | 0 | 200 | 7 | 9 | 7.58e-04 | *** | 7.58e-03 | ** |
108 | sp | CMP2|1.0 | log_ifng | 1.6 | 200 | 8 | 9 | 5.41e-07 | **** | 5.41e-06 | **** |
109 | sp | CMP2|1.0 | log_ifng | 8 | 200 | 8 | 9 | 4.01e-03 | ** | 4.01e-02 | * |
110 | sp | CMP2|1.0 | log_ifng | 40 | 200 | 8 | 9 | 4.74e-01 | ns | 1.00e+00 | ns |
111 | sp | CMP2|10.0 | log_ifng | 0 | 1.6 | 7 | 8 | 4.27e-01 | ns | 1.00e+00 | ns |
112 | sp | CMP2|10.0 | log_ifng | 0 | 8 | 7 | 8 | 2.75e-01 | ns | 1.00e+00 | ns |
113 | sp | CMP2|10.0 | log_ifng | 1.6 | 8 | 8 | 8 | 5.56e-02 | ns | 5.56e-01 | ns |
114 | sp | CMP2|10.0 | log_ifng | 0 | 40 | 7 | 8 | 6.10e-03 | ** | 6.10e-02 | ns |
115 | sp | CMP2|10.0 | log_ifng | 1.6 | 40 | 8 | 8 | 4.75e-04 | *** | 4.75e-03 | ** |
116 | sp | CMP2|10.0 | log_ifng | 8 | 40 | 8 | 8 | 6.93e-02 | ns | 6.93e-01 | ns |
117 | sp | CMP2|10.0 | log_ifng | 0 | 200 | 7 | 9 | 1.58e-03 | ** | 1.58e-02 | * |
118 | sp | CMP2|10.0 | log_ifng | 1.6 | 200 | 8 | 9 | 9.39e-05 | **** | 9.39e-04 | *** |
119 | sp | CMP2|10.0 | log_ifng | 8 | 200 | 8 | 9 | 2.33e-02 | * | 2.33e-01 | ns |
120 | sp | CMP2|10.0 | log_ifng | 40 | 200 | 8 | 9 | 6.60e-01 | ns | 1.00e+00 | ns |
While the Bonferroni method corrects for multiple comparisons by controlling the chance of false positives, another approach exists that might be more suitable in some cases. This method, developed by Benjamini and Hochberg, focuses on controlling the False Discovery Rate (FDR).
The FDR represents the proportion of statistically significant results (rejected null hypotheses) that are actually false positives. Intuitively, we want to keep the FDR at a low level.
The Benjamini-Hochberg procedure offers a more powerful way to adjust for multiple comparisons compared to Bonferroni correction. Here's a simplified overview of how it works:
By employing the BH procedure, we can potentially identify more true positives while controlling the FDR at a specific level.
pwc = data_filtered.groupby(['sample', 'stim_id']).apply(
pg.pairwise_tests,
dv="log_ifng",
between='dose_µg',
padjust='fdr_bh',
parametric=True,
# effsize='none',
)
pwc.round(4).head(10) # note that we use the values from this table in the final figure
Contrast | A | B | Paired | Parametric | T | dof | alternative | p-unc | p-corr | p-adjust | BF10 | hedges | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
sample | stim_id | ||||||||||||||
ln | CMP1|0.1 | 0 | dose_µg | 0.0 | 1.6 | False | True | -0.9633 | 11.3797 | two-sided | 0.3554 | 0.5924 | fdr_bh | 0.593 | -0.4773 |
1 | dose_µg | 0.0 | 8.0 | False | True | -0.0503 | 9.8972 | two-sided | 0.9608 | 0.9608 | fdr_bh | 0.437 | -0.0253 | ||
2 | dose_µg | 0.0 | 40.0 | False | True | 0.9640 | 12.5812 | two-sided | 0.3532 | 0.5924 | fdr_bh | 0.593 | 0.4709 | ||
3 | dose_µg | 0.0 | 200.0 | False | True | 0.5076 | 9.4795 | two-sided | 0.6234 | 0.6926 | fdr_bh | 0.469 | 0.2568 | ||
4 | dose_µg | 1.6 | 8.0 | False | True | 1.2052 | 14.0000 | two-sided | 0.2481 | 0.5924 | fdr_bh | 0.69 | 0.5697 | ||
5 | dose_µg | 1.6 | 40.0 | False | True | 2.1453 | 14.0000 | two-sided | 0.0500 | 0.3707 | fdr_bh | 1.755 | 1.0142 | ||
6 | dose_µg | 1.6 | 200.0 | False | True | 1.9379 | 13.3181 | two-sided | 0.0741 | 0.3707 | fdr_bh | 1.385 | 0.9069 | ||
7 | dose_µg | 8.0 | 40.0 | False | True | 1.2584 | 14.0000 | two-sided | 0.2288 | 0.5924 | fdr_bh | 0.72 | 0.5949 | ||
8 | dose_µg | 8.0 | 200.0 | False | True | 0.7909 | 14.6821 | two-sided | 0.4416 | 0.6309 | fdr_bh | 0.519 | 0.3653 | ||
9 | dose_µg | 40.0 | 200.0 | False | True | -0.6826 | 11.7134 | two-sided | 0.5081 | 0.6352 | fdr_bh | 0.492 | -0.3231 |
Having completed the quality control steps, assumption checks, and statistical analysis, we can now finalize the data visualization. As a first step, we will define a function to incorporate significance levels into the figure.
def asterix(ax=None, sample=None, stim=None, ticks=False, data=data):
"""
Function
--------
draw_significance_line(data, pairwise_comparisons, ticks=False)
Purpose
-------
This function automatically generates horizontal lines on a graph
to indicate statistically significant differences between groups
in a box or strip plot.
Parameters
----------
data
A data structure containing the data for the box/strip plots
(format depends on your plotting library).
pairwise_comparisons
A table containing the results of pairwise comparisons between
groups. This table should specify which comparisons are
statistically significant.
ticks (optional)
A boolean value indicating whether to add small ticks at the
ends of the horizontal lines (default: False).
Functionality
-------------
The function iterates through all combinations of sample and stim
in the data. For each combination, it identifies statistically
significant differences based on the provided pairwise_comparisons
table. If a significant difference is found, the function draws a
horizontal line between the two groups on the graph. The height of
the line is positioned slightly above the highest data point in the
compared groups. Additionally, the function displays the level of
significance (number of asterisks) next to the line.
Overall, this function simplifies the process of visually highlighting
significant differences in box/strip plots.
"""
def n_stars(pval):
"""
Encodes the number of asterices corresponding to P value.
"""
if pval <= 1e-4 : return '*'*4
elif pval <= 1e-3 : return '*'*3
elif pval <= 1e-2 : return '*'*2
elif pval <= .05 : return '*'
else : return 'ns'
def plot_significant(x1, x2, y, pval=1):
"""
Actually plot the line and asterices.
"""
ax.plot(
[x1, x1, x2, x2],
[y-(.04*ticks), y, y, y-(.04*ticks)],
lw=1,
color='k'
)
ax.text(
(x1 + x2)*.5,
y-.03,
n_stars(pval),
ha='center',
va='bottom',
color='red',
fontdict={'size': 9},
)
# map the treatment/dose group to the x offset for each boxplot group
x_offset = {
0 : -.33,
1.6: -.16,
8 : 0,
40 : .16,
200: .33
}
# main loop of the function that goes thought each combination of
# stimulus and concentration, to extract and plot all the signficant differences
for conc in (.1, 1, 10):
y = round(data[data['conc_µg_ml-1'] == conc]['log_ifng'].max(), 1) + .15 # initial height of the line
# extraction of all significant differences from the previous pairwise comparisons
stim_id = data['stim_id'].unique()[0]
pwc_ = pwc.loc[(sample, stim_id)]
sign_ = pwc_[pwc_['p-corr'] < .05]
# loop over the list and plot
for idx in sign_.index:
plot_significant(
x1 = math.log10(conc) + 1 + x_offset[sign_.loc[idx, 'A']], # 0.1 on the axis is x=0
x2 = math.log10(conc) + 1 + x_offset[sign_.loc[idx, 'B']], # 1.0 on the axis is x=1, etc.
y=y,
pval=sign_.loc[idx, 'p-corr']
)
y+=.2 # next lines will be higher
plt.figure(figsize=(10,8))
i=1
for stim_ in ('CMP1', 'CMP2'):
for sampl_ in ('sp', 'ln'):
data_ = data[(data['stim'] == stim_) & (data['sample'] == sampl_)]
ax_ = plt.subplot(2,2,i)
draw_boxnstripplot(
x='conc_µg_ml-1',
y='log_ifng',
data=data_,
hue='dose_µg',
palette=sns.color_palette('Set2', 5)
)
plt.xlabel(f"[{stim_.upper()}](µg/mL)")
plt.ylabel('[IFN$\gamma$](pg/mL), $\log_{10}$')
plt.title(f"{sampl_.upper()} restimulated with {stim_.upper()}", size=10, fontweight='bold')
ax_.get_legend().set_title("dose (µg)")
# plot the significant differences for the active sample|stim combination and all concentrations
asterix(ax=ax_, sample=sampl_, stim=stim_, data=data_, ticks=True)
i+=1
plt.suptitle("3-way mixed design ANOVA, $F(13, 113.76)=2.609,P=0.003,\eta_g^2=0.014$", fontweight='bold', color='darkblue')
plt.annotate(
text="pwc: unpaired T tests; p.adjust: Benjamini/Hochberg FDR",
xy=(1,-.2),
xycoords='axes fraction',
color='darkred',
horizontalalignment='right',
verticalalignment='top')
plt.tight_layout();