The interaction of genetics and physical activity in the pathogenesis of metabolic dysfunction associated liver disease
Study cohort
The UKB is a large-scale prospective study cohort (n > 500,000) with participants aged 40–69 years recruited during 2006–2010. The biobank contains extensive phenotypic and genotypic data on the participants and has been open for researchers since 201218. The UKB has ethical approval from the North West Multi-Centre Research Ethics Committee (ref: 11/NW/0382) and informed written consent from all participants prior to the study. The current study was further approved by the Swedish central ethics committee (diary number 2019-03073).
A subset of the UKB cohort, including individuals with neck-to-knee MRI scans, was used when liver fat content and liver volume were considered outcomes. A larger subset of the UKB cohort including all participants with data on hospital health outcomes was included in the analyses considering MASLD and CLD as outcomes. All data-fields used in the present study are reported in Table S1.
Measurement of liver fat and liver volume
A large-scale multi-modal imaging study (n = 100,000) is ongoing in the UKB18. In the current study, we included a subset of the UKB cohort (n = 32,323) who have undergone neck-to-knee MRI scans, acquired with a Siemens 1.5 T MAGNETOM Aera using a dual-echo Dixon technique, resulting in water-fat volumes covering large parts of the body. The reference measurements of liver fat content, based on proton-density fat fraction (PDFF) maps, were only available for 9,893 subjects at the time of analysis19. The PDFF maps were based on a single transverse slice of the liver generated with a Siemens 1.5 T MAGNETOM Aera and a three-point Dixon technique19.
A neural network strategy was established and trained by Langner et al.19. First, the neural network was trained on the neck-to-knee MR images of those with reference measurements for regression of liver fat values. After tenfold cross-validation, the trained neural network measures were applied to the rest of the neck-to-knee MRI cohort for liver fat content inference. The data generated through neural network-based approach for liver fat content measures correlated well with the reference PDFF method (R2 = 0.94)19.
A similar approach was applied to generate liver volume measures. Neck-to-knee MR images (only abdominal stations were considered) were used to computationally estimate liver volume using a deep learning approach originally established for kidney segmentation by Langner et al.20. Briefly, abdominal water-signal images were used in the segmentation process. Liver volume was estimated by multiplying the number of segmented voxels by the size of one voxel (liver volume = number of segmented voxels × size of 1 voxel) using 97 subjects. A segmentation model was trained on manual segmentations from these 97 subjects and was used in predicting liver volume in all neck-to-knee MRI scanned subjects. The model showed high accuracy against known manual segmentations (R2 = 0.86).
In the UKB, the individuals that were deemed unrelated and had passed extensive quality control were included as described by Bycroft et al.21. We further filtered for ethnic background and only Caucasians were included in the analyses. For the current analyses, liver fat content (n = 27,243) as well as liver volume (n = 24,752) measures were available in the UKB study participants with Caucasian ancestry.
Genetic risk score
The data collection, genotyping and quality control in the UKB study has previously been described in detail elsewhere21. All UKB participants have been genotyped. Briefly, the genotyping was performed on blood samples using one of two designed arrays (UK BiLEVE Axiom Array and UK Biobank Axiom Array) that share 95% of the genetic markers. The genotypes were further imputed using the Haplotype Reference Consortium and the UK10K haplotype resource21. In the present study, the imputed genotypes version 3 was used.
Recently, Liu et al. performed a GWAS study (n = 32,858) in relation to abdominal MRI derived phenotypes using UKB and discovered 12 genetic variants in association with liver fat content and 12 genetic variants in association with liver volume5. Only independent genetic variants were included in the current study, a similar strategy has been used previously22. Ten liver fat content and 11 liver volume associated genetic variants were thus extracted from the UKB genetic data and were recoded based on the liver fat content/volume increasing alleles (Table S2).
Unweighted genetic risk scores for each study participant were determined by aligning trait-increasing alleles, both for liver fat content (GRSLF) as well as for liver volume (GRSLV) and summing up the total number of risk alleles, using methods that have previously been described23. The GRSLF ranged from 2 to 18 while GRSLV ranged from 0 to 14 among the study participants. For genetic risk scores, imputed genetic variants with genotype dosage < 0.5 were recoded as 0, with genotype dosage of > 0.5 to ≤ 1.50 were recoded as 1 and with genotype dosage of > 1.50 were recoded as 2. We found that two of the liver fat content associated variants were in strong LD with liver volume associated variants (rs4665985 and rs1260326, R2 = 0.34; rs58542926 and rs58489806, R2 = 0.80) and so were excluded from the GRSLF and GRSLV in the sensitivity analyses.
Self-reported levels of physical activity
Questions derived from the validated International Physical Activity Questionnaire (IPAQ) short form were used to assess the weekly performance of walking, moderate PA and vigorous PA. For each type of PA, the participants were asked to estimate how many days on a typical week they spend at least 10 min doing each activity. Participants who reported an activity frequency of at least one day/week were further asked to report how many minutes on a typical day they spend doing the activity. Participants reporting a frequency of zero days/week for any of the activities were given a duration of zero.
The participants were asked to include walking at work, walking to and from work and walking for sports/leisure when estimating frequency and time of walking. For moderate and vigorous PA, the participants were asked to include activities performed for work, leisure, travel and around the house.
The obtained questionnaire data was handled according to the IPAQ guidelines24. Individual IPAQ scores in metabolic equivalent of task (MET)-minutes/week were calculated for each type of PA by multiplying the frequency (days/week) with the typical duration (minutes) and an activity specific MET-value. Total MET-minutes/week scores were calculated for each participant by summing up the weekly MET-minutes for walking, moderate and vigorous PA. The time variables were truncated at 180 min24.
Participants who answered “do not know” or “prefer not to answer” on either question were excluded. Those reporting “unable to walk”, lacking data on either frequency or duration, or with a total activity time exceeding 960 min were also excluded. Further, reported durations of less than 10 min were changed to zero24. After the exclusions 23,080 and 20,986 participants were included in the liver fat and liver volume cohort, respectively.
The participants were divided into three PA groups (low, moderate, and high) based on the categorical score criteria stated in the IPAQ guidelines (Text S1)24. All exclusions made are presented in Figs. S1 and S2.
Other lifestyle measures
Age at baseline/recruitment was truncated to whole years. The sex of each participant was acquired from the central registry at recruitment. Both information regarding genotyping array and population substructure (first 20 genetic principal components) were obtained with the genomic UKB data. The BMI values were constructed from height and weight measured during the initial Assessment Centre visit. If either height or weight was omitted no BMI value was constructed.
The participants were asked to report their baseline smoking status (never, previous, current or prefer not to answer) in a touchscreen questionnaire. Participants were also asked to report their alcohol consumption status and estimate their current alcohol intake frequency. Individuals reporting an alcohol intake frequency of at least once or twice a week were asked to estimate an average weekly consumption of a variety of alcoholic beverages. These measures were used to estimate an average weekly alcohol intake, from which an estimated daily consumption was derived25. The Townsend deprivation index was calculated prior to recruitment to the UKB based on the preceding national census output areas. Each participant received a score based on the geographical area determined by their postcode. All analyses were performed as complete case analysis, the covariates were not imputed, number of individuals included in each model are reported in respective tables.
Assessment of MASLD and chronic liver disease
Cases of MASLD and CLD were defined based on hospital health outcome codes (ICD-9 and ICD-10 codes) (Table S3). Diagnosis across all the participants hospital inpatient records were part of the dataset, including diagnoses both before and after imaging data collection i.e. prevalent and incident cases.
Self-reported cases of CLD at a nurse’s interview were also included (Table S3)25. If the participants were uncertain about their illness, the interviewer, a trained nurse, tried to classify it based on their description. Any illnesses that the nurse could not code were recorded as a free-text description, which was later reviewed by a doctor for coding the illness or marked it as “unclassifiable”. The MASLD/CLD cohort included n = 239,308 individuals from which n = 172 cases of MASLD and n = 371 cases of CLD were identified.
Statistical analyses
All statistical analyses were performed using Stata (version 15, StataCorp, College Station, TX, USA). Both liver fat content and liver volume variables had skewed distributions and were transformed using rank-based inverse-normal transformation. Linear regression analyses were performed to assess the association of GRS and PA with liver fat content/volume, assuming an additive effect. Interaction analyses for GRSs and PA were performed by introducing an interaction term (GRS × PA) in the regression models, along with the main effect terms. All analyses including genotype as a variable were performed while adjusting for (a) basic model covariates: age, sex, genotyping array, and population substructure (first 20 principal components), (b) main model covariates, i.e., the basic model covariates as well as smoking status (never/previous/current), alcohol consumption (g/day), and Townsend deprivation index, and (c) main model covariates and BMI. Analyses of the effect of PA on liver fat content/liver volume (lacking genotype as a variable) were adjusted for main model covariates except genotyping array and population substructure.
To evaluate the effect of PA on GRSLF and GRSLV in relation to MASLD and CLD, linear regression analyses with robust standard errors were performed. We considered this the primary analysis instead of the more standard logistic regression since we want to measure interaction on an additive scale. This scale is more relevant when investigating which subgroups would benefit most from PA from a public health perspective26. Beta coefficients and 95% confidence intervals (CIs) limits were multiplied by 100 to express change in percentage points and the results are presented as such.
Logistic regression analyses were also performed to assess the effect of PA on GRSLF and GRSLV in predicting MASLD and CLD. Interaction analyses were performed by introducing an interaction term (GRS × PA) in the logistic regression models, along with the main effect terms. The analyses were adjusted according to the models listed above.
Ethics approval statement
The UKB has ethical approval from the North West Multi-Centre Research Ethics Committee (ref: 11/NW/0382) and informed written consent from all participants prior to the study. The current study was further approved by the Swedish central ethics committee (diary number 2019-03073).
Patient consent statement
All UK Biobank participants gave consent at recruitment.
link
