Materials and Methods Sampling - Genomics and New Approaches to Study Complex Traits in Pigs an

For the present study, 889 Italian Large White pigs were used. The animals were pure breed pigs included in the Italian sib test genetic evaluation scheme performed by ANAS

(Associazione Nazionale Allevatori Suini, ANAS; www.anas.it), reared in the same

environmental conditions at the genetic test station with a quasi ad libitum feeding level (60% of the pigs were able to ingest the entire supplied ration). At about 150 kg of live weight the animals were transported to a commercial abattoir located at about 25 km from the test station in accordance with Council Rule (EC) No. 1/2005 regarding the protection of animals during transport and related operations. At the slaughterhouse, the pigs were

electrically stunned and bled in a supine position in agreement with Council Regulation (EC) No. 1099/2009 regarding the protection of animals at the time of slaughter. All slaughter procedures were monitored by the veterinary team appointed by the Italian Ministry of Health. Carcass weight and backfat thickness measured with Fat-O-Meter (FOM) at 8 cm off the midline of the carcass at the level placed between the third and fourth last ribs.Samples of blood and backfat were gathered, immediately frozen in liquid nitrogen and stored at - 20°C.

Determination of backfat fatty acid composition

The backfat tissue samples collected after slaughtering were conserved at -20°C until

processed. Backfat FA composition was detected by direct trans-esterification, following the protocol reported by Murrieta et al. (2003). For each sample, 50 mg of frozen backfat was used for the total lipid extraction and then in each tube 0.5 mg of C19:0 methyl ester in hexane was added as internal standard. Gas chromatography was performed on GC-2010 Plus High-end Gas Chromatograph (Shimadzu Corporation, Tokyo, Japan), using SPTM-2560 Capillary GC Column (Sigma-Aldrich, Merck, Darmstadt, Germany). Backfat FA composition was expressed as the ratio between each FA and the total. Due to some missing information in the pedigree or in the measured phenotypes, 798 animals were taken into account for the following steps. The means and the standard deviations of the phenotypic traits measured in the 798 animals used for the GWAS are reported in Table 1.

Genotyping data quality control and imputation of the missing genotypes

DNA was extracted from the blood samples. The animals were then genotyped using Illumina PorcineSNP60 v2 BeadChip (Illumina Inc., San Diego, CA, USA), which contains 61,565 SNP markers distributed across the whole genome (Ramos et al., 2009). Quality control of the high-density SNP data was carried out on PLINK (Purcell et al., 2007): SNPs with more than 10% of missing genotypes, minor allele frequencies below 0.01, or deviations from Hardy-Weinberg equilibrium with p-value below 0.001 were filtered out. After the quality control the data set included 49,662 markers. All individuals had a call rate greater than 0.90 and passed the quality control. Markers were then mapped using the pig

genome assembly Sus scrofa build 10.2, and the unmapped SNPs were excluded. Markers located on sexual chromosomes were also excluded from the study. Finally, missing genotypes were imputed using Beagle version 3.3.2 (Browning and Browning, 2009). The final number of SNPs included in the GWAS was 45,704.

Genome-wide association study

Before performing the GWAS, the phenotypes of the 798 animals were adjusted for carcass weight (as a covariate), slaughter day (27 slaughtering batches), sex (castrate or female), animal (using a pedigree with 2,301 individuals) and litter effects (393 litters) using an animal model.

The associations between the genotypes and the adjusted phenotypes were assessed using the Bayes B approach as implemented in the GenSel software (Fernando and Garrick, 2014). The model was the following:

𝒚 = ∑

𝒛

𝛼

𝛿

+𝒆

where y was the vector of the adjusted phenotypes for each trait; zi was the vector of the

coded genotypes for a SNP at locus i (i = 1 to k, where k is the number of SNPs); αi was the

effect of the allele substitution of the SNP at locus i; δi was a random 0/1 variable

representing the absence (0) or the presence (1) of a SNP i in the model for a certain iteration of the Markov chain Monte Carlo procedure; and e was the vector of random residuals normally distributed. Alternate homozygous genotypes were coded as -10 and 10, and heterozygotes as 0. The prior probability (π) that the SNPs had no effect (δi = 0) on the adjusted phenotypes was fixed at π = 0.985, and consequently the prior probability of the markers having an effect on the adjusted phenotypes (δi = 1) was 1–π = 0.015. Thus, the model fitted approximately 745 SNPs in each iteration. A total of 500,000 iterations were run, with a burn-in of 100,000.

A Bayes Factor was calculated for each locus i (BFi) to evaluate the statistical relevance of the association between each SNP and the adjusted phenotypes. The Bayes Factor was

56 =

̂ ̂

where ̂ is the posterior probability of a marker i of being included in the model at a given iteration of the Markov chain Monte Carlo procedure. Generally, for BF above 3.2 the marker association with the trait is considered substantial, strong for BF between 10 and 100, and decisive for BF > 100 (Kass and Raftery, 1995).

Furthermore, for each trait we predicted the collective genetic variance of the SNPs included in consecutive non-overlapping 1-Mb windows based on the markers position in Sus scrofa assembly build 10.2. This approach permitted taking into consideration the combined effects of SNPs which are closely located and could be in linkage disequilibrium (LD). Contiguous 1- Mb windows that explained at least 0.5% of the total genetic variance each were merged and considered together, as in Ros-Freixedes et al. (2016). This approach permitted to take into account also the LD between markers placed in neighbouring regions or spanning more than 1 Mb. LD in candidate regions was evaluated using Haploview software (Barrett et al., 2005).

Functional characterization of the genes mapped in the most relevant regions

Candidate genes in the most associated regions were identified through Ensembl (EMBL- EBI), using the BioMart tool (Guberman et al., 2011) (url:

http://www.ensembl.org/biomart/), and posteriorly their functional gene annotation was analysed using Enrichr (Kuleshov et al., 2016) (url: http://amp.pharm.mssm.edu/Enrichr/), with the aim of identifying pathways of genes involved in the genetic determinism of the studied traits. Among the results obtained from Enrichr, the pathways from both Reactome Pathway Database (Febregat et al., 2016) and KEGG PATHWAY database (Kanehisa et al., 2016) were taken into account, with the aim of obtaining more complete information about the pathways related to the most associated regions. One of the regions associated with backfat FA composition comprised microRNAs. Aiming to identify if there were genes

hsa-miRNAs were used on PicTar (url: http://pictar.mdc-berlin.de/), utilising the algorithm for the predictions in vertebrates (Krek et al., 2005).

The obtained associated regions were also compared with the information about the known QTLs reported on QTLdb (Hu et al., 2016) (url: http://www.animalgenome.org/cgi-

bin/QTLdb/index).

Estimation of the genotypic effects for the most relevant markers

For the most interesting genes according to the GWAS results and the functional characterization analysis, we selected a tag SNP in order to further evaluate its effect. Estimated means and differences between genotypes were assessed using a model that also included carcass weight (as a covariate), slaughter day, sex, and litter. The program Rabbit (Rabbit programme, 2012) was used.

In document Genomics and New Approaches to Study Complex Traits in Pigs and Other Livestock Species. A Focus on the Investigation of Gene Networks Related to Fat Quality and Deposition in Pigs and Preliminary Research to Study Factors Related to Performances in Pigle (Page 53-57)