Abstract
Galectins are a family of carbohydrate-binding proteins with diverse functions in a wide range of cellular processes. A number of galectins play roles in tumorigenesis and cancer progression, and altered expression of various galectins has been observed in numerous malignancies. As promoter polymorphisms have been linked to expression differences, the goal of this thesis was to utilize public genome databases in order to locate single nucleotide polymorphisms (SNPs) in human galectin promoters, particularly those overlapping putative transcription factor and methylation sites, and investigate any possible association with cancer susceptibility. It was hypothesized that galectin promoter SNPs overlapping methylation and transcription factor binding sites may be associated with various cancers exhibiting altered galectin expression. In order to test this hypothesis, the objectives of this project are summarized as follows: 1: Parse the Oncomine database for expression differences in galectins 1-4 and 7-10 between cancerous and normal tissue. 2: Locate SNPs in upstream regulatory regions of genes coding for the above galectins using the International HapMap Project database. 3: Screen SNP sequences for putative overlapping transcription factor binding sites in upstream regulatory regions using fSNP, Consite, and TESS (TRANSFAC and IMD) search platforms. 4: Locate individual CpG sites overlaying upstream SNPs. Search for known CpG islands coinciding with the SNP sites using the UCSC Genome Browser and inspect sequences for putative CpG islands using CpG Island Searcher, CpG Plot, and CpG Island Explorer. 5: Screen Illumina whole-genome SNP data archived at the Gene Expression Omnibus for association between presence of galectin promoter SNPs and cancers showing significant galectin expression differences. A possible association between the rs3763959 polymorphism upstream of the galectin-9 start site and human breast carcinoma was observed, along with a more tenuous but still statistically significant association of the galectin-1 upstream polymorphism rs4820294 with a single melanoma dataset.