MedicalResearch.com Interview with:
Dr Mandy Peffers BSc MPhil PhD BVetMed FRCVS
Wellcome Trust Clinical Intermediate Fellow
Institute of Ageing & Chronic Disease
Faculty of Health & Life Sciences
University of Liverpool Liverpool UK
MedicalResearch.com: What is the background for this study?
Response: The project was an extension of Louise Pease’s MSc research project in bioinformatics which aimed to re-analyse existing RNA-seq data to determine age related changes in gene expression in musculoskeletal tissues that may lead to the development of degenerative diseases. From existing literature we identified that degenerative diseases such as osteoarthritis and tendinitis were more prevalent in females and became more frequent following menopause. We looked at the biology of the cohort we were trying to assess and discovered a gender imbalance, we hypothesised that this was why few results had been obtained from the original analysis. So we developed a research proposal that detailed extending the existing data to publicly available data and merging the experiments; to increase the number of replicates available and balance the experimental design. We conducted multiple analyses and discovered that splitting samples by age and gender obtained the most significant results, and that whilst in a lot of cases the same genes were being differentially expressed, they were changing in opposite directions. Louise remembered her statistics lecturer Gerard Cowburn (Ged) taught her about the assumptions of statistical tests, in particular covariance analysis (which has previously been used to show that age and gender do not affect gene expression) assumed that under the conditions being tested data points were not opposites. Realising that this assumption had been violated by the data she began to think about what other assumptions we were working with and how to test their validity.
We checked the distribution of the data when male and female samples were mixed and it was a Poison distribution, Ged had also taught Louise that the Poison distribution could conceal two negative binomial distributions, so she checked again with sex separation which identified that separating by sex put the data into the negative binomial distribution, explaining the increase in significant results. Having been struck by the fact that during the first analysis female transcripts were all identified as isoforms and remarked to co-author Simon Cockell “what are the chances!” We discussed the human genome project and my hypothesis that males and females are globally genetically different and he suggested de novo assemblies of male and female transcriptomes could indicate whether this was the case. We re-analysed with males and females in completely separate pipelines with sex specific transcriptome assemblies. When the analysis was completed the number of transcripts identified as significantly different had more than tripled. Fewer larger transcripts were identified which was evidence of an increase in the efficiency and accuracy of transcriptome assemblies and potential evidence that males and females are globally genetically different.
MedicalResearch.com: What are the main findings?
Response: In summary the main findings were that males and females are transcriptionally different and gene expression in aged cells moves in opposite directions. Separate analysis of male and female data reduced variance, and improved data distribution measures dramatically increasing statistical power with a relatively low number of replicates. Sex specific transcriptome assemblies further boosted statistical power by reducing the number of transcripts against which p-values were corrected leading to the identification of 19,816 significantly differentially expressed transcripts in females. The results showed only one gene (CRABP2) involved in retinoic acid binding and signalling was decreased in expression in males and females. In males this was associated with an increase in cell cycle, and in females a decrease in cell cycle was implicated by the gene expression changes. Therefore it could be that the development of tendinopathy is higher in females because damaged cells cannot be replaced, and this may be linked with reductions in hormone concentrations that occur following menopause. In males there was evidence of a reduction in immune signaling which are associated with the immune system targeting cells for destruction. This fits with mechanisms implicated in the development of rheumatoid arthritis, a degenerative musculoskeletal disease most prevalent in aged males.
MedicalResearch.com: What should clinicians and patients take away from your report?
Response: When Louise was growing up her father had a great saying “assumption is the mother of all screw ups” which he regularly quoted to her. In the early days of the omics era the experiments and technologies were criticised for being hypothesis free; but I always thought this was its greatest strength, because a hypothesis free experiment can be an assumption free experiment. RNA-seq was conceived and developed to overcome the limitations and assumptions of microarrays, it is, to this day one of the most powerful technologies available to biologists. RNA-seq is not dependent on processing steps that can introduce biases, and RNA-seq data can provide a comprehensive functional view of the transcriptome which can be used to unpick the functions of the genome. The data produced from an RNA-seq experiment is highly complex multi-dimensional data that can be affected by a multitude of factors. Consequently the inferences that can be made from RNA-seq experiments depend on; experimental design, assumptions made during data processing, assumptions made during the development of tools and technologies, and assumptions made during the collection of data on which your experiment and analysis rely. We would like readers to take from this report that males and females are not biologically, metabolically, genetically or transcriptionally the same, these always were assumptions, and they were wrong. As scientists we can never prove anything definitively, we can only find evidence for or against hypotheses, and we must always review and test our assumptions in light of new evidence. We must always consider the underlying assumptions of our statistical tests, technologies and experiments and how our methods may influence our results and their interpretation.
MedicalResearch.com: What recommendations do you have for future research as a result of this study?
Response: We would recommend that biologists and bioinformaticians ensure that they conduct well planned gender balanced studies. Scientists should record as much detail as possible about the phenotypes of their test subjects, experimental conditions, and anomalies observed during the course of the experiments. This data should be deposited alongside any molecular data in repositories to ensure it can be considered in all subsequent analyses. Most importantly apart from the condition being tested all experimental conditions should be kept the same. If something physically looks like it may be different (larger cells, different shape, gender, age group etc), treat it as if it is different, this is especially important in genomic studies due to their sensitivity. For many years sex differences have been identified in biological data and adjusted for using statistical smoothing and scaling techniques. Do not adjust for differences, because we don’t really know what they are or the extent to which they may impact on results. If you find unexpected differences in your data, consider this a good thing, use those differences to generate a new comparison, a differential equation that can be solved. Try to limit, or at least test assumptions wherever possible, always consider the impacts of your methods. Always, consult and listen to a bioinformatician prior to conducting omics experiments. A good bioinformatician will focus your question and ensure your experiment is well planned so you can answer the question at hand.
MedicalResearch.com: Is there anything else you would like to add?
Response: We would like to emphasise that omics experiments must be designed to answer one specific question, the sensitivity, depth and power of omics technologies mean that you cannot “throw things into the mix” and expect to draw conclusions from your results. Set out to answer one question well, ensure you have enough replicates to answer that question, and process all experimental samples together. Be critical in your evaluation of your own work and do not disregard or fail to report evidence that goes against your own or any other existing hypotheses, remember they are hypotheses. Do not be afraid to prove yourself wrong, we did many times during this project, so many things were the opposite of what we expected to see, what we would have hypothesised.
Louise would like to express her gratitude to all of her teachers, specifically her PhD supervisor Ian Singleton for the scholarship, always asking the right questions, and the important lessons he taught me (some of which are detailed in the previous paragraph). Ged who taught her much of the statistical theory she utilised in this project. Also Louise am indebted to her co-authors and supervisors who have been incredibly supportive of her throughout the course of my MSc and this project. In terms of disclosures, it is true we have held back some information, for now. Louise stated ‘I will say, when I was given the data and identified the gender imbalance I took to social media in a strop stating “I am a bioinformatician not a bioinformagician”, but that was an assumption’.
MedicalResearch.com: Thank you for your contribution to the MedicalResearch.com community.
Louise I. Pease, Peter D. Clegg, Carole J. Proctor, Daryl J. Shanley, Simon J. Cockell, Mandy J. Peffers. Cross platform analysis of transcriptomic data identifies ageing has distinct and opposite effects on tendon in males and females. Scientific Reports, 2017; 7 (1) DOI: 10.1038/s41598-017-14650-z
Note: Content is Not intended as medical advice. Please consult your health care provider regarding your specific medical condition and questions.