Skip to content

pwilmart/Start_Here

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 

Repository files navigation

Start_Here

Brief descriptions and links to my repositories and other content.

  • Start_Here - This repository. Updated November 26, 2025.

Latest Content:

  • mouse_liver_TMT-reanalysis - Reanalysis of some very large-scale TMT mouse liver data from the Gygi Lab published in 2015 and 2021. There are a lot of steps in analyzing quantitative proteomics data. These large, multi-plex TMT experiments are about as complicated as the data gets. Detailed discussions of processing bottom-up TMT-labeled quantitative proteomics data are provided along with how best to combined multiple TMT plexes.

  • power_of_proteomes - Classic biochemistry is characterizing one protein at a time. We can now measure thousands of proteins simultaneously; however, presentation and interpretation of results still seems to be one protein at a time. The power of looking at entire proteomes is demonstrated with examples of measured proteomes having contaminating proteomes. How to recognize contaminating proteomes and how to handle mixed proteome data are discussed.

  • detecting_deamidation - Your search engine probably has deamidation as a variable modification turned on by default. Do you know why? Is it just a poor way to pick up identifications when the precursor mass determination algorithm accidentally picked the first isotopic peak instead of the monoisotopic peak? See how poorly search engines deal with this common post-translational modification.

  • quantitative_proteomics_comparison - A comparison between DIA, spectral counting, and TMT isobaric labeling using similar composition proteomes (animal lenses). It is a measurement science comparison so the focus is on data quality/characteristic metrics. Result lists are of little use in evaluating these techniques.

  • mouse_lens_development_Khan2018_reanalysis - A thorough and detailed re-analysis of a TMT-labeled bottom-up quantitative proteomcis study. The experiment is tracking the developing mouse lens proteome at two embryonic ages (E15 and E18, in days) and postnatal ages (P0, P3, P6, and P9). The salient points are:

    • doing quantitative proteomics without using ratios
    • combining multi-plex TMT experiments
    • understanding samples with a few highly abundant proteins
    • understanding how data normalization and statistical testing results are coupled
    • preparing results in ways that facilitate data exploration and discovery
  • PXD030990_human-tear_re-analysis - A re-analysis of human tear samples characterized in a single-shot experimental design. Tear has a few highly abundant proteins that makes deep proteome profiling without fractionation impossible. Single-shot experimental designs have gained popularity but they are much more limiting than seems to be realized. Proteomic depth is a case of getting out what you put in. A single LC run won't get you much. Short gradient single LC runs will get you even less.

  • Human_rhesus_TMT - Analysis discussion of a multi-sample, multi-fraction, multi-kit, multi-species TMTpro experiment. Details how to analyze a 21 rhesus samples, 24 human samples, 45 samples total, 17 channels per plex (15 plus 2 pooled standards) in 3 plexes labeled with TMTpro 18-plex reagents experiment.

  • quantitative_proteomics_data_cleaning - A discussion of basic data cleaning concepts for quantitative proteomics data and some useful notebook quality control (QC) metrics.

  • TMT_channel_cross_talk - A short discussion of TMT channel cross talk for current TMT tags with N- and C-series reporter ions. Answers the questions of how much of a given tag signal ends up in some other tags (and which ones) and is it worthwhile to try and do corrections.

  • Mammalian_sperm_PXD003164 - How do you characterize the same proteome (sperm) across a series of mammals? There are several issues: what series of FASTA files do you use? Where do you get them? How do you know if they are up to the task? How would you compare a series of different (but similar) proteomes?

Table of Contents:

Blogs

Website Blogs

README Blogs

GitHub markdown (and the auto rendering of repository README.md files as nice webpages) creates a fast way to do technical blogging. Supporting files and images are easier to add to a repository than to a formal website. Repositories can also be great for sharing presentations (meeting content or training resources).

  • IRS_validation - Notebooks demonstrating how Internal Reference Scaling (IRS) in multiplex TMT experiments works. (Jan. 2019)
  • talk_to_repo_example - Tutorial on turning talks and posters into GitHub content. (Nov. 2019)
  • PRIDE_submission_tutorial - A guide to submitting PAW pipeline results to PRIDE. (May 2020)
  • precursor_mass_corrections - Is monoisotopic peak picking for MS2 scans a problem that needs solving? (April 2021)
  • score_distributions_FDR - Get your annoying tail out of my good scores! (April 2021)
  • Mammalian_sperm_PXD003164 - Reanalysis of mammalian sperm samples from a variety of species. Illustrates good and poor FASTA file choices. (May 2021)
  • Installing R kernel in Jupyter notebooks - How to add an R kernel to Jupyter notebooks. (Aug. 2021)
  • Gene-set-enrichment_STRING-DB - Short tutorial on doing gene set enrichment with STRING-DB. (Sep. 2021)
  • human_tear_references - A summary of quantitative tear proteomics references up to April 2022. Stimulated tearing confounds (probably) all these studies. (April 2022)
  • TMT_PAW_pipeline - Details about how TMT labeling is handled in the PAW pipeline. (Oct. 2022)
  • TMT_channel_cross_talk - A deeper dive on adjacent channel cross talk for TMTpro 18-plex. How large is the effect and some pros and cons of correction. (Dec. 2022)
  • Human-plasma_DIA-vs-TMT - An apples-to-aardvarks comparison of human plasma proteomes from DIA versus TMT. (Feb. 2023)
  • PXD011691_reanalysis - Reanalysis of data from PXD011691 - another DIA versus TMT experiment. (Feb. 2023)
  • quantitative_proteomics_data_cleaning - A discussion of basic data cleaning concepts for quantitative proteomics data and some useful notebook quality control (QC) metrics. (April 2023)
  • Human_rhesus_TMT - Analysis discussion of a multi-sample, multi-fraction, multi-kit, multi-species TMTpro experiment. Details how to analyze a 21 rhesus samples, 24 human samples, 45 samples total, 17 channels per plex (15 plus 2 pooled standards) in 3 plexes labeled with TMTpro 18-plex reagents experiment. (Oct. 2023)
  • PXD030990_human-tear_re-analysis - A re-analysis of human tear samples characterized in a single-shot experimental design. Tear has a few highly abundant proteins that makes deep proteome profiling without fractionation impossible. Single-shot experimental designs have gained popularity but they are much more limiting than seems to be realized. Proteomic depth is a case of getting out what you put in. A single LC run won't get you much. Short gradient single LC runs will get you even less. (Nov. 2023)
  • mouse_lens_development_Khan2018_reanalysis - A thorough and detailed re-analysis of a TMT-labeled bottom-up quantitative proteomcis study. The experiment is tracking the developing mouse lens proteome at two embryonic ages (E15 and E18, in days) and postnatal ages (P0, P3, P6, and P9). The salient points are:
    • doing quantitative proteomics without using ratios
    • combining multi-plex TMT experiments
    • understanding samples with a few highly abundant proteins
    • understanding how data normalization and statistical testing results are coupled
    • preparing results in ways that facilitate data exploration and discovery
      (Jan. 2024)
  • quantitative_proteomics_comparison - Comparison of DIA and DDA quantitative methods used in a few animal eye lens studies. Raises questions about single-shot quantitative experimental designs and if DIA actually lives up to its hype. (June 2025)
  • detecting_deamidation - Deamidation (conversion of amides to acids) is a post-translational modification that occurs in long lived proteins and in the test tube during sample preps. It is commonly specified as a variable modification in search engines. Can deamidation be reliably detected in the presence of isotopic peaks? (June 2025)
  • power_of_proteomes - Thinking about proteomics data as entire proteomes rather than one protein at a time has powerful advantages. Understanding contaminating proteomes is explored to illustrate the power of proteomes. (Oct. 2025)
  • mouse_liver_TMT_reanalysis - Reanalysis of some large-scale mouse liver quantitative TMT-labeling proteomic studies. The data is from the Gygi Lab and from 2015 and 2021 publications. Detailed discussion of how to analyze TMT data and how to combined data from multiple TMT-plexes.

Software

  • PAW_pipeline - The PAW/Comet proteomics pipeline

  • fasta_utilities - Utilities for downloading and prepping FASTA files

  • utilities - Some miscellaneous utility scripts

  • annotations - Scripts for adding UniProt annotations to results lists

  • PAW_BLAST - Scripts for BLAST ortholog matching

  • Z-score_GUI - Script for sliding-window Z-score analyses


Analyses

Internal_Reference_Scaling

Real_Time_Search

PAW_TMT

Other_TMT

MS2_TMT

Spectral_Counting


Meetings


Other_Repositories

Forked Repositories

About

Navigation links (and brief descriptions) to my repositories.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors