ProteinGPS: Simulate, Visualize and Learn

The data is simulated every time parameters on the right changes. "real" activities referred to below are simply activities assigned to clones from an additive model, with random weights from a guassian distribution. These activities are hidden from modeling, which only "sees" observed activities. Observed activities are subjected to noise and correlation with activities we care about.
Figure 1: Distribution of Measured Activities. Activities below the level of detection(LOD) are considered as zero. In this tool, we mark the WT(reference) protein to be 1, and any clone below 0.1 in activity is considered as "dead". Watch this distribution change based on the parameters on the right.
Figure 2: Real vs Observed(Measured) activities (top 5 "real" clones are in green). True activities of proteins are "unknown" in the real world. In this hypothetical scenario, we are comparing it with what we can measure - which includes noise in measurements (white and black). If a surrogate assay is chosen, knowing the correlation between what we care for and what we can measure is of importance.
Figure 3: Real Weights vs Model Weights (top 10 real weights are in green). Again, "real" weights are never known, but can be inferred from Sequence-Activity Models. Even noise in assay are tolerated in such models. ProteinGPS technology is based on top substitutions instead of top clones. See how top substitutions remain in top half after introducing noise.
Figure 4: Actual vs Predicted activities (top 5 actual are in green). Models from Figure 3 are in turn used to predict the activity of clones. This process of transforming measured activities allows us to determine outliers and also to deconvolute the noise. See how adding noise tightens the correlation with "real" activity (try large noise with a lot of clones).

Design Characteristics:

Number of Varying Positions: 0
Substitutions per Variant: 0
Number of Variants:

Protein Characteristics:

Deleterious Mutations: 0%
Avg. Effect of Substitutions:

Assay Characteristics:

Assay noise: %
Surrogate Correlation: