EDA: High Target Gradients
SageWorks EDS
The SageWorks toolkit a set of plots that show EDA results, it also has a flexible plugin architecture to expand, enhance, or even replace the current set of web components Dashboard.
The SageWorks framework has a broad range of Exploratory Data Analysis (EDA) functionality. Each time a DataSource or FeatureSet is created that data is run through a full set of EDA techniques:
 TBD
 TBD2
One of the latest EDA techniques we've added is the addition of a concept called High Target Gradients
 Definition: For a given data point (x_i) with target value (y_i), and its neighbor (x_j) with target value (y_j), the target gradient (G_{ij}) can be defined as:
[G_{ij} = \frac{y_i  y_j}{d(x_i, x_j)}]
where (d(x_i, x_j)) is the distance between (x_i) and (x_j) in the feature space. This equation gives you the rate of change of the target value with respect to the change in features, similar to a slope in a twodimensional space.
 Max Gradient for Each Point: For each data point (x_i), you can compute the maximum target gradient with respect to all its neighbors:
[G_{i}^{max} = \max_{j \neq i} G_{ij}]
This gives you a scalar value for each point in your training data that represents the maximum rate of change of the target value in its local neighborhood.

Usage: You can use (G_{i}^{max}) to identify and filter areas in the feature space that have high target gradients, which may indicate potential issues with data quality or feature representation.

Visualization: Plotting the distribution of (G_{i}^{max}) values or visualizing them in the context of the feature space can help you identify regions or specific points that warrant further investigation.
Additional Resources
 Consulting Available: SuperCowPowers LLC