datascience.tables.Table.scatter

Table.scatter(column_for_x, select=None, overlay=True, fit_line=False, colors=None, labels=None, **vargs)[source]

Creates scatterplots, optionally adding a line of best fit.

Each plot uses the values in column_for_x for horizontal positions. One plot is produced for every other column as y (or for the columns designated by select).

Every selected except column for column_for_categories must be numerical.

Args:
column_for_x (str): The name to use for the x-axis values of the
scatter plots.
Kwargs:
overlay (bool): create a chart with one color per data column;
if False, each will be displayed separately.

fit_line (bool): draw a line of best fit for each set of points

vargs: Additional arguments that get passed into plt.scatter.
See http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.scatter for additional arguments that can be passed into vargs. These include: marker and norm, to name a couple.

colors: A column of colors (labels or numeric values)

labels: A column of text labels to annotate dots

>>> table = Table().with_columns([
...     'x', [9, 3, 3, 1],
...     'y', [1, 2, 2, 10],
...     'z', [3, 4, 5, 6]])
>>> table
x    | y    | z
9    | 1    | 3
3    | 2    | 4
3    | 2    | 5
1    | 10   | 6
>>> table.scatter('x') 
<scatterplot of values in y and z on x>
>>> table.scatter('x', overlay=False) 
<scatterplot of values in y on x>
<scatterplot of values in z on x>
>>> table.scatter('x', fit_line=True) 
<scatterplot of values in y and z on x with lines of best fit>