Docs / Build Workflow

Visualization — scatter

When to use scatter

Scatter is the right choice when the question is whether two continuous measures move together across a population — margin vs revenue across brands, conversion rate vs traffic across pages, latency vs throughput across endpoints. Each row of the query becomes one point.

Use line instead when the x-axis is ordered (typically time) and you care about trend. Use heatmap when the data is dense and you need to see distribution rather than individual points.

Mapping

  • mapping.x — required. Numeric x-coordinate field. Rows where this value is non-finite are dropped silently.
  • mapping.y — required. Numeric y-coordinate field. Same finite-only filter.
  • mapping.label — optional. Field name used as the per-point tooltip label and as the cross-filter event payload when emission is enabled.
  • mapping.series — optional. Categorical field name. When present the chart switches to multi-series mode: every distinct value becomes its own colored group of points with a legend entry, all sharing the same x/y axes. One row is still one point — the field only decides which group (and color) the point belongs to.
  • mapping.size — optional. Numeric field encoded as point diameter (bubble chart). Turns the scatter into a bubble chart: x × y position plus a third measure as size. Works in both single-series and multi-series mode; bubble sizes use a global domain across all series so cohorts stay comparable. Rows with non-finite or negative size values are dropped, same as bad x/y coordinates. Mutually exclusive with chart.symbol_size and chart.large_threshold — validation rejects the combination.
mapping:
  x: revenue
  y: margin_pct
  label: brand
  series: cohort      # optional — one colored group per distinct value

chart shortcuts

The chart block is typed and closed.

  • chart.point_color — base color for the points (hex), used in single-series mode. The emphasis block (see below) overrides this for the highlighted point. In multi-series mode use chart.series_colors instead.
  • chart.series_colors — multi-series only. A map of series value → color, e.g. { Champions: "#6c47ff", Rest: "#94a3b8" }. Any value not listed falls back to the default palette in order. Use hex/CSS colors, not Tailwind brand class names.
  • chart.symbol_size — base point size in pixels for unsized scatter (no mapping.size). Default depends on the chart density. Cannot be combined with mapping.size.
  • chart.size_range — sized mode only (mapping.size set). Min and max point diameter in pixels; the size field's domain is scaled into this range. Default [8, 40]. Inert when mapping.size is absent.
  • chart.size_scale — sized mode only. sqrt (default) makes bubble area proportional to the value — the perceptually correct encoding. linear maps value to diameter directly (over-emphasizes large values). Inert without mapping.size.
  • chart.large_threshold — multi-series only, default 2000. When a single series has more points than this, it switches to a faster bulk-draw mode; the trade-off is that per-point hover/emphasis is turned off for that series only. Smaller groups keep full hover. Raise it if you need hover on a big group and can afford the slower draw; lower it to keep very large groups responsive. Cannot be combined with mapping.size — bulk draw ignores per-point sizing.
  • chart.cross_filter — boolean, default true. Set to false to disable click emission entirely; in that case cross_filter_emit is not required.
  • chart.cross_filter_emit"label", "x", or "series". Picks which value is emitted on click; "series" emits the clicked point's group value and is only meaningful in multi-series mode. Required whenever chart.cross_filter is not explicitly false; the schema rejects a chart block that has neither.
  • chart.legend — set legend.show: true to display the series legend (the cohort names) in multi-series mode.
  • chart.height — pixel height of the viz container.

Legend & tooltip

chart.legend and chart.tooltip share the same shape as on bar. For scatter the tooltip is most useful with trigger: item — hovering reveals the label and both coordinates of one point at a time.

Axes

chart.x_axis and chart.y_axis share the same shape (with one extra on x_axis):

  • name — axis title. Setting both is recommended for scatter so the audience can read the relationship.
  • name_location, name_gap.
  • axis_label.show, axis_label.rotate, axis_label.interval, axis_label.color, axis_label.font_size, axis_label.font_weight, axis_label.formatter, axis_label.max_chars.
  • x_axis.visible_window — integer ≥ 1. Restricts the visible x range.

format

  • format.x or format[<x_field_name>] — pattern for x-axis labels and tooltip x value.
  • format.y or format[<y_field_name>] — pattern for y-axis labels and tooltip y value.
  • format.size or format[<size_field_name>] — pattern for the size value in the tooltip when mapping.size is set.
  • format at the root — fallback.

Cross-filter behavior

  • Clicking a point cross-filters the rest of the dashboard by the point's label / series / x value.
  • The clicked field must be declared as a parameter in at least one model used by the dashboard, or the click is silently ignored.
  • The top-level emphasis block can declaratively highlight the matching point inside the same viz when a related cross-filter is active.
  • Disable per viz with chart.cross_filter: false.
emphasis:
  field: brand
  value_from_param: highlight_brand
  marker_color: "#6c47ff"
  marker_size: 18

The point whose brand equals the runtime value of highlight_brand is rendered in marker_color at marker_size (unsized scatter) or with a relative size boost (bubble / sized scatter); the rest stay at chart.point_color / chart.symbol_size.

Worked examples

Margin vs revenue across brands:

id: brand_margin_vs_revenue
title: Margin vs Revenue by Brand
query: "models/ec_revenue.malloy::by_brand"
type: scatter
mapping:
  x: revenue
  y: margin_pct
  label: brand
chart:
  height: 360
  point_color: "#0f766e"
  symbol_size: 12
  x_axis:
    name: Revenue
  y_axis:
    name: Margin
  tooltip:
    trigger: item
    formatter: "{b} — {c0} / {c1}"
format:
  revenue: "$#,##0"
  margin_pct: "#,##0.00%"
published: true

With emphasis from a dashboard pill:

type: scatter
mapping:
  x: revenue
  y: margin_pct
  label: brand
chart:
  point_color: "#94a3b8"
  symbol_size: 10
emphasis:
  field: brand
  value_from_param: highlight_brand
  marker_color: "#6c47ff"
  marker_size: 18

Multi-series — comparing two customer cohorts on one frequency × ticket plane. Each row is one customer; series splits them into colored groups with a shared scale, so the two cohorts are directly comparable in a single viz instead of two side-by-side charts:

id: cohorts_freq_vs_ticket
title: Frequency vs Ticket by cohort
query: "models/rfm.malloy::cohort_points"
type: scatter
mapping:
  x: order_frequency
  y: avg_ticket
  label: customer_id
  series: cohort          # e.g. "Champions" vs "Rest"
chart:
  height: 360
  series_colors:
    Champions: "#6c47ff"
    Rest: "#94a3b8"
  large_threshold: 2000   # the large "Rest" group draws fast; small "Champions" keeps hover
  legend:
    show: true
  cross_filter: true
  cross_filter_emit: series
  x_axis:
    name: Order frequency
  y_axis:
    name: Avg ticket
format:
  order_frequency: "#,##0"
  avg_ticket: "$#,##0.00"
published: true

Bubble chart — frequency × ticket × lifetime revenue per customer, anchored on the public BigQuery ecommerce dataset. Add mapping.size to encode a third measure as point diameter; the tooltip shows x, y, and size:

id: customer_value_bubbles
title: Frequency × Ticket × Lifetime value
query: "models/<workspace_slug>/customer_value.malloy::value_points"
type: scatter
mapping:
  x: order_frequency
  y: avg_ticket
  size: lifetime_revenue
  label: customer_id
  series: segment             # optional — multi-series bubble
chart:
  size_range: [8, 40]
  size_scale: sqrt            # default — area proportional to value
  series_colors:
    Champions: "#6c47ff"
  legend:
    show: true
  cross_filter: true
  cross_filter_emit: series
  x_axis:
    name: Order frequency
  y_axis:
    name: Avg ticket
format:
  order_frequency: "#,##0"
  avg_ticket: "$#,##0.00"
  size: "$#,##0"
published: true

The Malloy model should query bigquery-public-data.thelook_ecommerce (or a view derived from it). Sized scatters work best up to a few thousand points — beyond that, bubbles overlap and readability drops; for very large cohorts use unsized scatter with large_threshold instead.

Common pitfalls

  • Too many overlapping points. Scatter loses signal when there are thousands of points in the same area. large_threshold keeps a big group responsive but does not declutter it — for readability, pre-aggregate in the Malloy query (e.g. group by bucket and use a heatmap), or filter to top-N by some interesting measure. Bubble charts (mapping.size) overlap even faster; keep point counts in the low thousands or switch to unsized scatter.
  • Mixing bubble sizing with bulk draw. mapping.size cannot be combined with chart.symbol_size or chart.large_threshold — validation rejects it. Pick one mode per viz: data-driven sizes, or static size + adaptive bulk draw for huge cohorts.
  • Negative or extreme outliers compress the rest. Filter outliers in the query, or use a log scale by formatting the field. Rows with negative mapping.size values are dropped silently.
  • Axes are unnamed. Always set chart.x_axis.name and chart.y_axis.name — scatter is the viz where the audience most needs the labels to read the relationship.
  • Cross-filter clicks have no effect. The clicked field (label / series / x) must be declared as a parameter in at least one model used by the dashboard.