Docs / Build Workflow

Sources

Sources are step one of content reliability

Models depend on source aliases. If the alias, credential path, or dataset reference is wrong, every model, query, and visualization built on top fails. Configure and validate runtime/sources.runtime.yml before writing a single model.

If you have not set up your GCP service account yet, do that first. See BigQuery Dataset Access.

Working example: public BigQuery dataset

This is the exact configuration used in the ecommerce-showcase workspace. Copy it as your starting point and replace the project id and credentials path:

sources:
  ecommerce:
    name: The Look Ecommerce
    type: bigquery
    project_id: my-gcp-billing-project
    credentials_file: /workspace/secrets/my-workspace-bq.json
    datasets:
      - bigquery-public-data.thelook_ecommerce

Field by field:

  • ecommerce: the alias your Malloy models will use. Pick something short and stable — changing it later breaks every model that references it.
  • project_id: the GCP project that pays for query costs. This is your billing project, not necessarily where the data lives.
  • credentials_file: absolute path using the /workspace/ prefix. Always points into secrets/.
  • datasets: the dataset locations the runtime can query. Here the data is in bigquery-public-data, a different GCP project from the billing project — that is normal and expected.

Working example: your own private dataset

When data lives in your own GCP project, both project_id and the dataset project are the same:

sources:
  sales:
    name: Sales Data
    type: bigquery
    project_id: my-company-gcp-project
    credentials_file: /workspace/secrets/my-workspace-bq.json
    datasets:
      - my-company-gcp-project.sales_warehouse

You can define multiple sources in the same file:

sources:
  sales:
    name: Sales Data
    type: bigquery
    project_id: my-company-gcp-project
    credentials_file: /workspace/secrets/my-workspace-bq.json
    datasets:
      - my-company-gcp-project.sales_warehouse
  marketing:
    name: Marketing Data
    type: bigquery
    project_id: my-company-gcp-project
    credentials_file: /workspace/secrets/my-workspace-bq.json
    datasets:
      - my-company-gcp-project.marketing_events

Each alias becomes independently referenceable in models.

Source validation workflow

  1. Edit runtime/sources.runtime.yml from your workspace root.
  2. Run:
    looky sources list
    looky sources diff
    looky validate
  3. Fix any alias, credential, or dataset errors before touching models.

sources list shows the currently registered runtime aliases. sources diff shows what would change on push. validate checks structural consistency across the whole workspace.

Common source mistakes

  • Alias mismatch: model references sales but runtime defines ecommerce. Every query using that alias fails.
  • Wrong credentials path: path must use /workspace/secrets/ prefix, not a relative path from the repo root.
  • Dataset not in list: if a model queries a table whose dataset is not listed under datasets:, the query will be rejected at runtime.
  • Wrong billing project: if project_id does not have BigQuery Job User granted to the service account, queries will fail even if data is readable.

Never work around source problems inside the model. Fix the alias and runtime config once, and keep models clean.