Understanding Queries
Queries in Cube Alchemy bring together metrics and dimensions to answer specific questions.
A query consists of three key components:
-
Query name: A unique identifier for referencing the query
-
Dimensions: Columns to group by (the "by what" in your analysis)
-
Metrics: Measures to calculate (the "what" in your analysis)
# Define metrics
cube.define_metric(name='Revenue', expression='[qty] * [price]', aggregation='sum')
cube.define_metric(name='Order Count', expression='[order_id]', aggregation='count')
# Define a query by region and product category
cube.define_query(
name="regional_sales",
dimensions={'region', 'category'},
metrics=['Revenue', 'Order Count']
)
# Execute the query
result = cube.query("regional_sales")
Execution Pipeline
When you run a query, Cube Alchemy processes your data in a clear, ordered pipeline:
flowchart LR
A["Apply Context State Filters"] --> B["Fetch Dimensions"]
B --> C["Calculate Metrics (Aggregations)"]
C --> D["Compute Post-Aggregation Metrics"]
D --> E["Apply HAVING Filter"]
E --> F["Apply SORT"]
F --> G["Return Final Result"]
Query Types
Queries must contain either dimensions, metrics, or both (a query cannot lack both).
Dimension-Only Queries
When you only need to see what unique dimension combinations:
# Define a query with only dimensions
cube.define_query(
name="dimension_combinations",
dimensions={'region', 'category'}
)
# Get all unique region/category combinations
combinations = cube.query("dimension_combinations")
Metric-Only Queries
When you need to calculate global aggregates:
# Define metrics
cube.define_metric(name='Total Revenue', expression='[qty] * [price]', aggregation='sum')
cube.define_metric(name='Total Orders', expression='[order_id]', aggregation='count')
# Define a query with no dimensions
cube.define_query(
name="global_totals",
metrics=['Total Revenue', 'Total Orders']
)
# Execute the query
global_results = cube.query("global_totals")
Computed Metrics and HAVING
Computed metrics are calculated after base metrics are aggregated. Define them once, then reference by name in queries.
# Define base metrics
cube.define_metric(name='Cost', expression='[cost]', aggregation='sum')
cube.define_metric(name='Margin', expression='[qty] * [price] - [cost]', aggregation='sum')
# Define a computed metric (post-aggregation)
cube.define_computed_metric(
name='Margin %',
expression='[Margin] / [Cost] * 100',
fillna=0
)
# Use computed metric by name and add a HAVING filter
cube.define_query(
name='margin_by_product',
dimensions={'product'},
metrics=['Margin', 'Cost'],
computed_metrics=['Margin %'],
having='[Margin %] >= 20'
)
df = cube.query('margin_by_product')
Notes:
-
Use [Column] syntax in expressions; registered functions are available as @name.
-
Computed metrics reference columns present in the aggregated result (metrics and dimensions).
Working with Filters
Queries automatically respect all active filters on your hypercube, allowing you to:
- Define a query once
- Apply different filters
- Execute the same query to see different filtered views of your data
# Define your query
cube.define_query(
name="product_sales",
dimensions={'region', 'product_category'},
metrics=['Revenue', 'Order Count']
)
# Get unfiltered results
unfiltered_results = cube.query("product_sales")
# Apply filters
cube.filter({'product_type': ['Electronics', 'Home']})
# Get filtered results using the same query
filtered_results = cube.query("product_sales")