Skip to main content

Description

  • The Clickstream dataset is a user and event-level dataset that reports on every Page View and Conversion event tracked via the Rockerbox pixel on your site.
  • These events are attributed back to click-based marketing touchpoints, with the Rockerbox tier structure and spend keys applied for standardization with other Rockerbox conversion datasets.

Table Creation

  • This table is automatically created upon activation of the clickstream feature.
  • Once the Clickstream dataset has been enabled by Rockerbox, the table will appear in your data warehouse as ON_SITE_EVENTS_ALL_PAGES.

Partition Keys

  • date
💡 Note: Leverage partition keys when querying the table to improve query efficiency.

Logical Primary Key

While data warehouses do not enforce primary key constraints, the event_id functions as the logical primary key for the table.

Field Reference

#NameDescriptionType
1actionName of the raw pixel event. This includes events for all conversion segments configured in Rockerbox + a page view.str
2advertiserRockerbox Account IDstr
3base_idPrimary User IDstr
4dateDate when the action occurreddate
5engaged_sessionBinary 0 or 1 indicating a session lasting > 10 seconds (session_max - session_min > 10).int
6event_idA unique identifier for each action. Can be used as the primary key.str
7hash_ip_eventsHashed IP address of user for a particular actionstr
8identifierAdvertiser-specific identifierstr
9marketing_typeType of marketing touchpoint. This will always be onsite as this dataset reflects click-based marketing events only.str
10onsite_countThe total number of actions seen against a given user within a given session.int
11original_urlURL of the page landing pagestr
12rb_sync_idIdentifier used by Rockerbox to sync dataset to your warehouseint
13reportThe name of the reportstr
14request_referrerPage Referrer (the previous site where the user came from)str
15session_idIdentifier for a session, indicated by a timestamp. A unique user session will be a combination of session_id|uid, or can be identified using session_start.str
16session_maxTimestamp of the last time a user was seen on site during a sessiontimestamp
17session_minTimestamp of the first time a user was seen on site during a sessiontimestamp
18session_startBinary field indicating the first event of a session. Can be used for session / visitor analysis by filtering for session_start = 1.int
19spend_keyThe ID used to pull spend from an advertising platform. This is typically the Ad ID, but may differ based on your account setup.str
20tier_1Aligns to the 5-tiered categorization structure available in the Rockerbox UI. tier_1 = most broad categorization (level 1).str
21tier_2Aligns to the 5-tiered categorization structure available in the Rockerbox UI. tier_2 = level 2 categorization (more granular than level 1).str
22tier_3Aligns to the 5-tiered categorization structure available in the Rockerbox UI. tier_3 = level 3 categorization.str
23tier_4Aligns to the 5-tiered categorization structure available in the Rockerbox UI. tier_4 = level 4 categorization.str
24tier_5Aligns to the 5-tiered categorization structure available in the Rockerbox UI. tier_5 = level 5 categorization.str
25timestamp_actionTimestamp of when the action occurredtimestamp
26timestamp_eventTimestamp of when the marketing touchpoint occurred. This will only appear on the first action within a session, as the only action that will have a marketing touchpoint. The timestamp_event and timestamp_action will match.timestamp
27transform_table_idID associated with the Rockerbox table used to apply mappings and spend. Closed beta feature only.int
28uidRockerbox User ID cookiestr
29updated_atTime the cache record was updated most recentlytimestamp
30user_agentWeb identifier that includes characteristics like browser, device, operating system, and applicationstr
31utm_campaignutm_campaign value parsed from the landing page URL, if presentstr
32utm_contentutm_content value parsed from the landing page URL, if presentstr
33utm_idutm_id value parsed from the landing page URL, if presentstr
34utm_mediumutm_medium value parsed from the landing page URL, if presentstr
35utm_sourceutm_source value parsed from the landing page URL, if presentstr
36utm_termutm_term value parsed from the landing page URL, if presentstr

Clickstream FAQ

How does Rockerbox handle a session that spans two dates (UTC)

Rockerbox will break a session (generating a new session_id and date) if a user’s session is active across two days. This may be common if users are active around UTC midnight.

How does Rockerbox define a session? How can I make sure a session is unique when I query against the dataset?

  • The session_id used by Rockerbox is a timestamp of the first event in a session, vs the session_id cached in your browser. This allows Rockerbox to maintain the same session_id when a user opens a new tab.
  • Because session_id is defined as a timestamp, it’s not unique per user. To identify all unique user sessions, join session_id|uid or filter for session_start = 1 in your query.
  • A session “expires” after 30 minutes of inactivity. If the user is seen as active again, the session_id will be reset.
  • A session is NOT re-set if new source information is provide (for example, a UTM on an internal page)

How do Rockerbox sessions compared to other sessions sources?

  • Rockerbox’s source data is pixel based events, with sessions logic layered on top of source data to group disparate actions into connected sessions. Source pixel data may differ from session source data from other providers like Shopify, GA4, Amplitude, etc.
  • Rockerbox’s session definiton (described above) may differ from other source session definitions and cannot be customized to match the definitions of other data providers.

How does Rockerbox handle bot traffic

  • Today, the only filtering performed on top of your raw site data is to remove any uids seen > 200x in the same day. Additional bot filtering is not applied by default, knowing that brands often prefer a custom approach to this type of filtering.

Why are some fields for a given row in the dataset blank?

  • Not all actions will have associated marketing context. Typically, only the first action in a session will pass along click-based marketing context in the URL. In your Rockerbox conversion data, the marketing context for each conversion event is carried over via this first action with marketing context. In this dataset, only the events that carry marketing context will have relevant fields populated to avoid any duplication.

Can I see non-click marketing context?

  • Non-click marketing context like view-based data from Linear TV, OTT, Display, and Social as well as other marketing context like promo code attribution or direct mail matching is not currently available in this dataset.

How can I identify a repeat visitor?

  • When a visitor returns to site, they may or make not have the same Rockerbox cookie ID (uid). To string together a user path when the uid is NOT the same, Rockerbox applies an identity resolution process to our conversion datasets. This is not currently available for this dataset.
  • Users with > 200 events / day are excluded from this dataset under the assumption that these are admin users or server-side cookie IDs.

What is an engaged session?

  • An engaged session is defined by a session where the differences of the session_max and session_min timestamps are > 10 seconds.
  • By logic, this means that a session with only 1 action cannot be an engaged session, since the session_min and session_max timestamps will be the same
  • The engaged_session flag carries through all events within a given session (eg sessions with 4 events that last > 10 seconds will have an engaged_session flag on each row). To compare engaged session starts to overall session start, filter for session_start = 1 in your queries.

Why does the Clickstream channel attribution differ from GA4?

  • Rockerbox applies custom channel rules per advertiser using custom logic and parameters beyond UTMs, which can lead to variances in attribution categorization of a session
  • Most advertisers have “Last Non-Direct Click Attribution” applied in GA4, meaning Direct attribution may be overwritten by the last non-Direct marketing channel seen against the same user.

How can I join the clickstream dataset to the Clickstream Event Paramters dataset?

  • Each unique event_id in the Clickstream dataset will have multiple rows in the Event Parameter dataset, reflecting the individual parameters passed on the pixel. To retrieve specific query_param_name and value details, most advertisers will
  • Join on the event_id and date
  • Filter by a specific query_param_name

What is each query_param_name?

  • The query_param_name and value fields are parsed directly from your on-site pixels with no further modifications applied. While certain parameters are required to be passed for Rockerbox implementation, in many cases additional parameters are also provided. Questions about what values are passed and what each means will likely need to be investigated by your team as the experts on your implementation and data layer, vs by the Rockerbox team.
  • If you don’t see a certain query_param_name, check the name of the field in your implementation (ex in GTM). Otherwise, check if the query_param_name is passed anywhere on your pixel.