Profiles enable you to join customer profile data from your data warehouse with existing behavioral product data already in Amplitude.
This feature is currently in an open beta.
Profiles act as standalone properties, in that they aren't associated with specific events and are instead associated with a user profile. They're different from traditional user properties and offer the opportunity to conduct more expansive analyses.
Profiles always display the most current data synced from your warehouse.
Regardless of whether you're using Snowflake or Databricks, Change Data Capture (CDC) doesn't support replacing existing tables. Instead, you must use incremental modeling. If the table you integrate with drops and replaces data, the connection breaks.
If this is your first time importing data from this table, set a data retention time and enable change tracking in Snowflake with the following commands:
1ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILES_PROPERTIES_TABLE_1 SET DATA_RETENTION_TIME_IN_DAYS = 7;2 3ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILES_PROPERTIES_TABLE_1 SET CHANGE_TRACKING = TRUE;
On Snowflake Standard Edition plans, the maximum retention time is one day. If you’re on this plan, you should set the frequency to 12 hours in later steps.
Follow these instructions to enable change tracking:
If you're working with a new table, set the table property delta.enableChangeDataFeed = true
in the CREATE TABLE
command:
CREATE TABLE student (id INT, name STRING, age INT) TBLPROPERTIES (delta.enableChangeDataFeed = true)
Also set spark.databricks.delta.properties.defaults.enableChangeDataFeed = true
for all new tables.
If you're working with an existing table, set the table property delta.enableChangeDataFeed = true
in the ALTER TABLE
command:
ALTER TABLE myDeltaTable SET TBLPROPERTIES (delta.enableChangeDataFeed = true)
Set a data retention period. This must be at least one day, but in most cases you should set this period to seven days or longer. If your retention period is too short, the import process can fail.
To set up a profile in Amplitude, follow these steps:
In Amplitude Data, navigate to Connections Overview. Then in the Sources panel, click Add More. Scroll down until you find the Snowflake tile and click it.
On the Set Up Connection tab, connect Amplitude to your data warehouse by filling in all the relevant fields under Snowflake Credentials, which are outlined in the Snowflake Data Import guide. You can either create a new connection, or reuse an existing one. Click Next when you're done.
You can see a list of your tables under Select Table. To begin column mapping, click the table you're interested in.
In the list of required fields under Column Mapping, enter the column names in the appropriate fields to match columns to required fields. To add more fields, click + Add field.
On the Select Data tab, select the profiles
data type. Amplitude pre-selects the required change data capture import strategy for you, which you can see under the Select Import Strategy dropdown:
When you're done, click Test Mapping verify your mapping information. Then click Next.
Name the source and set the frequency at which Amplitude should refresh your profiles from the data warehouse. You should set the frequency to 12 hours if you are on Snowflake Standard Edition.
To set up a profile in Amplitude, follow these steps:
In Amplitude Data, navigate to Connections Overview. Then in the Sources panel, click Add More. Scroll down until you find the Databricks tile and click it.
In the Set Up Connection tab, connect Amplitude to your data warehouse. Have the following information ready:
Click Next when you're done.
You can see a list of your tables under Select Table. To begin column mapping, click the table you're interested in.
In the list of required fields under Column Mapping, enter the column names in the appropriate fields to match columns to required fields. To add more fields, click + Add field.
In the Data Selection tab, select the profiles
data type.
When you're done, click Test Mapping to verify your mapping information. Then click Next.
Name the source and set the frequency at which Amplitude should refresh your profiles from the data warehouse. The default frequency is 12 hours, but you can change it.
Profiles supports:
A user_id
must go with each profile.
Field | Description | Example |
---|---|---|
user_id |
Identifier for the user. Must have a minimum length of 5. | |
Profile Property 1 |
Profile property set at the user level. The value of this field is the value from the customer’s source since last sync. | |
Profile Property 2 |
Profile property set at the user level. The value of this field is the value from the customer’s source since last sync. |
Example:
1{2 "user_id": 12345,3 "number of purchases": 10,4 "title": "Data Engineer"5}
See this article for information on Snowflake profiles.
1SELECT2 AS "user_id",3 AS "profile_property_1",4 AS "profile_property_2"5FROM DATABASE_NAME.SCHEMA_NAME.TABLE_OR_VIEW_NAME
When you remove profile values in your data warehouse, those values sync to Amplitude during the next sync operation. You can also use Amplitude Data to remove unused property fields from users in Amplitude.
1SELECT2 user_id as "user_id",3 upgrade_propensity_score as "Upgrade Propensity Score",4 user_model_version as "User Model Version"5FROM6 ml_models.prod_propensity_scoring
1SELECT2 m.uid as "user_id",3 m.title as "Title",4 m.seniority as "Seniority",5 m.dma as "DMA"6FROM7 prod_users.demo_data m
Thanks for your feedback!
October 1st, 2024
Need help? Contact Support
Visit Amplitude.com
Have a look at the Amplitude Blog
Learn more at Amplitude Academy
© 2024 Amplitude, Inc. All rights reserved. Amplitude is a registered trademark of Amplitude, Inc.