Last month I had the opportunity to contribute to the Salesforce.org community sprint and worked on the Test Data Generation project. I mainly worked with Patrick McNeal on a sample Snowfakery recipe for higher education.

Snowfakery is an open source tool from Salesforce.org that creates fake data with relationships and can handle large data volumes. Snowfakery is unique from other data generation tools in that it was built to address Salesforce-specific problems such as managing relationships and inserting a high volume of related records seamlessly into a Salesforce org.

This post is an overview of some of the capabilities with links to resources for using the tool to generate data in Salesforce orgs.

Basics

Output Types

Snowfakery can be used to generate data in CSV, SQL bulk insert, JSON and a variety of other formats. See the “Outputs” section of the Snowfakery documentation for the full list. The easiest way to get started is to try out some of the different data formats locally. Go through the steps in the “Install Snowfakery” section of these instructions to get started quickly. With the recipe included in these setup steps you can try out different output formats.

Data Generation Features

Snowfakery uses yaml files for creating a data generation recipe. Snowfakery includes a few features for managing relationships. Many parent-child relationships can be defined using the relationships feature. The friends feature adds options to specify the number child records.

Functions, variables and formulas provide a rich set of tools for creating fake data with the precision necessary to be a realistic dataset for your use case. One of the most significant functions is the faker function which generates fake data using the Python faker library. See the list of options on the faker docs.

Snowfakery comes with a few additional plugins installed, including advanced math, which brings capabilities of Python’s math module into Snowfakery recipes; and external datasets, which allows you to load data from a spreadsheet or database if you want to use previously created fake data.

Loading Generated Data into Salesforce

CumulusCI

Snowfakery is included with CumulusCI, which is required to load the generated data into a Salesforce org. The Generate Data in a Salesforce Org section walks through the steps of loading a Snowfakery recipe in a scratch org and a sandbox.

Sample Snowfakery Recipes

The following are links to recipes that you can copy and adapt to your own use case.

Resources

Leave a Comment

Your email address will not be published. Required fields are marked *