Ben's Bites
← Back
.md

Automatically clean data files for analysis

Create an automated process to combine and deduplicate two CSVs with Julius AI workflows.

intermediate pro
Tool: Julius AI Topic: Data & Reporting

2024-12-20

Data cleaning is an important step in preparing data for analysis, but it’s often tedious and time-consuming. If you’ve ever found yourself mind-numbingly scrolling through spreadsheets and editing data, you’re going to be excited about this tutorial. We’re going to guide you through using Julius AI to create an automated workflow for combining and cleaning your CSV files, making your data analysis process more efficient, consistent, and straightforward.

What’s great about Julius AI workflows, is you can set up a fully automated process that you can then share with your team or stakeholders, allowing them to complete a complex, multi-step data cleaning process in a fully- or semi-automated fashion.

Steps we’ll follow in this tutorial:

  • Create your data-cleaning workflow
  • Test your workflow
  • Edit and share your workflow

Let’s dive in.

Step 1: Create your data-cleaning workflow

To get started, navigate to the Julius AI website and create a free account.

__wf_reserved_inherit

You’ll then land on the Julius AI dashboard. Click on the “My Workflows” tab in the left-side navigation.

__wf_reserved_inherit
💡 Tip: You can explore community-made workflows via the Explore Workflows page; however, in this tutorial, we’ll be creating our own data workflow from scratch.

Then, click the “New Workflow” button at the top of the page.

__wf_reserved_inherit

Next, we’ll name our workflow. We’re going to call this one “Combine Two CSVs”. You can add an optional description to your workflow. Then, we’ll create our data workflow. You can provide a simple prompt and it’ll generate all the steps needed to complete the workflow - both the steps the AI needs to take as well as the human analyst.

Sample prompt:

Combine two uploaded CSVs of data into one. Deduplicate any duplicate data.
__wf_reserved_inherit

Julius AI will then generate all of the steps needed to perform this process in the tool. For instance, for our process, the first two steps are uploading the first CSV file and then uploading the second CSV file.

__wf_reserved_inherit

You can keep scrolling down the workflow on the right side of the page to see all of the steps Julius AI has generated to complete the data-cleaning workflow.

__wf_reserved_inherit
💡 Tip: You can rearrange, delete, or add steps to the workflow if there are components you want to change using the arrow buttons, the delete button, or the plus button between steps. In this example, we have no edits for the generated workflow.

Step 2: Test your workflow

Once the workflow outline is to your liking, you can test it by clicking the “Run Workflow” button at the top of the page.

__wf_reserved_inherit

This will take you to the live workflow experience, which Julius AI calls a “Thread”. The middle window will be where you interact with the workflow and the right side window provides a step-by-step outline of the workflow.

For this example, the first step is uploading our first CSV file, which we’ll do by clicking the “Load File” button and uploading a CSV from our computer.

__wf_reserved_inherit
💡 Tip: This is the UI/UX users of your workflow will experience when running your data workflow.

After we upload our first CSV file, we’ll move to step two, where we’ll upload our second CSV file.

__wf_reserved_inherit

Julius AI will then automatically proceed through the AI steps of reading the files, combining them into one DataFrame, deduplicating the data, and outputting a new, combined CSV file.

__wf_reserved_inherit

Julius AI will show its work every step of the way, so if there are any issues, debugging your workflow is much easier.

__wf_reserved_inherit

Occasionally, there might be steps that require human intervention. For instance, in one of the last steps of our workflow, there is a step where we can submit a name for our new CSV. You can remove these steps if they are not necessary via the workflow editor.

__wf_reserved_inherit

Finally, Julius AI will output a final, combined CSV file that we can click to download.

__wf_reserved_inherit

Step 3: Edit and share the workflow

If there are any steps along the way you want to edit, you can navigate back to the workflow builder and add additional steps at the top of the workflow builder.

__wf_reserved_inherit

Or, you can insert, move, or remove steps within the workflow as well. For instance, if you wanted to remove the user step of naming the CSV file, you could update that to an AI step to automatically generate the CSV file name.

__wf_reserved_inherit

Once you’re happy with your workflow, you can share it via the “Share” button in the top right corner via a link or to all Julius AI users on the Explore page.

__wf_reserved_inherit

This tutorial was created by Garrett.

Upgrade to Pro

This tutorial contains Pro content. Upgrade to access the full tutorial and all Pro features.

Get Pro Access