# Python

Metaforms' Data Validation module helps QA and data processing teams verify that survey data matches the questionnaire design. Rather than manually writing and running validation scripts, the platform uses AI to automatically generate checks that catch programming errors, data mismatches, and logic failures — at scale and with far less room for human error.

> **Important:** Data Validation focuses on **data correctness checks**, not data cleaning. It verifies whether the survey was programmed correctly and whether the collected data is consistent with the questionnaire. Features like identifying speeders, straight-liners, or nonsensical open-ended responses are part of the product roadmap and are not included in the current release.

Data Validation with Python uses an uploaded **.SAV** file (SPSS format) containing both the data and metadata from the programmed survey. Scripts are generated and executed directly within the platform, and you can export a validation report.

***

### 1. Setting Up a Data Validation Project

This section covers everything from creating your project to generating validation scripts — the foundational setup before you begin reviewing or running checks.

#### Step 1: Create a New Project

From the Metaforms home screen, enter a project name and click **Create Project**. This creates the workspace where your questionnaire, data files, and validation scripts will live.

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/27305a84-1788-4c73-b72b-5f350b3d8efb.png)

#### Step 2: Open the Data Validation Tab

From the left-hand menu inside your project, select **Data Validation** to enter the module.

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/2ba86579-a443-45e3-99b0-59daeff302d7.png)

#### Step 3: Upload Your Questionnaire

Click the upload area to select your questionnaire file (PDF or DOCX). Metaforms will parse and analyze the document, extracting question texts, response options, and routing logic. This typically takes a few seconds.

> **Tip:** You can also upload additional reference files (such as brand lists, option lists, or routing specifications) alongside your questionnaire. The AI reads these files to build more accurate validation logic for complex survey designs.

<figure><img src="https://1402057010-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgtHd8o9ldznZ1hKHiZvc%2Fuploads%2FVU1c9agVeAundvuQ6lvz%2FScreenshot%202026-04-08%20at%201.15.31%E2%80%AFPM.png?alt=media&#x26;token=2163a793-85ea-4027-971d-f996c075d1bb" alt=""><figcaption></figcaption></figure>

#### Step 4: Select a Scripting Method and Provide Your Data File

* Choose your scripting language as **Python** then provide the path to your data file (.SAV).
* Click **Start Validation** to begin.

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/77c96677-c605-43b1-b659-2239eb5bdb4e.png)

#### Step 5: AI Generates Validation Scripts

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/42396535-f098-40c4-a202-2ba2c982942d.png)

Once started, the AI agent takes over. On the right side of the screen, you can watch the agent's progress as it works through three stages:

1. **Metadata extraction** — The agent reads the SAV file and extracts all variable definitions, question structures, and coding schemes.
2. **Questionnaire comparison** — It compares the extracted metadata against your uploaded questionnaire to identify any mismatches in question text, response options, or variable codes. Questions where the metadata aligns with the questionnaire are marked **valid**; those with discrepancies are marked **invalid** and flagged for your attention.
3. **Script generation** — For each valid question, the agent writes a Python validation script. These scripts check for conditions like:
   * **Range checks** — Whether response values fall within the expected set of options (e.g., options 1–5 for a single-select question, flagging any unexpected 6th or 7th value).
   * **Single-select vs. multi-select enforcement** — Whether a question marked as single-select truly has only one response per respondent, or whether a multi-select question is being treated as single-select.
   * **Termination logic** — Whether respondents who should have been terminated (e.g., answering "No" to a screening question) were actually routed out of the survey.
   * **Skip and routing logic** — Whether respondents were correctly shown or skipped past questions based on their prior answers and the questionnaire's base conditions.

<figure><img src="https://1402057010-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgtHd8o9ldznZ1hKHiZvc%2Fuploads%2FIKuUcz4Yk9hgGutwLuQF%2Fimage.png?alt=media&#x26;token=9e86bd9b-bb93-498e-82ea-f39c59fc81c7" alt=""><figcaption></figcaption></figure>

This process typically takes **5–15 minutes** depending on the number of questions. For a 50-question survey, expect roughly 12–15 minutes; a 36-question survey may complete in 10–15 minutes.

***

### 2. Understanding the Interface

Once script generation is complete, the Data Validation module presents two primary views: **Checks** and **Respondents**. Together, they give you full visibility into the validation logic and the underlying data.

#### Checks View

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/68756712-7c47-49ac-b663-27bcbfdceb01.png)

The Checks view is the main workspace for reviewing and managing your validation scripts. Each block corresponds to a question from the questionnaire. You can also use the navigation on the left to scroll to scripts for specific questions.&#x20;

> Every check will be followed after a comment that describes the intent of the check

#### Respondents View

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/b5a1c62f-10da-4833-a7c3-b3d9b5d6e0f8.png)

The Respondents tab provides a raw tabular view of the data from your SAV file. Each row represents a respondent, and each column corresponds to a variable.

You can use column filters to narrow down the view to specific variables or respondent subsets. This is useful for spot-checking data before running validation, or for examining the raw responses of respondents flagged by a particular check.

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/61dafa53-8f70-43db-92d9-8cb8ef82b782.png)

After checks are run, the Respondents view also supports **validation filters** — letting you display only respondents who failed validation on specific questions, so you can quickly drill into problem areas.

***

### 3. Reviewing and Editing Validation Scripts

Before running your checks, it is important to review the AI-generated scripts to ensure they match your expectations. The AI acts as a copilot — it handles the bulk of script generation, but human review is essential to catch edge cases, confirm business logic, and ensure accuracy.

<figure><img src="https://1402057010-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgtHd8o9ldznZ1hKHiZvc%2Fuploads%2FpOxgiUY6yrKcqgFTipAv%2Fimage.png?alt=media&#x26;token=503b546b-7b34-474f-aea5-1a1b64091148" alt=""><figcaption></figcaption></figure>

#### Review the Generated Code

Expand any check in the Checks view to see the full validation script. Each script is self-contained Python code that targets a single question. Read through the logic to confirm it aligns with the questionnaire's intended behavior.

Once you are satisfied that a script is correct, you can mark it as **Reviewed**. This helps you track progress across a large survey — you can quickly see which checks have been vetted and which still need attention.

#### Make Manual Edits

If the generated script needs adjustments, you can edit it directly in the built-in code editor. Changes are saved within the project. Metaforms respects your manual edits — any code you write or modify will be preserved, even if you later ask the AI to regenerate other parts of the project.

This is useful for adding custom checks that go beyond the standard validations (e.g., cross-question consistency rules) or correcting a check where the AI misinterpreted the questionnaire logic.

#### Use the AI Chat to Modify Scripts

Instead of editing code directly, you can use the **AI chat interface** to request changes in natural language. For example:

* *"Add a check that Q3 should be skipped if Q1 equals 2."*
* *"The termination condition for S1 should flag respondents who answered 'No' but were not terminated."*
* *"Change the valid range for Q7 from 1–5 to 1–7."*

The AI will update the script accordingly. If your instruction is ambiguous, it will ask clarifying questions before making changes. This makes it accessible even if you are not comfortable writing Python code yourself.

***

### 4. Running Checks and Reviewing Results

Once you have reviewed your scripts and are confident they are correct, it is time to execute them against the dataset.

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/1b9ee3a6-4dbb-40d9-806e-022e77e0bdc4.png)

#### Step 1: Run All Checks

Click **Run All Checks** to execute every validation script in the project. The platform runs all checks against your respondent data and returns results within seconds, regardless of dataset size.

<figure><img src="https://1402057010-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgtHd8o9ldznZ1hKHiZvc%2Fuploads%2FliJKXPrNB8eK33TkAQ1M%2Fimage.png?alt=media&#x26;token=4bf106d3-6d79-439a-94c1-7d2f2b4a8b55" alt=""><figcaption></figcaption></figure>

Once the checks have run, the yellow dot next to each question would indicate that there are respondents flagged for the checks run for that question.&#x20;

<figure><img src="https://1402057010-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgtHd8o9ldznZ1hKHiZvc%2Fuploads%2FpRRtAcml5guVvKNQ6Z9h%2Fimage.png?alt=media&#x26;token=65d050dc-8beb-4936-881c-746dd32c193b" alt=""><figcaption></figcaption></figure>

#### Step 2: View Flagged Respondents

For any check that failed, click the **View** button next to it. This takes you directly to the Respondents tab, automatically filtered to show only the respondents who failed that specific check. The columns are also filtered to display only the relevant variables, so you can immediately see the problematic data without sifting through the full dataset.

<figure><img src="https://1402057010-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgtHd8o9ldznZ1hKHiZvc%2Fuploads%2FxeDiVdI9e5Ok6GQoE8pM%2FScreenshot%202026-04-08%20at%201.41.37%E2%80%AFPM.png?alt=media&#x26;token=685e3c2c-416e-44bf-8620-c96bd24600ea" alt=""><figcaption></figcaption></figure>

This makes it straightforward to understand exactly which respondents were flagged and why.&#x20;

<figure><img src="https://1402057010-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgtHd8o9ldznZ1hKHiZvc%2Fuploads%2FePrSEnJsLkDYhIS7roj1%2FScreenshot%202026-04-08%20at%201.42.50%E2%80%AFPM.png?alt=media&#x26;token=bfb0ead0-7f90-4fa0-a8b1-0cd5aa7bbe33" alt=""><figcaption></figcaption></figure>

#### Step 3: Export the Validation Report

Once you have reviewed the results, click **Export** to download a comprehensive validation report.&#x20;

<figure><img src="https://1402057010-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgtHd8o9ldznZ1hKHiZvc%2Fuploads%2FbwewrPwvbvTDpL5msmPm%2Fimage.png?alt=media&#x26;token=a9c2d0dc-0c32-4e50-b7fe-9e2cb6dd9069" alt=""><figcaption></figcaption></figure>

The report is an Excel file that includes:

* **Summary sheet** — Overall pass rate, total respondents, number of checks run, and counts of flagged respondents.<br>

  <figure><img src="https://1402057010-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgtHd8o9ldznZ1hKHiZvc%2Fuploads%2FxJushMPBp63RqaCoKMlR%2Fimage.png?alt=media&#x26;token=0da44d3e-a273-4e2c-b098-147dd05d15b8" alt=""><figcaption></figcaption></figure>

* **Detailed findings** — Each failed check listed with the question text, the validation condition, the number of respondents who failed, and their respondent IDs.<br>

  <figure><img src="https://1402057010-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgtHd8o9ldznZ1hKHiZvc%2Fuploads%2FP635CRkXikp99V4GuMhD%2Fimage.png?alt=media&#x26;token=384b3aad-feb1-4320-bca5-6b83f91595ef" alt=""><figcaption></figcaption></figure>

* **Per-respondent breakdown** — Grouped by question and check, showing exactly which respondents failed and what checks.<br>

  <figure><img src="https://1402057010-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgtHd8o9ldznZ1hKHiZvc%2Fuploads%2FmLE5YIiTCojhEPsaWKgW%2Fimage.png?alt=media&#x26;token=630773e3-76b0-45de-b13b-b9a3e1ed6bb9" alt=""><figcaption></figcaption></figure>

This report can be shared with your programming team, QA team, or clients to communicate data quality findings and drive corrective action.
