> For the complete documentation index, see [llms.txt](https://help.metaforms.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://help.metaforms.ai/data-processing/python/step-1-setting-up-a-project.md).

# Step 1: Setting Up a Data Validation Project

This section covers everything from creating your project to generating validation scripts — the foundational setup before you begin reviewing or running checks.

### Step 1: Create a New Project

From the Metaforms home screen, enter a project name and click **Create Project**. This creates the workspace where your questionnaire, data files, and validation scripts will live.

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/27305a84-1788-4c73-b72b-5f350b3d8efb.png)

### Step 2: Open the Data Validation Tab

From the left-hand menu inside your project, select **Data Validation** to enter the module.

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/2ba86579-a443-45e3-99b0-59daeff302d7.png)

### Step 3: Upload Your Questionnaire

Click the upload area to select your questionnaire file (PDF or DOCX). Metaforms will parse and analyze the document, extracting question texts, response options, and routing logic. This typically takes a few seconds.

> **Tip:** You can also upload additional reference files (such as brand lists, option lists, or routing specifications) alongside your questionnaire. The AI reads these files to build more accurate validation logic for complex survey designs.

### Step 4: Select a Scripting Method and Provide Your Data File

* Choose your scripting language as **Python** then provide the path to your data file (.SAV).
* Click **Start Validation** to begin.

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/77c96677-c605-43b1-b659-2239eb5bdb4e.png)

### Step 5: AI Generates Validation Scripts

![](https://usercontent.in.prod.clueso.io/743138ba-7933-4a1a-90b2-26ddeb90ccbd/f92caeff-fe49-434b-a35f-fb2b01d97783/f5a37142-3fd8-412d-9b9c-82bb70a8fc4d/images/42396535-f098-40c4-a202-2ba2c982942d.png)

Once started, the AI agent takes over. On the right side of the screen, you can watch the agent's progress as it works through three stages:

1. **Metadata extraction** — The agent reads the SAV file and extracts all variable definitions, question structures, and coding schemes.
2. **Questionnaire comparison** — It compares the extracted metadata against your uploaded questionnaire to identify any mismatches in question text, response options, or variable codes. Questions where the metadata aligns with the questionnaire are marked **valid**; those with discrepancies are marked **invalid** and flagged for your attention.
3. **Script generation** — For each valid question, the agent writes a Python validation script. These scripts check for conditions like:
   * **Range checks** — Whether response values fall within the expected set of options (e.g., options 1–5 for a single-select question, flagging any unexpected 6th or 7th value).
   * **Single-select vs. multi-select enforcement** — Whether a question marked as single-select truly has only one response per respondent, or whether a multi-select question is being treated as single-select.
   * **Termination logic** — Whether respondents who should have been terminated (e.g., answering "No" to a screening question) were actually routed out of the survey.
   * **Skip and routing logic** — Whether respondents were correctly shown or skipped past questions based on their prior answers and the questionnaire's base conditions.

<figure><img src="/files/yuKkiE49d4T78Iq8QMGa" alt=""><figcaption></figcaption></figure>

This process typically takes **5–15 minutes** depending on the number of questions. For a 50-question survey, expect roughly 12–15 minutes; a 36-question survey may complete in 10–15 minutes.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://help.metaforms.ai/data-processing/python/step-1-setting-up-a-project.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.