Building DataDiluvium: A Data Generation Tool – Part 1: Prerequisites and Project Overview

(To read up on how to use the site, check out the previous post Effortless Data Generation for Developers.)

DataDiluvium is a web-based tool I’ve built designed to help developers, database administrators, and data engineers generate realistic test data based on SQL schema definitions. The tool takes SQL table definitions as input and produces sample data in various formats, making it easier to populate development and testing environments with meaningful data.

Project Overview

The core functionality of DataDiluvium includes:

  • SQL schema parsing and validation
  • Customizable data generation rules per column
  • Support for foreign key relationships
  • Multiple export formats (JSON, CSV, XML, Plain Text, SQL Inserts)
  • Real-time preview of generated data
  • Dark mode support
  • Responsive design

Prerequisites and Setup

Development Environment

  1. Node.js and npm

  2. Git

  3. Code Editor

    • VS Code (or Cursor?)
    • Extensions:
      • ESLint
      • Prettier
      • Tailwind CSS IntelliSense
      • TypeScript and JavaScript Language Features

Core Technologies

  1. Next.js 15.1.3

    • React framework for production
    • App Router architecture
    • Server-side rendering capabilities
    • Installation: npx create-next-app@latest
  2. TypeScript

    • For type-safe JavaScript
    • Included with Next.js setup
    • Version: 5.x
  3. Tailwind CSS

    • Utility-first CSS framework
    • For responsive design
    • Dark mode support
    • Installation: Included with Next.js setup

Additional Dependencies

  1. SQL Parser

    • For parsing SQL schema definitions
    • We’ll use node-sql-parser for SQL parsing capabilities
  2. Data Generation Libraries

    • faker for generating realistic data
    • uuid for unique identifier generation
    • date-fns for date manipulation

Project Structure

datadiluvium/
├── src/
│   ├── app/                    # Next.js app router pages
│   ├── components/             # Reusable React components
│   ├── lib/                    # Core functionality
│   │   ├── generators/        # Data generation logic
│   │   ├── parsers/          # SQL parsing logic
│   │   └── types/            # TypeScript type definitions
│   └── styles/                # Global styles
├── public/                    # Static assets
└── package.json              # Project dependencies

Initial Setup Steps

  1. Create a new Next.js project:

    npx create-next-app@latest datadiluvium --typescript --tailwind --eslint
    
  2. Navigate to the project directory:

    cd datadiluvium
    
  3. Install additional dependencies:

    npm install node-sql-parser @faker-js/faker uuid date-fns
    
  4. Configure TypeScript and ESLint:

    • The Next.js setup will create these configurations
    • Additional rules may be added as needed
  5. Set up Tailwind CSS:

    • Configuration will be created by Next.js
    • Custom theme settings can be added later

Development Guidelines

  1. Code Style

    • Use TypeScript for all new files
    • Follow ESLint and Prettier configurations
    • Use functional components with hooks
    • Implement proper error handling
  2. Git Workflow

    • Use feature branches
    • Write meaningful commit messages
    • Follow conventional commits format
  3. Testing Strategy

    • Unit tests for data generation logic
    • Integration tests for SQL parsing
    • Component tests for UI elements

NOTE: for each of these I’m going to try to follow them. It’s always tough to get a 1000% consistency across a code base!

Next Steps

In the next part of this series, we’ll dive into implementing the core SQL parsing functionality and the schema input interface. We’ll explore how to:

  • Parse SQL schema definitions
  • Validate table structures
  • Create a user-friendly interface for schema input
  • Implement real-time validation and error handling

Stay tuned for Part 2, where we’ll begin building the foundation of the DataDiluvium App!