Adding Time Range Generation to Data Diluvium

Following up on my previous posts about adding humidity and temperature data generation to Data Diluvium, I’m now adding a Time Range generator. I decided this would be a nice addition to give any graphing of the data a good look. This will complete the trio of generators I needed for my TimeScale DB setup. While humidity and temperature provide the environmental data, the Time Range generator ensures we have properly spaced time points for our time-series analysis.

Why Time Range Generation?

When working with TimeScale DB for time-series data, having evenly spaced time points is crucial for accurate analysis. I’ve found that many of my experiments require data points that are:

  • Evenly distributed across a time window
  • Properly spaced for consistent analysis
  • Flexible enough to handle different sampling rates
  • Random in their starting point to avoid bias

The Implementation

I created a Time Range generator that produces timestamps based on a 2-hour window. Here’s what I considered:

  • Default 2-hour time window
  • Even distribution of points across the window
  • Random starting point within a reasonable range
  • Support for various numbers of data points

Here’s how I implemented this in Data Diluvium:

Continue reading “Adding Time Range Generation to Data Diluvium”

Data Diluvium Utility, Schema, Notes, & Wrap Up

In my previous three posts, I covered the core functionality of DataDiluvium. In this follow-up post, I’ll explore the additional features, utilities, and implementation details that I’ve added to enhance the application’s functionality and developer experience.

SQL Sample Files

1. Sample Schema Repository

I’ve included a collection of sample SQL schemas in the sql-samples directory. These samples serve multiple purposes:

  1. Documentation examples
  2. Testing scenarios
  3. Quick-start templates for users

Here’s an example from the users table:

CREATE TABLE users (
    id INT PRIMARY KEY,
    username VARCHAR(50) NOT NULL,
    email VARCHAR(100) NOT NULL,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
Continue reading “Data Diluvium Utility, Schema, Notes, & Wrap Up”

Building DataDiluvium: A Data Generation Tool – Part 3: Data Generation and Final Implementation

In Parts 1 and 2, I set up the development environment and implemented the schema parsing functionality. Now, I’ll explore the data generation system and final implementation details that make DataDiluvium a complete solution.

Data Generation System

1. Generator Registry

I’ve implemented the generator system in src/app/lib/generators/registry.ts. This registry manages different types of data generators:

import { Generator } from './types';
import { SequentialNumberGenerator } from './basic';
import { UsernameGenerator } from './basic';
import { DateGenerator } from './basic';
import { ForeignKeyGenerator } from './basic';

export const generatorRegistry = new Map<string, Generator>();

// Register built-in generators
generatorRegistry.set('Sequential Number', new SequentialNumberGenerator());
generatorRegistry.set('Username', new UsernameGenerator());
generatorRegistry.set('Date', new DateGenerator());
generatorRegistry.set('Foreign Key', new ForeignKeyGenerator());
Continue reading “Building DataDiluvium: A Data Generation Tool – Part 3: Data Generation and Final Implementation”

Building DataDiluvium: A Data Generation Tool – Part 2: Core Implementation and Schema Handling

In Part 1, we set up our development environment and outlined the project structure. Now, we’ll dive into implementing the core functionality of DataDiluvium, starting with SQL schema parsing and the user interface for schema input.

SQL Schema Parsing Implementation

1. SQL Parser Setup

The SQL parser is implemented in src/app/lib/parsers/sqlParser.ts. This module is responsible for:

  1. Parsing SQL schema definitions using the node-sql-parser library
  2. Extracting table and column information
  3. Mapping SQL data types to appropriate data generators

The parser works by:

  1. Taking a SQL string input
  2. Converting it to an Abstract Syntax Tree (AST)
  3. Traversing the AST to extract:
    • Table names
    • Column names
    • Data types
    • Default values
    • Foreign key relationships

The SchemaColumn type defines the structure of our parsed columns:

export type SchemaColumn = {
  tableName: string;
  columnName: string;
  dataType: string;
  defaultValue: string | null;
  generator?: string;
  referencedTable?: string;
  referencedColumn?: string;
};
Continue reading “Building DataDiluvium: A Data Generation Tool – Part 2: Core Implementation and Schema Handling”

Building DataDiluvium: A Data Generation Tool – Part 1: Prerequisites and Project Overview

(To read up on how to use the site, check out the previous post Effortless Data Generation for Developers.)

DataDiluvium is a web-based tool I’ve built designed to help developers, database administrators, and data engineers generate realistic test data based on SQL schema definitions. The tool takes SQL table definitions as input and produces sample data in various formats, making it easier to populate development and testing environments with meaningful data.

Project Overview

The core functionality of DataDiluvium includes:

  • SQL schema parsing and validation
  • Customizable data generation rules per column
  • Support for foreign key relationships
  • Multiple export formats (JSON, CSV, XML, Plain Text, SQL Inserts)
  • Real-time preview of generated data
  • Dark mode support
  • Responsive design
Continue reading “Building DataDiluvium: A Data Generation Tool – Part 1: Prerequisites and Project Overview”