# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview **pgm** is a lightweight, type-safe PostgreSQL query builder for Go, built on top of jackc/pgx. It generates Go code from SQL schema files, enabling compile-time safety for database queries without the overhead of traditional ORMs. **Core Philosophy:** - Schema defined in SQL (not Go) - Minimal code generation (only table/column definitions) - Users provide their own models (no forced abstractions) - Type-safe query building with fluent API - Zero reflection for maximum performance ## Development Commands ### Building ```bash # Build CLI with version from git tags make build # Build with specific version make build VERSION=v1.2.3 # Install to GOPATH/bin make install # Check current version make version pgm -version ``` ### Testing ```bash # Run all tests in playground make test # Run benchmarks for SELECT queries make bench-select # Run code generator on playground schema make run ``` ### Code Generation ```bash # Generate Go code from SQL schema pgm -o ./db ./schema.sql # The generator expects a SQL schema file and outputs: # - One package per table in the output directory # - Each package contains table and column definitions # Example output: db/users/users.go, db/posts/posts.go ``` ## Architecture ### Core Components #### 1. Query Builder System (qry_*.go files) The query builder uses a **mutable, stateful design** for conditional query building: - **qry_select.go**: SELECT queries with joins, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT, OFFSET - **qry_insert.go**: INSERT queries with RETURNING and UPSERT (ON CONFLICT) support - **qry_update.go**: UPDATE queries with conditional WHERE clauses - **qry_delete.go**: DELETE queries with WHERE conditions **Important:** Query builders accumulate state with each method call. They are designed for conditional building within a single query but should NOT be reused across multiple separate queries. ```go // ✅ CORRECT - Conditional building query := users.User.Select(users.ID, users.Email) if nameFilter != "" { query = query.Where(users.Name.Like("%" + nameFilter + "%")) } err := query.First(ctx, &id, &email) // ❌ WRONG - Reusing builder across separate queries baseQuery := users.User.Select(users.ID) baseQuery.Where(users.ID.Eq(1)).First(ctx, &id1) // Adds ID=1 baseQuery.Where(users.Status.Eq(2)).First(ctx, &id2) // Has BOTH conditions! ``` Query builders are NOT thread-safe. Each goroutine must create its own query instance. #### 2. Field System (pgm_field.go) Fields are type-safe column references with: - Comparison operators: `Eq()`, `NotEq()`, `Gt()`, `Lt()`, `Gte()`, `Lte()` - Pattern matching: `Like()`, `ILike()`, `LikeFold()`, `EqFold()` - NULL checks: `IsNull()`, `IsNotNull()` - Array operations: `Any()`, `NotAny()` - Aggregate functions: `Count()`, `Sum()`, `Avg()`, `Min()`, `Max()` - String functions: `Lower()`, `Upper()`, `Trim()`, `StringEscape()` - Special functions: `ConcatWs()`, `StringAgg()`, `DateTrunc()`, `RowNumber()` **Security:** Functions like `ConcatWs()`, `StringAgg()`, and `DateTrunc()` validate inputs and escape SQL strings. They use allowlists for parameters like date truncation levels. #### 3. Connection Pool (pgm.go) Global connection pool using pgxpool with atomic pointer for thread safety: - `InitPool(Config)`: Initialize once at startup (panics if called multiple times or with invalid config) - `GetPool()`: Retrieve pool instance (panics if not initialized) - `ClosePool()`: Graceful shutdown - `BeginTx(ctx)`: Start transactions The pool is stored in an `atomic.Pointer[pgxpool.Pool]` for lock-free concurrent access. #### 4. Code Generator (cmd/) Parses SQL schema files and generates Go code: - **cmd/main.go**: CLI entry point with version flag support - **cmd/parse.go**: Regex-based SQL parser (has known limitations) - **cmd/generate.go**: Code generation with go/format for proper formatting - **cmd/version.go**: Version string generation from build-time ldflags **Parser Limitations:** - No multi-line comments (`/* */`) - Limited support for complex data types (arrays, JSON, JSONB) - No advanced PostgreSQL features (PARTITION BY, INHERITS) - Some constraints (CHECK, EXCLUDE) not parsed Generated files include a header comment with version and timestamp: ```go // Code generated by code.patial.tech/go/pgm/cmd v1.2.3 on 2025-11-16 04:05:43 DO NOT EDIT. ``` #### 5. Table System (pgm_table.go) Minimal table metadata: ```go type Table struct { Name string // Table name FieldCount int // Number of columns PK []string // Primary key columns DerivedTable Query // For subqueries/CTEs } ``` Provides factory methods: `Select()`, `Insert()`, `Update()`, `Delete()` ### String Builder Pool Uses `sync.Pool` for efficient string building (qry.go): ```go var sbPool = sync.Pool{ New: func() any { return new(strings.Builder) } } ``` All query builders use `getSB()` / `putSB()` to reduce allocations. ### Error Handling Custom errors in pgm.go: - `ErrConnStringMissing`: Connection string validation - `ErrInitTX`: Transaction initialization failure - `ErrCommitTX`: Transaction commit failure - `ErrNoRows`: Wrapper for pgx.ErrNoRows Use `pgm.IsNotFound(err)` to check for no rows errors. ### Generated Code Structure For a schema with `users` and `posts` tables: ``` db/ ├── schema.go # Table definitions and DerivedTable helper ├── user/ │ └── users.go # User table columns as constants ├── post/ │ └── posts.go # Post table columns as constants └── ... ``` Each table file exports constants like: ```go const ( All pgm.Field = "users.*" ID pgm.Field = "users.id" Email pgm.Field = "users.email" // ... all columns ) ``` ### Naming Conventions - **Table pluralization**: `pluralToSingular()` in cmd/generate.go handles irregular plurals (people→person, children→child) and common patterns (-ies→-y, -ves→-fe, -es→-e, -s→"") - **Field naming**: Snake_case columns converted to PascalCase (first_name → FirstName) - **ID suffix**: Fields ending in `_id` become `ID` not `Id` (user_id → UserID) ## Key Implementation Details ### Query Building Strategy 1. **Pre-allocation**: Queries estimate final string length via `averageLen()` methods to reduce allocations 2. **Conditional accumulation**: All clauses stored in slices, built on demand 3. **Parameterized queries**: Uses PostgreSQL numbered parameters (`$1`, `$2`, etc.) 4. **Builder methods return interfaces**: Enforces correct method call sequences at compile time Example type progression: ```go SelectClause → WhereClause → AfterWhere → OrderByClause → Query → First/All ``` ### Transaction Handling Pattern used throughout: ```go tx, err := pgm.BeginTx(ctx) if err != nil { return err } defer tx.Rollback(ctx) // Safe to call after commit // ... operations using ExecTx, FirstTx, AllTx methods ... return tx.Commit(ctx) ``` All query execution methods have `Tx` variants that accept `pgx.Tx`. ### Full-Text Search Helpers Helper functions for PostgreSQL tsvector queries: - `TsAndQuery()`: AND operator between terms - `TsPrefixAndQuery()`: AND with prefix matching `:*` - `TsOrQuery()`: OR operator - `TsPrefixOrQuery()`: OR with prefix matching Used with tsvector columns via `Field.TsQuery()`. ## Testing Tests are in the `playground/` directory: - **playground/schema.sql**: Test database schema - **playground/db/**: Generated code from schema - **playground/*_test.go**: Integration tests for SELECT, INSERT, UPDATE, DELETE - **playground/local_select_test.go**: Additional SELECT test cases Tests require a running PostgreSQL instance. The playground uses the generated code to verify the query builder works correctly. ## Version Management Version is injected at build time via ldflags: ```bash go build -ldflags "-X main.version=v1.2.3" ./cmd ``` The Makefile automatically extracts version from git tags: ```makefile VERSION ?= $(shell git describe --tags --always --dirty 2>/dev/null || echo "dev") ``` This version appears in: - CLI `pgm -version` output - Generated file headers - Version string formatting in cmd/version.go ## Common Patterns ### Connection Pool Initialization Always initialize once at startup and defer cleanup: ```go func main() { pgm.InitPool(pgm.Config{ ConnString: os.Getenv("DATABASE_URL"), MaxConns: 25, MinConns: 5, }) defer pgm.ClosePool() // ... } ``` ### Safe Query Execution Check for no rows using the helper: ```go err := users.User.Select(users.Email).Where(users.ID.Eq(id)).First(ctx, &email) if pgm.IsNotFound(err) { // Handle not found case } if err != nil { // Handle other errors } ``` ### Validation Requirements - UPDATE requires at least one `Set()` call - INSERT requires at least one `Set()` call - DELETE without `Where()` deletes ALL rows (dangerous!) All execution methods validate these requirements and return descriptive errors. ## Performance Considerations - **sync.Pool**: Reuses string builders across queries - **Pre-allocation**: Queries pre-calculate buffer sizes - **Zero reflection**: Direct field access, no runtime type inspection - **pgxpool**: Leverages jackc/pgx's efficient connection pooling - **Direct scanning**: Users scan into their own types, no intermediate mapping ## Security Notes - All queries use parameterized statements (PostgreSQL numbered parameters) - Field validation in functions like `DateTrunc()` uses allowlists - SQL string escaping in `escapeSQLString()` for literal values - Identifier validation via regex in `validateSQLIdentifier()` - Connection pool configuration validates against negative values and invalid ranges