Swimm - Mainframe Preparation Guide

Overview

This guide will help you prepare your codebase and documentation to get maximum value from Swimm. Unlike pure AI solutions, Swimm uses static code analysis to build a comprehensive knowledge base of your systems, then applies AI to generate human-readable documentation. This hybrid approach ensures accuracy and reliability - the documentation is grounded in actual code structure, not AI guesswork.

By following these recommendations, you'll ensure that Swimm can deliver the most comprehensive and accurate documentation for your COBOL/mainframe systems.

What You'll Get from Swimm

Accurate, reliable documentation - Built on static code analysis, not AI hallucinations
Living documentation that stays synchronized with your code
Business logic clarity - AI-enhanced explanations grounded in actual code structure
Visual system maps - Automatically generated dependency diagrams and flow charts
Searchable knowledge base - Find answers across your entire mainframe estate
Modernization readiness - Technical specifications ready for your transformation initiatives
Faster onboarding - New team members can understand your systems in days, not months

Why Swimm is Different

Swimm's static analysis engine first parses and understands your code structure - programs, data flows, dependencies, and business logic. This creates a factual knowledge base of your system. AI is then used to make this technical information accessible and understandable. This approach delivers reliability that pure AI solutions simply cannot match.

How to Use This Guide

We've organized the artifacts into three priority levels to help you focus your efforts:

🔴 Required - Start here! These are essential for Swimm to analyze your systems
🟡 Recommended - Providing these will significantly improve the quality and depth of your documentation
🟢 Optional - These enhance the documentation with additional context and insights

Pro tip: Even if you can't provide everything at once, you can always add more artifacts later to enrich your documentation over time.

Your Artifact Collection Checklist

Priority	Artifact Type	What to Provide	What You'll Get
SOURCE CODE COMPONENTS
🔴 Required	COBOL Programs	• All in-scope COBOL source files (.cbl, .cob, .cobol) • Both batch and online programs • Unexpanded source code (not pre-compiled) • Note: System libraries are not needed	Swimm's static analysis engine parses each program to extract business logic, calculations, and process flows. AI then generates clear explanations anchored in actual program structure
🔴 Required	Copybooks	• Each copybook as a separate file • All referenced copybooks (data structures, common routines) • CICS-related copybooks if applicable	Swimm analyzes copybook structures and traces their usage across programs, showing exactly how data structures are defined, used, and flow through your system
🔴 Required (if using CICS)	CICS Assets	• BMS map source files • CICS control tables (if customized) • Transaction definitions	Documentation of screen layouts and user interaction flows, making it easy to understand how users navigate through online transactions
🟡 Recommended	Shared Utilities	• Common subroutines and utilities • Error handling modules • Date/time routines	Documentation of reusable components and enterprise standards, helping your team leverage these shared resources consistently
DATABASE & DATA DEFINITIONS
🔴 Required (if using DB2)	DB2 Definitions	• DDL statements for all referenced tables • Table/column definitions, data types, lengths • Primary/foreign key constraints • Indexes and views	Swimm's analysis engine maps database schemas to program usage, showing exactly which programs access which tables and how data relationships work - factual mapping, not AI inference
🔴 Required (if using IMS)	IMS Definitions	• Database Descriptions (DBDs) with segment layouts • Program Specification Blocks (PSBs) if available • Segment field definitions	Documentation of hierarchical data structures and how programs access IMS data, making the complex IMS hierarchy understandable
🟡 Recommended	Data Mapping Docs	• Field-level input-to-output mappings • Transformation rules and calculations • Validation logic documentation • Business rule descriptions	Swimm traces data transformations automatically; your mappings enhance this with business context, creating complete data lineage that's both technically accurate and business-meaningful
JOB CONTROL & ORCHESTRATION
🔴 Required (batch systems)	JCL Files	• JCLs that execute in-scope programs • JCLs for upstream/downstream jobs • PROC definitions • Parameter descriptions if available	Clear documentation of batch execution sequences, job dependencies, and runtime parameters - essential for understanding when and how programs run
🟡 Recommended	Scheduling Definitions	• Control-M, Autosys, or other scheduler definitions • Run calendars and frequencies • Dependency chains	Documentation of operational schedules, showing when jobs run, how often, and what depends on what - valuable for operations and planning
BUSINESS CONTEXT
🔴 Required	Program Flow Docs	• Programs grouped by business function/process flow • High-level description of each program • Cross-program call relationships	Swimm traces actual program-to-program calls through static analysis and combines this with your business context - accurate process maps, not AI-generated guesses about your architecture
🟡 Recommended	Technical Specifications	• Existing technical design documents • Program specifications • Interface agreements • Descriptions of system libraries (if available) • Note: Flag any source code embedded in Word/PDF	Richer documentation including original design intent and business context - the "why" behind the code that isn't visible in programs themselves
🟢 Optional	Sample Data	• Non-production, sanitized sample records (no PII!) • Example input/output files • Test scenarios with expected results	More accurate documentation with concrete examples showing how data actually flows and transforms through your system
SUPPORTING DOCUMENTATION
🟢 Optional	Visual Documentation	• Entity Relationship Diagrams (ERDs) • System architecture diagrams • Process flow charts • Screen mockups or captures	Swimm generates its own visual diagrams from static analysis (dependency graphs, flow charts, etc.). Your existing diagrams provide valuable context to validate the analysis and reveal business perspectives
🟢 Optional	Environment-Specific Components	• PCTS configurations (if applicable) • DBT models (if applicable) • Environment-specific settings files	Documentation of your unique implementation details and configurations that differentiate your environment

How to Organize Your Files

Repository Format

Your files should be organized in a repository format (such as a Git repository) that can be analyzed directly from VS Code. This allows Swimm to integrate seamlessly with your development environment and maintain version control of your documentation alongside your code.

Example Structure

While Swimm is flexible, organizing your files this way will help streamline the documentation process:

/project-root/
├── /programs/
│   ├── /batch/
│   │   ├── PROG001.cbl
│   │   └── PROG002.cbl
│   └── /online/
│       └── PROG003.cbl
├── /copybooks/
│   ├── COPY001.cpy
│   └── COPY002.cpy
├── /jcl/
│   ├── JOB001.jcl
│   └── PROC001.prc
├── /database/
│   ├── /db2/
│   │   └── tables.ddl
│   └── /ims/
│       ├── database.dbd
│       └── program.psb
├── /documentation/
│   ├── program-flows.xlsx
│   └── technical-specs.pdf
└── /maps/
    └── MAP001.bms

File Format Tips

Source code: Provide the original source, not compiled or expanded versions
System libraries: Not needed - only provide your custom application code
Separate files: Keep each program and copybook in its own file
Naming: Use your original file names - this helps with traceability

Quick Verification Checklist

Take a moment to verify:

All in-scope programs are included
Copybooks are provided as separate files (not embedded in programs)
Database definitions cover the tables your programs use
JCLs are included for your batch processes
Program flow documentation is provided (even if it's informal)
Production credentials have been removed
Sample data contains no PII

Common Pitfalls (Easy to Avoid!)

Missing copybooks - Include all copybooks your programs reference, even common ones
Compiled code instead of source - We need the source code, not compiled or expanded versions
Incomplete database definitions - Make sure to include all tables your programs actually use
No process flow info - Even a simple spreadsheet showing which programs work together helps enormously
Source code hidden in documents - If you have code embedded in Word/PDF files, please let us know

Pro tip: When in doubt, include it! Extra information is easy to work with; missing pieces slow everyone down.

Overview​

What You'll Get from Swimm​

Why Swimm is Different​

How to Use This Guide​

Your Artifact Collection Checklist​

How to Organize Your Files​

Repository Format​

Example Structure​

File Format Tips​

Quick Verification Checklist​

Common Pitfalls (Easy to Avoid!)​