Swimm - Mainframe Preparation Guide
Overview
This guide will help you prepare your codebase and documentation to get maximum value from Swimm. Unlike pure AI solutions, Swimm uses static code analysis to build a comprehensive knowledge base of your systems, then applies AI to generate human-readable documentation. This hybrid approach ensures accuracy and reliability - the documentation is grounded in actual code structure, not AI guesswork.
By following these recommendations, you'll ensure that Swimm can deliver the most comprehensive and accurate documentation for your COBOL/mainframe systems.
What You'll Get from Swimm
- Accurate, reliable documentation - Built on static code analysis, not AI hallucinations
- Living documentation that stays synchronized with your code
- Business logic clarity - AI-enhanced explanations grounded in actual code structure
- Visual system maps - Automatically generated dependency diagrams and flow charts
- Searchable knowledge base - Find answers across your entire mainframe estate
- Modernization readiness - Technical specifications ready for your transformation initiatives
- Faster onboarding - New team members can understand your systems in days, not months
Why Swimm is Different
Swimm's static analysis engine first parses and understands your code structure - programs, data flows, dependencies, and business logic. This creates a factual knowledge base of your system. AI is then used to make this technical information accessible and understandable. This approach delivers reliability that pure AI solutions simply cannot match.
How to Use This Guide
We've organized the artifacts into three priority levels to help you focus your efforts:
- 🔴 Required - Start here! These are essential for Swimm to analyze your systems
- 🟡 Recommended - Providing these will significantly improve the quality and depth of your documentation
- 🟢 Optional - These enhance the documentation with additional context and insights
Pro tip: Even if you can't provide everything at once, you can always add more artifacts later to enrich your documentation over time.
Your Artifact Collection Checklist
| Priority | Artifact Type | What to Provide | What You'll Get |
|---|---|---|---|
| SOURCE CODE COMPONENTS | |||
| 🔴 Required | COBOL Programs | • All in-scope COBOL source files (.cbl, .cob, .cobol) • Both batch and online programs • Unexpanded source code (not pre-compiled) • Note: System libraries are not needed | Swimm's static analysis engine parses each program to extract business logic, calculations, and process flows. AI then generates clear explanations anchored in actual program structure |
| 🔴 Required | Copybooks | • Each copybook as a separate file • All referenced copybooks (data structures, common routines) • CICS-related copybooks if applicable | Swimm analyzes copybook structures and traces their usage across programs, showing exactly how data structures are defined, used, and flow through your system |
| 🔴 Required (if using CICS) | CICS Assets | • BMS map source files • CICS control tables (if customized) • Transaction definitions | Documentation of screen layouts and user interaction flows, making it easy to understand how users navigate through online transactions |
| 🟡 Recommended | Shared Utilities | • Common subroutines and utilities • Error handling modules • Date/time routines | Documentation of reusable components and enterprise standards, helping your team leverage these shared resources consistently |
| DATABASE & DATA DEFINITIONS | |||
| 🔴 Required (if using DB2) | DB2 Definitions | • DDL statements for all referenced tables • Table/column definitions, data types, lengths • Primary/foreign key constraints • Indexes and views | Swimm's analysis engine maps database schemas to program usage, showing exactly which programs access which tables and how data relationships work - factual mapping, not AI inference |
| 🔴 Required (if using IMS) | IMS Definitions | • Database Descriptions (DBDs) with segment layouts • Program Specification Blocks (PSBs) if available • Segment field definitions | Documentation of hierarchical data structures and how programs access IMS data, making the complex IMS hierarchy understandable |
| 🟡 Recommended | Data Mapping Docs | • Field-level input-to-output mappings • Transformation rules and calculations • Validation logic documentation • Business rule descriptions | Swimm traces data transformations automatically; your mappings enhance this with business context, creating complete data lineage that's both technically accurate and business-meaningful |
| JOB CONTROL & ORCHESTRATION | |||
| 🔴 Required (batch systems) | JCL Files | • JCLs that execute in-scope programs • JCLs for upstream/downstream jobs • PROC definitions • Parameter descriptions if available | Clear documentation of batch execution sequences, job dependencies, and runtime parameters - essential for understanding when and how programs run |
| 🟡 Recommended | Scheduling Definitions | • Control-M, Autosys, or other scheduler definitions • Run calendars and frequencies • Dependency chains | Documentation of operational schedules, showing when jobs run, how often, and what depends on what - valuable for operations and planning |
| BUSINESS CONTEXT | |||
| 🔴 Required | Program Flow Docs | • Programs grouped by business function/process flow • High-level description of each program • Cross-program call relationships | Swimm traces actual program-to-program calls through static analysis and combines this with your business context - accurate process maps, not AI-generated guesses about your architecture |
| 🟡 Recommended | Technical Specifications | • Existing technical design documents • Program specifications • Interface agreements • Descriptions of system libraries (if available) • Note: Flag any source code embedded in Word/PDF | Richer documentation including original design intent and business context - the "why" behind the code that isn't visible in programs themselves |
| 🟢 Optional | Sample Data | • Non-production, sanitized sample records (no PII!) • Example input/output files • Test scenarios with expected results | More accurate documentation with concrete examples showing how data actually flows and transforms through your system |
| SUPPORTING DOCUMENTATION | |||
| 🟢 Optional | Visual Documentation | • Entity Relationship Diagrams (ERDs) • System architecture diagrams • Process flow charts • Screen mockups or captures | Swimm generates its own visual diagrams from static analysis (dependency graphs, flow charts, etc.). Your existing diagrams provide valuable context to validate the analysis and reveal business perspectives |
| 🟢 Optional | Environment-Specific Components | • PCTS configurations (if applicable) • DBT models (if applicable) • Environment-specific settings files | Documentation of your unique implementation details and configurations that differentiate your environment |
How to Organize Your Files
Repository Format
Your files should be organized in a repository format (such as a Git repository) that can be analyzed directly from VS Code. This allows Swimm to integrate seamlessly with your development environment and maintain version control of your documentation alongside your code.
Example Structure
While Swimm is flexible, organizing your files this way will help streamline the documentation process:
/project-root/
├── /programs/
│ ├── /batch/
│ │ ├── PROG001.cbl
│ │ └── PROG002.cbl
│ └── /online/
│ └── PROG003.cbl
├── /copybooks/
│ ├── COPY001.cpy
│ └── COPY002.cpy
├── /jcl/
│ ├── JOB001.jcl
│ └── PROC001.prc
├── /database/
│ ├── /db2/
│ │ └── tables.ddl
│ └── /ims/
│ ├── database.dbd
│ └── program.psb
├── /documentation/
│ ├── program-flows.xlsx
│ └── technical-specs.pdf
└── /maps/
└── MAP001.bms
File Format Tips
- Source code: Provide the original source, not compiled or expanded versions
- System libraries: Not needed - only provide your custom application code
- Separate files: Keep each program and copybook in its own file
- Naming: Use your original file names - this helps with traceability
Quick Verification Checklist
Take a moment to verify:
- All in-scope programs are included
- Copybooks are provided as separate files (not embedded in programs)
- Database definitions cover the tables your programs use
- JCLs are included for your batch processes
- Program flow documentation is provided (even if it's informal)
- Production credentials have been removed
- Sample data contains no PII
Common Pitfalls (Easy to Avoid!)
- Missing copybooks - Include all copybooks your programs reference, even common ones
- Compiled code instead of source - We need the source code, not compiled or expanded versions
- Incomplete database definitions - Make sure to include all tables your programs actually use
- No process flow info - Even a simple spreadsheet showing which programs work together helps enormously
- Source code hidden in documents - If you have code embedded in Word/PDF files, please let us know
Pro tip: When in doubt, include it! Extra information is easy to work with; missing pieces slow everyone down.