Skip to main content

Swimm - Mainframe Preparation Guide

Overview

This guide will help you prepare your codebase and documentation to get maximum value from Swimm. Unlike pure AI solutions, Swimm uses static code analysis to build a comprehensive knowledge base of your systems, then applies AI to generate human-readable documentation. This hybrid approach ensures accuracy and reliability - the documentation is grounded in actual code structure, not AI guesswork.

By following these recommendations, you'll ensure that Swimm can deliver the most comprehensive and accurate documentation for your COBOL/mainframe systems.

What You'll Get from Swimm

  • Accurate, reliable documentation - Built on static code analysis, not AI hallucinations
  • Living documentation that stays synchronized with your code
  • Business logic clarity - AI-enhanced explanations grounded in actual code structure
  • Visual system maps - Automatically generated dependency diagrams and flow charts
  • Searchable knowledge base - Find answers across your entire mainframe estate
  • Modernization readiness - Technical specifications ready for your transformation initiatives
  • Faster onboarding - New team members can understand your systems in days, not months

Why Swimm is Different

Swimm's static analysis engine first parses and understands your code structure - programs, data flows, dependencies, and business logic. This creates a factual knowledge base of your system. AI is then used to make this technical information accessible and understandable. This approach delivers reliability that pure AI solutions simply cannot match.


How to Use This Guide

We've organized the artifacts into three priority levels to help you focus your efforts:

  • 🔴 Required - Start here! These are essential for Swimm to analyze your systems
  • 🟡 Recommended - Providing these will significantly improve the quality and depth of your documentation
  • 🟢 Optional - These enhance the documentation with additional context and insights

Pro tip: Even if you can't provide everything at once, you can always add more artifacts later to enrich your documentation over time.


Your Artifact Collection Checklist

PriorityArtifact TypeWhat to ProvideWhat You'll Get
SOURCE CODE COMPONENTS
🔴 RequiredCOBOL Programs• All in-scope COBOL source files (.cbl, .cob, .cobol)
• Both batch and online programs
• Unexpanded source code (not pre-compiled)
Note: System libraries are not needed
Swimm's static analysis engine parses each program to extract business logic, calculations, and process flows. AI then generates clear explanations anchored in actual program structure
🔴 RequiredCopybooks• Each copybook as a separate file
• All referenced copybooks (data structures, common routines)
• CICS-related copybooks if applicable
Swimm analyzes copybook structures and traces their usage across programs, showing exactly how data structures are defined, used, and flow through your system
🔴 Required
(if using CICS)
CICS Assets• BMS map source files
• CICS control tables (if customized)
• Transaction definitions
Documentation of screen layouts and user interaction flows, making it easy to understand how users navigate through online transactions
🟡 RecommendedShared Utilities• Common subroutines and utilities
• Error handling modules
• Date/time routines
Documentation of reusable components and enterprise standards, helping your team leverage these shared resources consistently
DATABASE & DATA DEFINITIONS
🔴 Required
(if using DB2)
DB2 Definitions• DDL statements for all referenced tables
• Table/column definitions, data types, lengths
• Primary/foreign key constraints
• Indexes and views
Swimm's analysis engine maps database schemas to program usage, showing exactly which programs access which tables and how data relationships work - factual mapping, not AI inference
🔴 Required
(if using IMS)
IMS Definitions• Database Descriptions (DBDs) with segment layouts
• Program Specification Blocks (PSBs) if available
• Segment field definitions
Documentation of hierarchical data structures and how programs access IMS data, making the complex IMS hierarchy understandable
🟡 RecommendedData Mapping Docs• Field-level input-to-output mappings
• Transformation rules and calculations
• Validation logic documentation
• Business rule descriptions
Swimm traces data transformations automatically; your mappings enhance this with business context, creating complete data lineage that's both technically accurate and business-meaningful
JOB CONTROL & ORCHESTRATION
🔴 Required
(batch systems)
JCL Files• JCLs that execute in-scope programs
• JCLs for upstream/downstream jobs
• PROC definitions
• Parameter descriptions if available
Clear documentation of batch execution sequences, job dependencies, and runtime parameters - essential for understanding when and how programs run
🟡 RecommendedScheduling Definitions• Control-M, Autosys, or other scheduler definitions
• Run calendars and frequencies
• Dependency chains
Documentation of operational schedules, showing when jobs run, how often, and what depends on what - valuable for operations and planning
BUSINESS CONTEXT
🔴 RequiredProgram Flow Docs• Programs grouped by business function/process flow
• High-level description of each program
• Cross-program call relationships
Swimm traces actual program-to-program calls through static analysis and combines this with your business context - accurate process maps, not AI-generated guesses about your architecture
🟡 RecommendedTechnical Specifications• Existing technical design documents
• Program specifications
• Interface agreements
• Descriptions of system libraries (if available)
Note: Flag any source code embedded in Word/PDF
Richer documentation including original design intent and business context - the "why" behind the code that isn't visible in programs themselves
🟢 OptionalSample Data• Non-production, sanitized sample records (no PII!)
• Example input/output files
• Test scenarios with expected results
More accurate documentation with concrete examples showing how data actually flows and transforms through your system
SUPPORTING DOCUMENTATION
🟢 OptionalVisual Documentation• Entity Relationship Diagrams (ERDs)
• System architecture diagrams
• Process flow charts
• Screen mockups or captures
Swimm generates its own visual diagrams from static analysis (dependency graphs, flow charts, etc.). Your existing diagrams provide valuable context to validate the analysis and reveal business perspectives
🟢 OptionalEnvironment-Specific Components• PCTS configurations (if applicable)
• DBT models (if applicable)
• Environment-specific settings files
Documentation of your unique implementation details and configurations that differentiate your environment

How to Organize Your Files

Repository Format

Your files should be organized in a repository format (such as a Git repository) that can be analyzed directly from VS Code. This allows Swimm to integrate seamlessly with your development environment and maintain version control of your documentation alongside your code.

Example Structure

While Swimm is flexible, organizing your files this way will help streamline the documentation process:

/project-root/
├── /programs/
│ ├── /batch/
│ │ ├── PROG001.cbl
│ │ └── PROG002.cbl
│ └── /online/
│ └── PROG003.cbl
├── /copybooks/
│ ├── COPY001.cpy
│ └── COPY002.cpy
├── /jcl/
│ ├── JOB001.jcl
│ └── PROC001.prc
├── /database/
│ ├── /db2/
│ │ └── tables.ddl
│ └── /ims/
│ ├── database.dbd
│ └── program.psb
├── /documentation/
│ ├── program-flows.xlsx
│ └── technical-specs.pdf
└── /maps/
└── MAP001.bms

File Format Tips

  • Source code: Provide the original source, not compiled or expanded versions
  • System libraries: Not needed - only provide your custom application code
  • Separate files: Keep each program and copybook in its own file
  • Naming: Use your original file names - this helps with traceability

Quick Verification Checklist

Take a moment to verify:

  • All in-scope programs are included
  • Copybooks are provided as separate files (not embedded in programs)
  • Database definitions cover the tables your programs use
  • JCLs are included for your batch processes
  • Program flow documentation is provided (even if it's informal)
  • Production credentials have been removed
  • Sample data contains no PII

Common Pitfalls (Easy to Avoid!)

  1. Missing copybooks - Include all copybooks your programs reference, even common ones
  2. Compiled code instead of source - We need the source code, not compiled or expanded versions
  3. Incomplete database definitions - Make sure to include all tables your programs actually use
  4. No process flow info - Even a simple spreadsheet showing which programs work together helps enormously
  5. Source code hidden in documents - If you have code embedded in Word/PDF files, please let us know

Pro tip: When in doubt, include it! Extra information is easy to work with; missing pieces slow everyone down.