Import Command Features
This guide covers advanced features and optimizations in the code import command.
Overview
The code import command has been optimized for large codebases and includes several features to improve reliability, performance, and user experience:
- Progress Reporting: Real-time progress bars for long-running operations
- Feature Validation: Automatic validation of existing features when resuming imports
- Early Save Checkpoint: Features saved immediately after analysis to prevent data loss
- Performance Optimizations: Pre-computed caches for 5-15x faster processing
- Re-validation Flag: Force re-analysis of features even if files havenโt changed
Progress Reporting
The import command now provides detailed progress reporting for all major operations:
Feature Analysis Progress
During the initial codebase analysis, youโll see:
๐ Analyzing codebase...
โ Found 3156 features
โ Detected themes: API, Async, Database, ORM, Testing
โ Total stories: 5604
Source File Linking Progress
When linking source files to features, a progress bar shows:
Linking 3156 features to source files...
[โโโโโโโโโโโโโโโโโโโโ] 100% (3156/3156 features)
This is especially useful for large codebases where linking can take several minutes.
Contract Extraction Progress
During OpenAPI contract extraction, progress is shown for each feature being processed.
Feature Validation
When you restart an import on an existing bundle, the command automatically validates existing features:
Automatic Validation
# First import
specfact code import my-project --repo .
# Later, restart import (validates existing features automatically)
specfact code import my-project --repo .
Validation Results
The command reports validation results:
๐ Validating existing features...
โ All 3156 features validated successfully (source files exist)
Or if issues are found:
โ Feature validation found issues: 3100/3156 valid, 45 orphaned, 11 invalid
Orphaned features (all source files missing):
- FEATURE-1234 (3 missing files)
- FEATURE-5678 (2 missing files)
...
Invalid features (some files missing or structure issues):
- FEATURE-9012 (1 missing file)
...
Tip: Use --revalidate-features to re-analyze features and fix issues
What Gets Validated
- Source file existence: Checks that all referenced implementation and test files still exist
- Feature structure: Validates that features have required fields (key, title, stories)
- Orphaned features: Detects features whose source files have been deleted
- Invalid features: Identifies features with missing files or structural issues
Early Save Checkpoint
Features are saved immediately after the initial codebase analysis, before expensive operations like source tracking and contract extraction.
Benefits
- Resume capability: If the import is interrupted, you can restart without losing the initial analysis
- Data safety: Features are persisted early, reducing risk of data loss
- Faster recovery: No need to re-run the full codebase scan if interrupted
Example
# Start import
specfact code import my-project --repo .
# Output shows:
# โ Found 3156 features
# ๐พ Saving features (checkpoint)...
# โ Features saved (can resume if interrupted)
# If you press Ctrl+C during source linking, you can restart:
specfact code import my-project --repo .
# The command will detect existing features and resume from checkpoint
Performance Optimizations
The import command has been optimized for large codebases (3000+ features):
Pre-computed Caches
- AST Parsing: All files are parsed once before parallel processing
- File Hashes: All file hashes are computed once and cached
- Function Mappings: Function names are extracted once per file
Performance Improvements
- Before: ~34 features/minute (515/3156 in 15 minutes)
- After: 200-500+ features/minute (5-15x faster)
- Large codebases: 3000+ features processed in 6-15 minutes (down from 90+ minutes)
How It Works
- Pre-computation phase: Single pass through all files to build caches
- Parallel processing: Uses cached results (no file I/O or AST parsing)
- Thread-safe: Read-only caches during parallel execution
Re-validation Flag
Use --revalidate-features to force re-analysis even if source files havenโt changed.
When to Use
- Analysis improvements: When the analysis logic has been improved
- Confidence changes: When you want to re-evaluate features with a different confidence threshold
- File changes outside repo: When files were moved or renamed outside the repository
- Validation issues: When validation reports orphaned or invalid features
Example
# Re-analyze all features even if files unchanged
specfact code import my-project --repo . --revalidate-features
# Output shows:
# โ --revalidate-features enabled: Will re-analyze features even if files unchanged
What Happens
- Forces full codebase analysis regardless of incremental change detection
- Re-computes all feature mappings and source tracking
- Updates feature confidence scores based on current analysis logic
- Regenerates all contracts and relationships
Best Practices
Large Codebases
For codebases with 1000+ features:
- Use partial analysis: Start with
--entry-pointto analyze one module at a time - Monitor progress: Watch the progress bars to estimate completion time
- Use checkpoints: Let the early save checkpoint work for you - donโt worry about interruptions
- Re-validate periodically: Use
--revalidate-featuresafter major code changes
Resuming Interrupted Imports
- Donโt delete the bundle: The checkpoint is stored in the bundle directory
- Run the same command: Just re-run the import command - it will detect existing features
- Check validation: Review validation results to see if any features need attention
- Use re-validation if needed: If validation shows issues, use
--revalidate-features
Performance Tips
- Exclude tests if not needed: Use
--exclude-testsfor faster processing (if test analysis isnโt critical) - Use entry points: For monorepos, analyze one project at a time with
--entry-point - Adjust confidence: Lower confidence (0.3-0.5) for faster analysis, higher (0.7-0.9) for more accurate results
Troubleshooting
Slow Linking
If source file linking is slow:
- Check file count: Large numbers of files (10,000+) will take longer
- Monitor progress: The progress bar shows current status
- Use entry points: Limit scope with
--entry-pointfor faster processing
Validation Issues
If validation reports many orphaned features:
- Check file paths: Ensure source files havenโt been moved
- Use re-validation: Run with
--revalidate-featuresto fix mappings - Review feature keys: Some features may need manual adjustment
Interrupted Imports
If import is interrupted:
- Donโt delete bundle: The checkpoint is in
.specfact/projects/<bundle-name>/ - Restart command: Run the same import command - it will resume
- Check progress: Validation will show what was completed
Related Documentation
- Command Reference - Complete command documentation
- Quick Examples - Quick command examples
- Brownfield Engineer Guide - Complete brownfield workflow
- Common Tasks - Common import scenarios
Happy importing! ๐