AircraftOps
Search
Posts

FAA Registry Ingestion CLI

Tools → Command Line Interface

CLI tooling for FAA aircraft registration ingestion pipelines, including dataset acquisition, normalization, repository persistence, export workflows, and search index synchronization.


Execution Pipeline

faa ingest
      ↓
Dataset Acquisition (Archive Download)
      ↓
Archive Extraction
      ↓
Record Parsing and Normalization
      ↓
Dataset Enrichment
      ↓
Repository Persistence (DynamoDB / Local JSON / S3)

FAA CLI Command Reference

The FAA CLI module provides ingestion, replay, purge, and indexing commands for managing aircraft registration datasets within the AircraftOps ingestion framework.


ingest

Run the full FAA ingestion pipeline.

pipenv run python cli -m faa ingest [--save-dynamodb] [--save-local] [--save-s3]
  • Downloads aircraft registry datasets
  • Parses and normalizes registration records
  • Runs enrichment workflows
  • Writes processed datasets to selected destinations

ingest-local

Replay ingestion using locally stored FAA registry datasets.

pipenv run python cli -m faa ingest-local [--save-dynamodb] [--save-s3]

ingest-s3

Replay ingestion using registry datasets stored in object storage.

pipenv run python cli -m faa ingest-s3 [--save-dynamodb] [--save-local]

purge

Delete all FAA registry records from the primary datastore.

pipenv run python cli -m faa purge

typesense-sync

Scan FAA registry datasets and enqueue indexing jobs into the search indexing pipeline.

pipenv run python cli -m faa typesense-sync

Repository Layer

Persistence operations are abstracted through the AircraftOps repository layer, accessible via model-managed repository objects.

FAARegistrationModel.objects_sync

The repository layer provides:

  • Batch persistence operations
  • Conditional updates and upserts
  • Partition-aware dataset scanning
  • Bulk deletion workflows
  • Pagination-aware record traversal

Operational Notes

  • Full registry ingestion may exceed hundreds of thousands of records
  • Replay ingestion supports deterministic dataset reconstruction
  • Search indexing operations are asynchronous via queue-based fanout
  • All ingestion workflows emit structured operational logs for audit tracking