Loading services...

Normalizers Monitoring

Real Time Base Offer Processing Status

How Normalization Works

Architecture, data flow, and supplier-specific processing

Data Flow & Collections

Input Collection
{supplier_code}_original_records (MongoDB)
Raw supplier data as ingested from API/files
Processing Pipeline
Kafka Event → normalisation_worker.py → Supplier Normalizer
Topic: atlas.raw.ingested
Success Output
{supplier_code}_god_instances
Normalized hotel data (GOD schema)
Error Output
{supplier_code}_ingestion_errors
Failed records with error details

Worker Orchestration

Main Worker
normalisation_worker.py
services/normalization/src/
Kafka Topic
atlas.raw.ingested
Listens for new ingested records
Factory Pattern
NormalizerFactory
Selects normalizer by supplier_id
Parallelization
Multiple Workers
High throughput processing
1. Kafka consumer receives event
2. NormalizerFactory.get_normalizer(supplier_id)
3. Normalizer.transform(raw_record)
4. Write to god_instances or ingestion_errors
5. Acknowledge Kafka message

Feed-Specific Quirks

supplier_001 Expedia (Rapid API)
  • Simple JSON structure with direct field mapping
  • Native image URLs, minimal transformation needed
  • Multi-language descriptions in single response
  • Fast processing: ~3.5 KB per hotel average
supplier_009 DOTW (TravelgateX)
  • Complex nested XML → JSON structure
  • Requires TGX unified code mapping (1.2M+ records)
  • Multiple translation steps for descriptions
  • Largest normalizer: 26 KB of transformation logic
supplier_hotelbeds Hotelbeds
  • Multiple API endpoints merged (hotels, content, images)
  • Rich amenity data requires OTA code mapping
  • Large file sizes: typically 14 KB+ per hotel
  • Comprehensive coverage: ~800K hotels globally

Shared Utilities (10 Modules)

Core Extractors

  • description_extractor.py
    Multi-language description processing
  • images_extractor.py
    Image URL normalization & GridFS prep
  • nearby_locations_extractor.py
    POI and landmark extraction
  • policy_extractor.py
    Cancellation, payment policies

Field Mappers

  • amenity_mapper.py
    OTA amenity code standardization
  • location_mapper.py
    Geographic coordinate validation
  • contact_mapper.py
    Phone, email, website normalization

Orchestration & Building

  • field_orchestrator.py
    Field-level transformation pipeline coordinator
  • model_builder.py
    GOD instance schema construction
  • basic_fields.py
    Simple field copying utilities
All normalizers compose these utilities to transform raw supplier data into standardized GOD instances

Normalizer Code Viewer

Read-only view of normalizer implementation files

No normalizer code files found