Normalizers Monitoring
Real Time Base Offer Processing Status
How Normalization Works
Architecture, data flow, and supplier-specific processing
Data Flow & Collections
Input Collection
{supplier_code}_original_records (MongoDB)
Raw supplier data as ingested from API/files
Processing Pipeline
Kafka Event → normalisation_worker.py → Supplier Normalizer
Topic: atlas.raw.ingested
Success Output
{supplier_code}_god_instances
Normalized hotel data (GOD schema)
Error Output
{supplier_code}_ingestion_errors
Failed records with error details
Worker Orchestration
Main Worker
normalisation_worker.pyservices/normalization/src/
Kafka Topic
atlas.raw.ingestedListens for new ingested records
Factory Pattern
NormalizerFactorySelects normalizer by supplier_id
Parallelization
Multiple Workers
High throughput processing
1. Kafka consumer receives event
2. NormalizerFactory.get_normalizer(supplier_id)
3. Normalizer.transform(raw_record)
4. Write to god_instances or ingestion_errors
5. Acknowledge Kafka messageFeed-Specific Quirks
supplier_001 Expedia (Rapid API)
- Simple JSON structure with direct field mapping
- Native image URLs, minimal transformation needed
- Multi-language descriptions in single response
- Fast processing: ~3.5 KB per hotel average
supplier_009 DOTW (TravelgateX)
- Complex nested XML → JSON structure
- Requires TGX unified code mapping (1.2M+ records)
- Multiple translation steps for descriptions
- Largest normalizer: 26 KB of transformation logic
supplier_hotelbeds Hotelbeds
- Multiple API endpoints merged (hotels, content, images)
- Rich amenity data requires OTA code mapping
- Large file sizes: typically 14 KB+ per hotel
- Comprehensive coverage: ~800K hotels globally
Shared Utilities (10 Modules)
Core Extractors
- description_extractor.py Multi-language description processing
- images_extractor.py Image URL normalization & GridFS prep
- nearby_locations_extractor.py POI and landmark extraction
- policy_extractor.py Cancellation, payment policies
Field Mappers
- amenity_mapper.py OTA amenity code standardization
- location_mapper.py Geographic coordinate validation
- contact_mapper.py Phone, email, website normalization
Orchestration & Building
- field_orchestrator.py Field-level transformation pipeline coordinator
- model_builder.py GOD instance schema construction
- basic_fields.py Simple field copying utilities
All normalizers compose these utilities to transform raw supplier data into standardized GOD instances
Normalizer Code Viewer
Read-only view of normalizer implementation files
No normalizer code files found