What does the architecture of a production-grade talent intelligence platform actually look like? This post breaks it down layer by layer — from the user-facing channels all the way to containerized cloud infrastructure.
Modern talent platforms face a unique intersection of engineering challenges: high-cardinality search across millions of candidate profiles, real-time recruiter dashboards, multi-channel data ingestion from ATS systems and social platforms, and the need to match job descriptions against resumes with semantic precision.
01 — Application Architecture Overview
The platform is structured around four clean horizontal layers. Each layer builds strictly on the one below, with higher layers consuming services provided by lower ones — a discipline that keeps domain logic separate from shared infrastructure.
A critical design decision is the explicit split within Application Services between platform-specific components and reusable components. Taxonomy Manager, Connectors, and Profiles are architected as standalone building blocks — meaning they could be extracted and deployed in a different product context without modification.
02 — Technical Architecture
At the core sits a Flask-based application server, Elasticsearch serving as both search engine and primary database, and a pluggable resume/job description parsing engine.
Data Flow at a Glance
/ Taxonomy
App Server
Index
Data
Storage
Elasticsearch as the Primary Database
This is the boldest architectural choice: the platform uses Elasticsearch not as a secondary search layer bolted on top of a relational database, but as the primary database. This works because talent matching is fundamentally a search problem — filtering candidates by skills, matching job descriptions to profiles, and navigating hierarchical taxonomies all map naturally to ES's inverted index.
The tradeoff is real — ES lacks the transactional guarantees of a relational database — but for a read-heavy, search-intensive workload, the performance and scalability gains far outweigh the cost.
External Integrations
- Taleo
- Jobscience
- Other Sources (Adaptable)
Parsing & Analytics
- Sovren Resume/JD Parser
- Parser Integration (Replaceable)
- Kibana Dashboards
- Azure Blob (file storage)
03 — Deployment Pipeline
A three-stage deployment model — Development → Staging → Production — fully containerized with Docker and orchestrated via Azure Kubernetes Service (AKS).
Development
Local development uses Docker Compose to orchestrate all services — the Flask app, Elasticsearch, and Kibana — in a coordinated local environment. A single docker-compose up replicates the full production topology, eliminating environment drift.
Production on AKS
Production runs on Azure Kubernetes Service within a VPN/NSG-protected network boundary. Two container groups form the production cluster:
Elasticsearch Containers
- Multiple ES nodes (horizontally scalable)
- Kibana on whitelisted IPs only
- Azure Disk persistent storage
API / Frontend Servers
- Flask + uWSGI app containers
- Nginx as reverse proxy
- Azure Load Balancer (public-facing)
By containerizing every component — from the app server to Elasticsearch — the platform achieves environment parity across all three deployment stages. What runs locally is exactly what runs in production.
04 — Authentication & Security
The platform supports username/password login and external OAuth (Google, LinkedIn, Facebook), issuing HS256-signed JWT tokens with configurable validity windows. Tokens are stateless — no session state is stored server-side, which means any app server replica can validate any request.
{
"exp": 1515044478, // token expiration timestamp
"iat": 1515044178, // issued at
"nbf": 1515044178, // not valid before
"identity": "AWC_qrkjqSWdEnD3N7H_" // opaque user ID
}
05 — Technology Stack
Backend
- Python 3.6
- Flask 0.12.2
- uWSGI + Nginx
- Sovren (parsing)
Frontend
- React 16.2.0
- Single-page application
- Web + Mobile responsive
Data Layer
- Elasticsearch 6.1.1
- Kibana 6.1.1
- Azure Disk (persistence)
- Azure Blob (files)
Infrastructure
- Docker + Docker Compose
- Azure AKS (production)
- Azure Container Registry
06 — Key Design Principles
Search-first data modeling. When your core workload is search — filtering, matching, ranking — consider whether Elasticsearch as your primary store makes more sense than adding it as a secondary layer on top of a relational database.
Replaceability by design. Marking integration points as "Replaceable" isn't just documentation — it's an architectural contract that signals to every future engineer that these boundaries are extension points.
Reusable components as first-class citizens. Separating reusable components from platform-specific ones means the next product built on this foundation gets those capabilities for free.
Stateless authentication at scale. JWT-based auth with configurable validity windows is a clean, horizontally scalable approach that requires no shared session store across replicas.
The most interesting architectural bets here are using Elasticsearch as a primary database and containerizing everything from day one. Both are tradeoffs — but for a search-heavy platform deployed on cloud infrastructure, they're the right tradeoffs to make.