TQPro Multi-Tenancy Architecture Plan¶
Context¶
TQPro is currently a single-tenant application. All users, regardless of company, share the same database, same schemas, same caches, same external system credentials, and same file storage. The only hint of organizational grouping is CRegUser.companyId (mapped to Odoo's customer_account_id), but nothing enforces data isolation based on it.
Goal: Enable multiple independent travel agencies (potentially competitors) to use the same TQPro platform with full logical data isolation — each tenant's customers, bookings, products, documents, and conversations are invisible to other tenants.
Current State Summary¶
| Concern | Status | Key File |
|---|---|---|
| Tenant identity in JWT | Missing | tqapi/.../oidc/ValidatedToken.java |
| Tenant in request context | Missing | tqcommon/.../util/RequestContext.java |
| Tenant filtering in DB queries | None | tqapp/.../nts/service/NTSEntityReadService.java |
| Database isolation | Single DB tlinq, 5 schemas |
tqapp/.../nts/db/NTSDBSession.java |
| Cache isolation | Global singletons | tqapp/.../cache/StaticMapCache.java |
| S3 document paths | No tenant prefix | tqapp/.../media/BookingDocumentStorageService.java |
| External system creds | Global singletons | tqodoo/.../OdooClientConfig.java, tqamds/.../AmadeusClientConfig.java |
| WhatsApp conversations | Keyed by phone only, no tenant | config/db-changes/0070-conversation-messages.sql |
| Keycloak realm | Single realm, no tenant claim | config/tlinqapi.properties |
| API authorization | Global roles (guest/agent/admin) | config/api-roles.properties |
| Configuration management | All configs are global singletons loaded from files | AppConfig, NTSClientConfig, OdooClientConfig, etc. |
Current Database Schema Layout (single DB tlinq)¶
| Schema | Purpose | JPA Entities |
|---|---|---|
nts |
Core business data (80+ tables: bookings, cruises, hotels, groups, visa, marketing, tripmaker, offline tickets) | Hardcoded in NTSDBSession — no @Table(schema=...) (uses default) |
amadeus |
Airport cache | @Table(name="airport", schema="amadeus") in AirportEntity |
goglobal |
GoGlobal hotel/city/country data | @Table(schema="goglobal", catalog="tlinq") in GG* entities |
tiqets |
Tiqets orders, products, tags | @Table(schema="tiqets", catalog="tlinq") in Tiqets* entities |
tqwa |
WhatsApp messages, conversations, broadcasts | Used by Python service; separate DB user wa_service |
Configuration Landscape¶
The platform loads configuration from files under TLINQ_HOME via several singleton classes. Every one of these singletons is loaded once at startup and shared globally — there is no per-tenant config resolution.
Configuration Loading Hierarchy¶
TLINQ_HOME/
├── tourlinq-config.xml ← ClientConfig singleton (JAXB)
│ └── entities/*.xml ← 19 XInclude entity mapping files
├── tourlinq.properties ← AppConfig singleton (Properties)
├── properties.d/ ← AppConfig (alphabetically merged overrides)
│ ├── erp-booking.properties
│ └── messaging.properties
├── nts-client.xml ← NTSClientConfig singleton (JAXB)
├── odoo-client.properties ← OdooClientConfig singleton
├── odoo-server.properties ← OdooClientConfig singleton
├── amadeus-client.xml ← AmadeusClientConfig singleton
├── amadeus.idfile ← Amadeus credentials
├── goglobal-client.xml ← GoGlobalClientConfig singleton
├── tiqets-client.xml ← TiqetsClientConfig singleton
├── rayna-client.xml ← RaynaClientConfig singleton
├── googleflights-client.xml ← GoogleFlightsClientConfig singleton
├── tlinqapi.properties ← TQProApiServer (API ports, auth, CORS)
├── api-roles.properties ← ApiRoleManager (endpoint→role mappings)
├── hazelcast.xml ← TlinqClusterCache (cluster config)
└── log.properties ← Java logging
Configuration Classification: Shared vs. Per-Tenant¶
| Config File | Loader Class | Classification | Reason |
|---|---|---|---|
| PLATFORM-LEVEL (shared across all tenants) | |||
tourlinq-config.xml — entity definitions, factory mappings |
ClientConfig |
Shared | Entity model & service wiring is identical for all tenants |
config/entities/*.xml — 19 entity mapping files |
ClientConfig (XInclude) |
Shared | Same DB schema structure per tenant DB |
nts-client.xml — service class mappings |
NTSClientConfig |
Shared | Service implementations are code, not data |
odoo-client.properties — Odoo service-to-model mappings |
OdooClientConfig |
Shared | Structural mappings are the same; credentials differ |
api-roles.properties — endpoint→role authorization |
ApiRoleManager |
Shared | Same API surface for all tenants |
tlinqapi.properties — server ports, OIDC issuer, CORS |
TQProApiServer |
Shared | Single JVM serves all tenants |
hazelcast.xml — cluster topology |
TlinqClusterCache |
Shared | One cluster; tenant isolation via key prefixing |
log.properties — logging format |
Java Logging | Shared | Tenant context added via MDC/RequestContext |
| TENANT-LEVEL (must differ per tenant) | |||
tourlinq.properties — API keys, company info, mail, payment gateway |
AppConfig |
Per-tenant | Contains credentials, company identity, business rules |
properties.d/messaging.properties — Twilio SID/token, WA config |
AppConfig |
Per-tenant | Each tenant has own Twilio account, WA phone |
properties.d/erp-booking.properties — ERP defaults, agency info, charge rates |
AppConfig |
Per-tenant | Agency name, payment channels, service charge rates |
odoo-server.properties — Odoo URL, DB, user, password |
OdooClientConfig |
Per-tenant | Each tenant has own Odoo instance |
amadeus-client.xml — Amadeus API credentials |
AmadeusClientConfig |
Per-tenant | Each tenant has own Amadeus contract |
amadeus.idfile — Amadeus credential file |
Amadeus plugin | Per-tenant | Credential file per tenant |
goglobal-client.xml — GoGlobal API URL, agency ID, credentials |
GoGlobalClientConfig |
Per-tenant | Each tenant has own GoGlobal account |
tiqets-client.xml — Tiqets API key, JWT key file |
TiqetsClientConfig |
Per-tenant | Each tenant has own Tiqets agreement |
rayna-client.xml — Rayna agent ID, API tokens |
RaynaClientConfig |
Per-tenant | Each tenant has own Rayna B2B account |
googleflights-client.xml — RapidAPI key |
GoogleFlightsClientConfig |
Per-tenant | API key is per-account |
tourlinq-config.xml — database connection URLs/credentials |
ClientConfig |
Per-tenant | Each tenant points to own DB |
hibernate.cfg.xml — JDBC URL (hardcoded localhost:5432/tlinq) |
Hibernate | Per-tenant | Different DB per tenant |
Specific Per-Tenant Properties (from tourlinq.properties + properties.d/)¶
Identity & Branding:
- company.code, tqpro.company.name, tqpro.company.logo.url, tqpro.company.contact.1/2
- tqpro.agency.name, tqpro.agency.phone, tqpro.agency.email
Mail/SMTP:
- mail.server, mail.port, mail.user, mail.password, mail.from, mail.name, mail.usetls
Payment Gateway (Telr):
- telr.pgw-url, telr.auth-key, telr.store-id, telr.test-mode
- telr.link-success, telr.link-cancel, telr.link-declined
- pgw.callback-base, pgw.public-site-url
Messaging/Twilio/WhatsApp:
- twilio.sid, twilio.token, twilio.sms.sender, twilio.wa.sender
- broadcast.wa.provider, broadcast.wa.media.cdn.prefix, broadcast.wa.shortlink
- whatsapp.python.service.url, whatsapp.python.service.api-key
- manager.sms, admin.sms
3rd Party API Keys:
- rapidapi.visa.key, tiqets.api.key, goglobal.api.password, gf.rapidapi.key
- pexels.api.key, ai.api.key
- ryb2b.server, ryb2b.token
ERP/Business Rules:
- erp.payment.channel.*, erp.default.product.*
- service.charge.* (per payment method rates)
- hotel.margin, service.default.hotel.checkInTime, etc.
Storage Paths:
- content.directory, content.urlprefix, content.cdn-prefix
- offline-tickets.storage-path
Database:
- tlinq.dbname, tlinq.dbpass
Recommended Strategy: Database-Per-Tenant (Same PG Server)¶
Why not shared-DB with tenant_id column (row-level)?¶
- 100+ JPA entities with a generic, reflection-based query framework (
EntityFacade+NTSClientService.querySearchS()). InjectingAND tenant_id = ?into every dynamically-built query path is fragile — one missed filter = cross-tenant data leak.
Why not schema-per-tenant?¶
- The DB already has 5 distinct schemas (
nts,amadeus,goglobal,tiqets,tqwa) with hardcoded@Table(schema=...)annotations in JPA entities. Duplicating all 5 schemas per tenant and rewriting annotations is messy and error-prone.
Why database-per-tenant is the best fit:¶
- Each tenant gets a complete copy of the DB with all 5 schemas intact —
nts,amadeus,goglobal,tiqets,tqwaschema names stay identical - Zero JPA annotation changes — all
@Table(schema="goglobal")references work as-is - Strongest isolation — a programming error cannot leak data across databases; PostgreSQL enforces DB boundaries at the protocol level
- Clean tenant provisioning —
pg_dump/pg_restoreof the template DB creates a new tenant instantly - Per-tenant DB users — each tenant's DB can have its own credentials, enabling PG-level access control
- Hibernate supports this natively —
MultiTenancyStrategy.DATABASEwithMultiTenantConnectionProvider - Same PG server — no network overhead; managed PG services (RDS, Cloud SQL) support multiple DBs on one instance
- Connection pool sizing — for 5-50 tenants with 2-5 connections each = 10-250 connections total, well within PG limits
Phased Implementation¶
Phase 0: Tenant Identity Infrastructure (Foundation)¶
Effort: ~2-3 weeks | Risk: Low
Goal: Establish tenant identity flowing from JWT through the entire request lifecycle to the DB session, without changing any business logic. Deploy with a single tenant to validate the plumbing.
0.1 Tenant Registry Database & Table¶
Create a platform-level database tqplatform with a tenant registry:
CREATE TABLE tenant (
tenant_id VARCHAR(36) PRIMARY KEY, -- UUID
tenant_code VARCHAR(50) UNIQUE NOT NULL, -- e.g. "acme-travel" (also the Keycloak realm name)
tenant_name VARCHAR(200) NOT NULL,
db_name VARCHAR(63) NOT NULL, -- e.g. "tlinq_acme"
db_user VARCHAR(63), -- optional per-tenant DB user
db_pass VARCHAR(200), -- encrypted
kc_realm VARCHAR(63) NOT NULL, -- Keycloak realm name (= tenant_code)
kc_admin_client_secret VARCHAR(200), -- encrypted, per-realm service account secret
status VARCHAR(20) DEFAULT 'ACTIVE', -- ACTIVE, SUSPENDED, DEPROVISIONED
config JSONB DEFAULT '{}', -- per-tenant config overrides
created_at TIMESTAMP DEFAULT NOW()
);
-- First tenant = existing realm & database
INSERT INTO tenant (tenant_id, tenant_code, tenant_name, db_name, kc_realm)
VALUES ('00000000-0000-0000-0000-000000000001', 'tqpro-adm', 'Default Agency', 'tlinq', 'tqpro-adm');
The config JSONB column holds per-tenant overrides (Odoo creds, Amadeus keys, S3 prefix, WA phone ID, branding, etc.) — extensible without DDL changes.
0.2 Keycloak: Realm-Per-Tenant Architecture¶
Why realm-per-tenant (not single realm with tenant_id attribute):
- Keycloak users belong to a realm. A single shared realm means all tenants' users are in one pool — Keycloak's admin console would expose users across tenants, and user attribute-based isolation is fragile.
- Competing agencies need complete user isolation: independent password policies, independent login branding/themes, independent identity providers (e.g., tenant A uses Google SSO, tenant B uses SAML).
- The staff management plan (doc/plans/management/staff-management-plan.md) has the tenant admin managing users from inside TQPro via the Keycloak Admin REST API. With realm-per-tenant, the admin API calls are scoped to the tenant's realm — no risk of cross-tenant user visibility.
Architecture:
Keycloak Server
├── master realm (platform admin only)
│ └── Platform service account: tqpro-platform-admin
│ (has create-realm permission — used ONLY for tenant provisioning)
│
├── tqpro-adm realm (first/default tenant — existing)
│ ├── Client: tqweb-adm (public, OIDC login for frontend)
│ ├── Client: tqpro-admin-api (confidential, service account for user CRUD)
│ ├── Realm roles: guest, agent, manager, finance, admin
│ └── Users: existing users
│
├── acme-travel realm (tenant 2 — created at onboarding)
│ ├── Client: tqweb-adm (same config, different realm)
│ ├── Client: tqpro-admin-api (confidential, service account)
│ ├── Realm roles: guest, agent, manager, finance, admin
│ └── Users: acme-travel's users
│
└── ... more tenant realms
Tenant identity derivation from JWT:
The iss (issuer) claim in every JWT contains the realm: https://auth.tourlinq.com/realms/acme-travel. The tenant is resolved by extracting the realm name from the issuer URL and looking it up in tqplatform.tenant.kc_realm.
No tenant_id user attribute needed — the realm IS the tenant boundary.
0.3 Multi-Realm JWT Validation¶
The current JWTValidator is hardcoded to a single OIDCConfig (one issuer, one JWKS endpoint). For realm-per-tenant, it needs to accept tokens from any registered tenant realm.
tqapi/src/main/java/com/perun/tlinq/oidc/JWTValidator.java — refactor:
public class JWTValidator {
// One processor per realm, lazily created
private final ConcurrentHashMap<String, ConfigurableJWTProcessor<SecurityContext>> processors
= new ConcurrentHashMap<>();
private final String keycloakBaseUrl; // e.g. "https://auth.tourlinq.com"
private final String clientId; // same client ID across all realms
public ValidatedToken validateToken(String token) throws TokenValidationException {
SignedJWT signedJWT = SignedJWT.parse(token);
JWTClaimsSet unverifiedClaims = signedJWT.getJWTClaimsSet();
String issuer = unverifiedClaims.getIssuer();
// Extract realm from issuer: "https://auth.tourlinq.com/realms/acme-travel"
String realmName = extractRealmFromIssuer(issuer);
// Verify realm is a known tenant
TenantInfo tenant = TenantRegistry.instance().getByRealm(realmName);
if (tenant == null) throw new TokenValidationException("Unknown realm: " + realmName, ...);
// Get or create processor for this realm (caches JWKS per realm)
ConfigurableJWTProcessor<SecurityContext> processor = processors.computeIfAbsent(
realmName, r -> createProcessorForRealm(r, issuer));
JWTClaimsSet claims = processor.process(signedJWT, null);
// ... extract userId, email, name, roles as before ...
return new ValidatedToken(userId, email, name, roles, expiresAt, token, tenant.getTenantId());
}
}
tqapi/src/main/java/com/perun/tlinq/oidc/ValidatedToken.java:
- Add private final String tenantId; field + getter
tqapi/src/main/java/com/perun/tlinq/oidc/OIDCConfig.java:
- Keep as base config with keycloakBaseUrl and clientId
- Remove single issuer requirement; each realm has its own issuer
tqapi/src/main/java/com/perun/tlinq/oidc/JWKSManager.java:
- Refactor to support multiple JWKS sources (one per realm)
- Map<String, JWKSource<SecurityContext>> keyed by realm
0.4 Propagate tenant through RequestContext¶
tqcommon/src/main/java/com/perun/tlinq/util/RequestContext.java:
- Add private final String tenantId; field + getter
- Update constructor signature
tqapi/src/main/java/com/perun/tlinq/AuthenticationFilter.java:
- Extract tenantId from:
- JWT (Tier 1): ValidatedToken.getTenantId() (resolved from issuer realm)
- Headers (Tier 2): X-Tenant-ID header
- Dev mode (Tier 3a): from dev-tenant-id property in tlinqapi.properties
- Internal API (Tier 0): X-Tenant-ID header (required for cross-service calls)
- Pass to new RequestContext(userId, userName, userEmail, correlationId, tenantId)
- Reject authenticated (non-guest) requests without a valid tenant → 403
0.5 Frontend Login — Realm-Aware OIDC¶
The frontend (tqweb-adm) currently initializes its OIDC client with a hardcoded realm URL. With realm-per-tenant:
Option A — Subdomain routing (recommended):
- Each tenant gets a subdomain: acme.tourlinq.com, bravo.tourlinq.com
- A lightweight resolver (nginx, or a /auth/config endpoint) maps subdomain → realm
- Frontend calls GET /auth/config?tenant=acme → returns { issuer: "https://auth.tourlinq.com/realms/acme-travel", clientId: "tqweb-adm" }
- Frontend initializes OIDC with tenant-specific issuer
Option B — Login page with tenant selector: - Single URL, tenant selected before login - Less clean but works if subdomains are not feasible
The existing AuthApi.java endpoint POST /auth/config returns OIDC config to the frontend — this needs to be tenant-aware.
0.5 TenantRegistry and TenantConfig (tqcommon)¶
New class: tqcommon/src/main/java/com/perun/tlinq/util/TenantRegistry.java
Loads tenant records from tqplatform.tenant at startup. Caches in ConcurrentHashMap<String, TenantInfo>. Provides:
- getDbName(tenantId) → database name
- getDbUrl(tenantId) → full JDBC URL
- getTenantCode(tenantId) → short code
- isValid(tenantId) → tenant exists and ACTIVE
- refresh() → reload from DB
Connects to tqplatform DB using a dedicated JDBC connection (separate from tenant DBs).
New class: tqcommon/src/main/java/com/perun/tlinq/util/TenantConfig.java
This is the key abstraction for per-tenant configuration. It replaces direct calls to AppConfig.getInstance().getProp(key) with a tenant-aware lookup:
public class TenantConfig {
// Platform-wide defaults loaded once from files (tourlinq.properties + properties.d/)
private static Properties platformDefaults;
// Per-tenant overrides loaded from tqplatform.tenant.config JSONB
private static final ConcurrentHashMap<String, Properties> tenantOverrides = new ConcurrentHashMap<>();
/**
* Get a config property for the current tenant.
* Lookup order: tenant override → platform default → null
*/
public static String get(String key) {
String tenantId = RequestContext.current() != null
? RequestContext.current().getTenantId() : null;
if (tenantId != null) {
Properties overrides = tenantOverrides.get(tenantId);
if (overrides != null && overrides.containsKey(key)) {
return overrides.getProperty(key);
}
}
return platformDefaults.getProperty(key);
}
/** Get config for explicit tenant (for background jobs) */
public static String get(String tenantId, String key) { ... }
}
Migration path: Existing AppConfig.getInstance().getProp(key) calls are gradually replaced with TenantConfig.get(key). Both can coexist during transition — AppConfig continues to work for platform-level properties.
What goes in the tenant.config JSONB column (stored in tqplatform.tenant):
{
"company.code": "ACME01",
"tqpro.company.name": "Acme Travel LLC",
"tqpro.company.logo.url": "https://acme.example.com/logo.png",
"tqpro.agency.name": "Acme Travel LLC",
"tqpro.agency.phone": "+1-555-0100",
"tqpro.agency.email": "info@acmetravel.com",
"mail.server": "smtp.acmetravel.com",
"mail.port": "587",
"mail.user": "bookings@acmetravel.com",
"mail.password": "encrypted:...",
"mail.from": "bookings@acmetravel.com",
"mail.name": "Acme Travel",
"telr.auth-key": "encrypted:...",
"telr.store-id": "29876",
"pgw.callback-base": "https://api.acmetravel.com/tlinq-api",
"twilio.sid": "ACxxxxxxxxxx",
"twilio.token": "encrypted:...",
"twilio.sms.sender": "MGxxxxxxxxxx",
"twilio.wa.sender": "MGxxxxxxxxxx",
"odoo.server": "https://erp.acmetravel.com",
"odoo.db": "acme_prod",
"odoo.user": "admin",
"odoo.password": "encrypted:...",
"amadeus.api.key": "encrypted:...",
"amadeus.api.secret": "encrypted:...",
"tiqets.api.key": "encrypted:...",
"goglobal.api.password": "encrypted:...",
"ryb2b.token": "encrypted:...",
"hotel.margin": "8",
"erp.default.product.HOTEL": "LHSTAY",
"service.charge.CARD_ONLINE": "percent,3.0,Card Processing Fee",
"content.cdn-prefix": "https://cdn.acmetravel.com/",
"s3.path.prefix": "tenants/acme/"
}
Only values that differ from platform defaults need to be specified. The TenantConfig.get() method falls back to platformDefaults (loaded from the file-based configs) for any key not overridden.
0.6 Tenant-Aware DB Sessions (THE critical change)¶
tqapp/src/main/java/com/perun/tlinq/client/nts/db/NTSDBSession.java:
Replace the single static SessionFactory with a tenant-keyed factory map:
public class NTSDBSession {
private static final Map<String, SessionFactory> factories = new ConcurrentHashMap<>();
private static Configuration baseConfiguration; // built once, cloned per tenant
static {
// Build base configuration (entity registrations, Hibernate settings)
// but do NOT set connection URL yet
baseConfiguration = new Configuration();
baseConfiguration.configure();
// ... register all annotated classes (lines 69-204 stay the same) ...
// Initialize factory for default/first tenant
String defaultDb = TenantRegistry.instance().getDbName("default");
factories.put("default", buildFactory(defaultDb));
}
private static SessionFactory buildFactory(String dbName) {
Configuration cfg = baseConfiguration.copy(); // or rebuild
cfg.setProperty("hibernate.connection.url",
ClientConfig.instance().getDB(dbName));
cfg.setProperty("hibernate.connection.username", ...);
cfg.setProperty("hibernate.connection.password", ...);
cfg.setProperty("hibernate.hikari.poolName", "NTS-" + dbName);
cfg.setProperty("hibernate.hikari.minimumIdle", "2");
cfg.setProperty("hibernate.hikari.maximumPoolSize", "5");
return cfg.buildSessionFactory();
}
public static Session getSession() {
String tenantId = RequestContext.current() != null
? RequestContext.current().getTenantId() : null;
if (tenantId == null) tenantId = "default";
SessionFactory factory = factories.computeIfAbsent(tenantId, tid -> {
String dbName = TenantRegistry.instance().getDbName(tid);
return buildFactory(dbName);
});
return factory.openSession();
}
}
Key design decisions:
- Lazy factory creation via computeIfAbsent — new tenant pools spin up on first request
- Small per-tenant pools (2-5 connections) keep total connection count manageable
- Base configuration (entity class registrations) is shared; only JDBC URL differs
- HikariCP pool names are tenant-prefixed for monitoring
Same pattern for other DB sessions (if they exist):
- AmdDBSession in tqamds (Amadeus schema)
- Any GoGlobal/Tiqets DB sessions
0.7 Safe deployment checkpoint¶
- Deploy with existing
tlinqDB as the single default tenant - All traffic routes to the same DB as before
- Zero behavioral change — this validates the entire plumbing end-to-end
Phase 1: Tenant Provisioning & Second Tenant¶
Effort: ~1-2 weeks | Risk: Medium
1.1 Database Cloning Script¶
# Create new tenant database from template
pg_dump -Fc tlinq > /tmp/tlinq_template.dump
createdb tlinq_acme
pg_restore -d tlinq_acme /tmp/tlinq_template.dump
# Optionally: create tenant-specific DB user
createuser tlinq_acme_user
GRANT ALL ON DATABASE tlinq_acme TO tlinq_acme_user;
Wrap this in a provisioning script/CLI that:
1. Creates the database from template
2. Creates optional DB user
3. Inserts tenant row into tqplatform.tenant
4. Creates Keycloak users with tenant_id attribute
5. Clears the new DB of any data (keeps DDL only) or seeds with demo data
1.2 DDL Migration Management¶
Each tenant DB must stay at the same schema version. Options:
- Simple approach: Loop over all tenant DBs from the registry and apply config/db-changes/ scripts to each
- Better: Adopt Flyway with a multi-database wrapper that iterates over tqplatform.tenant
- Add a schema_version column to tqplatform.tenant to track per-tenant migration state
1.3 Automated Keycloak Realm Provisioning¶
This is the key automation for self-onboarding. When a new tenant signs up, the platform creates a fully configured Keycloak realm automatically.
New class: tqapp/src/main/java/com/perun/tlinq/entity/tenant/KeycloakRealmProvisioner.java
Uses the Keycloak Admin REST API via java.net.http.HttpClient (no new dependencies). Authenticates using a platform-level service account in the master realm with create-realm permission.
Provisioning sequence (called during tenant onboarding):
1. POST /admin/realms → Create new realm
2. POST /admin/realms/{realm}/roles → Create roles: guest, agent, manager, finance, admin
3. POST /admin/realms/{realm}/clients → Create client "tqweb-adm" (public, OIDC)
4. POST /admin/realms/{realm}/clients → Create client "tqpro-admin-api" (confidential, service account)
5. POST /admin/realms/{realm}/clients/{id}/service-account-user/role-mappings/realm
→ Assign realm-management roles to service account
6. POST /admin/realms/{realm}/users → Create initial tenant admin user
7. PUT /admin/realms/{realm}/users/{id}/execute-actions-email
→ Send welcome/set-password email to tenant admin
Realm configuration template (applied programmatically):
{
"realm": "acme-travel",
"enabled": true,
"displayName": "Acme Travel",
"loginTheme": "tqpro", // shared custom theme
"sslRequired": "external",
"registrationAllowed": false, // users created by tenant admin only
"resetPasswordAllowed": true,
"accessTokenLifespan": 300, // 5 min (same as current)
"ssoSessionIdleTimeout": 1800
}
Client "tqweb-adm" template (public, for frontend OIDC):
{
"clientId": "tqweb-adm",
"publicClient": true,
"standardFlowEnabled": true,
"directAccessGrantsEnabled": false,
"rootUrl": "https://acme.tourlinq.com",
"redirectUris": ["https://acme.tourlinq.com/*"],
"webOrigins": ["https://acme.tourlinq.com"],
"defaultClientScopes": ["openid", "profile", "email", "roles"]
}
Client "tqpro-admin-api" template (confidential, for staff management):
{
"clientId": "tqpro-admin-api",
"publicClient": false,
"serviceAccountsEnabled": true,
"clientAuthenticatorType": "client-secret",
"standardFlowEnabled": false,
"directAccessGrantsEnabled": false
}
The generated client_secret for tqpro-admin-api is stored in tqplatform.tenant.kc_admin_client_secret (encrypted).
Integration with staff management plan:
The KeycloakAdminClient from doc/plans/management/staff-management-plan.md (M1) becomes tenant-aware:
- Instead of a single hardcoded realm, it reads the realm name from TenantRegistry.getByTenantId(tenantId).getKcRealm()
- Instead of a single client secret, it reads from tenant.kc_admin_client_secret
- The service account in each tenant's realm has manage-users, view-users, manage-realm roles
- When a tenant admin calls POST /admin/user/create, the KeycloakAdminClient targets that tenant's realm
- The CStaffMember entity lives in the tenant's own DB, so cross-tenant staff isolation is automatic
Platform-level service account setup (one-time manual):
- In Keycloak master realm, create client tqpro-platform-admin
- Service account enabled, client_credentials grant
- Assign role admin in master realm (grants create-realm and manage-realm for all realms)
- Store credentials in platform config (env var or tlinqapi.properties)
- This account is used only during tenant provisioning, not for day-to-day operations
1.4 Complete Tenant Onboarding Flow¶
Whether triggered by self-service signup or manual admin action, the onboarding runs these steps atomically:
┌─────────────────────────────────────────────────────┐
│ 1. Validate tenant code (unique, safe for realm name) │
│ 2. Create PostgreSQL database (pg_dump/pg_restore) │
│ 3. Create Keycloak realm (KeycloakRealmProvisioner) │
│ - Realm + roles + clients + service account │
│ 4. Create initial tenant admin user in Keycloak │
│ 5. Insert tenant row into tqplatform.tenant │
│ - db_name, kc_realm, kc_admin_client_secret, config│
│ 6. Send welcome email to tenant admin │
│ - Contains login URL: https://{tenant}.tourlinq.com│
│ - Keycloak sends set-password email │
│ 7. Refresh TenantRegistry cache │
└─────────────────────────────────────────────────────┘
If any step fails, previous steps are rolled back (delete realm, drop database, remove tenant row).
Self-service onboarding endpoint (future, requires public-facing signup page):
POST /platform/tenant/signup
Body: { tenantCode, tenantName, adminEmail, adminName }
→ Runs the onboarding flow above
→ Returns: { tenantId, loginUrl, message: "Check your email to set your password" }
This endpoint would be rate-limited and require email verification before the realm is created.
1.5 Onboard Second Tenant — Proof of Concept¶
- Run provisioning (manual CLI or admin API)
- Verify: new Keycloak realm exists with correct clients and roles
- Verify: tenant admin can log in via
https://{tenant}.tourlinq.com - Verify: tenant admin can create staff users from the User Management page (staff-management-plan M4)
- Verify: data isolation — bookings created by Tenant A are invisible to Tenant B
Phase 2: Cache Partitioning¶
Effort: ~1-2 weeks | Risk: Low
All in-memory caches must be tenant-scoped. Simplest approach: prefix cache keys with tenant ID.
| Cache | File | Change |
|---|---|---|
| StaticMapCache | tqapp/.../cache/StaticMapCache.java |
Key = tenantId::cacheName |
| PricelistCache | tqapp/.../cache/PricelistCache.java |
Delegate to tenant-prefixed StaticMapCache |
| ProductCache | tqapp/.../cache/ProductCache.java |
Same |
| SupplierCache | tqapp/.../cache/SupplierCache.java |
Same |
| CartHolder | tqapp/.../cart/CartHolder.java |
Key = tenantId::sessionId |
| TlinqClusterCache | tqcommon/.../cache/TlinqClusterCache.java |
Hazelcast map name = tenantId::mapName |
Utility class: TenantCacheKey.of(name) → RequestContext.current().getTenantId() + "::" + name
Phase 3: Configuration & Credential Isolation¶
Effort: ~2-3 weeks | Risk: Medium
Each tenant has its own credentials, business rules, and branding. All per-tenant values are stored in tqplatform.tenant.config JSONB and accessed via TenantConfig.get(key) (introduced in Phase 0.5).
3.1 Plugin Config Classes — Tenant-Aware Credential Lookup¶
| Plugin | Current Singleton | Change |
|---|---|---|
| Odoo | OdooClientConfig.getInstance() |
Server URL, DB, user, password → TenantConfig.get("odoo.*") |
| Amadeus | AmadeusClientConfig.getInstance() |
API key/secret → TenantConfig.get("amadeus.*") |
| Rayna B2B | RaynaClientConfig singleton |
Token, server → TenantConfig.get("ryb2b.*") |
| Tiqets | TiqetsClientConfig singleton |
API key → TenantConfig.get("tiqets.*") |
| GoGlobal | GoGlobalClientConfig singleton |
Password, agency ID → TenantConfig.get("goglobal.*") |
| Google Flights | GoogleFlightsClientConfig singleton |
RapidAPI key → TenantConfig.get("gf.*") |
Important: The structural parts of plugin configs (service class mappings in nts-client.xml, odoo-client.properties) remain shared — loaded once from files. Only credentials and connection URLs become tenant-specific.
3.2 Other Tenant-Specific Settings¶
| Category | Config Keys | Notes |
|---|---|---|
| Mail/SMTP | mail.server, mail.port, mail.user, mail.password, mail.from, mail.name |
Each tenant sends from own domain |
| Payment Gateway | telr.auth-key, telr.store-id, pgw.callback-base |
Own merchant account |
| Twilio/WhatsApp | twilio.sid, twilio.token, twilio.sms.sender, twilio.wa.sender |
Own Twilio account |
| Company Branding | tqpro.company.name, tqpro.company.logo.url, tqpro.agency.* |
Tenant identity |
| Business Rules | hotel.margin, erp.default.product.*, service.charge.* |
Agency-specific pricing |
| Content/CDN | content.cdn-prefix, s3.path.prefix |
Tenant-specific storage paths |
| AI Integration | ai.api.key |
Separate API key per tenant (optional) |
All accessed via TenantConfig.get(key) with platform defaults as fallback.
3.3 AppConfig Migration Path¶
TenantConfigloads platform defaults from the same filesAppConfiguses- Gradually replace
AppConfig.getInstance().getProp(key)→TenantConfig.get(key)in tenant-sensitive code AppConfigremains for truly platform-level properties (TLINQ_HOME, config path resolution)- A grep of
AppConfig.getInstance().getPropidentifies all ~N call sites to migrate
3.4 Credential Encryption¶
Sensitive values stored with encrypted: prefix in JSONB. TenantConfig.get() transparently decrypts using application-level AES-256 key from TQPRO_ENCRYPTION_KEY env var.
Phase 4: S3 Document Storage Isolation¶
Effort: ~1 week | Risk: Low
tqapp/.../media/BookingDocumentStorageService.java: Prefix all S3 paths with tenant path prefix- Current:
bookings/{bookingId}/documents/{file} - New:
{tenantCode}/bookings/{bookingId}/documents/{file} MediaS3Configsingleton is fine (single AWS account) — only paths change- Tenant prefix comes from
TenantRegistry.getTenantCode(tenantId) - Existing files: one-time migration to move under tenant prefix
Phase 5: WhatsApp/Messaging Isolation¶
Effort: ~2 weeks | Risk: Medium
The Python WhatsApp service (tqwhatsapp/) connects to the same PG server. With database-per-tenant:
- Java → Python internal API calls: Include
X-Tenant-IDheader (fromRequestContext) - Python service: Accept
X-Tenant-ID, resolve tenant DB name from registry, connect to that DB'stqwaschema - No schema cloning needed — each tenant's DB already has its own
tqwaschema - WhatsApp Business Account: Each tenant gets own phone number ID from
tenant.config->'wa_phone_id' - Conversation isolation is automatic — queries hit the tenant's DB, so phone number collisions across tenants are impossible
Phase 6: Frontend Tenant Awareness¶
Effort: ~1 week | Risk: Low
- Minimal changes: JWT carries
tenant_id; API isolation is entirely server-side - Add tenant name/branding to the UI header (new
/auth/tenant-infoendpoint) - Optional: per-tenant logo, color theme, company name from
tenant.config api-roles.properties: No change — roles are the same across tenants
Phase 7: Defense-in-Depth Safety¶
Effort: ~1 week | Risk: Low
TenantAssert utility (new class in tqcommon):
- requireTenant() — throws if no tenant in RequestContext
- requireDbMatch(Session) — verifies session's JDBC URL matches expected tenant DB
Insert assertions into:
- NTSEntityWriteService.create(), .write(), .delete()
- NTSEntityReadService.read(), .search()
Background jobs utility: TenantScope.run(tenantId, Runnable) — sets RequestContext for code running outside JAX-RS requests (scheduled maintenance in BookingMaintenanceApi, Hazelcast listeners, etc.)
Startup health check: On boot, verify all tenant DBs are reachable and at the expected schema version.
Key Risks & Mitigations¶
| Risk | Mitigation |
|---|---|
| Connection pool explosion (many tenants) | Small per-tenant pools (2-5); lazy initialization; pool eviction for idle tenants |
Missed getSession() call without tenant context |
TenantAssert.requireTenant() safety net; grep audit of all call sites |
| Background jobs without RequestContext | TenantScope.run(tenantId, Runnable) wraps all async/scheduled code |
| Schema DDL drift across tenant DBs | Migration wrapper iterates all tenant DBs; schema_version tracking in registry |
| Hibernate Configuration per tenant | Build from scratch per tenant (entity class list is deterministic); cache the SessionFactory instances |
| Keycloak realm sprawl | Realm count = tenant count; Keycloak handles hundreds of realms efficiently. Monitor realm count. Deprovisioned tenants → disable realm. |
| JWTValidator accepting tokens from unknown realms | validateToken() checks TenantRegistry.getByRealm(realmName) before processing — rejects tokens from unknown/deprovisioned tenants |
| Realm provisioning failure mid-onboarding | Transactional onboarding with rollback (delete realm, drop DB, remove tenant row) if any step fails |
| Staff management plan compatibility | KeycloakAdminClient (from staff-management-plan M1) becomes tenant-aware: reads realm name and service account secret per tenant from TenantRegistry. No architectural conflict — same API, scoped to tenant's realm. |
| Per-tenant Odoo/Amadeus outages | Natural blast radius isolation — one tenant's external system being down doesn't affect others |
| Config migration completeness | Grep all AppConfig.getInstance().getProp() calls to identify every property access point. Missing a migration means a tenant gets the platform default — safe but possibly wrong. Track migration in a checklist. |
| Shared file-based configs accidentally treated as per-tenant | Clear separation: file-based configs = shared structural/platform; tqplatform.tenant.config JSONB = per-tenant overrides. TenantConfig.get() provides the layered lookup. |
Verification Plan¶
- Phase 0 smoke test: Deploy with single tenant (existing DB), run full regression — zero behavioral change expected
- Phase 1 isolation test: Onboard second tenant, create bookings/customers in both, verify via direct DB queries that data is in separate databases
- Cross-tenant penetration test: Use Tenant A's JWT to call APIs — verify only Tenant A's data is returned. Attempt to pass Tenant B's entity IDs — should get "not found"
- Cache isolation test: Populate caches for Tenant A, switch to Tenant B context, verify caches are empty/independent
- External system test: Configure different Odoo instances per tenant, verify correct routing
- Document access test: Upload documents for Tenant A, verify Tenant B can't access them
- Connection pool test: Monitor HikariCP metrics under multi-tenant load — verify pool sizes stay within limits
- Config isolation test: Set different
mail.from,hotel.margin,tqpro.agency.nameper tenant; verify each tenant's emails, pricing, and branding use correct values - Config fallback test: Omit a property from tenant config; verify it falls back to platform default from
tourlinq.properties - Migration audit test: Verify all
AppConfig.getInstance().getProp()call sites handling tenant-specific data have been migrated toTenantConfig.get()