Skip to content

TQPro Multi-Tenancy Architecture Plan

Context

TQPro is currently a single-tenant application. All users, regardless of company, share the same database, same schemas, same caches, same external system credentials, and same file storage. The only hint of organizational grouping is CRegUser.companyId (mapped to Odoo's customer_account_id), but nothing enforces data isolation based on it.

Goal: Enable multiple independent travel agencies (potentially competitors) to use the same TQPro platform with full logical data isolation — each tenant's customers, bookings, products, documents, and conversations are invisible to other tenants.


Current State Summary

Concern Status Key File
Tenant identity in JWT Missing tqapi/.../oidc/ValidatedToken.java
Tenant in request context Missing tqcommon/.../util/RequestContext.java
Tenant filtering in DB queries None tqapp/.../nts/service/NTSEntityReadService.java
Database isolation Single DB tlinq, 5 schemas tqapp/.../nts/db/NTSDBSession.java
Cache isolation Global singletons tqapp/.../cache/StaticMapCache.java
S3 document paths No tenant prefix tqapp/.../media/BookingDocumentStorageService.java
External system creds Global singletons tqodoo/.../OdooClientConfig.java, tqamds/.../AmadeusClientConfig.java
WhatsApp conversations Keyed by phone only, no tenant config/db-changes/0070-conversation-messages.sql
Keycloak realm Single realm, no tenant claim config/tlinqapi.properties
API authorization Global roles (guest/agent/admin) config/api-roles.properties
Configuration management All configs are global singletons loaded from files AppConfig, NTSClientConfig, OdooClientConfig, etc.

Current Database Schema Layout (single DB tlinq)

Schema Purpose JPA Entities
nts Core business data (80+ tables: bookings, cruises, hotels, groups, visa, marketing, tripmaker, offline tickets) Hardcoded in NTSDBSession — no @Table(schema=...) (uses default)
amadeus Airport cache @Table(name="airport", schema="amadeus") in AirportEntity
goglobal GoGlobal hotel/city/country data @Table(schema="goglobal", catalog="tlinq") in GG* entities
tiqets Tiqets orders, products, tags @Table(schema="tiqets", catalog="tlinq") in Tiqets* entities
tqwa WhatsApp messages, conversations, broadcasts Used by Python service; separate DB user wa_service

Configuration Landscape

The platform loads configuration from files under TLINQ_HOME via several singleton classes. Every one of these singletons is loaded once at startup and shared globally — there is no per-tenant config resolution.

Configuration Loading Hierarchy

TLINQ_HOME/
├── tourlinq-config.xml          ← ClientConfig singleton (JAXB)
│   └── entities/*.xml           ← 19 XInclude entity mapping files
├── tourlinq.properties          ← AppConfig singleton (Properties)
├── properties.d/                ← AppConfig (alphabetically merged overrides)
│   ├── erp-booking.properties
│   └── messaging.properties
├── nts-client.xml               ← NTSClientConfig singleton (JAXB)
├── odoo-client.properties       ← OdooClientConfig singleton
├── odoo-server.properties       ← OdooClientConfig singleton
├── amadeus-client.xml           ← AmadeusClientConfig singleton
├── amadeus.idfile               ← Amadeus credentials
├── goglobal-client.xml          ← GoGlobalClientConfig singleton
├── tiqets-client.xml            ← TiqetsClientConfig singleton
├── rayna-client.xml             ← RaynaClientConfig singleton
├── googleflights-client.xml     ← GoogleFlightsClientConfig singleton
├── tlinqapi.properties          ← TQProApiServer (API ports, auth, CORS)
├── api-roles.properties         ← ApiRoleManager (endpoint→role mappings)
├── hazelcast.xml                ← TlinqClusterCache (cluster config)
└── log.properties               ← Java logging

Configuration Classification: Shared vs. Per-Tenant

Config File Loader Class Classification Reason
PLATFORM-LEVEL (shared across all tenants)
tourlinq-config.xml — entity definitions, factory mappings ClientConfig Shared Entity model & service wiring is identical for all tenants
config/entities/*.xml — 19 entity mapping files ClientConfig (XInclude) Shared Same DB schema structure per tenant DB
nts-client.xml — service class mappings NTSClientConfig Shared Service implementations are code, not data
odoo-client.properties — Odoo service-to-model mappings OdooClientConfig Shared Structural mappings are the same; credentials differ
api-roles.properties — endpoint→role authorization ApiRoleManager Shared Same API surface for all tenants
tlinqapi.properties — server ports, OIDC issuer, CORS TQProApiServer Shared Single JVM serves all tenants
hazelcast.xml — cluster topology TlinqClusterCache Shared One cluster; tenant isolation via key prefixing
log.properties — logging format Java Logging Shared Tenant context added via MDC/RequestContext
TENANT-LEVEL (must differ per tenant)
tourlinq.properties — API keys, company info, mail, payment gateway AppConfig Per-tenant Contains credentials, company identity, business rules
properties.d/messaging.properties — Twilio SID/token, WA config AppConfig Per-tenant Each tenant has own Twilio account, WA phone
properties.d/erp-booking.properties — ERP defaults, agency info, charge rates AppConfig Per-tenant Agency name, payment channels, service charge rates
odoo-server.properties — Odoo URL, DB, user, password OdooClientConfig Per-tenant Each tenant has own Odoo instance
amadeus-client.xml — Amadeus API credentials AmadeusClientConfig Per-tenant Each tenant has own Amadeus contract
amadeus.idfile — Amadeus credential file Amadeus plugin Per-tenant Credential file per tenant
goglobal-client.xml — GoGlobal API URL, agency ID, credentials GoGlobalClientConfig Per-tenant Each tenant has own GoGlobal account
tiqets-client.xml — Tiqets API key, JWT key file TiqetsClientConfig Per-tenant Each tenant has own Tiqets agreement
rayna-client.xml — Rayna agent ID, API tokens RaynaClientConfig Per-tenant Each tenant has own Rayna B2B account
googleflights-client.xml — RapidAPI key GoogleFlightsClientConfig Per-tenant API key is per-account
tourlinq-config.xml — database connection URLs/credentials ClientConfig Per-tenant Each tenant points to own DB
hibernate.cfg.xml — JDBC URL (hardcoded localhost:5432/tlinq) Hibernate Per-tenant Different DB per tenant

Specific Per-Tenant Properties (from tourlinq.properties + properties.d/)

Identity & Branding: - company.code, tqpro.company.name, tqpro.company.logo.url, tqpro.company.contact.1/2 - tqpro.agency.name, tqpro.agency.phone, tqpro.agency.email

Mail/SMTP: - mail.server, mail.port, mail.user, mail.password, mail.from, mail.name, mail.usetls

Payment Gateway (Telr): - telr.pgw-url, telr.auth-key, telr.store-id, telr.test-mode - telr.link-success, telr.link-cancel, telr.link-declined - pgw.callback-base, pgw.public-site-url

Messaging/Twilio/WhatsApp: - twilio.sid, twilio.token, twilio.sms.sender, twilio.wa.sender - broadcast.wa.provider, broadcast.wa.media.cdn.prefix, broadcast.wa.shortlink - whatsapp.python.service.url, whatsapp.python.service.api-key - manager.sms, admin.sms

3rd Party API Keys: - rapidapi.visa.key, tiqets.api.key, goglobal.api.password, gf.rapidapi.key - pexels.api.key, ai.api.key - ryb2b.server, ryb2b.token

ERP/Business Rules: - erp.payment.channel.*, erp.default.product.* - service.charge.* (per payment method rates) - hotel.margin, service.default.hotel.checkInTime, etc.

Storage Paths: - content.directory, content.urlprefix, content.cdn-prefix - offline-tickets.storage-path

Database: - tlinq.dbname, tlinq.dbpass


Why not shared-DB with tenant_id column (row-level)?

  • 100+ JPA entities with a generic, reflection-based query framework (EntityFacade + NTSClientService.querySearchS()). Injecting AND tenant_id = ? into every dynamically-built query path is fragile — one missed filter = cross-tenant data leak.

Why not schema-per-tenant?

  • The DB already has 5 distinct schemas (nts, amadeus, goglobal, tiqets, tqwa) with hardcoded @Table(schema=...) annotations in JPA entities. Duplicating all 5 schemas per tenant and rewriting annotations is messy and error-prone.

Why database-per-tenant is the best fit:

  • Each tenant gets a complete copy of the DB with all 5 schemas intact — nts, amadeus, goglobal, tiqets, tqwa schema names stay identical
  • Zero JPA annotation changes — all @Table(schema="goglobal") references work as-is
  • Strongest isolation — a programming error cannot leak data across databases; PostgreSQL enforces DB boundaries at the protocol level
  • Clean tenant provisioningpg_dump / pg_restore of the template DB creates a new tenant instantly
  • Per-tenant DB users — each tenant's DB can have its own credentials, enabling PG-level access control
  • Hibernate supports this nativelyMultiTenancyStrategy.DATABASE with MultiTenantConnectionProvider
  • Same PG server — no network overhead; managed PG services (RDS, Cloud SQL) support multiple DBs on one instance
  • Connection pool sizing — for 5-50 tenants with 2-5 connections each = 10-250 connections total, well within PG limits

Phased Implementation

Phase 0: Tenant Identity Infrastructure (Foundation)

Effort: ~2-3 weeks | Risk: Low

Goal: Establish tenant identity flowing from JWT through the entire request lifecycle to the DB session, without changing any business logic. Deploy with a single tenant to validate the plumbing.

0.1 Tenant Registry Database & Table

Create a platform-level database tqplatform with a tenant registry:

CREATE TABLE tenant (
    tenant_id       VARCHAR(36) PRIMARY KEY,      -- UUID
    tenant_code     VARCHAR(50) UNIQUE NOT NULL,   -- e.g. "acme-travel" (also the Keycloak realm name)
    tenant_name     VARCHAR(200) NOT NULL,
    db_name         VARCHAR(63) NOT NULL,          -- e.g. "tlinq_acme"
    db_user         VARCHAR(63),                   -- optional per-tenant DB user
    db_pass         VARCHAR(200),                  -- encrypted
    kc_realm        VARCHAR(63) NOT NULL,          -- Keycloak realm name (= tenant_code)
    kc_admin_client_secret VARCHAR(200),           -- encrypted, per-realm service account secret
    status          VARCHAR(20) DEFAULT 'ACTIVE',  -- ACTIVE, SUSPENDED, DEPROVISIONED
    config          JSONB DEFAULT '{}',            -- per-tenant config overrides
    created_at      TIMESTAMP DEFAULT NOW()
);
-- First tenant = existing realm & database
INSERT INTO tenant (tenant_id, tenant_code, tenant_name, db_name, kc_realm)
VALUES ('00000000-0000-0000-0000-000000000001', 'tqpro-adm', 'Default Agency', 'tlinq', 'tqpro-adm');

The config JSONB column holds per-tenant overrides (Odoo creds, Amadeus keys, S3 prefix, WA phone ID, branding, etc.) — extensible without DDL changes.

0.2 Keycloak: Realm-Per-Tenant Architecture

Why realm-per-tenant (not single realm with tenant_id attribute): - Keycloak users belong to a realm. A single shared realm means all tenants' users are in one pool — Keycloak's admin console would expose users across tenants, and user attribute-based isolation is fragile. - Competing agencies need complete user isolation: independent password policies, independent login branding/themes, independent identity providers (e.g., tenant A uses Google SSO, tenant B uses SAML). - The staff management plan (doc/plans/management/staff-management-plan.md) has the tenant admin managing users from inside TQPro via the Keycloak Admin REST API. With realm-per-tenant, the admin API calls are scoped to the tenant's realm — no risk of cross-tenant user visibility.

Architecture:

Keycloak Server
├── master realm (platform admin only)
│   └── Platform service account: tqpro-platform-admin
│       (has create-realm permission — used ONLY for tenant provisioning)
├── tqpro-adm realm (first/default tenant — existing)
│   ├── Client: tqweb-adm (public, OIDC login for frontend)
│   ├── Client: tqpro-admin-api (confidential, service account for user CRUD)
│   ├── Realm roles: guest, agent, manager, finance, admin
│   └── Users: existing users
├── acme-travel realm (tenant 2 — created at onboarding)
│   ├── Client: tqweb-adm (same config, different realm)
│   ├── Client: tqpro-admin-api (confidential, service account)
│   ├── Realm roles: guest, agent, manager, finance, admin
│   └── Users: acme-travel's users
└── ... more tenant realms

Tenant identity derivation from JWT: The iss (issuer) claim in every JWT contains the realm: https://auth.tourlinq.com/realms/acme-travel. The tenant is resolved by extracting the realm name from the issuer URL and looking it up in tqplatform.tenant.kc_realm.

No tenant_id user attribute needed — the realm IS the tenant boundary.

0.3 Multi-Realm JWT Validation

The current JWTValidator is hardcoded to a single OIDCConfig (one issuer, one JWKS endpoint). For realm-per-tenant, it needs to accept tokens from any registered tenant realm.

tqapi/src/main/java/com/perun/tlinq/oidc/JWTValidator.java — refactor:

public class JWTValidator {
    // One processor per realm, lazily created
    private final ConcurrentHashMap<String, ConfigurableJWTProcessor<SecurityContext>> processors 
        = new ConcurrentHashMap<>();
    private final String keycloakBaseUrl;  // e.g. "https://auth.tourlinq.com"
    private final String clientId;         // same client ID across all realms

    public ValidatedToken validateToken(String token) throws TokenValidationException {
        SignedJWT signedJWT = SignedJWT.parse(token);
        JWTClaimsSet unverifiedClaims = signedJWT.getJWTClaimsSet();
        String issuer = unverifiedClaims.getIssuer();

        // Extract realm from issuer: "https://auth.tourlinq.com/realms/acme-travel"
        String realmName = extractRealmFromIssuer(issuer);

        // Verify realm is a known tenant
        TenantInfo tenant = TenantRegistry.instance().getByRealm(realmName);
        if (tenant == null) throw new TokenValidationException("Unknown realm: " + realmName, ...);

        // Get or create processor for this realm (caches JWKS per realm)
        ConfigurableJWTProcessor<SecurityContext> processor = processors.computeIfAbsent(
            realmName, r -> createProcessorForRealm(r, issuer));

        JWTClaimsSet claims = processor.process(signedJWT, null);
        // ... extract userId, email, name, roles as before ...

        return new ValidatedToken(userId, email, name, roles, expiresAt, token, tenant.getTenantId());
    }
}

tqapi/src/main/java/com/perun/tlinq/oidc/ValidatedToken.java: - Add private final String tenantId; field + getter

tqapi/src/main/java/com/perun/tlinq/oidc/OIDCConfig.java: - Keep as base config with keycloakBaseUrl and clientId - Remove single issuer requirement; each realm has its own issuer

tqapi/src/main/java/com/perun/tlinq/oidc/JWKSManager.java: - Refactor to support multiple JWKS sources (one per realm) - Map<String, JWKSource<SecurityContext>> keyed by realm

0.4 Propagate tenant through RequestContext

tqcommon/src/main/java/com/perun/tlinq/util/RequestContext.java: - Add private final String tenantId; field + getter - Update constructor signature

tqapi/src/main/java/com/perun/tlinq/AuthenticationFilter.java: - Extract tenantId from: - JWT (Tier 1): ValidatedToken.getTenantId() (resolved from issuer realm) - Headers (Tier 2): X-Tenant-ID header - Dev mode (Tier 3a): from dev-tenant-id property in tlinqapi.properties - Internal API (Tier 0): X-Tenant-ID header (required for cross-service calls) - Pass to new RequestContext(userId, userName, userEmail, correlationId, tenantId) - Reject authenticated (non-guest) requests without a valid tenant → 403

0.5 Frontend Login — Realm-Aware OIDC

The frontend (tqweb-adm) currently initializes its OIDC client with a hardcoded realm URL. With realm-per-tenant:

Option A — Subdomain routing (recommended): - Each tenant gets a subdomain: acme.tourlinq.com, bravo.tourlinq.com - A lightweight resolver (nginx, or a /auth/config endpoint) maps subdomain → realm - Frontend calls GET /auth/config?tenant=acme → returns { issuer: "https://auth.tourlinq.com/realms/acme-travel", clientId: "tqweb-adm" } - Frontend initializes OIDC with tenant-specific issuer

Option B — Login page with tenant selector: - Single URL, tenant selected before login - Less clean but works if subdomains are not feasible

The existing AuthApi.java endpoint POST /auth/config returns OIDC config to the frontend — this needs to be tenant-aware.

0.5 TenantRegistry and TenantConfig (tqcommon)

New class: tqcommon/src/main/java/com/perun/tlinq/util/TenantRegistry.java

Loads tenant records from tqplatform.tenant at startup. Caches in ConcurrentHashMap<String, TenantInfo>. Provides: - getDbName(tenantId) → database name - getDbUrl(tenantId) → full JDBC URL - getTenantCode(tenantId) → short code - isValid(tenantId) → tenant exists and ACTIVE - refresh() → reload from DB

Connects to tqplatform DB using a dedicated JDBC connection (separate from tenant DBs).

New class: tqcommon/src/main/java/com/perun/tlinq/util/TenantConfig.java

This is the key abstraction for per-tenant configuration. It replaces direct calls to AppConfig.getInstance().getProp(key) with a tenant-aware lookup:

public class TenantConfig {
    // Platform-wide defaults loaded once from files (tourlinq.properties + properties.d/)
    private static Properties platformDefaults;
    // Per-tenant overrides loaded from tqplatform.tenant.config JSONB
    private static final ConcurrentHashMap<String, Properties> tenantOverrides = new ConcurrentHashMap<>();

    /**
     * Get a config property for the current tenant.
     * Lookup order: tenant override → platform default → null
     */
    public static String get(String key) {
        String tenantId = RequestContext.current() != null 
            ? RequestContext.current().getTenantId() : null;
        if (tenantId != null) {
            Properties overrides = tenantOverrides.get(tenantId);
            if (overrides != null && overrides.containsKey(key)) {
                return overrides.getProperty(key);
            }
        }
        return platformDefaults.getProperty(key);
    }

    /** Get config for explicit tenant (for background jobs) */
    public static String get(String tenantId, String key) { ... }
}

Migration path: Existing AppConfig.getInstance().getProp(key) calls are gradually replaced with TenantConfig.get(key). Both can coexist during transition — AppConfig continues to work for platform-level properties.

What goes in the tenant.config JSONB column (stored in tqplatform.tenant):

{
  "company.code": "ACME01",
  "tqpro.company.name": "Acme Travel LLC",
  "tqpro.company.logo.url": "https://acme.example.com/logo.png",
  "tqpro.agency.name": "Acme Travel LLC",
  "tqpro.agency.phone": "+1-555-0100",
  "tqpro.agency.email": "info@acmetravel.com",

  "mail.server": "smtp.acmetravel.com",
  "mail.port": "587",
  "mail.user": "bookings@acmetravel.com",
  "mail.password": "encrypted:...",
  "mail.from": "bookings@acmetravel.com",
  "mail.name": "Acme Travel",

  "telr.auth-key": "encrypted:...",
  "telr.store-id": "29876",
  "pgw.callback-base": "https://api.acmetravel.com/tlinq-api",

  "twilio.sid": "ACxxxxxxxxxx",
  "twilio.token": "encrypted:...",
  "twilio.sms.sender": "MGxxxxxxxxxx",
  "twilio.wa.sender": "MGxxxxxxxxxx",

  "odoo.server": "https://erp.acmetravel.com",
  "odoo.db": "acme_prod",
  "odoo.user": "admin",
  "odoo.password": "encrypted:...",

  "amadeus.api.key": "encrypted:...",
  "amadeus.api.secret": "encrypted:...",

  "tiqets.api.key": "encrypted:...",
  "goglobal.api.password": "encrypted:...",
  "ryb2b.token": "encrypted:...",

  "hotel.margin": "8",
  "erp.default.product.HOTEL": "LHSTAY",
  "service.charge.CARD_ONLINE": "percent,3.0,Card Processing Fee",

  "content.cdn-prefix": "https://cdn.acmetravel.com/",
  "s3.path.prefix": "tenants/acme/"
}

Only values that differ from platform defaults need to be specified. The TenantConfig.get() method falls back to platformDefaults (loaded from the file-based configs) for any key not overridden.

0.6 Tenant-Aware DB Sessions (THE critical change)

tqapp/src/main/java/com/perun/tlinq/client/nts/db/NTSDBSession.java:

Replace the single static SessionFactory with a tenant-keyed factory map:

public class NTSDBSession {
    private static final Map<String, SessionFactory> factories = new ConcurrentHashMap<>();
    private static Configuration baseConfiguration; // built once, cloned per tenant

    static {
        // Build base configuration (entity registrations, Hibernate settings)
        // but do NOT set connection URL yet
        baseConfiguration = new Configuration();
        baseConfiguration.configure();
        // ... register all annotated classes (lines 69-204 stay the same) ...

        // Initialize factory for default/first tenant
        String defaultDb = TenantRegistry.instance().getDbName("default");
        factories.put("default", buildFactory(defaultDb));
    }

    private static SessionFactory buildFactory(String dbName) {
        Configuration cfg = baseConfiguration.copy(); // or rebuild
        cfg.setProperty("hibernate.connection.url", 
            ClientConfig.instance().getDB(dbName));
        cfg.setProperty("hibernate.connection.username", ...);
        cfg.setProperty("hibernate.connection.password", ...);
        cfg.setProperty("hibernate.hikari.poolName", "NTS-" + dbName);
        cfg.setProperty("hibernate.hikari.minimumIdle", "2");
        cfg.setProperty("hibernate.hikari.maximumPoolSize", "5");
        return cfg.buildSessionFactory();
    }

    public static Session getSession() {
        String tenantId = RequestContext.current() != null 
            ? RequestContext.current().getTenantId() : null;
        if (tenantId == null) tenantId = "default";

        SessionFactory factory = factories.computeIfAbsent(tenantId, tid -> {
            String dbName = TenantRegistry.instance().getDbName(tid);
            return buildFactory(dbName);
        });
        return factory.openSession();
    }
}

Key design decisions: - Lazy factory creation via computeIfAbsent — new tenant pools spin up on first request - Small per-tenant pools (2-5 connections) keep total connection count manageable - Base configuration (entity class registrations) is shared; only JDBC URL differs - HikariCP pool names are tenant-prefixed for monitoring

Same pattern for other DB sessions (if they exist): - AmdDBSession in tqamds (Amadeus schema) - Any GoGlobal/Tiqets DB sessions

0.7 Safe deployment checkpoint

  • Deploy with existing tlinq DB as the single default tenant
  • All traffic routes to the same DB as before
  • Zero behavioral change — this validates the entire plumbing end-to-end

Phase 1: Tenant Provisioning & Second Tenant

Effort: ~1-2 weeks | Risk: Medium

1.1 Database Cloning Script

# Create new tenant database from template
pg_dump -Fc tlinq > /tmp/tlinq_template.dump
createdb tlinq_acme
pg_restore -d tlinq_acme /tmp/tlinq_template.dump
# Optionally: create tenant-specific DB user
createuser tlinq_acme_user
GRANT ALL ON DATABASE tlinq_acme TO tlinq_acme_user;

Wrap this in a provisioning script/CLI that: 1. Creates the database from template 2. Creates optional DB user 3. Inserts tenant row into tqplatform.tenant 4. Creates Keycloak users with tenant_id attribute 5. Clears the new DB of any data (keeps DDL only) or seeds with demo data

1.2 DDL Migration Management

Each tenant DB must stay at the same schema version. Options: - Simple approach: Loop over all tenant DBs from the registry and apply config/db-changes/ scripts to each - Better: Adopt Flyway with a multi-database wrapper that iterates over tqplatform.tenant - Add a schema_version column to tqplatform.tenant to track per-tenant migration state

1.3 Automated Keycloak Realm Provisioning

This is the key automation for self-onboarding. When a new tenant signs up, the platform creates a fully configured Keycloak realm automatically.

New class: tqapp/src/main/java/com/perun/tlinq/entity/tenant/KeycloakRealmProvisioner.java

Uses the Keycloak Admin REST API via java.net.http.HttpClient (no new dependencies). Authenticates using a platform-level service account in the master realm with create-realm permission.

Provisioning sequence (called during tenant onboarding):

1. POST /admin/realms                          → Create new realm
2. POST /admin/realms/{realm}/roles            → Create roles: guest, agent, manager, finance, admin
3. POST /admin/realms/{realm}/clients          → Create client "tqweb-adm" (public, OIDC)
4. POST /admin/realms/{realm}/clients          → Create client "tqpro-admin-api" (confidential, service account)
5. POST /admin/realms/{realm}/clients/{id}/service-account-user/role-mappings/realm
                                               → Assign realm-management roles to service account
6. POST /admin/realms/{realm}/users            → Create initial tenant admin user
7. PUT  /admin/realms/{realm}/users/{id}/execute-actions-email
                                               → Send welcome/set-password email to tenant admin

Realm configuration template (applied programmatically):

{
  "realm": "acme-travel",
  "enabled": true,
  "displayName": "Acme Travel",
  "loginTheme": "tqpro",           // shared custom theme
  "sslRequired": "external",
  "registrationAllowed": false,     // users created by tenant admin only
  "resetPasswordAllowed": true,
  "accessTokenLifespan": 300,       // 5 min (same as current)
  "ssoSessionIdleTimeout": 1800
}

Client "tqweb-adm" template (public, for frontend OIDC):

{
  "clientId": "tqweb-adm",
  "publicClient": true,
  "standardFlowEnabled": true,
  "directAccessGrantsEnabled": false,
  "rootUrl": "https://acme.tourlinq.com",
  "redirectUris": ["https://acme.tourlinq.com/*"],
  "webOrigins": ["https://acme.tourlinq.com"],
  "defaultClientScopes": ["openid", "profile", "email", "roles"]
}

Client "tqpro-admin-api" template (confidential, for staff management):

{
  "clientId": "tqpro-admin-api",
  "publicClient": false,
  "serviceAccountsEnabled": true,
  "clientAuthenticatorType": "client-secret",
  "standardFlowEnabled": false,
  "directAccessGrantsEnabled": false
}

The generated client_secret for tqpro-admin-api is stored in tqplatform.tenant.kc_admin_client_secret (encrypted).

Integration with staff management plan: The KeycloakAdminClient from doc/plans/management/staff-management-plan.md (M1) becomes tenant-aware: - Instead of a single hardcoded realm, it reads the realm name from TenantRegistry.getByTenantId(tenantId).getKcRealm() - Instead of a single client secret, it reads from tenant.kc_admin_client_secret - The service account in each tenant's realm has manage-users, view-users, manage-realm roles - When a tenant admin calls POST /admin/user/create, the KeycloakAdminClient targets that tenant's realm - The CStaffMember entity lives in the tenant's own DB, so cross-tenant staff isolation is automatic

Platform-level service account setup (one-time manual): - In Keycloak master realm, create client tqpro-platform-admin - Service account enabled, client_credentials grant - Assign role admin in master realm (grants create-realm and manage-realm for all realms) - Store credentials in platform config (env var or tlinqapi.properties) - This account is used only during tenant provisioning, not for day-to-day operations

1.4 Complete Tenant Onboarding Flow

Whether triggered by self-service signup or manual admin action, the onboarding runs these steps atomically:

┌─────────────────────────────────────────────────────┐
│ 1. Validate tenant code (unique, safe for realm name) │
│ 2. Create PostgreSQL database (pg_dump/pg_restore)    │
│ 3. Create Keycloak realm (KeycloakRealmProvisioner)   │
│    - Realm + roles + clients + service account        │
│ 4. Create initial tenant admin user in Keycloak       │
│ 5. Insert tenant row into tqplatform.tenant           │
│    - db_name, kc_realm, kc_admin_client_secret, config│
│ 6. Send welcome email to tenant admin                 │
│    - Contains login URL: https://{tenant}.tourlinq.com│
│    - Keycloak sends set-password email                │
│ 7. Refresh TenantRegistry cache                       │
└─────────────────────────────────────────────────────┘

If any step fails, previous steps are rolled back (delete realm, drop database, remove tenant row).

Self-service onboarding endpoint (future, requires public-facing signup page):

POST /platform/tenant/signup
Body: { tenantCode, tenantName, adminEmail, adminName }
→ Runs the onboarding flow above
→ Returns: { tenantId, loginUrl, message: "Check your email to set your password" }

This endpoint would be rate-limited and require email verification before the realm is created.

1.5 Onboard Second Tenant — Proof of Concept

  • Run provisioning (manual CLI or admin API)
  • Verify: new Keycloak realm exists with correct clients and roles
  • Verify: tenant admin can log in via https://{tenant}.tourlinq.com
  • Verify: tenant admin can create staff users from the User Management page (staff-management-plan M4)
  • Verify: data isolation — bookings created by Tenant A are invisible to Tenant B

Phase 2: Cache Partitioning

Effort: ~1-2 weeks | Risk: Low

All in-memory caches must be tenant-scoped. Simplest approach: prefix cache keys with tenant ID.

Cache File Change
StaticMapCache tqapp/.../cache/StaticMapCache.java Key = tenantId::cacheName
PricelistCache tqapp/.../cache/PricelistCache.java Delegate to tenant-prefixed StaticMapCache
ProductCache tqapp/.../cache/ProductCache.java Same
SupplierCache tqapp/.../cache/SupplierCache.java Same
CartHolder tqapp/.../cart/CartHolder.java Key = tenantId::sessionId
TlinqClusterCache tqcommon/.../cache/TlinqClusterCache.java Hazelcast map name = tenantId::mapName

Utility class: TenantCacheKey.of(name)RequestContext.current().getTenantId() + "::" + name


Phase 3: Configuration & Credential Isolation

Effort: ~2-3 weeks | Risk: Medium

Each tenant has its own credentials, business rules, and branding. All per-tenant values are stored in tqplatform.tenant.config JSONB and accessed via TenantConfig.get(key) (introduced in Phase 0.5).

3.1 Plugin Config Classes — Tenant-Aware Credential Lookup

Plugin Current Singleton Change
Odoo OdooClientConfig.getInstance() Server URL, DB, user, password → TenantConfig.get("odoo.*")
Amadeus AmadeusClientConfig.getInstance() API key/secret → TenantConfig.get("amadeus.*")
Rayna B2B RaynaClientConfig singleton Token, server → TenantConfig.get("ryb2b.*")
Tiqets TiqetsClientConfig singleton API key → TenantConfig.get("tiqets.*")
GoGlobal GoGlobalClientConfig singleton Password, agency ID → TenantConfig.get("goglobal.*")
Google Flights GoogleFlightsClientConfig singleton RapidAPI key → TenantConfig.get("gf.*")

Important: The structural parts of plugin configs (service class mappings in nts-client.xml, odoo-client.properties) remain shared — loaded once from files. Only credentials and connection URLs become tenant-specific.

3.2 Other Tenant-Specific Settings

Category Config Keys Notes
Mail/SMTP mail.server, mail.port, mail.user, mail.password, mail.from, mail.name Each tenant sends from own domain
Payment Gateway telr.auth-key, telr.store-id, pgw.callback-base Own merchant account
Twilio/WhatsApp twilio.sid, twilio.token, twilio.sms.sender, twilio.wa.sender Own Twilio account
Company Branding tqpro.company.name, tqpro.company.logo.url, tqpro.agency.* Tenant identity
Business Rules hotel.margin, erp.default.product.*, service.charge.* Agency-specific pricing
Content/CDN content.cdn-prefix, s3.path.prefix Tenant-specific storage paths
AI Integration ai.api.key Separate API key per tenant (optional)

All accessed via TenantConfig.get(key) with platform defaults as fallback.

3.3 AppConfig Migration Path

  1. TenantConfig loads platform defaults from the same files AppConfig uses
  2. Gradually replace AppConfig.getInstance().getProp(key)TenantConfig.get(key) in tenant-sensitive code
  3. AppConfig remains for truly platform-level properties (TLINQ_HOME, config path resolution)
  4. A grep of AppConfig.getInstance().getProp identifies all ~N call sites to migrate

3.4 Credential Encryption

Sensitive values stored with encrypted: prefix in JSONB. TenantConfig.get() transparently decrypts using application-level AES-256 key from TQPRO_ENCRYPTION_KEY env var.


Phase 4: S3 Document Storage Isolation

Effort: ~1 week | Risk: Low

  • tqapp/.../media/BookingDocumentStorageService.java: Prefix all S3 paths with tenant path prefix
  • Current: bookings/{bookingId}/documents/{file}
  • New: {tenantCode}/bookings/{bookingId}/documents/{file}
  • MediaS3Config singleton is fine (single AWS account) — only paths change
  • Tenant prefix comes from TenantRegistry.getTenantCode(tenantId)
  • Existing files: one-time migration to move under tenant prefix

Phase 5: WhatsApp/Messaging Isolation

Effort: ~2 weeks | Risk: Medium

The Python WhatsApp service (tqwhatsapp/) connects to the same PG server. With database-per-tenant:

  • Java → Python internal API calls: Include X-Tenant-ID header (from RequestContext)
  • Python service: Accept X-Tenant-ID, resolve tenant DB name from registry, connect to that DB's tqwa schema
  • No schema cloning needed — each tenant's DB already has its own tqwa schema
  • WhatsApp Business Account: Each tenant gets own phone number ID from tenant.config->'wa_phone_id'
  • Conversation isolation is automatic — queries hit the tenant's DB, so phone number collisions across tenants are impossible

Phase 6: Frontend Tenant Awareness

Effort: ~1 week | Risk: Low

  • Minimal changes: JWT carries tenant_id; API isolation is entirely server-side
  • Add tenant name/branding to the UI header (new /auth/tenant-info endpoint)
  • Optional: per-tenant logo, color theme, company name from tenant.config
  • api-roles.properties: No change — roles are the same across tenants

Phase 7: Defense-in-Depth Safety

Effort: ~1 week | Risk: Low

TenantAssert utility (new class in tqcommon): - requireTenant() — throws if no tenant in RequestContext - requireDbMatch(Session) — verifies session's JDBC URL matches expected tenant DB

Insert assertions into: - NTSEntityWriteService.create(), .write(), .delete() - NTSEntityReadService.read(), .search()

Background jobs utility: TenantScope.run(tenantId, Runnable) — sets RequestContext for code running outside JAX-RS requests (scheduled maintenance in BookingMaintenanceApi, Hazelcast listeners, etc.)

Startup health check: On boot, verify all tenant DBs are reachable and at the expected schema version.


Key Risks & Mitigations

Risk Mitigation
Connection pool explosion (many tenants) Small per-tenant pools (2-5); lazy initialization; pool eviction for idle tenants
Missed getSession() call without tenant context TenantAssert.requireTenant() safety net; grep audit of all call sites
Background jobs without RequestContext TenantScope.run(tenantId, Runnable) wraps all async/scheduled code
Schema DDL drift across tenant DBs Migration wrapper iterates all tenant DBs; schema_version tracking in registry
Hibernate Configuration per tenant Build from scratch per tenant (entity class list is deterministic); cache the SessionFactory instances
Keycloak realm sprawl Realm count = tenant count; Keycloak handles hundreds of realms efficiently. Monitor realm count. Deprovisioned tenants → disable realm.
JWTValidator accepting tokens from unknown realms validateToken() checks TenantRegistry.getByRealm(realmName) before processing — rejects tokens from unknown/deprovisioned tenants
Realm provisioning failure mid-onboarding Transactional onboarding with rollback (delete realm, drop DB, remove tenant row) if any step fails
Staff management plan compatibility KeycloakAdminClient (from staff-management-plan M1) becomes tenant-aware: reads realm name and service account secret per tenant from TenantRegistry. No architectural conflict — same API, scoped to tenant's realm.
Per-tenant Odoo/Amadeus outages Natural blast radius isolation — one tenant's external system being down doesn't affect others
Config migration completeness Grep all AppConfig.getInstance().getProp() calls to identify every property access point. Missing a migration means a tenant gets the platform default — safe but possibly wrong. Track migration in a checklist.
Shared file-based configs accidentally treated as per-tenant Clear separation: file-based configs = shared structural/platform; tqplatform.tenant.config JSONB = per-tenant overrides. TenantConfig.get() provides the layered lookup.

Verification Plan

  1. Phase 0 smoke test: Deploy with single tenant (existing DB), run full regression — zero behavioral change expected
  2. Phase 1 isolation test: Onboard second tenant, create bookings/customers in both, verify via direct DB queries that data is in separate databases
  3. Cross-tenant penetration test: Use Tenant A's JWT to call APIs — verify only Tenant A's data is returned. Attempt to pass Tenant B's entity IDs — should get "not found"
  4. Cache isolation test: Populate caches for Tenant A, switch to Tenant B context, verify caches are empty/independent
  5. External system test: Configure different Odoo instances per tenant, verify correct routing
  6. Document access test: Upload documents for Tenant A, verify Tenant B can't access them
  7. Connection pool test: Monitor HikariCP metrics under multi-tenant load — verify pool sizes stay within limits
  8. Config isolation test: Set different mail.from, hotel.margin, tqpro.agency.name per tenant; verify each tenant's emails, pricing, and branding use correct values
  9. Config fallback test: Omit a property from tenant config; verify it falls back to platform default from tourlinq.properties
  10. Migration audit test: Verify all AppConfig.getInstance().getProp() call sites handling tenant-specific data have been migrated to TenantConfig.get()