TQPro Amadeus Client Observability Implementation Guide¶
Prometheus + Grafana Monitoring Setup¶
Status: Implementation Planned Target Module: tqamds Created: 2025-11-23 Objective: Implement comprehensive observability for TQPro Amadeus client with remote API and database metrics
Table of Contents¶
- Overview
- Prerequisites
- Phase 1: Infrastructure Setup
- Phase 2: Code Implementation - Core Metrics
- Phase 3: Code Implementation - API Instrumentation
- Phase 4: Code Implementation - Database Instrumentation
- Phase 5: Prometheus Configuration
- Phase 6: Grafana Dashboard Setup
- Phase 7: Testing & Validation
- Metrics Reference
- Troubleshooting
Overview¶
This guide implements observability for the TQPro Amadeus Client module (tqamds) to track:
- Remote API Metrics - Amadeus API call success/failure rates, response times, error classifications
- Database Metrics - Query execution times, record counts, transaction durations
- Service Health - Service availability, connection pool status, cache hit rates
Architecture¶
┌──────────────────┐ ┌──────────────┐ ┌─────────────┐
│ TQPro Amadeus │─────>│ Prometheus │─────>│ Grafana │
│ Client Services │ HTTP │ (Scraper) │ HTTP │ (Dashboard) │
│ (tqamds) │ │ │ │ │
└──────────────────┘ └──────────────┘ └─────────────┘
│
├─> Amadeus API (external)
└─> PostgreSQL DB (cache)
Module Structure¶
tqamds is the Amadeus API client integration that: - Wraps the Amadeus Java SDK for flight and hotel operations - Uses Hibernate ORM for local database caching (airports, hotels) - Implements a service factory pattern for creating API service instances - Has 7 service classes handling different API operations
Key Services:
- AmdFlightSearchService - Flight search and pricing (16 remote API calls, 3 DB operations)
- AmadeusFlightOrderService - Flight booking operations (4 remote API calls)
- AmdHotelOfferSearchService - Hotel search (2 remote API calls)
- AmdHotelRefDataService - Hotel reference data
- AmdRefDataService - Airlines and location reference data (3 remote API calls)
Technology Stack¶
- Metrics Library: Micrometer (vendor-neutral facade)
- Metrics Format: Prometheus
- Storage: Prometheus TSDB
- Visualization: Grafana
- ORM: Hibernate 6.x
- API Client: Amadeus Java SDK
Prerequisites¶
Required Software¶
- JDK 11+ (already installed)
- Docker & Docker Compose (for Prometheus/Grafana)
- Gradle (already configured in project)
- PostgreSQL (for Hibernate metrics)
Required Knowledge¶
- Understanding of Amadeus API operations
- Hibernate ORM and session management
- Prometheus query language (PromQL)
- Java reflection and dynamic proxies (for instrumentation)
Estimated Time¶
- Phase 1 (Infrastructure): 1-2 hours
- Phase 2 (Core Metrics): 2-3 hours
- Phase 3 (API Instrumentation): 4-5 hours
- Phase 4 (DB Instrumentation): 3-4 hours
- Phase 5 (Prometheus Config): 1 hour
- Phase 6 (Grafana Dashboards): 3-4 hours
- Phase 7 (Testing): 2-3 hours
Total: 16-22 hours
Phase 1: Infrastructure Setup¶
Step 1.1: Add Micrometer Dependencies¶
File: tqamds/build.gradle.kts
Location: Lines 12-19 (dependencies block)
dependencies {
testImplementation(platform("org.junit:junit-bom:5.10.0"))
testImplementation("org.junit.jupiter:junit-jupiter")
implementation(project(":tqapp"))
implementation(project(":tqcommon"))
// ADD: Micrometer metrics dependencies
implementation("io.micrometer:micrometer-core:1.12.0")
implementation("io.micrometer:micrometer-registry-prometheus:1.12.0")
// ADD: Hibernate metrics integration
implementation("org.hibernate:hibernate-micrometer:6.4.0.Final")
// ADD: Prometheus exposition format
implementation("io.prometheus:simpleclient_httpserver:0.16.0")
}
Rationale:
- micrometer-core - Core metrics API
- micrometer-registry-prometheus - Prometheus format support
- hibernate-micrometer - Automatic Hibernate session metrics
- simpleclient_httpserver - HTTP server for metrics endpoint
Step 1.2: Rebuild Project¶
Expected Output:
Step 1.3: Docker Compose for Prometheus & Grafana¶
File: tqamds/docker-compose.observability.yml (NEW FILE)
version: '3.8'
services:
prometheus:
image: prom/prometheus:v2.48.0
container_name: tqamds-prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus-data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention.time=30d'
- '--web.console.libraries=/usr/share/prometheus/console_libraries'
- '--web.console.templates=/usr/share/prometheus/consoles'
networks:
- monitoring
grafana:
image: grafana/grafana:10.2.0
container_name: tqamds-grafana
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_USERS_ALLOW_SIGN_UP=false
- GF_SERVER_ROOT_URL=http://localhost:3000
volumes:
- grafana-data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning
networks:
- monitoring
depends_on:
- prometheus
volumes:
prometheus-data:
grafana-data:
networks:
monitoring:
driver: bridge
Step 1.4: Start Infrastructure¶
Verify Services: - Prometheus: http://localhost:9090 - Grafana: http://localhost:3000 (admin/admin)
Phase 2: Code Implementation - Core Metrics¶
Step 2.1: Create Metrics Manager Singleton¶
File: tqamds/src/main/java/com/perun/tlinq/client/amadeus/metrics/AmadeusMetricsManager.java (NEW FILE)
/*
* Copyright (c) 2025. Perun Consulting Services FZ LLE
*/
package com.perun.tlinq.client.amadeus.metrics;
import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;
import io.micrometer.prometheusregistry.PrometheusConfig;
import io.micrometer.prometheusregistry.PrometheusMeterRegistry;
import java.util.concurrent.ConcurrentHashMap;
import java.util.logging.Logger;
/**
* Singleton manager for Amadeus client metrics.
* Provides centralized access to Micrometer registry and metric creation.
*/
public class AmadeusMetricsManager {
private static final Logger logger = Logger.getLogger(AmadeusMetricsManager.class.getName());
private static volatile AmadeusMetricsManager instance;
private final PrometheusMeterRegistry registry;
private final ConcurrentHashMap<String, Counter> counterCache = new ConcurrentHashMap<>();
private final ConcurrentHashMap<String, Timer> timerCache = new ConcurrentHashMap<>();
private AmadeusMetricsManager() {
this.registry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);
logger.info("AmadeusMetricsManager initialized with Prometheus registry");
}
public static AmadeusMetricsManager getInstance() {
if (instance == null) {
synchronized (AmadeusMetricsManager.class) {
if (instance == null) {
instance = new AmadeusMetricsManager();
}
}
}
return instance;
}
/**
* Get the underlying Micrometer registry.
* Used for integration with Hibernate metrics and custom instrumentation.
*/
public MeterRegistry getRegistry() {
return registry;
}
/**
* Scrape metrics in Prometheus text format.
* Called by HTTP endpoint to expose metrics.
*/
public String scrape() {
return registry.scrape();
}
/**
* Increment a counter metric.
* Automatically creates counter if it doesn't exist (cached).
*
* @param name Metric name (e.g., "amadeus.api.requests.total")
* @param tags Tag pairs: key1, value1, key2, value2, ...
*/
public void incrementCounter(String name, String... tags) {
String cacheKey = buildCacheKey(name, tags);
Counter counter = counterCache.computeIfAbsent(cacheKey, k ->
Counter.builder(name)
.tags(tags)
.register(registry)
);
counter.increment();
}
/**
* Increment a counter by a specific amount.
*
* @param name Metric name
* @param amount Amount to increment
* @param tags Tag pairs
*/
public void incrementCounter(String name, double amount, String... tags) {
String cacheKey = buildCacheKey(name, tags);
Counter counter = counterCache.computeIfAbsent(cacheKey, k ->
Counter.builder(name)
.tags(tags)
.register(registry)
);
counter.increment(amount);
}
/**
* Start a timer sample.
* Use with recordTimer() to measure duration.
*
* @return Timer.Sample to be stopped later
*/
public Timer.Sample startTimer() {
return Timer.start(registry);
}
/**
* Record a timer sample completion.
*
* @param sample The timer sample started earlier
* @param name Timer metric name
* @param tags Tag pairs
*/
public void recordTimer(Timer.Sample sample, String name, String... tags) {
if (sample == null) {
logger.warning("Attempted to record null timer sample for metric: " + name);
return;
}
String cacheKey = buildCacheKey(name, tags);
Timer timer = timerCache.computeIfAbsent(cacheKey, k ->
Timer.builder(name)
.tags(tags)
.publishPercentiles(0.5, 0.95, 0.99)
.publishPercentileHistogram()
.register(registry)
);
sample.stop(timer);
}
/**
* Record a specific duration in milliseconds.
*
* @param name Timer metric name
* @param durationMs Duration in milliseconds
* @param tags Tag pairs
*/
public void recordDuration(String name, long durationMs, String... tags) {
String cacheKey = buildCacheKey(name, tags);
Timer timer = timerCache.computeIfAbsent(cacheKey, k ->
Timer.builder(name)
.tags(tags)
.publishPercentiles(0.5, 0.95, 0.99)
.publishPercentileHistogram()
.register(registry)
);
timer.record(java.time.Duration.ofMillis(durationMs));
}
/**
* Build a cache key from metric name and tags.
*/
private String buildCacheKey(String name, String... tags) {
StringBuilder sb = new StringBuilder(name);
if (tags != null) {
for (String tag : tags) {
sb.append(":").append(tag);
}
}
return sb.toString();
}
}
Step 2.2: Create Metrics HTTP Endpoint¶
File: tqamds/src/main/java/com/perun/tlinq/client/amadeus/metrics/MetricsHttpServer.java (NEW FILE)
/*
* Copyright (c) 2025. Perun Consulting Services FZ LLE
*/
package com.perun.tlinq.client.amadeus.metrics;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpHandler;
import com.sun.net.httpserver.HttpServer;
import java.io.IOException;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import java.nio.charset.StandardCharsets;
import java.util.logging.Logger;
/**
* Simple HTTP server exposing Prometheus metrics endpoint.
* Runs on port 8081 by default.
*/
public class MetricsHttpServer {
private static final Logger logger = Logger.getLogger(MetricsHttpServer.class.getName());
private static final int DEFAULT_PORT = 8081;
private static volatile MetricsHttpServer instance;
private HttpServer server;
private final int port;
private MetricsHttpServer(int port) {
this.port = port;
}
public static MetricsHttpServer getInstance() {
return getInstance(DEFAULT_PORT);
}
public static MetricsHttpServer getInstance(int port) {
if (instance == null) {
synchronized (MetricsHttpServer.class) {
if (instance == null) {
instance = new MetricsHttpServer(port);
}
}
}
return instance;
}
/**
* Start the HTTP server exposing /metrics endpoint.
*/
public void start() throws IOException {
if (server != null) {
logger.warning("Metrics HTTP server already started");
return;
}
server = HttpServer.create(new InetSocketAddress(port), 0);
server.createContext("/metrics", new MetricsHandler());
server.createContext("/health", new HealthHandler());
server.setExecutor(null); // Use default executor
server.start();
logger.info("Metrics HTTP server started on port " + port);
logger.info("Metrics endpoint: http://localhost:" + port + "/metrics");
logger.info("Health endpoint: http://localhost:" + port + "/health");
}
/**
* Stop the HTTP server.
*/
public void stop() {
if (server != null) {
server.stop(0);
server = null;
logger.info("Metrics HTTP server stopped");
}
}
/**
* Handler for /metrics endpoint.
*/
static class MetricsHandler implements HttpHandler {
@Override
public void handle(HttpExchange exchange) throws IOException {
if (!"GET".equals(exchange.getRequestMethod())) {
exchange.sendResponseHeaders(405, -1);
return;
}
try {
String metrics = AmadeusMetricsManager.getInstance().scrape();
byte[] response = metrics.getBytes(StandardCharsets.UTF_8);
exchange.getResponseHeaders().set("Content-Type", "text/plain; version=0.0.4");
exchange.sendResponseHeaders(200, response.length);
try (OutputStream os = exchange.getResponseBody()) {
os.write(response);
}
} catch (Exception e) {
logger.severe("Error scraping metrics: " + e.getMessage());
exchange.sendResponseHeaders(500, -1);
}
}
}
/**
* Handler for /health endpoint.
*/
static class HealthHandler implements HttpHandler {
@Override
public void handle(HttpExchange exchange) throws IOException {
String response = "{\"status\":\"UP\"}";
byte[] responseBytes = response.getBytes(StandardCharsets.UTF_8);
exchange.getResponseHeaders().set("Content-Type", "application/json");
exchange.sendResponseHeaders(200, responseBytes.length);
try (OutputStream os = exchange.getResponseBody()) {
os.write(responseBytes);
}
}
}
}
Step 2.3: Initialize Metrics in Plugin¶
File: tqamds/src/main/java/com/perun/tlinq/client/amadeus/framework/AmadeusPlugin.java
Modification: Update the initializePlugin() method
Location: Lines 24-36
Original Code:
@Override
public void initializePlugin() {
logger.info("Amadeus plugin initializing.");
if(acc != null) {
try {
AmadeusServiceFactory asf = AmadeusServiceFactory.getInstance();
} catch (TlinqClientException ex) {
logger.severe("Cannot initialize Amadeus service factory.");
}
} else {
logger.severe("Amadeus plugin not initialized - error reading configuration.");
}
}
Replace with:
@Override
public void initializePlugin() {
logger.info("Amadeus plugin initializing.");
// Initialize metrics manager
try {
AmadeusMetricsManager.getInstance();
logger.info("Amadeus metrics manager initialized");
// Start metrics HTTP server
int metricsPort = Integer.parseInt(acc.getProperty("metrics.port", "8081"));
MetricsHttpServer.getInstance(metricsPort).start();
logger.info("Amadeus metrics endpoint started on port " + metricsPort);
} catch (Exception ex) {
logger.severe("Failed to initialize metrics: " + ex.getMessage());
}
if(acc != null) {
try {
AmadeusServiceFactory asf = AmadeusServiceFactory.getInstance();
} catch (TlinqClientException ex) {
logger.severe("Cannot initialize Amadeus service factory.");
}
} else {
logger.severe("Amadeus plugin not initialized - error reading configuration.");
}
}
Add imports:
import com.perun.tlinq.client.amadeus.metrics.AmadeusMetricsManager;
import com.perun.tlinq.client.amadeus.metrics.MetricsHttpServer;
Phase 3: Code Implementation - API Instrumentation¶
Step 3.1: Create API Metrics Interceptor¶
File: tqamds/src/main/java/com/perun/tlinq/client/amadeus/metrics/ApiMetricsInterceptor.java (NEW FILE)
/*
* Copyright (c) 2025. Perun Consulting Services FZ LLE
*/
package com.perun.tlinq.client.amadeus.metrics;
import com.amadeus.exceptions.NotFoundException;
import com.amadeus.exceptions.ResponseException;
import io.micrometer.core.instrument.Timer;
import java.util.logging.Logger;
/**
* Interceptor for tracking Amadeus API call metrics.
* Tracks: request counts, duration, success/failure, error types.
*/
public class ApiMetricsInterceptor {
private static final Logger logger = Logger.getLogger(ApiMetricsInterceptor.class.getName());
private final AmadeusMetricsManager metrics = AmadeusMetricsManager.getInstance();
/**
* Record a successful API call.
*
* @param serviceName Service class name (e.g., "AmdFlightSearchService")
* @param methodName Method name (e.g., "getFlightOffers")
* @param durationMs Duration in milliseconds
* @param endpoint Amadeus API endpoint (optional)
*/
public void recordSuccess(String serviceName, String methodName, long durationMs, String endpoint) {
// Increment success counter
metrics.incrementCounter(
"amadeus.api.requests.total",
"service", serviceName,
"method", methodName,
"status", "success"
);
// Record duration
metrics.recordDuration(
"amadeus.api.request.duration.seconds",
durationMs,
"service", serviceName,
"method", methodName,
"status", "success"
);
if (endpoint != null && !endpoint.isEmpty()) {
metrics.incrementCounter(
"amadeus.api.endpoint.requests.total",
"endpoint", endpoint,
"status", "success"
);
}
}
/**
* Record a failed API call.
*
* @param serviceName Service class name
* @param methodName Method name
* @param durationMs Duration in milliseconds
* @param ex The exception that occurred
* @param endpoint Amadeus API endpoint (optional)
*/
public void recordFailure(String serviceName, String methodName, long durationMs,
Exception ex, String endpoint) {
String errorType = classifyError(ex);
String httpStatus = extractHttpStatus(ex);
// Increment failure counter
metrics.incrementCounter(
"amadeus.api.requests.total",
"service", serviceName,
"method", methodName,
"status", "failure"
);
// Increment error counter with classification
metrics.incrementCounter(
"amadeus.api.errors.total",
"service", serviceName,
"method", methodName,
"error_type", errorType,
"http_status", httpStatus
);
// Record duration
metrics.recordDuration(
"amadeus.api.request.duration.seconds",
durationMs,
"service", serviceName,
"method", methodName,
"status", "failure"
);
if (endpoint != null && !endpoint.isEmpty()) {
metrics.incrementCounter(
"amadeus.api.endpoint.requests.total",
"endpoint", endpoint,
"status", "failure"
);
}
logger.fine(String.format("API failure recorded: %s.%s - %s (%s)",
serviceName, methodName, errorType, httpStatus));
}
/**
* Classify exception into error type categories.
*/
private String classifyError(Exception ex) {
if (ex instanceof NotFoundException) {
return "not_found";
} else if (ex instanceof ResponseException) {
ResponseException rex = (ResponseException) ex;
int code = rex.getResponse() != null ? rex.getResponse().getStatusCode() : 0;
if (code >= 400 && code < 500) {
return "client_error";
} else if (code >= 500) {
return "server_error";
} else {
return "response_error";
}
} else if (ex instanceof java.net.SocketTimeoutException) {
return "timeout";
} else if (ex instanceof java.io.IOException) {
return "network_error";
} else {
return "unknown";
}
}
/**
* Extract HTTP status code from exception.
*/
private String extractHttpStatus(Exception ex) {
if (ex instanceof NotFoundException) {
return "404";
} else if (ex instanceof ResponseException) {
ResponseException rex = (ResponseException) ex;
int code = rex.getResponse() != null ? rex.getResponse().getStatusCode() : 0;
return code > 0 ? String.valueOf(code) : "unknown";
}
return "unknown";
}
}
Step 3.2: Create Instrumented Base Service¶
File: tqamds/src/main/java/com/perun/tlinq/client/amadeus/service/InstrumentedAmadeusClientService.java (NEW FILE)
/*
* Copyright (c) 2025. Perun Consulting Services FZ LLE
*/
package com.perun.tlinq.client.amadeus.service;
import com.perun.tlinq.client.amadeus.metrics.ApiMetricsInterceptor;
import com.perun.tlinq.entity.RemoteEntityI;
import com.perun.tlinq.util.TlinqClientException;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
/**
* Base class that extends AmadeusClientService with metrics instrumentation.
* All service classes should extend this instead of AmadeusClientService.
*/
public abstract class InstrumentedAmadeusClientService extends AmadeusClientService {
protected final ApiMetricsInterceptor metricsInterceptor = new ApiMetricsInterceptor();
@Override
public <T> T invokeMethod(String methodName, Class<T> returnClass, RemoteEntityI inputData)
throws TlinqClientException {
String serviceName = this.getClass().getSimpleName();
String actualMethodName = methodName == null ? this.getMethodName() : methodName;
long startTime = System.currentTimeMillis();
Timer.Sample sample = metricsInterceptor.startTimer();
try {
// Call parent implementation
T result = super.invokeMethod(actualMethodName, returnClass, inputData);
// Record success
long duration = System.currentTimeMillis() - startTime;
metricsInterceptor.recordSuccess(serviceName, actualMethodName, duration, null);
return result;
} catch (Exception ex) {
// Record failure
long duration = System.currentTimeMillis() - startTime;
metricsInterceptor.recordFailure(serviceName, actualMethodName, duration, ex, null);
// Re-throw
if (ex instanceof TlinqClientException) {
throw (TlinqClientException) ex;
} else {
throw new TlinqClientException("Error invoking method: " + ex.getMessage(), ex);
}
}
}
}
Step 3.3: Update Service Classes to Use Instrumented Base¶
For each of the 5 service classes, change the parent class from AmadeusClientService to InstrumentedAmadeusClientService:
Files to modify:
1. AmdFlightSearchService.java - Line 33
2. AmadeusFlightOrderService.java - Line 30
3. AmdHotelOfferSearchService.java - Line 29
4. AmdHotelRefDataService.java - Similar pattern
5. AmdRefDataService.java - Line 21
Example for AmdFlightSearchService.java:
Original (Line 33):
Replace with:
public class AmdFlightSearchService extends InstrumentedAmadeusClientService implements EntitySearchServiceI {
Step 3.4: Add Detailed Metrics to Critical API Calls¶
For methods that directly call Amadeus API (via getProxy()), add more detailed tracking.
Example: AmdFlightSearchService.java - getFlightOffers() method
Location: Lines 159-172
Original Code:
public List getFlightOffers(RemoteEntityI notUsed) throws TlinqClientException {
Params prm = buildParams();
FlightOfferSearch[] fos;
try {
fos = getProxy().shopping.flightOffersSearch.get(prm);
} catch (ResponseException ex) {
TlinqClientException newEx = new TlinqClientException(ex.getMessage(), ex);
newEx.setErrorCode(TlinqErr.REMOTE_ERROR);
throw newEx;
}
return buildFlightOfferList(fos);
}
Replace with:
public List getFlightOffers(RemoteEntityI notUsed) throws TlinqClientException {
Params prm = buildParams();
FlightOfferSearch[] fos;
long startTime = System.currentTimeMillis();
String endpoint = "/v2/shopping/flight-offers";
try {
fos = getProxy().shopping.flightOffersSearch.get(prm);
// Record success
long duration = System.currentTimeMillis() - startTime;
metricsInterceptor.recordSuccess(
this.getClass().getSimpleName(),
"getFlightOffers",
duration,
endpoint
);
} catch (ResponseException ex) {
// Record failure
long duration = System.currentTimeMillis() - startTime;
metricsInterceptor.recordFailure(
this.getClass().getSimpleName(),
"getFlightOffers",
duration,
ex,
endpoint
);
TlinqClientException newEx = new TlinqClientException(ex.getMessage(), ex);
newEx.setErrorCode(TlinqErr.REMOTE_ERROR);
throw newEx;
}
return buildFlightOfferList(fos);
}
Repeat this pattern for other critical API methods:
AmdFlightSearchService.getFlightOfferPricing()- Lines 289-297AmadeusFlightOrderService.createFlightOrder()- Lines 81-88AmadeusFlightOrderService.getBooking()- Lines 105-115AmdHotelOfferSearchService.searchHotelOffers()- Lines 87-96AmdRefDataService.getAirlines()- Lines 26-36AmdRefDataService.getLocations()- Lines 43-55
Total: ~11 methods to instrument
Phase 4: Code Implementation - Database Instrumentation¶
Step 4.1: Create Database Metrics Interceptor¶
File: tqamds/src/main/java/com/perun/tlinq/client/amadeus/metrics/DbMetricsInterceptor.java (NEW FILE)
/*
* Copyright (c) 2025. Perun Consulting Services FZ LLE
*/
package com.perun.tlinq.client.amadeus.metrics;
import java.util.logging.Logger;
/**
* Interceptor for tracking database operation metrics.
* Tracks: query counts, execution times, record counts, transaction durations.
*/
public class DbMetricsInterceptor {
private static final Logger logger = Logger.getLogger(DbMetricsInterceptor.class.getName());
private final AmadeusMetricsManager metrics = AmadeusMetricsManager.getInstance();
/**
* Record a database query execution.
*
* @param entityName Entity name (e.g., "AirportEntity")
* @param operation Operation type (e.g., "SELECT", "INSERT", "UPDATE")
* @param durationMs Query duration in milliseconds
* @param recordCount Number of records affected/returned
*/
public void recordQuery(String entityName, String operation, long durationMs, int recordCount) {
// Increment query counter
metrics.incrementCounter(
"amadeus.db.queries.total",
"entity", entityName,
"operation", operation
);
// Record query duration
metrics.recordDuration(
"amadeus.db.query.duration.seconds",
durationMs,
"entity", entityName,
"operation", operation
);
// Record record count
metrics.incrementCounter(
"amadeus.db.records.total",
recordCount,
"entity", entityName,
"operation", operation
);
logger.fine(String.format("DB query recorded: %s %s - %dms, %d records",
operation, entityName, durationMs, recordCount));
}
/**
* Record a database transaction.
*
* @param entityName Entity name
* @param durationMs Transaction duration in milliseconds
* @param success Whether transaction succeeded
*/
public void recordTransaction(String entityName, long durationMs, boolean success) {
String status = success ? "committed" : "rolled_back";
// Increment transaction counter
metrics.incrementCounter(
"amadeus.db.transactions.total",
"entity", entityName,
"status", status
);
// Record transaction duration
metrics.recordDuration(
"amadeus.db.transaction.duration.seconds",
durationMs,
"entity", entityName,
"status", status
);
}
/**
* Record a database connection acquisition.
*
* @param durationMs Time to acquire connection in milliseconds
* @param success Whether connection was acquired successfully
*/
public void recordConnectionAcquisition(long durationMs, boolean success) {
String status = success ? "success" : "failure";
metrics.incrementCounter(
"amadeus.db.connections.acquired.total",
"status", status
);
if (success) {
metrics.recordDuration(
"amadeus.db.connection.acquisition.duration.seconds",
durationMs
);
}
}
}
Step 4.2: Integrate Hibernate Metrics with Micrometer¶
File: tqamds/src/main/java/com/perun/tlinq/client/amadeus/db/AmdDBSession.java
Modification: Update the static initializer to register Hibernate metrics
Location: Lines 21-47
Add after line 42 (after _factory = configuration.buildSessionFactory();):
// Register Hibernate metrics with Micrometer
try {
SessionFactory factory = _factory;
MeterRegistry meterRegistry = AmadeusMetricsManager.getInstance().getRegistry();
// Register Hibernate session factory metrics
if (factory.getStatistics() != null) {
factory.getStatistics().setStatisticsEnabled(true);
// Bind Hibernate metrics to Micrometer
new io.micrometer.core.instrument.binder.jpa.HibernateMetrics(
factory,
"amadeus_db", // Metric name prefix
java.util.Collections.emptyList()
).bindTo(meterRegistry);
logger.info("Hibernate metrics registered with Micrometer");
}
} catch (Exception ex) {
logger.warning("Failed to register Hibernate metrics: " + ex.getMessage());
}
Add imports at top of file:
import com.perun.tlinq.client.amadeus.metrics.AmadeusMetricsManager;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.binder.jpa.HibernateMetrics;
Step 4.3: Instrument Database Operations¶
File: tqamds/src/main/java/com/perun/tlinq/client/amadeus/service/AmdFlightSearchService.java
Method: loadAirportNames() - Lines 109-157
Add instrumentation:
After line 109 (method start):
private HashMap<String, String> loadAirportNames(List<String> airports) throws TlinqClientException{
DbMetricsInterceptor dbMetrics = new DbMetricsInterceptor();
long sessionStartTime = System.currentTimeMillis();
Session session = AmdDBSession.getSession();
long sessionDuration = System.currentTimeMillis() - sessionStartTime;
dbMetrics.recordConnectionAcquisition(sessionDuration, true);
After line 113 (query execution):
long queryStartTime = System.currentTimeMillis();
Query<AirportEntity> qry = session.createQuery("FROM AirportEntity e where e.iatacode in (:airports)", AirportEntity.class);
qry.setParameterList("airports", airports);
List<AirportEntity> res = qry.getResultList();
long queryDuration = System.currentTimeMillis() - queryStartTime;
// Record query metrics
dbMetrics.recordQuery("AirportEntity", "SELECT", queryDuration, res.size());
After line 119 (transaction begin):
After line 153 (transaction commit):
session.getTransaction().commit();
long transactionDuration = System.currentTimeMillis() - transactionStartTime;
// Record transaction metrics
int insertCount = airports.size() - res.size();
dbMetrics.recordTransaction("AirportEntity", transactionDuration, true);
if (insertCount > 0) {
dbMetrics.recordQuery("AirportEntity", "INSERT", transactionDuration, insertCount);
}
Add import:
Phase 5: Prometheus Configuration¶
Step 5.1: Create Prometheus Configuration¶
File: tqamds/prometheus.yml (NEW FILE)
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: 'tqamds-monitor'
environment: 'development'
# Alerting configuration
alerting:
alertmanagers:
- static_configs:
- targets: []
# Load rules once and periodically evaluate them
rule_files:
- 'alerts.yml'
# Scrape configurations
scrape_configs:
# TQPro Amadeus Client metrics
- job_name: 'tqamds'
static_configs:
- targets: ['host.docker.internal:8081']
labels:
service: 'tqamds'
module: 'amadeus-client'
metrics_path: '/metrics'
scrape_interval: 10s
scrape_timeout: 5s
Step 5.2: Create Alert Rules¶
File: tqamds/alerts.yml (NEW FILE)
groups:
- name: amadeus_api_alerts
interval: 30s
rules:
# High API error rate
- alert: HighAmadeusAPIErrorRate
expr: |
rate(amadeus_api_errors_total[5m]) > 0.1
for: 5m
labels:
severity: warning
component: amadeus-api
annotations:
summary: "High Amadeus API error rate detected"
description: "Service {{ $labels.service }} method {{ $labels.method }} has error rate of {{ $value }} errors/sec"
# API response time degradation
- alert: SlowAmadeusAPIResponse
expr: |
histogram_quantile(0.95,
rate(amadeus_api_request_duration_seconds_bucket[5m])
) > 5
for: 10m
labels:
severity: warning
component: amadeus-api
annotations:
summary: "Slow Amadeus API response detected"
description: "Service {{ $labels.service }} p95 latency is {{ $value }}s"
# Specific error types
- alert: AmadeusAPIServerErrors
expr: |
rate(amadeus_api_errors_total{error_type="server_error"}[5m]) > 0.05
for: 5m
labels:
severity: critical
component: amadeus-api
annotations:
summary: "Amadeus API server errors detected"
description: "Service {{ $labels.service }} experiencing server errors (5xx)"
# API timeouts
- alert: AmadeusAPITimeouts
expr: |
rate(amadeus_api_errors_total{error_type="timeout"}[5m]) > 0.02
for: 5m
labels:
severity: warning
component: amadeus-api
annotations:
summary: "Amadeus API timeouts detected"
description: "Service {{ $labels.service }} method {{ $labels.method }} experiencing timeouts"
- name: amadeus_database_alerts
interval: 30s
rules:
# Slow database queries
- alert: SlowDatabaseQueries
expr: |
histogram_quantile(0.95,
rate(amadeus_db_query_duration_seconds_bucket[5m])
) > 1
for: 10m
labels:
severity: warning
component: database
annotations:
summary: "Slow database queries detected"
description: "Entity {{ $labels.entity }} {{ $labels.operation }} p95 latency is {{ $value }}s"
# Database connection issues
- alert: DatabaseConnectionFailures
expr: |
rate(amadeus_db_connections_acquired_total{status="failure"}[5m]) > 0
for: 2m
labels:
severity: critical
component: database
annotations:
summary: "Database connection failures detected"
description: "Unable to acquire database connections"
# High transaction rollback rate
- alert: HighTransactionRollbackRate
expr: |
rate(amadeus_db_transactions_total{status="rolled_back"}[5m]) /
rate(amadeus_db_transactions_total[5m]) > 0.1
for: 10m
labels:
severity: warning
component: database
annotations:
summary: "High database transaction rollback rate"
description: "{{ $value | humanizePercentage }} of transactions are being rolled back"
Phase 6: Grafana Dashboard Setup¶
Step 6.1: Create Grafana Provisioning Directory¶
Step 6.2: Configure Prometheus Datasource¶
File: tqamds/grafana/provisioning/datasources/prometheus.yml (NEW FILE)
apiVersion: 1
datasources:
- name: Prometheus-TQAMDS
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
editable: true
jsonData:
timeInterval: "10s"
queryTimeout: "60s"
Step 6.3: Configure Dashboard Provisioning¶
File: tqamds/grafana/provisioning/dashboards/dashboards.yml (NEW FILE)
apiVersion: 1
providers:
- name: 'TQAmadeus'
orgId: 1
folder: 'TQPro Amadeus'
type: file
disableDeletion: false
updateIntervalSeconds: 10
allowUiUpdates: true
options:
path: /etc/grafana/provisioning/dashboards
Step 6.4: Create Amadeus API Dashboard¶
File: tqamds/grafana/provisioning/dashboards/amadeus-api-dashboard.json (NEW FILE)
This is a large JSON file. Here's a condensed structure with key panels:
{
"dashboard": {
"title": "TQPro Amadeus API Metrics",
"tags": ["tqpro", "amadeus", "api"],
"timezone": "browser",
"panels": [
{
"id": 1,
"title": "API Request Rate",
"type": "graph",
"targets": [{
"expr": "rate(amadeus_api_requests_total[5m])",
"legendFormat": "{{service}}.{{method}} - {{status}}"
}]
},
{
"id": 2,
"title": "API Response Times (p95)",
"type": "graph",
"targets": [{
"expr": "histogram_quantile(0.95, rate(amadeus_api_request_duration_seconds_bucket[5m]))",
"legendFormat": "{{service}}.{{method}} p95"
}]
},
{
"id": 3,
"title": "API Error Rate by Type",
"type": "graph",
"targets": [{
"expr": "rate(amadeus_api_errors_total[5m])",
"legendFormat": "{{error_type}} - {{http_status}}"
}]
},
{
"id": 4,
"title": "Requests by Endpoint",
"type": "piechart",
"targets": [{
"expr": "sum by (endpoint) (rate(amadeus_api_endpoint_requests_total[5m]))",
"legendFormat": "{{endpoint}}"
}]
},
{
"id": 5,
"title": "Success vs Failure Rate",
"type": "stat",
"targets": [{
"expr": "sum(rate(amadeus_api_requests_total{status='success'}[5m])) / sum(rate(amadeus_api_requests_total[5m])) * 100",
"legendFormat": "Success Rate %"
}]
}
]
}
}
Step 6.5: Create Database Metrics Dashboard¶
File: tqamds/grafana/provisioning/dashboards/amadeus-database-dashboard.json (NEW FILE)
{
"dashboard": {
"title": "TQPro Amadeus Database Metrics",
"tags": ["tqpro", "amadeus", "database"],
"timezone": "browser",
"panels": [
{
"id": 1,
"title": "Database Query Rate",
"type": "graph",
"targets": [{
"expr": "rate(amadeus_db_queries_total[5m])",
"legendFormat": "{{entity}} - {{operation}}"
}]
},
{
"id": 2,
"title": "Query Duration (p95)",
"type": "graph",
"targets": [{
"expr": "histogram_quantile(0.95, rate(amadeus_db_query_duration_seconds_bucket[5m]))",
"legendFormat": "{{entity}} - {{operation}}"
}]
},
{
"id": 3,
"title": "Records Fetched/Inserted",
"type": "graph",
"targets": [{
"expr": "rate(amadeus_db_records_total[5m])",
"legendFormat": "{{entity}} - {{operation}}"
}]
},
{
"id": 4,
"title": "Transaction Duration",
"type": "graph",
"targets": [{
"expr": "rate(amadeus_db_transaction_duration_seconds_sum[5m]) / rate(amadeus_db_transaction_duration_seconds_count[5m])",
"legendFormat": "{{entity}} avg"
}]
},
{
"id": 5,
"title": "Connection Pool Status",
"type": "stat",
"targets": [{
"expr": "rate(amadeus_db_connections_acquired_total{status='success'}[5m])",
"legendFormat": "Connections/sec"
}]
},
{
"id": 6,
"title": "Hibernate Session Factory",
"type": "graph",
"targets": [{
"expr": "amadeus_db_session_factory_sessions_opened_total",
"legendFormat": "Sessions Opened"
}]
}
]
}
}
Step 6.6: Access Grafana and Import Dashboards¶
- Open Grafana: http://localhost:3000
- Login with
admin/admin - Navigate to Dashboards → Browse
- Verify auto-imported dashboards in "TQPro Amadeus" folder
Phase 7: Testing & Validation¶
Step 7.1: Unit Tests for Metrics¶
File: tqamds/src/test/java/com/perun/tlinq/client/amadeus/metrics/AmadeusMetricsManagerTest.java (NEW FILE)
package com.perun.tlinq.client.amadeus.metrics;
import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.Timer;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;
class AmadeusMetricsManagerTest {
private AmadeusMetricsManager metricsManager;
@BeforeEach
void setUp() {
metricsManager = AmadeusMetricsManager.getInstance();
}
@Test
void testSingletonInstance() {
AmadeusMetricsManager instance1 = AmadeusMetricsManager.getInstance();
AmadeusMetricsManager instance2 = AmadeusMetricsManager.getInstance();
assertSame(instance1, instance2, "Should return same singleton instance");
}
@Test
void testIncrementCounter() {
String metricName = "test.counter";
metricsManager.incrementCounter(metricName, "tag1", "value1");
String scraped = metricsManager.scrape();
assertTrue(scraped.contains(metricName), "Scraped output should contain metric name");
}
@Test
void testTimerRecording() {
Timer.Sample sample = metricsManager.startTimer();
assertNotNull(sample, "Timer sample should not be null");
// Simulate work
try {
Thread.sleep(10);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
String metricName = "test.timer";
metricsManager.recordTimer(sample, metricName, "operation", "test");
String scraped = metricsManager.scrape();
assertTrue(scraped.contains(metricName), "Scraped output should contain timer metric");
}
@Test
void testScrapeOutputFormat() {
metricsManager.incrementCounter("test.metric", "label", "value");
String scraped = metricsManager.scrape();
assertNotNull(scraped, "Scrape output should not be null");
assertTrue(scraped.length() > 0, "Scrape output should not be empty");
assertTrue(scraped.contains("# HELP"), "Should contain Prometheus HELP comments");
assertTrue(scraped.contains("# TYPE"), "Should contain Prometheus TYPE comments");
}
}
Step 7.2: Integration Test¶
File: tqamds/src/test/java/com/perun/tlinq/client/amadeus/service/MetricsIntegrationTest.java (NEW FILE)
package com.perun.tlinq.client.amadeus.service;
import com.perun.tlinq.client.amadeus.metrics.AmadeusMetricsManager;
import org.junit.jupiter.api.Test;
import java.io.IOException;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import static org.junit.jupiter.api.Assertions.*;
class MetricsIntegrationTest {
@Test
void testMetricsEndpointAvailable() throws IOException, InterruptedException {
// Start metrics server
// Note: This assumes server is running on 8081
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create("http://localhost:8081/metrics"))
.GET()
.build();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
assertEquals(200, response.statusCode(), "Metrics endpoint should return 200 OK");
assertTrue(response.body().length() > 0, "Metrics response should not be empty");
assertTrue(response.body().contains("# TYPE"), "Should be Prometheus format");
}
@Test
void testHealthEndpointAvailable() throws IOException, InterruptedException {
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create("http://localhost:8081/health"))
.GET()
.build();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
assertEquals(200, response.statusCode(), "Health endpoint should return 200 OK");
assertTrue(response.body().contains("UP"), "Health should be UP");
}
}
Step 7.3: Manual Testing Checklist¶
Test 1: Metrics Endpoint
Expected: Prometheus-formatted metrics outputTest 2: Health Endpoint
Expected:{"status":"UP"}
Test 3: Prometheus Scraping
1. Open http://localhost:9090
2. Navigate to Status → Targets
3. Verify tqamds target is UP
4. Run query: rate(amadeus_api_requests_total[5m])
5. Should see metrics data
Test 4: Grafana Dashboards 1. Open http://localhost:3000 2. Navigate to Dashboards → TQPro Amadeus folder 3. Open "TQPro Amadeus API Metrics" dashboard 4. Verify panels display data
Test 5: Trigger API Calls
// Example test to generate metrics
AmdFlightSearchService service = new AmdFlightSearchService();
// ... configure service
service.getFlightOffers(null);
Test 6: Check Metrics After API Call
Expected: Counter values incrementedMetrics Reference¶
API Metrics¶
| Metric Name | Type | Labels | Description |
|---|---|---|---|
amadeus.api.requests.total |
Counter | service, method, status | Total API requests |
amadeus.api.request.duration.seconds |
Histogram | service, method, status | API request duration |
amadeus.api.errors.total |
Counter | service, method, error_type, http_status | API errors by type |
amadeus.api.endpoint.requests.total |
Counter | endpoint, status | Requests by Amadeus endpoint |
Label Values:
- service: Service class name (e.g., "AmdFlightSearchService")
- method: Method name (e.g., "getFlightOffers")
- status: "success" or "failure"
- error_type: "not_found", "client_error", "server_error", "timeout", "network_error", "unknown"
- http_status: HTTP status code or "unknown"
- endpoint: Amadeus API path (e.g., "/v2/shopping/flight-offers")
Database Metrics¶
| Metric Name | Type | Labels | Description |
|---|---|---|---|
amadeus.db.queries.total |
Counter | entity, operation | Total database queries |
amadeus.db.query.duration.seconds |
Histogram | entity, operation | Query execution time |
amadeus.db.records.total |
Counter | entity, operation | Records affected/returned |
amadeus.db.transactions.total |
Counter | entity, status | Database transactions |
amadeus.db.transaction.duration.seconds |
Histogram | entity, status | Transaction duration |
amadeus.db.connections.acquired.total |
Counter | status | Connection acquisitions |
amadeus.db.connection.acquisition.duration.seconds |
Histogram | - | Connection acquisition time |
Label Values:
- entity: Entity name (e.g., "AirportEntity", "HotelEntity")
- operation: "SELECT", "INSERT", "UPDATE", "DELETE"
- status: "success", "failure", "committed", "rolled_back"
Hibernate Metrics (auto-registered)¶
| Metric Name | Type | Description |
|---|---|---|
amadeus_db_session_factory_sessions_opened_total |
Counter | Total sessions opened |
amadeus_db_session_factory_sessions_closed_total |
Counter | Total sessions closed |
amadeus_db_second_level_cache_hit_count |
Counter | Second-level cache hits |
amadeus_db_second_level_cache_miss_count |
Counter | Second-level cache misses |
amadeus_db_query_execution_count |
Counter | Total queries executed |
Useful PromQL Queries¶
API Success Rate (last 5 minutes):
sum(rate(amadeus_api_requests_total{status="success"}[5m])) /
sum(rate(amadeus_api_requests_total[5m])) * 100
API Error Rate by Service:
P95 API Response Time:
Database Query Rate:
Cache Hit Ratio:
rate(amadeus_db_second_level_cache_hit_count[5m]) /
(rate(amadeus_db_second_level_cache_hit_count[5m]) +
rate(amadeus_db_second_level_cache_miss_count[5m]))
Troubleshooting¶
Issue 1: Metrics Endpoint Returns 404¶
Symptoms: curl http://localhost:8081/metrics returns 404
Possible Causes: - Metrics HTTP server not started - Port conflict on 8081 - Plugin not initialized
Solutions:
1. Check logs for "Metrics HTTP server started on port 8081"
2. Verify port is not in use: lsof -i :8081
3. Check plugin initialization in AmadeusPlugin.initializePlugin()
Issue 2: No Metrics Data in Prometheus¶
Symptoms: Prometheus shows target as UP but no metrics data
Possible Causes: - No API calls made yet (metrics not generated) - Scrape configuration incorrect - Firewall blocking Prometheus
Solutions:
1. Make test API call to generate metrics
2. Verify Prometheus scrape config targets host.docker.internal:8081
3. Check Prometheus logs: docker logs tqamds-prometheus
Issue 3: Hibernate Metrics Not Appearing¶
Symptoms: API metrics work but database metrics missing
Possible Causes: - Hibernate statistics not enabled - Micrometer Hibernate integration not registered - No database operations performed yet
Solutions:
1. Verify factory.getStatistics().setStatisticsEnabled(true) in AmdDBSession
2. Check for "Hibernate metrics registered" log message
3. Execute query to trigger metrics: test loadAirportNames()
Issue 4: High Memory Usage¶
Symptoms: Application memory usage increases over time
Possible Causes: - Metric cardinality too high (too many unique label combinations) - Counter/timer cache not bounded - Metrics not cleaned up
Solutions: 1. Review metric labels - avoid high-cardinality labels (user IDs, timestamps) 2. Consider implementing metric cache eviction 3. Use Micrometer's meter filters to limit cardinality
Issue 5: Performance Impact¶
Symptoms: API response times increased after adding metrics
Possible Causes: - Synchronous metric recording blocking requests - Timer overhead on every API call - Database proxy adding latency
Solutions: 1. Profile application to identify bottleneck 2. Consider async metric recording for high-volume APIs 3. Use sampling for database operations (record every Nth query) 4. Disable percentile histograms if not needed
Issue 6: Grafana Dashboard Shows "No Data"¶
Symptoms: Dashboard panels empty despite metrics in Prometheus
Possible Causes: - Time range doesn't match data availability - Query syntax error - Datasource not configured correctly
Solutions:
1. Set time range to "Last 15 minutes"
2. Verify query in Prometheus UI first
3. Check datasource URL in Grafana settings
4. Review Grafana logs: docker logs tqamds-grafana
Debugging Commands¶
Check metrics endpoint:
Check specific metric:
Check Prometheus scrape status:
curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.labels.job=="tqamds")'
Test Prometheus query:
View Grafana logs:
View Prometheus logs:
Next Steps¶
After completing this implementation:
- Production Deployment
- Configure Prometheus with persistent storage
- Set up Grafana with external database
- Configure alerting (email, Slack, PagerDuty)
-
Implement metric retention policies
-
Advanced Metrics
- Add circuit breaker metrics for API resilience
- Track cache effectiveness (hit rates, eviction rates)
- Monitor connection pool exhaustion
-
Add custom business metrics (bookings per minute, revenue tracking)
-
Distributed Tracing
- Integrate OpenTelemetry for request tracing
- Track requests across tqapi → tqamds → Amadeus API
-
Correlate logs with traces using trace IDs
-
Documentation
- Create runbook for on-call engineers
- Document metric thresholds and alert responses
-
Create training materials for dashboard usage
-
Continuous Improvement
- Review dashboard usage and refine panels
- Adjust alert thresholds based on real traffic
- Add new metrics as services evolve
- Regular review of metric cardinality
Document Version: 1.0 Last Updated: 2025-11-23 Author: TQPro Observability Team Status: Ready for Implementation