1.1
| Version | Date | Description of Change |
|---|---|---|
| 1.0 | 2025-10-26 | Initial design document |
| 1.1 | 2026-03-23 | Added precision controls, sampling strategy, CSN drill-down, improved output format details, updated WebUI and CLI sections |
The Directory Server Replication Lag Analyzer Tool is designed to analyze replication performance in 389 Directory Server deployments. It processes access logs from multiple directory servers, calculates replication lag times, and generates comprehensive reports in various formats (Charts, CSV, and only for Fedora - HTML, PNG). The system is available both as a command-line tool and through an web-based interface in the 389 DS Cockpit WebUI.
The tool focuses on two key metrics:
The system consists of three main components:
DSLogParser: Parses directory server access logsReplicationLogAnalyzer: Coordinates log analysis and report generationVisualizationHelper: Handles data visualization and report formattingLog Directories: List of paths to server log directories. Each directory represents one server in topology.
suffixes: List of DN suffixes to analyzetime_range: Optional start/end datetime rangelag_time_lowest: Minimum lag thresholdetime_lowest: Minimum operation execution timeanonymous: Hide server names in reportsonly_fully_replicated: Show only changes reaching all serversonly_not_replicated: Show only incomplete replicationutc_offset: Timezone handlinganalysis_precision: Controls sampling aggressiveness (fast, balanced, full)max_chart_points: Maximum data points across all chart series (overrides precision preset)sampling_mode: auto (default) or none — controls whether downsampling is appliedThe analyzer supports configurable precision modes that balance analysis speed against data fidelity. This is critical for large deployments where access logs can contain millions of entries.
| Preset | Max Chart Points | Description |
|---|---|---|
fast |
2,000 | Quick preview with aggressive sampling. Suitable for initial investigation of large datasets |
balanced |
6,000 | Default. Good trade-off between speed and detail for most deployments |
full |
None (unlimited) | No sampling cap. Processes all data points. Very large datasets may still trigger auto-sampling if they exceed the auto-sampling threshold |
When datasets exceed the configured limits, the analyzer applies uniform sampling to reduce data volume while preserving the statistical shape of the data:
fast/balanced modesThe --max-chart-points CLI parameter allows direct override of the preset limit for fine-tuned control.
For the JSON output drill-down feature, the analyzer stores detailed per-CSN propagation data. To prevent excessive memory usage:
Reports include sampling metadata so consumers know whether data was reduced:
{
"applied": false,
"mode": "auto",
"samplingMode": "auto",
"precision": "balanced",
"maxChartPoints": 6000,
"originalTotalPoints": 15000,
"reducedTotalPoints": 6000
}
Server Logs → DSLogParser → Parsed Events
Parsed Events → ReplicationLogAnalyzer → Lag Calculations
Lag Calculations → Precision Controls → Sampled Data
Sampled Data → VisualizationHelper → Reports (JSON/CSV/HTML/PNG)
replication_analysis_summary.json)Always generated alongside other formats. Contains aggregate statistics:
{
"analysis_summary": {
"total_servers": 3,
"configured_log_dirs": ["/var/log/dirsrv/slapd-supplier1", "..."],
"processed_log_dirs": ["/var/log/dirsrv/slapd-supplier1", "..."],
"skipped_log_dirs": [],
"analyzed_logs": 4521,
"total_updates": 12843,
"minimum_lag": 0.001,
"maximum_lag": 45.230,
"average_lag": 2.150,
"minimum_hop_lag": 0.001,
"maximum_hop_lag": 12.450,
"average_hop_lag": 1.030,
"total_hops": 8922,
"updates_by_suffix": {"dc=example,dc=com": 12843},
"time_range": {"start": "2025-01-01 00:00:00", "end": "2025-01-31 23:59:59"}
}
}
replication_analysis.json)Designed for the Cockpit WebUI’s PatternFly chart components. Top-level structure:
{
"replicationLags": {
"title": "Global Replication Lag Over Time",
"yAxisLabel": "Lag Time (seconds)",
"xAxisLabel": "Time",
"series": [
{
"datapoints": [
{
"name": "supplier1",
"x": "2025-01-15T10:30:00+00:00",
"y": 2.150,
"duration": 0.003,
"hoverInfo": "Timestamp: ...<br>CSN: ...<br>...",
"csnId": "5a5b6c7d000000010000"
}
],
"legendItem": {"name": "supplier1 (dc=example,dc=com)"},
"color": "#0066cc"
}
]
},
"hopLags": {
"title": "Per-Hop Replication Lags",
"yAxisLabel": "Hop Lag Time (seconds)",
"xAxisLabel": "Time",
"series": [
{
"datapoints": [
{
"name": "supplier1 → consumer1",
"x": "2025-01-15T10:30:00+00:00",
"y": 1.050,
"hoverInfo": "...",
"csnId": "5a5b6c7d000000010000"
}
],
"legendItem": {"name": "supplier1 → consumer1"},
"color": "#ff6600"
}
]
},
"csnDetails": {
"5a5b6c7d000000010000": {
"csn": "5a5b6c7d000000010000",
"targetDn": "uid=user1,ou=people,dc=example,dc=com",
"suffix": "dc=example,dc=com",
"globalLag": 2.150,
"originServer": "supplier1",
"originTime": "2025-01-15T10:30:00+00:00",
"arrivals": [
{
"server": "supplier1",
"timestamp": "2025-01-15T10:30:00+00:00",
"relativeDelay": 0.0,
"duration": 0.003
},
{
"server": "consumer1",
"timestamp": "2025-01-15T10:30:01.05+00:00",
"relativeDelay": 1.050,
"duration": 0.002,
"hopFrom": "supplier1",
"hopLag": 1.050
}
],
"hops": [
{"from": "supplier1", "to": "consumer1", "lag": 1.050}
],
"totalHops": 1,
"serverCount": 2,
"replicatedToAll": true
}
},
"metadata": {
"totalServers": 3,
"configuredLogDirs": ["..."],
"processedLogDirs": ["..."],
"skippedLogDirs": [],
"analyzedLogs": 4521,
"totalUpdates": 12843,
"timeRange": {"start": "...", "end": "..."},
"timezone": "UTC",
"sampling": { "...sampling metadata..." }
}
}
The csnDetails map enables click-through drill-down in the WebUI — clicking any chart point reveals the full propagation path for that CSN across all servers.
replication_analysis.csv)Tabular format with columns: timestamp, CSN, server, lag_time, duration, target_dn, suffix, and hop lag information. Suitable for spreadsheet analysis and external tooling.
replication_analysis.html)Standalone interactive Plotly visualization with 3 subplots:
Supports hover info, range selection, and zoom controls. Requires python3-lib389-repl-reports package (Plotly dependency).
replication_analysis.png)Static matplotlib export with 2 subplots (global lags and operation durations). Requires python3-lib389-repl-reports package (matplotlib dependency). No per-hop subplot due to matplotlib’s limitations with many series.
The Replication Log Analyzer is accessible via Monitor → Log Analyser in the 389 DS Cockpit WebUI. The interface provides a form-based configuration system with real-time validation and integrated file browsing capabilities.
The UI is organized into card-based sections with an expandable help section explaining the analysis process. Form validation occurs in real-time with error highlighting and helper text for invalid inputs.
The tool starts with an expandable “About Replication Log Analysis” section that provides a clear overview of the analysis process. This isn’t just documentation - it’s an interactive guide that walks you through the five essential steps: selecting server log directories, specifying suffixes, adjusting filters, choosing report formats, and generating the report.
File Browser Integration: Modal dialog for directory selection opens to /var/log/dirsrv by default. Supports navigation via path input or folder browsing with checkbox-based multi-selection.
Directory Management: Selected directories display in a DataList component with folder icons and remove buttons. The interface validates directory accessibility before allowing selection.
Input Field: Text input with real-time DN validation using the valid_dn() function. Invalid DNs trigger immediate error display.
Chip Display: Selected suffixes appear as removable PatternFly chips. Interface pre-populates with existing replicated suffixes from server configuration.
Display Options:
Time Range Controls:
Threshold Configuration:
Analysis Precision: Radio button selector controlling backend sampling strategy:
The selected precision value is passed to the backend via the --precision CLI parameter.
Format Options:
python3-lib389-repl-reports packagePackage Detection: On mount, the interface runs rpm -q python3-lib389-repl-reports to check package availability. If missing, HTML and PNG checkboxes are disabled with explanatory tooltips.
Output Directory: Defaults to /tmp with optional custom directory selection via the file browser. Each report run creates a subdirectory — either using a custom report name or an auto-generated name with ISO 8601 timestamp and random suffix (e.g., repl_report_2025-01-15T14-23-45_a3f2b1).
Report Naming: Optional custom report names; defaults to timestamp-based naming.
Process Flow:
Command Construction: Builds dsconf replication lag-report command with all configured parameters, including log directories, suffixes, time ranges, and output formats.
Tabbed Interface: LagReportModal dialog with tabs dynamically adapting to available report formats. Each tab loads its data independently and asynchronously, with load tokens preventing race conditions.
CSNDetailModal showing the full propagation path for that CSN — origin server, arrival timeline with arrows showing hop-by-hop propagation, and detailed timing information for each serverPNG Tab (when PNG available): Static image display. PNG is read as binary data (up to 64 MiB), converted to a data URL via Blob/FileReader, and rendered as an <img> element
CSV Tab (when CSV available): Shows a text preview of the first 20 lines of CSV data in a <pre> code block
Report Discovery: “Choose Existing Report” button opens ChooseLagReportModal that scans the configured output directory for existing reports. Discovery logic:
replication_analysis.json, _summary.json, .html, .csv, .png)stat or by parsing the directory name timestampReport Table: Displays report metadata sorted by creation time (newest first) with format availability indicators (checkmarks/X marks) and “View Report” actions that open the same LagReportModal used for new reports.
The replication lag analyzer is also available as a CLI tool through dsconf:
dsconf INSTANCE replication lag-report [options]
–log-dirs: List of log directories to analyze. Each directory represents one server in the replication topology.
--log-dirs /var/log/dirsrv/slapd-supplier1 /var/log/dirsrv/slapd-consumer1
–suffixes: List of suffixes (naming contexts) to analyze.
--suffixes "dc=example,dc=com" "dc=test,dc=com"
–output-dir: Directory where analysis reports will be written.
--output-dir /tmp/repl_analysis
–output-format: Specify one or more output formats. Options: html, json, png, csv. Default: html.
--output-format json csv png
–json: Output results as JSON for programmatic use or UI integration.
Replication Status Filters (mutually exclusive):
Threshold Filters:
–start-time: Start time for analysis in YYYY-MM-DD HH:MM:SS format. Default: 1970-01-01 00:00:00
–end-time: End time for analysis in YYYY-MM-DD HH:MM:SS format. Default: 9999-12-31 23:59:59
–utc-offset: UTC offset in ±HHMM format for timezone handling (e.g., -0400, +0530)
–anonymous: Anonymize server names in reports (replaces with generic identifiers)
–precision: Analysis precision vs speed. Choices: fast, balanced (default), full. Controls the sampling aggressiveness when generating chart data. See Performance & Precision Controls for details.
–max-chart-points: Maximum total data points to include across all chart series. Sampling is applied if exceeded. Default depends on --precision setting. Overrides the precision preset when specified explicitly.
Basic analysis:
dsconf supplier1 replication lag-report \
--log-dirs /var/log/dirsrv/slapd-supplier1 /var/log/dirsrv/slapd-consumer1 \
--suffixes "dc=example,dc=com" \
--output-dir /tmp/repl_report
Advanced analysis with filtering:
dsconf supplier1 replication lag-report \
--log-dirs /var/log/dirsrv/slapd-supplier1 /var/log/dirsrv/slapd-consumer1 \
--suffixes "dc=example,dc=com" \
--output-dir /tmp/repl_report \
--output-format json csv png \
--lag-time-lowest 1.0 \
--only-fully-replicated \
--start-time "2025-01-01 00:00:00" \
--end-time "2025-01-31 23:59:59" \
--utc-offset "-0500"
Fast preview of a large dataset:
dsconf supplier1 replication lag-report \
--log-dirs /var/log/dirsrv/slapd-supplier1 /var/log/dirsrv/slapd-consumer1 /var/log/dirsrv/slapd-consumer2 \
--suffixes "dc=example,dc=com" \
--output-dir /tmp/repl_report \
--output-format json \
--precision fast
Full precision with custom chart point limit:
dsconf supplier1 replication lag-report \
--log-dirs /var/log/dirsrv/slapd-supplier1 /var/log/dirsrv/slapd-consumer1 \
--suffixes "dc=example,dc=com" \
--output-dir /tmp/repl_report \
--output-format json csv \
--precision full \
--max-chart-points 15000
Simon Pichugin (@droideck)