SOFTWARE

DBridge

DB Bridge

An ETL batch engine for large-scale data migration between disparate DBMS.

v1.0.0Updated 2026-04Multi-process ETL Engine · 6 DB Adapters

Architecture

ArchitectureBy security policy, Source DB and Target DB are not connected directly but go through a central hub engine. A multi-process worker pool executes SELECT, Transform, and INSERT in parallel based on stage dependencies, while a Meta DB manages progress and resumption.

The Problem We Solve

Whether for one-off migrations or periodic loads, moving data between disparate DBMS easily bottlenecks on a single core — held back by single-connection round-trips, client-side memory limits, and long-running transactions. DBridge logically partitions tables into N pieces, stabilizes memory using vendor-specific Server-Side Cursors, and maximizes throughput up to the hardware limits through multi-processing.

Supported DBMS & Datastores (6 Types)

PostgreSQL Source · Target
Server-Side Cursor (Default)
Oracle Source · Target
Server-Side Cursor · CLOB Processing
MySQL · MariaDB Source · Target
Server-Side Cursor
MSSQL Source · Target
Client-Side (Partitioning recommended for large volumes)
Memgraph Primarily Target
Graph DB — Node/Relationship load (batch_sep=G/R)
MeiliSearch Target
Search Indexing (Write-only)

4 Relational DBs + Graph (Memgraph) + Search (MeiliSearch). Source/Target adapters are implemented separately to match each vendor's characteristics — "we don't treat every database the same."

Components

Multi-process Worker Pool Multiple worker pools. Dynamically allocated to Stage 0 background and main pools.
Logical Partitioning RANGE / MOD / ROWID / HASH. Divides a single table into N segments for parallel processing.
Stage Dependency Sequential between stages, parallel within stages. Dependent tasks are automatically canceled if precursors fail.
Server-Side Cursor Memory stability + Network efficiency. Vendor-specific implementation for Oracle, PostgreSQL, and MySQL.
Double Lock PID + DB double lock + Auto-cleanup of stale locks. Multi-instance safe.
ErrorDebugger Automatically traces problematic rows using binary search on failure.

ETL/ELT Model

DBridge delegates Transform to the Target DBMS's SQL. Column mappings, filters, aggregations, and Joins are defined in the SELECT queries within the Meta Query Definitions, while the engine solely handles data type conversion, NULL normalization, and LOB processing.

This follows the ELT pattern, using raw SQL instead of visual mapper GUIs. It preserves the target DB optimizer's execution plan and pushes highly variable business rules down to the SQL layer, leaving the engine untouched. This keeps the "no project-specific hardcoding" principle intact at the core.

Measured Throughput

Single Process (INSERT): ~10,000 rows/sec (Network bottleneck)
10 Parallel Processes (Split INSERT): ~80,000 rows/sec
100M row migration (10 Parallel): Approx. 16 min
100M row CSV Export (10 Parallel): Approx. 2 min (manual import not included)

Source — Internal measurement (Oracle 19c / PostgreSQL 17 single box environment, standard INSERT). Can vary depending on the target, indexes, and trigger configurations.

Operations / Reliability

Daily folder logs (app/log/YYYYMMDD/batch_main.log)
1-minute progress interval — Success / Failure / Running / Queued counts
Dependency failure propagation — Auto-cancels dependent tasks on precursor failure
TRUNCATE → INSERT idempotent design (Safe to re-run)
Auto-cleanup of Stale Locks (PID + DB double)
Closed network assumption — No external network/CDN dependencies

Operations Console Example

An example (MOCKUP) of an operations UI layout that brings batch progress, failed rows, and throughput trends onto a single screen. We tailor it to fit how your project actually runs.

DBridge Console — Operations MOCKUP

Running

Queued

Success

1,284

Failed

Throughput (rows/sec) · Last 60 min peak 84,210

Running Tasks Showing 5 / Total 49

Task ID	Status	Stage	Progress	Elapsed
`BT_USER_HIST_2024`	running	2 / 4	6.4M / 12.0M	04:21
`BT_TXN_LOG_RANGE_03`	running	3 / 4	1.2M / 3.0M	01:08
`BT_DOC_INDEX_MEILI`	queued	0 / 2	—	—
`BT_GRAPH_REL_LOAD`	success	4 / 4	8.9M / 8.9M	12:47
`BT_LEGACY_HWPX_BLOB`	failed	2 / 3	0.8M / 4.5M	03:12

※ Mockup — Actual screens may differ from the mockup.

Specifications

Version: v1.0.0
License: Private (Internal project)
Runtime: Multi-process ETL Engine · 6 DB Adapters
Meta Storage: PostgreSQL 18 (Meta tables for batch rules, execution history, etc.)
Deployment: Docker compose · Single box · Closed network assumption
Entry Point: CLI Batch (`app/main.py`) — External cron integration via exit code 0/1
Ops UI: Planned (Mockup — see Ops Console above)
Last Updated: 2026-04

Additional metrics (Memory footprint, CPU usage, concurrent batch limits, etc.) are specified above in Measured Throughput along with the measurement environment.

Security & Compliance

License: Private (README "Internal use only")
Operating Premise: Closed Network — No external network dependencies, single box deployment
Data Safety: Source DB is Read-Only, Target DB is INSERT-Only — Source modification prohibited
Locking & Recovery: PID + DB Double Lock, Auto-cleanup of Stale Locks, Multi-tier Timeout
Failure Isolation: Stage-level retry, failed rows loaded separately
Tech Support: A dedicated technical team is assigned to each project, providing incident monitoring and a support channel.

Getting Started

Requirement Review — Source/Target DBMS types, table sizes, migration windows, network
Mapping Definition — Register stages, tables, and partitioning policies in meta tables (Batch rules, Meta Query definitions)
Rehearsal → Main Migration — Partial to Full, transitioning to production after verification

Considering Cubiware for your organization?

We will guide you through setup and rollout tailored to your requirements and operating environment. Reach out for a demo or a proposal.