根据冲突的键在 PostgreSQL 数据库之间移动数据

如何解决根据冲突的键在 PostgreSQL 数据库之间移动数据

情况

我有 2 个数据库，它们曾经是彼此的直接副本，但现在它们包含新的不同数据。

我想做什么

我想将数据从数据库“SOURCE”移动到数据库“TARGET”，但问题是表使用自动递增的键，而且由于两个数据库同时使用，很多 ID 已经被占用在 TARGET 中，所以我不能只是标识插入来自 SOURCE 的数据。

但理论上，我们根本不能使用身份插入，而让数据库负责分配新的 ID。

更难的是我们有大约 50 个表，每个表都通过外键连接。显然，外键也必须更改，否则它们将不再引用正确的内容。

让我们看一个非常简单的例子：

table Human {
  id integer NOT NULL PK AutoIncremented
  name varchar NOT NULL
  parentId integer NULL FK -> Human.id 
}

table Pet {
  id integer NOT NULL PK AutoIncremented
  name varchar NOT NULL
  ownerId integer NOT NULL FK -> Human.id 
}

SOURCE Human
Id      name      parentId
==========================
1       Aron      null
2       Bert      1
3       Anna      2

SOURCE Pet
Id      name      ownerId
==========================
1       Frankie   1
2       Doggo     2    

TARGET Human
Id      name      parentId
==========================
1       Armin      null
2       Cecil     1

TARGET Pet
Id      name      ownerId
==========================
1       Gatto     2

假设我想将 Aron、Bert、Anna、Frankie 和 Doggo 移至 TARGET 数据库。

但是如果我们不关心原始id直接尝试插入，外键就会乱码：

TARGET Human
Id      name      parentId
==========================
1       Armin     null
2       Cecil     1
3       Aron      null
4       Bert      1
5       Anna      2

TARGET Pet
Id      name      ownerId
==========================
1       Gatto     2 
2       Frankie   1
3       Doggo     2

Anna 的父亲是 Cecil，Doggo 的主人也是 Cecil 而不是 Bert。 Bert 的父母是 Armin 而不是 Aron。

我想要的样子是：

TARGET Human
Id      name      parentId
==========================
1       Armin     null
2       Cecil     1
3       Aron      null
4       Bert      3
5       Anna      4

TARGET Pet
Id      name      ownerId
==========================
1       Gatto     2 
2       Frankie   3
3       Doggo     4

想象一下，有 50 个类似的表，有 1000 行，所以我们必须自动化解决方案。

问题

有我可以使用的特定工具吗？

是否有一些简单的 sql 逻辑可以精确地做到这一点？

我是否需要推出自己的软件来执行此操作（例如，连接到两个数据库的服务，读取 EF 中的所有内容以及所有关系，并将其保存到另一个数据库）？我担心问题太多，而且很费时间。

解决方法

有具体的工具吗？据我所知没有。
有一些简单的SQL吗？并不完全简单，但也不是那么复杂。
需要自己滚吗？也许，取决于您是否认为您使用 SQL (balow)。

我猜没有直接的路径，问题正如你所指出的，重新分配 FK 值。下面为所有表添加一列，可用于跨表。为此，我将使用 uuid。然后，您可以从一个表集复制到另一个表，除了 FK。复制后就可以加入uuid来完成FK了。

-- establish a reference field unique across databases. 
 alter table target_human add sync_id uuid default gen_random_uuid ();
 alter table target_pet   add sync_id uuid default gen_random_uuid ();
 alter table source_human add sync_id uuid default gen_random_uuid ();
 alter table source_pet   add sync_id uuid default gen_random_uuid ();  
 
--- copy table 2 to table 1 except parent_id 
 insert into target_human(name,sync_id)
   select name,sync_id 
     from source_human;
 
-- update parent id in table to prior parent in table 2 reasigning parent  
with conv (sync_parent,sync_child,new_parent) as 
     ( select h2p.sync_id sync_parent,h2c.sync_id sync_child,h1.id new_parent
         from source_human h2c
         join source_human h2p on h2c.parentid = h2p.id
         join target_human h1  on h1.sync_id = h2p.sync_id 
     ) 
update target_human  h1
   set parentid = c.new_parent
  from conv c 
 where h1.sync_id = c.sync_child;
----------------------------------------------------------------------------------------------- 
alter table target_pet alter column ownerId drop not null; 

insert into target_pet(name,sync_id) 
  select name,sync_id 
    from source_pet ;
    
with conv ( sync_pet,new_owner) as 
     ( select p2.sync_id,h1.id 
         from source_pet p2 
         join source_human h2  on p2.ownerid = h2.id
         join target_human h1  on h2.sync_id = h1.sync_id
     )  
update target_pet  p1
   set ownerid = c.new_owner 
  from conv c 
 where p1.sync_id = c.sync_pet; 
 
alter table target_pet alter column ownerId set not null;

见demo。您现在反转源表和目标表定义以完成同步的另一端。如果需要，您可以然后删除 uuid 列。但您可能想保留它们。如果您使它们不同步，您将再次这样做。您甚至可以更进一步，将 UUID 设为您的 PK/FK，然后只需复制数据，密钥将保持正确，但这可能涉及将应用程序更新为修改后的数据库结构。这不涉及跨数据库的通信，但我假设您已经处理了该问题。您将需要为每个集合重复，也许您可以编写一个脚本来生成它们。此外，我猜想与自己动手相比，陷阱更少，耗时更少。这基本上是您拥有的每个表集 5 个查询，但为了清理当前的混乱，500 个查询并没有那么多；