Vertica SQL在插入时覆盖数据

如何解决Vertica SQL在插入时覆盖数据

每次在vertica中有插入语句时，如何覆盖表？

考虑：

INSERT INTO table1 VALUES ('My Value');

这会说

| MyCol  |
----------
MyValue

如何在下一个插入语句上覆盖相同的表

INSERT INTO table1 VALUES ('My Value2');

| MyCol  |
----------
MyValue2

解决方法

您可以DELETE或TRUNCATE您的表格。 Vertica没有替代方法。使用TRUNCATE，因为您只想要一个值。

Source

INSERT INTO table1 VALUES ('My Value');
TRUNCATE TABLE table1;
INSERT INTO table1 VALUES ('My Value2');

或者（如果连接在提交之前丢失，那么它将不会生效。）

Rollback

一条单独的语句返回一条错误消息。在这种情况下，Vertica会回滚该语句。

DDL错误，系统故障，死锁和资源限制返回ROLLBACK消息。在这种情况下，Vertica会回滚整个交易。

INSERT INTO table1 VALUES ('My Value');
DELETE FROM table1
WHERE MyCol !='My Value2';
INSERT INTO table1 VALUES ('My Value2'); 
COMMIT;

我可能建议您不要这样做。

最简单的方法是用一行填充表，也许：

insert into table1 (value)
    values (null);

然后使用update，而不是insert：

update table1
    set value = ?;

这可以解决您的问题。

如果您坚持使用insert，则可以在Identity列中插入值，并使用视图来获取最新值：

create table table1 (
    table1_id identity(1,1),value varchar(255)
);

然后使用视图访问表：

create view v_table1 as
    select value
    from table1
    order by table1_id desc
    limit 1;

如果视图效率低下，则可以定期清空表。

此方法的一个优点是表永远不会为空并且不会锁定很长时间-因此它通常是可用的。在这方面，删除行和插入行可能很棘手。

如果您真的喜欢触发器，则可以使用上面的表格。然后使用触发器更新另一个表中只有一行的行。这样也可以最大程度地提高可用性，而无需花费任何时间来获取最新值。

如果它是单行表，那么按照 @Gordon Linoff 的建议，用单个可以为NULL的行填充它就没有任何风险。

在内部，您应该意识到，Vertica在后台始终通过为行添加删除向量，然后应用INSERT来将UPDATE实现为DELETE。

单行表没有问题，因为Tuple Mover（后台守护进程会唤醒所有5分钟以对内部存储进行碎片整理，简而言之，并将创建单个数据（读取优化存储） -ROS）容器出自：先前值；指向该先前值的删除向量，从而使其无效，以及将其更新为的新插入值。所以：

CREATE TABLE table1 (
  mycol VARCHAR(16)
) UNSEGMENTED ALL NODES; -- a small table,replicate it across all nodes
-- now you have an empty table
-- for the following scenario,I assume you commit the changes every time,as other connected
-- processes will want to see the data you changed
-- then,only once:
INSERT INTO table1 VALUES(NULL::VARCHAR(16);
-- now,you get a ROS container for one row.
-- Later:
UPDATE table1 SET mycol='first value';
-- a DELETE vector is created to mark the initial "NULL" value as invalid
-- a new row is added to the ROS container with the value "first value"
-- Then,before 5 minutes have elapsed,you go:
UPDATE table1 SET mycol='second value';
-- another DELETE vector is created,in a new delete-vector-ROS-container,-- to mark "first value" as invalid
-- another new row is added to a new ROS container,containing "second value"
-- Now 5 minutes have elapsed since the start,the Tuple Mover sees there's work to do,-- and:
-- - it reads the ROS containers containing "NULL" and "first value"
-- - it reads the delete-vector-ROS containers marking both "NULL" and "first value"
--    as invalid
-- - it reads the last ROS container containing "second value"
-- --> and it finally merges all into a brand new ROS container,to only contain.
--     "second value",and,at the end the four other ROS containers are deleted.

使用单行表，此方法效果很好。十亿行不要那样做。