Postgres数据库truncate表无有效备份恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Postgres数据库truncate表无有效备份恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

创建一个Postgres表,并插入数据

postgres=# CREATE TABLE "PeisInterfaceLog"(
postgres(#         "PeisInterfaceLogId" text,
postgres(#         "PeisDepartmentId" text,
postgres(#         "PeisDepartmentName" text,
postgres(#         "PeisInterfaceSubjectType" text,
postgres(#         "PeisInterfaceSubjectId" text,
postgres(#         "PeisInterfaceNo" text,
postgres(#         "PeisInterfaceName" text,
postgres(#         "PeisInterfaceDirection" text,
postgres(#         "PeisInterfaceCallAddress" text,
postgres(#         "PeisInterfaceLogStartTime" timestamp,
postgres(#         "PeisInterfaceInputContent" json,
postgres(#         "PeisInterfaceInputTranscodeContent" json,
postgres(#         "PeisInterfaceOutContent" json,
postgres(#         "PeisInterfaceOutTranscodeContent" json,
postgres(#         "PeisInterfaceSuccessSign" int4,
postgres(#         "PeisInterfaceLogStatusCode" text,
postgres(#         "PeisInterfaceLogNote" text,
postgres(#         "PeisInterfaceLogTimestamp" timestamp,
postgres(#         "PeisInterfaceLogInfo" text,
postgres(#         "PeisPatientRegisterId" text
postgres(# );
CREATE TABLE
postgres=# \i /postgres/COPY/public_copy.sql 
SET
COPY 722957
postgres=# select count(1) from "PeisInterfaceLog";
 count  
--------
 722957
(1 row)

验证truncate操作,引起Postgres中oid的变化

postgres=# checkpoint;
CHECKPOINT
postgres=#  show data_directory;
 data_directory 
----------------
 /pgdata
(1 row)
postgres=# select oid, datname from pg_database ;
  oid  |  datname  
-------+-----------
 13676 | postgres
     1 | template1
 13675 | template0
(3 rows)
postgres=# select relname, relowner, relfilenode from pg_class where relowner = 10 and relname like '%PeisInterfaceLog%';
     relname      | relowner | relfilenode 
------------------+----------+-------------
 PeisInterfaceLog |       10 |       16384
(1 row)

postgres=# truncate table "PeisInterfaceLog";
TRUNCATE TABLE
postgres=# select count(1) from  "PeisInterfaceLog";
 count 
-------
     0
(1 row)

postgres=# select relname, relowner, relfilenode from pg_class where relowner = 10 and relname like '%PeisInterfaceLog%';
     relname      | relowner | relfilenode 
------------------+----------+-------------
 PeisInterfaceLog |       10 |       16394
(1 row)

使用工具进行初始化字典信息

PDU.public=# b;

开始初始化...
 -pg_database:</pgdata/global/1262>

数据库:postgres 
      -pg_schema:</pgdata/base/13676/2615>
      -pg_class:</pgdata/base/13676/1259> 共88行
      -pg_attribute:</pgdata/base/13676/1249> 共2950行
      模式:
        ▌ public 1张表

PDU.public=# use postgres;

┌────────────────────────────────────────┐
│          模式             │  表数量    │
├────────────────────────────────────────┤
│    public                 │  1         │
└────────────────────────────────────────┘
postgres.public=# set public;

┌──────────────────────────────────────────────────┐
│               表名                  │  表大小    │
├──────────────────────────────────────────────────┤
│    PeisInterfaceLog                 │  0         │
└──────────────────────────────────────────────────┘
        仅显示表大小排名前 1 的表名
postgres.public=# \d PeisInterfaceLog;

┌──────────────────────────────────────────────────────────────┐
│                            列类型                            │
└──────────────────────────────────────────────────────────────┘
text,text,text,text,text,text,text,text,text,timestamp,json,json,json,json,int4,text,text,timestamp,text,text

配置软件磁盘扫描操作(pdu.ini中配置)

#dropScan需要扫描的磁盘
DISK_PATH=/data/test.dd
#dropScan时跳跃的数据块数量,数值越小覆盖磁盘越全面,速度越慢
BLOCK_INTERVAL=5

启用磁盘扫描操作

PDU.public=# p idxmode off;

┌─────────────────────────────────────────────────────────────────┐
│              参数                │             当前值           │
├─────────────────────────────────────────────────────────────────┤
│    startwal                      │                              │
│    endwal                        │                              │
│    startlsnt                     │                              │
│    endlsnt                       │                              │
│    starttime                     │                              │
│    endtime                       │                              │
│    resmode(Data Restore Mode)    │              TIME            │
│    exmode(Data Export Mode)      │              CSV             │
│    encoding                      │              UTF8            │
│    restype(Data Restore Type)    │              DELETE          │
          ----------------------DropScan----------------------
│    dsoff(DropScan startOffset)   │              0               │
│    blkiter(Block Intervals)      │              5               │
│    itmpcsv(Items Per Csv)        │              100             │
│    idxmode                       │              off             │
└─────────────────────────────────────────────────────────────────┘
PDU.public=# ds;

 ▌全量扫描恢复模式 

 ▌数据文件扫描 
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
          表名           │                                   结果                                    
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
 PeisInterfaceLog           99.976 %(21469593600)   数据页: 71158      成功: 722947    (疑似乱码: 2809      ) 失败: 0 

耗时 15.28 秒
└────────────────────────────────────────────────────────────────────────────────────────────────────┘
扫描完毕,文件目录如下: 
        restore/dropscan/PeisInterfaceLog
PDU.public=# ds copy;
已导出: 
/restore/dropscan/PeisInterfaceLog/COPY.sql
PDU.public=# 

[root@xifenfei PeisInterfaceLog]# more COPY.sql 
[root@xifenfei PeisInterfaceLog]# cat /restore/dropscan/PeisInterfaceLog/COPY.sql
COPY PeisInterfaceLog FROM '/restore/dropscan/PeisInterfaceLog/09-28-21:29:55_226738176_32760blks_336787items.csv';
COPY PeisInterfaceLog FROM '/restore/dropscan/PeisInterfaceLog/09-28-21:30:01_595968000_32767blks_330416items.csv';
COPY PeisInterfaceLog FROM '/restore/dropscan/PeisInterfaceLog/09-28-21:30:04_1116061696_5631blks_55744items.csv';

把恢复数据导入到Postgres数据中

[root@xifenfei ~]# su - postgres 
[postgres@xifenfei ~]$ psql
psql (12.8)
Type "help" for help.

postgres=# \i /restore/dropscan/PeisInterfaceLog/COPY.sql
COPY 336787
COPY 330416
COPY 55744
postgres=# select count(1) from "PeisInterfaceLog";
 count  
--------
 722947
(1 row)

经过上述扫描测试证明该工具实现了在Postgres中truncate数据的绝大部分数据恢复(这里的乱码不是由于没有扫描到数据,主要是由于个别字符串由于类型判断关系识别不对导致乱码抛弃).
如果你遇到Postgres 数据库由于drop/truncate等误操作,而且无有效备份进行恢复,面临数据丢失风险,请第一时间保护现场(数据文件所在分区尽可能不要有写入操作),联系我们提供专业恢复技术支持:
电话/微信:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

由于空间满导致PostgreSQL数据库异常处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:由于空间满导致PostgreSQL数据库异常处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

朋友和我反馈pg库异常,通过查看日志确认最初故障是由于磁盘空间满,导出出现类似:无法扩展文件 “pg_tblspc/16394/PG_13_202007201/16395/5055912.143″: No space left on device错误

2025-07-28 09:06:02.703 HKT [15352] 错误:  无法扩展文件 "pg_tblspc/16394/PG_13_202007201/16395/5055912.143": No space left on device
2025-07-28 09:06:02.703 HKT [15352] 提示:  检查空闲磁盘控件.
2025-07-28 09:06:02.703 HKT [15352] 语句:  insert into 语句
2025-07-28 09:06:02.703 HKT [576] 错误:  无法扩展文件 "pg_tblspc/16394/PG_13_202007201/16395/4477723.73": No space left on device
2025-07-28 09:06:02.703 HKT [576] 提示:  检查空闲磁盘控件.
2025-07-28 09:06:02.703 HKT [576] 语句:  update 语句
2025-07-28 09:06:02.706 HKT [11940] 错误:  无法扩展文件 "pg_tblspc/16394/PG_13_202007201/16395/5055912.143": No space left on device
2025-07-28 09:06:02.706 HKT [11940] 提示:  检查空闲磁盘控件.
2025-07-28 09:06:02.706 HKT [11940] 语句:  insert into 语句

pg-nofree


后续对d盘空间进行了清理,pg报无法打开文件”pg_tblspc/16394/PG_13_202007201/16395/4477706.16″(目标数据块5544906): Permission denied

2025-07-28 10:53:10.435 HKT [11920] 语句:  insert into语句
2025-07-28 10:53:10.435 HKT [18300] 错误:  无法打开文件"pg_tblspc/16394/PG_13_202007201/16395/4477706.16"(目标数据块5544906): Permission denied

和报:警告: 无法写入pg_tblspc/16394/PG_13_202007201/16395/4477706的块5545387错误

2025-07-28 11:05:11.008 HKT [5380] 警告:  无法写入pg_tblspc/16394/PG_13_202007201/16395/4477706的块5545387
2025-07-28 11:05:11.008 HKT [5380] 详细信息:  多次失败 --- 写错误可能是永久性的
2025-07-28 11:05:11.015 HKT [12072] 错误:  无法打开文件"pg_tblspc/16394/PG_13_202007201/16395/4477706.17"(目标块5545398):上一段只有952个块
2025-07-28 11:05:11.015 HKT [12072] 上下文:  写入关系pg_tblspc/16394/PG_13_202007201/16395/4477706的块5545398
2025-07-28 11:05:11.015 HKT [12072] 语句:  select 语句
2025-07-28 11:05:11.016 HKT [12072] 警告:  无法写入pg_tblspc/16394/PG_13_202007201/16395/4477706的块5545398
2025-07-28 11:05:11.016 HKT [12072] 详细信息:  多次失败 --- 写错误可能是永久性的

查看4477706相关文件情况,初步看部分文件大小不对,结合上述报错,可以断定该表异常
pgsize


通过查询pg字典确认该表具体名称

xifenfei=# select oid,relname,relnamespace from pg_class where  relfilenode=4477706;
  oid  |     relname      | relnamespace
-------+------------------+--------------
 16880 | xifenfei01_log   |         2200
(1 行记录)

尝试pg_dump导入数据报错:多次失败 — 写错误可能是永久性的

这次运气比较好,损坏的对象是一个日志表,可以直接清理掉,对该日志表进行清理,然后正常导出数据,如果遇到的损坏表是业务需要的表,可以使用pdu(PostgreSQL表文件损坏恢复—pdu恢复损坏的表文件)进行恢复

pg_control丢失/损坏处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:pg_control丢失/损坏处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

-l :设置下一个wal日志名
可以通过 $PGDATA/pg_wal 目录中查找数值最新的WAL段的文件名+1

[postgres@localhost pg_wal]$ ls
00000001000000000000008E  archive_status

-l 00000001000000000000008F

-m:设置下一个和最旧的事务ID
mxid1:下一个多事务ID的安全值可以通过在 $PGDATA/pg_multixact/offsets目录中查找数值最大的文件名+1,然后乘以65536(0×10000)来确定。
mxid2:最旧的事务ID的安全值可以通过$PGDATA/pg_multixact/offsets目录中数字最小的文件名+1,乘以65536(0×10000)来确定。文件名是十六进制

[postgres@localhost pg_wal]$ cd ../pg_multixact
[postgres@localhost pg_multixact]$ ls
members  offsets
[postgres@localhost pg_multixact]$ cd offsets/
[postgres@localhost offsets]$ ls
0000

-m 0x10000,0x10000

-O:设置下一个多事务处理偏移量
安全值可以通过在 $PGDATA/pg_multixact/members 目录中查找数值最大的文件名,+1,然后乘以52352(0xCC80)来确定。文件名为十六进制

[postgres@localhost members]$ ls
0000

-O 0xCC80

-x:设置下一个事务ID
安全值可以通过在$PGDATA/pg_xact目录中查找数值最大的文件名,+1,然后乘以1048576(0×100000)来确定。请注意,文件名是十六进制

[postgres@localhost pg_xact]$ ls -ltr
total 504
-rw------- 1 postgres postgres 262144 Mar  8 01:53 0000
-rw------- 1 postgres postgres 253952 Mar  9 10:54 0001

-x 0x200000

删除postmaster.pid文件
如果正常关闭库,该文件会被自动删除,异常关闭的才需要处理

[postgres@localhost pg_wal]$ pg_resetwal -l 00000001000000000000008F -m 0x10000,0x10000 -O 0xCC80 -x 0x200000 -f $PGDATA
pg_resetwal: error: lock file "postmaster.pid" exists
pg_resetwal: hint: Is a server running?  If not, delete the lock file and try again.
[postgres@localhost pg_wal]$ cd ../
[postgres@localhost data]$ ls -l postmaster.pid
-rw------- 1 postgres postgres 75 Mar  9 11:02 postmaster.pid
[postgres@localhost data]$ rm -rf postmaster.pid

touch pg_control文件

[postgres@localhost data]$ pg_resetwal -l 00000001000000000000008F -m 0x10000,0x10000 -O 0xCC80 -x 0x200000 -f $PGDATA
pg_resetwal: error: could not open file "global/pg_control" for reading: No such file or directory
pg_resetwal: hint: If you are sure the data directory path is correct, execute
  touch global/pg_control
and try again.
[postgres@localhost data]$ touch global/pg_control

重建pg_control

[postgres@localhost data]$ pg_resetwal -l 00000001000000000000008F -m 0x10000,0x10000 -O 0xCC80 -x 0x200000 -f $PGDATA
pg_resetwal: warning: pg_control exists but is broken or wrong version; ignoring it
Write-ahead log reset
[postgres@localhost data]$ pg_ctl start
waiting for server to start....
 done
server started

pg启动报invalid checkpoint record处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:pg启动报invalid checkpoint record处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

pg库启动报PANIC: could not locate a valid checkpoint record错误

2025-03-09 10:59:10.365 EDT [73013] LOG:  starting PostgreSQL 16.8 on x86_64-pc-linux-gnu, 
                                          compiled by gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), 64-bit
2025-03-09 10:59:10.365 EDT [73013] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2025-03-09 10:59:10.365 EDT [73013] LOG:  listening on IPv6 address "::", port 5432
2025-03-09 10:59:10.367 EDT [73013] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2025-03-09 10:59:10.401 EDT [73018] LOG:  database system was interrupted; last known up at 2025-03-09 10:54:43 EDT
2025-03-09 10:59:11.506 EDT [73018] LOG:  invalid checkpoint record
2025-03-09 10:59:11.508 EDT [73018] PANIC:  could not locate a valid checkpoint record
2025-03-09 10:59:12.004 EDT [73013] LOG:  startup process (PID 73018) was terminated by signal 6: Aborted
2025-03-09 10:59:12.004 EDT [73013] LOG:  aborting startup due to startup process failure
2025-03-09 10:59:12.006 EDT [73013] LOG:  database system is shut down

从报错信息中看,是有无法读取到有效的checkpoint记录导致,初步怀疑是wal异常,检查pg_wal目录,发现wal日志为空

[root@localhost pg_wal]# ls -ltr
total 16388
drwx------  2 postgres postgres        6 Mar  6 08:10 archive_status
[root@localhost pg_wal]# 

进一步检查系统操作命令rm删除wal日志命令

cd pg_wal
rm -rf 0000000*

初步确认是由于wal日志被意外删除导致pg库无法启动,尝试重置wal,由于库不是干净关闭,无法直接重置成功

[postgres@localhost log]$ pg_resetwal $PGDATA
The database server was not shut down cleanly.
Resetting the write-ahead log might cause data to be lost.
If you want to proceed anyway, use -f to force reset.

使用pg_resetwal -f强制重置

[postgres@localhost log]$ pg_resetwal -f $PGDATA
Write-ahead log reset

尝试启动pg库成功

[postgres@localhost log]$ pg_ctl start 
waiting for server to start....2025-03-09 11:02:02.609 EDT [73088] LOG:  
redirecting log output to logging collector process
2025-03-09 11:02:02.609 EDT [73088] HINT:  Future log output will appear in directory "log".
 done
server started
2025-03-09 11:02:02.609 EDT [73088] LOG:  starting PostgreSQL 16.8 on x86_64-pc-linux-gnu, 
                                          compiled by gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), 64-bit
2025-03-09 11:02:02.609 EDT [73088] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2025-03-09 11:02:02.609 EDT [73088] LOG:  listening on IPv6 address "::", port 5432
2025-03-09 11:02:02.610 EDT [73088] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2025-03-09 11:02:02.645 EDT [73092] LOG:  database system was shut down at 2025-03-09 11:01:53 EDT
2025-03-09 11:02:02.650 EDT [73088] LOG:  database system is ready to accept connections

pg误删除数据恢复(PostgreSQL delete数据恢复)

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:pg误删除数据恢复(PostgreSQL delete数据恢复)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

应用维护人员通过应用程序误删除了一些数据,希望对其进行恢复.通过咨询确认是PostgreSQL数据,wal和arch文件都在,取客户删除数据时间点相关wal文件
QQ20250306-231208


然后让客户提供具体涉及误删除的相关的库和表信息
QQ20250306-230846

通过工具解析wal日志,比较顺利的获取到了undo sql语句

Switch wal to 0000000300000525000000B0 on time 2025-03-05 22:23:14.263661+08
Switch wal to 0000000300000525000000B1 on time 2025-03-05 22:23:14.423082+08
Switch wal to 0000000300000525000000B2 on time 2025-03-05 22:23:15.983833+08
Switch wal to 0000000300000525000000B3 on time 2025-03-05 22:23:17.802107+08
Switch wal to 0000000300000525000000B4 on time 2025-03-05 22:23:18.942125+08
Switch wal to 0000000300000525000000B5 on time 2025-03-05 22:23:20.293585+08
Switch wal to 0000000300000525000000B6 on time 2025-03-05 22:23:21.531484+08
Switch wal to 0000000300000525000000B7 on time 2025-03-05 22:23:24.217501+08
Switch wal to 0000000300000525000000B8 on time 2025-03-05 22:23:27.504164+08
Switch wal to 0000000300000525000000B9 on time 2025-03-05 22:23:29.260754+08
Switch wal to 0000000300000525000000BA on time 2025-03-05 22:23:33.544486+08
Switch wal to 0000000300000525000000BB on time 2025-03-05 22:23:35.568116+08
Switch wal to 0000000300000525000000BC on time 2025-03-05 22:23:37.329659+08
Switch wal to 0000000300000525000000BD on time 2025-03-05 22:23:38.971834+08
Switch wal to 0000000300000525000000BE on time 2025-03-05 22:23:39.922353+08
Switch wal to 0000000300000525000000BF on time 2025-03-05 22:23:40.897294+08

QQ20250306-231657


然后把这些sql语句反向插入到生产库,完成这些误操作数据的恢复
QQ20250306-231935

如果此类的数据库(oracle,mysql,sql server,PostgreSQL)等恢复请求,需要专业恢复技术支持,请联系我们:
电话/微信:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com