删除redo导致ORA-00313 ORA-00312故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:删除redo导致ORA-00313 ORA-00312故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户由于误操作直接rm 删除了redo文件,导致数据库启动报ORA-00313 ORA-00312错

2025-03-07T14:49:16.325723+08:00
ALTER DATABASE OPEN
2025-03-07T14:50:00.124620+08:00
Ping without log force is disabled:
  instance mounted in exclusive mode.
2025-03-07T14:50:00.198907+08:00
Crash Recovery excluding pdb 2 which was cleanly closed.
2025-03-07T14:50:00.238450+08:00
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
 Thread 1: Recovery starting at checkpoint rba (logseq 2966 block 74686), scn 0
2025-03-07T14:50:00.325246+08:00
Started redo scan
2025-03-07T14:50:00.341193+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2681.trc:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/u01/app/oracle/oradata/orcl/redo02.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7
2025-03-07T14:50:00.372632+08:00
Slave encountered ORA-10388 exception during crash recovery
…………
2025-03-07T14:50:00.385698+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-03-07T14:50:00.388594+08:00
Aborting crash recovery due to error 313
2025-03-07T14:50:00.388739+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2681.trc:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/u01/app/oracle/oradata/orcl/redo02.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7
2025-03-07T14:50:00.389243+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2681.trc:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/u01/app/oracle/oradata/orcl/redo02.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7
ORA-313 signalled during: ALTER DATABASE OPEN...

然后客户把历史的redo文件拷贝过来,尝试恢复数据库,报ORA-00314 ORA-00312错误

2025-03-07T15:07:30.784759+08:00
ALTER DATABASE OPEN
Ping without log force is disabled:
  instance mounted in exclusive mode.
2025-03-07T15:07:30.808497+08:00
Crash Recovery excluding pdb 2 which was cleanly closed.
2025-03-07T15:07:30.838664+08:00
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
 Thread 1: Recovery starting at checkpoint rba (logseq 2966 block 74686), scn 0
2025-03-07T15:07:30.897547+08:00
Started redo scan
2025-03-07T15:07:30.898222+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_4106.trc:
ORA-00314: log 2 of thread 1, expected sequence# 2966 doesn't match 1646
ORA-00312: online log 2 thread 1: '/u01/app/oracle/oradata/orcl/redo02.log'
2025-03-07T15:07:30.930089+08:00
Slave encountered ORA-10388 exception during crash recovery
…………
2025-03-07T15:07:30.940051+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-03-07T15:07:30.942274+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_mz00_4138.trc:
ORA-00312: online log 1 thread 1: '/u01/app/oracle/oradata/orcl/redo01.log'
2025-03-07T15:07:30.945509+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-03-07T15:07:30.945512+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-03-07T15:07:30.948369+08:00
Aborting crash recovery due to error 314
2025-03-07T15:07:30.948488+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_4106.trc:
ORA-00314: log 2 of thread 1, expected sequence# 2966 doesn't match 1646
ORA-00312: online log 2 thread 1: '/u01/app/oracle/oradata/orcl/redo02.log'
2025-03-07T15:07:30.949390+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_4106.trc:
ORA-00314: log 2 of thread 1, expected sequence# 2966 doesn't match 1646
ORA-00312: online log 2 thread 1: '/u01/app/oracle/oradata/orcl/redo02.log'
ORA-314 signalled during: ALTER DATABASE OPEN...

使用Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)脚本收集信息之后数据文件头状态和所需要redo信息
df_header


数据库需要sequence#为2966的redo日志,但是当前已经被删除,基于当前情况,只能进行强制非一致性恢复,尝试强制打开库

SQL> recover database;                 
ORA-00283: recovery session canceled due to errors
ORA-00314: log 2 of thread 1, expected sequence# 2966 doesn't match 1646
ORA-00312: online log 2 thread 1: '/u01/app/oracle/oradata/orcl/redo02.log'

QL> select group#,status,sequence# from v$log;

	  GROUP# STATUS 		 SEQUENCE#
---------------- ---------------- ----------------
	       1 UNUSED 			 0
	       3 CURRENT		      2967
	       2 ACTIVE 		      2966

SQL> 
SQL> 
SQL> recover database until cancel;
ORA-00279: change 163033183 generated at 03/07/2025 14:04:20 needed for thread 1
ORA-00289: suggestion : /u01/app/oracle/recovery_area/orcl/archivelog/2025_03_08/o1_mf_1_2966_%u_.arc
ORA-00280: change 163033183 for thread 1 is in sequence #2966


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/app/oracle/oradata/orcl/system01.dbf'


ORA-01112: media recovery not started


SQL> alter database open resetlogs;

Database altered.

运气不错,直接打开数据库成功,然后逻辑导出数据,完成此处恢复.这个让我想起来了一些类似案例:
Oracle 23ai rm redo*.log恢复
清空redo,导致ORA-27048: skgfifi: file header information is invalid
由于默认情况下oracle的redo文件扩展名是.log,然后被当做是不重要文件从而被清理导致数据库故障,在oracle服务器上清理数据之前建议查询v$datafile,v$logfile,v$tempfile,v$controlfile来确认是否是数据库文件

Navicat连接postgresql时出现column “datlastsysoid” does not exist错误解决

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Navicat连接postgresql时出现column “datlastsysoid” does not exist错误解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

使用Navicat Premium 15访问PostgreSQL 16版本的库,报如下错误

ERROR:  column "datlastsysoid" does not exist
LINE 1: SELECT DISTINCT datlastsysoid FROM pg_database;

QQ20250308-105201
QQ20250308-105222


登录pg库查看

postgres=# SELECT DISTINCT datlastsysoid FROM pg_database;
2025-03-07 22:09:03.569 EST [62553] ERROR:  column "datlastsysoid" does not exist at character 17
2025-03-07 22:09:03.569 EST [62553] STATEMENT:  SELECT DISTINCT datlastsysoid FROM pg_database;
ERROR:  column "datlastsysoid" does not exist
LINE 1: SELECT DISTINCT datlastsysoid FROM pg_database;
                        ^
postgres=# \d pg_database;
               Table "pg_catalog.pg_database"
     Column     |   Type    | Collation | Nullable | Default 
----------------+-----------+-----------+----------+---------
 oid            | oid       |           | not null | 
 datname        | name      |           | not null | 
 datdba         | oid       |           | not null | 
 encoding       | integer   |           | not null | 
 datlocprovider | "char"    |           | not null | 
 datistemplate  | boolean   |           | not null | 
 datallowconn   | boolean   |           | not null | 
 datconnlimit   | integer   |           | not null | 
 datfrozenxid   | xid       |           | not null | 
 datminmxid     | xid       |           | not null | 
 dattablespace  | oid       |           | not null | 
 datcollate     | text      | C         | not null | 
 datctype       | text      | C         | not null | 
 daticulocale   | text      | C         |          | 
 daticurules    | text      | C         |          | 
 datcollversion | text      | C         |          | 
 datacl         | aclitem[] |           |          | 
Indexes:
    "pg_database_oid_index" PRIMARY KEY, btree (oid), tablespace "pg_global"
    "pg_database_datname_index" UNIQUE CONSTRAINT, btree (datname), tablespace "pg_global"
Tablespace: "pg_global"

确认没有了datlastsysoid列,通过查看文档确认从Postgres 15版本开始 pg_database表中删除了 datlastsysoid,但是可以通过查询dattablespace替代,使用二进制工具修改libcc.dll文件中的SELECT DISTINCT datlastsysoid部分语句为:SELECT DISTINCT dattablespace
QQ20250308-105427
QQ20250308-105455


然后重新启动Navicat,访问pg库正常
QQ20250308-111555

当然这个问题如果使用高版本的Navicat或者访问低版本的PostgreSQL库不会出现

aix磁盘损坏oracle数据库恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:aix磁盘损坏oracle数据库恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户aix环境硬盘异常导致系统无法启动,初步判断是数据文件存放在本地磁盘的空间中(本地两个盘都异常,系统无法启动),通过硬件恢复厂商镜像出来,但是通过aix文件系统直接挂载提示需要fsck,但是做fsck之后,提示大量文件丢失(最关键的数据文件和备份文件都被自动删除)
dmp-remove
fsck-remove


基于这种情况,采用镜像主机挂载的方式肯定不行,考虑直接采用软件直接解析,能够看到软件,可惜由于大量的文件系统元数据损坏,解析出来的数据文件和dmp也不可用(大量损坏和空块)
QQ20250307-122939

基于上述情况,只能采用碎片级别恢复出来数据文件
QQ20250307-123123

然后使用dul工具把数据恢复到表中,实现最大限度抢救客户数据
QQ20250307-123653

对于数据库级别恢复,这个是理论上的终极恢复方法

pg误删除数据恢复(PostgreSQL delete数据恢复)

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:pg误删除数据恢复(PostgreSQL delete数据恢复)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

应用维护人员通过应用程序误删除了一些数据,希望对其进行恢复.通过咨询确认是PostgreSQL数据,wal和arch文件都在,取客户删除数据时间点相关wal文件
QQ20250306-231208


然后让客户提供具体涉及误删除的相关的库和表信息
QQ20250306-230846

通过工具解析wal日志,比较顺利的获取到了undo sql语句

Switch wal to 0000000300000525000000B0 on time 2025-03-05 22:23:14.263661+08
Switch wal to 0000000300000525000000B1 on time 2025-03-05 22:23:14.423082+08
Switch wal to 0000000300000525000000B2 on time 2025-03-05 22:23:15.983833+08
Switch wal to 0000000300000525000000B3 on time 2025-03-05 22:23:17.802107+08
Switch wal to 0000000300000525000000B4 on time 2025-03-05 22:23:18.942125+08
Switch wal to 0000000300000525000000B5 on time 2025-03-05 22:23:20.293585+08
Switch wal to 0000000300000525000000B6 on time 2025-03-05 22:23:21.531484+08
Switch wal to 0000000300000525000000B7 on time 2025-03-05 22:23:24.217501+08
Switch wal to 0000000300000525000000B8 on time 2025-03-05 22:23:27.504164+08
Switch wal to 0000000300000525000000B9 on time 2025-03-05 22:23:29.260754+08
Switch wal to 0000000300000525000000BA on time 2025-03-05 22:23:33.544486+08
Switch wal to 0000000300000525000000BB on time 2025-03-05 22:23:35.568116+08
Switch wal to 0000000300000525000000BC on time 2025-03-05 22:23:37.329659+08
Switch wal to 0000000300000525000000BD on time 2025-03-05 22:23:38.971834+08
Switch wal to 0000000300000525000000BE on time 2025-03-05 22:23:39.922353+08
Switch wal to 0000000300000525000000BF on time 2025-03-05 22:23:40.897294+08

QQ20250306-231657


然后把这些sql语句反向插入到生产库,完成这些误操作数据的恢复
QQ20250306-231935

如果此类的数据库(oracle,mysql,sql server,PostgreSQL)等恢复请求,需要专业恢复技术支持,请联系我们:
电话/微信:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

PostgreSQL表文件损坏恢复—pdu恢复损坏的表文件

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:PostgreSQL表文件损坏恢复—pdu恢复损坏的表文件

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在某些情况下,由于PostgreSQL表文件损坏导致无法正常访问,可以通过pdu把好的block中的数据恢复出来
准备一张测试表,里面有97条记录

his5_dms=# \d hiscrm.t_sys_oper_log;
                                            Table "hiscrm.t_sys_oper_log"
       Column       |          Type          | Collation | Nullable |Default                     
--------------------+------------------------+-----------+----------+------------
 id                 | bigint                 |           | not null | 
 module             | character varying(50)  |           |          | 
 title              | character varying(50)  |           |          | 
 alias              | character varying(50)  |           |          | 
 business_type      | integer                |           |          | 0
 method             | character varying(200) |           |          | 
 request_method     | character varying(10)  |           |          | 
 operator_type      | integer                |           |          | 0
 oper_name          | character varying(50)  |           |          | 
 dept_name          | character varying(50)  |           |          | 
 oper_url           | character varying(255) |           |          | 
 oper_ip            | character varying(50)  |           |          | 
 oper_location      | character varying(255) |           |          | 
 oper_param         | text                   |           |          | 
 json_result        | text                   |           |          | 
 status             | integer                |           |          | 0
 error_msg          | text                   |           |          | 
 oper_time          | date                   |           |          | 
 create_id          | bigint                 |           |          | 
 create_time        | bigint                 |           |          | 
 clinic_id          | bigint                 |           |          | 
 group_id           | bigint                 |           |          | 
 patient_id         | bigint                 |           |          | 
 is_patient_related | integer                |           |          | 
 business_content   | json                   |           |          | 

his5_dms=# select count(1) from hiscrm.t_sys_oper_log;
 count 
-------
    97
(1 row)

查询表对应的具体文件

his5_dms=# SELECT oid,relfilenode FROM pg_class WHERE relname='t_sys_oper_log';
  oid  | relfilenode 
-------+-------------
 16850 |       16850
(1 row)

his5_dms=# SELECT pg_relation_filepath('hiscrm.t_sys_oper_log');
 pg_relation_filepath 
----------------------
 base/16386/16850
(1 row)

his5_dms=# SHOW data_directory;
     data_directory     
------------------------
 /var/lib/pgsql/12/data
(1 row)

使用dd对文件进行破坏

[postgres@xifenfeidg ~]$ ls -l  /var/lib/pgsql/12/data/base/16386/16850
-rw-------. 1 postgres postgres 90112 Sep  5 20:26 /var/lib/pgsql/12/data/base/16386/16850
[postgres@xifenfeidg ~]$ dd if=/dev/zero of=/var/lib/pgsql/12/data/base/16386/16850 bs=512 count=1 conv=notrunc
1+0 records in
1+0 records out
512 bytes copied, 0.000158756 s, 3.2 MB/s

重启pg库

[postgres@xifenfeidg bin]$ ./pg_ctl -m fast -D /var/lib/pgsql/12/data/ stop
waiting for server to shut down.... done
server stopped
[postgres@xifenfeidg bin]$ ./pg_ctl -D /var/lib/pgsql/12/data/ start
waiting for server to start....2025-03-02 19:02:11.395 HKT [64515] LOG:  
  starting PostgreSQL 12.20 on x86_64-pc-linux-gnu, 
  compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22), 64-bit
2025-03-02 19:02:11.396 HKT [64515] LOG:  listening on IPv6 address "::1", port 5432
2025-03-02 19:02:11.396 HKT [64515] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2025-03-02 19:02:11.396 HKT [64515] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-02 19:02:11.397 HKT [64515] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2025-03-02 19:02:11.403 HKT [64515] LOG:  redirecting log output to logging collector process
2025-03-02 19:02:11.403 HKT [64515] HINT:  Future log output will appear in directory "log".
 done
server started

查询数据报错

[postgres@xifenfeidg bin]$ psql
psql (16.8, server 12.20)
Type "help" for help.

postgres=# \c his5_dms;
psql (16.8, server 12.20)
You are now connected to database "his5_dms" as user "postgres".
his5_dms=#  select count(1) from hiscrm.t_sys_oper_log;
ERROR:  invalid page in block 0 of relation base/16386/16850

通过pdu进行恢复
跳过了坏块,把好的block中数据均恢复出来

his5_dms.hiscrm=# unload tab t_sys_oper_log;
正在解析表 <t_sys_oper_log>. 已解析数据页: 0, 已解析数据: 0 条
        |-块号0 空页面或页面已损坏,已跳过
正在解析表 <t_sys_oper_log>. 已解析数据页: 11, 已解析数据: 86 条
表名<t_sys_oper_log>-</var/lib/pgsql/12/data/base/16386/16850> 
    解析完成, 11 个数据页 ,共计 86 条数据. 成功 86 条; 失败【0】条 
COPY文件路径为:<his5_dms/hiscrm/t_sys_oper_log.csv>

导入pg库中

his5_dms=# truncate table hiscrm.t_sys_oper_log;
TRUNCATE TABLE
his5_dms=# \i his5_dms/COPY/hiscrm_copy.sql
SET
COPY 86
his5_dms=#