1.5T MySQL数据库完美恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:1.5T MySQL数据库完美恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户MySQL数据库异常无法正常启动,需要提供恢复支持,当时提供的错误日志信息为:log sequence number xxxx is in the future
Q15


2026-06-03T13:35:02.368514Z 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html for information about forcing recovery.
2026-06-03T13:35:02.369669Z 0 [ERROR] InnoDB: Page [page id: space=0, page number=521127] log sequence number 15319315659882 is in the future! Current system log sequence number 6712970192343.
从头分析mysql的日志,发现最初情况为:

---TRANSACTION 8424429306, ACTIVE 259 sec truncating table
mysql tables in use 1, locked 1
0 lock struct(s), heap size 1136, 0 row lock(s)
MySQL thread id 4513911, OS thread handle 21996, query id 4194188849 localhost 127.0.0.1 root System lock
TRUNCATE TABLE xxxx
--------
FILE I/O
--------
I/O thread 0 state: wait Windows aio (insert buffer thread)
I/O thread 1 state: wait Windows aio (log thread)
I/O thread 2 state: complete io for buf page (read thread)
I/O thread 3 state: wait Windows aio (read thread)
I/O thread 4 state: wait Windows aio (read thread)
I/O thread 5 state: complete io for buf page (read thread)
I/O thread 6 state: wait Windows aio (write thread)
I/O thread 7 state: wait Windows aio (write thread)
I/O thread 8 state: wait Windows aio (write thread)
I/O thread 9 state: wait Windows aio (write thread)
Pending normal aio reads: [2, 0, 0, 3] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 0; buffer pool: 0
908141629 OS file reads, 8774070813 OS file writes, 2977363738 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
InnoDB: ###### Diagnostic info printed to the standard error stream
2026-06-03T06:37:25.679003Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 17216 has waited at btr0sea.ic line 128 for 258  seconds the semaphore:
S-lock on RW-latch at 0000017F19122E18 created in file btr0sea.cc line 195
a writer (thread id 8340) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffff
Last time read locked in file btr0sea.ic line 128
Last time write locked in file g:\ade\build\sb_0-34537258-1560180832.84\mysql-5.7.27\storage\innobase\include\btr0sea.ic line 90
2026-06-03T06:37:25.681739Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 28160 has waited at btr0sea.ic line 128 for 241  seconds the semaphore:
S-lock on RW-latch at 0000017F19123598 created in file btr0sea.cc line 195
a writer (thread id 13620) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffff
Last time read locked in file btr0sea.ic line 128
Last time write locked in file G:\ade\build\sb_0-34537258-1560180832.84\mysql-5.7.27\storage\innobase\btr\btr0cur.cc line 3874
2026-06-03T06:37:25.684495Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 23052 has waited at btr0sea.ic line 128 for 253  seconds the semaphore:
S-lock on RW-latch at 0000017F19123598 created in file btr0sea.cc line 195
a writer (thread id 13620) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffff
Last time read locked in file btr0sea.ic line 128
Last time write locked in file G:\ade\build\sb_0-34537258-1560180832.84\mysql-5.7.27\storage\innobase\btr\btr0cur.cc line 3874
2026-06-03T06:37:25.687586Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 28480 has waited at btr0sea.ic line 128 for 272  seconds the semaphore:
S-lock on RW-latch at 0000017F19122E18 created in file btr0sea.cc line 195
a writer (thread id 8340) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffff
Last time read locked in file btr0sea.ic line 128
Last time write locked in file g:\ade\build\sb_0-34537258-1560180832.84\mysql-5.7.27\storage\innobase\include\btr0sea.ic line 90
2026-06-03T06:37:25.689857Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 2868 has waited at buf0flu.cc line 1209 for 262  seconds the semaphore:
SX-lock on RW-latch at 00000179B38E0DC0 created in file buf0buf.cc line 1460
a writer (thread id 1008) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: f0000000
Last time read locked in file ibuf0ibuf.cc line 4552
Last time write locked in file G:\ade\build\sb_0-34537258-1560180832.84\mysql-5.7.27\storage\innobase\ibuf\ibuf0ibuf.cc line 406
…………………………
2026-06-03T06:37:56.919054Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 13620 has waited at btr0sea.ic line 90 for 303  seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0000017F19123598 created in file btr0sea.cc line 195
a writer (thread id 13620) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffff
Last time read locked in file btr0sea.ic line 128
Last time write locked in file G:\ade\build\sb_0-34537258-1560180832.84\mysql-5.7.27\storage\innobase\btr\btr0cur.cc line 3874
2026-06-03T06:37:56.921090Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 24268 has waited at btr0sea.ic line 128 for 252  seconds the semaphore:
S-lock on RW-latch at 0000017F19122E18 created in file btr0sea.cc line 195
a writer (thread id 8340) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffff
Last time read locked in file btr0sea.ic line 128
Last time write locked in file g:\ade\build\sb_0-34537258-1560180832.84\mysql-5.7.27\storage\innobase\include\btr0sea.ic line 90
2026-06-03T06:37:56.923175Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 15984 has waited at btr0sea.ic line 128 for 302  seconds the semaphore:
S-lock on RW-latch at 0000017F19122CD8 created in file btr0sea.cc line 195
a writer (thread id 5420) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffff
Last time read locked in file btr0sea.ic line 128
Last time write locked in file G:\ade\build\sb_0-34537258-1560180832.84\mysql-5.7.27\storage\innobase\btr\btr0cur.cc line 3874
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 0, pwrites 0
InnoDB: ###### Diagnostic info printed to the standard error stream
2026-06-03T07:04:50.616385Z 0 [Note] MySQL: Normal shutdown

2026-06-03T07:04:50.617251Z 0 [Note] Giving 97 client threads a chance to die gracefully
2026-06-03T07:54:31.982035Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use 
--explicit_defaults_for_timestamp server option (see documentation for more details).
2026-06-03T07:54:31.983047Z 0 [Warning] 'NO_ZERO_DATE', 'NO_ZERO_IN_DATE' and 'ERROR_FOR_DIVISION_BY_ZERO' 
sql modes should be used with strict mode. They will be merged with strict mode in a future release.
2026-06-03T07:54:31.983057Z 0 [Warning] 'NO_AUTO_CREATE_USER' sql mode was not set.
2026-06-03T07:54:31.983098Z 0 [Note] --secure-file-priv is set to NULL. Operations related to importing and exporting data are disabled
2026-06-03T07:54:31.985272Z 0 [Note] MySQL (mysqld 5.7.27) starting as process 832 ...
2026-06-03T07:54:32.054771Z 0 [Note] InnoDB: Mutexes and rw_locks use Windows interlocked functions
2026-06-03T07:54:32.055435Z 0 [Note] InnoDB: Uses event mutexes
2026-06-03T07:54:32.055728Z 0 [Note] InnoDB: _mm_lfence() and _mm_sfence() are used for memory barrier
2026-06-03T07:54:32.056143Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2026-06-03T07:54:32.057581Z 0 [Note] InnoDB: Number of pools: 1
2026-06-03T07:54:32.058963Z 0 [Note] InnoDB: Not using CPU crc32 instructions
2026-06-03T07:54:32.063255Z 0 [Note] InnoDB: Initializing buffer pool, total size = 40G, instances = 8, chunk size = 128M
2026-06-03T07:54:34.619212Z 0 [Note] InnoDB: Completed initialization of buffer pool
2026-06-03T07:54:35.420695Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2026-06-03T07:54:36.121477Z 0 [Note] InnoDB: Log scan progressed past the checkpoint lsn 15319438590791
2026-06-03T07:54:36.449855Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319443833344
2026-06-03T07:54:36.847204Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319449076224
2026-06-03T07:54:37.263455Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319454319104
2026-06-03T07:54:37.544475Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319459561984
2026-06-03T07:54:37.678504Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319463095246
2026-06-03T07:54:37.681262Z 0 [Note] InnoDB: Database was not shutdown normally!
2026-06-03T07:54:37.681755Z 0 [Note] InnoDB: Starting crash recovery.
2026-06-03T07:54:38.088953Z 0 [Note] InnoDB: 2 transaction(s) which must be rolled back or cleaned up in total 1 row operations to undo
2026-06-03T07:54:38.089913Z 0 [Note] InnoDB: Trx id counter is 8424438528
2026-06-03T07:54:38.090288Z 0 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 …… 91 92 93 94 95 96 97 98 99 
2026-06-03T07:54:42.043861Z 0 [Note] InnoDB: Apply batch completed
2026-06-03T07:54:42.295726Z 0 [Note] InnoDB: Rolling back trx with id 8424429306, 0 rows to undo
2026-06-03T07:54:42.298212Z 0 [Note] InnoDB: Rollback of trx with id 8424429306 completed
2026-06-03T07:55:19.432722Z 0 [Note] InnoDB: Completing truncate for table with id (8714) residing in file-per-table tablespace with id (6005)
2026-06-03T07:55:30.963664Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90593936, which exceeds the log group capacity 90593280.
2026-06-03T07:55:47.442228Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90602582, which exceeds the log group capacity 90593280.
2026-06-03T07:56:05.422243Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90609452, which exceeds the log group capacity 90593280.
2026-06-03T07:56:27.643550Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90621740, which exceeds the log group capacity 90593280.
2026-06-03T07:56:47.845714Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90627886, which exceeds the log group capacity 90593280.
2026-06-03T07:57:04.714691Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90639662, which exceeds the log group capacity 90593280.
2026-06-03T07:57:26.028889Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90645808, which exceeds the log group capacity 90593280.
2026-06-03T07:57:42.796901Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90655262, which exceeds the log group capacity 90593280.
2026-06-03T07:58:04.513110Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90670384, which exceeds the log group capacity 90593280.
2026-06-03T07:58:20.675070Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90689842, which exceeds the log group capacity 90593280.
2026-06-03T07:58:39.362158Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90699949, which exceeds the log group capacity 90593280.
2026-06-03T07:58:57.146181Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90707243, which exceeds the log group capacity 90593280.
2026-06-03T07:59:15.226281Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90731065, which exceeds the log group capacity 90593280.
2026-06-03T07:59:32.902269Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90751406, which exceeds the log group capacity 90593280.
2026-06-03T07:59:55.020833Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use 
--explicit_defaults_for_timestamp server option (see documentation for more details).
2026-06-03T07:59:55.022839Z 0 [Warning] 'NO_ZERO_DATE', 'NO_ZERO_IN_DATE' and 'ERROR_FOR_DIVISION_BY_ZERO' 
sql modes should be used with strict mode. They will be merged with strict mode in a future release.
2026-06-03T07:59:55.022856Z 0 [Warning] 'NO_AUTO_CREATE_USER' sql mode was not set.
2026-06-03T07:59:55.022918Z 0 [Note] --secure-file-priv is set to NULL. Operations related to importing and exporting data are disabled
2026-06-03T07:59:55.028372Z 0 [Note] MySQL (mysqld 5.7.27) starting as process 2532 ...
2026-06-03T07:59:55.078201Z 0 [Note] InnoDB: Mutexes and rw_locks use Windows interlocked functions
2026-06-03T07:59:55.078998Z 0 [Note] InnoDB: Uses event mutexes
2026-06-03T07:59:55.079491Z 0 [Note] InnoDB: _mm_lfence() and _mm_sfence() are used for memory barrier
2026-06-03T07:59:55.080092Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2026-06-03T07:59:55.084872Z 0 [Note] InnoDB: Number of pools: 1
2026-06-03T07:59:55.090793Z 0 [Note] InnoDB: Not using CPU crc32 instructions
2026-06-03T07:59:55.096910Z 0 [Note] InnoDB: Initializing buffer pool, total size = 40G, instances = 8, chunk size = 128M
2026-06-03T07:59:57.966267Z 0 [Note] InnoDB: Completed initialization of buffer pool
2026-06-03T07:59:58.778551Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2026-06-03T07:59:59.348188Z 0 [Note] InnoDB: Log scan progressed past the checkpoint lsn 15319447826657
2026-06-03T07:59:59.942491Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319453069312
2026-06-03T08:00:00.233703Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319458312192
2026-06-03T08:00:00.470153Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319463555072
2026-06-03T08:00:01.158338Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319468797952
2026-06-03T08:00:01.852263Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319474040832
2026-06-03T08:00:02.555075Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319479283712
2026-06-03T08:00:03.289043Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319484526592
2026-06-03T08:00:04.031381Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319489769472
2026-06-03T08:00:04.754714Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319495012352
2026-06-03T08:00:05.533650Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319500255232
2026-06-03T08:00:06.253542Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319505498112
2026-06-03T08:00:06.883691Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319510740992
2026-06-03T08:00:07.603988Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319515983872
2026-06-03T08:00:08.268531Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319521226752
2026-06-03T08:00:08.967662Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319526469632
2026-06-03T08:00:09.323566Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 15319529184727
2026-06-03T08:00:09.328240Z 0 [Note] InnoDB: Database was not shutdown normally!
2026-06-03T08:00:09.328844Z 0 [Note] InnoDB: Starting crash recovery.
2026-06-03T08:00:09.730171Z 0 [Note] InnoDB: 1 transaction(s) which must be rolled back or cleaned up in total 1 row operations to undo
2026-06-03T08:00:09.731034Z 0 [Note] InnoDB: Trx id counter is 8424439040
2026-06-03T08:00:09.731402Z 0 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 …… 94 95 96 97 98 99 
2026-06-03T08:00:17.587182Z 0 [Note] InnoDB: Apply batch completed
2026-06-03T08:01:02.988241Z 0 [Note] InnoDB: Completing truncate for table with id (8714) residing in file-per-table tablespace with id (6005)
2026-06-03T08:01:07.965314Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90593580, which exceeds the log group capacity 90593280.
2026-06-03T08:01:25.945332Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90599727, which exceeds the log group capacity 90593280.
2026-06-03T08:01:43.117294Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90610481, which exceeds the log group capacity 90593280.
2026-06-03T08:02:05.338558Z 0 [ERROR] InnoDB: The age of the last checkpoint is 90622769, which exceeds the log group capacity 90593280.

从这里看,最初是truncate table xxxx,然后由于被阻塞了无法truncate成功,可以就关闭了mysql服务,然后启动库就没有成功,然后就是加上了innodb_force_recovery出现了上述截图的错误.尝试进行强制拉库,遭遇以下错误

2026-06-04T07:05:59.924315Z 0 [Note] MySQL (mysqld 5.7.27) starting as process 8764 ...
2026-06-04T07:05:59.944187Z 0 [Warning] option 'innodb-purge-threads': unsigned value 0 adjusted to 1
2026-06-04T07:05:59.947611Z 0 [Note] InnoDB: Started in read only mode
2026-06-04T07:05:59.948012Z 0 [Note] InnoDB: Mutexes and rw_locks use Windows interlocked functions
2026-06-04T07:05:59.948485Z 0 [Note] InnoDB: Uses event mutexes
2026-06-04T07:05:59.948825Z 0 [Note] InnoDB: _mm_lfence() and _mm_sfence() are used for memory barrier
2026-06-04T07:05:59.949329Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2026-06-04T07:05:59.950087Z 0 [Note] InnoDB: Number of pools: 1
2026-06-04T07:05:59.950587Z 0 [Note] InnoDB: Not using CPU crc32 instructions
2026-06-04T07:05:59.950987Z 0 [Note] InnoDB: Disabling background log and ibuf IO write threads.
2026-06-04T07:05:59.952835Z 0 [Note] InnoDB: Initializing buffer pool, total size = 512M, instances = 1, chunk size = 128M
2026-06-04T07:05:59.980631Z 0 [Note] InnoDB: Completed initialization of buffer pool
2026-06-04T07:06:00.012856Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2026-06-04T07:06:00.013649Z 0 [Note] InnoDB: The user has set SRV_FORCE_NO_LOG_REDO on, skipping log redo
2026-06-04T07:06:00.019299Z 0 [Note] InnoDB: Completing truncate for table with id (8714) residing in file-per-table tablespace with id (6005)
07:06:00 UTC - mysqld got exception 0xc0000005 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=0
max_threads=200
thread_count=0
connection_count=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 87429 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
7ff67eaba97e    mysqld.exe!std::operator<<<std::char_traits<char> >()[ostream:791]
7ff67f1ffd49    mysqld.exe!os_file_create_subdirs_if_needed()[os0file.cc:2013]
7ff67f22e967    mysqld.exe!fil_ibd_create()[fil0fil.cc:3530]
7ff67f2afdd7    mysqld.exe!truncate_t::fixup_tables_in_non_system_tablespace()[row0trunc.cc:2274]
7ff67f1c563a    mysqld.exe!innobase_start_or_create_for_mysql()[srv0start.cc:2346]
7ff67f12ea40    mysqld.exe!innobase_init()[ha_innodb.cc:4077]
7ff67e9b783e    mysqld.exe!ha_initialize_handlerton()[handler.cc:840]
7ff67eab7228    mysqld.exe!plugin_initialize()[sql_plugin.cc:1229]
7ff67eab8790    mysqld.exe!plugin_register_builtin_and_init_core_se()[sql_plugin.cc:1589]
7ff67e972de3    mysqld.exe!init_server_components()[mysqld.cc:4080]
7ff67e977da5    mysqld.exe!win_main()[mysqld.cc:4773]
7ff67e975705    mysqld.exe!mysql_service()[mysqld.cc:5226]
7ff80e734da7    MSVCR120.dll!_beginthread()
7ff80e734e60    MSVCR120.dll!_endthread()
7ff84e4d84d4    KERNEL32.DLL!BaseThreadInitThunk()
7ff850cb1791    ntdll.dll!RtlUserThreadStart()
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

在启动过程中需要去完成truncate操作,但是由于强制拉库是只读状态导致无法完成,直接启动失败.如果非只读状态拉库,启动过程包InnoDB: Corruption of an index tree: table `innodb_change_buffer` index `CLUST_IND`, father ptr page no 111415, child page no 517749异常

2026-06-04T13:35:46.447160Z 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace 
but not the InnoDB log files. Please refer to http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html 
for information about forcing recovery.
2026-06-04T13:35:46.448525Z 0 [ERROR] InnoDB: Corruption of an index tree: table `innodb_change_buffer` index `CLUST_IND`, 
father ptr page no 111415, child page no 517749
PHYSICAL RECORD: n_fields 6; 1-byte offsets; info bits 0
 0: len 4; hex 00001775; asc    u;;
 1: len 1; hex 00; asc  ;;
 2: len 4; hex 00e78555; asc    U;;
 3: len 16; hex 000400010c0f0190002d860800088000; asc          -      ;;
 4: len 29; hex 323032362d30332d31345f3132325f315f323032363033313432303136; asc 2026-03-14_122_1_202603142016;;
 5: len 8; hex 800000001b26cad6; asc      &  ;;
2026-06-04T13:35:46.450738Z 0 [Note] InnoDB: n_owned: 0; heap_no: 2; next rec: 211
PHYSICAL RECORD: n_fields 7; 1-byte offsets; info bits 0
 0: len 4; hex 00001775; asc    u;;
 1: len 1; hex 00; asc  ;;
 2: len 4; hex 00e78555; asc    U;;
 3: len 16; hex 000400010c0f0190002d860800088000; asc          -      ;;
 4: len 29; hex 323032362d30332d31345f3132325f315f323032363033313432303136; asc 2026-03-14_122_1_202603142016;;
 5: len 8; hex 800000001b26cad6; asc      &  ;;
 6: len 4; hex 0001b337; asc    7;;
2026-06-04T13:35:46.452647Z 0 [Note] InnoDB: n_owned: 0; heap_no: 147; next rec: 14554
2026-06-04T13:35:46.453038Z 0 [ERROR] [FATAL] InnoDB: You should dump + drop + reimport the table to fix the corruption. 
If the crash happens at database startup. Please refer to http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html 
for information about forcing recovery. Then dump + drop + reimport.
2026-06-04 21:35:46 0x828  InnoDB: Assertion failure in thread 2088 in file ut0ut.cc line 910
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
13:35:46 UTC - mysqld got exception 0x80000003 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=0
max_threads=200
thread_count=0
connection_count=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 87429 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
7ff7c7794312    mysqld.exe!my_sigabrt_handler()[my_thr_init.c:449]
7fffaf31ec9d    MSVCR120.dll!raise()
7fffaf324874    MSVCR120.dll!abort()
7ff7c78b07e4    mysqld.exe!ut_dbg_assertion_failed()[ut0dbg.cc:67]
7ff7c78b09c1    mysqld.exe!ib::fatal::~fatal()[ut0ut.cc:910]
7ff7c790a794    mysqld.exe!btr_page_get_father_node_ptr_func()[btr0btr.cc:799]
7ff7c790a28e    mysqld.exe!btr_page_get_father()[btr0btr.cc:854]
7ff7c7906cca    mysqld.exe!btr_compress()[btr0btr.cc:3577]
7ff7c7913a3e    mysqld.exe!btr_cur_compress_if_useful()[btr0cur.cc:5068]
7ff7c7917f7c    mysqld.exe!btr_cur_pessimistic_delete()[btr0cur.cc:5403]
7ff7c79583f9    mysqld.exe!ibuf_delete_rec()[ibuf0ibuf.cc:4385]
7ff7c795805e    mysqld.exe!ibuf_delete_for_discarded_space()[ibuf0ibuf.cc:4833]
7ff7c78d51a8    mysqld.exe!fil_recreate_tablespace()[fil0fil.cc:2265]
7ff7c794fe06    mysqld.exe!truncate_t::fixup_tables_in_non_system_tablespace()[row0trunc.cc:2297]
7ff7c786563a    mysqld.exe!innobase_start_or_create_for_mysql()[srv0start.cc:2346]
7ff7c77cea40    mysqld.exe!innobase_init()[ha_innodb.cc:4077]
7ff7c705783e    mysqld.exe!ha_initialize_handlerton()[handler.cc:840]
7ff7c7157228    mysqld.exe!plugin_initialize()[sql_plugin.cc:1229]
7ff7c7158790    mysqld.exe!plugin_register_builtin_and_init_core_se()[sql_plugin.cc:1589]
7ff7c7012de3    mysqld.exe!init_server_components()[mysqld.cc:4080]
7ff7c7017da5    mysqld.exe!win_main()[mysqld.cc:4773]
7ff7c7015705    mysqld.exe!mysql_service()[mysqld.cc:5226]
7fffaf2d4da7    MSVCR120.dll!_beginthread()
7fffaf2d4e60    MSVCR120.dll!_endthread()
7fffd0f784d4    KERNEL32.DLL!BaseThreadInitThunk()
7fffd3b91791    ntdll.dll!RtlUserThreadStart()
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

基于这种两种情况:
2026-06-04T13:35:46.448525Z 0 [ERROR] InnoDB: Corruption of an index tree: table `innodb_change_buffer` index `CLUST_IND`, father ptr page no 111415, child page no 517749和InnoDB: Completing truncate for table with id (8714) residing in file-per-table tablespace with id (6005)异常形成了相互死循环,无法直接强制拉库.
这个库有1.5T如果通过工具提取效率有点低
data


对于这样的情况,使用ibd的discard+import功能进行处理,参考相关文章:
frm和ibd文件数据库恢复
运气不错,这个客户的所有库通过这种方法导入ibd文件(部分temp结尾的临时表数据可以不用恢复,占据空间较大)之后,然后通过mysqldump顺利导出所有数据,完成本次恢复任务
sql

WARNING: detected duplicate paths to the same disk导致crs无法正常启动故障解决

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:WARNING: detected duplicate paths to the same disk导致crs无法正常启动故障解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最近一周之内,两个客户由于错误修改asm_diskstring=’/dev/dm*’,’/dev/mapper/*’,导致后续集群无法启动成功,查看asm的alert日志发现类似错误(为了说明类似问题,实际中情况比现在复杂)

WARNING: detected duplicate paths to the same disk:
  '/dev/mapper/mpathi' and
  '/dev/dm-3'
ERROR: diskgroup CRS was not mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "CRS" cannot be mounted
ORA-15040: diskgroup is incomplete

由于asm的vote磁盘组无法mount,从而使得crs无法正常启动,进而导致集群无法正常启动和工作.而在系统中mpathi和dm-3是同一块磁盘

[root@rac1 mapper]# ls -ltr mpathi
lrwxrwxrwx 1 root root       7 May 24 13:09 mpathi -> ../dm-3

GPnP profile中关于asm_diskstring配置

[grid@rac1 trace]$  gpnptool get -o-
……
<orcl:ASM-Profile id="asm" DiscoveryString="/dev/dm*,/dev/mapper/mpath*" 
SPFile="+DATA/cimp-cluster/asmparameterfile/registry.253.889969697"/>
……

对于这样的情况,需要把profile中值修改了才行,通过重建asm spfile来实现

---pfile
large_pool_size= 12M
instance_type= 'asm'
remote_login_passwordfile= 'EXCLUSIVE'
asm_diskstring= '/dev/mapper/*'
asm_power_limit= 1
diagnostic_dest= '/opt/app/grid'

创建spfile(create spfile=’+CRS’ from pfile=’/tmp/pfile’)

NOTE: Attempting voting file refresh on diskgroup CRS
NOTE: Refresh completed on diskgroup CRS. No voting file found.
NOTE: Voting file relocation is required in diskgroup CRS
NOTE: Attempting voting file relocation on diskgroup CRS
NOTE: voting file deletion on grp 1 disk CRS_0000
NOTE: voting file deletion on grp 1 disk CRS_0001
NOTE: voting file deletion on grp 1 disk CRS_0002
NOTE: No voting file found on diskgroup CRS
Sun May 24 13:23:25 2026
NOTE: updated gpnp profile ASM SPFILE to 
NOTE: updated gpnp profile ASM diskstring: /dev/mapper/*
NOTE: updated gpnp profile ASM diskstring: /dev/mapper/*
NOTE: updated gpnp profile ASM SPFILE to +CRS/cluster/asmparameterfile/registry.253.1234099407

主要注意在create spfile的过程中voting file的信息被删除,使用kfed进行验证

[root@rac1 dbs]# kfed read /dev/asm_mpathi|grep vf
kfdhdb.vfstart:                       0 ; 0x0ec: 0x00000000
kfdhdb.vfend:                         0 ; 0x0f0: 0x00000000
[root@rac1 dbs]# kfed read /dev/asm_mpatha|grep vf
kfdhdb.vfstart:                       0 ; 0x0ec: 0x00000000
kfdhdb.vfend:                         0 ; 0x0f0: 0x00000000
[root@rac1 dbs]# kfed read /dev/asm_mpathd|grep vf
kfdhdb.vfstart:                       0 ; 0x0ec: 0x00000000
kfdhdb.vfend:                         0 ; 0x0f0: 0x00000000

需要执行replace votedisk

[root@rac1 dbs]# crsctl replace votedisk +CRS
Successful addition of voting disk 99df870869c64f78bf944b40bfff8644.
Successful addition of voting disk e6db086a74d64f49bf0aff02944bcac5.
Successful addition of voting disk e88a412fe04b4fe6bf1ca067bca779b8.
Successful deletion of voting disk 615579e778684f54bf31c0ce83709f37.
Successful deletion of voting disk 9e48b89f95084fefbfb7648897508684.
Successful deletion of voting disk 274b913b3b874f7cbfb6b025028b4eaa.
Successfully replaced voting disk group with +CRS.
CRS-4266: Voting file(s) successfully replaced

这样操作完成之后,再次查看GPnP profile中关于asm_diskstring配置

[grid@rac1 trace]$  gpnptool get -o-
……
<orcl:ASM-Profile id="asm" DiscoveryString="/dev/asm_*" 
SPFile="+DATA/cimp-cluster/asmparameterfile/registry.253.1234099407"/>
……

然后再次重启集群,一切恢复正常,解决了asm_diskstring=’/dev/dm*’,’/dev/mapper/*’误操作引起的WARNING: detected duplicate paths to the same disk:故障

asm dd 10M导致system文件部分坏块修复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:asm dd 10M导致system文件部分坏块修复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户的数据库由于不当操作导致asm磁盘头损坏,进行的操作命令类似

dd if=/dev/zero of=/dev/dm-29 bs=1024K count=10

asm磁盘组无法mount,提示ORA-15042

SQL> ALTER DISKGROUP DATA MOUNT  /* asm agent *//* {1:1712:2} */ 
2026-05-19T21:57:16.517284-04:00
NOTE: cache registered group DATA 5/0xAE103296
NOTE: cache began mount (first) of group DATA 5/0xAE103296
NOTE: Assigning number (5,10) to disk (/dev/dm-29)
…………
NOTE: Assigning number (5,18) to disk (/dev/dm-15)
NOTE: Assigning number (5,0) to disk (/dev/dm-7)
NOTE: Assigning number (5,15) to disk (/dev/dm-5)
2026-05-19T21:57:16.650481-04:00
cluster guid (b216d47e2bf86f6aff34a36119b0c161) generated for PST Hbeat for instance 1
2026-05-19T21:57:22.659262-04:00
NOTE: GMON heartbeating for grp 5 (DATA)
GMON querying group 5 at 28 for pid 42, osid 12996
2026-05-19T21:57:22.661447-04:00
NOTE: Assigning number (5,14) to disk ()
2026-05-19T21:57:22.662476-04:00
GMON querying group 5 at 29 for pid 42, osid 12996
2026-05-19T21:57:22.663144-04:00
NOTE: cache dismounting (clean) group 5/0xAE103296 (DATA)
NOTE: messaging CKPT to quiesce pins Unix process pid: 12996, image: oracle@dlycdb1 (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: LGWR not being messaged to dismount
NOTE: cache dismounted group 5/0xAE103296 (DATA)
NOTE: cache ending mount (fail) of group DATA number=5 incarn=0xae103296
NOTE: cache deleting context for group DATA 5/0xae103296
2026-05-19T21:57:22.720673-04:00
GMON dismounting group 5 at 30 for pid 42, osid 12996
2026-05-19T21:57:22.721055-04:00
NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment
…………
NOTE: Disk DATA_0013 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0015 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0016 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0017 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0018 in mode 0x7f marked for de-assignment
ERROR: diskgroup DATA was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "14" is missing from group number "5" 

2026-05-19T21:57:22.745140-04:00
ERROR: ALTER DISKGROUP DATA MOUNT  /* asm agent *//* {1:1712:2} */

由于客户这个是19c版本,直接使用备份au还原,然后mount磁盘组成功

SQL> alter diskgroup data mount 
2026-05-19T22:40:59.137657+08:00
NOTE: cache registered group DATA 2/0xB126DA41
NOTE: cache began mount (first) of group DATA 2/0xB126DA41
NOTE: Assigning number (2,14) to disk (/dev/dm-29)
NOTE: Assigning number (2,8) to disk (/dev/dm-28)
…………
NOTE: Assigning number (2,2) to disk (/dev/dm-7)
NOTE: Assigning number (2,0) to disk (/dev/dm-6)
NOTE: Assigning number (2,1) to disk (/dev/dm-3)
2026-05-19T22:40:59.303908+08:00
cluster guid (b216d47e2bf86f6aff34a36119b0c161) generated for PST Hbeat for instance 2
2026-05-19T22:41:05.312792+08:00
NOTE: GMON heartbeating for grp 2 (DATA)
GMON querying group 2 at 80 for pid 35, osid 53051
2026-05-19T22:41:05.314925+08:00
NOTE: cache is mounting group DATA created on 2024/11/28 17:55:45
NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/dm-6
NOTE: 05/20/26 16:41:04 DATA.F1X0 found on disk 0 au 10 fcn 0.0 datfmt 1
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/dm-3
…………
NOTE: cache opening disk 13 of grp 2: DATA_0013 path:/dev/dm-18
NOTE: cache opening disk 14 of grp 2: DATA_0014 path:/dev/dm-29
NOTE: cache opening disk 15 of grp 2: DATA_0015 path:/dev/dm-16
NOTE: cache opening disk 16 of grp 2: DATA_0016 path:/dev/dm-17
NOTE: cache opening disk 17 of grp 2: DATA_0017 path:/dev/dm-20
NOTE: cache opening disk 18 of grp 2: DATA_0018 path:/dev/dm-21
2026-05-19T22:41:05.317307+08:00
NOTE: cache mounting (first) external redundancy group 2/0xB126DA41 (DATA)
2026-05-19T22:41:05.522191+08:00
NOTE: attached to recovery domain 2
2026-05-19T22:41:05.558136+08:00
validate pdb 2, flags x4, valid 0, pdb flags x204 
* validated domain 2, flags = 0x200
NOTE: cache recovered group 2 to fcn 0.46336611
NOTE: redo buffer size is 512 blocks (2105344 bytes)
2026-05-19T22:41:05.569479+08:00
NOTE: LGWR attempting to mount thread 1 for diskgroup 2 (DATA)
NOTE: LGWR found thread 1 closed at ABA 110.6507 lock domain=0 inc#=0 instnum=1
NOTE: LGWR mounted thread 1 for diskgroup 2 (DATA)
2026-05-19T22:41:05.574499+08:00
mntstmp=2026/05/20 16:41:05.572000
2026-05-19T22:41:05.574854+08:00
NOTE: cache mounting group 2/0xB126DA41 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=2 incarn=0xb126da41
2026-05-19T22:41:05.616227+08:00
NOTE: Instance updated compatible.asm to 19.0.0.0.0 for grp 2 (DATA).
2026-05-19T22:41:05.616754+08:00
NOTE: Instance updated compatible.asm to 19.0.0.0.0 for grp 2 (DATA).
2026-05-19T22:41:05.617932+08:00
NOTE: Instance updated compatible.rdbms to 19.0.0.0.0 for grp 2 (DATA).
2026-05-19T22:41:05.618303+08:00
NOTE: Instance updated compatible.rdbms to 19.0.0.0.0 for grp 2 (DATA).
2026-05-19T22:41:05.643844+08:00
SUCCESS: diskgroup DATA was mounted

虽然磁盘组mount成功,但是asm依旧在报错

这个客户胆子也真够大的,这样mount起来的磁盘组,还往里面加入磁盘出发Rebalance操作
1
2026-05-20T09:18:34.292103-04:00
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_60220.trc:
ORA-15196: invalid ASM block header [kfc.c:30747] [endian_kfbh] [2147483662] [1014] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:30747] [endian_kfbh] [2147483662] [1014] [0 != 1]
NOTE: cache repaired a corrupt block: group=2(DATA) dsk=14 blk=1014 on disk 14 from disk=14 (DATA_0014) incarn=4042690154 au=11 blk=1014 count=1
2026-05-20T09:18:36.721669-04:00
WARNING: cache read a corrupt block: group=2(DATA) dsk=14 blk=1015 disk=14 (DATA_0014) incarn=4042690154 au=0 blk=1015 count=1
2026-05-20T09:18:36.721982-04:00
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_60220.trc:
ORA-15196: invalid ASM block header [kfc.c:30747] [endian_kfbh] [2147483662] [1015] [0 != 1]
NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_60220.trc
WARNING: cache read (retry) a corrupt block: group=2(DATA) dsk=14 blk=1015 disk=14 (DATA_0014) incarn=4042690154 au=0 blk=1015 count=1
2026-05-20T09:18:36.724279-04:00
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_60220.trc:
ORA-15196: invalid ASM block header [kfc.c:30747] [endian_kfbh] [2147483662] [1015] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:30747] [endian_kfbh] [2147483662] [1015] [0 != 1]
NOTE: cache repaired a corrupt block: group=2(DATA) dsk=14 blk=1015 on disk 14 from disk=14 (DATA_0014) incarn=4042690154 au=11 blk=1015 count=1
2026-05-20T09:44:20.469792-04:00
NOTE: Starting expel slave for group 2/0xcd67eb4 (DATA)
2026-05-20T09:44:20.473880-04:00
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=2
2026-05-20T09:44:20.530101-04:00
NOTE: membership refresh pending for group 2/0xcd67eb4 (DATA)
2026-05-20T09:44:20.534097-04:00
GMON querying group 2 at 178 for pid 27, osid 70130
2026-05-20T09:44:20.557651-04:00
SUCCESS: refreshed membership for 2/0xcd67eb4 (DATA)
2026-05-20T09:44:23.489304-04:00
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
2026-05-20T11:20:49.804413-04:00
NOTE: stopping process ARB0
NOTE: stopping process ARBA
2026-05-20T11:20:51.538777-04:00
SUCCESS: rebalance completed for group 2/0xcd67eb4 (DATA)
2026-05-20T23:20:51.541462+08:00
SUCCESS: ALTER DISKGROUP DATA ADD  DISK '/dev/dm-20' SIZE 4194304M
 REBALANCE WAIT

算幸运,由于ORA-15196: invalid ASM block header [kfc.c:30747] [endian_kfbh] 错误导致rebalance没有真正运行起来,从而该磁盘组没有dismount(19c这个方面确实增强不少,如果以前版本大概率会直接dismount掉)

客户在这样mount的磁盘组上尝试启动库,报ORA-01578错误,无法启动成功

2026-05-20T17:46:07.060794+08:00
Error attempting to elevate LMHB's priority: no further priority changes will be attempted for this process
2026-05-20T17:46:07.647114+08:00
Undo initialization recovery: Parallel FPTR complete: start:26091908 end:26096229 diff:4321 ms (4.3 seconds)
Undo initialization recovery: err:0 start: 26091907 end: 26096229 diff: 4322 ms (4.3 seconds)
[50880] Successfully onlined Undo Tablespace 5.
Undo initialization online undo segments: err:0 start: 26096229 end: 26096576 diff: 347 ms (0.3 seconds)
Undo initialization finished serial:0 start:26091907 end:26096591 diff:4684 ms (4.7 seconds)
Database Characterset is AL32UTF8
No Resource Manager plan active
2026-05-20T17:46:08.734113+08:00

Corrupt block relative dba: 0x004030ee (file 1, block 12526)
Completely zero block found during buffer read

Reread (file 1, block 12526) found same corrupt data (no logical check)
2026-05-20T17:46:08.811455+08:00
Corrupt Block Found
         TIME STAMP (GMT) = 05/20/2026 17:46:07
         CONT = 0, TSN = 0, TSNAME = SYSTEM
         RFN = 1, BLK = 12526, RDBA = 4206830
         OBJN = 37, OBJD = 37, OBJECT = I_OBJ2, SUBOBJECT = 
         SEGMENT OWNER = SYS, SEGMENT TYPE = Index Segment
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_50880.trc  (incident=800708):
ORA-01578: ORACLE data block corrupted (file # 1, block # 12526)
ORA-01110: data file 1: '+DATA/xff/DATAFILE/system.257.1186720165'
2026-05-20T05:46:09.047772-04:00
ALTER SYSTEM SET remote_listener=' xff-scan:11521' SCOPE=MEMORY SID='xff2';
2026-05-20T05:46:09.049615-04:00
ALTER SYSTEM SET listener_networks='' SCOPE=MEMORY SID='xff2';
2026-05-20T17:46:09.812271+08:00
*****************************************************************
An internal routine has requested a dump of selected redo.
This usually happens following a specific internal error, when
analysis of the redo logs will help Oracle Support with the
diagnosis.
It is recommended that you retain all the redo logs generated (by
all the instances) during the past 12 hours, in case additional
redo dumps are required to help with the diagnosis.
*****************************************************************

Corrupt block relative dba: 0x004030de (file 1, block 12510)
Completely zero block found during buffer read

Reread (file 1, block 12510) found same corrupt data (no logical check)
2026-05-20T17:46:10.350573+08:00
Corrupt Block Found
         TIME STAMP (GMT) = 05/20/2026 17:46:09
         CONT = 0, TSN = 0, TSNAME = SYSTEM
         RFN = 1, BLK = 12510, RDBA = 4206814
         OBJN = 83, OBJD = 83, OBJECT = DEPENDENCY$, SUBOBJECT = 
         SEGMENT OWNER = SYS, SEGMENT TYPE = Table Segment
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_50880.trc  (incident=800709):
ORA-01578: ORACLE data block corrupted (file # 1, block # 12510)
ORA-01110: data file 1: '+DATA/xff/DATAFILE/system.257.1186720165'
2026-05-20T17:46:10.694365+08:00

Corrupt block relative dba: 0x00403000 (file 1, block 12288)

Completely zero block found during validation
Reread of blocknum=12288, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12288, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12288, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12288, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12288, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data

Corrupt block relative dba: 0x00403001 (file 1, block 12289)

Completely zero block found during validation
Reread of blocknum=12289, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12289, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12289, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12289, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12289, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
………………
Corrupt block relative dba: 0x004030ff (file 1, block 12543)

Completely zero block found during validation
Reread of blocknum=12543, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12543, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12543, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12543, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data
Reread of blocknum=12543, file=+DATA/xff/DATAFILE/system.257.1186720165. found same corrupt data

Corrupt block relative dba: 0x0040301d (file 1, block 12317)
Completely zero block found during buffer read

Reread (file 1, block 12317) found same corrupt data (no logical check)
2026-05-20T17:46:12.183545+08:00
Corrupt Block Found
         TIME STAMP (GMT) = 05/20/2026 17:46:11
         CONT = 0, TSN = 0, TSNAME = SYSTEM
         RFN = 1, BLK = 12317, RDBA = 4206621
         OBJN = 37, OBJD = 37, OBJECT = I_OBJ2, SUBOBJECT = 
         SEGMENT OWNER = SYS, SEGMENT TYPE = Index Segment
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_50880.trc  (incident=800710):
ORA-01578: ORACLE data block corrupted (file # 1, block # 12317)
ORA-01110: data file 1: '+DATA/xff/DATAFILE/system.257.1186720165'
Incident details in: /u01/app/oracle/diag/rdbms/xff/xff2/incident/incdir_800710/xff2_ora_50880_i800710.trc
2026-05-20T05:46:12.371040-04:00
ALTER SYSTEM SET remote_listener=' xff-scan:11521' SCOPE=MEMORY SID='xff2';
2026-05-20T05:46:12.372924-04:00
ALTER SYSTEM SET listener_networks='' SCOPE=MEMORY SID='xff2';
2026-05-20T17:46:13.814346+08:00
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_50880.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 12317)
ORA-01110: data file 1: '+DATA/xff/DATAFILE/system.257.1186720165'
2026-05-20T17:46:13.814407+08:00
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_50880.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 12317)
ORA-01110: data file 1: '+DATA/xff/DATAFILE/system.257.1186720165'
2026-05-20T17:46:13.814442+08:00
Error 604 happened during db open, shutting down database
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_50880.trc  (incident=800711):
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 12317)
ORA-01110: data file 1: '+DATA/xff/DATAFILE/system.257.1186720165'

Corrupt block relative dba: 0x00403097 (file 1, block 12439)
Completely zero block found during buffer read

Reread (file 1, block 12439) found same corrupt data (no logical check)
2026-05-20T17:46:14.259044+08:00
Corrupt Block Found
         TIME STAMP (GMT) = 05/20/2026 17:46:13
         CONT = 0, TSN = 0, TSNAME = SYSTEM
         RFN = 1, BLK = 12439, RDBA = 4206743
         OBJN = 18, OBJD = 18, OBJECT = OBJ$, SUBOBJECT = 
         SEGMENT OWNER = SYS, SEGMENT TYPE = Table Segment
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_gen0_48604.trc  (incident=798852):
ORA-01578: ORACLE data block corrupted (file # 1, block # 12439)
ORA-01110: data file 1: '+DATA/xff/DATAFILE/system.257.1186720165'

2026-05-20T17:46:13.814346+08:00
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_50880.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 12317)
ORA-01110: data file 1: '+DATA/xff/DATAFILE/system.257.1186720165'
2026-05-20T17:46:13.814407+08:00
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_50880.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 12317)
ORA-01110: data file 1: '+DATA/xff/DATAFILE/system.257.1186720165'
2026-05-20T17:46:13.814442+08:00
Error 604 happened during db open, shutting down database
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_50880.trc  (incident=800711):
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 12317)
ORA-01110: data file 1: '+DATA/xff/DATAFILE/system.257.1186720165'

Corrupt block relative dba: 0x00403097 (file 1, block 12439)
Completely zero block found during buffer read

Reread (file 1, block 12439) found same corrupt data (no logical check)
2026-05-20T17:46:14.259044+08:00
Corrupt Block Found
         TIME STAMP (GMT) = 05/20/2026 17:46:13
         CONT = 0, TSN = 0, TSNAME = SYSTEM
         RFN = 1, BLK = 12439, RDBA = 4206743
         OBJN = 18, OBJD = 18, OBJECT = OBJ$, SUBOBJECT = 
         SEGMENT OWNER = SYS, SEGMENT TYPE = Table Segment
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_gen0_48604.trc  (incident=798852):
ORA-01578: ORACLE data block corrupted (file # 1, block # 12439)
ORA-01110: data file 1: '+DATA/xff/DATAFILE/system.257.1186720165'
2026-05-20T05:46:14.436597-04:00
ALTER SYSTEM SET remote_listener=' xff-scan:11521' SCOPE=MEMORY SID='xff2';
2026-05-20T05:46:14.438492-04:00
ALTER SYSTEM SET listener_networks='' SCOPE=MEMORY SID='xff2';
2026-05-20T17:46:15.486758+08:00
opiodr aborting process unknown ospid (50880) as a result of ORA-603
2026-05-20T17:46:15.498707+08:00
ORA-603 : opitsk aborting process
License high water mark = 423
USER(prelim) (ospid: 50880): terminating the instance due to ORA error 604
2026-05-20T17:46:15.536740+08:00
opiodr aborting process unknown ospid (69547) as a result of ORA-1092
2026-05-20T05:46:16.321597-04:00
ORA-1092 : opitsk aborting process

这里基本上可以看出来是由于在数据库启动过程中递归调用一些sql,但是由于遭遇到坏块导致启动失败,通过dbv检查system数据文件发现256个坏块
dbv-zero


256个连续的全0坏块,怀疑是2M的数据被dd全空覆盖,这样的情况,也就是怀疑是au=2的后面2M被覆盖(ausize为4M),分析system的数据分布情况
au

这里可以确认system的第24个au(从0开始)在14号盘au 2 上面,也就是数据块起始损坏为block:12288-12543(24M*4/8K[有block 0 需要考虑]),对于这种彻底损坏而且比较靠前的system中block,通过人工构造出来这些block的方式进行修复,在自研的Oracle Recovery Tools和obet工具都有该功能.运气不错,通过这个修复之后,直接expdp导出数据没有大问题,比较完美的恢复了这个故障.

Oracle 19c 202604补丁(RUs+OJVM)-19.31

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle 19c 202604补丁(RUs+OJVM)-19.31

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

Release Database Update GI Update Windows Bundle Patch
APR2026 (19.31.0.0.0) 39034528 - 38818049
JAN2026 (19.31.0.0.0) - 39036936 -
JAN2026 (19.30.0.0.0) 38632161 38629535 38597735
OCT2025 (19.29.0.0.0) 38291812 38298204 38111211
JUL2025 (19.28.0.0.0) 37960098 37957391 37962957
APR2025 (19.27.0.0.0) 37642901 37641958 37532350
JAN2025 (19.26.0.0.0) 37260974 37257886 37486199
OCT2024 (19.25.0.0.0) 36912597 36916690 36878821
JUL2024 (19.24.0.0.0) 36582781 36582629 36521936
APR2024 (19.23.0.0.0) 36233263 36233126 36219938
JAN2024 (19.22.0.0.0) 35943157 35940989 35962832
OCT2023 (19.21.0.0.0) 35643107 35642822 35681552
JUL2023 (19.20.0.0.0) 35320081 35319490 35348034
APR2023 (19.19.0.0.0) 35042068 35037840 35046439
JAN2023 (19.18.0.0.0) 34765931 34762026 34750795
Oct2022 (19.17.0.0.0) 34419443 34416665 34468114
JUL2022 (19.16.0.0.0) 34133642 34130714 34110685
APR2022 (19.15.0.0.0) 33806152 33803476 33829175
JAN2022 (19.14.0.0.0) 33515361 33509923 33575656
OCT2021(19.13.0.0.0) 33192793 33182768 33155330
JUL2021 (19.12.0.0.0) 32904851 32895426 32832237
APR2021 (19.11.0.0.0) 32545013 32545008 32409154
JAN2021 (19.10.0.0.0) 32218454 32226239 32062765
OCT2020 (19.9.0.0.0) 31771877 31750108 31719903
JUL2020 (19.8.0.0.0) 31281355 31305339 31247621
APR2020 (19.7.0.0.0) 30869156 30899722 30901317
JAN2020 (19.6.0.0.0) 30557433 30501910 30445947
OCT2019 (19.5.0.0.0) 30125133 30116789 30151705
JUL2019 (19.4.0.0.0) 29834717 29708769 -
APR2019 (19.3.0.0.0) 29517242 29517302 -
Release OJVM Update OJVM + DB Update OJVM + GI Update
APR2026 (19.31.0.0.260421) 38906621 39062931 39062956
JAN2026 (19.30.0.0.260120) 38523609 38658587 38658588
OCT2025 (19.29.0.0.251021) 38194382 38273545 38273558
JUL2025 (19.28.0.0.250715) 37847857 37952354 37952382
APR2025 (19.27.0.0.250415) 37499406 37591483 37591516
JAN2025 (19.26.0.0.250121) 37102264 37262172 37262208
OCT2024 (19.25.0.0.241015) 36878697 36866623 36866740
JUL2024 (19.24.0.0.240716) 36414915 36522340 36522439
APR2024 (19.23.0.0.240416) 36199232 36209492 36209493
JAN2024 (19.22.0.0.240116) 35926646 36031426 36031453
OCT2023 (19.21.0.0.231017) 35648110 35742413 35742441
JUL2023 (19.20.0.0.230718) 35354406 35370174 35370167
APR2023 (19.19.0.0.230418) 35050341 35058163 35058172
JAN2023 (19.18.0.0.230117) 34786990 34773489 34773504
OCT2022 (19.17.0.0.221018) 34411846 34449114 34449117
JUL2022 (19.16.0.0.220719) 34086870 34160831 34160854
APR2022 (19.15.0.0.220419) 33808367 33859194 33859214
JAN2022 (19.14.0.0.220118) 33561310 33567270 33567274
OCT2021 (19.13.0.0.211019) 33192694 33248420 33248471
JUL2021 (19.12.0.0.210720) 32876380 32900021 32900083
APR2021 (19.11.0.0.210420) 32399816 32578972 32578973
JAN2021 (19.10.0.0.210119) 32067171 32126828 32126842
OCT2020 (19.9.0.0.201020) 31668882 31720396 31720429
JUL2020 (19.8.0.0.200714) 31219897 31326362 31326369
APR2020 (19.7.0.0.200414) 30805684 30783543 30783556
JAN2020 (19.6.0.0.200114) 30484981 30463595 30463609
OCT2019 (19.5.0.0.191015) 30128191 30133124 30133178
JUL2019 (19.4.0.0.190716) 29774421 29699079 29699097
APR2019 (19.3.0.0.190416) 29548437 29621253 29621299

参考:Assistant: Download Reference for Oracle Database/GI Update, Revision, PSU, SPU(CPU), Bundle Patches, Patchsets and Base Releases KA958

Oracle故障第一现场被恢复混乱的数据库恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle故障第一现场被恢复混乱的数据库恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户数据库断电异常之后,由于第三方进行了一系列的恢复,现场比较混乱,无法判断最初情况,通过Oracle Database Recovery Check收集的结果进行初步判断
1. 有三个文件处于丢失状态,并且数据库在故障之后被人强制resetlogs拉过库
file-missing


2. 数据文件头scn不一致,而且相差的日志序列还比较大(该数据库为非归档模式)
seq

3. 该库多次重建ctl(alert日志中也有相关记录)
rectl

现在恢复这个库需要做的几件事情:
1. 由于没有任何原始故障之后的控制文件,需要从服务器上找出来所有故障之时的数据文件,担心被人重建ctl使用了错误的数据文件
2. 对于三个file missing的进行分析,并确认磁盘上是否存在,是否是好的,如果是好的需要和现在的文件一起作为一个整体进行恢复,并打开库
3. 打开数据库过程可能遇到的错误处理

通过obet中近期增加的get_dbinfo功能来解析所有可能的数据文件头(obet官方说明),结合文件头的信息判断,发现磁盘上名称dbf结尾的文件号重复
1

这样的情况下,我们取filesize大,(scn大不一定正确,可能由于被强制resetlogs导致scn比正确的文件大),同时也结合这个收集的信息,确认三个丢失的文件中两个为undotbs1表空间文件,另外一个为112k的数据文件,这里让我学习到了新知识,oracle的数据文件最小可以多少个block(通过试验测试,最小可以16个block,文件大小即为:16+1(block 0)*block_size)

C:\Users\XFF>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on 星期三 5月 13 22:05:52 2026

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create tablespace tbs datafile 'e:/tbs01.dbf' size 8k;
create tablespace tbs datafile 'e:/tbs01.dbf' size 8k
*
第 1 行出现错误:
ORA-03214: 指定的文件大小小于所需的最小值


SQL> create tablespace tbs datafile 'e:/tbs01.dbf' size 80k;
create tablespace tbs datafile 'e:/tbs01.dbf' size 80k
*
第 1 行出现错误:
ORA-03214: 指定的文件大小小于所需的最小值


SQL> create tablespace tbs datafile 'e:/tbs01.dbf' size 96k;

表空间已创建。

确认好相关信息之后,然后对于三个file missing状态的文件进行dbv检测确认undotbs01.dbf(file 3)基本上全部损坏(大量全0块),另外两个文件正常
obet_dbv


对于正常的文件通过obet修改相关scn信息
obet_resetlogs_scn

然后重建控制文件(丢弃undotbs01.dbf文件),由于确认undo已经异常,直接设置undo为manual管理方式并屏蔽回滚段,然后屏蔽一致性,强制打开数据库,结果报ORA-600 2662错误
ora-600-2662

使用Patch_SCN工具修改数据库scn(Patch_SCN工具说明)
patch_scn

然后数据库顺利打开,重建新undo,增加temp,删除老undo,导出数据完成本次恢复任务