pg_filedump是在pg数据库极端情况下直接解析数据库文件的利器,但是由于是开源软件,本身难以实现批量处理,通过对底层基表分析,可以实现批量处理功能
分析PostgreSQL库中数据库信息
--数据库查询结果
postgres=# select oid,datname,datdba,dattablespace from pg_database;
oid | datname | datdba | dattablespace
-------+-------------+--------+---------------
14187 | postgres | 10 | 1663
16403 | db_xff | 10 | 1663
1 | template1 | 10 | 1663
14186 | template0 | 10 | 1663
16407 | db_xifenfei | 16405 | 16406
(5 rows)
--通过dump 该文件解析数据
<Data> -----
Item 1 -- Length: 0 Offset: 5 (0x0005) Flags: REDIRECT
Item 2 -- Length: 0 Offset: 6 (0x0006) Flags: REDIRECT
Item 3 -- Length: 260 Offset: 7320 (0x1c98) Flags: NORMAL
COPY: 14187 postgres
Item 4 -- Length: 260 Offset: 7056 (0x1b90) Flags: NORMAL
COPY: 16403 db_xff
Item 5 -- Length: 297 Offset: 7888 (0x1ed0) Flags: NORMAL
COPY: 1 template1
Item 6 -- Length: 297 Offset: 7584 (0x1da0) Flags: NORMAL
COPY: 14186 template0
Item 7 -- Length: 260 Offset: 6792 (0x1a88) Flags: NORMAL
COPY: 16407 db_xifenfei
分析PostgreSQL 表空间信息
--sql查询表空间信息
postgres=# select * from pg_tablespace;
oid | spcname | spcowner | spcacl | spcoptions
-------+--------------+----------+--------+------------
1663 | pg_default | 10 | |
1664 | pg_global | 10 | |
16406 | tbs_xifenfei | 16405 | |
(3 rows)
--通过dump 该文件解析数据
<Data> -----
Item 1 -- Length: 96 Offset: 8096 (0x1fa0) Flags: NORMAL
COPY: 1663 pg_default
Item 2 -- Length: 96 Offset: 8000 (0x1f40) Flags: NORMAL
COPY: 1664 pg_global
Item 3 -- Length: 96 Offset: 7904 (0x1ee0) Flags: NORMAL
COPY: 16406 tbs_xifenfei
分析PostgreSQL 对象id、name、path对应关系
--对象信息查询
postgres=# select oid ,relname,relnamespace,reltype,reloftype,relowner,relam,relfilenode,
reltablespace from pg_class where relname like 't_t%' or relname like 't_x%';
oid | relname | relnamespace | reltype | reloftype | relowner | relam | relfilenode | reltablespace
-------+------------+--------------+---------+-----------+----------+-------+-------------+---------------
16387 | t_xifenfei | 2200 | 16389 | 0 | 10 | 2 | 16390 | 0
16391 | t_xff | 2200 | 16393 | 0 | 10 | 2 | 16391 | 0
16394 | t_xff2 | 2200 | 16396 | 0 | 10 | 2 | 16394 | 0
16397 | t_xff3 | 2200 | 16399 | 0 | 10 | 2 | 16397 | 0
16400 | t_xff4 | 2200 | 16402 | 0 | 10 | 2 | 16400 | 0
16408 | t_tbs | 2200 | 16410 | 0 | 10 | 2 | 16408 | 16406
(6 rows)
--通过dump 该文件解析数据(显示部分)
COPY: 16394 t_xff2 2200 16396 0 10 2 16394 0 0 0.000000000000 0 0 f f p r
Item 29 -- Length: 0 Offset: 31 (0x001f) Flags: REDIRECT
Item 30 -- Length: 172 Offset: 2592 (0x0a20) Flags: NORMAL
COPY: 16397 t_xff3 2200 16399 0 10 2 16397 0 0 0.000000000000 0 0 f f p r
Item 31 -- Length: 205 Offset: 3376 (0x0d30) Flags: NORMAL
COPY: 12093 pg_shadow 11 12094 0 10 0 0 0 0 0.000000000000 0 0 f f p v
Item 32 -- Length: 172 Offset: 2416 (0x0970) Flags: NORMAL
COPY: 16400 t_xff4 2200 16402 0 10 2 16400 0 0 0.000000000000 0 0 f f p r
这个里面获取到pg_class.reltablespace是表空间的id值,根据自定义表空间的规则:在pgdata/pg_tblspc创建link指向创建表空间时候的文件夹路径
-bash-4.2$ pwd
/var/lib/pgsql/12/data/pg_tblspc
-bash-4.2$ ls -ltr
total 0
lrwxrwxrwx 1 postgres postgres 30 Apr 15 20:13 16406 -> /var/lib/pgsql/12/data/tbs_xff
结合上述的pg_database,pg_tablespace,pg_class信息,可以获取到每个表对应实际的存储路径
分析PostgreSQL 模式信息
--sql查询模式信息
postgres=# select * from pg_namespace;
oid | nspname | nspowner | nspacl
-------+--------------------+----------+-------------------------------------
99 | pg_toast | 10 |
12314 | pg_temp_1 | 10 |
12315 | pg_toast_temp_1 | 10 |
11 | pg_catalog | 10 | {postgres=UC/postgres,=U/postgres}
2200 | public | 10 | {postgres=UC/postgres,=UC/postgres}
13887 | information_schema | 10 | {postgres=UC/postgres,=U/postgres}
16404 | u_xifenfei | 10 |
(7 rows)
--通过dump 该文件解析数据
<Data> -----
Item 1 -- Length: 0 Offset: 6 (0x0006) Flags: REDIRECT
Item 2 -- Length: 96 Offset: 8096 (0x1fa0) Flags: NORMAL
COPY: 99 pg_toast
Item 3 -- Length: 0 Offset: 7 (0x0007) Flags: REDIRECT
Item 4 -- Length: 96 Offset: 8000 (0x1f40) Flags: NORMAL
COPY: 12314 pg_temp_1
Item 5 -- Length: 96 Offset: 7904 (0x1ee0) Flags: NORMAL
COPY: 12315 pg_toast_temp_1
Item 6 -- Length: 141 Offset: 7760 (0x1e50) Flags: NORMAL
COPY: 11 pg_catalog
Item 7 -- Length: 141 Offset: 7616 (0x1dc0) Flags: NORMAL
COPY: 2200 public
Item 8 -- Length: 0 Offset: 9 (0x0009) Flags: REDIRECT
Item 9 -- Length: 141 Offset: 7472 (0x1d30) Flags: NORMAL
COPY: 13887 information_schema
Item 10 -- Length: 96 Offset: 7376 (0x1cd0) Flags: NORMAL
COPY: 16404 u_xifenfei
通过pg_namespace,pg_class信息,可以获取到对象所属的模式关系,基于上述汇总,可以获取到某个模式下面,所有表id和实际存储路径,现在使用pg_filedump进行恢复,还缺少表的列类型信息,通过pg_type和pg_attribute来获取。
获取PostgreSQL表的列名称和类型[编号]信息
--sql查询列信息
postgres=# \d t_tbs
Table "public.t_tbs"
Column | Type | Collation | Nullable | Default
----------+------+-----------+----------+---------
oid | oid | | |
spcname | name | | |
spcowner | oid | | |
Tablespace: "tbs_xifenfei"
postgres=# select attrelid,attname,atttypid,attstattarget,attlen,attnum from pg_attribute where attrelid=16408;
attrelid | attname | atttypid | attstattarget | attlen | attnum
----------+----------+----------+---------------+--------+--------
16408 | tableoid | 26 | 0 | 4 | -6
16408 | cmax | 29 | 0 | 4 | -5
16408 | xmax | 28 | 0 | 4 | -4
16408 | cmin | 29 | 0 | 4 | -3
16408 | xmin | 28 | 0 | 4 | -2
16408 | ctid | 27 | 0 | 6 | -1
16408 | oid | 26 | -1 | 4 | 1
16408 | spcname | 19 | -1 | 64 | 2
16408 | spcowner | 26 | -1 | 4 | 3
(9 rows)
--dump 内容(截取部分)
Item 11 -- Length: 144 Offset: 1424 (0x0590) Flags: NORMAL
COPY: 16408 oid 26 -1 4 1
Item 12 -- Length: 144 Offset: 1280 (0x0500) Flags: NORMAL
COPY: 16408 spcname 19 -1 64 2
Item 13 -- Length: 144 Offset: 1136 (0x0470) Flags: NORMAL
COPY: 16408 spcowner 26 -1 4 3
Item 14 -- Length: 144 Offset: 992 (0x03e0) Flags: NORMAL
COPY: 16408 ctid 27 0 6 -1
Item 15 -- Length: 144 Offset: 848 (0x0350) Flags: NORMAL
COPY: 16408 xmin 28 0 4 -2
Item 16 -- Length: 144 Offset: 704 (0x02c0) Flags: NORMAL
COPY: 16408 cmin 29 0 4 -3
Item 17 -- Length: 144 Offset: 560 (0x0230) Flags: NORMAL
COPY: 16408 xmax 28 0 4 -4
Item 18 -- Length: 144 Offset: 416 (0x01a0) Flags: NORMAL
COPY: 16408 cmax 29 0 4 -5
PostgreSQL获取类型编号和实际类型名称对应关系
--查询类型编号和实际类型关系
postgres=# select oid,typname from pg_type;
oid | typname
-------+---------------------------------------
16 | bool
17 | bytea
18 | char
19 | name
20 | int8
21 | int2
22 | int2vector
23 | int4
24 | regproc
25 | text
26 | oid
27 | tid
28 | xid
29 | cid
……
--dump 内容(截取部分)
Item 1 -- Length: 176 Offset: 8016 (0x1f50) Flags: NORMAL
COPY: 16 bool
Item 2 -- Length: 176 Offset: 7840 (0x1ea0) Flags: NORMAL
COPY: 17 bytea
Item 3 -- Length: 176 Offset: 7664 (0x1df0) Flags: NORMAL
COPY: 18 char
Item 4 -- Length: 176 Offset: 7488 (0x1d40) Flags: NORMAL
COPY: 19 name
Item 5 -- Length: 176 Offset: 7312 (0x1c90) Flags: NORMAL
COPY: 20 int8
Item 6 -- Length: 176 Offset: 7136 (0x1be0) Flags: NORMAL
COPY: 21 int2
Item 7 -- Length: 176 Offset: 6960 (0x1b30) Flags: NORMAL
COPY: 22 int2vector
Item 8 -- Length: 176 Offset: 6784 (0x1a80) Flags: NORMAL
COPY: 23 int4
Item 9 -- Length: 176 Offset: 6608 (0x19d0) Flags: NORMAL
COPY: 24 regproc
Item 10 -- Length: 176 Offset: 6432 (0x1920) Flags: NORMAL
COPY: 25 text
Item 11 -- Length: 176 Offset: 6256 (0x1870) Flags: NORMAL
COPY: 26 oid
Item 12 -- Length: 176 Offset: 6080 (0x17c0) Flags: NORMAL
COPY: 27 tid
Item 13 -- Length: 176 Offset: 5904 (0x1710) Flags: NORMAL
COPY: 28 xid
Item 14 -- Length: 176 Offset: 5728 (0x1660) Flags: NORMAL
COPY: 29 cid
Item 15 -- Length: 176 Offset: 5552 (0x15b0) Flags: NORMAL
COPY: 30 oidvector
通过pg_class、pg_type和pg_attribute可以获取对象的表的列名称,数据类型等信息。通过以上几个对象,即可获取到pg_filedmp处理所需要的所有信息,然后可以实现批量处理