oracle字符集扫描工具csscan介绍

2011-05-24  张林 

oracle字符集扫描工具csscan介绍
1. oracle csscan工具使用 (Character Set Scanner utility)

csscan与Oracle的exp、imp等工具类似,具体如下:

$ pwd
/oracle/app/oracle/product/9.2.0/bin
$ ls -l cs*
-rwxr-xr-x 1 oracle oinstall 13786434 Apr 26 2002 csscan
-rwxr-xr-x 1 oracle oinstall 0 May 02 2002 csscanO

$ csscan help=y

Character Set Scanner v1.1 : Release 9.2.0.1.0 - Production on Mon Dec 7 14:33:32 2009

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

You can let Scanner prompt you for parameters by entering the CSSCAN
command followed by your username/password:

Example: CSSCAN SYSTEM/MANAGER

Or, you can control how Scanner runs by entering the CSSCAN command
followed by various parameters. To specify parameters, you use keywords:

Example: CSSCAN SYSTEM/MANAGER FULL=y TOCHAR=utf8 ARRAY=102400 PROCESS=3

Keyword Default Prompt Description
---------- ------- ------ -------------------------------------------------
USERID yes username/password
FULL N yes scan entire database
USER yes user name of the table to scan
TABLE yes list of tables to scan
EXCLUDE list of tables to exclude from scan
TOCHAR yes new database character set name
FROMCHAR current database character set name
TONCHAR new NCHAR character set name
FROMNCHAR current NCHAR character set name
ARRAY 10240 yes size of array fetch buffer
PROCESS 1 yes number of scan process
MAXBLOCKS split table if larger than MAXBLOCKS
CAPTURE N capture convertible data
SUPPRESS suppress error log by N per table
FEEDBACK feedback progress every N rows
BOUNDARIES list of column size boundaries for summary report
LASTRPT N generate report of the last database scan
LOG scan base name of log files
PARFILE parameter file name
PRESERVE N preserve existing scan results
HELP N show help screen (this screen)
---------- ------- ------ -------------------------------------------------
Scanner terminated successfully.
2.csscan工具安装

csscan在使用前,需要创建相关表,其脚本在$ORACLE_HOME/rdbms/admin/csminst.sql

$ sqlplus

SQL*Plus: Release 9.2.0.7.0 - Production on Mon Dec 7 14:39:56 2009
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Enter user-name: sys as sysdba
Enter password:
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.7.0 - Production
SQL> @?/rdbms/admin/csminst.sql
User created.

Grant succeeded.

Grant succeeded.

Grant succeeded.

User altered.

1 row created.

1 row updated.
Connected.
......
Commit complete.
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.7.0 - Production
3.修改csmig的默认表空间

如果在做大量扫描时,可能会需要大量的表空间,因此建议设置单独表空间,可以在步骤2中进行修改

4.用csscan扫描指定用户的数据

$ csscan userid=system/”system*” user=train tochar=’UTF8′

Character Set Scanner v1.1 : Release 9.2.0.1.0 – Production on Mon Dec 7 14:47:02 2009

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.7.0 – 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.7.0 – Production

Enter array fetch buffer size: 10240 >

Enter number of scan processes to utilize(1..32): 1 >

Enumerating tables to scan…

. process 1 scanning TRAIN.TRAIN_OBJECTS[AAAODqAArAAAArBAAA]
. process 1 scanning TRAIN.LY_SB_TEST[AAAQ8gAArAAAAuBAAA]
. process 1 scanning TRAIN.COLUMN_DROP_DEMO[AAAE3/AArAAAAZBAAA]
. process 1 scanning TRAIN.UDI_DEMO_TMP[AAAE1RAArAAAASJAAA]
. process 1 scanning TRAIN.STUDENTS[AAADQAAAqAAAAAhAAA]
. process 1 scanning TRAIN.HOT_STUDENTS[AAAE37AArAAAAWBAAA]
. process 1 scanning TRAIN.TESTAA[AAAQynAArAAAAtZAAA]
. process 1 scanning TRAIN.UDI_DEMO[AAAEyjAArAAAAHZAAA]
. process 1 scanning TRAIN.T3[AAAQ/XAArAAACmZAAA]
. process 1 scanning TRAIN.PLAN_TABLE[AAAQ/rAArAAAAJBAAA]
. process 1 scanning TRAIN.CLASSES[AAADP8AAqAAAAAJAAA]
. process 1 scanning TRAIN.DM_CZRY[AAAE89AArAAAAcZAAA]
….
. process 1 scanning TRAIN.GRADES[AAADQHAAqAAAAAxAAA]
. process 1 scanning TRAIN.COURSES[AAADP+AAqAAAAARAAA]

Creating Database Scan Summary Report…

Creating Individual Exception Report…

Scanner terminated successfully.

5.查看生成报告

scan.err展示转换会出错的表和字段名
scan.out展示执行csscan的步骤
scan.txt显示数据库概要信息

$
$ ls -l scan*
-rw-r–r– 1 oracle oinstall 2928 Dec 07 14:47 scan.err
-rw-r–r– 1 oracle oinstall 2998 Dec 07 14:47 scan.out
-rw-r–r– 1 oracle oinstall 8377 Dec 07 14:47 scan.txt
$ cat scan.err

Database Scan Individual Exception Report

[Database Scan Parameters]

Parameter Value
------------------------------ ------------------------------------------------
Scan type User tables
User name train
Scan CHAR data? YES
Current database character set ZHS16GBK
New database character set UTF8
Scan NCHAR data? NO
Array fetch buffer size 10240
Number of processes 1
Capture convertible data? NO
------------------------------ ------------------------------------------------

[Data Dictionary individual exceptions]

[Application data individual exceptions]

User : TRAIN
Table : DM_CZRY
Column: CZRY_MC
Type : VARCHAR2(10)
Number of Exceptions : 10
Max Post Conversion Data Size: 14

ROWID Exception Type Size Cell Data(first 30 bytes)
------------------ ------------------ ----- ------------------------------
AAAE89AArAAAAcaAAG exceed column size 12 四呼少虹
AAAE89AArAAAAcbAAH exceed column size 12 巫罗伟强
AAAE89AArAAAAcbAAb exceed column size 12 南山查询
AAAE89AArAAAAcbAAn exceed column size 14 计征福田14
AAAE89AArAAAAcbAAp exceed column size 14 计征福田17
AAAE89AArAAAAcbAAq exceed column size 14 计征福田15
AAAE89AArAAAAcbAAr exceed column size 14 计征福田16
AAAE89AArAAAAcbAAs exceed column size 14 计征福田18
AAAE89AArAAAAccABJ exceed column size 12 储蓄扣税
AAAE89AArAAAAccABN exceed column size 12 信息中心
------------------ ------------------ ----- ------------------------------

User : TRAIN
Table : TRACE_DDL
Column: SQL_TEXT
Type : VARCHAR2(2000)
Number of Exceptions : 4
Max Post Conversion Data Size: 1966

ROWID Exception Type Size Cell Data(first 30 bytes)
------------------ ------------------ ----- ------------------------------
AAAOAzAATAAIMVlAAN lossy conversion CREATE OR REPLACE PROCEDURE DE
AAAOAzAATAAIMVmAAA lossy conversion CREATE OR REPLACE PROCEDURE SE
AAAOAzAATAAIMVnAAC lossy conversion CREATE OR REPLACE PROCEDURE DE
AAAOAzAAUAAFtLTAAC lossy conversion CREATE OR REPLACE PROCEDURE SE
------------------ ------------------ ----- ------------------------------

$ cat scan.out

Character Set Scanner v1.1 : Release 9.2.0.1.0 - Production on Mon Dec 7 14:47:02 2009

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.7.0 - Production

Enter array fetch buffer size: 10240 >
Enter number of scan processes to utilize(1..32): 1 >
Enumerating tables to scan...

. process 1 scanning TRAIN.TRAIN_OBJECTS[AAAODqAArAAAArBAAA]
. process 1 scanning TRAIN.LY_SB_TEST[AAAQ8gAArAAAAuBAAA]
. process 1 scanning TRAIN.COLUMN_DROP_DEMO[AAAE3/AArAAAAZBAAA]
. process 1 scanning TRAIN.UDI_DEMO_TMP[AAAE1RAArAAAASJAAA]
. process 1 scanning TRAIN.STUDENTS[AAADQAAAqAAAAAhAAA]
. process 1 scanning TRAIN.HOT_STUDENTS[AAAE37AArAAAAWBAAA]
. process 1 scanning TRAIN.TESTAA[AAAQynAArAAAAtZAAA]
. process 1 scanning TRAIN.UDI_DEMO[AAAEyjAArAAAAHZAAA]
. process 1 scanning TRAIN.T3[AAAQ/XAArAAACmZAAA]
. process 1 scanning TRAIN.PLAN_TABLE[AAAQ/rAArAAAAJBAAA]
. process 1 scanning TRAIN.CLASSES[AAADP8AAqAAAAAJAAA]
. process 1 scanning TRAIN.DM_CZRY[AAAE89AArAAAAcZAAA]
. process 1 scanning TRAIN.MV_STUDENT_SCORES[AAAE85AArAAAAdZAAA]
. process 1 scanning TRAIN.TRACE_DDL[AAAOAzAAUAAFtLRAAA]
. process 1 scanning TRAIN.TRACE_DDL[AAAOAzAATAAIMVhAAA]
. process 1 scanning TRAIN.TRACE_LOGON[AAAN7+AArAAAAqRAAA]
. process 1 scanning TRAIN.TRAIN_ERRORS_A[AAAN7fAArAAAAeBAAA]
. process 1 scanning TRAIN.TABLE_TRACE[AAAN7eAArAAAAd5AAA]
. process 1 scanning TRAIN.TRAIN_ERRORS_BAK[AAAN7dAArAAAAdBAAA]
..
. process 1 scanning TRAIN.MLOG$_SCORES[AAAE8xAArAAAAdRAAA]
. process 1 scanning TRAIN.FND_USERS[AAAE3DAArAAAATxAAA]
. process 1 scanning TRAIN.SALARY[AAAE3JAArAAAAThAAA]
. process 1 scanning TRAIN.MULIT_INSERT_DEMO1[AAAEylAArAAAAHxAAA]
. process 1 scanning TRAIN.MULIT_INSERT_DEMO[AAAEykAArAAAAHpAAA]
. process 1 scanning TRAIN.GRADES[AAADQHAAqAAAAAxAAA]
. process 1 scanning TRAIN.COURSES[AAADP+AAqAAAAARAAA]

Creating Database Scan Summary Report...

Creating Individual Exception Report...

Scanner terminated successfully.

$ cat scan.txt

Database Scan Summary Report

Time Started : 2009-12-07 14:47:15
Time Completed: 2009-12-07 14:47:25

Process ID Time Started Time Completed
---------- -------------------- --------------------
1 2009-12-07 14:47:19 2009-12-07 14:47:24
---------- -------------------- --------------------

[Database Size]

Tablespace Used(MB) Free(MB) Total(MB)
------------------------------ --------------- --------------- ---------------
SYSTEM 380.000 2,696.000 3,076.000
UNDOTBS1 613.000 9,627.000 10,240.000
TEMP .000 .000 .000
sid_IDX 43,650.000 18,214.000 61,864.000
sid_DAT 73,249.000 8,671.000 81,920.000
sid_FP_DAT 6,675.000 7,661.000 14,336.000
sid_JK_DAT 2,098.000 8,142.000 10,240.000
sid_SB_DAT 23,349.000 3,963.000 27,312.000
sid_DAT_2002 96.000 1,028.000 1,124.000
sid_DAT_2003 169.000 2,903.000 3,072.000
sid_DAT_2004 269.000 81.000 350.000
sid_DAT_2006 395.000 605.000 1,000.000
sid_DAT_2007 7,080.000 944.000 8,024.000
sid_DAT_2008 7,911.000 1,113.000 9,024.000
sid_DAT_9999 209.000 915.000 1,124.000
sid_MAINT 1.000 5,119.000 5,120.000
sid_CX_TMP 99.000 1,949.000 2,048.000
sid_DAT_2005 8,205.000 963.000 9,168.000
TBS_TRAIN 8.000 2,040.000 2,048.000
TBS_TRAIN_IDX 84.000 1,964.000 2,048.000
DW 114,343.000 29,017.000 143,360.000
DWINDX 48,072.000 13,368.000 61,440.000
SGY 6,953.000 3,287.000 10,240.000
SGYINDX 578.000 446.000 1,024.000
TBS_SKY 533.000 19,947.000 20,480.000
sid_DAT_GD 9,505.000 21,215.000 30,720.000
------------------------------ --------------- --------------- ---------------
Total 354,521.000 165,881.000 520,402.000

[Database Scan Parameters]

Parameter Value
------------------------------ ------------------------------------------------
Scan type User tables
User name train
Scan CHAR data? YES
Current database character set ZHS16GBK
New database character set UTF8
Scan NCHAR data? NO
Array fetch buffer size 10240
Number of processes 1
Capture convertible data? NO
------------------------------ ------------------------------------------------

[Scan Summary]

Some character type application data are not convertible to the new character set

[Data Dictionary Conversion Summary]

Datatype Changeless Convertible Exceptional Total
--------------------- ---------------- ---------------- ---------------- ----------------
VARCHAR2
CHAR
LONG
CLOB
--------------------- ---------------- ---------------- ---------------- ----------------
Total

[Application Data Conversion Summary]

Datatype Changeless Convertible Exceptional Total
--------------------- ---------------- ---------------- ---------------- ----------------
VARCHAR2 2,706,457 2,286 14 2,708,757
CHAR 1,500 0 0 1,500
LONG 10 0 0 10
CLOB
--------------------- ---------------- ---------------- ---------------- ----------------
Total 2,707,967 2,286 14 2,710,267

[Distribution of Convertible Data per Table]

USER.TABLE Convertible Exceptional
------------------------------------------------------- ---------------- ----------------
TRAIN.CLASSES 100 0
TRAIN.COURSES 28 0
TRAIN.GRADES 6 0
TRAIN.DM_CZRY 290 10
TRAIN.MV_STUDENT_SCORES 900 0
TRAIN.MVJOB_STUDENT_SCORES 900 0
TRAIN.TRAIN_ERRORS 8 0
TRAIN.TRAIN_ERRORS_BAK 2 0
TRAIN.TRAIN_ERRORS_A 5 0
TRAIN.TRACE_DDL 36 4
TRAIN.SYS_PARAMETERS 1 0
TRAIN.TABLE1 6 0
TRAIN.DIC1 4 0
------------------------------------------------------- ---------------- ----------------

[Distribution of Convertible Data per Column]

USER.TABLE|COLUMN Convertible Exceptional
------------------------------------------------------- ---------------- ----------------
TRAIN.CLASSES|CLASS_NAME 100 0
TRAIN.COURSES|COURSE_NAME 28 0
TRAIN.GRADES|GRADE_NAME 6 0
TRAIN.DM_CZRY|CZRY_MC 290 10
TRAIN.MV_STUDENT_SCORES|CLASS_NAME 450 0
TRAIN.MV_STUDENT_SCORES|GRADE_NAME 450 0
TRAIN.MVJOB_STUDENT_SCORES|CLASS_NAME 450 0
TRAIN.MVJOB_STUDENT_SCORES|GRADE_NAME 450 0
TRAIN.TRAIN_ERRORS|SQLERRM_DESC 8 0
TRAIN.TRAIN_ERRORS_BAK|SQLERRM_DESC 2 0
TRAIN.TRAIN_ERRORS_A|SQLERRM_DESC 5 0
TRAIN.TRACE_DDL|SQL_TEXT 36 4
TRAIN.SYS_PARAMETERS|PARAMETER_DESC 1 0
TRAIN.TABLE1|DIC 6 0
TRAIN.DIC1|DIC_NAME 4 0
------------------------------------------------------- ---------------- ----------------
[Indexes to be Rebuilt]
USER.INDEX on USER.TABLE(COLUMN)
-----------------------------------------------------------------------------------------
6.与csscan相关数据字典

$ sqlplus

SQL*Plus: Release 9.2.0.7.0 - Production on Mon Dec 7 14:56:58 2009

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

Enter user-name: sys as sysdba
Enter password:

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.7.0 - Production
SQL> column object_name format a30
SQL> select owner,object_name,OBJECT_TYPE from dba_objects where owner='CSMIG';
OWNER OBJECT_NAME OBJECT_TYPE
------------------------------ ------------------------------ ------------------
CSMIG CSM$COLUMNS TABLE
CSMIG CSM$CONSTRAINTS TABLE
CSMIG CSM$ERRORS TABLE
CSMIG CSM$EXTABLES TABLE
CSMIG CSM$INDEXES TABLE
CSMIG CSM$PARAMETERS TABLE
CSMIG CSM$TABLES TABLE
CSMIG CSM$TRIGGERS TABLE
CSMIG CSMV$COLUMNS VIEW
CSMIG CSMV$CONSTRAINTS VIEW
CSMIG CSMV$ERRORS VIEW
OWNER OBJECT_NAME OBJECT_TYPE
------------------------------ ------------------------------ ------------------
CSMIG CSMV$INDEXES VIEW
CSMIG CSMV$TABLES VIEW
CSMIG CSMV$TRIGGERS VIEW
7.Changing the Database Character Set of an Existing Database

Database character set migration has two stages: data scanning and data conversion. Before you change the database character set, you need to identify possible database character set conversion problems and truncation of data. This step is called data scanning.
Data scanning identifies the amount of effort required to migrate data into the new character encoding scheme before changing the database character set. Some examples of what may be found during a data scan are the number of schema objects where the column widths need to be expanded and the extent of the data that does not exist in the target character repertoire. This information helps to determine the best approach for converting the database character set. There are three approaches to converting data from one database character set to another if the database does not contain any of the inconsistencies described in “Character Set Conversion Issues” on page 10-4. A description of methods to migrate databases with such inconsistencies is out of the scope of this documentation. For more information, contact Oracle Consulting Services for assistance.
The approaches are:
1.Migrating Character Data Using a Full Export and Import
2.Migrating Character Data Using the ALTER DATABASE CHARACTER SET Statement
3.Migrating Character Data Using the ALTER DATABASE CHARACTER SET Statement and Selective Imports

Migrating Character Data Using a Full Export and Import
In most cases, a full export and import is recommended to properly convert all data to a new character set. It is important to be aware of data truncation issues, because columns with character datatypes may need to be extended before the import to handle an increase in size. Existing PL/SQL code should be reviewed to ensure that all byte-based SQL functions such as LENGTHB, SUBSTRB, and INSTRB, and PL/SQL CHAR and VARCHAR2 declarations are still valid.

Migrating Character Data Using the ALTER DATABASE CHARACTER SET Statement

The ALTER DATABASE CHARACTER SET statement is the fastest way to migrate a character set, but it can be used only under special circumstances. The ALTER DATABASE CHARACTER SET statement does not perform any data conversion, so it can be used if and only if the new character set is a strict superset of the current character set. The new character set is a strict superset of the current character set if:

Each and every character in the current character set is available in the new character set.

Each and every character in the current character set has the same code point value in the new character set. For example, US7ASCII is a strict subset of many character sets.

Another restriction of the ALTER DATABASE CHARACTER SET statement is that it can be used only when the character set migration is between two single-byte character sets or between two multibyte character sets. If the planned character set
migration is from a single-byte character set to a multibyte character set, then use the Export and Import utilities.
This restriction on using the ALTER DATABASE CHARACTER SET statement arises because of CLOB data. In Oracle9i, some internal fields in the data dictionary are stored in CLOB columns. Customers may also store data in CLOB fields. When the database character set is multibyte, CLOB data in Oracle9i is stored as UCS-2 data (two-byte, fixed-width Unicode). When the database character set is single-byte, CLOB data is stored using the database character set. Because the ALTER DATABASE CHARACTER SET statement does not convert data, CLOB columns remain in the original database character set encoding when the database character set is migrated from single-byte to multibyte. This introduces data inconsistency in the CLOB columns.
The syntax of the ALTER DATABASE CHARACTER SET statement is as follows:
ALTER DATABASE [db_name] CHARACTER SET new_character_set;
db_name is optional. The character set name should be specified without quotes.

修改数据库字符集示例

For example:
ALTER DATABASE CHARACTER SET AL32UTF8;
To change the database character set, perform the following steps:
1. Shut down the database, using either a SHUTDOWN IMMEDIATE or a SHUTDOWN
NORMAL statement.
2. Do a full backup of the database because the ALTER DATABASE CHARACTER SET statement cannot be rolled back.
3. Complete the following statements:
STARTUP MOUNT;
ALTER SYSTEM ENABLE RESTRICTED SESSION;
ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0;
ALTER SYSTEM SET AQ_TM_PROCESSES=0;
ALTER DATABASE OPEN;
ALTER DATABASE CHARACTER SET new_character_set;

SHUTDOWN IMMEDIATE; — or SHUTDOWN NORMAL;
STARTUP;

Using the ALTER DATABASE CHARACTER SET Statement in an Oracle9i Real
Application Clusters Environment
In a Oracle9i Real Application Clusters environment, ensure that no other Oracle background processes are running, with the exception of the background processes associated with the instance through which a user is connected, before attempting to issue the ALTER DATABASE CHARACTER SET statement. Use the following SQL
statement to verify the environment:
SELECT SID, SERIAL#, PROGRAM FROM V$SESSION;
Set the CLUSTER_DATABASE initialization parameter to FALSE to allow the character set change to be completed. This is required in an Oracle9i Real Application Cluster environment; an exclusive startup is not sufficient.

Migrating Character Data Using the ALTER DATABASE CHARACTER SET Statement
and Selective Imports
Another approach to migrating character data is to perform an ALTER DATABASE CHARACTER SET statement followed by selective imports. This method is best suited for a known distribution of convertible data that is stored within a small
number of tables. A full export and import is too expensive in this scenario. For example, suppose you have a 100GB database with over 300 tables, but only 3 tables require character set conversions. The rest of the data is of the same encoding as the destination character set. The 3 tables can be exported and imported back to the new database after issuing the ALTER DATABASE CHARACTER SET statement. Incorrect data conversion can lead to data corruption, so perform a full backup of the database before attempting to migrate the data to a new character set.

http://www.idb-stock.net/idb/2011/05/10/73.html
http://www.ningoo.net/html/2007/language_and_character_set_file_scanner.html
422°/4213 人阅读/1 条评论 发表评论

杨一  2011-05-25

可惜没看懂哦


登录 后发表评论