This
feature introduced in SQL Server 2008. With this feature we can capture the
changes that are being done to any specific table. The changes made are stored
in new tables that will get created automatically after enabling this feature.
Till before SQL Server 2008 we can capture the changes made to tables with the
help of AFTER TRIGGERS.
In order to
use this feature first we have to enable this at database level and afterwards
we can enable it for respective tables. Once we enable CDC, 2 new jobs gets
created (explained later) these jobs serves as monitors of CDC process. They
keep monitoring the data changes of respective tables and will update the
repository tables which gets created automatically. The repository tables will
have information of all DML operations performed on particular tables for which
CDC has been enabled along with old data.
In order to
use this feature first we need to enable this feature at database level and
then we need to enable for specific table\tables for which we want to capture
the data changes.
To enable
this feature we need to run below command:
EXEC sys.sp_cdc_enable_db
This
procedure must be executed for a database before any tables can be enabled for
change data capture in that database. Change data capture records all insert,
update, and delete (DML)activity
applied to
enabled tables, making the details of the changes available in an easily
consumed relational format. Change data capture feature is available only in
SQL Server 2008 Enterprise, Developer and Evaluation editions.
Note: Change data capture cannot be
enabled on system databases or distribution
databases.
sys.sp_cdc_enable_db creates the change data capture
objects that have database wide scope, including meta data tables and DDL
triggers. It also creates the cdc schema and cdc database user and
sets the is_cdc_enabled column for the database entry in the
sys.databases catalog view to 1.
sys.sp_cdc_disable_db
this SP disables change data capture for the current database. This SP
disables change data capture for all tables in the database currently enabled.
All system objects related to change data capture, such as change tables, jobs,
stored procedures and functions are dropped. The is_cdc_enabled column
for the database entry in the sys.databases catalog view is set to 0.
We can use
below query to confirm whether CDC is enabled or not:
USE master
GO
SELECT [name], database_id, is_cdc_enabled
FROM sys.databases
GO
SELECT [name], database_id, is_cdc_enabled
FROM sys.databases
sys.sp_cdc_enable_table
this SP Enables
change data capture for the specified source table in the current database.
When a table is enabled for change data capture, a record of each data
manipulation language (DML) operation applied to the table is written to the
transaction log. The change data capture process retrieves information from the
log and writes it to change tables that are accessed by using a set of
functions.
To enable
CDC for specific table we have to use below command:
EXEC
sys.sp_cdc_enable_table
@source_schema = N'Administrators',
@source_name = N'Shift',
@role_name = NULL
GO
@source_schema = N'Administrators',
@source_name = N'Shift',
@role_name = NULL
GO
ü Administrators – Is
the schema to which the table belongs.
ü Shift – Is
the table name.
ü @role_name – Is the database role used to gate access to
change data. The purpose of the named role is to control access to the change
data. If explicitly set to NULL, no gating role is used to limit access to the
change data.
When change
data capture is enabled for a table, a change table and one or two query
functions are generated. The change table serves as a repository for the source
table changes extracted from the transaction log by the capture process. The
query functions are used to extract data from the change table. The names of
these functions are derived from the capture_instance parameter in the
following ways:
All changes function: cdc.fn_cdc_get_all_changes_<capture_instance>
Net changes function: cdc.fn_cdc_get_net_changes_<capture_instance>
cdc.fn_cdc_get_all_changes_<capture_instance>
Returns one row for
each change applied to the source table within the specified log sequence
number (LSN) range. If a source row had multiple changes during the interval,
each change is represented in the returned result set.
cdc.fn_cdc_get_net_changes_<capture_instance> Returns
one net change row for each source row changed within the specified LSN range.
sys.sp_cdc_enable_table also creates the capture and
cleanup jobs for the database if the source table is the first table in the
database to be enabled for CDC.
To verify
for which tables CDC is enabled we can run below query:
SELECT [name], is_tracked_by_cdc
FROM sys.tables
FROM sys.tables
sys.sp_cdc_disable_table drops the change data capture
change table and system functions associated with the specified source table.
It deletes any rows associated with the specified capture instance from the
change data capture system tables. Metadata that describes the configuration
details of the capture instance is retained in the change data capture metadata
tables cdc.change_tables, cdc.index_columns, and cdc.captured_columns.
EXEC sys.sp_cdc_disable_table
@source_schema = N'HumanResources',
@source_name = N'Shift',
@role_name = NULL
Note: If
change data capture is enabled on a table with an existing primary key, and the
@index_name parameter is not used to identify an alternative unique index, the
change data capture feature will use the primary key. Subsequent changes to the
primary key will not be allowed without first disabling change data
capture for the table.
After
enabling CDC we will notice new table for the schema ‘cdc’ got created
automatically. Name resembles like cdc.HumanResources_Shift_CT.
This table
contains all the data changes along with the old values. This table will have 5
additional columns:
ü $start_lsn
ü $end_lsn
ü $seqval
ü $operation
ü $update_mask
Each of
these new columns will have information like
$start_lsn
– This will have the starting of log sequence number.
$end_lsn -- This is the ending log sequence number.
$seqval -- This will have sequence of values to order
the row changes within the transaction.
$operation
– This contains value as per the DML operation performed.
Like 1 –
for Delete statement
2 – for Insert statement
3 – value before Update statement
4 – value after Update statement
$update_mask
-- is a bit mask where every column that was changed is set to 1.
Below 2
DMV’s can be used to monitor the CDC process.
sys.dm_cdc_log_scan_sessions:
The
sys.dm_cdc_log_scan_sessions management view contains one row for each log scan
session in the current database. The last row represents the current session.
The view also contains a row with a session ID of 0, which has aggregate
information about all the sessions since the instance of SQL Server was last
started.
sys.dm_cdc_errors:
The
sys.dm_cdc_errors management view contains one row for each error that is
encountered during the change data capture log scan session.
Note:
• Computed Columns are not tracked in
CDC. The column will appear in the change table with the appropriate type, but
will have a value of NULL.
• Changes to individual XML elements
are not tracked.• Timestamp data type in the change table is converted to binary.
Example:
Enabling
CDC for the database:
USE TestDB
GO EXEC sys.sp_cdc_enable_db
Verifying:
USE master
GO SELECT [name], database_id, is_cdc_enabled
FROM sys.databases
Enabling
CDC for table:
USE TestDB
GO EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'HotelDetail',
@role_name = NULL
Verifying:
USE TestDB
GOSELECT [name], is_tracked_by_cdc
FROM sys.tables
where is_tracked_by_cdc = 1
Querying
DMV’s:
USE TestDB
GO select * from sys.dm_cdc_log_scan_sessions
Go
select * from sys.dm_cdc_errors
Disabling
CDC for a table:
EXEC
sys.sp_cdc_disable_table
@source_schema
= N'dbo', @source_name = N'HotelDetail',
@capture_instance = N'dbo_HotelDetail‘
Disabling
CDC for a database:
EXEC
sys.sp_cdc_disable_db
Thanks!!!