SQL Server Index vs Statistic

Quali sono le differenze tra CREATE INDEXe CREATE STATISTICSe quando dovrei usarle?

sql-server index statistics

— Scott
fonte

Gli indici memorizzano i dati effettivi (pagine di dati o pagine di indice in base al tipo di indice di cui stiamo parlando) e le statistiche memorizzano la distribuzione dei dati. Pertanto, CREATE INDEXsarà il DDL a creare un indice (cluster, non cluster, ecc.) CREATE STATISTICSEd è il DDL a creare le statistiche sulle colonne all'interno della tabella.

Vi consiglio di leggere su questi aspetti dei dati relazionali. Di seguito sono riportati un paio di articoli introduttivi per principianti. Questi sono argomenti molto ampi, e quindi le informazioni su di essi possono essere molto ampie e molto profonde. Leggi l'idea generale di loro di seguito e fai domande più specifiche quando si presentano.

Riferimento BOL sulla tabella e organizzazione dell'indice
Riferimento BOL sulla struttura dell'indice cluster
Riferimento BOL sulle strutture dell'indice non cluster
SQL Server Central sull'introduzione agli indici
Riferimento BOL sulle statistiche

Ecco un esempio funzionante per vedere queste due parti in azione (commentate per spiegare):

use testdb;
go

create table MyTable1
(
    id int identity(1, 1) not null,
    my_int_col int not null
);
go

insert into MyTable1(my_int_col)
values(1);
go 100

-- this statement will create a clustered index
-- on MyTable1.  The index key is the id field
-- but due to the nature of a clustered index
-- it will contain all of the table data
create clustered index MyTable1_CI
on MyTable1(id);
go


-- by default, SQL Server will create a statistics
-- on this index.  Here is proof.  We see a stat created
-- with the name of the index, and the consisting stat 
-- column of the index key column
select
    s.name as stats_name,
    c.name as column_name
from sys.stats s
inner join sys.stats_columns sc
on s.object_id = sc.object_id
and s.stats_id = sc.stats_id
inner join sys.columns c
on sc.object_id = c.object_id
and sc.column_id = c.column_id
where s.object_id = object_id('MyTable1');


-- here is a standalone statistics on a single column
create statistics MyTable1_MyIntCol
on MyTable1(my_int_col);
go

-- now look at the statistics that exist on the table.
-- we have the additional statistics that's not necessarily
-- corresponding to an index
select
    s.name as stats_name,
    c.name as column_name
from sys.stats s
inner join sys.stats_columns sc
on s.object_id = sc.object_id
and s.stats_id = sc.stats_id
inner join sys.columns c
on sc.object_id = c.object_id
and sc.column_id = c.column_id
where s.object_id = object_id('MyTable1');


-- what is a stat look like?  run DBCC SHOW_STATISTICS
-- to get a better idea of what is stored
dbcc show_statistics('MyTable1', 'MyTable1_CI');
go

Ecco come può apparire un campione di test di statistiche:

inserisci qui la descrizione dell'immagine

Si noti che le statistiche sono il contenimento della distribuzione dei dati. Aiutano SQL Server a determinare un piano ottimale. Un buon esempio di questo è, immagina che stai per vivere un oggetto pesante. Se sapessi quanto peso perché c'era un segno di peso su di esso, determineresti il modo migliore per sollevare e con quali muscoli. È una specie di cosa fa SQL Server con le statistiche.

-- create a nonclustered index
-- with the key column as my_int_col
create index IX_MyTable1_MyIntCol
on MyTable1(my_int_col);
go

-- let's look at this index
select
    object_name(object_id) as object_name,
    name as index_name,
    index_id,
    type_desc,
    is_unique,
    fill_factor
from sys.indexes
where name = 'IX_MyTable1_MyIntCol';

-- now let's see some physical aspects
-- of this particular index
-- (I retrieved index_id from the above query)
select *
from sys.dm_db_index_physical_stats
(
    db_id('TestDB'),
    object_id('MyTable1'),
    4,
    null,
    'detailed'
);

Dall'esempio precedente possiamo vedere che l'indice contiene effettivamente dei dati (a seconda del tipo di indice, le pagine delle foglie saranno diverse).

Questo post ha mostrato solo una panoramica molto molto breve di questi due grandi aspetti di SQL Server. Entrambi potrebbero occupare capitoli e libri. Leggi alcuni dei riferimenti, quindi avrai una migliore comprensione.

— Thomas Stringer
fonte

So che questo è un vecchio post, ma penso sia degno di nota che la creazione di un indice (nella maggior parte dei casi) genererà automaticamente statistiche per l'indice. Lo stesso non si può dire della creazione di statistiche.

— Steve Mangiameli,