Thursday 27 December 2012

List all heap tables of a given database

-- List all heap tables 
-- Try this with tempdb and see how many temp tables or table variables have been created there, you will be amazed with large olap databases.  
SELECT s.name + '.' + t.name AS TableName

FROM sys.tables t INNER JOIN sys.schemas s

ON t.schema_id = s.schema_id INNER JOIN sys.indexes i

ON t.object_id = i.object_id

AND i.type = 0 -- = Heap

ORDER BY TableName

Database external fragmentation overview

use MyDatabase




go


select top 50

avg_fragmentation_in_percent*s.page_count/100.00 fragmented_page_count

,o.name as TableName

,i.name as IndexName

,s.partition_number

,s.avg_fragmentation_in_percent

,s.page_count

,s.index_depth

,s.index_level

,s.index_type_desc

,d.name as DataBaseName

,s.*

from sys.dm_db_index_physical_stats(db_id(), null, null, null, 'limited') s

--sys.dm_db_index_physical_stats(db_id(), object_id('MyTableInCurrentDB'), null, null, 'limited') s

inner join sys.databases d on d.database_id=s.database_id

inner join sys.objects o on o.object_id=s.object_id

left join sys.indexes i on i.object_id=s.object_id and i.index_id=s.index_id

where -- have large external fragmentation

s.avg_fragmentation_in_percent>10

-- ignore small table/index

and s.page_count>1

-- ignore heap table

and s.index_id>0

order by avg_fragmentation_in_percent*s.page_count desc

Tuesday 18 December 2012

Checking SQL server stats

-- use dm_db_stats_properties to get stats info
-- for SQL 2012 and above
SELECT
 OBJECT_NAME([sp].[object_id]) AS "Table"
 ,[sp].[stats_id] AS "Statistic ID"
 ,[s].[name] AS "Statistic"
 ,[sp].[rows_sampled]
 ,[sp].[modification_counter] AS "Modifications"
 ,[sp].[last_updated] AS "Last Updated"
 ,[sp].[rows]
 ,[sp].[unfiltered_rows]
 FROM [sys].[stats] AS [s]
   OUTER APPLY sys.dm_db_stats_properties ([s].[object_id],[s].[stats_id]) AS [sp]
 --WHERE [s].[object_id] = OBJECT_ID(N'Sales.TestSalesOrderDetail');
 ORDER BY 1,2

--
-- quick way to check which stats objects are used by query
DBCC FREEPROCCACHE
 
SELECT 
    p.Name,
    total_quantity = SUM(th.Quantity)
FROM AdventureWorks.Production.Product AS p
JOIN AdventureWorks.Production.TransactionHistory AS th ON
    th.ProductID = p.ProductID
WHERE
    th.ActualCost >= $5.00
    AND p.Color = N'Red'
GROUP BY
    p.Name
ORDER BY
    p.Name
OPTION
(
    QUERYTRACEON 3604,
    QUERYTRACEON 9292,
    QUERYTRACEON 9204
)

--
-- checking auto created stats which are never updated or updated before 1 week
DECLARE @NumOfDays int, @StateCheckDateTimeDeadLine datetime
SET @NumOfDays=7
SET @StateCheckDateTimeDeadLine=DATEADD(hour, -24*@NumOfDays, getdate())
--SELECT @StateCheckDateTimeDeadLine

SELECT object_name(object_id) as TableName
,name AS StatisticsName
,STATS_DATE(object_id, stats_id) AS StatisticsUpdatDate
,auto_created AS IsAutoCreated
FROM sys.stats
WHERE auto_created=1
-- never updated or updated before dead line
AND (STATS_DATE(object_id, stats_id) IS NULL
OR STATS_DATE(object_id, stats_id)<=@StateCheckDateTimeDeadLine)
order by 1,2

-- after this check detail of each stat
DBCC SHOW_STATISTICS ( table_or_indexed_view_name , target )
[ WITH [ NO_INFOMSGS ] < option > [ , n ] ]
< option > :: =
    STAT_HEADER | DENSITY_VECTOR | HISTOGRAM | STATS_STREAM

DBCC SHOW_STATISTICS ("MyTable", MyPrimayKey);
DBCC SHOW_STATISTICS ("MyTable", MyStatsName);

reference:
1)
http://sqlblog.com/blogs/paul_white/archive/2011/09/21/how-to-find-the-statistics-used-to-compile-an-execution-plan.aspx
2)
http://www.sqlskills.com/blogs/erin/understanding-when-statistics-will-automatically-update/

Thursday 13 December 2012

Sizing database/partition/partitioned table with [dm_db_partition_stats]

--
-- whole database size
SELECT   
        (SUM(reserved_page_count)*8192)/1024000 AS [TotalSpace(MB)]
        ,(SUM([used_page_count])*8192)/1024000 AS [UsedSpace(MB)]
        ,(SUM([row_count])) AS [RowCount]
        FROM sys.dm_db_partition_stats

--
-- size of each partition scheme       
select    s.name as [partition_scheme]
        ,sum(ps.[reserved_page_count])*8192/10240000  AS [TotalSpace(MB)]
        ,sum(ps.[used_page_count])*8192/10240000 AS [UsedSpace(MB)]
        ,sum([row_count]) AS [RowCount]
    from    sys.indexes i inner join sys.partition_schemes s
            on i.data_space_id = s.data_space_id 
            inner join
                (
                SELECT   
                    [object_id]
                    ,[index_id]
                    ,sum([used_page_count]) as [used_page_count]
                    ,sum([reserved_page_count]) as [reserved_page_count]
                    ,sum([row_count]) as [row_count]
                    FROM [sys].[dm_db_partition_stats]
                    group by [object_id],[index_id]
                ) ps
            on ps.object_id=i.object_id and ps.index_id=i.index_id
    group by s.name
    order by 1,2

--
-- size of partitioned table
select
        object_schema_name(i.object_id) as [schema]
        ,object_name(i.object_id) as [object]
        ,i.name as [index]
        ,ps.[used_page_count]
        ,ps.[reserved_page_count]
        ,ps.[row_count]
        ,s.name as [partition_scheme]
        ,f.name as [patition_function]
    from    sys.indexes i inner join sys.partition_schemes s
            on i.data_space_id = s.data_space_id 
            inner join sys.partition_functions f
            on f.function_id = s.function_id
            inner join
                (
                SELECT   
                    [object_id]
                    ,[index_id]
                    ,sum([used_page_count]) as [used_page_count]
                    ,sum([reserved_page_count]) as [reserved_page_count]
                    ,sum([row_count]) as [row_count]
                    FROM [sys].[dm_db_partition_stats]
                    group by [object_id],[index_id]
                ) ps
            on ps.object_id=i.object_id and ps.index_id=i.index_id
    order by 1,2,3

Saturday 8 December 2012

Inmon vs Kimball – Architecture Comparison and Data Valut Architect


1) A good slide show for reference:
http://www.slideshare.net/kgraziano/introduction-to-data-vault-modeling#btnNext


Inmon states that the data warehouse is:
Subject-oriented
The data in the data warehouse is organized so that all the data elements relating to the same real-world event or object are linked together.
Non-volatile
Data in the data warehouse are never over-written or deleted — once committed, the data are static, read-only, and retained for future reporting.
Integrated
The data warehouse contains data from most or all of an organization's operational systems and these data are made consistent.
Time-variant
For An operational system, the stored data contains the current value.


Kimball: DW is where we publish used data.

Data Valut by Dan:
Hub: Business Key
Link: represent 3nf many to many relation for Business key
Satellite: description for business key

2)
From:

http://oracledwbi.wordpress.com/2010/03/25/inmon-vs-kimball-architecture-comparison/

Inmon – Hub & SpokeKimball – Bus Architecture
Methodology & Architecture
Overall approachTop-downBottom-up
Architectural structureEnterprise wide (atomic) data warehouse “feeds” departmental databasesData marts model a single business process; enterprise consistency achieved through data bus and conformed dimensions
Complexity of the methodQuite complexFairly simple
Comparison with established development methodologiesDerived from the spiral methodologyFour-step process; a departure from RDBMS methods
Discussion of physical designFairly thoroughFairly light
ScalabilityGrowing scope and changing requirements are criticalNeed to adapt to highly volatile needs within a limited scope
ApplicationStrategic solutionTactical solution
Information interdependenceSupportedSupported
Deployment costHigher start-up costs, with lower subsequent project development costsLower start-up costs, with each subsequent project costing about the same
Time to deployLonger start-up timeQuick start-up time
Development skills requiredLarger team(s) of specialistsSmall teams of generalists
Persistency of dataHigh rate of change from source systemsSource systems are relatively stable
Data modelling
Data orientationSubject or data-drivenProcess oriented
ToolsTraditional (ERDs, DISs)Dimensional modelling
End-user accessibilityLowHigh
Philosophy
Primary audienceIT professionalsEnd users
Place in the organisationCorporate wide facilityGrowth subject to departmental demand
ObjectiveDeliver a sound technical solution based on proven database methods and technologiesDeliver a solution that makes it easy for end users to directly query the data and still get reasonable response times