The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling

Höfundur Ralph Kimball

Útgefandi Wiley Professional Development (P&T)

Snið Page Fidelity

Print ISBN 9781118530801

Útgáfa 3

Útgáfuár 2013

6.390 kr.

Description

Efnisyfirlit

  • Title Page
  • Copyright
  • Contents
  • 1 Data Warehousing, Business Intelligence, and Dimensional Modeling Primer
  • Different Worlds of Data Capture and Data Analysis
  • Goals of Data Warehousing and Business Intelligence
  • Publishing Metaphor for DW/BI Managers
  • Dimensional Modeling Introduction
  • Star Schemas Versus OLAP Cubes
  • Fact Tables for Measurements
  • Dimension Tables for Descriptive Context
  • Facts and Dimensions Joined in a Star Schema
  • Kimball’s DW/BI Architecture
  • Operational Source Systems
  • Extract, Transformation, and Load System
  • Presentation Area to Support Business Intelligence
  • Business Intelligence Applications
  • Restaurant Metaphor for the Kimball Architecture
  • Alternative DW/BI Architectures
  • Independent Data Mart Architecture
  • Hub-and-Spoke Corporate Information Factory Inmon Architecture
  • Hybrid Hub-and-Spoke and Kimball Architecture
  • Dimensional Modeling Myths
  • Myth 1: Dimensional Models are Only for Summary Data
  • Myth 2: Dimensional Models are Departmental, Not Enterprise
  • Myth 3: Dimensional Models are Not Scalable
  • Myth 4: Dimensional Models are Only for Predictable Usage
  • Myth 5: Dimensional Models Can’t Be Integrated
  • More Reasons to Think Dimensionally
  • Agile Considerations
  • Summary
  • 2 Kimball Dimensional Modeling Techniques Overview
  • Fundamental Concepts
  • Gather Business Requirements and Data Realities
  • Collaborative Dimensional Modeling Workshops
  • Four-Step Dimensional Design Process
  • Business Processes
  • Grain
  • Dimensions for Descriptive Context
  • Facts for Measurements
  • Star Schemas and OLAP Cubes
  • Graceful Extensions to Dimensional Models
  • Basic Fact Table Techniques
  • Fact Table Structure
  • Additive, Semi-Additive, Non-Additive Facts
  • Nulls in Fact Tables
  • Conformed Facts
  • Transaction Fact Tables
  • Periodic Snapshot Fact Tables
  • Accumulating Snapshot Fact Tables
  • Factless Fact Tables
  • Aggregate Fact Tables or OLAP Cubes
  • Consolidated Fact Tables
  • Basic Dimension Table Techniques
  • Dimension Table Structure
  • Dimension Surrogate Keys
  • Natural, Durable, and Supernatural Keys
  • Drilling Down
  • Degenerate Dimensions
  • Denormalized Flattened Dimensions
  • Multiple Hierarchies in Dimensions
  • Flags and Indicators as Textual Attributes
  • Null Attributes in Dimensions
  • Calendar Date Dimensions
  • Role-Playing Dimensions
  • Junk Dimensions
  • Snowflaked Dimensions
  • Outrigger Dimensions
  • Integration via Conformed Dimensions
  • Conformed Dimensions
  • Shrunken Dimensions
  • Drilling Across
  • Value Chain
  • Enterprise Data Warehouse Bus Architecture
  • Enterprise Data Warehouse Bus Matrix
  • Detailed Implementation Bus Matrix
  • Opportunity/Stakeholder Matrix
  • Dealing with Slowly Changing Dimension Attributes
  • Type 0: Retain Original
  • Type 1: Overwrite
  • Type 2: Add New Row
  • Type 3: Add New Attribute
  • Type 4: Add Mini-Dimension
  • Type 5: Add Mini-Dimension and Type 1 Outrigger
  • Type 6: Add Type 1 Attributes to Type 2 Dimension
  • Type 7: Dual Type 1 and Type 2 Dimensions
  • Dealing with Dimension Hierarchies
  • Fixed Depth Positional Hierarchies
  • Slightly Ragged/Variable Depth Hierarchies
  • Ragged/Variable Depth Hierarchies with Hierarchy Bridge Tables
  • Ragged/Variable Depth Hierarchies with Pathstring Attributes
  • Advanced Fact Table Techniques
  • Fact Table Surrogate Keys
  • Centipede Fact Tables
  • Numeric Values as Attributes or Facts
  • Lag/Duration Facts
  • Header/Line Fact Tables
  • Allocated Facts
  • Profit and Loss Fact Tables Using Allocations
  • Multiple Currency Facts
  • Multiple Units of Measure Facts
  • Year-to-Date Facts
  • Multipass SQL to Avoid Fact-to-Fact Table Joins
  • Timespan Tracking in Fact Tables
  • Late Arriving Facts
  • Advanced Dimension Techniques
  • Dimension-to-Dimension Table Joins
  • Multivalued Dimensions and Bridge Tables
  • Time Varying Multivalued Bridge Tables
  • Behavior Tag Time Series
  • Behavior Study Groups
  • Aggregated Facts as Dimension Attributes
  • Dynamic Value Bands
  • Text Comments Dimension
  • Multiple Time Zones
  • Measure Type Dimensions
  • Step Dimensions
  • Hot Swappable Dimensions
  • Abstract Generic Dimensions
  • Audit Dimensions
  • Late Arriving Dimensions
  • Special Purpose Schemas
  • Supertype and Subtype Schemas for Heterogeneous Products
  • Real-Time Fact Tables
  • Error Event Schemas
  • 3 Retail Sales
  • Four-Step Dimensional Design Process
  • Step 1: Select the Business Process
  • Step 2: Declare the Grain
  • Step 3: Identify the Dimensions
  • Step 4: Identify the Facts
  • Retail Case Study
  • Step 1: Select the Business Process
  • Step 2: Declare the Grain
  • Step 3: Identify the Dimensions
  • Step 4: Identify the Facts
  • Dimension Table Details
  • Date Dimension
  • Product Dimension
  • Store Dimension
  • Promotion Dimension
  • Other Retail Sales Dimensions
  • Degenerate Dimensions for Transaction Numbers
  • Retail Schema in Action
  • Retail Schema Extensibility
  • Factless Fact Tables
  • Dimension and Fact Table Keys
  • Dimension Table Surrogate Keys
  • Dimension Natural and Durable Supernatural Keys
  • Degenerate Dimension Surrogate Keys
  • Date Dimension Smart Keys
  • Fact Table Surrogate Keys
  • Resisting Normalization Urges
  • Snowflake Schemas with Normalized Dimensions
  • Outriggers
  • Centipede Fact Tables with Too Many Dimensions
  • Summary
  • 4 Inventory
  • Value Chain Introduction
  • Inventory Models
  • Inventory Periodic Snapshot
  • Inventory Transactions
  • Inventory Accumulating Snapshot
  • Fact Table Types
  • Transaction Fact Tables
  • Periodic Snapshot Fact Tables
  • Accumulating Snapshot Fact Tables
  • Complementary Fact Table Types
  • Value Chain Integration
  • Enterprise Data Warehouse Bus Architecture
  • Understanding the Bus Architecture
  • Enterprise Data Warehouse Bus Matrix
  • Conformed Dimensions
  • Drilling Across Fact Tables
  • Identical Conformed Dimensions
  • Shrunken Rollup Conformed Dimension with Attribute Subset
  • Shrunken Conformed Dimension with Row Subset
  • Shrunken Conformed Dimensions on the Bus Matrix
  • Limited Conformity
  • Importance of Data Governance and Stewardship
  • Conformed Dimensions and the Agile Movement
  • Conformed Facts
  • Summary
  • 5 Procurement
  • Procurement Case Study
  • Procurement Transactions and Bus Matrix
  • Single Versus Multiple Transaction Fact Tables
  • Complementary Procurement Snapshot
  • Slowly Changing Dimension Basics
  • Type 0: Retain Original
  • Type 1: Overwrite
  • Type 2: Add New Row
  • Type 3: Add New Attribute
  • Type 4: Add Mini-Dimension
  • Hybrid Slowly Changing Dimension Techniques
  • Type 5: Mini-Dimension and Type 1 Outrigger
  • Type 6: Add Type 1 Attributes to Type 2 Dimension
  • Type 7: Dual Type 1 and Type 2 Dimensions
  • Slowly Changing Dimension Recap
  • Summary
  • 6 Order Management
  • Order Management Bus Matrix
  • Order Transactions
  • Fact Normalization
  • Dimension Role Playing
  • Product Dimension Revisited
  • Customer Dimension
  • Deal Dimension
  • Degenerate Dimension for Order Number
  • Junk Dimensions
  • Header/Line Pattern to Avoid
  • Multiple Currencies
  • Transaction Facts at Different Granularity
  • Another Header/Line Pattern to Avoid
  • Invoice Transactions
  • Service Level Performance as Facts, Dimensions, or Both
  • Profit and Loss Facts
  • Audit Dimension
  • Accumulating Snapshot for Order Fulfillment Pipeline
  • Lag Calculations
  • Multiple Units of Measure
  • Beyond the Rearview Mirror
  • Summary
  • 7 Accounting
  • Accounting Case Study and Bus Matrix
  • General Ledger Data
  • General Ledger Periodic Snapshot
  • Chart of Accounts
  • Period Close
  • Year-to-Date Facts
  • Multiple Currencies Revisited
  • General Ledger Journal Transactions
  • Multiple Fiscal Accounting Calendars
  • Drilling Down Through a Multilevel Hierarchy
  • Financial Statements
  • Budgeting Process
  • Dimension Attribute Hierarchies
  • Fixed Depth Positional Hierarchies
  • Slightly Ragged Variable Depth Hierarchies
  • Ragged Variable Depth Hierarchies
  • Shared Ownership in a Ragged Hierarchy
  • Time Varying Ragged Hierarchies
  • Modifying Ragged Hierarchies
  • Alternative Ragged Hierarchy Modeling Approaches
  • Advantages of the Bridge Table Approach for Ragged Hierarchies
  • Consolidated Fact Tables
  • Role of OLAP and Packaged Analytic Solutions
  • Summary
  • 8 Customer Relationship Management
  • CRM Overview
  • Operational and Analytic CRM
  • Customer Dimension Attributes
  • Name and Address Parsing
  • International Name and Address Considerations
  • Customer-Centric Dates
  • Aggregated Facts as Dimension Attributes
  • Segmentation Attributes and Scores
  • Counts with Type 2 Dimension Changes
  • Outrigger for Low Cardinality Attribute Set
  • Customer Hierarchy Considerations
  • Bridge Tables for Multivalued Dimensions
  • Bridge Table for Sparse Attributes
  • Bridge Table for Multiple Customer Contacts
  • Complex Customer Behavior
  • Behavior Study Groups for Cohorts
  • Step Dimension for Sequential Behavior
  • Timespan Fact Tables
  • Tagging Fact Tables with Satisfaction Indicators
  • Tagging Fact Tables with Abnormal Scenario Indicators
  • Customer Data Integration Approaches
  • Master Data Management Creating a Single Customer Dimension
  • Partial Conformity of Multiple Customer Dimensions
  • Avoiding Fact-to-Fact Table Joins
  • Low Latency Reality Check
  • Summary
  • 9 Human Resources Management
  • Employee Profile Tracking
  • Precise Effective and Expiration Timespans
  • Dimension Change Reason Tracking
  • Profile Changes as Type 2 Attributes or Fact Events
  • Headcount Periodic Snapshot
  • Bus Matrix for HR Processes
  • Packaged Analytic Solutions and Data Models
  • Recursive Employee Hierarchies
  • Change Tracking on Embedded Manager Key
  • Drilling Up and Down Management Hierarchies
  • Multivalued Skill Keyword Attributes
  • Skill Keyword Bridge
  • Skill Keyword Text String
  • Survey Questionnaire Data
  • Text Comments
  • Summary
  • 10 Financial Services
  • Banking Case Study and Bus Matrix
  • Dimension Triage to Avoid Too Few Dimensions
  • Household Dimension
  • Multivalued Dimensions and Weighting Factors
  • Mini-Dimensions Revisited
  • Adding a Mini-Dimension to a Bridge Table
  • Dynamic Value Banding of Facts
  • Supertype and Subtype Schemas for Heterogeneous Products
  • Supertype and Subtype Products with Common Facts
  • Hot Swappable Dimensions
  • Summary
  • 11 Telecommunications
  • Telecommunications Case Study and Bus Matrix
  • General Design Review Considerations
  • Balance Business Requirements and Source Realities
  • Focus on Business Processes
  • Granularity
  • Single Granularity for Facts
  • Dimension Granularity and Hierarchies
  • Date Dimension
  • Degenerate Dimensions
  • Surrogate Keys
  • Dimension Decodes and Descriptions
  • Conformity Commitment
  • Design Review Guidelines
  • Draft Design Exercise Discussion
  • Remodeling Existing Data Structures
  • Geographic Location Dimension
  • Summary
  • 12 Transportation
  • Airline Case Study and Bus Matrix
  • Multiple Fact Table Granularities
  • Linking Segments into Trips
  • Related Fact Tables
  • Extensions to Other Industries
  • Cargo Shipper
  • Travel Services
  • Combining Correlated Dimensions
  • Class of Service
  • Origin and Destination
  • More Date and Time Considerations
  • Country-Specific Calendars as Outriggers
  • Date and Time in Multiple Time Zones
  • Localization Recap
  • Summary
  • 13 Education
  • University Case Study and Bus Matrix
  • Accumulating Snapshot Fact Tables
  • Applicant Pipeline
  • Research Grant Proposal Pipeline
  • Factless Fact Tables
  • Admissions Events
  • Course Registrations
  • Facility Utilization
  • Student Attendance
  • More Educational Analytic Opportunities
  • Summary
  • 14 Healthcare
  • Healthcare Case Study and Bus Matrix
  • Claims Billing and Payments
  • Date Dimension Role Playing
  • Multivalued Diagnoses
  • Supertypes and Subtypes for Charges
  • Electronic Medical Records
  • Measure Type Dimension for Sparse Facts
  • Freeform Text Comments
  • Images
  • Facility/Equipment Inventory Utilization
  • Dealing with Retroactive Changes
  • Summary
  • 15 Electronic Commerce
  • Clickstream Source Data
  • Clickstream Data Challenges
  • Clickstream Dimensional Models
  • Page Dimension
  • Event Dimension
  • Session Dimension
  • Referral Dimension
  • Clickstream Session Fact Table
  • Clickstream Page Event Fact Table
  • Step Dimension
  • Aggregate Clickstream Fact Tables
  • Google Analytics
  • Integrating Clickstream into Web Retailer’s Bus Matrix
  • Profitability Across Channels Including Web
  • Summary
  • 16 Insurance
  • Insurance Case Study
  • Insurance Value Chain
  • Draft Bus Matrix
  • Policy Transactions
  • Dimension Role Playing
  • Slowly Changing Dimensions
  • Mini-Dimensions for Large or Rapidly Changing Dimensions
  • Multivalued Dimension Attributes
  • Numeric Attributes as Facts or Dimensions
  • Degenerate Dimension
  • Low Cardinality Dimension Tables
  • Audit Dimension
  • Policy Transaction Fact Table
  • Heterogeneous Supertype and Subtype Products
  • Complementary Policy Accumulating Snapshot
  • Premium Periodic Snapshot
  • Conformed Dimensions
  • Conformed Facts
  • Pay-in-Advance Facts
  • Heterogeneous Supertypes and Subtypes Revisited
  • Multivalued Dimensions Revisited
  • More Insurance Case Study Background
  • Updated Insurance Bus Matrix
  • Detailed Implementation Bus Matrix
  • Claim Transactions
  • Transaction Versus Profile Junk Dimensions
  • Claim Accumulating Snapshot
  • Accumulating Snapshot for Complex Workflows
  • Timespan Accumulating Snapshot
  • Periodic Instead of Accumulating Snapshot
  • Policy/Claim Consolidated Periodic Snapshot
  • Factless Accident Events
  • Common Dimensional Modeling Mistakes to Avoid
  • Mistake 10: Place Text Attributes in a Fact Table
  • Mistake 9: Limit Verbose Descriptors to Save Space
  • Mistake 8: Split Hierarchies into Multiple Dimensions
  • Mistake 7: Ignore the Need to Track Dimension Changes
  • Mistake 6: Solve All Performance Problems with More Hardware
  • Mistake 5: Use Operational Keys to Join Dimensions and Facts
  • Mistake 4: Neglect to Declare and Comply with the Fact Grain
  • Mistake 3: Use a Report to Design the Dimensional Model
  • Mistake 2: Expect Users to Query Normalized Atomic Data
  • Mistake 1: Fail to Conform Facts and Dimensions
  • Summary
  • 17 Kimball DW/BI Lifecycle Overview
  • Lifecycle Roadmap
  • Roadmap Mile Markers
  • Lifecycle Technology Track
  • Technical Architecture Design
  • Product Selection and Installation
  • Lifecycle Data Track
  • Dimensional Modeling
  • Physical Design
  • ETL Design and Development
  • Lifecycle BI Applications Track
  • BI Application Specification
  • BI Application Development
  • Lifecycle Wrap-up Activities
  • Deployment
  • Maintenance and Growth
  • Common Pitfalls to Avoid
  • Summary
  • 18 Dimensional Modeling Process and Tasks
  • Modeling Process Overview
  • Get Organized
  • Identify Participants, Especially Business Representatives
  • Review the Business Requirements
  • Leverage a Modeling Tool
  • Leverage a Data Profiling Tool
  • Leverage or Establish Naming Conventions
  • Coordinate Calendars and Facilities
  • Design the Dimensional Model
  • Reach Consensus on High-Level Bubble Chart
  • Develop the Detailed Dimensional Model
  • Review and Validate the Model
  • Finalize the Design Documentation
  • Summary
  • 19 ETL Subsystems and Techniques
  • Round Up the Requirements
  • Business Needs
  • Compliance
  • Data Quality
  • Security
  • Data Integration
  • Data Latency
  • Archiving and Lineage
  • BI Delivery Interfaces
  • Available Skills
  • Legacy Licenses
  • The 34 Subsystems of ETL
  • Extracting: Getting Data into the Data Warehouse
  • Subsystem 1: Data Profiling
  • Subsystem 2: Change Data Capture System
  • Subsystem 3: Extract System
  • Cleaning and Conforming Data
  • Improving Data Quality Culture and Processes
  • Subsystem 4: Data Cleansing System
  • Subsystem 5: Error Event Schema
  • Subsystem 6: Audit Dimension Assembler
  • Subsystem 7: Deduplication System
  • Subsystem 8: Conforming System
  • Delivering: Prepare for Presentation
  • Subsystem 9: Slowly Changing Dimension Manager
  • Subsystem 10: Surrogate Key Generator
  • Subsystem 11: Hierarchy Manager
  • Subsystem 12: Special Dimensions Manager
  • Subsystem 13: Fact Table Builders
  • Subsystem 14: Surrogate Key Pipeline
  • Subsystem 15: Multivalued Dimension Bridge Table Builder
  • Subsystem 16: Late Arriving Data Handler
  • Subsystem 17: Dimension Manager System
  • Subsystem 18: Fact Provider System
  • Subsystem 19: Aggregate Builder
  • Subsystem 20: OLAP Cube Builder
  • Subsystem 21: Data Propagation Manager
  • Managing the ETL Environment
  • Subsystem 22: Job Scheduler
  • Subsystem 23: Backup System
  • Subsystem 24: Recovery and Restart System
  • Subsystem 25: Version Control System
  • Subsystem 26: Version Migration System
  • Subsystem 27: Workflow Monitor
  • Subsystem 28: Sorting System
  • Subsystem 29: Lineage and Dependency Analyzer
  • Subsystem 30: Problem Escalation System
  • Subsystem 31: Parallelizing/Pipelining System
  • Subsystem 32: Security System
  • Subsystem 33: Compliance Manager
  • Subsystem 34: Metadata Repository Manager
  • Summary
  • 20 ETL System Design and Development Process and Tasks
  • ETL Process Overview
  • Develop the ETL Plan
  • Step 1: Draw the High-Level Plan
  • Step 2: Choose an ETL Tool
  • Step 3: Develop Default Strategies
  • Step 4: Drill Down by Target Table
  • Develop the ETL Specification Document
  • Develop One-Time Historic Load Processing
  • Step 5: Populate Dimension Tables with Historic Data
  • Step 6: Perform the Fact Table Historic Load
  • Develop Incremental ETL Processing
  • Step 7: Dimension Table Incremental Processing
  • Step 8: Fact Table Incremental Processing
  • Step 9: Aggregate Table and OLAP Loads
  • Step 10: ETL System Operation and Automation
  • Real-Time Implications
  • Real-Time Triage
  • Real-Time Architecture Trade-Offs
  • Real-Time Partitions in the Presentation Server
  • Summary
  • 21 Big Data Analytics
  • Big Data Overview
  • Extended RDBMS Architecture
  • MapReduce/Hadoop Architecture
  • Comparison of Big Data Architectures
  • Recommended Best Practices for Big Data
  • Management Best Practices for Big Data
  • Architecture Best Practices for Big Data
  • Data Modeling Best Practices for Big Data
  • Data Governance Best Practices for Big Data
  • Summary
  • Index
  • Advertisement
Show More

Additional information

Veldu vöru

Rafbók til eignar

Reviews

There are no reviews yet.

Be the first to review “The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling”

Netfang þitt verður ekki birt. Nauðsynlegir reitir eru merktir *

Aðrar vörur

1
    1
    Karfan þín
    A Sociology of Family Life
    A Sociology of Family Life
    Veldu vöru:

    Rafbók til eignar

    1 X 2.990 kr. = 2.990 kr.