Shared Flashcard Set

Details

ISDS 2001 Test 3
Chapter 2
55
Software
Undergraduate 2
03/11/2013

Additional Software Flashcards

 


 

Cards

Term
Data Warehouse
Definition

Database used for reporting and data analysis.
Foundation of most BI.
Allows accessing and integrating relevant data in a form that is consistent and readily available. 

Single version of the truth.

 

 

 

Term
4 Major Characteristics of Data Warehousing
Definition
  1. Subject-Oriented
  2. Integrated
  3. Time-Varient
  4. Nonvolatile
Term

Characteristic of Data Warehousing: 

Subject-Oriented

Definition

Organized by topics

Best for providing a more comprehensive view of the organization.

Says why a business is operating, not just how. 

Ex: Sales, products, customers, etc.

Term

Characteristic of Data Warehousing:

Integrated

Definition

Different sources stored in one consistent format. 

Gives clarity. 

 

Term

Characteristic of Data Warehousing:

Time Varient

Definition

Provides data at various points in time. 

Daily, weekly, monthly, etc.

Both historic and current data so you can analyze trends.

Every data warehouse should have a time variable.

Ex: LSU enrollment, retention, etc.

Term

Characteristic of Data Warehousing:

Nonvolatile

Definition
Users cannot change the data once entered into the DW. Only used for reference, but unnecessary info can be deleted
Term
Additional Chacteristics of DW
Definition
  1. Designed for web-based usage
  2. Has relational/multidimensional structure
  3. Uses client/server layout to provide easy access to end-user
  4. For newer DWs, allows for real-time and active data access and analysis
  5. Contains metadata which is info that describes your data (data about data)

 

Term

3 Main Types of DW's:

Data Mart


(2 Subsets)

Definition

Usually consists of one subject area

Ex: Sales

 

  1. Dependent Data Mart- created directly from the dw. Ensures that the user is viewing the same data available to all users. EDW must be constructed first.
  2. Independent Data Mart- Small warehouse designed for a department or strategic business unit (SBU). It's source is not an EDW. 
Term

3 Main Types of DW's:

Operational Data Stores

(ODS)

Definition

Type of database used as a staging area for a dw, especially for customer info. files (CIF). 

Data are updated frequently as opposed to the static contents of the dw. (short-term memory)

Term

3 Main Types of DW's:

Enterprise Data Warehouses

(EDW)

Definition

Large-scale dw used across the company for decision support. 

Integrates data in a standard format from many sources. 

Ex: DirecTV and Enterprise Rental use EDW

Term
Metadata
Definition

Data about data.

Describes the contents/structure of a dw and why it's used.

(field name, data type, ect.)

 

Term

Major Component of the DW process:

Data Sources

Definition
Transactional data (OLTP) such as CRM, external dta (ex: census data), Access, etc..
Term

Major Component of the DW process:

ETL Process

(Extraction, Transformation, Load)

Definition
Data is taken from external sources, maintained in a staging area where transformed and integrated, then loaded into the DW or DM (data mart).
Term

Major Component of the DW process:

Comprehensive Database

Definition
The EDW used to support all decision analyses.
Term

Major Component of the DW process:

Metadata

Definition

Maintained so it can be used by IT users.

Includes software programs with rules for organizing data that can be indexed and searched. 

Term

Major Component of the DW process:

Middleware Tools

Definition

Tools that access the contents of dw. 

Ex: data mining, queries, predictive analysis, ect. 

Term

DW Architectures:

Server

Definition
Computer hardware that provides a specific service
Term

DW Architectures:

Application Server

Definition
Computer hardware responsible for the efficient execution of procedures (programs)
Term

DW Architectures:

Database Server

Definition

Sometimes referred to as the "back-end"

Holds the DW

Term

DW Architectures:

Client Software

Definition
Allows users to request a server's content or function.
Term

Two-Tiered Architecture

(Includes Pros and Cons)

Definition
  1. Client Workstation- allows end user to request both the application functions and data content from 1 server. 
  2. Application Server and DW run on the same server

Pros- more economical than three-tiered

Cons- Can have performance problems for large dw using complicated applications

(Picture in notes)

Term

Three-Tiered Architecture

(Includes Pros and Cons)

Definition
  1. Client Workstation- allows the end user to request application functions (ex: Access) and request data content (ex: Access database files)
  2. Application Server- responsible for execution of programs
  3. Database Server- houses the database or data warehouse

Pros: Seperates the application and database functions, allows for greater capacity and performance of the respective servers

Cons: Increased cost due to more hardware requirements

Term
Issues Considered When Deciding on the Architecture to use (4)
Definition
  1. Which Database Management System (DBMS) should be used? 
  2. Will parallel processing/partitioning be used? (Parallel partitioning referes to splitting data tables into smaller tables for efficient access)
  3. Will data migration tools be used to load the data warehouse?
  4. What tools will be used to support data retrieval and analysis?
Term

Alternative DW Architectures:

Independent Data Mart

Definition
Contains data coming directly from the operational data sources
Term

Alternative DW Architectures:

Data Mart Bus
Definition

Tightly integrated data marts that get their power from conformed dimensions and fact tables.

Conformed dimension is defined one time and used everywhere

Term

Alternative DW Architectures:

Hub-and-Spoke
Definition
Term

Alternative DW Architectures:

Centralized Enterprise DW
Definition
Collaboration of the company data extracts.
Term

Alternative DW Architectures:

Federated DW
Definition
Term
10 Factors that Potentially Affect the Architecture Selection Decision
Definition
  1. Info interdependence between organizational units
  2. Upper management's info needs
  3. Urgency of need for a dw
  4. Nature of end-user tasks
  5. Constraints on resources
  6. Strategic view of the data warehouse prior to implementation
  7. Compatibility with existing systems
  8. Percieved ability of the in-house IT staff
  9. Technical issues
  10. Social/political factors
Term
Teradata Corporation
Definition
Supports the central DW architecture
Term
3 Major Processes of Data Integration
Definition

Data Integration- combining data from different sources, providing users with a unified view of the data. 

  1. Data access
  2. Data federation
  3. Change capture

 

Term

3 Major Processes of Data Integration:

Data Access

Definition
Refers to the ability to access and extract data form any data source
Term

3 Major Processes of Data Integration:

Data Federation

Definition
Exists when one application is able to treat multiple data stores as one entity
Term

3 Major Processes of Data Integration:

Change Capture

Definition
Process of capturing changes made at the data source and applying them throughout the entire enterprise. Ensures data consistency. 
Term
ETL
Definition

Major technical side of DW process. 

Takes up 70% of time used to build DW.

DW would not exist without ETL, ensures data quality. 

Includes Extraction, Transformation, and Load

Term

ETL:

Extraction

Definition

Reading data from one or more databases. 

ex: spreadsheets

Term

ETL:

Transformation

Definition

Converting data from their original form to whatever form the DW needs. 

Removes errors, splits fields, recodes, etc. 

Term

ETL:

Loading

Definition
Putting the transformed data into the DW
Term
What a company must do to ensure a successful DW implementation proccess
Definition
  1. Define the plan (business objectives and strategies)
  2. Gather support from managers and end-users
  3. Set reasonable time frames and budgets
  4. Manage expectations
Term

The Imnom Model

(DW Approach)

Definition

Ultimately results in an EDW

Created by Bill Inmon (Father of DW)

Top-down approach, the enterprise has one dw and the data marts are created from the dw. 

Term

Kimball Model

(Data Mart Approach)

Definition

Results in an EDW

Created by Ralph Kimble

Bottom-up approach, the dw is a collection of all data marts across the enterprise

built one dm at a time

Data mart is subject-oriented and focuses on the departmental level (ex: sales, inventory, etc.)

Term
Additional DW Developmental Considerations
Definition

Outsourcing to have a seperate company maintain the DW

security/privacy concerns

Term

The Star Schema 

(Representation of data in a dw)

Definition

Design is based upon a concept called

dimensional modeling (retrieval-based system that allows for querying data tables, usually in terms of location, time period, etc)

The schema consists of a central FACT table surrounded by several DIMENSION tables

The FACT table is your data to be analyzed. Contains foreign keys (FK) to be linked to the dimensional tables

The DIMENSION table contains classification information used to define how your facts/data are to be summarized

-dimension tables have one-to-many relationships with rows in the fact table

Examples in notes

Term
OLAP and OLTP
Definition

OLAP: Captures and stores day-to-day business operations

OLTP: Can perform ad-hoc analysis and complex multidimensional queries 

Term

OLAP Operations

Slice

Definition

A subset of a multidimensional array of corresponding to a single value on ONE of the dimensions. 

ex: (in notes) The single value on one dimension was alcohol involvement= yes

 

Term

OLAP Operations:

Dice

Definition

A slice on more than 2 dimensions 

(same as slicing the slice of a slice)

Term

OLAP Operations:

Drill Down/Up

Definition
Technique where the user navigaes among levels of data ranging from the most summarized (up) to the most detailed (down)
Term

OLAP Operations:

Roll Up


Definition
Involves computing totals across all levels of a dimension (or multiple dimensions)
Term

OLAP Operations:

Pivot

Definition

Changing the dimensional orientation of a cube 

(ex: rotating the cube)

Term
Granularity
Definition

Refers to the highest level of detail. 

ex: sales by region vs sales by state

If granularity is too high, then the dw may not support requests to drill down into the data.

ex: if sales are only stored by department, you cannot request data by sales rep 

Term
Scalability
Definition
The ability to accommodate when informtion increases and maintain performance.
Term

Real-time DW (RDW)

(aka-active data warehouse (ADW))

Definition

Proccess of loading and providing data by way of a dw as they become more available

Data used for decision making is updated on an ongoing basis

Becoming more useful for companies who interact directly with customers and suppliers and need up-to-date info

Term
Evolutionary Process of Increased Requirement of the DW
Definition
  1. At a very basic level, the DW says what happened in terms of reports. (ex: daily sales reports)
  2. Next level, DW analyzes why something happened (ex: sales dropped due to defective billboards)
  3. As business increases, DW predict what will happen. 
  4. Technology advances then to describe what is happening now. Allows for companies to react quickly
Term

DW Administrator (DWA)

 

Definition

Maintains the DW and has skills that surpass a traditional database administrator (DBA). Besides technical skills, they should have:

  • Know high-performance hardware, software, and networking technologies
  • Solid business insight, to understand the DW purpose
  • Familiarity with business decision-making processes to understand how the DW will be use
  • Excellent communication skills
Term
4 Main Areas of DW Security
Definition
  1. Establishing effective corporate and security policies and procedures
  2. Implementing logical security procedures to restrict access (ex: authenication, encryption, etc)
  3. Limiting physical access to the data center environment
  4. Establishing an effective internal review process for security and privacy
Supporting users have an ad free experience!