Shared Flashcard Set

Details

BIT 4512 Exam 3
Database 3 exam notes
61
Business
Undergraduate 4
04/18/2012

Additional Business Flashcards

 


 

Cards

Term
Database Transaction
Definition
any (possible Multi-step) action that reads from and/or writes to a database.
Term
Successful Transaction
Definition
One in which all of the SQL statements are completed successfully.
Term
Consistent Database State
Definition
One in which all data integrity contraints are satisfied.
Term
Properties of a transaction
Definition
Atomicity, consistency, Isolation, Durability
Term
Atomicity
Definition
All transaction operations must be completed.
Term
Consistency
Definition
When a database transaction is completed, the database must be in a consistent state.
Term
Isolation
Definition
Data used during the execution of a transaction cannot be used by a second transaction until the first one is completed.
Term
Durability
Definition
Once transaction changes are commited, they cannot be undone or lost due to subsequent failure.
Term
COMMIT
Definition
Permanently records all changes in the database.
Term
ROLLBACK
Definition

Aborts all uncommited changes

Databased is rolled to it's previos state

Term
Transaction Log
Definition
Maintained by a DBMS to support recovery to a consistent state.
Term
Concurrency Control
Definition
The process of managing simultaneous operations on the database without having them interfere with one another.
Term
Lost Updates
Definition
Occurs when a successfully completed update is overwritten by another transaction.
Term
Uncommited Data
Definition
Occurs when a transaction accesses the intermediate results of another transaction before they are commited-and the second transaction is then rolled back.
Term
Inconsistent Retrievals
Definition
Occurs when a transaction reads several values, but a different transaction updates some of them in the midst of this process.
Term
Serializable Schedule
Definition
A schedule of a trasnaction's operations in which the interleaved execution of all active transactions yields the same results as if those transactions were executed in serial order.
Term
Lock Granularity
Definition

The size of the locked resource:

Database-level..Table-level..Page-level..Row-level

Term
Exclusive Lock
Definition
Prohibits other users from reading the locked resource.
Term
Shared Lock
Definition
Allows other users to read the locked resources, but they cannot update it.
Term
Optimistic Locking
Definition

Assumes that no transaction conflit(s) will occur.

DBMS proccesses a transaction to a temporary filel checks weather conflict occured.

Term
Pessimisitc Locking
Definition

Assumes conflict(s) with occur:

Lcoks are issued before a transaction is processed, and then the locks are released.

Term
Two-phase locking
Definition

Guarantees serializability

one of the most common techniques used to achieve this

Transactions are allowed to obtain as many locks as necessary (growing phase)

Once the first lock is released (shrinking phase), no additional locks can be obtained

Two-phase locking doesn't prevent deadlocks

Term
Deadlocks
Definition
An impasse that may result when two (or more) transactions are waiting for locks held by the other to be released.
Term
Deadlock Prevention
Definition

Abort a transaction if possibility of deadlock

Reschedule transaction for later execution

Term
Deadlock Detection
Definition

DBMS periodically tests database for deadlocks

If found, one transaction ("victim") is rolled back

Term
Deadlock Avoidance
Definition
Transactions obtain all needed locks before execution
Term
Timestamp
Definition
A unique identifier created by DBMS that indicates the relative starting time of a transaction.
Term
Stored Procedure
Definition
A subroutine avaliable to applications accessing a relation database system. A stored procedure(sproc or SP) is acutally stored in the database.
Term
Stored Procedure can input and return?
Definition
Parameters, results
Term
Stored Procedures can be called from?
Definition

Standard languages (Java, C#)

Scripting Languages (Javascript,VBScript,PHP)

SQL Command prompt (SQL*Plus)

Term
Advantages of Stored Procedures
Definition

Performance

 

Compiled Once

 

Server Side computation

 

executable code is cached and shared

 

grouping SQL statements allow for single call exection

Term
Persistent Stored Modules
Definition

SQL itself does not support control statements such as looping operations

SQL-99 Standard defines the use.

Term
Triggers
Definition

A procedure that is automatically executed by the RDBMS when a given data manipulation event occurs.

Often used to enforce referential integrity.

Term
Business Intelligence (BI)
Definition
A set of methodologies, processes, architectures, and technologies, that transform raw data into meaningful and useful information.
Term
Who coined the term Business Intelligence and when?
Definition
Gartner Group -> Early 1990s
Term
Typical Major Components of Business Intelligence
Definition

Data extraction, transformation, and loading tools

Data store (data warehouse or Data mart)

Data Query and analysis tool (OLAP)

Data Presentation and visualtion tools (dashboard)

Term
Major Software Vendors of BI
Definition
Microsoft, IBM, Oracle, SAP
Term
Operational Data
Definition

(Transactional databases)

Stored in highly normalized tables in a relational database

 

Dynamically updated

 

Focus on traditional information systems

Term
Decision  Support Data
Definition

(Data Warehouses)

Stored in formats that facilitate data extraction, data analysis, and decision making

Often aggregated

Often with redundancies.

Term
General Steps of BI
Definition

1.Collect and storing ops data

2.Aggregating the ops data into decision support data

3.Analyzing the decision support data to generate info

4.Presentating the info to the end user to support decision-making

5.Making business decision (and generating more data)

6.Monitoring results to evaluate outcomes of the business decisions

Term
Data Warehouse
Definition
A database optimized for data analysis and read-only query processing.
Term
Master Data Management (MDM)
Definition

Provides for a comprehensive and consistent definition of all data in an organization

Ensures uniform and consistent views of all data

Supports proper Governance

For controlling and monitoring business health

Creates accountability

 
Term
Data Warehouse Characteristics
Definition

Integrated - Consistent format and meaning.

Subject-oriented - Organized to answer questions.

Time-variant - captures and represents the flow of data over time.

Nonvolatile - Once the data enters the warehouse, it's never removed.

~1 to 3 years to implement

Term
Data Mart
Definition
A small, single-subject data warehouse subset that provides decision support to a small group of people.
Term
Data Mart Characteristics
Definition

Less organizational commitment

Lower Cost

Shorter implementation time

~6 months - 1 year

Term
Online Analytical Processing (OLAP)
Definition

Graphical User Interface

Analytic processing logic

Data-processing logic

Capacity for multi-deminsional analysis

Used with both transactional and Data warehouses

 

Term
Multi-dimensional DBMS
Definition

Data is stored in multi-dimensional arrays

Typically visualized as being stored as a Data Cube

Term
Data Cube
Definition

Data Retrieval is much quicker than with standard  relational databases

Provides opportunity to "Slice and dice" data

Foundation for multi-dimensional OLAP

Term
Fact Table
Definition

Associated with a particular type of (aggregated) data.

Ex. Sales Table

Term
Dimension Tables
Definition

Attributes provide descriptive information about the facts within a given dimension.

Ex. Product, time, and location tables.

Term
Data Mining
Definition
Non-trival extraction of implicit, previously unknown and potentially useful information from data.
Term
Motivation of Data Mining
Definition
Ideas come from from many disciplines including machine learning/AI, pattern recognitions, statistics, and database systems
Term
Supervised algorithims (Classification)
Definition

Learning by example

Use of training data which has correct answers

Create a model by running the algorithm on the training data

Term
Unsupervised algorithms (Clustering)
Definition

Does not use training data

Classes may not be known in advance

Term
Classification
Definition

Given a collection of records

Each record contains a set of attributes, one of the attributes is the dependent variable/class

Find a model  to predict the class attribute as a function of the values of the other attributes

Goal: previously unseen records should be assigned to a class as accurately as possible

 
Term
K-Nearest Neighbor
Definition

Basic idea:

Look at characteristics / attributes

“If it walks like a duck and quacks like a duck, then it’s probably a duck”

 
Term
Nearest-Neighbor Classifier
Definition

Requires three things

-The set of stored records

-Distance Metric to compute the distance between records

-The value of k, the number of nearest neighbors to retrieve

To classify an unknown record:

-Compute distance to other training records

-Identify k nearest neighbors 

-Use class labels of nearest neighbors to determine the class label of unknown record (e.g., by taking majority vote, weighted distance)

 
Term
Choosing the value of k
Definition

If k is too small, the model is sensitive to noise

If k is too large, neighborhood may include too many points from other classes

 
Term
Neural Networks
Definition

An artificial neural network (ANN), usually called neural network (NN), is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks.

They are usually used to model complex relationships between inputs and outputs or to find patterns in data.

 

Term
Clustering
Definition

Given a set of data points, each having a set of attributes, and a similarity measure among them, find clusters such that

Data points in one cluster are more similar to one another

Data points in separate clusters are less similar to one another

Similarity Measures:

Euclidean Distance (if attributes are continuous)

Other Problem-specific Measures

 
Term
Document Clustering
Definition

Clustering Points:  Twitter feeds / blog comments

Similarity Measure:  How many words are common in these “documents” (after some word filtering)

 

Applications:  

Identify issues with a product more quickly and with greater detail

Identify the occurrence of / details about a disaster event as it is in the process of occurring (used for flooding in Oklahoma and North Dakota)

 
Supporting users have an ad free experience!