Term
|
Definition
| a term reffering to the huge amount of data available today. however, often too big and unstructured to utalize with conventional database software |
|
|
Term
| the amount of data on corporate hard drives doubles ______ |
|
Definition
| the amount of data on corporate hard drives doubles every 6 months |
|
|
Term
| business intelligence (BI) |
|
Definition
| a term combining aspects of reporting, data exploration and ad hoc queries, and sophisticated data modeling and analysis |
|
|
Term
|
Definition
| the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based managemtn to drive decisions and actions |
|
|
Term
| why is moving early key in establishing competitive advantage in using data to create models? |
|
Definition
| there's no monopoly on math, advantages based on capabilities and data that others can acquire will be short lived. |
|
|
Term
| what is key in establishing operationally effective data to gain true strategic positioning? |
|
Definition
| differenetiation in distinguishing operationally effective data use is key in true strategic positioning |
|
|
Term
| what is a huge limiting factor of BI? |
|
Definition
| getting data into a form where it can be used, analyzed and turned into information is a limiting factor of BI |
|
|
Term
|
Definition
| outdated information systems that were not designed to share data, aren't compatible with newer technologies, and aren't aligned with the firm's current business needs. |
|
|
Term
| what's the issue with most transactional databases? |
|
Definition
| most transactional databases aren't set up to be simultaneously accessed for reporting and analytics, so this forces the firm to export the data to a warehouse or data mart |
|
|
Term
|
Definition
| a set of databases designed to support decision making in an organization. structured for fast online queries and exploration, may aggregate enormous amounts of data from many different operational systems |
|
|
Term
|
Definition
| a database focused on addressing the concerns of a specific problem like increasing customer retention or improving product quality |
|
|
Term
| what needs to happen before a firm tackles changing its system? |
|
Definition
| a firm needs to have its business goals clearly defined before it can begin to design, develop, deploy, and maintain it's system. |
|
|
Term
| what questions should be ask when planing to change a system? |
|
Definition
| after establishing a clear goal, a business needs to address questions concerning data relevance, sourcing, quantity, quality, hosting, and governance. |
|
|
Term
|
Definition
| an open source project that was created to analyze massive amounts of raw information better than traditional, highly strucutred databases. consisting of some half dozen separate software pieces. |
|
|
Term
| four primary advatnages to hadoop: |
|
Definition
1. flexibility: can absorb any type of data from any source 2. scalability: can start with just one machine, but allows for others to join and combine to work together for storage and analysis 3. cost effectiveness: open source 4. fault tolerance: no single point of failure |
|
|
Term
|
Definition
| identifying and retrieving relevant electronic info to support litigation efforts, something a firm should account for in archiving and data storage plans. |
|
|
Term
| main problem with creating large data warehouses? |
|
Definition
| large data warehouses are complex, costly, and can take years to build. |
|
|
Term
| query and reporting tools |
|
Definition
| designed to present users with a subset of requested data that has been selected, sorted, ordered, calculated, and compared as needed. these tools help managers to see what's happening inside their organizations |
|
|
Term
|
Definition
| provide regular summaries of information in a predetermined format. often developed by IS staff and can be difficult to alter |
|
|
Term
|
Definition
| tools that put users in control so they can create custom reports on an as-need basis |
|
|
Term
|
Definition
| a heads up display of critical indicators, letting managers get a graphical glance at key performance metrics. |
|
|
Term
| online analytical processing (OLAP) |
|
Definition
| a method of querying and reporting that takes data from standard relational databases, calculates and summarizes it, and then stores it on a data cube. this makes it extremely fast and allows users to 'slice and dice' their data by exploring and comparing data across multiple factors and uncover new insights |
|
|
Term
|
Definition
| a special data base used in OLAP |
|
|
Term
|
Definition
| non-trivial discovery of novel, valid, comprehensible and potentially useful patterns from data. the process of using computers to identify hidden patterns and to build models from large data sets. |
|
|
Term
| key areas where businesses are leveraging data mining: |
|
Definition
1. customer segmentation 2. marketing and promotion targeting 3. market basket analysis 4. collaborate filtering 5. customer churn 6. fraud detection 7. financial modeling 8. hiring and promotion |
|
|
Term
|
Definition
| determining which customers are likely to leave and what tactics can help the firm avoid this |
|
|
Term
| what two conditions need to be present in order for data mining to work? |
|
Definition
1. the organization must have clean consistant data 2. the events in that data should reflect current and future trends |
|
|
Term
|
Definition
| deciving your system leads to bad data and bad data creates bad models, which leads to bad estimates |
|
|
Term
| problem of hisotrical consistency: |
|
Definition
| computer models are blind when faced with black swans |
|
|
Term
|
Definition
creating a model with so many variables that 1. the solution arrived at might only work on the subset of data you've created 2. you might be looking at a meaningless statistical fluke |
|
|
Term
| how do you test to see if you're looking at a random occurrence? |
|
Definition
| to test if you're looking at a random occurrence, divide your data. use one portion to build your model and the other to verify your results. |
|
|
Term
| three critical skills needed by an effective data mining and business analytics team |
|
Definition
1. information technology- understanding how to pull data together 2. statistics- to build models and interpret strength and validity of results 3. business knowledge- to help set goals, requirements, offer deeper insight into what the data is really saying about the business environment |
|
|
Term
|
Definition
| an AI network that hunt down and expose patterns and build models to exploit findings |
|
|
Term
|
Definition
| AI systems that leverage experts' knowledge to create if/then rules in order to perform a task in a way that mimicks applied human expertise. they improve decision making in non-experts |
|
|
Term
|
Definition
| model building techniques where computers examine many potential solutions to a problem, modifying various models and comparing the models to look for the best alternative. |
|
|
Term
|
Definition
| Walmarts proprietary system that records sales and automatically triggers inventory reordering, scheduling and delivery. main reason for walmart's incredible inventory turnover rate of 8.5 (selling it's entire inventory roughtly every 6 weeks) |
|
|
Term
|
Definition
| too little or too much inventory |
|
|
Term
| walmart uses data mining to: |
|
Definition
1. keep product mix right in varying conditions (pop tarts during huricanes) 2. make operational forecasts (how many cashires are needed) |
|
|
Term
| how does walmart use hadoop |
|
Definition
| walmart leverages its hadoop-based data to support it's social media data mining efforts |
|
|
Term
| who does walmart share their data with? |
|
Definition
| walmart gives suppliers access tot their products' walmart performance across metrics |
|
|
Term
| how does walmart keep data competitors off their trail? |
|
Definition
| walmart custom builds large portions of its information systems and closely guards it's infrastructure |
|
|
Term
| what is a current challenge for walmart |
|
Definition
| walmart is reaching a platoo, they need to find huge markets or dramatic cost savings in order to boost profits and continue to move its stock price higher |
|
|
Term
|
Definition
too aggressive and big: 1.subpar wages 2. poor labor conditions at some of their suppliers 3. catch 22 for suppliers, miss out on retail sales or such low prices they end up cannibalizing their own sales with other retailers 4. threatening mom and pop stores |
|
|
Term
| problems with operational data |
|
Definition
1. data is in too many places 2. data is dirty/missing values 3. data is non maintained consistently 4. data is hard to retrieve from legacy systems 5. too much data |
|
|
Term
|
Definition
| integrate data from multiple sources and process it. the results are then formatted into reports used to improve decision making |
|
|
Term
|
Definition
| data from operational databases, internal/external sources go to a data extraction/preparation program, and then to a warehouse |
|
|
Term
|
Definition
| reporting systems, data-mining systems, knowledge management systems, expert systesm |
|
|
Term
| business intellegence systems us __________ to provide reporting and analysis for organizational _______ |
|
Definition
| business intelligence systems us data created by other systems to provide reporting and analysis for orginizational decision mkaing |
|
|
Term
| how do data mining systems work? |
|
Definition
| data mining systems use statistical techniques such as regression and decision tree analysis to look for patterns and relationships in order to predict outcomes. |
|
|
Term
|
Definition
| dads that go shopping on thursday-saturday will stock up on dipers and beer |
|
|
Term
| knowledge management systems |
|
Definition
| used to share human knowledge and thus gain value from intellectual capital. used to foster innovation and increase company organizational responsiveness |
|
|
Term
|
Definition
| how recently a customer ordered, how frequently they order, and how much they spend per order. |
|
|
Term
|
Definition
| customers sorted by date of most recent purchase and then separated into fifths. the most recent fifth receiving a 1 and the least recent earning a 5 |
|
|
Term
|
Definition
| customers are arranged by frequency, and split into fifths. most frequent fifth receives a 1, least frequent a 5 |
|
|
Term
|
Definition
| customers are arranged by amount, and split into fifths. most expensive fifth receives a 1, least expensive a 5 |
|
|
Term
|
Definition
| data mining technique for determining sales patterns. shows products customers buy together. based of association rules of probablity, support, confidence, and lift |
|
|
Term
|
Definition
| associations and/or correlations amoung large set of data items. provided in "if then" statements and the rules are probablistic |
|
|
Term
|
Definition
| likelihood that two items will be purchased together |
|
|
Term
|
Definition
| frequency product appears in a transaction database |
|
|
Term
|
Definition
| likelyhood that a person buying product A will also buy product B |
|
|
Term
|
Definition
| how much more likely is that person who buys products A + B together than the likelihood that anyone who walks into the store will buy product B |
|
|
Term
| why do companies risk lawsuits over privacy infringment? |
|
Definition
| the cost of a lawsuit is completely bypassed by the profits gained from leveraging this private information |
|
|
Term
| what's the solution to the data dilemma? |
|
Definition
| the data dilemma is data warehouses |
|
|
Term
|
Definition
| provide information for improving decision making. include: reporting, data-mining, knowledge management, expert systems |
|
|
Term
|
Definition
| look at data for patterns with human eye, build association rules, analyze associations by looking at things like confidence and lift |
|
|