none
Data warehousing /Business intelligence

    Question

  • Does anyone have a list of data warehousing /Business intelligence interview questions?
    or can share the Data warehousing/BI interview experience with me?
    Wednesday, August 20, 2014 12:39 PM

Answers

  • OLTP-type applications are interactive: you press a button and expect the system to respond within a few seconds at most. That presents a challenge if the application is a multi-user system with hundreds of users working at the same time. It is not uncommon for an OLTP system to crunch tens of thousands, even millions of database operations per minute. And because of that these operations need to be extremely fast.

    Therefore OLTP systems are specifically engineered for a low latency in conditions of high volume of elementary database operations.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:03 PM
    Wednesday, August 20, 2014 1:02 PM
  • See the below link

    http://www.dwbiconcepts.com/tutorial/24-interview-questions/6-top-50-dwbi-interview-questions-with-answers.html

    It's a nice one


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    Wednesday, August 20, 2014 12:44 PM
  • Online transaction processing (OLTP), workloads are characterized by small, interactive transactions that generally require sub-second response times. It is common for OLTP systems to have high concurrency requirements, with a read/write ratio ranging from 60/40 to as low as 98/2. Modifications are predominantly singleton statements, and most queries are constrained to simple joins. While limiting joins to as few tables as possible is desirable, a significant number of application systems do join many tables. Standard practices call for indexing strategies in OLTP systems to target an increase in concurrency versus query support; however, more indexes have to be created than is desired to reach acceptable query performance. The lower the proportion of write operations is in the system, the higher the level of indexing that can be tolerated, unless the timing of specific write operations is critical. Database plans generally start with third normal form (3NF) enforced with referential integrity (RI) constraints, and then selectively deviate to second normal form (2NF) when necessary to enhance performance.

    http://technet.microsoft.com/en-us/library/hh393556(v=sql.110).aspx


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 12:52 PM
    Wednesday, August 20, 2014 12:47 PM

  • OLTP is designed to serve real-time transactional business needs.  The
    idea is get in/get out quickly and efficiently.  OLAP is designed to
    have large batch windows that aggregate and summarize data into
    reporting schemas.  The two are dynamically opposed.  The processing of
    the OLAP portion of the database kills the processing and memory
    utilization needed to keep the OLTP system active and "lively".

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 12:52 PM
    Wednesday, August 20, 2014 12:49 PM
  • OLPT and OLAP are complementing technologies. You can't live without OLTP: it runs your business day by day. So, using getting strategic information from OLTP is usually first “quick and dirty” approach, but can become limiting later.

    This post explores key differences between two technologies.

    OLTP stands for On Line Transaction Processing and is a data modeling approach typically used to facilitate and manage usual business applications. Most of applications you see and use are OLTP based.

    OLAP stands for On Line Analytic Processing and is an approach to answer multi-dimensional queries. OLAP was conceived for Management Information Systems and Decision Support Systems but is still widely underused: every day I see too much people making out business intelligence from OLTP data!

    With the constant growth of data analysis and business intelligence applications (now even in small business) understanding OLAP nuances and benefits is a must if you want provide valid and useful analytics to management.

    Ref.http://www.cbsolution.net/techniques/ontarget/olap_vs_oltp_what_makes


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 12:51 PM
    Wednesday, August 20, 2014 12:51 PM
  • This schema is used in data warehouse models where one centralized fact table references number of dimension tables so as the keys (primary key) from all the dimension tables flow into the fact table (as foreign key) where measures are stored. This entity-relationship diagram looks like a star, hence the name. Star-Schema

    Consider a fact table that stores sales quantity for each product and customer on a certain time. Sales quantity will be the measure here and keys from customer, product and time dimension tables will flow into the fact table.

    Pleases see the link : http://www.dwbiconcepts.com/tutorial/24-interview-questions/6-top-50-dwbi-interview-questions-with-answers.html


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:01 PM
    Wednesday, August 20, 2014 12:55 PM
  • Is it possible to collect data of Data warehouses usually from multiple systems ?

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:11 PM
    Wednesday, August 20, 2014 1:05 PM
  • Yes it is.Data warehouses usually source data from multiple systems. For example, sales data, financial data, inventory, vendors, etc. may come from different geographically distributed platforms which otherwise don’t talk to each other. Having all this information in one data warehouse enables business users to get reports such as profits on sales of certain goods per supplier.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:14 PM
    Wednesday, August 20, 2014 1:06 PM
  • 1. Big data is images/photos/video/music streams (and as files too),
    unstructured text (documents, emails), web site log files, outputs
    (streams) from scientific/industrial instruments and sensors such as
    thermometers, digital pneumatic/pressure gauges.

    2. data warehouse
    is a system that retrieves and consolidates data periodically from the
    source systems into a dimensional or normalized data store. It usually
    keeps years of history and is queried for business intelligence or other
    analytical activities.

    3.Data mining is the process of exploring
    data to find the patterns and relationships that describe the data and
    to predict the unknown or future values of the data. The key value of
    data mining is the ability to understand why some things happened in the
    past and the ability to predict what will happen in the future.

    Ref:http://dwbi1.wordpress.com/2012/07/14/what-is-big-data-data-warehouse-data-mining/


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:15 PM
    Wednesday, August 20, 2014 1:12 PM
  • I've included links to them.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/


    Wednesday, August 20, 2014 1:26 PM
  • Real-time data warehousing is a combination of two things: 1) real-time activity and 2) data warehousing. Real-time activity is activity that is happening right now. The activity could be anything such as the sale of widgets. Once the activity is complete, there is data about it. 

    Data warehousing captures business activity data. Real-time data warehousing captures business activity data as it occurs. As soon as the business activity is complete and there is data about it, the completed activity data flows into the data warehouse and becomes available instantly. In other words, real-time data warehousing is a framework for deriving information from data as the data becomes available.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:46 PM
    Wednesday, August 20, 2014 1:31 PM
  • OLTP systems usually handles operational activity of the company and serves high volume of short identical queries and short transactions. Data Warehouse blueprint described reporting type system with low volume of long transactions and complex queries. In OLTP systems data is constantly changing, in Data Warehouse systems it rarely the case – data usually updates on the batches based on some schedule.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:46 PM
    Wednesday, August 20, 2014 1:35 PM
  • No-consistency in terms of transaction isolation levels mean read uncommitted. Either directly as “set transaction isolation level” or with (NOLOCK) hints. There are some cases when you can decide to use that isolation level – for example, in our system we have a few transaction entities where data has been inserting to our system and never ever updating/deleting after that. Same time, clients are constantly downloading the data and we don’t really care if clients get the data from uncommitted transaction.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:45 PM
    Wednesday, August 20, 2014 1:37 PM
  • Data Warehouse type systems – use them. I don’t think about any single reason why you would like to avoid them in such systems. For OLTP – consider them. If you can live with performance overhead – it could be the good choice. Be careful – don’t forget about extra 14 bytes and don’t use fillfactor = 100 though.

    So, the bottom line. For Data Warehouse – use optimistic isolation levels whenever possible. Only case with Data Warehouse systems when I would suggest to consider different options is when data in the system updates on the real time. And even in such case give optimistic isolation levels the try. For OLTP – if you can use optimistic isolation levels – use them (start with read committed snapshot). If not, use read committed and optimize the queries. Don’t use read uncommitted and (nolock) hints unless you don’t care at all about consistency. And definitely don’t use read uncommitted to reduce blocking. This is the bad choice.


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:45 PM
    Wednesday, August 20, 2014 1:44 PM

All replies

  • See the below link

    http://www.dwbiconcepts.com/tutorial/24-interview-questions/6-top-50-dwbi-interview-questions-with-answers.html

    It's a nice one


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    Wednesday, August 20, 2014 12:44 PM
  • Why are OLTP database designs not generally a good idea for a Data Warehouse?
    Wednesday, August 20, 2014 12:44 PM
  • Online transaction processing (OLTP), workloads are characterized by small, interactive transactions that generally require sub-second response times. It is common for OLTP systems to have high concurrency requirements, with a read/write ratio ranging from 60/40 to as low as 98/2. Modifications are predominantly singleton statements, and most queries are constrained to simple joins. While limiting joins to as few tables as possible is desirable, a significant number of application systems do join many tables. Standard practices call for indexing strategies in OLTP systems to target an increase in concurrency versus query support; however, more indexes have to be created than is desired to reach acceptable query performance. The lower the proportion of write operations is in the system, the higher the level of indexing that can be tolerated, unless the timing of specific write operations is critical. Database plans generally start with third normal form (3NF) enforced with referential integrity (RI) constraints, and then selectively deviate to second normal form (2NF) when necessary to enhance performance.

    http://technet.microsoft.com/en-us/library/hh393556(v=sql.110).aspx


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 12:52 PM
    Wednesday, August 20, 2014 12:47 PM
  • Why should you put your data warehouse on a different system than your OLTP system?
    Wednesday, August 20, 2014 12:48 PM

  • OLTP is designed to serve real-time transactional business needs.  The
    idea is get in/get out quickly and efficiently.  OLAP is designed to
    have large batch windows that aggregate and summarize data into
    reporting schemas.  The two are dynamically opposed.  The processing of
    the OLAP portion of the database kills the processing and memory
    utilization needed to keep the OLTP system active and "lively".

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 12:52 PM
    Wednesday, August 20, 2014 12:49 PM
  • OLPT and OLAP are complementing technologies. You can't live without OLTP: it runs your business day by day. So, using getting strategic information from OLTP is usually first “quick and dirty” approach, but can become limiting later.

    This post explores key differences between two technologies.

    OLTP stands for On Line Transaction Processing and is a data modeling approach typically used to facilitate and manage usual business applications. Most of applications you see and use are OLTP based.

    OLAP stands for On Line Analytic Processing and is an approach to answer multi-dimensional queries. OLAP was conceived for Management Information Systems and Decision Support Systems but is still widely underused: every day I see too much people making out business intelligence from OLTP data!

    With the constant growth of data analysis and business intelligence applications (now even in small business) understanding OLAP nuances and benefits is a must if you want provide valid and useful analytics to management.

    Ref.http://www.cbsolution.net/techniques/ontarget/olap_vs_oltp_what_makes


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 12:51 PM
    Wednesday, August 20, 2014 12:51 PM
  • Thanks .

    Can you tell me "What is Star-schema?"

    Wednesday, August 20, 2014 12:54 PM
  • This schema is used in data warehouse models where one centralized fact table references number of dimension tables so as the keys (primary key) from all the dimension tables flow into the fact table (as foreign key) where measures are stored. This entity-relationship diagram looks like a star, hence the name. Star-Schema

    Consider a fact table that stores sales quantity for each product and customer on a certain time. Sales quantity will be the measure here and keys from customer, product and time dimension tables will flow into the fact table.

    Pleases see the link : http://www.dwbiconcepts.com/tutorial/24-interview-questions/6-top-50-dwbi-interview-questions-with-answers.html


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:01 PM
    Wednesday, August 20, 2014 12:55 PM
  • Why the OLTP systems are specifically engineered for a low latency?
    Wednesday, August 20, 2014 1:01 PM
  • OLTP-type applications are interactive: you press a button and expect the system to respond within a few seconds at most. That presents a challenge if the application is a multi-user system with hundreds of users working at the same time. It is not uncommon for an OLTP system to crunch tens of thousands, even millions of database operations per minute. And because of that these operations need to be extremely fast.

    Therefore OLTP systems are specifically engineered for a low latency in conditions of high volume of elementary database operations.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:03 PM
    Wednesday, August 20, 2014 1:02 PM
  • Is it possible to collect data of Data warehouses usually from multiple systems ?

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:11 PM
    Wednesday, August 20, 2014 1:05 PM
  • Yes it is.Data warehouses usually source data from multiple systems. For example, sales data, financial data, inventory, vendors, etc. may come from different geographically distributed platforms which otherwise don’t talk to each other. Having all this information in one data warehouse enables business users to get reports such as profits on sales of certain goods per supplier.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:14 PM
    Wednesday, August 20, 2014 1:06 PM
  • What is Big Data, Data Warehouse, Data Mining?


    Wednesday, August 20, 2014 1:10 PM
  • 1. Big data is images/photos/video/music streams (and as files too),
    unstructured text (documents, emails), web site log files, outputs
    (streams) from scientific/industrial instruments and sensors such as
    thermometers, digital pneumatic/pressure gauges.

    2. data warehouse
    is a system that retrieves and consolidates data periodically from the
    source systems into a dimensional or normalized data store. It usually
    keeps years of history and is queried for business intelligence or other
    analytical activities.

    3.Data mining is the process of exploring
    data to find the patterns and relationships that describe the data and
    to predict the unknown or future values of the data. The key value of
    data mining is the ability to understand why some things happened in the
    past and the ability to predict what will happen in the future.

    Ref:http://dwbi1.wordpress.com/2012/07/14/what-is-big-data-data-warehouse-data-mining/


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:15 PM
    Wednesday, August 20, 2014 1:12 PM
  • Can you tell me the Indexing Strategies for OLTP Databases?
    Wednesday, August 20, 2014 1:25 PM
  • I've included links to them.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/


    Wednesday, August 20, 2014 1:26 PM
  • What is real time data-warehousing?
    Wednesday, August 20, 2014 1:30 PM
  • Real-time data warehousing is a combination of two things: 1) real-time activity and 2) data warehousing. Real-time activity is activity that is happening right now. The activity could be anything such as the sale of widgets. Once the activity is complete, there is data about it. 

    Data warehousing captures business activity data. Real-time data warehousing captures business activity data as it occurs. As soon as the business activity is complete and there is data about it, the completed activity data flows into the data warehouse and becomes available instantly. In other words, real-time data warehousing is a framework for deriving information from data as the data becomes available.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:46 PM
    Wednesday, August 20, 2014 1:31 PM
  • When I start to think what isolation level should you use in system?
    Wednesday, August 20, 2014 1:33 PM
  • OLTP systems usually handles operational activity of the company and serves high volume of short identical queries and short transactions. Data Warehouse blueprint described reporting type system with low volume of long transactions and complex queries. In OLTP systems data is constantly changing, in Data Warehouse systems it rarely the case – data usually updates on the batches based on some schedule.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:46 PM
    Wednesday, August 20, 2014 1:35 PM
  • How much data consistency do I really need?”. And even if the answer “I don’t need any consistency” is quite popular, it rarely the case in the real life. So let’s dive a little bit more in “no-consistency” mode.
    Wednesday, August 20, 2014 1:37 PM
  • No-consistency in terms of transaction isolation levels mean read uncommitted. Either directly as “set transaction isolation level” or with (NOLOCK) hints. There are some cases when you can decide to use that isolation level – for example, in our system we have a few transaction entities where data has been inserting to our system and never ever updating/deleting after that. Same time, clients are constantly downloading the data and we don’t really care if clients get the data from uncommitted transaction.

    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:45 PM
    Wednesday, August 20, 2014 1:37 PM
  • What is optimistic isolation levels ?

    Wednesday, August 20, 2014 1:43 PM
  • Data Warehouse type systems – use them. I don’t think about any single reason why you would like to avoid them in such systems. For OLTP – consider them. If you can live with performance overhead – it could be the good choice. Be careful – don’t forget about extra 14 bytes and don’t use fillfactor = 100 though.

    So, the bottom line. For Data Warehouse – use optimistic isolation levels whenever possible. Only case with Data Warehouse systems when I would suggest to consider different options is when data in the system updates on the real time. And even in such case give optimistic isolation levels the try. For OLTP – if you can use optimistic isolation levels – use them (start with read committed snapshot). If not, use read committed and optimize the queries. Don’t use read uncommitted and (nolock) hints unless you don’t care at all about consistency. And definitely don’t use read uncommitted to reduce blocking. This is the bad choice.


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    • Marked as answer by David Gutter Wednesday, August 20, 2014 1:45 PM
    Wednesday, August 20, 2014 1:44 PM
  • Thanks ahsan for your support
    Wednesday, August 20, 2014 1:47 PM