Big Data (Mongodb, Hbase and Casandra)

In: Computers and Technology

Submitted By popkid
Words 3463
Pages 14
Big Data

[Name of Writer]

[Name of Institution]

Introduction The term Big Data is gaining more followers and popularity. However, despite this trend, not all organizations are clear about how to face the challenge to store, organize, display and analyze large volumes of data. The term Big Data is gaining more followers and popularity. However, despite this trend so evident, not all organizations are clear about how to face the challenge to store, organize, display and analyze large volumes of data. There are multiple techniques in terms of huge database storing approaches that can store petabytes, exabytes and may be zetabytes data. These options are Cassendara, Mongodb and HBase. We will discuss about them one by one and in a proper research method and will compare them in order to contrast their difference and efficiency.

Research Background

One problem in understanding the phenomenon is that the size of these data sets the volume greatly exceeds the Data warehouse. A plane collects 10 terabytes of information from sensors every 30 minutes flight, while the Stock Exchange of New York collects structured information 1 TB per day.

In the context of Big Data, volumes are reaching peta bytes, exa bytes and then soon to zeta bytes. For instance, Apple has just announced that 7 trillion send daily notifications to iOS devices. The explosion of information in social networks, blogs, and emails is characterized the presence of data key "unstructured" and "semi" in contrast with the data type “structured” is what is commonly handled in the Data warehouse.

However, the concept of Big Data makes sense from the moment that not only the volume but also the speed and variety of data exceeds the processing capacity that can handle traditional IT systems into information of value to decision. This last feature, the value is the key…...

Similar Documents

Big Data

...McKinsey Global Institute June 2011 Big data: The next frontier for innovation, competition, and productivity The McKinsey Global Institute The McKinsey Global Institute (MGI), established in 1990, is McKinsey & Company’s business and economics research arm. MGI’s mission is to help leaders in the commercial, public, and social sectors develop a deeper understanding of the evolution of the global economy and to provide a fact base that contributes to decision making on critical management and policy issues. MGI research combines two disciplines: economics and management. Economists often have limited access to the practical problems facing senior managers, while senior managers often lack the time and incentive to look beyond their own industry to the larger issues of the global economy. By integrating these perspectives, MGI is able to gain insights into the microeconomic underpinnings of the long-term macroeconomic trends affecting business strategy and policy making. For nearly two decades, MGI has utilized this “micro-to-macro” approach in research covering more than 20 countries and 30 industry sectors. MGI’s current research agenda focuses on three broad areas: productivity, competitiveness, and growth; the evolution of global financial markets; and the economic impact of technology. Recent research has examined a program of reform to bolster growth and renewal in Europe and the United States through accelerated productivity growth; Africa’s economic......

Words: 60035 - Pages: 241

Big Data

...A New Era for Big Data COMP 440 1/12/13 Big Data Big Data is a type of new era that will help the competition of companies to capture and analyze huge volumes of data. Big data can come in many forms. For example, the data can be transactions for online stores. Online buying has been a big hit over the last few years, and people have begun to find it easier to buy their resources. When the tractions go through, the company is collecting logs of data to help the company increase their marketing production line. These logs help predict buying patterns, age of the buyer, and when to have a product go on sale. According to Martin Courtney, “there are three V;s of big data which are: high volume, high variety, high velocity and high veracity. There are other sites that use big volumes of data as well. Social networking sites such as Facebook, Twitter, and Youtube are among the few. There are many sites that you can share objects to various sources. On Facebook we can post audio, video, and photos to share amongst our friends. To get the best out of these sites, the companies are always doing some type of updating to keep users wanting to use their network to interact with their friends or community. Data is changing all the time. Developers for these companies and other software have to come up with new ways of how to support new hardware to adapt. With all the data in the world, there is a better chance to help make decision making better. More and more information...

Words: 474 - Pages: 2

Big Data

...BUS211f(2) ANALYZING BIG DATA I1 Spring 2014—MW 8:00–9:20 am Location: Sachar 116 (International Hall) Prof. Bharatendra Rai 313-282-8309 (mobile) brai@brandeis.edu Office: Sachar 1C Hours: MW, 9:30 – 10:15 and by appointment TA: TBD This is a two credit module that examines the opportunities and industry disruption in an era of massive, high velocity, unstructured data and new developments in data analytic. We treat some strategic, ethical, and technical dimensions of big data. The technical foci of the course include data structures, data warehousing, Structured Query Language (SQL), and high-impact visual displays. The principal objective of the course is to help students build understanding of data as an essential competitive resource, and acquire advanced computer skills through cases and hands-on applications. Assignments and classroom time will be devoted to both to analysis of current developments in analytics and to gaining experience with current tools.  Davenport , Thomas H. and Harris, Jeanne G. Competing on Analytics: The New Science of Winning. Cambridge: Harvard Business School Press, 2007. ISBN 978-1422103326. Available for purchase at the bookstore.  There is a required on-line course pack available for purchase at the Harvard Business Publishing website at this URL: http://cb.hbsp.harvard.edu/cbmp/access/23455671 This link is also available on LATTE . See last page of Syllabus for course pack contents.  Other readings as posted on LATTE site. Learning......

Words: 2130 - Pages: 9

Big Data

...examine the definition of big data. It also seeks to examine the components of a Unified Data Architecture and its ability to facilitate the analysis of big data. 2 WHAT IS BIG DATA Cuzzocrea, Song and Davis (2011) defined big data in part as being “enormous amounts of unstructured data produced by high-performance applications falling in a wide and heterogeneous family of application scenarios”. In recent years there has been an increasing interest and focus on big data. Many and varied definitions have been proposed but without a consensus on a single definition. The MIT Technology Review (2014), brought attention to the work of Ward and Barker (2014) which examined a number of definitions of big data that have attracted some general ICT industry support from leading ICT industry analysts and organisations such as Gartner, Oracle and Microsoft. In their work they proposed to provide a “concise definition of an otherwise ambiguous term”. The author having just attended a digital government conference with a large proportion of big data tagged presentations also noted that no single definition was offered. There was however a common content theme that supported the Ward and Barker definition of: “Big data is a term describing the storage and analysis of large and or complex data sets using a series of techniques including, but not limited to: NoSQL, MapReduce and machine learning.” 3 UNIFIED DATA ARCHITECTURE 3.1 WHAT IS THE UNIFIED DATA ARCHITECTURE? The......

Words: 579 - Pages: 3

Big Data

...The Big Data Challenges By Jamia Yant April 19th, 2012 Introduction When Volvo separated from Ford in 2010, it was breaking free from an IT infrastructure that consisted of a tangle of different systems and licenses. The need was there to develop a new stand alone IT infrastructure that could provide better Business Intelligence, boost communication capabilities and enrich collaborations. Volvo Car Corporation Integrates the Cloud into Its Networks The ability to collectively harness the wealth of data being mined was invaluable. Volvo collects terabytes of data from embedded sensors in their cars, from their customer relationship management (CRM) systems, from dealerships, product development and design systems and from their production/factory floors. Volvo then, via the cloud, transfers and archives this Big Data to its Volvo Data Warehouse where it can be stored for Long Term Archival and Retrieval or it can be accessed by Volvo’s employees. In 2010, Volvo stretched across eight main business units and twelve support areas with production plants in 19 countries. The platform used to link employees at the business units, support and production plants together are done via Volvo’s cloud with Saas software as a user interface and display. They have employee web portals, as well as supplier and vendor web portals to improve collaboration. Volvo has a high-performance infrastructure that includes parallel multi-processing, high-speed networking,......

Words: 945 - Pages: 4

Big Data

...◦ What is Big data? ◦ Why Big-Data? ◦ When Big-Data is really a problem?   ‘Big-data’ is similar to ‘Small-data’, but bigger …but having data bigger consequently requires different approaches: …to solve: ◦ techniques, tools & architectures ◦ New problems… ◦ …and old problems in a better way.  From “Understanding Big Data” by IBM Big-Data  Key enablers for the growth of “Big Data” are: ◦ Increase of storage capacities ◦ Increase of processing power ◦ Availability of data  NoSQL  MapReduce Storage Servers ◦ DatabasesMongoDB, CouchDB, Cassandra, Redis, BigTable, Hbase, Hypertable, Voldemort, Riak, ZooKeeper ◦ Hadoop, Hive, Pig, Cascading, Cascalog, mrjob, Caffeine, S4, MapR, Acunu, Flume, Kafka, Azkaban, Oozie, Greenplum ◦ S3, Hadoop Distributed File System ◦ EC2, Google App Engine, Elastic, Beanstalk, Heroku ◦ R, Yahoo! Pipes, Mechanical Turk, Solr/Lucene, ElasticSearch, Datameer, BigSheets, Tinkerpop    Processing  …when the operations on data are complex: ◦ …e.g. simple counting is not a complex problem ◦ Modeling and reasoning with data of different kinds can get extremely complex  Good news about big-data: ◦ Often, because of vast amount of data, modeling techniques can get simpler (e.g. smart counting can replace complex model based analytics)… ◦ …as long as we deal with the scale  Research areas (such as IR, KDD, ML, NLP, SemWeb, …) are subcubes within the data......

Words: 754 - Pages: 4

Big Data

...Introduction to Big data Every day, 2.5 quintillion bytes of complex, every changing data are generated. (IBM) Data comes from social sites, digital images, transaction records, and countless unknown resources. The amount of data we generate daily is enormous, and the rate it is being generated is accelerating. As we head into a future where technology dominates the global market, this pace will only continue accelerate. Businesses and other entities are aware of this data and its power. In a survey taken by Capgemini and the Economist, over 600 global business leaders identified their companies as data driven and identified data analytics as an integral part of their business. Big Data solutions are considered the answer for handling this data converting it into useful information. According to the O'Reilly Radar Team (Big Data Now), Big Data consists of three variables – size, velocity and variety. Data is considered big if conventional systems cannot handle its size. It is not only that size of Big Data that matters, but also the volume of transactions that come with it. The second issue is how fast the data is generated and how fast if it changes (velocity). New data and updated data is constantly generated, and it must be processed and analyzed quickly to create real value for an organization. The final issue is data structure (variety). Data is typically collected in raw form, unstructured, from a variety of sources. To acquire useful information, data needs to be......

Words: 2909 - Pages: 12

Big Data

...era of ‘big data’? Brad Brown, Michael Chui, and James Manyika Radical customization, constant experimentation, and novel business models will be new hallmarks of competition as companies capture and analyze huge volumes of data. Here’s what you should know. The top marketing executive at a sizable US retailer recently found herself perplexed by the sales reports she was getting. A major competitor was steadily gaining market share across a range of profitable segments. Despite a counterpunch that combined online promotions with merchandizing improvements, her company kept losing ground. When the executive convened a group of senior leaders to dig into the competitor’s practices, they found that the challenge ran deeper than they had imagined. The competitor had made massive investments in its ability to collect, integrate, and analyze data from each store and every sales unit and had used this ability to run myriad real-world experiments. At the same time, it had linked this information to suppliers’ databases, making it possible to adjust prices in real time, to reorder hot-selling items automatically, and to shift items from store to store easily. By constantly testing, bundling, synthesizing, and making information instantly available across the organization— from the store floor to the CFO’s office—the rival company had become a different, far nimbler type of business. What this executive team had witnessed first hand was the gamechanging effects of big data. Of......

Words: 3952 - Pages: 16

Big Data

...2014 SUBJECT: “Big Data” Introduction The purpose of this report is to present the technology issue of big data. In this memo we shall discuss what exactly big data is, how it applys to the accounting field, why it’s an issue for concern, and our recommendations as to how best to respond to the issue. What is Big Data? A truly succinct definition of big data, encompassing the entirety of the issue and everyone can agree on is something that right now doesn’t exist. Many people have many slightly different ways of describing just what big data is, so in order to get an accurate idea of the entire scope of what this term means you’d need look to multiple sources to get the whole picture. According to information gathered from the SAS Institute and Forbes magazine, the following definition is formed: * Big Data is a collection of both unstructured and structured data gathered from traditional, non-traditional, digital, numerical, and many other such sources inside and outside the company. All of this data forthcoming from the sources listed represent a source for ongoing discovery and analysis. (Arthur, L. 2013 and SAS Institute, Inc. n.d.) Big data comes in many different forms including video, audio, or simple text. In fact, even social media content like someone’s tweets on Twitter are included under the banner of big data. To further define the broad discussion of what big data is, many industry experts have looked to the “Three Vs of big data”:......

Words: 1416 - Pages: 6

Big Data and Analytics Developer

...Mansour Big Data and Analytics Developer at OMS ahmedelmasry_60311@hotmail.com Summary Working in Big Data & Analytics (2014 - Present). Working in Business Intelligence (IBM Cognos) (2013 - Present). Working in ERP & Data manipulation (Oracle & Asp.net) (2011 - 2013). Skills (Pivotal HD (Hadoop),Oracle, Sql Server, MongoDB, Asp.net, JavaScript, Node.js, C#). Training (Pivotal HD Hadoop training). Master's Degree in Informatics at Nile University (2014-2016) Graduated from Faculty of Science, Cairo University (2011). Awarded (YIA) The Young Innovator Award (2010). Experience Big Data and Analytics Developer at OMS April 2015 - Present (1 month) Developing and analysis Big Data using Hadoop framework (Pivotal HD & Hawq), Hadoop Eco-System Co-Founder and Data Analyst at AlliSootak September 2010 - Present (4 years 8 months) Developing and Researcher Senior Software Developer at Fifth Dimension (5d) October 2014 - April 2015 (7 months) Senior Software Developer at Bizware August 2013 - October 2014 (1 year 3 months) Developing 2 recommendations available upon request Director of Special Projects at CIT Support May 2012 - January 2014 (1 year 9 months) Ensure that the client's requirements are met, the project is completed on time and within budget and that everyone else is doing their job properly. Senior Software Developer at I-Axiom Cloud ERP Solutions November 2011 - August 2013 (1 year 10 months) Developing Certifications The Data......

Words: 840 - Pages: 4

Big Data

...Summary This article “Big Data” is talking about how data could improve the company’s performance and competition. That means we should measure the business with more data at first, and then involve the knowledge to improve our decision, which I do agree with. Big data gives traditional businesses chances to transform their old way to newer, profitable way, like on-line business, which is much more powerful than the past. And analyzing of big data will also change long-standing ideas about the value of experience, the nature of expertise, and the practice of management, which is very meaningful for developing business in the future---the high technology world. What’s more, the article points out that there are three major differences between analytics and Big data. Firstly, Big data’s volume is much bigger than analytics, which will give companies information about the decision. Secondly, the speed of spreading of data in these days is a more important criterion for a company to be much more agile than its competitors. Finally, Big data could spread through a lot different ways, such as social networks, sensors, and even cell phones. All the data in each way will update every second to everywhere. In sum, these differences make big data become a huge, fast and variety resource for business, even everyday life. So we should take big data seriously. This article also figures out that the importance to be a data-driven company, which will perform better on objective measures......

Words: 911 - Pages: 4

Big Data

...The Situation of Big Data Technology Yu Liu International American University BUS 530: Management Information Systems Matthew Keogh 2015 Summer 2 - Section C Introduction In this paper, I will list the main technologies related to big data. According to the life cycle of the data processing, big data technology can be divided into data collection and pre-processing, data storage and management, data analysis and data mining, data visualization and data privacy and security, and so on. The reason I select topic about big data My major is computer science and I have taken a few courses about data mining before. Nowadays more and more job positions about big data are showing at job seeking website, such as Monster.com. I am planning to learn some mainstream big data technologies like Hadoop. Therefore, I choose big data as my midterm paper topic. Big data in Google Google's big data analytics intelligence applications include customer sentiment analysis, risk analysis, product recommendations, message routing, customer losing prediction, the classification of the legal copy, email content filtering, political tendency forecast, species identification and other aspects. It is said that big data will generate $23 million every day for Google. Some typical applications are as follows: Based on MapReduce, Google's traditional applications include data storage, data analysis, log analysis, search quality and other data analytical applications. Based on Dremel......

Words: 1405 - Pages: 6

Big Data

...Big Data is Scaling BI and Analytics How the information surge is changing the way organizations use business intelligence and analytics Information Management Magazine, Sept/Oct 2011 Shawn Rogers Like what you see? Click here to sign up for Information Management's daily newsletter to get the latest news, trends, commentary and more. The explosive growth in the amount of data created in the world continues to accelerate and surprise us in terms of sheer volume, though experts could see the signposts along the way. Gordon Moore, co-founder of Intel and the namesake of Moore's law, first forecast that the number of transistors that could be placed on an integrated circuit would double year over year. Since 1965, this "doubling principle" has been applied to many areas of computing and has more often than not been proven correct. When applied to data, not even Moore's law seems to keep pace with the exponential growth of the past several years. Recent IDC research on digital data indicates that in 2010, the amount of digital information in the world reached beyond a zettabyte in size. That's one trillion gigabytes of information. To put that in perspective, a blogger at Cisco Systems noted that a zettabyte is roughly the size of 125 billion 8GB iPods fully loaded. Advertisement As the overall digital universe has expanded, so has the world of enterprise data. The good news for data management professionals is that our working data won't reach zettabyte scale for......

Words: 2481 - Pages: 10

Big Data

...No one can deny the how important is big data to our business world right now .Big data is transforming the way individuals within organizations work together. It is turning to be a cultural mindset not just a technology tool in which business and IT decision makers must join forces to realize the maximum value from all data. Outcomes and Insights from big data can enable all corporation individuals to make better decisions—through real time data analytics which is deepening customer engagement by adding value to the end customers, optimizing technical and untechnical operations, preventing possible threats and fraud, and mounting on new sources of revenue. Under the umbrella of the world’s globalization and glocalization, escalating demand requires a fundamentally new high quality data handling analytics approaches to architecture, tools and practices. Through some job experience I encountered recently in the field of supply chain management, I found out that big data is the backbone for a successful SCM network. Big data is optimizing Supply Chain networks with greater data accuracy, clarity, and insights, leading to more visionary contextual intelligence shared across supply chains. It’s a fact that manufacturers now a days have to orchestrate 80% or more of their supplier network activity outside their slio functional walls, using big data and cloud-based technologies to get beyond the constraints of old legacy Enterprise Resource Planning (ERP) and Supply Chain......

Words: 385 - Pages: 2

Big Data

...I. Big data emerging factor in IT area A. World’s notice for big data An appearance of tablet PC and social media was the hottest issue in IT market in last year. There are some successful global companies that go along the trends although it is not that long period since they appeared in the world, such as Apple, Google, Facebook, and Twitter. They have something in common. That is, they are based on ‘Big Data’ technology. As a result of using ‘big data’, the amount of stored data by their big data system during 2012 is much more than that of data which had been produced and stored until 2011. It helps to solve several problems in the company. Due to the geometrical increase of the amount of data, the important of big data will be continuous. Big data is selected as one of noticeable keyword in 2013 IT area with mobility, social, and cloud. It will be main factor of growth of IT infrastructure in the medium to longer term and is expected to provide new strategic superiority for many companies. It is highly acclaimed at the domestic market and also the foreign market. Several successful cases of applying big data shows that it can be positive factor helping to recover global economy. Moreover, it is not limited to IT-related business but the introduction in various areas will create value. B. Background of emerging big data In fact, there are many efforts to extract meaningful information through collection and analysis of huge amount of data. Through this......

Words: 2394 - Pages: 10