NYC Data Science Corporate blog

What Interested Us the Week of February 16, 2015

Written by manager | Feb 21, 2015 5:56:28 PM

Big Data: Telling The Story Of Falling Oil Prices

Jiayu Peng, 2/20/15

The fall of Oil prices is one of the most prominent topics in our world nowadays, so it's a matter of curiosity for any data enthusiast to see what Big Data can tell us about the Oil market's scene.

Recently, the Rebaie Analytics Group analyzed thousands of news article mentioning the Oil & Gas discussions before and after the fall of Oil prices in a 6 months’ time frame. They used the GDELT data which monitors the world's broadcast, print, and web news around the world.

Based on these data, the research group constructed a network diagram that shows the “communities” of conversation around “Oil & Gas”. Using data-mining techniques in networks, they identified significant influencers on the oil price, and the people with whom they are most closely connected. These results provide valuable insights and evidence for further interpretations.

Reference:
Big Data: Telling the Story of Falling Prices

IBM, G.E. and Others Create Big Data Alliance

Jiayu Peng, 2/20/15

A key element of the big data business is getting what much of computer technology secretly craves: Normality.

A new big data alliance, named the "Open Data Platform", has formed around developing products based on a common core of Hadoop's key components. The members of this alliance,including GE, Hortonworks, IBM, Infosys, Pivotal, SAS, announced a common set of standards for Hadoop.

Hadoop is perhaps the most widespread framework for distributing, managing and processing big data. However, the technology has been somewhat difficult to use, and there are concerns that deepening uses of different kinds of Hadoop, even with slight variations, could slow down the market. Therefore, it is really beneficial that big companies have teamed up and signed on common standards for Hadoop.

Reference:

IBM GE and Others Create Big Data Alliance

Tech Companies Unite Open Data Platform

 

Title: Oracle's new products aim to combine big data from multiple sources

Jiayu Peng, 2/20/15

Oracle announced four new products on Thursday, targeting one of the core challenges in big data efforts: combining data from multiple sources.

Oracle Big Data Discovery, for example, is designed to serve as the "visual face of Hadoop" for business users. With an interface intended to offer an experience as familiar as shopping online, it lets users not just find and explore data from across multiple sources but also analyze it and share the results, all from a single tool.

Another new product is called "GoldenGate for Big Data", a Hadoop-based tool that allows users to stream real-time, unstructured data from heterogeneous transactional systems into big-data systems including Apache Hadoop, Apache Hive, Apache HBase and Apache Flume.

"Oracle gives customers an integrated platform that helps simplify access to all their data, discover new insights, predict outcomes in real time, and keep all their data governed and secure," said Neil Mendelson, vice president of big data at Oracle.

Reference:
Oracle Steps Us its Big Data Push with New Products

 

Title: Internet of DNA: medicine’s next great advance

Jiayu Peng, 2/20/15

In January, programmers in Toronto began testing a system for trading genetic information with other hospitals. These facilities, in locations including Miami, Baltimore, and Cambridge, U.K., also treat children with so-called ­Mendelian disorders, which are caused by a rare mutation in a single gene. The system, called MatchMaker Exchange, represents something new: a way to automate the comparison of DNA from sick people around the world.

The communication between DNA databases is definitely beneficial. If a global network of millions of genomes were established, everyone's medical treatment would benefit from the experiences of millions of others. However, technical issues prevent sharing genomic data around the web, for example, there are no standard protocols, application programming interfaces (APIs), and file formats for DNA.

Fortunately, scientists are targeting these issues, and the MatchMaker Exchange system is a breakthrough. If successfully built, the Internet of DNA could be medicine’s next great advance.

Reference:
Internet of DNA

Governments Must Embrace IoT for Smart Cities

Bob Violino,  2/18/15

Mr. Violino's biggest message is stated in a quote from Ruthbea Yesner Clarke, director, Smart Cities Strategies program, at ID Government Insights at the end of his article.  "The Internet of Things is an emerging reality, and U.S. cities and states cannot avoid the ramifications of new IP-enabled and connected devices and their potential impact on the delivery of government services and on the quality of life of citizens," Ruthbea Yesner Clarke, director, Smart Cities Strategies program.
 
The idea of smart cities is to improve the quality of life in a number of areas.  Smart cities can mean less traffic, better EMS response, reduce greenhouse gas emissions, and generally service the community better.  This would include a combination of strategies, including cloud, mobile, social networks and big data/analytics.  Cities can realize a return on their investment in terms of lowered costs as well as increase in public good.
 
The stumbling block is more related to lack of awareness of what IoT can mean for a city and lack of experience in this area.  "Many department leaders have specific problems they’d like to solve that would be a fit for IoT, but they’re not clear on what IoT means in practical terms, what are specific use cases, and what other cities have already tested and tried", Mr. Violino writes.  So IoT is coming, just will take a while to catch on.
 

Apply for the Upcoming NYC Data Science Bootcamp

The first step in becoming a data scientist is to complete your Data Science Bootcamp Application.  Just click the button to apply.  It's free and will only take you about 5 minutes.