Australian Innovation Shines at BigInsights Data Innovation Awards

Australian Innovation Shines at BigInsights Data Innovation Awards

Both, Australia’s Federal Assistant Minister for Industry, Innovation & Science, the Hon Craig Laundy MP, and NSW Minister for Innovation & Better Regulation, the Hon Victor Dominello presented the 'BigInsights Data Innovation Awards 2016' on December 6, 2016, at the University of Sydney.

Attended by 150 industry professionals, the BigInsights Data Innovation Awards recognise teams & end users that are doing ground-breaking work using Data Analytics & IoT to deliver business outcomes. ( Hitachi Consulting and University of Technology Sydney were the key sponsors.

The Awards attracted interest from over 35 organisations, from which 18 entries were received. They were judged by a team of independent industry experts that conferred 6 Awards in the five categories that clearly demonstrated best practices in developing and deploying Analytics or IoT techniques.

The Winners of the Awards 2016 BigInsights Data Innovation Awards were:

Best Industry Application of Data Analytics - Ambiata Pty Ltd

Scalable Machine Learning and Experimentation System for Personalised Advertising

Best Industry Application of IoT - Internet of Light Technologies Pty Ltd

Light Net - creating intelligent buildings and smart cities through a new global communications network of connected intelligent Lighting

Best Industry Application of AI/Cognitive - Strategos Pty Ltd

Stratejos - an artificial team assistant that helps teams know the ...

Read More on Datafloq
Yep, I’m Writing a Book on Modern Data Management Platforms

Yep, I’m Writing a Book on Modern Data Management Platforms


Over the past couple of years, I have spent lots of time talking with vendors, users, consultants, and other analysts, as well as plenty of people from the data management community about the wave of new technologies and continued efforts aimed at finding the best software solutions to address the increasing number of issues associated with managing enterprise data. In this way, I have gathered much insight on ways to exploit the potential value of enterprise data through efficient analysis for the purpose of “gathering important knowledge that informs better decisions.

Many enterprises have had much success in deriving value from data analysis, but a more significant number of these efforts have failed to achieve much, if any, useful results. And yet other users are still struggling with finding the right software solution for their business data analysis needs, perhaps confused by the myriad solutions emerging nearly every single day.

It is precisely in this context that I’ve decided to launch this new endeavor and write a book that offers a practical perspective on those new data platform deployments that have been successful, as well as practical use cases and plausible design blueprints for your organization or data management project. The information, insight, and guidance that I will provide is based on lessons I’ve learned through research projects and other efforts examining robust and solid data management platform solutions for many organizations.

In the following months, I will be working hard to deliver a book that serves as a practical guide for the implementation of a successful modern data management platform.
The resources for this project will require crowdfunding efforts, and here is where your collaboration will be extremely valuable.
There are several ways in which you can participate:

  • Participating in our Data Management Platforms survey to obtain a nice discount right off the bat)
  • Pre-ordering the book (soon, I’ll provide you with details on how to pre-order your copy, but in the meantime, you can show your interest by signing up at the link below)
  • Providing us with information about your own successful enterprise use case, which we may use in the book

To let us know which of these options best fits with your spirit of collaboration, and to receive the latest updates on this book, as well as other interesting news, you just need to sign up to our email list here. Needless to say, the information you provide will be kept confidential and used only for the purpose of developing this book.

In the meantime, I’d like to leave you with a brief synopsis of the contents of this book, with more details to come in the near future:

New Data Management Platforms

Discovering Architecture Blueprints

About the Book

What Is This Book About?

This book is the result of a comprehensive study into the improvement, expansion, and modernization of different types of architectures, solutions, and platforms to address the need for better and more effective ways of dealing with increasing and more complex volumes of data.

In conducting his research for the book, the author has made every effort to analyze in detail a number of successful modern data management deployments as well as the different types of solutions proposed by software providers, with the aim of providing guidance and establishing practical blueprints for the adoption and/or modernization of existing data management platforms.
These new platforms have the capability of expanding the ability of enterprises to manage new data sources—from ingestion to exposure—more accurately and efficiently, and with increased speed.

The book is the result of extensive research conducted by the author examining a wide number of real-world, modern data management use cases and the plethora of software solutions offered by various software providers that have been deployed to address them. Taking a software vendor‒agnostic viewpoint, the book analyzes what companies in different business areas and industries have done to achieve success in this endeavor, and infers general architecture footprints that may be useful to those enterprises looking to deploy a new data management platform or improve an already existing one.

Inside Industry 4.0: What’s Driving The Fourth Industrial Revolution?

Inside Industry 4.0: What’s Driving The Fourth Industrial Revolution?

Industry 4.0 is much more than just another buzz word.

The term Industry 4.0 refers to the fourth industrial revolution and is comprised of growing trends in automation, the internet of things, big data and cloud computing technologies. Just like steam power, electricity and digital automation of the past, cyber physical systems will create the factory of the future; the smart factory.

Originating from Germany as part of a Governmental strategy for the computerization of factories, this is a revolution that will spread across industries globally. It is predicted that the adoption of Industry 4.0 will benefit production due to increased connectivity across entire businesses as manual factories are transformed into smart factories.

Cyber physical systems provide factories with increased connectivity between management level and the production floor. They’re able to monitor the manufacturing process in real time and make decentralized decisions based on data fed back through these networked machines. The autonomous networking of machines and systems along with the inclusion of big data analytics could help predict maintenance issues or system failures and react to them accordingly. Saving valuable time and money for companies. These technologies are also revolutionising the way things are designed, demand for mass production and even product lifecycles.

But what ...

Read More on Datafloq
How Is Big Data Helping Humanitarian Crises

How Is Big Data Helping Humanitarian Crises

The world today is more connected than ever before. 46.1% of the world population are online, that’s 3.4 billion internet users. Each of these users typically have 5.54 social media accounts. We generate on average 303 million tweets a day and 60 billion messages a day on Facebook Messenger and WhatsApp. You’ll agree – these numbers are staggering and we’re constantly generating and adding to this torrent of big data.

We interact with the world around us through our devices and we’ve essentially used big data to make our lives more comfortable; where to eat, what movies to watch, the best routes from A to B. But what if this can be leveraged not only to recommend the best coffee shop on the block, but to show people where there is safe water to drink, or the nearest refuge centre?

Not only this. Could we use the unprecedented levels of data we’ve generated to actually predict crises and prevent violence and conflict in the worse affected areas?

The use of big data in response to humanitarian and social crisis is becoming increasingly important to tackle and understand the growing number of these issues today. Disasters – both natural and conflict driven are chaotic ...

Read More on Datafloq
What Has Pokémon Go Got To Do With Big Data?

What Has Pokémon Go Got To Do With Big Data?

Last weekend I went to the park to take my baby daughter for a walk, but I knew that it would be a walk with an extra level of intrigue above the usual cooing, dribbling, and occasional uncontrollable bawling. No, on the previous day I downloaded PokémonGo onto my phone, and although I am far “too busy” to be “down with the kids” I was curious enough to see what all this augmented reality fuss was about.

I wasn’t particularly keen on Pokémon as a child, and I was wondering how it could possibly be so addictive. So (and you all know the story here by now) I started playing…. Do you know, it was a revelation, and not only from a gaming point of view. I could suddenly see the future of Big Data, and it was immersive and immediate.

Two important factors with Big Data are relevance and size. You will only gain quality insights if there is a relevant audience engaging in a relevant activity in a comparable way. The problem with much of Big Data is that it is mostly historical, and much of this “relevant” information is lost. Are we really comparing apples with apples? Was the ...

Read More on Datafloq
The Skills and Knowledge You Need to Become a Data Scientist

The Skills and Knowledge You Need to Become a Data Scientist

Want to be part of the country’s hottest new field? Big data is helping businesses all over the world work smarter, not harder, and a lot of people are looking to get in on the action. If you’re one of them, then you’re probably wondering what kinds of skills and training you’ll need to land your dream job. Data scientists aren’t the only players in the big data field, but the position is very appealing for its high salary and interesting description. Is data scientist on your list? There’s a shortage of talent right now, so it’s a great time to jump in—but not everyone’s the right fit. Here are some of the most essential skills and traits you’ll need before starting the job hunt.

An Advanced Degree

Not all, but most, data scientists have degrees beyond a Bachelor’s. Most have at least a Master’s, and often, a PhD. This is due to the complex skills required to be successful in the role. Data scientists’ majors vary, but the most common are Mathematics and Statistics, Computer Science, and Engineering.

Algorithm & General Mathematics Knowledge

Big data is built on algorithms, and data scientists not only need to understand algorithms, they’ll need to be able to create and manipulate ...

Read More on Datafloq
Blog 12: Statistics Denial Myth #12, Publications Straw Man

Blog 12: Statistics Denial Myth #12, Publications Straw Man

Von Moltke the Elder said, "No plan of battle survives the first contact with the enemy." Similarly, no plan of analysis survives the first contact with the data. — Doug Samuelson

In theory there is no difference between theory and practice. In practice there is. — Yogi Berra

Myth #12: Statistics is defined by the recent publishing activities of statistics professors

Implication: There are these nonstatistical techniques for data analysis, which are needed to cover gaps in statistics

The myth is that the limits of statistics are defined by its academic publications and that there is a massive gap in applied statistical capabilities, which conveniently needs to be addressed by those spreading this myth.  This mischaracterization is typified by blogs such as, "Data science [analysis] without statistics is possible, even desirable," which could only be made true if data analysis was excluded from data science and they are not going to do that.  Such claimants go on to be utterly incapable of listing anything other than statistics for analyzing data.  Their claims are like a dentist writing that we should stop using mathematics because now we can use 'multiplication.' 

The harm is that this statistics denial leads to another round of adultering statistics by excluding Best ...

Read More on Datafloq
The 5G Revolution Is Coming — What to Know Before It’s Here

The 5G Revolution Is Coming — What to Know Before It’s Here

It doesn’t seem long ago when telecom companies were prepping their networks for the next wireless standard of the time — 4G. Here we are on the eve of 5G. About 100 telecom operators around the world have already started preparing for the adoption of 5G technology.

This information comes from the Ericsson Mobility Report, which also estimates there will be 28 billion connected devices around the world by 2021. Nearly 16 billion of that total will be IoT-enabled devices. If the estimates are correct, we will certainly need the new wireless format to keep up. The report also claims that standardization for 5G has already begun, and if things remain on schedule, it will be completed by 2020. But just what is 5G? What does it mean for the rest of us?

5G Will Change the Game

The “G” in 4G and 5G — if you haven’t already figured it out — stands for “generation.” In the early 1990s, when wireless phone technology first appeared, the first generation began. The second generation, or 2G, started when phones were able to send text messages between devices.

Eventually, telecom providers moved on to the third generation — or 3G — which allowed people to do just about anything with ...

Read More on Datafloq
3 Data Science Methods and 10 Algorithms for Big Data Experts

3 Data Science Methods and 10 Algorithms for Big Data Experts

One of the hottest questions in Information Management now is how to deal with Big Data in all its applications: how to gather, store, secure, and – possibly most importantly – interpret what we collect. Organizations that are able to apply effective data analysis to massive amounts of data gain significant competitive advantages in their industries. 

Organizations no longer question the value of gathering and storing such data but are far more heavily focused on methods to make sense of that all the valuable information that data represents. Although security and storage remain critical issues for IT departments, organizations are finding that their commitment to Big Data can’t stop there – they must be able to make sense of their data, to know what data is valid, relevant, and usable, as well as how to use it. 

The more data an organization has, the more difficult it is to process, store, and analyze, but conversely, the more data the organization has, the more accurate its predictions can be. As well big data comes with big responsibility. Big data requires military-grade encryption keys to keep information safe and confidential. 

This is where data science comes in. Many organizations, faced with the problem of being ...

Read More on Datafloq
How Big Data Enables a Successful Implementation of Demonetization in India

How Big Data Enables a Successful Implementation of Demonetization in India

On the evening of 8th November, the people of India were on the receiving end of a big surprise – a surprise with positive intentions. But it was a big one, nevertheless. The Government of India enacted a policy that made ₹1000 and ₹500 notes no longer valid as legal tender money.

Just after the announcement, all the ATMs, and cash deposit machines were filled with people either depositing or taking out their money. After that came the long queues in front of banks and again in ATMs, due to the severe cash crunch and the fixed cap on the amount that can be withdrawn or deposited.

The government made their intentions clear that this was the best way to curb the black money present in the economy and also root out the corruptions or terrorist activities that were being boosted by this black money. It is believed that 85% of the currency notes in circulation will be replaced and this has led to different views. Here, I’ll let the experts in economics do the talking. But, one thing that is for sure, without technology this would have been a very hard thing to accomplish.

New age technologies like e-wallets, smartphones, internet of ...

Read More on Datafloq
How to Protect Corporate Cloud Storage From Hackers

How to Protect Corporate Cloud Storage From Hackers

Some high-profile hacks have come to light recently; It was revealed last month that the private details of over 412 million users of the adult dating and pornography company Friend Finder Networks were hacked in October, making it the largest data breach ever recorded, it brings back into focus the importance of businesses ensuring their customers’ data is always protected, for example, while it came to light in September that Yahoo! experienced a hack in 2014 where the accounts of around 500 million users were compromised.

More and more businesses are now using the cloud to store their customers’ data. As with all new technologies though, hackers will look to exploit any security vulnerabilities they can find. While the attack on Friend Finder Networks does not appear to have been a cloud-based attack, a number of high profile attacks have taken place against cloud storage systems in recent years.

In this article, David Midgley, Head of Operations at payment gateway and merchant services provider Total Processing, examines what has made cloud storage vulnerable to attack and how to protect the cloud as much as possible from hackers going forward.

All of the big supermarkets now let you order your groceries online and websites ...

Read More on Datafloq
Cloud Data Warehousing for Dummies

Cloud Data Warehousing for Dummies

If you, like me, are a data warehousing or BI professional, you have probably been wondering how this all fits in the cloud world. You may have even heard of data warehousing "in the cloud". But what does that really mean? What is a cloud data warehouse?
HPE takes aim at customer needs for speed and agility in age of IoT, hybrid everything

HPE takes aim at customer needs for speed and agility in age of IoT, hybrid everything

A leaner, more streamlined Hewlett Packard Enterprise (HPE) advanced across several fronts at HPE Discover 2016 in London, making inroads into hybrid IT, Internet of Things (IoT), and on to the latest advances in memory-based computer architecture. All the innovations are designed to help customers address the age of digital disruption with speed, agility, and efficiency.

Addressing a Discover audience for the first time since HPE announced spinning off many software lines to Micro Focus, Meg Whitman, HPE President and CEO, said that company is not only committed to those assets, becoming a major owner of Micro Focus in the deal, but building its software investments.
HPE is not getting out of software but doubling-down on the software that powers the apps and data workloads of hybrid IT.

"HPE is not getting out of software but doubling-down on the software that powers the apps and data workloads of hybrid IT," she said Tuesday at London's ExCel exhibit center.

"Massive compute resources need to be brought to the edge, powering the Internet of Things (IoT). ... We are in a world now where everything computes, and that changes everything," said Whitman, who has now been at the helm of HPE and HP for five years.

HPE's new vision: To be the leading provider of hybrid IT, to run today's data centers, and then bridge the move to multi-cloud and empower the intelligent edge, said Whitman. "Our goal is to make hybrid IT simple and to harness the intelligent edge for real-time decisions" to allow enterprises of all kinds to win in the marketplace, she said.

Hyper-converged systems

To that aim, the company this week announced an extension of HPE Synergy's fully programmable infrastructure to HPE's multi-cloud platform and hyper-converged systems, enabling IT operators to deliver software-defined infrastructure as quickly as customers' businesses demand. The new solutions include:
  • HPE Synergy with HPE Helion CloudSystem 10 -- This brings full composability across compute, storage and fabric to HPE's OpenStack technology-based hybrid cloud platform to enable customers to run bare metal, virtualized, containerized and cloud-native applications on a single infrastructure and dynamically compose and recompose resources for unmatched agility and efficiency.
  • HPE Hyper Converged Operating Environment -- The software update leverages composable technologies to deliver new capabilities to the HPE Hyper Converged 380, including new workspace controls that allow IT managers to compose and recompose virtualized resources for different lines of business, making it easier and more efficient for IT to act as an internal service provider to their organization.
    This year's HPE Discover was strong on showcasing the ecosystem approach to creating and maintaining hybrid IT.
This move delivers a full-purpose composable infrastructure platform, treating infrastructure as code, enabling developers to accelerate application delivery, says HPE. HPE Synergy has nearly 100 early access customers across a variety of industries, and is now broadly available. [Disclosure: HPE is a sponsor of BriefingsDirect podcasts.]

This year's HPE Discover was strong on showcasing the ecosystem approach to creating and maintaining hybrid IT. Heavy hitters from Microsoft Azure, Arista, and Docker joined Whitman on stage to show their allegiance to HPE's offerings -- along with their own -- as essential ingredients to Platform 3.0 efficiency.

See more on my HPE Discover analysis on The Cube.

HPE also announced plans to expand Cloud28+, an open community of commercial and public sector organizations with the common goal of removing barriers to cloud adoption. Supported by HPE's channel program, Cloud28+ unites service providers, solution providers, ISVs, system integrators, and government entities to share knowledge, resources and services aimed at helping customers build and consume the right mix of cloud solutions for their needs.

Internet of Things

Discover 2016 also saw new innovations designed to help organizations rapidly, securely, and cost-effectively deploy IoT devices in wide area, enterprise and industrial deployments. These solutions include:
"Cost-prohibitive economics and the lack of a holistic solution are key barriers for mass adoption of IoT," said Keerti Melkote, Senior Vice President and General Manager, HPE. "By approaching IoT with innovations to expand our comprehensive framework built on edge infrastructure solutions, software platforms, and technology ecosystem partners, HPE is addressing the cost, complexity and security concerns of organizations looking to enable a new class of services that will transform workplace and operational experiences."

As organizations integrate IoT into mainstream operations, the onboarding and management of IoT devices remains costly and inefficient particularly at large scale. Concurrently, the diverse variations of IoT connectivity, protocols and security, prevent organizations from easily aggregating data across a heterogeneous fabric of connected things.
The edge of the network is becoming a very crowded place, but these devices need to be made more useful.

To improve the economies of scale for massive IoT deployments over wide area networks, HPE announced the new HPE Mobile Virtual Network Enabler (MVNE) and enhancements to the HPE Universal IoT (UIoT) Platform.

As the amount of data generated from smart “things” grows and the frequency at which it is collected increases, so will the need for systems that can acquire and analyze the data in real-time. Real-time analysis is enabled through edge computing and the close convergence of data capture and control systems in the same box.

HPE Edgeline Converged Edge Systems converge real-time analog data acquisition with data center-level computing and manageability, all within the same rugged open standards chassis. Benefits include higher performance, lower energy, reduced space, and faster deployment times.

"The intelligent edge is the new frontier of the hybrid computing world," said Whitman. "The edge of the network is becoming a very crowded place, but these devices need to be made more useful."

This means that the equivalent of a big data crunching data center needs to be brought to the edge affordably.

Biggest of big data

"IoT is the biggest of big data," said Tom Bradicich, HPE Vice President and General Manager, Servers and IoT Systems. "HPE EdgeLine and [partner company] PTC help bridge the digital and physical worlds for IoT and augmented reality (AR) for fully automated assembly lines."

IoT and data analysis at the edge helps companies finally predict the future, head off failures and maintenance needs in advance. And the ROI on edge computing will be easy to prove when factory downtime can be greatly eliminated using IoT, data analysis and AR at the edge everywhere.

Along these lines, Citrix, together with HPE, has developed a new architecture around HPE Edgeline EL4000 with XenApp, XenDesktop and XenServer to allow graphically rich, high-performance applications to be deployed right at the edge.  They're now working together on next-generation IoT solutions that bring together the HPE Edge IT and Citrix Workspace IoT strategies.
I predict that HPC will be a big driver for HPE, both in private cloud implementations and in supporting technical differentiation for HPE customers and partners.

In related news, SUSE has entered into an agreement with HPE to acquire technology and talent that will expand SUSE's OpenStack infrastructure-as-a-service (IaaS) solution and accelerate SUSE's entry into the growing Cloud Foundry platform-as-a-service (PaaS) market.

The acquired OpenStack assets will be integrated into SUSE OpenStack Cloud, and the acquired Cloud Foundry and PaaS assets will enable SUSE to bring to market a certified, enterprise-ready SUSE Cloud Foundry PaaS solution for all customers and partners in the SUSE ecosystem.

As part of the transaction, HPE has named SUSE as its preferred open source partner for Linux, OpenStack IaaS, and Cloud Foundry PaaS.

#HPE also put force behind its drive to make high performance computing (HPC) a growing part of enterprise data centers and private clouds. Hot on the heels of buying SGI, HPE has recognized that public clouds leave little room for those workloads that do not perform best in virtual machines.

Indeed, if all companies buy their IT from public clouds, they have little performance advantage over one another. But many companies want to gain the best systems with the best performance for the workloads that give them advantage, and which run the most complex -- and perhaps value-creating -- applications. I predict that HPC will be a big driver for HPE, both in private cloud implementations and in supporting technical differentiation for HPE customers and partners.

Memory-driven computing

Computer architecture took a giant leap forward with the announcement that HPE has successfully demonstrated memory-driven computing, a concept that puts memory, not processing, at the center of the computing platform to realize performance and efficiency gains not possible today.

Developed as part of The Machine research program, HPE's proof-of-concept prototype represents a major milestone in the company's efforts to transform the fundamental architecture on which all computers have been built for the past 60 years.

Gartner predicts that by 2020, the number of connected devices will reach 20.8 billion and generate an unprecedented volume of data, which is growing at a faster rate than the ability to process, store, manage, and secure it with existing computing architectures.

"We have achieved a major milestone with The Machine research project -- one of the largest and most complex research projects in our company's history," said Antonio Neri, Executive Vice President and General Manager of the Enterprise Group at HPE. "With this prototype, we have demonstrated the potential of memory-driven computing and also opened the door to immediate innovation. Our customers and the industry as a whole can expect to benefit from these advancements as we continue our pursuit of game-changing technologies."
We have achieved a major milestone with The Machine research project -- one of the largest and most complex research projects in our company's history.

The proof-of-concept prototype, which was brought online in October, shows the fundamental building blocks of the new architecture working together, just as they had been designed by researchers at HPE and its research arm, Hewlett Packard Labs. HPE has demonstrated:
  • Compute nodes accessing a shared pool of fabric-attached memory
  • An optimized Linux-based operating system (OS) running on a customized system on a chip (SOC)
  • Photonics/Optical communication links, including the new X1 photonics module, are online and operational
  • New software programming tools designed to take advantage of abundant persistent memory.
During the design phase of the prototype, simulations predicted the speed of this architecture would improve current computing by multiple orders of magnitude. The company has run new software programming tools on existing products, illustrating improved execution speeds of up to 8,000 times on a variety of workloads. HPE expects to achieve similar results as it expands the capacity of the prototype with more nodes and memory.

In addition to bringing added capacity online, The Machine research project will increase focus on exascale computing. Exascale is a developing area of HPC that aims to create computers several orders of magnitude more powerful than any system online today. HPE's memory-driven computing architecture is incredibly scalable, from tiny IoT devices to the exascale, making it an ideal foundation for a wide range of emerging high-performance compute and data intensive workloads, including big data analytics.


HPE says it is committed to rapidly commercializing the technologies developed under The Machine research project into new and existing products. These technologies currently fall into four categories: Non-volatile memory, fabric (including photonics), ecosystem enablement and security.

Martin Banks, writing in Diginomica, questions whether these new technologies and new architectures represent a new beginning or a last hurrah for HPE. He poses the question to David Chalmers, HPE's Chief Technologist in EMEA, and Chalmers explains HPE's roadmap.

The conclusion? Banks feels that the in-memory architecture has the potential to be the next big step that IT takes. If all the pieces fall into place, Banks says, "There could soon be available a wide range of machines at price points that make fast, high-throughput systems the next obvious choice. . . . this could be the foundation for a whole range of new software innovations."

Storage initiative

HPE lastly announced a new initiative to address demand for flexible storage consumption models, accelerate all-flash data center adoption, assure the right level of resiliency, and help customers transform to a hybrid IT infrastructure.

Over the past several years, the industry has seen flash storage rapidly evolve from niche application performance accelerator to the default media for critical workloads. During this time, HPE's 3PAR StoreServ Storage platform has emerged as a leader in all-flash array market share growth, performance, and economics. The new HPE 3PAR Flash Now initiative gives customers a way to acquire this leading all-flash technology on-premises starting at $0.03 per usable Gigabyte per month, a fraction of the cost of public cloud solutions.
This keynote address and the news makes more sense as pertains to current and future IT market than I’ve ever seen.

"Capitalizing on digital disruption requires that customers be able to flexibly consume new technologies," said Bill Philbin, vice president and general manager, Storage, Hewlett Packard Enterprise. "Helping customers benefit from both technology and consumption flexibility is at the heart of HPE's innovation agenda."

Whitman's HPE, given all of the news at HPE Discover, has assembled the right business path to place HPE and its ecoystems of partners and alliances squarely the very center of the major IT trends of the next five years.

Indeed, I’ve been at HPE Discover conferences for more than 10 years now, and this keynote address and the news makes more sense as pertains to current and future IT market than I’ve ever seen.

You may also be interested in:

How To Protect Your Cloud System From A DDoS Attack

How To Protect Your Cloud System From A DDoS Attack

Cloud systems offer a distinct advantage over on-premise legacy solutions. Hosting your content on a local server makes them vulnerable to cyber-attacks. With the resources available today, it is not difficult to launch an attack on systems that are completely on-premise. With the cloud, your content is distributed across a number of servers; sometimes located across the world. This way, even if a server gets under attack, your content is still safe and the backup systems can kick in to keep your services running.

But cloud systems are slowly losing this edge as well. As the recent attack on Dyn shows, services hosted in the cloud are no-less susceptible to DDoS attacks than the on-premise solutions are. And that is mainly possible because of the cloud itself. DDoS attacks are, in practice, always over the cloud. So no matter how distributed your cloud system is, it is theoretically possible to build a cloud network that can overwhelm this service through a DDoS attack.

In essence, this is a cat and mouse game between the hosting provider and the attackers. While it may be pretty impossible to build a foolproof system that attackers can never breach, you can always make it harder for them ...

Read More on Datafloq
How to Achieve a Single Customer View in Ecommerce

How to Achieve a Single Customer View in Ecommerce

What is a Single view of the customer?

Till some time back, a single view of customer meant bringing together all data for customer and consolidating into a single record. Driving force behind single view of customer within an organization was usually operational. Marketing was merchandise driven and strictly followed quarterly planned marketing calendars and hence was content with only half updated or even outdated view of a customer. In the present scenario, businesses have become multi-channel. Customers now interact across a range of touch points. Marketing has become customer centric and is not content with an updated view of customer. They need to know each customer interaction and the intent of each interaction to personalize and ensure contextual relevance of each communication. For businesses today, the single view of customer means bringing together all data of customer including interactions, transactions and intent.

Data Challenge

Rise of digital commerce has resulted in not just huge data volume but also in new type of data that businesses have to deal with. Visitors can browse through the site, view products, read reviews without providing any personal information. Shoppers can interact with a brand’s social media properties, like posts, comment on them or share within their ...

Read More on Datafloq
How to Control Crime with Big Data and Predictive Analytics

How to Control Crime with Big Data and Predictive Analytics

Law enforcement has many highly sophisticated tools available for helping to pin down those who commit crimes. Cyber criminals are more difficult to catch than those who commit theft and assault, and still more difficult is preventing crime before it happens. Big data is proving very useful for filling in these gaps in law enforcement technology, providing insights and detecting anomalies that can help officials reduce crime. Since the potential applications of big data in this field are extensive, let’s take a look at some of the ways the technology is already being used to help control crime.

Facial Recognition

In 2013, facial recognition software did not catch the Tsarnaev brothers following the Boston Marathon bombing. However, after the Paris terrorist attacks in 2015, the technology had become advanced enough to help officials find suspects in the case. The hope is that as facial recognition improves, it can be used to help prevent crime before it occurs, identifying individuals like known terrorists as they approach public places. Baltimore police are also using facial recognition in their work, comparing them with photos in the state’s vehicle records. At this time, it’s unclear as to how this is being used in law enforcement, ...

Read More on Datafloq
How to Find the Best Tools for Data Collection and Analysis

How to Find the Best Tools for Data Collection and Analysis

Most marketers have a problem trying to obtain accurate data that is provided by the tools provided by Google. Inaccurate data can create a conflict between the marketing departments and the management. The local search results might be correct but the efficacy of the data collection tool is essential. Instead of relying on free tools offered by Google, know how to find reliable ones.

Internal and external data

When you are running a business that has several branches, managing the data will be different. To know the performance of the various branches, the tool must have the ability to collect the data and analyse them independently. A tool that is incapable of handling internal and external data effectively will encounter failure in producing correct results.

Phone conversions

The customers that view your ads might be using other electronic gadgets including mobile phones. In fact, mobile phones are convenient for numerous prospective customers. Most marketers are finding it hard to get precise data since their tools do not recognize the visits and the conversions made through the mobile phones. That will result in inaccuracy of the data.

Avoid free tools

It is advisable in business to look for means of reducing your expenditure. It ...

Read More on Datafloq
The Top 7 Big Data Trends for 2017

The Top 7 Big Data Trends for 2017

It is the end of the year again and a lot has happened in 2016. Google’s AlphGo algorithm beat Lee Se-dol in the game of Go, Blockchain really took off and governments around the globe are investing heavily in smart cities. As every year, I will provide you with the big data trends for the upcoming year, just as I did for 2014, 2015 and 2016. 2017 promises to be a big year for Big Data. The Big Data hype is finally over and, therefore, we can finally get started with Big Data. That is why I would like to call 2017 the Year of Intelligence. So, which big data trends will affect your organisation in 2017? Let’s have a look at the seven top big data trends for 2017.

1. Blockchain-enabled Smart Contracts: Blockchain 2.0


In 2016, Blockchain took off with a lot of media attention on the distributed technology that will drastically change organisations and societies. Many organisations are exploring Blockchain solutions. The R3 Partnership, which involves over 70 of the largest banks in the world, seeks to invest almost $60 million in the development of their blockchain platform. Although four prominent banks left the consortium, it shows that banks ...

Read More on Datafloq
How Sacrificing Cyber Security Affects Big Data Innovations

How Sacrificing Cyber Security Affects Big Data Innovations

Business executives quickly have to learn to prioritize in order to help their company grow. There are only so many resources available and so many hours in the day, so some sacrifices have to be made while the business is being built. Unfortunately, some companies don’t make the best decisions when prioritizing and allocating resources. There are some areas of the business that seem like a chore or an unnecessary expenditure, and many businesses skimp out in these areas—often resulting in problems down the line. Cyber security is one pressing issue that no one wants to deal with, but it’s something that businesses in every industry should be concerned about. Unfortunately, many companies are so laser-focused on innovation that they don’t put any resources into protecting the sensitive data they collect and store. How does this affect the business and its big data innovations down the road?

Prioritizing Innovation Leaves Vulnerabilities

Innovation is fun. Innovation is what sets businesses apart from their competitors, makes them industry leaders, and helps them to attract further funding. Cyber security is not fun. It’s an expenditure that doesn’t have an immediate return on investment (ROI)—and the ROI may never be obvious. However, cyber security is an ...

Read More on Datafloq
5 Ways How Businesses and Consumers can both Reap Rewards of Big Data

5 Ways How Businesses and Consumers can both Reap Rewards of Big Data

A company’s data is commonly considered one of their most important assets, provided it is managed effectively. If data is mismanaged, it will either grow into a business liability, or at the very least, a company will not perform as well as it could have if its information was:

Captured and input properly, with effective spelling and consistent formatting
Classified and indexed effectively
Free of duplicates and dummy data
Secured and only made available to privileged employees
Maintained with a scalable storage system with archiving options

Or, as the saying goes: “garbage in, garbage out.” Since you’re looking for the benefits of big data for business and consumer clients, here are six of them you can use to build a solid business case.

1. Effective customer and prospect profiling

Isn’t it great when you go to a website, like a self-service portal and you are already logged in, and everywhere you navigate, you feel like the website knows who you are, and your preferences for certain products and content types?

Whether you are shopping for clothing for your family, or office furniture for your business, effective data management provides your personalized online shopping experience. The same can be said for the customer experience in the real world. The ability ...

Read More on Datafloq
What is the Blockchain – part 5 – ICOs and DAOs

What is the Blockchain – part 5 – ICOs and DAOs

The Blockchain has the potential to completely change our societies. However, it is still a nascent technology, one which can be difficult to grasp. Therefore, I am writing a series of blog posts on what is the blockchain. This is part 5 and the final blog post of this series. The first blog post was a generic introduction to the blockchain. The second post provided insights into consensus mechanisms and different types of blockchain. The third post offered information on five challenges of the blockchain that need to be solved and the fourth post talked about smart contracts on the blockchain. In this final post, I will dive into two concepts that have enormous potential to radically change our world; ICOs and DAOs.

ICOs – Every Company Its Own Central Bank

Startups have always been looking for funds to invest in their venture to build the next Facebook or Google. However, money is expensive and any startup that raises money has to give a share of the company to the investors. The earlier an investor joins, the higher the risk, the more expensive it becomes. That has been the paradigm for the past decades. Not anymore. Since the rise of the Blockchain, ...

Read More on Datafloq
What You are Too Afraid to Ask About Artificial Intelligence (Part I): Machine Learning

What You are Too Afraid to Ask About Artificial Intelligence (Part I): Machine Learning

AI is moving at a stellar speed and is probably one of most complex and present sciences. The complexity here is not meant as a level of difficulty in understanding and innovating (although of course, this is quite high), but as the degree of interrelation with other fields apparently disconnected.

There are basically two schools of thought on how an AI should be properly built: the Connectionists start from the assumption that we should draw inspiration from the neural networks of the human brain, while the Symbolists prefer to move from banks of knowledge and fixed rules on how the world works. Given these two pillars, they think it is possible to build a system capable of reasoning and interpreting.

In addition, a strong dichotomy is naturally taking shape in terms of problem-solving strategy: you can solve a problem through a simpler algorithm, which though it increases its accuracy in time (iteration approach), or you can divide the problem into smaller and smaller blocks (parallel sequential decomposition approach).

Up to date, there is not a clear answer on what approach or school of thoughts works the best, and thus I find appropriate to briefly discuss major advancements in both pure machine learning techniques (Part I) ...

Read More on Datafloq
FCO-IM at TDWI Roundtable

FCO-IM at TDWI Roundtable

FCO-IM - Data Modeling by Example

Do You want to visit a presentation about Fully Communication Oriented Information Modeling (FCO-IM) in Frankfurt?
I’m very proud that we, the board of the TDWI Roundtable FFM, could win Marco Wobben to speak about FCO-IM. In my opinion, it’s one of the most powerful technique for building conceptual information models. And the best is, that such models can be automatically transformed into ERM, UML, Relational or Dimensional models and much more. So we can gain more wisdom in data modeling at all.

But, what is information modeling? Information modeling is making a model of the language used to communicate about some specific domain of business in a more or less fixed way. This involved not only the words used but also typical phrases and patterns that combine these words into meaningful standard statements about the domain [3].

What to Look for in Master Data Management Software

What to Look for in Master Data Management Software

With such a large emphasis on data collection and regulatory compliance, the creation and maintenance of accurate master data has become extremely important for businesses. But does the average business know what to look for when choosing between different software options?

Why Master Data Management Matters

We’re living in a world that’s saturated with data. Everywhere you look, someone or something is collecting numbers, statistics, and figures. And while we – as a business world – have come close to mastering the process of collecting data, most organizations are still coming up short when it comes to sorting, extrapolating, and using data.

Businesses have been told for years that they need to collect data. And now that there are convenient tools that allow even the smallest businesses to cost-effectively gather information, everyone’s doing it. But if you’re collecting data without a plan, all you’re doing is complicating your business. It’s akin to continually buying furniture that you don’t need. Pretty soon, your home will be filled with so much furniture that you can’t actually use any of it.

This is where master data management (MDM) comes into play. MDM is a method of allowing businesses to link all of their critical data and information ...

Read More on Datafloq
How can the Internet of Things Revolutionize Your Relationship with Customers?

How can the Internet of Things Revolutionize Your Relationship with Customers?

Customer relationship management (CRM) software has changed the way companies do business. It has provided increased capability for tracking and measuring not only relationships but relationships within departments and with vendors as well. Companies can now determine exactly who becomes a customer in response to a marketing campaign or a sales promotion. Combined with the power of the Internet of Things (IoT), the amount of information available, and the number of ways to use it to benefit both businesses and consumers is astounding.

The Great Information Exchange

For the IoT to work, consumers have to be willing to disclose personal information. An article in the Harvard Business Review points to the example of Waze, a wildly successful traffic management app that depends on users giving their location information. In exchange, they receive traffic information that saves them time and frustration. Each mobile device relays real- time traffic information to a central hub, which analyzes that data and returns informational messages containing the results back to each connected device.

In an age in which the loss of privacy to technology has become a hot topic, consumers expect real and tangible benefits in exchange for that information. A customer connected to IoT, through a shopping app, ...

Read More on Datafloq
How to Use Lead Scoring Models to Plan Your Communication Strategy

How to Use Lead Scoring Models to Plan Your Communication Strategy

As lead generation methods become increasingly sophisticated, it’s no longer enough for sales teams to generate a lead and pass it on to the marketing department. Sales agents need to be clearer about lead scoring models. Marketers need to analyze and qualify leads based on the lead score and align their strategy depending on the stage of the customer journey where the lead is currently positioned. They also need to tweak their communication strategy accordingly.

This is a short and succinct guide to understanding lead scoring models and how best practices in lead generation and qualifying affect communication at various stages of the sales funnel.

B2B Communication Challenges

When you’re talking B2B, you usually imagine a large organization riddled with red tape. The product value is high, sometimes running into the millions. So it is natural that your customer journey is longer than usual. The complexity doesn’t stop here; consider these two stats:

Currently, 81% of all B2B purchases involve multiple decision makers. (Source: Ledger Bennett DGA)
By 2020 over 85% of B2B purchase decisions will be made without human interaction. (Source: Gartner Research)

It is easy to see that B2B marketing communication is in the middle of a giant transformation. There isn’t enough clarity or ...

Read More on Datafloq
How to Overcome Big Data Analytics Limitations With Hadoop

How to Overcome Big Data Analytics Limitations With Hadoop

Hadoop is an open source project that was developed by Apache back in 2011. The initial version had a variety of bugs, so a more stable version was introduced in August. Hadoop is a great tool for big data analytics, because it is highly scalable, flexible and cost-effective.

However, there are also some challenges big data analytics professionals need to be aware of. The good news is that new SQL tools are available, which can overcome them.

What are the benefits of Hadoop for Big Data Storage and Predictive Analytics?

Hadoop is a very scalable system that allows you to store multi-terabyte files across multiple servers. Here are some benefits of this big data storage and analytics platform.

Low Failure Rate

The data is replicated on every machine, which makes Hadoop a great option for backing up large files. Every time a dataset is copied to a node, it is replicated on other nodes in the same data cluster. Since it is backed up across so many nodes, there is a very small probability that the data will be permanently altered or destroyed.


Hadoop is one of the most cost-effective big data analytics and storage solutions. According to research from Cloudera, it is possible to store ...

Read More on Datafloq
2017: The Year of Smart Prediction for Ecommerce

2017: The Year of Smart Prediction for Ecommerce

Ecommerce has been on the rise over the past couple of years. Global total retail sales are expected to be around $22-trillion, 6% higher than last year. And although growth rates are becoming calmer, sales by the year 2020 is expected to reach $27-trillion. Truly, the future is still bright for ecommerce.

However, what is keeping the industry from achieving even bigger heights is how a vast majority of ecommerce businesses are focusing only on key performance indicators – getting only statistics on how they’ve performed over a period of time. These are just numbers and do not exactly seek actionable rights.

So what really has to change so that the ecommerce industry will perform better than expected? What can make ecommerce businesses pick up the pace?

Soon to be Old-school Practices

There are some practices that players in the ecommerce industry have been doing, but they shouldn’t. Some things that worked before just don’t work anymore. Some people just don’t fix things until they’re broken, even habits, practices, and processes.

Online businesses get data from their database of orders, products, baskets, visits, users, marketing campaigns, referring links, keywords, and catalog browsing. Some businesses also tap social data from Facebook, Twitter, and Google. Google Analytics ...

Read More on Datafloq
Meet George Jetson – your new AI-empowered chief procurement officer

Meet George Jetson – your new AI-empowered chief procurement officer

The next BriefingsDirect technology innovation thought leadership discussion explores how rapid advances in artificial intelligence (AI) and machine learning are poised to reshape procurement -- like a fast-forwarding to a once-fanciful vision of the future.

Whereas George Jetson of the 1960s cartoon portrayed a world of household robots, flying cars, and push-button corporate jobs -- the 2017 procurement landscape has its own impressive retinue of decision bots, automated processes, and data-driven insights.

We won’t need to wait long for this vision of futuristic business to arrive. As we enter 2017, applied intelligence derived from entirely new data analysis benefits has redefined productivity and provided business leaders with unprecedented tools for managing procurement, supply chains, and continuity risks.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

To learn more about the future of predictive -- and even proactive -- procurement technologies, please welcome Chris Haydon, Chief Strategy Officer at SAP Ariba. The discussion is moderated by BriefingsDirect's Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: It seems like only yesterday that we were content to gain a common view of the customer or develop an end-to-end bead on a single business process. These were our goals in refining business in general, but today we've leapfrogged to a future where we're using words like “predictive” and “proactive” to define what business function should do and be about. Chris, what's altered our reality to account for this rapid advancement from visibility into predictive -- and on to proactive?

Haydon: There are a couple of things. The acceleration of the smarts, the intelligence, or the artificial intelligence, whatever the terminology that you identify with, has really exploded. It’s a lot more real, and you see these use-cases on television all the time. The business world is just looking to go in and adopt that.

And then there’s this notion of the Lego block of being able to string multiple processes together via an API is really exciting -- that coupled with the ability to have insight. The last piece, the ability to make sense of big data, either from a visualization perspective or from a machine-learning perspective, has accelerated things.

These trends are starting to come together in the business-to-business (B2B) world, and today, we're seeing them manifest themselves in procurement.

Gardner: What is it about procurement as a function that’s especially ripe for taking advantage of these technologies?

Transaction intense

Haydon: Procurement is obviously very transaction-intense. Historically, what transaction intensity means is people, processing, exceptions. When we talk about these trends now, the ability to componentize services, the ability to look at big data or machine learning, and the input on top of this contextualizes intelligence. It's cognitive and predictive by its very nature, a bigger data set, and [improves] historically inefficient human-based processes. That’s why procurement is starting to be at the forefront.


Gardner: Procurement itself has changed from the days of when we were highly vertically integrated as corporations. We had long lead times on product cycles and fulfillment. Nowadays, it’s all about agility and compressing the time across the board. So, procurement has elevated its position. Anything more to add?

Haydon: Everyone needs to be closer to the customer, and you need live business. So, procurement is live now. This change in dynamic -- speed and responsiveness -- is closer to your point. It’s also these other dimensions of the consumer experience that now has to be the business-to-business experience. All that means same-day shipping, real-time visibility, and changing dynamically. That's what we have to deliver.

Gardner: If we go back to our George Jetson reference, what is it about this coming year, 2017? Do you think it's an important inception point when it comes to factoring things like the rising role of procurement, the rising role of analytics, and the fact that the Internet of Things (IoT) is going to bring more relevant data to bear? Why now?

Haydon: There are a couple of things. The procurement function is becoming more mature. Procurement leaders have extracted those first and second levels of savings from sourcing and the like. And they have control of their processes.

With cloud-based technologies and more of control of their processes, they're looking now to how they're going to serve their internal customers by being value-generators and risk-reducers.

How do you forward the business, how do you de-risk, how do you get supply continuity, how do you protect your brand? You do that by having better insight, real-time insight into your supply base, and that’s what’s driving this investment.

Gardner: We've been talking about Ariba being a 20-year-old company. Congratulations on your anniversary of 20 years.

Haydon: Thank you.

AI and bots

Gardner: You're also, of course, part of SAP. Not only have you been focused on procurement for 20 years, but you've got a large global player with lots of other technologies and platform of benefits to avail yourselves of. So, that brings me to the point of AI and bots.

It seems to me that right at the time when procurement needs help, when procurement is more important than ever, that we're also in a position technically to start doing some innovative things that get us into those words "predictive" and more "intelligent."

Set the stage for how these things come together.

Haydon: You allude to being part of SAP, and that's really a great strength and advantage for a domain-focused procurement expertise company.

The machine-learning capabilities that are part of a native SAP HANA platform, which we naturally adopt and get access to, put us on the forefront of not having to invest in that part of the platform, but to focus on how we take that platform and put it into the context of procurement.

There are a couple of pretty obvious areas. There's no doubt that when you’ve got the largest B2B network, billions in spend, and hundreds and millions of transactions on invoicing, you apply some machine learning on that. We can start doing a lot smarter matching an exception management on that, pretty straightforward. That's at one end of the chain.
It's not about upstream and downstream, it's about end-to-end process, and re-imagining and reinventing that.

On the other end of the chain, we have bots. Some people get a little bit wired about the word “bot,” “robotics,” or whatever, maybe it's a digital assistant or it's a smart app. But, it's this notion of helping with decisions, helping with real-time decisions, whether it's identifying a new source of supply because there's a problem, and the problem is identified because you’ve got a live network. It's saying that you have a risk or you have a continuity problem, and not just that it's happening, but here's an alternative, here are other sources of a qualified supply.

Gardner: So, it strikes me that 2017 is such a pivotal year in business. This is the year where we're going to start to really define what machines do well, and what people do well, and not to confuse them. What is it about an end-to-end process in procurement that the machine can do better that we can then elevate the value in the decision-making process of the people?

Haydon: Machines can do better in just identifying patterns -- clusters, if you want to use a more technical word. They transform category management and enables procurement to be at the front of their internal customer set by looking not just at their traditional total cost of ownership (TCO), but total value and use. That's a part of that real dynamic change.

What we call end-to-end, or even what SAP Ariba defined in a very loose way when we talked about upstream, it was about outsourcing and contracting, and downstream was about procurement, purchasing, and invoicing. That's gone, Dana. It's not about upstream and downstream, it's about end-to-end process, and re-imagining and reinventing that.

The role of people

Gardner: When we give more power to a procurement professional by having highly elevated and intelligent tools, their role within the organization advances and the amount of improvement they can make financially advances. But I wonder where there's risk if we automate too much and whether companies might be thinking that they still want people in charge of these decisions. Where do we begin experimenting with how much automation to bring, now that we know how capable these machines have been, or is this going to be a period of exploration for the next few years?

Haydon: It will be a period of exploration, just because businesses have different risk tolerances and there are actually different parts of their life cycle. If you're in a hyper growth mode and you're pretty profitable, that's a little bit different than if you're under a very big margin pressure.

For example, maybe if you're in high tech in the Silicon Valley, and some big names that we could all talk about are, you're prepared to be able to go at it, and let it all come.

If you're in a natural-resource environment, every dollar is even more precious than it was a year ago.

That’s also the beauty, though, with technology. If you want to do it for this category, this supplier, this business unit, or this division you can do that a lot easier than ever before and so you go on a journey.
If you're in a hyper growth mode and you're pretty profitable, that's a little bit different than if you're under a very big margin pressure.

Gardner: That’s an important point that people might not appreciate, that there's a tolerance for your appetite for automation, intelligence, using machine learning, and AI. They might even change, given the context of the certain procurement activity you're doing within the same company. Maybe you could help people who are a little bit leery of this, thinking that they're losing control. It sounds to me like they're actually gaining more control.

Haydon: They gain more control, because they can do more and see more. To me, it’s layered. Does the first bot automatically requisition something -- yes or no? So, you put tolerances on it. I'm okay to do it if it is less than $50,000, $5,000, or whatever the limit is, and it's very simple. If the event is less than $5,000 and it’s within one percent of the last time I did it, go and do it. But tell me that you are going to do it or let’s have a cooling-off period.

If you don't tell me or if you don’t stop me, I'm going to do it, and that’s the little bit of this predictive as well. So you still control the gate, you just don’t have to be involved in all the sub-processes and all that stuff to get to the gate. That’s interesting.

Gardner: What’s interesting to me as well, Chris, is because the data is such a core element of how successful this is, it means that companies in a procurement intelligence drive will want more data, so they can make better decisions. Suppliers who want to be competitive in that environment will naturally be incentivized to provide more data, more quickly, with more openness. Tell us some of the implications for intelligence brought to procurement on the supplier? What we should expect suppliers to do differently as a result?

Notion of content

Haydon: There's no doubt that, at a couple of levels, suppliers will need to let the buyers know even more about themselves than they have ever known before.

That goes to the notion of content. It’s like there is unique content to be discovered, which is whom am I, what do I do well and demonstrate that I do well. That’s being discovered. Then, there is the notion of being able to transact. What do I need to be able to do to transact with you efficiently whether that's a payment, a bank account, or just the way in which I can consume this?

Then, there is also this last notion of the content. What content do I need to be able to provide to my customer, aka the end user, for them to be able to initiate the business with them?

These three dimensions of being discovered, how to be dynamically transacted with, and then actually providing the content of what you do even as a material of service to the end user via the channel. You have to have all of these dimensions right.
If you don't have the context of the business process between a buyer and a seller and what they are trying to affect through the network, how does it add value?

That’s why we fundamentally believe that a network-based approach, when it's end to end, meaning a supplier can do it once to all of the customers across the [Ariba] Discovery channel, across the transactional channel, across the content channel is really value adding. In a digital economy, that's the only way to do it.

Gardner: So this idea of the business network, which is a virtual repository for all of this information isn't just quantity, but it's really about the quality of the relationship. We hear about different business networks vying for attention. It seems to me that understanding that quality aspect is something you shouldn't lose track of.

Haydon: It’s the quality. It’s also the context of the business process. If you don't have the context of the business process between a buyer and a seller and what they are trying to affect through the network, how does it add value? The leading-practice networks, and we're a leading-practice network, are thinking about Discovery. We're thinking about content; we're thinking about transactions.

Gardner: Again, going back to the George Jetson view of the future, for organizations that want to see the return on their energy and devotion to these concepts around AI, bots, and intelligence. What sort of low-hanging fruit do we look for, for assuring them that they are on the right path? I'm going to answer my own question, but I want you to illustrate it a bit better, and that’s risk and compliance and being able to adjust to unforeseen circumstances seems to me an immediate payoff for doing this.

Severance of pleadings

Haydon: The United Kingdom is enacting a law before the end of the year for severance of pleadings. It’s the law, and you have to comply. The real question is how you comply.

You eye your brand, you eye your supply chain, and having the supply-chain profile information at hand right now is top of mind. If you're a Chief Procurement Officer (CPO) and you walk into the CEO’s office, the CEO could ask, "Can you tell me that I don’t have any forced labor, I don’t have any denied parties, and I'm Office of Foreign Assets Control (OFAC) compliant? Can you tell me that now?"

You might be able to do it for your top 50 suppliers or top 100 suppliers, and that’s great, but unfortunately, a small, $2,000 supplier who uses some forced labor in any part of the world is potentially a problem in this extended supply chain. We've seen brands boycotted very quickly. These things roll.

So yes, I think that’s just right at the forefront. Then, it's applying intelligence to that to give that risk threshold and to think about where those challenges are. It's being smart and saying, "Here is a high risk category. Look at this category first and all the suppliers in the category. We're not saying that the suppliers are bad, but you better have a double or triple look at that, because you're at high risk just because of the nature of the category."
Think larger than yourself in trying to solve that problem differently. Those cloud deployment models really help you.

Gardner: Technically, what should organizations be thinking about in terms of what they have in place in order for their systems and processes to take advantage of these business network intelligence values? If I'm intrigued by this concept, if I see the benefits in reducing risk and additional efficiency, what might I be thinking about in terms of my own architecture, my own technologies in order to be in the best position to take advantage of this?

Haydon: You have to question how much of that you think you can build yourself. If you think you're asking different questions than most of your competitors, you're probably not. I'm sure there are specific categories and specific areas on tight supplier relationships and co-innovation development, but when it comes to the core risk questions, more often, they're about an industry, a geography, or the intersection of both.

Our recommendation to corporations is never try and build it yourself. You might need to have some degree of privacy, but look to have it as more industry-based. Think larger than yourself in trying to solve that problem differently. Those cloud deployment models really help you.

Gardner: So it really is less of a technical preparatory thought process than process being a digital organization, availing yourself of cloud models, being ready to think about acting intelligently and finding that right demarcation between what the machines do best and what the people do best.

More visible

Haydon: By making things digital they are actually more visible. You have to be able to balance the pure nature of visibility to get at the product; that's the first step. That’s why people are on a digital journey.

Gardner: Machines can’t help you with a paper-based process, right?

Haydon: Not as much. You have to scan it and throw it in. Then, you are then digitizing it.

Gardner: We heard about Guided Buying last year from SAP Ariba. It sounds like we're going to be getting a sort of "Guided Buying-Plus" next year and we should keep an eye on that.

Haydon: We're very excited. We announced that earlier this year. We're trying to solve two problems quickly through Guided Buying.
Our Guided Buying has a beautiful consumer-based look and feel, but with embedded compliance. We hide the complexity. We just show the user what they need to know at the time, and the flow is very powerful.

One is the nature of the ad-hoc user. We're all ad-hoc users in the business today. I need to buy things, but I don’t want to read the policy, I don’t want to open the PDF on some corporate portal on some threshold limit that, quite honestly, I really need to know about once or twice a year.

So our Guided Buying has a beautiful consumer-based look and feel, but with embedded compliance. We hide the complexity. We just show the user what they need to know at the time, and the flow is very powerful.

Listen to the podcast. Find it on iTunes. Get the mobile app. Download the transcript. Sponsor: SAP Ariba.

You may also be interested in:

DDoS Attack: A Wake-Up Call for IoT

DDoS Attack: A Wake-Up Call for IoT

Welcome to the world of Internet of Things wherein a glut of devices are connected to the internet which emanates massive amounts of data. Analysis and use of this data will have real positive impact on our lives. But we have many hoops to jump before we can claim that crown starting with a huge number of devices lacking unified platform with serious issues of security standards threating the very progress of IoT.

The concept of IoT introduces a wide range of new security risks and challenges to IoT devices, platforms and operating systems, communications, and even the systems to which they're connected. New security technologies will be required to protect IoT devices and platforms from both information attacks and physical tampering, to encrypt their communications, and to address new challenges such as impersonating "things" or denial-of-sleep attacks that drain batteries, to denial-of-service attack. But IoT security will be complicated by the fact that many "things" use simple processors and operating systems that may not support sophisticated security approaches. In addition to all that "Experienced IoT security specialists are scarce, and security solutions are currently fragmented and involve multiple vendors," said Mr. Jones from Gartner, he added; "New threats will emerge ...

Read More on Datafloq
8 Benefits of Cloud Accounting for Small and Medium Businesses

8 Benefits of Cloud Accounting for Small and Medium Businesses

The ability to flawlessly execute business operations is a critical determining factor of a businesses potential success. It is this flawless execution guarantees that the business delivers the services or products to their customers and meet their demands, all the while also maintaining a certain competitive edge over the rest of the industry. 

Previously, large corporations had a great advantage when it came to data management as they the have all the necessary resources like in-house data servers, skilled IT professionals, and other such resources. When it comes to the competitive barrier between small-medium businesses (SMBs) and large companies, this technological gap meant that SMBs never had access to the same playing fields as the larger organizations. Their capability to invest in custom IT infrastructures to enable more efficient and smoother business operation gave them a huge advantage that SMBs were previously unable to seize. 

But with the advent of Cloud computing technology and Cloud-based accounting systems, today SMBs are also able to keep up with greater customer demands all the while delivering and implementing the same large business like operational protocols. All this is only possible with Cloud based solutions which offer a cost effective solutions package to SMBs and levels ...

Read More on Datafloq
Strategic view across more data delivers digital business boost for AmeriPride

Strategic view across more data delivers digital business boost for AmeriPride

The next BriefingsDirect Voice of the Customer digital transformation case study explores how linen services industry leader AmeriPride Services uses big data to gain a competitive and comprehensive overview of its operations, finances and culture.

We’ll explore how improved data analytics allows for disparate company divisions and organizations to come under a single umbrella -- to become more aligned -- and to act as a whole greater than the sum of the parts. This is truly the path to a digital business.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

Here to describe how digital transformation has been supported by innovations at the big data core, we’re joined by Steven John, CIO, and Tony Ordner, Information Team Manager, both at at AmeriPride Services in Minnetonka, Minnesota. The discussion is moderated by BriefingsDirect's Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Let’s discuss your path to being a more digitally transformed organization. What were the requirements that led you to become more data-driven, more comprehensive, and more inclusive in managing your large, complex organization?


John: One of the key business drivers for us was that we're a company in transition -- from a very diverse organization to a very centralized organization. Before, it wasn't necessarily important for us to speak the same data language, but now it's critical. We’re developing the lexicon, the Rosetta Stone, that we can all rely on and use to make sure that we're aligned and heading in the same direction.

Gardner: And Tony, when we say “data,” are we talking about just databases and data within applications? Or are we being even more comprehensive -- across as many information types as we can?

Ordner: It’s across all of the different information types. When we embarked on this journey, we discovered that data itself is great to have, but you also have to have processes that are defined in a similar fashion. You really have to drive business change in order to be able to effectively utilize that data, analyze where you're going, and then use that to drive the business. We're trying to institute into this organization an iterative process of learning.

Gardner: For those who are not familiar with AmeriPride Services, tell us about the company. It’s been around for quite a while. What do you do, and how big of an umbrella organization are we talking about?

Long-term investments

John: The company is over 125 years old. It’s family-owned, which is nice, because we're not driven by the quarter. We can make longer-term investments through the family. We can have more of a future view and have ambition to drive change in different ways than a quarter-by-quarter corporation does.

We're in the laundry business. We're in the textiles and linen business. What that means is that for food and beverage, we handle tablecloths, napkins, chef coats, aprons, and those types of things. In oil and gas, we provide the safety garments that are required. We also provide the mats you cross as you walk in the door of various restaurants or retail stores. We're in healthcare facilities and meet the various needs of providing and cleansing the garments and linens coming out of those institutions. We're very diverse. We're the largest company of our kind in Canada, probably about fourth in the US, and growing.
Become a Member of myVertica
Gain Access to the Free 
HPE Vertica Community Edition
Gardner: And this is a function that many companies don't view as core and they're very happy to outsource it. However, you need to remain competitive in a dynamic world. There's a lot of innovation going on. We've seen disruption in the taxicab industry and the hospitality industry. Many companies are saying, “We don’t want to be a deer in the headlights; we need to get out in front of this.”

Tony, how do you continue to get in front of this, not just at the data level, but also at the cultural level?

Ordner: Part of what we're doing is defining those standards across the company. And we're coming up with new programs and new ways to get in front and to partner with the customers.

As part of our initiative, we're installing a lot of different technology pieces that we can use to be right there with the customers, to make changes with them as partners, and maybe better understand their business and the products that they aren't buying from us today that we can provide. We’re really trying to build that partnership with customers, provide them more ways to access our products, and devise other ways they might not have thought of for using our products and services.

With all of those data points, it allows us to do a much better job.

Gardner: And we have heard from Hewlett Packard Enterprise (HPE) the concept that it's the “analytics that are at the core of the organization,” that then drive innovation and drive better operations. Is that something you subscribe to, and is that part of your thinking?

John: For me, you have to extend it a little bit further. In the past, our company was driven by the experience and judgment of the leadership. But what we discovered is that we really wanted to be more data-driven in our decision-making.

Data creates a context for conversation. In the context of their judgment and experience, our leaders can leverage that data to make better decisions. The data, in and of itself, doesn’t drive the decisions -- it's that experience and judgment of the leadership that's that final filter.

We often forget the human element at the end of that and think that everything is being driven by analytics, when analytics is a tool and will remain a tool that helps leaders lead great companies.

Gardner: Steven, tell us about your background. You were at a startup, a very successful one, on the leading edge of how to do things different when it comes to apps, data, and cloud delivery.

New ways to innovate

John: Yes, you're referring to Workday. I was actually Workday’s 33rd customer, the first to go global with their product. Then, I joined Workday in two roles: as their Strategic CIO, working very closely with the sales force, helping CIOs understand the cloud and how to manage software as a service (SaaS); and also as their VP of Mid-Market Services, where we were developing new ways to innovate, to implement in different ways and much more rapidly.

And it was a great experience. I've done two things in my life, startups and turnarounds, and I thought that I was kind of stepping back and taking a relaxing job with AmeriPride. But in many ways, it's both; AmeriPride’s both a turnaround and a startup, and I'm really enjoying the experience.

Gardner: Let’s hear about how you translate technology advancement into business advancement. And the reason I ask it in that fashion is that it seems as a bit of a chicken and the egg, that they need to be done in parallel -- strategy, ops, culture, as well as technology. How are you balancing that difficult equation?

John: Let me give you an example. Again, it goes back to that idea of, if you just have the human element, they may not know what to ask, but when you add the analytics, then you suddenly create a set of questions that drive to a truth.
Become a Member of myVertica
Gain Access to the Free 
HPE Vertica Community Edition
We're a route-based business. We have over a 1,000 trucks out there delivering our products every day. When we started looking at margin we discovered that our greatest margin was from those customers that were within a mile of another customer.

So factoring that in changes how we sell, that changes how we don't sell, or how we might actually let some customers go -- and it helps drive up our margin. You have that piece of data, and suddenly we as leaders knew some different questions to ask and different ways to orchestrate programs to drive higher margin.

Gardner: Another trend we've seen is that putting data and analytics, very powerful tools, in the hands of more people can have unintended, often very positive, consequences. A knowledge worker isn't just in a cube and in front of a computer screen. They're often in the trenches doing the real physical work, and so can have real process insights. Has that kicked in yet at AmeriPride, and are you democratizing analytics?

Ordner: That’s a really great question. We've been trying to build a power-user base and bring some of these capabilities into the business segments to allow them to explore the data.

You always have to keep an eye on knowledge workers, because sometimes they can come to the wrong conclusions, as well as the right ones. So it's trying to make sure that we maintain that business layer, that final check. It's like, the data is telling me this, is that really where it is?

I liken it to having a flashlight in a dark room. That’s what we are really doing with visualizing this data and allowing them to eliminate certain things, and that's how they can raise the questions, what's in this room? Well, let me look over here, let me look over there. That’s how I see that.

Too much information

John: One of the things I worry about is that if you give people too much information or unstructured information, then they really get caught up in the academics of the information -- and it doesn’t necessarily drive a business process or drive a business result. It can cause people to get lost in the weeds of all that data.

You still have to orchestrate it, you still have to manage it, and you have to guide it. But you have to let people go off and play and innovate using the data. We actually have a competition among our power-users where they go out and create something, and there are judges and prizes. So we do try to encourage the innovation, but we also want to hold the reins in just a little bit.

Gardner: And that gets to the point of having a tight association between what goes on in the core and what goes on at the edge. Is that something that you're dabbling in as well?

John: It gets back to that idea of a common lexicon. If you think about evolution, you don't want a Madagascar or a Tasmania, where groups get cut off and then they develop their own truth, or a different truth, or they interpret data in a different way -- where they create their own definition of revenue, or they create their own definition of customer.

If you think about it as orbits, you have to have a balance. Maybe you only need to touch certain people in the outer orbit once a month, but you have to touch them once a month to make sure they're connected. The thing about orbits and keeping people in the proper orbits is that if you don't, then one of two things happens, based on gravity. They either spin out of orbit or they come crashing in. The idea is to figure out what's the right balance for the right groups to keep them aligned with where we are going, what the data means, and how we're using it, and how often.

Gardner: Let’s get back to the ability to pull together the data from disparate environments. I imagine, like many organizations, that you have SaaS apps. Maybe it’s for human capital management or maybe it’s for sales management. How does that data then get brought to bear with internal apps, some of them may even be on a mainframe still, or virtualized apps from older code basis and so forth? What’s the hurdle and what words of wisdom might you impart to others who are earlier in this journey of how to make all that data common and usable?

Ordner: That tends to be a hurdle. As to the data acquisition piece, as you set these things up in the cloud, a lot of the times the business units themselves are doing these things or making the agreements. They don't put into place the data access that we've always needed. That’s been our biggest hurdle. They'll sign the contracts, not getting us involved until they say, "Oh my gosh, now we need the data." We look at it and we say, "Well, it’s not in our contracts and now it’s going to cost more to access the data." That’s been our biggest hurdle for the cloud services that we've done.

Once you get past that, web services have been a great thing. Once you get the licensing and the contract in place, it becomes a very simple process, and it becomes a lot more seamless.

Gardner: So, maybe something to keep in mind is always think about the data before, during, and after your involvement with any acquisition, any contract, and any vendor?

Ordner: Absolutely.

You own three things

John: With SaaS, at the end of the day, you own three things: the process design, the data, and the integration points. When we construct a contract, one of the things I always insist upon is what I refer to as the “prenuptial agreement.”

What that simply means is, before the relationship begins, you understand how it can end. The key thing in how it ends is that you can take your data with you, that it has a migration path, and that they haven't created a stickiness that traps you there and you don't have the ability to migrate your data to somebody else, whether that’s somebody else in the cloud or on-premise.

Gardner: All right, let’s talk about lessons learned in infrastructure. Clearly, you've had an opportunity to look at a variety of different platforms, different requirements that you have had, that you have tested and required for your vendors. What is it about HPE Vertica, for example, that is appealing to you, and how does that factor into some of these digital transformation issues?

Ordner: There are two things that come to mind right away for me. One is there were some performance implications. We were struggling with our old world and certain processes that ran 36 hours. We did a proof of concept with HPE and Vertica and that ran in something like 17 minutes. So, right there, we were sold on performance changes.

As we got into it and negotiated with them, the other big advantage we discovered is that the licensing model with the amount of data, versus the core model that everyone else runs in the CPU core. We're able to scale this and provide that service at a high speed, so we can maintain that performance without having to take penalties against licensing. Those are a couple of things I see. Anything from your end, Steven?

John: No, I think that was just brilliant.

Gardner: How about on that acquisition and integration of data. Is there an issue with that that you have been able to solve?

Ordner: With acquisition and integration, we're still early in that process. We're still learning about how to put data into HPE Vertica in the most effective manner. So, we're really at our first source of data and we're looking forward to those additional pieces. We have a number of different telematics pieces that we want to include; wash aisle telematics as well as in-vehicle telematics. We're looking forward to that.

There's also scan data that I think will soon be on the horizon. All of our garments and our mats have chips in them. We scan them in and out, so we can see the activity and where they flow through the system. Those are some of our next targets to bring that data in and take a look at that and analyze it, but we're still a little bit early in that process as far as multiple sources. We're looking forward to some of the different ways that Vertica will allow us to connect to those data sources.

Gardner: I suppose another important consideration when you are picking and choosing systems and platforms is that extensibility. RFID tags are important now; we're expecting even more sensors, more data coming from the edge, the information from the Internet of Things (IoT). You need to feel that the systems you're putting in place now will scale out and up. Any thoughts about the IoT impact on what you're up to?

Overcoming past sins

John: We have had several conversations just this week with HPE and their teams, and they are coming out to visit with us on that exact topic. Being about a year into our journey, we've been doing two things. We've been forming the foundation with HPE Vertica and we've been getting our own house in order. So, there's a fair amount of cleanup and overcoming the sins of the past as we go through that process.

But Vertica is a platform; it's a platform where we have only tapped a small percentage of its capability. And in my personal opinion, even HPE is only aware of a portion of its capability. There are a whole set of things that it can do, and I don’t believe that we have discovered all of them.
Become a Member of myVertica
Gain Access to the Free 
HPE Vertica Community Edition
With that said, we're going to do what you and Tony just described; we're going to use the telematics coming out of our trucks. We're going to track safety and seat belts. We're going to track green initiatives, routes, and the analytics around our routes and fuel consumption. We're going to make the place safer, we're going to make it more efficient, and we're going to get proactive about being able to tell when a machine is going to fail and when to bring in our vendor partners to get it fixed before it disrupts production.

Gardner: It really sounds like there is virtually no part of your business in the laundry services industry that won't be in some way beneficially impacted by more data, better analytics delivered to more people. Is that fair?

Ordner: I think that’s a very fair statement. As I prepared for this conference, one of the things I learned, and I have been with the company for 17 years, is that we've done a lot technology changes, and technology has taken an added significance within our company. When you think of laundry, you certainly don't think of technology, but we've been at the leading edge of implementing technology to get closer to our customers, closer to understanding our products.

[Data technology] has become really ingrained within the industry, at least at our company.

John: It is one of those few projects where everyone is united, everybody believes that success is possible, and everybody is willing to pay the price to make it happen.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in:

13 Forecasts on Artificial Intelligence

13 Forecasts on Artificial Intelligence

We have discussed some AI topics in the previous posts, and it should seem now obvious the extraordinary disruptive impact AI had over the past few years. However, what everyone is now thinking of is where AI will be in five years time. I find it useful then to describe a few emerging trends we start seeing today, as well as make few predictions around machine learning future developments. The following proposed list does not want to be either exhaustive or truth-in-stone, but it comes from a series of personal considerations that might be useful when thinking about the impact of AI on our world.

The 13 Forecasts on AI

1. AI is going to require fewer data to work

Companies like Vicarious or Geometric Intelligence are working toward reducing the data burden needed to train neural networks. The amount of data required nowadays represents the major barrier for AI to be spread out (and the major competitive advantage), and the use of probabilistic induction (Lake et al., 2015) could solve this major problem for an AGI development. A less data-intensive algorithm might eventually use the concepts learned and assimilated in richer ways, either for action, imagination, or exploration.

2. New types of learning methods are the key

The new ...

Read More on Datafloq
4 Ways to Improve Your eCommerce Business with Big Data

4 Ways to Improve Your eCommerce Business with Big Data

For the last couple of years, Big Data has been a buzzword in basically every industry, and seeing how much actionable data the Internet creates on a daily basis, it’s not hard to see why. In the last five years, the overall global population of internet users has grown by around 60% - from 2.2 billion in 2011 to 3.5 billion users at the moment – according to data gathered by Internet Live Stats. Today, Big Data is being used across sectors, but there has be an increasing emphasis on Big Data analytics in eCommerce in a recent few years.

When it comes to eCommerce, data usually falls under two distinctive categories – structured and unstructured data. The first is regular data they have been able to catch easily, like a customer’s name, address, age and preferences. Unstructured data, such as tweets, likes and videos, is much more valuable. For example, according to Harvard Business Review, Walmart collects roughly 2.5 petabytes of unstructured data every hour from its customer transactions. But one of the biggest challenges eCommerce businesses face today is how to take all of that data, analyze it and gain meaningful insights.

Leveraging Big Data

Sometimes, it takes parallel software running ...

Read More on Datafloq
10 Vital Data Protection Principles for Businesses

10 Vital Data Protection Principles for Businesses

Protecting data is extremely important and an imperative for any business that holds their customer’s information. If people are willing to provide you with their details then you have an obligation to protect it. Here are 10 ways to do just that.

1) Secure Consent

Whenever personal data is collected, stored, or used, the consent of the individuals who submit it needs to be secured. Every form that solicits personal information, whether online or on paper, needs to make it clear what the information will be used for and who will have access to it.

2) Respect Sensitive Information

Personal data that could be considered sensitive (e.g. racial, political, religious, medical, sexual, or criminal information) should be handled with special care. According to River Cohen, this information should never be collected or disclosed unless it is absolutely necessary. Obtaining the consent of individuals before collecting this type of information - and informing them of what it will be used for - is absolutely vital.

3) Transparency

Individuals should always have access to their personal information. This means that notes or reports entered into official records which relate to individuals should be prepared with the understanding that the subjects have the right to see them. This general ...

Read More on Datafloq
Big Data: Breaking Down Inefficiency in Global Shipping

Big Data: Breaking Down Inefficiency in Global Shipping

Our food supply, our luxuries, and everything in between depends on global shipping. Shipping to your home used to be a big deal—if you wanted something sent to you, it was going to cost you a lot more than heading to the store to pick it up. Today, online shopping has made this struggle a thing of the past for consumers. Big retailers like Amazon offer free two-day shipping for subscribers, and most websites offer free shipping for orders over a certain spending threshold—some even offer free returns. Global shipping has increased immensely as well, particularly as many e-commerce sites choose drop shipping for distribution. With all this extra volume of packages going out to residential areas, shipping companies have been raking in the profit, and need to keep up with the speed demands of the public. How? As employee telecommuting continues to grow, big data is one of the tools companies are using to keep up. Let’s take a look at how big data is breaking down inefficiency in the shipping industry.

Container Efficiency

Large shipping containers also have a large downside: they’re heavy and difficult to maneuver. Packing them onto a cargo ship efficiently for a multi-stop journey is an ...

Read More on Datafloq
Is Big Data in Healthcare Worth the Cost?

Is Big Data in Healthcare Worth the Cost?

Big data is finally beginning to make a dent in the healthcare industry, where it is projected to not only open up new possibilities in medical care, but save industry providers billions - between 300 billion and 450 billion according to a study conducted by McKinsey & Company. However, not everyone is convinced that this technological turn is a prudent investment for healthcare providers. Some have raised a voice of warning against an industry-wide adoption of big data, citing some of the challenges that may make the use of big data in the medical world not worth the hefty cost.

This may seem hard to believe. Indeed, the use of data and analytics has proven to be a considerable asset for a host of businesses in other industries. Why wouldn’t data do just as much good in the hands of doctors? Because doctors tend to not use data as well as professionals in other fields. The disconnect between doctors and data comes down to two problems: some doctors aren’t even interested in using data at all, and many of the ones who are lack the skills to maximize its value.

As mentioned previously, some are projecting that data will save the healthcare ...

Read More on Datafloq
How to Stop Data Theft in the Healthcare Industry?

How to Stop Data Theft in the Healthcare Industry?

Cybercrime in the healthcare industry is growing rapidly. Nearly 90% of all healthcare organizations have experienced one or more data breach in the past two years. This industry has suffered more breaches than any other industry in the last ten years, losing an estimated total of $6.2 billion. They are failing largely because they haven’t been securing their applications. When it comes to online security, the healthcare industry has been stuck in their ways, but this needs to change. They need to adapt with the times and become unpredictable, or we could see some life-threatening information get into the wrong hands.

When it comes to data theft in the healthcare industry, there are three areas that these thieves look for: medical records, billing and insurance records, and payment information. The hackers go mainly for medical records and are looking for the best way to sell this information to people. They are leveraging the information they have over social media to get more people interested in what they are selling. Medical records are still a little harder to sell than financial records, but experts believe that very soon, these hackers will be able to sell the information much more easily. This will ...

Read More on Datafloq
Using the Snowflake Information Schema

Using the Snowflake Information Schema

It is my 1 year anniversary of becoming the tech evangelist for Snowflake Computing! Hard to believe that a year ago I gave up independent consulting and joined this amazing team in San Mateo. While there has been a lot of travel recently with my speaking schedule, I have gotten to learn a ton about big data, […]
Artificial Intelligence: What is It and Why Now?

Artificial Intelligence: What is It and Why Now?

Artificial Intelligence (AI) represents nowadays a paradigm shift that is driving at the same time the scientific progress as well as the industry evolution. Given the intense level of domain knowledge required to really appreciate the technicalities of the artificial engines, what AI is and can do is often misunderstood: the general audience is fascinated by its development and frightened by terminator-like scenarios; investors are mobilizing huge amounts of capital but they have not a clear picture of the competitive drivers that characterize companies and products; and managers are rushing to get their hands on the last software that may improve their productivities and revenues, and eventually their bonuses.

Even though the general optimism around creating advancements in artificial intelligence is evident (Muller and Bostrom, 2016), in order to foster the pace of growth facilitated by AI I believe it would be necessary to clarify some concepts.

Basic Definitions & Categorization

First, let’s describe what artificial intelligence means. According to Bostrom (2014), AI today is perceived in three different ways: it is something that might answer all your questions, with an increasing degree of accuracy (“the Oracle”); it could do anything it is commanded to do (“the Genie”), or it might act autonomously ...

Read More on Datafloq
Why Artificial Intelligence is Now More Important Than Ever

Why Artificial Intelligence is Now More Important Than Ever

The reason why we are studying AI right now more actively is clearly because of the potential applications it might have, because of the media and general public attention it received, as well as because of the incredible amount of funding investors are devoting to it as never before.

Machine learning is being quickly commoditized, and this encourages a more profound democratization of intelligence, although this is true only for low-order knowledge. If from one hand a large bucket of services and tools are now available to final users, on the other hand, the real power is concentrating into the hands of few major incumbents with the data availability and computational resources to really exploit AI to a higher level.

Apart from this technological polarization, the main problem the sector is experiencing can be divided into two key branches: first, the misalignments of i) the long term AGI research sacrificed for the short term business applications, and ii) what AI can actually do against what people think or assume it does. Both the issues stem from the high technical knowledge intrinsically required to understand it, but they are creating hype around AI. Part of the hype is clearly justified, because AI has been useful in ...

Read More on Datafloq
How to Ensure Your CRM Data is Fit for Purpose

How to Ensure Your CRM Data is Fit for Purpose

In a recent blog post we discussed the importance of data quality. In this post we provide you with seven ways you can improve the quality of your data and ensure it is fit for purpose. 

Constantly improving upon the quality of your data is essential to remain ahead of your competition. Failure to keep your data up to date will result in a 30% erosion in the value of your most valuable asset, your data, which is why it’s imperative to manage and maintain it effectively. What’s more, as explained by Experian, 99% of marketers feel driven to turn data into insights, yet 75% of organisations believe inaccurate data is undermining their ability to provide an excellent customer experience.

It’s impossible to provide outstanding customer experience with inaccurate or incomplete customer data. However, improving the quality of your CRM data doesn’t have to be costly or time consuming. Here’s how you can effectively improve the quality of your CRM data…

Set goals - Be sure to identify what it is you want to achieve, be it to increase leads, conversions, develop a product or service and to maintain regular communication with suppliers, partners, prospects and customers. Once you’ve outlined this, you can begin ...

Read More on Datafloq
How a North American railroad used the magic of Big data & Cloud for Crew / Workforce Management

How a North American railroad used the magic of Big data & Cloud for Crew / Workforce Management

This article is sponsored by CloudMoyo - Partner of choice for solutions at the intersection of Cloud & Analytics.

One of the most gratifying things that a big data analytics firm can be part of is the transformation of an established company as it integrates new practices and insights generated by its data.  That is exactly what happened in the collaboration between CloudMoyo and a North American railroad operator, an organization which has been operating since 1887 across 10 central U.S states as well as northern Mexico and southern Canada.

Logistics are the lifeblood of a transportation company. It has over 13000 freight cars, 1044 locomotives and their rail network comprises approximately 6,600 route miles that link commercial and industrial markets in the United States and Mexico.  It has approx. 500 trains running per day with an average of 800+ crew members daily across 181 interchange points with other railroads. Add to this the complexities of repairs, re-crews, allocations, scheduling, incidents, services, people & goods movement, vacations, communications and it turns out to be a heck of a day. Needless to say, it’s a massive transportation and logistics business with big data.  The company had to get its operations right and it turned to ...

Read More on Datafloq
Disrupting the data market: Interview with EXASOL’s CEO Aaron Auld

Disrupting the data market: Interview with EXASOL’s CEO Aaron Auld

Processing data fast and efficiently has become a never ending race. With the increasing need for data consumption by companies comes along a never ending “need for speed” for processing data and consequently, the emergence of new generation of database software solutions that emerging to fulfill this need for high performance data processing.

These new database management systems that incorporate novel technology provide high speed, and more efficient access and processing of large bulks of data.

EXASOL is one of this disruptive "new" database solution. Headquartered out of Nuremberg, Germany and with offices around the globe, EXASOL has worked hard to bring a fresh, new approach to the data analytics market via the offering of a world-class database solution.

In this interview, we took the opportunity to chat with EXASOL’s Aaron Auld about the company and its innovative database solution.

Aaron Auld is the Chief Executive Officer as well as the Chairman of the Board at EXASOL, positions he has held since July 2013. He was made a board member in 2009.

As CEO and Chairman, Aaron is responsible for the strategic direction and execution of the company, as well as growing the business internationally.

Aaron embarked on his career back in 1996 at MAN Technologie AG, where he worked on large industrial projects and M&A transactions in the aerospace sector. Subsequently, he worked for the law firm Eckner-Bähr & Colleagues in the field of corporate law.

After that, the native Brit joined Océ Printing Systems GmbH as legal counsel for sales, software, R&D and IT. He then moved to Océ Holding Germany and took over the global software business as head of corporate counsel. Aaron was also involved in the IPO (Prime Standard) of Primion Technology AG in a legal capacity, and led investment management and investor relations.

Aaron studied law at the Universities of Munich and St. Gallen. Passionate about nature, Aaron likes nothing more than to relax by walking or sailing and is interested in politics and history.

So, what is EXASOL and what is the story behind it?

EXASOL is a technology vendor that develops a high-performance in-memory analytic database that was built from the ground up to analyze large volumes of data extremely fast and with a high degree of flexibility.
The company was founded back in the early 2000's in Nuremberg, Germany, and went to market with the first version of the analytic database in 2008.

Now in its sixth generation, EXASOL continues to develop and market the in-memory analytic database working with organizations across the globe to help them derive business insight from their data that helps them to drive their businesses forward.

How does the database work? Could you tell us some of the main features?

We have always focused on delivering an analytic database ultra-fast, massively scalable analytic performance. The database combines in-memory, columnar storage and massively parallel processing technologies to provide unrivaled performance, flexibility and scalability.

The database is tuning-free and therefore helps to reduce the total cost of ownership while enabling users to solve analytical tasks instead of having to cope with technical limits and constraints.

With the recently-announced version 6, the database now offers a data virtualization and data integration framework which allows users to connect to more data sources than ever before.

Also, alongside out-of-the-box support for R, Lua, Python and Java, users can integrate the analytics programming language of their choice and use it for in-database analytics.

Especially today, speed of data processing is important. I’ve read EXASOL has taken some benchmarks in this regard. Could you tell us more about it?

One of the truly independent set of benchmark tests available is offered by the Transactional Processing Council (TPC).  A few years ago we decided to take part in the TPC-H benchmark and ever since we have topped the tables in terms of not only performance (i.e. analytic speeds) but also in terms of price/performance (i.e. cost aligned with speed) when analyzing data volumes ranging from 100GB right up to 100TB.   No other database vendor comes close.
The information is available online here.

One of the features of EXASOL is that, if I’m not mistaken, is deployed on commodity hardware. How does EXASOL’s design guarantee optimal performance and reliability?

Offering flexible deployment models in terms of how businesses can benefit from EXASOL has always been important to us at EXASOL.

Years ago, the concept of the data warehouse appliance was talked about as the optimum deployment model, but in most cases it meant that vendors were forcing users to use their database on bespoke hardware, on hardware that then could not be re-purposed for any other task.  Things have changed since: while the appliance model is still offered, ours is and always has been one that uses commodity hardware.

Of course, users are free to download our software and install it on their own hardware too.
It all makes for a more open and transparent framework where there is no vendor lock-in, and for users that can only be a good thing.  What’s more, because the hardware and chip vendors are always innovating, when a new processor or server is released, users only stand to benefit as they will see yet even faster performance when they run EXASOL on that new technology.
We recently discussed this in a promotional video for Intel.

Price point related, is it intended only for large organizations, what about medium and small ones with needs for fast data processing?

We work with organizations both large and small.  The common denominator is always that they have an issue with their data analytics or incumbent database technology and that they just cannot get answers to their analytic queries fast enough.

Price-wise, our analytic database is extremely competitively priced and we offer organizations of all shapes and sizes to use our database software on terms that best fit their own requirements, be that via a perpetual license model, a subscription model, a bring-your-own license model (BYOL) – whether on-premises or in the cloud.

What would be a minimal configuration example? Server, user licensing etc.?

Users can get started today with the EXASOL Free Small Business Edition.  It is a single-node only edition of the database software and users can pin up to 200GB of data into RAM.

Given that we advocate a 1:10 ratio of RAM vs raw data volume, this means that users can put 2TB of raw data into their EXASOL database instance and still get unrivaled analytic performance on their data – all for free. There are no limitations in terms of users.

We believe this is a very compelling advantage for businesses that want to get started with EXASOL.

Later, when data volumes grow and when businesses want to make use of advanced features such as in-database analytics or data virtualization, users can then upgrade to the EXASOL Enterprise Cluster Edition which offers much more in terms of functionality.

Regarding big data requirements, could you tell us some of the possibilities to integrate or connect EXASOL with big data sources/repositories such as Hadoop and others?

EXASOL can be easily integrated into every IT infrastructure.  It is SQL-compliant and, is compatible with leading BI and ETL products such as Tableau, MicroStrategy, Birst, IBM Cognos, SAP BusinessObjects, Alteryx, Informatica, Talend, Looker and Pentaho, and provides the most flexible Hadoop connector on the market.

Furthermore, through an extensive data virtualization and integration framework, users can now analyze data from more sources more easily and faster than ever before.

Recently, the company announced that EXASOL is now available on Amazon. Could you tell us a bit more about the news? EXASOL is also available on Azure, right?

As more and more organizations are deploying applications and their systems in the cloud, it’s therefore important that we can allow them to use EXASOL in the cloud, too.  As a result, we are now available on Amazon Web Services as well as Microsoft Azure.  What’s more, we continue to offer our own cloud and hosting environment, which we call EXACloud.

Finally, on a more personal topic. Being a Scot who lives in Germany, would you go for a German beer or a Scottish whisky?

That’s an easy one.  First enjoy a nice German beer (ideally, one from a Munich brewery) before dinner, then round the evening off with by savoring a nice Scottish whisky.  The best of both worlds.

How to Transform Your Business into an Analytics Company?

How to Transform Your Business into an Analytics Company?

This thought leadership article is brought to you by SayOne Technologies – providing tailored data analytics applications for customers around the world.

Many organizations are affected by the rapid changes in our society due to new technologies. The rate of change that organizations experience has increased to such an extent that half of S&P 500 companies are expected to be replaced by newcomers in the next 10 years. These newcomers use emerging technologies, such as Artificial Intelligence, Robotics, 3D printing or the Internet of Things. In addition, most of the new organizations take a different form than traditional organizations; many new organizations, or startups, are decentralized platform organizations that have been around for less than a decade but have experienced exponential growth.

Examples of these newcomers are Uber, which is the world’s largest taxi company that does not own any taxis; AirBnB, the world’s largest accommodation provider that does not own any hotels; WhatsApp, the world’s largest telecom company that does not own any telecom infrastructure or Alibaba, the second largest retailer that does not own any inventory. Each of these innovators disrupts an entire industry and they have two common denominators; they collect data in everything they do and they use ...

Read More on Datafloq
What is the Difference Between Vulnerability Scanning and Security Audit

What is the Difference Between Vulnerability Scanning and Security Audit

With the increased number of cybercrimes, security has become a major concern for every organization. So, IT people are adopting and discovering various new techniques for beating these threats and viruses that either harm the system or steal confidential and sensitive information. There are various products that are available in market to safeguard the system and data stored in the system. However, in order to make system secure and safe, one needs to make sure that those products are capable of keeping the system safe. Before safeguarding the system with any tool, one needs to know the difference between vulnerability scanning and the security audits so that system could be secured accordingly.

Vulnerability Scanning

It is a process to detect all the threatening vulnerabilities that could harm the system and data in a bad way.

This type of scanning is aimed at evaluating the security of hardware, software, network and system. Every organization wants to keep their systems and networks safe and secure from any kind of vulnerabilities. To perform the vulnerability scanning, an organization needs a scanning tool that could identify both high risk vulnerabilities and low risk vulnerabilities as well.

Once the vulnerabilities are detected and identified, you could work on removing ...

Read More on Datafloq
What Is the Future of Data Warehousing?

What Is the Future of Data Warehousing?

There is no denying it – we live in The Age of the Customer. Consumers all over the world are now digitally empowered, and they have the means to decide which businesses will succeed and grow, and which ones will fail. As a result, most savvy businesses now understand that they must be customer-obsessed to succeed. They must have up-to-the-second data and analytical information so that they can give their customers what they want and provide the very best customer satisfaction possible.

This understanding has given rise to the concept of business intelligence (BI), the use of data mining, big data, and data analytics to analyze raw data and create faster, more effective business solutions. However, while the concept of BI is not necessarily new, traditional BI tactics are no longer enough to keep up and ensure success in the future. Today, traditional BI must be combined with agile BI (the use of agile software development to accelerate traditional BI for faster results and more adaptability) and big data to deliver the fastest and most useful insights so that businesses may convert, serve, and retain more customers.

Essentially, for a business to survive, BI must continuously evolve and adapt to improve agility ...

Read More on Datafloq
How Connecting Big Data and Employee Training Improves Productivity

How Connecting Big Data and Employee Training Improves Productivity

Businesses can get a number of advantages through big data solutions, but one that's often overlooked is improving the employee training experience. Less than one percent of this data is even analyzed, but team of data analysts can use business intelligence to advantage in a number of training processes.

Improve Productivity, Especially in New Hires

If you are looking for some element that could streamline the process of new employee training, companies like Allen Communications demonstrate that big data can drive onboarding as well as other processes. Reliable data and data mining techniques can help you spot trends and patterns that point to the most productive aspects of your training programs in various job roles. Big data helps your company to get more benefits by speeding up the process and maximizing results.

A combination of performance tracking mechanisms, trainee feedback, and quality analytics software can help you to identify which training programs are having the most effect. Big data helps management recognize and improve the procedures that are most efficient in getting new hires up to expected levels of knowledge and skill.

Boosts in Revenue

These new training technologies allow you to customize your training modules to better fit the needs of your company and ...

Read More on Datafloq
Ethical Implications Of Industrialized Analytics

Ethical Implications Of Industrialized Analytics

As analytics are embedded more and more deeply into processes and systems that we interact with, they now directly impact us far more than in the past. No longer constrained to providing marketing offers or assessing the risk of a credit application, analytics are beginning to make truly life and death decisions in areas as diverse as autonomous vehicles and healthcare. These developments necessitate that attention is given to the ethical and legal frameworks required to account for today’s analytic capabilities.

Analytics Will Create Winners & Losers

In my recent client meetings and conference talks, such as the Rock Stars of Big Data event in early November, a certain question has come up repeatedly. When I discuss the analytics embedded in autonomous vehicles, I am often asked about the ethics and legalities behind them. A lot of focus is given to the safety of autonomous vehicles and rightly so. If the automated analytics in the vehicle don’t work right, people can die. This means a lot of scrutiny is being, and will continue to be, placed on the algorithms under the hood. I often make the point that the technology for autonomous vehicles will be ready well before our laws and public ...

Read More on Datafloq
GPS, IoT, and the Future Tech that Could Replace it All

GPS, IoT, and the Future Tech that Could Replace it All

Whether you’re lost on a lonely country road, or simply want to check into a cute café in New York City, GPS always has your back. Now that it’s so fully integrated into our cars, our phones, and our lives, it can be hard to remember a time before location services were available on almost every device. Where would we be without GPS today? Besides making it tough to get lost as long as your battery lasts, GPS has become more sophisticated and useful than we could have imagined twenty years ago. In honor of Geographic Information Systems Day on November 16, let’s take a look at how far GPS has come—and how far it could go from where we are now. 

Recent Innovations

GPS is accurate enough for most of us when it’s being used on our smartphones or in our cars. This is because we have human brains, and we’re able to make deductions quickly if the GPS isn’t 100% precise. Robots, however, don’t have that ability yet—and neither do we, if we’re not relying on normal vision or reality. New GPS technology from the University of Texas uses software that can pick up precise GPS signals that are accurate ...

Read More on Datafloq
How Big Data Could Affect What You Pay For Car Insurance

How Big Data Could Affect What You Pay For Car Insurance

Whether you are a good driver or a careless one, there are insurance companies who will be happy to reward you for your skill (or lack thereof). When it comes to using big data collected from cars to benefit our lives, automobile insurance is one clear industry that could see big changes in the next fifteen years. For instance, based on the information that your vehicle shares, your insurance company can learn a whole lot about your driving style, habits, and most importantly what risks you should be rated for. Beyond the obvious data your car can share, there are more personal items which also may impact your future insurance premiums.

One study correlated a direct relationship between positive Twitter posts and a healthier heart. So while you are letting your autonomous car drive you to work, and updating your Twitter feed, you may one day be telling your insurance company how to treat your policy. It is a science known as "sentiment analysis" and while it is relatively new, it has been said to be quite accurate. Combine posts with artificial intelligence analysis, and you may be sharing more with your insurance company than you realized or wanted to.

As you ...

Read More on Datafloq
How to build a Successful Big Data Analytics Proof-of-Concept

How to build a Successful Big Data Analytics Proof-of-Concept

This article is sponsored by CloudMoyo - Partner of choice for solutions at the intersection of Cloud & Analytics.

For all kinds of organizations, whether large multi-national enterprises or small businesses, developing a big data strategy is a difficult and time-consuming exercise. In fact, big data projects can take up to 18 months to finish. While a few within an organization may be very well aware of what Big Data is and what the possibilities of Big Data are, not everyone else, including the decision-makers, are aware of this. Developing a proof of concept (PoC) is a right approach to begin with and develop a business case. This can help organizations to answer questions like where to start, departments to be involved, functional areas to be addressed, and what will be the return on the required investment. All of these aspects should be involved in your big data business case.

Buying Business Intelligence (BI) and Analytics solutions has modified dramatically within the past few years with the advent of emerging technologies and the ever increasing sources of data. Historically, vendors didn't supply a Proof-Of-Concept (POC) stage throughout the buying process and if a vendor did provide one, it'd typically take months to line up ...

Read More on Datafloq
Agile Amped Video – Live from Southern Fried Agile

Agile Amped Video – Live from Southern Fried Agile

A few weeks back I had the pleasure of attending and speaking at the Southern Fried Agile event in Charlotte, North Carolina. It was quite an event with over 700 people in attendance. This is a primarily local event drawing people mostly from the Charlotte area. To say agile is big in that town is […]
How Data Can Help Your Company Improve Internal Communication

How Data Can Help Your Company Improve Internal Communication

If you’re struggling to make your internal communication fast and efficient, then you may be taking the wrong approach. Thankfully, with all of the data you now have at your fingertips, you can make educated choices and implement detailed strategies that lead to effortless communication.

3 Data-Driven Ways to Enhance Internal Communication

While you may not spend time thinking about the connection between data and communication strategies, the relationship does exist. Let’s take a quick look at three specific data-driven ways you can enhance your company’s internal communication with relative ease.

1. Have a Plan for Mass Communication

If your business depends on supply chain efficiency to consistently produce products and meet demand, then you know just how important it is to keep things moving at a steady pace. But what happens when a warehouse or factory employee calls in sick?

“For factories and manufacturing plants, the effect of call-offs can be serious,” DialMyCalls explains. “A factory floor is designed to operate with certain people at certain points in production to ensure that your goods are being produced at the quality you expect. One missing link in that chain could close down the entire production line.”

How do you ensure that call-offs and unforeseen scheduling issues ...

Read More on Datafloq
Data science és big data képzések – Érdeklődők gyűjtése

Data science és big data képzések – Érdeklődők gyűjtése

A napokban sokan megkerestek milyen oktatásokat tartunk a közeljövőben, milyen BME-s előadásokat fogunk megnyitni a külsősök előtt, milyen cégre szabott oktatásokat lehet kérni tőlünk. Hogy ne csak azt mondhassuk, hogy a kövessétek a blogot, arra gondoltam jó ötlet gyűjteni az érdeklődők elérhetőségeti:

Ha szeretnél értesítést kapni előzetesen azokról a data science és big data oktatásokról, melyekben részt veszünk, regisztráld magadat az alábbi oldalon:

Érdeklődőként való regisztráció

proactive.pngA blogon továbbra is jelezni fogjuk az aktuális lehetőségeket, a blog hírlevelét (jelentkezés itt) most újítottuk meg, ott is megírjuk az információkat, ez a lista azonban nem hírlevél. Ha bejelölöd milyen típusú dolgok érdekelnek, és éppen indítunk valami ide vágó oktatást, akkor a regisztráltaknak proaktívan kiküldjük a rájuk vonatkozó információkat. 

Reméljük ez is segít majd azoknak, akik már most tervezik a jövő évi oktatási keretüket is. 


ideaw.gifHa nem tudtok várni arra, hogy megjelenjenek a meghirdetett oktatások, vagy egyedi igényetek van, keressetek meg bátran, a legtöbb oktatásunk cégre / személyre szabottan fut, keress meg minket, szívesen javaslunk tematikát és adunk ajánlatot.

Gáspár Csaba:  +36208234154

Megosztom Facebookon! Megosztom Twitteren! Megosztom Tumblren!

Big Data and Risk Management in Financial Markets (Part I)

Big Data and Risk Management in Financial Markets (Part I)

We have seen how the interdisciplinary use of big data affected many sectors. Different examples are contagion spreading (Culotta, 2010); music albums success predictions (Dhar and Chang, 2009); or presidential election (Tumasjan et al., 2010).

In financial markets, the sentiment analysis probably represents the major and most known implementation of machine learning techniques on big datasets (Bollen et al., 2011). In spite of all the hype, though, risk management is still an exception. New information and stack of technologies did not bring as many benefits to the risk management as they did to trading for instance.

A risk is indeed usually addressed from an operational perspective, from a customer relationship angle, or specifically for fraud prevention and credit scoring. However, applications strictly related to financial markets are still not so widespread, mainly because of the following problem: in theory, more information should entail a higher degree of accuracy, while in practice it (also) exponentially augments the system complexity, making really complicated to identify and timely analyze unstructured data that might be extremely valuable in such fast-paced environments.

The likelihood (and the risk) of a network systemic failure is then multiplied by the increase in the interconnection degree of the markets. More and more data can help central ...

Read More on Datafloq
Automation of Excel Macros

Automation of Excel Macros

There is still a huge value of reporting and using Excel for end users.  Excel is simple to work with, allows data access and allows end users to program with minimal learning.  As users develop their skills they inevitably look at automation through macros.  So if we want to turn this into something a little […]
Swift and massive data classification advances score a win for better securing sensitive information

Swift and massive data classification advances score a win for better securing sensitive information

The next BriefingsDirect Voice of the Customer digital transformation case study explores how -- in an era when cybersecurity attacks are on the rise and enterprises and governments are increasingly vulnerable -- new data intelligence capabilities are being brought to the edge to provide better data loss prevention (DLP).

We'll learn how Digital Guardian in Waltham, Massachusetts analyzes both structured and unstructured data to predict and prevent loss of data and intellectual property (IP) with increased accuracy.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or  download a copy.
To learn how data recognition technology supports network and endpoint forensic insights for enhanced security and control, we're joined by Marcus Brown, Vice President of Corporate Business Development for Digital Guardian. The discussion is moderated by BriefingsDirect's Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: What are some of the major trends making DLP even more important, and even more effective?

Brown: Data protection has very much to come to the forefront in the last couple of years. Unfortunately, we wake up every morning and read in the newspapers, see on television, and hear on the radio a lot about data breaches. It’s pretty much every type of company, every type of organization, government organizations, etc., that’s being hit by this phenomenon at the moment.


So, awareness is very high, and apart from the frequency, a couple of key points are changing. First of all, you have a lot of very skilled adversaries coming into this, criminals, nation-state actors, hactivists, and many others. All these people are well-trained and very well resourced to come after your data. That means that companies have a pretty big challenge in front of them. The threat has never been bigger.

In terms of data protection, there are a couple of key trends at the cyber-security level. People have been aware of the so-called insider threat for a long time. This could be a disgruntled employee or it could be someone who has been recruited for monetary gain to help some organization get to your data. That’s a difficult one, because the insider has all the privilege and the visibility and knows where the data is. So, that’s not a good thing.

Then, you have employees, well-meaning employees, who just make mistakes. It happens to all of us. We touch something in Outlook, and we have a different email address than the one we were intending, and it goes out. The well-meaning employees, as well, are part of the insider threat.

Outside threats

What’s really escalated over the last couple of years are the advanced external attackers or the outside threat, as we call it. These are well-resourced, well-trained people from nation-states or criminal organizations trying to break in from the outside. They do that with malware or phishing campaigns.

About 70 percent of the attacks stop with the phishing campaign, when someone clicks on something that looked normal. Then, there's just general hacking, a lot of people getting in without malware at all. They just hack straight in using different techniques that don’t rely on malware.

People have become so good at developing malware and targeting malware at particular organizations, at particular types of data, that a lot of tools like antivirus and intrusion prevention just don’t work very well. The success rate is very low. So, there are new technologies that are better at detecting stuff at the perimeter and on the endpoint, but it’s a tough time.

There are internal and external attackers. A lot of people outside are ultimately after the two main types of data that companies have. One is a customer data, which is credit card numbers, healthcare information, and all that stuff. All of this can be sold on the black market per record for so-and-so many dollars. It’s a billion-dollar business. People are very motivated to do this.
Learn More About HPE IDOL
Advanced Enterprise Search and Analytics
For Unstructured Data
Most companies don’t want to lose their customers’ data. That’s seen as a pretty bad thing, a bad breach of trust, and people don’t like that. Then, obviously, for any company that has a product where you have IP, you spent lots of money developing that, whether it’s the new model of a car or some piece of electronics. It could be a movie, some new clothing, or whatever. It’s something that you have developed and it’s a secret IP. You don’t want that to get out, as well as all of your other internal information, whether it’s your financials, your plans, or your pricing. There are a lot of people going after both of those things, and that’s really the challenge.

In general, the world has become more mobile and spread out. There is no more perimeter to stop people from getting in. Everyone is everywhere, private life and work life is mixed, and you can access anything from anywhere. It’s a pretty big challenge.

Gardner: Even though there are so many different types of threats, internal, external, and so forth, one of the common things that we can do nowadays is get data to learn more about what we have as part of our inventory of important assets.

While we might not be able to seal off that perimeter, maybe we can limit the damage that takes place by early detection of problems. The earlier that an organization can detect that something is going on that shouldn’t be, the quicker they can come to the rescue. How does the instant analysis of data play a role in limiting negative outcomes?

Can't protect everything

Brown: If you want to protect something, you have to know it’s sensitive and that you want to protect it. You can’t protect everything. You're going to find which data is sensitive, and we're able to do that on-the-fly to recognize sensitive data and nonsensitive data. That’s a key part of the DLP puzzle, the data protection puzzle.

We work for some pretty large organizations, some of the largest companies and government organizations in the world, as well as lot of medium- and smaller-sized customers. Whatever it is we're trying to protect, personal information or indeed the IP, we need to be in the right place to see what people are doing with that data.

Our solution consists of two main types of agents. Some agents are on endpoint computers, which could be desktops or servers, Windows, Linux, and Macintosh. It’s a good place to be on the endpoint computer, because that’s where people, particularly the insider, come into play and start doing something with data. That’s where people work. That’s how they come into the network and it’s how they handle a business process.

So the challenge in DLP is to support the business process. Let people do with data what they need to do, but don’t let that data get out. The way to do that is to be in the right place. I already mentioned the endpoint agent, but we also have network agents, sensors, and appliances in the network that can look at data moving around.

The endpoint is really in the middle of the business process. Someone is working, they're working with different applications, getting data out of those applications, and they're doing whatever they need to do in their daily work. That’s where we sit, right in the middle of that, and we can see who the user is and what application they're working with it. It could be an engineer working with the computer-aided design (CAD) or the product lifecycle management (PLM) system developing some new automobile or whatever, and that’s a great place to be.

We rely very heavily on the HPE IDOL technology for helping us classify data. We use it particularly for structured data, anything like a credit card number, or alphanumeric data. It could be also free text about healthcare, patient information, and all this sort of stuff.

We use IDOL to help us scan documents. We can recognize regular expressions, that’s a credit card number type of thing, or Social Security. We can also recognize terminology. We rely on the fact that IDOL supports hundreds of languages and many different subject areas. So, using IDOL, we're able to recognize a whole lot of anything that’s written in textual language.

Our endpoint agent also has some of its own intelligence built in that we put on top of what we call contextual recognition or contextual classification. As I said, we see the customer list coming out of or we see the jet fighter design coming out of the PLM system and we then tag that as well. We're using IDOL, we're using some of our technology, and we're using our vantage point on the endpoint being in the business process to figure out what the data is.

We call that data-in-use monitoring and, once we see something is sensitive, we put a tag on it, and that tag travels with the data no matter where it goes.

An interesting thing is that if you have someone making a mistake, an unintentional, good-willed employee, accidentally attaching the wrong doc to something that it goes out, obviously it will warn the user of that.

We can stop that

If you have someone who is very, very malicious and is trying to obfuscate what they're doing, we can see that as well. For example, taking a screenshot of some top-secret diagram, embedding that in a PowerPoint and then encrypting the PowerPoint, we're tagging those docs. Anything that results from IP or top-secret information, we keep tagging that. When the guy then goes to put it on a thumb drive, put it on Dropbox, or whatever, we see that and stop that.

So that’s still a part of the problem, but the two points are classify it, that’s what we rely on IDOL a lot for, and then stop it from going out, that’s what our agent is responsible for.

Gardner: Let’s talk a little bit about the results here, when behaviors, people and the organization are brought to bear together with technology, because it’s people, process and technology. When it becomes known in the organization that you can do this, I should think that that must be a fairly important step. How do we measure effectiveness when you start using a technology like Digital Guardian? Where does that become explained and known in the organization and what impact does that have?

Brown: Our whole approach is a risk-based approach and it’s based on visibility. You’ve got to be able to see the problem and then you can take steps and exercise control to stop the problems.
Learn More About HPE IDOL
Advanced Enterprise Search and Analytics
For Unstructured Data
When you deploy our solution, you immediately gain a lot of visibility. I mentioned the endpoints and I mentioned the network. Basically, you get a snapshot without deploying any rules or configuring in any complex way. You just turn this on and you suddenly get this rich visibility, which is manifested in reports, trends, and all this stuff. What you get, after a very short period of time, is a set of reports that tell you what your risks are, and some of those risks may be that your HR information is being put on Dropbox.

You have engineers putting the source code onto thumb drives. It could all be well-meaning, they want to work on it at home or whatever, or it could be some bad guy.

One the biggest points of risk in any company is when an employee resigns and decides to move on. A lot of our customers use the monitoring and the reporting we have at that time to actually sit down with the employee and say, "We noticed that you downloaded 2,000 files and put them on a thumb drive. We’d like you to sign this saying that you're going to give us that data back."

That’s a typical use case, and that’s the visibility you get. You turn it on and you suddenly see all these risks, hopefully, not too many, but a certain number of risks and then you decide what you're going to do about it. In some areas you might want to be very draconian and say, "I'm not going to allow this. I'm going to completely block this. There is no reason why you should put the jet fighter design up on Dropbox."

Gardner: That’s where the epoxy in the USB drives comes in.

Warning people

Brown: Pretty much. On the other hand, you don’t want to stop people using USB, because it’s about their productivity, etc. So, you might want to warn people, if you're putting some financial data on to a thumb drive, we're going to encrypt that so nothing can happen to it, but do you really want to do this? Is this approach appropriate? People get a feeling that they're being monitored and that the way they are acting maybe isn't according to company policy. So, they'll back out of it.

In a nutshell, you look at the status quo, you put some controls in place, and after those controls are in place, within the space of a week, you suddenly see the risk posture changing, getting better, and the incidence of these dangerous actions dropping dramatically.

Very quickly, you can measure the security return on investment (ROI) in terms of people’s behavior and what’s happening. Our customers use that a lot internally to justify what they're doing.

Generally, you can get rid of a very large amount of the risk, say 90 percent, with an initial pass, or initial first two passes of rules to say, we don’t want this, we don’t want that. Then, you're monitoring the status, and suddenly, new things will happen. People discover new ways of doing things, and then you’ve got to put some controls in place, but you're pretty quickly up into the 90 percent and then you fine-tuning to get those last little bits of risk out.

Gardner: Because organizations are becoming increasingly data-driven, they're getting information and insight across their systems and their applications. Now, you're providing them with another data set that they could use. Is there some way that organizations are beginning to assimilate and analyze multiple data sets including what Digital Guardian’s agents are providing them in order to have even better analytics on what’s going on or how to prevent unpleasant activities?

Brown: In this security world, you have the security operations center (SOC), which is kind of the nerve center where everything to do with security comes into play. The main piece of technology in that area is the security information and event management (SIEM) technology. The market leader is HPE’s ArcSight, and that’s really where all of the many tools that security organizations use come together in one console, where all of that information can be looked at in a central place and can also be correlated.

We provide a lot of really interesting information for the SIEM for the SOC. I already mentioned we're on the endpoint and the network, particularly on the endpoint. That’s a bit of a blind spot for a lot of security organizations. They're traditionally looking at firewalls, other network devices, and this kind of stuff.

We provide rich information about the user, about the data, what’s going on with the data, and what’s going on with the system on the endpoint. That’s key for detecting malware, etc. We have all this rich visibility on the endpoint and also from the network. We actually pre-correlate that. We have our own correlation rules. On the endpoint computer in real time, we're correlating stuff. All of that gets populated into ArcSight.

At the recent HPE Protect Show in National Harbor in September we showed the latest generation of our integration, which we're very excited about. We have a lot of ArcSight content, which helps people in the SOC leverage our data, and we gave a couple of presentations at the show on that.

Gardner: And is there a way to make this even more protected? I believe encryption could be brought to bear and it plays a role in how the SIEM can react and behave.

Seamless experience

Brown: We actually have a new partnership, related to HPE's acquisition of Voltage, which is a real leader in the e-mail security space. It’s all about applying encryption to messages and managing the keys and making that user experience very seamless and easy to use.

Adding to that, we're bundling up some of the classification functionality that we have in our network sensors. What we have is a combination between Digital Guardian Network, DOP, and the HPE Data Security Encryption solution, where an enterprise can define a whole bunch of rules based on templates.

We can say, "I need to comply with HIPAA," "I need to comply with PCI," or whatever standard it is. Digital Guardian on the network will automatically scan all the e-mail going out and automatically classify according to our rules which e-mails are sensitive and which attachments are sensitive. It then goes on to the HPE Data Security Solution where it gets encrypted automatically and then sent out.

It’s basically allowing corporations to apply standard set of policies, not relying on the user to say they need to encrypt this, not leaving it to the user’s judgment, but actually applying standard policies across the enterprise for all e-mail making sure they get encrypted. We are very excited about it.
Learn More About HPE IDOL
Advanced Enterprise Search and Analytics
For Unstructured Data
Gardner: That sounds key -- using encryption to the best of its potential, being smart about it, not just across the waterfront, and then not depending on a voluntary encryption, but doing it based on need and intelligence.
Brown: Exactly.

Gardner: For those organizations that are increasingly trying to be data-driven, intelligent, taking advantage of the technologies and doing analysis in new interesting ways, what advice might you offer in the realm of security? Clearly, we’ve heard at various conferences and other places that security is, in a sense, the killer application of big-data analytics. If you're an organization seeking to be more data-driven, how can you best use that to improve your security posture?

Brown: The key, as far as we’re concerned, is that you have to watch your data, you have to understand your data, you need to collect information, and you need visibility of your data.

The other key point is that the security market has been shifting pretty dramatically from more of a network view much more toward the endpoint. I mentioned earlier that antivirus and some of these standard technologies on the endpoint aren't really cutting it anymore. So, it’s very important that you get visibility down at the endpoint and you need to see what users are doing, you need to understand what your systems are running, and you need to understand where your data is.

So collect that, get that visibility, and then leverage that visibility with analytics and tools so that you can profit from an automated kind of intelligence.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or  download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in:

What is the Blockchain – part 4 – Transactions and Smart Contracts

What is the Blockchain – part 4 – Transactions and Smart Contracts

Blockchain is rapidly gaining attention from organisations in every industry. However, it is a difficult to understand technology that, if not executed correctly could result in serious harm for your organisation. Therefore, in this series of posts on the blockchain, I explain what the Blockchain is and how it affects your organisation. The first part was a generic introduction on the Blockchain, while the second part focused on different types of blockchains and dApps. The third blog provided insights in several startups that are working hard on developing the required technology as well as several Blockchain challenges that need to be overcome before we will see wide-scale adoption of the Blockchain. In this fourth post, I will dive deeper in the different type of transactions that can be recorded on the blockchain as well as one particular type of transactions; smart contracts.

Different Transactions

A key characteristic of the blockchain is that it removes the need for trusted intermediaries; centralised organisations that take a fee for verifying transactions. Removing the middlemen, completely changes the game for organisations that want to do business with each other. Last week, for the first time, a transaction took place between two organisations across the globe which ...

Read More on Datafloq
Big Data Strategy (Part II): a Data Maturity Map

Big Data Strategy (Part II): a Data Maturity Map

As shown in Part I, there are a series of issues related to internal data management policies and approaches. The answers to these problems are not trivial, and we need a frame to approach them.

A Data Stage of Development Structure (DS2) is a maturity model built for this purpose, a roadmap developed to implement a revenue-generating and impactful data strategy. It can be used to assess the current situation of the company and to understand the future steps to undertake to enhance internal big data capabilities.

The following table provides a four by four matrix where the increasing stages of evolution are indicated as Primitive, Bespoke, Factory, and Scientific, while the metrics they are considered through are Culture, Data, Technology, and Talent. The final considerations are drawn in the last row, the one that concerns the financial impact on the business of a well-set data strategy.

Figure 1. Data Stage of Development Structure (DS2)

Stage one is about raising awareness: the realization that data science could be relevant to the company business. In this phase, there is neither any governance structure in place nor any pre-existing technology and above all no organization-wide buy-in. Yet, tangible projects are still the result of individual’s data enthusiasm being channeled into something actionable. The set of skills owned is ...

Read More on Datafloq
How to Use Big Data to Leverage Micro-Moments

How to Use Big Data to Leverage Micro-Moments

With the vast amount of digital data and analytics available, business owners and marketers can better target their marketing initiatives. Think with Google used big data insights to create meaningful solutions businesses can utilize to increase sales. Through big data research, the researchers at Google found that micro-moments are where business big and small have the best opportunities to reach their audiences.  

Using Big Data to Discover Insights

No other company deals with the vast amount of data that Google has access to. By learning how to capture and analyze this information, Google is able to provide their big data research findings to marketing professionals and business owners - allowing these professionals to hone in on the right marketing methods for their specific business model.

This leads us to a more important point, leveraging big data to create real results for your business. In “Retail Results: Six Case Studies of Big Data Success”, we see how many popular chains have utilized big data analysis to come up with actionable ways to quickly incorporate these insights in their marketing strategy. By taking quick and meaningful action on big data insights, these companies were able to develop exceptional results with data driven marketing strategies.

What is ...

Read More on Datafloq
How Data Visualization can help you Become a better “Salesperson”?

How Data Visualization can help you Become a better “Salesperson”?

The amount of business data today is going through the roof, and production is conducted by people, computers, and machines. The sheer volume is mind-boggling, and one of the main challenges is how to cherry pick meaningful information from that ocean and pass it off in a captivating way. In such a situation, it’s not a good idea to rely on the imperfect working of the human brain and harness the power of words alone.

The cutting edge

The most prudent salespersons are in tune with tech innovations and invest a great deal of time in research phase. They do not just read generic articles, but also scientific studies and business reports. Data visualization is of great help when you need to grasp large quantities of information, and that’s why businessmen, managers and sales teams employ business intelligence solutions that provide them with key insights on operations, strategies, and means of achieving goals.

One thing is for sure – there is no shortage of data to collect these days. Today's unlimited broadband that many internet users have access to, everything they do on their computers, devices and any IoT devices they might own is producing and transmitting data. Windows 10 is perhaps the ...

Read More on Datafloq
How to Bring Together Data Analytics with Business Intelligence

How to Bring Together Data Analytics with Business Intelligence

There is a very thin line that separates data analytics from business intelligence. In both the cases, we make use of data to analyze and interpret results. While data analytics is all about asking questions and setting up predictive models and forecasts, business intelligence is all about using these processes to make business decisions. 

In essence, data analytics comes before you can perform business intelligence tasks. Bringing together data analytics with business intelligence is a crucial first step towards generating meaningful information from the terrabytes of business data.

Executing data analytics tasks for business intelligence is useful in a wide number of industries. For instance, data analytics could be used in the customer service industry to study the response time of support tickets with various degrees of priority, analyze the customer satisfaction metrics and then use this data for business intelligence like deciding on the number of agents dealing with each of these different categories of support. 

High Frequency Trading (HFT), which involves analysis of millions of data points each second to make buy/sell calls, is another wonderful example of how data analytics and business intelligence can come together to make profitable business decisions.

How To Bring Together Data Analytics With Business Intelligence

By necessity, ...

Read More on Datafloq
Ways Your Marketers Probably Aren’t Using Big Data to Their Advantage

Ways Your Marketers Probably Aren’t Using Big Data to Their Advantage

Marketers love to say they are doing everything they need to get customers, but the truth is, most of them aren’t. The majority of marketing companies are failing in one HUGE area...big data. Companies realize data is critical in decision making, but a lot of companies aren’t exactly sure how to use it to its full advantage. Being able to analyze the data and implement it in your strategy is one of the most vital skills to have in the marketing world.

Analytics are the key to marketing these days. There is no better way to reach your audience and give them what they want than by reviewing analytics. You’ll know exactly where they are and what they react to the most. Never before have you been able to have this knowledge at the tips of your fingers so easily. Why not use this key information?

Knowing how to implement the data you get is the most important skill to have in the area of analytics. Being able to read the data means nothing if you don’t know what to do with the answers you get from them. You no longer have to wait weeks or months to get data and make ...

Read More on Datafloq
The Newcomers Guide to Data Warehouse Software

The Newcomers Guide to Data Warehouse Software

"Data warehouse" is a term that's been used in business intelligence since 1990. It refers to a collection of data that's kept separate from day-to-day operational data, such as transaction records, in order to provide a consistent historical dataset. For that reason, the information in a data warehouse must be unchanging and integrated as a cohesive structure that can be used for different analytic goals, on different types of business intelligence software.

Why use a data warehouse?

A data warehouse provides the basis for executives to understand collected business data across variants such as time or location. Suppose a marketing director is looking for a history of seasonal differences in the sale of "EZ Cleaning Product" in the Midwest. He/she could extract such results from the data warehouse over the past three years to look for buyer patterns that could potentially improve future sales.

This is a very common business need; 89% of American businesses invest in data and data analytics. This includes retailers, manufacturers, financial institutions, and any business large or small hoping to gain productive insights from review of their past business data.

A data warehouse provides companies a multidimensional view of this consolidated information. This can include any relevant information in ...

Read More on Datafloq
The Newcomers Guide to Data Warehouse Software

The Newcomers Guide to Data Warehouse Software

"Data warehouse" is a term that's been used in business intelligence since 1990. It refers to a collection of data that's kept separate from day-to-day operational data, such as transaction records, in order to provide a consistent historical dataset. For that reason, the information in a data warehouse must be unchanging and integrated as a cohesive structure that can be used for different analytic goals, on different types of business intelligence software.

Why use a data warehouse?

A data warehouse provides the basis for executives to understand collected business data across variants such as time or location. Suppose a marketing director is looking for a history of seasonal differences in the sale of "EZ Cleaning Product" in the Midwest. He/she could extract such results from the data warehouse over the past three years to look for buyer patterns that could potentially improve future sales.

This is a very common business need; 89% of American businesses invest in data and data analytics. This includes retailers, manufacturers, financial institutions, and any business large or small hoping to gain productive insights from review of their past business data.

A data warehouse provides companies a multidimensional view of this consolidated information. This can include any relevant information in ...

Read More on Datafloq
Csavarjunk egyet az innovatív adatelemzési irányok keresésén

Csavarjunk egyet az innovatív adatelemzési irányok keresésén

crisp_dm.pngA Nagy Könyvben az áll, hogy egy valamirevaló adatbányászati / adatelemzési projektet a CRISP-DM (Cross Industry Standard Process for Data Mining) metodika szerint érdemes menedzselni. Eszerint egy projektnek hat fő fázisa van, (1) első lépésben megértjük az üzleti problémát, majd (2) a hozzá kapcsolódó adatokat, (3) adatátalakításokat végzünk, (4) gépi tanulási vagy statisztikai modelleket futtatunk, (5) kiértékeljük a kapott eredményeket üzleti szempontból, és persze ha minden klappol, akkor (6) hadrendbe állítjuk a megoldást. Nyilván ennél bonyolultabb a helyzet (a részletes leírást a metodikáról itt olvashatunk), szinte mindig van szükség iterációkra, mikor egy vagy több fázist is vissza kell ugranunk az aktuális lépés során tapasztaltak miatt. Érdekes látni, hogy ez iteratív fejlesztés mennyire jól illeszthető a ma egyre jobban terjedő agilis szemlélethez. 

De nem is a CRISP-DM metodikát akarom most kivesézni, felülbírálni - minden hibája ellenére igazán szeretem, gyakran mentett meg minket kényes helyzetekben. Inkább egy újfajta jelenségre szeretném felhívni a figyelmet: egyre többször van lehetőségünk olyan módon elkezdeni egy-egy projektet, hogy nem a legelső fázisnál, az üzleti feladat megértésénél indulunk - hanem a másodiknál, az adatok megismerésénél.

Arról van szó, hogy vannak vállalatok, akik nagyon vágynak arra, hogy valami igazán izgalmas, jelentős üzleti hatást felmutatni képes adatos projektet indítsanak, és a lehetséges partnereket állandóan szondázzák, mondjanak valami igazán ütős use-case-t. Gyakran egy-egy jó pozícióban levő tanácsadó cég is megkeres hasonló feladattal: "bent ülök az X cég big data board-jában, most dolgozzuk ki a stratégiát, ha van valami remek alkalmazási ötletetek, mondjátok el, ha elég jó, lehetne belőle jó üzletet csinálni". 

Nagyon nehéz ilyenkor valami jó javaslattal előállni, faramuci helyzet, hogy nem a megrendelő hozza az üzleti problémát, hanem mi. Ezért ezekben a helyzetekben gyakran azt javasoljuk, forduljunk az adatokhoz: néhány ilyen megkeresést át tudtunk alakítani egy adatvezéreltebb gondolkodás mentén működő projekté:

  • Elsőként hozzáférést kaptunk az adatok egy részhez, és egyszerűen megértettük milyen típusú, mennyiségű és minőségű adattal élnek együtt ezek a cégek. Tipikusan néhány érdekes adatkört adtak oda, ami szerintük is tartogat meglepetéseket.
  • Ezt követően egyfajta data discovery fázisban kicsit megpiszkáltuk az adatokat, mit is rejtenek. Ezek az elemzések egyfajta adhoc riportoknak foghatók fel, segítenek abban, hogy ötleteket tudjunk adni arról, mire is használhatók ezek az adatok. Lényeg, hogy itt az adatok is súgnak nekünk.
  • Az adatok ismeretében készítettünk jó sok javaslatot, hogyan lehetne üzletileg kiaknázni az adatokat. Itt a friss, más nézőpontból érkező szemünk adata lehetőségek (és korlátok) mentén leírtunk 10-20 use-case-t, esetenként néhány slide-nyi anyagot szedtünk össze. Egy-egy ilyen javaslatot gyakran néhány, a cégre vonatkozó adattal is megtámogatunk.
  • Ezeket vizsgálta meg a megrendelő üzleti csapata, a use-case-ek jelentős része nem érte el az ingerküszöbüket, de mindig akadt egy-kettő, ami érdekelte őket. Mivel láttak hozzá néhány releváns adatot, gyakran könnyebben meggyőzhetőek ilyenkor, mintha külsősként azt mondom, ez biztos megoldás után kiáltó gond nálatok.
  • Ezt követően általában egy proof-of-concept fázis jön, mikor az adott use-case-re kidolgozzuk az elemzést a múltbeli adatokon, itt bizonyítjuk be, hogy érdemes ez az elemzést, adatbányászati modellezést folytatni.
  • Végül, ha ez utóbbi lépésben is meggyőzők voltunk, akkor foglalkozunk a megoldás integrációjával, ekkor kerülnek fel az eredményeink a dashboard-okra, beépítjük a kalkulációkat be a rendszerekbe, és itt dolgozzuk ki hosszú távon hogyan érdemes frissíteni az eredményeket.

gut.jpgLátható, hogy az iteratív, egyre komplexebb megoldásokat kihozó szemlélet itt is megmaradt, de az első pont az adatokról szól, és nem az üzletről. A döntések meghozatalánál persze az is sokat nyomott a latba, hogy így a költségek is fokozatosan jelennek meg. Az adatfeltárás + use-case készítés jóval kisebb feladat (8-21 nap), mint mondjuk a historikus adatokon már futó poof-of-concept megoldás kialakítása, de az igazán komoly befektetést igénylő integráció előtt van egy érvényes döntési pont, ahol múltbeli adatokra támaszkodva jól meg lehet becsülni az üzletre gyakorolt hatását az adott megoldásnak.

Én igazán szeretek így dolgozni, nagyon kreatív és sokkal üzletszagúbb megoldások tudnak így létrejönni. És attól a pillanattól fogva, hogy a partner saját adatain futó elemzésekkel támasztjuk alá a mondandónkat, sokkal érdekesebbek lesznek az elsőre akár túl egyszerű vagy scifi jellegű ötletek is.

ideaw.gifÉrdekes számodra ez a megközelítés, mert a Te cégednél is van egy állandó ötletvadászat az innovatív irányokat illetően, de valahogy mindig az az érzésed, hogy a bejövő ötletek valahogy nem hitelesek. Írj nekünk, és mi szívesen segítünk a fenti metodika szerint megtalálni azt, ami nálatok valóban érdemes bevezetni.

Gáspár Csaba:

Megosztom Facebookon! Megosztom Twitteren! Megosztom Tumblren!

How to Solve Genomics’ Big Data Management Problem

How to Solve Genomics’ Big Data Management Problem

Today, medical and technological advancements are made more rapidly than ever before. Often the two go hand in hand, as seen in surgical robots and nanotechnology to name a couple. Even research and experiments on things that were previously theoretical or purely misunderstood are now opened up through the advancements in technology.

In the case of genomics, uncountable hours of research have been conducted to better understand the building blocks of our species as well as the diseases and ailments that plague us. But such a vast amount of genomic information — so much so that it has entered the realm of big data — has grown to such a size that it demands proper handling and accessibility to those who need it.

The three main problems facing the management of genomics big data are:

How to properly store this data
Protecting the privacy of genomics research and the people whose genetic material is being used
The integration of new and improved programs that allow better access to genomic research for medical professionals around the world

Solving these problems is daunting, but overcoming them means having a system that saves both lives and money.


When it comes to storing large amounts of data, a room full of ...

Read More on Datafloq
Why Big Money and Big Data Win Elections

Why Big Money and Big Data Win Elections

Political campaigns are anything but simple. They’ve gotten even more complex over the years as industries lobby for their own interests by sending money to the candidates they believe will serve them best. While campaigns have used whatever data they could get their hands on in the past, the amount of specific information now available (thanks to big data) has been a boon to politicians hoping to target specific voters and gain new support. In 2016, all of the major players are using big data in their efforts to gain office. Here’s how analytics and money play a role in election outcomes. 

The Obama Campaign and the Shift Toward Analytics

TV advertising segments used to play an enormous role in American politics. While they’re still a tool that campaigns use to win voters and turn opinion against opponents, they’re less important than they used to be, and they’re more targeted—largely due to the popularity of big data. Using voter databases and information provided by voter surveys, the 2008 Obama campaign used big data successfully to win Iowa, a highly competitive primary state. It didn’t stop there—experts cite the Obama campaign’s use of big data and technology as a major factor in his ...

Read More on Datafloq
Logging challenges for containerized applications: Interview with Eduardo Silva

Logging challenges for containerized applications: Interview with Eduardo Silva

Next week, another edition of Cloud Native Con conference will take place in the great city of Seattle. One of the key topics in this edition has to do with containers, a software technology that is enabling and easing the development and deployment of applications by encapsulating them for further deployment with only a simple process.

In this installment, we took the opportunity to chat with Eduardo Silva a bit about containers and his upcoming session: Logging for Containers which will take place during the conference.

Eduardo Silva is a principal Open Source developer at Treasure Data Inc where he currently leads the efforts to make logging ecosystem more friendly in Embedded, Containers and Cloud services.

He also directs the Monkey Project organization which is behind the Open Source projects Monkey HTTP Server and Duda I/O.

A well known speaker, Eduardo has been speaking in events across South America and in recent Linux Foundation events in the US, Asia and Europe.

Thanks so much for your time Eduardo!

What is a container and how is it applied specifically in Linux?

When deploying applications, is always desired to have full control over given resources, likely we would like to have this application isolated as much as possible, Containers is the concept of package an application with it entire runtime environment in an isolated way.
In order to accomplish this, from an operating system level, Linux provide us with two features that lead to implement the concept of containers: cgroups and namespaces.

  • cgroups (control goups) allow us to limit the resource usage for one or more processes, so you can define how much CPU or memory a program(s) might use when running.
  • on the other hand namespaces (associated to users and groups) allow us to define restricted access to specific resources such as mount points, network devices and IPC within others.

For short, if you like programming, you can implement your own containers with a few system calls. Since this could be a tedious work from an operability perspective, there are libraries and services that abstract the whole details and let you focus on what really matters: deployment and monitoring.

So, what is the difference between a Linux Container and, for example a virtual machine?

A container aims to be a granular unit of an application and its dependencies, it's one or a group of processes. A Virtual Machine runs a whole Operating System which you might guess should be a bit heavy.

So, if we ought to define some advantages of containers versus virtualization, could you tell us a couple of advantages and disadvantages of both?

There're many differences… pros and cons, so taking into account our Cloud world-environment when you need to deploy applications at scale (and many times just on-demand), containers provide you the best choice, deploy a container just takes a small fraction of a second, while deploying a Virtual Machine may take a few seconds and a bunch of resources that most likely will be wasted.

Due to the opportunities it brings, there are some container projects and solutions out there such as LXC, LXD or LXCFS. Could you share with us what is the difference between them? Do you have one you consider your main choice and why?

Having the technology to implement containers is the first step, but as I said before, not everybody would like to play with system calls, instead different technologies exists to create and manage containers. LXC and LXD provide the next level of abstraction to manage containers, LXCFS is a user-space file system for containers (works on top of Fuse).
Since I don't play with containers at low level, I don't have a strong preference.

And what about solutions such as Docker, CoreOS or Vagrant? Any take on them?

Docker is the big player nowadays, it provide good security and mechanisms to manage/deploy containers. CoreOS have a prominent container engine caller Rocket (rkt), I have not used it but it looks promising in terms of design and implementation, orchestration services like Kubernetes are already providing support for it.

You are also working on a quite interesting project called Fluent-Bit. What is the project about?

I will give you a bit of context. I'm part of the open source engineering team at Treasure Data, our primary focus in the team is to solve data collection and data delivery for a wide range of use cases and integrations, to accomplish this, Fluentd exists. It's a very successful project which nowadays is solving Logging challenges in hundreds of thousands of systems, we are very proud of it.
A year ago we decided to dig into the embedded Linux space, and as you might know the capacity of these devices in terms of CPU, Memory and Storage are likely more restricted than a common server machine.
Fluentd is really good but it also have its technical requirements, it's written in a mix of Ruby + C, but having Ruby in most of embedded Linux could be a real challenge or a blocker. That's why a new solution has born: Fluent Bit.
Fluent Bit is a data collector and log shipper written 100% in C, it have a strong focus on Linux but it also works on BSD based systems, including OSX/MacOS. Its architecture have been designed to be very lightweight and provide high performance from collection to distribution.
Some of it features are:

  • Input / Output plugins
  • Event driven (async I/O operations)
  • Built-in Metrics
  • Security: SSL/TLS
  • Routing
  • Buffering
  • Fluentd Integration

Despite it was initially conceived for embedded Linux, it has evolved, gaining features that makes it cloud friendly without loss of performance and lightweight goals.
If you are interested into collect data and deliver it to somewhere, Fluent Bit allows you to do that through the built-in plugins, some of them are:

  • Input
    • Forward: Protocol on top of TCP, get data from Fluentd or Docker Containers
    • Head: read initial chunks of bytes from a file.
    • Health: check remote TCP server healthy.
    • kmsg: read Kernel log messages.
    • CPU: collect CPU metrics usage, globally and per core.
    • Mem: memory usage of the system or from a specific running process.
    • TCP: expect for JSON messages over TCP.
  • Output
    • Elasticsearch database
    • Treasure Data (our cloud analytics platform)
    • NATS Messaging Server
    • HTTP end-point

So as you can see, with Fluent Bit it would be easy to aggregate Docker logs into Elasticsearch, monitor your current OS resources usage or collect JSON data over the network (TCP) and send it to your own HTTP end-point.
The use-cases are multiple and this is a very exciting tool, but not just from an end user perspective, but also from a technical implementation point of view.
The project is moving forward pretty quickly an getting exceptional new features such as support to write your own plugins in Golang! (yes, C -> Go), isn't it neat ?

You will be presenting at CNCF event CloudNativeCon & KubeCon in November. Can you share with us a bit of what you will be presenting about in your session?

I will share our experience with Logging in critical environments and dig into common pains and best practices that can be applied to different scenarios.
It will be everything about Logging in the scope of (but not limited to) containers, microservices, distributed Logging, aggregation patterns, Kubernetes, Open Source solutions for Logging and demos.
I'd say that everyone who's a sysadmin, devops or developer, will definitely benefit from the content of this session, Logging "is" and "required" everywhere.

Finally, on a personal note. Which do you consider to be the geekiest songs of this century?

That's a difficult question!
 I am not an expert on geek music but I would vouch for Spybreak from Propellerheads (Matrix).

How Data Analytics Can Empower B2B Sales in Pharma

How Data Analytics Can Empower B2B Sales in Pharma

This article is sponsored by CloudMoyo - Partner of choice for solutions at the intersection of Cloud & Analytics.

Earlier, pharmaceutical companies would invest in expensive, broad-scale product promotion via lengthy doctor visits. A recent survey suggests that 87% intend to increase their use of analytics to target spending and drive improved ROI. Some of that money is likely to go into monitoring doctors’ therapeutic tastes, geographic trends, peak prescription rates – anything that has a direct relevance to the sales cycle. Drug companies are employing predictive methods to determine which consumers and physicians are most likely to utilize a drug and create more targeted on-the-ground marketing efforts. Pharmas are providing drug reps with mobile devices and real-time analytics on their prospects. Reps can then tailor their agenda to suit the physician. Afterward, the sales team can analyze the results to determine whether the approach was effective.

For pharma clinical research organizations (CRO), business development initiatives are mainly through preparation of information request responses and preparation of proposals for pharmaceutical, medical device, and biotechnology companies. Today, CROs rely on data-driven insights, which require reports and performance metrics available at decision makers’ fingertips for tracking multiple opportunities, maintaining win-loss rations, predicting pipelines, analyzing lost sales, ...

Read More on Datafloq
2016 election campaigners look to big data analysis to gain an edge in intelligently reaching voters

2016 election campaigners look to big data analysis to gain an edge in intelligently reaching voters

The next BriefingsDirect Voice of the Customer digital transformation case study explores how data-analysis services startup BlueLabs in Washington, DC helps presidential election campaigns better know and engage with potential voters.

We'll learn how BlueLabs relies on high-performing analytics platforms that allow a democratization of querying, of opening the value of vast data resources to discretely identify more of those in the need to know.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

Here to describe how big data is being used creatively by contemporary political organizations for two-way voter engagement, we're joined by Erek Dyskant Co-Founder and Vice President of Impact at BlueLabs Analytics in Washington. The discussion is moderated by BriefingsDirect's Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Obviously, this is a busy season for the analytics people who are focused on politics and campaigns. What are some of the trends that are different in 2016 from just four years ago. It’s a fast-changing technology set, it's also a fast-changing methodology. And of course, the trends about how voters think, react, use social, and engage are also dynamic. So what's different this cycle?

Dyskant: From a voter-engagement perspective, in 2012, we could reach most of our voters online through a relatively small set of social media channels -- Facebook, Twitter, and a little bit on the Instagram side. Moving into 2016, we see a fragmentation of the online and offline media consumption landscape and many more folks moving toward purpose-built social media platforms.

If I'm at the HPE Conference and I want my colleagues back in D.C. to see what I'm seeing, then maybe I'll use Periscope, maybe Facebook Live, but probably Periscope. If I see something that I think one of my friends will think is really funny, I'll send that to them on Snapchat.
Join myVertica
To Get the Free
HPE Vertica Community Edition
Where political campaigns have traditionally broadcast messages out through the news-feed style social-media strategies, now we need to consider how it is that one-to-one social media is acting as a force multiplier for our events and for the ideas of our candidates, filtered through our campaign’s champions.

Gardner: So, perhaps a way to look at that is that you're no longer focused on precincts physically and you're no longer able to use broadcast through social media. It’s much more of an influence within communities and identifying those communities in a new way through these apps, perhaps more than platforms.

Social media

Dyskant: That's exactly right. Campaigns have always organized voters at the door and on the phone. Now, we think of one more way. If you want to be a champion for a candidate, you can be a champion by knocking on doors for us, by making phone calls, or by making phone calls through online platforms.

You can also use one-to-one social media channels to let your friends know why the election matters so much to you and why they should turn out and vote, or vote for the issues that really matter to you.

Gardner: So, we're talking about retail campaigning, but it's a bit more virtual. What’s interesting though is that you can get a lot more data through the interaction than you might if you were physically knocking on someone's door.

Dyskant: The data is different. We're starting to see a shift from demographic targeting. In 2000, we were targeting on precincts. A little bit later, we were targeting on combinations of demographics, on soccer moms, on single women, on single men, on rural, urban, or suburban communities separately.


Moving to 2012, we've looked at everything that we knew about a person and built individual-level predictive models, so that we knew each person's individual set of characteristics made that person more or less likely to be someone that our candidate would have an engaging conversation through a volunteer.

Now, what we're starting to see is behavioral characteristics trumping demographic or even consumer data. You can put whiskey drinkers in your model, you can put cat owners in your model, but isn't it a lot more interesting to put in your model that fact that this person has an online profile on our website and this is their clickstream? Isn't it much more interesting to put into a model that this person is likely to consume media via TV, is likely to be a cord-cutter, is likely to be a social media trendsetter, is likely to view multiple channels, or to use both Facebook and media on TV?

That lets us have a really broad reach or really broad set of interested voters, rather than just creating an echo chamber where we're talking to the same voters across different platforms.

Gardner: So, over time, the analytics tools have gone from semi-blunt instruments to much more precise, and you're also able to better target what you think would be the right voter for you to get the right message out to.

One of the things you mentioned that struck me is the word "predictive." I suppose I think of campaigning as looking to influence people, and that polling then tries to predict what will happen as a result. Is there somewhat less daylight between these two than I am thinking, that being predictive and campaigning are much more closely associated, and how would that work?

Predictive modeling

Dyskant: When I think of predictive modeling, what I think of is predicting something that the campaign doesn't know. That may be something that will happen in the future or it may be something that already exists today, but that we don't have an observation for it.

In the case of the role of polling, what I really see about that is understanding what issues matter the most to voters and how it is that we can craft messages that resonate with those issues. When I think of predictive analytics, I think of how is it that we allocate our resources to persuade and activate voters.

Over the course of elections, what we've seen is an exponential trajectory of the amount of data that is considered by predictive models. Even more important than that is an exponential set of the use cases of models. Today, we see every time a predictive model is used, it’s used in a million and one ways, whereas in 2012 it might have been used in 50, 20, or 100 sessions about each voter contract.

Gardner: It’s a fascinating use case to see how analytics and data can be brought to bear on the democratic process and to help you get messages out, probably in a way that's better received by the voter or the prospective voter, like in a retail or commercial environment. You don’t want to hear things that aren’t relevant to you, and when people do make an effort to provide you with information that's useful or that helps you make a decision, you benefit and you respect and even admire and enjoy it.

Dyskant: What I really want is for the voter experience to be as transparent and easy as possible, that campaigns reach out to me around the same time that I'm seeking information about who I'm going to vote for in November. I know who I'm voting for in 2016, but in some local actions, I may not have made that decision yet. So, I want a steady stream of information to be reaching voters, as they're in those key decision points, with messaging that really is relevant to their lives.
I want a steady stream of information to be reaching voters, as they're in those key decision points, with messaging that really is relevant to their lives.

I also want to listen to what voters tell me. If a voter has a conversation with a volunteer at the door, that should inform future communications. If somebody has told me that they're definitely voting for the candidate, then the next conversation should be different from someone who says, "I work in energy. I really want to know more about the Secretary’s energy policies."

Gardner: Just as if a salesperson is engaging with process, they use customer relationship management (CRM), and that data is captured, analyzed, and shared. That becomes a much better process for both the buyer and the seller. It's the same thing in a campaign, right? The better information you have, the more likely you're going to be able to serve that user, that voter.

Dyskant: There definitely are parallels to marketing, and that’s how we at BlueLabs decided to found the company and work across industries. We work with Fortune 100 retail organizations that are interested in how, once someone buys one item, we can bring them back into the store to buy the follow-on item or maybe to buy the follow-on item through that same store’s online portal. How it is that we can provide relevant messaging as users engage in complex processes online? All those things are driven from our lessons in politics.

Politics is fundamentally different from retail, though. It's a civic decision, rather than an individual-level decision. I always want to be mindful that I have a duty to voters to provide extremely relevant information to them, so that they can be engaged in the civic decision that they need to make.

Gardner: Suffice it to say that good quality comparison shopping is still good quality comparison decision-making.

Dyskant: Yes, I would agree with you.

Relevant and speedy

Gardner: Now that we've established how really relevant, important, and powerful this type of analysis can be in the context of the 2016 campaign, I'd like to learn more about how you go about getting that analysis and making it relevant and speedy across large variety of data sets and content sets. But first, let’s hear more about BlueLabs. Tell me about your company, how it started, why you started it, maybe a bit about yourself as well.

Dyskant: Of the four of us who started BlueLabs, some of us met in the 2008 elections and some of us met during the 2010 midterms working at the Democratic National Committee (DNC). Throughout that pre-2012 experience, we had the opportunity as practitioners to try a lot of things, sometimes just once or twice, sometimes things that we operationalized within those cycles.

Jumping forward to 2012 we had the opportunity to scale all that research and development to say that we did this one thing that was a different way of building models, and it worked for in this congressional array. We decided to make this three people’s full-time jobs and scale that up.

Moving past 2012, we got to build potentially one of the fastest-growing startups, one of the most data-driven organizations, and we knew that we built a special team. We wanted to continue working together with ourselves and the folks who we worked with and who made all this possible. We also wanted to apply the same types of techniques to other areas of social impact and other areas of commerce. This individual-level approach to identifying conversations is something that we found unique in the marketplace. We wanted to expand on that.
Join myVertica
To Get the Free
HPE Vertica Community Edition
Increasingly, what we're working on is this segmentation-of-media problem. It's this idea that some people watch only TV, and you can't ignore a TV. It has lots of eyeballs. Some people watch only digital and some people consume a mix of media. How is it that you can build media plans that are aware of people's cross-channel media preferences and reach the right audience with their preferred means of communications?

Gardner: That’s fascinating. You start with the rigors of the demands of a political campaign, but then you can apply in so many ways, answering the types of questions anticipating the type of questions that more verticals, more sectors, and charitable organizations would want to be involved with. That’s very cool.

Let’s go back to the data science. You have this vast pool of data. You have a snappy analytics platform to work with. But, one of the things that I am interested in is how you get more people whether it's in your organization or a campaign, like the Hillary Clinton campaign, or the DNC to then be able to utilize that data to get to these inferences, get to these insights that you want.

What is it that you look for and what is it that you've been able to do in that form of getting more people able to query and utilize the data?

Dyskant: Data science happens when individuals have direct access to ask complex questions of a large, gnarly, but well-integrated data set. If I have 30 terabytes of data across online contacts, off-line contacts, and maybe a sample of clickstream data, and I want to ask things like of all the people who went to my online platform and clicked the password reset because they couldn't remember their password, then never followed up with an e-mail, how many of them showed up at a retail location within the next five days? They tried to engage online, and it didn't work out for them. I want to know whether we're losing them or are they showing up in person.

That type of question maybe would make it into a business-intelligence (BI) report a few months from that, but people who are thinking about what we do every day, would say, "I wonder about this, turn it into a query, and say, "I think I found something." If we give these customers phone calls, maybe we can reset their passwords over the phone and reengage them.

Human intensive

That's just one tiny, micro example, which is why data science is truly a human-intensive exercise. You get 50-100 people working at an enterprise solving problems like that and what you ultimately get is a positive feedback loop of self-correcting systems. Every time there's a problem, somebody is thinking about how that problem is represented in the data. How do I quantify that. If it’s significant enough, then how is it that the organization can improve in this one specific area?

All that can be done with business logic is the interesting piece. You need very granular data that's accessible via query and you need reasonably fast query time, because you can’t ask questions like that when you're going to get coffee every time you run a query.

Layering predictive modeling allows you to understand the opportunity for impact if you fix that problem. That one hypothesis with those users who cannot reset their passwords is that maybe those users aren't that engaged in the first place. You fix their password but it doesn’t move the needle.

The other hypothesis is that it's people who are actively trying to engage with your server and are unsuccessful because of this one very specific barrier. If you have a model of user engagement at an individual level, you can say that these are really high-value users that are having this problem, or maybe they aren’t. So you take data science, align it with really smart individual-level business analysis, and what you get is an organization that continues to improve without having to have at an executive-decision level for each one of those things.

Gardner: So a great deal of inquiry experimentation, iterative improvement, and feedback loops can all come together very powerfully. I'm all for the data scientist full-employment movement, but we need to do more than have people have to go through data scientist to use, access, and develop these feedback insights. What is it about the SQL, natural language, or APIs? What is it that you like to see that allows for more people to be able to directly relate and engage with these powerful data sets?
It's taking that hypothesis that’s driven from personal stories, and being able to, through a relatively simple query, translate that into a database query, and find out if that hypothesis proves true at scale.

Dyskant: One of the things is the product management of data schemas. So whenever we build an analytics database for a large-scale organization I think a lot about an analyst who is 22, knows VLOOKUP, took some statistics classes in college, and has some personal stories about the industry that they're working in. They know, "My grandmother isn't a native English speaker, and this is how she would use this website."

So it's taking that hypothesis that’s driven from personal stories, and being able to, through a relatively simple query, translate that into a database query, and find out if that hypothesis proves true at scale.

Then, potentially take the result of that query, dump them into a statistical-analysis language, or use database analytics to answer that in a more robust way. What that means is that our schemas favor very wide schemas, because I want someone to be able to write a three-line SQL statement, no joins, that enters a business question that I wouldn't have thought to put in a report. So that’s the first line -- is analyst-friendly schemas that are accessed via SQL.

The next line is deep key performance indicators (KPIs). Once we step out of the analytics database, consumers drop into the wider organization that’s consuming data at a different level. I always want reporting to report on opportunity for impact, to report on whether we're reaching our most valuable customers, not how many customers are we reaching.

"Are we reaching our most valuable customers" is much more easily addressable; you just talk to different people. Whereas, when you ask, "Are we reaching enough customers," I don’t know how find out. I can go over to the sales team and yell at them to work harder, but ultimately, I want our reporting to facilitate smarter working, which means incorporating model scores and predictive analytics into our KPIs.

Getting to the core

Gardner: Let’s step back from the edge, where we engage the analysts, to the core, where we need to provide the ability for them to do what they want and which gets them those great results.

It seems to me that when you're dealing in a campaign cycle that is very spiky, you have a short period of time where there's a need for a tremendous amount of data, but that could quickly go down between cycles of an election, or in a retail environment, be very intensive leading up to a holiday season.

Do you therefore take advantage of the cloud models for your analytics that make a fit-for-purpose approach to data and analytics pay as you go? Tell us a little bit about your strategy for the data and the analytics engine.

Dyskant: All of our customers have a cyclical nature to them. I think that almost every business is cyclical, just some more than others. Horizontal scaling is incredibly important to us. It would be very difficult for us to do what we do without using a cloud model such as Amazon Web Services (AWS).

Also, one of the things that works well for us with HPE Vertica is the licensing model where we can add additional performance with only the cost of hardware or hardware provision through the cloud. That allows us to scale up our cost areas during the busy season. We'll sometimes even scale them back down during slower periods so that we can have those 150 analysts asking their own questions about the areas of the program that they're responsible for during busy cycles, and then during less busy cycles, scale down the footprint of the operation.
I do everything I can to avoid aggregation. I want my analysts to be looking at the data at the interaction-by-interaction level.

Gardner: Is there anything else about the HPE Vertica OnDemand platform that benefits your particular need for analysis? I'm thinking about the scale and the rows. You must have so many variables when it comes to a retail situation, a commercial situation, where you're trying to really understand that consumer?

Dyskant: I do everything I can to avoid aggregation. I want my analysts to be looking at the data at the interaction-by-interaction level. If it’s a website, I want them to be looking at clickstream data. If it's a retail organization, I want them to be looking at point-of-sale data. In order to do that, we build data sets that are very frequently in the billions of rows. They're also very frequently incredibly wide, because we don't just want to know every transaction with this dollar amount. We want to know things like what the variables were, and where that store was located.

Getting back to the idea that we want our queries to be dead-simple, that means that we very frequently append additional columns on to our transaction tables. We’re okay that the table is big, because in a columnar model, we can pick out just the columns that we want for that particular query.
Join myVertica
To Get the Free
HPE Vertica Community Edition
Then, moving into some of the in-database machine-learning algorithms allows us to perform more higher-order computation within the database and have less data shipping.

Gardner: We're almost out of time, but I wanted to do some predictive analysis ourselves. Thinking about the next election cycle, midterms, only two years away, what might change between now and then? We hear so much about machine learning, bots, and advanced algorithms. How do you predict, Erek, the way that big data will come to bear on the next election cycle?

Behavioral targeting

Dyskant: I think that a big piece of the next election will be around moving even more away from demographic targeting, toward even more behavioral targeting. How is it that we reach every voter based on what they're telling us about them and what matters to them, how that matters to them? That will increasingly drive our models.

To do that involves probably another 10X scale in the data, because that type of data is generally at the clickstream level, generally at the interaction-by-interaction level, incorporating things like Twitter feeds, which adds an additional level of complexity and laying in computational necessity to the data.

Gardner: It almost sounds like you're shooting for sentiment analysis on an issue-by-issue basis, a very complex undertaking, but it could be very powerful.

Dyskant: I think that it's heading in that direction, yes.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in:

Does a Budget Increase Lead to Successful Data Projects?

Does a Budget Increase Lead to Successful Data Projects?

Back in 2015, organizations expressed high expectations in regards to the use of data. Over two-thirds of organizations said they saw huge opportunities with data, nearly 70% expected a budget increase for 2016. Vision and management support were the large themes, operationally the challenges were building up teams and developing central data warehouses. Organizations pointed out that they used data mainly for marketing purposes, but that they expected to start developing use cases across their entire organizations soon. In Big Data Survey 2016, we have researched these these and more. Will organizations have been able to live up to their expectations this year?

Financial room for projects

Despite the fact that in 2016 organizations indicate that the opportunities with data are less big than they were last year, more organizations employ data scientists and use predictive models.

Since last year, more organizations have appointed budget, which has led to financial room for initial projects.

Beyond the hype

Let’s face the brutal facts: becoming a data driven enterprise is not done overnight. There is no such thing as a magic box that spits out an algorithm after you put some data in it. Developing successful data use cases is primarily about making data available, exploring the ...

Read More on Datafloq
BBBT Hosts Webinar with Snowflake Computing on Changing the Game for Cloud Data Warehousing

BBBT Hosts Webinar with Snowflake Computing on Changing the Game for Cloud Data Warehousing

This Friday, the Boulder Business Intelligence Brain Trust (BBBT), the largest industry analyst consortium of its kind, will host a private webinar with Snowflake Computing on how Snowflake’s unique data warehouse built for the cloud is making the scalability and complexity challenges of large scale business intelligence a thing of the past.

(PRWeb November 01, 2016)

Read the full story at

Teradata Partners Conference 2016: Teradata Everywhere

Teradata Partners Conference 2016: Teradata Everywhere

Our technologized society is becoming opaque.
As technology becomes more ubiquitous and our relationship with digital devices ever
more seamless, our technical infrastructure seems to be increasingly intangible.
- Honor Harger

An idea that I could sense was in the air during my last meeting with Teradata’s crew in California, during their last influencer event, was confirmed and reaffirmed a couple of weeks ago during Teradata’s big partner conference: Teradata is now in full-fledged transformational mode.

Of course, for companies like Teradata that are used to being on the front line of the software industry, particularly in the data management space, transformation has now become much more than a “nice to do”. These days it’s pretty much the life breath of any organization at the top of the software food chain.

These companies have the complicated mandate to, if they want to stay at the top, be fast and smart enough to provide the software, the method, and the means to enable customers to gain technology and business improvements and the value that results from these changes.

And while it seems Teradata has taken its time for this transformation it is also evident that the company is taking it very seriously. Will this be enough to keep pace with peer vendors within a very active, competitive, and transformational market? Well, it’s hard to say, but certainly with a number of defined steps, Teradata looks like it will be able to meet its goal of remaining a key player in the data management and analytics industry.

Here we take an up-to-date look at Teradata’s business and technology strategy, including its flexible approach to deployment and ability for consistent and coherent analytics over all types of deployment, platforms, and sources of data; and then explore what the changes mean for the company and its current and future customers.

The Sentient Enterprise
As explained in detail in a previous installment, Teradata has developed a new approach towards the adoption of analytics, called the “sentient enterprise.” This approach aims to guide companies to:

  • improve their data agility
  • adopt a behavioral data platform
  • adopt an analytical application platform
  • adopt an autonomous decision platform

While we won’t give a full explanation of the model here (see the video below or my recent article on Teradata for a fuller description of the approach), there is no doubt that this is a crucial pillar for Teradata’s transformational process, as it forms the backbone of Teradata‘s approach to analytics and data management.

Teradata Video: The Sentient Enterprise

As mentioned in the previous post, one aspect of the “sentient enterprise” approach from Teradata that I particularly like is the “methodology before technology” aspect, which focuses on scoping the business problem, then selecting the right analytics methodology, and at the end choosing the right tools and technology (including tools such as automatic creation models and scoring datasets).

Teradata Everywhere
Another core element of the new Teradata approach consists of spreading its database offering wide, i.e., making it available everywhere, especially in the cloud. This movement involves putting Teradata’s powerful analytics to work. Teradata Database will now be available in different delivery modes and via different providers, including on:

  • Amazon Web Services—Teradata Database will be available for a massively parallel process (MPP) configuration and scalable for up to 32 nodes, including services such as node failure recovery and backup, as well as restoring and querying data in Amazon’s Simple Storage Service (S3). The system will be available in more than ten geographic regions.
  • Microsoft’s Azure—Teradata Database is expected to be available by Q4 of 2016 in the Microsoft Azure Marketplace. It will be offered with MPP (massively parallel processing) features and scalability for up to 32 nodes.
  • VMWare——via the Teradata Virtual Machine Edition (TVME), users have the option for deploying a virtual machine edition of Teradata Database for virtual environments and infrastructures.
  • Teradata Database as a Service—Extended availability for the Teradata Database will be available to customers in Europe through a data center hosted in Germany.

Teradata’s own on-premises IntelliFlex platform.

Availability of Teradata Database on different platforms

Borderless Analytics and Hybrid Clouds
The third element in the new Teradata Database picture involves a comprehensive provision of analytics despite the delivery mode chosen, an offering which fits the reality of many organizations—a hybrid environment consisting of both on-premises and cloud offerings.

With a strategy called Borderless Analytics, Teradata allows customers to deploy comprehensive analytics solutions within a single analytics framework. Enabled by Teradata’s solutions such as its multi-source SQL and processing QueryGrid engine and Unity, its orchestration engine for Teradata’s multi-system’s environments, this strategy purposes a way to perform consistent and coherent analytics over heterogeneous platforms with multiple systems and sources of data, i.e., in the cloud, on-premises, or virtual environments.

At the same time, this is also serving Teradata as a way to set the basis for its larger strategy for addressing the Internet of Things (IoT) market. Teradata is addressing this goal with the release of a set of new offerings called Analytics of Things Accelerators (AoTAs), comprised by technology-agnostic intellectual property that emerged as a result of Teradata’s real life IoT project engagements.

These accelerators can help organizations determine which IoT analytical techniques and sensors to use and trust. Due to the AoTAs’ enterprise readiness and design, companies can deploy them without having an enterprise scaling approach in mind, and not have to go through time-consuming experimentation phases before deployment to ensure the right analytical techniques have been used. Teradata’s AoTAs accelerate adoption, enabling deployment cost reduction and ensuring reliability. This is a noteworthy effort to provide IoT projects with an effective enterprise analytics approach.

What Does this Mean for Current and Potential Teradata Customers?
Teradata seems to have a concrete, practical, and well-thought-out strategy regarding the delivery of new generation solutions for analytics, focusing on giving omnipresence, agility, and versatility to its analytics offerings, and providing less product dependency and more business focus to its product stack.

But one thing Teradata needs to consider, given the increasing number of solutions available from its portfolio, is being sure to provide clarity and efficiency to customers regarding which solution blend to choose. This is especially true when the solution choice involves increasingly sophisticated big data solutions, a market that is getting “top notch” but certainly is still difficult to navigate into, especially for those new to big data.

Teradata’s relatively new leadership team seems to have sensed right away that the company is currently in a very crucial position not only within itself but also within the industry of providing insights. If its strategy works, Teradata might be able to not only maintain its dominance in this arena but also increase its footprint in an industry destined to expand with the advent of the Internet of Things.

For Teradata’s existing customer base, these moves could be encouraging, as they could mean being able to expand the company’s existing analytics platforms using a single platform and therefore without any friction and with and cost savings.

For those considering Teradata as a new option, it means having even more options for deploying end-to-end data management solutions using a single vendor rather than a having a “best of breed” approach. Either way though, Teradata is pushing towards the future with a new and comprehensive approach to data management and analytics in an effort to remain a key player in this fierce market.

The question is if Teradata’s strategic moves will resonate effectively within the enterprise market to compete with the existing software monsters such as Oracle, Microsoft, and SAP.

Are you a Teradata user? If so, let us know what you think in the comments section below.

(Originally published on TEC's Blog)
How NPO Provides Personal Recommendations Based on Viewed Content

How NPO Provides Personal Recommendations Based on Viewed Content

For over 60 years, the Nederlandse Publieke Omroep (NPO), Public Broadcasting Organization, has been producing and broadcasting radio and television. Every week, NPO reaches 85% of the Dutch population, with a larger presence online. To provide relevant content for online viewers, NPO recently implemented several smart data applications.

Personal broadcasts

Because of changes in viewing behavior and an increased number of channels, it has become increasingly difficult for media companies to retain viewer attention. To remain relevant in the coming decade as a public broadcaster and to make sure that viewers don’t get lost in the overwhelming offer of video content, NPO set an objective to become a personal broadcasting organization.
NPO began pointing out potential use cases, like a/b tests, automatically generated playlists and personal newsletters to offer consumers personal viewing experiences. To successfully develop these use cases, it pointed out three main themes; dashboards, recommenders and the introduction of NPO ID.

The right process for innovation

Product innovation required implementing an agile methodology and working in multidisciplinary teams.

“For NPO, agile working in multidisciplinary teams was not common practice, to put it mildly. We found out that it was not only necessary to adapt our way of working, but that the whole office needed ...

Read More on Datafloq
Big Data Strategy (Part I): Tips for Analyzing Your Data

Big Data Strategy (Part I): Tips for Analyzing Your Data

We have seen in a previous post what are the common misconceptions in big data analytics, and how relevant it is starting looking at data with a goal in mind.

Even if I personally believe that posing the right question is 50% of what a good data scientist should do, there are alternative approaches that can be implemented. The main one that is often suggested, in particular from non-technical professionals, is the “let the data speak” approach: a sort of magic random data discovery that should spot valuable insights that a human analyst does not notice.

Well, the reality is that this a highly inefficient method: (random) data mining it is resource consuming and potentially value-destructive. The main reasons why data mining is often ineffective is that it is undertaken without any rationale, and this leads to common mistakes such as false positives; over-fitting; neglected spurious relations; sampling biases; causation-correlation reversal; wrong variables inclusion; or eventually model selection (Doornik and Hendry, 2015; Harford, 2014). We should especially pay specific attention to the causation-correlation problem, since observational data only take into account the second aspect. However, according to Varian (2013) the problem can be easily solved through experimentations.

Hence, I think that a hybrid approach is ...

Read More on Datafloq
Look Back Over DMZ

Look Back Over DMZ

I was at the Data Modeling Zone Europe 2016 in Berlin as a speaker. It was the 4th Data Modeling Zone in Europe and in my opinion one of the best per the conference program and the interesting and awesome chats with other speakers and attendees. This year’s venue was the Abion Hotel in Berlin, situated next to the Spreebogen and for this a great environment around the venue.

The ITIL Certification was a mandate: In Conversation with Mutasim Abuzeid | Simplilearn

The ITIL Certification was a mandate: In Conversation with Mutasim Abuzeid | Simplilearn

The ITIL Certification was a mandate: In Conversation with Mutasim Abuzeid | Simplilearn Mutasim Abuzeid, comes from a Telecommunications Engineering background from Sudan in Africa, where technology and the IT industry was recently introduced only over the last ten years. “I was in the software industry for more than 8 years, building systems infrastructure for Zain Sudan (Sudan’s first mobile operator) then I moved to the...Read More.
How to Foster the Adoption of Lean and Six Sigma in the Sharing Economy? | Simplilearn

How to Foster the Adoption of Lean and Six Sigma in the Sharing Economy? | Simplilearn

How to Foster the Adoption of Lean and Six Sigma in the Sharing Economy? | Simplilearn Successful business like Uber and Airbnb prove that the sharing economy is here to stay. An increasing number of customers are now able to get services that are more affordable and available for everyone. But what about the quality of these new offerings? Today, you can travel by ride-sharing with Uber, eat at someone else’s house with Cooka...Read More.
IoT – Why should you care as a Business? (And the challenges you will have) | Simplilearn

IoT – Why should you care as a Business? (And the challenges you will have) | Simplilearn

IoT – Why should you care as a Business? (And the challenges you will have) | Simplilearn Internet of Things (IoT) is the buzz phrase that’s kept the technology journalists busy since mid-2013! With the immense potential that this scenario has, organizations have started using the IoT angle in their work and have also added the IoT title to their executives and gadgets that are a part of the IoT enablement ecosystem. With the hue...Read More.
Investigating the Stock Market Futures for Better Investment | Simplilearn

Investigating the Stock Market Futures for Better Investment | Simplilearn

Investigating the Stock Market Futures for Better Investment | Simplilearn Stock Market Futures Definition Futures contract is a standardized contract between two parties to buy or sell a particular product or financial instrument in future at specified date and price agreed upon today. In finance, a stock market index future is a cash-settled futures contract on the value of a particular stock market index. Characteristi...Read More.
Unraveled – Social Media Marketing Career Options | Simplilearn

Unraveled – Social Media Marketing Career Options | Simplilearn

Unraveled - Social Media Marketing Career Options | Simplilearn The role of a Social Media Marketer seems to offer the promise of a position steeped in creativity, of a job which actually pays you for browsing Facebook and Twitter. These and many other misconceptions are prevalent amongst a mass of aspiring professionals looking to this job role for career nirvana. With Social Media Marketing becoming a buzzwor...Read More.
Why all Businesses Are at Risk of Phishing Attacks? | Simplilearn

Why all Businesses Are at Risk of Phishing Attacks? | Simplilearn

Why all Businesses Are at Risk of Phishing Attacks? | Simplilearn Phishing continues to plague businesses. You must be thinking that by now – over a decade after criminals began sending emails impersonating banks to unsuspecting customers so they could steal credentials to bank accounts – we’d have the problem under control. But, we don’t. And what’s worse, we are seeing an increas...Read More.
The Scrum Approach | Simplilearn

The Scrum Approach | Simplilearn

The Scrum Approach | Simplilearn What is Scrum? It is an iterative methodology that treats major portions of development as a controlled black box. Iterations called sprints (described in more details later in this article) are used to evolve the product which is ready to ship after each sprint. This is different from the traditional way to build software, used by companies, which...Read More.
Advice from an Alumnus: How the Lean Six Sigma Training helped me Edge Ahead | Simplilearn

Advice from an Alumnus: How the Lean Six Sigma Training helped me Edge Ahead | Simplilearn

Advice from an Alumnus: How the Lean Six Sigma Training helped me Edge Ahead | Simplilearn From managing shop-floor activity to heavy hitting large-scale projects as AGM to beyond, Sanjay Bharne’s spectacular rise continues as he now aims for the stars with Simplilearn’s Six Sigma training. For as long as he can remember, Sanjay has wanted to be the man who brought about change for the better, whether at work, in business, o...Read More.
How to become a paid Ethical Hacker? | Simplilearn

How to become a paid Ethical Hacker? | Simplilearn

How to become a paid Ethical Hacker? | Simplilearn For as long as the internet has been around, network security has always been an issue. In the last few decades, there has been an explosion of interest in ethical hacking. Whereas traditional hackers exploit networks for malicious reasons, ethical hackers work on the side of the 'good guys’ to protect computer systems from dangerous intrusion...Read More.
Framework and Applications of Financial Analysis Technique | Simplilearn

Framework and Applications of Financial Analysis Technique | Simplilearn

Framework and Applications of Financial Analysis Technique | Simplilearn Financial Analysis refers to the process of analyzing and assessing a company’s financial statements to gain an understanding on its business model, financial performance, risk and profitability of the business. The role of financial statement analysis is to utilize the information available in a company's financial statements (Balance Sheet, ...Read More.
Supercharge your IT service Career | Simplilearn webinar starts 30-11-2016 10:30

Supercharge your IT service Career | Simplilearn webinar starts 30-11-2016 10:30

If you’re in the ITSM field, you no doubt have a dream role. Maybe you want to be a Service Level Manager or work in Service Strategy. This webinar will tell you how to land that dream ITSM role, in 3 simple steps. It starts off with the basics – what is IT Service Management – and goes on to cover its key processes, the critical ...Read More.
How to supercharge your Digital Marketing Career | Simplilearn webinar starts 27-11-2016 10:30

How to supercharge your Digital Marketing Career | Simplilearn webinar starts 27-11-2016 10:30

There is significant demand for Digital Marketers around the world today and a shortage of qualified, well trained candidates. You might find yourself questioning: Is digital marketing a career for me? How do I get started or how can I improve my current skills and abilities to find a better role? If you have any doubts whatsoever, then Gaurav, wil...Read More.
Internet of Things: 5 Best Connected Consumer Devices

Internet of Things: 5 Best Connected Consumer Devices

The reach of the Internet of Things (IoT) is incredibly far and wide, as demonstrated by the increasing number of connected devices that are transforming the energy, agriculture, transportation, and healthcare industries. With over 34 billion devices expected to be connected to the internet by 2020, the IoT is being called the next Industrial Revolution, changing the way we interact with the physical world. Today, that revolution even enters our most private and cherished space by bringing a number of devices connected to the IoT into our homes.

If you’re interested in connecting your devices in an effort to create a seemingly impossible smart home, here are the 5 best connected consumer devices for your home on the market today:


Hate the feeling of returning to a freezing home in the dead of winter or walking into a sweltering apartment in the middle of a heat wave? If your answer is yes, Nest offers a way for you to control your home’s temperature with excellent precision right  from your smartphone. The Nest family now includes a renowned thermostat, the Nest Protect, which includes a smoke detector, and the Nest Cam, a security a camera with high definition playback.

Nest works with an ...

Read More on Datafloq
3 Steps to your Dream Big Data role | Simplilearn webinar starts 30-11-2016 10:30

3 Steps to your Dream Big Data role | Simplilearn webinar starts 30-11-2016 10:30

This webinar will take you through all the steps needed to land your dream Big Data job. The webinar will cover the following topics Knowledge of new age datastores (NoSQL) Knowledge of Analytical Software for Big Data like HDFS, Spark and Mahout Finding the right questions and validating their reliabilty. How to pick up latest developments ...Read More.
How Big Data Will Be Used to Predict the next Natural Disaster

How Big Data Will Be Used to Predict the next Natural Disaster

Climate change has been increasing the severity and frequency of natural disasters in recent years, which, in turn, has caused the death of millions of people in different nations around the world, and cost billions of dollars in damages.

As these storms get worse, and the population continues to grow exponentially, it has become even more imperative that we learn everything we can about natural disasters before they happen. It has been well documented that early warning systems go a long way to saving lives. 

How We Are Creating Early Warning Systems

In the past, the Early Warning Systems (EWS) like satellites, seismographs, and other sensors have been improving every year, but they have not been very effective in predicting disasters with accuracy. Just look at the 2004 Tsunami in Malaysia, where the population was only given a 10-20 minute warning before they were hit. This late warning caused the death toll to reach hundreds of thousands. 

Or, more recently, hurricane Matthew was predicted to hit the east coast of the United States as a category 1 storm, but that prediction was quickly changed to a category 5 super-storm the very next day. When predictions are this bad, it leaves a lot for improvement.

So, ...

Read More on Datafloq
Retail Results: Six Case Studies of Big Data Success

Retail Results: Six Case Studies of Big Data Success

Thinking about using big data to help your company grow? It’s a good move—you can improve your ROI (return on investment) in several ways by leveraging the information you’ll gain with big data. However, getting a big data infrastructure set up can be costly and time consuming. Most retailers want evidence that they’ll benefit from this investment by looking to others who have succeeded. The good news? You don’t have to look far to find retailers who have made big money from big data. Here are just a few case studies of large companies showing how powerful predictive analytics can be for business growth and trimming unnecessary spending. 

1. Staples

Social media feedback is important for any business. A large corporation like Staples gets a massive amount of incoming data from different social sources, which can make it difficult to sort the relevant messages from the irrelevant. Enter big data. The company used Dell’s technology to implement a big data solution for social media, which cut irrelevant data down by 75%. Because they were focusing on the messages that mattered, Staples was able to improve customer communication and evaluate marketing campaigns quickly, saving money on projects that weren’t worth the investment.

2. Wal-Mart

For ...

Read More on Datafloq
ServiceMaster’s path to an agile development twofer: Better security and DevOps business benefits

ServiceMaster’s path to an agile development twofer: Better security and DevOps business benefits

The next BriefingsDirect Voice of the Customer security transformation discussion explores how home-maintenance repair and services provider ServiceMaster develops applications with a security-minded focus as a DevOps benefit.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript o download a copy.

To learn how security technology leads to posture maturity and DevOps business benefits, we're joined by Jennifer Cole, Chief Information Security Officer and Vice President of IT, Information Security, and Governance for ServiceMaster in Memphis, Tennessee, and Ashish Kuthiala, Senior Director of Marketing and Strategy at Hewlett Packard Enterprise DevOps. The discussion is moderated by BriefingsDirect's Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Jennifer, tell me, what are some of the top trends that drive your need for security improvements and that also spurred DevOps benefits?

Cole: When we started our DevOps journey, security was a little bit ahead of the curve for application security and we were able to get in on the front end of our DevOps transformation.


The primary reason for our transformation as a company is that we are an 86-year-old company that has seven brands under one umbrella, and we needed to have one brand, one voice, and be able to talk to our customers in a way that they wanted us to talk to them.

That means enabling IT to get capabilities out there quickly, so that we can interact with our customers "digital first." As a result of that, we were able to see an increase in the way that we looked at security education and process. We were normally doing our penetration tests after the fact of a release. We were able to put tools in place to test prior to a release, and also teach our developers along the way that security is everyone's responsibility.

ServiceMaster has been fortunate that we have a C-suite willing to invest in DevOps and an Agile methodology. We also had developers who were willing to learn, and with the right intent to deliver code that would protect our customers. Those things collided, and we have the perfect storm.

So, we're delivering quicker, but we also fail faster allowing us to go back and fix things quicker. We're seeing an uptick in what we're delivering being a lot more secure.

Gardner: Ashish, it seems obvious, having heard Jennifer describe it, DevOps and security hand-in-hand -- a whole greater than the sum of the parts. Are you seeing this more across various industries?

Stopping defects

Kuthiala: Absolutely. With the adoption of DevOps increasing more across enterprises, security is no different than any other quality-assurance (QA) testing that you do. You can't let a defect reach your customer base; and you cannot let a security flaw reach your customer base as well.

If you look at it from that perspective, and the teams are willing to work together, you're treated no differently than any other QA process. This boils not just to the vulnerability of your software that you're releasing in the marketplace, but there are so many different regulations and compliance [needs] -- internal, external, your own company policies -- that you have to take a look at. You don't want to go faster and compromise security. So, it's an essential part of DevOps.

Cole: DevOps allows for continuous improvement, too. Security comes at the front of a traditional SDLC process, while in the old days, security came last. We found problems after they were in production or something had been compromised. Now, we're at the beginning of the process and we're actually getting to train the people that are at the beginning of the process on how and why to deliver things that are safe for our customers.

Gardner: Jennifer, why is security so important? Is this about your brand preservation? Is this about privacy and security of data? Is this about the ability for high performance to maintain its role in the organization? All the above? What did I miss? Why is this so important?

Cole: Depending on the lens that you are looking through, that answer may be different. For me, as a CISO, it's making sure that our data is secure and that our customers have trust in us to take care of their information. The rest of the C-suite, I am sure, feels the same, but they're also very focused on transformation to digital-first, making sure customers can work with us in any way that they want to and that their ServiceMaster experience is healthy.

Our leaders also want to ensure our customers return to do business with us and are happy in the process.  Our company helps customers in some of the most difficult times in their life, or helps them prevent a difficult time in the ownership of their home.

But for me and the rest of our leadership team, it's making sure that we're doing what's right. We're training our teams along the way to do what's right, to just make the overall ServiceMaster experience better and safe. As young people move into different companies, we want to make sure they have that foundation of thinking about security first -- and also the customer.
Learn More About DevOps
Solutions that Unify
Development and Operations
We tend to put IT people in a back room, and they never see the customer. This methodology allows IT to see what they could have released and correct it if it's wrong, and we get an opportunity to train for the future.
Through my lens, it’s about protecting our data and making sure our customers are getting service that doesn't have vulnerabilities in it and is safe.

Gardner: Now, Ashish, user experience is top of mind for organizations, particularly organizations that are customer focused like ServiceMaster. When we look at security and DevOps coming together, we can put in place the requirements to maintain that data, but it also means we can get at more data and use it more strategically, more tactically, for personalization and customization -- and at the same time, making sure that those customers are protected.

How important is user experience and data gathering now when it comes to QA and making applications as robust as they can be?

Million-dollar question

Kuthiala: It's a million-dollar question. I'll give you an example of a client I work with. I happen to use their app very, very frequently, and I happen to know the team that owns that app. They told me about 12 months ago that they had invested -- let’s just make up this number -- $1 million in improving the user experience. They asked me how I liked it. I said, "Your app is good. I only use this 20 percent of the features in your app. I really don’t use the other 80 percent. It's not so useful to me."

That was an eye-opener to them, because the $1 million or so that they would have invested in enriching the user experience -- if they knew exactly what I was doing as a user, what I use, what I did not use, where I had problems -- could have used that toward that 20 percent that I use. They could have made it better than anybody else in the marketplace and also gathered information on what is it that the market wants by monitoring the user experience with people like me.
It's not just the availability and health of the application; it’s the user experience. It's having empathy for the user, as an end user.

It's not just the availability and health of the application; it’s the user experience. It's having empathy for the user, as an end-user. HPE of course, makes a lot of these tools, like HPE AppPulse, which is very specifically designed to capture that mobile user experience and bring it back before you have a flood of calls and support people screaming at you as to why the application isn’t working.

Security is also one of those things. All is good until something goes wrong. You don't want to be in a situation when something has actually gone wrong and your brand is being dragged through mud in the press, your revenue starts to decline, and then you look at it. It’s one of those things that you can't look at after the fact.

Gardner: Jennifer, this strikes me as an under-appreciated force multiplier, that the better you maintain data integrity, security, and privacy, the more trust you are going to get to get more data about your customers that you can then apply back to a better experience for them. Is that something that you are banking on at ServiceMaster?
Learn More About DevOps
Solutions that Unify
Development and Operations
Cole: Absolutely. Trust is important, not only with our customers, but also our employees and leaders. We want people to feel like they're in a healthy environment, where they can give us feedback on that user experience. What I would say to what Ashish was saying is that DevOps actually gives us the ability to deliver what the business wants IT to deliver for our customers.

In the past 25 years, IT has decided what the customer would like to see. In this methodology, you're actually working with your business partners who understand their products and their customers, and they're telling you the features that need to be delivered. Then, you're able to pick the minimum viable product and deliver it first, so that you can capture that 20 percent of functionality.

Also, if you're wrapping security in front of that, that means security is not coming back to you later with the penetration test results and say that you have all of these things to fix, which takes time away from delivering something new for our customers.

This methodology pays off, but the journey is hard. It’s tough because in most companies you have a legacy environment that you have to support. Then, you have this new application environment that you’re creating. There's a healthy balance that you have to find there, and it takes time. But we've seen quicker results and better revenue, our customers are happier, they're enjoying the ServiceMaster experience, instead of our individual brand families, and we've really embraced the methodology.

Gardner: Do you have any examples that you can recall where you've done development projects and you’ve been able to track that data around that particular application? What’s going on with the testing, and then how is that applied back to a DevOps benefit? Maybe you could just walk us through an example of where this has really worked well.

Digital first

Cole: About a year and a half ago, we started with one of our brands, American Home Shield, and looked at where the low hanging fruit -- or minimum viable product -- was in that brand for digital first. Let me describe the business a little bit. Our customers reach out to us, they purchase a policy for their house and we maintain appliances and such in their home, but it is a contractor-based company. We send out a contractor who is not a ServiceMaster associate.

We have to make that work and make our customer feel like they've had a seamless experience with American Home Shield. We had some opportunity in that brand for digital first. We went after it and drastically changed the way that our customers did business with us. Now, it's caught on like wildfire, and we're really trying to focus on one brand and one voice. This is a top-down decision which does help us move faster.

All seven of our brands are home services. We're in 75,000 homes a day and we needed to identify the customers of all the brands, so that we could customize the way that we do business with them. DevOps allows us to move faster into the market and deliver that.

Gardner: Ashish, there aren't that many security vendors that do DevOps, or DevOps vendors that do security. At HPE, how have you made advances in terms of how these two areas come together?
The strengths of HPE in helping its customers lies with the very fact that we have an end-to-end diverse portfolio.

Kuthiala: The strengths of HPE in helping its customers lies with the very fact that we have an end-to-end diverse portfolio. Jennifer talked about taking the security practices and not leaving it toward the end of the cycle, but moving it to the very beginning, which means that you have to get developers to start thinking like security experts and work with the security experts.

Given that we have a portfolio that spans the developers and the security teams, our best practices include building our own customer-facing software products that incorporate security practices, so that when developers are writing code, they can begin to see any immediate security threats as well as whether their code is compliant with any applicable policies or not. Even before code is checked in, the process runs the code through security checks and follows it all the way through the software development lifecycle.

These are security-focused feedback loops. At any point, if there is a problem, the changes are rejected and sent back or feedback is sent back to the developers immediately.

If it makes through the cycle and a known vulnerability is found before release to production, we have tools such as App Defender that can plug in to protect the code in production until developers can fix it, allowing you to go faster but remain protected.

Cole: It blocks it from the customer until you can fix it.

Kuthiala: Jennifer, can you describe a little bit how you use some of these products?

Strategic partnership

Cole: Sure. We’ve had a great strategic partnership with HPE in this particular space. Application security caught on fire about two years ago at RSA, which is one of the main security conferences for anyone in our profession.

The topic of application security has not been focused to CISOs in my opinion. I was fortunate enough that I had a great team member who came back and said that we have to get on board with this. We had some conversations with HPE and ended up in a great strategic partnership. They've really held our hands and helped us get through the process. In turn, that helped make them better, as well as make us better, and that's what a strategic partnership should be about.

Now, we're watching things as they are developed. So, we're teaching the developer in real-time. Then, if something happens to get through, we have App Defender, which will actually contain it until we can fix it before it releases to our customer. If all of those defenses don’t work, we still do the penetration test along with many other controls that are in place. We also try to go back to just grassroots, sit down with the developers, and help them understand why they would want to develop differently next time.
The next step for ServiceMaster specifically is making solid plans to migrate off of our legacy systems, so that we can truly focus on maturing DevOps and delivering for our customer in a safer, quicker way.

Someone from security is in every one of the development scrum meetings and on all the product teams. We also participate in Big Room Planning. We're trying to move out of that overall governing role and into a peer-to-peer type role, helping each other learn, and explaining to them why we want them to do things.

Gardner: It seems to me that, having gone at this at the methodological level with those collaboration issues solved, bringing people into the scrum who are security minded, puts you in a position to be able to scale this. I imagine that more and more applications are going to be of a mobile nature, where there's going to be continuous development. We're also going to start perhaps using micro-services for development and ultimately Internet of Things (IoT) if you start measuring more and more things in your homes with your contractors.

Cole: We reach 75,000 homes a day. So, you can imagine that all of those things are going to play a big part in our future.

Gardner: Before we sign-off, perhaps you have projections as to where you like to see things go. How can DevOps and security work better for you as a tag team?
Learn More About DevOps
Solutions that Unify
Development and Operations
Cole: For me, the next step for ServiceMaster specifically is making solid plans to migrate off of our legacy systems, so that we can truly focus on maturing DevOps and delivering for our customer in a safer, quicker way, and so we're not always having to balance this legacy environment and this new environment.
If we could accelerate that, I think we will deliver to the customer quicker and also more securely.

Gardner: Ashish, last word, what should people who are on the security side of the house be thinking about DevOps that they might not have appreciated?

Higher quality

Kuthiala: This whole approach of adopting DevOps is to deliver your software faster to your customers with higher quality says it. DevOps is an opportunity for security teams to get deeply embedded in the mindset of the developers, the business planners, testers, production teams – essentially the whole software development lifecycle, which earlier they didn’t have the opportunity to do.

They would usually come in before code went to production and often would push back the production cycles by a few weeks because they had to do the right thing and ensure release of code that was secure. Now, they’re able to collaborate with and educate developers, sit down with them, tell them exactly what they need to design and therefore deliver secure code right from the design stage. It’s the opportunity to make this a lot better and more secure for their customers.

Cole: The key is security being a strategic partner with the business and the rest of IT, instead of just being a governing body.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript o download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in:

Google Search Engine Marketing Ready Reckoner | Simplilearn

Google Search Engine Marketing Ready Reckoner | Simplilearn

Digital ad spending, including mobile searches, is expected to double to almost $59 billion by 2018. This is because more than 55% of customers have made a purchase by clicking on a PPC ad. PPC ads are great for large companies and small businesses alike, because you can use them to target visitors at all stages of the buying funnel. Intrigued? Yo...Read More.
Why Big Data Could Pose Big Problems for Healthcare Privacy

Why Big Data Could Pose Big Problems for Healthcare Privacy

Advances in technology have made massive changes in health care. Dozens of life-saving innovations and treatment options help to extend the average lifespan and improve the life for people who are ailing.

Health care wouldn’t be nearly as excellent without recent technology, and the demand for tech-savvy health professionals continues is growing.

Perhaps the most momentous of technological advances involves the use of big data in medical care. Thanks to a constant flow of information that entails both private and public information, health-care organizations are capable of better diagnoses and courses of treatment than ever.

But there’s a dark side to technology and big data in health care that isn’t often discussed, however: the alarming frequency and volume of privacy breaches. Though violations of HIPAA have been occurred for years, researchers argue there’s more now, thanks to the high volume of data involved.

It’s difficult to keep track of all that information without letting some slip out here and there. In addition, we have better ways of tracking privacy violations now. Technology provides new ways to record and track privacy breaches. We have a much better idea where information has been slipping through the cracks.

There’s also a risk of privacy disappearing altogether because of ...

Read More on Datafloq
Data-Driven Unemployment: Something to Worry About?

Data-Driven Unemployment: Something to Worry About?

The rise of big data has led to a great deal of excitement in its potential. Part of the hype surrounds big data’s ability to maximize efficiency and improve company operations. This can be accomplished a number of ways, from micro-targeting customers for marketing campaigns to identifying fraud and waste in financial transactions. Many companies have caught on to the advantage that big data analytics brings, and that has resulted in a technological revolution of sorts. Big data has spread everywhere, creeping into nearly any type of job you can imagine, from retail to sports to governments. With this growing popularity has come an even bigger concern. With so much happening in big data analytics, could that eventually lead to more data-driven unemployment? After all, greater efficiency usually leads the need for fewer jobs. As it turns out, the concerns surrounding data and computers taking over people’s jobs are legitimate.

The foundation for these concerns lies in several studies conducted by reputable organizations. One study from the World Economic Forum predicts that we’ll see a reduction in 5 million jobs over the next five years, all thanks to computers and robots utilizing big data more. As worrying as that prediction is, ...

Read More on Datafloq
How Data Scientists and Other Big Data Jobs Drive Economic Growth

How Data Scientists and Other Big Data Jobs Drive Economic Growth

Big data has done more than produce and collect massive amounts of data to help us solve problems in nearly every industry—it has also created a brand new field and new job opportunities for smart and ambitious people. Raw data isn’t useful unless it’s analyzed—and big data analysts are in high demand for the value they can bring to organizations. Because these positions are so new, it can be hard for businesses to find qualified candidates—and salaries are high. Talented data scientists are transforming the landscape of modern business, but their role is still evolving. What do data scientists do, and what is their role in economic growth?

What Do Data Scientists Do?

A data scientist is a person who has a knack for wading through large quantities of data collected by an organization, analyzing that data, and discovering patterns that are significant within the data’s context to shape business strategy. Data scientists today have the ability to code, crunch data, and to communicate with shareholders and executives—justifying their own position and providing business insights that can be implemented for greater efficiency or profit.

Data scientists aren’t the only key players in the big data field. Other support positions, such as big data ...

Read More on Datafloq
The Emperor’s New Clothes… Resisting the Hadoop Imperative

The Emperor’s New Clothes… Resisting the Hadoop Imperative

Did You Jump on the Hadoop Bandwagon? Did You Do it for the Right Reasons?

I recently saw a statistic that only 15% of Hadoop projects are in production. So that would mean that 85% of the Hadoop projects are not yet in production, right? Hmmm…could that be because Hadoop is not the right choice for all data problems?

When the only tool you have is a hammer, everything looks like a nail.

As a builder of data warehouses I have been confounded by the proliferation of Hadoop and its ancillary “big data” products, into the Data Warehouse ecosystem. Don’t get me wrong, Hadoop does have its place for certain use cases. And there are environments with such high talent density and capacity that they can handle the complexity without blinking an eye, for example Workday or Netflix.

Unfortunately, many people don’t get that Hadoop comes with a price. A fairly steep price. Yes, the software may be all or mostly open source, so your licensing costs will be low, and it can run on commodity machines, but the complexity and the number of tools and skills your people will have to learn and integrate into your existing environment, or you will have to pay to get (and integrate), can be EXPENSIVE. Especially if your businesses core focus does not require a lot of Java engineers.

In 20 years of database and data warehouse experience I have never needed to know Java. Not once. In 2006, I did a tutorial and learned the basics of Java. Then last year, I took a Hadoop class. Guess what, the primary programming language used with Hadoop is Java. So I did another whole set of online Java courses. Nothing wrong with Java tons of systems and applications are written in Java. Just not that many data warehouse professionals have had to learn it to accomplish their jobs.

Hadoop Responsiveness and Complexity:

But wait, you might say, there are constructs and indeed a whole ecosystem built on top of Hadoop to let you use SQL language. Sure, but at what cost and what complexity? And what about the time lag? The fundamental architecture of Hadoop and the Hadoop Distributed File System (HDFS) is divide and conquer.  IF I have petabytes of data that I need to analyze, frequently this can be a good strategy. The user should not expect instant results. They submit their query to Hive which then translates the query to Java and submits it to the master machine which sends the query out to all of the slave machines, each to process a portion of the data, the process results are aggregated and then sent back to the master to send back to Hive to send back to the user. Hadoop was created for the problem of big data for less money, it is not trying to be responsive nor is it designed for responsiveness with small data sets.

So how is it that so many projects are trying to move to Hadoop?

I have two theories:

1. The Bandwagon effect or simply “peer pressure”:

Back when Hadoop and HDFS first justifiably started getting positive publicity. Someone senior, perhaps the CIO, reads an article extolling the virtues of Hadoop. He was curious so he asked about it with his direct reports, say at the director level. The directors either had or had not yet heard about Hadoop but not wanting to look bad, and perhaps interpreting curiosity as interest and intention, start researching and discussing with their direct reports. Before long the scuttlebutt around the water cooler was that the CIO wants to do Hadoop and if you want to look good, get promoted and get a bonus, you should be doing Hadoop. Or at least putting it in your budget and planning for it.

Then word got out, that ABC company was doing Hadoop projects and company XYZ’s CIO didn’t want to fall behind the competition, so he started talking to his direct reports and the whole thing repeated itself again.

At both companies there may have been people who questioned this wisdom, who knew or suspected that Hadoop was not the right tool, but like the story of the emperor’s cloths, we don’t want to be the only one who doesn’t see it. We don’t want to look stupid.

And there is a corollary concerning the story of the emperor’s clothes, it says that the kid who pointed out that the emperor was naked, did not win the Emperor’s royal scholarship that year, or any year thereafter.

2. Hot new tool (buzz word) effect:

I did hear one other explanation for why it is that some companies might be doing Hadoop projects even though their use case might not fit the “big data” profile… they do Hadoop to keep their good people. The theory is that if you don’t let your people work with the hot new tools and add the buzz words to their resumes, they will go somewhere else that will let them do just that. I’ve been ruminating on that one for a while. At first I accepted it as perhaps a necessary inefficiency for keeping good people. After thinking about it for a while and discussing it with a variety of friends and acquaintances in the industry, I’ve come to the conclusion that there needs to be some other way to engage your people to design, architect and implement an appropriate efficient solution with a minimal amount of waste.

An example from the field: working with a client recently I was the data architect on an outsourced project converting an end-user-created manual legacy process to an engineered Informatica and Oracle implementation that could be supported by the IT department.  About half way through the project, one of the corporate architects came by and asked, “Why aren’t you doing this in Hadoop?” The senior ETL architect and I looked at each other, then looked at him, a little dumbfounded… Um, because we are only processing 20 gigabytes of data once a month?

An Alternative to Hadoop (for many use cases):

Until recently, I did not have a good alternative for this complexity inflating, budget killing, risky tendency to try to put everything in Hadoop. Then I attended the Denver Agile Analytics meetup and the presenter that night was Kent Graziano, the senior evangelist for Snowflake Computing. His presentation was about his experience and some techniques used to do agile data warehouse development.

After his agile BI presentation he did a separate presentation on Snowflake Computing.

It rocked my world!

Snowflake is a new elastic data warehouse database engine built from the ground up for doing data warehouses on Amazon Web Services (AWS).

Kent referenced his blog post: 10 Cool Things I like About Snowflake and went though some of his top 10 things he liked about Snowflake.

In my next blog post titled “Snowflake – Will it give Hadoop a run for the money?” I will tell you why I am so excited about this product and its many useful features. In a nutshell it reduces the complexity by at least an order of magnitude and allows for the delivery of data warehouses at a whole new pace. At a recent Cloud Analytics City Tour, a Snowflake customer did a presentation and had deployed a fully functional data warehouse from scratch in 90 days.  With traditional data warehouse tools and vendors it can take more time than that just to negotiate the vendor contracts.

Privacy Policy

Copyright © 2016 BBBT - All Rights Reserved
Powered by WordPress & Atahualpa