Oracle 18c Goes for Database Automation in the Cloud

Oracle 18c Goes for Database Automation in the Cloud

In what was probably the most important announcement made during 2017 ’s version of Oracle’s OpenWorld conference, the company announced the release of version 18c of its worldwide known database management system which includes two key features: to be a fully automated.

Oracle’s founder and CTO Larry Ellison made the announcement of the autonomous database, which includes database and cyber-security automation because, according to Mr. Ellison, “human processes stink�.

According to Oracle, the autonomous database will practically eliminate all human intervention associated with all database managing activities like tuning, patching, updating and maintenance by including major capabilities:

  • Self-Driving: Provides continuous adaptive performance tuning based on machine learning. Automatically upgrades and patches itself while running. Automatically applies security updates while running to protect against cyber-attacks.
  • Self-Scaling: Instantly resizes compute and storage without downtime. Cost savings are multiplied because Oracle Autonomous Database Cloud consumes less compute and storage than Amazon, with lower manual administration costs.
  • Self-Repairing: Provides automated protection from downtime. SLA guarantees 99.995 percent reliability and availability, which reduces costly planned and unplanned downtime to less than 30-minutes per year.

To achieve it, the new autonomous database has integrated applied machine learning techniques to deliver without human intervention, self-driving, self-tuning, self-recovering, and self-scaling management capabilities which aims to streamline operations and provide more efficient consumption of resources as well as higher security and reliability.

But first... the Data Warehouse

Oracle’s autonomous database service can handle different workload types including transactional, non-transactional, mixed or graph and IoT workloads yet, while the automated OLTP version is scheduled to be available by June 2018, Oracle’s first autonomous database service will be directed to data warehouse workloads, planned to be available 2017.

Much as like all their services, the design of Oracle’s Autonomous Database Cloud Service for Data Warehouse relies on machine-learning to enable automatic tune and performance optimization. By using artificial intelligence and machine learning, Oracle aims achieve autonomous control to offer reliability, performance and highly elastic data management services as well as to enable fast deployments that can be done in seconds.
According to Oracle, some features to be offered by the new service include capabilities to:
  • Execute high-performance queries and concurrent workloads with optimized query performance and with pre-configured resource profiles for different types of users
  • Deploy highly elastic pre-configured compute and storage architectures to instantaneously scale up or down, avoiding overpay for fixed blocks of resources
  • Integrate Oracle SQL DWCS all business analytics tools that support Oracle database
  • Make use of its built-in web-based Apache Zeppelin based notebooks
  • Deploy a self Driving fully automated database for self-tuning patch, upgrade itself while the system is running
  • Take advantage of its database migration utility dedicated cloud-ready migration tools for easy migration from Amazon AWS Redshift, SQL Server and other databases
  • Perform cloud-based scalable data-loading from Oracle Object Storage, AWS S3, or on-premises
  • Deploy under an enterprise grade security schema on which data is encrypted by default in the cloud, as well as in transit and at rest
The new Oracle autonomous database cloud service for data warehouse aims to eliminate manual configuration errors and ensure continuous reliability and self-correction, It also includes, according with Oracle unlimited concurrent access and an advanced clustering technology to enable organizations to scale without any downtime.

With the inclusion of this service, Oracle is expanding its data warehouse software stack portfolio, expanding its services for both on-premises and cloud platforms and with different data services, aiming to reach a greater number of organizations each with different data warehousing management needs and complexities such is the case for existing data warehouse services available within Oracle Exadata, Exadata Cloud, and now the autonomous database cloud service.

The Rise of the Automated Database?

The ideal to achieve full database automation is not new and many, if not all, software vendors have made important efforts to automate different aspects of a database administration cycle —examples include Teradata and Attunity for automating data ingestion and data warehouse or those efforts made by third party software providers like BMC with BladeLogic Database Automation— and yet, until now full automation seemed to be an impossible task.

One main reason is that database automation involves not just the ability to achieve automation for common repetitive database configuration tasks including those involved with initial schema and security configuration but much more complex tasks including database tuning and performance monitoring which requires the ability adapt to changing conditions and require the system’s ability to learn and adapt.

The evolution of machine learning, artificial intelligence and cognitive computing technologies is certainly making this automation efforts possible and of course, Oracle deserves significant credit for embracing these technologies and taking a step further and aiming to achieve fully database automation.

As we should expect, it will not take long for other software providers to join the race and join the ranks of vendors offering fully automated database solutions, so as a cautionary message, it will be critical, in my view, to start by making comprehensive assessments of these solutions capabilities and accuracy before rushing to push the automatic pilot button and get rid of your DBA’s just yet.

You might realize it will take some time before you can lower your IT footprint.

Comments? Let me know your thoughts

Google Cloud Dataprep: Spreadsheet-Style Data Wrangling Powered by Google Cloud Dataflow

Google Cloud Dataprep: Spreadsheet-Style Data Wrangling Powered by Google Cloud Dataflow

I’ve been meaning to mention this new product on the blog for some time now, as it’s one of the tools I use almost every day at work and back home on my personal data projects. Google Cloud Dataprep is Google’s new serverless data preparation tool, a new category of ETL tool aimed at analysts and business users looking to load and prepare their own datasets for analysis, rather than developers looking to create industrial-strength ETL routines into corporate data warehouses.

In my case I’ve been using Cloud Dataprep to take the raw event data I’ve been landing into BigQuery from my various IoT, social media and wearable devices and using it to add descriptors, standardise and then join various feeds together so its easier to query using tools such as Google Data Studio and Looker. Based on technology originally from Trifacta and then extended by Google to add BigQuery and GCS as source and target options, it uses Google Cloud Dataflow as the underlying transformation engine and bases pricing on a multiple (currently 1.16x, though that could change once it comes out of beta) making the cost of typical data prep job just a few pence, at least for my data volumes.

To take an example, I have a set of Google BigQuery tables that receive data via streaming BigQuery inserts sent over by a FluentD server running on a Google Compute Engine VM. The screenshot below shows one of the tables with data that’s come in from the Fitbit health tracker I use, which sends over daily summary numbers each morning for metrics such as active and inactive minutes, calories burnt and steps recorded.

Google Cloud Dataprep presents the data from each table or file source using a spreadsheet-like interface, allowing you to visually built-up a set of sequential steps like the ones below that adjust a time recording in AM/PM format to 24hr clock, and forward-fill missing values for weight readings that were skipped between certain days.

Something I’d like to have seen but isn’t in the product yet is any support for Google’s Natural Language Processing and other Cloud APIs, key value-adds within Google’s GCP platform as I talked about in an earlier blog post, but presumably they’ll begin to get added into the product as Google extend the core Trifacta codebase to leverage more Google Cloud-specific features. That said, what was inherited from Trifacta is a pretty comprehensive set of regular, windowing and data wrangling transformations pulled together into a script, or “recipe�, as shown in the screenshot below.

I can also use Cloud Dataprep to pivot and aggregate data into summary tables, in the example below taking another BigQuery data source containing individual tweets and other social media interactions stored in the same table column and then pivoting it to give me one column per interaction type and counts of those interactions per day.

Then, you can string together preparation steps into a sequence to produce your final output BigQuery table or file, as I’ve done in the screenshot below to produce my final prepared and combined health data table.

Then I finally go back to the BigQuery Web UI and check-out my freshly-prepared table that I can thereafter keep up-to-date with new data by scheduling that same Google Cloud Dataprep recipe to run every night, appending newly-arrived data to that same BigQuery table.

Google Cloud Dataprep is currently in open beta and if you want to give it a try too, it’s accessible from the Google Cloud Platform console within the Big Data group of products.

Google Cloud Dataprep: Spreadsheet-Style Data Wrangling Powered by Google Cloud Dataflow was originally published in Mark Rittman’s Personal Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Why We Should Use Blockchain and Artificial Intelligence to Enable the Imagination Age

Why We Should Use Blockchain and Artificial Intelligence to Enable the Imagination Age

When the web was developed over 25 years ago, the technologies in place significantly lowered the cost of building a global company. Thanks to the internet, it has become possible to reach a large part of the global population simply from behind your computer. Those companies who first understood the power of the web, and managed to execute their vision correctly, are now the leading global monopolies we are so familiar with: we use Google for finding information, Facebook or WeChat for social activities, Amazon to shop and Apple for our hardware, etc.

For years, these companies have understood that data is a goldmine. Since their beginning, these companies have rigorously been collecting and storing data, resulting in them not only becoming powerful monopolies but also contributing to a centralisation of the world wide web. This centralisation is in stark contrast to Sir Tim Berners-Lee’s vision of a decentralised web in which everybody participates and has full control over their data and the content that they create.

Content Controlled by Tech Giants

As a result, these companies have become exceptionally influential. They have access to large amounts of data of their customers, which they use and misuse to track (potential) customers around the ...

Read More on Datafloq
Why We Should Use Blockchain and Artificial Intelligence to Enable the Imagination Age

Why We Should Use Blockchain and Artificial Intelligence to Enable the Imagination Age

When the web was developed over 25 years ago, the technologies in place significantly lowered the cost of building a global company. Thanks to the internet, it has become possible to reach a large part of the global population simply from behind your computer. Those companies who first understood the power of the web, and managed to execute their vision correctly, are now the leading global monopolies we are so familiar with: we use Google for finding information, Facebook or WeChat for social activities, Amazon to shop and Apple for our hardware, etc.

For years, these companies have understood that data is a goldmine. Since their beginning, these companies have rigorously been collecting and storing data, resulting in them not only becoming powerful monopolies but also contributing to a centralisation of the world wide web. This centralisation is in stark contrast to Sir Tim Berners-Lee’s vision of a decentralised web in which everybody participates and has full control over their data and the content that they create.

Content Controlled by Tech Giants

As a result, these companies have become exceptionally influential. They have access to large amounts of data of their customers, which they use and misuse to track (potential) customers around the ...

Read More on Datafloq
U.S. Records in the Gulf War (1991)

U.S. Records in the Gulf War (1991)

Of course, there were problems with the U.S. Army record keeping in the Gulf War (1991). There were serious problems with the U.S. Army record keeping in the Vietnam War (1965-1973), so not surprising, the problem had not been corrected, and the same problems existed 20 years later. In the Vietnam War, the 82nd Airborne Division pretty much threw away most of their records. According to Don Hakenson, Director, Center of Unit Records Research, Records Management and Declassification Agency; in the Gulf War, 86% or 87% of the battalion daily journals were not preserved (see War by Numbers, page 146).

This became a big issue when the “Gulf War Syndrome” became an issue. People became suspicious that U.S. soldiers had become exposed to hazardous materials or chemical weapons. Yet, when the Veterans Administration and others tried to figure out where the units were at the time, they found that the records no longer existed for many these units. In many cases, they could not determine where the unit or the people were during operations. Many of the records had simply been thrown out.

The Gulf War Syndrome was not a small issue. It has been estimated that 250,000 U.S. veterans were afflicted. In was a case where record keeping briefly became a major issue. Wikipedia article:

Since the 1960s, there has been serious gaps in U.S. record keeping. There still was in 1998 when we conducted a survey of the subject for the U.S. Army. We have conducted no other surveys since then, but gather that corrective action has been undertaken.

U.S. Army Record Keeping

Command and Combat Effectiveness: The Case of the British 51st Highland Division

Command and Combat Effectiveness: The Case of the British 51st Highland Division

Soldiers of the British 51st Highland Division take cover in bocage in Normandy, 1944. [Daily Record (UK)]

While Trevor Dupuy’s concept of combat effectiveness has been considered controversial by some, he was hardly the only one to observe that throughout history, some military forces have fought more successfully on the battlefield than others. While the sources of victory and defeat in battle remain a fertile, yet understudied topic, there is a growing literature on the topic of military effectiveness in the fields of strategic and security studies.

Anthony King, a professor in War Studies at the University of Warwick, has published an outstanding article in the most recent edition of British Journal of Military History, “Why did 51st Highland Division Fail? A case-study in command and combat effectiveness.� In it, he examined military command and combat effectiveness through the experience of the British 51st Highland Division in the 1944 Normandy Campaign. Most usefully, King developed a definition of military command that clarifies its relationship to combat effectiveness: “The function of a commander is to maximise combat power by defining achievable missions and, then, orchestrating subordinates into a cohesive whole committed to mission accomplishment.�

Defining Military Command

In order to analyze the relationship between command and combat effectiveness, King sought to “define the concept of command and to specify its relationship to management and leadership.� The construct he developed drew upon the work of Peter Drucker, an Austrian-born American business consultant and writer who is considered by many to be “the founder of modern management.� From Drucker, King distilled a definition of the function and process of military command: “command always consists of three elements: mission definition, mission management and mission motivation.�

As King explained, “When command is understood in this way, its connection to combat effectiveness begins to become clear.�

[C]ommand is an institutional solution to an organizational problem; it generates cohesion in a formation. Specifically, by uniting decision-making authority in one person and one role, a large military force is able to unite subordinate units, whose troops are not co-present with each other and who, in most cases, do not know each other. Crucially, the combat effectiveness of a formation, as a formation, is substantially dependent upon the ability of its commander to synchronise its disparate efforts in order to generate collective effects. Skillful command has a galvanising influence on a military force; by orchestrating the activities of subordinate units and motivating troops, command is able to create a level of combat power, which supervenes the capabilities of each of the parts. A well-commanded force has properties, which exceed those of its constituent units, fighting alone.

It is through the orchestration, synchronization, and motivation of effort, King concluded, that “command and combat effectiveness are immediately connected. Command fuses a formation together and increases its determination to fulfil its missions.�

Assessing the Combat Effectiveness of the 51st Division

The rest of King’s article is a detailed assessment of the combat effectiveness of the 51st Highland Division in Normandy in June and July 1944 using this military command construct. Observers at the time noted a decline in the division’s combat performance, which had been graded quite highly in North Africa and Sicily. The one obvious difference was the replacement of Major General Douglas Wimberley with Major General Charles Bullen-Smith in August 1943. After concluding that the 51st Division was no longer battleworthy, the commander of the British 21st Army Group, General Bernard Montgomery personally relieved Bullen-Smith in late July 1944.

In reviewing Bullen-Smith’s performance, King concluded that

Although a number of factors contributed to the struggles of the Highland Division in Normandy, there is little doubt that the shortcomings of its commander, Major General Charles Bullen-Smith, were the critical factor. Charles Bullen-Smith failed to fulfill the three essential functions required of a commander… Bullen-Smith’s inadequacies are highly suggestive of a direct relationship between command and combat effectiveness; they demonstrate how command can augment or undermine combat performance.

King’s approach to military studies once again demonstrates the relevance of multi-disciplinary analysis based on solid historical research. His military command model should prove to be a very useful tool for analyzing the elements of combat effectiveness and assessing combat power. Along with Dr. Jonathan Fennell’s work on measuring morale, among others, it appears that good progress is being made on the study of human factors in combat and military operations, at least in the British academic community (even if Tom Ricks thinks otherwise).

Is Religion The Next Frontier For AI?

Is Religion The Next Frontier For AI?

AI engineer Anthony Levandowski, who is notoriously at the centre of a lawsuit between Uber and Waymo, has filed the paperwork for a new Artificial Intelligence-based religion, Way of the Future or WOTF for short. This AI religion’s aim is to ‘contribute to the betterment of society’ through ‘understanding and worship of the (AI) Godhead’, according to the proposal. The WOTF is set to consist of Levandowski as a Dean, and a further small council of advisors.

It’s no surprise that in our increasingly secular society, we see a rise in new religious movements. Over millions of years, our planet and different civilisations on it have worshipped many different gods and deities. Science, discovery and new technologies have influenced religion in the past, so is it really all that far-fetched to think that our digital age should birth an AI god?  

According to WIRED, Levandowski’s reasoning behind the implementation of an AI godhead is down to transition. He explains that he believes humans rule the planet because they’re smarter than other species on it, so once we create something smarter than ourselves, there would naturally be a transition of power.

Arguably, you can also see the beginnings of an AI allegiance of ...

Read More on Datafloq
How Machine Learning Boosts Personalization in Travel

How Machine Learning Boosts Personalization in Travel

She opens the browser, puts the cursor in the search bar, types “cheap flights from Boston to London,� and up pops the first ten links from Google’s results page. After some surfing, she lands on Skyscanner filtering the flights by date and cost and selects the cheapest deal from Norwegian Airlines. As Skyscanner aggregates the offers from providers, she again bulk opens multiple links leading to deals by online travel agencies. She glances over the pages, never staying more than 30 seconds on any of them. At one of the online travel agencies, she opens flight details, hastily closes a pop-up window without reading its contents, and continues searching. In two days, she returns to the same agency closing the top deal from the search feed.  

The behaviour of this imaginary user is quite common. The data scientists from AltexSoft, a travel tech provider, call this type of ticket surfer an “economy buyer�. The economy buyer accounts for about a half of airfare searches. They look for the most affordable deals, don’t spend too much time exploring flight details, don’t care about long layovers or seating.

Back in 2012, Amadeus published a research called Who Travels with You. The study outlined ...

Read More on Datafloq
How Big Data is Helping the Dark Web Industry?

How Big Data is Helping the Dark Web Industry?

The dark web is a secret platform accessed by criminals for trading, as it offers anonymity for financial transactions. It is one of three subsets of the internet—in all, there’s the surface web, the deep web and the dark web. Most of us normally use the surface web, which only makes up approximately three to four percent of all information available on the internet.

The rest of the content can be placed under the deep web category, which is filled with information that isn’t indexed by conventional search engines like Google.That content pertains to databases, encyclopedic catalogues, medical records, government resources, etc. A comprehensive table of deep web links has been posted on Dark Web News just to give you an idea of how extensive the content is.

You might consider that you can always access the entire internet but in reality, what you are accessing on a daily basis is only the tip of the iceberg. Search engines are only capable of surfing the surface web, which is only a tiny portion of the entire internet.

Big Data

Now, the dark web is a small division of the internet where drug trafficking, child pornography and other illegal trades and activities take place. These ...

Read More on Datafloq
Gulf War Records

Gulf War Records

We, of course, have never examined the other sides records for the Gulf War (1990-91). We have included in our various combat databases over 20 division and battalion-level engagements from the Gulf War. These were all assembled for us by C. Curtis Johnson, former VP of HERO and author of something like eight books.

At the time he was working for a project that had collected the U.S. Gulf War records. So he had access to the U.S. records of the various engagements, as was able to assemble the U.S. side. He had to assemble the estimates of Iraqi strength and losses based upon the U.S. intelligence records and a little educated guesswork. There are real problems in using intelligence estimates to determine the other side’s strength and losses. I can point out a number of cases where loss estimates were off by an order of magnitude (I discuss this in depth in my Kursk book). Still, as we had overrun most of the units involved, taking their records at the time, then it appears that these were reasonable and the certainly the best estimates that could be made at the time. Because the records Curt was working with were classified, and our database is unclassified, he could not leave a record of how he developed these estimates. There were, of course, also problems with the U.S. records, but that is the subject of another post.

Now, our engagements could be improved upon by a careful examination of the captured Iraqi records, which is why this caught our attention:

The Sad Story Of The Captured Iraqi DESERT STORM Documents

Needless to say, this means that for all practical purposes, the 20+ engagements in our database can never be cross-checked or improved upon. It is the best that can be done.

A Guide to Data Analysis with Python Libraries

A Guide to Data Analysis with Python Libraries

Why Python

Python is a general-purpose language which can be fine-tuned to serve data analysis purposes but is not limited to those only, like R and MATLAB. Also, it offers important advantages such as speed, performance, and scalability. An expert Python developer from Iflexion also mentioned flexibility and capacity, making this a great tool to handle Big Data projects. The extended community built around it adds to rating Python as a top choice, since you can always have someone to ask for help.

Where to Start

Python is so widespread it could be overwhelming to choose what course or framework to learn first. There are a lot of free and paid options, you just need to find a teaching style that speaks to you, no matter if you choose a more academic path or go for a gamified approach. Just make sure you understand the basics of syntax and logic. Don’t spend an overwhelming amount of time delving into all the nuts and bolts, especially if you just want to use it for data analysis. Get up to speed and then head out to learn the necessary libraries. 

Python Basics

Since this is a programming language, you will need to get a good grasp of data ...

Read More on Datafloq
How Can Big Data Help Cancer Research?

How Can Big Data Help Cancer Research?

A ray of hope is shining on cancer research, and it is coming from Big Data. This umbrella concept, more common in the technology sector is making its way into other industries, including healthcare, through its applications. The idea behind Big Data is that by analyzing vast volumes of information like medical records, lab tests, and even DNA information, it is possible to detect patterns. To be of any use, these patterns should be unique to each type of cancer, like a signature. So far, doctors look at symptoms and work to identify the cause; now they are aiming to get a tissue sample, analyze it and look for signatures by comparing results with an existing, ever-growing database.

Why is mesothelioma frightening?

Cancer, in general, is one of the most chilling words. It is associated with painful treatments, a significant decrease of if the quality of life and even a gloomy perspective on surviving. To make matters worse, mesothelioma is considered a particularly aggressive type of cancer. The tumor affects the lining of internal organs, for example, the pleura that surrounds the lungs. It is believed that the cause is the exposure to asbestos.

The survival rates of mesothelioma, although improving in the ...

Read More on Datafloq
Agile BI Building Blocks 2.0

Agile BI Building Blocks 2.0

Quite a while ago, I published a blog post about my Agile BI Maturity Model. In this post I’d like to show you the current state of the model.

First of all I renamed the model to “Agile BI Building Blocks”. I don’t like the “maturity” term anymore as it somehow values the way you are doing things. Building blocks are more neutral. I simply want to show what aspects you need to take into consideration to introduce Agile BI in a sustainable way. The following picture shows the current model:

What changed compared to version 1? Let me go through the individual building blocks:

  1. Agile Basics & Mindset. No change – still very important: You need to start with agile basics and the agile mindset. A good starting point is still the Agile Manifesto or the newer Modern Agile.
  2. Envision Cycle & Inception Phase. No change – this about the importance of the Inception Phase especially for BI project. Simply don’t jump straight into development but do some minimal upfront work like setup the infrastructure or create a highlevel release scope and secure funding.
  3. BI specific User Stories. Changed the term from simply User Stories to “BI specific User Stories”. Unfortunately I didn’t manage to write a blog post about this yet, but in my recent workshop materials you’ll find some ideas around it.
  4. No / Relative Estimating. Changed from Relative Estimating (which is mainly about Story Points) to include also No Estimating which is basically about the #NoEstimates movement. I held a recent presentation at TDWI Schweiz 2017 about this topic (in German only for now)
  5. (Self Organizing) Teams. Changed and put the term “Self Organizing” in brackets as this building block is about teams and team roles in general.
  6. Workspace & Co-Location. Added “Workspace” as this building block is not only about co-location (though this is an important aspect of the agile workspace in general)
  7. Agile Contracting. No change, in my recent presentation at TDWI Schweiz 2017 I talked about Agile Contracting including giving an overview of the idea of the “Agiler Festpreis”, more details you can find in the (German) book here.
  8. New: Data Modeling & Metadata Mgt. Not only for AgileBI data modeling tool support and the question around how to deal with metadata is crucial. In combination with Data Warehouse Automation these elements become even more important in the context of AgileBI.
  9. New: Data Warehouse Automation. The more I work with Data Warehouse Automation tools like WhereScape, the more I wonder how we could work previously without it. These kind of tools are an important building block on your journey of becoming a more agile BI environment. You can get a glimpse at these tools in my recent TDWI / BI-Spektrum article (again, in German only unfortunately)
  10. Version Control. No change here – still a pity that version control and integration into common tools like Git are not standard in the BI world.
  11. Test Automation. No change here, a very important aspect. Glad to see finally some DWH specific tools emerging like BiGeval.
  12. Lean & Fast processes. No change here – this block refers to introducing an iterative-incremental process. There are various kinds of process frameworks available. I personally favor Disciplined Agile providing you with a goal-centric approach and a choice of different delivery lifecycles.
  13. Identify & Apply Design Patterns. No change except that I removed “Development Standards” as a separate building block as these are often tool or technology specific formings of a given design pattern. Common design patterns in the BI world range from requirements modeling patterns (e.g. the BEAM method by Lawrence Corr as well as the IBIREF framework) to data modeling patterns like Data Vault or Dimensional Modeling and design patterns for data visualization like the IBCS standards.
  14. New: Basic Refactoring. Refactoring is a crucial skill to become more agile and continously improve already existing artefacts. Basic refactoring means that you are able to do a refactoring within the same technology or tool type, e.g. within the database using database refactoring patterns.
  15. New: Additive Iterative Data Modeling. At a certain step in your journey to AgileBI you can’t draw the full data model upfront but want to design the model more iteratively. A first step into that direction is the additive way, that means you typically enhance your data model iteration by iteration, but you model in a way that the existing model isn’t changed much. A good resource around agile / iterative modeling can be found here.
  16. Test Driven Development / Design (TDD). No change here. On the data warehouse layer tools like BiGeval simplify the application of this approach tremendously. There are also materials availble online to learn more about TDD in the database context.
  17. Sandbox Development Infrastructure. No change, but also not much progress since version 1.0. Most BI systems I know still work with a three or four system landscape. No way that every developer has its own full stack.
  18. Datalab Sandboxes. No change. The idea here is that (power) business users can get their own, temporary data warehouse copy to run their own analysis and add and integrate their own data. I see this currently only in the data science context, where a data scientist uses such a playground to experiment with data of various kinds.
  19. Scriptable BI/DWH toolset. No change. Still a very important aspect. If your journey to AgileBI takes you to this third stage of “Agile Infrastructure & Patterns” which includes topics like individual developer environments and subsequently Continuous Integration, a scriptable BI/DWH toolset is an important precondition. Otherwise automation will become pretty difficult.
  20. Continuous Integration. No change. Still a vision for me – will definitely need some more time to invest into this in the BI context.
  21. Push Button Deployments. No change. Data Warehouse Automation tools (cf. building block 9) can help with this a lot already. Still need a lot of manual automation coding to have link with test automation or a coordinated deployment for multiple technology and tool layers.
  22. New: Multilayer Refactoring. In contrast to Basic Refactoring (cf. building block 14) this is the vision that you can refactor your artefacts across multiple technologies and tools. Clearly a vison (and not yet reality) for me…
  23. New:  Heavy Iterative Data Modeling. In contrast to Additive Iterative Data Modeling (cf. building block 15) this is about the vision that you can constantly evolve your data model incl. existing aspects of it. Having the multilayer refactoring capabilities is an important precondition to achieve this.

Looking at my own journey towards more agility in BI and data warehouse environments, I’m in the midst of the second phase about basic infrastructure and basic patterns & standards. Looking forward to an exciting year 2018. Maybe the jump over the chasm will work 😉

What about your journey? Where are you now? Do you have experience with the building blocks in the third phase about agile infrastructure & patterns? Let me know in the comments section!

TDI Friday Read: Naval Air Power

TDI Friday Read: Naval Air Power

A rare photograph of the current Russian Navy aircraft carrier Admiral Kuznetsov (ex-Riga, ex-Leonid Brezhnev, ex-Tblisi) alongside her unfinished sister, the now Chinese PLAN Liaoning (former Ukrainian Navy Varyag) in the Mykolaiv shipyards, Ukraine. [Pavel Nenashev/Pinterest]

Today’s edition of TDI Friday Read is a round-up of blog posts addressing various aspects of naval air power. The first set address Russian and Chinese aircraft carriers and recent carrier operations.

The Admiral Kuznetsov Adventure

Lives Of The Russian (And Ex-Russian) Aircraft Carriers

Chinese Carriers

Chinese Carriers II

The last pair of posts discuss aspects of future U.S. naval air power and the F-35.

U.S. Armed Forces Vision For Future Air Warfare

The U.S. Navy and U.S. Air Force Debate Future Air Superiority

IBM’s Integrated Analytics System Joins the Ranks of Full Powered Analytics Platforms

IBM’s Integrated Analytics System Joins the Ranks of Full Powered Analytics Platforms

As we get deeper into an era of new software platforms both, big players and newcomers are industriously working to reshape or launch their proposed new-generation analytics platforms, especially aiming to appeal to the growing community of new information workers or “data scientists� ㅡa community always eager to attain the best possible platform to “crunch the numbers�ㅡ, examples include those including Teradata with its new analytics platform or Cloudera with its Data Science Workbench.

So now the turn is for IBM, which recently unveiled its Integrated Analytics System. IBM’s new offering represents the company’s unified data system aimed to provide organizations with easy, yet sophisticated platform for the development of data science within data from on-premises, private, public of hybrid cloud environments.

The new offering coming from the “Big Blue� company is set to incorporate a myriad of data science tools and functionality features as well as the proper data management processes for developing and deploying advanced analytics models in-place.

The new offering aims to allow data scientists to easily perform all data science tasks, including moving workloads to the public cloud to begin automating their businesses with machine learning easily and rapidly.

The system is built on the IBM common SQL engine to enable users can use a common language and engine across both hosted and cloud-based databases allowing them to move and query data across multiple data stores, including Db2 Warehouse on Cloud, or Hortonworks Data Platform.

According to IBM, the Integrated Analytics System, the product team has developed the platform to blend and make the system work seamlessly work with IBM’s Data Science Experience, Apache Spark and the Db2 Warehouse on Cloud, where:

  • The Data Science Experience is set to provide the set of necessary critical data science tools and a collaborative work space
  • Apache Spark set to enable in-memory data processing to speed analytic applications
  • Db2 Warehouse on Cloud to enable deployment and management of cloud-based Db2 Warehouse on Cloud clusters within a single management framework

All aimed for data scientists to allow them create new analytic models that then developers can make use of for developing and deploying intelligent applications easily and rapidly.

According to Vitaly Tsivin, Executive Vice President at AMC Networks:

“The combination of high performance and advanced analytics – from the Data Science Experience to the open Spark platform – gives our business analysts the ability to conduct intense data investigations with ease and speed. The Integrated Analytics System is positioned as an integral component of an enterprise data architecture solution, connecting IBM Netezza Data Warehouse and IBM PureData System for Analytics, cloud-based Db2 Warehouse on Cloud clusters, and other data sources.�

The Integrated Analytics System is built with the IBM common SQL engine to enable users to seamlessly integrate the unit with cloud-based warehouse solutions and, to provide users with an option to easily move workloads seamlessly to public or private cloud environments with Spark clusters, for their specific requirements.

Some capabilities and power include:

  • Asymmetric massively parallel processing (AMPP) with IBM Power technology and flash memory storage hardware
  • Ability to built on the IBM PureData System for Analytics, and the previous IBM Netezza data warehouse offerings
  • Support for variety of data types and data services, including the Watson Data Platform and IBM Db2 Warehouse On Cloud, to Hadoop and IBM BigSQL.

Also, the new Integrated Analytics System incorporates hybrid transactional analytical processing (HTAP) where HTAP can run predictive analytics, transactional and historical data on the same database at faster response times.

Additionally, the Integrated Analytics System is designed to provide built-in data virtualization and compatibility with the rest of the IBM data management product stack including Netezza, Db2, and IBM PureData System for Analytics.

According to IBM, later this year, the company has plans to incorporate support for HTAP within the IBM Db2 Analytics Accelerator for z/OS to enable the new platform to seamlessly integrate with IBM z Systems infrastructures.

A new “data science� platform era?

It seems a major reshaping is ongoing in the BI and analytics software market as new-generation solutions keep emerging or getting more robust.

It also seems this transformation, seen from the user perspective of view is enabling traditional business intelligence tasks to evolve, blurring the lines between the traditional BI analysis and that coming from data science, helping departments to evolve their BI teams more naturally into robust advanced analytics departments and even easing somehow the educational process these departments need to overcome to make their personnel evolve with the times.

It seems we are seeing a new era in the evolution of enterprise BI/analytics/data science platforms that are about to take over the world. A new space worth to keep an eye on, I think.

The Five Main Benefits of the Internet of Things

The Five Main Benefits of the Internet of Things

Just a decade ago, the internet was still a fledgeling invention, with only a handful of devices able to connect. Now televisions, kettles, toasters, and more, are all able to connect to the internet of things and with technology constantly developing, the possibilities for internet-enabled appliances is only set to grow.

The internet of things is the connections between devices that allow them to collect and share information and communicate with each other via the internet. It’s a development that has many benefits that will make life easier and simpler, improving the world around us, especially as more devices become able to connect. It’s estimated that by 2020 as many as 30 billion objects (excluding computers, laptops, phones, and tablets) will be connected to the internet of things.

With such a large number of expected connected objects, it’s important to know what impact the internet of things - something you may never have heard of before - will have on your life. Here are the main benefits that the internet of things will have for you.

Car Safety

The World Health Organisation estimated that 1.25 million people died from traffic-related fatalities in 2013, and a large number of those deaths are due to human ...

Read More on Datafloq
Captured Records: Vietnam

Captured Records: Vietnam

There is a file of captured records for the Vietnam War. The Viet Cong, having political officers and a command structure, actually did keep records. The North Vietnamese Army also kept records. During the course of the war, some of these records were captured and are in a file at the National Archives. I don’t know of anyone who has used them. I did glance at the file, and there was no finders guide and nothing was translated. There did not appear to be much order to the file. I would have needed someone fluent in Vietnamese to help me (which is actually easy to find in Northern Virginia…for example General Nguyen Ngoc Loan ended up owning a Pizza restaurant in Springfield, VA).

** EDS NOTE: GRAPHIC CONTENT ** South Vietnamese Gen. Nguyen Ngoc Loan, chief of the national police, fires his pistol, shoots, executes into the head of suspected Viet Cong officer Nguyen Van Lem (also known as Bay Lop) on a Saigon street Feb. 1, 1968, early in the Tet Offensive. (AP Photo/Eddie Adams)

In the mid-1990s I did meet with Americans who had worked with the Vietnamese in trying to locate missing U.S. servicemen. They stated that the Vietnamese were very open and interested in researching and discussing the war. They felt that they would be receptive to a joint research project on Vietnam and would be willing to open their archives for us. As we had had access to the Soviet military archives since 1993, this looked like a fairly attractive next adventure for us. Unfortunately, we could not get anyone interested in funding research on insurgencies at that time. It was not something that U.S. had researched or analyzed since 1973.

Needless to say, after we got involved in insurgencies in Afghanistan and Iraq, I again floated the idea to the Army of doing a joint research project on Vietnam. They listened to me a little longer, but in the end, there was really no interest in analyzing the insurgency in Vietnam. I am not sure why. It has the virtue of being one of the few insurgencies where the insurgents kept good records. This would allow us to do analysis based upon two-sided data. There was certainly something that could be learned from this.

Of course, one of the problems with studying Vietnam is that U.S. Army record keeping at that time was grossly substandard. It was the poorest quality records from the U.S. Army that I had ever observed. The files from most of the units were very scant. Sometimes it was difficult to even determine the units strength and losses. Some divisions were missing almost all of their files (like the 82nd Airborne). For the 1st Brigade, 5th Mechanized Division on the DMZ, we could not determine the tank strength of the unit. There was no periodic strength and loss reports for armor. For the assault helicopter battalion my father commanded, there was only a battalion newspaper and few other files. You could not tell what aircraft the unit had, nor their status or strength. It was embarrassing.

We did actually flag this problem to the active duty army of the time and they ended up giving us a contract to examine the state of current U.S, army record keeping, which is  discussed in this post:

U.S. Army Record Keeping

Anyhow, this is an extended discussion of captured records originally inspired by this post:

The Sad Story Of The Captured Iraqi DESERT STORM Documents


5 Types of Virtual Reality that will Affect the Future

5 Types of Virtual Reality that will Affect the Future

Virtual reality is a digital experience that gives a real which nobody can see because it doesn’t exist in the real world. Virtual reality is like closing your eyes and experiencing the sound of music like as if you are in front of a live artist or at that exact place and time when the instrumentals of a song were being composed. It is experiencing things that do not exist.

Virtual reality is a 3D computer universe that one can access using digital technology. It comes with visual machinery attached to a computer to create the virtual experience. When you are connected to a virtual reality setup, the feeling is ecstatic as it looks so real.

With Virtual Reality, you can take a trip to the moon and it would feel and look exactly like you just landed on the moon but in reality, you are just sitting a chair in a room experiencing all this. Unlike movie experience where you are stationed in one position looking at a big screen, Virtual Reality is much more different as you can move around and the computer world would be moving along with you.

The Virtual reality world is usually very large and the person ...

Read More on Datafloq
Beyond Bitcoin: Seven of the Top Trending Cryptocurrencies

Beyond Bitcoin: Seven of the Top Trending Cryptocurrencies

When most people think of a cryptocurrency, the first thing to come to mind is probably Bitcoin. But Bitcoin is just one of many. At the moment, there are over 1275 cryptocurrencies, with a total market cap of over $200 billion, each of which offers different values and benefits.

Unlike traditional currencies like dollars and euros, which have a fixed value regardless of how and where you use them, different types of cryptocurrencies and tokens perform differently and are designed to solve various issues and problems in the digital world. A dollar or euro has the same value whether you use it to buy a shirt at Target or a book on Amazon, but the different cryptocurrencies are designed to offer users specific features, like the ability to purchase goods and services anonymously, or to facilitate a particular action, like raising money online.

The current cryptocurrency marketplace has been likened to a "Wild West", with massive daily fluctuations. It is still in the infancy stage and is likely to continue to grow apace with technological developments and consumer appetite for more decentralised and democratic systems.

Five Cryptocurrency Trends

The cryptocurrency market is developing at a rapid pace and trends come and go. Nevertheless, there ...

Read More on Datafloq
The Digital Transformation of Healthcare, and What It Means for You

The Digital Transformation of Healthcare, and What It Means for You

Medical care has come a long way in the last hundred years. Where once we had a system that required doctors to travel far and wide to respond to house calls in which they would ultimately offer medical treatment and consultation that is rudimentary by today’s standards, we now have an industry that is becoming increasingly high tech. Particularly in recent years, digitalization in the medical field has created unprecedented levels of precision and connectivity. Let’s take a look at some recent tech-based innovations in the medical industry and consider how these developments have impacted the experience of patients.


Running a hospital, clinic or dental office involves managing massive amounts of information. As medical professionals have incorporated technology into this process by digitizing records and analytics processes, they have decreased the chances of human error in recordkeeping, billing and even things like diagnostic work. Digitalization has also made it easier for medical personnel to navigate through information, in its many forms. Ultimately, the use of technology has meant greater convenience, reliability and timeliness for patients in accessing their medical records and billing information, which in turn has simplified their healthcare experience. More importantly, it has given doctors greater access to a ...

Read More on Datafloq
How to Prepare for a Cyber Attack During a Public Health Crisis

How to Prepare for a Cyber Attack During a Public Health Crisis

It’s no secret that the healthcare industry has gone digital. Medical records are stored in computers instead of filing cabinets, medical devices can be monitored from miles away, and even ACLS certification can be obtained online. However, with this digital approach comes the threat of cyber attacks, a threat that increases significantly during times of public health emergencies.

Bioterrorism, global public health emergencies and even mass shootings such as the recent Las Vegas tragedy strike at random. There is no telling when the next mass Ebola outbreak or Salmonella attack might occur, or what form that health emergency might take.

Unfortunately, opportunities for cyber attacks increase significantly during these crises. This is largely due to the extra strain on the system as well as the additional people using it. Volunteers and staff members working in a system for the first time mean well, but unfamiliarity with IT security protocols can easily create gaps and loopholes to exploit in what is otherwise a sound system.

Criminals might look to exploit the medical system for any number of reasons. Some attackers might be looking to commit identity theft or run online scams; others might use the information to access bank accounts or obtain prescriptions for ...

Read More on Datafloq
The Top Professionals of Your Big Data Team

The Top Professionals of Your Big Data Team

We are now in the age of Big Data, with businesses using facts and figures collected through data analysis to help make strategic business decisions, as well as to gain an edge over their competitors.

Fast becoming a major asset used across all departments of an organisation, not only IT and Security, the algorithmic approach and predictive analysis techniques associated with Big Data are successfully fueling a new era for business.

To harness the power of Big Data in a business scenario, having a team of skilled professionals at your disposal is a definite advantage. Below is a list of essential members of your Big Data team.

The Scientist

A Data Scientist with extensive industry knowledge is like gold dust. Highly skilled in their trade, they can handle and process raw data as well as implement specialised statistical techniques to interpret and uncover unique patterns.  

Showcasing confidence using Big Data-specific programming languages such as Python, R. and SQL,  a Data Scientist will also have a strong statistical and machine learning background; this will enable them to support conclusions they draw from analysed data.

The Architect

When data is collected, it's often unstructured and in need of a practised hand to transform it into something readable. Hiring ...

Read More on Datafloq
How to Educate Stakeholders on the Realities of Cybersecurity

How to Educate Stakeholders on the Realities of Cybersecurity

Cybersecurity is like the door ajar warning light on the console of a huge van going 80 mph down the freeway. Whoever’s driving the vehicle needs to pull over and close the door, but it’s in the back, it’s only slightly ajar, and at any rate, the van can keep going, so the driver figures they might as well just get to it later. But by the time they get to it, someone has already snuck in and made off with a crate of valuables to sell on the black market.

A new cybersecurity survey of businesses reveals 87 percent of respondents are “confident in their cybersecurity preparedness.� This comes at a time when 71 percent “had at least one breach in the previous year.� Types of breaches respondents reported included DoS attacks, fraud, insider attacks, and ransomware. The average cost of a single breach for SMBs was nearly $78K, for enterprises it was nearly $1 million.

Clearly, the door is ajar and many businesses (400 in the survey) naively think they know how to close it.      

Businesses aren’t the only ones facing this problem. Schools face cyber attacks too. About 27 percent of schools allow anyone access to their open networks, ...

Read More on Datafloq
Korean War Story

Korean War Story

My father was a forward observer in Korea. In 1953, him and another U.S. soldier were camped out in a foxhole between the lines. It was nighttime and they were making dinner.

The U.S. command had requested that its soldiers should try to capture some Chinese soldiers. As added incentive, the people who captured one would get a three-day pass to Japan. This was a pretty good incentive for those living out in the field. So the two foxhole buddies were sitting making dinner and of course talking about what they would do on their three-day pass to Japan, assuming they could capture a Chinese soldier.

Suddenly, a Chinese soldier stuck his head over the rim of the foxhole. They saw him, yelled “There is one” and immediately leaped for him. The poor Chinese soldier took off running. They ran for a mile or two through the “no mans land” between the lines(which would became the DMZ) and eventually the two larger American’s were able to run him down and capture him.

Now, they were in the middle of the (soon be called) DMZ, in the middle of the night, dragging along a captured Chinese soldier, and not quite sure where their foxhole was. Furthermore, in their haste to get him, they forgot to grab their guns. For the two unarmed Americans dragging a Chinese prisoner through the dark, it was a very long and tense walk back to their foxhole.

They did get their three-day pass to Japan.


Note: This is a story told to me by father many years ago. It was not written down and I have never checked the veracity of it. I have no doubt that it is mostly true, but one cannot rule out a little exaggeration for the sake of a good yarn. We do not know what became of the Chinese soldier.

AI In Telecom: Intelligent Operations is the New Norm

AI In Telecom: Intelligent Operations is the New Norm

The move towards an intelligent world is faster and more rapid than it ever was before. The increase in this transition has been propagated through the role of several high key stakeholders that have redefined the way we look at technology. One of the key players in this transition is Huawei. 

Huawei’s recent UBBF conference held in Hangzhou on October 18-19 was a step towards awareness in this regard. Being personally present at this conference, there were numerous intakes that I noted down and would like to present to my readers. 

An Intelligent World

One cannot stay oblivious to the fact that we are indeed moving towards an intelligent world. The world of the future is being propelled through the use of Artificial Intelligence (AI) and Machine Learning (ML). Both these technologies conflate to form the basis of numerous intelligent technologies that help make life easier for us. A few of these developments that affect the usage of Telecom networks substantially are:

Smart Homes: Through the use of AI or the Internet of Things (IoT), as it is known in more common language, the concept of Smart Homes is spreading at a rapid pace. We have homes where we can monitor and control things ...

Read More on Datafloq
Is Big Data Eliminating Gut Feel in Business?

Is Big Data Eliminating Gut Feel in Business?

Many of us trust our gut feeling implicitly.

We have all been there when something didn’t feel right but done it anyway. There was nothing obvious to suggest that it was a bad idea. Yet, when it turns out to be a disaster we wonder why we didn’t listen to that gentle little voice: “don’t do it.�

It seems that our gut feel is some magical force, but actually, it is an accumulation of thousands of tiny past moments, all informing our current situation. You could say that gut feel is a personification of Big Data – you can’t always explain it, but you sense that there is great wisdom to be gained from listening to it.

This understanding is crucial to any data scientist seeking to influence their colleagues. They might view the data as being cold and clinical, preferring to rely on their own intuition, but in actual fact, their intuition is based on their own internal data. The difference is that Big Data can reference infinitely more external data than a human mind could ever process. If we see the capabilities of Big Data, its powers of intuition are far greater than any human.

We simply have to use it in ...

Read More on Datafloq
Join the Cloud Analytics Academy

Join the Cloud Analytics Academy

Maybe not a cool as Star Fleet Academy, but this is pretty cool. Snowflake and a number of our partners have come together to create the first, self-paced, vendor agnostic, online training academy for analytics in the cloud. This academy will get you up to speed on what is happening today in the cloud with […]
Why Blockchain and Analytics Don’t Mix Well

Why Blockchain and Analytics Don’t Mix Well

The concept of a blockchain is quite a phenomenon in recent times. It has quickly risen from a relatively obscure idea known mostly within some small circles to one that is being discussed as having potential to literally change some of the fundamentals of the world’s economic systems.

I don’t claim to be a blockchain expert, but given that it is a new paradigm for generating and storing data, my mind has naturally drifted toward thinking about how the mechanics and performance of analyzing data within a blockchain environment would be different from how we analyze data within other platforms. My initial thoughts point to some significant challenges.

A System That Isn’t Built for Analytics Isn’t Optimized for Analytics

Let’s start with a historical perspective by examining the early days of data warehousing and 3rd normal form data. Storing data in 3rd normal form does have a range of benefits, particularly when it comes to storing massive amounts of data at an enterprise scale. For one example, it minimizes data duplication and, therefore, storage costs. However, for building models and executing deeper analytics, we need to denormalize such data. So, 3rd normal form adds overhead to our analytic processing. There are benefits to ...

Read More on Datafloq
Korean War Records

Korean War Records

Not much to say about captured records in the Korea War as I have never checked on them. I assume there must be some taken from North Korean and Chinese units and they are files away somewhere. My father did capture a Chinese soldier during the Korean War.

Oddly enough, there not been much done in the world of quantitative analysis on the Korea War outside of the work that ORO (Operations Research Office) did in the 1950s. We have never done any significant work on the Korean War. In the late 1980s we did explore conducting some analysis of Korean War battalion-level combat. As part of that effort Trevor Dupuy and I went over to the National Archives at Suitland and pulled up some U.S. Army Korea War records. They appeared to be quite complete. There were a couple of French infantry battalions attached to the U.S. Division and we appear to have good strength and loss data for them also.

Later, in 1989, Trevor Dupuy arranged with China to conduct a joint research project. It was funded by OSD Net Assessment (Andy Marshall). Trevor Dupuy really wanted to do some two-sided analysis of combat with the Chinese Army in Korea, but apparently getting access to the Chinese Army records was still too sensitive at that point. So, instead, they arranged to do a joint research contract on a more general and less sensitive theme like perceptions of each sides intentions during the Korean War. But then in June 1989 the Chinese government rolled over the student protestors in Tiananmen Square with tanks. That ended all joint research projects for many years.

We never got back to trying to conduct a joint research project on combat with China. Instead in 1995, we started a research project on Kursk using Russia records.

Trevor Dupuy did mention that the Chinese informally told him that the United States often overestimated the size of the Chinese forces they were facing, and often underestimated the casualties the Chinese took. I have no idea how valid that is.


Anyhow, this is an extended discussion of captured records originally inspired by this post:

The Sad Story Of The Captured Iraqi DESERT STORM Documents

A Brief History of AI

A Brief History of AI

This article appeared on Medium, it has been awarded the Silver badge by KDnuggets and the blog has been also recognized as one of top 50 AI blogs by Feedspot.

I. The origins

In spite of all the current hype, AI is not a new field of study, but it has its ground in the fifties. If we exclude the pure philosophical reasoning path that goes from the Ancient Greek to Hobbes, Leibniz, and Pascal, AI as we know it has been officially started in 1956 at Dartmouth College, where the most eminent experts gathered to brainstorm on intelligence simulation.

This happened only a few years after Asimov set his own three laws of robotics, but more relevantly after the famous paper published by Turing (1950), where he proposes for the first time the idea of a thinking machine and the more popular Turing test to assess whether such machine shows, in fact, any intelligence.

As soon as the research group at Dartmouth publicly released the contents and ideas arisen from that summer meeting, a flow of government funding was reserved for the study of creating a nonbiological intelligence.

II. The phantom menace

At that time, AI seemed to be easily reachable, but it turned out that was not the case. At ...

Read More on Datafloq
Don’t Be Left Behind With the New IT Revolution

Don’t Be Left Behind With the New IT Revolution

The world has seen massive technological leaps in the last ten years or so. Innovations such as multi-touch tablets, cloud computing, smartphones, robotics, etc. have revolutionized the way we live and work. There is some kind of a 'digital mesh' surrounding each one of us, ambient and continuous experiences continue to emerge in a bid to exploit. Think about it, in the last 24 hours, you probably have had several moments of continuous communication with people using devices — smartphones, tablets, computers, etc. — apps, and other services.

Tsunami of New Technologies

Virtually every aspect of our lives has been hit by the tsunami of new technologies. Our businesses, in particular, have experienced a lot of disruption and mind you, we are just getting started, believe it or not. There is more to come. While it's impossible to predict what exactly will happen in the future vis-à-vis technology, we can use emerging trends to make an educated guess of what the future holds when it comes to IT. Here are some of the top technological/computing trends that will be driving the new IT revolution.

GPU Processors

We use computers in many aspects of our lives; working, gaming, entertainment, etc. The Central Processing Unit (CPU), ...

Read More on Datafloq
Ethical Concerns for AI and Big Data

Ethical Concerns for AI and Big Data

AI and Big Data have both been boons to the corporations that use them. They make it simpler for data analysts to understand the interests of the demographics being addressed, and they can help companies large and small improve the appeal of their products and advertisements. There has been, however, a natural backlash against the use of AI and Big Data. Many sources have cited the use of both – especially when users are unaware of their presence – as unethical and invasive. This is a genuine concern worth considering as AI and Big Data are used more frequently in companies of all sizes and interests.

Certain ethical lines need to be considered, and those at the head of particular companies need to have an open dialogue with the audience they’re serving in order to ensure that both AI and Big Data are being used effectively and ethically. At the moment, there are certain quirks of the deep learning algorithms and network tapping that read a little too 1984 for some folks. In considering these aspects, so too can solutions be considered that will better balance the relationship between technology and the individual.

Deep Learning Algorithms

Deep learning algorithms work within the existing ...

Read More on Datafloq
TDI Friday Read: How Many Troops Are Needed To Defeat An Insurgency?

TDI Friday Read: How Many Troops Are Needed To Defeat An Insurgency?

A paratrooper from the French Foreign Legion (1er REP) with a captured fellagha during the Algerian War (1954-1962). [Via Pinterest]

Today’s edition of TDI Friday Read is a compilation of posts addressing the question of manpower and counterinsurgency. The first four posts summarize research on the question undertaken during the first decade of the 21st century, while the Afghan and Iraqi insurgencies were in full bloom. Despite different research questions and analytical methodologies, each of the studies concluded that there is a relationship between counterinsurgent manpower and counterinsurgency outcomes.

The fifth post addresses the U.S. Army’s lack of a formal methodology for calculating manpower requirements for counterinsurgencies and contingency operations.

Force Ratios and Counterinsurgency

Force Ratios and Counterinsurgency II

Force Ratios and Counterinsurgency III

Force Ratios and Counterinsurgency IV

Has The Army Given Up On Counterinsurgency Research, Again?

Why is Big Data Security so Difficult?

Why is Big Data Security so Difficult?

More and more data is collected and stored today than ever before, making big data a solution to just about every industry’s needs. Customers and clients want solutions and options catered perfectly to their needs before they even know they need it. Silos of data store personal information that allows companies and businesses to personalise interactions and shopping experiences for every individual. But with this great reaping of data comes the difficulty of protecting that personal information. Just as companies are becoming smarter and innovating their collection and analysis of big data, hackers are also becoming smarter and innovating their attacks on sensitive and expensive information attacking computer servers easily.

From Target to Home Depot and JPMorgan Chase, big named companies have been hit by hackers, but that doesn’t mean smaller companies that also hold your personal information aren’t susceptible. In fact, they are sometimes more prey as they don’t often have the budget to invest in top-notch integrated security solutions. These silos of data that companies store are a goldmine for cybercriminals. Data breaches on companies that collect and store big data are becoming more common and aren’t going away anytime soon.

But protecting big data isn’t just a matter of ...

Read More on Datafloq
Analytics with Maple Syrup Flavor: Canadian Data & Analytics Companies. Part 1

Analytics with Maple Syrup Flavor: Canadian Data & Analytics Companies. Part 1

We all know Silicon Valley is the mecca of technology and, of course, this applies too,for the business intelligence (BI) and analytics market, as it concentrates many of its vendors.

Still, it is not hard to realize that around the world we can find tech companies developing innovative technology and software in many areas of the data management space, both already consolidated companies and vibrant startups looking to disrupt the market.

While for many people it’s not a surprise the relevant role some Canadian companies have played in the evolution of the business intelligence (BI) and analytics market, for some, it is still unclear which companies in Canada are setting a mark for the evolution of local Canadian data management technology.

Just as brief sample, we can mention these honorable mentions of Canadian companies who played a key role in the evolution of BI, including:

  • Former Ottawa-based software company Cognos, a fundamental player in the BI enterprise performance management software market acquired by IBM
  • Dundas, a longtime runner in the market who remains as a relevant player and who sold part of its dashboard and reporting technology to become part of Microsoft’s large reporting and analytics arsenal
  • Or more recently Datazen, an innovative mobile BI and data visualization developer acquired also by Microsoft

So without further due, here’s a list of some current players making waves in the BI and analytics market:

Core Analytx 
Solution(s): OnTime Analytx 

Based in Markham, Ontario. Core Analytx is the developer of OnTime Analytics, the company’s flagship product and main analytics offering.

With its solution being offered in three flavors: standard (SAAS based), enterprise (on-premises, as well as on private cloud) the company aims to encourage, guide and assist organizations with the implementation of analytics centric processes.

Core Analytx develops its proprietary technology with the principles of ease of use and self-service approach and, the provision of practical and efficient analytics products and services to serve organizations from  different industries and lines of business.

Major functions and features offered by OnTime Analytics include:

  • Data ingestion from databases, flat files, mainframes and others
  • Configurable web services for data connectivity
  • Test data Management
  • Basic and advanced analytics features
  • Custom training features
  • Data transformation capabilities
  • Data visualization and publishing capabilities
  • Importing data via a  data loader that connect to all standard databases (i.e. SQL Server, MySQL, Oracle etc.)
  • “What if Scenarioâ€� capabilities
  • Application customization via developer API
  • Ad-hoc report creation
  • Integration to key partners software including and Oracle Cloud
  • Customized security capabilities

OnTime Analytx’ Screencap (Courtesy of Core Analytx)

Solution(s): Coveo Intelligent Search Platform

Coveo is a company with a great deal of experience when it comes to search, identify and provide contextual information to end users.

Based in Quebec City, its flagship Intelligent Search Platform offers uses number of data analysis and management capabilities bundled under Coveo AIâ„¢ proprietary technology. With this technology, Coveo can search, find and deliver predictive insights across different cloud and on-premises systems.

Already well known for being a provider of enterprise search solutions, the company has expanded its solution to offer much more, using its now cloud-based solution.

Some core functional elements offered within its platform include:

  • Artificial Intelligence(AI)-powered search
  • Relevant search results
  • Advanced query suggestions and intelligent recommendations to website visitors and automatic relevance tune to recommend the best content
  • Single-Sign On (SSO), for unified search that honors user and group permissions across all enterprise content sources
  • Personal business content, like emails, can only be searched and seen by the individual user, and any designated super-users (e.g. compliance officers)
  • Usage Analytics

Coveo includes partnerships with key software companies to allow its platform to integrate and work with data from Microsoft, Sitecore and
Coveo’s Screencap (Courtesy of Coveo)

DMTI Spatial
Solution(s): Location Hub Analytics

For over 20 years, DMTI Spatial has been providing industry leading location economics and Master Address Management (MAM) solutions to Global 2000 companies and government agencies, it is also the creator of CanMap mapping solutions and the award-winning Location Hub. DMTI Spatial is headquartered in Markham, Ontario.

Location Hub Analytics is a self-service data analytics engine that provides Canada’s robust, accurate and up-to-date location-based data.

Relevant functional features of Location Hub Analytics include:

  • Automatically consolidates, cleanses, validates and geocodes your address database
  • Each record is assigned a Unique Address Identifier (UAIDâ„¢)
  • Quickly processes and analyzes data, to objectively reveal meaningful patterns and trends to help better understand customers and prospects
  • Allows you to visualize and interact with your results on a map for better data profiling
  • Enriches data with Canadian demographics' information for further analysis and greater customer intelligence
  • Helps generate new business prospect lists by infilling the addresses within a specific territory that are not in your current database

Location Hub Analytics (Courtesy of DMTI Spatial)

Solution(s): Dundas BI

Dundas is an experienced company in the business intelligence scene. Headquartered in Toronto, the company offers, via its now flagship product Dundas BI a robust BI and analytics platform.

With its BI solution, Dundas aims to give users full control over their data so it can be quickly delivered in the most actionable way. Dundas BI platform enables organizations to work with data, prepare it  and transform it and subsequently enable its visual exploration within dashboards, reports and visual data analytics tools.

Also, worth to mention is Dundas’ success relies on its ability to build a solution with a wide amount of built-in functionality, and a rich set of open APIs.

Main functional features include:

  • Customizable dashboards
  • Communication and collaboration tools
  • Slideshows
  • Rich, interactive Scorecards
  • Ad-hoc reporting
  • Mobile features
  • Predictive and advanced data analytics
  • Embedded BI with seamless data integration
  • Support for Windows authentication
  • Multi-tenancy support

Dundas BI’s Screencap (Courtesy of Dundas)

Panorama Software
Solution: NECTO

Necto is Panorama Software’s full BI and analytics solution. The Toronto based company has offices in the US, UK and Israel, the company develops a business intelligence and analytics solution that offers automated analysis and recommendations which are easily disseminated throughout the organization.

With a fully customizable layout that can be modified to fit within many organization’s language and with easy point and click functionality, Panorama aims with Necto to take collaboration to the next level with the best business intelligence reporting tools that communicates real data.

Key features offered in Necto include:

  • Centrally administered, fully web based system
  • Fully functional dashboard design capabilities & simplified info-graphics
  • Automated analysis & hidden insights
  • Easy sharing of BI content
  • High security & scalability
  • Powered with KPI alerts
  • Mashup data from multiple sources
  • Simple & fast development

Necto’s’ Screencap (Courtesy of Panorama Software)

Solution: Semeon Insights

With great deal of experience in the machine learning and artificial intelligence (AI) R&D within its corridors and offices, Montreal based Semeon develops next generation cloud-based “AI Linguistic� text analytics platform solution Semeon Insights to service businesses interested in better understanding what is being said about their brand, company, products, staff, competitors, and more.

All Semeon’s solutions are developed using its series of patented Semantic Algorithms which can determine the sentiment, intent, and predictive behaviors of clients, buyers or customers.

Key features offered by Semeon Insights include:

  • Sifts through public (Social Media, forums, blogs, review sites) as well as private data (CRM data, Customer Service data) to enhance customer-driven campaigns.
  • Use of number of techniques to uncover key insights and influencers, these techniques include:
    • Sentiment Analysis
    • Concept Clouds
    • Timeline Tracking
    • Content Classification
    • Sources/channels
    • Influencer identification
    • Data Visualization
    • Geolocation
    • Intent Analysis
  • Leverage concepts and opinions that drive public perception to fuel content creation teams and boost ROI as well as glean insights from competitor’s digital campaigns. 

Semeon’s Screencap (Courtesy of Semeon)

Ahh! And There’s More

So, in the second part of this series I will include some other start-ups and projects that will caught your attention with their innovation and opportunity for both using them or build business with them.

In the meantime and considering I might be leaving some companies out, please feel free to let me know your comments or the name of new Canadian analytics solution we all should know about.

* All logos are trademarks of their respective owners
Are You Ready to Have a Robot as Your Boss?

Are You Ready to Have a Robot as Your Boss?

Artificial intelligence (AI) and robots have slowly but steadily made their way into a variety of industries spanning fast food to the financial sectors. This expansion is projected to continue and is forecasted to replace a considerable amount of jobs. A Forrester report notes that AI can replace as many as six percent of jobs by 2021. A PwC reports notes that many jobs across the globe will be affected by the 2030s, including 38 percent of U.S. jobs; 35 percent of jobs in Germany; 30 percent of U.K. jobs, and 21 percent of occupations in Japan.

There is no doubt that AI and robots can replace frontline workers who complete routine tasks. But what does this mean for managers, supervisors and even executives? Are today's workers ready to have robots be their future bosses? Here are some factors to explore.

The Robots Are Coming

There are several reports about the positive and adverse implications of Artificial Intelligence. AI and robots are expected to increase production and efficiency in the workplace and increase salaries. They are even expected to increase jobs in sectors that are difficult to automate, such as health care, social work, and education. However, the influx of a robot workforce ...

Read More on Datafloq
5 AI Powered Tools to Help Run Business Smoothly

5 AI Powered Tools to Help Run Business Smoothly

Artificial Intelligence is transforming the face of the business and no other technology would have affected more than AI. AI has continued to evolve with the passage of time and helped people from various walks of life, industries, and businesses.

Entrepreneurs are no exception and helped them to automate their business, extract useful data, and gain insights. One thing is for sure that adaption of AI for businesses and individuals will flourish. Presently, AI-based solutions are being used to solve business problems and facilitate entrepreneurs to transform the way they conduct operations.

Have a look at some of the statistics that will reveal the future of AI in business.

According to IDC FutureScaped, it is believed that by 2018, 75% of developer teams will include AI functionality in one or more application or service.
The 2016 survey revealed that 58% of enterprise business executives are presently using predictive analytics within their organization (Narrative Science)
According to Gartner, by 2025, 85% of customer interactions will be managed without human intervention.
45% of fastest growing companies worldwide will employ smart machines and virtual assistants instead of people by 2018 (Gartner)

From the above statistics, it is clear that the impact of AI will be largely seen in the business ...

Read More on Datafloq
Big Data Contracts Shouldn’t Cause Big Headaches

Big Data Contracts Shouldn’t Cause Big Headaches

With every innovation in the world of big data and analytics, corporate IT departments dive into new contracts with vendors to stay ahead of the competition. This is only natural, considering that the cost of collecting and maintaining data is one of the heaviest cost centres on a CIOs balance sheet.

But, in a world where F.O.M.O., the fear of missing out, is guiding decision making, it’s easy to get suckered into a bad deal.

Thankfully, more and more is being handled internally by corporate data scientists. Hadoop completely changed the way we store and interact with data - significantly lowering costs. And the more manageable data becomes, the more we’ll be able to reduce reliance on external contractors.

But, there’s one significant advantage to utilising outside contractors, even if your internal team can handle the nuts and bolts of data management. By working with outside vendors, you gain access to fresh perspectives, innovative sources of new data and cutting-edge concepts.

And no, using a qualified outside vendor does not constitute a data security risk. I hear from c-suite personnel that they worry about trusting their information to an external vendor. There are a ton of easy steps you can take to limit your ...

Read More on Datafloq
Captured Records: WWII

Captured Records: WWII

At the end of World War I, the United States made sure they had access to the German military records in their treaty and sent a research team over there to review and copy the pertinent records. At the end of World War II, we just took everything. The allies gathered together all the material, in part because of concerns over war crimes, and eventually the entire collection was shipped back to the United States.

In the 1960s, the United States decided to repatriate the records back to Germany. Before they shipped them out they decided to copy the entire collection of German World War II records and place them on microfilm. This was a massive effort done by government contractors that took several years. Like all good government contracts, towards the end, this one was behind schedule, so they were having to cut corners by choosing not to copy things that they felt were not particularly relevant. So, the originals records are now in the archives in Freiburg, Germany (a nice little town about as far away from the old East German border as you can get). There are copies on microfilm of most of the German records, sometimes in disorder, in the U.S. Archives II in College Park, Maryland (near Washington DC). The British also have microfilm copies of portions of the German records collection that they captured over at the Public Records Office in Kew (London). The UK collection is a fraction the size of the U.S. microfilm collection, and as far as I know has nothing additional in it.

But, there were some records that were not copied by the U.S., but it is not much. For example, for the Kursk Data Base, I did the German research from the U.S. record collection. There were 17 German divisions in the offensive in the south in July 1943. I made a detailed listing of the records I have reviewed and sent that list to Dr. Arthur Volz over in Germany. He then went to Frieburg and tried to locate additional material on strength and losses from those files. About the only additional material he located was the panzer regiment files from the 11th Panzer Division, which were either not in the U.S. archives or I overlooked when I did my research. That was it. Overall, the original copying effort was pretty exhaustive.

There was one major gap for a long time. For a couple of decades, many of the original German situation maps were in the U.S., but no longer accessible. There were supposed to be copied and sent to Germany, but there was a budget issue. Meanwhile one researcher was handling them so poorly that they canceled access so as to protect them. They have finally copied them and sent the originals back to Germany.

New WWII German Maps At The National Archives

There are also no real Luftwaffe files. Most of the Luftwaffe files were placed on a train and when the order came down from Hitler to destroy everything….these weenies actually obeyed the order and burned all their records. There are also major gaps in the German records after July 1944. Every six months, the German army units wrapped up their records and sent them back to their central archives. Because the war ended in May 1945, many of the records for July-December 1944 never made it back to be filed. Same for the 1945 records. This is why the QJM (Quantified Judgment Model) was originally developed from Italian Campaign Data from 1943 through June 1944.

Anyhow, this is an extended discussion of captured records originally inspired by this post and started with the discussions below.

The Sad Story Of The Captured Iraqi DESERT STORM Documents

Captured Records: World War I

Survey of German WWI Records


The Dichotomy Between Data Science and Business Analytics

The Dichotomy Between Data Science and Business Analytics

As an analytics professional, I take a keen interest in trends of all sorts. Thanks to Google Trends, we don’t have to question the accuracy of this trend – in my business, any other trend can and will be questioned six ways from Sunday!

I know that IT investments are no popularity contests, but I think search term traffic is a good proxy for what interests businesses worldwide at a given point in time. In this blog, my objective is to try and highlight my understanding of what might be driving these trends. I’m stating what is obvious from this chart.

The interest in “Business Intelligence� has significantly waned in the last decade
The post-recession interest in “Data Science� has grown exponentially (and surpassed BI just a year ago!), while
“Business Analytics� has kept a decade-long marathon-paced trend in popularity

I started my career in analytics exactly 11 years ago. I think it coincided nicely with the emergence of analytics as a legitimate business function. And I worked at Hewlett Packard that could afford investments in high-performance software and research.

Business intelligence was very popular, and everyone was crazy about ETLs and data warehousing and “putting 2 and 2 together�. It was an exciting time to learn ...

Read More on Datafloq
Will Tax Reform Throttle A U.S. Defense Budget Increase?

Will Tax Reform Throttle A U.S. Defense Budget Increase?

John Conger recently reported in Defense One that the tax reform initiative championed by the Trump administration and Republican congressional leaders may torpedo an increase in the U.S. defense budget for 2018. Both the House and Senate have passed authorizations approving the Trump administration’s budget request for $574.5 billion in defense spending, which is $52 billion higher than the limit established by the Budget Control Act (BCA). However, the House and Senate also recently passed a concurrent 2018 budget resolution to facilitate passage of a tax reform bill that caps the defense budget at $522 billion as mandated by the BCA.

The House and Senate armed services committees continue to hammer out the terms of the 2018 defense authorization, which includes increases in troop strength and pay. These priorities could crowd out other spending requested by the services to meet strategic and modernization requirements if the budget remains capped. Congress also continues to resist the call by Secretary of Defense James Mattis to close unneeded bases and facilities, which could free spending for other needs. There is also little interest in reforming Defense Department business practices that allegedly waste $125 billion annually.

Congressional Republicans and Democrats were already headed toward a showdown over 2018 BCA limits on defense spending. Even before the tax reform push, several legislators predicted yet another year-long continuing resolution limiting government spending to the previous year’s levels. A bipartisan consensus existed among some armed services committee members that this would constitute “borderline legislative malpractice, particularly for the Department of Defense.�

Despite the ambitious timeline set by President Trump to pass a tax reform bill, the chances of a continuing resolution remain high. It also seems likely that any agreement to increase defense spending will be through the Overseas Contingency Operations budget, which is not subject to the BCA. Many in Congress agree with Democratic Representative Adam Smith that resorting to this approach is “a fiscal sleight of hand [that] would be bad governance and ‘hypocritical.’�

Are tax reform and increased defense spending incompatible? Stay tuned.

How Computer Vision Could Save Your Life

How Computer Vision Could Save Your Life

Technology is driving huge disruption and innovation within healthcare. Big data, machine learning, artificial intelligence led applications are benefiting the ways in which medical professionals diagnose and treat patients.

Google Brain has recently made a significant breakthrough in computer vision. In 2011, the error rate was 26% for computers analyzing images, but now it’s been revealed that computers are able to recognise and analyse images better than humans, with an error rate of only 3% compared to a 5% error rate of humans. This opens up huge opportunities for this technology in an array of sectors including healthcare. Medical imaging accounts for a huge part of medicine, from image registration and annotation, image-guided therapy through to computer diagnostics. 

Unfortunately, human error is no stranger to the healthcare sector and can put lives at risk. Whilst medical professions strive to deliver the best care they can, is the time for AI - led healthcare here? And is computer vision really capable of providing more thorough results than doctors? 

Accenture’s recent report suggests that the AI health market size is set to grow to a huge $6.6billion by 2021, at an annual rate of 40%. This growth is enormous and signals a radical change in healthcare that ...

Read More on Datafloq
Ten Major Challenges of Big Data Analytics in the Healthcare Industry of Today

Ten Major Challenges of Big Data Analytics in the Healthcare Industry of Today

The hype surrounding big data analytics in healthcare can be termed as “challenging, but inescapable� with the providers like Datafloq feeling the highest pinch of it. The need to reliably store the data today, safely and securely, and be sure that it will be efficiently accessible particularly useful in healthcare when needed adds to the excitement. Big data is long, complicated and bulky, often requiring taking a closer look at more vital aspects of it to ensure that it becomes meaningful.

Forget about the enormous enthusiasm regarding how “big data� will address continuous cost and quality deficiencies in the system, interpreting and successfully integrating them isn’t a mere walk. Clinical and IT departments with narrow focuses that solve a single problem at a time are the ones who feel the pressure even more. And those who have barely understood how to convert them into Electronic Health Records (HER) are now required to highlight actionable insights out of the data.

Clearly, the pathways to meaningful healthcare analytics are thorny ones even though some of the perks are healthier patients, lower healthcare costs, and higher consumer satisfaction. The facility will first have to collect, store, analyse, process and present the data to its stakeholders ...

Read More on Datafloq
6 Predictions for the Future of Artificial Intelligence

6 Predictions for the Future of Artificial Intelligence

Artificial Intelligence is no longer a futuristic technology. It is not an attention-grabbing fiction infused tool that a mobile game development company considers important. It is already there allowing us to reap advantages like more precise predictions, the more adaptive behaviour of machines, context-aware machine reactions to voice commands of human, etc. Machines are continuing to imitate human intelligence and unleashing automated as well as responsive behaviour to human situations that were unpredictable in the past.

In the pace Artificial Intelligence is paving the way for better comfort, ease of use and multifaceted advantages for everyday life, soon we can see AI make a lot of things happen that were previously unthinkable. The latest AI research projects underway and the predictions about the roles of AI in future to come upholds a future which is equally bright and shrouded with anticipation.

Here we are going to explain six predictions about future of AI that seem credible.  

1. Robots for disaster management

AI, which refers to the intelligence of machines, will make machines more responsive and aware of human contexts. If one facet of modern technology can be predicted to reap highest advantages of this new machine intelligence, it is nothing but the robotics. ...

Read More on Datafloq
8 Deep Learning Use Cases to Know & 10 Deep Learning Startups to Watch

8 Deep Learning Use Cases to Know & 10 Deep Learning Startups to Watch

Over the past few years, deep learning has become another trendy word1. It is mostly used in a business language when the conversation is about Machine Learning, Artificial Intelligence, Big Data, analytics, etc. Currently, it is showing great promise when it comes to developing the autonomous, self-teaching systems which are revolutionising many industries. Therefore I decided to write an article about deep learning startups, use cases and books.

Deep Learning was developed as a Machine Learning approach to deal with complex input-output mappings. Deep learning crunches more data than machine learning and that is the biggest difference. If you have a little bit of data, machine learning is a good choice, but if you have a lot of data, deep learning is a better choice for you. Deep learning algorithms do complicated things, like matrix multiplications. They also learn high-level features, so in the case of facial recognition, the algorithm will get the image pretty close to the RAW version in replication whereas machine learning’s images would be blurry. Another powerful feature is that it forms an end-to-end solution instead of breaking a problem and solution down into parts.

What is Deep Learning?

But what is Deep Learning exactly? Why has it become so popular? In ...

Read More on Datafloq
Mobile app development trends of the year

Mobile app development trends of the year

This past year shaped up to be a phenomenal year for the app economy1, and 2018 is set to be another great year. Mobile is more mainstream than ever, and businesses from all industries are relying on this channel to boost existing revenue by meeting customer demands. Mobile app development trends change a loT and 2018 promises to see the continuing emergence of new, cutting-edge techniques and tools, along with growth in traditional technologies and approaches. But now, let’s look at top trends to focus on next year.

4 Mobile App Development Trends


AI will make a mark in both the construction techniques of mobile apps and in boosting their capabilities. Through the use of advanced analytics, cognitive interfaces into complex systems, and machine learning techs, AI will provide business users access to powerful insights never before available to them. Due to these advantages, big players like Google, IBM, eBay, Facebook have started investing and acquiring startups those who are experts in Artificial Intelligence.

AR & VR 

If you are a little aware of modern mobile app development trends, then AR and VR apps are nothing new to you. They have been revolutionary in gaming and entertainment industry. VR devices like Samsung Gear VR, Google Cardboard and Oculus Rift ...

Read More on Datafloq
Why I Worry About Semi-Autonomous Cars

Why I Worry About Semi-Autonomous Cars

The technological background

The invention of the self-driven cars is marked as one of the extraordinary steps that have been made in technology. With this technology installed into our cars, it creates concerns over whether one should proceed to teach themselves some driving skills in case one was a novice at the skill. The technology behind self-driven cars is sophisticated and ordinary minds cannot grasp the idea behind this type of revolution. In a nutshell, the semi-autonomous vehicle technology serves to ensure safety by doing all the driving for you. In a scenario where the weather is much cloudy, and the road is not visible, then the self-driven cars come in hand with their senses and their robotic driving capabilities. Notably, the semi-autonomous can outshine the human driver and drive miles without bumping into a vehicle or driving into the wrong lane.

The smartness that the self-driven cars present the users is beyond our thinking. It is clear that computers are here to make things easy for us while making us lazy. For instance, can you imagine yourself getting into a car each morning, then the car drives off from the parking area by itself and heads to the highway by itself! ...

Read More on Datafloq
Recovering from Digital Transformation

Recovering from Digital Transformation

Digital Transformation is exciting and Data Protection is boring. But to get the cool apps into production, they have to be recoverable.
Why You Should Set Up a Big Data Environment Before Seeking AI

Why You Should Set Up a Big Data Environment Before Seeking AI

The past decade has seen major strides in the IT industry with cloud computing, blockchain and big data emerging as successful trendsetters for more innovation. According to a study led by EMC and Cap Gemini, 65% of big companies have estimated that they run the risk of becoming obsolete if they do not adopt adequate Big Data analytics solutions to support their modern data platform. IDC further brings to the fore predictions that suggest Big Data annual spending will reach $48.6 billion in 2019. With big data initiatives taking centre stage, organisations are keen at leveraging the agility of big data processes in combination with artificial intelligence (AI) capabilities to speed up the delivery of business value.

How big data fits in the whole scheme of seeking artificial intelligence

With organisations grappling with hordes of data while managing their legacy IT processes, customer relationships, business intelligence, sales forecasting, logistics and so on, big data analytics is presenting itself with opportunities to work with massive data sets that offer value beyond that of just sample sets. That being said, the data storage industry is looking at newer ways of leveraging the power of huge data sets that are now readily available. And one ...

Read More on Datafloq
The CRS Casualty Estimates

The CRS Casualty Estimates

Let’s just outline the specifics of the casualty estimates for a war with North Korea in the latest Congressional Research Service (CRS) report dated 27 October 2017:

On page 18:

Even if the DPRK uses only its conventional munitions, estimates range from between 30,000 and 300,000 dead in the first days of fighting, given that DPRK artillery is thought to be capable of firing 10,000 rounds per minute at Seoul. One observer states

Estimates are that hundreds of thousands of South Koreans would die in the first few hours of combat–from artillery, from rockets, from short range missiles–and if this war would escalate to the nuclear level, then you are looking at tens of millions of casualties and the destruction of the eleventh largest economy in the world.

It does not appear that CRS has done any independent analysis of this issues. Its sources in the footnotes are articles from Reuters, New York Times, CNN, NAPSNet Special Reports and GlobalSecurity.

And on page 3:

Should the DPRK use the nuclear, chemical or biological weapons in its arsenal, according to some estimates casualty figures could number in the millions.


Casualty Estimates for a War with North Korea

Casualty Estimates for a War with North Korea

There are a few casualty estimates out there of the cost of a war with North Korea. A couple of these casualty estimates are summarized in this article:

They are:

1. As many as 2.1 million could die if nuclear detonations occurred over Seoul and Tokyo (source: website 38 North, October 2017).

2. As many as 300,000 could die in the first few days of a conflict between North Korea and the U.S. even without the use of nuclear weapons (source: Congressional Research Service, 27 October 2017:

The Dupuy Institute has not done any casualty estimates or analysis of a war with North Korea, nor are we planning to at this juncture. We have done a few casualty estimates in the past:




Assessing the TNDA 1990-91 Gulf War Forecast

Assessing the 1990-1991 Gulf War Forecasts

Forecasting U.S. Casualties in Bosnia

Forecasting the Iraqi Insurgency

Which Top IoT companies are losing, keeping or winning its Charm in 2017?

Which Top IoT companies are losing, keeping or winning its Charm in 2017?

Nostalgics will remember the articles on the ReadWrite website that included the top Internet of Things (IoT) companies of 2009 and 2010. Unfortunately, these articles are no longer online.

The list of companies in the Internet of Things market has not stopped growing in the last eight years. During this time, I have been following many of these companies that have joined the train of the IoT that aims to reach a still diffuse destination. Just in the USA alone, the Internet of Things includes 3,000 Companies, $125B In funding, $613B In valuation, 342,000 employees.

In this post, I will review Top IoT 10 companies that in 2017, in my opinion, have lost, maintained or gained charm in this dynamic, competitive and demanding sector of the IoT.

I selected these companies from these articles  “Top 25 IoT Companies by Sales�, the most powerful Internet of Things companies� and the Top Internet of Things Companies.

According to the acceptance of this article, I will decide in the next weeks to write about the top IoT platforms, the top IoT startups or the top M2M Service Providers that are losing, keeping or winning its charm.

Defining Charm

The power of pleasing or attracting through their strategy, investments, innovation, ...

Read More on Datafloq
Why the Internet of Things Offers a Bold New Frontier for Social Marketing

Why the Internet of Things Offers a Bold New Frontier for Social Marketing

The year is nearly 2018, and we have yet to see flying cars become the norm or colonisation of mars. Science fiction may have over-promised, but I would argue that Silicon Valley hasn’t under delivered.

Nearly 1 in 5 households are considered hyper-connected, which means that they have 10 or more internet-connected devices. That’s great news for people that are looking for machines to become more integrated with our daily lives.

Do The Risks of a Connected Life Outweigh the Benefits?

Unfortunately, with every advance, there are new risks and concerns. One of the concerns with the connected household is privacy. Even for corporate America, this presents a challenge. While 85% of companies plan to utilise connected devices, only 10% feel that they can successfully secure them for daily use.

If corporate America can’t protect itself from the onslaught of hackers, how are residential households supposed to venture confidently into the world of smart devices and connected homes?

There are a few products that I think are a step in the right direction for the secure, connected household. First is the introduction of highly-secure home routers that can detect attacks because they have security software baked into the network infrastructure. Second, there are a growing ...

Read More on Datafloq
TDI Friday Read: Afghanistan

TDI Friday Read: Afghanistan

[SIGAR, Quarterly Report to Congress, 30 October 2017, p. 107]

While it is too soon to tell if the Trump Administration’s revised strategy in Afghanistan will make a difference, the recent report by the Special Inspector General for Afghanistan Reconstruction (SIGAR) to Congress documents the continued slow erosion of security in that country. Today’s edition of TDI Friday Read offers a selection of recent posts addressing some of the problems facing the U.S. counterinsurgent and stabilization missions there.


Meanwhile, In Afghanistan…

We probably need to keep talking about Afghanistan

What will be our plans for Afghanistan?

Stalemate in Afghanistan

Troop Increase in Afghanistan?

Sending More Troops to Afghanistan

Mattis on Afghanistan

Deployed Troop Counts

Disappearing Statistics



8 Ways How Big Data Changes Marketing

8 Ways How Big Data Changes Marketing

Today's most successful companies and businesses depend on big data and analyse it to increase the quality of their customer experience. The information stored in the cloud, bookmarked URLs and new data streaming in every second from all over the world accumulate into what is referred to as big data.

Business intelligence relies entirely on the analysis and interpretation of big data. Although there is always a lot of data at a company's disposal, it takes a special tool to analyze and manipulate it for the company's advantage. Stiff competition in marketing has triggered companies in acquiring the best tools for big data analysis. This is a clear indication that big data analytics plays a great role in marketing.

The roles of big data in marketing include the following:

1. Providing Better Customer Insights

Analysis of big data reveals what consumers prefer at the moment. If the majority of people on a social platform are discussing a certain product, that is the perfect moment and place to pop up the product's Ad. This increases the accuracy of your list of target customers.

2. Personalization

Big data analysis results show your consumer's purchase patterns. As an outgoing marketer, you can take advantage and send them recommendations of ...

Read More on Datafloq
Why Artificial Intelligence Will Be Fundamentally Different from Human Intelligence

Why Artificial Intelligence Will Be Fundamentally Different from Human Intelligence

Last week, Dubai announced a minister of Artificial Intelligence. The 27-year-old Omar Bin Sultan Al Olama is the UAE’s first State Minister of AI. A remarkable move by UAE Vice-President and Prime Minister Shaikh Mohammad Bin Rashid Al Maktoum, who said: “we want the UAE to become the world’s most prepared country for artificial intelligence�. The role focuses on future skills, future sciences and future technology and offers Dubai a chance to be ready for an AI-driven future. The move by Dubai is a great showcase of vision and dare to take risk, since artificial intelligence will increasingly define our future.

In fact, I would recommend any country that takes digitalisation serious to follow Dubai’s example and appoint a minister of Artificial Intelligence. Even more, I believe that every organisation should employ a Chief AI within their organisation to understand how AI will impact the business and how it will affect organisations and societies. Why? Simply because Artificial Intelligence will be fundamentally different from human intelligence and understanding this can bring you a competitive advantage and can help us mitigate risks.

Why AI Will Be Different

Intelligence is “the complex expression of a complex set of principles�, consisting of various interdependent subsystems all ...

Read More on Datafloq
Disappearing Statistics

Disappearing Statistics

There was a time during the Iraq insurgency when statistics on the war were readily available. As a small independent contractor, we were getting the daily feed of incidents, casualties and other such material during the Iraq War. It was one of the daily intelligence reports for Iraq. We had simply emailed someone in the field and were put on their distribution list, even though we had no presence in Iraq and no official position. This was public information so it was not a problem….until the second half of 2005…when suddenly the war was not going very well…then someone decided to restrict distribution. We received daily intelligence reports from 4 September 2004. They ended on 25 August 2005. There is more to this story, but maybe later.

This article was brought to my attention today:

A few highlights:

  1. From January 1 to May 8 Afghan forces sustained 2,531 killed in action and 4,238 wounded (a 1.67-to-1 wounded to killed ratio, which seems very low).

  2. Afghan forces control 56.8% of the 407 districts, a one percentage point drop over the last six months.

  3. Afghan government controls 63.7% percent of the population.

  4. Some of these statistics will now be classified.


One of our older posts on wounded-to-killed ratios. I have an entire chapter on the subject in War by Numbers.

Wounded-To-Killed Ratios

How Will Widespread A.I. Affect Generation Alpha?

How Will Widespread A.I. Affect Generation Alpha?

Children growing up today, specifically those born after 2010, have been dubbed Generation Alpha, and they’re set to be “the most tech-intensive, educated generation yet,� according to the experts at KinderCare Learning Center. In the same way that children of the 70s and 80s saw computers grow from expensive machinery used by the elite to common devices that we carry in our pockets, Generation Alpha will watch the rise of technology that today we see as impossibly complex and advanced, including the rise of A.I., from lunky, unrefined chatbots to whatever they become in the future.

That change is coming sooner than most would probably expect. Business Insider reports that the Kingdom of Saudi Arabia has become the first country in the world to bestow citizenship upon an AI-powered robot, named Sophia.

"I am very honored and proud of this unique distinction," Sophia told the audience, speaking on a panel. "This is historical to be the first robot in the world to be recognized with a citizenship."

The ceremonial event was held in the capital city of Riyadh, ahead of the Future Investment Initiative, and while likely more of a PR-stunt than anything else, sets interesting precedents regarding A.I. and how we will interact ...

Read More on Datafloq
The Historical Combat Effectiveness of Lighter-Weight Armored Forces

The Historical Combat Effectiveness of Lighter-Weight Armored Forces

A Stryker Infantry Carrier Vehicle-Dragoon fires 30 mm rounds during a live-fire demonstration at Aberdeen Proving Ground, Md., Aug. 16, 2017. Soldiers with 2nd Cavalry Regiment spent six weeks at Aberdeen testing and training on the new Stryker vehicle and a remote Javelin system, which are expected to head to Germany early next year for additional user testing. (Photo Credit: Sean Kimmons)

In 2001, The Dupuy Institute conducted a study for the U.S. Army Center for Army Analysis (CAA) on the historical effectiveness of lighter-weigh armored forces. At the time, the Army had developed a requirement for an Interim Armored Vehicle (IAV), lighter and more deployable than existing M1 Abrams Main Battle Tank and the M2 Bradley Infantry Fighting Vehicle, to form the backbone of the future “Objective Force.” This program would result in development of the Stryker Infantry Fighting Vehicle.

CAA initiated the TDI study at the request of Walter W. “Don” Hollis, then the Deputy Undersecretary of the Army for Operations Research (a position that was eliminated in 2006.) TDI completed and submitted “The Historical Combat Effectiveness of Lighter-Weight Armored Forces” to CAA in August 2001. It examined the effectiveness of light and medium-weight armored forces in six scenarios:

  • Conventional conflicts against an armor supported or armor heavy force.
  • Emergency insertions against an armor supported or armor heavy force.
  • Conventional conflict against a primarily infantry force (as one might encounter in sub-Saharan Africa).
  • Emergency insertion against a primarily infantry force.
  • A small to medium insurgency (includes an insurgency that develops during a peacekeeping operation).
  • A peacekeeping operation or similar Operation Other Than War (OOTW) that has some potential for violence.

The historical data the study drew upon came from 146 cases of small-scale contingency operations; U.S. involvement in Vietnam; German counterinsurgency operations in the Balkans, 1941-1945; the Philippines Campaign, 1941-42; the Normandy Campaign, 1944; the Korean War 1950-51; the Persian Gulf War, 1990-91; and U.S. and European experiences with light and medium-weight armor in World War II.

The major conclusions of the study were:

Small Scale Contingency Operations (SSCOs)

  1. Implications for the Interim Armored Vehicle (IAV) Family of Vehicles. It would appear that existing systems (M-2 and M-3 Bradley and M-113) can fulfill most requirements. Current plans to develop an advanced LAV-type vehicle may cover almost all other shortfalls. Mine protection is a design feature that should be emphasized.
  2. Implications for the Interim Brigade Combat Team (IBCT). The need for armor in SSCOs that are not conventional or closely conventional in nature is limited and rarely approaches the requirements of a brigade-size armored force.


  1. Implications for the Interim Armored Vehicle (IAV) Family of Vehicles. It would appear that existing systems (M-2 and M-3 Bradley and M-113) can fulfill most requirements. The armor threat in insurgencies is very limited until the later stages if the conflict transitions to conventional war. In either case, mine protection is a design feature that may be critical.
  2. Implications for the Interim Brigade Combat Team (IBCT). It is the nature of insurgencies that rapid deployment of armor is not essential. The armor threat in insurgencies is very limited until the later stages if the conflict transitions to a conventional war and rarely approaches the requirements of a brigade-size armored force.

Conventional Warfare

Conventional Conflict Against An Armor Supported Or Armor Heavy Force

  1. Implications for the Interim Armored Vehicle (IAV) Family of Vehicles. It may be expected that opposing heavy armor in a conventional armor versus armor engagement could significantly overmatch the IAV. In this case the primary requirement would be for a weapon system that would allow the IAV to defeat the enemy armor before it could engage the IAV.
  2. Implications for the Interim Brigade Combat Team (IBCT). The IBCT could substitute as an armored cavalry force in such a scenario.

Conventional Conflict Against A Primarily Infantry Force

  1. Implications for the Interim Armored Vehicle (IAV) Family of Vehicles. This appears to be little different from those conclusions found for the use of armor in SSCOs and Insurgencies.
  2. Implications for the Interim Brigade Combat Team (IBCT). The lack of a major armor threat will make the presence of armor useful.

Emergency Insertion Against An Armor Supported Or Armor Heavy Force

  1. Implications for the Interim Armored Vehicle (IAV) Family of Vehicles. It appears that the IAV may be of great use in an emergency insertion. However, the caveat regarding the threat of being overmatched by conventional heavy armor mentioned above should not be ignored. In this case the primary requirement would be for a weapon system that would allow the IAV to defeat the enemy armor before it could engage the IAV.
  2. Implications for the Interim Brigade Combat Team (IBCT). Although the theoretical utility of the IBCT in this scenario may be great it should be noted that The Dupuy Institute was only able to find one comparable case of such a deployment which resulted in actual conflict in US military history in the last 60 years (Korea, 1950). In this case the effect of pushing forward light tanks into the face of heavier enemy tanks was marginal.

Emergency Insertion Against A Primarily Infantry Force

  1. Implications for the Interim Armored Vehicle (IAV) Family of Vehicles. The lack of a major armor threat in this scenario will make the presence of any armor useful. However, The Dupuy Institute was unable to identify the existence of any such cases in the historical record.
  2. Implications for the Interim Brigade Combat Team (IBCT). The lack of a major armor threat will make the presence of any armor useful. However, The Dupuy Institute was unable to identify the existence of any such cases in the historical record.

Other Conclusions

Wheeled Vehicles

  1. There is little historical evidence one way or the other establishing whether wheels or tracks are the preferable feature of AFVs.

Vehicle Design

  1. In SSCOs access to a large-caliber main gun was useful for demolishing obstacles and buildings. This capability is not unique and could be replaced by AT missiles armed CFVs, IFVs and APCs.
  2. Any new lighter tank-like vehicle should make its gun system the highest priority, armor secondary and mobility and maneuverability tertiary.
  3. Mine protection should be emphasized. Mines were a major threat to all types of armor in many scenarios. In many SSCOs it was the major cause of armored vehicle losses.
  4. The robust carrying capacity offered by an APC over a tank is an advantage during many SSCOs.

Terrain Issues

  1. The use of armor in urban fighting, even in SSCOs, is still limited. The threat to armor from other armor in urban terrain during SSCOs is almost nonexistent. Most urban warfare armor needs, where armor basically serves as a support weapon, can be met with light armor (CFVs, IFVs, and APCs).
  2. Vehicle weight is sometimes a limiting factor in less developed areas. In all cases where this was a problem, there was not a corresponding armor threat. As such, in almost all cases, the missions and tasks of a tank can be fulfilled with other light armor (CFVs, IFVs, or APCs).
  3. The primary terrain problem is rivers and flooded areas. It would appear that in difficult terrain, especially heavily forested terrain (areas with lots of rainfall, like jungles), a robust river crossing capability is required.

Operational Factors

  1. Emergency insertions and delaying actions sometimes appear to be a good way to lose lots of armor for limited gain. This tends to come about due to terrain problems, enemy infiltration and bypassing, and the general confusion prevalent in such operations. The Army should be careful not to piecemeal assets when inserting valuable armor resources into a ‘hot’ situation. In many cases holding back and massing the armor for defense or counter-attack may be the better option.
  2. Transportability limitations have not been a major factor in the past for determining whether lighter or heavier armor were sent into a SSCO or a combat environment.

Casualty Sensitivity

  1. In a SSCO or insurgency, in most cases the weight and armor of the AFVs is not critical. As such, one would not expect any significant changes in losses regardless of the type of AFV used (MBT, medium-weight armor, or light armor). However, the perception that US forces are not equipped with the best-protected vehicle may cause some domestic political problems. The US government is very casualty sensitive during SSCOs. Furthermore, the current US main battle tank particularly impressive, and may help provide some additional intimidation in SSCOs.
  2. In any emergency insertion scenario or conventional war scenario, the use of lighter armor could result in higher US casualties and lesser combat effectiveness. This will certainly cause some domestic political problems and may impact army morale. However by the same token, light infantry forces, unsupported by easily deployable armor could present a worse situation.
How Big Data is Helping Social Workers Know Where to Focus

How Big Data is Helping Social Workers Know Where to Focus

Social workers play an important role into how society shapes itself for the future. They make sure children, families, and adults could survive, thrive, and interact with local communities. Most social workers spend their time trying to help people in lower income communities get whatever help they need, to improve their quality of life and decrease crime.

The problem is, there aren’t enough social workers to go around. Many social workers are over-worked, emotionally burdened, and asked to accomplish projects way too big for them.

Luckily, social workers aren’t alone. Thanks to big data, social workers can find where best to spend their time, discover best practices for helping individuals and communities and identity areas that some preventive action could help keep a community healthy.   

Big Data Mapping Different Factors

With the right data, it’s possible to see patterns, trends, signals, and contributors to different elements of living in a low-income neighbourhood. Mapping out these different factors can give social workers guidance as to what type of actions they need to take to help the neighbourhood as a whole.

For example, let’s say a social worker wants to combat dangerous disease and help prevent illnesses like diabetes or heart disease. They can team up ...

Read More on Datafloq
How to Create an Intelligent Company

How to Create an Intelligent Company

Nowadays, companies are besieged by information and the possibilities of new IT solutions. With the rapid advancements in information technology and high-speed Internet, companies now have access to huge amounts of information, processing and sharing systems regarding customers, their demographics, and their online behaviour across all touch points on the buyer’s journey. The advantage of access to so much information is not just in the form of increased revenue and developing long-lasting customer relationships. It is also about developing sensitivity to warning signals, which would allow companies to prevent or mitigate disasters.

So far, companies have improved their practices concerning capturing greater amounts of data. However, the prevalent norm is that company employees pass this data on to decision makers en masse, leaving it to them to sift through it and come up with relevant data upon which decisions could be taken or to a company process or person to do the same. Hence, although they’ve gotten good at collecting data, most companies have yet to develop the ability to process this data and generate actionable insights which they can share with decision makers, processors, and their clients.

In a study conducted by the Hackett Group in 2006, finance organizations that were ...

Read More on Datafloq
The Four Types of Data Analytics

The Four Types of Data Analytics

Terms like “business intelligence� and “data analytics� mean different things in different contexts and the only way to hack through the Forest of Jargon is with a Machete of Specificity. Whether you're building an analytics tool, shopping for a business intelligence application, or just looking to get a better handle on IT terms, it's useful to be familiar with the analytics spectrum. In this article, we’re going to focus on disambiguating the term data analytics by breaking it down into types and aligning those with business objectives.

Data analytics is the process of extracting, transforming, loading, modelling, and drawing conclusions from data to make decisions. It’s the “drawing conclusions� bit that BI tools are most concerned with, as the extracting, transforming, and loading steps generally happen at the database level. There are four ways of making sense out of data once it’s been formatted for reporting, and these are descriptive, diagnostic, predictive, and prescriptive analytics.

Descriptive and diagnostic analytics help you construct a narrative of the past while predictive and prescriptive analytics help you envision a possible future. The above diagram shows examples of features that would fall into each of the four categories, along with the types of questions those features are designed to help answer.

Descriptive analytics comprise your reporting ...

Read More on Datafloq
U.S. Army Solicits Proposals For Mobile Protected Firepower (MPF) Light Tank

U.S. Army Solicits Proposals For Mobile Protected Firepower (MPF) Light Tank

The U.S. Army’s late and apparently lamented M551 Sheridan light tank. [U.S. Department of the Army/Wikipedia]

The U.S. Army recently announced that it will begin soliciting Requests for Proposal (RFP) in November to produce a new lightweight armored vehicle for its Mobile Protected Firepower (MPF) program. MPF is intended to field a company of vehicles for each Army Infantry Brigade Combat Team to provide them with “a long-range direct-fire capability for forcible entry and breaching operations.�

The Army also plans to field the new vehicle quickly. It is dispensing with the usual two-to-three year technology development phase, and will ask for delivery of the first sample vehicles by April 2018, one month after the RFP phase is scheduled to end. This will invariably favor proposals using existing off-the-shelf vehicle designs and “mature technology.�

The Army apparently will also accept RFPs with turret-mounted 105mm main guns, at least initially. According to previous MFP parameters, acceptable designs will eventually need to be able to accommodate 120mm guns.

I have observed in the past that the MPF is the result of the Army’s concerns that its light infantry may be deprived of direct fire support on anti-access/area denial (A2/AD) battlefields. Track-mounted, large caliber direct fire guns dedicated to infantry support are something of a doctrinal throwback to the assault guns of World War II, however.

There was a noted tendency during World War II to use anything on the battlefield that resembled a tank as a main battle tank, with unhappy results for the not-main battle tanks. As a consequence, assault guns, tank destroyers, and light tanks became evolutionary dead-ends in the development of post-World War II armored doctrine (the late M551 Sheridan, retired without replacement in 1996, notwithstanding). [For more on the historical background, see The Dupuy Institute, “The Historical Effectiveness of Lighter-Weight Armored Forces,� August 2001.]

The Army has been reluctant to refer to MPF as a light tank, but as David Dopp, the MPF Program Manager admitted, “I don’t want to say it’s a light tank, but it’s kind of like a light tank.â€� He went on to say that “It’s not going toe to toe with a tank…It’s for the infantry. It goes where the infantry goes — it breaks through bunkers, it works through targets that the infantry can’t get through.”

Major General David Bassett, program executive officer for the Army’s Ground Combat Systems concurred. It will be a tracked vehicle with substantial armor protection, Bassett said, “but certainly not what you’d see on a main battle tank.â€�

It will be interesting to see what the RFPs have to offer.

Previous TDI commentaries on the MPF Program:

How Big Data is Changing the Insurance Industry

How Big Data is Changing the Insurance Industry

Big data and the Internet of Things are two technological wonders that are changing our lives. Our machines, equipment and devices are continuously connected and communicating with each other, and billions of devices are recording data at all times - sensors, microphones and cameras are attached to everything that we have - whether that's industrial hardware, our phones, our cars - or even in some cases our fridges!

Insurance is nothing new - but imagine how hard it must have been for insurers to know what was going on in the mid 19th century. Quantifying losses for business interruption insurance back then was hard because modern accounting and reporting standards weren't yet a thing. Most records were created by hand - and even just a few decades ago, when computers were becoming more common, but data was transferred by the lovingly nicknamed 'sneakernet' - walking to a colleague's desk to give them a floppy disk with the data - record keeping wasn't up to modern standards.

Today, we store data about the data that we store! That's not a hyperbole. Datasets are huge and we track as much as we can about everyone and everything that we deal with, and we need ...

Read More on Datafloq
Logistic Map, Chaos, Randomness and Quantum Algorithms

Logistic Map, Chaos, Randomness and Quantum Algorithms

The logistic map is the most basic recurrence formula exhibiting various levels of chaos depending on its parameter. It has been used in population demographics to model chaotic behavior. Here we explore this model in the context of randomness simulation and revisit a bizarre non-periodic random number generator discovered 70 years ago, based on the logistic map equation. We then discuss flaws and strengths in widely used random number generators, as well as how to reverse-engineer such algorithms. Finally, we discuss quantum algorithms, as they are appropriate in our context.

Source for animated picture: click here

Logistic map

The logistic map is defined by the following recursion

X(k) = r X(k-1) (1 - X(k-1))

with one positive parameter r less or equal to 4. The starting value X(0) is called the seed, and must be in [0, 1]. The higher r, the more chaotic the behavior.  At r = 3.56995... is the onset of chaos. At this point, from almost all seeds, we no longer see oscillations of finite period. In other words, slight variations in the initial population yield dramatically different results over time, a prime characteristic of chaos.

When r=4, an exact solution is known, see here. In that case, the explicit formula is

The case r=4 was used ...

Read More on Datafloq
How to Use Location Data to Get Ahead of the Competition

How to Use Location Data to Get Ahead of the Competition

It’s no longer business as usual in the retail industry. It’s not enough to know your customers and to provide them with the best services and products. One tiny piece of information --where the customers are -- is the game changer. Retailers can use location data to obtain this information.

Location data refers to the trajectories of people and objects. In the retail industry, knowing the location of their target market makes them more responsive to their customers’ needs and to environmental changes; hence, making them more profitable.

Location data presents a myriad of opportunities for business owners. Knowing where their customers are, provides retailers with the power to access them and to effectively promote their products or services at the right place and time; increase their sales; redirect their resources in the best way; and overall, be able to make smarter business decisions.

Here is an example of how retailers can use location data to find their customers and increase competitiveness. In each example, anonymized location data around Bukit Bintang Kuala Lumpur between 1 October 2016 and 31 October 2016 was used. The data was available from DataStreamX.

Finding the right catchment area

For retailers, understanding their catchment areas is very important in running their ...

Read More on Datafloq
How Augmented Reality Will Affect Marketing Campaigns

How Augmented Reality Will Affect Marketing Campaigns

Often, I write about emerging technologies such as big data, blockchain, IoT and AI. However, other technologies are also increasingly affecting organizations, one of them being Augmented Reality (AR). Thanks to technological advancements, Augmented reality is rapidly growing and is projected to drive billion-dollar annual revenues within the next decade. A Markets and Markets report estimates that AR will grow to be a $117.4 billion market by 2022. Moreover, A Citi GPS report projects AR's billion-dollar annual revenues will further increase to $692 billion by 2025. AR will be booming and offers great opportunities for organizations to expand and enhance marketing activities. Take a look at augmented reality's impact on marketing:

AR Marketing Will Change the Marketer's Role and Marketing

Augmented reality will have significant implications for marketers. To leverage AR's opportunities for marketing campaigns, it is key to understand AR's differences and similarities in comparison to other digital engagement platforms. AR is easy to adopt for many users because virtual aspects are superimposed over familiar environments. For instance, one study found that consumers enjoyed the playful experience of an AR mirror-app, compared to physical testers. The app helped them to virtually 'try on' eyeshadow and lipstick so they could easily visualize ...

Read More on Datafloq
The Transformative Influence of IoT in Manufacturing

The Transformative Influence of IoT in Manufacturing

You don't need to use IoT in your home to understand that it's already transforming the business environment that we have right now. IoT is transforming how consumers order online, transfer their data to their employer when working remotely and deliver their demands to companies. It doesn't stop there. IoT is now transforming the world of manufacturing. From inventory tracking to big data management, IoT is getting their hands into the big, transformative areas of manufacturing to deliver efficient and most profitable market shares in the industry.

How IoT Adds Value To Manufacturing

The potential users of IoT are global, large-scale and almost infinite. When the manufacturing industry taps this market, there will be a lot of opportunity for growth. That said, the biggest parts of manufacturing that will enjoy a lot of benefits from IoT will be in the area of device connectivity, data analysis, business productivity and advanced analytics.

Device Connectivity

With IoT transforming the way data gets across various manufacturing companies, the companies have better visibility and access to the different ways that can make the process even more efficient. The connected network via IoT and satellite m2m will be able to provide the manufacturing companies the insight and data management ...

Read More on Datafloq
How Big Data Can Help With Streamlined Fleet Tracking

How Big Data Can Help With Streamlined Fleet Tracking

According to a survey by eyefortransport, almost half of businesses considered that the biggest benefit of adopting fleet tracking technologies was the improvement of fuel and route management. What makes these technologies so effective in these areas of activity? The collection and the analysis of the "big data" such as mileage, route speeds, fuel usage and delivery times are responsible for these impressive improvements.

By identifying the most important metric for tracking your vehicles, you can improve your business in the areas you are the most interested in such as safety, operation costs and delivery times.

Routing Data

By knowing your driver, your truck and your delivery location, you may be able to determine the most efficient route to take your driver from A to B. You may even be able to figure out which of the alternative routes is the most effective during the morning rush hour by comparison to the mid-afternoon. The problem appears when you start adding more drivers, trucks and routes, as the analysis of all incoming data can easily eat up a very large part of your time. All these incoming informational streams that enter a single system are defined as "big data."

Your GPS tracking system is able ...

Read More on Datafloq
Teradata Aims for a New Era in Data Management with its New IntelliSphere Offering

Teradata Aims for a New Era in Data Management with its New IntelliSphere Offering

As Teradata continues to expand its Teradata Everywhere initiative, major announcements came from within its 2017 Partners conference, so along with the announcement of its brand new analytics platform, the company also unveiled a new comprehensive software portfolio that adds the data management power needed behind the analytics scenario.

According to Teradata, IntelliSphere is “a comprehensive software portfolio that unlocks a wealth of key capabilities for enterprises to leverage all the core software required to ingest, access, deploy and manage a flexible analytical ecosystem�.

(Image courtesy of Teradata)

Meanwhile, Teradata IntelliSphere is intended to complement the ongoing Teradata Everywhere initiative and be a natural companion for the Teradata Analytics Platform and, an important tool to enable users across the organization to use their preferred analytic tools and engines across data sources at scale, while having all the necessary components to ensure efficient data management from ingestion to consumption.

According to Oliver Ratzesberger, Executive Vice President and Chief Product Officer at Teradata:

“With IntelliSphere, companies no longer need to purchase separate software applications to build and manage their ecosystem. Companies can design their environment to realize the full potential of their data and analytics today, with a guarantee that future updates can be leveraged immediately without another license or subscription.�

Available for purchase now, the IntelliSphere software portfolio includes a series of key capabilities to ensure efficiency in all the data process to:

  • Ingest, so  companies can easily capture and distribute high volume data streams, with ready-to-run elastic architecture and quick access for business-critical analysis.
  • Access, so companies can gain easy access to data stored in a hybrid cloud or heterogeneous technology environment.
  • Deploy applications and analytic models for easy user access and enterprise collaboration.
  • Manage, to allow ad-hoc data movement, as well as ongoing monitoring and control via an operational interface.

According to the data management company, Teradata IntelliSphere is composed of ten software components, which include:

Finally, the company mentions that in the future, all new software releases will become part the IntelliSphere bundle, a logical step towards building a consistent and more homogeneous analytics ecosystem that can help Teradata to provide simplicity and functional power to its user base.

As I mentioned in another blog in this same vein, It seems we are facing a new stage in the analytics and data management software market in which software companies are now fully renovating their offerings to consolidate as many functions as possible within single enterprise platforms that blend all analytics needs with a robust data engine.

In future posts I’ll try to bring more information about this and the rest of Teradata’s new set of offerings so, stay tuned.
Teradata includes brand New Analytics Platform to its Teradata Everywhere Initiative

Teradata includes brand New Analytics Platform to its Teradata Everywhere Initiative

(Image Courtesy of Teradata)
In a recent announcement made during its 2017 Partners conference data management software provider, Teradata made an important new addition to its global Teradata Everywhere initiative with a brand new analytics platform.

The new offering to be available for early access later this year will aim to enable users to use the analytics environment of their choice. According to the company, the new analytics platform is planned to enable access to a myriad of analytics functions and engines so users can develop full analytics processes and business solutions using the tools of their choice so initially, the new platform, will natively integrate with Teradata and Aster technology (Figure 1) and in a near future will enable integration with leading analytics engines including Spark, TensorFlow, Gluon, and Theano.

Figure 1.  Aster Analytics Functions (Courtesy of Teradata)

As corporate data is increasingly captured and stored in a wider number of formats, the platform includes support for several data types from multiple data sources, from traditional to new formats social media and IoT formats, including text, spatial, CSV, and JSON formats, or Apache Avro, as well as other open-source data types that allow programmers to dynamically process data schemas.

As part of a new set of functional features, the Teradata Analytics Platform includes the provision of different scalable analytic functions like attribution, path analytics, time series, and number of statistical, text and machine learning algorithms.

With support for multiple languages including Python, R, SAS and SQL, and different tools like Jupyter, RStudio, KNIME, SAS, and Dataiku. Teradata expects experienced users can use their tool of choice to not just develop with less disruption but to promote efficiency via code and model re-use via Teradata’s AppCenter to allow analysts to share analytic applications and deploy reusable models within web-based interface.

According to Oliver Ratzesberger, Teradata’s executive vice president and chief product officer:

“In today’s environment different users have different analytic needs, this dynamic causes a proliferation of tools and approaches that are both costly and silo-ed. We solve this dilemma with the unmatched versatility of the Teradata Analytics Platform, where we are incorporating a choice of analytic functions and engines, as well as an individual’s preferred tools and languages across data types. Combined with the industry’s best scalability, elasticity and performance, the Teradata Analytics Platform drives superior business insight for our customers.�

According to Teradata, the benefits offered by the new analytics platform include:

  • Simplification of data access to both data warehouse and data lakes
  • Speed data preparation with embedded analytics
  • Allow fast and easy access to cutting-edge advanced analytics and AI technologies
  • Support for preferred data science workbenches and languages like R, Python, and SQL
  • Helping to make prescriptive analytics operational to enable autonomous decisioning
  • Minimize risk of existing analytical architectures with Teradata Everywhere

More important, the announcement of the new analytics platform comes along with the announcement of Teradata’s new comprehensive software portfolio initiative IntelliSphere, the new company’s proposal for easy data access, ingestion, deployment, and management.

According to Teradata, the new platform is planned to be flexibly delivered on-premises or via public and private clouds as well as managed cloud options, for which all them will use the same software.

Teradata is definitely aiming to be everywhere

Teradata seems to have understood how important is and will be in the future to offer new software solutions from more open and agile architectures that play well with others and yet are solid and secure. A movement other data management companies are already exploring and adopting, such is the case for, among others, Cloudera and its new Data Science Workbench or SAS’ Open Analytics Platform.

It seems we are facing a new stage in the analytics and data management software market in which software companies are reshaping all its offerings to consolidate as many functions as possible within single enterprise platforms that blend all analytics needs with a robust data engine.
In the meantime, personally I’m eager to check the new Teradata’s Analytics Platform in action.

What’s Wrong with Big Data

What’s Wrong with Big Data

Big data may be the technology getting all the buzz nowadays, but that does not mean that it is infallible. Big data has wreaked havoc in many situations, yet the exact reasons are not always clear. They could be the detection of false positives, technical glitches, lack of tools, shabby data, incorrect data or even unnecessary data.

Needless to say, if you have some of the errors mentioned above, the results will be completely different from what you were expecting. To make matters worse, the results are sometimes not analyzed, which can result in some unpleasant consequences.  

Flaws of Big Data

Thanks to big data and the cloud, the powers of supercomputers are everybody’s for the taking. However, what we lose in the mix is that the tools we use to interpret, analyze and apply this tsunami of information usually has a fatal flaw. Most of the data analysis we conduct is based on erroneous models which means that mistakes are inevitable. And when our overblown expectations exceed our capacity, the consequences can be dire.

If big data was not so ginormous, this would not be such a big problem. Unfortunately, given the volume of that we have, we are able to use ...

Read More on Datafloq
RELX Group: The Transformation to a Leading Global Information & Analytics Company (Part 2)

RELX Group: The Transformation to a Leading Global Information & Analytics Company (Part 2)

With the world changing so rapidly, every company or organization that adapts to the changes becomes an example for all others. One very recent example for all companies to learn from and implement in their own decision making is the transformation of RELX Group to a leading global information and analytics company. I mentioned in my previous article, and the first part of this series, that the transformation at RELX Group has been interesting, but somewhat surprising at the same time. Considering the questions this brings to mind, I decided to delve into this concept even more.

For those new to the organization, RELX Group is a global company that provides information and analytics for business customers and professionals across industries. They help scientists make discoveries, lawyers win cases, doctors save lives, and insurance companies offer customers lower prices. In short, they enable their customers to make better decisions, get better results, and be more productive.

Last time around, I had the pleasure of talking with Kumsal Bayazit, who is the head of the CTO forum at RELX Group. She has been at the organization for over 13 years, which is why her insight was deeply appreciated by the readers and me. ...

Read More on Datafloq
Validating Trevor Dupuy’s Combat Models

Validating Trevor Dupuy’s Combat Models

[The article below is reprinted from Winter 2010 edition of The International TNDM Newsletter.]

A Summation of QJM/TNDM Validation Efforts

By Christopher A. Lawrence

There have been six or seven different validation tests conducted of the QJM (Quanti�ed Judgment Model) and the TNDM (Tactical Numerical Deterministic Model). As the changes to these two models are evolutionary in nature but do not fundamentally change the nature of the models, the whole series of validation tests across both models is worth noting. To date, this is the only model we are aware of that has been through multiple validations. We are not aware of any DOD [Department of Defense] combat model that has undergone more than one validation effort. Most of the DOD combat models in use have not undergone any validation.

The Two Original Validations of the QJM

After its initial development using a 60-engagement WWII database, the QJM was tested in 1973 by application of its relationships and factors to a validation database of 21 World War II engagements in Northwest Europe in 1944 and 1945. The original model proved to be 95% accurate in explaining the outcomes of these additional engagements. Overall accuracy in predicting the results of the 81 engagements in the developmental and validation databases was 93%.[1]

During the same period the QJM was converted from a static model that only predicted success or failure to one capable of also predicting attrition and movement. This was accomplished by adding variables and modifying factor values. The original QJM structure was not changed in this process. The addition of movement and attrition as outputs allowed the model to be used dynamically in successive “snapshot� iterations of the same engagement.

From 1973 to 1979 the QJM’s formulae, procedures, and variable factor values were tested against the results of all of the 52 significant engagements of the 1967 and 1973 Arab-Israeli Wars (19 from the former, 33 from the latter). The TNDM was able to replicate all of those engagements with an accuracy of more than 90%?[2]

In 1979 the improved QJM was revalidated by application to 66 engagements. These included 35 from the original 81 engagements (the “development database�), and 31 new engagements. The new engagements included �ve from World War II and 26 from the 1973 Middle East War. This new validation test considered four outputs: success/failure, movement rates, personnel casualties, and tank losses. The TNDM predicted success/failure correctly for about 85% of the engagements. It predicted movement rates with an error of 15% and personnel attrition with an error of 40% or less. While the error rate for tank losses was about 80%, it was discovered that the model consistently underestimated tank losses because input data included all kinds of armored vehicles, but output data losses included only numbers of tanks.[3]

This completed the original validations efforts of the QJM. The data used for the validations, and parts of the results of the validation, were published, but no formal validation report was issued. The validation was conducted in-house by Colonel Dupuy’s organization, HERO [Historical Evaluation Research Organization]. The data used were mostly from division-level engagements, although they included some corps- and brigade-level actions. We count these as two separate validation efforts.

The Development of the TNDM and Desert Storm

In 1990 Col. Dupuy, with the collaborative assistance of Dr. James G. Taylor (author of Lanchester Models of Warfare [vol. 1] [vol. 2], published by the Operations Research Society of America, Arlington, Virginia, in 1983) introduced a signi�cant modi�cation: the representation of the passage of time in the model. Instead of resorting to successive “snapshots,� the introduction of Taylor’s differential equation technique permitted the representation of time as a continuous flow. While this new approach required substantial changes to the software, the relationship of the model to historical experience was unchanged.[4] This revision of the model also included the substitution of formulae for some of its tables so that there was a continuous flow of values across the individual points in the tables. It also included some adjustment to the values and tables in the QJM. Finally, it incorporated a revised OLI [Operational Lethality Index] calculation methodology for modem armor (mobile �ghting machines) to take into account all the factors that influence modern tank warfare.[5] The model was reprogrammed in Turbo PASCAL (the original had been written in BASIC). The new model was called the TNDM (Tactical Numerical Deterministic Model).

Building on its foundation of historical validation and proven attrition methodology, in December 1990, HERO used the TNDM to predict the outcome of, and losses from, the impending Operation DESERT STORM.[6] It was the most accurate (lowest) public estimate of U.S. war casualties provided before the war. It differed from most other public estimates by an order of magnitude.

Also, in 1990, Trevor Dupuy published an abbreviated form of the TNDM in the book Attrition: Forecasting Battle Casualties and Equipment Losses in Modern War. A brief validation exercise using 12 battles from 1805 to 1973 was published in this book.[7] This version was used for creation of M-COAT[8] and was also separately tested by a student (Lieutenant Gozel) at the Naval Postgraduate School in 2000.[9] This version did not have the �repower scoring system, and as such neither M-COAT, Lieutenant Gozel’s test, nor Colonel Dupuy’s 12-battle validation included the OLI methodology that is in the primary version of the TNDM.

For counting purposes, I consider the Gulf War the third validation of the model. In the end, for any model, the proof is in the pudding. Can the model be used as a predictive tool or not? If not, then there is probably a fundamental flaw or two in the model. Still the validation of the TNDM was somewhat second-hand, in the sense that the closely-related previous model, the QJM, was validated in the 1970s to 200 World War II and 1967 and 1973 Arab-Israeli War battles, but the TNDM had not been. Clearly, something further needed to be done.

The Battalion-Level Validation of the TNDM

Under the guidance of Christopher A. Lawrence, The Dupuy Institute undertook a battalion-level validation of the TNDM in late 1996. This effort tested the model against 76 engagements from World War I, World War II, and the post-1945 world including Vietnam, the Arab-Israeli Wars, the Falklands War, Angola, Nicaragua, etc. This effort was thoroughly documented in The International TNDM Newsletter.[10] This effort was probably one of the more independent and better-documented validations of a casualty estimation methodology that has ever been conducted to date, in that:

  • The data was independently assembled (assembled for other purposes before the validation) by a number of different historians.
  • There were no calibration runs or adjustments made to the model before the test.
  • The data included a wide range of material from different conflicts and times (from 1918 to 1983).
  • The validation runs were conducted independently (Susan Rich conducted the validation runs, while Christopher A. Lawrence evaluated them).
  • The results of the validation were fully published.
  • The people conducting the validation were independent, in the sense that:

a) there was no contract, management, or agency requesting the validation;
b) none of the validators had previously been involved in designing the model, and had only very limited experience in using it; and
c) the original model designer was not able to oversee or influence the validation.[11]

The validation was not truly independent, as the model tested was a commercial product of The Dupuy Institute, and the person conducting the test was an employee of the Institute. On the other hand, this was an independent effort in the sense that the effort was employee-initiated and not requested or reviewed by the management of the Institute. Furthermore, the results were published.

The TNDM was also given a limited validation test back to its original WWII data around 1997 by Niklas Zetterling of the Swedish War College, who retested the model to about 15 or so Italian campaign engagements. This effort included a complete review of the historical data used for the validation back to their primarily sources, and details were published in The International TNDM Newsletter.[12]

There has been one other effort to correlate outputs from QJM/TNDM-inspired formulae to historical data using the Ardennes and Kursk campaign-level (i.e., division-level) databases.[13] This effort did not use the complete model, but only selective pieces of it, and achieved various degrees of “goodness of �t.� While the model is hypothetically designed for use from squad level to army group level, to date no validation has been attempted below battalion level, or above division level. At this time, the TNDM also needs to be revalidated back to its original WWII and Arab-Israeli War data, as it has evolved since the original validation effort.

The Corps- and Division-level Validations of the TNDM

Having now having done one extensive battalion-level validation of the model and published the results in our newsletters, Volume 1, issues 5 and 6, we were then presented an opportunity in 2006 to conduct two more validations of the model. These are discussed in depth in two articles of this issue of the newsletter.

These validations were against conducted using historical data, 24 days of corps-level combat and 25 cases of division-level combat drawn from the Battle of Kursk during 4-15 July 1943. It was conducted using an independently-researched data collection (although the research was conducted by The Dupuy Institute), using a different person to conduct the model runs (although that person was an employee of the Institute) and using another person to compile the results (also an employee of the Institute). To summarize the results of this validation (the historical �gure is listed �rst followed by the predicted result):

There was one other effort that was done as part of work we did for the Army Medical Department (AMEDD). This is fully explained in our report Casualty Estimation Methodologies Study: The Interim Report dated 25 July 2005. In this case, we tested six different casualty estimation methodologies to 22 cases. These consisted of 12 division-level cases from the Italian Campaign (4 where the attack failed, 4 where the attacker advanced, and 4 Where the defender was penetrated) and 10 cases from the Battle of Kursk (2 cases Where the attack failed, 4 where the attacker advanced and 4 where the defender was penetrated). These 22 cases were randomly selected from our earlier 628 case version of the DLEDB (Division-level Engagement Database; it now has 752 cases). Again, the TNDM performed as well as or better than any of the other casualty estimation methodologies tested. As this validation effort was using the Italian engagements previously used for validation (although some had been revised due to additional research) and three of the Kursk engagements that were later used for our division-level validation, then it is debatable whether one would want to call this a seventh validation effort. Still, it was done as above with one person assembling the historical data and another person conducting the model runs. This effort was conducted a year before the corps and division-level validation conducted above and influenced it to the extent that we chose a higher CEV (Combat Effectiveness Value) for the later validation. A CEV of 2.5 was used for the Soviets for this test, vice the CEV of 3.0 that was used for the later tests.


The QJM has been validated at least twice. The TNDM has been tested or validated at least four times, once to an upcoming, imminent war, once to battalion-level data from 1918 to 1989, once to division-level data from 1943 and once to corps-level data from 1943. These last four validation efforts have been published and described in depth. The model continues, regardless of which validation is examined, to accurately predict outcomes and make reasonable predictions of advance rates, loss rates and armor loss rates. This is regardless of level of combat (battalion, division or corps), historic period (WWI, WWII or modem), the situation of the combats, or the nationalities involved (American, German, Soviet, Israeli, various Arab armies, etc.). As the QJM, the model was effectively validated to around 200 World War II and 1967 and 1973 Arab-Israeli War battles. As the TNDM, the model was validated to 125 corps-, division-, and battalion-level engagements from 1918 to 1989 and used as a predictive model for the 1991 Gulf War. This is the most extensive and systematic validation effort yet done for any combat model. The model has been tested and re-tested. It has been tested across multiple levels of combat and in a wide range of environments. It has been tested where human factors are lopsided, and where human factors are roughly equal. It has been independently spot-checked several times by others outside of the Institute. It is hard to say what more can be done to establish its validity and accuracy.


[1] It is unclear what these percentages, quoted from Dupuy in the TNDM General Theoretical Description, specify. We suspect it is a measurement of the model’s ability to predict winner and loser. No validation report based on this effort was ever published. Also, the validation �gures seem to reflect the results after any corrections made to the model based upon these tests. It does appear that the division-level validation was “incremental.� We do not know if the earlier validation tests were tested back to the earlier data, but we have reason to suspect not.

[2] The original QJM validation data was �rst published in the Combat Data Subscription Service Supplement, vol. 1, no. 3 (Dunn Loring VA: HERO, Summer 1975). (HERO Report #50) That effort used data from 1943 through 1973.

[3] HERO published its QJM validation database in The QJM Data Base (3 volumes) Fairfax VA: HERO, 1985 (HERO Report #100).

[4] The Dupuy Institute, The Tactical Numerical Deterministic Model (TNDM): A General and Theoretical Description, McLean VA: The Dupuy Institute, October 1994.

[5] This had the unfortunate effect of undervaluing WWII-era armor by about 75% relative to other WWII weapons when modeling WWII engagements. This left The Dupuy Institute with the compromise methodology of using the old OLI method for calculating armor (Mobile Fighting Machines) when doing WWII engagements and using the new OLI method for calculating armor when doing modem engagements

[6] Testimony of Col. T. N. Dupuy, USA, Ret, Before the House Armed Services Committee, 13 Dec 1990. The Dupuy Institute File I-30, “Iraqi Invasion of Kuwait.�

[7] Trevor N. Dupuy, Attrition: Forecasting Battle Casualties and Equipment Losses in Modern War (HERO Books, Fairfax, VA, 1990), 123-4.

[8] M-COAT is the Medical Course of Action Tool created by Major Bruce Shahbaz. It is a spreadsheet model based upon the elements of the TNDM provided in Dupuy’s Attrition (op. cit.) It used a scoring system derived from elsewhere in the U.S. Army. As such, it is a simpli�ed form of the TNDM with a different weapon scoring system.

[9] See Gözel, Ramazan. “Fitting Firepower Score Models to the Battle of Kursk Data,� NPGS Thesis. Monterey CA: Naval Postgraduate School.

[10] Lawrence, Christopher A. “Validation of the TNDM at Battalion Level.â€� The International TNDM Newsletter, vol. 1, no. 2 (October 1996); Bongard, Dave “The 76 Battalion-Level Engagements.â€� The International TNDM Newsletter, vol. 1, no. 4 (February 1997); Lawrence, Christopher A. “The First Test of the TNDM Battalion-Level Validations: Predicting the Winnerâ€� and “The Second Test of the TNDM Battalion-Level Validations: Predicting Casualties,” The International TNDM Newsletter, vol. 1 no. 5 (April 1997); and Lawrence, Christopher A. “Use of Armor in the 76 Battalion-Level Engagements,â€� and “The Second Test of the Battalion-Level Validation: Predicting Casualties Final Scorecard.â€� The International TNDM Newsletter, vol. 1, no. 6 (June 1997).

[11] Trevor N. Dupuy passed away in July 1995, and the validation was conducted in 1996 and 1997.

[12] Zetterling, Niklas. “CEV Calculations in Italy, 1943,” The International TNDM Newsletter, vol. 1, no. 6. McLean VA: The Dupuy Institute, June 1997. See also Research Plan, The Dupuy Institute Report E-3, McLean VA: The Dupuy Institute, 7 Oct 1998.

[13] See Gözel, “Fitting Firepower Score Models to the Battle of Kursk Data.�

AI is the Next Wave of Innovation For Life Sciences and Pharma

AI is the Next Wave of Innovation For Life Sciences and Pharma

Innovation doesn’t always happen at a steady pace. Most often, technological revolutions happen all at once, with rapid change and disruption, often followed by a lull as people get used to the new normal. Noted futurist from the mid 19th century, Alvin Toffler, once expressed this concept as a series of ‘waves’. In his example, each wave of innovative technology washes over the previous one. This means that with each new technological development, the old is washed away and a new era ensues.

We believe that we are at the start of a new wave of technological innovation, one that will completely disrupt the way we view medicine, research, and our own health. This wave is being powered by advances in AI and machine learning that are making it possible for researchers to develop cures faster, doctors to deliver more effective care, and healthcare companies to reduce costs while increasing access to care.

Imagine a world in which better data analytics mean a cure takes months to develop instead of years. Or a future in which a doctor in a rural area has access to the same data resources that the biggest metropolitan hospitals use. By adapting machine learning technologies to the ...

Read More on Datafloq
7 Ways Big Data Analytics Can Augment Your Marketing Strategy

7 Ways Big Data Analytics Can Augment Your Marketing Strategy

For the last few years, Big Data has been all the rage in the corporate world. Just when you thought the information age had been milked for all that could be squeezed out of it, the use of Big Data analytics became the avenue to massive corporate profits that keeps on giving. In fact, the more data that companies gather and leverage to improve their marketing strategies, the more income potential they stand to generate. This is why it is critical for smaller companies to get into the game and ride the Big Data wave in order to earn more from their marketing strategies.

1. Finding the Right Price Point

Every day information is being gathered on how much consumers are willing to pay for products and services your company is presently offering for sale. As more data is amassed about such pricing information, your company can take advantage of these Big Data findings to know how to zero in on the right pricing structure to use to attract more buyers. By adjusting your pricing to fit the data, you will inevitably increase sales without having to do so much guessing about how to price products and services being offered. This will ...

Read More on Datafloq
20 essential software development books to read

20 essential software development books to read

Software development books are a great source of knowledge and wisdom1. But unfortunately, there are very few people reading books today, especially programmers. Most often they rely on the internet search results to find answers. 
But If you’re a software developer, you need to read more books, because software development is not only about coding, it is about thinking, it is about best practices. And books give you a good explanation and base, that you won’t always find in short articles or Google search results. In Apiumhub we are big fans of reading good literature, we even have a small library in the office with our favorite software development books. Today, we created a list of books we believe may help any developer become a better professional. And here you have a list of top 20 software development books that are worth mentioning in this article.

Top 20 software development books to read

1. Refactoring: Improving the Design of Existing Code by Martin Fowler, Kent Beck, John Brant, William Opdyke, Don Roberts, Erich Gamma

This book is the first one in the list of top software development books, and it is written by very well known software development influencers. It is basically about improving the design of ...

Read More on Datafloq
8 Top Smart City Projects & Leaders to Watch in 2017

8 Top Smart City Projects & Leaders to Watch in 2017

The smart city became another buzz word these years, but we are going to be hearing a lot more of in the coming years. By 2020 we will be spending $400 billion a year building smart cities.

Let’s start with smart city landscape to see the trend and then look at absolutely amazing smart city projects, that may change our life, make it better! I think now we really reached the point when everyone cares about the planet and when everyone is conscious of environmental and social problems we have. I really hope that this article will inspire others to do something for the planet, for the cities and citizens and at the same time earn money! I will also give you a list of big players & key startups that work on the smart city projects the most. And also I am sure that you are interested in knowing top smart cities in the world, we will look at it also. I found some interesting facts!

Smart City Landscape

Smart cities are no longer the wave of the future. They are here now and growing quickly as the Internet of Things expands and impacts municipal services around the globe.

The smart city industry ...

Read More on Datafloq
Survey of German WWI Records

Survey of German WWI Records

At one point, we did a survey of German records from World War I. This was for an exploratory effort to look at measuring the impact of chemical weapons in an urban environment. As World War I was one of the few wars with extensive use of chemical weapons, then it was natural to look there for operational use of chemical weapons. Specifically we were looking at the use of chemical weapons in villages and such, as there was little urban combat in World War I.

As discussed in my last post on this subject, there is a two-sided collection of records in the U.S. Archives for those German units that fought the Americans in 1918. As our customer was British, they wanted to work with British units. They conducted the British research, but, they needed records from the German side. Ironically, the German World War I records were destroyed by the British bombing of Potsdam in April 1945. So where to find good opposing force data for units facing the British during World War I?

Germany did not form into a nation until 1871. During World War I, there were still several independent states, and duchies inside of the Germany and some of these maintained their own armies. The kingdoms of Bavaria, Wurttemberg and Saxony, along with the Grand Duchy of Baden fielded their own armies. They raised their own units and maintained their own records. So, if they maintained their records from World War I then we could developed a two-sided database of combat between the British and Germans in those cases where the British units opposed German units from those states.

So….for practical purposes, we ended up making a “research trip” to Freiburg (German archives), then Karlsruhe (Baden), Stuttgart (Wurttemberg) and then Munich (Bavaria). Sure enough, Wurttemberg had an nice collection of records for its units (it was really only two divisions in a single corps) and Bavaria still had a complete collection of records for its many divisions. The Bavarian Army fielded over a dozen divisions during the war.

So we ended up in Munich for several days going through their records. Their archives were located near Munich’s Olympic Park, the place of the tragic 1972 Olympics. It was in the old Bavarian Army headquarters that had been converted to an archives. After World War II, it was occupied by the Americans, and on the doors of many of the offices was still the name tags of the American NCOs and officers who last occupied those offices. The records were in great shape. The German Army just before WWII had done a complete inventory of the Bavarian records and made sure that there were complete. It was clear that when we looked into that, that many of these files had not been opened since then. Many of the files had sixty years of dust on them. The exception was the Sixth Bavarian Division, which clearly had been accessed several times recently. Adolf Hitler had served in that division in WWI.

The staff was extremely helpful. I did bring them gifts of candy for their efforts. They were  neatly wrapped in the box with plastic mice attached to the packaging. Later, they sent me this:

So we were able to establish that good German data could be assembled for those Wurttemberg and Bavarian units that the face the British. The British company that hired us determine that the British records were good for their research efforts. So the exploratory research effort was a success, but main effort was never funded because of changing priorities among their sponsors. This research was occurring while the Iraq War (2003-2011) was going on, so sometimes budget priorities would change rather suddenly.

The Iran-Iraq War (1980-1988) also made extensive used of chemical weapons. This is discussed in depth in our newsletters. See:  (issues Vol 2, No. 3; Vol 2, No. 4, Vol. 3, No 1, and Vol 3, No 2). Specifically see:, page 21. To date, I am not aware of any significant work done on chemical warfare based upon their records of the war.

This post is the follow-up to these two posts:

Captured Records: World War I

The Sad Story Of The Captured Iraqi DESERT STORM Documents

Clean Code: Explanation, Benefits & Books

Clean Code: Explanation, Benefits & Books

Every year, a tremendous amount of time and significant resources are lost because of poorly written code. Developers very often rush because they feel pressure from their managers or from the client to get the job done quickly, sometimes even sacrificing the quality. This is a big issue nowadays, and therefore I decided to write an article about clean code, where I want to show all the benefits of clean coding, of building the software project right from the beginning.

What is clean code?

This article I want to start with a very good quote: “Any fool can write code that a computer can understand. Good programmers write code that humans can understand1.� – Martin Fowler. This quote very good explains the essence of clean coding.

When we talk about clean code, we talk about reader-focused development style that produces software that’s easy to write, read and maintain. Clean code is code that is easy to understand and easy to change.

The word “clean� became very trendy nowadays if you look at design, photography, etc. people go for clean things, because nowadays our life is extremely complicated and we want to choose clean and clear options, because it calms us down and saves our precious time. Same in software ...

Read More on Datafloq
New U.S. Army Security Force Assistance Brigades Face Challenges

New U.S. Army Security Force Assistance Brigades Face Challenges

The shoulder sleeve insignia of the U.S. Army 1st Security Forces Assistance Brigade (SFAB). [U.S. Army]

The recent deaths of four U.S. Army Special Forces (ARSOF) operators in an apparent ambush in support of the Train and Assist mission in Niger appears to have reminded Congress of the enormous scope of ongoing Security Force Assistance (SFA) activities being conducted world-wide by the Defense Department. U.S. military forces deployed to 138 countries in 2016, the majority of which were by U.S. Special Operations Forces (SOF) conducting SFA activities. (While SFA deployments continue at a high tempo, the number of U.S. active-duty troops stationed overseas has fallen below 200,000 for the first time in 60 years, interestingly enough.)

SFA is the umbrella term for U.S. whole-of-government support provided to develop the capability and capacity of foreign security forces and institutions. SFA is intended to help defend host nations from external and internal threats, and encompasses foreign internal defense (FID), counterterrorism (CT), counterinsurgency (COIN), and stability operations.

Last year, the U.S. Army announced that it would revamp its contribution to SFA by creating a new type of unit, the Security Force Assistance Brigade (SFAB), and by establishing a Military Advisor Training Academy. The first of six projected SFABs is scheduled to stand up this month at Ft. Benning, Georgia.

Rick Montcalm has a nice piece up at the Modern War Institute describing the doctrinal and organizational challenges the Army faces in implementing the SFABs. The Army’s existing SFA structure features regionally-aligned Brigade Combat Teams (BCTs) providing combined training and partnered mission assistance for foreign conventional forces from the team to company level, while ARSOF focuses on partner-nation counterterrorism missions and advising and assisting commando and special operations-type forces.

Ideally, the SFABs would supplement and gradually replace most, but not all, of the regionally-aligned BCTs to allow them to focus on warfighting tasks. Concerns have arisen with the ARSOF community, however, that a dedicated SFAB force would encroach functionally on its mission and compete within the Army for trained personnel. The SFABs currently lack the intelligence capabilities necessary to successfully conduct the advisory mission in hostile environments. Although U.S. Army Chief of Staff General Mark Milley asserts that the SFABs are not Special Forces, properly preparing them for advise and assist roles would make them very similar to existing ARSOF.

Montcalm also points out that Army personnel policies complicate maintain the SFABs in the long-term. The Army has not created a specific military advisor career field and volunteering to serve in a SFAB could complicate the career progression of active duty personnel. Although the Army has taken steps to address this, the prospect of long repeat overseas tours and uncertain career prospects has forced the service to offer cash incentives and automatic promotions to bolster SFAB recruiting. As of August, the 1st SFAB needed 350 more soldiers to fully man the unit, which was scheduled to be operational in November.

SFA and the Army’s role in it will not decline anytime soon, so there is considerable pressure to make the SFAB concept successful. In light of the Army’s problematic efforts to build adequate security forces in Iraq and Afghanistan, there is also considerable room for improvement.

By: My Homepage

By: My Homepage

… [Trackback]

[…] Read More: […]

Data Science: To PhD or not to PhD?

Data Science: To PhD or not to PhD?

...That is the question.

Further study often seems the most appealing route to go down, and for many companies out there it’s often heralded as a must have for data science, but it isn’t always so and isn’t always pivotal in advancing your career.

To PhD.

A PhD is your original, unique research into something not necessarily covered from that angle before. The content of a data science PhD should showcase new findings in the field that will make an impact, or contribute to a particular subject. A lot of people pursuing Ph.D.s are driven by a passion for their area of study, more than thinking of a specific role they can enter into afterwards. They can take a long time to complete, so having a particularly keen interest in the subject, and passion in the field will drive your success. PhDs are also a gateway into being published at top conferences in your chosen field, like ICCV (International Conference on Computer Vision), NIPS (Conference on Neural Information Processing Systems) and ECCV (European Conference on Computer Vision) to name just a few, which undoubtedly has a huge impact on a researcher’s career path.

Carrying out and writing about such in-depth research shows your ability ...

Read More on Datafloq
Project Management Series: Interactive vs Push vs Pull Communication | Simplilearn

Project Management Series: Interactive vs Push vs Pull Communication | Simplilearn

Project Management Series: Interactive vs Push vs Pull Communication | Simplilearn Being a good communicator is an attribute that invariably finds its way onto every list of highly desirable qualities for any project manager, worldwide. And with good reason. Communication plays a critically important role in project management. If you do not communicate with your team, they will not know what they are supposed to do, and when. ...Read More.
What is Critical Chain Project Management? | Simplilearn

What is Critical Chain Project Management? | Simplilearn

What is Critical Chain Project Management? | Simplilearn Critical Chain Project Management A Brief Overview Critical Chain Project Management was developed and publicized by Dr. Eliyahu M. Goldratt in 1997. Followers of this methodology of Project Management claim it to be an alternative to the established standard of Project Management as advocated by PMBOK® and other Standards of Proje...Read More.
What is the real impact of social media? | Simplilearn

What is the real impact of social media? | Simplilearn

What is the real impact of social media? | Simplilearn Information and communication technology has changed rapidly over the past 20 years with a key development being the emergence of social media. The pace of change is accelerating. For example, the development of mobile technology has played an important role in shaping the impact of social media. Across the globe, mobile devices dominate in terms ...Read More.
Expert Webinar: The Pros and Cons of Paying for Social Placement | Simplilearn webinar starts 09-11-2017 13:00

Expert Webinar: The Pros and Cons of Paying for Social Placement | Simplilearn webinar starts 09-11-2017 13:00

With organic social reach falling across all platforms, marketers face a difficult choice; work overtime to create content so amazing people can't help but share it or pay established social media voices to promote your content for you. Learn to balance the need for increased exposure to the risks of pushing the boundaries too far, and discover...Read More.
The 3-to-1 Rule in Histories

The 3-to-1 Rule in Histories

I was reading a book this last week, The Blitzkrieg Legend: The 1940 Campaign in the West by Karl-Heinz Frieser (originally published in German in 1996). On page 54 it states:

According to a military rule of thumb, the attack should be numerically superior to the defender at a ratio of 3:1. That ratio goes up if the defender can fight from well developed fortification, such as the Maginot Line.

This rule never seems to go away. Trevor Dupuy had a chapter on it in Understanding War, published in 1987. It was Chapter 4: The Three-to-One Theory of Combat. I didn’t really bother discussing the 3-to-1 rule in my book, War by Numbers: Understanding Conventional Combat. I do have a chapter on force ratios: Chapter 2: Force Ratios. In that chapter I show a number of force ratios based on history. Here is my chart from the European Theater of Operations, 1944 (page 10):

Force Ratio…………………..Result……………..Percentage of Failure………Number of Cases

0.55 to 1.01-to-1.00…………Attack Fails………………………….100……………………………………5

1.15 to 1.88-to-1.00…………Attack usually succeeds………21…………………………………..48

1.95 to 2.56-to-1.00…………Attack usually succeeds………10…………………………………..21

2.71 to 1.00 and higher….Attack advances……………………..0…………………………………..42


We have also done a number of blog posts on the subject (click on our category “Force Ratios”), primarily:

Trevor Dupuy and the 3-1 Rule

You will also see in that blog post another similar chart showing the odds of success at various force ratios.

Anyhow, I kind of think that people should probably quit referencing the 3-to-1 rule. It gives it far more weight and attention than it deserves.


Master the Art of Thinking Clearly Before Making Your Analytics Investment

Master the Art of Thinking Clearly Before Making Your Analytics Investment

I lead the technology industry business unit for a fast-growing analytics firm. And one of the perks of my job is the numerous conversations I get to have with a variety of prospects or clients.

I recently read the novel ‘The Art of Thinking Clearly’ by Rolf Dobelli, which deals with systematic cognitive errors of human beings as a result of our evolution.

I enjoyed reading the book immensely, relating those cognitive errors to what I experience every day. I am selecting a few rampant cognitive errors mentioned in the book and linking them to situations I have come across.


Def: The mania for all things new and shiny.

The desire to have marketing automation, machine learning models, artificial intelligence, and other such cutting-edge capabilities is intense in organizations today and probably justified. However, it’s alarming to see many of these efforts promise a very large ROI but never deliver it. I once had an executive of a large organization say without batting an eyelid, “We have invested millions in a big data environment, and we now need to figure how to use it.�

I am always disappointed when I hear things like – “Can you tell me how I can use machine learning capabilities in ...

Read More on Datafloq
The Rise of Business Analytics as a Lucrative Career | Simplilearn

The Rise of Business Analytics as a Lucrative Career | Simplilearn

The Rise of Business Analytics as a Lucrative Career | Simplilearn The field of business analytics has been driven to new heights by the astonishing explosion of data now available to businesses – and the analytics tools that give companies the ability to leverage that data to great benefit. A recent study on Big Data says that more than 85% of survey respondents report that they have started programs to cre...Read More.
How AI Will Affect the Travel Industry

How AI Will Affect the Travel Industry

Since I have a passion for travelling as well as the hospitality industry (having a bachelor in hospitality management), and I believe in the power of Artificial Intelligence (AI), for this week’s article I decided to look into what happens when you combine the two. As it seems, Artificial Intelligence is set to be a game-changer for the travel industry.  It is helping consumers and companies simplify making travel arrangements and streamlining business processes. AI is also modernising travel by taking it from a complicated, drawn-out experience to one that is more enhanced and customer-focused by improving the overall efficiency for hotels, airlines, and other travel providers. AI's impact on the travel industry is powerful and massive and it has the potential to transform business completely. Here's how AI will revolutionise the travel industry:

Enhanced Booking and Ticketing via Dynamic Pricing

Tracking price fluctuations for travel and lodging accommodations can be difficult for travellers who are looking for the best time to book a hotel, flights or vacation packages. However, travel providers can help enhance the customer experience using dynamic pricing tools that leverage predictive, or even prescriptive, analytics such as the Hopper's travel deal tracking app. These tools help consumers track ...

Read More on Datafloq
4 Reasons Why Cyber Security is a Growing Industry

4 Reasons Why Cyber Security is a Growing Industry

The introduction of the Internet to the world in 1990 was something understood but unbelievable.  The idea that one could use a machine to do something that always had to be done by hand left many puzzled.  The Internet became a place full of information benefiting many in all aspects of life.  Not until about 2014 did “cyber-attacks� become something to be feared. Since then, cybersecurity has grown and will continue to grow for many reasons.

Everything is Going Digital

Everywhere you look, something that was once done manually is now being automated.  With every piece of technology that is automated, there is an impressive amount of code that makes the system work.  With every new piece of technology created by code, there is an equal opportunity created for a cyber hacker to misuse the same technology.

Similar to when the credit card gained traction and people around the world start using it because of how efficient it was.  In the beginning, there was no such thing as credit card fraud, but as more and more people started using them, credit card fraud became a thing and continued to grow as more and more people started relying on credit cards.

Sensitive Data is Stored ...

Read More on Datafloq
Druid, Imply and Looker 5 bring OLAP Analysis to BigQuery’s Data Warehouse

Druid, Imply and Looker 5 bring OLAP Analysis to BigQuery’s Data Warehouse

Back in the good old days of on-premise data warehousing projects where data was structured, the software came on CDs and data engineering was called ETL development (and was considered the most boring job on a project, how times have changed) a typical technical architecture had sources on the left-hand side, a relational database hosting the data warehouse in the middle, and a multi-dimensional OLAP (or “MOLAP�) server on the right-hand side serving up data — fast — to specialist query and reporting tools.

Diagram courtesy of Readings in Database Systems, 3rd Edition, Stonebraker & Hellerstein, eds, 1996.

OLAP (Online Analytical Processing) Servers were a great complement to the relational database servers that hosted data warehouses back in those days by taking a subset of the whole dataset, structuring it into a dimensional model then storing it as a indexed arrays of leaf-level and pre-computed aggregates, served up results with split-second response times for any combination of dimension filters or level of aggregation requested … and users loved them.

The first ten years of my consulting career were spent working with OLAP Servers and if you’re interested in their history I’d recommend you check out Episode 37 of the Drill to Detail Podcast where I talked about arguably the last pure MOLAP Server still actively sold and implemented, Oracle’s Essbase Server, with Cameron Lackpour; Episode 34 of the Drill to Detail Podcast with Donald Farmer, the original Product Manager behind Microsoft Analysis Services and Episode 11 of the Drill to Detail Podcast with Graham Spicer, my original mentor and the single person most responsible for the nearly twenty-year career I’ve had since then in consulting, data warehousing and now product management, on Oracle’s original Express Server and then OLAP Option technologies.

But just like mainframes that reached perfection just at the time when PCs and mini-computers made them obsolete, OLAP Servers fell out of favour as data volumes and data types exploded whilst time available for loading them via batch processes just disappeared. On-premise data warehouses eventually transformed into elastic, cloud-hosted data warehouse platforms provided as fully-managed services such as Google BigQuery, Snowflake DB and Oracle Autonomous Data Warehouse Cloud Service and were accompanied by a new generation of BI tools like Looker, aimed at data-driven tech startups needing to analyze and understand vast amounts of consumer activity and other event-level behavioural data, as I talked about in my session at the recent Looker Join 2017 conference in San Francisco on Qubit, BigQuery and Looker for petabyte-scale analytics.

But BigQuery is, at the end of the day, a query and compute engine optimized for data warehouse-style queries and workloads albeit at a scale unimaginable ten years ago; Druid, an open-source project first announced in a white paper back in 2014 and now arguably the standard for new-world distributed data stores optimized this time for sub-second response times, may be the OLAP Server to BigQuery’s data warehouse.

To be clear, BigQuery and other distributed query engines like it are fast, particularly when filtering, sorting and aggregating single wide tables of columnar-organized data as you can see in the video below where I query and aggregate around four-and-a-half million smart device events to find out the average monthly temperature in each of the rooms in my house.

BigQuery supports joins between large tables, uses ANSI-standard SQL and more recently has benefited from a number of improvements to improve the response time for small queries as well as large ones, but compared to OLAP servers that typically pre-compute in-advance all the different aggregations and store data indexed and organized by the dimensions that users filter results by, it’s definitely a general-purpose database engine rather than a single-purpose OLAP server, and all query aggregations have to be computed on-the-fly.

Druid, originally authored by Eric Tschetter and Fangjin Yang at Metamarkets in 2011 and described in-detail in this white paper from 2014 explicitly re-implements key features of old-school OLAP servers by pre-aggregating incoming real-time streaming and batch data and storing it in a more compressed form, organizes that compressed data as time-based segments bitmap-indexed by dimensions and then presents data out as OLAP cubes to client applications.

Image courtesy of “What is Druid�, image downloaded in Oct. 2017

Druid has some significant limitations compared to more general-purpose analytic database engines such as BigQuery; it doesn’t support table joins right now (though it may do at the time you read this, as an open-source project it evolves rapidly), its primary client interface is JSON over HTTP, and most importantly for organizations that moved to BigQuery because it runs as infrastructure-as-a-service you have to take care of server upgrades, capacity scaling and all the other infrastructure management tasks that we thought we’d said goodbye to with data warehouse-as-a-service platforms.

But companies offering services and management tools to manage Druid as just another platform service are starting to emerge and courtesy of the Apache Calcite project it’s now possible to query Druid using regular SQL queries, a capability Looker recently took advantage of to offer Druid connectivity as one of the new features in their recent Looker 5 release, as you can see me demonstrating in the video below.

But just as old-school OLAP servers worked best with query tools specifically designed to work with them, new open-source BI tools such as Superset (from the same engineering team at Airbnb that also brought us Airflow, my current ETL orchestration tool of choice) connect directly to Druid clusters and come close to their commercial rivals in terms of reporting and dashboard features offered to end users; in the video below you can see me creating a percentage-changed line-graph showing how the amount of time I spend cycling each month changed over time, using the same Druid datasource as in the other videos.

Superset, Looker and other BI tools that now support Druid are of course great but the one that’s really got my interest, and prompted me to look further into Druid and how it complements BigQuery and other data warehouse cloud platform cloud services is Imply, a startup launched by one of the original co-authors of Druid who, not unlike Looker who reinvented the enterprise BI platform for the big data and startup world, are reintroducing that same world to OLAP analysis whilst making the Druid back-end much easier to manage.

Imply runs either on-premise as open-source software you can download and then install on local or cloud-hosted VMs, or run as platform-as-a-service through a contract with Imply. Druid is one of the more complex and involved analytic database types to load but Imply’s cloud service makes it simple to spin-up, manage and then ingest data into your Druid cluster either as real-time streaming sources, or via batch loads from Amazon S3 or other popular datasources.

Images courtesy of Imply Cloud Quickstart docs page

I’ve got Imply running on my own Google Compute Engine infrastructure-as-a-service platform so take care of server management and data ingestion manually, but for me the standout feature in Imply’s platform is Pivot, their open-source OLAP query tool. If any reader is old enough to remember OLAP client tools such as Oracle Express Sales Analyzer and ProClarity you’ll recognize Pivot’s use of terms such as cubes, dimensions, aggregation types and measures as shown in the screenshot of my setup below…

… but more importantly, you’ll recognise the structured query environment and lightning-fast response to my queries against that same set of four-and-a-half million IoT and other events that I extracted from my BigQuery environment and then loaded and stored in compressed column-stored segments pre-aggregated and indexed by the same dimension fields I’m now analysing it by.

Well they say nothing’s new in fashion or music if you wait long enough, and sure enough as I said in my tweet a couple of years ago…

… yes it does make me feel old, but it’s great to see such a powerful concept as multidimensional OLAP storage and dimensional models being rediscovered by the big data and startup worlds.

Druid, Imply and Looker 5 bring OLAP Analysis to BigQuery’s Data Warehouse was originally published in Mark Rittman’s Personal Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Why Security Intelligence Is Going High-Tech

Why Security Intelligence Is Going High-Tech

President Trump’s security detail now includes flying drones. It’s not just heads of state that are seeing technological advances change the way we look at security, though.The entire industry is in the midst of a revolution, thanks to a perfect storm of new technologies that will define next-generation security and surveillance.

Science-fiction buffs are quick to point out where a given book, show or movie may have predicted the future.For example, consider the tablet devices we see in Star Trek or the creation of cyberspace in William Gibson’s 1984 novel Neuromancer. Now we get to see a new generation of ideas once only imagined or viewed on the silver screen come to life. Here’s a look at some of the definitive technologies for next-generation security.

Artificial Intelligence

Speaking of science fiction, here's a term you'll recognize from the genre. AI has been characterized as the single greatest leap forward in human technology, but also the potential undoing of human existence. Today’s tech companies don’t seem frightened, though.

They are doing everything possible to learn more about how to build an artificial version of the human brain, capable of learning based on its experiences. In a security application, such a technology could strategically identify vulnerabilities in ...

Read More on Datafloq
TDI Friday Read: U.S. Airpower

TDI Friday Read: U.S. Airpower

[Image by Geopol Intelligence]

This weekend’s edition of TDI’s Friday Read is a collection of posts on the current state of U.S. airpower by guest contributor Geoffery Clark. The same factors changing the character of land warfare are changing the way conflict will be waged in the air. Clark’s posts highlight some of the way these changes are influencing current and future U.S. airpower plans and concepts.

F-22 vs. F-35: Thoughts On Fifth Generation Fighters

The F-35 Is Not A Fighter

U.S. Armed Forces Vision For Future Air Warfare

The U.S. Navy and U.S. Air Force Debate Future Air Superiority

U.S. Marine Corps Concepts of Operation with the F-35B

The State of U.S. Air Force Air Power

Fifth Generation Deterrence


4 Proven Reasons Why Gamification Improves Employee Training | Simplilearn

4 Proven Reasons Why Gamification Improves Employee Training | Simplilearn

Key causes of failure for employee Learning and Development programs (both online or classroom-based) include lack of engagement by participants, lack of oversight by management and lack of social involvement by peers. However, what you might not know is that many of these issues can be ameliorated through gamification. Gamification, from a traini...Read More.
Gamification in eLearning: How Does it Work? | Simplilearn

Gamification in eLearning: How Does it Work? | Simplilearn

Gamification in eLearning: How Does it Work? | Simplilearn Enterprises that are looking to impart effective team training without causing stress or disinterest are beginning to look to training providers that offer gamification in their digital learning models. According to a research by Ambient Insight, the spends on gamified learning will reach $2.4 billion by 2018, up from $1.7 billion in 2013, which su...Read More.
Dell Makes a Big Bet on IoT

Dell Makes a Big Bet on IoT

Dell announced its company-wide strategy to capture the IoT opportunity and is betting big - $1B for IQT.
10 Ways the Internet of Things Will Change the World as We Know It

10 Ways the Internet of Things Will Change the World as We Know It

The Internet of Things (IoT) is a phrase that is becoming more and more common. It can be used to refer to everything from smartphones to smart houses. In a nutshell, it is the interconnectivity between all these different smart gadgets flooding the market and the Internet.

Business Insider estimates by 2020 there will be more than 24 billion IoT devices.  But, just as personal computers and the Internet changed the world as we know it, so too will the Internet of things begin to change the world around us more and more. Here are ten ways things might change in the coming years from the view of Vince Robinson, CEO of ScalaHosting.

1. Appliances Will Do the Work for You

Have you ever wished you didn’t have to spend so much time building a grocery list, or that you could just place your order and pick it up? Smart appliances are not only on the market but are doing more and more.

For example, Samsung's Family Hub refrigerator does everything from play hit tunes to managing your grocery list for you. The concept of grocery management is fairly simple. When the door of the refrigerator opens and closes, there are three cameras that ...

Read More on Datafloq
The 12 Basic Principles Of Data Visualization

The 12 Basic Principles Of Data Visualization

The best data in the world won't be worth anything if no one can understand it. The job of a data analyst is not only to collect and analyze data, but also to present it to the end users and other interested parties who will then act on that data. Here’s where data visualization comes in.

Many data analysts are not necessarily experts in data communication or graphic design. This means a lot of them can be lost in the translation of data from the collection to the presentation in the boardroom. I often find myself teaching data visualization classes to more and more data science teams, who recognize this as an area of weakness.

If your job entails presenting findings from a set of data or analysis to a group of laymen, then it’s part of your job to present it to them in such a way that it’s easy to understand and therefore take appropriate action.

In this post, I’ll share a few tips to help you turn data into actionable insights that people will understand.

Keep your Audience in Mind

Any data visualization should be designed in such a way that it meets the needs of the audience and their information needs. ...

Read More on Datafloq

Privacy Policy

Copyright © 2017 BBBT - All Rights Reserved
Powered by WordPress & Atahualpa