Blockchain could solve Data Integrity problems

As the world relies more heavily on data as the basis for critical decision-making, it is vital that this data can be trusted. And that trust is the key issue here.

People (Data Scientists, Chief Innovation Officers) are looking for ways to automate using data. Automation translates to efficiency which translates to value. This automation trend has increased through advances in business intelligence, big data, the rise of IoT and the necessary cloud infrastructure.

So why do I raise this trust issue? Isn’t this solved perhaps by the Industry standard DMBOK? It states the possible Data Quality Management processes.

Because data is vulnerable, not just to the breaches we hear about in the news, but to a much more subtle, potentially more destructive class of attack, an attack on data integrity. Data isn’t stolen but manipulated and changed.

Like this tech-savvy Staten Island high school student who studied advanced computer programming at an elite computer camp who used his skills to hack into a secure computer system and improve his scores.

Enter the Blockchain

A possible solution for assuring data integrity could be blockchain technology.

In a blockchain, time-stamped entries are made into an immutable, linear log of events that is replicated across the network. Each discrete entry, in addition to being time-stamped, is irreversible and can have a strong identity attached. So it becomes irrefutable who made the entry, and when. These time-stamped entries are then approved by a distributed group of validators according to a previously agreed-upon rule set.

Once an entry is confirmed according to this rule set, the entry is replicated and stored by every node in the network, eliminating single points of failure and ensuring data resilience and availability.

Future

Because the promises of data integrity and security are so strong, new systems can be built to share blockchain-enforced data among organizations who may not trust each other. And once an ecosystem has shared data that everyone can trust in, new automation opportunities emerge.

Smart contracts are perhaps the next step. It makes it possible that different parties create automated processes across companies and perhaps industries. Blockchain could be an ecosystem for cross-industry workflows involving data from multiple parties. Now an entire new class of loosely coupled integration applications can be created.

All this availability of data connection makes me feel nostalgic

When I first started my journey online I was still on dial-up. Once every while, I listened to the whirly noises the little grey box on my computer made as it performed a ‘handshake’ with another modem somewhere.

Like magic, I was online and I used those few expensive minutes to connect to a BBS, upload code, download email, perform a search via Altavista and various other simple tasks. Then I hurried to disconnect. Dialing in was expensive after all.

As I relax here in the beautiful Dordogne in the south of France, I’m oddly reminded of those hectic times. It’s very quiet here and there’s no Wi-Fi. The mobile signal we do have is hardly enough to download a single email and constantly switching from one provider to another. I promised my wife and kids that I am going to try to disconnect, get away and enjoy our vacation as much as possible. So once every couple of days I switch on my data connection, and wait for the weather forecast (turns out to be very important when camping) and other messages to trickle into to my phone.

I’m not complaining – it’s nice to focus on other things. I read books (albeit mostly pulpy action novels), take different walk in the hills or mountains as I am used to the flatness of the Netherlands, and am enjoying being slightly disconnected from the office and my life back home.

 

I know every photo I want to upload to Facebook is going to take a few minutes, and it will cost some money. While I’m only paying a couple of cents per megabyte here, it all adds up when you realize a photo easily comes in at 4MB.

By putting a price on those things you’re forced to think about the value of each updated status on Facebook, reloading your Twitter feed, or casually browsing Instagram.

I’m not passing judgement here, I’m merely recognizing that I’m part of a generation that is still very much delighted by the technology that we have access to, because we grew up in a time when that all didn’t yet exist. I realize how fortunate those of us are to live in a time where technology is so ubiquitous and available to almost anyone at almost any time.

Cloud is al about the mindset and Application Architecture

One of the propositions of cloud is that it should be possible – through the use of intelligent software – to build reliable systems on top of unreliable hardware. Just like you can build reliable and affordable storage systems using RAID (Redundant Arrays of Inexpensive Disks).
One of the largest cloud providers says: “everything that can go wrong, will go wrong”.

So the hardware is unreliable, right? Mmm, no. Nowadays most large cloud providers buy very reliable simpler (purpose-optimized) equipment upstream of suppliers in the server market. Sorry Dell, HP & Lenovo there goes a large part of your market. Because when running several hundred thousands of servers a failure rate of 1 PPM versus 2 PPM (parts per million) makes a huge difference.

Up-time is further increased by thinking carefully about what exactly is important for reliability. For example: one of the big providers routinely removes the overload protection from its transformers. They prefer that occasionally a transformer costing a few thousand dollars breaks down, to regularly having whole isles loose power because a transformer manufacturer was worried about possible warranty claims.

The real question continues to be what happens to your application when something like this happens. Does it simply remain operational, does it gracefully decline to a slightly simpler, slightly slower but still usable version of itself, or does it just crash and burn? And for how long?
The cloud is not about technology or hardware, it’s about mindset and the application architecture.

 

“Who’s on First?”

The movie Purple Rain mixes up nicely with Abbot and Costello

Morris: Okay. What’s the password?
Jerome: You got it.
Morris: Got what?
Jerome: The password.
Morris: The password is what?
Jerome: Exactly.
Morris: The password is exactly?
Jerome: No, it’s okay.
Morris: The password is okay?
Jerome: Far as I’m concerned.
Morris: Damn it, say the password!
Jerome: What.
Morris: Say the password, onion head!
Jerome: The password is what?
Morris: [frustrated] That’s what I’m asking you!
Jerome: [more frustrated] It’s the password!
Morris: The password is it?
Jerome: [exasperated] Ahhhhh! The password is what!
Morris: It! You just said so!
Jerome: The password isn’t it! The password is?
Morris: What?
Jerome: Got it!
Morris: I got it?
Jerome: Right.
Morris: It or right?

“Who’s on First?” is a comedy routine made famous by Abbott and Costello. The premise of the sketch is that Abbott is identifying the players on a baseball team for Costello, but their names and nicknames can be interpreted as non-responsive answers to Costello’s questions.

But it also reminds me of Rain Man or Rush Hour 3 – He is Mi and I am Yu

Afbeeldingsresultaat voor who are you Yu

 

Cloud the new legacy?

One thing is for certain, we will spend a good part of 2015 talking about, discussing and disagreeing on how we now need to move, deliver, transport, carry, send and integrate the various component elements that make up our Business Applications.

The advent of Cloud, virtualization and managed hosting technologies means that we have all become used to the ‘as-a-Service’ extension as we now purchase a defined selection of software applications and data that are increasingly segmented and componentized in their nature.

Because of the Cloud, businesses run on mobile devices with employees, customers and partners easily collaborating, data securely stored and accessible from anywhere in the world all without a worry about the infrastructure. That’s someone else’s problem, isn’t it? With low monthly prices, who wouldn’t sign up and embrace a SaaS app that makes your life easier.

All the convenience comes at a price.

That price is silos. Instead of tearing down silos, SaaS applications builds strong and high walls around functionality and data. Not like those traditional legacy silos but loads of little silos within and in between departments and teams. Instead of bringing teams into alignment, they are separated into fiefdoms of data if one does not govern the Cloud.

New platform lets people make money leaking confidential files

A team of cryptographers and developers want to create a website where anyone can sell data sets to the highest bidder. “You’ll hate it,” is the slogan of the service, which is accessible via Tor. Payments are made via Bitcoin.

Who wants to leak a file to the highest bidder, must upload it to Slur, a marketplace for data. There are no restrictions on the type of data that is offered or the motives of the seller, says spokesman Thom Lauret of U99, the group cryptographers and developers behind the website. The design of the site is to “subvert and destabilize the established order“.

The website expects stolen databases, source code for proprietary software, zero-day exploits and other confidential documents, as well as “unflattering” pictures and videos of celebrities. Only the highest bidder will get the data, and then may choose to release the data, or just to keep it hidden. Large companies may be able to deposit money to keep leaks from publicity. In order to stem this, the website allows users to create a form of crowd sourcing/bidding that creates a larger bidding deposit .

Slur.io ensures that “whistleblowers” are to remain completely anonymous, and compensated. “Slur introduces a balanced system with the material interests of whistleblowers protected in exchange for the risks they take” said spokesman Lauret. Datasets can only be offered once.

To prevent false claims are made about the content of the data, the buyer can see the data before the seller gets the money. If the buyer is not happy with the content, you can start an arbitration involving other members to vote from the community about the content. If they agree with the buyer, the buyer gets his money back.

Payments are made via Bitcoin and the site will only be accessible via Tor to keep out the different governments. The developers do not expect to be targeted by the government because source code would fall under free speech and they do not claim to benefit from data that is sold on the site. The question is whether the American government agrees; the site is now based in San Francisco.

The developers of the website hope they get public money to pay for the development of the platform. In April, a beta version of the site was to be opened, and should follow a full release in July.

Cloud reliabilty

Reliability is often attributed as one of the reasons some organizations are wary of the cloud.

Last week, Amazon, Rackspace and IBM had to “reboot” their clouds to deal with maintenance issues with the Xen hypervisor. Details were scarce but it was pretty quickly established that an unspecified vulnerability in the Xen hypervisor was the issue.

The vulnerability, discovered by researcher Jan Beulich, concerned Xen hypervisor, open-source technology that cloud service providers use to create and run virtual machines. If exploited the vulnerability would have allowed malicious virtual machines to read data from or crash other virtual machines as well as the host server.

Not all providers had to reboot their clouds to upgrades or maintenance. Google and EMC VMware support the notion of live migration, which keeps internal changes invisible to users and avoids these Xen reboots and Microsoft uses (customized) Hyper V so they did not have that vulnerability.

It is interesting to see what “uptime” means in this context. In many reports of this nature, “uptime” doesn’t take into account “scheduled downtime.” And that could very well be the case here, as well. If one does a little bit of math:

  • 99.9% uptime is 8.77 hours of downtime per year
  • 99.99% uptime is 52.60 minutes of downtime per year
  • 99.999% uptime is 5.26 minutes of downtime per year

Although some users complained about the outage most where complaining about (the lack of) the providers’ communications.

Cloud providers cannot be considered as a black box anymore. As an architect we need to know the limitations of the architectural components the provider uses such as Xen. We need to know how often these kinds of reboots have occurred, and how the provider handles transparent maintenance.

We also need to consider the lines of communications. Providers often drop the ball here. People are often unhappy because they didn’t get much (or any) heads-up about the reboot, not about the reboots itself.

We should remember that outages and other disruptions are few and far between these days, so these rare event get extra media attention.

Cloud adoption! Do you have a strategy?

As conversations about the Cloud continues to focus on IT’s inability at adoption (or the gap between IT and Business), organizations outside of IT continue their cloud adoption. While many of these efforts are considered Rogue or Shadow IT efforts and are frowned upon by the IT organization, they are simply a response to a wider problem.

The IT organization needs to adopt a cloud strategy, a holistic one is even better. However, are they really ready for this approach? There are still CIOs who are resisting cloud.

A large part of the problem is that most organizations are still in a much earlier state of adoption.

Common hurdles are

  1. The mindset : “critical systems may not reside outside your own data center”
  2. Differentiation: “our applications and services are true differentiators”
  3. Organizational changes : “moving to cloud changes how our processes and governance models behave”
  4. Vendor management : “we like the current vendors and their sales representative”

In order to develop a holistic cloud strategy, it is important to follow a well-defined process. Plan Do Check Act fits just about any organization:

Assess: Provide a holistic assessment of the entire IT organization, applications and services that are business focused, not technology focused. Understand what is differentiating and what is not.

Roadmap: Use the options and recommendations from the assessment to provide a roadmap. The roadmap outlines priority and valuations .

Execute: For many, it is important to start small because of the lower risk and ramp up were possible.

Re-Assess & Adjust: As the IT organization starts down the path of execution, lessons are learned and adjustments needed. Those adjustments will span technology, organization, process and governance. Continual improvement is a key hallmark to staying in tune with the changing demands.

Today, cloud is leveraged in many ways from Software as a Service (SaaS) to Infrastructure as a Service (IaaS). However, it is most often a very fractured and disjointed approach to leveraging cloud. Yet, the very applications and services in play require that organizations consider a holistic approach in order to work most effectively.

The Cloud has landed! It’s become Foggy

We’re about to go back once again in the circle to decentralize and give a greater role to local storage and computing power.

It depends on the nature and the amount of data that needs to be stored and it’s process demands. With the enormous rise of the amount of data because of the ‘Internet of Things’ the nature of the data is becoming more and more diffuse. These developments lead to yet another revolution in data area: The Fog.

Smarter? Or gathering more data?

More and more devices are equipped with sensors; cars, lampposts, parking lots, windmills, solar power plants and from animals to humans. Many of these developments are currently still in the design phase, but it will not be long before we live in smart homes in smart cities and we are driving our cars by smart streets wearing our smart tech.

Everything around us is ‘getting smarter’ / gathers more data. But where is that data stored, and why? Where is all that data processed into useful information? The bandwidth of the networks we use, grows much slower than the amount of data that is send through it. This requires thinking about the reason to store data (in the cloud).

If you want to compare data from many different locations, for instance data from sensors in a parking lot via an app where the nearest free parking space is, then the cloud is a good place to process the information. But what about the data that can even better be handled locally?

Data Qualification

The more data is collected, the more important it will be to determine the nature of the data is and what needs to be done with it. We need to look at the purpose of the collected data. For example: If the data is used for ‘predictive maintenance’, which monitors something so that a timely replacement or preventive maintenance can take place, it does not always make sense to send the data to the cloud.

Another example is the data that is generated by security cameras. These typically show 99.9% of the time an image of room/space that has not changed. The interesting data is the remaining 0.1% where there is something to see. The rest can be stored locally, or even not at all. This filtering of useful and useless data calls again for local power.

This decentralization of computing power and storage is a recent trend that Cisco calls ‘fog computing’. With distributed intelligence an often more effective action can be taken in response to the collected data, and unnecessary costs of bandwidth and storage can be avoided. This is a development that goes very well with the transition to the cloud.

Cisco

Fog Computing is a paradigm that extends Cloud computing and services to the edge of the network. Similar to Cloud, Fog provides data, compute, storage, and application services to end-users. The distinguishing Fog characteristics are its proximity to end-users, its dense geographical distribution, and its support for mobility. Services are hosted at the network edge or even end devices such as set-top-boxes or access points. By doing so, Fog reduces service latency, and improves Quality of Service (QoS), resulting in superior user-experience. Fog Computing supports emerging Internet of Everything (IoE) applications that demand real-time/predictable latency (industrial automation, transportation, networks of sensors and actuators). Thanks to its wide geographical distribution the Fog paradigm is well positioned for real time big data and real time analytics. Fog supports densely distributed data collection points, hence adding a fourth axis to the often mentioned Big Data dimensions (volume, variety, and velocity).

Unlike traditional data centers, Fog devices are geographically distributed over heterogeneous platforms, spanning multiple management domains. Cisco is interested in innovative proposals that facilitate service mobility across platforms, and technologies that preserve end-user and content security and privacy across domains.

The future? It will be hybrid with foggy edges.

Price reductions for IaaS lead to?

In the last six months the continued decline in pricing for IaaS is a signal that more business is sought.

IBM thinks that the prices and profit margins for x86s will be under continual pressure and they sold their server business to Lenovo. This shows that IBM thinks that the server hardware is already commoditized, so few more cost reductions in basic cloud infrastructure can be expected.

From a supplier perspective: The lower prices can be a signal that IaaS might actually become a loss leader to get users into the cloud store and then to PaaS and SaaS offerings. They will try to sell basic IaaS users other cloud services on top of IaaS.

From a user (IT Department) perspective: IaaS displaces only hardware cost; PaaS displaces hardware, OS and middleware costs; and SaaS displaces all application costs.

Amazon, Google, Microsoft and other cloud providers need a customer base so they can sell their cloud-specific services on PaaS and Saas. Price reductions for IaaS will keep that base, and opportunities to upsell into the emerging cloud-specific service market will grow.

Every cloud will be potentially a hybrid, so users and providers will rely on deployment and management tools that converge on a common model.