Cloud data storage is based on the delivery of files from local computers and servers into the remote servers and storage facilities that are obscure to the user, but can be accessed and managed at any time. Thereby, the reliability of cloud storage services and the privacy of users (i.e. protecting the files from being accessed by any party other than their owner) are paramount to subscribing to and implementing any cloud services.
The market of cloud storage services is composed by a large number of companies that operate and offer data storage programs, from small data centers who cater to the needs of individuals and SMEs to large storage facilities of companies (such as Amazon, Google and Microsoft), aiming at managing their own gigantic volumes of data, but also offered to external customers. However, since the first days of cloud storage services and until recently concerns over the protection of data, the reliability of centralized data centers, the liability of cloud storage companies in cases of lost or incorrectly stored files and the privacy of users are often expressed by experts (see for example Hu et al., 2010; Dai et al., 2017).
Faults associated with technical performance of the cloud emerge from its servers, from retrieval systems (Content Distribution Networks, or CDNs) and from clients. Some are faults are defined as crash faults while others are performance-degrading faults. Crash faults are the most common category, categorized by service “blackouts”, whereas services that are temporarily disabled or exhibit lower degrees of performance are performance-degrading faults. For example, an incident in which file that were uploaded to the cloud are not accessible due to writing errors to a folder is a crash fault, while CPU leaks that cause lower performance of a server (and therefore slower retrieval of a file) are performance-degrading faults (Wang, 2017). When data and files are managed through a centralized data centers (or through a series of them), a wide scale fault, and in particular a crash fault that terminates the access of users to their stored files, can cause the termination of operations of companies, organizations and individuals as long as the outage persists. For example, AWS’ recent outage in March 2017 continued for several hours, causing damages that are estimated by more than 300 million USD (Sverdlik, 2017).
Artificial Intelligence is a set of advanced computational models and processes inspired by research of the human brain. These models and tools operate behind the scenes of many apps, websites and applications in a seamless way that does not interfere with the user’s interaction through the UI. For example, web searches and similarity between terms, automated translation, face recognition and recommendation systems are some of the applications of AI.
Artificial Intelligence is often used to generate better user experience. A simple case of this would be Google. Google uses advanced machine learning algorithms to narrow down its search results to provide its users with results closely matching what the users are looking for. As the algorithm learns and refines its search definition, users can sometimes notice that search results may vary from day to day or user by user. Targeted ads often use machine learning algorithm to propose possible products and advertisement on sale based on the users search results.
The market for AI applications is expected to grow substantially in the coming years. Figure 2 presents some of the expected common uses and the revenues from their commercialization in the near future. Nonetheless, the widespread implementation of AI processes requires increasingly powerful computational facilities, due to the complexity of these operations. Therefore, companies invest vast amounts in purchasing GPU and CPU units that are dedicated to carry out this scope of compu- tations, or purchase at a great expense processing power from one of the cloud processing providers (i.e. Amazon Web Services, Google Cloud, Microsoft Azure and IBM).
Just like a human brain, AI and machine learning algorithms require inputs of data to deduce an inference. Data mining is the computing process of discovering patterns in large data sets and helps reduce large sets of data structures to allow machine learning algorithms to make decisions and inferences. Consequently, as organizations and companies accumulate large datasets as a part of their day-to-day operations virtually on every aspect of their performance , suppliers and clients , they seek new ways to apply AI and machine learning methods to derive new managerial insights from the data on a continuous basis.
Nonetheless , AI and machine learning tools for analyzing vast amounts of data require large volumes of computational power that organizations often lack, hence requiring them to subscribe to a commercial cloud service and uploading their sensitive data files into another company ’s servers . Due to the confiden - tial nature of data and its commercial value, many companies avoid doing so, hence not benefitting from the potential value of analyzing their databases with advanced AI methods.
The Blockchain technology provides a unique and fully secure solution towards processing , storing and distributing data and maintaining their consistency and integrity that can be used for use cases like decentralized processing . The Blockchain is simply blocks of data hashed together and chained using previous hashes and its current block to maintain consistency across the chain (Vijayan , 2017 ). Blockchains use the SHA 256 algorithm to create a hash . The unique nature of the hash makes its resource intensive to crack as the SHA256 hash can only be broken today through brute force with computational power that is not avail- able yet in the commercial hardware market (Vijayan, 2017).
Distributed data mining of large datasets was introduced by the SETI Institute through its BOINC program (Estrada et al., 2009). The introduction of ‘Bitcoin’ and the proof of work mechanism allowed a framework for providing incentives to data miners for work and energy to accomplish a large series of computations expanded to process data over a decentralized network (Nakamoto, 2008).
There are many projects ongoing in terms of providing secure storage over a decentralized network. A decentralized storage network is defined as a cloud platform where nodes either store a part of the data or file or the entire chain of data in a blockchain. Some of the more well-known names in this space are FileCoin, IPFS, SiaCoin, Storj, NextCloud, and NEM’s Mijin project (see e.g. Protocol Labs, 2017). Reliability and privacy on a decentralized network can be a major issue. Most decentralized networks are not equipped to recover lost data in the event the hosting node experiences hardware crashes or nodes with malicious intent configure files in order to hack the file recipient (a common problem that plagues torrent).
IAGON was built not only to serve the decentralized network but also work with current data storage facilities like SQL and NoSQL databases. The approach taken with IAGON is unique to the point that IAGON utilizes is machine learning algorithm to distribute load across a decentralized network for processing and then encrypts/decrypts data which flows through its system.
There are many use cases that IAGON can serve. IAGON can provide secure storage over centralized, clustered or decentralized networks, distribute data processing load across its network of data miners for data analytics, provide a secure solution for creating smart contracts over the Blockchain, or serve to identify honest and attacking nodes within a system.