The Development Of Data Storage Technology In The Era Of Big Data
Posted by
Md Ashikquer Rahman
Your Ads Here
In the digital economy era, data has become a new means of production, and various applications of data-driven experience, data-driven decision-making, and data-driven processes are constantly being staged.
5G, cloud, and AI accelerate the digital transformation of the industry, and the era of massive data has arrived. Massive data drives enterprises to move from data management to data operation. Currently, they are facing three major challenges: one is high cost and cannot be stored; the other is low efficiency and flow is not moving; and the third is poor automation and poor management.
In the Hadoop 1.0 era, computing and storage were highly integrated, and could only handle a single MapReduce analysis business; in the Hadoop 2.0 era, the computing layer and data began to be decoupled, independent resource management was realized through Yarn, and Spark began to support more computing engine; now in the Hadoop3.0 era, computing and storage have evolved separately. Hadoop EC is used to support the storage of cold data. At the same time, external storage such as S3 is introduced to enhance its storage base capabilities and gradually evolve to the data lake architecture.
In the Hadoop 3.0 era, computing is developing towards lightweight and containerization, and the evolution of computing and storage separation has become a reality. After the separation of computing and storage, we replaced the original native big data storage base with an enterprise-level storage base. The advantage is that the current advanced technology of enterprise-level storage can be brought into big data, such as high reliability, high utilization, Multi-protocol integration, etc., to better release the value of data.
For example, in 2018, Huawei innovative launched a big data storage and computing separation solution based on the OceanS tor Pacific series. In terms of cost, Huawei's big data storage-calculation separation solution realizes storage-calculation separation, independent expansion of resources on demand, elastic EC, and cold and hot data classification, which greatly reduces storage costs. In terms of data application efficiency, Huawei OceanS tor Pacific series adopts fully symmetrical distributed NameNode. The cluster performance and the number of supported files increase linearly with the increase of the number of nodes. A single namespace supports files of tens of billions. In terms of actual operation and maintenance, the native HDFS interface provided by Huawei OceanS tor Pacific series provides better performance and user experience. The new and old coexistence can be realized through the ViewFS or Hbase metadata gateway method, and the smooth evolution from the integration of storage and calculation to the separation of storage and calculation is realized, and the existing investment of users is protected.
Your Ads Here
Your Ads Here
Your Ads Here
Your Ads Here
Newer Posts
Newer Posts
Older Posts
Older Posts
Comments