原文來自:http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html
譯者:Chris Lin ( Hes Sin , Lin) -- Chris @ Internet
----------------------------------------------------------------------------------------------------
Purpose:
This document is a starting point for users working with Hadoop Distributed File System (HDFS) either as a part of a Hadoop cluster or as a stand-alone general purpose distributed file system. While HDFS is designed to "just work" in many environments, a working knowledge of HDFS helps greatly with configuration improvements and diagnostics on a specific cluster.
目的:
這一個文件是讓使用者開始在 Hadoop Distributed File Syste (HDFS) 上的應用,無論是在使用 Hadoop Cluster 或者是在單獨獨立的分散式系統上。 HDFS 被設計在可以在許多環境上"馬上能夠運作",在 HDFS 上的運作知識可以讓你在設定一個群組的效能調整和問題診斷上有相當的幫助。
Overview
HDFS is the primary distributed storage used by Hadoop applications. A HDFS cluster primarily consists of a NameNode that manages the file system metadata and DataNodes that store the actual data. The architecture of HDFS is described in detail here. This user guide primarily deals with interaction of users and administrators with HDFS clusters. The diagram from HDFS architecture depicts basic interactions among NameNode, the DataNodes, and the clients. Clients contact NameNode for file metadata or file modifications and perform actual file I/O directly with the DataNodes.
概要
HDFS 是 Hadoop 專案中的分散式存儲主要格式。一個 HDFS Cluster 由一個管理檔系統資料的 NameNode 和 存儲實際資料的一些 Datanode 所組成。 HDFS 的架構在這連結裡有詳細描述。這個 user guide 主要給需要跟 HDFS Cluster 使用的工程師或管理員。HDFS 架構文章中描繪了 Namenode、 Datanode 和用戶端們之間的基本關係。用戶端與 Namenode 通訊而可以獲取或者修改檔的定義資誱,並可以與 Datanode進行實際的 I/O 操作。
The following are some of the salient features that could be of interest to many users.
下面的列表應該是大多數用戶關心的HDFS突出特點。
Hadoop, including HDFS, is well suited for distributed storage and distributed processing using commodity hardware. It is fault tolerant, scalable, and extremely simple to expand. Map-Reduce, well known for its simplicity and applicability for large set of distributed applications, is an integral part of Hadoop.
Hadoop,包括 HDFS,非常適合廉價機器上運作的分散式存儲和分散式處理。這個系統是可容錯的、具有延展性的,且非常易於擴展。並且,以簡單性和高應用性著稱的 Map-Reduce 是 Hadoop 不可或缺的組成部分。
- HDFS is highly configurable with a default configuration well suited for many installations. Most of the time, configuration needs to be tuned only for very large clusters.
- HDFS的默認配置適合於大多數安裝的應用。通常情況下,只有在非常大規模的 Cluster 應用上才需要修改原來的設定配置檔。
文章定位: