1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77
| 如何连接hadoop集群? fs = hadoop.HadoopDFS("username","password","ugi",64310) fs.disconnect() 如何获取当前工作目录? fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.getWorkingDirectory() fs.disconnect() 如何更改当前工作目录? fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.setWorkingDirectory("/user/ns-lsp/logs") fs.disconnect() 如果目录不存在setWorkingDirectory()返回-1,如果执行成功,返回0 如果目录不存在setWorkingDirectory()返回-1,如果执行成功,返回0 如何判断某个文件/目录是否存在? fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.pathExists("/user/ns-lsp/logs") fs.disconnect() 文件/目录存在,返回0,如果不存在,返回-1 如何创建一个目录? fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.createDirectory("/user/ns-lsp/logs/cjj") fs.disconnect() 如果目录已经存在,则返回-1,如果目录创建成功,返回0 如何获得当前默认块大小? fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.getDefaultBlockSize() fs.disconnect() 如何获得当期目录下的文件/目录? fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.listDirectory("/user/ns-lsp/logs") fs.disconnect() 如何移动一个文件/目录? 同一HDFS内移动文件: fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.move("/user/ns-lsp/logs/cjj","/user/ns-lsp/logs/cjj_new") fs.disconnect() 不同HDFS之间移动文件: target_fs = hadoop.HadoopDFS("username","password","ugi",64310) fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.move("/user/ns-lsp/logs/cjj","/user/ns-lsp/logs/cjj_new",target_fs) fs.disconnect() 如何删除一个文件/目录? fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.delete("/user/ns-lsp/logs/cjj_new") fs.disconnect() 如何重命名一个文件/目录? fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.rename("/user/ns-lsp/logs/cjj","/user/ns-lsp/logs/cjj1") fs.disconnect() 如何修改一个文件/目录的权限? fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.chmod("/user/ns-lsp/logs/cjj",7) fs.disconnect() 如何文件块所在的服务器名? 有时我们需要查找某些文件块所在的服务器名是什么,可以如下使用: fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.getHosts("/user/ns-lsp/logs/cjj/a",0,1) fs.disconnect() 返回包含服务器名的列表. $ python gethosts.py ['xxxx'] 如何获取一个文件/目录的信息? fs = hadoop.HadoopDFS("username","password","ugi",64310) pathinfo = fs.getPathInfo("/user/ns-lsp/logs/cjj") fs.disconnect() getPathInfo()返回一个hdfsFileInfo类。 如何指定文件的备份数? fs = hadoop.HadoopDFS("username","password","ugi",64310) print fs.setReplication("/user/ns-lsp/logs/cjj/a",3) fs.disconnect() 如何打开一个文件,并读取数据? 要操作文件,需要创建一个HadoopFile对象,并利用read()方法读取数据. fs = hadoop.HadoopDFS("username","password","ugi",64310) fh = hadoop.HadoopFile(fs,'/user/ns-lsp/logs/cjj/a') print fh.read() fh.close() fs.disconnect()
|