Hadoop-based services for Windows include a FTP server that operates directly on the Hadoop Distributed File System (HDFS). The FTPS protocol is used for secure transfers. FTP communication is wire efficient and especially suited for transferring large data set. The steps below describe how to use the FTP server.

  1. Log into the portal on http://www.hadooponazure.com/.
  2. Click the Open Ports tile to access the FTP server port configuration.
  3. Click the toggle switch to open the FTPS port, 2226.
  4. Click the arrow to go back to the account page. You should see the FTPS port opened as shown below.
  5. In order to communicate with the FTP server, you will need a MD5 hash of the password for your account.  You can get this from the cluster by clicking the Remote Desktop tile shown in the screenshot above.
  6. The portal will ask you whether you want to open or save a RDP connection file. Click Open to open the remote desktop connection.
  7. On the Remote Desktop Connection dialog, click Connect.
  8. On the Windows Security dialog, enter the correct password to authenticate yourself for the remote desktop connection. Then click OK.
  9. Click Start->Run.
  10. In the Run Dialog, enter notepad c:\apps\dist\ftp\users.conf and press Enter to view the ftp configuration file in notepad.
  11. Copy the MD5 hash of the password for the user account that will be used to transfer files over FTPS.
  12. Use the user name and MD5 hash of the password to authenticate with the FTP server over FTPS port 2226. For example the following screenshot shows transferring C:\MyBigDataFolder\MyBigData.dat to the Hadoop cluster named myhadoopcluster.cloudapp.net using a user named “MyUser” along with the MD5 hash of that user’s password.
    Note: Certificate validation is turned off (“-k”) in this use of curl.exe. This is because the MD5 hash of the password uses a self-signed certificate on the hadoop server that might not be fully trusted. CURL is a commonly used FTP client utility for scripting the FTP transfer of files.

     Note: If you receive a FTP error 550: permission denied, make sure you are attempting your transfer to a directory that the user has write permissions for.

     

  13. You can verify the file transfer by logging back into the portal on http://www.hadooponazure.com.
  14. Click the Interactive Console tile then execute a directory listing of the destination directory in the console to verify the FTP transfer. For example, “#ls /uploads”…
     

More Information