Distcp S3-hdfs for eu-central-1 (AWS Frankfurt) – details “behind the trenches”

  Distcp (distributed copy) is a fairly old tool used to move a large quantity of files usually within hdfs, using MapReduce job to do so where mappers list the source files and reducers do the copy heavy lifting. Another useful integration is that it can also deal with file migrations between hdfs and AWS … More Distcp S3-hdfs for eu-central-1 (AWS Frankfurt) – details “behind the trenches”