WebHDFS/HttpFS - загрузка файла JAR не работает должным образом
Я пытаюсь загрузить файл jar Java в кластер HDFS с WebHDFS за шлюзом HttpFs.
Я попробовал этот локон:
$ curl -v -X PUT --data-binary @myfile.jar "http://myhost:14000/webhdfs/v1/user/myuser/myfile.jar?op=CREATE&user.name=myuser" -H "Content-Type: application/octet-stream"
Это похоже на работу:
* Trying myhost...
* Connected to smyhost (myhost) port 14000 (#0)
> PUT /webhdfs/v1/user/myuser/myfile.jar?op=CREATE&user.name=myuser HTTP/1.1
> Host: myhost:14000
> User-Agent: curl/7.43.0
> Accept: */*
> Content-Type: application/octet-stream
> Content-Length: 2566043
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 307 Temporary Redirect
< X-Powered-By: Express
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: HEAD, POST, GET, OPTIONS, DELETE
< Access-Control-Allow-Headers: origin, content-type, X-Auth-Token, Tenant-ID, Authorization
< server: Apache-Coyote/1.1
< set-cookie: hadoop.auth="u=myuser&p=myuser&t=simple&e=1466112799770&s=nf0V1RauYozVoVVvR+PxHZnGJ1E="; Version=1; Path=/; Expires=Thu, 16-Jun-2016 21:33:11 GMT; HttpOnly
< location: http://myhost:14000/webhdfs/v1/user/myuser/myfile.jar?op=CREATE&user.name=myuser&data=true
< Content-Type: application/json; charset=utf-8
< content-length: 0
< date: Thu, 16 Jun 2016 11:33:11 GMT
< connection: close
<
* Closing connection 0
* Issue another request to this URL: 'http://myhost:14000/webhdfs/v1/user/myuser/myfile.jar?op=CREATE&user.name=myuser&data=true'
* Trying myhost...
* Connected to myhost (myhost) port 14000 (#1)
> PUT /webhdfs/v1/user/myuser/myfile.jar?op=CREATE&user.name=myuser&data=true HTTP/1.1
> Host: myhost:14000
> User-Agent: curl/7.43.0
> Accept: */*
> Content-Type: application/octet-stream
> Content-Length: 2566043
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 201 Created
< X-Powered-By: Express
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: HEAD, POST, GET, OPTIONS, DELETE
< Access-Control-Allow-Headers: origin, content-type, X-Auth-Token, Tenant-ID, Authorization
< server: Apache-Coyote/1.1
< set-cookie: hadoop.auth="u=myuser&p=myuser&t=simple&e=1466112820064&s=p0i2IQ4Nbn2zytazKB1hHe3Dv+4="; Version=1; Path=/; Expires=Thu, 16-Jun-2016 21:33:48 GMT; HttpOnly
< Content-Type: application/json; charset=utf-8
< content-length: 0
< date: Thu, 16 Jun 2016 11:33:48 GMT
< connection: close
<
* Closing connection 1
Но при попытке использовать банку я получаю сообщение об ошибке:
$ sudo -u myuser hadoop fs -copyToLocal /user/myuser/myfile.jar /home/myuser
$ sudo -u myuser jar -tf /home/myuser/myfile.jar
java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:215)
at java.util.zip.ZipFile.<init>(ZipFile.java:145)
at java.util.zip.ZipFile.<init>(ZipFile.java:116)
at sun.tools.jar.Main.list(Main.java:1004)
at sun.tools.jar.Main.run(Main.java:245)
at sun.tools.jar.Main.main(Main.java:1177)
Дело в том, что размер файла JAR в HDFS намного больше, чем оригинал, поэтому подозревает, что он не был загружен должным образом:
$ ls -la myfile.jar
-rw-r--r-- 1 myuser myuser 2566043 14 jun 16:11 myfile.jar
$ sudo -u myuser hadoop fs -ls /user/myuser
Found 1 items
-rwxr-xr-x 3 myuser myuser 4620153 2016-06-16 13:10 /user/myuser/myfile.jar
При скручивании, используя -T
или же --data-binary
не имеет значения. Я думал, если проблема была с Content-Type
заголовок, таким образом я пробовал с binary/octet-stream
, Тем не менее, HttpFS возвращает HTTP Status 400 - Data upload requests must have content-type set to 'application/octet-stream'
,
Есть намеки?