Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't run HRNET auto annotations, Connection refused #5744

Closed
2 tasks done
dwffls opened this issue Feb 21, 2023 · 4 comments
Closed
2 tasks done

Can't run HRNET auto annotations, Connection refused #5744

dwffls opened this issue Feb 21, 2023 · 4 comments

Comments

@dwffls
Copy link

dwffls commented Feb 21, 2023

My actions before raising this issue

Steps to Reproduce (for bugs)

  1. Pull latest repository
  2. Follow instructions to install nuctl
  3. run serverless/deploy_gpu.sh serverless/pytorch/saic-vul/hrnet to install hrnet nuclio container
  4. run docker compose -f docker-compose.yml -f components/serverless/docker-compose.serverless.yml up -d

Expected Behaviour

Have HRNET auto annotations available in CVAT

Current Behaviour

After deploy_gpu.sh nuclio container appears to be ready, however when using auto annotations an error pops up reporting:
Error: Request failed with status code 503. "HTTPConnectionPool(host='host.docker.internal', port=32783): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f3730196d30>: Failed to establish a new connection: [Errno 111] Connection refused'))".

When nuclio web ui is checked container is unhealthy. After pressing deploy error pops up saying: open serverless/pytorch/saic-vul/hrnet/nuclio/function-gpu.yaml: no such file or directory

Tried running a another annotation network (Yolov7) and works without a problem on both gpu and cpu

Possible Solution

Context

Your Environment

Docker version:

 Version:           23.0.1
 API version:       1.42
 Go version:        go1.19.5
 Git commit:        a5ee5b1
 Built:             Thu Feb  9 19:46:56 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          23.0.1
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.19.5
  Git commit:       bc3805a
  Built:            Thu Feb  9 19:46:56 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.18
  GitCommit:        2456e983eb9e37e47538f59ea18f2043c9a73640
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Nuclio build log:

23.02.21 13:30:21.879                     nuctl (I) Project created {"Name": "cvat", "Namespace": "nuclio"}
Deploying serverless/pytorch/saic-vul/hrnet function...
23.02.21 13:30:22.295                     nuctl (I) Deploying function {"name": ""}
23.02.21 13:30:22.295                     nuctl (I) Building {"builderKind": "docker", "versionInfo": "Label: 1.8.14, Git commit: cbb0774230996a3eb4621c1a2079e2317578005b, OS: linux, Arch: amd64, Go version: go1.17.8", "name": ""}
23.02.21 13:30:22.651                     nuctl (I) Staging files and preparing base images
23.02.21 13:30:22.652                     nuctl (I) Building processor image {"registryURL": "", "imageName": "cvat-pth.saic-vul.hrnet:latest"}
23.02.21 13:30:22.652     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/handler-builder-python-onbuild:1.8.14-amd64"}
23.02.21 13:30:25.457     nuctl.platform.docker (W) Docker command outputted to stderr - this may result in errors {"workingDir": "/tmp/nuclio-build-4211351214/staging", "cmd": "docker build --network host --force-rm -t nuclio-onbuild-cfqbinp2nhl9hfo5odig -f /tmp/nuclio-build-4211351214/staging/Dockerfile.onbuild   --build-arg NUCLIO_LABEL=1.8.14 --build-arg NUCLIO_ARCH=amd64 --build-arg NUCLIO_BUILD_LOCAL_HANDLER_DIR=handler  .", "stderr": "#1 [internal] load .dockerignore\n#1 transferring context: 2B done\n#1 DONE 0.7s\n\n#2 [internal] load build definition from Dockerfile.onbuild\n#2 transferring dockerfile: 142B done\n#2 DONE 0.7s\n\n#3 [internal] load metadata for quay.io/nuclio/handler-builder-python-onbuild:1.8.14-amd64\n#3 DONE 0.0s\n\n#4 [1/1] FROM quay.io/nuclio/handler-builder-python-onbuild:1.8.14-amd64\n#4 CACHED\n\n#5 exporting to image\n#5 exporting layers done\n#5 writing image sha256:a7adbd3a9671ac582fe78d151a7afd660f512c70eac942b99ef29b5372243015 done\n#5 naming to docker.io/library/nuclio-onbuild-cfqbinp2nhl9hfo5odig done\n#5 DONE 0.0s\n"}
23.02.21 13:30:27.972     nuctl.platform.docker (I) Pulling image {"imageName": "quay.io/nuclio/uhttpc:0.0.1-amd64"}
23.02.21 13:30:31.143     nuctl.platform.docker (W) Docker command outputted to stderr - this may result in errors {"workingDir": "/tmp/nuclio-build-4211351214/staging", "cmd": "docker build --network host --force-rm -t nuclio-onbuild-cfqbiph2nhl9hfo5odj0 -f /tmp/nuclio-build-4211351214/staging/Dockerfile.onbuild   --build-arg NUCLIO_LABEL=1.8.14 --build-arg NUCLIO_ARCH=amd64 --build-arg NUCLIO_BUILD_LOCAL_HANDLER_DIR=handler  .", "stderr": "#1 [internal] load .dockerignore\n#1 transferring context: 2B done\n#1 DONE 0.1s\n\n#2 [internal] load build definition from Dockerfile.onbuild\n#2 transferring dockerfile: 117B done\n#2 DONE 0.1s\n\n#3 [internal] load metadata for quay.io/nuclio/uhttpc:0.0.1-amd64\n#3 DONE 0.0s\n\n#4 [1/1] FROM quay.io/nuclio/uhttpc:0.0.1-amd64\n#4 CACHED\n\n#5 exporting to image\n#5 exporting layers done\n#5 writing image sha256:9943b85af94c13d2c584c901927b0b5c771dd68b142f846151c555a1bd049fb2 0.0s done\n#5 naming to docker.io/library/nuclio-onbuild-cfqbiph2nhl9hfo5odj0\n#5 naming to docker.io/library/nuclio-onbuild-cfqbiph2nhl9hfo5odj0 0.0s done\n#5 DONE 0.0s\n"}
23.02.21 13:30:33.007            nuctl.platform (I) Building docker image {"image": "cvat-pth.saic-vul.hrnet:latest"}
23.02.21 13:30:34.578     nuctl.platform.docker (W) Docker command outputted to stderr - this may result in errors {"workingDir": "/tmp/nuclio-build-4211351214/staging", "cmd": "docker build --network host --force-rm -t cvat-pth.saic-vul.hrnet:latest -f /tmp/nuclio-build-4211351214/staging/Dockerfile.processor   --build-arg NUCLIO_ARCH=amd64 --build-arg NUCLIO_BUILD_LOCAL_HANDLER_DIR=handler --build-arg NUCLIO_LABEL=1.8.14  .", "stderr": "#1 [internal] load .dockerignore\n#1 transferring context: 2B done\n#1 DONE 0.3s\n\n#2 [internal] load build definition from Dockerfile.processor\n#2 transferring dockerfile: 2.05kB done\n#2 DONE 0.4s\n\n#3 [internal] load metadata for docker.io/library/ubuntu:20.04\n#3 DONE 0.6s\n\n#4 [ 1/24] FROM docker.io/library/ubuntu:20.04@sha256:4a45212e9518f35983a976eead0de5eecc555a2f047134e9dd2cfc589076a00d\n#4 DONE 0.0s\n\n#5 [internal] load build context\n#5 transferring context: 1.88MB 0.0s done\n#5 DONE 0.1s\n\n#6 [17/24] RUN pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html\n#6 CACHED\n\n#7 [13/24] RUN pip3 install setuptools\n#7 CACHED\n\n#8 [ 3/24] RUN add-apt-repository ppa:deadsnakes/ppa\n#8 CACHED\n\n#9 [14/24] RUN pip3 install -r requirements.txt\n#9 CACHED\n\n#10 [ 5/24] RUN apt-get update && apt-get install -y --no-install-recommends build-essential git curl libglib2.0-0 software-properties-common python3 python3.6-dev python3-pip python3-tk\n#10 CACHED\n\n#11 [22/24] COPY artifacts/uhttpc /usr/local/bin/uhttpc\n#11 CACHED\n\n#12 [18/24] WORKDIR /opt/nuclio\n#12 CACHED\n\n#13 [19/24] COPY artifacts/processor /usr/local/bin/processor\n#13 CACHED\n\n#14 [ 7/24] RUN pip3 install --upgrade pip\n#14 CACHED\n\n#15 [23/24] COPY handler /opt/nuclio\n#15 CACHED\n\n#16 [20/24] COPY artifacts/py /opt/nuclio/\n#16 CACHED\n\n#17 [15/24] RUN apt update && apt install -y libgl1-mesa-glx\n#17 CACHED\n\n#18 [ 9/24] RUN git clone https://github.com/saic-vul/ritm_interactive_segmentation.git hrnet\n#18 CACHED\n\n#19 [ 6/24] RUN ln -s /usr/bin/pip3 /usr/local/bin/pip && ln -s /usr/bin/python3 /usr/bin/python\n#19 CACHED\n\n#20 [12/24] RUN wget https://github.com/saic-vul/ritm_interactive_segmentation/releases/download/v1.0/coco_lvis_h18_itermask.pth\n#20 CACHED\n\n#21 [11/24] RUN apt-get install -y --no-install-recommends wget\n#21 CACHED\n\n#22 [ 4/24] RUN apt remove python* -y\n#22 CACHED\n\n#23 [21/24] COPY artifacts/py3.8-whl /opt/nuclio/whl\n#23 CACHED\n\n#24 [ 2/24] RUN apt-get update && apt-get install software-properties-common -y\n#24 CACHED\n\n#25 [ 8/24] WORKDIR /opt/nuclio\n#25 CACHED\n\n#26 [16/24] RUN pip3 uninstall torch torch vision -y\n#26 CACHED\n\n#27 [10/24] WORKDIR /opt/nuclio/hrnet\n#27 CACHED\n\n#28 [24/24] RUN python /opt/nuclio/whl/$(basename /opt/nuclio/whl/pip-*.whl)/pip install pip --no-index --find-links /opt/nuclio/whl && python -m pip install nuclio-sdk msgpack --no-index --find-links /opt/nuclio/whl\n#28 CACHED\n\n#29 exporting to image\n#29 exporting layers done\n#29 writing image sha256:22a0f7464631adf151e7b756c3cc5709b9b8d6d3bf9d0129850a08ff92d2e357 done\n#29 naming to docker.io/library/cvat-pth.saic-vul.hrnet:latest done\n#29 DONE 0.0s\n"}
23.02.21 13:30:34.578            nuctl.platform (I) Pushing docker image into registry {"image": "cvat-pth.saic-vul.hrnet:latest", "registry": ""}
23.02.21 13:30:34.578            nuctl.platform (I) Docker image was successfully built and pushed into docker registry {"image": "cvat-pth.saic-vul.hrnet:latest"}
23.02.21 13:30:34.578                     nuctl (I) Build complete {"result": {"Image":"cvat-pth.saic-vul.hrnet:latest","UpdatedFunctionConfig":{"metadata":{"name":"pth-saic-vul-hrnet","namespace":"nuclio","labels":{"nuclio.io/project-name":"cvat"},"annotations":{"animated_gif":"https://raw.githubusercontent.com/opencv/cvat/develop/site/content/en/images/hrnet_example.gif","framework":"pytorch","help_message":"The interactor allows to get a mask for an object using positive points, and negative points","min_neg_points":"0","min_pos_points":"1","name":"HRNET","spec":"","type":"interactor","version":"2"}},"spec":{"description":"HRNet18 for click based interactive segmentation","handler":"main:handler","runtime":"python:3.8","env":[{"name":"PYTHONPATH","value":"/opt/nuclio/hrnet"}],"resources":{"limits":{"nvidia.com/gpu":"1"},"requests":{"cpu":"25m","memory":"1Mi"}},"image":"cvat-pth.saic-vul.hrnet:latest","targetCPU":75,"triggers":{"myHttpTrigger":{"class":"","kind":"http","name":"myHttpTrigger","maxWorkers":1,"workerAvailabilityTimeoutMilliseconds":10000,"attributes":{"maxRequestBodySize":33554432}}},"volumes":[{"volume":{"name":"volume-1","hostPath":{"path":"/mercury/admin_appdata/cvat/cvat-2.3.0/serverless/common"}},"volumeMount":{"name":"volume-1","mountPath":"/opt/nuclio/common"}}],"build":{"functionConfigPath":"serverless/pytorch/saic-vul/hrnet/nuclio/function-gpu.yaml","image":"cvat-pth.saic-vul.hrnet","baseImage":"ubuntu:20.04","directives":{"preCopy":[{"kind":"ENV","value":"DEBIAN_FRONTEND=noninteractive"},{"kind":"RUN","value":"apt-get update && apt-get install software-properties-common -y"},{"kind":"RUN","value":"add-apt-repository ppa:deadsnakes/ppa"},{"kind":"RUN","value":"apt remove python* -y"},{"kind":"RUN","value":"apt-get update && apt-get install -y --no-install-recommends build-essential git curl libglib2.0-0 software-properties-common python3 python3.6-dev python3-pip python3-tk"},{"kind":"RUN","value":"ln -s /usr/bin/pip3 /usr/local/bin/pip && ln -s /usr/bin/python3 /usr/bin/python"},{"kind":"RUN","value":"pip3 install --upgrade pip"},{"kind":"WORKDIR","value":"/opt/nuclio"},{"kind":"RUN","value":"git clone https://github.com/saic-vul/ritm_interactive_segmentation.git hrnet"},{"kind":"WORKDIR","value":"/opt/nuclio/hrnet"},{"kind":"RUN","value":"apt-get install -y --no-install-recommends wget"},{"kind":"RUN","value":"wget https://github.com/saic-vul/ritm_interactive_segmentation/releases/download/v1.0/coco_lvis_h18_itermask.pth"},{"kind":"RUN","value":"pip3 install setuptools"},{"kind":"RUN","value":"pip3 install -r requirements.txt"},{"kind":"RUN","value":"apt update && apt install -y libgl1-mesa-glx"},{"kind":"RUN","value":"pip3 uninstall torch torch vision -y"},{"kind":"RUN","value":"pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html"},{"kind":"WORKDIR","value":"/opt/nuclio"}]},"codeEntryType":"image"},"platform":{"attributes":{"mountMode":"volume","restartPolicy":{"maximumRetryCount":3,"name":"always"}}},"readinessTimeoutSeconds":120,"securityContext":{},"eventTimeout":"30s"}}}}
23.02.21 13:30:34.581                     nuctl (I) Cleaning up before deployment {"functionName": "pth-saic-vul-hrnet"}
23.02.21 13:30:34.700                     nuctl (I) Function already exists, deleting function containers {"functionName": "pth-saic-vul-hrnet"}
23.02.21 13:30:37.439            nuctl.platform (I) Waiting for function to be ready {"timeout": 120}
23.02.21 13:30:38.952                     nuctl (I) Function deploy complete {"functionName": "pth-saic-vul-hrnet", "httpPort": 34357, "internalInvocationURLs": ["172.17.0.5:8080"], "externalInvocationURLs": []}
  NAMESPACE |        NAME        | PROJECT | STATE | REPLICAS | NODE PORT  
  nuclio    | pth-saic-vul-hrnet | cvat    | ready | 1/1      |     34357```
@dwffls dwffls changed the title Can't run auto annotiations, function-gpu.yaml not found Can't run HRNET auto annotations Feb 21, 2023
@dwffls dwffls changed the title Can't run HRNET auto annotations Can't run HRNET auto annotations, Connection refused Feb 21, 2023
@onurtore
Copy link

Same problem happens to us in the online version.

@kompaqt
Copy link

kompaqt commented Mar 6, 2023

It works for me when changing those two lines in the function-gpu.yaml:

- kind: RUN
         value: pip3 uninstall torch torch vision numpy -y
- kind: RUN
         value: pip install numpy==1.20 torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html

It seems to rely on an older numpy version. Uninstalling and forcing version 1.20 worked in my case.

@kompaqt
Copy link

kompaqt commented Mar 6, 2023

Also #5574 should fix the issue permanently when it is approved and the tests pass.

@dwffls
Copy link
Author

dwffls commented Mar 6, 2023

No idea how I did not find the pull request..
Anyways, tested the pull request and it works!

Thanks for the help

@dwffls dwffls closed this as completed Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants