Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increasing the resolution of the mask #8

Closed
remybonnav opened this issue Sep 22, 2022 · 5 comments
Closed

Increasing the resolution of the mask #8

remybonnav opened this issue Sep 22, 2022 · 5 comments

Comments

@remybonnav
Copy link

So when I run the code I get a mask that seems to be made of 32x32 large square on a whole 512x512 pixels image.

Is there a way to increase the resolution of the prediction and thus the mask generated?

Thanks a lot in advance.

@PancrasPZ
Copy link

Hi~
i'm also looking for the ways.
The low resolution seems due to the CLIP model...

So, i'm trying to find another way to help increase the resolution...
Such as, use the mask that create by CLIPSeg as a trimap and create the final high resolution mask with the other project..(like Deep-Image-Matting ?)

@remybonnav
Copy link
Author

I have to try in automatic1111 webUi this code
https://github.com/ThereforeGames/txt2mask
to see if it is also limited to 512*512 and with low resolution or if they manage to increase it

@PancrasPZ
Copy link

I have already tried this, they use a simple Resize to increase the resolution..
I don't know is it suit for you.
In my case, the quality is not enough.

I have to try in automatic1111 webUi this code https://github.com/ThereforeGames/txt2mask to see if it is also limited to 512*512 and with low resolution or if they manage to increase it

@timojl
Copy link
Owner

timojl commented Sep 26, 2022

The problem that the binary segmentations have a block-like appearance is known and reported in the paper. I'll have a closer look into this problem and see if it can be improved.

@timojl
Copy link
Owner

timojl commented Sep 27, 2022

Adding a convolution to consider the neighborhood before projecting into pixel space solves this problem to some degree, see the complex_trans_conv option in CLIPSeg model for details. The updated model and corresponding weights are available.

@timojl timojl closed this as completed Sep 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants