Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDB FTP change unsupported #37

Closed
kristycarp opened this issue Oct 18, 2023 · 2 comments · Fixed by #38
Closed

PDB FTP change unsupported #37

kristycarp opened this issue Oct 18, 2023 · 2 comments · Fixed by #38
Assignees
Labels
bug Something isn't working

Comments

@kristycarp
Copy link

kristycarp commented Oct 18, 2023

When updating the localpdb files (with localpdb_setup -db_path [DB PATH HERE] --update) , I get the following error:

Failed to download url: 'http://ftp.rcsb.org/pub/pdb/derived_data/index/entries.idx' to destination: '[DESTINATION FOLDER]/localpdb/data/20231013/pdb_entries.txt'
Failed to download file_type: "entries"

I am fairly certain this is because of the (recent?) PDB change that makes their HTTP downloads available at files.rcsb.org, whereas their FTP downloads are still at ftp.rcsb.org (see https://www.wwpdb.org/ftp/pdb-ftp-sites).

I tried to tweak the remote_sources.yml file to reflect these changes, but was only able to successfully update my localpdb files with these very janky changes to the localpdb code:

  • changing download_proto within the rcsb mirror from http to ftp in remote_sources.yml
  • adding the argument ftp=True to any call to download_url in PDBDownloader.py
  • getting rid of any versioning or modification checks in PDBDownloader.py (I suspect this is a very bad thing to do!)

I had to do all of this because 1) simply changing the url of the rcsb mirror in remote_sources.yml to files.rcsb.org led to an error saying that ftp://files.rcsb.org is inaccessible and I couldn't figure out how to make it go to http://files.rcsb.org instead; and 2) simply changing the download_proto of the rcsb mirror in remote_sources.yml to ftp (from http) led to errors with checking versioning that were only fixed by the other tweaks mentioned above.

However, I suspect that this is not the smartest of solutions and that myself and other users would benefit from a fix from the localpdb creators that either properly changes the protocol type from HTTP to FTP, or properly updates the HTTP address to files.rcsb.org.

@jludwiczak
Copy link
Member

Hi @kristycarp - thanks for raising the issue and providing the temporary workaround.
I believe seting -mirror option to either pdbe or pdbj can solve this too (these are identical replicates of default rcsb mirror) - at least this was helpful in recent e-mail conversation regarding similar / or same issue.
I'm away now but I'll look into this more thoroughly at the beginning of November and provide a proper fix.

@jludwiczak jludwiczak added the bug Something isn't working label Oct 24, 2023
@jludwiczak jludwiczak self-assigned this Oct 28, 2023
@KYQiu21
Copy link

KYQiu21 commented Nov 2, 2023

Setting the mirror as pdbe works for me. Thanks.

@jludwiczak jludwiczak mentioned this issue Nov 5, 2023
jludwiczak added a commit that referenced this issue Nov 5, 2023
* Fix ftp/http loc due to recent URL change in RCSB

* Allow more recent pandas

* Bump python and pandas versions, update lock file

* PDBSeqresMapper standalone

* Further updates in PDBSeqresMapper

* Update dependencies to serve docs properly

* Bump to v0.2.8

---------

Co-authored-by: Jan Ludwiczak <j.ludwiczak@cent.uw.edu.pl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants