Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding interfaces to popular Python packages #709

Open
6 tasks
padix-key opened this issue Nov 29, 2024 · 6 comments
Open
6 tasks

Adding interfaces to popular Python packages #709

padix-key opened this issue Nov 29, 2024 · 6 comments

Comments

@padix-key
Copy link
Member

padix-key commented Nov 29, 2024

Although Biotite implements a lot of algorithms from the bioinformatics community, there are still many tasks that cannot be done in Biotite, either because a method is not implemented yet, it is too complex or because it is out of scope. For this purpose Biotite provides the biotite.application subpackage, which provides interfaces to external software.
The logic of the core biotite.application.Application class inherently assumes that another process is running the desired functionality - either local (e.g. Muscle )or on some server (currently only NCBI BLAST).

However, there are also a lot of popular Python packages for various bionformatics tasks out there, where an interface would be reasonable in my opinion. This would be a non-exhaustive list:

Pressing these into the logic of the Application would be possible, but

  • it would be slower, due to file I/O instead of converting Python objects,
  • it would be counterintuitive, as users probably expect to use a Python library as library instead of some static application run and
  • it would be inflexible, as one would be unable to use the full extent of the programming language

Hence, I would like to discuss an additional biotite.interface subpackage: Analogous to biotite.application it would contain the biotite.interface.xyz subpackage for the Python package xyz and the user is responsible for installing xyz separately. However, unlike biotite.application its purpose is to convert between native biotite objects and the corresponding objects in xyz. So the user could use the converted object directly within the same Python script.

Pros:

  • A pythonic way to interface other packages in the bioinformatics ecosystem.
  • Ability to migrate some current extension packages directly into biotite.interface to reduce maintenance overhead (namely ammolite and molmarbles).

Cons:

  • The conversion code needs to be maintained as well and needs to be kept up to date with the interfaced package.
  • Breaking changes in the interfaced package lead to ambiguity: For which version of the package does the conversion function work?
    • Could possibly solved with a function decorator that specifies the version.

@t0mdavid-m @JHKru @MaxGreil (and everbody else having an opinion on it 😉)

@padix-key
Copy link
Member Author

I think the structure of https://github.com/biotite-dev/molmarbles is a good example for how I would envision such a biotite.interface subpackage

@t0mdavid-m
Copy link
Member

@poshul

@t0mdavid-m
Copy link
Member

I like the idea of convenience functions converting Biotite objects to key data structures of popular packages as this can be cumbersome for users and seems to occur frequently. I would also add pyOpenMS to the list of packages that could be interfaced. I could imagine an interface to AASequence for instance.

However, we have to make sure that we dont entangle Biotite too much with other packages. I would suggest to continue to permit integrating some selected external algorithms (as done for the plotting of RNA structures with ViennaRNA, ..) but would in general keep this to a minimum. Otherwise, we risk having a huge maintenance burden just to keep up with external package changes.

@padix-key
Copy link
Member Author

I agree. In the usual case I would also keep the interfaces very sparse: It should be only about converting the representations of structures/sequences.

A notable exception is Ammolite, which also provides a wrapper (PyMOLObject) to the objects used by PyMOL for convenience. @t0mdavid-m Do you think this is a reason not to integrate Ammolite into the hypothetical biotite.interface.pymol subpackage?

@t0mdavid-m
Copy link
Member

I think Ammolite could definitely an exception in terms of sparsity. In my opinion it is quite a useful package and keeping it as an extension package limits its exposure. I would vote for integrating it into a biotite.interface.pymol subpackage. I would even go as far as advertising its functionality a bit more prominently on the project website.

@padix-key
Copy link
Member Author

Probably I will begin with writing an interface of RDKit. We can then view this as a proof-of-concept, to make a final decision, whether a biotite.interface would be worthwhile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants