We welcome any kind of contribution to our software, from simple comment or question to a full fledged pull request.
A contribution can be one of the following cases:
- you have a question;
- you think you may have found a bug (including unexpected behavior);
- you want to make some kind of change to the code base (e.g. to fix a bug, to add a new feature, to update documentation);
The sections below outline the steps in each case.
- use the search functionality in the issues to see if someone already filed the same issue;
- if your issue search did not yield any relevant results, make a new issue;
- use the search functionality in the issues to see if someone already filed the same issue;
- if your issue search did not yield any relevant results, make a new issue, making sure to provide enough information to the rest of the community to understand the cause and context of the problem. Depending on the issue, you may want to include:
- some identifying information (name and version number) for dependencies you're using;
- information about the operating system;
In case you feel like you've made a valuable contribution, but you don't know how to write or run tests for it, or how to generate the documentation: don't let this discourage you from making the pull request; we can help you! Just go ahead and submit the pull request, but keep in mind that you might be asked to append additional commits to your pull request.
- (important) announce your plan to the rest of the community before you start working. This announcement should be in the form of a (new) issue;
- (important) wait until some kind of consensus is reached about your idea being a good idea;
- if needed, fork the repository to your own Github profile and create your own feature branch off of the latest master commit. While working on your feature branch, make sure to stay up to date with the main branch by pulling in changes, possibly from the 'upstream' repository (follow the instructions here and here);
- make sure the existing tests still work by running
pytest; - add your own tests (if necessary);
- update or expand the documentation;
- update the CHANGELOG file with change;
- push your feature branch to (your fork of) this repository on GitHub;
- create the pull request, e.g. following the instructions here.
MSMetaEnhancer has a modular architecture that makes it easy to add new conversion services. There are two main types of converters:
- Web Converters: Services that make HTTP requests to external APIs
- Compute Converters: Services that perform local computations
The MSMetaEnhancer system consists of several key components:
-
Base Converter Classes:
Converter: Abstract base class for all convertersWebConverter: Base class for web-based API servicesComputeConverter: Base class for local computation services
-
Job System:
Job: Represents a conversion task (source → target using specific converter)- Jobs are defined as tuples:
(source_attribute, target_attribute, converter_name)
-
Converter Builder:
- Automatically discovers and instantiates available converters
- Manages both web and compute converters
-
Dynamic Method Creation:
- Converters automatically generate methods like
compound_name_to_inchi() - Based on the conversions list defined in each converter
- Converters automatically generate methods like
To add a new web-based service, follow these steps:
Create a new Python file in MSMetaEnhancer/libs/converters/web/ named after your service (e.g., MyService.py).
from MSMetaEnhancer.libs.converters.web.WebConverter import WebConverter
class MyService(WebConverter):
"""
Brief description of what your service does.
Service URL: https://example.com/api
"""
def __init__(self, session):
super().__init__(session)
# Define the service endpoints
self.endpoints = {
'MyService': 'https://api.example.com/v1/'
}
# Define the conversions this service supports
conversions = [
('source_attr', 'target_attr', 'conversion_method'),
# Add more conversions as needed
]
self.create_top_level_conversion_methods(conversions)
# Add rate limiting if needed (optional)
# self.throttler = Throttler(rate_limit=5) # 5 requests per second
async def conversion_method(self, input_data):
"""
Implement the actual conversion logic.
:param input_data: The input data to convert
:return: Dictionary with converted data
"""
# Build the API request
args = f'endpoint/{input_data}'
# Make the request (with throttling if configured)
response = await self.query_the_service('MyService', args)
# Parse and return the result
if response:
return self.parse_response(response)
return {}
def parse_response(self, response):
"""
Parse the API response and extract relevant data.
:param response: Raw API response
:return: Dictionary with parsed data
"""
# Implement response parsing logic
# Return a dictionary with attribute names as keys
return {'target_attr': parsed_value}Add your new converter to MSMetaEnhancer/libs/converters/web/__init__.py:
from MSMetaEnhancer.libs.converters.web.MyService import MyService
__all__ = ['IDSM', 'CTS', 'CIR', 'PubChem', 'BridgeDb', 'MyService']Create a test file tests/test_MyService.py:
import pytest
from MSMetaEnhancer.libs.converters.web.MyService import MyService
@pytest.mark.dependency()
async def test_service_available():
"""Test if the service is available."""
# Implementation depends on your service
pass
@pytest.mark.dependency(depends=["test_service_available"])
async def test_conversion():
"""Test the conversion functionality."""
# Mock the service and test your conversion methods
passFor local computation services (like RDKit), follow these steps:
Create a new Python file in MSMetaEnhancer/libs/converters/compute/ named after your service.
from MSMetaEnhancer.libs.converters.compute.ComputeConverter import ComputeConverter
class MyComputeService(ComputeConverter):
"""
Description of your compute service.
"""
def __init__(self):
super().__init__()
# Define the conversions this service supports
conversions = [
('source_attr', 'target_attr', 'conversion_method'),
# Add more conversions as needed
]
self.create_top_level_conversion_methods(conversions, asynch=False)
def conversion_method(self, input_data):
"""
Implement the computation logic.
:param input_data: The input data to process
:return: Dictionary with computed data
"""
# Perform local computation
result = some_computation(input_data)
return {'target_attr': result}Add your new converter to MSMetaEnhancer/libs/converters/compute/__init__.py:
from MSMetaEnhancer.libs.converters.compute.MyComputeService import MyComputeService
__all__ = ['RDKit', 'MyComputeService']To add new conversion functions to existing converters:
Add a new method to the existing converter class:
async def new_conversion_method(self, input_data):
"""
Description of what this conversion does.
:param input_data: Input data
:return: Converted data
"""
# Implementation here
passAdd the new conversion to the conversions list in the __init__ method:
conversions = [
# existing conversions...
('new_source_attr', 'new_target_attr', 'new_conversion_method'),
]- Error Handling: Always handle API errors gracefully and return empty dictionaries when data is not available
- Rate Limiting: Respect API rate limits using throttling mechanisms
- Data Validation: Validate input data before making API calls
- Response Parsing: Implement robust response parsing that handles various response formats
- Documentation: Include docstrings for all methods explaining parameters and return values
- Testing: Write comprehensive tests including service availability and conversion functionality
After implementing your converter:
-
Run the existing tests to ensure you haven't broken anything:
pytest tests/
-
Run your specific tests:
pytest tests/test_YourService.py -v
-
Test the integration by using the converter in a real scenario
- Throttling: Use
Throttlerclass for rate limiting - Caching: Use
@lru_cachedecorator for caching responses - Error handling: Inherit from base converter classes for consistent error handling
- Data escaping: Use decorators like
@escape_single_quotesfor input sanitization
To help you get started quickly, you can use these template files as starting points:
- Web Converter Template: Use the CTS or PubChem converters as reference implementations
- Compute Converter Template: Use the RDKit converter as a reference implementation
- Test Template: Follow the existing test patterns in the
tests/directory
These templates include:
- Proper class structure and inheritance
- Common import patterns
- Standard method signatures
- Error handling patterns
- Documentation structure
- Test structure and mocking examples