The last few weeks have seen a lot of new features added in DistilBio. While looking at the queries run by users and based on user feedback, we realized that the following were areas for improvement
- Ability to filter, refine and analyse results
- “Ease of use of DistilBio”
- Missing results which were known and relevant
- Queries yielding no results
Let’s look at the issues a little closely and the measures taken by the DistilBio team to handle this. (Watch demo video)
Features
Filtering and Analysis
Filtering and analysis of results were areas that needed refinement from the previous version of DistilBio. Earlier, all the facets were displayed linearly, while queries could be branched. This is led to some confusion about understanding the results. The revamped result interface of DistilBio shows the relationships on the canvas, thus retaining and displaying the connections accurately.
The results can also be selected and filtered to display only instances and their relationships that a user is interested in. For instance, if a query is created for a Aspirin – Protein – Disease. Generally multiple target proteins and associated diseases are shown. Clicking and highlighting the protein facet displays information regarding proteins. Selecting diseases (check boxes) narrows the results showing only proteins connected with the selected diseases. The details tab to the right displays data of the facet that has been selected in this case: Protein. To view another facet’s detail, click the facet header and the description changes whilst retaining the filters.
Apart from this, if a user finds an instance(s) of interest, the query can now extended by selecting the facet and the instance(s) that the user is interested in and clicking on the “Extend” button (displayed at the top of the page). This creates a new query for that particular instance and more nodes could be added to this query.
Clicking on the name of the instance in the details tab provides a relationship profile of the instance which is very similar to doing a simple search such as Aspirin. The simple search has been further enhanced to encompass all relationships between the facets – direct and inverse relationships. For example, Uniprot links proteins to drugs while DrugBank links drugs to proteins. DistilBio captures both these relations and links them inversely too.
More Complete Results
In a simple search that yields the relationship profile, there is also a “More” link (displayed at the top in each facet) if more results are available.
In a regular search, if more results are available for the query, DistilBio now displays a “Get More results” button at the top in the result page. Clicking on the button allows the user to increase the limit. The user can iteratively fetch more results ensuring that relevant results for each query are not missed out!
List Query (Multiple Instance Query)
Another interesting feature is the “List Query”. In an earlier post of mine, I had done a comparative study of the drugs aspirin, acetaminophen and ibuprofen. The query I had created gave me the common protein targets between these drugs. With the “List Query”, all 3 drugs can be starting nodes of the query and protein targets of the all three drugs can be found in a single query. This gives us an union of all the protein targets of these drugs. Clicking on the drug facet will display the numbers beside the protein indicating how many drugs target the protein. Proteins with the number 3 will be the common proteins targets between these drugs.
Let us see the query below:
Query: Aspirin, Acetaminophen, Ibuprofen > protein
As can be seen, PGH1_HUMAN and PGH2_HUMAN are common protein targets for these drugs.
I will also use the Rabeprazole example to highlight some of the features we have discussed. While working on my earlier post, I had to run multiple queries to get my results. Some of the new features added to DistilBio have made the searches much easier!
I first ran a query to find the protein targets of the drug Rabeprazole and the drugs that target these proteins.
Query: Rabeprazole > protein > drug
The results are as below
The “Get More Results” button is displayed. By clicking on it, you can increase the limit and all results available are displayed.
In the screenshot below, I have selected 2 proteins – 5HT1D_Human and DRD3_Human for further analysis. The drug list below now displays drugs targeting these 2 proteins. Clicking on the name 5HT1D_HUMAN in the right tab will create a simple search and display all relationships available for the protein.
You can also see the “Extend” button. You can extend your query for the selected proteins by clicking on this. You could also add further nodes to the query as shown below. I have selected the proteins 5HT1D_HUMAN and DRD3_HUMAN to extend my query. When a new query graph is created, I also added the node to find the diseases associated with these proteins.
The above query is an example for “List Query” and this displays all diseases associated with the 2 proteins.
Saved Searches and Recent Queries
Created a query and found interesting results? You can now save your search and this can be retrieved any time. All you have to do is login to your account (registration is free).
The “Recent searches” tab on the query page earlier used to display only the names of the searches. Selecting the queries only by the names used to be difficult. Now hovering over the name display the query graph on the canvas, making it easier to choose and modify the query.
Blast and Structure viewer
We have also added new tools that would be useful for users – PDB structure viewer and BLAST. Proteins structures can be viewed in DistilBio by selecting proteins from the protein details tab and clicking on “View Structure”. The blast search is accessible from the home page (link under the search box). Paste in your sequence and blast against SWISSPROT. Once the results are fetched from the server a page lists the matching proteins. Select the protein(s) of your interest (like the list query) to display all information and relationships for that protein.
Limit Enhancement
Previously, when a user ran a query, the default limit for the number of results to be displayed was 100. Sometimes, users did not find some of the known relevant results for their queries. Also, in case there were more results available for the query, the user would be unaware of it. Now, in the query interface, the limit for the results has been increased to 500 from 100. So a user gets more relevant results in each query. Some relationship searches, for instance like Protein-Disease, will have a huge number of results. The upper limit has been set to 10000.
Empty Result
One reason for unavailability of results was that if a user created a query for an instance and added other nodes, result would be available only if results were available for all nodes. Eg.: If a query is created for a drug, its target proteins and properties of the drug, previously only if proteins + properties were available then results were displayed. This problem has been handled in the latest release of DistilBio, where an user can add nodes to the instance and get results even if one relationship is available. ie., target proteins will be displayed even if drug properties are not available.
The other issue frequently encountered was that no results were available for some queries. Now, the user has the option of exploring each node of the query to find relationships available for that node.
New Datasets
Gene Ontology
Gene Ontology (GO) terms associated with gene products in terms of biological processes, cellular component and molecular function is now available in DistilBio. I had highlighted the use of this in my earlier post, you could check it to understand how the GO terms could used.
Cell Image Library
DistilBio now also has data from The Cell Image Library. You could run a query to find images and the GO Biological processes and publications associated with it. You could also include organism and the cell type that you are interested in. Alternatively, you could run a query for your protein of interest, the GO biological processes and image associated with it. Else, you could also find publications and images associated with your protein of interest.
We would love to hear your feedback on the new features and enhancements to DistilBio! Email us at: distilbio@metaome.com or leave your feedback on www.distilbio.com. Or you could always leave your comments to this post!

