In a nutshell...
- A FunFam is a collection of protein structural domains that share a common function (read more)
- FunFams can be used to predict the location of structural domains and provide clues about putative function in novel protein sequences (read more)
- This site contains information, scripts and tutorials on how FunFams can be used to annotate your own protein sequences (see below)
There are few different options. The most suitable method will depend on:
- how many protein sequences do you want to scan? (1? 100? >100000?)
- how comfortable are you working in a technical environment? (Linux terminal? scripting?)
- Difficulty: EASY
- Expected time: < 1 minute
- Technical requirements: none
The most simple way of predicting the location of FunFams on your protein sequence is to use the CATH web pages. Simply copy and paste your protein sequence into the sequence search and follow the instructions.
http://cathdb.info/search/by_sequence
- Difficulty: MEDIUM
- Expected time: < 10 mins
- Technical requirements: simple scripting (e.g. Perl, Python)
The FunFHMMER server provides a public API that allows users to submit their own sequence scans through their own scripts. Note, queries are submitted to a queueing system and are subject to fair use policy.
API: (click the 'API' tab)
http://cathdb.info/search/by_sequence#api
Example client:
https://github.com/UCLOrengoGroup/cath-tools-seqscan
- Difficulty: MEDIUM
- Expected time: < 1 hour
- Technical requirements: Linux terminal
The FunFHMMER protocol can be used to scan your own sequences on your own machines. This is the recommended approach if you are trying to scan many thousands of protein sequences (e.g. entire genomes).
https://github.com/UCLOrengoGroup/cath-tools-genomescan
The wiki area contains lots of information, please have a good look through that documentation. If you have a question that is not already answered then please help us improve the documentation by letting us know (by raising an issue or emailing us).