A web server for identifying AMPs with their functional activities

Download Data Sets

Stage 1

The positive dataset, used to construct the AMP classifier, were collected from various of AMP databases, including APD3, ADAM, ParaPep, AVPdb, CancerPPD, MLACP, AntiCP, AntiFP, and DRAMP, and a total number of 6,766 sequences were collected. On the other hands, the non-AMP formed the negative dataset and was obtained from AmPEP4 that collected from UniProt28 with 5-255 amino acid residues long and then filtered sequences with unnatural amino acid B, J, O, U, X, Z. After reducing the homology bias and redundancy, the training set contained 1,686 AMP sequences and testing set included 723 ones.

Training set		Testing set
Positive	Negative	Positive	Negative
AMP	AMP	AMP	AMP

Stage 2

The construction of class-specific classifiers dataset consists of the aforementioned AMP databases. Note that once a sequence has the activity that we now concern, then this sequence belongs to the positive set, otherwise, to negative set.

Training set		Testing set
Positive	Negative	Positive	Negative
Anti-parasitic	Anti-parasitic	Anti-parasitic	Anti-parasitic
Anti-viral	Anti-viral	Anti-viral	Anti-viral
Anti-caner	Anti-caner	Anti-caner	Anti-caner
Targeting mammals	Targeting mammals	Targeting mammals	Targeting mammals
Anti-fungal	Anti-fungal	Anti-fungal	Anti-fungal
Targeting Gram positive bacteria	Targeting Gram positive bacteria	Targeting Gram positive bacteria	Targeting Gram positive bacteria
Targeting Gram negative bacteria	Targeting Gram negative bacteria	Targeting Gram negative bacteria	Targeting Gram negative bacteria