kisterae

My Photo
Name:
Location: New York, New York, United States

我叫江奕賢啦

Friday, September 30, 2005

10A should also be red

same as 10B

Just as Dr. Kister said, when black SM has odd number of strandons, there's something wrong.
Now we know, SOMETHING=RED.

why 2AF not exist?

2AF follows every rules, why there's no 2AF?

Wednesday, September 28, 2005

motifs in SM contents odd number strandons are all red in some SM

ex: SM#7, SM#11, maybe SM#16

Tuesday, September 27, 2005

error correction status

error type
correcteddescibtionexample
motif not exist yesauto fix2A, 2X, 2AL, 3C, 3J, 3N, 3AB, 4N, 23E
jump caused by missing strandyesadd'1Y, 2S, 2Y,2AM,3P,3Q,11A
other2L: should delete strand 6
wrong direction caused by uninverted sheetyes3AB, should be 2F
assign multiple motif number for same motif3P, 3Q
combinition of two motif11A

Wednesday, September 21, 2005

error correction in statistics table

error type
correcteddescibtionexample
motif not exist not yet2A, 2X, 2AL, 3C
jump caused by missing strandnot yet1Y, 2S, 2AM
other2L: should delete strand 6
wrong direction caused by uninverted sheet3AB, should be 2F
assign multiple motif number for same motif3P, 3Q
combinition of two motif11A

change the table


1ciy (13A)

1
3 4 11 6 9

2 5 10 7 8

change to

1 4
3 4 11 6 9

2 5 10 7 8

Tuesday, September 20, 2005

d & j only on edge

wrong strandon: wrong direction and jump only exist on edge strandon except 7H.
(Not including wrong SM here)

Wednesday, September 14, 2005

idea about parallel and anti-parallel


A B
we got same-sheet parallel (A) only if the loop is long enough.
otherwise if two consicutive strands next to each other in the same sheet can only be anti-parallel like B.

Tuesday, September 13, 2005

idea about strandon prediction

in this kind of motif should have a preference signal like this.

which shows strand 1 like strand2, and strand 2 like strand 1 and strand 5, etc

Thursday, September 08, 2005

ref papers

this paper use some probes about 5~9 residues long to predict the structures.
it's like in the middle of homology modeling(lots of residues) and my previous finding which use 4 residues to predict loops.
they obtained the probes by statistics similar to HMM.

this paper use similar approach as mine, they try to predict Strand-Loop-Strand structure.
I think they got a pretty good result, and I might use their result as input.

this paper try to predict folding initiation sites. which might solve our problem by this way.

this paper predict long range contact by contact maps.
I think this might help a bit in our case.

structure prediction

This paper use rules for B-sheet motifs to predict protein structures.
Which is pretty similar to ours.

Wednesday, September 07, 2005

plan 050907

align fold 3, SM#2, four strands in interlock by HB info.

check SM which has odd number of strandons

ex: SM#7

7E 5 (2 1) (8 9)
(4 3) (6 7)
if you change the order of 8 and 9, then it's not in SM#7, it will become a even number strandons.

Thursday, September 01, 2005

discover rules/associations by weka

 
=== Run information ===

Scheme: weka.associations.Apriori -N 10 -T 0 -C 0.9 -D 0.05 -U 1.0 -M 0.1 -S -1.0
Relation: predictstrandon2.txt-weka.filters.unsupervised.attribute.Remove-R12-19,25-weka.filters.unsupervised.attribute.Remove-R1,14-15,17-18-weka.filters.unsupervised.attribute.Remove-R14-weka.filters.unsupervised.attribute.Remove-R9-weka.filters.unsupervised.attribute.Remove-R13
Instances: 61
Attributes: 13
seq
res1
res2
res3
res4
ss
ss1
ss2
ss4
type
type2
typen
intrastrdonn
=== Associator model (full training set) ===


Apriori
=======

Minimum support: 0.75
Minimum metric : 0.9
Number of cycles performed: 5

Generated sets of large itemsets:

Size of set of large itemsets L(1): 4

Size of set of large itemsets L(2): 5

Size of set of large itemsets L(3): 2

Best rules found:

1. intrastrdonn=FALSE 55 ==> ss2=C 55 conf:(1)
2. ss1=C 50 ==> ss2=C 50 conf:(1)
3. ss4=C 49 ==> ss2=C 49 conf:(1)
4. ss4=C intrastrdonn=FALSE 48 ==> ss2=C 48 conf:(1)
5. ss1=C intrastrdonn=FALSE 48 ==> ss2=C 48 conf:(1)
6. ss4=C 49 ==> ss2=C intrastrdonn=FALSE 48 conf:(0.98)
7. ss2=C ss4=C 49 ==> intrastrdonn=FALSE 48 conf:(0.98)
8. ss4=C 49 ==> intrastrdonn=FALSE 48 conf:(0.98)
9. ss1=C 50 ==> ss2=C intrastrdonn=FALSE 48 conf:(0.96)
10. ss1=C ss2=C 50 ==> intrastrdonn=FALSE 48 conf:(0.96)


strandon prediction using weka, result1


=== Run information ===

Scheme: weka.classifiers.trees.ADTree -B 10 -E -3
Relation: predictstrandon2.txt-weka.filters.unsupervised.attribute.Remove-R2,7-weka.filters.unsupervised.attribute.Remove-R14-17-weka.filters.unsupervised.attribute.Remove-R16-17-weka.filters.unsupervised.attribute.Remove-R18
Instances: 61
Attributes: 17
pos
res1
res2
res3
res4
score
s1
s3
s3
s4
s5
s6
s7
typen
total
relpos
intrastrandon
Test mode: evaluate on training data

=== Classifier model (full training set) ===

Alternating decision tree:

: -1.04
| (1)s6 < 0.291: -1.434
| (1)s6 >= 0.291: 0.773
| | (3)score < 1.401: -0.731
| | (3)score >= 1.401: 0.706
| | | (5)s3 < 2.062: 1.42
| | | (5)s3 >= 2.062: -0.452
| (2)relpos < 6.5: -0.737
| | (4)res3 = T: 0.427
| | (4)res3 != T: -0.819
| (2)relpos >= 6.5: 0.658
| (6)res1 = A: 0.254
| (6)res1 != A: -0.452
| | (7)pos < 54: 0.146
| | (7)pos >= 54: -0.418
| | | (8)typen = &12: 0.119
| | | (8)typen != &12: -0.378
Legend: -ve = FALSE, +ve = TRUE
Tree size (total number of nodes): 25
Leaves (number of predictor nodes): 17

Time taken to build model: 0.02 seconds

=== Evaluation on training set ===
=== Summary ===

Correctly Classified Instances 61 100 %
Incorrectly Classified Instances 0 0 %
Kappa statistic 1
Mean absolute error 0.0237
Root mean squared error 0.0445
Relative absolute error 12.6362 %
Root relative squared error 14.9366 %
Total Number of Instances 61

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure Class
1 0 1 1 1 FALSE
1 0 1 1 1 TRUE

=== Confusion Matrix ===

a b <-- classified as
55 0 a = FALSE
0 6 b = TRUE