Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
hulianyu committed Nov 29, 2023
0 parents commit c941f4f
Show file tree
Hide file tree
Showing 59 changed files with 10,836 additions and 0 deletions.
2 changes: 2 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Auto detect text files and perform LF normalization
* text=auto
6 changes: 6 additions & 0 deletions ACC.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
function result = ACC(Y, predY)
%if pred_classnum
res = bestMap(Y, predY);
% accuarcy
result = length(find(Y == res))/length(Y);
end
46 changes: 46 additions & 0 deletions Assign_Control.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
function pv_point = Assign_Control(sList,X,k,pi)
% Calculate the percentage of samples controlled by the assignment
%% assignment: pval(thisK)<pval(otherK)
OtherClusters = setdiff(unique(pi),k);
numClu = size(sList,1);
numOther = length(OtherClusters);
pv_point = zeros(numClu,numOther);
for kth = 1:length(OtherClusters)
otherK = find(pi==OtherClusters(kth));
OtherS = X(otherK,:);
% X_s = X(otherK,:);
pvs = Other_single_point_fisher_exactG(X, sList, OtherS);
for sth=1:numClu
pv_point(sth,kth) = binomtest(pvs(sth,:),0.05);
end
end
end

function p = Other_single_point_fisher_exactG(X, sList, OtherS)
% He Z, Zhao C, Liang H, et al. Protein complexes identification with family-wise error rate control[J].
% IEEE/ACM transactions on computational biology and bioinformatics, 2019, 17(6): 2062-2073.
% Lancichinetti A, Radicchi F, Ramasco J J, et al. Finding statistically significant communities in networks[J].
% PloS one, 2011, 6(4): e18961.
% record of revisions:
% date programmer description of change
% ----------- ----------------- ------------------------
% Nov 20, 2023 Lianyu Hu Original code version
%% for each attribute value u of other cluster OtherS, compute p-value on point sList
[N,num_attr] = size(X);
i_S = size(sList,1);
i_X = X(sList(:,1),:);
D_S = size(OtherS,1);
p = zeros(i_S,num_attr);
for attr=1:num_attr
X_attr = X(:,attr);
[value, tab] = histRate(X_attr);
for th = 1:i_S
s = i_X(th,attr);
k_i = tab(value==s);
k_in = sum(OtherS(:,attr)==s);
k_out = k_i - k_in;
volume = min(D_S-k_in, k_out);
p(th,attr) = sum(hygepdf(k_in:k_in+volume,N,D_S,k_i));
end
end
end
12 changes: 12 additions & 0 deletions CMI.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
function result = CMI(X,pi)
% Cluster membership index (CMI)
N = size(X,1);
K = max(pi);
signum = zeros(K,1);
parfor k =1:K
clusterk = find(pi==k);
pval = SigCM_intra(X,clusterk);
[signum(k,1),~,~] = FWER_Control(pval);
end
result = sum(signum)/N;
end
57 changes: 57 additions & 0 deletions Datasets/balance-scale.names
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
1. Title: Balance Scale Weight & Distance Database

2. Source Information:
(a) Source: Generated to model psychological experiments reported
by Siegler, R. S. (1976). Three Aspects of Cognitive
Development. Cognitive Psychology, 8, 481-520.
(b) Donor: Tim Hume ([email protected])
(c) Date: 22 April 1994

3. Past Usage: (possibly different formats of this data)
- Publications
1. Klahr, D., & Siegler, R.S. (1978). The Representation of
Children's Knowledge. In H. W. Reese & L. P. Lipsitt (Eds.),
Advances in Child Development and Behavior, pp. 61-116. New
York: Academic Press
2. Langley,P. (1987). A General Theory of Discrimination
Learning. In D. Klahr, P. Langley, & R. Neches (Eds.),
Production System Models of Learning and Development, pp.
99-161. Cambridge, MA: MIT Press
3. Newell, A. (1990). Unified Theories of Cognition.
Cambridge, MA: Harvard University Press
4. McClelland, J.L. (1988). Parallel Distibuted Processing:
Implications for Cognition and Development. Technical
Report AIP-47, Department of Psychology, Carnegie-Mellon
University
5. Shultz, T., Mareschal, D., & Schmidt, W. (1994). Modeling
Cognitive Development on Balance Scale Phenomena. Machine
Learning, Vol. 16, pp. 59-88.

4. Relevant Information:
This data set was generated to model psychological
experimental results. Each example is classified as having the
balance scale tip to the right, tip to the left, or be
balanced. The attributes are the left weight, the left
distance, the right weight, and the right distance. The
correct way to find the class is the greater of
(left-distance * left-weight) and (right-distance *
right-weight). If they are equal, it is balanced.

5. Number of Instances: 625 (49 balanced, 288 left, 288 right)

6. Number of Attributes: 4 (numeric) + class name = 5

7. Attribute Information:
1. Class Name: 3 (L, B, R)
2. Left-Weight: 5 (1, 2, 3, 4, 5)
3. Left-Distance: 5 (1, 2, 3, 4, 5)
4. Right-Weight: 5 (1, 2, 3, 4, 5)
5. Right-Distance: 5 (1, 2, 3, 4, 5)

8. Missing Attribute Values:
none

9. Class Distribution:
1. 46.08 percent are L
2. 07.84 percent are B
3. 46.08 percent are R
Loading

0 comments on commit c941f4f

Please sign in to comment.