Skip to content

Commit 9d0b53b

Browse files
committed
En docs for graph
1 parent 91f2873 commit 9d0b53b

13 files changed

+2279
-21
lines changed

docs/graph/data_loader_en.md

+366
Large diffs are not rendered by default.

docs/graph/data_object_en.md

+131
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# Data objects
2+
3+
GraphLearn describes the result of traversal and sampling as a data object; GraphLearn traversal and sampling are Batch operations, in which the number of neighbors/negative neighbors of a Batch can be equal or unequal, so sampling is divided into aligned and non-aligned sampling.
4+
5+
The result of vertex traversal and aligned vertex sampling is `Nodes`, and the result of non-aligned vertex sampling is `SparseNodes`. Correspondingly, edge traversal and aligned edge sampling results in `Edges`, and non-aligned edge sampling results in `SparseEdges`. <br />
6+
7+
## Dense data objects
8+
### `Nodes`
9+
10+
```python
11+
@property
12+
def ids(self):
13+
""" vertex id, numpy.ndarray(int64) """
14+
15+
@property
16+
def shape(self):
17+
""" vertex id's shape, (batch_size) / (batch_size, neighbor_count) """
18+
19+
@property
20+
def int_attrs(self):
21+
""" attributes of type int, numpy.ndarray(int64), shape as [ids.shape, number of attributes of type int] """
22+
23+
@property
24+
def float_attrs(self):
25+
""" properties of type float, numpy.ndarray(float32), shape is [ids.shape, number of properties of type float] """
26+
27+
@property
28+
def string_attrs(self):
29+
""" attributes of type string, numpy.ndarray(string), shape is [ids.shape, number of attributes of type string] """
30+
31+
@property
32+
def weights(self):
33+
""" weights, numpy.ndarray(float32), shape is ids.shape """
34+
35+
@property
36+
def labels(self):
37+
""" labels, numpy.ndarray(int32), shape is ids.shape """ @property
38+
def ids(self):
39+
""" vertex id, numpy.ndarray(int64) """
40+
41+
@property
42+
def shape(self):
43+
""" vertex id's shape, (batch_size) / (batch_size, neighbor_count) """
44+
45+
@property
46+
def int_attrs(self):
47+
""" attributes of type int, numpy.ndarray(int64), shape as [ids.shape, number of attributes of type int] """
48+
49+
@property
50+
def float_attrs(self):
51+
""" properties of type float, numpy.ndarray(float32), shape is [ids.shape, number of properties of type float] """
52+
53+
@property
54+
def string_attrs(self):
55+
""" attributes of type string, numpy.ndarray(string), shape is [ids.shape, number of attributes of type string] """
56+
57+
@property
58+
def weights(self):
59+
""" weights, numpy.ndarray(float32), shape is ids.shape """
60+
61+
@property
62+
def labels(self):
63+
""" labels, numpy.ndarray(int32), shape as ids.shape """
64+
```
65+
66+
### `Edges`
67+
The difference between the `Edges` interface and `Nodes` is that the `ids` interface has been removed and the following four interfaces have been added for accessing source and destination vertices.
68+
69+
```python
70+
@property
71+
def src_nodes(self):
72+
""" source vertex Nodes object """
73+
74+
@property
75+
def dst_nodes(self):
76+
""" destination vertex Nodes object """
77+
78+
@property
79+
def src_ids(self):
80+
""" source vertex id, numpy.ndarray(int64) """
81+
82+
@property
83+
def dst_ids(self):
84+
""" destination vertex id, numpy.ndarray(int64) """
85+
```
86+
87+
Regarding the shape of `ids`, in vertex and edge traversal operations, the shape is one-dimensional and the size is the specified batch size. In sampling operations, the shape is two-dimensional and the size is [the one-dimensional expansion size of the input data, the current number of samples].
88+
89+
## Sparse data object
90+
91+
### `SparseNodes`
92+
`SparseNodes` is used to express the sparse neighbor vertices of a vertex, with the following additional interface relative to Nodes.
93+
94+
```python
95+
@property
96+
def offsets(self):
97+
""" one-dimensional shape-shifting array: the number of neighbors per vertex """
98+
99+
@property
100+
def dense_shape(self):
101+
""" tuples with 2 elements: the shape of the corresponding Dense Nodes """
102+
103+
@property
104+
def indices(self):
105+
""" 2-dimensional array representing the position of each neighbor """
106+
107+
def __next__(self):
108+
""" the traversal interface, traversing the vertices of each vertex's neighbors """
109+
return Nodes
110+
```
111+
112+
### `SparseEdges`
113+
``SparseEdges`` is used to express the sparse neighboring edges of a vertex, with the following additional interface relative to Edges.
114+
115+
```python
116+
@property
117+
def offsets(self):
118+
""" one-dimensional shape-shifting array: the number of neighbors per vertex """
119+
120+
@property
121+
def dense_shape(self):
122+
""" tuples with 2 elements: the shape of the corresponding Dense Edges """
123+
124+
@property
125+
def indices(self):
126+
""" 2-dimensional array representing the position of each neighbor """
127+
128+
def __next__(self):
129+
""" the traversal interface, traversing the edges of each vertex's neighbors """
130+
return Edges
131+
```

docs/graph/graph_object_cn.md

+10-11
Original file line numberDiff line numberDiff line change
@@ -133,37 +133,36 @@ item item -> item item
133133
2 3 3 1
134134
3 3 2 3
135135
1 4 3 2
136-
3 3
137-
1 4
138-
4 1
136+
3 3
137+
1 4
138+
4 1
139139
```
140140

141141
- src_type和dst_type不一致,如边类型为("user", "item", "u2i"),当u2i为无向边时,在加载时实际上除了加载原始的u2i边之外,额外加载了一份i2u的反向边。<br />
142142

143143
```
144144
原始u2i数据 加载到图中的u2i数据 + 加载到图中的i2u数据
145-
user item -> user item item user
145+
user item -> user item item user
146146
1 2 1 2 2 1
147147
2 1 2 1 1 2
148-
1 3 1 3 3 1
149-
2 3 2 3 3 2
150-
3 3 3 3 3 3
151-
1 4 1 4 4 1
148+
1 3 1 3 3 1
149+
2 3 2 3 3 2
150+
3 3 3 3 3 3
151+
1 4 1 4 4 1
152152
```
153153

154154
<br />在遍历时,同构的无向边和异构的无向边使用详见GSL文档。
155155
<br />在采样时,需要根据自己指定的meta-path选择合理的边方向,合理使用outV(从src到dst)和inV(从dst到src)。outV和inV接口详见GSL文档。
156156

157157
<a name="OVdVh"></a>
158158
### partition
159-
分布式场景下,即存在多个GraphLeanrn Server时,构图时会自动进行图的partition, 将图分布式存储。默认partiton是按照src_id % server数进行节点和边的分配。
159+
分布式场景下,即存在多个GraphLearn Server时,构图时会自动进行图的partition, 将图分布式存储。默认partition是按照src_id % server数进行节点和边的分配。
160160

161-
<a name="HNiIP"></a>
162161
## 初始化
163162

164163
顶点与边添加完成后,需要调用初始化接口,完成从原始数据到内存索引的构建。初始化过程决定了图数据被Serving的情况,单机的还是分布式的。若为分布式的,还要区分Server Mode和Client-Server Mode。初始化完成后,便可对Graph对象进行操作了。<br />
165164

166-
<a name="nGHkF"></a>
165+
167166

168167
### 单机
169168
单机模式比较简单,表示该Graph对象Hold全部图数据。<br />

0 commit comments

Comments
 (0)