28.1.6 Binary File Formats
PGX Binary Format (PGB)
PGX binary format (.pgb
) is the proprietary binary format for graph server (PGX), which allows fast and efficient file processing. Fundamentally, the file is a binary dump of the graph and property data. Bytes are written in network byte order (big endian).
Type Encoding
Table 28-3 Type Encoding
Value | Type | Size in bytes |
---|---|---|
0 |
Boolean |
1 |
1 |
Integer |
4 |
2 |
Long |
8 |
3 |
Float |
4 |
4 |
Double |
8 |
7 |
String |
varies |
11 |
Vertex labels |
varies |
13 |
Local date |
4 |
14 |
Time |
4 |
15 |
Timestamp |
8 |
16 |
Time with time zone |
8 |
17 |
Timestamp with time zone |
12 |
18 |
Vector property |
variable: <sizeof component-type> * <dimension> |
File Layout
Table 28-4 File Layout
Size in bytes | Description | Required | Comment |
---|---|---|---|
4 |
magic word | Yes | 0x99191191 |
4 |
vertex size | Yes | Allowed values are 4 and 8 .
|
4 |
edge size | Yes | Allowed values are 4 and 8 .
|
<vertex size> |
number of vertices | Yes | |
<edge size> |
number of edges | Yes | |
<edge size> * (<numVertices> + 1) |
edge begin array | Yes | |
<vertex size> * <numEdges> |
destination vertex array | Yes | |
1 |
component bitmap | Yes |
|
4 |
vertexKey type | No | Only present if component bitmap & 0x0001 == 0x0001 . See Table 28-3 for type encoding.
|
<vertex key layout> |
vertex keys | No | Only present if component bitmap & 0x0001 == 0x0001 .
|
4 |
edgeKey type | No | Only present if component bitmap & 0x0008 == 0x0008 . See table Table 28-3 for type encoding
|
<numEdges> * 8 |
edge keys | No | Only present if component bitmap & 0x0008 == 0x0008 .
|
4 |
number of vertex properties | Yes | |
<num vertex properties> * <property layout> |
property data | Yes | See Table 28-10. |
4 |
number of edge properties | Yes | |
<num edge properties> * <property layout> |
property data | Y | See Edge Property Layout. |
<vertex labels layout> |
vertex labels | No | Only present if component bit & 0x0002 == 0x0002 .
|
<edge labels layout> |
edge label | No | Only present if component bit & 0x0004 == 0x0004 .
|
4 |
number of shared pools | Yes | |
<shared pools size> |
shared pools | No | |
<property names size> |
property names | No | Only present if component bit & 0x0010 == 0x0010 . See Table 28-19.
|
Vertex Key Layout
The layout of vertex keys depends on the vertexKey type. PGB supports integer
, long
and string
vertex keys.
Table 28-5 Integer Vertex Keys
Size in bytes | Description | Required | Comment |
---|---|---|---|
<numVertices> * 4 |
key data | Yes | For each vertex, the corresponding integer key value. |
Table 28-6 Long Vertex Keys
Size in bytes | Description | Required | Comment |
---|---|---|---|
<numVertices> * 8 |
key data | Yes | For each vertex, the corresponding long key value. |
Table 28-7 String Vertex Keys
Size in bytes | Description | Required | Comment |
---|---|---|---|
4 |
compression scheme | Yes | reserved (must be 0 )
|
8 |
property size | Yes | size of each element in bytes in the following data |
<number of keys> * <string key element layout> |
string key data | Yes | content of the vertex keys (see Table 28-5) |
Table 28-8 String Key Element Layout
Size in bytes | Description | Required | Comment |
---|---|---|---|
4 |
string length | Yes | length of the string in bytes |
<string length> |
string key data | Yes | content of the string as bytes, No zero-character |
Property Layout
The following shows the special layout for string properties, and for vector properties:
Table 28-9 Primitive Type Layout
Size in bytes | Description | Required | Comment |
---|---|---|---|
4 |
property type | Yes | See Table 28-3 for type encoding. |
8 |
property size | Yes | Size of the property data in bytes |
<property size> |
property data | Yes | Stored as <numVertices/numEdges> * <type size> |
Table 28-10 Vector Property Layout
Size in bytes | Description | Comment |
---|---|---|
4 |
vector type mark | Always equal to 18. |
8 |
size of vector property data and extra fields | dataSize = <sizeof component-type> * <dimension> + 8 (The 8 extra bytes are for the added following 2 extra fields in the vector property header.)
|
4 |
vector component data type | Valid types are integer , long , float , double . Encoded with the value specified in Table 28-3.
|
4 |
vector dimension | Number of components per vector value. Must be greater than 0 to be a valid vector property. |
dataSize - 8 |
data | Stored as array of length * ` in which the value of the j-th component of the vector for the i-th entity is at position i * + j` .
|
Table 28-11 String Type Layout
Size in bytes | Description | Required | Comment |
---|---|---|---|
4 |
property type | Yes | Must be 7. |
8 |
property size | Yes | Size of the following data in bytes. |
1 |
reserved | Yes | Reserved (must be 0). |
<dictionary layout> |
dictionary | Yes | String dictionary used in the property |
<numVertices/numEdges> * 8 |
property content | Yes | Content of the string property, stored as IDs that refer to the strings in the dictionary. |
Table 28-12 String Dictionary Layout
Size in bytes | Description | Required | Comment |
---|---|---|---|
1 |
reserved | Yes | Reserved (must be 0). |
8 |
number of strings | Yes | Number of strings in the following dictionary. |
<number of strings> * <dictionary element layout> |
dictionary data | Yes | See Table 28-13. |
Table 28-13 String Dictionary Element Layout
Size in bytes | Description | Required | Comment |
---|---|---|---|
8 |
string id | Yes | Unique ID of the string. |
4 |
string length | Yes | Length of the string in bytes. |
<string length> |
string data | Yes | Content of the string as bytes, No zero-character |
Vertex Labels Layout
Table 28-14 Vertex Labels Layout
Size in bytes | Description | Required | Comment |
---|---|---|---|
4 |
type | Yes | Must be 11. |
8 |
size | Yes | Size of the following data in bytes. |
<dictionary layout> |
dictionary | Yes | String dictionary used in the vertex labels. |
<numVertices + 1> * 8 |
string id begin array | Yes | <string ids> offset array for each vertex.
|
8 |
number of string ids | Yes | The number of string ids. |
<number of string ids> * 8 |
string ids | Yes | Array of string ids in the string dictionary. |
Edge Label Layout
The edge label layout follows the string type layout.
Shared Pools Layout
Table 28-15 Shared Pools Layout
Size in bytes | Description | Required | Comment |
---|---|---|---|
1 |
type | Yes | 1: enum, 2: prefixed |
Table 28-16 Type == Enum
Size in bytes | Description | Required | Comment |
---|---|---|---|
8 |
num strings | Yes | |
<number of strings> * <string table layout> |
dictionary data | Yes | See Table 28-18. |
Table 28-17 Type == Prefix
Size in bytes | Description | Required | Comment |
---|---|---|---|
8 |
num prefixes | Yes | |
<number of prefixes> * <string table layout> |
dictionary data | Yes | See Table 28-18. |
8 |
num suffixes | Yes | |
<number of suffixes> * <string table layout> |
dictionary data | Yes | See Table 28-18. |
Table 28-18 String Table for Shared Pools
Size in bytes | Description | Required | Comment |
---|---|---|---|
8 |
string id | Yes | String can be literal (in case of enum) or prefix/suffix (in case of prefix). |
4 |
string length | Yes | |
<string length> |
string data | Yes |
Property Names Layout
Table 28-19 Property Names Layout
Size in bytes | Description | Required | Comment |
---|---|---|---|
8 |
size | Yes | String can be literal (in case of enum) or prefix/suffix (in case of prefix). |
<sum of size of vertex property names> |
vertex property names | No | Follows the String Key Element Layout. See Table 28-8. |
<sum of size of edge property names> |
edge property names | No | Follows the String Key Element Layout. See Table 28-8. |
Parent topic: Loading Graph Data from Files